Message ID | 20230303223056.966286-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | tcg: Merge two sequential labels | expand |
> -----Original Message----- > From: Richard Henderson <richard.henderson@linaro.org> > Sent: Friday, March 3, 2023 3:31 PM > To: qemu-devel@nongnu.org > Cc: anjo@rev.ng; Taylor Simpson <tsimpson@quicinc.com> > Subject: [PATCH 0/2] tcg: Merge two sequential labels > > WARNING: This email originated from outside of Qualcomm. Please be wary > of any links or attachments, and do not enable macros. > > Supercedes: 20230301142221.24495-1-anjo@rev.ng > ("[PATCH] tcg: `reachable_code_pass()` remove empty else-branch") > > Anton has probably seen this already but his patch now has conflicts with > master. Rather than a while loop looking back across labels, let's fold all > adjacent labels into a single label, at which point the current branch-to-next > optimization works as is. > > Perhaps I missed it, but I didn't see how to reproduce the issue being > resolved by the hexagon front end? So I haven't this trigger, but it passes > the testsuite. :-) > > > r~ Hi Richard, Thanks!! You need this series of patches to see this pattern from Hexagon https://patchew.org/QEMU/20230131225647.25274-1-tsimpson@quicinc.com/ I have an update to that series for your tcg_temp_local_* series, but I'm wondering if I should wait for your tcg_temp_free patches to land first. Please advise. Anyway, I tested your patch, and can see the optimization happening. Here is the input code and unoptimized TCG: ---------------- IN: 0x00400094: 0xf900c102 { if (P0) R2 = and(R0,R1) } OP: ld_i32 loc0,env,$0xfffffffffffffff0 brcond_i32 loc0,$0x0,lt,$L0 ---- 00400094 and_i32 loc2,p0,$0x1 brcond_i32 loc2,$0x0,eq,$L1 and_i32 loc4,r0,r1 mov_i32 r2,loc4 br $L2 set_label $L1 set_label $L2 add_i32 pkt_cnt,pkt_cnt,$0x1 add_i32 insn_cnt,insn_cnt,$0x1 mov_i32 pc,$0x400098 exit_tb $0x0 set_label $L0 exit_tb $0x7fea80000043 Here is the optimized TCG after your patches. The two labels have been merged, and the unnecessary branch has been removed. OP after optimization and liveness analysis: ld_i32 tmp0,env,$0xfffffffffffffff0 dead: 1 pref=0xffff brcond_i32 tmp0,$0x0,lt,$L0 dead: 0 ---- 00400094 and_i32 tmp2,p0,$0x1 dead: 1 2 pref=0xffff brcond_i32 tmp2,$0x0,eq,$L2 dead: 0 1 and_i32 tmp4,r0,r1 dead: 1 2 pref=0xffff mov_i32 r2,tmp4 sync: 0 dead: 0 1 pref=0xffff set_label $L2 add_i32 pkt_cnt,pkt_cnt,$0x1 sync: 0 dead: 0 1 pref=0xffff add_i32 insn_cnt,insn_cnt,$0x1 sync: 0 dead: 0 1 2 pref=0xffff mov_i32 pc,$0x400098 sync: 0 dead: 0 1 pref=0xffff exit_tb $0x0 set_label $L0 exit_tb $0x7fea80000043 Tested-by: Taylor Simpson <tsimpson@quicinc.com>
On 3/3/23 16:27, Taylor Simpson wrote: > You need this series of patches to see this pattern from Hexagon > https://patchew.org/QEMU/20230131225647.25274-1-tsimpson@quicinc.com/ > > I have an update to that series for your tcg_temp_local_* series, but I'm wondering if I should wait for your tcg_temp_free patches to land first. Please advise. I'm hoping to get the tcg_temp_free patches in asap, as the potential conflicts are legion. Thanks for the test pointer and quick review. r~