Message ID | 5649E333.4090904@arm.com |
---|---|
State | Superseded |
Headers | show |
Hi Bernd, On 16/11/15 18:40, Bernd Schmidt wrote: > On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: > >> I've explained in the comments in the patch what's going on but the >> short version is trying to change the destination of a defining insn >> that feeds into an extend insn is not valid if the defining insn >> doesn't feed directly into the extend insn. In the ree pass the only >> way this can happen is if there is an intermediate conditional move >> that the pass tries to handle in a special way. An equivalent fix >> would have been to check on that path (when copy_needed in >> combine_reaching_defs is true) that the state->copies_list vector >> (that contains the conditional move insns feeding into the extend >> insn) is empty. > > I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: > > /* Considering transformation of > (set (reg1) (expression)) > ... > (set (reg2) (any_extend (reg1))) > > into > > (set (reg2) (any_extend (expression))) > (set (reg1) (reg2)) > ... */ > > I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any measurable difference. Would you prefer to use !reg_used_between_p here? > > The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, > >> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) > > Shouldn't this really be !rtx_equal_p? > Maybe, will it behave the right way if the two regs have different modes or when subregs are involved? (will we even hit such a case in this path?) Thanks, Kyrill > > Bernd >
On 17/11/15 09:08, Kyrill Tkachov wrote: > Hi Bernd, > > On 16/11/15 18:40, Bernd Schmidt wrote: >> On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: >> >>> I've explained in the comments in the patch what's going on but the >>> short version is trying to change the destination of a defining insn >>> that feeds into an extend insn is not valid if the defining insn >>> doesn't feed directly into the extend insn. In the ree pass the only >>> way this can happen is if there is an intermediate conditional move >>> that the pass tries to handle in a special way. An equivalent fix >>> would have been to check on that path (when copy_needed in >>> combine_reaching_defs is true) that the state->copies_list vector >>> (that contains the conditional move insns feeding into the extend >>> insn) is empty. >> >> I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: >> >> /* Considering transformation of >> (set (reg1) (expression)) >> ... >> (set (reg2) (any_extend (reg1))) >> >> into >> >> (set (reg2) (any_extend (expression))) >> (set (reg1) (reg2)) >> ... */ >> >> I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? > > Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought > it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions > and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any > measurable difference. Actually, I tried it out. And while a check reg_used_between_p fixed the testcase, it caused code quality regressions on aarch64. Seems it's too aggressive in restricting ree. I'll have a closer look. Kyrill > > Would you prefer to use !reg_used_between_p here? > >> >> The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, >> >>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) >> >> Shouldn't this really be !rtx_equal_p? >> > > Maybe, will it behave the right way if the two regs have different modes or when subregs are involved? > (will we even hit such a case in this path?) > > Thanks, > Kyrill >> >> Bernd >> >
On 17/11/15 09:49, Kyrill Tkachov wrote: > > On 17/11/15 09:08, Kyrill Tkachov wrote: >> Hi Bernd, >> >> On 16/11/15 18:40, Bernd Schmidt wrote: >>> On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: >>> >>>> I've explained in the comments in the patch what's going on but the >>>> short version is trying to change the destination of a defining insn >>>> that feeds into an extend insn is not valid if the defining insn >>>> doesn't feed directly into the extend insn. In the ree pass the only >>>> way this can happen is if there is an intermediate conditional move >>>> that the pass tries to handle in a special way. An equivalent fix >>>> would have been to check on that path (when copy_needed in >>>> combine_reaching_defs is true) that the state->copies_list vector >>>> (that contains the conditional move insns feeding into the extend >>>> insn) is empty. >>> >>> I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: >>> >>> /* Considering transformation of >>> (set (reg1) (expression)) >>> ... >>> (set (reg2) (any_extend (reg1))) >>> >>> into >>> >>> (set (reg2) (any_extend (expression))) >>> (set (reg1) (reg2)) >>> ... */ >>> >>> I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? >> >> Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought >> it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions >> and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any >> measurable difference. > > Actually, I tried it out. And while a check reg_used_between_p fixed the testcase, it caused code quality regressions > on aarch64. Seems it's too aggressive in restricting ree. > > I'll have a closer look. Ok, so the testcases that regress code-quality-wise on aarch64 look like this before ree: (insn 48 57 49 7 (set (reg:SI 7 x7) (zero_extend:SI (mem:QI (plus:DI (reg/f:DI 1 x1) (const_int 2 [0x2])))))) (insn 49 48 52 7 (set (reg/v:SI 2 x2) (reg:SI 7 x7))) (insn 52 49 53 7 (set (reg:DI 8 x8) (zero_extend:DI (reg:SI 7 x7)))) ree wants to transform this into: (insn 48 57 296 7 (set (reg:DI 8 x8) (zero_extend:DI (mem:QI (plus:DI (reg/f:DI 1 x1) (const_int 2 [0x2])))))) (insn 296 48 49 7 (set (reg:DI 7 x7) (reg:DI 8 x8))) (insn 49 296 53 7 (set (reg/v:SI 2 x2) (reg:SI 7 x7))) which is valid, but we reject that with the reg_used_between_p check because x7 is used in the intermediate insn 49. Note that no conditional move is present here. So, I think that the crucial element here is that the destination of the def_insn should feed directly into the extend, and that is what my original patch was testing for. So, I'd like to keep my original proposed patch as is, although I think I'll add a couple of testcases from the duplicate PRs to it for the testsuite. What do you think? Thanks, Kyrill > > Kyrill > >> >> Would you prefer to use !reg_used_between_p here? >> >>> >>> The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, >>> >>>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) >>> >>> Shouldn't this really be !rtx_equal_p? >>> >> >> Maybe, will it behave the right way if the two regs have different modes or when subregs are involved? >> (will we even hit such a case in this path?) >> >> Thanks, >> Kyrill >>> >>> Bernd >>> >> >
On 17/11/15 12:10, Bernd Schmidt wrote: > On 11/17/2015 10:08 AM, Kyrill Tkachov wrote: >> Yes, I had considered that as well. It should be equivalent. I didn't >> use !reg_used_between_p because I thought >> it'd be more expensive than checking reg_overlap_mentioned_p since we >> must iterate over a number of instructions >> and call reg_overlap_mentioned_p on each one. But I suppose this case is >> rare enough that it wouldn't make any >> measurable difference. >> >> Would you prefer to use !reg_used_between_p here? > > I would but apparently it doesn't work, so that's kind of neither here nor there. > >>> The added comment could lead to some confusion since it's placed in >>> front of an existing if statement that also tests a different >>> condition. Also, if we go with your fix, >>> >>>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN >>>> (cand->insn)))) >>> >>> Shouldn't this really be !rtx_equal_p? >>> >> >> Maybe, will it behave the right way if the two regs have different modes >> or when subregs are involved? > > It would return false, in which case we'll conservatively fail here. I think that's desirable? > Well, I think the statement we want to make is "return false from this function if the two expressions contain the same register number". if (!rtx_equal_p (..., ...)) return false; will only return false if the two expressions are the same REG with the same mode. if (!reg_overlap_mentioned_p (..., ...)) return false; should return false even if the modes are different or one is a subreg, which is what we want. I did not see any codegen regressions using reg_overlap_mentioned_p on aarch64, so I don't think it will restrict any legitimate cases. Thanks, Kyrill > > Bernd >
On 17/11/15 23:11, Bernd Schmidt wrote: > On 11/17/2015 02:03 PM, Kyrill Tkachov wrote: >> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) >> return false; > >> Well, I think the statement we want to make is >> "return false from this function if the two expressions contain the same >> register number". > > I looked at it again and I think I'll need more time to really figure out what's going on in this pass. > > However, about the above... either I'm very confused, or the actual statement here is "return false _unless_ the two expressions contain the same register number". In the testcase, the regs in question are ax and bx, which is then > rejected if the patch has been applied. I'm sorry, it is my mistake in explaining. I meant to say: "return false from this function unless the two expressions contain the same register number" > > (gdb) p tmp_reg > (reg:SI 0 ax) > (gdb) p cand->insn > (insn 59 117 60 7 (set (reg:SI 4 si [orig:107 D.1813 ] [107]) > (sign_extend:SI (reg:HI 3 bx [orig:99 D.1813 ] [99]))) > > And I think it would be better to strengthen that to "return false unless registers are identical". (From the above it's clear however that a plain rtx_equal_p wouldn't work since there's an extension in the operand). So the three relevant instructions are: I1: (set (reg:HI 0 ax) (mem:HI (symbol_ref:DI ("f")))) I2: (set (reg:SI 3 bx) (if_then_else:SI (eq (reg:CCZ 17 flags) (const_int 0)) (reg:SI 0 ax) (reg:SI 3 bx))) I3: (set (reg:SI 4 si) (sign_extend:SI (reg:HI 3 bx))) I1 is def_insn, I3 is cand->insn. tmp_reg is 'ax'. What we want to do is reject this transformation because the destination of def_insn (aka I1), that is 'ax', is not the operand of the extend operation in cand->insn (aka I3). As you said, rtx_equal won't work on just SET_SRC (PATTERN (cand->insn)) because it's an extend operation. So reg_overlap_mentioned should be appropriate. Hope this helps. > > Also, I had another look at the testcase. It uses __builtin_printf and dg-output, which is at least unusual. Try to use the normal "if (cond) abort ()". > I did not try to remove the __builtin_printf because the use of 'f' there is needed to trigger this. There are at least two other dups of this PR: 68328 and 68185 the testcases for which have an "if (cond) abort ();" form. I'll use them instead. > > Bernd >
commit 33131f774705b936afc1a26c145e1214b388771f Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Fri Nov 13 15:01:47 2015 +0000 [RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves diff --git a/gcc/ree.c b/gcc/ree.c index b8436f2..e91d164 100644 --- a/gcc/ree.c +++ b/gcc/ree.c @@ -814,7 +814,30 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state) rtx tmp_reg = gen_rtx_REG (GET_MODE (SET_DEST (PATTERN (cand->insn))), REGNO (SET_DEST (*dest_sub_rtx))); - if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn)))) + + /* When transforming: + (set (reg1) (expression)) + ... + (set (reg2) (any_extend (reg1))) + + into + + (set (reg2) (any_extend (expression))) + (set (reg1) (reg2)) + make sure that reg1 from the first set feeds directly into the extend. + This may not hold in a situation with an intermediate + conditional copy i.e. + I1: (set (reg3) (expression)) + I2: (set (reg1) (cond ? reg3 : reg1)) + I3: (set (reg2) (any_extend (reg1))) + + where I3 is cand, I1 is def_insn and I2 is a conditional copy. + We want to avoid transforming that into: + (set (reg2) (any_extend (expression))) + (set (reg1) (reg2)) + (set (reg1) (cond ? reg3 : reg1)). */ + if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn))) + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) return false; /* The destination register of the extension insn must not be diff --git a/gcc/testsuite/gcc.dg/pr68194.c b/gcc/testsuite/gcc.dg/pr68194.c new file mode 100644 index 0000000..b4855ea --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr68194.c @@ -0,0 +1,35 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +int a, c, d, e, g, h; +short f; + +short +fn1 (void) +{ + int j[2]; + for (; e; e++) + if (j[0]) + for (;;) + ; + if (!g) + return f; +} + +int +main (void) +{ + for (; a < 1; a++) + { + for (c = 0; c < 2; c++) + { + d && (f = 0); + h = fn1 (); + } + __builtin_printf ("%d\n", (char) f); + } + + return 0; +} + +/* { dg-output "0" } */