Message ID | CAHz1=dWb73CaAJCywt0+uwKx+9TE2wo1dXGmMDW8G753dvh2xA@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Sep 26, 2011 at 7:31 AM, Revital Eres <revital.eres@linaro.org> wrote: > Hello, > > This patch extends the implementation to support instructions with > REG_INC notes. > It addresses the comments from the previous submission: > http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01299.html. > ok, so if we have an auto-inc'ing insn which defines (auto-inc's) an addr register and another (say, result) register, we want to allow the result register to have life ranges in excess of ii (by eliminating anti-dep edges of distance 1 from uses to def, and generating reg_moves if/where needed), but avoid having such life ranges of addr (by retaining such anti-dep edges). Right? > btw, regarding your previous question about the usage of the address > register been auto inc, apparently it can be used as follows: > > insn 1) def1 = MEM [ pre_dec (addr) ] > out edges: [1 -(T,2,1)-> 1] [1 -(T,2,0)-> 2] > in edges: [1 -(T,2,1)-> 1] [2 -(T,2,1)-> 1] [2 -(A,0,1)->1] > > insn 2) MEM [ addr + 3 ] = def1 > out edges: [2 -(T,2,1)-> 1] [2 -(A,0,1)->1] > in edges: [1 -(T,2,0)-> 2] > Are these all the edges? We have only one True dependence edge from insn 1 to insn 2, but insn 1 is setting two registers both used by insn 2 (regardless of what we decide to do with Anti-deps). As for Anti-deps of distance 1, we have only one going back from insn 2 to insn 1, perhaps corresponding to addr, allowing reg_moves for def1(?). But, it won't help def1, because this other Anti-dep will force them to be scheduled w/o reg_moves. > Reg-moves were not created for the address when testing on ppc. ok; note that the example above shows one could potentially have an insn 2 scheduled in a later stage than insn 1, wanting to feed off an earlier version of the pre_dec'd addr. (In this case, it would probably be better to update the offset (3) than use a reg_move for the addr. (Such folding might work for other cases as well(?))). > > Tested and bootstrap with patch 1 on ppc64-redhat-linux > enabling SMS on loops with SC 1. On arm-linux-gnueabi bootstrap > c on top of the set of patches that support do-loop pattern > (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01807.html) which solves > the bootstrap failure on ARM with SMS flags. > > OK for mainline? > OK, with the following comments: Make sure reg_moves are generated for the correct (result, not addr) register, in generate_reg_moves(). "been">>"being" (multiple appearances). Add a note that autoinc_var_is_used_p (rtx def_insn, rtx use_insn) doesn't need to consider the specific address register; no reg_moves will be allowed for any life range defined by def_insn and used by use_insn, if use_insn uses an address register auto-inc'ed by def_insn. In other words, one would expect to see two Anti-dep edges from insn 2 to insn 1, right? Ayal. > Thanks, > Revital > > > * ddg.c (autoinc_var_is_used_p): New function. > (create_ddg_dep_from_intra_loop_link, > add_cross_iteration_register_deps): Call it. > * modulo-sched.c (sms_schedule): Handle instructions with REG_INC. >
Hello, > ok, so if we have an auto-inc'ing insn which defines (auto-inc's) an > addr register and another (say, result) register, we want to allow the > result register to have life ranges in excess of ii (by eliminating > anti-dep edges of distance 1 from uses to def, and generating > reg_moves if/where needed), but avoid having such life ranges of addr > (by retaining such anti-dep edges). Right? Yes. > > Are these all the edges? We have only one True dependence edge from > insn 1 to insn 2, but insn 1 is setting two registers both used by > insn 2 (regardless of what we decide to do with Anti-deps). As for > Anti-deps of distance 1, we have only one going back from insn 2 to > insn 1, perhaps corresponding to addr, allowing reg_moves for def1(?). > But, it won't help def1, because this other Anti-dep will force them > to be scheduled w/o reg_moves. Please ignore the edges in the previous example. It indeed was a mistake, sorry about the confusion. Here are two examples taken from bootstrap on PPC of how the address is used; with the current patch applied and running with -fmodulo-sched-allow-regmoves: Node num: 2 (insn 3681 3678 3682 500 (set (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ]) (mem:QI (pre_dec:DI (reg:DI 1644 [ ivtmp.687 ])) [0 MEM[base: D.9586_4130, offset: 0B]+0 S1 A8])) ../../gcc/libiberty/regex.c:4259 358 {*movqi_internal} (expr_list:REG_INC (reg:DI 1644 [ ivtmp.687 ]) (nil))) OUT ARCS: [3681 -(T,2,1)-> 3681] [3681 -(T,2,0)-> 3682] IN ARCS: [3682 -(A,0,1)-> 3681] [3681 -(T,2,1)-> 3681] [3682 -(A,0,1)-> 3681] [3682 -(T,2,1)-> 3681] Node num: 3 (insn 3682 3681 3683 500 (set (mem:QI (plus:DI (reg:DI 1644 [ ivtmp.687 ]) (const_int 3 [0x3])) [0 MEM[base: D.9586_4130, offset: 3B]+0 S1 A8]) (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ])) ../../gcc/libiberty/regex.c:4259 358 {*movqi_internal} (expr_list:REG_DEAD (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ]) (nil))) OUT ARCS: [3682 -(A,0,1)-> 3681] [3682 -(A,0,1)-> 3681] [3682 -(O,0,0)-> 7263] [3682 -(A,0,0)-> 3683] [3682 -(T,2,1)-> 3681] IN ARCS: [3681 -(T,2,0)-> 3682] Another example of usage is as follows (the address register is not used in MEM): Node num: 0 (insn 1419 1415 1423 9 (set (mem/f:DI (pre_inc:DI (reg:DI 1882 [ ivtmp.1636 ])) [3 MEM[base: D.10911_2945, offset: 0B]+0 S8 A64]) (reg/f:DI 3923)) ../../gcc/libiberty/regex.c:5788 378 {*movdi_internal64} (expr_list:REG_INC (reg:DI 1882 [ ivtmp.1636 ]) (nil))) OUT ARCS: [1419 -(T,2,1)-> 1419] [1419 -(O,0,0)-> 5932] [1419 -(O,0,0)-> 1449] [1419 -(T,2,1)-> 1434] [1419 -(T,2,0)-> 1434] [1419 -(T,2,0)-> 1432] [1419 -(O,0,0)-> 1431] [1419 -(O,0,0)-> 1427] [1419 -(O,0,0)-> 1423] IN ARCS: [1419 -(T,2,1)-> 1419] [1432 -(A,0,1)-> 1419] [1449 -(O,0,1)-> 1419] [1434 -(A,0,1)-> 1419] [1431 -(O,0,1)-> 1419] [1427 -(O,0,1)-> 1419] [1423 -(O,0,1)-> 1419] Node num: 4 (insn 1432 1431 1433 9 (set (reg:DI 2632) (plus:DI (reg/v/f:DI 1058 [ reg_info ]) (reg:DI 1882 [ ivtmp.1636 ]))) ../../gcc/libiberty/regex.c:5543 79 {*adddi3_internal1} (nil)) OUT ARCS: [1432 -(A,0,1)-> 1419] [1432 -(T,1,0)-> 1433] IN ARCS: [1419 -(T,2,0)-> 1432] >> OK for mainline? >> > > OK, with the following comments: Thanks, will address the comments and re-submit. > In other words, one would expect to see two Anti-dep edges from insn 2 > to insn 1, right? Yes, that's indeed the case in the first example above. Thanks, Revital
On Tue, Sep 27, 2011 at 9:47 AM, Revital Eres <revital.eres@linaro.org> wrote: > Hello, > >> ok, so if we have an auto-inc'ing insn which defines (auto-inc's) an >> addr register and another (say, result) register, we want to allow the >> result register to have life ranges in excess of ii (by eliminating >> anti-dep edges of distance 1 from uses to def, and generating >> reg_moves if/where needed), but avoid having such life ranges of addr >> (by retaining such anti-dep edges). Right? > > Yes. >> >> Are these all the edges? We have only one True dependence edge from >> insn 1 to insn 2, but insn 1 is setting two registers both used by >> insn 2 (regardless of what we decide to do with Anti-deps). As for >> Anti-deps of distance 1, we have only one going back from insn 2 to >> insn 1, perhaps corresponding to addr, allowing reg_moves for def1(?). >> But, it won't help def1, because this other Anti-dep will force them >> to be scheduled w/o reg_moves. > > Please ignore the edges in the previous example. It indeed was a mistake, > sorry about the confusion. Here are two examples taken from bootstrap > on PPC of how the address is used; with the current patch applied and > running with -fmodulo-sched-allow-regmoves: > Ok, this does have two anti-dep edges. But still, only a single true dependence(?) ... can you see why? Thanks, Ayal. > Node num: 2 > (insn 3681 3678 3682 500 (set (reg:QI 2914 [ MEM[base: D.9586_4130, > offset: 0B] ]) > (mem:QI (pre_dec:DI (reg:DI 1644 [ ivtmp.687 ])) [0 MEM[base: > D.9586_4130, offset: 0B]+0 S1 A8])) ../../gcc/libiberty/regex.c:4259 > 358 {*movqi_internal} > (expr_list:REG_INC (reg:DI 1644 [ ivtmp.687 ]) > (nil))) > OUT ARCS: [3681 -(T,2,1)-> 3681] [3681 -(T,2,0)-> 3682] > IN ARCS: [3682 -(A,0,1)-> 3681] [3681 -(T,2,1)-> 3681] [3682 > -(A,0,1)-> 3681] [3682 -(T,2,1)-> 3681] > Node num: 3 > (insn 3682 3681 3683 500 (set (mem:QI (plus:DI (reg:DI 1644 [ ivtmp.687 ]) > (const_int 3 [0x3])) [0 MEM[base: D.9586_4130, offset: > 3B]+0 S1 A8]) > (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ])) > ../../gcc/libiberty/regex.c:4259 358 {*movqi_internal} > (expr_list:REG_DEAD (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ]) > (nil))) > OUT ARCS: [3682 -(A,0,1)-> 3681] [3682 -(A,0,1)-> 3681] [3682 > -(O,0,0)-> 7263] [3682 -(A,0,0)-> 3683] [3682 -(T,2,1)-> 3681] > IN ARCS: [3681 -(T,2,0)-> 3682] > > Another example of usage is as follows (the address register is not > used in MEM): > > Node num: 0 > (insn 1419 1415 1423 9 (set (mem/f:DI (pre_inc:DI (reg:DI 1882 [ > ivtmp.1636 ])) [3 MEM[base: D.10911_2945, offset: 0B]+0 S8 A64]) > (reg/f:DI 3923)) ../../gcc/libiberty/regex.c:5788 378 > {*movdi_internal64} > (expr_list:REG_INC (reg:DI 1882 [ ivtmp.1636 ]) > (nil))) > OUT ARCS: [1419 -(T,2,1)-> 1419] [1419 -(O,0,0)-> 5932] [1419 > -(O,0,0)-> 1449] [1419 -(T,2,1)-> 1434] [1419 -(T,2,0)-> 1434] > [1419 -(T,2,0)-> 1432] [1419 -(O,0,0)-> 1431] [1419 -(O,0,0)-> 1427] > [1419 -(O,0,0)-> 1423] > IN ARCS: [1419 -(T,2,1)-> 1419] [1432 -(A,0,1)-> 1419] [1449 > -(O,0,1)-> 1419] [1434 -(A,0,1)-> 1419] [1431 -(O,0,1)-> 1419] > [1427 -(O,0,1)-> 1419] [1423 -(O,0,1)-> 1419] > Node num: 4 > (insn 1432 1431 1433 9 (set (reg:DI 2632) > (plus:DI (reg/v/f:DI 1058 [ reg_info ]) > (reg:DI 1882 [ ivtmp.1636 ]))) > ../../gcc/libiberty/regex.c:5543 79 {*adddi3_internal1} > (nil)) > OUT ARCS: [1432 -(A,0,1)-> 1419] [1432 -(T,1,0)-> 1433] > IN ARCS: [1419 -(T,2,0)-> 1432] > >>> OK for mainline? >>> >> >> OK, with the following comments: > > Thanks, will address the comments and re-submit. > >> In other words, one would expect to see two Anti-dep edges from insn 2 >> to insn 1, right? > > Yes, that's indeed the case in the first example above. > > Thanks, > Revital >
Hello, > Ok, this does have two anti-dep edges. But still, only a single true > dependence(?) ... can you see why? The intra edge [3681 -(T,2,0)-> 3682] was created by haifa-sched and I guess that because both of the expected true-dep edges (one for the target and one for the address) are identical only one is created. The rest of the inter edges were created in ddg.c where we do not check for multiply identical edges. Thanks, Revital > > Thanks, > Ayal. > > >> Node num: 2 >> (insn 3681 3678 3682 500 (set (reg:QI 2914 [ MEM[base: D.9586_4130, >> offset: 0B] ]) >> (mem:QI (pre_dec:DI (reg:DI 1644 [ ivtmp.687 ])) [0 MEM[base: >> D.9586_4130, offset: 0B]+0 S1 A8])) ../../gcc/libiberty/regex.c:4259 >> 358 {*movqi_internal} >> (expr_list:REG_INC (reg:DI 1644 [ ivtmp.687 ]) >> (nil))) >> OUT ARCS: [3681 -(T,2,1)-> 3681] [3681 -(T,2,0)-> 3682] >> IN ARCS: [3682 -(A,0,1)-> 3681] [3681 -(T,2,1)-> 3681] [3682 >> -(A,0,1)-> 3681] [3682 -(T,2,1)-> 3681] >> Node num: 3 >> (insn 3682 3681 3683 500 (set (mem:QI (plus:DI (reg:DI 1644 [ ivtmp.687 ]) >> (const_int 3 [0x3])) [0 MEM[base: D.9586_4130, offset: >> 3B]+0 S1 A8]) >> (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ])) >> ../../gcc/libiberty/regex.c:4259 358 {*movqi_internal} >> (expr_list:REG_DEAD (reg:QI 2914 [ MEM[base: D.9586_4130, offset: 0B] ]) >> (nil))) >> OUT ARCS: [3682 -(A,0,1)-> 3681] [3682 -(A,0,1)-> 3681] [3682 >> -(O,0,0)-> 7263] [3682 -(A,0,0)-> 3683] [3682 -(T,2,1)-> 3681] >> IN ARCS: [3681 -(T,2,0)-> 3682] >> >> Another example of usage is as follows (the address register is not >> used in MEM): >> >> Node num: 0 >> (insn 1419 1415 1423 9 (set (mem/f:DI (pre_inc:DI (reg:DI 1882 [ >> ivtmp.1636 ])) [3 MEM[base: D.10911_2945, offset: 0B]+0 S8 A64]) >> (reg/f:DI 3923)) ../../gcc/libiberty/regex.c:5788 378 >> {*movdi_internal64} >> (expr_list:REG_INC (reg:DI 1882 [ ivtmp.1636 ]) >> (nil))) >> OUT ARCS: [1419 -(T,2,1)-> 1419] [1419 -(O,0,0)-> 5932] [1419 >> -(O,0,0)-> 1449] [1419 -(T,2,1)-> 1434] [1419 -(T,2,0)-> 1434] >> [1419 -(T,2,0)-> 1432] [1419 -(O,0,0)-> 1431] [1419 -(O,0,0)-> 1427] >> [1419 -(O,0,0)-> 1423] >> IN ARCS: [1419 -(T,2,1)-> 1419] [1432 -(A,0,1)-> 1419] [1449 >> -(O,0,1)-> 1419] [1434 -(A,0,1)-> 1419] [1431 -(O,0,1)-> 1419] >> [1427 -(O,0,1)-> 1419] [1423 -(O,0,1)-> 1419] >> Node num: 4 >> (insn 1432 1431 1433 9 (set (reg:DI 2632) >> (plus:DI (reg/v/f:DI 1058 [ reg_info ]) >> (reg:DI 1882 [ ivtmp.1636 ]))) >> ../../gcc/libiberty/regex.c:5543 79 {*adddi3_internal1} >> (nil)) >> OUT ARCS: [1432 -(A,0,1)-> 1419] [1432 -(T,1,0)-> 1433] >> IN ARCS: [1419 -(T,2,0)-> 1432] >> >>>> OK for mainline? >>>> >>> >>> OK, with the following comments: >> >> Thanks, will address the comments and re-submit. >> >>> In other words, one would expect to see two Anti-dep edges from insn 2 >>> to insn 1, right? >> >> Yes, that's indeed the case in the first example above. >> >> Thanks, >> Revital >> >
Index: ddg.c =================================================================== --- ddg.c (revision 179138) +++ ddg.c (working copy) @@ -145,6 +145,23 @@ mem_access_insn_p (rtx insn) return rtx_mem_access_p (PATTERN (insn)); } +/* Return true if DEF_INSN contains address been auto-inc or auto-dec + which is used in USE_INSN. Otherwise return false. The result is + been used to decide whether to remove the edge between def_insn and + use_insn when -fmodulo-sched-allow-regmoves is set. */ +static bool +autoinc_var_is_used_p (rtx def_insn, rtx use_insn) +{ + rtx note; + + for (note = REG_NOTES (def_insn); note; note = XEXP (note, 1)) + if (REG_NOTE_KIND (note) == REG_INC + && reg_referenced_p (XEXP (note, 0), PATTERN (use_insn))) + return true; + + return false; +} + /* Computes the dependence parameters (latency, distance etc.), creates a ddg_edge and adds it to the given DDG. */ static void @@ -173,10 +190,15 @@ create_ddg_dep_from_intra_loop_link (ddg compensate for that by generating reg-moves based on the life-range analysis. The anti-deps that will be deleted are the ones which have true-deps edges in the opposite direction (in other words - the kernel has only one def of the relevant register). TODO: - support the removal of all anti-deps edges, i.e. including those + the kernel has only one def of the relevant register). + If the address that is been auto-inc or auto-dec in DEST_NODE + is used in SRC_NODE then do not remove the edge to make sure + reg-moves will not be created for this address. + TODO: support the removal of all anti-deps edges, i.e. including those whose register has multiple defs in the loop. */ - if (flag_modulo_sched_allow_regmoves && (t == ANTI_DEP && dt == REG_DEP)) + if (flag_modulo_sched_allow_regmoves + && (t == ANTI_DEP && dt == REG_DEP) + && !autoinc_var_is_used_p (dest_node->insn, src_node->insn)) { rtx set; @@ -302,10 +324,14 @@ add_cross_iteration_register_deps (ddg_p gcc_assert (first_def_node); /* Always create the edge if the use node is a branch in - order to prevent the creation of reg-moves. */ + order to prevent the creation of reg-moves. + If the address that is been auto-inc or auto-dec in LAST_DEF + is used in USE_INSN then do not remove the edge to make sure + reg-moves will not be created for that address. */ if (DF_REF_ID (last_def) != DF_REF_ID (first_def) || !flag_modulo_sched_allow_regmoves - || JUMP_P (use_node->insn)) + || JUMP_P (use_node->insn) + || autoinc_var_is_used_p (DF_REF_INSN (last_def), use_insn)) create_ddg_dep_no_link (g, use_node, first_def_node, ANTI_DEP, REG_DEP, 1); Index: modulo-sched.c =================================================================== --- modulo-sched.c (revision 179138) +++ modulo-sched.c (working copy) @@ -1266,12 +1271,10 @@ sms_schedule (void) continue; } - /* Don't handle BBs with calls or barriers or auto-increment insns - (to avoid creating invalid reg-moves for the auto-increment insns), + /* Don't handle BBs with calls or barriers or !single_set with the exception of instructions that include count_reg---these instructions are part of the control part that do-loop recognizes. - ??? Should handle auto-increment insns. ??? Should handle insns defining subregs. */ for (insn = head; insn != NEXT_INSN (tail); insn = NEXT_INSN (insn)) { @@ -1282,7 +1285,6 @@ sms_schedule (void) || (NONDEBUG_INSN_P (insn) && !JUMP_P (insn) && !single_set (insn) && GET_CODE (PATTERN (insn)) != USE && !reg_mentioned_p (count_reg, insn)) - || (FIND_REG_INC_NOTE (insn, NULL_RTX) != 0) || (INSN_P (insn) && (set = single_set (insn)) && GET_CODE (SET_DEST (set)) == SUBREG)) break; @@ -1296,8 +1298,6 @@ sms_schedule (void) fprintf (dump_file, "SMS loop-with-call\n"); else if (BARRIER_P (insn)) fprintf (dump_file, "SMS loop-with-barrier\n"); - else if (FIND_REG_INC_NOTE (insn, NULL_RTX) != 0) - fprintf (dump_file, "SMS reg inc\n"); else if ((NONDEBUG_INSN_P (insn) && !JUMP_P (insn) && !single_set (insn) && GET_CODE (PATTERN (insn)) != USE)) fprintf (dump_file, "SMS loop-with-not-single-set\n");