Message ID | a808712a-94ad-17f0-bacb-a90c12629968@foss.arm.com |
---|---|
State | Superseded |
Headers | show |
On 12/01/2016 12:40 PM, Thomas Preudhomme wrote: > Hi, > > When considering a candidate for rematerialization, LRA verifies if > the candidate clobbers a live register before going forward with the > rematerialization (see code starting with comment "Check clobbers do > not kill something living."). To do this check, the set of live > registers at any given instruction needs to be maintained. This is > done by initializing the set of live registers when starting the > forward scan of instruction in a basic block and updating the set by > looking at REG_DEAD notes and destination register at the end of an > iteration of the scan loop. > > However the initialization suffers from 2 issues: > > 1) it is done from the live out set rather than live in (uses > df_get_live_out (bb)) > 2) it ignores pseudo registers that have already been allocated a hard > register (uses REG_SET_TO_HARD_REG_SET that only looks at hard > register and does not look at reg_renumber for pseudo registers) > > This patch changes the code to use df_get_live_in (bb) to initialize > the live_hard_regs variable using a loop to check reg_renumber for > pseudo registers. Please let me know if there is a macro to do that, I > failed to find one. > > ChangeLog entries are as follow: > > gcc/testsuite/ChangeLog: > > 2016-12-01 Thomas Preud'homme <thomas.preudhomme@arm.com> > > PR rtl-optimization/78617 > * gcc.c-torture/execute/pr78617.c: New test. > > > gcc/ChangeLog: > > 2016-12-01 Thomas Preud'homme <thomas.preudhomme@arm.com> > > PR rtl-optimization/78617 > * lra-remat.c (do_remat): Initialize live_hard_regs from live in > registers, also setting hard registers mapped to pseudo > registers. > > > Note however that as explained in the problem report, the testcase > does not trigger the bug on GCC 7 due to better optimization before > LRA rematerialization is reached. > > Testing: testsuite shows no regression when run using: > + an arm-none-eabi GCC cross-compiler targeting Cortex-M0 and Cortex-M3 > + a bootstrapped arm-linux-gnueabihf GCC native compiler > + a bootstrapped x86_64-linux-gnu GCC native compiler > > Is this ok for stage3? Yes. Thomas, thank you very much for the patch. Using live_out was my typo, not using reg_renumber is my more serious mistake (although it is non-trivial case with unused pseudo).
diff --git a/gcc/lra-remat.c b/gcc/lra-remat.c index f01c6644c428fd9b5efdf6cc98788e5f6fadba62..cdd7057f602098d33ec3acfdaaac66556640bd82 100644 --- a/gcc/lra-remat.c +++ b/gcc/lra-remat.c @@ -1047,6 +1047,7 @@ update_scratch_ops (rtx_insn *remat_insn) static bool do_remat (void) { + unsigned regno; rtx_insn *insn; basic_block bb; bitmap_head avail_cands; @@ -1054,12 +1055,21 @@ do_remat (void) bool changed_p = false; /* Living hard regs and hard registers of living pseudos. */ HARD_REG_SET live_hard_regs; + bitmap_iterator bi; bitmap_initialize (&avail_cands, ®_obstack); bitmap_initialize (&active_cands, ®_obstack); FOR_EACH_BB_FN (bb, cfun) { - REG_SET_TO_HARD_REG_SET (live_hard_regs, df_get_live_out (bb)); + CLEAR_HARD_REG_SET (live_hard_regs); + EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), 0, regno, bi) + { + int hard_regno = regno < FIRST_PSEUDO_REGISTER + ? regno + : reg_renumber[regno]; + if (hard_regno >= 0) + SET_HARD_REG_BIT (live_hard_regs, hard_regno); + } bitmap_and (&avail_cands, &get_remat_bb_data (bb)->avin_cands, &get_remat_bb_data (bb)->livein_cands); /* Activating insns are always in the same block as their corresponding diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78617.c b/gcc/testsuite/gcc.c-torture/execute/pr78617.c new file mode 100644 index 0000000000000000000000000000000000000000..89c4f6dea8cb507b963f91debb94cbe16eb1db90 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr78617.c @@ -0,0 +1,25 @@ +int a = 0; +int d = 1; +int f = 1; + +int fn1() { + return a || 1 >> a; +} + +int fn2(int p1, int p2) { + return p2 >= 2 ? p1 : p1 >> 1; +} + +int fn3(int p1) { + return d ^ p1; +} + +int fn4(int p1, int p2) { + return fn3(!d > fn2((f = fn1() - 1000) || p2, p1)); +} + +int main() { + if (fn4(0, 0) != 1) + __builtin_abort (); + return 0; +}