From patchwork Mon Sep 26 04:31:04 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Revital Eres X-Patchwork-Id: 4318 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id E3BF723F58 for ; Mon, 26 Sep 2011 04:31:07 +0000 (UTC) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by fiordland.canonical.com (Postfix) with ESMTP id BFD23A1808A for ; Mon, 26 Sep 2011 04:31:07 +0000 (UTC) Received: by fxe23 with SMTP id 23so7794505fxe.11 for ; Sun, 25 Sep 2011 21:31:07 -0700 (PDT) Received: by 10.223.57.17 with SMTP id a17mr4920212fah.65.1317011467599; Sun, 25 Sep 2011 21:31:07 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.3.234 with SMTP id f10cs24978laf; Sun, 25 Sep 2011 21:31:06 -0700 (PDT) Received: by 10.236.77.134 with SMTP id d6mr36886584yhe.52.1317011465539; Sun, 25 Sep 2011 21:31:05 -0700 (PDT) Received: from mail-gw0-f50.google.com (mail-gw0-f50.google.com [74.125.83.50]) by mx.google.com with ESMTPS id j69si12779473yhn.135.2011.09.25.21.31.04 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 25 Sep 2011 21:31:05 -0700 (PDT) Received-SPF: neutral (google.com: 74.125.83.50 is neither permitted nor denied by best guess record for domain of revital.eres@linaro.org) client-ip=74.125.83.50; Authentication-Results: mx.google.com; spf=neutral (google.com: 74.125.83.50 is neither permitted nor denied by best guess record for domain of revital.eres@linaro.org) smtp.mail=revital.eres@linaro.org Received: by gwj19 with SMTP id 19so4065871gwj.37 for ; Sun, 25 Sep 2011 21:31:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.254.3 with SMTP id b3mr5501315ani.124.1317011464762; Sun, 25 Sep 2011 21:31:04 -0700 (PDT) Received: by 10.101.58.12 with HTTP; Sun, 25 Sep 2011 21:31:04 -0700 (PDT) Date: Mon, 26 Sep 2011 07:31:04 +0300 Message-ID: Subject: [PATCH, SMS 1/2] Avoid generating redundant reg-moves From: Revital Eres To: Ayal Zaks Cc: gcc-patches@gcc.gnu.org, Patch Tracking Hello, The attached patch contains a fix to generate_reg_moves function. Currently we can generate reg-moves for stores which are later eliminated. This happens when we have mem dependency with distance 1 and as a result the number of regmoves is at least 1 based on the following calculation taken from generate_reg_moves (): if (e->distance == 1) nreg_moves4e = (SCHED_TIME (e->dest) - SCHED_TIME (e->src) + ii) / ii; This is an example of register move generated in such cases: reg_move = (insn 152 119 75 4 (set (reg:SI 231) (mem:SI (pre_modify:DI (reg:DI 215) (plus:DI (reg:DI 215) (reg:DI 171 [ ivtmp.42 ]))) [3 MEM[base: pretmp.27_65, index: ivtmp.42_9, offset: 0B]+0 S4 A32])) -1 (nil)) When not handling REG_INC instructions this was not a problem as these reg-moves were removes by dead code elimination. for example: insn 1) mem[x] = ... insn 2) .. = mem[y] When reg-move reg1 = mem [x] was generated mem[x] is not been used in insn 2 and thus reg1 could be eliminated. But with REG_INC this is different because the reg-move instruction remains and leads to bad gen. The attached tescase capture this case. Tested and bootstrap with patch 2 on ppc64-redhat-linux enabling SMS on loops with SC 1. On arm-linux-gnueabi bootstrap c on top of the set of patches that support do-loop pattern (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01807.html) which solves the bootstrap failure on ARM with SMS flags. OK for mainline? Thanks, Revital gcc/ * modulo-sched.c (generate_reg_moves): Skip instructions that do not set a register. testsuite/ * gcc.dg/sms-10.c: New file. /* { dg-do run } */ /* { dg-options "-O2 -fmodulo-sched -fmodulo-sched-allow-regmoves -fdump-rtl-sms" } */ typedef __SIZE_TYPE__ size_t; extern void *malloc (size_t); extern void free (void *); extern void abort (void); struct regstat_n_sets_and_refs_t { int sets; int refs; }; struct regstat_n_sets_and_refs_t *regstat_n_sets_and_refs; struct df_reg_info { unsigned int n_refs; }; struct df_d { struct df_reg_info **def_regs; struct df_reg_info **use_regs; }; struct df_d *df; static inline int REG_N_SETS (int regno) { return regstat_n_sets_and_refs[regno].sets; } __attribute__ ((noinline)) int max_reg_num (void) { return 100; } __attribute__ ((noinline)) void regstat_init_n_sets_and_refs (void) { unsigned int i; unsigned int max_regno = max_reg_num (); for (i = 0; i < max_regno; i++) { (regstat_n_sets_and_refs[i].sets = (df->def_regs[(i)]->n_refs)); (regstat_n_sets_and_refs[i].refs = (df->use_regs[(i)]->n_refs) + REG_N_SETS (i)); } } int a_sets[100] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 }; int a_refs[100] = { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198 }; int main () { struct df_reg_info *b[100], *c[100]; struct df_d df1; size_t s = sizeof (struct df_reg_info); struct regstat_n_sets_and_refs_t a[100]; df = &df1; regstat_n_sets_and_refs = a; int i; for (i = 0; i < 100; i++) { b[i] = (struct df_reg_info *) malloc (s); b[i]->n_refs = i; c[i] = (struct df_reg_info *) malloc (s); c[i]->n_refs = i; } df1.def_regs = b; df1.use_regs = c; regstat_init_n_sets_and_refs (); for (i = 0; i < 100; i++) if ((a[i].sets != a_sets[i]) || (a[i].refs != a_refs[i])) abort (); for (i = 0; i < 100; i++) { free (b[i]); free (c[i]); } return 0; } /* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target powerpc*-*-* } } } */ /* { dg-final { cleanup-rtl-dump "sms" } } */ Index: modulo-sched.c =================================================================== --- modulo-sched.c (revision 179138) +++ modulo-sched.c (working copy) @@ -476,7 +476,12 @@ generate_reg_moves (partial_schedule_ptr sbitmap *uses_of_defs; rtx last_reg_move; rtx prev_reg, old_reg; - + rtx set = single_set (u->insn); + + /* Skip instructions that do not set a register. */ + if (set && !REG_P (SET_DEST (set))) + continue; + /* Compute the number of reg_moves needed for u, by looking at life ranges started at u (excluding self-loops). */ for (e = u->out; e; e = e->next_out)