From patchwork Thu May  8 08:06:42 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Zhenqiang Chen <zhenqiang.chen@linaro.org>
X-Patchwork-Id: 29821
Return-Path: <patchwork-forward+bncBCA6F2EI6YNRBKHWVSNQKGQE36AY7DI@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-pd0-f198.google.com (mail-pd0-f198.google.com
 [209.85.192.198])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 90B5620534
 for <linaro@patches.linaro.org>; Thu,  8 May 2014 08:07:05 +0000 (UTC)
Received: by mail-pd0-f198.google.com with SMTP id w10sf8717747pde.1
 for <linaro@patches.linaro.org>; Thu, 08 May 2014 01:07:04 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :delivered-to:mime-version:date:message-id:subject:from:to:cc
 :x-original-sender:x-original-authentication-results:content-type;
 bh=l0uKv8a0LjC6TYi6scuKBkK3fLVh9CIN+Mtk83y9gtY=;
 b=jQLaSUYZo11aEri7WMVbs8NebK8o7gNHYzKVbT+QHQ81zdbPXD/q1Cz+5LFCb3FC0E
 sACwSLaPCiO/MlAJs85EsMhjFNNJtdjj04H6KwHGsR+hPBC2kBDSvVNfWJgJbJb1eG6R
 Rc9/+p4uoLk3Z7nxDU0rtkpnM0g6n8mQI/lP/N/7mkzIgRVCTmRyrt+kwI5YzoGANuNm
 eiZO4fCitjGv+V0TaBavuRs/8z7bEVr2eEKUB5j0pp6453HeR0qOJnsd02oX6KyhBYQW
 i1RWnNFFtuiYZnvbIZi0sfVshBvoEA6RtU7WthlM3JJF9i4r/X9JvJ/9ebZbQJ/Q2Dji
 /tuQ==
X-Gm-Message-State: ALoCoQmWIl3RwegkLpf9XoCdCSMJvUo3tslsoohcOdcPjKowf76SgKkM7wB/Hw6p1IuTmu81B+kK
X-Received: by 10.66.157.35 with SMTP id wj3mr7363169pab.11.1399536424751;
 Thu, 08 May 2014 01:07:04 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.21.203 with SMTP id 69ls3712633qgl.35.gmail; Thu, 08 May
 2014 01:07:04 -0700 (PDT)
X-Received: by 10.52.104.7 with SMTP id ga7mr1617615vdb.29.1399536424557;
 Thu, 08 May 2014 01:07:04 -0700 (PDT)
Received: from mail-vc0-x232.google.com (mail-vc0-x232.google.com
 [2607:f8b0:400c:c03::232])
 by mx.google.com with ESMTPS id d9si40557vca.134.2014.05.08.01.07.04
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Thu, 08 May 2014 01:07:04 -0700 (PDT)
Received-SPF: none (google.com:
 patch+caf_=patchwork-forward=linaro.org@linaro.org does not
 designate permitted sender hosts)
 client-ip=2607:f8b0:400c:c03::232; 
Received: by mail-vc0-f178.google.com with SMTP id hu19so2874967vcb.9
 for <patchwork-forward@linaro.org>;
 Thu, 08 May 2014 01:07:04 -0700 (PDT)
X-Received: by 10.52.95.171 with SMTP id dl11mr1619132vdb.36.1399536424324; 
 Thu, 08 May 2014 01:07:04 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.220.221.72 with SMTP id ib8csp374998vcb;
 Thu, 8 May 2014 01:07:03 -0700 (PDT)
X-Received: by 10.66.253.170 with SMTP id ab10mr4713449pad.53.1399536423568; 
 Thu, 08 May 2014 01:07:03 -0700 (PDT)
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 qb5si118960pbb.372.2014.05.08.01.07.03 for <patch@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 08 May 2014 01:07:03 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-return-366846-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Received: (qmail 1894 invoked by alias); 8 May 2014 08:06:50 -0000
Mailing-List: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 1876 invoked by uid 89); 8 May 2014 08:06:49 -0000
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00,
 RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-lb0-f169.google.com
Received: from mail-lb0-f169.google.com (HELO mail-lb0-f169.google.com)
 (209.85.217.169) by sourceware.org
 (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted)
 ESMTPS; Thu, 08 May 2014 08:06:46 +0000
Received: by mail-lb0-f169.google.com with SMTP id s7so3002754lbd.14 for
 <gcc-patches@gcc.gnu.org>; Thu, 08 May 2014 01:06:42 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.112.54.228 with SMTP id m4mr51147lbp.96.1399536402509;
 Thu, 08 May 2014 01:06:42 -0700 (PDT)
Received: by 10.112.162.170 with HTTP; Thu, 8 May 2014 01:06:42 -0700 (PDT)
Date: Thu, 8 May 2014 16:06:42 +0800
Message-ID: <CACgzC7C5gtZKnuLoVc53NFOEbNa81e-Mh75BrEOq-DdyVNfr3A@mail.gmail.com>
Subject: [PATCH,
 1/2] shrink wrap a function with a single loop: copy propagation
From: Zhenqiang Chen <zhenqiang.chen@linaro.org>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Jeff Law <law@redhat.com>
X-IsSubscribed: yes
X-Original-Sender: zhenqiang.chen@linaro.org
X-Original-Authentication-Results: mx.google.com;       spf=neutral
 (google.com: patch+caf_=patchwork-forward=linaro.org@linaro.org does
 not designate permitted sender hosts)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; 
 dkim=pass header.i=@gcc.gnu.org
X-Google-Group-Id: 836684582541

Hi,

Similar issue was discussed in thread
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01145.html. The patches
are close to Jeff's suggestion: "sink just the moves out of the
incoming argument registers".

The patch and following one try to shrink wrap a function with a
single loop, which can not be handled by
split_live_ranges_for_shrink_wrap and prepare_shrink_wrap, since the
induction variable has more than one definitions. Take the test case
in the patch as example, the pseudo code before shrink-wrap is like:

     p = p2
     if (!p) goto return
 L1: ...
     p = ...
     ...
     goto L1
     ...
return:

Function prepare_shrink_wrap does PRE like optimization to sink some
copies from entry block to the live block. The patches enhance
prepare_shrink_wrap with:
(1) Replace the reference of p to p2 in the entry block. (This patch)
(2) Create a new basic block on the live edge to hold the copy "p =
p2". (Next patch)

After shrink-wrap, the pseudo code would like:

     if (!p2) goto return
     p = p2
 L1: ...
     p = ...
     ...
     goto L1
return:

Bootstrap and no make check regression on X86-64 and ARM.
No Spec2k INT performance regression for X86-64 and ARM with -O3.

With the two patches, for X86-64 Spec2K INT, the number of functions
shrink-wrapped increases from 619 to 671.
For 453.povray in Spec2006, X86-64 is ~1% better. ARM THUMB mode is
~4% faster. No performance improvement for ARM mode since it uses the
mov (subs) to set the CC. There is no way to sink it out of the entry
block.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-05-08  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        * function.c (last_or_compare_p, try_copy_prop): new functions.
        (move_insn_for_shrink_wrap): try copy propagation.
        (prepare_shrink_wrap): Separate last_uses from uses.

testsuite/ChangeLog:
2014-05-08  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        * shrink-wrap-loop.c: New test case.

+/* { dg-final { cleanup-rtl-dump "pro_and_epilogue" } } */

diff --git a/gcc/function.c b/gcc/function.c
index 383a52a..764ac82 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5421,14 +5421,139 @@ next_block_for_reg (basic_block bb, int
regno, int end_regno)
   return live_edge->dest;
 }

-/* Try to move INSN from BB to a successor.  Return true on success.
-   USES and DEFS are the set of registers that are used and defined
-   after INSN in BB.  */
+/* Check whether INSN is the last insn in BB or
+   a COMPARE for the last insn in BB.  */
+
+static bool
+last_or_compare_p (basic_block bb, rtx insn)
+{
+  rtx x = single_set (insn);
+
+  if ((insn == BB_END (bb))
+       || ((insn == PREV_INSN (BB_END (bb)))
+          && x && REG_P (SET_DEST (x))
+          && GET_MODE_CLASS (GET_MODE (SET_DEST (x))) == MODE_CC))
+    return true;
+
+  return false;
+}
+
+/* Try to copy propagate INSN with SRC and DEST in BB to the last COMPARE
+   or JUMP insn, which use registers in LAST_USES.  */
+
+static bool
+try_copy_prop (basic_block bb, rtx insn, rtx src, rtx dest,
+              HARD_REG_SET *last_uses)
+{
+  bool ret = false;
+  bool changed, is_asm;
+  unsigned i, alt, n_ops, dregno, sregno;
+
+  rtx x, r, n, tmp;
+
+  if (GET_CODE (dest) == SUBREG || GET_CODE (src) == SUBREG
+      || insn == BB_END (bb))
+    return false;
+
+  x = NEXT_INSN (insn);
+  sregno = REGNO (src);
+  dregno = REGNO (dest);
+
+  while (x != NULL_RTX)
+    {
+      tmp = NEXT_INSN (x);
+
+      if (BARRIER_P(x))
+       return false;
+
+      /* Skip other insns since dregno is not referred according to
+        previous checks.  */
+      if (!last_or_compare_p (bb, x))
+       {
+         x = tmp;
+         continue;
+       }
+      changed = 0;
+      extract_insn (x);
+      if (!constrain_operands (1))
+       fatal_insn_not_found (x);
+      preprocess_constraints ();
+      alt = which_alternative;
+      n_ops = recog_data.n_operands;
+
+      is_asm = asm_noperands (PATTERN (x)) >= 0;
+      if (is_asm)
+       return false;
+
+      for (i = 0; i < n_ops; i ++)
+       {
+         r = recog_data.operand [i];
+         if (REG_P (r) && REGNO (r) == dregno)
+           {
+             enum reg_class cl = recog_op_alt[i][alt].cl;
+             if (GET_MODE (r) != GET_MODE (src)
+                 || !in_hard_reg_set_p (reg_class_contents[cl],
+                                        GET_MODE (r), sregno)
+                 || recog_op_alt[i][alt].earlyclobber)
+               {
+                 if (changed)
+                   cancel_changes (0);
+                 return false;
+                }
+             n = gen_rtx_raw_REG (GET_MODE (r), sregno);
+             if (!validate_unshare_change (x, recog_data.operand_loc[i],
+                                           n, true))
+               {
+                 cancel_changes (0);
+                 return false;
+               }
+
+             ORIGINAL_REGNO (n) = ORIGINAL_REGNO (r);
+             REG_ATTRS (n) = REG_ATTRS (r);
+             REG_POINTER (n) = REG_POINTER (r);
+             changed = true;
+            }
+       }
+
+      if (changed)
+       {
+          if (!verify_changes (0))
+           {
+             cancel_changes (0);
+             return false;
+           }
+         else
+           {
+             confirm_change_group ();
+             df_insn_rescan (x);
+             ret = true;
+           }
+       }
+
+      if (x == BB_END (bb))
+       break;
+
+      x = tmp;
+    }
+
+  if (ret)
+    {
+      CLEAR_HARD_REG_BIT (*last_uses, dregno);
+      SET_HARD_REG_BIT (*last_uses, sregno);
+      df_set_bb_dirty (bb);
+    }
+  return ret;
+}
+
+ /* Try to move INSN from BB to a successor.  Return true on success.
+    USES and DEFS are the set of registers that are used and defined
+    after INSN in BB.  */

 static bool
 move_insn_for_shrink_wrap (basic_block bb, rtx insn,
                           const HARD_REG_SET uses,
-                          const HARD_REG_SET defs)
+                          const HARD_REG_SET defs,
+                          HARD_REG_SET *last_uses)
 {
   rtx set, src, dest;
   bitmap live_out, live_in, bb_uses, bb_defs;
@@ -5460,6 +5585,12 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn,
   /* See whether there is a successor block to which we could move INSN.  */
   next_block = next_block_for_reg (bb, dregno, end_dregno);
   if (!next_block)
+     return false;
+
+  /* If the destination register is referred in later insn,
+     try to forward it.  */
+  if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest), dregno)
+      && !try_copy_prop (bb, insn, src, dest, last_uses))
     return false;

   /* At this point we are committed to moving INSN, but let's try to
@@ -5551,14 +5682,18 @@ static void
 prepare_shrink_wrap (basic_block entry_block)
 {
   rtx insn, curr, x;
-  HARD_REG_SET uses, defs;
+  HARD_REG_SET uses, defs, last_uses;
   df_ref *ref;

+  if (!JUMP_P (BB_END (entry_block)))
+    return;
+  CLEAR_HARD_REG_SET (last_uses);
   CLEAR_HARD_REG_SET (uses);
   CLEAR_HARD_REG_SET (defs);
   FOR_BB_INSNS_REVERSE_SAFE (entry_block, insn, curr)
     if (NONDEBUG_INSN_P (insn)
-       && !move_insn_for_shrink_wrap (entry_block, insn, uses, defs))
+       && !move_insn_for_shrink_wrap (entry_block, insn, uses, defs,
+                                      &last_uses))
       {
        /* Add all defined registers to DEFs.  */
        for (ref = DF_INSN_DEFS (insn); *ref; ref++)
@@ -5568,6 +5703,19 @@ prepare_shrink_wrap (basic_block entry_block)
              SET_HARD_REG_BIT (defs, REGNO (x));
          }

+       /* Add all used registers for the last and compare insn in BB.
+          These insns can not be sinked out of the ENTRY_BLOCK.  */
+       if (last_or_compare_p (entry_block, insn))
+         {
+           for (ref = DF_INSN_USES (insn); *ref; ref++)
+             {
+               x = DF_REF_REG (*ref);
+               if (REG_P (x) && HARD_REGISTER_P (x))
+                 SET_HARD_REG_BIT (last_uses, REGNO (x));
+             }
+           continue;
+         }
+
        /* Add all used registers to USESs.  */
        for (ref = DF_INSN_USES (insn); *ref; ref++)
          {
diff --git a/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
new file mode 100644
index 0000000..17dca4e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/shrink-wrap-loop.c
@@ -0,0 +1,20 @@
+/* { dg-do compile { target { { x86_64-*-* } || { arm_thumb2 } } } } */
+/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue"  } */
+
+int foo (int *p1, int *p2);
+
+int
+test (int *p1, int *p2)
+{
+  int *p;
+
+  for (p = p2; p != 0; p++)
+    {
+      if (!foo (p, p1))
+        return 0;
+    }
+
+  return 1;
+}
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping"
"pro_and_epilogue"  } } */