From patchwork Fri Sep 15 01:30:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 112668 Delivered-To: patch@linaro.org Received: by 10.140.106.117 with SMTP id d108csp70924qgf; Thu, 14 Sep 2017 18:30:49 -0700 (PDT) X-Received: by 10.84.224.199 with SMTP id k7mr26345773pln.403.1505439049315; Thu, 14 Sep 2017 18:30:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1505439049; cv=none; d=google.com; s=arc-20160816; b=NzhLMI8eN3/UhJxoM1/gUEQmbNTNbA9kU8dnRXTRoX9Z7wTs3/dOYPvxNJACGBir5i AtoeYmUUvjl9ERfjL82gfJf9ovOhHKHGByK8n7W65QQBCdGY99VNTUt6031QBe3WbLGk u9dPilW40ojKEc+QTbvp1xS9qpMi9avv5Hogc8GfNw1eDPkI4pTn4c0drEydfaBz72Pk ijSRYHGqEwwnGml9ij0DYsg8sptfEe3wnhf/hRqN9SqwmTC1XcJMQqYHpSrp0ldL1JwY xSLDyi75UkAn2F9Bco1p1V/aniMhX5lyWq8JtD6/aTDUoAbqJxBqOd6uYYxNUeDpgGbx pymg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:mime-version:delivered-to:sender :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=2t9v9NUTRPNW8gNLnS1SpA1itED2FS5KydZ6wSzazP8=; b=BaqXl1jopXUfdn04095iViRk0ujW5fV8GZUERScsYewAf5+ZnRZir30ZZZRrrytzXz fwJydc2m65tEUtAnpVIlu3Qr8Vc2Y/oBmUVWBIHfKKX2XisCHqHXLK3UyrKHIjVWffZQ ZVrwha54kOfzkhUhUcMTYYnk5rGs/h8EeupnTsjUd42m9zymFa9oRIMCbk7KJAOWvFIu bcILrYkdyVluph4TcyylwoUHUHNdfveIZxAlfTBLyeGYPWOiHLHBgPfwP462PlUn8Nc2 WLrXTt5ZRY2QAvpzaDT6SO9xuGXdN+Kc42oZqnHktX5V39rqarfiCBiOZaLXX/yPtZ++ j4SA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tjnWleSM; spf=pass (google.com: domain of gcc-patches-return-462191-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462191-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id k5si1093552plt.762.2017.09.14.18.30.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 14 Sep 2017 18:30:49 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-462191-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tjnWleSM; spf=pass (google.com: domain of gcc-patches-return-462191-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462191-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=iAS0Uup5jR1Kkc6igTsX+0u1gAJyPQftrCRRxmmsaWCHgL 7EF5K2V0MkVp4Xsc6On4WTTnGl+zVjnBTvdX9k4YHFnUkiR+jUD5BXq4gBluMxLV Q4Nv2Q4+D15mV9nMH9o0rbrhrH+3i9q6VnjijwyPVJpWI0o1C03qunWEiPwjE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=3lnh/Z4V+SKia1Eh/ZpkA4kPzhI=; b=tjnWleSMEAV8I9Jh2e/0 ZN4ZiSHtk/mHLYSQ4Hi0qZsna9RbBpRqFqGdTh4hGk7vs0JMMDMQxytRueeOjtK8 fTHYJqYyVcPBDrn0KIumW+5qVsiy8iWyd0XpW0mgvuspniNcqbItehniH0xLN9gZ SSnGOLVK6/2dAqbhvqVZw9g= Received: (qmail 111920 invoked by alias); 15 Sep 2017 01:30:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 111882 invoked by uid 89); 15 Sep 2017 01:30:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f175.google.com Received: from mail-qt0-f175.google.com (HELO mail-qt0-f175.google.com) (209.85.216.175) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 15 Sep 2017 01:30:25 +0000 Received: by mail-qt0-f175.google.com with SMTP id l25so975881qtf.13 for ; Thu, 14 Sep 2017 18:30:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=lH0+0BUjguHxQYt8KNrvUrkghtMFvqo6sYYuiD9xmKE=; b=BYxJrAt7TqAuptuN9ZElAWijzLcQVxwazZOW5vd+0aWKXtD/sLPFdzFoC7v98ZUfz0 lKbo4tJUy0YxcDnbyorLzcyRPhpg8VTQB8k7Mf/hH6F3v/FZBIYGn41IZfmfe4wZsBU9 eMMZPeHWF9+JXaLZd2ng/fOIE9vnOQV4K1l2KkFc0nRVNPx/RgbCkpgYqnjAEvzyJGAW suQUjQWrYUTnZcwruzaJpa1zU28WM+yGL+Yao5JF59SBr2zqVgPlYXEkKWpkUdWQYB+Y oidHa6+ypsQyQzJ/zp8oGCS7qno3jwXzZ3vp+YkT2LGdG2yqV6isGzdMtwsYgpv4IB6+ 12Mw== X-Gm-Message-State: AHPjjUiWz3uFtA9xV1cnK/iWdHUt4RUa4pHTJX7heGZ3lEl2HPDdTqVI jDlr7mUyCmRe4otS1UeTKuSg8AwcLCJ0yBzvTPeZDIuCrAI= X-Google-Smtp-Source: AOwi7QCyjYb9NjCsA2a1X8P0UQRKDXtXDr+ja+wAjAxOYvSmhhzU99Zx0L1AvguE3Oaa1mHspscklxgjPpBC72GhsB4= X-Received: by 10.237.35.35 with SMTP id h32mr32149413qtc.47.1505439016050; Thu, 14 Sep 2017 18:30:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.237.37.211 with HTTP; Thu, 14 Sep 2017 18:30:15 -0700 (PDT) From: Kugan Vivekanandarajah Date: Fri, 15 Sep 2017 11:30:15 +1000 Message-ID: Subject: [RFC][PACH 3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop To: "gcc-patches@gcc.gnu.org" X-IsSubscribed: yes This patch prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop. Thanks, Kugan gcc/ChangeLog: 2017-09-12 Kugan Vivekanandarajah * config/aarch64/aarch64.c (count_mem_load_streams): New. (aarch64_ok_to_unroll): New. * doc/tm.texi (ok_to_unroll): Define new target hook. * doc/tm.texi.in (ok_to_unroll): Likewise. * target.def (ok_to_unroll): Likewise. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Use ok_to_unroll while unrolling. >From 5de245bbf6ba1768e8206a61feb0f42c106a1d94 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Fri, 18 Aug 2017 16:41:13 +1000 Subject: [PATCH 3/5] tree unroller limit strided loads --- gcc/config/aarch64/aarch64.c | 70 ++++++++++++++++++++++++++++++++++++++++++++ gcc/doc/tm.texi | 4 +++ gcc/doc/tm.texi.in | 2 ++ gcc/target.def | 8 +++++ gcc/tree-ssa-loop-ivcanon.c | 8 +++++ 5 files changed, 92 insertions(+) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 7d1ee70..e88bb6c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -64,6 +64,7 @@ #include "sched-int.h" #include "target-globals.h" #include "common/common-target.h" +#include "tree-scalar-evolution.h" #include "selftest.h" #include "selftest-rtl.h" @@ -15122,6 +15123,72 @@ aarch64_sched_can_speculate_insn (rtx_insn *insn) } } +/* Count the strided loads in the LOOP with respect to OUT_LOOP. + If the strided loads are larger (compared to MAX_STRIDED_LOADS), + we dont need to compute all of them. */ + +static unsigned +count_mem_load_streams (struct loop *out_loop, + struct loop *loop, + unsigned max_strided_loads) +{ + basic_block *bbs = get_loop_body (loop); + unsigned nbbs = loop->num_nodes; + gimple_stmt_iterator gsi; + unsigned count = 0; + + for (unsigned i = 0; i < nbbs; i++) + { + bool ok; + basic_block bb = bbs[i]; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); + gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (!is_gimple_assign (stmt) + || !gimple_vuse (stmt)) + continue; + tree op = gimple_assign_rhs1 (stmt); + if (!INDIRECT_REF_P (op) + && TREE_CODE (op) != MEM_REF + && TREE_CODE (op) != TARGET_MEM_REF) + continue; + op = TREE_OPERAND (op, 0); + tree ev = analyze_scalar_evolution (out_loop, op); + ev = instantiate_parameters (loop, ev); + if (no_evolution_in_loop_p (ev, out_loop->num, &ok) && !ok) + count++; + if (count >= max_strided_loads) + return count; + } + } + return count; +} + +/* Target hook that prevents complete loop unrolling if this would make + the outer loop's prefetch strems more than hardware can handle. */ + +static bool +aarch64_ok_to_unroll (struct loop *loop, unsigned HOST_WIDE_INT nunroll) +{ + struct loop *loop_father; + unsigned loads; + unsigned outter_loads; + + if (aarch64_tune_params.prefetch->hw_prefetchers_avail == -1) + return true; + + if ((loop_father = loop_outer (loop))) + { + unsigned max_strided_loads = aarch64_tune_params.prefetch->hw_prefetchers_avail; + loads = count_mem_load_streams (loop_father, loop, max_strided_loads); + outter_loads = count_mem_load_streams (loop_father, loop_father, max_strided_loads); + if ((outter_loads + (nunroll - 1) * loads) > max_strided_loads) + return false; + } + return true; +} + /* Target-specific selftests. */ #if CHECKING_P @@ -15550,6 +15617,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 4 +#undef TARGET_OK_TO_UNROLL +#define TARGET_OK_TO_UNROLL aarch64_ok_to_unroll + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 795e492..45cea4c 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11617,6 +11617,10 @@ is required only when the target has special constraints like maximum number of memory accesses. @end deftypefn +@deftypefn {Target Hook} bool TARGET_OK_TO_UNROLL (struct loop *@var{loop_info}, unsigned HOST_WIDE_INT @var{nunroll}) +This hook should return false if target prefers loop should not be unrolled +@end deftypefn + @defmac POWI_MAX_MULTS If defined, this macro is interpreted as a signed integer C expression that specifies the maximum number of floating point multiplications diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 98f2e6b..64dfa51 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -8155,6 +8155,8 @@ build_type_attribute_variant (@var{mdecl}, @hook TARGET_LOOP_UNROLL_ADJUST +@hook TARGET_OK_TO_UNROLL + @defmac POWI_MAX_MULTS If defined, this macro is interpreted as a signed integer C expression that specifies the maximum number of floating point multiplications diff --git a/gcc/target.def b/gcc/target.def index bbd9c01..2f62328 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5120,6 +5120,14 @@ hardware divmod insn but defines target-specific divmod libfuncs.", void, (rtx libfunc, machine_mode mode, rtx op0, rtx op1, rtx *quot, rtx *rem), NULL) +/* Target function to check complete unrolling of loop is profitable for loop. */ +DEFHOOK +(ok_to_unroll, + "This hook should return false if target prefers loop should not be unrolled", + bool, + (struct loop *loop_info, unsigned HOST_WIDE_INT nunroll), + NULL) + /* Return the class for a secondary reload, and fill in extra information. */ DEFHOOK (secondary_reload, diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c index efb199a..c2016458 100644 --- a/gcc/tree-ssa-loop-ivcanon.c +++ b/gcc/tree-ssa-loop-ivcanon.c @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-inline.h" #include "tree-cfgcleanup.h" #include "builtins.h" +#include "target.h" /* Specifies types of loops that may be unrolled. */ @@ -855,6 +856,13 @@ try_unroll_loop_completely (struct loop *loop, loop->num); return false; } + + if (targetm.ok_to_unroll + && !targetm.ok_to_unroll (loop, n_unroll)) + { + return false; + } + if (!n_unroll) dump_printf_loc (report_flags, locus, "loop turned into non-loop; it never loops.\n"); -- 2.7.4