From patchwork Mon Feb 12 23:55:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 128177 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp3700509ljc; Mon, 12 Feb 2018 15:56:23 -0800 (PST) X-Google-Smtp-Source: AH8x226xx+xBp+Nm6kDelF5EPd0q33z1hHE3R4PEEFcsa1xDFAvhgsKxEPXMzNX+30CukEmus4sG X-Received: by 2002:a17:902:6884:: with SMTP id i4-v6mr9212410plk.259.1518479783054; Mon, 12 Feb 2018 15:56:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518479783; cv=none; d=google.com; s=arc-20160816; b=rTzXvrHNltw338PoKa/jJ+Kay90hrnUiuretGTnvzwsD/KOzi3axJdoqEwpVe6EVOJ dBbY9K8PNbiKJ19R8d9hCPWroXA5WUFZf0ncBDi18fk65FTzo4gn4Drxw25+1awhHdva zdiEVhgIqLAm40uA1LdkaYnH723Ietd4lBnzH6oXvxcwLhd2Sk9uYA7WPq23lWXsWUz7 uSVzMauIR9e+3mu27fu9dgBgNkYfFrbSvD+DT4uS4m6r8F3lW9kLebOkCRhpek1zb/fx Wwnzp7Zv3WFUTBN/rlHIJrA1Zd2H40/YV7NP9Ferwzi+QxN87BE61+SrEzhIb3c0h26i Uzvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:mime-version:delivered-to:sender :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=pogpQ1PmMgqsAAVyWCe8BOnJbncP4k5kDTOz8IbfdwE=; b=jNLM7hHJYicvmoGI7R+GWMBDBShQ44yef/8eoUzgepk81KK8avlHXXeQcv72URhkZr dmTmUCT0ONWQJ9wk9tc43gfW5ASuqWsyZIQMf/1WL0KAgvFiAhVFrwoN7sjQ4e9MNeqs CeRdWtL04ARGQYp+iFeGyOmWoDmjbeWhrNpvfKktiLnATCP2qfP1+bqO1VInFNnIGcIc FvW6Sno0BZkwLKOeAvuoPQTmbazvLuw7KC28/Vx7LKKDM+5g7dFpkOF97hWv6i6kYzdB A/FzmhzkZWR0grbzRTj/CYWD8ax0vDlEzsaM/S1QCHlAGtrzy20ff8PTClT7vZMQ1UPu 1lfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NTwXa5Nr; spf=pass (google.com: domain of gcc-patches-return-473126-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-473126-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id e71si1057941pgc.571.2018.02.12.15.56.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Feb 2018 15:56:23 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-473126-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NTwXa5Nr; spf=pass (google.com: domain of gcc-patches-return-473126-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-473126-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=g9EOko853a+odOpwPM+MFuPAWHLEPGyfjFqvYb5OOzescK +21s7zISHs8DmN/paS+Ds+CeWpBNYUhqGrZVvqr4/V4TaW43kdIX78JzNQl89j6X V/Z5idGJtFg51g2nLXKCZV5v3RImXFDFOkVfeSNAVQg7gFTZDBDS5sL8c7xkQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=DLXnJWrxbGnTzwtSShJ8Qjz2E80=; b=NTwXa5NrPuaC9oUCM238 3sX8JOB6f0vAb8gAkc/W8EdoawhdUx/MO4BYYIvFwykvbBn7K8wLjWMc2ISRxBhA PZuGywXZAE+tXuRjXSxSwfy8wDCPar5Dcvh3qgbH1dCXnIVuqReyDUlBhloArc2x HA3ye4nRlDWEaNbI/ki3n1s= Received: (qmail 17234 invoked by alias); 12 Feb 2018 23:56:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 17222 invoked by uid 89); 12 Feb 2018 23:56:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=sk:number X-HELO: mail-qt0-f175.google.com Received: from mail-qt0-f175.google.com (HELO mail-qt0-f175.google.com) (209.85.216.175) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 12 Feb 2018 23:56:07 +0000 Received: by mail-qt0-f175.google.com with SMTP id c19so1694287qtm.7 for ; Mon, 12 Feb 2018 15:56:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=PyTlHimGrHCr00NecVJq8la5BtuOOa452m1Ou5PPB4U=; b=N8LzUhQ63LdikK8u4KS3yhpm16cYR+lQWS2G1D8Er6RFANaw9SRD+2lzYSnZA2QDJm sCQdkixoyTdRSNT7A3YVtWGfC60iwYt2aIPoBXcBBHkXvNqaayppviJalBH52it10cmn /L/MJktvrLZtvFPlRa3yRHooUDWzUkC2ug7c1exMK0aNpFdgMYu1M2OO7q+U5LbfqFDI 4VOf5jCCXtrpE1Tf4gybJpOvkjBFf/FGqctGqUDaH9ZR9WxM/WgsFiOIqrldwNFphwCS sY4rK/+iG14IGeWNbUA46dClQZ0QighWbreWqwDgtjvWTPubFVUfaNPHJka+C+0UT5Oq ekSw== X-Gm-Message-State: APf1xPBbW9pMrJdSr9eSxviaVtkaBlpYC0tQbeGp/PF455CoO8y25YQo 3dMGZED596PfAK3sv5PbcI9IY0TBY+Tx58f57y9x9Z2qIUM= X-Received: by 10.200.59.65 with SMTP id r1mr21338912qtf.18.1518479765943; Mon, 12 Feb 2018 15:56:05 -0800 (PST) MIME-Version: 1.0 Received: by 10.200.36.185 with HTTP; Mon, 12 Feb 2018 15:55:25 -0800 (PST) From: Kugan Vivekanandarajah Date: Tue, 13 Feb 2018 10:55:25 +1100 Message-ID: Subject: [RFC] Tree Loop Unroller Pass To: GCC Patches X-IsSubscribed: yes Implements tree loop unroller using the infrastructure provided. gcc/ChangeLog: 2018-02-12 Kugan Vivekanandarajah * Makefile.in (OBJS): Add tree-ssa-loop-unroll.o. * common.opt (ftree-loop-unroll): New option. * passes.def: Add pass_tree_loop_uroll * timevar.def (TV_TREE_LOOP_UNROLL): Add. * tree-pass.h (make_pass_tree_loop_unroll): Declare. * tree-ssa-loop-unroll.c: New file. >From 71baaf8393dd79a98b4c0216e56d87083caf0177 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Mon, 12 Feb 2018 10:44:00 +1100 Subject: [PATCH 2/4] tree-loop-unroller Change-Id: I58c25b5f2e796d4166af3ea4e50a0f4a3078b6c2 --- gcc/Makefile.in | 1 + gcc/common.opt | 4 + gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + gcc/tree-ssa-loop-unroll.c | 268 +++++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 276 insertions(+) create mode 100644 gcc/tree-ssa-loop-unroll.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 374bf3e..de3c146 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1536,6 +1536,7 @@ OBJS = \ tree-ssa-loop-im.o \ tree-ssa-loop-ivcanon.o \ tree-ssa-loop-ivopts.o \ + tree-ssa-loop-unroll.o \ tree-ssa-loop-manip.o \ tree-ssa-loop-niter.o \ tree-ssa-loop-prefetch.o \ diff --git a/gcc/common.opt b/gcc/common.opt index b20a9aa..ea47b8c 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1770,6 +1770,10 @@ fivopts Common Report Var(flag_ivopts) Init(1) Optimization Optimize induction variables on trees. +ftree-loop-unroll +Common Report Var(flag_tree_loop_unroll) Init(1) Optimization +Perform loop unrolling in gimple. + fjump-tables Common Var(flag_jump_tables) Init(1) Optimization Use jump tables for sufficiently large switch statements. diff --git a/gcc/passes.def b/gcc/passes.def index 9802f08..57f7cc2 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -302,6 +302,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_predcom); NEXT_PASS (pass_complete_unroll); NEXT_PASS (pass_slp_vectorize); + NEXT_PASS (pass_tree_loop_unroll); NEXT_PASS (pass_loop_prefetch); /* Run IVOPTs after the last pass that uses data-reference analysis as that doesn't handle TARGET_MEM_REFs. */ diff --git a/gcc/timevar.def b/gcc/timevar.def index 91221ae..a6bb847 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -202,6 +202,7 @@ DEFTIMEVAR (TV_TREE_LOOP_DISTRIBUTION, "tree loop distribution") DEFTIMEVAR (TV_CHECK_DATA_DEPS , "tree check data dependences") DEFTIMEVAR (TV_TREE_PREFETCH , "tree prefetching") DEFTIMEVAR (TV_TREE_LOOP_IVOPTS , "tree iv optimization") +DEFTIMEVAR (TV_TREE_LOOP_UNROLL , "tree loop unroll") DEFTIMEVAR (TV_PREDCOM , "predictive commoning") DEFTIMEVAR (TV_TREE_CH , "tree copy headers") DEFTIMEVAR (TV_TREE_SSA_UNCPROP , "tree SSA uncprop") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 93a6a99..2c0740f 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -388,6 +388,7 @@ extern gimple_opt_pass *make_pass_complete_unrolli (gcc::context *ctxt); extern gimple_opt_pass *make_pass_parallelize_loops (gcc::context *ctxt); extern gimple_opt_pass *make_pass_loop_prefetch (gcc::context *ctxt); extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_tree_loop_unroll (gcc::context *ctxt); extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt); diff --git a/gcc/tree-ssa-loop-unroll.c b/gcc/tree-ssa-loop-unroll.c new file mode 100644 index 0000000..04cf092 --- /dev/null +++ b/gcc/tree-ssa-loop-unroll.c @@ -0,0 +1,268 @@ + +/* Tree Loop Unroller. + Copyright (C) 2017 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +GCC is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "tree.h" +#include "tree-pass.h" +#include "target.h" +#include "gimple.h" +#include "cfgloop.h" +#include "tree-ssa-loop.h" +#include "tree-scalar-evolution.h" +#include "tree-ssa-loop-manip.h" +#include "tree-ssa-loop-niter.h" +#include "tree-ssa-loop-ivopts.h" +#include "gimple.h" +#include "predict.h" +#include "cfghooks.h" +#include "tree-inline.h" +#include "gimple-iterator.h" +#include "fold-const.h" +#include "tree-data-ref.h" +#include "params.h" + +/* Count the load memory streams in the LOOP. If the streams are larger + (compared to MAX_LOAD_STREAMS), we dont need to compute all of + them. This is used to limit the partial unrolling factor to avoid + too many memory load streams. */ + +static signed +count_mem_load_streams (struct loop *loop, signed max_load_streams) +{ + basic_block *bbs = get_loop_body (loop); + unsigned nbbs = loop->num_nodes; + gimple_stmt_iterator gsi; + signed count = 0; + hash_set *dr_set = new hash_set (); + vec datarefs_vec; + datarefs_vec.create (20); + unsigned k; + + for (unsigned i = 0; i < nbbs; i++) + { + basic_block bb = bbs[i]; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); + gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + tree access_fn; + vec access_fns; + datarefs_vec.truncate (0); + + if (!find_data_references_in_stmt (loop, stmt, &datarefs_vec)) + continue; + + for (unsigned j = 0; j < datarefs_vec.length (); ++j) + { + data_reference_p dr = datarefs_vec[j]; + if (!DR_IS_READ (dr)) + continue; + access_fns = DR_ACCESS_FNS (dr); + + FOR_EACH_VEC_ELT (access_fns, k, access_fn) + { + if (TREE_CODE (access_fn) == POLYNOMIAL_CHREC) + { + if (!dr_set->add (dr)) + count++; + break; + } + } + } + + if (count > max_load_streams / 2) + break; + } + } + free_data_refs (datarefs_vec); + free (dr_set); + return count; +} + +/* Verify that loop in a form that is suitable for unrolling. */ + +static bool +loop_form_ok_p (struct loop *loop) +{ + if (!single_exit (loop)) + return false; + + /* Loop has preheader. */ + if (!loop_preheader_edge (loop)) + return false; + + return true; +} + +/* Return false if the cost model does not prefer LOOP to be unrolled. + Return true and update the DESC if we decided that the LOOP + is to be unrolled. Also, in this case, determine the factor (NUNROLL) + by which the LOOP should be unrolled. */ + +static bool +decide_unroll_iterations (struct loop *loop, + struct tree_niter_desc *desc, + signed *nunroll) +{ + /* nunroll = total number of copies of the original loop body in + unrolled loop (i.e. if it is 2, we have to duplicate loop body once. */ + signed ninsns = tree_num_loop_insns (loop, &eni_size_weights); + signed ninsns_outer = tree_num_loop_insns (loop_outer (loop), + &eni_size_weights); + signed n_unroll = PARAM_VALUE (PARAM_MAX_UNROLLED_INSNS) / ninsns; + signed load_streams = 0; + signed max_load_streams = -1; + signed n_unroll2; + + if (ninsns_outer >= PARAM_VALUE (PARAM_MAX_UNROLLED_INSNS)) + return false; + ninsns_outer -= ninsns; + n_unroll2 = (PARAM_VALUE (PARAM_MAX_UNROLLED_INSNS) - ninsns_outer)/ ninsns; + if (n_unroll2 < n_unroll) + n_unroll = n_unroll2; + + if (n_unroll > PARAM_VALUE (PARAM_MAX_UNROLL_TIMES)) + n_unroll = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES); + + /* Check for simple loops. */ + edge exit = single_exit (loop); + if (!number_of_iterations_exit (loop, exit, + desc, false, false) + || !integer_onep (desc->assumptions) + || !integer_zerop (desc->may_be_zero) + || !desc->niter) + return false; + + /* Now force nunroll to be power of 2. */ + n_unroll = 1 << (floor_log2 (n_unroll)); + if (targetm.hw_max_mem_read_streams + && (max_load_streams = targetm.hw_max_mem_read_streams ()) > 0) + { + load_streams = count_mem_load_streams (loop, max_load_streams); + if (load_streams > 0) + { + signed t = 1 << (floor_log2 (max_load_streams / load_streams)); + if (t < n_unroll) + n_unroll = t; + } + } + + if (!can_unroll_loop_p (loop, n_unroll, desc)) + return false; + + *nunroll = n_unroll; + return true; +} + +/* Unroll LOOP based on the cost model. */ + +static bool +unroll_loop (struct loop *loop) +{ + struct tree_niter_desc desc; + signed n_unroll; + + if (!decide_unroll_iterations (loop, &desc, &n_unroll)) + return false; + if (n_unroll <= 1) + return false; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "\nunrollong loop=%d times=%d\n", loop->num, n_unroll); + initialize_original_copy_tables (); + tree_unroll_loop (loop, n_unroll, + single_dom_exit (loop), &desc); + free_original_copy_tables (); + return true; +} + +static unsigned +execute_tree_loop_unroll () +{ + struct loop *loop; + bool changed = false; + int todo_flags = 0; + if (optimize_function_for_size_p (cfun)) + return 0; + + estimate_numbers_of_iterations (cfun); + FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) + { + if (!loop_form_ok_p (loop)) + continue; + if (unroll_loop (loop)) + { + changed |= true; + free_numbers_of_iterations_estimates (loop); + } + } + + if (changed) + { + scev_reset (); + todo_flags |= TODO_cleanup_cfg; + } + + return todo_flags; +} + +namespace { +const pass_data pass_data_tree_loop_unroll = +{ + GIMPLE_PASS, /* type */ + "tree-loop-unroll", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_LOOP_UNROLL, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_update_ssa | TODO_verify_all), +}; + +class pass_tree_loop_unroll : public gimple_opt_pass +{ +public: + pass_tree_loop_unroll (gcc::context *ctxt) + : gimple_opt_pass (pass_data_tree_loop_unroll, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_tree_loop_unroll (m_ctxt); } + virtual bool gate (function *) { return flag_tree_loop_unroll != 0; } + virtual unsigned int execute (function *) + { + return execute_tree_loop_unroll (); + } + +}; // class pass_tree_loop_unroll + +} // anon namespace + +gimple_opt_pass * +make_pass_tree_loop_unroll (gcc::context *ctxt) +{ + return new pass_tree_loop_unroll (ctxt); +} + -- 2.7.4