From patchwork Wed Nov 22 10:09:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 119458 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp6491025qgn; Wed, 22 Nov 2017 02:09:39 -0800 (PST) X-Google-Smtp-Source: AGs4zMZmbO+iiqTLXzfhNWfuYx9cXNo33rfFs7mi1QlXHwezt54+iMjzK4JhM2kNSXM0eElsKm7l X-Received: by 10.98.178.17 with SMTP id x17mr16416615pfe.239.1511345379132; Wed, 22 Nov 2017 02:09:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511345379; cv=none; d=google.com; s=arc-20160816; b=Lqce4ixbi37zth3AAjQecaPQnUAZVn7KuaFj4j5ftppQR8mt4GBG7Pl93zn2deiWSC 4/Nt430bcZhWv/CNM6JtCYzaJsXJhZUyIVv9awHh6u3Gs4iFTNz+waKFFFgwG3xWgXrv fIw4JWwgWcVVTSGGsXQMMse0IE9I/sTRNtMvPGVLTlq2Hqi7QKFePFiVxsFl95RKCG53 y1HiUCFSkmLle9E8uFCkagNK8ApQS5Vj7RB/cdkJOacGaAl5kDWL5a+AvRiv9nrUaWtv 2PVscTOvlCh5EW/K+B5QzR1cumZDRc4pTmzK+RiQx6ylt/RxpSyP+mJmmOp3Rd1j4AnW tcLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=BfMpa/ViPX5HOn86bUMCaaErUpmk0inMxoI6aPHQtIw=; b=JCqqxyZfZJEDcmzXmDskMSPhotTjp0eVEgUBUr0f7YNP9M88EwELCeg5NLtgHcmjc9 X5NlNwPTBXLD2irXFzmHqeAPjsBM7ZrxMq6sL90lNpgvzIaZt0LHLnQbQzLhnJzNl/2w cbFCxszi0Kxb1jj2PeZVrV3nOpBIY1DG7dz8oooa4Kn+0iRg8hrPOCt/LKcMXEPVGmv6 6wiAXQo6dEhwQOaUyL8GcZiw/6UytG3RlXeTp0qXS939L6w6fvQjqqv2v5AVqJTpiz/l hU8HQGxjGaDEH/SU2X4vWtC2X5QApCp9k62ddswADUTz7eC4CtGpawDhcFBaud0s8G9c Crow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=BGDIUX5I; spf=pass (google.com: domain of gcc-patches-return-467654-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-467654-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id r7si12244500pgp.220.2017.11.22.02.09.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 22 Nov 2017 02:09:39 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-467654-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=BGDIUX5I; spf=pass (google.com: domain of gcc-patches-return-467654-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-467654-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=gidIMsTWHKfYH1O2hWg4vdlkaiB6b3cZIm5zTGep+kNZhuPgXxSWc 7iaveiSTLkDZZsoIH0SzD8JqaYZKo7Y2sAnJTB7DfApf1Eor0HIcYN0PAbEcHR7H hUzMe70ApVe8k30G4UfuHnDb3uoQ7h3FWgirt8AZD6J6T6FGM6HqDk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=D1Bl0dpcTmdvcmASfC9zvxs1UyI=; b=BGDIUX5IYq1ZXJoJVFjC QvIsfICNt/0GFPnumDFjNtNX+3uJrghTeJhAtEk+iqm50ctWL0E42lm7DyaRzQap +jJqqtCGCxCyZ6S6O9Hs9jewM1RBukSlp85byL0hyoVbMdM+L04Qbo+wt2ZAk67q E1c6EDTYbblkm2nTnuWCJ4U= Received: (qmail 32214 invoked by alias); 22 Nov 2017 10:09:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 32177 invoked by uid 89); 22 Nov 2017 10:09:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-15.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_SHORT, KB_WAM_FROM_NAME_SINGLEWORD, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-wr0-f178.google.com Received: from mail-wr0-f178.google.com (HELO mail-wr0-f178.google.com) (209.85.128.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 22 Nov 2017 10:09:15 +0000 Received: by mail-wr0-f178.google.com with SMTP id z75so12647821wrc.5 for ; Wed, 22 Nov 2017 02:09:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:date:message-id :user-agent:mime-version; bh=BfMpa/ViPX5HOn86bUMCaaErUpmk0inMxoI6aPHQtIw=; b=aFn6MDDKjrkmUj1Rx/eroL9F1uC98IPfF7zfIzdyBKGkl19i4kpKpw96lisOEdoxpI /vjymRcoC/srs2laoT6sGXqWsVc8kd8b0bHE4pcC15KBPrk1fglUftMjixbA+aNvTe4s JWfD7tvdy7AOx8Pg1liaZ0cpDuWGz35VAyKNaHIggV7Sp7lC0BHmAj2YX9PnOUXyB1Fa 5Ob6k1E6IB1/y+Lkeg1uAzTMjt1SKeIX8QV0Z1triP+gL9J3OZNQona6vjWBiJOh6xGc K+J/jSRtwdZyF86fu+KuUQEXt02+yaIhHzW9d6I9itN5GKf9otws8YbseYB2ejbCA8IF T9Ag== X-Gm-Message-State: AJaThX5nQo+FKiYAIRfb04H/IgmGMV2tdL633XqlFSVODX+cLNTTuwRG fP2vKetY75Orl5el0ZCSrZeLDbk+WsA= X-Received: by 10.223.182.147 with SMTP id j19mr14770096wre.159.1511345352831; Wed, 22 Nov 2017 02:09:12 -0800 (PST) Received: from localhost ([2.25.234.120]) by smtp.gmail.com with ESMTPSA id 80sm3080925wmk.14.2017.11.22.02.09.10 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 22 Nov 2017 02:09:11 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: Replace REDUC_*_EXPRs with internal functions. Date: Wed, 22 Nov 2017 10:09:08 +0000 Message-ID: <87shd6zku3.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 This patch replaces the REDUC_*_EXPR tree codes with internal functions. This is needed so that the support for in-order reductions can also use internal functions without too much complication. This came out of the review for: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01516.html Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2017-11-22 Richard Sandiford gcc/ * tree.def (REDUC_MAX_EXPR, REDUC_MIN_EXPR, REDUC_PLUS_EXPR): Delete. * cfgexpand.c (expand_debug_expr): Remove handling for them. * expr.c (expand_expr_real_2): Likewise. * fold-const.c (const_unop): Likewise. * optabs-tree.c (optab_for_tree_code): Likewise. * tree-cfg.c (verify_gimple_assign_unary): Likewise. * tree-inline.c (estimate_operator_cost): Likewise. * tree-pretty-print.c (dump_generic_node): Likewise. (op_code_prio): Likewise. (op_symbol_code): Likewise. * internal-fn.def (DEF_INTERNAL_SIGNED_OPTAB_FN): Define. (IFN_REDUC_PLUS, IFN_REDUC_MAX, IFN_REDUC_MIN): New internal functions. * internal-fn.c (direct_internal_fn_optab): New function. (direct_internal_fn_array, direct_internal_fn_supported_p (internal_fn_expanders): Handle DEF_INTERNAL_SIGNED_OPTAB_FN. * fold-const-call.c (fold_const_reduction): New function. (fold_const_call): Handle CFN_REDUC_PLUS, CFN_REDUC_MAX and CFN_REDUC_MIN. * tree-vect-loop.c: Include internal-fn.h. (reduction_code_for_scalar_code): Rename to... (reduction_fn_for_scalar_code): ...this and return an internal function. (vect_model_reduction_cost): Take an internal_fn rather than a tree_code. (vect_create_epilog_for_reduction): Likewise. Build calls rather than assignments. (vectorizable_reduction): Use internal functions rather than tree codes for the reduction operation. Update calls to the functions above. * config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin): Use calls to internal functions rather than REDUC tree codes. * config/aarch64/aarch64-simd.md: Update comment accordingly. Index: gcc/tree.def =================================================================== --- gcc/tree.def 2017-11-22 10:05:56.012967507 +0000 +++ gcc/tree.def 2017-11-22 10:06:16.885024316 +0000 @@ -1260,18 +1260,6 @@ DEFTREECODE (OMP_CLAUSE, "omp_clause", t Operand 0: BODY: contains body of the transaction. */ DEFTREECODE (TRANSACTION_EXPR, "transaction_expr", tcc_expression, 1) -/* Reduction operations. - Operations that take a vector of elements and "reduce" it to a scalar - result (e.g. summing the elements of the vector, finding the minimum over - the vector elements, etc). - Operand 0 is a vector. - The expression returns a scalar, with type the same as the elements of the - vector, holding the result of the reduction of all elements of the operand. - */ -DEFTREECODE (REDUC_MAX_EXPR, "reduc_max_expr", tcc_unary, 1) -DEFTREECODE (REDUC_MIN_EXPR, "reduc_min_expr", tcc_unary, 1) -DEFTREECODE (REDUC_PLUS_EXPR, "reduc_plus_expr", tcc_unary, 1) - /* Widening dot-product. The first two arguments are of type t1. The third argument and the result are of type t2, such that t2 is at least Index: gcc/cfgexpand.c =================================================================== --- gcc/cfgexpand.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/cfgexpand.c 2017-11-22 10:06:16.878456941 +0000 @@ -5050,9 +5050,6 @@ expand_debug_expr (tree exp) /* Vector stuff. For most of the codes we don't have rtl codes. */ case REALIGN_LOAD_EXPR: - case REDUC_MAX_EXPR: - case REDUC_MIN_EXPR: - case REDUC_PLUS_EXPR: case VEC_COND_EXPR: case VEC_PACK_FIX_TRUNC_EXPR: case VEC_PACK_SAT_EXPR: Index: gcc/expr.c =================================================================== --- gcc/expr.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/expr.c 2017-11-22 10:06:16.880333334 +0000 @@ -9367,26 +9367,6 @@ #define REDUCE_BIT_FIELD(expr) (reduce_b return target; } - case REDUC_MAX_EXPR: - case REDUC_MIN_EXPR: - case REDUC_PLUS_EXPR: - { - op0 = expand_normal (treeop0); - this_optab = optab_for_tree_code (code, type, optab_default); - machine_mode vec_mode = TYPE_MODE (TREE_TYPE (treeop0)); - - struct expand_operand ops[2]; - enum insn_code icode = optab_handler (this_optab, vec_mode); - - create_output_operand (&ops[0], target, mode); - create_input_operand (&ops[1], op0, vec_mode); - expand_insn (icode, 2, ops); - target = ops[0].value; - if (GET_MODE (target) != mode) - return gen_lowpart (tmode, target); - return target; - } - case VEC_UNPACK_HI_EXPR: case VEC_UNPACK_LO_EXPR: { Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/fold-const.c 2017-11-22 10:06:16.882209727 +0000 @@ -1707,36 +1707,6 @@ const_unop (enum tree_code code, tree ty return build_vector (type, elts); } - case REDUC_MIN_EXPR: - case REDUC_MAX_EXPR: - case REDUC_PLUS_EXPR: - { - unsigned int nelts, i; - enum tree_code subcode; - - if (TREE_CODE (arg0) != VECTOR_CST) - return NULL_TREE; - nelts = VECTOR_CST_NELTS (arg0); - - switch (code) - { - case REDUC_MIN_EXPR: subcode = MIN_EXPR; break; - case REDUC_MAX_EXPR: subcode = MAX_EXPR; break; - case REDUC_PLUS_EXPR: subcode = PLUS_EXPR; break; - default: gcc_unreachable (); - } - - tree res = VECTOR_CST_ELT (arg0, 0); - for (i = 1; i < nelts; i++) - { - res = const_binop (subcode, res, VECTOR_CST_ELT (arg0, i)); - if (res == NULL_TREE || !CONSTANT_CLASS_P (res)) - return NULL_TREE; - } - - return res; - } - default: break; } Index: gcc/optabs-tree.c =================================================================== --- gcc/optabs-tree.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/optabs-tree.c 2017-11-22 10:06:16.882209727 +0000 @@ -146,17 +146,6 @@ optab_for_tree_code (enum tree_code code case FMA_EXPR: return fma_optab; - case REDUC_MAX_EXPR: - return TYPE_UNSIGNED (type) - ? reduc_umax_scal_optab : reduc_smax_scal_optab; - - case REDUC_MIN_EXPR: - return TYPE_UNSIGNED (type) - ? reduc_umin_scal_optab : reduc_smin_scal_optab; - - case REDUC_PLUS_EXPR: - return reduc_plus_scal_optab; - case VEC_WIDEN_MULT_HI_EXPR: return TYPE_UNSIGNED (type) ? vec_widen_umult_hi_optab : vec_widen_smult_hi_optab; Index: gcc/tree-cfg.c =================================================================== --- gcc/tree-cfg.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/tree-cfg.c 2017-11-22 10:06:16.883147923 +0000 @@ -3773,18 +3773,6 @@ verify_gimple_assign_unary (gassign *stm return false; } - case REDUC_MAX_EXPR: - case REDUC_MIN_EXPR: - case REDUC_PLUS_EXPR: - if (!VECTOR_TYPE_P (rhs1_type) - || !useless_type_conversion_p (lhs_type, TREE_TYPE (rhs1_type))) - { - error ("reduction should convert from vector to element type"); - debug_generic_expr (lhs_type); - debug_generic_expr (rhs1_type); - return true; - } - return false; case VEC_UNPACK_HI_EXPR: case VEC_UNPACK_LO_EXPR: Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/tree-inline.c 2017-11-22 10:06:16.884086120 +0000 @@ -3874,9 +3874,6 @@ estimate_operator_cost (enum tree_code c case REALIGN_LOAD_EXPR: - case REDUC_MAX_EXPR: - case REDUC_MIN_EXPR: - case REDUC_PLUS_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/tree-pretty-print.c 2017-11-22 10:06:16.884086120 +0000 @@ -3208,24 +3208,6 @@ dump_generic_node (pretty_printer *pp, t is_expr = false; break; - case REDUC_MAX_EXPR: - pp_string (pp, " REDUC_MAX_EXPR < "); - dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); - pp_string (pp, " > "); - break; - - case REDUC_MIN_EXPR: - pp_string (pp, " REDUC_MIN_EXPR < "); - dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); - pp_string (pp, " > "); - break; - - case REDUC_PLUS_EXPR: - pp_string (pp, " REDUC_PLUS_EXPR < "); - dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); - pp_string (pp, " > "); - break; - case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: @@ -3604,9 +3586,6 @@ op_code_prio (enum tree_code code) case ABS_EXPR: case REALPART_EXPR: case IMAGPART_EXPR: - case REDUC_MAX_EXPR: - case REDUC_MIN_EXPR: - case REDUC_PLUS_EXPR: case VEC_UNPACK_HI_EXPR: case VEC_UNPACK_LO_EXPR: case VEC_UNPACK_FLOAT_HI_EXPR: @@ -3725,9 +3704,6 @@ op_symbol_code (enum tree_code code) case PLUS_EXPR: return "+"; - case REDUC_PLUS_EXPR: - return "r+"; - case WIDEN_SUM_EXPR: return "w+"; Index: gcc/internal-fn.def =================================================================== --- gcc/internal-fn.def 2017-11-22 10:05:56.012967507 +0000 +++ gcc/internal-fn.def 2017-11-22 10:06:16.882209727 +0000 @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. DEF_INTERNAL_FN (NAME, FLAGS, FNSPEC) DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) + DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SIGNED_OPTAB, + UNSIGNED_OPTAB, TYPE) DEF_INTERNAL_FLT_FN (NAME, FLAGS, OPTAB, TYPE) DEF_INTERNAL_INT_FN (NAME, FLAGS, OPTAB, TYPE) @@ -49,6 +51,12 @@ along with GCC; see the file COPYING3. - mask_store: currently just maskstore - store_lanes: currently just vec_store_lanes + DEF_INTERNAL_SIGNED_OPTAB_FN defines an internal function that + maps to one of two optabs, depending on the signedness of an input. + SIGNED_OPTAB and UNSIGNED_OPTAB are the optabs for signed and + unsigned inputs respectively, both without the trailing "_optab". + SELECTOR says which type in the tree_pair determines the signedness. + DEF_INTERNAL_FLT_FN is like DEF_INTERNAL_OPTAB_FN, but in addition, the function implements the computational part of a built-in math function BUILT_IN_{F,,L}. Unlike some built-in functions, @@ -75,6 +83,12 @@ along with GCC; see the file COPYING3. DEF_INTERNAL_FN (NAME, FLAGS | ECF_LEAF, NULL) #endif +#ifndef DEF_INTERNAL_SIGNED_OPTAB_FN +#define DEF_INTERNAL_SIGNED_OPTAB_FN(NAME, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ + DEF_INTERNAL_FN (NAME, FLAGS | ECF_LEAF, NULL) +#endif + #ifndef DEF_INTERNAL_FLT_FN #define DEF_INTERNAL_FLT_FN(NAME, FLAGS, OPTAB, TYPE) \ DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) @@ -98,6 +112,13 @@ DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary) +DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW, + reduc_plus_scal, unary) +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first, + reduc_smax_scal, reduc_umax_scal, unary) +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first, + reduc_smin_scal, reduc_umin_scal, unary) + /* Unary math functions. */ DEF_INTERNAL_FLT_FN (ACOS, ECF_CONST, acos, unary) DEF_INTERNAL_FLT_FN (ASIN, ECF_CONST, asin, unary) @@ -236,5 +257,6 @@ DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF #undef DEF_INTERNAL_INT_FN #undef DEF_INTERNAL_FLT_FN #undef DEF_INTERNAL_FLT_FLOATN_FN +#undef DEF_INTERNAL_SIGNED_OPTAB_FN #undef DEF_INTERNAL_OPTAB_FN #undef DEF_INTERNAL_FN Index: gcc/internal-fn.c =================================================================== --- gcc/internal-fn.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/internal-fn.c 2017-11-22 10:06:16.882209727 +0000 @@ -90,6 +90,8 @@ #define binary_direct { 0, 0, true } const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = { #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct, #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct, +#define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) TYPE##_direct, #include "internal-fn.def" not_direct }; @@ -2818,6 +2820,30 @@ #define direct_load_lanes_optab_supporte #define direct_mask_store_optab_supported_p direct_optab_supported_p #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p +/* Return the optab used by internal function FN. */ + +static optab +direct_internal_fn_optab (internal_fn fn, tree_pair types) +{ + switch (fn) + { +#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) \ + case IFN_##CODE: break; +#define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ + case IFN_##CODE: return OPTAB##_optab; +#define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ + case IFN_##CODE: return (TYPE_UNSIGNED (types.SELECTOR) \ + ? UNSIGNED_OPTAB ## _optab \ + : SIGNED_OPTAB ## _optab); +#include "internal-fn.def" + + case IFN_LAST: + break; + } + gcc_unreachable (); +} + /* Return true if FN is supported for the types in TYPES when the optimization type is OPT_TYPE. The types are those associated with the "type0" and "type1" fields of FN's direct_internal_fn_info @@ -2835,6 +2861,16 @@ #define DEF_INTERNAL_OPTAB_FN(CODE, FLAG case IFN_##CODE: \ return direct_##TYPE##_optab_supported_p (OPTAB##_optab, types, \ opt_type); +#define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ + case IFN_##CODE: \ + { \ + optab which_optab = (TYPE_UNSIGNED (types.SELECTOR) \ + ? UNSIGNED_OPTAB ## _optab \ + : SIGNED_OPTAB ## _optab); \ + return direct_##TYPE##_optab_supported_p (which_optab, types, \ + opt_type); \ + } #include "internal-fn.def" case IFN_LAST: @@ -2874,6 +2910,15 @@ #define DEF_INTERNAL_OPTAB_FN(CODE, FLAG { \ expand_##TYPE##_optab_fn (fn, stmt, OPTAB##_optab); \ } +#define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \ + UNSIGNED_OPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn fn, gcall *stmt) \ + { \ + tree_pair types = direct_internal_fn_types (fn, stmt); \ + optab which_optab = direct_internal_fn_optab (fn, types); \ + expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ + } #include "internal-fn.def" /* Routines to expand each internal function, indexed by function number. Index: gcc/fold-const-call.c =================================================================== --- gcc/fold-const-call.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/fold-const-call.c 2017-11-22 10:06:16.880333334 +0000 @@ -583,6 +583,25 @@ fold_const_builtin_nan (tree type, tree return NULL_TREE; } +/* Fold a call to IFN_REDUC_ (ARG), returning a value of type TYPE. */ + +static tree +fold_const_reduction (tree type, tree arg, tree_code code) +{ + if (TREE_CODE (arg) != VECTOR_CST) + return NULL_TREE; + + tree res = VECTOR_CST_ELT (arg, 0); + unsigned int nelts = VECTOR_CST_NELTS (arg); + for (unsigned int i = 1; i < nelts; i++) + { + res = const_binop (code, type, res, VECTOR_CST_ELT (arg, i)); + if (res == NULL_TREE || !CONSTANT_CLASS_P (res)) + return NULL_TREE; + } + return res; +} + /* Try to evaluate: *RESULT = FN (*ARG) @@ -1148,6 +1167,15 @@ fold_const_call (combined_fn fn, tree ty CASE_FLT_FN_FLOATN_NX (CFN_BUILT_IN_NANS): return fold_const_builtin_nan (type, arg, false); + case CFN_REDUC_PLUS: + return fold_const_reduction (type, arg, PLUS_EXPR); + + case CFN_REDUC_MAX: + return fold_const_reduction (type, arg, MAX_EXPR); + + case CFN_REDUC_MIN: + return fold_const_reduction (type, arg, MIN_EXPR); + default: return fold_const_call_1 (fn, type, arg); } Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2017-11-22 10:05:56.012967507 +0000 +++ gcc/tree-vect-loop.c 2017-11-22 10:06:16.885024316 +0000 @@ -50,6 +50,7 @@ Software Foundation; either version 3, o #include "cgraph.h" #include "tree-cfg.h" #include "tree-if-conv.h" +#include "internal-fn.h" /* Loop Vectorization Pass. @@ -2376,35 +2377,34 @@ vect_analyze_loop (struct loop *loop, lo } -/* Function reduction_code_for_scalar_code +/* Function reduction_fn_for_scalar_code Input: CODE - tree_code of a reduction operations. Output: - REDUC_CODE - the corresponding tree-code to be used to reduce the - vector of partial results into a single scalar result, or ERROR_MARK + REDUC_FN - the corresponding internal function to be used to reduce the + vector of partial results into a single scalar result, or IFN_LAST if the operation is a supported reduction operation, but does not have - such a tree-code. + such an internal function. Return FALSE if CODE currently cannot be vectorized as reduction. */ static bool -reduction_code_for_scalar_code (enum tree_code code, - enum tree_code *reduc_code) +reduction_fn_for_scalar_code (enum tree_code code, internal_fn *reduc_fn) { switch (code) { case MAX_EXPR: - *reduc_code = REDUC_MAX_EXPR; + *reduc_fn = IFN_REDUC_MAX; return true; case MIN_EXPR: - *reduc_code = REDUC_MIN_EXPR; + *reduc_fn = IFN_REDUC_MIN; return true; case PLUS_EXPR: - *reduc_code = REDUC_PLUS_EXPR; + *reduc_fn = IFN_REDUC_PLUS; return true; case MULT_EXPR: @@ -2412,7 +2412,7 @@ reduction_code_for_scalar_code (enum tre case BIT_IOR_EXPR: case BIT_XOR_EXPR: case BIT_AND_EXPR: - *reduc_code = ERROR_MARK; + *reduc_fn = IFN_LAST; return true; default: @@ -3745,7 +3745,7 @@ have_whole_vector_shift (machine_mode mo the loop, and the epilogue code that must be generated. */ static void -vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code, +vect_model_reduction_cost (stmt_vec_info stmt_info, internal_fn reduc_fn, int ncopies) { int prologue_cost = 0, epilogue_cost = 0; @@ -3799,7 +3799,7 @@ vect_model_reduction_cost (stmt_vec_info if (!loop || !nested_in_vect_loop_p (loop, orig_stmt)) { - if (reduc_code != ERROR_MARK) + if (reduc_fn != IFN_LAST) { if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION) { @@ -4266,7 +4266,7 @@ get_initial_defs_for_reduction (slp_tree we have to generate more than one vector stmt - i.e - we need to "unroll" the vector stmt by a factor VF/nunits. For more details see documentation in vectorizable_operation. - REDUC_CODE is the tree-code for the epilog reduction. + REDUC_FN is the internal function for the epilog reduction. REDUCTION_PHIS is a list of the phi-nodes that carry the reduction computation. REDUC_INDEX is the index of the operand in the right hand side of the @@ -4282,7 +4282,7 @@ get_initial_defs_for_reduction (slp_tree The loop-latch argument is taken from VECT_DEFS - the vector of partial sums. 2. "Reduces" each vector of partial results VECT_DEFS into a single result, - by applying the operation specified by REDUC_CODE if available, or by + by calling the function specified by REDUC_FN if available, or by other means (whole-vector shifts or a scalar loop). The function also creates a new phi node at the loop exit to preserve loop-closed form, as illustrated below. @@ -4317,7 +4317,7 @@ get_initial_defs_for_reduction (slp_tree static void vect_create_epilog_for_reduction (vec vect_defs, gimple *stmt, gimple *reduc_def_stmt, - int ncopies, enum tree_code reduc_code, + int ncopies, internal_fn reduc_fn, vec reduction_phis, bool double_reduc, slp_tree slp_node, @@ -4569,7 +4569,7 @@ vect_create_epilog_for_reduction (vec2" ;; 'across lanes' max and min ops. ;; Template for outputting a scalar, so we can create __builtins which can be -;; gimple_fold'd to the REDUC_(MAX|MIN)_EXPR tree code. (This is FP smax/smin). +;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP smax/smin). (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]