From patchwork Mon Sep 25 11:08:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 114172 Delivered-To: patch@linaro.org Received: by 10.140.106.117 with SMTP id d108csp2466476qgf; Mon, 25 Sep 2017 04:08:29 -0700 (PDT) X-Received: by 10.84.128.1 with SMTP id 1mr7202049pla.238.1506337709765; Mon, 25 Sep 2017 04:08:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1506337709; cv=none; d=google.com; s=arc-20160816; b=UNulxEM1oLnWZFz7A8Swu/pu+5zVIG4QpemR4v+rhSYEyv1qnFrUWz5SD/ApMg1zZZ oj5eDW2e6DQ+nUiIW/jxHjD4xwTnk4SDmgzqgsIXNQZJ4MXTYJXM4HJyR+uAAnOOJAH+ mwFa5kwuxwHGaOokRaaD/Hc9nH+8bdetRPBnmV7refTwjlbYGxpmTLc5itO/G0won6Lm miMABWTZbT6TNbhZFhfU2z7ZuMvmXpGp2AIAYAHtQAe8FosoGJ6qWNXU0uGfv1ZAa4bX /5Zg4tzvSPC4YEEX5/UPMvubmB/X/1mypwQXypM1nhhMkHdKMrFNXqEZ3ALt3rh1ebjq 1x+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=/ZC2VEqjvFs5Mt5zrrP3+uH+1+AVLQaA5Wpha+vt2mo=; b=N/RgatTJCni2U8+/Q4opZ/rndR30dyn/ex9mwiQbx/9MK7DQ4BiJ1YLZv4zMVgPs/O AqejkW+qdoxoelJXRXxBa8zoU67QL/u5aTJSbbpG7n8W2kUPT6T/rkmMTJdHwkjuvGwZ jR0Mh8QtALnCUcK94Ur1wh+/kAL4HeXPFJG/gLWridt9IIB40JO4cKyg6i5UpDUsBoEE EQ+UElISdndeD9XzqQ70MNETPFnRYdu7AvdnOCDlT7hex4Ua6IVP9R/i6MuiBeIZUMAQ 9WTM0iB91YK8K3HEBfW3kCzXCWFBGuZOqoAghFjL5gsBtI6rSyU3egRUJPbgjgwL7Jk9 GnBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=fyB4TYS7; spf=pass (google.com: domain of gcc-patches-return-462875-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462875-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id m3si2300331pgd.375.2017.09.25.04.08.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Sep 2017 04:08:29 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-462875-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=fyB4TYS7; spf=pass (google.com: domain of gcc-patches-return-462875-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462875-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=JDeIGEhdOFKqpytZNnkfZh5LuAgCXqgKXR6FrttxqKQ00i+d+p/um 4/kBSYd1FRCuu32dq9sXxQJ3G/HHBsezmaYxs1AFNyM4IR5u6Fah5/zIQx87xvVt 7uC0YFVCzbJHW//zlTtxqx4yfRwqsu+a4rpzw+5mTyYyiaL4wfmlJU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=mtoADVeDr6Qah188ESrSZvFGeQo=; b=fyB4TYS7FzHFZ97y1Smz hkmXMtXDTnxPBDH+jC0z1K+jHgr4QeNDuIz7HKo0Y2ghGFOduxYoSIauFDznXOPa DJ+Ak5h0GFkKkWrADO4Uin3WrrC60IG4uuo3RoCVXLqIdyr+elE+9KT2sezVqmRW tzoWe0IK1/IZlonK5uuMI/g= Received: (qmail 16667 invoked by alias); 25 Sep 2017 11:08:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16655 invoked by uid 89); 25 Sep 2017 11:08:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-16.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=inch, year's X-HELO: mail-wm0-f45.google.com Received: from mail-wm0-f45.google.com (HELO mail-wm0-f45.google.com) (74.125.82.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 25 Sep 2017 11:08:07 +0000 Received: by mail-wm0-f45.google.com with SMTP id r136so18323055wmf.2 for ; Mon, 25 Sep 2017 04:08:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:date:message-id :user-agent:mime-version; bh=/ZC2VEqjvFs5Mt5zrrP3+uH+1+AVLQaA5Wpha+vt2mo=; b=hKr1/AxhgJXVZ5oDRuXjJn43x0sY7H0JJgaITmDTpna0Mj7QehHV63zOsBVPxnlJnL WGIo0SEn/9fYbTK0Y+y6FSOkpiuI7S5xD6+PX4njKDbwoS2YZczlxij3QEuWRoQO3HM6 NavZQvbmYwmSWJFBYxnX3D6rowbfGedRLiaXI5pMqerRMhtS2oJhCGD5sYemQuvNtmOh cVqY0PWdZvK3pkUvbNKfv/UUb6paV5hgSLm6dP4CFcuQxRhZK4kmB3N6wBgnyHjZnS/6 3/Ab1uEuWBd102/1a7CQK8NJQccol/Yw/IKn5vkg3sVGQhm5SuDdim8ir0kg5AJSnxku vkkw== X-Gm-Message-State: AHPjjUhiZUBSHNysq4D58xz0wETSdG/hXfjbsA7g91fPMrSRQyH8MF6J 0xRzxJC8sgsiWy9EsrKXzLzhgV7rwLo= X-Google-Smtp-Source: AOwi7QBzT9JhqtbxU8brq7K0lOA/dXLN2D4nHiYNSV45uJdrhzJjkQ5HOxrhjnEFr2Sfk7wpGAOi+A== X-Received: by 10.28.156.18 with SMTP id f18mr107224wme.63.1506337684126; Mon, 25 Sep 2017 04:08:04 -0700 (PDT) Received: from localhost (94.197.121.79.threembb.co.uk. [94.197.121.79]) by smtp.gmail.com with ESMTPSA id w2sm9253808wrb.67.2017.09.25.04.08.01 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 25 Sep 2017 04:08:03 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: Add VEC_DUPLICATE_{CST,EXPR} and associated optab Date: Mon, 25 Sep 2017 12:08:00 +0100 Message-ID: <874lrrjasv.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 SVE needs a way of broadcasting a scalar to a variable-length vector. This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would be used for fixed-length vectors. VEC_DUPLICATE_EXPR is the tree equivalent of the existing rtl code VEC_DUPLICATE. Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT to mark constant nodes, but in response to last year's RFC, Richard B. suggested it would be better to have separate codes for the constant and non-constant cases. This allows VEC_DUPLICATE_EXPR to be treated as a normal unary operation and avoids the previous need for treating it as a GIMPLE_SINGLE_RHS. It might make sense to use VEC_DUPLICATE_CST for all duplicated vector constants, since it's a bit more compact than VECTOR_CST in that case, and is potentially more efficient to process. I don't have any specific plans to do that though. We'll need to keep both types of constant around whatever happens. The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2017-09-25 Richard Sandiford Alan Hayward David Sherwood gcc/ * doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document. (VEC_COND_EXPR): Add missing @tindex. * doc/md.texi (vec_duplicate@var{m}): Document. * tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes. * tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW are used for VEC_DUPLICATE_CST as well. (tree_vector): Access base.n.nelts directly. * tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of valid codes. (VEC_DUPLICATE_CST_ELT): New macro. (build_vec_duplicate_cst): Declare. * tree.c (tree_node_structure_for_code, tree_code_size, tree_size) (integer_zerop, integer_onep, integer_all_onesp, integer_truep) (real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop) (walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST. (build_vec_duplicate_cst): New function. (uniform_vector_p): Handle the new codes. (test_vec_duplicate_predicates_int): New function. (test_vec_duplicate_predicates_float): Likewise. (test_vec_duplicate_predicates): Likewise. (tree_c_tests): Call test_vec_duplicate_predicates. * cfgexpand.c (expand_debug_expr): Handle the new codes. * tree-pretty-print.c (dump_generic_node): Likewise. * dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST. * gimple-expr.h (is_gimple_constant): Likewise. * gimplify.c (gimplify_expr): Likewise. * graphite-isl-ast-to-gimple.c (translate_isl_ast_to_gimple::is_constant): Likewise. * graphite-scop-detection.c (scan_tree_for_params): Likewise. * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise. (func_checker::compare_operand): Likewise. * ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise. * match.pd (negate_expr_p): Likewise. * print-tree.c (print_node): Likewise. * tree-chkp.c (chkp_find_bounds_1): Likewise. * tree-data-ref.c (data_ref_compare_tree): Likewise. * tree-loop-distribution.c (const_with_all_bytes_same): Likewise. * tree-ssa-loop.c (for_each_index): Likewise. * tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise. * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise. (ao_ref_init_from_vn_reference): Likewise. * tree-vect-generic.c (ssa_uniform_vector_p): Likewise. * varasm.c (const_hash_1, compare_constant): Likewise. * fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop) (fold_convert_const, operand_equal_p, fold_view_convert_expr) (exact_inverse, fold_checksum_tree): Likewise. (const_unop): Likewise. Fold VEC_DUPLICATE_EXPRs of a constant. (test_vec_duplicate_folding): New function. (fold_const_c_tests): Call it. * optabs.def (vec_duplicate_optab): New optab. * optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR. * optabs.h (expand_vector_broadcast): Declare. * optabs.c (expand_vector_broadcast): Make non-static. Try using vec_duplicate_optab. * expr.c (store_constructor): Try using vec_duplicate_optab for uniform vectors. (const_vector_element): New function, split out from... (const_vector_from_tree): ...here. (expand_expr_real_2): Handle VEC_DUPLICATE_EXPR. (expand_expr_real_1): Handle VEC_DUPLICATE_CST. * internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P instead of checking for VECTOR_CST. * tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR. (verify_gimple_assign_single): Handle VEC_DUPLICATE_CST. * tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR. Index: gcc/doc/generic.texi =================================================================== --- gcc/doc/generic.texi 2017-09-04 08:29:12.853103383 +0100 +++ gcc/doc/generic.texi 2017-09-25 12:03:06.688818488 +0100 @@ -1036,6 +1036,7 @@ As this example indicates, the operands @tindex FIXED_CST @tindex COMPLEX_CST @tindex VECTOR_CST +@tindex VEC_DUPLICATE_CST @tindex STRING_CST @findex TREE_STRING_LENGTH @findex TREE_STRING_POINTER @@ -1089,6 +1090,14 @@ constant nodes. Each individual constan double constant node. The first operand is a @code{TREE_LIST} of the constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}. +@item VEC_DUPLICATE_CST +These nodes represent a vector constant in which every element has the +same scalar value. At present only variable-length vectors use +@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST} +instead. The scalar element value is given by +@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the +element of a @code{VECTOR_CST}. + @item STRING_CST These nodes represent string-constants. The @code{TREE_STRING_LENGTH} returns the length of the string, as an @code{int}. The @@ -1692,6 +1701,7 @@ a value from @code{enum annot_expr_kind} @node Vectors @subsection Vectors +@tindex VEC_DUPLICATE_EXPR @tindex VEC_LSHIFT_EXPR @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @@ -1703,9 +1713,14 @@ a value from @code{enum annot_expr_kind} @tindex VEC_PACK_TRUNC_EXPR @tindex VEC_PACK_SAT_EXPR @tindex VEC_PACK_FIX_TRUNC_EXPR +@tindex VEC_COND_EXPR @tindex SAD_EXPR @table @code +@item VEC_DUPLICATE_EXPR +This node has a single operand and represents a vector in which every +element is equal to that operand. + @item VEC_LSHIFT_EXPR @itemx VEC_RSHIFT_EXPR These nodes represent whole vector left and right shifts, respectively. Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi 2017-09-04 11:49:42.934500723 +0100 +++ gcc/doc/md.texi 2017-09-25 12:03:06.693818177 +0100 @@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val the vector mode @var{m}, or a vector mode with the same element mode and smaller number of elements. +@cindex @code{vec_duplicate@var{m}} instruction pattern +@item @samp{vec_duplicate@var{m}} +Initialize vector output operand 0 so that each element has the value given +by scalar input operand 1. The vector has mode @var{m} and the scalar has +the mode appropriate for one element of @var{m}. + +This pattern only handles duplicates of non-constant inputs. Constant +vectors go through the @code{mov@var{m}} pattern instead. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern @item @samp{vec_cmp@var{m}@var{n}} Output a vector comparison. Operand 0 of mode @var{n} is the destination for Index: gcc/tree.def =================================================================== --- gcc/tree.def 2017-07-27 10:37:56.369045398 +0100 +++ gcc/tree.def 2017-09-25 12:03:06.739815314 +0100 @@ -304,6 +304,10 @@ DEFTREECODE (COMPLEX_CST, "complex_cst", /* Contents are in VECTOR_CST_ELTS field. */ DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0) +/* Represents a vector constant in which every element is equal to + VEC_DUPLICATE_CST_ELT. */ +DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0) + /* Contents are TREE_STRING_LENGTH and the actual contents of the string. */ DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0) @@ -534,6 +538,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr", 1 and 2 are NULL. The operands are then taken from the cfg edges. */ DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3) +/* Represents a vector in which every element is equal to operand 0. */ +DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1) + /* Vector conditional expression. It is like COND_EXPR, but with vector operands. Index: gcc/tree-core.h =================================================================== --- gcc/tree-core.h 2017-09-14 16:25:43.864400951 +0100 +++ gcc/tree-core.h 2017-09-25 12:03:06.723816310 +0100 @@ -975,7 +975,8 @@ struct GTY(()) tree_base { /* VEC length. This field is only used with TREE_VEC. */ int length; - /* Number of elements. This field is only used with VECTOR_CST. */ + /* Number of elements. This field is only used with VECTOR_CST + and VEC_DUPLICATE_CST. It is always 1 for VEC_DUPLICATE_CST. */ unsigned int nelts; /* SSA version number. This field is only used with SSA_NAME. */ @@ -1062,7 +1063,7 @@ struct GTY(()) tree_base { public_flag: TREE_OVERFLOW in - INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST + INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST TREE_PUBLIC in VAR_DECL, FUNCTION_DECL @@ -1329,7 +1330,7 @@ struct GTY(()) tree_complex { struct GTY(()) tree_vector { struct tree_typed typed; - tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1]; + tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1]; }; struct GTY(()) tree_identifier { Index: gcc/tree.h =================================================================== --- gcc/tree.h 2017-09-14 16:45:44.200520742 +0100 +++ gcc/tree.h 2017-09-25 12:03:06.741815189 +0100 @@ -730,8 +730,8 @@ #define TREE_SYMBOL_REFERENCED(NODE) \ #define TYPE_REF_CAN_ALIAS_ALL(NODE) \ (PTR_OR_REF_CHECK (NODE)->base.static_flag) -/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means - there was an overflow in folding. */ +/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST, + this means there was an overflow in folding. */ #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag) @@ -1030,6 +1030,10 @@ #define VECTOR_CST_NELTS(NODE) (VECTOR_C #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts) #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX]) +/* In a VEC_DUPLICATE_CST node. */ +#define VEC_DUPLICATE_CST_ELT(NODE) \ + (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0]) + /* Define fields and accessors for some special-purpose tree nodes. */ #define IDENTIFIER_LENGTH(NODE) \ @@ -4026,6 +4030,7 @@ extern tree build_int_cst (tree, HOST_WI extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst); extern tree build_int_cst_type (tree, HOST_WIDE_INT); extern tree make_vector (unsigned CXX_MEM_STAT_INFO); +extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO); extern tree build_vector (tree, vec CXX_MEM_STAT_INFO); extern tree build_vector_from_ctor (tree, vec *); extern tree build_vector_from_val (tree, tree); Index: gcc/tree.c =================================================================== --- gcc/tree.c 2017-09-21 12:06:40.939511360 +0100 +++ gcc/tree.c 2017-09-25 12:03:06.737815438 +0100 @@ -464,6 +464,7 @@ tree_node_structure_for_code (enum tree_ case FIXED_CST: return TS_FIXED_CST; case COMPLEX_CST: return TS_COMPLEX; case VECTOR_CST: return TS_VECTOR; + case VEC_DUPLICATE_CST: return TS_VECTOR; case STRING_CST: return TS_STRING; /* tcc_exceptional cases. */ case ERROR_MARK: return TS_COMMON; @@ -816,6 +817,7 @@ tree_code_size (enum tree_code code) case FIXED_CST: return sizeof (struct tree_fixed_cst); case COMPLEX_CST: return sizeof (struct tree_complex); case VECTOR_CST: return sizeof (struct tree_vector); + case VEC_DUPLICATE_CST: return sizeof (struct tree_vector); case STRING_CST: gcc_unreachable (); default: return lang_hooks.tree_size (code); @@ -875,6 +877,9 @@ tree_size (const_tree node) return (sizeof (struct tree_vector) + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree)); + case VEC_DUPLICATE_CST: + return sizeof (struct tree_vector); + case STRING_CST: return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1; @@ -1682,6 +1687,30 @@ cst_and_fits_in_hwi (const_tree x) && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x))); } +/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP. + + Note that this function is only suitable for callers that specifically + need a VEC_DUPLICATE_CST node. Use build_vector_from_val to duplicate + a general scalar into a general vector type. */ + +tree +build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL) +{ + int length = sizeof (struct tree_vector); + + record_node_allocation_statistics (VEC_DUPLICATE_CST, length); + + tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT); + + TREE_SET_CODE (t, VEC_DUPLICATE_CST); + TREE_TYPE (t) = type; + t->base.u.nelts = 1; + VEC_DUPLICATE_CST_ELT (t) = exp; + TREE_CONSTANT (t) = 1; + + return t; +} + /* Build a newly constructed VECTOR_CST node of length LEN. */ tree @@ -2343,6 +2372,8 @@ integer_zerop (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return integer_zerop (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2369,6 +2400,8 @@ integer_onep (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return integer_onep (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2407,6 +2440,9 @@ integer_all_onesp (const_tree expr) return 1; } + else if (TREE_CODE (expr) == VEC_DUPLICATE_CST) + return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr)); + else if (TREE_CODE (expr) != INTEGER_CST) return 0; @@ -2462,7 +2498,7 @@ integer_nonzerop (const_tree expr) int integer_truep (const_tree expr) { - if (TREE_CODE (expr) == VECTOR_CST) + if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST) return integer_all_onesp (expr); return integer_onep (expr); } @@ -2633,6 +2669,8 @@ real_zerop (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return real_zerop (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2661,6 +2699,8 @@ real_onep (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return real_onep (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2688,6 +2728,8 @@ real_minus_onep (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -7090,6 +7132,9 @@ add_expr (const_tree t, inchash::hash &h inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags); return; } + case VEC_DUPLICATE_CST: + inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate); + return; case SSA_NAME: /* We can just compare by pointer. */ hstate.add_wide_int (SSA_NAME_VERSION (t)); @@ -10344,6 +10389,9 @@ initializer_zerop (const_tree init) return true; } + case VEC_DUPLICATE_CST: + return initializer_zerop (VEC_DUPLICATE_CST_ELT (init)); + case CONSTRUCTOR: { unsigned HOST_WIDE_INT idx; @@ -10389,7 +10437,13 @@ uniform_vector_p (const_tree vec) gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec))); - if (TREE_CODE (vec) == VECTOR_CST) + if (TREE_CODE (vec) == VEC_DUPLICATE_CST) + return VEC_DUPLICATE_CST_ELT (vec); + + else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR) + return TREE_OPERAND (vec, 0); + + else if (TREE_CODE (vec) == VECTOR_CST) { first = VECTOR_CST_ELT (vec, 0); for (i = 1; i < VECTOR_CST_NELTS (vec); ++i) @@ -11094,6 +11148,7 @@ #define WALK_SUBTREE_TAIL(NODE) \ case REAL_CST: case FIXED_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: case BLOCK: case PLACEHOLDER_EXPR: @@ -12380,6 +12435,12 @@ drop_tree_overflow (tree t) elt = drop_tree_overflow (elt); } } + if (TREE_CODE (t) == VEC_DUPLICATE_CST) + { + tree *elt = &VEC_DUPLICATE_CST_ELT (t); + if (TREE_OVERFLOW (*elt)) + *elt = drop_tree_overflow (*elt); + } return t; } @@ -13797,6 +13858,92 @@ test_integer_constants () ASSERT_EQ (type, TREE_TYPE (zero)); } +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs + for integral type TYPE. */ + +static void +test_vec_duplicate_predicates_int (tree type) +{ + tree vec_type = build_vector_type (type, 4); + + tree zero = build_zero_cst (type); + tree vec_zero = build_vec_duplicate_cst (vec_type, zero); + ASSERT_TRUE (integer_zerop (vec_zero)); + ASSERT_FALSE (integer_onep (vec_zero)); + ASSERT_FALSE (integer_minus_onep (vec_zero)); + ASSERT_FALSE (integer_all_onesp (vec_zero)); + ASSERT_FALSE (integer_truep (vec_zero)); + ASSERT_TRUE (initializer_zerop (vec_zero)); + + tree one = build_one_cst (type); + tree vec_one = build_vec_duplicate_cst (vec_type, one); + ASSERT_FALSE (integer_zerop (vec_one)); + ASSERT_TRUE (integer_onep (vec_one)); + ASSERT_FALSE (integer_minus_onep (vec_one)); + ASSERT_FALSE (integer_all_onesp (vec_one)); + ASSERT_FALSE (integer_truep (vec_one)); + ASSERT_FALSE (initializer_zerop (vec_one)); + + tree minus_one = build_minus_one_cst (type); + tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one); + ASSERT_FALSE (integer_zerop (vec_minus_one)); + ASSERT_FALSE (integer_onep (vec_minus_one)); + ASSERT_TRUE (integer_minus_onep (vec_minus_one)); + ASSERT_TRUE (integer_all_onesp (vec_minus_one)); + ASSERT_TRUE (integer_truep (vec_minus_one)); + ASSERT_FALSE (initializer_zerop (vec_minus_one)); + + tree x = create_tmp_var_raw (type, "x"); + tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x); + ASSERT_EQ (uniform_vector_p (vec_zero), zero); + ASSERT_EQ (uniform_vector_p (vec_one), one); + ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one); + ASSERT_EQ (uniform_vector_p (vec_x), x); +} + +/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point + type TYPE. */ + +static void +test_vec_duplicate_predicates_float (tree type) +{ + tree vec_type = build_vector_type (type, 4); + + tree zero = build_zero_cst (type); + tree vec_zero = build_vec_duplicate_cst (vec_type, zero); + ASSERT_TRUE (real_zerop (vec_zero)); + ASSERT_FALSE (real_onep (vec_zero)); + ASSERT_FALSE (real_minus_onep (vec_zero)); + ASSERT_TRUE (initializer_zerop (vec_zero)); + + tree one = build_one_cst (type); + tree vec_one = build_vec_duplicate_cst (vec_type, one); + ASSERT_FALSE (real_zerop (vec_one)); + ASSERT_TRUE (real_onep (vec_one)); + ASSERT_FALSE (real_minus_onep (vec_one)); + ASSERT_FALSE (initializer_zerop (vec_one)); + + tree minus_one = build_minus_one_cst (type); + tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one); + ASSERT_FALSE (real_zerop (vec_minus_one)); + ASSERT_FALSE (real_onep (vec_minus_one)); + ASSERT_TRUE (real_minus_onep (vec_minus_one)); + ASSERT_FALSE (initializer_zerop (vec_minus_one)); + + ASSERT_EQ (uniform_vector_p (vec_zero), zero); + ASSERT_EQ (uniform_vector_p (vec_one), one); + ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one); +} + +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs. */ + +static void +test_vec_duplicate_predicates () +{ + test_vec_duplicate_predicates_int (integer_type_node); + test_vec_duplicate_predicates_float (float_type_node); +} + /* Verify identifiers. */ static void @@ -13825,6 +13972,7 @@ test_labels () tree_c_tests () { test_integer_constants (); + test_vec_duplicate_predicates (); test_identifiers (); test_labels (); } Index: gcc/cfgexpand.c =================================================================== --- gcc/cfgexpand.c 2017-09-14 16:25:43.861637270 +0100 +++ gcc/cfgexpand.c 2017-09-25 12:03:06.687818551 +0100 @@ -5049,6 +5049,8 @@ expand_debug_expr (tree exp) case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: case VEC_PERM_EXPR: + case VEC_DUPLICATE_CST: + case VEC_DUPLICATE_EXPR: return NULL; /* Misc codes. */ Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c 2017-08-24 08:46:01.758139665 +0100 +++ gcc/tree-pretty-print.c 2017-09-25 12:03:06.728815998 +0100 @@ -1800,6 +1800,12 @@ dump_generic_node (pretty_printer *pp, t } break; + case VEC_DUPLICATE_CST: + pp_string (pp, "{ "); + dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false); + pp_string (pp, ", ... }"); + break; + case FUNCTION_TYPE: case METHOD_TYPE: dump_generic_node (pp, TREE_TYPE (node), spc, flags, false); @@ -3230,6 +3236,15 @@ dump_generic_node (pretty_printer *pp, t pp_string (pp, " > "); break; + case VEC_DUPLICATE_EXPR: + pp_space (pp); + for (str = get_tree_code_name (code); *str; str++) + pp_character (pp, TOUPPER (*str)); + pp_string (pp, " < "); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, " > "); + break; + case VEC_UNPACK_HI_EXPR: pp_string (pp, " VEC_UNPACK_HI_EXPR < "); dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); Index: gcc/dwarf2out.c =================================================================== --- gcc/dwarf2out.c 2017-09-21 11:53:16.380966799 +0100 +++ gcc/dwarf2out.c 2017-09-25 12:03:06.704817493 +0100 @@ -18862,6 +18862,7 @@ rtl_for_decl_init (tree init, tree type) switch (TREE_CODE (init)) { case VECTOR_CST: + case VEC_DUPLICATE_CST: break; case CONSTRUCTOR: if (TREE_CONSTANT (init)) Index: gcc/gimple-expr.h =================================================================== --- gcc/gimple-expr.h 2017-02-23 19:54:20.000000000 +0000 +++ gcc/gimple-expr.h 2017-09-25 12:03:06.708817243 +0100 @@ -134,6 +134,7 @@ is_gimple_constant (const_tree t) case FIXED_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: return true; Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c 2017-08-29 08:47:13.282917702 +0100 +++ gcc/gimplify.c 2017-09-25 12:03:06.711817057 +0100 @@ -11501,6 +11501,7 @@ gimplify_expr (tree *expr_p, gimple_seq case STRING_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: /* Drop the overflow flag on constants, we do not want that in the GIMPLE IL. */ if (TREE_OVERFLOW_P (*expr_p)) Index: gcc/graphite-isl-ast-to-gimple.c =================================================================== --- gcc/graphite-isl-ast-to-gimple.c 2017-09-22 17:22:08.334305773 +0100 +++ gcc/graphite-isl-ast-to-gimple.c 2017-09-25 12:03:06.712816994 +0100 @@ -245,7 +245,8 @@ enum phi_node_kind return TREE_CODE (op) == INTEGER_CST || TREE_CODE (op) == REAL_CST || TREE_CODE (op) == COMPLEX_CST - || TREE_CODE (op) == VECTOR_CST; + || TREE_CODE (op) == VECTOR_CST + || TREE_CODE (op) == VEC_DUPLICATE_CST; } private: Index: gcc/graphite-scop-detection.c =================================================================== --- gcc/graphite-scop-detection.c 2017-09-22 17:22:08.510305732 +0100 +++ gcc/graphite-scop-detection.c 2017-09-25 12:03:06.712816994 +0100 @@ -1447,6 +1447,7 @@ scan_tree_for_params (sese_info_p s, tre case REAL_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: break; default: Index: gcc/ipa-icf-gimple.c =================================================================== --- gcc/ipa-icf-gimple.c 2017-08-30 16:25:16.913251173 +0100 +++ gcc/ipa-icf-gimple.c 2017-09-25 12:03:06.714816870 +0100 @@ -333,6 +333,7 @@ func_checker::compare_cst_or_decl (tree case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: case REAL_CST: { @@ -528,6 +529,7 @@ func_checker::compare_operand (tree t1, case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: case REAL_CST: case FUNCTION_DECL: Index: gcc/ipa-icf.c =================================================================== --- gcc/ipa-icf.c 2017-06-07 07:42:16.940073012 +0100 +++ gcc/ipa-icf.c 2017-09-25 12:03:06.715816808 +0100 @@ -1478,6 +1478,7 @@ sem_item::add_expr (const_tree exp, inch case STRING_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: inchash::add_expr (exp, hstate); break; case CONSTRUCTOR: @@ -2030,6 +2031,9 @@ sem_variable::equals (tree t1, tree t2) return 1; } + case VEC_DUPLICATE_CST: + return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1), + VEC_DUPLICATE_CST_ELT (t2)); case ARRAY_REF: case ARRAY_RANGE_REF: { Index: gcc/match.pd =================================================================== --- gcc/match.pd 2017-09-21 11:17:14.827201204 +0100 +++ gcc/match.pd 2017-09-25 12:03:06.716816745 +0100 @@ -944,6 +944,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (match negate_expr_p VECTOR_CST (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type)))) +(match negate_expr_p + VEC_DUPLICATE_CST + (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type)))) /* (-A) * (-B) -> A * B */ (simplify Index: gcc/print-tree.c =================================================================== --- gcc/print-tree.c 2017-08-21 10:42:05.815630531 +0100 +++ gcc/print-tree.c 2017-09-25 12:03:06.719816559 +0100 @@ -783,6 +783,10 @@ print_node (FILE *file, const char *pref } break; + case VEC_DUPLICATE_CST: + print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4); + break; + case COMPLEX_CST: print_node (file, "real", TREE_REALPART (node), indent + 4); print_node (file, "imag", TREE_IMAGPART (node), indent + 4); Index: gcc/tree-chkp.c =================================================================== --- gcc/tree-chkp.c 2017-08-16 08:50:32.376422338 +0100 +++ gcc/tree-chkp.c 2017-09-25 12:03:06.722816372 +0100 @@ -3800,6 +3800,7 @@ chkp_find_bounds_1 (tree ptr, tree ptr_s case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: if (integer_zerop (ptr_src)) bounds = chkp_get_none_bounds (); else Index: gcc/tree-data-ref.c =================================================================== --- gcc/tree-data-ref.c 2017-08-29 20:01:07.143372092 +0100 +++ gcc/tree-data-ref.c 2017-09-25 12:03:06.724816248 +0100 @@ -1223,6 +1223,7 @@ data_ref_compare_tree (tree t1, tree t2) case STRING_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: { hashval_t h1 = iterative_hash_expr (t1, 0); hashval_t h2 = iterative_hash_expr (t2, 0); Index: gcc/tree-loop-distribution.c =================================================================== --- gcc/tree-loop-distribution.c 2017-08-29 20:01:07.143372092 +0100 +++ gcc/tree-loop-distribution.c 2017-09-25 12:03:06.727816061 +0100 @@ -935,6 +935,9 @@ const_with_all_bytes_same (tree val) && CONSTRUCTOR_NELTS (val) == 0)) return 0; + if (TREE_CODE (val) == VEC_DUPLICATE_CST) + return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val)); + if (real_zerop (val)) { /* Only return 0 for +0.0, not for -0.0, which doesn't have Index: gcc/tree-ssa-loop.c =================================================================== --- gcc/tree-ssa-loop.c 2017-08-10 14:36:07.892477227 +0100 +++ gcc/tree-ssa-loop.c 2017-09-25 12:03:06.728815998 +0100 @@ -616,6 +616,7 @@ for_each_index (tree *addr_p, bool (*cbc case STRING_CST: case RESULT_DECL: case VECTOR_CST: + case VEC_DUPLICATE_CST: case COMPLEX_CST: case INTEGER_CST: case REAL_CST: Index: gcc/tree-ssa-pre.c =================================================================== --- gcc/tree-ssa-pre.c 2017-09-13 18:03:48.390469882 +0100 +++ gcc/tree-ssa-pre.c 2017-09-25 12:03:06.729815936 +0100 @@ -2675,6 +2675,7 @@ create_component_ref_by_pieces_1 (basic_ case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case REAL_CST: case CONSTRUCTOR: case VAR_DECL: Index: gcc/tree-ssa-sccvn.c =================================================================== --- gcc/tree-ssa-sccvn.c 2017-09-21 11:53:16.339540234 +0100 +++ gcc/tree-ssa-sccvn.c 2017-09-25 12:03:06.731815812 +0100 @@ -858,6 +858,7 @@ copy_reference_ops_from_ref (tree ref, v case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case REAL_CST: case FIXED_CST: case CONSTRUCTOR: @@ -1050,6 +1051,7 @@ ao_ref_init_from_vn_reference (ao_ref *r case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case REAL_CST: case CONSTRUCTOR: case CONST_DECL: Index: gcc/tree-vect-generic.c =================================================================== --- gcc/tree-vect-generic.c 2017-09-14 17:04:19.082694343 +0100 +++ gcc/tree-vect-generic.c 2017-09-25 12:03:06.731815812 +0100 @@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs ssa_uniform_vector_p (tree op) { if (TREE_CODE (op) == VECTOR_CST + || TREE_CODE (op) == VEC_DUPLICATE_CST || TREE_CODE (op) == CONSTRUCTOR) return uniform_vector_p (op); if (TREE_CODE (op) == SSA_NAME) Index: gcc/varasm.c =================================================================== --- gcc/varasm.c 2017-09-22 17:43:06.658083770 +0100 +++ gcc/varasm.c 2017-09-25 12:03:06.743815065 +0100 @@ -3068,6 +3068,9 @@ const_hash_1 (const tree exp) CASE_CONVERT: return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2; + case VEC_DUPLICATE_CST: + return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3; + default: /* A language specific constant. Just hash the code. */ return code; @@ -3158,6 +3161,10 @@ compare_constant (const tree t1, const t return 1; } + case VEC_DUPLICATE_CST: + return compare_constant (VEC_DUPLICATE_CST_ELT (t1), + VEC_DUPLICATE_CST_ELT (t2)); + case CONSTRUCTOR: { vec *v1, *v2; Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c 2017-09-14 17:04:19.080694343 +0100 +++ gcc/fold-const.c 2017-09-25 12:03:06.708817243 +0100 @@ -418,6 +418,9 @@ negate_expr_p (tree t) return true; } + case VEC_DUPLICATE_CST: + return negate_expr_p (VEC_DUPLICATE_CST_ELT (t)); + case COMPLEX_EXPR: return negate_expr_p (TREE_OPERAND (t, 0)) && negate_expr_p (TREE_OPERAND (t, 1)); @@ -577,6 +580,14 @@ fold_negate_expr_1 (location_t loc, tree return build_vector (type, elts); } + case VEC_DUPLICATE_CST: + { + tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t)); + if (!sub) + return NULL_TREE; + return build_vector_from_val (type, sub); + } + case COMPLEX_EXPR: if (negate_expr_p (t)) return fold_build2_loc (loc, COMPLEX_EXPR, type, @@ -1433,6 +1444,16 @@ const_binop (enum tree_code code, tree a return build_vector (type, elts); } + if (TREE_CODE (arg1) == VEC_DUPLICATE_CST + && TREE_CODE (arg2) == VEC_DUPLICATE_CST) + { + tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), + VEC_DUPLICATE_CST_ELT (arg2)); + if (!sub) + return NULL_TREE; + return build_vector_from_val (TREE_TYPE (arg1), sub); + } + /* Shifts allow a scalar offset for a vector. */ if (TREE_CODE (arg1) == VECTOR_CST && TREE_CODE (arg2) == INTEGER_CST) @@ -1456,6 +1477,15 @@ const_binop (enum tree_code code, tree a return build_vector (type, elts); } + + if (TREE_CODE (arg1) == VEC_DUPLICATE_CST + && TREE_CODE (arg2) == INTEGER_CST) + { + tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2); + if (!sub) + return NULL_TREE; + return build_vector_from_val (TREE_TYPE (arg1), sub); + } return NULL_TREE; } @@ -1649,6 +1679,13 @@ const_unop (enum tree_code code, tree ty if (i == count) return build_vector (type, elements); } + else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST) + { + tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (arg0)); + if (sub) + return build_vector_from_val (type, sub); + } break; case TRUTH_NOT_EXPR: @@ -1734,6 +1771,11 @@ const_unop (enum tree_code code, tree ty return res; } + case VEC_DUPLICATE_EXPR: + if (CONSTANT_CLASS_P (arg0)) + return build_vector_from_val (type, arg0); + return NULL_TREE; + default: break; } @@ -2164,6 +2206,15 @@ fold_convert_const (enum tree_code code, } return build_vector (type, v); } + if (TREE_CODE (arg1) == VEC_DUPLICATE_CST + && (TYPE_VECTOR_SUBPARTS (type) + == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1)))) + { + tree sub = fold_convert_const (code, TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (arg1)); + if (sub) + return build_vector_from_val (type, sub); + } } return NULL_TREE; } @@ -2950,6 +3001,10 @@ operand_equal_p (const_tree arg0, const_ return 1; } + case VEC_DUPLICATE_CST: + return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0), + VEC_DUPLICATE_CST_ELT (arg1), flags); + case COMPLEX_CST: return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1), flags) @@ -7504,6 +7559,20 @@ can_native_encode_string_p (const_tree e static tree fold_view_convert_expr (tree type, tree expr) { + /* Recurse on duplicated vectors if the target type is also a vector + and if the elements line up. */ + tree expr_type = TREE_TYPE (expr); + if (TREE_CODE (expr) == VEC_DUPLICATE_CST + && VECTOR_TYPE_P (type) + && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type) + && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type))) + { + tree sub = fold_view_convert_expr (TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (expr)); + if (sub) + return build_vector_from_val (type, sub); + } + /* We support up to 512-bit values (for V8DFmode). */ unsigned char buffer[64]; int len; @@ -8903,6 +8972,15 @@ exact_inverse (tree type, tree cst) return build_vector (type, elts); } + case VEC_DUPLICATE_CST: + { + tree sub = exact_inverse (TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (cst)); + if (!sub) + return NULL_TREE; + return build_vector_from_val (type, sub); + } + default: return NULL_TREE; } @@ -12097,6 +12175,9 @@ fold_checksum_tree (const_tree expr, str for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i) fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht); break; + case VEC_DUPLICATE_CST: + fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht); + break; default: break; } @@ -14563,6 +14644,36 @@ test_vector_folding () ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one))); } +/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs. */ + +static void +test_vec_duplicate_folding () +{ + tree type = build_vector_type (ssizetype, 4); + tree dup5 = build_vec_duplicate_cst (type, ssize_int (5)); + tree dup3 = build_vec_duplicate_cst (type, ssize_int (3)); + + tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5); + ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5)); + + tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5); + ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6)); + + tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3); + ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8)); + + tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2)); + ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20)); + + tree size_vector = build_vector_type (sizetype, 4); + tree size_dup5 = fold_convert (size_vector, dup5); + ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5)); + + tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5)); + tree dup5_cst = build_vector_from_val (type, ssize_int (5)); + ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0)); +} + /* Run all of the selftests within this file. */ void @@ -14570,6 +14681,7 @@ fold_const_c_tests () { test_arithmetic_folding (); test_vector_folding (); + test_vec_duplicate_folding (); } } // namespace selftest Index: gcc/optabs.def =================================================================== --- gcc/optabs.def 2017-08-10 14:36:07.448493264 +0100 +++ gcc/optabs.def 2017-09-25 12:03:06.718816621 +0100 @@ -364,3 +364,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a") OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a") + +OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE) Index: gcc/optabs-tree.c =================================================================== --- gcc/optabs-tree.c 2017-06-22 12:22:57.735313105 +0100 +++ gcc/optabs-tree.c 2017-09-25 12:03:06.716816745 +0100 @@ -210,6 +210,9 @@ optab_for_tree_code (enum tree_code code return TYPE_UNSIGNED (type) ? vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab; + case VEC_DUPLICATE_EXPR: + return vec_duplicate_optab; + default: break; } Index: gcc/optabs.h =================================================================== --- gcc/optabs.h 2017-06-30 12:50:37.492697279 +0100 +++ gcc/optabs.h 2017-09-25 12:03:06.719816559 +0100 @@ -181,6 +181,7 @@ extern rtx simplify_expand_binop (machin enum optab_methods methods); extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int, enum optab_methods); +extern rtx expand_vector_broadcast (machine_mode, rtx); /* Generate code for a simple binary or unary operation. "Simple" in this case means "can be unambiguously described by a (mode, code) Index: gcc/optabs.c =================================================================== --- gcc/optabs.c 2017-09-23 10:28:11.672861860 +0100 +++ gcc/optabs.c 2017-09-25 12:03:06.718816621 +0100 @@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o mode of OP must be the element mode of VMODE. If OP is a constant, then the return value will be a constant. */ -static rtx +rtx expand_vector_broadcast (machine_mode vmode, rtx op) { enum insn_code icode; @@ -385,6 +385,16 @@ expand_vector_broadcast (machine_mode vm if (CONSTANT_P (op)) return gen_rtx_CONST_VECTOR (vmode, vec); + icode = optab_handler (vec_duplicate_optab, vmode); + if (icode != CODE_FOR_nothing) + { + struct expand_operand ops[2]; + create_output_operand (&ops[0], NULL_RTX, vmode); + create_input_operand (&ops[1], op, GET_MODE (op)); + expand_insn (icode, 2, ops); + return ops[0].value; + } + /* ??? If the target doesn't have a vec_init, then we have no easy way of performing this operation. Most of this sort of generic support is hidden away in the vector lowering support in gimple. */ Index: gcc/expr.c =================================================================== --- gcc/expr.c 2017-09-23 10:27:39.925846365 +0100 +++ gcc/expr.c 2017-09-25 12:03:06.705817430 +0100 @@ -6572,7 +6572,8 @@ store_constructor (tree exp, rtx target, constructor_elt *ce; int i; int need_to_clear; - int icode = CODE_FOR_nothing; + insn_code icode = CODE_FOR_nothing; + tree elt; tree elttype = TREE_TYPE (type); int elt_size = tree_to_uhwi (TYPE_SIZE (elttype)); machine_mode eltmode = TYPE_MODE (elttype); @@ -6582,13 +6583,30 @@ store_constructor (tree exp, rtx target, unsigned n_elts; alias_set_type alias; bool vec_vec_init_p = false; + machine_mode mode = GET_MODE (target); gcc_assert (eltmode != BLKmode); + /* Try using vec_duplicate_optab for uniform vectors. */ + if (!TREE_SIDE_EFFECTS (exp) + && VECTOR_MODE_P (mode) + && eltmode == GET_MODE_INNER (mode) + && ((icode = optab_handler (vec_duplicate_optab, mode)) + != CODE_FOR_nothing) + && (elt = uniform_vector_p (exp))) + { + struct expand_operand ops[2]; + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], expand_normal (elt), eltmode); + expand_insn (icode, 2, ops); + if (!rtx_equal_p (target, ops[0].value)) + emit_move_insn (target, ops[0].value); + break; + } + n_elts = TYPE_VECTOR_SUBPARTS (type); - if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target))) + if (REG_P (target) && VECTOR_MODE_P (mode)) { - machine_mode mode = GET_MODE (target); machine_mode emode = eltmode; if (CONSTRUCTOR_NELTS (exp) @@ -6600,7 +6618,7 @@ store_constructor (tree exp, rtx target, == n_elts); emode = TYPE_MODE (etype); } - icode = (int) convert_optab_handler (vec_init_optab, mode, emode); + icode = convert_optab_handler (vec_init_optab, mode, emode); if (icode != CODE_FOR_nothing) { unsigned int i, n = n_elts; @@ -6648,7 +6666,7 @@ store_constructor (tree exp, rtx target, if (need_to_clear && size > 0 && !vector) { if (REG_P (target)) - emit_move_insn (target, CONST0_RTX (GET_MODE (target))); + emit_move_insn (target, CONST0_RTX (mode)); else clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL); cleared = 1; @@ -6656,7 +6674,7 @@ store_constructor (tree exp, rtx target, /* Inform later passes that the old value is dead. */ if (!cleared && !vector && REG_P (target)) - emit_move_insn (target, CONST0_RTX (GET_MODE (target))); + emit_move_insn (target, CONST0_RTX (mode)); if (MEM_P (target)) alias = MEM_ALIAS_SET (target); @@ -6707,8 +6725,7 @@ store_constructor (tree exp, rtx target, if (vector) emit_insn (GEN_FCN (icode) (target, - gen_rtx_PARALLEL (GET_MODE (target), - vector))); + gen_rtx_PARALLEL (mode, vector))); break; } @@ -7683,6 +7700,19 @@ expand_operands (tree exp0, tree exp1, r } +/* Expand constant vector element ELT, which has mode MODE. This is used + for members of VECTOR_CST and VEC_DUPLICATE_CST. */ + +static rtx +const_vector_element (scalar_mode mode, const_tree elt) +{ + if (TREE_CODE (elt) == REAL_CST) + return const_double_from_real_value (TREE_REAL_CST (elt), mode); + if (TREE_CODE (elt) == FIXED_CST) + return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode); + return immed_wide_int_const (elt, mode); +} + /* Return a MEM that contains constant EXP. DEFER is as for output_constant_def and MODIFIER is as for expand_expr. */ @@ -9548,6 +9578,12 @@ #define REDUCE_BIT_FIELD(expr) (reduce_b target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case VEC_DUPLICATE_EXPR: + op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier); + target = expand_vector_broadcast (mode, op0); + gcc_assert (target); + return target; + case BIT_INSERT_EXPR: { unsigned bitpos = tree_to_uhwi (treeop2); @@ -9981,6 +10017,11 @@ expand_expr_real_1 (tree exp, rtx target tmode, modifier); } + case VEC_DUPLICATE_CST: + op0 = const_vector_element (GET_MODE_INNER (mode), + VEC_DUPLICATE_CST_ELT (exp)); + return gen_const_vec_duplicate (mode, op0); + case CONST_DECL: if (modifier == EXPAND_WRITE) { @@ -11742,8 +11783,7 @@ const_vector_from_tree (tree exp) { rtvec v; unsigned i, units; - tree elt; - machine_mode inner, mode; + machine_mode mode; mode = TYPE_MODE (TREE_TYPE (exp)); @@ -11754,23 +11794,12 @@ const_vector_from_tree (tree exp) return const_vector_mask_from_tree (exp); units = VECTOR_CST_NELTS (exp); - inner = GET_MODE_INNER (mode); v = rtvec_alloc (units); for (i = 0; i < units; ++i) - { - elt = VECTOR_CST_ELT (exp, i); - - if (TREE_CODE (elt) == REAL_CST) - RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt), - inner); - else if (TREE_CODE (elt) == FIXED_CST) - RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), - inner); - else - RTVEC_ELT (v, i) = immed_wide_int_const (elt, inner); - } + RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode), + VECTOR_CST_ELT (exp, i)); return gen_rtx_CONST_VECTOR (mode, v); } Index: gcc/internal-fn.c =================================================================== --- gcc/internal-fn.c 2017-09-21 11:17:14.803201205 +0100 +++ gcc/internal-fn.c 2017-09-25 12:03:06.713816932 +0100 @@ -1911,12 +1911,12 @@ expand_vector_ubsan_overflow (location_t emit_move_insn (cntvar, const0_rtx); emit_label (loop_lab); } - if (TREE_CODE (arg0) != VECTOR_CST) + if (!CONSTANT_CLASS_P (arg0)) { rtx arg0r = expand_normal (arg0); arg0 = make_tree (TREE_TYPE (arg0), arg0r); } - if (TREE_CODE (arg1) != VECTOR_CST) + if (!CONSTANT_CLASS_P (arg1)) { rtx arg1r = expand_normal (arg1); arg1 = make_tree (TREE_TYPE (arg1), arg1r); Index: gcc/tree-cfg.c =================================================================== --- gcc/tree-cfg.c 2017-09-13 18:03:48.394093241 +0100 +++ gcc/tree-cfg.c 2017-09-25 12:03:06.721816434 +0100 @@ -3803,6 +3803,17 @@ verify_gimple_assign_unary (gassign *stm case CONJ_EXPR: break; + case VEC_DUPLICATE_EXPR: + if (TREE_CODE (lhs_type) != VECTOR_TYPE + || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type)) + { + error ("vec_duplicate should be from a scalar to a like vector"); + debug_generic_expr (lhs_type); + debug_generic_expr (rhs1_type); + return true; + } + return false; + default: gcc_unreachable (); } @@ -4473,6 +4484,7 @@ verify_gimple_assign_single (gassign *st case FIXED_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: return res; Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c 2017-09-21 22:35:16.975368768 +0100 +++ gcc/tree-inline.c 2017-09-25 12:03:06.726816123 +0100 @@ -4002,6 +4002,7 @@ estimate_operator_cost (enum tree_code c case VEC_PACK_FIX_TRUNC_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: + case VEC_DUPLICATE_EXPR: return 1;