From patchwork Sun Oct 2 08:30:50 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ira Rosen X-Patchwork-Id: 4460 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 01BE123EFA for ; Sun, 2 Oct 2011 08:30:55 +0000 (UTC) Received: from mail-bw0-f52.google.com (mail-bw0-f52.google.com [209.85.214.52]) by fiordland.canonical.com (Postfix) with ESMTP id D6AFBA185D2 for ; Sun, 2 Oct 2011 08:30:55 +0000 (UTC) Received: by bke5 with SMTP id 5so5291145bke.11 for ; Sun, 02 Oct 2011 01:30:55 -0700 (PDT) Received: by 10.223.55.136 with SMTP id u8mr18817137fag.46.1317544253924; Sun, 02 Oct 2011 01:30:53 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.3.234 with SMTP id f10cs17978laf; Sun, 2 Oct 2011 01:30:52 -0700 (PDT) Received: by 10.101.88.16 with SMTP id q16mr12065181anl.106.1317544251397; Sun, 02 Oct 2011 01:30:51 -0700 (PDT) Received: from mail-gy0-f178.google.com (mail-gy0-f178.google.com [209.85.160.178]) by mx.google.com with ESMTPS id l20si1770530anm.184.2011.10.02.01.30.50 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 02 Oct 2011 01:30:51 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.160.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) client-ip=209.85.160.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.160.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) smtp.mail=ira.rosen@linaro.org Received: by gyf1 with SMTP id 1so3151587gyf.37 for ; Sun, 02 Oct 2011 01:30:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.133.145 with SMTP id q17mr10034829yhi.58.1317544250487; Sun, 02 Oct 2011 01:30:50 -0700 (PDT) Received: by 10.147.99.14 with HTTP; Sun, 2 Oct 2011 01:30:50 -0700 (PDT) In-Reply-To: References: Date: Sun, 2 Oct 2011 11:30:50 +0300 Message-ID: Subject: Re: [patch] Support vectorization of widening shifts From: Ira Rosen To: Ramana Radhakrishnan Cc: gcc-patches@gcc.gnu.org, Patch Tracking On 29 September 2011 17:30, Ramana Radhakrishnan wrote: > On 19 September 2011 08:54, Ira Rosen wrote: > >> >> Bootstrapped on powerpc64-suse-linux, tested on powerpc64-suse-linux >> and arm-linux-gnueabi >> OK for mainline? > > Sorry I missed this patch. Is there any reason why we need unspecs in > this case ? Can't this be represented by subregs and zero/ sign > extensions in RTL without the UNSPECs ? Like this: ; because the ordering of vector elements in Q registers is different from what ; the semantics of the instructions require. ? Thanks, Ira > > cheers > Ramana > >> >> Thanks, >> Ira >> >> ChangeLog: >> >>        * doc/md.texi (vec_widen_ushiftl_hi, vec_widen_ushiftl_lo, >> vec_widen_sshiftl_hi, >>        vec_widen_sshiftl_lo): Document. >>        * tree-pretty-print.c (dump_generic_node): Handle WIDEN_SHIFT_LEFT_EXPR, >>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR. >>        (op_code_prio): Likewise. >>        (op_symbol_code): Handle WIDEN_SHIFT_LEFT_EXPR. >>        * optabs.c (optab_for_tree_code): Handle >>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR. >>        (init-optabs): Initialize optab codes for vec_widen_u/sshiftl_hi/lo. >>        * optabs.h (enum optab_index): Add OTI_vec_widen_u/sshiftl_hi/lo. >>        * genopinit.c (optabs): Initialize the new optabs. >>        * expr.c (expand_expr_real_2): Handle >>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR. >>        * gimple-pretty-print.c (dump_binary_rhs): Likewise. >>        * tree-vectorizer.h (NUM_PATTERNS): Increase to 6. >>        * tree.def (WIDEN_SHIFT_LEFT_EXPR, VEC_WIDEN_SHIFT_LEFT_HI_EXPR, >>        VEC_WIDEN_SHIFT_LEFT_LO_EXPR): New. >>        * cfgexpand.c (expand_debug_expr):  Handle new tree codes. >>        * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add >>        vect_recog_widen_shift_pattern. >>        (vect_handle_widen_mult_by_const): Rename... >>        (vect_handle_widen_op_by_const): ...to this.  Handle shifts. >>        Add a new argument, update documentation. >>        (vect_recog_widen_mult_pattern): Assume that only second >>        operand can be constant.  Update call to >>        vect_handle_widen_op_by_const. >>        (vect_operation_fits_smaller_type): Add the already existing >>        def stmt to the list of pattern statements. >>        (vect_recog_widen_shift_pattern): New. >>        * tree-vect-stmts.c (vectorizable_type_promotion): Handle >>        widening shifts. >>        (supportable_widening_operation): Likewise. >>        * tree-inline.c (estimate_operator_cost): Handle new tree codes. >>        * tree-vect-generic.c (expand_vector_operations_1): Likewise. >>        * tree-cfg.c (verify_gimple_assign_binary): Likewise. >>        * config/arm/neon.md (neon_vec_shiftl_lo_): New. >>        (vec_widen_shiftl_lo_, neon_vec_shiftl_hi_, >>        vec_widen_shiftl_hi_, neon_vec_shift_left_): >>        Likewise. >>        * tree-vect-slp.c (vect_build_slp_tree): Require same shift operand >>        for widening shift. >> >> testsuite/ChangeLog: >> >>       * gcc.dg/vect/vect-widen-shift-s16.c: New. >>       * gcc.dg/vect/vect-widen-shift-s8.c: New. >>       * gcc.dg/vect/vect-widen-shift-u16.c: New. >>       * gcc.dg/vect/vect-widen-shift-u8.c: New. >> > Index: config/arm/neon.md =================================================================== --- config/arm/neon.md (revision 178942) +++ config/arm/neon.md (working copy) @@ -5550,6 +5550,46 @@ } ) +(define_insn "neon_vec_shiftl_" + [(set (match_operand: 0 "register_operand" "=w") + (SE: (match_operand:VW 1 "register_operand" "w"))) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" +{ + /* The boundaries are: 0 < imm <= size. */ + neon_const_bounds (operands[2], 0, neon_element_bits (mode) + 1); + return "vshll. %q0, %P1, %2"; +} + [(set_attr "neon_type" "neon_shift_1")] +) + +(define_expand "vec_widen_shiftl_lo_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VU 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON && !BYTES_BIG_ENDIAN" + { + emit_insn (gen_neon_vec_shiftl_ (operands[0], + simplify_gen_subreg (mode, operands[1], mode, 0), + operands[2])); + DONE; + } +) + +(define_expand "vec_widen_shiftl_hi_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VU 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON && !BYTES_BIG_ENDIAN" + { + emit_insn (gen_neon_vec_shiftl_ (operands[0], + simplify_gen_subreg (mode, operands[1], mode, + GET_MODE_SIZE (mode)), + operands[2])); + DONE; + } +) + ;; Vectorize for non-neon-quad case (define_insn "neon_unpack_" [(set (match_operand: 0 "register_operand" "=w") @@ -5626,6 +5666,34 @@ } ) +(define_expand "vec_widen_shiftl_hi_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VDI 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" + { + rtx tmpreg = gen_reg_rtx (mode); + emit_insn (gen_neon_vec_shiftl_ (tmpreg, operands[1], operands[2])); + emit_insn (gen_neon_vget_high (operands[0], tmpreg)); + + DONE; + } +) + +(define_expand "vec_widen_shiftl_lo_" + [(match_operand: 0 "register_operand" "") + (SE: (match_operand:VDI 1 "register_operand" "")) + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" + { + rtx tmpreg = gen_reg_rtx (mode); + emit_insn (gen_neon_vec_shiftl_ (tmpreg, operands[1], operands[2])); + emit_insn (gen_neon_vget_low (operands[0], tmpreg)); + + DONE; + } +) + ; FIXME: These instruction patterns can't be used safely in big-endian mode