From patchwork Mon Feb 15 06:31:57 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 61922 Delivered-To: patch@linaro.org Received: by 10.112.43.199 with SMTP id y7csp888124lbl; Sun, 14 Feb 2016 22:32:32 -0800 (PST) X-Received: by 10.98.73.6 with SMTP id w6mr21438478pfa.82.1455517952179; Sun, 14 Feb 2016 22:32:32 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id fd8si41184149pab.235.2016.02.14.22.32.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 14 Feb 2016 22:32:32 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-421399-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-421399-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-421399-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=hNDuQ86EOTAN971Ds EAaJ2efApXTCfsUYu3iiCumzKCS64EZJNSSL2U5F/R5kNk31EWwbrO7MO4Q3Z+mk 0hqkB2fRga8drx+W5CKQx3zT8OmtcI0XDYSL6pxqY8p85V6Oea6ChxHGoa9dbicp MYCTnpkJSdtPnylW05kTdjdmuw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=+t3Br5Hnt02IPrVeKLEw4og Mtag=; b=Yaj3N5Tc1+XfZEFMlbm+aWmKnQkNA5cbtI8C7ayfwUN98k6Z3hI3jjw /bWxewISzEGnclcR+FQCZHwnU3BKUxJXoZS0nvok15AAUHKKfiQmsxEvGJUeqY0t vn76GOVu/MOhamJWXBU0Fi3tSxoKmnLqqjZeTDVI9DaI89JMKAes= Received: (qmail 24517 invoked by alias); 15 Feb 2016 06:32:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 24486 invoked by uid 89); 15 Feb 2016 06:32:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=sk:ATTRIBU, *insn, sk:bytes_b, littleendian X-HELO: mail-pa0-f45.google.com Received: from mail-pa0-f45.google.com (HELO mail-pa0-f45.google.com) (209.85.220.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 15 Feb 2016 06:32:09 +0000 Received: by mail-pa0-f45.google.com with SMTP id fy10so41320813pac.1 for ; Sun, 14 Feb 2016 22:32:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to:content-type; bh=mpGrNQwBo1pEWeff1hziEggkF5m5LBUgyK+0jziyqds=; b=k4azmN4ozTVeSFH/ynMZ0/zkmQVARfzw5VOTWCBDREt3bw+ZBKel9L9pT/7r4g6qUb ZOWoS08heiKBfp08OCOec3e5+cnRZHt8qCkVH3xwZZJd8dyNadF0ufWRdcdTJjGYlEWA AKNgEpvTugC3wTr8dx6rc+MJZindYr+bXrml2Ag3oNhiDxAnJicyR7ARNH5wQXOm/TDq rao1gnvbfM7NcVpUaf8PTPwj+ZyABOQKimLc2vurU7ie+fsI3cVoHt1VmeN6bimJrqDg g3aJLz3I5OfseOZo6rEvM7pRssxavtyFRhTr++o54ctMREecxNEVuHeXJFlz7AAUdhk7 T97A== X-Gm-Message-State: AG10YOTxOnJcYgQK3saS9eDkgCzliygaqNpFbffnKTmuUShPL8CacDnX7Ktlu2bQCefCPhaG X-Received: by 10.66.62.226 with SMTP id b2mr21237165pas.94.1455517927763; Sun, 14 Feb 2016 22:32:07 -0800 (PST) Received: from [192.168.1.9] (ip70-176-202-128.ph.ph.cox.net. [70.176.202.128]) by smtp.googlemail.com with ESMTPSA id 70sm35525896pfs.78.2016.02.14.22.32.06 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 14 Feb 2016 22:32:07 -0800 (PST) Subject: Re: [ARM] Use vector wide add for mixed-mode adds To: Kyrill Tkachov , gcc Patches , Ramana Radhakrishnan References: <565BA3CC.3050800@linaro.org> <566995BE.8040206@arm.com> <5671FB8A.4000004@linaro.org> <56BA1356.4060506@foss.arm.com> From: Michael Collison Message-ID: <56C170DD.1040303@linaro.org> Date: Sun, 14 Feb 2016 23:31:57 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <56BA1356.4060506@foss.arm.com> Hi Kyrill, I made the following changes based on your comments: 1. I rebased the patch so that it applies cleanly on trunk 2. Fixed the dg-add-options as requested to my new test cases 3. Fixed the GNU style issues identified by ./contrib/check_GNU_style.sh The failure you are seeing on slp-reduc-3.c is a known failure. The test case has a xfail with 'xfail { vect_widen_sum_hi_to_si_pattern' which I added in my patch. Richard Biener resolved some of these issues with PR 68333, but 'slp-reduc-3.c' still fails. I will create a new PR. I retested on the Linaro testing infrastructure with the latest trunk and the only failure is 'slp-reduc-3.c'. Okay for GCC 7? 2016-02-12 Michael Collison * config/arm/neon.md (widen_sum): New patterns where mode is VQI to improve mixed mode vectorization. * config/arm/neon.md (vec_sel_widen_ssum_lo3): New define_insn to match low half of signed vaddw. * config/arm/neon.md (vec_sel_widen_ssum_hi3): New define_insn to match high half of signed vaddw. * config/arm/neon.md (vec_sel_widen_usum_lo3): New define_insn to match low half of unsigned vaddw. * config/arm/neon.md (vec_sel_widen_usum_hi3): New define_insn to match high half of unsigned vaddw. * config/arm/arm.c (arm_simd_vect_par_cnst_half): New function. (arm_simd_check_vect_par_cnst_half_p): Likewise. * config/arm/arm-protos.h (arm_simd_vect_par_cnst_half): Prototype for new function. (arm_simd_check_vect_par_cnst_half_p): Likewise. * config/arm/predicates.md (vect_par_constant_high): Support big endian and simplify by calling arm_simd_check_vect_par_cnst_half (vect_par_constant_low): Likewise. * testsuite/gcc.target/arm/neon-vaddws16.c: New test. * testsuite/gcc.target/arm/neon-vaddws32.c: New test. * testsuite/gcc.target/arm/neon-vaddwu16.c: New test. * testsuite/gcc.target/arm/neon-vaddwu32.c: New test. * testsuite/gcc.target/arm/neon-vaddwu8.c: New test. * testsuite/lib/target-supports.exp (check_effective_target_vect_widen_sum_hi_to_si_pattern): Indicate that arm neon support vector widen sum of HImode TO SImode. On 02/09/2016 09:27 AM, Kyrill Tkachov wrote: > Hi Michael, > > On 17/12/15 00:02, Michael Collison wrote: >> Kyrill, >> >> I have attached a patch that address your comments. The only change I >> would ask you to re-consider renaming is the function 'bool >> aarch32_simd_check_vect_par_cnst_half'. This function was copied from >> the aarch64 port and I thought it as important to match the naming >> for maintenance purposes. I did rename the function to 'bool >> arm_simd_check_vect_par_cnst_half_p'. I changed 'aarch32' to 'arm' >> and added '_p' per you suggestions. Is this okay? >> > > Ok, that's fine with me. > >> I implemented all your other change suggestions. >> > > Thanks, sorry it took a long time to get back to this, I was busy with > regression-fixing patches as we're > in bug-fixing mode... > >> 2015-12-16 Michael Collison >> >> * config/arm/neon.md (widen_sum): New patterns where >> mode is VQI to improve mixed mode vectorization. >> * config/arm/neon.md (vec_sel_widen_ssum_lo3): >> New >> define_insn to match low half of signed vaddw. >> * config/arm/neon.md (vec_sel_widen_ssum_hi3): >> New >> define_insn to match high half of signed vaddw. >> * config/arm/neon.md (vec_sel_widen_usum_lo3): >> New >> define_insn to match low half of unsigned vaddw. >> * config/arm/neon.md (vec_sel_widen_usum_hi3): >> New >> define_insn to match high half of unsigned vaddw. >> * config/arm/arm.c (arm_simd_vect_par_cnst_half): New function. >> (arm_simd_check_vect_par_cnst_half_p): Likewise. >> * config/arm/arm-protos.h (arm_simd_vect_par_cnst_half): Prototype >> for new function. >> (arm_simd_check_vect_par_cnst_half_p): Likewise. >> * config/arm/predicates.md (vect_par_constant_high): Support >> big endian and simplify by calling >> arm_simd_check_vect_par_cnst_half >> (vect_par_constant_low): Likewise. >> * testsuite/gcc.target/arm/neon-vaddws16.c: New test. >> * testsuite/gcc.target/arm/neon-vaddws32.c: New test. >> * testsuite/gcc.target/arm/neon-vaddwu16.c: New test. >> * testsuite/gcc.target/arm/neon-vaddwu32.c: New test. >> * testsuite/gcc.target/arm/neon-vaddwu8.c: New test. >> * testsuite/lib/target-supports.exp >> (check_effective_target_vect_widen_sum_hi_to_si_pattern): Indicate >> that arm neon support vector widen sum of HImode TO SImode. >> > > I've tried this out and I have a few comments. > The arm.c hunk doesn't apply to current trunk anymore due to context. > Can you please rebase the patch? > I've fixed it up manually in my tree so I can build it. > With this patch I'm seeing two PASS->FAIL on arm-none-eabi: > FAIL: gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects > scan-tree-dump-times vect "vectorizing stmts using SLP" 1 > FAIL: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing > stmts using SLP" 1 > My compiler is configured with --with-float=hard --with-cpu=cortex-a9 > --with-fpu=neon --with-mode=thumb > Can you please look into these? Maybe it's just the tests that need > adjustment? > > Also, I'm seeing the new tests give an error: > ERROR: gcc.target/arm/neon-vaddws16.c: Unrecognized option type: > arm_neon_ok for " dg-add-options 3 arm_neon_ok " > UNRESOLVED: gcc.target/arm/neon-vaddws16.c: Unrecognized option type: > arm_neon_ok for " dg-add-options 3 arm_neon_ok " > > That've because the dg-add-options argument should be arm_neon rather > than arm_neon_ok. > Also, since the new tests are compile-only the effective target check > should be arm_neon_ok rather than arm_neon_hw. > > I also see ./contrib/check_GNU_style.sh complaining about some minor > style issues like trailing whitespace and > blocks of whitespace that should be replaced with tabs. > > In any case, this patch is GCC 7 material at this point, so I think > with the above issues resolved > (and the FAILs investigated) this should be in good shape. > > Thanks, > Kyrill -- Michael Collison Linaro Toolchain Working Group michael.collison@linaro.org >From f3d167389cce45ecbd62bb4b1da754ba629ce32f Mon Sep 17 00:00:00 2001 From: Michael Collison Date: Wed, 10 Feb 2016 22:13:26 -0700 Subject: [PATCH] patches for tcwg 833 Fix GNU style issues GNU formatting changes pt. 2 GNU formatting changes pt. 3 GNU formatting changes pt. 4 Fix inadverdent change Fix trailing whitespace Fix another inadverdant change Fix incorrect application of patch Fix > 80 character line length issue Fix trailing whitespace Fix order of dg-options --- gcc/config/arm/arm-protos.h | 4 +- gcc/config/arm/arm.c | 76 ++++++++++++++++ gcc/config/arm/neon.md | 125 ++++++++++++++++++++++++++- gcc/config/arm/predicates.md | 50 +---------- gcc/testsuite/gcc.target/arm/neon-vaddws16.c | 19 ++++ gcc/testsuite/gcc.target/arm/neon-vaddws32.c | 18 ++++ gcc/testsuite/gcc.target/arm/neon-vaddwu16.c | 18 ++++ gcc/testsuite/gcc.target/arm/neon-vaddwu32.c | 18 ++++ gcc/testsuite/gcc.target/arm/neon-vaddwu8.c | 19 ++++ gcc/testsuite/lib/target-supports.exp | 2 + 10 files changed, 296 insertions(+), 53 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/neon-vaddws16.c create mode 100644 gcc/testsuite/gcc.target/arm/neon-vaddws32.c create mode 100644 gcc/testsuite/gcc.target/arm/neon-vaddwu16.c create mode 100644 gcc/testsuite/gcc.target/arm/neon-vaddwu32.c create mode 100644 gcc/testsuite/gcc.target/arm/neon-vaddwu8.c diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 0083673..d8179c4 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -50,7 +50,9 @@ extern tree arm_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED); extern void arm_init_builtins (void); extern void arm_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update); - +extern rtx arm_simd_vect_par_cnst_half (machine_mode mode, bool high); +extern bool arm_simd_check_vect_par_cnst_half_p (rtx op, machine_mode mode, + bool high); #ifdef RTX_CODE extern bool arm_vector_mode_supported_p (machine_mode); extern bool arm_small_register_classes_for_mode_p (machine_mode); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 82becef..7ac34bb 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -30239,4 +30239,80 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri, return; } + +/* Construct and return a PARALLEL RTX vector with elements numbering the + lanes of either the high (HIGH == TRUE) or low (HIGH == FALSE) half of + the vector - from the perspective of the architecture. This does not + line up with GCC's perspective on lane numbers, so we end up with + different masks depending on our target endian-ness. The diagram + below may help. We must draw the distinction when building masks + which select one half of the vector. An instruction selecting + architectural low-lanes for a big-endian target, must be described using + a mask selecting GCC high-lanes. + + Big-Endian Little-Endian + +GCC 0 1 2 3 3 2 1 0 + | x | x | x | x | | x | x | x | x | +Architecture 3 2 1 0 3 2 1 0 + +Low Mask: { 2, 3 } { 0, 1 } +High Mask: { 0, 1 } { 2, 3 } +*/ + +rtx +arm_simd_vect_par_cnst_half (machine_mode mode, bool high) +{ + int nunits = GET_MODE_NUNITS (mode); + rtvec v = rtvec_alloc (nunits / 2); + int high_base = nunits / 2; + int low_base = 0; + int base; + rtx t1; + int i; + + if (BYTES_BIG_ENDIAN) + base = high ? low_base : high_base; + else + base = high ? high_base : low_base; + + for (i = 0; i < nunits / 2; i++) + RTVEC_ELT (v, i) = GEN_INT (base + i); + + t1 = gen_rtx_PARALLEL (mode, v); + return t1; +} + +/* Check OP for validity as a PARALLEL RTX vector with elements + numbering the lanes of either the high (HIGH == TRUE) or low lanes, + from the perspective of the architecture. See the diagram above + arm_simd_vect_par_cnst_half_p for more details. */ + +bool +arm_simd_check_vect_par_cnst_half_p (rtx op, machine_mode mode, + bool high) +{ + rtx ideal = arm_simd_vect_par_cnst_half (mode, high); + HOST_WIDE_INT count_op = XVECLEN (op, 0); + HOST_WIDE_INT count_ideal = XVECLEN (ideal, 0); + int i = 0; + + if (!VECTOR_MODE_P (mode)) + return false; + + if (count_op != count_ideal) + return false; + + for (i = 0; i < count_ideal; i++) + { + rtx elt_op = XVECEXP (op, 0, i); + rtx elt_ideal = XVECEXP (ideal, 0, i); + + if (!CONST_INT_P (elt_op) + || INTVAL (elt_ideal) != INTVAL (elt_op)) + return false; + } + return true; +} + #include "gt-arm.h" diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index f495d40..754d394 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -1204,16 +1204,133 @@ ;; Widening operations +(define_expand "widen_ssum3" + [(set (match_operand: 0 "s_register_operand" "") + (plus: + (sign_extend: + (match_operand:VQI 1 "s_register_operand" "")) + (match_operand: 2 "s_register_operand" "")))] + "TARGET_NEON" + { + machine_mode mode = GET_MODE (operands[1]); + rtx p1, p2; + + p1 = arm_simd_vect_par_cnst_half (mode, false); + p2 = arm_simd_vect_par_cnst_half (mode, true); + + if (operands[0] != operands[2]) + emit_move_insn (operands[0], operands[2]); + + emit_insn (gen_vec_sel_widen_ssum_lo3 (operands[0], + operands[1], + p1, + operands[0])); + emit_insn (gen_vec_sel_widen_ssum_hi3 (operands[0], + operands[1], + p2, + operands[0])); + DONE; + } +) + +(define_insn "vec_sel_widen_ssum_lo3" + [(set (match_operand: 0 "s_register_operand" "=w") + (plus: + (sign_extend: + (vec_select:VW + (match_operand:VQI 1 "s_register_operand" "%w") + (match_operand:VQI 2 "vect_par_constant_low" ""))) + (match_operand: 3 "s_register_operand" "0")))] + "TARGET_NEON" +{ + return BYTES_BIG_ENDIAN ? "vaddw.\t%q0, %q3, %f1" : + "vaddw.\t%q0, %q3, %e1"; +} + [(set_attr "type" "neon_add_widen")]) + +(define_insn "vec_sel_widen_ssum_hi3" + [(set (match_operand: 0 "s_register_operand" "=w") + (plus: + (sign_extend: + (vec_select:VW (match_operand:VQI 1 "s_register_operand" "%w") + (match_operand:VQI 2 "vect_par_constant_high" ""))) + (match_operand: 3 "s_register_operand" "0")))] + "TARGET_NEON" +{ + return BYTES_BIG_ENDIAN ? "vaddw.\t%q0, %q3, %e1" : + "vaddw.\t%q0, %q3, %f1"; +} + [(set_attr "type" "neon_add_widen")]) + (define_insn "widen_ssum3" [(set (match_operand: 0 "s_register_operand" "=w") - (plus: (sign_extend: - (match_operand:VW 1 "s_register_operand" "%w")) - (match_operand: 2 "s_register_operand" "w")))] + (plus: + (sign_extend: + (match_operand:VW 1 "s_register_operand" "%w")) + (match_operand: 2 "s_register_operand" "w")))] "TARGET_NEON" "vaddw.\t%q0, %q2, %P1" [(set_attr "type" "neon_add_widen")] ) +(define_expand "widen_usum3" + [(set (match_operand: 0 "s_register_operand" "") + (plus: + (zero_extend: + (match_operand:VQI 1 "s_register_operand" "")) + (match_operand: 2 "s_register_operand" "")))] + "TARGET_NEON" + { + machine_mode mode = GET_MODE (operands[1]); + rtx p1, p2; + + p1 = arm_simd_vect_par_cnst_half (mode, false); + p2 = arm_simd_vect_par_cnst_half (mode, true); + + if (operands[0] != operands[2]) + emit_move_insn (operands[0], operands[2]); + + emit_insn (gen_vec_sel_widen_usum_lo3 (operands[0], + operands[1], + p1, + operands[0])); + emit_insn (gen_vec_sel_widen_usum_hi3 (operands[0], + operands[1], + p2, + operands[0])); + DONE; + } +) + +(define_insn "vec_sel_widen_usum_lo3" + [(set (match_operand: 0 "s_register_operand" "=w") + (plus: + (zero_extend: + (vec_select:VW + (match_operand:VQI 1 "s_register_operand" "%w") + (match_operand:VQI 2 "vect_par_constant_low" ""))) + (match_operand: 3 "s_register_operand" "0")))] + "TARGET_NEON" +{ + return BYTES_BIG_ENDIAN ? "vaddw.\t%q0, %q3, %f1" : + "vaddw.\t%q0, %q3, %e1"; +} + [(set_attr "type" "neon_add_widen")]) + +(define_insn "vec_sel_widen_usum_hi3" + [(set (match_operand: 0 "s_register_operand" "=w") + (plus: + (zero_extend: + (vec_select:VW (match_operand:VQI 1 "s_register_operand" "%w") + (match_operand:VQI 2 "vect_par_constant_high" ""))) + (match_operand: 3 "s_register_operand" "0")))] + "TARGET_NEON" +{ + return BYTES_BIG_ENDIAN ? "vaddw.\t%q0, %q3, %e1" : + "vaddw.\t%q0, %q3, %f1"; +} + [(set_attr "type" "neon_add_widen")]) + (define_insn "widen_usum3" [(set (match_operand: 0 "s_register_operand" "=w") (plus: (zero_extend: diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index 41a6ea4..a21f675 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -605,59 +605,13 @@ (define_special_predicate "vect_par_constant_high" (match_code "parallel") { - HOST_WIDE_INT count = XVECLEN (op, 0); - int i; - int base = GET_MODE_NUNITS (mode); - - if ((count < 1) - || (count != base/2)) - return false; - - if (!VECTOR_MODE_P (mode)) - return false; - - for (i = 0; i < count; i++) - { - rtx elt = XVECEXP (op, 0, i); - int val; - - if (!CONST_INT_P (elt)) - return false; - - val = INTVAL (elt); - if (val != (base/2) + i) - return false; - } - return true; + return arm_simd_check_vect_par_cnst_half_p (op, mode, true); }) (define_special_predicate "vect_par_constant_low" (match_code "parallel") { - HOST_WIDE_INT count = XVECLEN (op, 0); - int i; - int base = GET_MODE_NUNITS (mode); - - if ((count < 1) - || (count != base/2)) - return false; - - if (!VECTOR_MODE_P (mode)) - return false; - - for (i = 0; i < count; i++) - { - rtx elt = XVECEXP (op, 0, i); - int val; - - if (!CONST_INT_P (elt)) - return false; - - val = INTVAL (elt); - if (val != i) - return false; - } - return true; + return arm_simd_check_vect_par_cnst_half_p (op, mode, false); }) (define_predicate "const_double_vcvt_power_of_two_reciprocal" diff --git a/gcc/testsuite/gcc.target/arm/neon-vaddws16.c b/gcc/testsuite/gcc.target/arm/neon-vaddws16.c new file mode 100644 index 0000000..8281134 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vaddws16.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3" } */ +/* { dg-add-options arm_neon } */ + + + +int +t6 (int len, void * dummy, short * __restrict x) +{ + len = len & ~31; + int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "vaddw\.s16" } } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-vaddws32.c b/gcc/testsuite/gcc.target/arm/neon-vaddws32.c new file mode 100644 index 0000000..8c18691 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vaddws32.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3" } */ +/* { dg-add-options arm_neon } */ + + +int +t6 (int len, void * dummy, int * __restrict x) +{ + len = len & ~31; + long long result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "vaddw\.s32" } } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-vaddwu16.c b/gcc/testsuite/gcc.target/arm/neon-vaddwu16.c new file mode 100644 index 0000000..580bb06 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vaddwu16.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3" } */ +/* { dg-add-options arm_neon } */ + + +int +t6 (int len, void * dummy, unsigned short * __restrict x) +{ + len = len & ~31; + unsigned int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "vaddw.u16" } } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-vaddwu32.c b/gcc/testsuite/gcc.target/arm/neon-vaddwu32.c new file mode 100644 index 0000000..21b0633 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vaddwu32.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3" } */ +/* { dg-add-options arm_neon } */ + + +int +t6 (int len, void * dummy, unsigned int * __restrict x) +{ + len = len & ~31; + unsigned long long result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "vaddw\.u32" } } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-vaddwu8.c b/gcc/testsuite/gcc.target/arm/neon-vaddwu8.c new file mode 100644 index 0000000..d350ed5 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vaddwu8.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3" } */ +/* { dg-add-options arm_neon } */ + + + +int +t6 (int len, void * dummy, char * __restrict x) +{ + len = len & ~31; + unsigned short result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "vaddw\.u8" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 645981a..01d72a5 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4313,6 +4313,8 @@ proc check_effective_target_vect_widen_sum_hi_to_si_pattern { } { set et_vect_widen_sum_hi_to_si_pattern_saved 0 if { [istarget powerpc*-*-*] || [istarget aarch64*-*-*] + || ([istarget arm*-*-*] && + [check_effective_target_arm_neon_ok]) || [istarget ia64-*-*] } { set et_vect_widen_sum_hi_to_si_pattern_saved 1 } -- 1.9.1