From patchwork Mon Nov 9 06:51:47 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 56179 Delivered-To: patch@linaro.org Received: by 10.112.155.196 with SMTP id vy4csp18326lbb; Sun, 8 Nov 2015 22:52:17 -0800 (PST) X-Received: by 10.68.135.199 with SMTP id pu7mr4849958pbb.98.1447051937597; Sun, 08 Nov 2015 22:52:17 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id ps9si20418786pac.87.2015.11.08.22.52.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 08 Nov 2015 22:52:17 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=p8kA8zkRKtiO89KwU81h+AmXiRUBrdwBpoPgrb7cKc7jXW 9oqKOuOkkSPU10pLzkXFrcQuh2QQynXxY4b+EcYQNjhw3FS+UV5HqhgPlxbC4yTI 62Y8egE3xV7FGHP3F1a1UTgf4xqNyYdt6VcZ4kBl+lJ3iedf7YoLuB2Mc53VE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=WFlqvdOCOdpwKV4WmBmUo4aFXz4=; b=DYDaZX63f55hVATT2VjT YWxk0qn/Zkz7nA8CRqgqRFf2fWkInIFgQbBsZ0u7TD6RYkOJ6eZ+BBPbetVe+dme UzKGRNhztCB5GGHkvobvuVoAIpKJZe1QvLeS6nw/5UxVXYKDiXiE8Z2rIO1SYv9d dTpgoyFTRR3ly3RhEb2rj0I= Received: (qmail 124197 invoked by alias); 9 Nov 2015 06:52:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 124170 invoked by uid 89); 9 Nov 2015 06:52:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pa0-f43.google.com Received: from mail-pa0-f43.google.com (HELO mail-pa0-f43.google.com) (209.85.220.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 09 Nov 2015 06:51:58 +0000 Received: by padhx2 with SMTP id hx2so181096893pad.1 for ; Sun, 08 Nov 2015 22:51:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:organization:user-agent :mime-version:to:subject:content-type; bh=fdVf2H9UyV8+E8fY4jk1aPXhzkOJmez3ltZC4sopIN0=; b=Ey/+cVJInhN3AfyCXT0vCTR0IKtSKZ/XxcOjdYegVNFc5u726KTmnAs+ypKgWvl92Q bmZ1pdwYDQuPf+CcWkmB/mhmzYpOCNdJZJwbGExGnQlX1t2q2EaizlvAHAlQNsCNTInT lDHeInvAmx83yBqVYKi/q9dtzUC758sXqGCGLsssDHabLqwEnNRobFFeJ3Es7oJb0WtU dFDYzC66dfcYMDBJ8hZMIEcQEnywHVvyThRAxIehp6CC34RYpn/XYNJOjqergVqXiejX O7wFrs1VpwzDFgbifkf0LRLEukQyY/Xzq6VsdxHk54G6skpP8P0YVLSH1iUkplHTm/lF WEmg== X-Gm-Message-State: ALoCoQkvcIJRW/6hicbR4k1Zj3mkgjxVdyat6rjDkHUnxFwHXLNKoZIvExhnTmqxw6DlQEtbP29L X-Received: by 10.66.227.170 with SMTP id sb10mr24423207pac.80.1447051916529; Sun, 08 Nov 2015 22:51:56 -0800 (PST) Received: from [192.168.1.14] (ip70-176-202-128.ph.ph.cox.net. [70.176.202.128]) by smtp.googlemail.com with ESMTPSA id ws6sm14170602pbc.33.2015.11.08.22.51.55 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 08 Nov 2015 22:51:55 -0800 (PST) Message-ID: <56404283.5070503@linaro.org> Date: Sun, 08 Nov 2015 23:51:47 -0700 From: Michael Collison User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: gcc Patches , Richard Biener , James Greenhalgh Subject: Re: [Aarch64] Use vector wide add for mixed-mode adds This is a followup patch to my earlier patch here: https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00408.html and comments here: https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01300.html This patches fixes the failure in slp-reduc-3.c by adding aarch64 support in check_effective_target_vect_widen_sum_hi_to_si_pattern in target-supports.exp. The remaining failures in slp-multitypes-[45].c and vect-125.c appear to be deficiencies in the vectorizer, as the same failures are seen on PowerPC and ia64. See here: PowerPC: https://gcc.gnu.org/ml/gcc-testresults/2015-10/msg03293.html ia64: https://gcc.gnu.org/ml/gcc-testresults/2015-10/msg03176.html Thanks to James Greenhalgh at Arm for pointing this out. My patch disables these tests for targets with widening adds that support V8HI to V4SI. Tested on aarch64-none-elf, aarch64_be-none-elf, and aarch64-none-linus-gnu. 2015-11-06 Michael Collison * config/aarch64/aarch64-simd.md (widen_ssum, widen_usum) (aarch64_w_internal): New patterns * config/aarch64/iterators.md (Vhalf, VDBLW): New mode attributes. * gcc.target/aarch64/saddw-1.c: New test. * gcc.target/aarch64/saddw-2.c: New test. * gcc.target/aarch64/uaddw-1.c: New test. * gcc.target/aarch64/uaddw-2.c: New test. * gcc.target/aarch64/uaddw-3.c: New test. * gcc.dg/vect/slp-multitypes-4.c: Disable test for targets with widening adds from V8HI=>V4SI. * gcc.dg/vect/slp-multitypes-5.c: Ditto. * gcc.dg/vect/vect-125.c: Ditto. * lib/target-support.exp (check_effective_target_vect_widen_sum_hi_to_si_pattern): Add aarch64 to list of support targets. Okay to commit? -- Michael Collison Linaro Toolchain Working Group michael.collison@linaro.org diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 65a2b6f..acb7cf0 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2750,6 +2750,60 @@ ;; w. +(define_expand "widen_ssum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (sign_extend: (match_operand:VQW 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" + { + rtx p = aarch64_simd_vect_par_cnst_half (mode, false); + rtx temp = gen_reg_rtx (GET_MODE (operands[0])); + + emit_insn (gen_aarch64_saddw_internal (temp, operands[2], + operands[1], p)); + emit_insn (gen_aarch64_saddw2 (operands[0], temp, operands[1])); + DONE; + } +) + +(define_expand "widen_ssum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (sign_extend: + (match_operand:VD_BHSI 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" +{ + emit_insn (gen_aarch64_saddw (operands[0], operands[2], operands[1])); + DONE; +}) + +(define_expand "widen_usum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (zero_extend: (match_operand:VQW 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" + { + rtx p = aarch64_simd_vect_par_cnst_half (mode, false); + rtx temp = gen_reg_rtx (GET_MODE (operands[0])); + + emit_insn (gen_aarch64_uaddw_internal (temp, operands[2], + operands[1], p)); + emit_insn (gen_aarch64_uaddw2 (operands[0], temp, operands[1])); + DONE; + } +) + +(define_expand "widen_usum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (zero_extend: + (match_operand:VD_BHSI 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" +{ + emit_insn (gen_aarch64_uaddw (operands[0], operands[2], operands[1])); + DONE; +}) + (define_insn "aarch64_w" [(set (match_operand: 0 "register_operand" "=w") (ADDSUB: (match_operand: 1 "register_operand" "w") @@ -2760,6 +2814,18 @@ [(set_attr "type" "neon__widen")] ) +(define_insn "aarch64_w_internal" + [(set (match_operand: 0 "register_operand" "=w") + (ADDSUB: (match_operand: 1 "register_operand" "w") + (ANY_EXTEND: + (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_lo_half" "")))))] + "TARGET_SIMD" + "w\\t%0., %1., %2." + [(set_attr "type" "neon__widen")] +) + (define_insn "aarch64_w2_internal" [(set (match_operand: 0 "register_operand" "=w") (ADDSUB: (match_operand: 1 "register_operand" "w") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 964f8f1..f851dca 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -455,6 +455,13 @@ (V4SF "V2SF") (V4HF "V2HF") (V8HF "V4HF") (V2DF "DF")]) +;; Half modes of all vector modes, in lower-case. +(define_mode_attr Vhalf [(V8QI "v4qi") (V16QI "v8qi") + (V4HI "v2hi") (V8HI "v4hi") + (V2SI "si") (V4SI "v2si") + (V2DI "di") (V2SF "sf") + (V4SF "v2sf") (V2DF "df")]) + ;; Double modes of vector modes. (define_mode_attr VDBL [(V8QI "V16QI") (V4HI "V8HI") (V4HF "V8HF") @@ -472,6 +479,11 @@ (SI "v2si") (DI "v2di") (DF "v2df")]) +;; Modes with double-width elements. +(define_mode_attr VDBLW [(V8QI "V4HI") (V16QI "V8HI") + (V4HI "V2SI") (V8HI "V4SI") + (V2SI "DI") (V4SI "V2DI")]) + ;; Narrowed modes for VDN. (define_mode_attr VNARROWD [(V4HI "V8QI") (V2SI "V4HI") (DI "V2SI")]) diff --git a/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c b/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c index faf17d6..fa3b9e2 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c +++ b/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_unpack } */ #include #include "tree-vect.h" @@ -51,6 +52,6 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_unpack } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_unpack } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c b/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c index fb4f720..42a3b4b 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c +++ b/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_pack_trunc } */ #include #include "tree-vect.h" @@ -51,6 +52,6 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_pack_trunc } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-125.c b/gcc/testsuite/gcc.dg/vect/vect-125.c index 4a3c0dc..a1d1e88 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-125.c +++ b/gcc/testsuite/gcc.dg/vect/vect-125.c @@ -16,4 +16,4 @@ void train(short *t, short *w, int n, int err) } } -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail vect_no_int_min_max } } } */ +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { vect_widen_sum_hi_to_si_pattern || vect_no_int_min_max } } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-1.c b/gcc/testsuite/gcc.target/aarch64/saddw-1.c new file mode 100644 index 0000000..9db5d00 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saddw-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + + +int +t6(int len, void * dummy, short * __restrict x) +{ + len = len & ~31; + int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "saddw" } } */ +/* { dg-final { scan-assembler "saddw2" } } */ + + + diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-2.c b/gcc/testsuite/gcc.target/aarch64/saddw-2.c new file mode 100644 index 0000000..6f8c8fd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saddw-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int +t6(int len, void * dummy, int * __restrict x) +{ + len = len & ~31; + long long result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "saddw" } } */ +/* { dg-final { scan-assembler "saddw2" } } */ + + diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-1.c b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c new file mode 100644 index 0000000..e34574f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + + +int +t6(int len, void * dummy, unsigned short * __restrict x) +{ + len = len & ~31; + unsigned int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "uaddw" } } */ +/* { dg-final { scan-assembler "uaddw2" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-2.c b/gcc/testsuite/gcc.target/aarch64/uaddw-2.c new file mode 100644 index 0000000..fd3b578 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int +t6(int len, void * dummy, unsigned short * __restrict x) +{ + len = len & ~31; + unsigned int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "uaddw" } } */ +/* { dg-final { scan-assembler "uaddw2" } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-3.c b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c new file mode 100644 index 0000000..04bc7c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + + +int +t6(int len, void * dummy, char * __restrict x) +{ + len = len & ~31; + unsigned short result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "uaddw" } } */ +/* { dg-final { scan-assembler "uaddw2" } } */ + + + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index b543519..46f41a1 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3943,6 +3943,7 @@ proc check_effective_target_vect_widen_sum_hi_to_si_pattern { } { } else { set et_vect_widen_sum_hi_to_si_pattern_saved 0 if { [istarget powerpc*-*-*] + || [istarget aarch64*-*-*] || [istarget ia64-*-*] } { set et_vect_widen_sum_hi_to_si_pattern_saved 1 } -- 1.9.1