From patchwork Mon Nov  9 06:51:47 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Collison <michael.collison@linaro.org>
X-Patchwork-Id: 56179
Delivered-To: patch@linaro.org
Received: by 10.112.155.196 with SMTP id vy4csp18326lbb;
 Sun, 8 Nov 2015 22:52:17 -0800 (PST)
X-Received: by 10.68.135.199 with SMTP id pu7mr4849958pbb.98.1447051937597; 
 Sun, 08 Nov 2015 22:52:17 -0800 (PST)
Return-Path: <gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org>
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 ps9si20418786pac.87.2015.11.08.22.52.17 for <patch@linaro.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 08 Nov 2015 22:52:17 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Authentication-Results: mx.google.com; spf=pass (google.com: domain of
 gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 smtp.mailfrom=gcc-patches-return-413182-patch=linaro.org@gcc.gnu.org;
 dkim=pass header.i=@gcc.gnu.org
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :message-id:date:from:mime-version:to:subject:content-type; q=
 dns; s=default; b=p8kA8zkRKtiO89KwU81h+AmXiRUBrdwBpoPgrb7cKc7jXW
 9oqKOuOkkSPU10pLzkXFrcQuh2QQynXxY4b+EcYQNjhw3FS+UV5HqhgPlxbC4yTI
 62Y8egE3xV7FGHP3F1a1UTgf4xqNyYdt6VcZ4kBl+lJ3iedf7YoLuB2Mc53VE=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :message-id:date:from:mime-version:to:subject:content-type; s=
 default; bh=WFlqvdOCOdpwKV4WmBmUo4aFXz4=; b=DYDaZX63f55hVATT2VjT
 YWxk0qn/Zkz7nA8CRqgqRFf2fWkInIFgQbBsZ0u7TD6RYkOJ6eZ+BBPbetVe+dme
 UzKGRNhztCB5GGHkvobvuVoAIpKJZe1QvLeS6nw/5UxVXYKDiXiE8Z2rIO1SYv9d
 dTpgoyFTRR3ly3RhEb2rj0I=
Received: (qmail 124197 invoked by alias); 9 Nov 2015 06:52:02 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <mailto:gcc-patches-unsubscribe-patch=linaro.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 124170 invoked by uid 89); 9 Nov 2015 06:52:00 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00,
 RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-pa0-f43.google.com
Received: from mail-pa0-f43.google.com (HELO mail-pa0-f43.google.com)
 (209.85.220.43) by sourceware.org
 (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256
 encrypted) ESMTPS; Mon, 09 Nov 2015 06:51:58 +0000
Received: by padhx2 with SMTP id hx2so181096893pad.1 for
 <gcc-patches@gcc.gnu.org>; Sun, 08 Nov 2015 22:51:56 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net;
 s=20130820;
 h=x-gm-message-state:message-id:date:from:organization:user-agent
 :mime-version:to:subject:content-type;
 bh=fdVf2H9UyV8+E8fY4jk1aPXhzkOJmez3ltZC4sopIN0=;
 b=Ey/+cVJInhN3AfyCXT0vCTR0IKtSKZ/XxcOjdYegVNFc5u726KTmnAs+ypKgWvl92Q
 bmZ1pdwYDQuPf+CcWkmB/mhmzYpOCNdJZJwbGExGnQlX1t2q2EaizlvAHAlQNsCNTInT
 lDHeInvAmx83yBqVYKi/q9dtzUC758sXqGCGLsssDHabLqwEnNRobFFeJ3Es7oJb0WtU
 dFDYzC66dfcYMDBJ8hZMIEcQEnywHVvyThRAxIehp6CC34RYpn/XYNJOjqergVqXiejX
 O7wFrs1VpwzDFgbifkf0LRLEukQyY/Xzq6VsdxHk54G6skpP8P0YVLSH1iUkplHTm/lF
 WEmg==
X-Gm-Message-State: ALoCoQkvcIJRW/6hicbR4k1Zj3mkgjxVdyat6rjDkHUnxFwHXLNKoZIvExhnTmqxw6DlQEtbP29L
X-Received: by 10.66.227.170 with SMTP id sb10mr24423207pac.80.1447051916529; 
 Sun, 08 Nov 2015 22:51:56 -0800 (PST)
Received: from [192.168.1.14] (ip70-176-202-128.ph.ph.cox.net.
 [70.176.202.128]) by smtp.googlemail.com with ESMTPSA id
 ws6sm14170602pbc.33.2015.11.08.22.51.55 (version=TLSv1/SSLv3
 cipher=OTHER); Sun, 08 Nov 2015 22:51:55 -0800 (PST)
Message-ID: <56404283.5070503@linaro.org>
Date: Sun, 08 Nov 2015 23:51:47 -0700
From: Michael Collison <michael.collison@linaro.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:31.0) Gecko/20100101 Thunderbird/31.4.0
MIME-Version: 1.0
To: gcc Patches <gcc-patches@gcc.gnu.org>,
 Richard Biener <richard.guenther@gmail.com>,
 James Greenhalgh <james.greenhalgh@arm.com>
Subject: Re: [Aarch64] Use vector wide add for mixed-mode adds

This is a followup patch to my earlier patch here:

https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00408.html

and comments here:

https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01300.html

This patches fixes the failure in slp-reduc-3.c by adding aarch64 support in
check_effective_target_vect_widen_sum_hi_to_si_pattern in 
target-supports.exp.
The remaining failures in slp-multitypes-[45].c and vect-125.c appear to 
be deficiencies in
the vectorizer, as the same failures are seen on PowerPC and ia64. See here:

PowerPC: https://gcc.gnu.org/ml/gcc-testresults/2015-10/msg03293.html
ia64: https://gcc.gnu.org/ml/gcc-testresults/2015-10/msg03176.html

Thanks to James Greenhalgh at Arm for pointing this out. My patch 
disables these tests for targets with
widening adds that support V8HI to V4SI. Tested on aarch64-none-elf, 
aarch64_be-none-elf, and aarch64-none-linus-gnu.

2015-11-06  Michael Collison <Michael.Collison@linaro.org>
     * config/aarch64/aarch64-simd.md (widen_ssum, widen_usum)
(aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>_internal): New patterns
     * config/aarch64/iterators.md (Vhalf, VDBLW): New mode attributes.
     * gcc.target/aarch64/saddw-1.c: New test.
     * gcc.target/aarch64/saddw-2.c: New test.
     * gcc.target/aarch64/uaddw-1.c: New test.
     * gcc.target/aarch64/uaddw-2.c: New test.
     * gcc.target/aarch64/uaddw-3.c: New test.
     * gcc.dg/vect/slp-multitypes-4.c: Disable test for
     targets with widening adds from V8HI=>V4SI.
     * gcc.dg/vect/slp-multitypes-5.c: Ditto.
     * gcc.dg/vect/vect-125.c: Ditto.
     * lib/target-support.exp
     (check_effective_target_vect_widen_sum_hi_to_si_pattern):
     Add aarch64 to list of support targets.

Okay to commit?

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 65a2b6f..acb7cf0 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2750,6 +2750,60 @@
 
 ;; <su><addsub>w<q>.
 
+(define_expand "widen_ssum<mode>3"
+  [(set (match_operand:<VDBLW> 0 "register_operand" "")
+	(plus:<VDBLW> (sign_extend:<VDBLW> (match_operand:VQW 1 "register_operand" ""))
+		      (match_operand:<VDBLW> 2 "register_operand" "")))]
+  "TARGET_SIMD"
+  {
+    rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, false);
+    rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
+
+    emit_insn (gen_aarch64_saddw<mode>_internal (temp, operands[2],
+						operands[1], p));
+    emit_insn (gen_aarch64_saddw2<mode> (operands[0], temp, operands[1]));
+    DONE;
+  }
+)
+
+(define_expand "widen_ssum<mode>3"
+  [(set (match_operand:<VWIDE> 0 "register_operand" "")
+	(plus:<VWIDE> (sign_extend:<VWIDE>
+		       (match_operand:VD_BHSI 1 "register_operand" ""))
+		      (match_operand:<VWIDE> 2 "register_operand" "")))]
+  "TARGET_SIMD"
+{
+  emit_insn (gen_aarch64_saddw<mode> (operands[0], operands[2], operands[1]));
+  DONE;
+})
+
+(define_expand "widen_usum<mode>3"
+  [(set (match_operand:<VDBLW> 0 "register_operand" "")
+	(plus:<VDBLW> (zero_extend:<VDBLW> (match_operand:VQW 1 "register_operand" ""))
+		      (match_operand:<VDBLW> 2 "register_operand" "")))]
+  "TARGET_SIMD"
+  {
+    rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, false);
+    rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
+
+    emit_insn (gen_aarch64_uaddw<mode>_internal (temp, operands[2],
+						 operands[1], p));
+    emit_insn (gen_aarch64_uaddw2<mode> (operands[0], temp, operands[1]));
+    DONE;
+  }
+)
+
+(define_expand "widen_usum<mode>3"
+  [(set (match_operand:<VWIDE> 0 "register_operand" "")
+	(plus:<VWIDE> (zero_extend:<VWIDE>
+		       (match_operand:VD_BHSI 1 "register_operand" ""))
+		      (match_operand:<VWIDE> 2 "register_operand" "")))]
+  "TARGET_SIMD"
+{
+  emit_insn (gen_aarch64_uaddw<mode> (operands[0], operands[2], operands[1]));
+  DONE;
+})
+
 (define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>"
   [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
         (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
@@ -2760,6 +2814,18 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_widen")]
 )
 
+(define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>_internal"
+  [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
+        (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
+			(ANY_EXTEND:<VWIDE>
+			  (vec_select:<VHALF>
+			   (match_operand:VQW 2 "register_operand" "w")
+			   (match_operand:VQW 3 "vect_par_cnst_lo_half" "")))))]
+  "TARGET_SIMD"
+  "<ANY_EXTEND:su><ADDSUB:optab>w\\t%0.<Vwtype>, %1.<Vwtype>, %2.<Vhalftype>"
+  [(set_attr "type" "neon_<ADDSUB:optab>_widen")]
+)
+
 (define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w2<mode>_internal"
   [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
         (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 964f8f1..f851dca 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -455,6 +455,13 @@
 			 (V4SF "V2SF")  (V4HF "V2HF")
 			 (V8HF "V4HF")  (V2DF  "DF")])
 
+;; Half modes of all vector modes, in lower-case.
+(define_mode_attr Vhalf [(V8QI "v4qi")  (V16QI "v8qi")
+			 (V4HI "v2hi")  (V8HI  "v4hi")
+			 (V2SI "si")    (V4SI  "v2si")
+			 (V2DI "di")    (V2SF  "sf")
+			 (V4SF "v2sf")  (V2DF  "df")])
+
 ;; Double modes of vector modes.
 (define_mode_attr VDBL [(V8QI "V16QI") (V4HI "V8HI")
 			(V4HF "V8HF")
@@ -472,6 +479,11 @@
 			(SI   "v2si")  (DI   "v2di")
 			(DF   "v2df")])
 
+;; Modes with double-width elements.
+(define_mode_attr VDBLW [(V8QI "V4HI") (V16QI "V8HI")
+                  (V4HI "V2SI") (V8HI "V4SI")
+                  (V2SI "DI")   (V4SI "V2DI")])
+
 ;; Narrowed modes for VDN.
 (define_mode_attr VNARROWD [(V4HI "V8QI") (V2SI "V4HI")
 			    (DI   "V2SI")])
diff --git a/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c b/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c
index faf17d6..fa3b9e2 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-multitypes-4.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_unpack } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -51,6 +52,6 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target vect_unpack } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect"  { target vect_unpack } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c b/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c
index fb4f720..42a3b4b 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-multitypes-5.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_pack_trunc } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -51,6 +52,6 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_pack_trunc } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern  } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail vect_widen_sum_hi_to_si_pattern } } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/vect-125.c b/gcc/testsuite/gcc.dg/vect/vect-125.c
index 4a3c0dc..a1d1e88 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-125.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-125.c
@@ -16,4 +16,4 @@ void train(short *t, short *w, int n, int err)
     }
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail vect_no_int_min_max } } } */
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { vect_widen_sum_hi_to_si_pattern || vect_no_int_min_max } } } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-1.c b/gcc/testsuite/gcc.target/aarch64/saddw-1.c
new file mode 100644
index 0000000..9db5d00
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/saddw-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+
+int 
+t6(int len, void * dummy, short * __restrict x)
+{
+  len = len & ~31;
+  int result = 0;
+  __asm volatile ("");
+  for (int i = 0; i < len; i++)
+    result += x[i];
+  return result;
+}
+
+/* { dg-final { scan-assembler "saddw" } } */
+/* { dg-final { scan-assembler "saddw2" } } */
+
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-2.c b/gcc/testsuite/gcc.target/aarch64/saddw-2.c
new file mode 100644
index 0000000..6f8c8fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/saddw-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int 
+t6(int len, void * dummy, int * __restrict x)
+{
+  len = len & ~31;
+  long long result = 0;
+  __asm volatile ("");
+  for (int i = 0; i < len; i++)
+    result += x[i];
+  return result;
+}
+
+/* { dg-final { scan-assembler "saddw" } } */
+/* { dg-final { scan-assembler "saddw2" } } */
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-1.c b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c
new file mode 100644
index 0000000..e34574f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+
+int 
+t6(int len, void * dummy, unsigned short * __restrict x)
+{
+  len = len & ~31;
+  unsigned int result = 0;
+  __asm volatile ("");
+  for (int i = 0; i < len; i++)
+    result += x[i];
+  return result;
+}
+
+/* { dg-final { scan-assembler "uaddw" } } */
+/* { dg-final { scan-assembler "uaddw2" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-2.c b/gcc/testsuite/gcc.target/aarch64/uaddw-2.c
new file mode 100644
index 0000000..fd3b578
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/uaddw-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int 
+t6(int len, void * dummy, unsigned short * __restrict x)
+{
+  len = len & ~31;
+  unsigned int result = 0;
+  __asm volatile ("");
+  for (int i = 0; i < len; i++)
+    result += x[i];
+  return result;
+}
+
+/* { dg-final { scan-assembler "uaddw" } } */
+/* { dg-final { scan-assembler "uaddw2" } } */
+
diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-3.c b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c
new file mode 100644
index 0000000..04bc7c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+
+int 
+t6(int len, void * dummy, char * __restrict x)
+{
+  len = len & ~31;
+  unsigned short result = 0;
+  __asm volatile ("");
+  for (int i = 0; i < len; i++)
+    result += x[i];
+  return result;
+}
+
+/* { dg-final { scan-assembler "uaddw" } } */
+/* { dg-final { scan-assembler "uaddw2" } } */
+
+
+
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index b543519..46f41a1 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3943,6 +3943,7 @@ proc check_effective_target_vect_widen_sum_hi_to_si_pattern { } {
     } else {
         set et_vect_widen_sum_hi_to_si_pattern_saved 0
         if { [istarget powerpc*-*-*]
+              || [istarget aarch64*-*-*]
              || [istarget ia64-*-*] } {
             set et_vect_widen_sum_hi_to_si_pattern_saved 1
         }
-- 
1.9.1