From patchwork Mon Sep 7 05:54:30 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 53210 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f197.google.com (mail-lb0-f197.google.com [209.85.217.197]) by patches.linaro.org (Postfix) with ESMTPS id 4E0D821354 for ; Mon, 7 Sep 2015 05:54:54 +0000 (UTC) Received: by lbbmp1 with SMTP id mp1sf22244838lbb.2 for ; Sun, 06 Sep 2015 22:54:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:message-id:date:from:user-agent:mime-version:to :subject:content-type:content-transfer-encoding:x-original-sender :x-original-authentication-results; bh=Ul7qXfVyBnSPMRuSH/jOrznvRdYLUWUH4qDnj1rMwzU=; b=ZoSjq4uiMesjvzNRjhYrVf5N2scJUAjGEcIaKIKWCJVXZN2vYdHMzInBlJw5pc5pWU XbCqqtyBQJ9i9R16MjRblMnk3kDzb7RR0iZnNWVPDykYuy/jUVr44U2y6p5d1QExR7Lm p0mOuwNxYqt4b5ngS42PxdXZTbHBJIC/9lQ5o2h6Ul6HbsQRSt2qqQl6ubBINqF91Lr2 3uadj8JCTNTzK8MUW1yOwutwuz3cvi9fboZtLjASqxpyhJzjFcpyyVDACAxvgKt+Z4Fs OUBumIAHXQ7FlsmcoFnIyuiRIu33r8HlDhrPj3uESM4TJi1EybLT+tZDD3NTD2TyjqUB hAWQ== X-Gm-Message-State: ALoCoQkQlgkymBAxLhigbuL5qSgu+G0rIGaBG8Y55wPkzCu52e9JtmXeuDgU2ex+PyVRk9F20Afl X-Received: by 10.180.87.199 with SMTP id ba7mr4519218wib.5.1441605293031; Sun, 06 Sep 2015 22:54:53 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.198.136 with SMTP id jc8ls263079lac.50.gmail; Sun, 06 Sep 2015 22:54:52 -0700 (PDT) X-Received: by 10.112.171.68 with SMTP id as4mr15322391lbc.64.1441605292733; Sun, 06 Sep 2015 22:54:52 -0700 (PDT) Received: from mail-lb0-x233.google.com (mail-lb0-x233.google.com. [2a00:1450:4010:c04::233]) by mx.google.com with ESMTPS id jd6si9488316lac.171.2015.09.06.22.54.52 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 06 Sep 2015 22:54:52 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c04::233 as permitted sender) client-ip=2a00:1450:4010:c04::233; Received: by lbpo4 with SMTP id o4so33852512lbp.2 for ; Sun, 06 Sep 2015 22:54:52 -0700 (PDT) X-Received: by 10.112.166.106 with SMTP id zf10mr15354047lbb.36.1441605292608; Sun, 06 Sep 2015 22:54:52 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp172333lbq; Sun, 6 Sep 2015 22:54:51 -0700 (PDT) X-Received: by 10.66.147.131 with SMTP id tk3mr42530510pab.104.1441605291271; Sun, 06 Sep 2015 22:54:51 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id k1si18263388pdg.58.2015.09.06.22.54.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 06 Sep 2015 22:54:51 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-406784-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 52017 invoked by alias); 7 Sep 2015 05:54:38 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 52002 invoked by uid 89); 7 Sep 2015 05:54:37 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-pa0-f47.google.com Received: from mail-pa0-f47.google.com (HELO mail-pa0-f47.google.com) (209.85.220.47) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 07 Sep 2015 05:54:35 +0000 Received: by padhy16 with SMTP id hy16so84164560pad.1 for ; Sun, 06 Sep 2015 22:54:33 -0700 (PDT) X-Received: by 10.68.218.136 with SMTP id pg8mr42038822pbc.169.1441605273644; Sun, 06 Sep 2015 22:54:33 -0700 (PDT) Received: from [192.168.1.14] (ip70-176-202-128.ph.ph.cox.net. [70.176.202.128]) by smtp.googlemail.com with ESMTPSA id n9sm10467626pdi.88.2015.09.06.22.54.32 for (version=TLSv1/SSLv3 cipher=OTHER); Sun, 06 Sep 2015 22:54:33 -0700 (PDT) Message-ID: <55ED2696.8020102@linaro.org> Date: Sun, 06 Sep 2015 22:54:30 -0700 From: Michael Collison User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: GCC Patches Subject: [Aarch64] Use vector wide add for mixed-mode adds X-Original-Sender: michael.collison@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c04::233 as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 This patch is designed to address code that was not being vectorized due to missing widening patterns in the aarch64 backend. Code such as: int t6(int len, void * dummy, short * __restrict x) { len = len & ~31; int result = 0; __asm volatile (""); for (int i = 0; i < len; i++) result += x[i]; return result; } Validated on aarch64-none-elf, aarch64_be-none-elf, and aarch64-none-linus-gnu. Note that there are three non-execution tree dump vectorization regressions where previously code was being vectorized. They are: Passed now fails [PASS => FAIL]: gcc.dg/vect/slp-multitypes-4.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 gcc.dg/vect/slp-multitypes-4.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 gcc.dg/vect/slp-multitypes-4.c scan-tree-dump-times vect "vectorized 1 loops" 1 gcc.dg/vect/slp-multitypes-4.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 gcc.dg/vect/slp-multitypes-5.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 gcc.dg/vect/slp-multitypes-5.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorized 1 loops" 1 gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 gcc.dg/vect/vect-125.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops" gcc.dg/vect/vect-125.c scan-tree-dump vect "vectorized 1 loops" I would like to treat these as saperate bugs and resolve them separately. -------------------------------------------------------------------------------------------------------------------------------------------------------- 2015-09-04 Michael Collison * config/aarch64/aarch64-simd.md (widen_ssum, widen_usum, aarch64_w_internal): New patterns * config/aarch64/iterators.md (Vhalf, VDBLW): New mode attributes. * gcc.target/aarch64/saddw-1.c: New test. * gcc.target/aarch64/saddw-2.c: New test. * gcc.target/aarch64/uaddw-1.c: New test. * gcc.target/aarch64/uaddw-2.c: New test. * gcc.target/aarch64/uaddw-3.c: New test. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 9777418..d6c5d61 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2636,6 +2636,60 @@ ;; w. +(define_expand "widen_ssum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (sign_extend: (match_operand:VQW 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" + { + rtx p = aarch64_simd_vect_par_cnst_half (mode, false); + rtx temp = gen_reg_rtx (GET_MODE (operands[0])); + + emit_insn (gen_aarch64_saddw_internal (temp, operands[2], + operands[1], p)); + emit_insn (gen_aarch64_saddw2 (operands[0], temp, operands[1])); + DONE; + } +) + +(define_expand "widen_ssum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (sign_extend: + (match_operand:VD_BHSI 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" +{ + emit_insn (gen_aarch64_saddw (operands[0], operands[2], operands[1])); + DONE; +}) + +(define_expand "widen_usum3" + [(set (match_operand: 0 "register_operand" "=&w") + (plus: (zero_extend: (match_operand:VQW 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" + { + rtx p = aarch64_simd_vect_par_cnst_half (mode, false); + rtx temp = gen_reg_rtx (GET_MODE (operands[0])); + + emit_insn (gen_aarch64_uaddw_internal (temp, operands[2], + operands[1], p)); + emit_insn (gen_aarch64_uaddw2 (operands[0], temp, operands[1])); + DONE; + } +) + +(define_expand "widen_usum3" + [(set (match_operand: 0 "register_operand" "") + (plus: (zero_extend: + (match_operand:VD_BHSI 1 "register_operand" "")) + (match_operand: 2 "register_operand" "")))] + "TARGET_SIMD" +{ + emit_insn (gen_aarch64_uaddw (operands[0], operands[2], operands[1])); + DONE; +}) + (define_insn "aarch64_w" [(set (match_operand: 0 "register_operand" "=w") (ADDSUB: (match_operand: 1 "register_operand" "w") @@ -2646,6 +2700,18 @@ [(set_attr "type" "neon__widen")] ) +(define_insn "aarch64_w_internal" + [(set (match_operand: 0 "register_operand" "=w") + (ADDSUB: (match_operand: 1 "register_operand" "w") + (ANY_EXTEND: + (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_lo_half" "")))))] + "TARGET_SIMD" + "w\\t%0., %1., %2." + [(set_attr "type" "neon__widen")] +) + (define_insn "aarch64_w2_internal" [(set (match_operand: 0 "register_operand" "=w") (ADDSUB: (match_operand: 1 "register_operand" "w") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index b8a45d1..cd2914e 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -427,6 +427,13 @@ (V2DI "DI") (V2SF "SF") (V4SF "V2SF") (V2DF "DF")]) +;; Half modes of all vector modes, in lower-case. +(define_mode_attr Vhalf [(V8QI "v4qi") (V16QI "v8qi") + (V4HI "v2hi") (V8HI "v4hi") + (V2SI "si") (V4SI "v2si") + (V2DI "di") (V2SF "sf") + (V4SF "v2sf") (V2DF "df")]) + ;; Double modes of vector modes. (define_mode_attr VDBL [(V8QI "V16QI") (V4HI "V8HI") (V2SI "V4SI") (V2SF "V4SF") @@ -439,6 +446,11 @@ (SI "v2si") (DI "v2di") (DF "v2df")]) +;; Modes with double-width elements. +(define_mode_attr VDBLW [(V8QI "V4HI") (V16QI "V8HI") + (V4HI "V2SI") (V8HI "V4SI") + (V2SI "DI") (V4SI "V2DI")]) + ;; Narrowed modes for VDN. (define_mode_attr VNARROWD [(V4HI "V8QI") (V2SI "V4HI") (DI "V2SI")]) diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-1.c b/gcc/testsuite/gcc.target/aarch64/saddw-1.c new file mode 100644 index 0000000..9db5d00 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saddw-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + + +int +t6(int len, void * dummy, short * __restrict x) +{ + len = len & ~31; + int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "saddw" } } */ +/* { dg-final { scan-assembler "saddw2" } } */ + + + diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-2.c b/gcc/testsuite/gcc.target/aarch64/saddw-2.c new file mode 100644 index 0000000..6f8c8fd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/saddw-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int +t6(int len, void * dummy, int * __restrict x) +{ + len = len & ~31; + long long result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "saddw" } } */ +/* { dg-final { scan-assembler "saddw2" } } */ + + diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-1.c b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c new file mode 100644 index 0000000..e34574f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + + +int +t6(int len, void * dummy, unsigned short * __restrict x) +{ + len = len & ~31; + unsigned int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "uaddw" } } */ +/* { dg-final { scan-assembler "uaddw2" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-2.c b/gcc/testsuite/gcc.target/aarch64/uaddw-2.c new file mode 100644 index 0000000..fd3b578 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int +t6(int len, void * dummy, unsigned short * __restrict x) +{ + len = len & ~31; + unsigned int result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "uaddw" } } */ +/* { dg-final { scan-assembler "uaddw2" } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-3.c b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c new file mode 100644 index 0000000..04bc7c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + + +int +t6(int len, void * dummy, char * __restrict x) +{ + len = len & ~31; + unsigned short result = 0; + __asm volatile (""); + for (int i = 0; i < len; i++) + result += x[i]; + return result; +} + +/* { dg-final { scan-assembler "uaddw" } } */ +/* { dg-final { scan-assembler "uaddw2" } } */ + + +