From patchwork Fri Oct 16 12:58:35 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 55099 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f200.google.com (mail-lb0-f200.google.com [209.85.217.200]) by patches.linaro.org (Postfix) with ESMTPS id E2A6A22EAC for ; Fri, 16 Oct 2015 12:59:02 +0000 (UTC) Received: by lbcao8 with SMTP id ao8sf24136167lbc.1 for ; Fri, 16 Oct 2015 05:59:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:message-id:date:from:user-agent:mime-version:to:cc :subject:content-type:x-original-sender :x-original-authentication-results; bh=qBSNEMaVGFsOSpMVon03UaLVHjX+b18+o2yLfFAVviA=; b=PRrUTynJ6F1OnhtuFsDLL0Jyb2mt/UM1kmVaD0edAnC7tcEcwKJxPjBWa62GZ7DCWT 2HNks5YML6zx1e2yDA+B7RHmvxPoHHgGPmucGIRePav26oxIN1iQcxMJyVXHQEMgGq24 YSUW3gu+JmBkwb4gUhXZHN0djbCyhOkc088gzEgerMzg+7mdifMcJP0JTEMXVt7ELejl /iWyHJdg/CuRQjuMgs7stku6beE8hH9mqAwZLS4pfj+WXfnTRgBK0cC7i2jL5rozRa5u nBkQv+7xRwmWKtMTaxBfwvlyRlQFB2lF/GljepI1psP4XFxWc6X7bj7e81aiOt1R+plq cOwA== X-Gm-Message-State: ALoCoQm9HyMmT/DoBau9h8LNugpJ9XpKhE6nMlibRPQp1P/VTTs4ENewMy8jpkrbPBxyW7ru5ZO1 X-Received: by 10.112.132.10 with SMTP id oq10mr3594221lbb.1.1445000341690; Fri, 16 Oct 2015 05:59:01 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.25.209.80 with SMTP id i77ls335523lfg.0.gmail; Fri, 16 Oct 2015 05:59:01 -0700 (PDT) X-Received: by 10.112.150.97 with SMTP id uh1mr6278492lbb.53.1445000341547; Fri, 16 Oct 2015 05:59:01 -0700 (PDT) Received: from mail-lf0-x234.google.com (mail-lf0-x234.google.com. [2a00:1450:4010:c07::234]) by mx.google.com with ESMTPS id kl6si5717101lbc.171.2015.10.16.05.59.01 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Oct 2015 05:59:01 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c07::234 as permitted sender) client-ip=2a00:1450:4010:c07::234; Received: by lfeh64 with SMTP id h64so75967431lfe.3 for ; Fri, 16 Oct 2015 05:59:01 -0700 (PDT) X-Received: by 10.25.19.97 with SMTP id j94mr5504321lfi.106.1445000341095; Fri, 16 Oct 2015 05:59:01 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp1270091lbq; Fri, 16 Oct 2015 05:58:59 -0700 (PDT) X-Received: by 10.66.227.201 with SMTP id sc9mr16234485pac.62.1445000339561; Fri, 16 Oct 2015 05:58:59 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id a4si29493555pas.197.2015.10.16.05.58.59 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Oct 2015 05:58:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-410370-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 79787 invoked by alias); 16 Oct 2015 12:58:45 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 79771 invoked by uid 89); 16 Oct 2015 12:58:43 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Oct 2015 12:58:41 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-10-_6SvObwTRM-mmVhuAEUMvQ-1; Fri, 16 Oct 2015 13:58:36 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 16 Oct 2015 13:58:36 +0100 Message-ID: <5620F47B.9010107@arm.com> Date: Fri, 16 Oct 2015 13:58:35 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Add support for 64-bit vector-mode ldp/stp X-MC-Unique: _6SvObwTRM-mmVhuAEUMvQ-1 X-IsSubscribed: yes X-Original-Sender: kyrylo.tkachov@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c07::234 as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 Hi all, We already support load/store-pair operations on the D-registers when they contain an FP value, but the peepholes/sched-fusion machinery that do all the hard work currently ignore 64-bit vector modes. This patch adds support for fusing loads/stores of 64-bit vector operands into ldp and stp instructions. I've seen this trigger a few times in SPEC2006. Not too many times, but the times it did trigger the code seemed objectively better i.e. long sequences of ldr and str instructions essentially halved in size. Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2015-10-16 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p): New function. (fusion_load_store): Use it. * config/aarch64/aarch64-ldpstp.md: Add new peephole2s for ldp and stp in VD modes. * config/aarch64/aarch64-simd.md (load_pair, VD): New pattern. (store_pair, VD): Likewise. 2015-10-16 Kyrylo Tkachov * gcc.target/aarch64/stp_vec_64_1.c: New test. * gcc.target/aarch64/ldp_vec_64_1.c: New test. commit b5f4a5b87a7315fb8a4d88da3e4c4afc52d16052 Author: Kyrylo Tkachov Date: Tue Oct 6 12:08:24 2015 +0100 [AArch64] Add support for 64-bit vector-mode ldp/stp diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md index 8d6d882..458829c 100644 --- a/gcc/config/aarch64/aarch64-ldpstp.md +++ b/gcc/config/aarch64/aarch64-ldpstp.md @@ -98,6 +98,47 @@ (define_peephole2 } }) +(define_peephole2 + [(set (match_operand:VD 0 "register_operand" "") + (match_operand:VD 1 "aarch64_mem_pair_operand" "")) + (set (match_operand:VD 2 "register_operand" "") + (match_operand:VD 3 "memory_operand" ""))] + "aarch64_operands_ok_for_ldpstp (operands, true, mode)" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +{ + rtx base, offset_1, offset_2; + + extract_base_offset_in_addr (operands[1], &base, &offset_1); + extract_base_offset_in_addr (operands[3], &base, &offset_2); + if (INTVAL (offset_1) > INTVAL (offset_2)) + { + std::swap (operands[0], operands[2]); + std::swap (operands[1], operands[3]); + } +}) + +(define_peephole2 + [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "") + (match_operand:VD 1 "register_operand" "")) + (set (match_operand:VD 2 "memory_operand" "") + (match_operand:VD 3 "register_operand" ""))] + "TARGET_SIMD && aarch64_operands_ok_for_ldpstp (operands, false, mode)" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +{ + rtx base, offset_1, offset_2; + + extract_base_offset_in_addr (operands[0], &base, &offset_1); + extract_base_offset_in_addr (operands[2], &base, &offset_2); + if (INTVAL (offset_1) > INTVAL (offset_2)) + { + std::swap (operands[0], operands[2]); + std::swap (operands[1], operands[3]); + } +}) + + ;; Handle sign/zero extended consecutive load/store. (define_peephole2 diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 6a2ab61..bf051c3 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -153,6 +153,34 @@ (define_insn "*aarch64_simd_mov" (set_attr "length" "4,4,4,8,8,8,4")] ) +(define_insn "load_pair" + [(set (match_operand:VD 0 "register_operand" "=w") + (match_operand:VD 1 "aarch64_mem_pair_operand" "Ump")) + (set (match_operand:VD 2 "register_operand" "=w") + (match_operand:VD 3 "memory_operand" "m"))] + "TARGET_SIMD + && rtx_equal_p (XEXP (operands[3], 0), + plus_constant (Pmode, + XEXP (operands[1], 0), + GET_MODE_SIZE (mode)))" + "ldp\\t%d0, %d2, %1" + [(set_attr "type" "neon_ldp")] +) + +(define_insn "store_pair" + [(set (match_operand:VD 0 "aarch64_mem_pair_operand" "=Ump") + (match_operand:VD 1 "register_operand" "w")) + (set (match_operand:VD 2 "memory_operand" "=m") + (match_operand:VD 3 "register_operand" "w"))] + "TARGET_SIMD + && rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, + XEXP (operands[0], 0), + GET_MODE_SIZE (mode)))" + "stp\\t%d1, %d3, %0" + [(set_attr "type" "neon_stp")] +) + (define_split [(set (match_operand:VQ 0 "register_operand" "") (match_operand:VQ 1 "register_operand" ""))] diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index d7d05b8..7682417 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3491,6 +3491,18 @@ offset_12bit_unsigned_scaled_p (machine_mode mode, HOST_WIDE_INT offset) && offset % GET_MODE_SIZE (mode) == 0); } +/* Return true if MODE is one of the modes for which we + support LDP/STP operations. */ + +static bool +aarch64_mode_valid_for_sched_fusion_p (machine_mode mode) +{ + return mode == SImode || mode == DImode + || mode == SFmode || mode == DFmode + || (aarch64_vector_mode_supported_p (mode) + && GET_MODE_SIZE (mode) == 8); +} + /* Return true if X is a valid address for machine mode MODE. If it is, fill in INFO appropriately. STRICT_P is true if REG_OK_STRICT is in effect. OUTER_CODE is PARALLEL for a load/store pair. */ @@ -12863,8 +12875,9 @@ fusion_load_store (rtx_insn *insn, rtx *base, rtx *offset) src = SET_SRC (x); dest = SET_DEST (x); - if (GET_MODE (dest) != SImode && GET_MODE (dest) != DImode - && GET_MODE (dest) != SFmode && GET_MODE (dest) != DFmode) + machine_mode dest_mode = GET_MODE (dest); + + if (!aarch64_mode_valid_for_sched_fusion_p (dest_mode)) return SCHED_FUSION_NONE; if (GET_CODE (src) == SIGN_EXTEND) diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c b/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c new file mode 100644 index 0000000..62213f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ldp_vec_64_1.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +typedef int int32x2_t __attribute__ ((__vector_size__ ((8)))); + +void +foo (int32x2_t *foo, int32x2_t *bar) +{ + int i = 0; + int32x2_t val = { 3, 2 }; + + for (i = 0; i < 1024; i+=2) + foo[i] = bar[i] + bar[i + 1]; +} + +/* { dg-final { scan-assembler "ldp\td\[0-9\]+, d\[0-9\]" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/stp_vec_64_1.c b/gcc/testsuite/gcc.target/aarch64/stp_vec_64_1.c new file mode 100644 index 0000000..11e757a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/stp_vec_64_1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + + +typedef int int32x2_t __attribute__ ((__vector_size__ ((8)))); + +void +bar (int32x2_t *foo) +{ + int i = 0; + int32x2_t val = { 3, 2 }; + + for (i = 0; i < 256; i+=2) + { + foo[i] = val; + foo[i+1] = val; + } +} + +/* { dg-final { scan-assembler "stp\td\[0-9\]+, d\[0-9\]" } } */