From patchwork Thu Sep 10 14:02:50 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 53382 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f198.google.com (mail-lb0-f198.google.com [209.85.217.198]) by patches.linaro.org (Postfix) with ESMTPS id 5315E22B19 for ; Thu, 10 Sep 2015 14:03:33 +0000 (UTC) Received: by lbcao8 with SMTP id ao8sf14372299lbc.1 for ; Thu, 10 Sep 2015 07:03:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-type:x-original-sender :x-original-authentication-results; bh=H/3Y//ug4GoJrmYkGIzLlDMXMrpp7AK19SChXA28Y0s=; b=Je8k/j1debApO0neSQwK7QlGNx7LYS0CCQDVrZLwxMWTCNchUgvzU4NCJMmFseouXH 5hpFvPPSeETfFtGV8l//fGPaQ2hOIWDp1ya/rVkv7Vpij6QdooAaQclQRHAyLqLkB+O2 eUZEAGf8QtXpDmSxvd7k3ulin3yavH4jv7qDR2WJ1sTLq6v9Sy9InQX+JWBeLFe+NkMl 4QpbKw6ckQkVWmKp2vOqzs/9K3ccQhiJ7WEhrIuNH/hI/uutAdX9c5+fHUEgILa041EM 6TX/2KgtRi7XomLYfwMt6eta78uFHU34UF3t4XbNOHsEAG98Wn6RjMAJHr3dP0ieK7OI Gt8Q== X-Gm-Message-State: ALoCoQnGzvjvuxpF4uG/8lm020uq8b5y44bFsPZaoLF13Ed1r1urZDeI9UEcJKgZNlrajUGYbhTc X-Received: by 10.152.45.101 with SMTP id l5mr9808219lam.7.1441893812265; Thu, 10 Sep 2015 07:03:32 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.37.36 with SMTP id v4ls201826laj.109.gmail; Thu, 10 Sep 2015 07:03:32 -0700 (PDT) X-Received: by 10.152.30.98 with SMTP id r2mr34150442lah.14.1441893812111; Thu, 10 Sep 2015 07:03:32 -0700 (PDT) Received: from mail-lb0-x229.google.com (mail-lb0-x229.google.com. [2a00:1450:4010:c04::229]) by mx.google.com with ESMTPS id lg6si10536184lab.59.2015.09.10.07.03.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 Sep 2015 07:03:32 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c04::229 as permitted sender) client-ip=2a00:1450:4010:c04::229; Received: by lbpo4 with SMTP id o4so23731067lbp.2 for ; Thu, 10 Sep 2015 07:03:32 -0700 (PDT) X-Received: by 10.112.169.66 with SMTP id ac2mr35459045lbc.32.1441893811114; Thu, 10 Sep 2015 07:03:31 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp997523lbq; Thu, 10 Sep 2015 07:03:27 -0700 (PDT) X-Received: by 10.68.220.132 with SMTP id pw4mr82794231pbc.149.1441893807042; Thu, 10 Sep 2015 07:03:27 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id ey7si19677364pab.142.2015.09.10.07.03.26 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 Sep 2015 07:03:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-407058-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 61061 invoked by alias); 10 Sep 2015 14:03:13 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 61044 invoked by uid 89); 10 Sep 2015 14:03:12 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 10 Sep 2015 14:03:04 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-37-fTBG6UsvR3iLaYXIv-gwLg-1; Thu, 10 Sep 2015 15:02:58 +0100 Received: from e107456-lin.cambridge.arm.com ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 10 Sep 2015 15:02:58 +0100 From: James Greenhalgh To: gcc-patches@gcc.gnu.org Cc: christophe.lyon@linaro.org, marcus.shawcroft@arm.com, tejas.belagod@arm.com, alan.lawrence@arm.com Subject: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics. Date: Thu, 10 Sep 2015 15:02:50 +0100 Message-Id: <1441893770-27517-1-git-send-email-james.greenhalgh@arm.com> In-Reply-To: References: MIME-Version: 1.0 X-MC-Unique: fTBG6UsvR3iLaYXIv-gwLg-1 X-IsSubscribed: yes X-Original-Sender: james.greenhalgh@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c04::229 as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 On Wed, Sep 09, 2015 at 10:28:28AM +0100, Christophe Lyon wrote: > On 9 September 2015 at 10:31, James Greenhalgh wrote: > > > > Hi, > > > > This patch clears up some remaining confusion in the vector lane orderings > > for the two intrinsics mentioned in the title. > > > > Bootstrapped on aarch64-none-linux-gnu and regression tested for > > aarch64_be-none-elf with no issues. > > > > Does this actually fix an existing testcase? Yes, of course, sorry - that was a useless introduction to the patch! First, I've updated the patch with a testcase, which fails for me on aarch64_be-none-elf but not aarch64-*-*. The issue is that the RTL folding routines will happily fold through a vec_concat or a vec_select, which we have given the wrong operands to when in BYTES_BIG_ENDIAN mode. The fix is similar to that which we have elsewhere in aarch64-simd.md, which is to split out the big and little endian forms of the patterns which need vec_concat, and to build a vec_par_cnst_*_half mask for the patterns which need vec_select. This keeps us in the GCC-view of lane ordering. There is test coverage that these patterns do the right thing for the vectorizer (I know, because I initially typoed s/le/be and saw tests gcc.dg/vect fall over), and the new testcase adds coverage for the expansion path through intrinsics. I've rebased on top of Alan's patch, which goes halfway to fixing the issue, but which didn't fix the float_truncate patterns, which had an incorrect vec_concat. That simplifies the patch considerably. Rechecked on aarch64_be-none-elf and aarch64-none-linux-gnu with no issues. OK? Thanks, James --- gcc/ 2015-09-09 James Greenhalgh * config/aarch64/aarch64-simd.md (aarch64_float_truncate_hi_v4sf): Rewrite as an expand. (aarch64_float_truncate_hi_v4sf_le): New. (aarch64_float_truncate_hi_v4sf_be): Likewise. gcc/testsuite/ 2015-09-09 James Greenhalgh * gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c: New. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index a4eaeca..8be9b97 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1703,6 +1703,13 @@ [(set_attr "type" "neon_fp_cvt_widen_s")] ) +;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns +;; is inconsistent with vector ordering elsewhere in the compiler, in that +;; the meaning of HI and LO is always taken with a little-endian view of +;; the vector. Thus, while the patterns below look incorrect in that +;; vec_unpacks_hi always extracts the high *architectural* lanes of a +;; vector, their behaviour is as required. + (define_expand "vec_unpacks_lo_" [(match_operand: 0 "register_operand" "") (match_operand:VQ_HSF 1 "register_operand" "")] @@ -1757,17 +1764,42 @@ [(set_attr "type" "neon_fp_cvt_narrow_d_q")] ) -(define_insn "aarch64_float_truncate_hi_" +(define_insn "aarch64_float_truncate_hi__le" [(set (match_operand: 0 "register_operand" "=w") (vec_concat: (match_operand:VDF 1 "register_operand" "0") (float_truncate:VDF (match_operand: 2 "register_operand" "w"))))] - "TARGET_SIMD" + "TARGET_SIMD && !BYTES_BIG_ENDIAN" "fcvtn2\\t%0., %2" [(set_attr "type" "neon_fp_cvt_narrow_d_q")] ) +(define_insn "aarch64_float_truncate_hi__be" + [(set (match_operand: 0 "register_operand" "=w") + (vec_concat: + (float_truncate:VDF + (match_operand: 2 "register_operand" "w")) + (match_operand:VDF 1 "register_operand" "0")))] + "TARGET_SIMD && BYTES_BIG_ENDIAN" + "fcvtn2\\t%0., %2" + [(set_attr "type" "neon_fp_cvt_narrow_d_q")] +) + +(define_expand "aarch64_float_truncate_hi_" + [(match_operand: 0 "register_operand" "=w") + (match_operand:VDF 1 "register_operand" "0") + (match_operand: 2 "register_operand" "w")] + "TARGET_SIMD" +{ + rtx (*gen) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN + ? gen_aarch64_float_truncate_hi__be + : gen_aarch64_float_truncate_hi__le; + emit_insn (gen (operands[0], operands[1], operands[2])); + DONE; +} +) + (define_expand "vec_pack_trunc_v2df" [(set (match_operand:V4SF 0 "register_operand") (vec_concat:V4SF diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c new file mode 100644 index 0000000..492d6fd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c @@ -0,0 +1,97 @@ +#include "arm_neon.h" +#include + +void abort (void); + +void +foo (void) +{ + /* Test vcvt_high_f32_f64. */ + float32x2_t arg1; + float64x2_t arg2; + float32x4_t result; + arg1 = vcreate_f32 (UINT64_C (0x3f0db5793f6e1892)); + arg2 = vcombine_f64 (vcreate_f64 (UINT64_C (0x3fe8e49d23fb575d)), + vcreate_f64 (UINT64_C (0x3fd921291b3df73e))); + // Expect: "result" = 3ec909483f4724e93f0db5793f6e1892 + result = vcvt_high_f32_f64 (arg1, arg2); + float32_t got; + float32_t exp; + + /* Lane 0. */ + got = vgetq_lane_f32 (result, 0); + exp = ((float32_t) 0.9300624132156372); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); + + /* Lane 1. */ + got = vgetq_lane_f32 (result, 1); + exp = ((float32_t) 0.5535503029823303); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); + + /* Lane 2. */ + got = vgetq_lane_f32 (result, 2); + exp = ((float32_t) 0.7779069617051665); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); + + /* Lane 2. */ + got = vgetq_lane_f32 (result, 3); + exp = ((float32_t) 0.3926489606891329); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); +} + +void +bar (void) +{ + /* Test vcvt_high_f64_f32. */ + float32x4_t arg1; + float64x2_t result; + arg1 = vcombine_f32 (vcreate_f32 (UINT64_C (0x3f7c5cf13f261f74)), + vcreate_f32 (UINT64_C (0x3e3a7bc03f6ccc1d))); + // Expect: "result" = 3fc74f78000000003fed9983a0000000 + result = vcvt_high_f64_f32 (arg1); + + float64_t got; + float64_t exp; + + /* Lane 0. */ + got = vgetq_lane_f64 (result, 0); + exp = 0.9249895215034485; + if (((((exp / got) < 0.999) + || ((exp / got) > 1.001)) + && (((exp - got) < -1.0e-4) + || ((exp - got) > 1.0e-4)))) + abort (); + + /* Lane 0. */ + got = vgetq_lane_f64 (result, 1); + exp = 0.1821126937866211; + if (((((exp / got) < 0.999) + || ((exp / got) > 1.001)) + && (((exp - got) < -1.0e-4) + || ((exp - got) > 1.0e-4)))) + abort (); +} + +int +main (int argc, char **argv) +{ + foo (); + bar (); + return 0; +}