From patchwork Mon Sep 21 14:38:25 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 53963 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wi0-f197.google.com (mail-wi0-f197.google.com [209.85.212.197]) by patches.linaro.org (Postfix) with ESMTPS id 39AA622B1E for ; Mon, 21 Sep 2015 14:39:16 +0000 (UTC) Received: by wisv5 with SMTP id v5sf32068515wis.0 for ; Mon, 21 Sep 2015 07:39:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-type:x-original-sender :x-original-authentication-results; bh=c2wQ8s+rJdCGDwwBXe02WiGnTL0RCFuPja2xvYU9ac0=; b=nBJF4niIjrtlAQZermeJ0oRKe63mS++/KnIujVw9VnDsxtqE5GUgSNAFLot360ieke SyIp8qAGDArvyLYWaw3OCX/kvAN81kI4lkMrKf2ue5FfKrxhvA7YnGb/rAmQYHXKeFlL Q2omxDo7+SephAatoG1UkSZHS4J/yP48JLjGQrnUH4x3IQmsYSUeELkMohSgVgexoLnR q6xnPayIOIhV2CUL+dtBRQg9AAdpMuurOpiCvcQBrcSzc/1rmphhTFkxots7vkVmPZfm 3G63AZKnvFI3bNYcYpmlFN71UieAX2+eX+57VKKVmj2F7sMlqLAjgtv5hItSksCC2+Qr n5DQ== X-Gm-Message-State: ALoCoQlLoozDSivyfmG3iFybG68WTyNgDR01efR8UGc/yhBCEHIx9GD4CKsgRfZndxTfwXyH/3jj X-Received: by 10.112.130.41 with SMTP id ob9mr3270318lbb.17.1442846355518; Mon, 21 Sep 2015 07:39:15 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.237.36 with SMTP id uz4ls493724lac.6.gmail; Mon, 21 Sep 2015 07:39:15 -0700 (PDT) X-Received: by 10.112.52.138 with SMTP id t10mr6139126lbo.99.1442846355380; Mon, 21 Sep 2015 07:39:15 -0700 (PDT) Received: from mail-lb0-x234.google.com (mail-lb0-x234.google.com. [2a00:1450:4010:c04::234]) by mx.google.com with ESMTPS id o9si14559583lag.30.2015.09.21.07.39.15 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Sep 2015 07:39:15 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c04::234 as permitted sender) client-ip=2a00:1450:4010:c04::234; Received: by lbbmp1 with SMTP id mp1so52225012lbb.1 for ; Mon, 21 Sep 2015 07:39:15 -0700 (PDT) X-Received: by 10.112.130.70 with SMTP id oc6mr7925036lbb.32.1442846355252; Mon, 21 Sep 2015 07:39:15 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp1752201lbq; Mon, 21 Sep 2015 07:39:13 -0700 (PDT) X-Received: by 10.50.45.106 with SMTP id l10mr10545884igm.57.1442846353660; Mon, 21 Sep 2015 07:39:13 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 35si17192564iok.27.2015.09.21.07.39.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Sep 2015 07:39:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-407956-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 3140 invoked by alias); 21 Sep 2015 14:38:41 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 3042 invoked by uid 89); 21 Sep 2015 14:38:40 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 21 Sep 2015 14:38:36 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-16-xSxt9aQ6Rn2zGUGqNK8G_g-1; Mon, 21 Sep 2015 15:38:30 +0100 Received: from e107456-lin.cambridge.arm.com ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 21 Sep 2015 15:38:30 +0100 From: James Greenhalgh To: gcc-patches@gcc.gnu.org Cc: alalaw01@arm.com, marsha01@arm.com, tbelagod@arm.com, christophe.lyon@linaro.org Subject: Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics. Date: Mon, 21 Sep 2015 15:38:25 +0100 Message-Id: <1442846305-39006-1-git-send-email-james.greenhalgh@arm.com> In-Reply-To: References: MIME-Version: 1.0 X-MC-Unique: xSxt9aQ6Rn2zGUGqNK8G_g-1 X-IsSubscribed: yes X-Original-Sender: james.greenhalgh@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c04::234 as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 On Mon, Sep 21, 2015 at 10:44:32AM +0100, Alan Lawrence wrote: > [Resending in plain text] This makes sense to me now, although I find > your comment slightly confusing: > > [....] in that > +;; the meaning of HI and LO is always taken with a little-endian view of > +;; the vector > > You mean vec_unpacks_{hi,lo} (which seems to go against the > *architectural* bit after this), or hi/lo in cases other than > vec_unpack (=> not "always"), or something else? > > maybe s/always/usually/ or s/always/otherwise/ ? > What I was aiming for is a description that our implementation of these standard pattern names looks wrong, because "hi" always extracts the architectural high lanes, in other big-endian patterns we make the adjustment that higher numbered lanes map to the low architectural lanes. I've tried to reword the comment to make it clearer, but I'm assuming some familiarity with our overall big-endian vector model. I've also updated the testcase to skip it if we are targetting AArch32, which does not provide these intrinsics. OK? Thanks, James --- gcc/ 2015-09-21 James Greenhalgh * config/aarch64/aarch64-simd.md (aarch64_float_truncate_hi_v4sf): Rewrite as an expand. (aarch64_float_truncate_hi_v4sf_le): New. (aarch64_float_truncate_hi_v4sf_be): Likewise. gcc/testsuite/ 2015-09-21 James Greenhalgh * gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c: New. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index dbe5259..5ab2f2b 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1703,6 +1703,15 @@ [(set_attr "type" "neon_fp_cvt_widen_s")] ) +;; ??? Note that the vectorizer usage of the vec_unpacks_[lo/hi] patterns +;; is inconsistent with vector ordering elsewhere in the compiler, in that +;; the meaning of HI and LO changes depending on the target endianness. +;; While elsewhere we map the higher numbered elements of a vector to +;; the lower architectural lanes of the vector, for these patterns we want +;; to always treat "hi" as referring to the higher architectural lanes. +;; Consequently, while the patterns below look inconsistent with our +;; other big-endian patterns their behaviour is as required. + (define_expand "vec_unpacks_lo_" [(match_operand: 0 "register_operand" "") (match_operand:VQ_HSF 1 "register_operand" "")] @@ -1757,17 +1766,42 @@ [(set_attr "type" "neon_fp_cvt_narrow_d_q")] ) -(define_insn "aarch64_float_truncate_hi_" +(define_insn "aarch64_float_truncate_hi__le" [(set (match_operand: 0 "register_operand" "=w") (vec_concat: (match_operand:VDF 1 "register_operand" "0") (float_truncate:VDF (match_operand: 2 "register_operand" "w"))))] - "TARGET_SIMD" + "TARGET_SIMD && !BYTES_BIG_ENDIAN" "fcvtn2\\t%0., %2" [(set_attr "type" "neon_fp_cvt_narrow_d_q")] ) +(define_insn "aarch64_float_truncate_hi__be" + [(set (match_operand: 0 "register_operand" "=w") + (vec_concat: + (float_truncate:VDF + (match_operand: 2 "register_operand" "w")) + (match_operand:VDF 1 "register_operand" "0")))] + "TARGET_SIMD && BYTES_BIG_ENDIAN" + "fcvtn2\\t%0., %2" + [(set_attr "type" "neon_fp_cvt_narrow_d_q")] +) + +(define_expand "aarch64_float_truncate_hi_" + [(match_operand: 0 "register_operand" "=w") + (match_operand:VDF 1 "register_operand" "0") + (match_operand: 2 "register_operand" "w")] + "TARGET_SIMD" +{ + rtx (*gen) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN + ? gen_aarch64_float_truncate_hi__be + : gen_aarch64_float_truncate_hi__le; + emit_insn (gen (operands[0], operands[1], operands[2])); + DONE; +} +) + (define_expand "vec_pack_trunc_v2df" [(set (match_operand:V4SF 0 "register_operand") (vec_concat:V4SF diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c new file mode 100644 index 0000000..4691da3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c @@ -0,0 +1,99 @@ +/* { dg-skip-if "" { arm*-*-* } } */ + +#include "arm_neon.h" +#include + +void abort (void); + +void +foo (void) +{ + /* Test vcvt_high_f32_f64. */ + float32x2_t arg1; + float64x2_t arg2; + float32x4_t result; + arg1 = vcreate_f32 (UINT64_C (0x3f0db5793f6e1892)); + arg2 = vcombine_f64 (vcreate_f64 (UINT64_C (0x3fe8e49d23fb575d)), + vcreate_f64 (UINT64_C (0x3fd921291b3df73e))); + // Expect: "result" = 3ec909483f4724e93f0db5793f6e1892 + result = vcvt_high_f32_f64 (arg1, arg2); + float32_t got; + float32_t exp; + + /* Lane 0. */ + got = vgetq_lane_f32 (result, 0); + exp = ((float32_t) 0.9300624132156372); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); + + /* Lane 1. */ + got = vgetq_lane_f32 (result, 1); + exp = ((float32_t) 0.5535503029823303); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); + + /* Lane 2. */ + got = vgetq_lane_f32 (result, 2); + exp = ((float32_t) 0.7779069617051665); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); + + /* Lane 3. */ + got = vgetq_lane_f32 (result, 3); + exp = ((float32_t) 0.3926489606891329); + if (((((exp / got) < ((float32_t) 0.999)) + || ((exp / got) > ((float32_t) 1.001))) + && (((exp - got) < ((float32_t) -1.0e-4)) + || ((exp - got) > ((float32_t) 1.0e-4))))) + abort (); +} + +void +bar (void) +{ + /* Test vcvt_high_f64_f32. */ + float32x4_t arg1; + float64x2_t result; + arg1 = vcombine_f32 (vcreate_f32 (UINT64_C (0x3f7c5cf13f261f74)), + vcreate_f32 (UINT64_C (0x3e3a7bc03f6ccc1d))); + // Expect: "result" = 3fc74f78000000003fed9983a0000000 + result = vcvt_high_f64_f32 (arg1); + + float64_t got; + float64_t exp; + + /* Lane 0. */ + got = vgetq_lane_f64 (result, 0); + exp = 0.9249895215034485; + if (((((exp / got) < 0.999) + || ((exp / got) > 1.001)) + && (((exp - got) < -1.0e-4) + || ((exp - got) > 1.0e-4)))) + abort (); + + /* Lane 1. */ + got = vgetq_lane_f64 (result, 1); + exp = 0.1821126937866211; + if (((((exp / got) < 0.999) + || ((exp / got) > 1.001)) + && (((exp - got) < -1.0e-4) + || ((exp - got) > 1.0e-4)))) + abort (); +} + +int +main (int argc, char **argv) +{ + foo (); + bar (); + return 0; +}