From patchwork Thu Jun 18 04:25:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 191029 Delivered-To: patch@linaro.org Received: by 2002:a92:cf06:0:0:0:0:0 with SMTP id c6csp1074177ilo; Wed, 17 Jun 2020 21:57:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz5juWmy1fq0hQd1dokl1OHW1JgdVr+0DBTTxrZY/CDH1jCmsIF9Z3r/TvfSb4VUEC8HpCr X-Received: by 2002:a25:a443:: with SMTP id f61mr3610719ybi.225.1592456224717; Wed, 17 Jun 2020 21:57:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592456224; cv=none; d=google.com; s=arc-20160816; b=TbVFilbc97p96BO1GYwGMYX56FTuvh6JsHk5iGFmKANkfTBlWHRv5C3bvHY4a14ulR Tno+9eX0Z/zMu74B5divc1Oayg8vNXByDQGoykvQUMHH243Rfw3MWXbOrzcID4vvMMmG sTXkoLmXiVROzFnHKOZvVUCbYS2TbqmVVbgUMAx/3voDbrIhT6ELY16XR/02SWmpfHQw L78rqQEe8hptyZJNS+QbjY39WgGUaD/NgKFFkL11dt4ltxBpDgVFh+jKfkQfwdrQQbZv T6iX5mBaOTz+LnPQJyc5sAbq/TfrFELmx8xIIpa4KokpcOPEio5kwYGrirOOZcwYjBCs IlAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=3jyMDBvSXdkWweNTlJ10emIpFoS2H8wIpT4CHdEpe/I=; b=yk9TDkMfFV+/63czOZ8HkezANv1uGbkJ84Yj+5yMcPIzpVuda1E7NQMOKOMHuaPacD gkPh9C8VuUGj882jgV4TRMQn52A2f5dBfA9ujpXBZoqroxwc73sQUc+1HANt6unFiiob mOaWO08nBH7htS+3LhuiAx1Y1Yym2EdzVXW0YofhcZH8xK3ttvYsXvUQHkQZwnSu9xR/ u5XvFpcVMkNbn1+UabVLkfJtHn+txD2DxC7lzMRn6Ij7g5uwsJyNwV+P9oWtoPXsGuCn KWiCzZw5oHIa3/ns3CpP3xuLd9zchDHS5Kp9MMaGEmBaWwduGaIKaz7neWlrVw4SYgAz 1Mxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iU66whJt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 187si1469759yba.204.2020.06.17.21.57.04 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 17 Jun 2020 21:57:04 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iU66whJt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48794 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jlmbo-0003cS-1r for patch@linaro.org; Thu, 18 Jun 2020 00:57:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:32858) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jlmCC-0001x2-91 for qemu-devel@nongnu.org; Thu, 18 Jun 2020 00:30:37 -0400 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]:39570) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jlmC8-00037r-EP for qemu-devel@nongnu.org; Thu, 18 Jun 2020 00:30:35 -0400 Received: by mail-pj1-x102d.google.com with SMTP id h95so2079945pje.4 for ; Wed, 17 Jun 2020 21:30:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3jyMDBvSXdkWweNTlJ10emIpFoS2H8wIpT4CHdEpe/I=; b=iU66whJtwT6nC4OqrLXuCBcJvaA3E18zJ3gs8CgBx/L8u6Yoop3J031FWQgCDsqsTw yLszbxWjg+25NWes5cnLyNookK6jivsgQp0HS7D57Omi6AlH9cNqQajwV6H51qBLMA+h S5z3acsBKy9+VJXQtNeBFE+JtkJxxq2KtBIp1qA/yicOsPXbCoiJ8jC95dFFyiD7nBup x703thmcYcjYIDLzP79pufTbZiLuIWhaLlIGTM8K4du6PcIUICquQxIiiuXCw5BzqpkZ Er4gkQFjsn6K1cfJ04v//eJ8ZLPC0iPQlzzL/TceJz0kpSgZewKkf0PR5/I2/Bi8XWrh trWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3jyMDBvSXdkWweNTlJ10emIpFoS2H8wIpT4CHdEpe/I=; b=b9p77Zh0no3nvYKK5gcfXdpilP3fUN/rhnl/drolvWMcDx+cX5DZ6CKyrorezll/5f JpzNaDM021aEFHvxo6kirHkOfZ/F5RnqGGLUS0ULGJKJHBgnc+ilCT1qCiG/K44gs3c7 HcVb2+Pro05yBPwTXR7MpY+FkQ3uH/PWiXMf+5GhMhVpRnXtkGbGgLDvQHkLS69vSmjR H5pepOpyjLiysA6lA2sLZUegVCPpCXyocP0P/E/fIeZICCufBk3mfzOSaO2NM/xRiail 5YZVlQ34PWLQgMdCHhrl2+QYtHHGCxTSA0ZG3EQVLzecSAKrUAPa48OoB49tFka7jSwE mmtw== X-Gm-Message-State: AOAM531EdxYBDoj/R6ZiNhWXJvS6A0Rxs0kFu0znjmcaX8qAFZmyyoWc hmdJIvu4RY4iSkHEUkjL8wnJKAgppLM= X-Received: by 2002:a17:902:b403:: with SMTP id x3mr2123053plr.240.1592454630551; Wed, 17 Jun 2020 21:30:30 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id i191sm1298861pfe.99.2020.06.17.21.30.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jun 2020 21:30:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 053/100] target/arm: Implement SVE2 complex integer multiply-add Date: Wed, 17 Jun 2020 21:25:57 -0700 Message-Id: <20200618042644.1685561-54-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200618042644.1685561-1-richard.henderson@linaro.org> References: <20200618042644.1685561-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, steplong@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix do_sqrdmlah_d (laurent desnogues) --- target/arm/helper-sve.h | 18 ++++++++++++++++ target/arm/vec_internal.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 32 +++++++++++++++++++++++++++++ target/arm/vec_helper.c | 15 +++++++------- 6 files changed, 109 insertions(+), 8 deletions(-) -- 2.25.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8fc8b856e7..4029093564 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2113,3 +2113,21 @@ DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index 372fe76523..38ce31b4ca 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -168,4 +168,9 @@ static inline int64_t do_suqrshl_d(int64_t src, int64_t shift, return do_uqrshl_d(src, shift, round, sat); } +int8_t do_sqrdmlah_b(int8_t, int8_t, int8_t, bool, bool); +int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); +int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); +int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 19c5013ddd..a03d6107da 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1362,3 +1362,8 @@ SMLSLB_zzzw 01000100 .. 0 ..... 010 100 ..... ..... @rda_rn_rm SMLSLT_zzzw 01000100 .. 0 ..... 010 101 ..... ..... @rda_rn_rm UMLSLB_zzzw 01000100 .. 0 ..... 010 110 ..... ..... @rda_rn_rm UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm + +## SVE2 complex integer multiply-add + +CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx +SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index dbf378d214..b4613d90dc 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1448,6 +1448,48 @@ DO_SQDMLAL(sve2_sqdmlsl_zzzw_d, int64_t, int32_t, , H1_4, #undef DO_SQDMLAL +#define DO_CMLA(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \ + int rot = simd_data(desc); \ + int sel_a = rot & 1, sel_b = sel_a ^ 1; \ + bool sub_r = rot == 1 || rot == 2; \ + bool sub_i = rot >= 2; \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + for (i = 0; i < opr_sz; i += 2) { \ + TYPE elt1_a = n[H(i + sel_a)]; \ + TYPE elt2_a = m[H(i + sel_a)]; \ + TYPE elt2_b = m[H(i + sel_b)]; \ + d[H(i)] = OP(elt1_a, elt2_a, a[H(i)], sub_r); \ + d[H(i + 1)] = OP(elt1_a, elt2_b, a[H(i + 1)], sub_i); \ + } \ +} + +#define do_cmla(N, M, A, S) (A + (N * M) * (S ? -1 : 1)) + +DO_CMLA(sve2_cmla_zzzz_b, uint8_t, H1, do_cmla) +DO_CMLA(sve2_cmla_zzzz_h, uint16_t, H2, do_cmla) +DO_CMLA(sve2_cmla_zzzz_s, uint32_t, H4, do_cmla) +DO_CMLA(sve2_cmla_zzzz_d, uint64_t, , do_cmla) + +#define DO_SQRDMLAH_B(N, M, A, S) \ + do_sqrdmlah_b(N, M, A, S, true) +#define DO_SQRDMLAH_H(N, M, A, S) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, S, true, &discard); }) +#define DO_SQRDMLAH_S(N, M, A, S) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, S, true, &discard); }) +#define DO_SQRDMLAH_D(N, M, A, S) \ + do_sqrdmlah_d(N, M, A, S, true) + +DO_CMLA(sve2_sqrdcmlah_zzzz_b, int8_t, H1, DO_SQRDMLAH_B) +DO_CMLA(sve2_sqrdcmlah_zzzz_h, int16_t, H2, DO_SQRDMLAH_H) +DO_CMLA(sve2_sqrdcmlah_zzzz_s, int32_t, H4, DO_SQRDMLAH_S) +DO_CMLA(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) + +#undef do_cmla +#undef DO_CMLA + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 054c9d4799..0ad55ad243 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7217,3 +7217,35 @@ static bool trans_UMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) { return do_umlsl_zzzw(s, a, true); } + +static bool trans_CMLA_zzzz(DisasContext *s, arg_CMLA_zzzz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_cmla_zzzz_b, gen_helper_sve2_cmla_zzzz_h, + gen_helper_sve2_cmla_zzzz_s, gen_helper_sve2_cmla_zzzz_d, + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} + +static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdcmlah_zzzz_b, gen_helper_sve2_sqrdcmlah_zzzz_h, + gen_helper_sve2_sqrdcmlah_zzzz_s, gen_helper_sve2_sqrdcmlah_zzzz_d, + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 4b7afd7be5..f016aa7978 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -38,8 +38,8 @@ #endif /* Signed saturating rounding doubling multiply-accumulate high half, 8-bit */ -static int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, - bool neg, bool round) +int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, + bool neg, bool round) { /* * Simplify: @@ -82,8 +82,8 @@ void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, - bool neg, bool round, uint32_t *sat) +int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, + bool neg, bool round, uint32_t *sat) { /* Simplify similarly to do_sqrdmlah_b above. */ int32_t ret = (int32_t)src1 * src2; @@ -175,8 +175,8 @@ void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, - bool neg, bool round, uint32_t *sat) +int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, + bool neg, bool round, uint32_t *sat) { /* Simplify similarly to do_sqrdmlah_b above. */ int64_t ret = (int64_t)src1 * src2; @@ -273,8 +273,7 @@ static int64_t do_sat128_d(Int128 r) return ls; } -static int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, - bool neg, bool round) +int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, bool neg, bool round) { uint64_t l, h; Int128 r, t;