From patchwork Tue May 25 01:03:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 447106 Delivered-To: patch@linaro.org Received: by 2002:a02:7a1b:0:0:0:0:0 with SMTP id a27csp3827856jac; Mon, 24 May 2021 18:38:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwWjccgrt4cXi2vcswAQDaEHk9kKWIUnopWHtDsJbJbp9itwHQhC2qYLinqzrYh8eAcFQ/T X-Received: by 2002:a05:6e02:1a0c:: with SMTP id s12mr17951924ild.201.1621906711329; Mon, 24 May 2021 18:38:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621906711; cv=none; d=google.com; s=arc-20160816; b=uZZScSLQfbE+hOZTlCx5WkL8utX1gBtRE6DN2HQjUFohQ9+tAl+Ewt+VOIzfwovBVg OhYi9HaWBBpOssu4CaEPPFqguZufWqKEan5uBFBNP47u3s0wD7p3DXXfDhvEh6zXga4u 36BVtwxcYf2sPlf0fUZ9WCRiP43T46DulPuVhyga+vvkR0q2UKWanLhiDYwDgxkIpG0C LnAmMyV1R9dMKsr5HETJXy9YV4TkhqMZt1QDUjiWe0a1ufb+lfcKq5pkfO3mtE+tD4o1 cJm/5KoSWaCvGX15S2nAtU+SzrOFd99sTVn4eu+G0D/LpW+WlhEVG9l2ScM2iYfJCvWh pG9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=I8/Dcr1RlD62cWLa9TUHbYmDfTs7MOj8zYzO3BzWCao=; b=ykKyDPfJcQ3Pv6P9Q/MgtotT43ywH3HRwG3OQXpH99lh4L9FHhdGiEpq30SE+DcVBb PTzSSjMiIi6t3aaA3o9kYdMJjMDzYwUbKLRCCoWFdbFDlLsDD0rDWQ50rifHwLXbkwY2 nDOzafvtw1zyaGfRa77EpOPRpza1ecvgzhY3I+Yz0DyATml3FUFi2uOr6jMAkgIKqEUU R/cpjlAZSZcB9lV28GVSyqe9HUVqZ+9nmOabCLcIuvxJemU1xCTFO3BOkM2QFmUXxAMv jdLozoRIQ22CiIDLguzdB2ygOJz9wS4Es91h7gBwwLxXDTVfYSbE+lHbO6hQofpTC6tn XyZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="Vl/DRBDt"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l13si2400129ilv.61.2021.05.24.18.38.31 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Mon, 24 May 2021 18:38:31 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="Vl/DRBDt"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:44138 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM1d-0005x3-Qd for patch@linaro.org; Mon, 24 May 2021 21:38:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55046) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXu-0007sE-Af for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:46 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]:38908) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXc-00040N-4x for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:46 -0400 Received: by mail-pf1-x42d.google.com with SMTP id e17so11674229pfl.5 for ; Mon, 24 May 2021 18:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I8/Dcr1RlD62cWLa9TUHbYmDfTs7MOj8zYzO3BzWCao=; b=Vl/DRBDtjyAvgDJzLlhjjpbrvCF1MqIviLVwNgvZllcHh/J4xBQMw4bxI+MvwNdBFp Tkg5pvTN0BRBEWpVrOsTCSkWs/C6RnjO8J1NZeOCjSVeX77p2y1CSzxkfoAY/0Xf3nfF 3TQc+RVpshj4iNsYXEZhqABiSUGyCFM3MLx41F9Q0h7sqlC5vhmN6qLGZmstG2dj9kav f2D0L49ei9qh8S9CdltxZIaK+DWrEpCHhYqQFZySupRY9PNZffAuzRFPbE62NASTf0Yk NLZ030vASyXupMMBPPIj/Jq7Vr3VpJJV1rAPJ6SrOPw01oryQQoujVNJ/8i/ZnIfFqJe Ky6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I8/Dcr1RlD62cWLa9TUHbYmDfTs7MOj8zYzO3BzWCao=; b=SvYRM0qf+lAczhOJo0Lsld5XlNUeHDKfNu8hTy8b5+uqmDb6dGVbXfAveocErxroUB gbUQULEAcw2AmCu+n7c6XAaOP2hMxdvkoAOt8daqG36DhtqKhRV56gLHBVJNc+ecOABX LTndONssUoeZPHf5L+AbmOsWm4lvvgv55VV4ZrkC/O0BTIDHK5jgQgXFYxyloqPuxABX yqKtXJ9tVEvW5KMxGUrMVs/tGVqCcWLgh7LQpBX9M5BfHzwHgVPtFlMWvVbkJ9fijit3 glntrJROyli7J/E747XANvFixzSrWYskf5bCN3opQqi5QNgzmxfHSw1QfrMcxBl+VqmC ryhw== X-Gm-Message-State: AOAM530cy5+wTeZurk6kFld6aCDrt1cjEbXxxAjeXDYet804CSc2ZzSZ d6ZeaZI8bAq0xNYAPlBOC291eu2MzioEyA== X-Received: by 2002:a63:b507:: with SMTP id y7mr8934679pge.74.1621904846375; Mon, 24 May 2021 18:07:26 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 43/92] target/arm: Implement SVE2 XAR Date: Mon, 24 May 2021 18:03:09 -0700 Message-Id: <20210525010358.152808-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" In addition, use the same vector generator interface for AdvSIMD. This fixes a bug in which the AdvSIMD insn failed to clear the high bits of the SVE register. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 ++ target/arm/helper.h | 2 + target/arm/translate-a64.h | 3 ++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 39 ++++++++++++++ target/arm/translate-a64.c | 25 ++------- target/arm/translate-sve.c | 104 +++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 12 +++++ 8 files changed, 172 insertions(+), 21 deletions(-) -- 2.25.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 507a2fea8e..28b8f00201 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2558,6 +2558,10 @@ DEF_HELPER_FLAGS_5(sve2_histcnt_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_4(sve2_histseg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index 6bb0b0ddc0..23a7ec5638 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -953,6 +953,8 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 89437276e7..58f50abca4 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -120,5 +120,8 @@ bool disas_sve(DisasContext *, uint32_t); void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz); #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8f501a083c..7645587469 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -65,6 +65,7 @@ &rr_dbm rd rn dbm &rrri rd rn rm imm &rri_esz rd rn imm esz +&rrri_esz rd rn rm imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rpr_s rd pg rn s @@ -384,6 +385,9 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +XAR 00000100 .. 1 ..... 001 101 rm:5 rd:5 &rrri_esz \ + rn=%reg_movprfx esz=%tszimm16_esz imm=%tszimm16_shr + # SVE2 bitwise ternary operations EOR3 00000100 00 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 BSL 00000100 00 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 662ed80b1c..5b6292929e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7202,3 +7202,42 @@ void HELPER(sve2_histseg)(void *vd, void *vn, void *vm, uint32_t desc) *(uint64_t *)(vd + i + 8) = out1; } } + +void HELPER(sve2_xar_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + int shl = 8 - shr; + uint64_t mask = dup_const(MO_8, 0xff >> shr); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + uint64_t t = n[i] ^ m[i]; + d[i] = ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + int shl = 16 - shr; + uint64_t mask = dup_const(MO_16, 0xffff >> shr); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + uint64_t t = n[i] ^ m[i]; + d[i] = ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + int shr = simd_data(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ror32(n[i] ^ m[i], shr); + } +} diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 0c80d0b505..bdd47208b1 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -14349,8 +14349,6 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) int imm6 = extract32(insn, 10, 6); int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - TCGv_i64 tcg_op1, tcg_op2, tcg_res[2]; - int pass; if (!dc_isar_feature(aa64_sha3, s)) { unallocated_encoding(s); @@ -14361,25 +14359,10 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) return; } - tcg_op1 = tcg_temp_new_i64(); - tcg_op2 = tcg_temp_new_i64(); - tcg_res[0] = tcg_temp_new_i64(); - tcg_res[1] = tcg_temp_new_i64(); - - for (pass = 0; pass < 2; pass++) { - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - - tcg_gen_xor_i64(tcg_res[pass], tcg_op1, tcg_op2); - tcg_gen_rotri_i64(tcg_res[pass], tcg_res[pass], imm6); - } - write_vec_element(s, tcg_res[0], rd, 0, MO_64); - write_vec_element(s, tcg_res[1], rd, 1, MO_64); - - tcg_temp_free_i64(tcg_op1); - tcg_temp_free_i64(tcg_op2); - tcg_temp_free_i64(tcg_res[0]); - tcg_temp_free_i64(tcg_res[1]); + gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), imm6, 16, + vec_full_reg_size(s)); } /* Crypto three-reg imm2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 13f84d14d3..ba39ff84a5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -340,6 +340,110 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a) return do_zzz_fn(s, a, tcg_gen_gvec_andc); } +static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_8, 0xff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 8 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_16, 0xffff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 16 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh) +{ + tcg_gen_xor_i32(d, n, m); + tcg_gen_rotri_i32(d, d, sh); +} + +static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_rotri_i64(d, d, sh); +} + +static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, int64_t sh) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_rotri_vec(vece, d, d, sh); +} + +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 }; + static const GVecGen3i ops[4] = { + { .fni8 = gen_xar8_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_b, + .opt_opc = vecop, + .vece = MO_8 }, + { .fni8 = gen_xar16_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_h, + .opt_opc = vecop, + .vece = MO_16 }, + { .fni4 = gen_xar_i32, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_s, + .opt_opc = vecop, + .vece = MO_32 }, + { .fni8 = gen_xar_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_gvec_xar_d, + .opt_opc = vecop, + .vece = MO_64 } + }; + int esize = 8 << vece; + + /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift <= esize); + shift &= esize - 1; + + if (shift == 0) { + /* xar with no rotate devolves to xor. */ + tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, + shift, &ops[vece]); + } +} + +static bool trans_XAR(DisasContext *s, arg_rrri_esz *a) +{ + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + gen_gvec_xar(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), a->imm, vsz, vsz); + } + return true; +} + static bool do_sve2_zzzz_fn(DisasContext *s, arg_rrrr_esz *a, GVecGen4Fn *fn) { if (!dc_isar_feature(aa64_sve2, s)) { diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 19006f50f7..a3d80ecad0 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2253,3 +2253,15 @@ void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_xar_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ror64(n[i] ^ m[i], shr); + } + clear_tail(d, opr_sz * 8, simd_maxsz(desc)); +}