From patchwork Wed May 13 16:32:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186648 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp613154ilb; Wed, 13 May 2020 09:33:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx8ivuwnAhjpyDKsw7pXLorZFFos8WiYoLQq/IT8JB1kHldSuhyREOARVPFwWukEuGFBzhV X-Received: by 2002:ad4:45a7:: with SMTP id y7mr503301qvu.184.1589387609073; Wed, 13 May 2020 09:33:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387609; cv=none; d=google.com; s=arc-20160816; b=T6b9NL5KGQ6YDEXEPfK6tdUqMwDQFOOkiqextJ83ea2WAAqEvKbQ03yGVtZxpo4mzW AxZnK9GY/cMjV9/QDQoujygrmFjXk8dY+9gieHNAsymbBxcPVaG97jweCSbbqp3TV8rY Yl8EjVRDsSYlU++yDJo5SNPG+Um/a+M6GjlB1A6oetPsVamEkr/aVq09aIQnlj9uC8dc SXoGtjjkRjNFqvWKyaaeBHQ6mQMxvIsTU3ZkENIz6Wk8OEBJiMZmr5F0fmN5I2ANfosA KHUf91/T3eQN8wC8vQUHnBeNowHn4Fqs/7pG2WnduTr2he6Uk6umtDgIliGWQ144iLNj yTjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=8A6hG2VJX6G8AstTG9MEpJ8uNR7b68/hVZpp/w046b8=; b=Fu4ImVdeKaDuYqzArn/ag8XX36Na8vhl7IqXHD5l1Z9WULV0VBjRi6CMU0D5HXfvHL U8+TYqnZ/60QmFyXmHTb2DL47lJiL7vYdonTufHNGfX5Dwz7er+bMrHjz/5uPCMTKG57 ECumZS4Yb46WCJOkMyoLmRYJ0Vrm553EI+QTcg0jemp8L739dDlTHVuTw1AS5iLn1khX kOSiT6IgUpnipaXCZE8p0G+VAAOrnGUi/PUuh4FT82/JpFEpGi1Lfsac0MFj/OsPhZT5 lJX3tLtrMC8zrKpyltAt2NEJ00jU+RPg7dtBPXI1Dia1cOs0T5IrkJjI3FeVjCBBjeaV vb0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=tyMcrGvm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id o56si159988qtf.178.2020.05.13.09.33.28 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:33:29 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=tyMcrGvm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:53654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuK0-0003bV-Hp for patch@linaro.org; Wed, 13 May 2020 12:33:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJQ-0003ZH-Hk for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:52 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:54008) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJP-0001z5-9H for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:52 -0400 Received: by mail-pj1-x1042.google.com with SMTP id hi11so11332310pjb.3 for ; Wed, 13 May 2020 09:32:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8A6hG2VJX6G8AstTG9MEpJ8uNR7b68/hVZpp/w046b8=; b=tyMcrGvmgykxmAtwkNFUpWXDVTu7DtHa10DGaXg8ik33MfN79jnxdSvUbJ0LT9VUli JpS44y9KwVrQSXzw967qrdhr7qXSYv+Hq92qrTuxF6lIhIsY7Ie0vnEnAkDW5ZHLP2sL 3jvt9CmpIPlAe3frg3H9UQIDGfN5IU4VgQdzo1Dgr1H/rXyAPIHDUuX47Az0P6A0TzrJ 4PRVS/Km1EqAd4OwbQmVDwXWDngT6rp8ULDqqywljm/zGD0/3tOv+CpaAq/lZKraYkHY W2yDpEQzdlw2kN6YUf9WUkb6JFY4yqTTMvQvJKE/XmudEK1aCeiytM44LXJ7GLhC9/SB yDCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8A6hG2VJX6G8AstTG9MEpJ8uNR7b68/hVZpp/w046b8=; b=TFMRJdYhBuKhCUEEOPgDDUWfgPtKlYaFPZmikG4seKX/4SIMlDWtMp98Icdbg9oENm iNoAajxfWukcWPZeFVnb6dkIB1BZQGNmH3Irg+JzTcPgNPrHwbv78kDQZOwWDmLWgiyp fZoeZs6OyYIpCUhKuXYvbC+yoyI7D4HhUh2plIOupslVFhjmO9NrI1XxY+NKX3skrTp1 Z5KpaVL3joAPc/omlDUhQaAGi89GqjGuFRNT4Z0rAIwy+yixVsncTWnzh8HANwjEbe45 /D70oy45nR8rssRTLyNRsHs6VqZ4Y3TnepQjjE1+bYkZPEwNx1uh4BrFEKVznLq027HB KVvw== X-Gm-Message-State: AGi0PuaGOa0eW91j4lcmN5kLQeCaFJD5gwawHvOieEfIos6xAEHT6DGJ SVKMXLMlvFmphNWKc5zp5/lAWLM42o0= X-Received: by 2002:a17:90b:374f:: with SMTP id ne15mr34846602pjb.181.1589387569108; Wed, 13 May 2020 09:32:49 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 01/16] target/arm: Create gen_gvec_[us]sra Date: Wed, 13 May 2020 09:32:30 -0700 Message-Id: <20200513163245.17915-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 5817626b20..9bc162345c 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -691,6 +691,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index cb7925ea46..1839a59a8e 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -299,6 +297,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 62e5729904..315de9a9b6 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10188,19 +10188,8 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. */ - if (shift == 8 << size) { - shift -= 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 74fac1d09c..c18140f2e6 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3874,33 +3874,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_ssra[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; -const GVecGen2i ssra_op[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_ssra, - .load_dest = true, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -3932,33 +3950,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_usra[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; -const GVecGen2i usra_op[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5220,19 +5260,12 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift = -shift; - /* Shifts larger than the element size are architecturally - * valid. Unsigned results in all zeros; signed results - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >= 8 << size) { - /* rd += 0 */ + if (u) { + gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 3d534188a8..230085b35e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] += n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Wed May 13 16:32:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186651 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp615246ilb; Wed, 13 May 2020 09:36:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwA1YTMWcFeIIoDvi/xVhTBpzHm/xzA6CCswCGoUtp4EJFerzuurnnE49G4jcB8meeHsTwu X-Received: by 2002:a0c:9e6d:: with SMTP id z45mr501861qve.206.1589387767963; Wed, 13 May 2020 09:36:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387767; cv=none; d=google.com; s=arc-20160816; b=BCBGubiKtVzSSEVMc53OajS1MBmNkAHpWPb6WJzBNFbVpHyPu4E1PElSZiDiKlCkzn 3NcbD53I4Z6KqDeXM+HVchTYET3k4XUMrBfhVEu/KgV9QO75wVlzpytNmTzZCPQBFoFp V3JG8472lnieKuXTa6mtAyDwU2mt4OuOByPi0uQch6SjlCEvg+A8j0IYp5vaDwNDAr55 W5R2UXDc59cCuKBtt9aCZO09wr76OooY+xMrRQLlMt2z9MAgDBFN/9QNkPx5rRJNLf78 aWHY3zoBxl9OmUcMsYb0AmKUlSwzfZOTZxMaWYIYRYUJRTzBepv5utHX+8oe4piVNIvm kgQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=XS/mIW9X0kPcGiRIELKdlQNFPmcbFEWtnUU78dllzHU=; b=Kugnd4Dt4CXsr6mT5ZRl0p3nW5YCYnGhXkN0ylmHCV3HfJ+GqAba72e7sf9viCG60o UTubBRe+K7fAitgGFV0EOyKcN1NUXYKdeKiuYpmosNxkWiEaeEKNVZf0LMMnp2UOoFYP 1vZlO5F7HAV3i6w6DT5Awk6MU9GvTh8rdhTbcrWIBY0IqmzMotyxUQSKrj+2UWbF1so0 5xxwVoIy/NwniPiLHcmC12CySdioidlwEZNXd0X8QyVyS853x0qaCzMgkI6dVcQLZn43 ISiYPoQrcfF3kNSJNLipPZAE16RUg5lDQR3MqAM9OLQq5sjBLJiVd1t5ghnGhraoYwCn XZyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ormT12iD; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id j21si148674qtk.376.2020.05.13.09.36.07 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:36:07 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ormT12iD; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:32922 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuMZ-0006qz-B0 for patch@linaro.org; Wed, 13 May 2020 12:36:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47938) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJS-0003bX-IX for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:54 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:41092) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJQ-0001zq-Tb for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:54 -0400 Received: by mail-pg1-x543.google.com with SMTP id r10so7514572pgv.8 for ; Wed, 13 May 2020 09:32:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XS/mIW9X0kPcGiRIELKdlQNFPmcbFEWtnUU78dllzHU=; b=ormT12iD6QOHXRrN7+7/4bsnrS4trYlzsLobNtEUgiSKCmdzSxNv/fvmvInSb8ZGF5 ue4o66cTjqTyD1dqw55swNyyVQYzlb2McSGUV0hFsPpjR7kJSVvcRbH6iylSYtIzWx65 KDnT7Poce2WTu6IxQxRIbfsqmqt88jo9Voxa1ztPMLKJXVxoBA5tGA9rlCSWPWm/KANM OAth5bX6wXtyuhdOyrlBpLFXFmdHJXWTWeED+p555O+d2d5KwDi/H09h5dgvzLm1FXzC Z3FT6+KgQcdao/bJpk6+g4LtHxtepr01Q3B+EhmPjVSCKwrXstmccnhXkewakW809VW7 1Ocw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XS/mIW9X0kPcGiRIELKdlQNFPmcbFEWtnUU78dllzHU=; b=UKBzUaV/aB8Adg0smJiF34Kzqc1DTYl2mnLU02HUiMQr1fJrEOJau7RDke8Be986nB x3FYl2bWbHWLkvJPeggImFRhj3ksL3VHTcD4Y5a0SW3U9i2YxnBT03e0/D/z+4VRwWP0 VTeUUvtZ3KBVPaOVQH45sxyjKWjuD1V2uU6ZvmdIMwXBJ5dNcdXw2WVSrJ9xHx5w4Nmy dPcAfFkZGV8ckCBwT8nsCRuUuxlm7QlX6WhxE6wp5VUKxwNbbbhzhAe93d33fn1FR03X dXvnOrlqC5vkQyEBOFwpuY19iD0R0t+24oTdDS4rmD8PaAN/dmmj7YmX9e+BB2Wf3beY +eNQ== X-Gm-Message-State: AOAM530f25MERXZ+PPLrfKQ8YMNfgX+gVFJZYWliGNDc8+OoIOdAVbsZ tRZ+NIwPauJIdD9jvpGU/DwrbxYYfps= X-Received: by 2002:a63:1e01:: with SMTP id e1mr117524pge.351.1589387570498; Wed, 13 May 2020 09:32:50 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 02/16] target/arm: Create gen_gvec_{u,s}{rshr,rsra} Date: Wed, 13 May 2020 09:32:31 -0700 Message-Id: <20200513163245.17915-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::543; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x543.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate.h | 9 + target/arm/translate-a64.c | 11 +- target/arm/translate.c | 463 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 50 ++++ 5 files changed, 527 insertions(+), 26 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 9bc162345c..aeb1f52455 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -701,6 +701,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1839a59a8e..1db3b43a61 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -302,6 +302,15 @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 315de9a9b6..50949d306b 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10218,10 +10218,15 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, return; case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_urshr : gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate = true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_ursra : gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/translate.c b/target/arm/translate.c index c18140f2e6..aa03dc236b 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4000,6 +4000,422 @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, } } +/* + * Shift one less than the requested amount, and the low bit is + * the rounding bit. For the 8 and 16-bit operations, because we + * mask the low bit, we can perform a normal integer shift instead + * of a vector shift. + */ +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sar8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sar16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_sari_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_sari_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, sh - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_sari_vec(vece, d, a, sh); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr8_i64(t, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr16_i64(t, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + gen_srshr32_i32(t, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr64_i64(t, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + gen_srshr_vec(vece, t, a, sh); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift == (8 << vece)) { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_shr8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_shr16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_shri_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_shri_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_shri_vec(vece, d, a, shift); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, sh); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, sh); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (sh == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, sh); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, sh); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (sh == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, sh - 1); + } else { + gen_urshr_vec(vece, t, a, sh); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { uint64_t mask = dup_const(MO_8, 0xff >> shift); @@ -5269,6 +5685,30 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case 2: /* VRSHR */ + /* Right shift comes here negative. */ + shift = -shift; + if (u) { + gen_gvec_urshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + + case 3: /* VRSRA */ + /* Right shift comes here negative. */ + shift = -shift; + if (u) { + gen_gvec_ursra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srsra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + case 4: /* VSRI */ if (!u) { return 1; @@ -5320,13 +5760,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) neon_load_reg64(cpu_V0, rm + pass); tcg_gen_movi_i64(cpu_V1, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - if (u) - gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1); - else - gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1); - break; case 6: /* VQSHLU */ gen_helper_neon_qshlu_s64(cpu_V0, cpu_env, cpu_V0, cpu_V1); @@ -5343,11 +5776,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) default: g_assert_not_reached(); } - if (op == 3) { - /* Accumulate. */ - neon_load_reg64(cpu_V1, rd + pass); - tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1); - } neon_store_reg64(cpu_V0, rd + pass); } else { /* size < 3 */ /* Operands in T0 and T1. */ @@ -5355,10 +5783,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tmp2 = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp2, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - GEN_NEON_INTEGER_OP(rshl); - break; case 6: /* VQSHLU */ switch (size) { case 0: @@ -5384,13 +5808,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) g_assert_not_reached(); } tcg_temp_free_i32(tmp2); - - if (op == 3) { - /* Accumulate. */ - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - tcg_temp_free_i32(tmp2); - } neon_store_reg(rd, pass, tmp); } } /* for pass */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 230085b35e..fd8b2bff49 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) #undef DO_SRA +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] = (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] += (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Wed May 13 16:32:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186652 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp615836ilb; Wed, 13 May 2020 09:36:55 -0700 (PDT) X-Google-Smtp-Source: APiQypLO5yxJIDtCYtBLbmPVRfvxWL5FUHlumaxaeAEHe9W5dB3P0lOtilHDVezpve8aq9BYvd7G X-Received: by 2002:ac8:1c3d:: with SMTP id a58mr27469344qtk.52.1589387815488; Wed, 13 May 2020 09:36:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387815; cv=none; d=google.com; s=arc-20160816; b=ZYYna10/XAWQqLs37lxWAXvlgHTA6+XkHGf4x/wixCoAkkv9phOPWdcXwSqJ5zthWW nfLZSJ23UIqxVw6AxfwQNtGeAOIEOJ8fs6Q5XGD93Jc/dAjq1HsiOednaliwoJq4N6fk u3FlNvz7uUCroeDpOFjFmKe67XPPHVVA47SGz/JKiTFVlr2AXkDqtRchR9Q1ioU0eFMo onNi6Sa+dnguur8cxUTk/BLhIUP/usA6ckyCmA7S/4MA+q4Lhhhee3TOKni5K7tqSkXB GNcfztFkDEzcKUk8I2+fNF43Ds7kMX/gr16JqhjXBpKLaa4LX3daI+0as76P3TbIVSmi r7fA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=0WbbmeSiaNN4QpouxeGPic6h6RNKP0wk8Zd43DH2ue0=; b=fsGCJ48lRl060IegbBbO1rRbXR9iuzcOyeS7uM285A5fMjQr2WKDN5W9vOwZ4OiMpB 0578IK6dFbfaPa5XP7POTDScNCGSz9Ut3F/hOPk5B6I0u3KNjNtRwXcRnF1829G9cMhH sXDeUyTlVNdA1wcIdsxiYK3jiUyuvfeab7/SDkOWMcfNcuPft22joXx7WJ4d1Cc/6tKe KLTqYzWXALmvKD8uAv5G8WSXba4bwkGYTRvCMGpj42F3+EKuWco35LWbOinN2Dno3Ep9 TC/4uKH5rXgaVmK0P/SLGWeb0jfpGHs/Kcbl+IpkwxyhjdZ9WBPXXmlZrXFjwsBWrBGT 0U2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=KiTt6z5V; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id a19si4695543qko.305.2020.05.13.09.36.55 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:36:55 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=KiTt6z5V; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:33414 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuNK-00072n-So for patch@linaro.org; Wed, 13 May 2020 12:36:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47950) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJT-0003cQ-6p for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:55 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:37581) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJR-0001zy-JD for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:54 -0400 Received: by mail-pl1-x641.google.com with SMTP id x10so48660plr.4 for ; Wed, 13 May 2020 09:32:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0WbbmeSiaNN4QpouxeGPic6h6RNKP0wk8Zd43DH2ue0=; b=KiTt6z5VmnrOIrPiuaDIFW526+LR40zcVRryWnIVgzOCXaz/apM2zvgL+9MtPMk25A o+h0kHqkXFWhkYHE2MhLLe364wh/P0u9FPB+NGhXboJvOAbWdjTsY5G+mvlWYFzHEXma hTbD0v5qDGFVdA7/en0qjKUhB45wP+wAP6jZg4CbNkFyfbB309V2WuPxTbGuAogog48Y CvlJYYVio3HlTyeT1J8Z9MawMY8UGrB9RP5vvnA82TNLCc0rv8v0oJY7CY/R6UlctU2F h9squYJ/nbDviMEKx12neoQLlaKUVriZwsDjk/whl2g5EyXh7P1CoZqG3XTgtFQjlI// dYVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0WbbmeSiaNN4QpouxeGPic6h6RNKP0wk8Zd43DH2ue0=; b=qVndfiU7of3pTvcuEpZvO/4ieJaqTbXmnYweaKUFwEOQJ5PdqA8oOYAhx4wjMns+8S P5WGbbDcuIg20pRK/rowHIIZXdobnP7vHRXi48jPqs0Wo0abj9AcHBkAS+9vK0L0LzV+ F3R4Y7UTqF3/F04q1g/4jeRo2B5thRlB6cTBcC+RSk9bssCtXdeZ4OCC980olNKCZgv/ zrfalSSY1xS+SzcUAO/H1ARBPDuJ1xJBdWZC/o2ixBOOFcrhBcmQjpifRnPPZ2YSwxQ3 X9XvanJV+G3h9wYG57Hdtu7bqSuLaKjF9xS0a0KslfyzQrp87HXQ3pEiB/7RPH6fKDKB 0w8Q== X-Gm-Message-State: AGi0Pua7QUfvc30T6WXJDagM4ha1nWN0/Gcjn2DyS3iXH3WbmDM6ccJB /J6US273wYgHBvY1ZRkAfEKjNx1zfM8= X-Received: by 2002:a17:90a:1743:: with SMTP id 3mr33873672pjm.106.1589387571713; Wed, 13 May 2020 09:32:51 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 03/16] target/arm: Create gen_gvec_{sri,sli} Date: Wed, 13 May 2020 09:32:32 -0700 Message-Id: <20200513163245.17915-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index aeb1f52455..33c76192d2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -721,6 +721,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1db3b43a61..fa5c3f12b9 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -311,6 +309,11 @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 50949d306b..2d7dad6c3f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -585,16 +585,6 @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); } -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -10191,12 +10181,9 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); return; case 0x00: /* SSHR / USHR */ @@ -10247,7 +10234,6 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); - done: clear_vec_high(s, is_q, rd); } @@ -10272,7 +10258,7 @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, } if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index aa03dc236b..3c489852dc 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4454,47 +4454,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sri_op[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4532,47 +4547,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sli_op[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -5715,20 +5743,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } /* Right shift comes here negative. */ shift = -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]); - } + gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index fd8b2bff49..096fea67ef 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) #undef DO_RSRA +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Wed May 13 16:32:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186655 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp617194ilb; Wed, 13 May 2020 09:38:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyjkRUAQHT5rGkew63K/6ugcQu3eS87+RhYiznQ7SjJMi6bXhfmkDBbP6cvXIuVMHfnmlxF X-Received: by 2002:a37:9702:: with SMTP id z2mr587898qkd.28.1589387924491; Wed, 13 May 2020 09:38:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387924; cv=none; d=google.com; s=arc-20160816; b=yBpmYjYLn0oAwiv60IKOIZOO4TLmJZQ6w2jgunpoSFEF3DADhzRjjsVGEul2++iwGF mN2x+AcahlDyptlAEBNhjxCSwSOomwbCIGq2dZJ0zhWHtVvFGUVv2Ko54bBWEwBi9/Ef E15on5a/jcYTI/vo+YVoUWQ4eBpzgHgNHadMfwJ+l9TjMDqwzkYOs1RLlLKZyzg+GcgJ 6k7WEebUGnpZEQdNXgp9iqVAyiukQ3DAWMmsNNcQ9TQW3mOgdSYjxwdc8ywnZDQDiDks 90Kx6ORkHXH/+KYXUWOe7haXBMg4NLvpcAvK5j3uc3i7hZF/AQRvaQAEy8yYjHPXUySE WeWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=2WJ6plpfPlfef610n9KhYRt52HX2ijIyS18DrRrsIPs=; b=T+b+S47knK4yJLhJFKldgIPIRygiRw2KaestFH892Sb0+rC2BdQInThr4rknAvZjQn Ok+CDq6Z8dHP3sobofLrgQsh2cbUQIaPSpzaw76qePLc5cBnyrQTstxevcFbs5gD8ptI MwPoOfK12Qnailnn13C6w/+wvYJHqnq/7efTYEgD4ZDMoXQyaG/FuTICmZw+juz2Lu2t mGVQZUEiJGXkC0p6P9/0Kij4OQ2EoD2EN19tCxwotBYT00qaolodGVFnkj9NgHhxzmTN ZNME+aAWX/laM0q3cYptj6dAcisMt//xsRGGkCFCRLWcBp80/1Wuf+4xQf3sYJLwpxER 1BEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=yHu6zi3v; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id k1si35236qvz.30.2020.05.13.09.38.44 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:38:44 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=yHu6zi3v; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:41378 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuP5-0003Ql-W4 for patch@linaro.org; Wed, 13 May 2020 12:38:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47952) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJT-0003d7-N4 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:55 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39227) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJS-000206-Mb for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:55 -0400 Received: by mail-pj1-x1044.google.com with SMTP id e6so11215539pjt.4 for ; Wed, 13 May 2020 09:32:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2WJ6plpfPlfef610n9KhYRt52HX2ijIyS18DrRrsIPs=; b=yHu6zi3v58tBaVlgx8pCuRGdfufTf/Nf0iEz1oczi01XRLJgh0KxCS4r8UW/sZRvU8 6TLIjCDJVaUf7qJAWux4U3jhowBNP8C0+sH55PV8fagWm8DmeVmS0cqU1mGXsytaF4mt 8+6PrU5rdaQWoA/AlBru3wLHfb/lbQqiw+oFmHsAK6oFQfQ57Lf8wqlzxNsBHAiM8tKl 4F8Ijqpl+wBzjuaI2wqPMP2TfY/Irg0oSRItTGnQEARow3FRUKeXkoH3hCCfgXXVJGGm xNuePPAt5dfkEHNSWXjQloM9tjppbrfC9CTOIgwt38aX601NLJnDqp2lNE8hL3Mrh5YY 02/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2WJ6plpfPlfef610n9KhYRt52HX2ijIyS18DrRrsIPs=; b=b+Xae/4ut9hpb7l+JDcpJywyGkeHHibqFWDgmL/66oNV/lWk/7wr/2hQQ5k7cs4+mo 5m3djqBC7SQhrxg8dewSP++xbSdUPapSNEDTqa2nLnS6/Pb3RzqMF3111BK2fhVVQ5IT rpP3XlJMIdXC2LdxBUIvMRIKRAf7sF3p0XaODWcW1AE3vQuPQ26NWxgRTYPNAeYFIgpr pRrjUtxnNOmn4LGOq8zz58VBz1ih2Cx3ceM9WtHsyQRCSjzohqokvNchxJUd56z8/n3w Z4v0B3FK/CFYnoXMYo55WaL0K1aJFDZUDh9sQfUJV9jv4pqQRIShevy4DcFuH5fy8YLr 2Chw== X-Gm-Message-State: AOAM533YWSwbyjGWmUMrF7YdjR7GtC/r9ZO2xYziGm1K5CWgr2T4Au0o p6nU/w1koaS0OPdsuB9JcmNyKBf9vy4= X-Received: by 2002:a17:902:8349:: with SMTP id z9mr70073pln.38.1589387573009; Wed, 13 May 2020 09:32:53 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 04/16] target/arm: Remove unnecessary range check for VSHL Date: Wed, 13 May 2020 09:32:33 -0700 Message-Id: <20200513163245.17915-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" In 1dc8425e551, while converting to gvec, I added an extra range check against the shift count. This was unnecessary because the encoding of the shift count produces 0 to the element size - 1. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 3c489852dc..2eec689c5e 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5752,16 +5752,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) gen_gvec_sli(size, rd_ofs, rm_ofs, shift, vec_size, vec_size); } else { /* VSHL */ - /* Shifts larger than the element size are - * architecturally valid and results in zero. - */ - if (shift >= 8 << size) { - tcg_gen_gvec_dup_imm(size, rd_ofs, - vec_size, vec_size, 0); - } else { - tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, - vec_size, vec_size); - } + tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; } From patchwork Wed May 13 16:32:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186658 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp619655ilb; Wed, 13 May 2020 09:42:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxtvA6FJ8x+T8juOTFnQb40Ai6RezUgsRDGmjz8t8YG/9egkgipfV3fG+7zjfRLPdM6whQT X-Received: by 2002:a37:7dc4:: with SMTP id y187mr564422qkc.412.1589388126876; Wed, 13 May 2020 09:42:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388126; cv=none; d=google.com; s=arc-20160816; b=OeK+GKzCGNbnB8aAMNHCoG7zoy4Sy0CzIplAUo03IiWUWcGSNY9i3OJ/+lAOJo0vR/ RHwyxYsYd/86L2EnyNvuFqbPhvXHBS9yNBWp9nIdg8jClxVYNWYI9Ujt3VnODmK8dNYJ woOs2YrNrDo88FL/UgEEc/vDJr/E7RH+1APy7jCOh8t55V2y6LcNPWNpSH1TKvOBkSyw uCepEfw5IhVwW+FNPz7PRDSvQTQ8cLiP1p1uaLf6JpTdf1AZKSLR4KusCLOfw/neoXlT h9z/gRZ9uL66tsiyduiqXOKt1k291If4ghL+limvPlXNBRQrRuz7POx0Ag6gaCGYOVw7 cfWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=aEeapnyAaYV0er6LNsDf+nnur14ggqXJIcqCiWufNK0=; b=ZJmEbmiOV/y2DMPCAEYS2X9DMIccs44OHwQujyYfi94ILQndcv2Ho9UqpNBJRXkn/j AcXwJpr7hzD5Sjn40VsQYhtySfiyShNIDwgWc1+2QPyfIKupZJ7UWoB5/gDgAnW+U9uo PE/6MBeEDpr2m/AD96BMRyibaBX9WSYNRxQOiJTYjFrV00muNN0Kq9SuAMGXtlGtNtfI 82N1xfLQ4oU858wSbdzgBooza339lE91zrRPd3RaCSLa36daFD/9X6cqJJpdQHLAb6H9 IvR4c7em0VwpIdHBM1o7ouRFDmOmQ16XNtLyZwSgA24z2r2PvSaVjYEOAmcpwkceeKvd Amow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=wRs8eg+q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id w5si27392qvl.125.2020.05.13.09.42.06 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:42:06 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=wRs8eg+q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuSM-0008Gf-7h for patch@linaro.org; Wed, 13 May 2020 12:42:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47962) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJU-0003er-SG for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:56 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:45584) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJT-00020N-Ud for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:56 -0400 Received: by mail-pl1-x644.google.com with SMTP id u22so32501plq.12 for ; Wed, 13 May 2020 09:32:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aEeapnyAaYV0er6LNsDf+nnur14ggqXJIcqCiWufNK0=; b=wRs8eg+qSjJsDO8X2oJuG5+dH7gxjcvKT4uab5YBXDLX89wjresuGcN7A1b6i3M+WC pPLWUu65KgafJCrNNdsCNaQyHEQEVLjuYdoX0bnKp4DXVjeIe8cv2adWaapMvHsTeCTe Ctel7rknCZFVfUQGMjSEr0rxL4DNxHtpddXznhmTV0iogjYodGHOVoHPyv7bJYMj4HhB rRc97A4caKXU6jDYiOQRzSzBQQvVGTZ4jg592TlXW5uW4wUBvmdfeXqvfQemVVOFd9gA DHhVCohWLgth+9fGopg+ndgQQq0ArZxgcEsc+C/c6hdR4BqW2lxPKXybH+RUCHEb6tXr pW/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aEeapnyAaYV0er6LNsDf+nnur14ggqXJIcqCiWufNK0=; b=VVVjJRbrt6oGMfKBPvWx9HOS7EAjEqvDJtJgpJdqTOQyNjuIr4DPx+wC6egkbqgmwP S2I0/LM/2VsZKOkDo6/DgDaaZiuGgplUjP5t2tiJoiWNLMVUFgbMdc4QdNWK7JvqW9p4 ZM6QSAgSYSMBJck6rWDDAHgPcJGXBxSWXNWsFS3Ykw+OHXezENR+7ER2vl3OIP6eFYfe Fk33y2zVXyOvww7VGwnnoN6ksMIBcAALMIIh+s4liXSBdKUm+wDD7wqaYK5ZnHX11mhX xPxL2ByWptmsMJcox1rAQ7TLB+M9jfR/z9NXslRHtTEA2WMFDn2TPzCmxshrE4KIMFOj SVPQ== X-Gm-Message-State: AOAM531B8Mfr+CUvopq6We6xUjxYRip3seSQpGnO47ObCe6yp1X9hPlq SSa5l5GaSYXtfvNRJ1mCkM4IsRHcZBA= X-Received: by 2002:a17:902:b7cc:: with SMTP id v12mr31525plz.39.1589387574244; Wed, 13 May 2020 09:32:54 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 05/16] target/arm: Tidy handle_vec_simd_shri Date: Wed, 13 May 2020 09:32:34 -0700 Message-Id: <20200513163245.17915-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) -- 2.20.1 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2d7dad6c3f..d5e77f34a7 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10155,16 +10155,7 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, int size = 32 - clz32(immh) - 1; int immhb = immh << 3 | immb; int shift = 2 * (8 << size) - immhb; - bool accumulate = false; - int dsize = is_q ? 128 : 64; - int esize = 8 << size; - int elements = dsize/esize; - MemOp memop = size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn = new_tmp_a64(s); - TCGv_i64 tcg_rd = new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -10178,13 +10169,12 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_usra : gen_gvec_ssra, size); - return; + gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra; + break; case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); - return; + gvec_fn = gen_gvec_sri; + break; case 0x00: /* SSHR / USHR */ if (is_u) { @@ -10192,49 +10182,31 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, /* Shift count the same size as element size produces zero. */ tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size); + return; } + gvec_fn = tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign. */ if (shift == 8 << size) { shift -= 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn = tcg_gen_gvec_sari; } - return; + break; case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_urshr : gen_gvec_srshr, size); - return; + gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr; + break; case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_ursra : gen_gvec_srsra, size); - return; + gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra; + break; default: g_assert_not_reached(); } - round_const = 1ULL << (shift - 1); - tcg_round = tcg_const_i64(round_const); - - for (i = 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } /* SHL/SLI - Vector shift left */ From patchwork Wed May 13 16:32:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186659 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp620769ilb; Wed, 13 May 2020 09:43:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyN11eGUm4P5Bh+zapDtlSAYkGqBNI3QZbGLUoJ7VsQxOPRLDyvZQqrrS/BMPbIT2g69xVn X-Received: by 2002:ac8:6b19:: with SMTP id w25mr1246725qts.147.1589388223318; Wed, 13 May 2020 09:43:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388223; cv=none; d=google.com; s=arc-20160816; b=RVV5s0v4RhFNd5Xp3yEJYXccB0MOVwclkZwjO3X9evMNZvbAZwpQzKPi64dAoT4nKt KqkuKCunFh+p9Bjo5C25YtF4BgEG6A6UOTFOXfWs/6x4firloJykXTiKW2Cq/LeFTM4z XBfhy3/QBxDjYqJpo7zRxF8u93I0FLXZ4BJTR8t8Ecp/td9RsupOd+DQ+kUqK53N8WXt cKVchezhWvLFa7CJg341pN5zr8op1+AUI0ZgTyT70Kc+szsaozMgqLicQzWjjn0/mSND I7PiYzhcqvZ6iZUV1AvDeWZ+b7kIOzFax56VXyqRRkYMrFoXTn4hesYxhMbitZ+Iky+Q Rl2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=l2QCtAJuqJw9W7z9OKqSJ6gRkNQUxusCmyvREvrDgd8=; b=uEkMqYkdJzK4YpSuvbtPb4YITkkzIuOatMMztl1+XhxUxytp6NehDT9RwDZT5qVECR 7uw2IurLwqfdxFHoBD9/FLObZYoEF8X/E68MkD+qKMcLInl8MADmBnxfCq/0n97KA8rY w4LCvaDzhV0eJRpOp9q/i3G4BwjByNwAJFWnzyudzDNCXOKlJ5RDRerSQriZ2b6gbxTM hOro91CXR+gx7HIoG686WxU3btAQ5vd8P/rnN0KQfOOkvPP8aTGC+7uSsTC4/dK0AWov EthgsttkqHilFaztk31HyLBKGXvpuxlXtLfzyjgpLNsrKMX/FBSvpBeImMKOK6ujXnZR 8L6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=oCqJ2f0H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id i15si9165660qki.280.2020.05.13.09.43.43 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:43:43 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=oCqJ2f0H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:57878 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuTu-0003Cc-Op for patch@linaro.org; Wed, 13 May 2020 12:43:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJW-0003g4-Vm for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:59 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39228) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJV-00020b-3U for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:58 -0400 Received: by mail-pj1-x1044.google.com with SMTP id e6so11215598pjt.4 for ; Wed, 13 May 2020 09:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l2QCtAJuqJw9W7z9OKqSJ6gRkNQUxusCmyvREvrDgd8=; b=oCqJ2f0Hw74eBQHT3/1vYVWd2us9cmQipR8aRjknPXKf7XxYTlimX2541YF22UAqTe o61ekV6Uz3j4veSejK9cmuhhDacienq2PeEpb3ksfN9qiqpwlk8D0hlm0fTeXn7i3FWQ tQlBHG9BH0Yvs7v0Uoqi3VulW7NqWo3DjHqt4YNQZM3ck1uawxbM2HhM3J9Iow7tp3gE cU9134CQ5b/dNKYvqrJjnhSS3Vedb2+qMn3ZuaC//xyoxllCiWMCLsKyrKN9tVmRIUoZ TtYdVuyTQGCEhpynK6KKznxlCqVWFoJwn4cqNMWBP/ajPCs7p2EKf/Vt1ytwdBSDtZeV VhCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l2QCtAJuqJw9W7z9OKqSJ6gRkNQUxusCmyvREvrDgd8=; b=eNjJprNe2cNF9ku47qTRF74EKHPUm3Q4WWvA7v3SNUT+6qshXwGQIyS8lISsCrzg2i q+GrvI5dFID4Or2TcgRMWhyPKxFNgRkYmZpfJKW7pk7o1d9de/osoB2J3fsClA/eBxyA 0GqRZxrwFU/ykBghPq7KBCeXmphJP4okRRa2M/UqY+TVjLZLxy6shdOG6kMEzD2F+5Uw wMqn8X8f17OX7biX5zNZk4ebHQ8tsDPSmwgHbt+XJhtRvmn2QMt93WVXobpobTNqJV6q sPaoHp2IXq7wEBGEO1AjNy9I7TY5bf/oWE3cLg2liHEwaIMGry9ANxKbxwRuFS6vSIsl c1Gw== X-Gm-Message-State: AGi0Pub326dGgnsCxIHSzoqrFvcyAiRP/iZBiZhfHoJpjTC8ojMDJ5Fs 6RYRPeopf20OVm+DuWWfcUw+Nq25AWk= X-Received: by 2002:a17:90b:f16:: with SMTP id br22mr35213498pjb.89.1589387575345; Wed, 13 May 2020 09:32:55 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 06/16] target/arm: Create gen_gvec_{ceq, clt, cle, cgt, cge}0 Date: Wed, 13 May 2020 09:32:35 -0700 Message-Id: <20200513163245.17915-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Macro-ize the 5 nearly identical comparisons. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 16 ++- target/arm/translate-a64.c | 22 ++-- target/arm/translate.c | 254 ++++++++----------------------------- 3 files changed, 74 insertions(+), 218 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index fa5c3f12b9..e35c812cc5 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -275,11 +275,17 @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex) uint64_t vfp_expand_imm(int size, uint8_t imm8); /* Vector operations shared between ARM and AArch64. */ -extern const GVecGen2 ceq0_op[4]; -extern const GVecGen2 clt0_op[4]; -extern const GVecGen2 cgt0_op[4]; -extern const GVecGen2 cle0_op[4]; -extern const GVecGen2 cge0_op[4]; +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index d5e77f34a7..fef93dc27a 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -577,14 +577,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, - int rn, const GVecGen2 *gvec_op) -{ - tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -12310,13 +12302,21 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } break; case 0x8: /* CMGT, CMGE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size); + } return; case 0x9: /* CMEQ, CMLE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size); + } return; case 0xa: /* CMLT */ - gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]); + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size); return; case 0xb: if (u) { /* ABS, NEG */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 2eec689c5e..010a158e63 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3645,204 +3645,59 @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, return 1; } -static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero); - tcg_temp_free_vec(zero); -} +#define GEN_CMP0(NAME, COND) \ + static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \ + { \ + tcg_gen_setcondi_i32(COND, d, a, 0); \ + tcg_gen_neg_i32(d, d); \ + } \ + static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \ + { \ + tcg_gen_setcondi_i64(COND, d, a, 0); \ + tcg_gen_neg_i64(d, d); \ + } \ + static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \ + { \ + TCGv_vec zero = tcg_const_zeros_vec_matching(d); \ + tcg_gen_cmp_vec(COND, vece, d, a, zero); \ + tcg_temp_free_vec(zero); \ + } \ + void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \ + uint32_t opr_sz, uint32_t max_sz) \ + { \ + const GVecGen2 op[4] = { \ + { .fno = gen_helper_gvec_##NAME##0_b, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_8 }, \ + { .fno = gen_helper_gvec_##NAME##0_h, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_16 }, \ + { .fni4 = gen_##NAME##0_i32, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_32 }, \ + { .fni8 = gen_##NAME##0_i64, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .prefer_i64 = TCG_TARGET_REG_BITS == 64, \ + .vece = MO_64 }, \ + }; \ + tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \ + } static const TCGOpcode vecop_list_cmp[] = { INDEX_op_cmp_vec, 0 }; -const GVecGen2 ceq0_op[4] = { - { .fno = gen_helper_gvec_ceq0_b, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_ceq0_h, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_ceq0_i32, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_ceq0_i64, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +GEN_CMP0(ceq, TCG_COND_EQ) +GEN_CMP0(cle, TCG_COND_LE) +GEN_CMP0(cge, TCG_COND_GE) +GEN_CMP0(clt, TCG_COND_LT) +GEN_CMP0(cgt, TCG_COND_GT) -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cle0_op[4] = { - { .fno = gen_helper_gvec_cle0_b, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cle0_h, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cle0_i32, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cle0_i64, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cge0_op[4] = { - { .fno = gen_helper_gvec_cge0_b, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cge0_h, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cge0_i32, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cge0_i64, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 clt0_op[4] = { - { .fno = gen_helper_gvec_clt0_b, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_clt0_h, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_clt0_i32, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_clt0_i64, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cgt0_op[4] = { - { .fno = gen_helper_gvec_cgt0_b, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cgt0_h, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cgt0_i32, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cgt0_i64, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +#undef GEN_CMP0 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -6772,24 +6627,19 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; case NEON_2RM_VCEQ0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &ceq0_op[size]); + gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cgt0_op[size]); + gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cle0_op[size]); + gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cge0_op[size]); + gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &clt0_op[size]); + gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; default: From patchwork Wed May 13 16:32:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186661 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp623226ilb; Wed, 13 May 2020 09:47:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxp3VdWrf+JTxQhZFr58HwrysPCT1NATCR3TXg6mVhy8pK5aIZ0btc0RleIfY+PH2VhVLC0 X-Received: by 2002:a37:a9d6:: with SMTP id s205mr586186qke.380.1589388422842; Wed, 13 May 2020 09:47:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388422; cv=none; d=google.com; s=arc-20160816; b=Cci89iAD4O4kKAn/GsVzRqaZJs8Ruv/eRmlDIK2KxfL3UWF1Gc7YHFzJq+pd6QOQWw APPhOfEZv/vsdhYH9BVk4p6t4gkkYNga6vKnqTcXewtDNlYTEwu75hmHOdP1rk+GBdyC 9PkWcP5GjOA46btPH/seZzlIxdu2acSqhE/wMrs8VLtOZ9m69sS4/L4z06z5VKThrpAh ki3rK0rJWC1+y1Ta/VW/EjfkwYmKkP2q/eV+HdQkMXkw2ENbHXW+X8nWxC7mOcIMJTyL cXLz42Mmdmg7yHD1NmT5g7nIs2lateoA9CxWcvSgYG5zolhNCSG/zB0lzRC/zLupimxt /o4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=R4kMTERo8NxbHW9D1mlQVpVF5MPRlqOT3EITQBFEuRw=; b=JNw/6I8mM+P4uFk5I5kPBhs1zyxWmrrODTOp2aswelWuI3jqmfUgdXRYGKfI3ysX8+ vMjAkk5DENm8xGAwROScyoVf/7tiXbgWcAaAUhL3x9V+349gr8xrfzDvDvR1LwLD32f5 +h6FB7gqhrTe5g7tJlYvrjmFv0Cz/TITBooJqjHIZpOr2GVZk6zCpb1D9oWuSqlqfnag CWfP6nZDV4G8rogyj7EpmAGtx4Fgf2BZYUly95L5iONWNAQ5/6duWfRkEQl3b5hjFNBV I64FOCJmSZxCMLr+pZg1Gw3u66RXcBX2cWeUKIzNDjpxs9aaTwhWV0MnH4LbWDBBvrHP RsYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Fzg6MFP6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l95si173787qte.222.2020.05.13.09.47.02 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:47:02 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Fzg6MFP6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:36154 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuX8-0005yT-8P for patch@linaro.org; Wed, 13 May 2020 12:47:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47984) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJY-0003gz-6W for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:00 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:43320) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJW-000211-Al for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:59 -0400 Received: by mail-pl1-x644.google.com with SMTP id y9so37077plk.10 for ; Wed, 13 May 2020 09:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=R4kMTERo8NxbHW9D1mlQVpVF5MPRlqOT3EITQBFEuRw=; b=Fzg6MFP6APBMA2EWx/zAOcTTEfsR0EujkUH4R4p4i39gYy9kVVeUJtQI08IKpdRzB/ 907ooOFSfth6aFviCWjY1snzIyi8AV/u1LA7aVmULZ0+XuALsLn2iyeUANroQWxFDruD Apsc2pu3EtbxC3Qm/emgdyNcC+vRD8jdlgTktxMHplmLM0nkcf1XpQHpI3ba8wyS5d6U WgVKboG//efR4EtsRvLVH9Nh1yPKjmdILLBy6qxXHufdS8uHacACzaNfL+VymYz1HY5B mDol1rBp7+Vbydw1hvjM260OlBJaIJQxCRD6hLr7ZDlu4YimeufMFc9bD6iATtGbOqvd pbxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R4kMTERo8NxbHW9D1mlQVpVF5MPRlqOT3EITQBFEuRw=; b=gY5w75FF8MPbnAnvgS0Xim7MtX/vExOUtOBZvnTtTpbG+AYA/qXZSvYbWQBBVW5I6a PuLEBAonOKDO7zbNviisSy3i4UuZEMdDcq0Oje/qCFJ/x1fhwdACKmSupDDCGyOWqzGN zbdM3dtnH2AX6EpzqIelVrvKzs0W8EllkEQ2SiNT8PG9+u78V+DMWJ80PkrFXVM71wwC m2akHk1Zki301QH801jDZeigcEJxZKmGOyXmurSe7FBR2yi13wCG0VVjgXEWITNbWV0f AQQDyOLnVtjonH2cIgp9z4lvW8G1zVSqTq+H0stVRkWIgTmZYMj1CSQygGkB/TNpHnlf EsVQ== X-Gm-Message-State: AOAM533PA+xkFmJS7+/YNJoKHUyIBSIw9s4fuhn5aYzw2k9aGYZxXjsY LfjTigU0954vW8jjYRjqeN6frzO7n3o= X-Received: by 2002:a17:902:9a06:: with SMTP id v6mr37711plp.286.1589387576633; Wed, 13 May 2020 09:32:56 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 07/16] target/arm: Create gen_gvec_{mla,mls} Date: Wed, 13 May 2020 09:32:36 -0700 Message-Id: <20200513163245.17915-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 7 +- target/arm/translate-a64.c | 4 +- target/arm/translate-neon.inc.c | 16 +---- target/arm/translate.c | 117 +++++++++++++++++--------------- 4 files changed, 71 insertions(+), 73 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index e35c812cc5..9354ceba35 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -286,8 +286,11 @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 mla_op[4]; -extern const GVecGen3 mls_op[4]; +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index fef93dc27a..ab9df12e44 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11226,9 +11226,9 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x12: /* MLA, MLS */ if (u) { - gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size); } else { - gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size); } return; case 0x11: diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index 50b77b6d71..aefeff498a 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -632,6 +632,8 @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax) DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin) DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin) DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul) +DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla) +DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -685,20 +687,6 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) return do_3same(s, a, gen_VMUL_p_3s); } -#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ - oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s) - - -DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op) -DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op) - #define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ diff --git a/target/arm/translate.c b/target/arm/translate.c index 010a158e63..face89a1f7 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4520,62 +4520,69 @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) /* Note that while NEON does not support VMLA and VMLS as 64-bit ops, * these tables are shared with AArch64 which does support them. */ +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mla8_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mla16_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mla32_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mla64_i64, + .fniv = gen_mla_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} -static const TCGOpcode vecop_list_mla[] = { - INDEX_op_mul_vec, INDEX_op_add_vec, 0 -}; - -static const TCGOpcode vecop_list_mls[] = { - INDEX_op_mul_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen3 mla_op[4] = { - { .fni4 = gen_mla8_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_8 }, - { .fni4 = gen_mla16_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_16 }, - { .fni4 = gen_mla32_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_32 }, - { .fni8 = gen_mla64_i64, - .fniv = gen_mla_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_64 }, -}; - -const GVecGen3 mls_op[4] = { - { .fni4 = gen_mls8_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_8 }, - { .fni4 = gen_mls16_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_16 }, - { .fni4 = gen_mls32_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_32 }, - { .fni8 = gen_mls64_i64, - .fniv = gen_mls_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_64 }, -}; +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mls8_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mls16_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mls32_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mls64_i64, + .fniv = gen_mls_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* CMTST : test is "if (X & Y != 0)". */ static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) From patchwork Wed May 13 16:32:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186662 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp624912ilb; Wed, 13 May 2020 09:49:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJww6tICcZVyrhwG8HvChA4zpEEXfFcLE3sOUNRytygomJ9GCunztDbxUe+IdEHS0r+0x4YO X-Received: by 2002:a37:4397:: with SMTP id q145mr536763qka.117.1589388575350; Wed, 13 May 2020 09:49:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388575; cv=none; d=google.com; s=arc-20160816; b=bQcfnmUVuEUjWis4TmtOmhD8sWSGsmQ/9PH0NfYPnceGrDOEjv6wtHuGDyP52fsdWD m/UJIFRba2UJKqb/rGe51o7NRZhHoRunYx6ShGiw6hL+i/N6OKfZBTWxy3nR+KN0nZHU vF2MtWhLH3vI75BV84rIuU2vffA1IBkGHKvgzkeEk8AyYNBpPeGI34U4qsRqdrNi4KYs teIDCB1ofU882AgxJYR4V0a++AjuhZRrpTRuVW4GcrnB4ty0qx28n3Etzx3tfo/vE3qh PTL3fhLacWHCpRCT34lUoHzQ/blbxctmLDNSJwgRFvllVNKrm9YzTXERBf2xIHRpmEJu bDeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=0XPH9ZnuWODr1aczdQy13UdDY+XzlpSDX0OJjahOFSQ=; b=l3YZSkj6//gm7niACfp58Pj+SpN3VjRtoIwaYq+YMvxe0qanFogh6K9APpVX5+nJMu RlGsjzmobS8v7uHyRqulL8Bwbl0h+OMVPc+tcJNxedjAdraCTqOPIWJmWYJSVVsQw+sR +zuleC68nckqM+HP91zGXUUjIDkarN5k9t0ZuF8sR9PPeVQDVZTEj30UbQmUwe5eq8cN 2SzHpNZzlgLuL+8H+LZSnqHn4biNWNkE0DzbVn/dWlfRsAl6n4vgCBC8WaI8KccPDMla IoNvL2DGGQG/0yZVxGcvCmuKcqsCsP65A3rLYD4AQ5s+z2x6/X2NzqcbjaeLLOZVg+90 +tcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Xnlyr6De; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id t1si10281qkj.153.2020.05.13.09.49.35 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:49:35 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Xnlyr6De; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:43572 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuZa-0001Gu-R0 for patch@linaro.org; Wed, 13 May 2020 12:49:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47986) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJZ-0003iw-Us for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:02 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:39531) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJX-00021F-Sn for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:00 -0400 Received: by mail-pf1-x444.google.com with SMTP id b190so2464116pfg.6 for ; Wed, 13 May 2020 09:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0XPH9ZnuWODr1aczdQy13UdDY+XzlpSDX0OJjahOFSQ=; b=Xnlyr6DewUk05UaxzDoYmd8KQpFrb81ykp6darc4Lz04gdyaBZMiUOQFKqiu2HOj73 SNYw8y5Z0+hLuqQ8y1+9nqdQvsMaCWEoYnumQqlNOdA+eAI1+u9LvRAJXVnDuOMq5FdY 19Q3RBCmD54scSu1SlviTwUbfj7GEhTqXPLtn/KlvVsTkiCyHY9Y/MmCIY8cgYiky79h NLLHAXGy7eeK2E+KYC9ySIhWeykqQKKNNC8Tn4Bv36nxu/UktkqLzgTLG2tX5n8QFeCf RZYw/CXYtFOHItJV8sp8f6ygEe8ZNmYslLs/B8Tu4A1AdLOm/LihTLDmRwFoOUzWwtoO SMrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0XPH9ZnuWODr1aczdQy13UdDY+XzlpSDX0OJjahOFSQ=; b=YZa/znZrZRUVhOD1xFmU4KkbCsecxtDY9D4Uv9818sdaN+ALGQplQpvVj2dLL3NEcq w89y2ORo+oD+v0y7zwDoo40AwBUkSZ1jOv5zJHwiQfkNhZs8kXTy9xFqeNuBc0bmC2w3 bzA6W+2ZIhSFXuwCfGPltgLZyqC0Fiko7kvB+kkUMAad7knsUD+44V+HVV1dqo3ozlwj HZeV/CPG1okeX0n25oZItaVpXbOqno6sPWM4CePWd3NI2hS5NIzQYroCvR/GBaNIpNPD K7rexdHZd/+CSwoOYCNbl40U/S+fNDqWCElzcfxV1bt0zaJhhj1EIpaJB2u1SRZ5hACN 4eFA== X-Gm-Message-State: AOAM5329NRIqUehchxwnWGIAnN4ugypqqr+f9GQADT8JHRbPrNx00PE7 1Umu/2GvBpMrkAlIEbu8I1icfujU+Fg= X-Received: by 2002:a63:6a82:: with SMTP id f124mr152238pgc.378.1589387577859; Wed, 13 May 2020 09:32:57 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 08/16] target/arm: Swap argument order for VSHL during decode Date: Wed, 13 May 2020 09:32:37 -0700 Message-Id: <20200513163245.17915-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x444.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Rather than perform the argument swap during code generation, perform it during decode. This means it doesn't have to be special cased later, and we can share code with aarch64 code generation. Hopefully the decode comment addresses any confusion that might arise in between. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/neon-dp.decode | 17 +++++++++++++++-- target/arm/translate-neon.inc.c | 3 +-- 2 files changed, 16 insertions(+), 4 deletions(-) -- 2.20.1 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode index ec3a92fe75..593f7fff03 100644 --- a/target/arm/neon-dp.decode +++ b/target/arm/neon-dp.decode @@ -65,8 +65,21 @@ VCGT_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same VCGE_S_3s 1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same -VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same -VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same +# The _rev suffix indicates that Vn and Vm are reversed. This is +# the case for shifts. In the Arm ARM these insns are documented +# with the Vm and Vn fields in their usual places, but in the +# assembly the operands are listed "backwards", ie in the order +# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose +# to consider Vm and Vn as being in different fields in the insn, +# which allows us to avoid special-casing shifts in the trans_ +# function code. We would otherwise need to manually swap the operands +# over to call Neon helper functions that are shared with AArch64, +# which does not have this odd reversed-operand situation. +@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \ + &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp + +VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev +VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index aefeff498a..416302bcc7 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -692,8 +692,7 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) uint32_t rn_ofs, uint32_t rm_ofs, \ uint32_t oprsz, uint32_t maxsz) \ { \ - /* Note the operation is vshl vd,vm,vn */ \ - tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, \ + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ oprsz, maxsz, &OPARRAY[vece]); \ } \ DO_3SAME(INSN, gen_##INSN##_3s) From patchwork Wed May 13 16:32:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186654 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp617153ilb; Wed, 13 May 2020 09:38:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzvXljbiU1lTdPBx1VeU0OQJPoeaBHGbSpS8dZrKLoS0D0G5yoF+NEKY1yBkl/sw5ZaMDGa X-Received: by 2002:ad4:4182:: with SMTP id e2mr504076qvp.61.1589387921009; Wed, 13 May 2020 09:38:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387921; cv=none; d=google.com; s=arc-20160816; b=MoI5tg54dUnsivVJlMoo9kUexQaGc5j+vfuqwBJVfUi3rSt10EKErJkBJUKYc0VsNp neSY3laGSTZIQCvPwGxTR4TeZA1YNZ/LIWbOhXba52LugaVOLoc1WvHJjqZgkcVSL30A BCqvRMgeO1XZq8liO/+B1UW79arL0ewZixCET2DqikTDTZkoMuQytG+eqdi06A48CoPa dyZG+BWDsqkcUTu0WFRSxc+oJ3aSoUxaAnlTTKs1mE5wkRL9Cg43vvIihvtyruY0kLmf ewupS2c7n3bENTxzOnPXFEOl11gFnuXIGmhz8F2VCwn/xeBLXMHkarWzyS8r4rN8mJSL y3Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=E/r+NCV6DzSIcLUZf6WhXrUleBIA2S8Z0OvC6I1eflc=; b=QnTK/LFV2mcvEanPlbvE0Fk4m98PMduBgQ/l6u5wkO5C42gvFoajgGQUJwU9b5KGCe HNyDJo1+qK1eGeWlU7rjBlM3vePp7cXJI97b4J8uoUMqfjgawsfgeCex8KJi9c1bJvsf b7rinunQzDvd0PXjJDvBg8jcpVQ1sYkfD/YjUaQLj3ByGz8f1OKDnIgFVthImhmzisO4 iQ+z8mU84BvzEaA5V0ExCC9VKA2X0AnRaOAsP8rr9XZ7kpp/snOJXaf05fn0NJ3cCm4P xVOW3MEUlsrESut6z5leutuB2zb2sqazVshosAFuIsWF4p9PUy7eIMFFiO6Ea8nf08vz P80A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BJzCsDIC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id cg17si21481qvb.135.2020.05.13.09.38.40 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:38:40 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BJzCsDIC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:41072 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuP2-00032w-G4 for patch@linaro.org; Wed, 13 May 2020 12:38:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48006) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJc-0003kE-55 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:04 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:32862) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJa-00021P-8t for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:03 -0400 Received: by mail-pg1-x541.google.com with SMTP id a4so8013607pgc.0 for ; Wed, 13 May 2020 09:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=E/r+NCV6DzSIcLUZf6WhXrUleBIA2S8Z0OvC6I1eflc=; b=BJzCsDICEfYtNjBi7wowlbIAGSrGGxWJJbtFa7XhIUoSrMb4Ed+BM+fSjDsmNfr2Gz 2Ly1jj3Ul3HMtJ4YK8T+rUy6R2iQTWh6pdMb3f/WMbs7Q9eN8HTeEfV/ouabAqotlgf+ NWeF4phxW7ev6drzi0EVEa7/GSRygwzW+mc6RuzQ3V8S0ELd46JovP1auE8FQFbYjC/I j3yB9hd1dMSnZM8/0RxZbERUvUw2mnp0It3czqlGioPikOCSKNOLY9fdJlhAwSp8KesP w07n5+nvAsXkKkWqTZUAIzvtdHqz4gWTcaDmGiSsS4yIEH/uPo1O8U0qPUyMYkRQQajk 6OPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=E/r+NCV6DzSIcLUZf6WhXrUleBIA2S8Z0OvC6I1eflc=; b=sb6dC4RHzpYr9y0hIoGQDH4Ozq+bkrisCLayxU0OJPB1G2xiH7rbB4meh/QvPwYt0e YoG4XGOK6imMOQdVKJal6NjVWCKmU2M+joY8gcQFgVeQwMizsKyEPH0Zj958eApK9uA3 AWbBxGTJzZaKCDERt6bLVSppnzDLDvYKwqZ159IbxiMnrdzo1R2eQiYQg3800mAU2fzb 4TbAYDy8W9Mooht5QiYAdqXBxLslgT2QyddDMFycxh+zEu3IvPfU0ctbxwuCOw/AfO6L dWWkZ02g3eT3zxul2O+pjI7C3L+HN+pJ5WRNEfwlBtI3KLljuN96iV1vSihm5xnL8p1n LPTQ== X-Gm-Message-State: AOAM531TtFteZoJUaYyE7k1EYfXE1/DwNkcEarugDOimSNqT1b6Z7LpT r/bVYcyOJn7pZCjjzppa5A+IeWKqrv8= X-Received: by 2002:a65:5287:: with SMTP id y7mr174306pgp.86.1589387578999; Wed, 13 May 2020 09:32:58 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 09/16] target/arm: Create gen_gvec_{cmtst,ushl,sshl} Date: Wed, 13 May 2020 09:32:38 -0700 Message-Id: <20200513163245.17915-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 10 ++- target/arm/translate-a64.c | 18 ++-- target/arm/translate-neon.inc.c | 23 +---- target/arm/translate.c | 146 +++++++++++++++++--------------- 4 files changed, 95 insertions(+), 102 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index 9354ceba35..a02a54cabf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,9 +291,13 @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 cmtst_op[4]; -extern const GVecGen3 sshl_op[4]; -extern const GVecGen3 ushl_op[4]; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ab9df12e44..3956c19ed8 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -577,15 +577,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, - int rn, int rm, const GVecGen3 *gvec_op) -{ - tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), is_q ? 16 : 8, - vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand operation using an out-of-line helper. */ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, int rn, int rm, int data, gen_helper_gvec_3 *fn) @@ -11193,8 +11184,11 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) (u ? uqsub_op : sqsub_op) + size); return; case 0x08: /* SSHL, USHL */ - gen_gvec_op3(s, is_q, rd, rn, rm, - u ? &ushl_op[size] : &sshl_op[size]); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size); + } return; case 0x0c: /* SMAX, UMAX */ if (u) { @@ -11233,7 +11227,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x11: if (!u) { /* CMTST */ - gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size); return; } /* else CMEQ */ diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index 416302bcc7..e16475c212 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -603,6 +603,8 @@ DO_3SAME(VBIC, tcg_gen_gvec_andc) DO_3SAME(VORR, tcg_gen_gvec_or) DO_3SAME(VORN, tcg_gen_gvec_orc) DO_3SAME(VEOR, tcg_gen_gvec_xor) +DO_3SAME(VSHL_S, gen_gvec_sshl) +DO_3SAME(VSHL_U, gen_gvec_ushl) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -634,6 +636,7 @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin) DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul) DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla) DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls) +DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -650,13 +653,6 @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE) DO_3SAME_CMP(VCGE_U, TCG_COND_GEU) DO_3SAME_CMP(VCEQ, TCG_COND_EQ) -static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) -{ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]); -} -DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s) - #define DO_3SAME_GVEC4(INSN, OPARRAY) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ @@ -686,16 +682,3 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) } return do_3same(s, a, gen_VMUL_p_3s); } - -#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ - oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op) -DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op) diff --git a/target/arm/translate.c b/target/arm/translate.c index face89a1f7..df91ff73e3 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4606,27 +4606,31 @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); } -static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 }; - -const GVecGen3 cmtst_op[4] = { - { .fni4 = gen_helper_neon_tst_u8, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_8 }, - { .fni4 = gen_helper_neon_tst_u16, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_16 }, - { .fni4 = gen_cmtst_i32, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_32 }, - { .fni8 = gen_cmtst_i64, - .fniv = gen_cmtst_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_cmtst, - .vece = MO_64 }, -}; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_helper_neon_tst_u8, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_helper_neon_tst_u16, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_cmtst_i32, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_cmtst_i64, + .fniv = gen_cmtst_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -4744,29 +4748,33 @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(rsh); } -static const TCGOpcode ushl_list[] = { - INDEX_op_neg_vec, INDEX_op_shlv_vec, - INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 -}; - -const GVecGen3 ushl_op[4] = { - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_b, - .opt_opc = ushl_list, - .vece = MO_8 }, - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_h, - .opt_opc = ushl_list, - .vece = MO_16 }, - { .fni4 = gen_ushl_i32, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_32 }, - { .fni8 = gen_ushl_i64, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_64 }, -}; +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_shlv_vec, + INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ushl_i32, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ushl_i64, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -4878,29 +4886,33 @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(tmp); } -static const TCGOpcode sshl_list[] = { - INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, - INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 -}; - -const GVecGen3 sshl_op[4] = { - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_b, - .opt_opc = sshl_list, - .vece = MO_8 }, - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_h, - .opt_opc = sshl_list, - .vece = MO_16 }, - { .fni4 = gen_sshl_i32, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_32 }, - { .fni8 = gen_sshl_i64, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_64 }, -}; +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, + INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sshl_i32, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sshl_i64, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) From patchwork Wed May 13 16:32:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186657 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp619322ilb; Wed, 13 May 2020 09:41:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzRVXr6+uxK0PaFArIpV4ui6C14AMflLzEs6GN11r3xGO2LYuXpe04n0Vzulnk8VA4zMc1J X-Received: by 2002:a05:620a:c0e:: with SMTP id l14mr550511qki.231.1589388099335; Wed, 13 May 2020 09:41:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388099; cv=none; d=google.com; s=arc-20160816; b=WF43ko8/KCspkbfNeTvgQPsrfrEsec+7CYe8KkILfmxetsjDdy47/Yi3I84hFnN696 TQE9rPBWQmViDw+hDkQqKzX4RnbhavBjC8V0ffuiFDMvfNBM+AflzV38TVGFvf6V6xKG 62gsCN/uX7BZ20Gykqc2+rLWyrLwUC0oez48nO9jQbKL4XySr2ZmATsT5pIDpgU+ClQ7 CvPOHQH/LAq0P59Ve5OfCAXqb00jXZS/B0xUyDnxzWKdUMKj2m/YnReyrcvNnGCXC9rS 2gDO9w5t5uD8RWEFASVm6TUotA3gf4If57huepC5fAiLzb9qk/czKuFo5uaCSQeXHLyx P2ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=skj7edYF/xFborw3YWQ6+XILqBi0CxcV8pcPVCKN/pk=; b=DREOhstEsVJlHmfEckPQUI6vFVC/kwLBQeioG/ud2kv8eL10cDPLDK9UjkXIRZMPM6 nN0rvhnjiHAjpG2o97k3XS7llf4Ghb/hPSXset9C8xiaKoZD6XVbWF+E2xRvj8wecmcv seaQOh/gKGBVPhwUItEq3HkBGZuE+KmTG7Lp0vj+/eMaDy/8PMS0mymjCuR56aZGKxJK k6amIQXyIDdRhoUNSijd7WP7I9YjM9Ng6i/GhdqWpBftWyJwHyAjmE0xK/wT49ARyG1b eSqasy8s7fYYd037FyxApnA3wdpdXu7FLdW3K1v9MUWJpD7iv+9fdo2VZuzh+feS383e OGqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Cwft4MyW; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id w23si191284qtk.116.2020.05.13.09.41.39 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:41:39 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Cwft4MyW; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49702 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuRu-0008FM-O5 for patch@linaro.org; Wed, 13 May 2020 12:41:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48024) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJf-0003lX-4c for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:07 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:40274) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJa-00022f-9Y for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:06 -0400 Received: by mail-pj1-x1042.google.com with SMTP id fu13so11215888pjb.5 for ; Wed, 13 May 2020 09:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=skj7edYF/xFborw3YWQ6+XILqBi0CxcV8pcPVCKN/pk=; b=Cwft4MyWGS5EkJ6akZStYYsNvTqVL1GFNIBdbKsSmsMh4qFa8blzv6nwcXJhAneuW8 hK7aNYUaR/YEp/jwS3SoRMxxxtZeYA+eBmWTFqRNX09w7w96fP4PnUVEfhx5jsOckSmb 5xnk8homEU7F3pdW0NUojX4/pTgVRKHXGWvuTQ0QuUJK21BbMkrxQsk6zg+8qEMIDvSu 3qrLsnN0zABzstxR/L6dLui2CRX7Kk8VswrrC5mNjPYKuVFa2hDjbbjKokcVRxyaYp88 2jgsV4APba0OvbvFSJSPEd1ItTGG560JPNEDdNLSPfZaMcirnxzGzWYGyPKzBrV2ude9 D/cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=skj7edYF/xFborw3YWQ6+XILqBi0CxcV8pcPVCKN/pk=; b=cjDTrglOfK7qgeUXA0vAdZU+8rGOmPT7aamhXy5z+VUIUsD4kZQGxOKjOOBBxP58SG ZklWldYelimo4nQlgYvDDvmA6hKMeVD9macdD7TMlPaK9PCU9Q9NJPI9J99HF4h4ODiT N4sJjZC3fVk8sXxogF7O4o0csfrQrbAAODV4DFVYB0X7jCag3q/zbkd+8ibiVVeqsyku NFxeQ7hiPtEiZHGNVCUkngMQhxlXT+INx63ae3lPy1B5xP/eE3crQrIUqVZKsrNbrezz lwvBNMLQrjFzWuadSbOrBIEhQwKP10wVBHVQLMBDellkvl8u+kFAKJsmLzBwLMdvTGED y9mw== X-Gm-Message-State: AGi0PubSeESgLMADR2xVjY7o5diQPkiLRyOUU1ZPD4+mb2/Aoez6t1nv YkCi/fuk7Hm2Og7P7ewG+eIUHBqV4qU= X-Received: by 2002:a17:90a:1b67:: with SMTP id q94mr34892204pjq.84.1589387580340; Wed, 13 May 2020 09:33:00 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 10/16] target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub} Date: Wed, 13 May 2020 09:32:39 -0700 Message-Id: <20200513163245.17915-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 13 +- target/arm/translate-a64.c | 22 ++- target/arm/translate-neon.inc.c | 19 +-- target/arm/translate.c | 228 +++++++++++++++++--------------- 4 files changed, 147 insertions(+), 135 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index a02a54cabf..4e1778c5e0 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -298,16 +298,21 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen4 uqadd_op[4]; -extern const GVecGen4 sqadd_op[4]; -extern const GVecGen4 uqsub_op[4]; -extern const GVecGen4 sqsub_op[4]; void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 3956c19ed8..ea5f6ceadc 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11168,20 +11168,18 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) switch (opcode) { case 0x01: /* SQADD, UQADD */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqadd_op : sqadd_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size); + } return; case 0x05: /* SQSUB, UQSUB */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqsub_op : sqsub_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size); + } return; case 0x08: /* SSHL, USHL */ if (u) { diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index e16475c212..099491b16f 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -605,6 +605,10 @@ DO_3SAME(VORN, tcg_gen_gvec_orc) DO_3SAME(VEOR, tcg_gen_gvec_xor) DO_3SAME(VSHL_S, gen_gvec_sshl) DO_3SAME(VSHL_U, gen_gvec_ushl) +DO_3SAME(VQADD_S, gen_gvec_sqadd_qc) +DO_3SAME(VQADD_U, gen_gvec_uqadd_qc) +DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc) +DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -653,21 +657,6 @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE) DO_3SAME_CMP(VCGE_U, TCG_COND_GEU) DO_3SAME_CMP(VCEQ, TCG_COND_EQ) -#define DO_3SAME_GVEC4(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), \ - rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -DO_3SAME_GVEC4(VQADD_S, sqadd_op) -DO_3SAME_GVEC4(VQADD_U, uqadd_op) -DO_3SAME_GVEC4(VQSUB_S, sqsub_op) -DO_3SAME_GVEC4(VQSUB_U, uqsub_op) - static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) { diff --git a/target/arm/translate.c b/target/arm/translate.c index df91ff73e3..7eb30cde60 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4925,32 +4925,37 @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqadd[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 uqadd_op[4] = { - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_b, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_8 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_h, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_16 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_s, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_32 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_d, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_64 }, -}; +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_b, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_h, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_s, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_d, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -4963,32 +4968,37 @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqadd[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 sqadd_op[4] = { - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_b, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_h, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_s, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_d, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5001,32 +5011,37 @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqsub[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 uqsub_op[4] = { - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_b, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_h, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_s, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_d, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5039,32 +5054,37 @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqsub[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 sqsub_op[4] = { - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_b, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_h, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_s, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_d, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. From patchwork Wed May 13 16:32:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186649 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp614749ilb; Wed, 13 May 2020 09:35:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwkNkKcMNtmTmRZ9jX44+dxspQlrpiTQ/38ff/k03jzuh2xvtUOAW8N8ByDvcUt9clQ8zqb X-Received: by 2002:a05:6214:8e9:: with SMTP id dr9mr532496qvb.84.1589387731212; Wed, 13 May 2020 09:35:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387731; cv=none; d=google.com; s=arc-20160816; b=RvL4d5BCfCH276MUf3iFQNPWccwLZSuvtYnZ9Nz9SgW7iyOboLJp5Y61L0OoXDKUnk byIOMUyYfXKF19FZkIeaOubBCMcGR/CBPEfLubHBgjy6NRpsyH/VyypgpNaW/v0Xm/dS fq0EE1z2X9hdpgBm4QYxVia/qWHdYD8SXa4dumSGsGyGYgbdI9Z7iAKeiV1RROdDQvnw FCn2KWbHB3ksuOrnLDH97bjgEOx4M18k48IUFfjXp7h70tsEximGJfqn+Fg4B1IPXToU w746VoCJ6mrtpjTEBM6zr6TFEtE7osIiWFxrU1hU4BUZxansSApnDFOSZsxKZ8W4qXMB kqrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=NLb4rJjMwiMsLgfx7HOV1muKPW0Fe0mWL0pQOnLqixs=; b=oG2vG2fYFc4BTerv8DasvAcHqe/p4ys9GWG9zLsGyAA7hnkwfFWYMG3KxOUCGWv4DU C6JQnvumUNeT/NoZvFLQHP+/scn4uJpUoli9r2L2lFc48+H6rxDj0mtP3+vX3wOp1KEM Qy6BwixzLaL8ZVzoCXCDYW5/scabazZzb/ikYtmqLkBYpgsJfGnu5i71L7yxckm2jHAS 26Y2Vd3KHwjp6/CNiJarkzMeRxncI9L5aJCRqaRxAZw6RqOQYQphe++ibdabn1P2CyHr PCBHifX4plao9os/8Qb8PFF4ub2KwcNnIJiF5UpEwcK13j+CZBGdzwPPtlcK8Ju9eRAm EWdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NkNaAgio; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id v8si151294qtp.311.2020.05.13.09.35.31 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:35:31 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NkNaAgio; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuLy-0004mP-KO for patch@linaro.org; Wed, 13 May 2020 12:35:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48012) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJd-0003kz-H4 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:05 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:39570) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJb-00022u-Pz for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:05 -0400 Received: by mail-pj1-x1033.google.com with SMTP id e6so11215716pjt.4 for ; Wed, 13 May 2020 09:33:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NLb4rJjMwiMsLgfx7HOV1muKPW0Fe0mWL0pQOnLqixs=; b=NkNaAgio82eDYKGidh/r6DIupdurwkx5r0sGeI4ArSShWcmRGwUn+Dwggugz0zm3ll Cd/OwvEARRuDLGIk+ECRooxJdqYp773SvmSf8IJm+7STjs7EE6D8NDDIRg/7RUBNfe5q MVz+rdMyRwyh1UU0uzqEKDphrFRw0TGFpaH2PxFoAj3gsTTspLPthpr2Jj9dAsowdHuK dTE1miExlgD0gI1nJaE2Of2qk6K1zUAl8ikiaFEcqWiBciLTlYLj2JxfxyDwChSThirX y/QwkSu5s5DXW2wcC5TgDYL+pDapC3+mz5LKryzTAmnYDOUxnguqMKK+LpBQIuTHgMc8 Cp1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NLb4rJjMwiMsLgfx7HOV1muKPW0Fe0mWL0pQOnLqixs=; b=sgveRowXBzsGP17IMN6JZ52qkv/vhNR2okY4TNmnrZWp+HjfLwAHEaFP9FqModiAU3 TMSwzQzNjjovy+KyPatCyw9ithU6VrbaZTB4tInG+oCi4hPlWTVJgOb7DmzNbgtpxngF CU9yJy976HqZlOzw51OFSYoF9F92IAcp7Lz02FqyvDkNtxRLbs3CQLp4wlzlHggKil1a CCCKlBcFRKxBEWXWUEQdzFwhEKMmXaCmwyDklkxPL2mBpMqen5HGFsOLi+C2SCu9fEgW WewH6zwfUp4UnEqMbF4ltqgANtH7q8A0fUmlQhZPIxGXmlExMpSB5+BXa1gIilgp/gs0 eiWw== X-Gm-Message-State: AGi0PuZZdOqXgf8f3XTawrSkLVtjzahzolOLJs+PZ9QaPny9n3fXtKyN AzHa6fB9hP3NPHhFBqWlj1myZ+OqZnM= X-Received: by 2002:a17:90a:154e:: with SMTP id y14mr36596197pja.180.1589387581810; Wed, 13 May 2020 09:33:01 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 11/16] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 Date: Wed, 13 May 2020 09:32:40 -0700 Message-Id: <20200513163245.17915-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1033; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1033.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" These operations do not touch fp_status. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/translate-a64.c | 5 ++--- target/arm/translate.c | 12 ++---------- target/arm/vfp_helper.c | 5 ++--- 4 files changed, 8 insertions(+), 18 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 33c76192d2..aed3050965 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -211,8 +211,8 @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr) -DEF_HELPER_2(recpe_u32, i32, i32, ptr) -DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr) +DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32) +DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32) DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32) DEF_HELPER_3(shl_cc, i32, env, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ea5f6ceadc..367fa403ae 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -9699,7 +9699,7 @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode, switch (opcode) { case 0x3c: /* URECPE */ - gen_helper_recpe_u32(tcg_res, tcg_op, fpst); + gen_helper_recpe_u32(tcg_res, tcg_op); break; case 0x3d: /* FRECPE */ gen_helper_recpe_f32(tcg_res, tcg_op, fpst); @@ -12244,7 +12244,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - need_fpstatus = true; break; case 0x1e: /* FRINT32Z */ case 0x1f: /* FRINT64Z */ @@ -12412,7 +12411,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus); break; case 0x7c: /* URSQRTE */ - gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus); + gen_helper_rsqrte_u32(tcg_res, tcg_op); break; case 0x1e: /* FRINT32Z */ case 0x5e: /* FRINT32X */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 7eb30cde60..391a09b439 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6875,19 +6875,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } case NEON_2RM_VRECPE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_recpe_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_recpe_u32(tmp, tmp); break; - } case NEON_2RM_VRSQRTE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_rsqrte_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_rsqrte_u32(tmp, tmp); break; - } case NEON_2RM_VRECPE_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 930d6e747f..ec007fce25 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1023,9 +1023,8 @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp) return make_float64(val); } -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(recpe_u32)(uint32_t a) { - /* float_status *s = fpstp; */ int input, estimate; if ((a & 0x80000000) == 0) { @@ -1038,7 +1037,7 @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) return deposit32(0, (32 - 9), 9, estimate); } -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(rsqrte_u32)(uint32_t a) { int estimate; From patchwork Wed May 13 16:32:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186650 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp614817ilb; Wed, 13 May 2020 09:35:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw/DuIpu4ECeUnmkZ9OZi2UXvo+IcYTu6Q2kRO1AaR+96UusS2CrN89LhlfG+3Pz7hBO42I X-Received: by 2002:ac8:7947:: with SMTP id r7mr5263587qtt.180.1589387736754; Wed, 13 May 2020 09:35:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387736; cv=none; d=google.com; s=arc-20160816; b=jE6QH2GTUyHmbioN9XNENieFqQKnE6yHGHiqKO7BY2216zVq7/B9IUubzeceSd4EBC j7uZgFDu0rNRVUJu6w7pORIqwDceU9uGxs80xAIN/hJLo5rNe5MTydKPcGhHy7P8ZA2m Xk454/BWlOqzSuR+HUSeUsZfat/pTTny5mm582+CZo2Gzmbq0zasJPedG9HJzNPe1FvT HzDzXAIWaZBqfq8fdZdrqkWUX/6ORv92E5gmIXS1YWLpSg5BnHKqYIz1wUUrpWF8WDSG iKaJe1zCYvkhixev26UBLkE1D92TamX3oR/OxweO6pnqSTdywWifFJvNBFYLLLkZD2OI oe0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=WJvxSydD4AqaqoBlC1YovV+jM2AoB6yDoOP5tm4PS34=; b=R6eF4U8etIrzr2I30ydmITcJtb7YjMoaY+nkQZExssMDvChJs8JwkPCoXZde5Oj3wp Uc16OdtOkjQ0vwm9qUlCRvuDruDxVjcK+QVVmF8U+i6YNCD4GgHRKtSf1qVOinLau2Xr 7kGD2UmcHRWVJLMcx48OBBrkQgsKGcao7HgvM5Dm6YGLFSXjafypNveRZgqxG6vj3Nbq HcVr+3s4kZmmdX43Wlx1MZHhvuhmOBQ84WsOR+t6GrkAWBRN1s6+ynXmagMcn4KMOQug AKRTXsMPdZdX4Mm0IWVJSPhrl3ESIbRgOQG97cq7dJgDSw6I+sBuHkzaATGnSIkYJQPR wk5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="ZK4yb9/3"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l17si1334792qki.167.2020.05.13.09.35.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:35:36 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="ZK4yb9/3"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56612 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuM4-0004wd-6W for patch@linaro.org; Wed, 13 May 2020 12:35:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJe-0003lG-6L for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:06 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:35600) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJd-00023I-7Y for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:05 -0400 Received: by mail-pg1-x542.google.com with SMTP id t11so8012350pgg.2 for ; Wed, 13 May 2020 09:33:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WJvxSydD4AqaqoBlC1YovV+jM2AoB6yDoOP5tm4PS34=; b=ZK4yb9/3XDesj4y48AAkEGNckVi++SwSZOlwBKfHicR/LBfqsMtV/bWcvlWqBpOAcz o0Gy5AhGeEg/s6qngAgNaHwTtE94KoSCcljM3Npx0fVErYnktwPK/dRWDRaSG+uSbOgs 2mg6N7MSn6v18dRbggWNA3b/9v5KYZpOv7pDxwE5NN7ZFBQfA/u8OUAZUBInp4ZTD4jx gkAcvPOWkXpgbzaMuB2uh+c5IVLsBk/K+VJIor5q3jdeo+Hs2yprVIfwiHRRl1y2x/R+ bF1A5FZbtuekjZtiNk/X2Ey/e8Zpb0QtOsCtSG3hF8qEDdW7zdZZIDMd3MCCfaFF39+C Q+Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WJvxSydD4AqaqoBlC1YovV+jM2AoB6yDoOP5tm4PS34=; b=PM25Ov6NDCEunVfS8ZgOfBtFKJYxI2xjde1mtUf84FQVgI1WlXVRbrmYpzP6IZ12iN 4OPP3ywkI9wC5SQi4OvQXc9Fui4jFRc8ECbnGK0s4LpoM38C9pM1HiGFR0kXZ6Upj44l qOLlWnoDiYdb9GmUCSUDNyqtLSLWDDiCTUINSgE8W63pxvwNBsh6AP7+DEjPBoAoshn8 dNUKHxWFAdHvAJmasC79LP9vt94+hJQmVLUIChRiQLFtNoCZs1R5hdjU7bHrVian5tRE dpgwfO+PGY/WOVs06FSDllEhv5Oosrtm/pb5P7+nlQWEhJk/pjl6YHjD5pldp/DRShFs uHeA== X-Gm-Message-State: AOAM532vQ0I18Qs7rVoXql6pbTyYL0ZjO+fGfqYJ0b1dz9N/Of76wLZ+ M2V2GDS2W+iJ1EegNQDh754W7RKaxUg= X-Received: by 2002:aa7:9575:: with SMTP id x21mr148296pfq.324.1589387583236; Wed, 13 May 2020 09:33:03 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 12/16] target/arm: Create gen_gvec_{qrdmla,qrdmls} Date: Wed, 13 May 2020 09:32:41 -0700 Message-Id: <20200513163245.17915-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 5 ++++ target/arm/translate-a64.c | 34 ++---------------------- target/arm/translate.c | 54 +++++++++++++++++++------------------- 3 files changed, 34 insertions(+), 59 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index 4e1778c5e0..aea8a9759d 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -332,6 +332,11 @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 367fa403ae..4577df3cf4 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -587,18 +587,6 @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } -/* Expand a 3-operand + env pointer operation using - * an out-of-line helper. - */ -static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd, - int rn, int rm, gen_helper_gvec_3_ptr *fn) -{ - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), cpu_env, - is_q ? 16 : 8, vec_full_reg_size(s), 0, fn); -} - /* Expand a 3-operand + fpstatus pointer + simd data value operation using * an out-of-line helper. */ @@ -11693,29 +11681,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) switch (opcode) { case 0x0: /* SQRDMLAH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size); return; case 0x1: /* SQRDMLSH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size); return; case 0x2: /* SDOT / UDOT */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 391a09b439..39626e0df9 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3629,20 +3629,26 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; - -/* Expand v8.1 simd helper. */ -static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, - int q, int rd, int rn, int rm) +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { - if (dc_isar_feature(aa32_rdm, s)) { - int opr_sz = (1 + q) * 8; - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), cpu_env, - opr_sz, opr_sz, 0, fn); - return 0; - } - return 1; + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); +} + +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ @@ -5197,13 +5203,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; /* VPADD */ } /* VQRDMLAH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; @@ -5216,13 +5219,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } /* VQRDMLSH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; From patchwork Wed May 13 16:32:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186663 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp627078ilb; Wed, 13 May 2020 09:52:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyj/NZFhwrxsL9UrhszHL7P2Z+BN6ahF9/1YvUr9GN6xTs7aAZytixeu1INQ29qn4AqrthM X-Received: by 2002:a37:6e42:: with SMTP id j63mr557381qkc.495.1589388764921; Wed, 13 May 2020 09:52:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388764; cv=none; d=google.com; s=arc-20160816; b=fpoRd6S/37QSVYiY7X39Xy0qquhHih4Nkz5xeLkX0W3/V/0alfpISKDs7PD0a0Q4b/ TAXW3Zq3KaSA8zqutnxFqEVT+dQbVLOzhX24ec7MTPYC4MIBuOoYmaRoXzP5OBzZ2w+7 bED8aFILVqaie/d4DVU/icFd+FK3OkmPRPl2HyZSnWLYmUIU3ctvPeD8A5jYjwvafNgY 4YpJ8swpMjMDI8D4nE/6XY+wVgjDH+ZytzCX4+kE7etuXr4nzoG/CJ0yebGti7233vtm O1e0O8hH33+hlK7B4+MOyLMP6w5vHXigtWOCA5eX4sfzgFRnHmR2OGyxLA9fsKsv0And X2kA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=oVwVd/Y2egFp31An/r8mYLXImVhx97R6Jw16V2BZktI=; b=Bv530Wz1Wb9j9LCQ88TO0/0Z6uhnxOlSRwnOfrXqUuo6PfuM6aUUcoULYEUXCJbHe0 8NimjQkYA7Mj94XV5AGNLWW+Wibcb6ki1X3B1bayY6o8jeWn4MTVwrFlmEhYrQzHK0Sk 382FRhnESkEwHdvF82rVER25H97AGvmmDZ/PGSb0dYvLRQOsFBDP3quL/imbsNcTdBzb YmeJPF+F7eli5fjiVVqixBe9ynv/UlI2ePlmyryguZjMi9P5zbvav9F45eI7LKUSMAmr CHWHorNvZOZrcyJeGBj9zrGu6MN/gA0CZOtAw2LVgRqUn0r9ML8vkur9FYwP/ICFGRLA 91Vg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XOSOiSkN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id w14si4021qka.309.2020.05.13.09.52.44 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:52:44 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XOSOiSkN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:55274 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuce-0007Pj-AZ for patch@linaro.org; Wed, 13 May 2020 12:52:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48030) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJg-0003m2-23 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:09 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]:46225) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJe-00023P-AU for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:07 -0400 Received: by mail-pl1-x62b.google.com with SMTP id b12so30685plz.13 for ; Wed, 13 May 2020 09:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oVwVd/Y2egFp31An/r8mYLXImVhx97R6Jw16V2BZktI=; b=XOSOiSkNedW0jdr5QVyeVcqKdDBxdSz/KRpA0mrgmfRd7ex48UBbJYNb71LKTKlrjL +K6lTxZKDpMaE8B5KSR0Oj6elxlLbsdIfnfAmVUZuh+iYYgV/fa3OhifF4hKH8y9THl1 r7Sd6BApEtIeOqO81HRRxAxShZM7ZlKgC99C+35HVQcU0cFOznVHq0AFkS0VtwQaQm9z Lx0zWoukkKxu8KREjYD7gkc9f4wgClWYsx0i1BmzQNBjOta1+Bp62KtkbkMqzSEDTfih sZcGMDAEfHdXbVXDfJkciyE+/+QPUQMXdB81sO6ZDrpmUN14GbACrPSXPMNSSg4g02m4 5ooQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oVwVd/Y2egFp31An/r8mYLXImVhx97R6Jw16V2BZktI=; b=MJag6ehAn431a8tdT5L8BSDYaknMrvQkqfFsPD1QmfmDZnSAVogGvNzkKRQzwvNC6t lHG9Gm9ni7v6kuhLu88xIs1ceIaf+HjXdD2TNDse+wLEJTZeAASdDG3WYHsP87GlJLHm /M3I0D660qK+uKGkoGnPHAkjfRRSUneNKMdZwjTWdAiM7JO728U8f3z+d+PenQ8DABA6 2NpmE4eMMHcYdIjcPi2fYKni9o/tkk6i/R4y2z5nU/MqVO879Mkl7xpwDAnI8DmqRZKZ MBW+lWYdB6qlaHD9/fHBRwNtpmYt4tdnyibHoz2WY1qpJoHnc1OOkPSUfPsob/mC2i4G VeLw== X-Gm-Message-State: AOAM532NiiQwGj0F19GKULxgV+l5js1zfxlIXboGX3OfG6TN73wsQGO6 hyTGHpNe+gYaLX1q8YEVMEN5kUg7mJg= X-Received: by 2002:a17:902:6949:: with SMTP id k9mr30318plt.47.1589387584548; Wed, 13 May 2020 09:33:04 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 13/16] target/arm: Pass pointer to qc to qrdmla/qrdmls Date: Wed, 13 May 2020 09:32:42 -0700 Message-Id: <20200513163245.17915-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage. Change the return type of inl_qrdml.h_s16 to match the sense of the operation: signed. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.c | 18 ++++++++--- target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------ 2 files changed, 54 insertions(+), 34 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 39626e0df9..21529a9b8f 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3629,6 +3629,18 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; +static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz, + gen_helper_gvec_3_ptr *fn) +{ + TCGv_ptr qc_ptr = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc)); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr, + opr_sz, max_sz, 0, fn); + tcg_temp_free_ptr(qc_ptr); +} + void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { @@ -3636,8 +3648,7 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, @@ -3647,8 +3658,7 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 096fea67ef..6aa2ca0827 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -36,8 +36,6 @@ #define H4(x) (x) #endif -#define SET_QC() env->vfp.qc[0] = 1 - static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) { uint64_t *d = vd + opr_sz; @@ -49,8 +47,8 @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Simplify: * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -60,7 +58,7 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) + ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -69,30 +67,30 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */ -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Similarly, using subtraction: * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -102,7 +100,7 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) - ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -111,85 +109,97 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlah_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) + ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlah_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */ -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlsh_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) - ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlsh_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Wed May 13 16:32:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186653 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp616748ilb; Wed, 13 May 2020 09:38:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyY29MSvh3TBtABOP7j78EZa9rgZDfE3yUjd9n8wX86wqVutUogz1X1/zwFfPwBfOtwVdDS X-Received: by 2002:a05:620a:142:: with SMTP id e2mr511623qkn.331.1589387883805; Wed, 13 May 2020 09:38:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387883; cv=none; d=google.com; s=arc-20160816; b=NvfWGT3Mjxr0GEH8LCGIvxulg1EZRLMd6PbXYsuR/8RoOtuSdXwiXFHiUSUcMYRxVD 5EB6XzV9jkKsDdn5mRmKqBSPnRm4oGnyvf8CfxuvlUPUnio7upClcHZe6jrYSg0s0saB /LEtWLqiZlBAbWgsURv9WKdBC5gMoimQsLFU349XSGwIrlJg1JXRG+a9r8PT2s9LmpfR FaPHng8F9NOVlieA000mmiPSaWUBbmYF8E1Fg7GJFeoDnlxYpp4i88sML9RRwcqgPhLE tIB8VwNhAZ6rwYPoRzMvaEdU+GPtLKr3czxqVbAPE7CUNHCT8QlX6hX0mGmN04KU7G9Y zvMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=zUV0ojLtw3LPZ1kr/+clAulYjRQCrH9O1wPeh4lYV08=; b=IZ3ifxuuI4VgSm/0gInLK/HJlCRd3e2jHUwqD8QO99w5k+4maFmhVW/O92VGuLUFLq BZl73VUgPHr1YDnTtM6luR4e4aEqrs2zmEGlrWiHtm+ZOWoNHMCM2oYrLC9qs/Imk2nX WgRpTXKeFA0a9N+ciohj/xAkq0lkRkqC+jvnzU0hIUxhEXzrQQ8KeqEatbhGJpkAcxJP itIizlqG13zZSLR1NrmYcQWHhI37Yjm3OK5touBJuW4HzpwmnZJGIOmV5+iaRc6a0gUj IqjrJEAmOJYYHE2SVtj0vC9twXpqHEI4husoJPGS89Z3c/s1p0WIZObR2NuzOJ8MRApV HuDw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fUtoNy5w; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id w33si154029qtd.386.2020.05.13.09.38.03 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:38:03 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fUtoNy5w; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:36506 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuOR-0008PI-6c for patch@linaro.org; Wed, 13 May 2020 12:38:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48038) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJh-0003my-UK for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:11 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:36449) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJf-00023Z-OS for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:09 -0400 Received: by mail-pj1-x1044.google.com with SMTP id q24so11219843pjd.1 for ; Wed, 13 May 2020 09:33:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zUV0ojLtw3LPZ1kr/+clAulYjRQCrH9O1wPeh4lYV08=; b=fUtoNy5wCQjWdkWWSLSHHPB8aJ5tVXP5PTWRvBowJoIsQ/1vb6a34/SYrwoujjuA5d +IMnBJLmJLatuIDiWYPGZdSGHghDATgrSsgr17VeRQxl+S2NApqx8GpuMsdzpc5G2kXu 1n1g7f4LKzvIMkNaWYfi/N2qibIjUXeW4x/4xzmolM6oAq147UUD49pgKQ8QUz+40wBd hjZwxEGryw0Nj6u5hJeX3ZoixztdjVEUOfBH1fGMWWsVDFq9SrLXKqoAEZdgseKhO8lg 4jezj+myFVla8Ck2Z1O+YWjIe8nSDcys13q5NcT+mZrWxYSAvkOuQq+bRi6Y7DK/xRey jC7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zUV0ojLtw3LPZ1kr/+clAulYjRQCrH9O1wPeh4lYV08=; b=VUMyNcyA2lYgI75lW5onhPWf0I7GRqXM6soRs5qwLfl7GT19Ba7vaoTenFzOS33RtY /mP9E4ZfzsQc+fNnzGRwy2jR1wsef0Kdyj0Srg2S2FqE/SxfeuN5mINagxU3UlmMutTB VrGv+dNqBQfoh1aJQuupab5JG+qQ/6qpaWH+6TuobhNcaOLf0ZF4srnkvJoc2SthA3e8 3biaNbM8G4SELviR3JKcXRJWfW1H2Z8Bv7V934OLZs31AlrXN91YJuvxNjAp0ee1Heu8 R7ApOym3e9DCVi4v0VIE88Vft3hFGzfK0ZOqO+5cS8OS6FZj0QgdGeykw4JpdUY3Mh67 D6+A== X-Gm-Message-State: AGi0PubLXuPwVUTEVFGoCeeYGm8OIFaZkRQHw3A9oNEEbRucT2dUMxLJ c0w77TX3/7ECG+nmBekZCDfS/HguViM= X-Received: by 2002:a17:90b:3115:: with SMTP id gc21mr33829486pjb.183.1589387585798; Wed, 13 May 2020 09:33:05 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 14/16] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_* Date: Wed, 13 May 2020 09:32:43 -0700 Message-Id: <20200513163245.17915-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Must clear the tail for AdvSIMD when SVE is enabled. Fixes: ca40a6e6e39 Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/vec_helper.c | 2 ++ 1 file changed, 2 insertions(+) -- 2.20.1 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 6aa2ca0827..a483841add 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -747,6 +747,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) @@ -771,6 +772,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ mm, a[i + j], 0, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) From patchwork Wed May 13 16:32:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186660 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp621735ilb; Wed, 13 May 2020 09:45:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxpW7w1sbDAlw2SIN0v3/GZR9RPjRGeeyQ1isGsj66DeEKcwVuZyJIR5y2V1iBC34A9GEE3 X-Received: by 2002:a37:9e06:: with SMTP id h6mr574509qke.400.1589388308082; Wed, 13 May 2020 09:45:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589388308; cv=none; d=google.com; s=arc-20160816; b=m1/Bi+Iljfc7VxQTDd6Uhmt1iSWh8HvW0Kwx6N1lWmR+lPggmbhFK0Vfc4vrqn7Zan cm00T2gdesAqSr10DY0iU5tjedJosfSMz059H5H7HYQRIdM1DbJOwklSARh/Z85sQHIx Xr6iitsf1FFWPQqmlx8wkp5i+cXoWTifwRnd0LusJ55tI5Shpxd/lpuKgMjZBuuMUtcN 6QFYhNqm93ije/LQRGu8gzadQeeAT5C9kB3k3A0oK9z0QX57uQLU+xAmtx2NbvUtOIJx E3TfyJfc3VyWShTxfN1MlLGTZ6dMBvgGl3QNNeV3jP1jI2F02zdyrLBAwEJWyeILDC0s t1WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=sMsSJIbayaVcN7fAKLPcCoZeyuYV2bazR+WjJ8tzbTs=; b=CRsen7KDvCy5p1c+FI/0837sQ0u0xzbemP3tJl00CfvuyjzV5e9gXC2xQfZNXcKH9O 751odkUYWnYSvUDqoUdT1GUmeSTZR2R5bC3cwrwfA8yhjua6xG4dyn1xTy6KZxzXOlh5 j9SwJpVmnEDrbigoU5mZGfsWK0X2PXM+b3KkXHGc+wZJkDRohbIsK42i0e6j7A53uFtf Tr3buZkeTHQ6bxD/zcCzrNwILXrikS1nJWPlM7ClTGtmIknRkHd3it4ELbc3O532QwEf 9Qb3aZmLryyLxwxdMhwhZ626bW5Yzok0dhCkwzJx+bAcWPX/lCy8IPbazcwRAGFRMJld qidA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fM4EGvFT; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id i15si198366qtw.131.2020.05.13.09.45.07 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:45:08 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fM4EGvFT; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:58912 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuVH-0003c5-HT for patch@linaro.org; Wed, 13 May 2020 12:45:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48052) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJj-0003nf-Pf for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:11 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:42066) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJh-00023s-IS for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:11 -0400 Received: by mail-pl1-x641.google.com with SMTP id k19so38675pll.9 for ; Wed, 13 May 2020 09:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sMsSJIbayaVcN7fAKLPcCoZeyuYV2bazR+WjJ8tzbTs=; b=fM4EGvFTSPXqTYDA3eCrVoULRLcsBmO9xM4iM9/ZWDPOqk+mcs9WqtH0vX5LljQ9pq fi7LHt8yirMEsi9kVwcG6L626Pba4kSoBUIVBOJG9YTgSAgt/Xb+qJahrKT6ONBTW8qF l22YvLHH2bvFDeYxPGgDsxc8mRfioVtUxxv04xDCyyAwXfFS/6rFlNSKEylLrQ4hc1i4 WBgifZFXt99bB5WCHz+1813PXALfz8NPyLde5bs4un3cbVRPYAhLD6WU4fLDX4Xren/5 sLpO8Rqb7Lk80syXqy+eQpSbV9M4SzlhvvS4NtNheyCXb0eIPrqhuYBxOK/hkbo4HFwU b5Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sMsSJIbayaVcN7fAKLPcCoZeyuYV2bazR+WjJ8tzbTs=; b=dbFj/tSbvXpE+izaxbqCUeWw5EEzxrimLWC6fFSWtmzlN6pyKARQOuszqFRd73FQEK DlcuCp28VSCztV5bpOC8PFXOKwX/jZnBl1dpDPRciZYRYx3ZTUZgJoBq4Mdt3jKPkxLK tpAUx/t3vzVx5549zixF8Hr4tFI41CmFKnKUmgRJUiwF2FqfXkNa+zRZo31eV81MTUF1 VFOL8xkibqCX/cZQN5m8nQDs4dTJu9FGYV2KJoIeVNy3BCbveRyb62ufqd0lTjSxpKvx JLYeQMjDZ3nvBadN8oSPtLblIEKgeikc13B+nk3v1AL9QxbSVhSVLOxMRyYDTSYuQFL4 vYbA== X-Gm-Message-State: AGi0PuYjBAQaRoNXMNcjE20E1tDU7MjDJGhQlJJK+oZf6ex98Jk/f6pg jlko3sYYvCHoa+Rb1oLxGNbyCHDrXtY= X-Received: by 2002:a17:90a:930c:: with SMTP id p12mr38892146pjo.64.1589387586951; Wed, 13 May 2020 09:33:06 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 15/16] target/arm: Vectorize SABD/UABD Date: Wed, 13 May 2020 09:32:44 -0700 Message-Id: <20200513163245.17915-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 5 ++ target/arm/translate-a64.c | 8 ++- target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 24 +++++++ 5 files changed, 176 insertions(+), 4 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index aed3050965..4678d3a6f4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -731,6 +731,16 @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index aea8a9759d..bbfe3d1393 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -337,6 +337,11 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 4577df3cf4..54b06553a6 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11190,6 +11190,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); } return; + case 0xe: /* SABD, UABD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11322,7 +11329,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index 21529a9b8f..d288721c23 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5102,6 +5102,126 @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5236,6 +5356,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 1; + case NEON_3R_VABD: + if (u) { + gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; + case NEON_3R_VADD_VSUB: case NEON_3R_LOGIC: case NEON_3R_VMAX: @@ -5380,9 +5510,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABD: - GEN_NEON_INTEGER_OP(abd); - break; case NEON_3R_VABA: GEN_NEON_INTEGER_OP(abd); tcg_temp_free_i32(tmp2); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a483841add..a4972d02fc 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1407,3 +1407,27 @@ DO_CMP0(gvec_cgt0_h, int16_t, >) DO_CMP0(gvec_cge0_h, int16_t, >=) #undef DO_CMP0 + +#define DO_ABD(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABD(gvec_sabd_b, int8_t) +DO_ABD(gvec_sabd_h, int16_t) +DO_ABD(gvec_sabd_s, int32_t) +DO_ABD(gvec_sabd_d, int64_t) + +DO_ABD(gvec_uabd_b, uint8_t) +DO_ABD(gvec_uabd_h, uint16_t) +DO_ABD(gvec_uabd_s, uint32_t) +DO_ABD(gvec_uabd_d, uint64_t) + +#undef DO_ABD From patchwork Wed May 13 16:32:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186656 Delivered-To: patch@linaro.org Received: by 2002:a92:5b0a:0:0:0:0:0 with SMTP id p10csp617902ilb; Wed, 13 May 2020 09:39:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyLDh3VSwv6uxcfJEoYQD0s5clQhVutJ5sWMuG2nEwwrBtvwB0zglfp/nR0JoI9vRbaLFmt X-Received: by 2002:ac8:341d:: with SMTP id u29mr4398497qtb.282.1589387980831; Wed, 13 May 2020 09:39:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589387980; cv=none; d=google.com; s=arc-20160816; b=tB7oRRrOimF7XzmOU0CVvCZoA9oPWih0AKfYv77eWl9wm9qb4oF3+D4cvV9CRD0Muf D2PkHuqLZaAMndDEvmBfRhbkLi56rb7mBujGoDbooQI/Zw04r7vX9daXPN+EhkAZ/ZEW rXC+Avtu9YhLfBuSZhj8goe9opp01FBsQZGfNilHIKRcnYoE1quZ6w+Loq5/NCPdApE8 /al1VJ0pi0pzN11Rspdtzy9jeB0SD7cxp++99lDDcgfidiYUzpOSQq1DDOCyNM79i30T 3O1Hxd+b67sfcLLY1cbqiuFpc+Mxy4G4LAH7pq36RoKKhXBEASv8aLk+A0p5dBO+9HaY 6SXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=hVHTEirUzJ20bAilpSh5nRxHdFXu/fWlu+BQCZdG8wI=; b=OGZXBHThYUWSqEmpfn5kpvJzOygXnYnKsFIJe5BbWX3b8WMHE3Z0h47cwBpuSw8WJQ 29mwB6cN7PZsQjhTVVBgtAIQr3nZmVkUb4vTKlxWuGoFit6ZxKa2BQq3TeMTQPUPQjDt BHiVothRoVnsUzxmcK1yIpbavVKWHc5T1/iZDJ9hCVZ0N/UyzeQyKyDNUs92FwZRukEt NxuFi//NoQwhatCCP7FrL0SUYVncgayhNjClVQQjBu4e4nlvBTdr6xWtW0pMam26kvFs z46jJYISDiwTq2tnipVuIPMQ7XB25fsG902+sXNSFFabWVkWQ0byOWAwOfyZNT4m9ADM CDCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=X2N1z92H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id x42si181890qta.103.2020.05.13.09.39.40 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 13 May 2020 09:39:40 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=X2N1z92H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:44226 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuQ0-0005nv-75 for patch@linaro.org; Wed, 13 May 2020 12:39:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48054) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJk-0003oq-EE for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:12 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:51783) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJi-00024N-0K for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:12 -0400 Received: by mail-pj1-x1041.google.com with SMTP id mq3so11344061pjb.1 for ; Wed, 13 May 2020 09:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hVHTEirUzJ20bAilpSh5nRxHdFXu/fWlu+BQCZdG8wI=; b=X2N1z92Hh3WJ7vtCRPIyz5OW5ZQqXBvZcJuxOr1p03Ok1f3zjOOOhcm3QhtN9pf2n8 sJdYPuG8bPGGfm4MDawILYEODVnGPcyuOjT6Vj7gypDXPbBVCBRexH0A76ro2ttMtSjV NnOqtt/Qrq9LCW51kfQOpixKJrng2rjFZvsoHr3+z+OJabLvpBnlUA0agR66mZaDh+Ad GN9twoZhaL5SblFPpVUBznhhkvDVjJ8H3NYsfCKjaXct+Ya5yR6sZwAal5ECsuVNAYLq jHfEsJCYmTqMI8VTXzKJ7tO8kX00FhOPjzPtRDsHzGRZfAA7IAyIgFD3z5rOzoatNQhG 6Ibg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hVHTEirUzJ20bAilpSh5nRxHdFXu/fWlu+BQCZdG8wI=; b=PaxZLCjG6d9OHOoYXVSzFOBUQ1Qz6pSFJvOBevG1VFcXkXfVR3vypaJE4DHLGPyB3i N6/Sd6HUrc29ZqqApzPIMaWINqUn1IEbC7p46/JyL0Fc9vmo+0t+HADF/y/ZRd/vM6kB 6oBykdeQysKVlD+KYzNbkZHncMRvPk7atNIOKHAxkNux3AnmXzABfsQLYzKP32lqg9OW kAuT+eKpIZSGKV1SCXMgBESPB8bsmWGalKG71VWHf3sZU7kt8/EhCpuEzoinLU7MrtuN uDJtEUwO4zYAia8zTJWMQR/XLtD8z/41WdcqvESFxXFlsXglQJCaHoSdcdy8pr/jLz7k Gteg== X-Gm-Message-State: AOAM531XF7LIlewgEt+B9MVr+71/0TxOfl1VSEMbjUe/XtXA+E7dOWgT xkoSyv0jX+QoHsInsrpMk5ZLcmUQrFY= X-Received: by 2002:a17:902:8494:: with SMTP id c20mr2897plo.305.1589387588187; Wed, 13 May 2020 09:33:08 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 16/16] target/arm: Vectorize SABA/UABA Date: Wed, 13 May 2020 09:32:45 -0700 Message-Id: <20200513163245.17915-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 +++-- target/arm/translate.h | 5 ++ target/arm/neon_helper.c | 10 --- target/arm/translate-a64.c | 17 ++--- target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 24 +++++++ 6 files changed, 174 insertions(+), 33 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 4678d3a6f4..1857f4ee46 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -284,13 +284,6 @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32) DEF_HELPER_2(neon_pmax_u16, i32, i32, i32) DEF_HELPER_2(neon_pmax_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u8, i32, i32, i32) -DEF_HELPER_2(neon_abd_s8, i32, i32, i32) -DEF_HELPER_2(neon_abd_u16, i32, i32, i32) -DEF_HELPER_2(neon_abd_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u32, i32, i32, i32) -DEF_HELPER_2(neon_abd_s32, i32, i32, i32) - DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) @@ -741,6 +734,16 @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index bbfe3d1393..c937dfe9bf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -342,6 +342,11 @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index 448be93fa1..2ef75e04c8 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -576,16 +576,6 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) \ - dest = (src1 > src2) ? (src1 - src2) : (src2 - src1) -NEON_VOP(abd_s8, neon_s8, 4) -NEON_VOP(abd_u8, neon_u8, 4) -NEON_VOP(abd_s16, neon_s16, 2) -NEON_VOP(abd_u16, neon_u16, 2) -NEON_VOP(abd_s32, neon_s32, 1) -NEON_VOP(abd_u32, neon_u32, 1) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) do { \ int8_t tmp; \ tmp = (int8_t)src2; \ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 54b06553a6..991e451644 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11197,6 +11197,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); } return; + case 0xf: /* SABA, UABA */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11329,16 +11336,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xf: /* SABA, UABA */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 }, - { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 }, - { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index d288721c23..e3d37ef2e9 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5222,6 +5222,124 @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5366,6 +5484,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case NEON_3R_VABA: + if (u) { + gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; + case NEON_3R_VADD_VSUB: case NEON_3R_LOGIC: case NEON_3R_VMAX: @@ -5510,12 +5638,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABA: - GEN_NEON_INTEGER_OP(abd); - tcg_temp_free_i32(tmp2); - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - break; case NEON_3R_VPMAX: GEN_NEON_INTEGER_OP(pmax); break; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a4972d02fc..fa33df859e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1431,3 +1431,27 @@ DO_ABD(gvec_uabd_s, uint32_t) DO_ABD(gvec_uabd_d, uint64_t) #undef DO_ABD + +#define DO_ABA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABA(gvec_saba_b, int8_t) +DO_ABA(gvec_saba_h, int16_t) +DO_ABA(gvec_saba_s, int32_t) +DO_ABA(gvec_saba_d, int64_t) + +DO_ABA(gvec_uaba_b, uint8_t) +DO_ABA(gvec_uaba_h, uint16_t) +DO_ABA(gvec_uaba_s, uint32_t) +DO_ABA(gvec_uaba_d, uint64_t) + +#undef DO_ABA