From patchwork Sat May 2 22:44:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186064 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2284944ilf; Sat, 2 May 2020 15:47:44 -0700 (PDT) X-Google-Smtp-Source: APiQypKMA6NK7h3mRsFgCJJiFQR2WcATnM+rlD+OWSG3ixKm4IS1bFyHEOOztCayw7CqGhfWFbcy X-Received: by 2002:a37:7d7:: with SMTP id 206mr9499333qkh.75.1588459663995; Sat, 02 May 2020 15:47:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459663; cv=none; d=google.com; s=arc-20160816; b=vjvhnHXk2Ug9nXXMoHDSOh8cfjucHFF+5anq6tg5YuCXY7P8Kt4hY2JMh4v91uq6QQ n9hsHWyTp119XRKb0cp62Bsoc4meGAM1HgCQFbqhsSnEy5rMJuH2jLIm+Zl+FNw7B6BS vZk8yYvfvc2siUzmyXOHRxeYgPmVr7T7F+kajk/XdJ2GdhnHgq3TU1CfC/5ia8gSl9S2 cH9D3gXcrNcUiX6Hwdz1wbd+/GKX3ZBvEOPkVDKYQE7FVjdfe8Vpv0bxUngqszMccZoo O6r/3Ms5064RNENy9QAvAbsXRPUQBdBH3H4amK34ap6DaRfdKfCj/RFX1Tn/O7gSyv/A W+0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=NCRdTQpU6yVNUEPlJLLGHU6a3sZwRzAjgbWbdlsQ8MI=; b=NuamsbC3nF6Kr5u38DFU+e+wfI5MCpG1H0CGlJkBwVuaVdQjBtd0jshdNCW2V8sHYs QXqJqreXeBeKmXIpmeUS8FZK6Zr+wO7Xmbdf1jD9dLcP1TRjjCXo7t/uz756lUMbPiB2 ujyUW85y0106HCynlzz0YES9oWX1dchxfzf5nPsPasG2vz/yPqTTvtmKhPOF5kDkbLi3 75dioRTJMT4t8fSdgJ+3jtMkA5eeNdncwhX/tlKgsr4TnBYBXqEhpTvi7zqIO+yi1sZ6 dqkRqo8IsklDO4f1kZvhhv5WPzloHv4tZbH3rL0q4gUbGvo1K9QCq+Hh8LZzMovVtBee cimg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=vu62aDwO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id c15si4026410qko.248.2020.05.02.15.47.43 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:47:43 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=vu62aDwO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:36340 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0v9-00066a-Cz for patch@linaro.org; Sat, 02 May 2020 18:47:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51496) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sg-00035K-Kd for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sf-0004TL-Hl for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:10 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:37619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sf-0004Qw-2Q for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:09 -0400 Received: by mail-pl1-x644.google.com with SMTP id a21so1620092pls.4 for ; Sat, 02 May 2020 15:45:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NCRdTQpU6yVNUEPlJLLGHU6a3sZwRzAjgbWbdlsQ8MI=; b=vu62aDwOb6glAHg2DSOONz6brslEoYus5Z72QOpmICf6BtxqMX9/wvF483FUtknPUQ 0nHur1vB/4M/+Z7NDNZP4zm47kuhpsdm56KncMClaKEwfsGXB6CYBNx7/KG194zbyAyQ eUQZzZw8Gt+RMp102wgJu1PryTMKkmNvwrRbd2etsr52FqA/wBJjt7V5fbli4Z4vT1m2 kC5Exh2sCbDQV/wUMX1VwJTbHm6e3eW7E3aTHE4gHw+SOSc6uv/fpHOtjCcZ7bmzIsG9 TW7UVRyaTKVpyDVbAWtbGeNDkP/uPywGeIgf963GugZDqLmhvv4trBuifJ517N1XAdfl voWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NCRdTQpU6yVNUEPlJLLGHU6a3sZwRzAjgbWbdlsQ8MI=; b=VMJ5QLEKE0wsq0mXP/XCZkISGVf7AsHqvz/jrFQCjQq1puGaxO/3Rn9SMZLhPZH6Lu e3990M+E/kllBQuZaTu4DgH5BQyJccc5sdRGmr7m2KiDlpTGY/0wVNhOq8BVmvL+6a/A RDlo3iADadOio4fx05dtAdaTuyKTDxZhyjAg2LVj1Z1nwwgSfYbG5bhbiCm480Xo5o40 JJPfia4wr532zbgWWQu+ptAfchDhZB23GulyjI6goHQzO5RwQhfWDbofHDwbfQ4dRGAy Mm70vui4NqCMhqXUaol09NgBgd7GEzeqd6yKGhgK9SP2SP8bX5WSlNxXG9QEtF4fSzu8 wjBA== X-Gm-Message-State: AGi0PuaNQcxqs1GW+a2HJuxtc7i/LoLP6ApKoH+9epYqMAqI7sEouXqu Mif9EEU29MCVWGkd9WQReuvfXIQZQcg= X-Received: by 2002:a17:902:522:: with SMTP id 31mr8999975plf.68.1588459506910; Sat, 02 May 2020 15:45:06 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 01/15] target/arm: Create gen_gvec_[us]sra Date: Sat, 2 May 2020 15:44:49 -0700 Message-Id: <20200502224503.2282-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::644 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 5817626b20..9bc162345c 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -691,6 +691,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 98b319f3f6..a39cf22666 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -299,6 +297,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 010b36633e..03f4dc5805 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10205,19 +10205,8 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. */ - if (shift == 8 << size) { - shift -= 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index a96899549b..04114906d7 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4146,33 +4146,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_ssra[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; -const GVecGen2i ssra_op[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_ssra, - .load_dest = true, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4204,33 +4222,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_usra[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; -const GVecGen2i usra_op[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5596,19 +5636,12 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift = -shift; - /* Shifts larger than the element size are architecturally - * valid. Unsigned results in all zeros; signed results - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >= 8 << size) { - /* rd += 0 */ + if (u) { + gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 3d534188a8..230085b35e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] += n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Sat May 2 22:44:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186063 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2284449ilf; Sat, 2 May 2020 15:46:58 -0700 (PDT) X-Google-Smtp-Source: APiQypLTVigRzit+iI2LbA25kBpTuEn3TlhEXIz+OeNtduzS5yndYGs1BmPfTsU3vlba34dyUeyp X-Received: by 2002:a0c:ec07:: with SMTP id y7mr10272600qvo.183.1588459618227; Sat, 02 May 2020 15:46:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459618; cv=none; d=google.com; s=arc-20160816; b=bdqjgG+EI4AbgyGrppRxYjS4QyohdDyilgWKeBFaE3FAsZD8uSW1at8lq6oplJzizv wlfT+3nVqZ33rMYVUhpPgNa0jpAd2S+Cwhyx/AMPHTvc+EQE0cEsBImJ9ooKOpAaFrh5 xQFxXVShd7e9qPP3k1rvNSCFIBtpumez37UvXpWzfYIaqa54mZW5lRMpZ5S3Ujtfa9Nx Somxe/4KfwT581ddS4LO/+AfHiQZnJtwIE/tNTsH/c8+GOcmyb7mAmspAQp3khRhuziX U6AxsEvMUXoxHoyw1Jt1utdnjCTnEhEM5hHzf+8qNwyGeHcG+7vfRUZkNpicprCpjSWj awvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=CXF9hoqm6JD/UN9y51ciG1nlrGUHFw+zFqLn1o/FF0w=; b=h+lSL0NHicczOOga2N352c8XBhJUKLjOJM1zMYFDdx4iffnFE3skkWK5z83lRK2Zx2 C5O6HZImoPmraB4yNO42hNzDT2EyXJwCt7J5DtRq3J/x93WTWnp1nyQMlUCrZm4P44+9 RPR5EnHnViMI4UhcuotCw3vCpKxNRFitX3hPR/nePZaaJz9DQqAOZ8Ni8eObxkq0yVeF OX11dj2FvouytMGsCS68jrFu1njQJndOnk2g4oHYd42N3JjqmmCOvgbo/AmMuTUxGi/X iTS8bNOjyWIYr2vrL7XrH9ZaPEB7rqQ/kONf3hWegPbKGNtTv9pbWrkXk6v+sJXCOTR+ ruOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=PjfjWLiC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id a26si3987782qtp.186.2020.05.02.15.46.58 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:46:58 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=PjfjWLiC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:32976 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0uP-0004IL-Lh for patch@linaro.org; Sat, 02 May 2020 18:46:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51548) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sk-00039K-1R for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sh-0004V0-7J for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:13 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:40085) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sg-0004Th-OW for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:10 -0400 Received: by mail-pj1-x1041.google.com with SMTP id fu13so1859654pjb.5 for ; Sat, 02 May 2020 15:45:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CXF9hoqm6JD/UN9y51ciG1nlrGUHFw+zFqLn1o/FF0w=; b=PjfjWLiCqCG/TZMjdoIEHx/dBZful6RG2jVph5feRlPcCRH9ZHH8014Ijhkb7/BnNF 8HQAG0QJxmQVPDOTXkEBZj+t0XjGn192iA11NJTldnlcP7OkWiTAhTng9lCar4ehiNRQ MRpBc5R+I+tY3ochuokxFsoGS1wlWZvHJQjcHqbvwLrZwqvuzRi5poTHMW8qGSKlHKX8 /m+JLDLEY2aJ1I/iq+4ffJ+NSEl+KuYsZVV/v0opXWW4aW6skPu0g1INQRwNPZOJrN9U yn0RF/IV9Xs2IFxft4BQxs/Km5NXyiEdNBJMTPPKyzi8NXBkQzqfTMcMC45vLnDbiRb/ xy+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CXF9hoqm6JD/UN9y51ciG1nlrGUHFw+zFqLn1o/FF0w=; b=uAXRRF1ROJEg0tpm472QG/D0HfTHYqtVLa89bwEGPet5ePp+yl2osbwK/hwM2xOiQx /IPgX/v5ywhPV2RQOEvlZpbx0aGL4+0qjfPhdrFFBlH0BE0hUV5H4o4vhXNHucb7FjSf d6cEYrRyC6WtLURM9IIj1VIj/NtMXO0F7Ff+q1SnCmhaIsSWKwZ+xKMjduOpVk/LJArG hSfORmfyfBk385BP3MEdPjOwQ+ne50wAz+3fE7NW2V7GGfH3DyMNTcRnA6pogHyISFTD MiEdGRnQghTYecVnMuhhpD4VdZHpzjaWrqtcRx0oxiwyZ4O4Nw+/gtoe9lfBfbPFC6qB 5UBA== X-Gm-Message-State: AGi0PuZuvdGCtmhdZXbFflOKB6e4XfX6oFdNq3IApsPDunnA8xL35oZu M5xGVRC1CiM41Dn0krsLurs5VO5xAzY= X-Received: by 2002:a17:902:c487:: with SMTP id n7mr10713728plx.316.1588459508292; Sat, 02 May 2020 15:45:08 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 02/15] target/arm: Create gen_gvec_{u,s}{rshr,rsra} Date: Sat, 2 May 2020 15:44:50 -0700 Message-Id: <20200502224503.2282-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1041 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate.h | 9 + target/arm/translate-a64.c | 11 +- target/arm/translate.c | 461 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 50 ++++ 5 files changed, 525 insertions(+), 26 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 9bc162345c..aeb1f52455 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -701,6 +701,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index a39cf22666..823821f82c 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -302,6 +302,15 @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 03f4dc5805..1ef05d5ce1 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10235,10 +10235,15 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, return; case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_urshr : gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate = true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_ursra : gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/translate.c b/target/arm/translate.c index 04114906d7..d724022cb6 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4272,6 +4272,422 @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, } } +/* + * Shift one less than the requested amount, and the low bit is + * the rounding bit. For the 8 and 16-bit operations, because we + * mask the low bit, we can perform a normal integer shift instead + * of a vector shift. + */ +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sar8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sar16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_sari_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_sari_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, sh - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_sari_vec(vece, d, a, sh); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr8_i64(t, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr16_i64(t, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + gen_srshr32_i32(t, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr64_i64(t, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + gen_srshr_vec(vece, t, a, sh); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift == (8 << vece)) { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_shr8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_shr16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_shri_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_shri_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_shri_vec(vece, d, a, shift); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, sh); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, sh); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (sh == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, sh); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, sh); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (sh == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, sh - 1); + } else { + gen_urshr_vec(vece, t, a, sh); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { uint64_t mask = dup_const(MO_8, 0xff >> shift); @@ -5645,6 +6061,28 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case 2: /* VRSHR */ + /* Right shift comes here negative. */ + shift = -shift; + if (u) { + gen_gvec_urshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + + case 3: /* VRSRA */ + if (u) { + gen_gvec_ursra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srsra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + case 4: /* VSRI */ if (!u) { return 1; @@ -5696,13 +6134,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) neon_load_reg64(cpu_V0, rm + pass); tcg_gen_movi_i64(cpu_V1, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - if (u) - gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1); - else - gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1); - break; case 6: /* VQSHLU */ gen_helper_neon_qshlu_s64(cpu_V0, cpu_env, cpu_V0, cpu_V1); @@ -5719,11 +6150,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) default: g_assert_not_reached(); } - if (op == 3) { - /* Accumulate. */ - neon_load_reg64(cpu_V1, rd + pass); - tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1); - } neon_store_reg64(cpu_V0, rd + pass); } else { /* size < 3 */ /* Operands in T0 and T1. */ @@ -5731,10 +6157,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tmp2 = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp2, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - GEN_NEON_INTEGER_OP(rshl); - break; case 6: /* VQSHLU */ switch (size) { case 0: @@ -5760,13 +6182,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) g_assert_not_reached(); } tcg_temp_free_i32(tmp2); - - if (op == 3) { - /* Accumulate. */ - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - tcg_temp_free_i32(tmp2); - } neon_store_reg(rd, pass, tmp); } } /* for pass */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 230085b35e..fd8b2bff49 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) #undef DO_SRA +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] = (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] += (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Sat May 2 22:44:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186071 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2287501ilf; Sat, 2 May 2020 15:51:37 -0700 (PDT) X-Google-Smtp-Source: APiQypJZyHzI5JZ0eB3cJkYd1qMrkT9ji6Tf8MByjoXhKQN/2mB9ZdAWlNduYTBSAHc+jQwPuCbf X-Received: by 2002:ac8:4d06:: with SMTP id w6mr10145177qtv.180.1588459897184; Sat, 02 May 2020 15:51:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459897; cv=none; d=google.com; s=arc-20160816; b=NHWwwMkmE1p+wyT11f0wMLg2fXGSjH76ZwhrqUbdE1Va6jTzbCT6/URY6RVu52gvQy TlQE27zNchsGrVcrbovCEIOgfpizrmDO1s6WhGXTXwtNVmI8TUo6SEnDnHeyD5a41y8b P5IRf41AArTH0a4ZmjCgPYGem0Thqhmiv0xuC2LpRtOpi5Uz+QeYc9JOWB+Pl9P8+o6F mqnC9R/4cUYwquxFnx4lwBOmTJ5EuJ2d3Ut4bn714Yz3E2vBge97Oxio1XW9U/HPQwS2 UEoCdA8InFYfbVbG/CeNOXONksaLxdcYIRogPQIkaGhbIo6Y/Jtrd38uh+Sr0TPI8BRE wunA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=AcqetdDay4HpRD3f/y6CLMiPz3CMxc+7RZyifvagB/c=; b=zFyAibfPDGJPA5NTLpYx9i9eVLVx7cCeGvvZ3e/VflIZmsVMMYZI+eMJbkY4iIsLvy z5UXXaM3Z7B2VSQwWqy5Kq4tIF6hYZQm8584PKK9NDcj1GeM/p79HD1mQbkUA/zeMu1V Xb20ahp+DV8i9cmvojJVB7ORxmreqATetsjlC9oWg8WRUOgLj6HkgxuSb+bn7YBkR+2S 2Blpk9LgLZPWLVFisuI+LY7rLCdUBQWlTA+CMQgIOoJrFYdFbrpM09UUDnyqUeu3QyzH wwfoelXew25AEjmR4WDDET7cyq2HBnObBioonxAJST2O3d56r6Xj2nVgoCNBTp1ZJ1V1 rReQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=u2azO5c3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id y10si3906419qtn.382.2020.05.02.15.51.37 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:51:37 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=u2azO5c3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52178 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0yu-0005lB-Ku for patch@linaro.org; Sat, 02 May 2020 18:51:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51538) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sj-00038a-Dh for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0si-0004Wz-0v for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:13 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:35582) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sh-0004US-Fa for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:11 -0400 Received: by mail-pl1-x643.google.com with SMTP id f8so5191026plt.2 for ; Sat, 02 May 2020 15:45:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AcqetdDay4HpRD3f/y6CLMiPz3CMxc+7RZyifvagB/c=; b=u2azO5c3HRK+xTcCCbQIIfsMY5pvXNbR3nvCtG+l20iOrMh0CFEXRwkmXKuIL/VgMO CFoBaiAKhMwpPaXkQuCXPVGCQqv5tGhZphD95kpBUfC+IznkgRsOcYkWqiVvpdY5lKQG iyMoCfKB4pabN61GqtvnqopW/LAn41iVx5cTRuVhmT1SIVxoYuvoWAgftIasaH/EKfEU lDoQTjfkGDLPPXeMZg3wvvrk8YnEcvByqB5xBlIk/7G1yJ1R3Ho6f7tBuPgDZZ2+NsaE +0JKaw3toaYc2rd8HiEiZiiKQO+PiuqcsQqmbXmI4SutcNOJpUb9fT+9hE9xUf6VGYRs 1T7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AcqetdDay4HpRD3f/y6CLMiPz3CMxc+7RZyifvagB/c=; b=rIGjOx27HYMWPsz4WwX2VF/OOqYC4nTxnCg1oSS1oPrA9dO2DQLaRn/E+BoI4zV0RE 9cSfB2BOY5XcfJvnGj9ote41SwTBGTlvgF8bYwpY+yehe6lV/p2v6i/Js7e6cZuoTp8L kHJETWCNEDhKyf/4qczrqIQerhtZ/5qlIGrsFBkKnSwICCVcwFq5sPN7iqY0Og8ksnQU PODH/uRR4cZ5gBkh9s9m8KVsHSYotJNMHQpoJNONLVe95/RU2mLc1xx/uhEPEwgeXehg OcN0iTCmGehIy06y7cOuUvTexT0XUmim9N/p3hYtHMpNcQ2LdNHiR5hGlY9310CtLEzs in+w== X-Gm-Message-State: AGi0PuZmlFdZZBkNFh3iJ21VNIYXlmYfoB+D09qF3vPSJ3cFRu8CSx/0 8R4FI6keDvPnjVg3MgRIUYdXStenUNo= X-Received: by 2002:a17:902:bd09:: with SMTP id p9mr11584378pls.214.1588459509497; Sat, 02 May 2020 15:45:09 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 03/15] target/arm: Create gen_gvec_{sri,sli} Date: Sat, 2 May 2020 15:44:51 -0700 Message-Id: <20200502224503.2282-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index aeb1f52455..33c76192d2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -721,6 +721,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 823821f82c..7a2008f0dd 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -311,6 +309,11 @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1ef05d5ce1..bc326dadda 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -602,16 +602,6 @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); } -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -10208,12 +10198,9 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); return; case 0x00: /* SSHR / USHR */ @@ -10264,7 +10251,6 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); - done: clear_vec_high(s, is_q, rd); } @@ -10289,7 +10275,7 @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, } if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index d724022cb6..f730eb5b75 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4726,47 +4726,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sri_op[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4804,47 +4819,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sli_op[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -6089,20 +6117,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } /* Right shift comes here negative. */ shift = -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]); - } + gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index fd8b2bff49..096fea67ef 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) #undef DO_RSRA +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Sat May 2 22:44:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186062 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2284160ilf; Sat, 2 May 2020 15:46:31 -0700 (PDT) X-Google-Smtp-Source: APiQypI4PCC2otMdKGSKdV0qL5Ci6MjKfKu/Kh82w3tgodmta8WmhZ3DWmQhSQYXv9c89BxJSMsW X-Received: by 2002:ae9:e217:: with SMTP id c23mr9544908qkc.409.1588459591853; Sat, 02 May 2020 15:46:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459591; cv=none; d=google.com; s=arc-20160816; b=TZuxTz9+ftAx/Tu/mKtg/TebT2CrWvVFrNU1xR4WPr1d81oaNxDZCqtWGsDzkBoNXt BpJDwb9kKsoiX+vHhMAuajB5DbuOSvOf/COOgdPufyut4YeZqf7QL7HzzrCrgwNZEMzk 7KT4QdL9PjHJovu7O9PThrRHlFUCEo2sI7AUrii9CtGrVfy2xSq3nl2JzfWz5OlQLTGP CmO3YRbgO7muuwW/m014Jk17eieW0PTVWElenoz2KTH/uS9JKUcnZKa7RTC6k04INfHm uMxi3bhUSHuzV0W62nc917Zns5OGX3fopEaq9FD4wtOd4O7+G1pN6MVv0mayPxzOQlhI rmXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=T6CcolLGmewGB263Dq5JcXqXdOTfepA8BlCHSWd+1xw=; b=ovk1iaBusIJnuJGO2ntLIF0FfhKwqjCgWvFPsi44x/NlQz6tL0J+D2WRawCbsvuxjn ez3ZBaNppCfuibOvrtWkhGEPM7kV709N8/6D+77FpIChSI8SV3bmqgilDOIMsy3sPXaM HAbsvcsr1T0UwiRLPJJlQ3P/9/fGkbUTOyYSuQ6TbI4ssh9w9FQnOZlo8aoz0duWA1YY MA9KQbA9ARTeQrP5Z4Aewepb1WYrG/1Qr074z4lZBfFLNwIYz9pkqgsPWicJ4OD+C49+ wPhloh8Fs6ZLJa+WLrEM8rBY1mN8oQNqDDl2kyzxv4BuiW7jMQ+GoXTRGKHhvy6G7Qxv XU1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="h74GJ9/M"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id q195si4267597qke.279.2020.05.02.15.46.31 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:46:31 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="h74GJ9/M"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34008 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0tz-0004qu-9D for patch@linaro.org; Sat, 02 May 2020 18:46:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51582) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sm-0003Cr-CP for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0si-0004Z9-Uu for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:16 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:41010) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0si-0004Wb-FK for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:12 -0400 Received: by mail-pl1-x643.google.com with SMTP id x6so498029plv.8 for ; Sat, 02 May 2020 15:45:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=T6CcolLGmewGB263Dq5JcXqXdOTfepA8BlCHSWd+1xw=; b=h74GJ9/MpsCpfBTQpBDLff4tMognUlFC4b0ivMZSZTwssgq6FJEWdd+KzUwU4N4mVC t39XPNtINTGIs922woQCgZk/FXyqgy2u3ktjD1Jzzof7FQ1SvdQvcJtTjAdqYtEFpXpd ZlP85XdQw+Bi0Q0DVVVURyEFeWCcisgG0X2KsWddu8mer5xrh8xjzYWWtBcJGLOsJ/Hy yg1raanr09QUWTMZsiYdsglVKlOvBSqJ1FoySUjzmCdMWckDJDXaj1fqwqQ0/ZtYuKV0 okJAw6NjWUUUYz0RrDMnu+sI3TPh1UjOXXIXIO/HZ417C203gHO6YCB9pMB/hC62lNFl Exvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=T6CcolLGmewGB263Dq5JcXqXdOTfepA8BlCHSWd+1xw=; b=ZgbCBqowg91zGjqHjN2X450hTs5+VyzmQ6UnrcMSMBwBlG/dipp0POv/bTbZpOlkEv MwBM5aKb8o2o9mSnmybiAs+L7AWezQjrBVDQWbLN2hcVaRNSdkg+kyrhDvLeuSNdtGGH rVQdkqqoELVIt7bPEk/5aDMKLoRBGpJWyZJ5Fh2EjrAQGbWfvCLdYpV6qBKIPl5yKtjb KG+BVZjFuq5j06838wvNFRkCvXrwJsZTk8nxR/KKNxKEccJFhiWdO212+n3ufWXZ1XNi YtFJyUhFRsKQqjK6saNbDXiIiU6H293v+lhGKpurp5ng/HC/ZcXypcmN73pFLcbpTpp5 CaeA== X-Gm-Message-State: AGi0PuYeVtoar0K34WjunTk2XcF86SsF9fG7qbNnFbQ9Oxefg0CnXtxb nbNwt5pDjP7flCz+Mv9MorxjyCzM00k= X-Received: by 2002:a17:90a:80c2:: with SMTP id k2mr8262344pjw.6.1588459510888; Sat, 02 May 2020 15:45:10 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 04/15] target/arm: Remove unnecessary range check for VSHL Date: Sat, 2 May 2020 15:44:52 -0700 Message-Id: <20200502224503.2282-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" In 1dc8425e551, while converting to gvec, I added an extra range check against the shift count. This was unnecessary because the encoding of the shift count produces 0 to the element size - 1. Signed-off-by: Richard Henderson --- target/arm/translate.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index f730eb5b75..f082384117 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6126,16 +6126,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) gen_gvec_sli(size, rd_ofs, rm_ofs, shift, vec_size, vec_size); } else { /* VSHL */ - /* Shifts larger than the element size are - * architecturally valid and results in zero. - */ - if (shift >= 8 << size) { - tcg_gen_gvec_dup_imm(size, rd_ofs, - vec_size, vec_size, 0); - } else { - tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, - vec_size, vec_size); - } + tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; } From patchwork Sat May 2 22:44:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186073 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2288283ilf; Sat, 2 May 2020 15:52:42 -0700 (PDT) X-Google-Smtp-Source: APiQypKpha13pQQaTc4jTt0c3S/6AB06LQvUgybf7K9y0HY58UFgerRnsZUTmtouDqQqZmVjGMQS X-Received: by 2002:ac8:44aa:: with SMTP id a10mr10661651qto.230.1588459962668; Sat, 02 May 2020 15:52:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459962; cv=none; d=google.com; s=arc-20160816; b=BWolvsiFhxvjAmu5kCASSzCTAJB6ezO4qzBOPgeD7z7P9LN/YlGPpiTu3t9kZzhnGw tEu3Ht1OmeB+jScSxiDiCfmP+5eNsJRahAWdPfzLN3cUXbSYBB306gLGLRp6GgRfYSdV xwehniJ9JbhELdhN9kwTU/0he4Umotbn3r49mwrLf0rfMmpvxAabeCFvq3ITIDW1hDDV N9twaUEAVm8kj7G32EvYyT/ZOeDzGCCBWlR4ct80w0/hpiqJb37btU94tBixPjMRRWSG KInBRF7Ar/D9LUuyheDlcjcTnrKqWBceqQo7OpKIIOy/f4FCk2F24C2wBRSdtZwiCq/t FQWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=3UIiAqWxRQSzQef8dIuVVnVIZspSlNbG4WcR7Cqf/AI=; b=CnFgbqeDzion6qwIN55Sl2Tvd5JfAL9/62pqlMAC2C8vCPBIQDZ+i61VA/NZNg+As9 ZEYfXJujSZmNuVTghy9a79kgLAHfj/DJibtYR9o34uzL2qLU1ZgLuWpa44Si3c9WVeN7 BzRTffT27do2K6KaTDYFdoPRBsSnv/clNj/tXEn3yCwZCqpNKUo4mZeAoSg7nvtjQKTd uYubp7p6Z8orAsCxnMZv+tX8lBR+Wvs+A0vGb2JYOvMCGYDqFGEbzovbkYw98BcSzG8R /LpV866MErckD1zEQjw7rrzivZFyath6ieiZq76xvidLcvZqXOTyhVJf+6txZUGCDvqG iE9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=C188vxqX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id i186si4054136qke.308.2020.05.02.15.52.42 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:52:42 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=C188vxqX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:58752 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0zy-0000nH-4N for patch@linaro.org; Sat, 02 May 2020 18:52:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51566) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sl-0003BH-9a for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sk-0004c0-7a for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:14 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:44523) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sj-0004ZK-OV for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:13 -0400 Received: by mail-pl1-x643.google.com with SMTP id h11so5176587plr.11 for ; Sat, 02 May 2020 15:45:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3UIiAqWxRQSzQef8dIuVVnVIZspSlNbG4WcR7Cqf/AI=; b=C188vxqXhvrX8VRnIDVlQkbXfft2SY0Pny/vL0K80OzUbMAU2BYM9Xe2aBXu3gI9Ju Gm2HOupePMmO7WOok8NpgPt13J89fcTcA8F6XCKBbQPWVMRNKr9YTa8MTjPPggFzYnVU 0a7wz9AYIchnqbEQhlwRTYbBqAL98nqpve2uHQX2bGovjz5lHSnZ8TiEtdnL5xdDRqi1 05Q1PjQVumWFI+NvR/skhdWPW6cJBtzfLDOzD6UDLAjlu96Mssydl1Peu06mqgwN2NaR xx+APlX65kDfVN/RVNgsshnn8FtNrArwjIDGHmU/InlznsHexzebApK7kq2Me2Uicp61 +MkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3UIiAqWxRQSzQef8dIuVVnVIZspSlNbG4WcR7Cqf/AI=; b=dKNhFeYbX9htPYcnvEC+KwXZHMy6tjqE0QI2FlrzLOYmQFLdmt9Nv+J5tqw8bPoBN3 Q5hOjOReXB/PEw9hJrOa3v/PGk8cg7sdZOVWzzez4A2BhxpfFeIsNleX1vdgRVvxWgnh PfKFndLAhGl+cvNxtzM11kapuNybslSRlknS9SjQH9hWLcLsZ7Hqo5ov+o4ZbA+Q72Q7 epDUp3mtZoTLUil7w41EJZ7T7kIhaHoMedb3QtPRNkI3N90JiTr4NAoJ50E8MKJSt/S/ 10aUPgkBOFviz871ZD2/KYzHc/8n8Ab7KN+NJ1CsKHecozXyWzpo++eVbcTpMG4HRZ6b rmXQ== X-Gm-Message-State: AGi0PubzC+H2/vfRwSJkhed0pUkZudCBLzrAMwNApGYks78LK63BCCHj y49qNPqhjKVCTTufFONUlwjsRld/qUM= X-Received: by 2002:a17:902:7042:: with SMTP id h2mr3730319plt.204.1588459512053; Sat, 02 May 2020 15:45:12 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 05/15] target/arm: Tidy handle_vec_simd_shri Date: Sat, 2 May 2020 15:44:53 -0700 Message-Id: <20200502224503.2282-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) -- 2.20.1 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index bc326dadda..5937069992 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10172,16 +10172,7 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, int size = 32 - clz32(immh) - 1; int immhb = immh << 3 | immb; int shift = 2 * (8 << size) - immhb; - bool accumulate = false; - int dsize = is_q ? 128 : 64; - int esize = 8 << size; - int elements = dsize/esize; - MemOp memop = size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn = new_tmp_a64(s); - TCGv_i64 tcg_rd = new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -10195,13 +10186,12 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_usra : gen_gvec_ssra, size); - return; + gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra; + break; case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); - return; + gvec_fn = gen_gvec_sri; + break; case 0x00: /* SSHR / USHR */ if (is_u) { @@ -10209,49 +10199,31 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, /* Shift count the same size as element size produces zero. */ tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size); + return; } + gvec_fn = tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign. */ if (shift == 8 << size) { shift -= 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn = tcg_gen_gvec_sari; } - return; + break; case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_urshr : gen_gvec_srshr, size); - return; + gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr; + break; case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_ursra : gen_gvec_srsra, size); - return; + gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra; + break; default: g_assert_not_reached(); } - round_const = 1ULL << (shift - 1); - tcg_round = tcg_const_i64(round_const); - - for (i = 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } /* SHL/SLI - Vector shift left */ From patchwork Sat May 2 22:44:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186061 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2284096ilf; Sat, 2 May 2020 15:46:26 -0700 (PDT) X-Google-Smtp-Source: APiQypLUhs3UzHQYVMYR/wJUxUjgM/kipc6odnrGQqgCOMZYwbBm8s7WjRoJH2McX4q4KOfSbS2R X-Received: by 2002:a37:9e4a:: with SMTP id h71mr9544791qke.341.1588459586325; Sat, 02 May 2020 15:46:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459586; cv=none; d=google.com; s=arc-20160816; b=IOWstgYRbqnpqwiFP1D0h8/MB9Zr99rHQsoI9BaFV0ZKa8dclc+2Pkye/uedUR+HSj PIrn5V3gIgz0ISN/Xs8ajuNbsYEEleeMmk6QLeshaqvMhlJNr2eZzW6owtd6COGDfr00 tdffPgNhH8tu39Rj+pvdHI58LayUeeVFKzt/ZDpuZGdhB8bQhCXu+I1u7Vxjv8mcO4bf /dGcCqJQYwpptpe/Zcdw7uTyX2x222OQ7/J0Z1T+haQSUumENH8MhIXFQ37rJRJwpu+e I0aFODQl3bi3AoWPMI1PqWmbXc8ICpaGorDG4QM5FOHHoBvB1Rmu9foJjKGGH1fwFkds ip8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=z1GrNs/INZtuwm/NWd3uanX5FvDBMbWqe89R1ET5KkI=; b=YPQ2idCirdnH1h3tXZMHCc1PvVJsu6nOAi3O5zm1uYco7t+6ktkGI7iY6TB/ZmelWe wYJw98k30o3yc4UrnFk0eE2vOkLEVQbl8V19K+dazUFiQE9G9RMUXkvDhOSFjM4DBCMF BvBztcdVAoG6+KFOmY8O6YuEqottBEF32pFOur2m5T9ek7GJ7a2D6wSsDuhhKAZ2PQ48 HWp8SRZamI+9+Ndl+BhInWXvW0rT7JsFIhzT/MyXFeK2DjaNAmEFXUiB09LV3PftBc0v bDIkhqHZA3d1NH08MaiNkzMLN10TQtLRkHU/vDKhk65KXo9+LvZujvApJxuV3LE3yrFf AI9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CifyB9et; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id a8si4453985qtm.141.2020.05.02.15.46.26 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:46:26 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CifyB9et; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59164 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0tt-0003Ff-Kc for patch@linaro.org; Sat, 02 May 2020 18:46:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51590) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sm-0003DL-Py for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sl-0004eb-ID for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:16 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:40086) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sl-0004ce-4X for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:15 -0400 Received: by mail-pj1-x1041.google.com with SMTP id fu13so1859724pjb.5 for ; Sat, 02 May 2020 15:45:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=z1GrNs/INZtuwm/NWd3uanX5FvDBMbWqe89R1ET5KkI=; b=CifyB9etfxF/QDkjcP6N+pO6NyJSrQC6kJQNsJSWZvoE2xSatxFf/mtKdjew1fzxDR 39qvqsGaXKj2+MuL7RsroIcASXmInud+o5UEPFCrsWIDGg/mjBghDXhenmTZkSR83e/w nbu+5ZyIThBy+l7wnOWh7BQZPzprc/9+JGZFQhv9Lv5OLdnay8H0RZ6LTVT9uXkgSMaV Gz1Sch2Lc9xAkOxBExRHrOL/hLxEk06LYC8WPNmTzH9FD2XygHbeYZ0lvl8R4zU1mRD7 wUk7EPCPbZ8kbh+ir61vxFIwEN0mKDEP7iGcf05WaF2maMoL2Jpc2mZzwDfA1EN2h9DR 7P0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z1GrNs/INZtuwm/NWd3uanX5FvDBMbWqe89R1ET5KkI=; b=NXkiPTV6YH7CpRO44NI1dO1CEUbjA7a8sgQzNEULFL+4XEwuvmkW29FNJMJZAYVmAt 64GifGmngJgbmQIQfFUyaI2u7ltL9iO42Ck6DQABspn7Gp4Zr1iG0/YjIt2sfOXcH8iW Fvdiio60rG2quVvCNUTXB3EQU7mG+oM9LTZHJPlrJT3kjwdEh/oPHZHTkJcbbNDTSCp5 6hMOdUX+CjtqNJ30u25AEGbOGdr/3MY+o+BbWwRinDWbubysKEdQmiIyI3kPbftAc7H5 syKlIYn7eui9qnyk/YzmiFkCIuosO05FpOYuqXWHY3ZVJ5Mbsr7QMRgQfEaDNZorp0Iz s1Ew== X-Gm-Message-State: AGi0PuZSKeKn5PDrvnSm2NZWgodJ74DB6LOk8R1n6dlYBZOFn+BiDHJd Lk+OOXq+6f0ISNORr2Bczt72azacs40= X-Received: by 2002:a17:902:6947:: with SMTP id k7mr10640747plt.298.1588459513284; Sat, 02 May 2020 15:45:13 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 06/15] target/arm: Wrap vector compare zero GVecGen2 in GVecGen2Fn Date: Sat, 2 May 2020 15:44:54 -0700 Message-Id: <20200502224503.2282-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1041 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Macro-ize the 5 nearly identical comparisons. Signed-off-by: Richard Henderson --- target/arm/translate.h | 16 ++- target/arm/translate-a64.c | 22 ++-- target/arm/translate.c | 254 ++++++++----------------------------- 3 files changed, 74 insertions(+), 218 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index 7a2008f0dd..20ec9cedd7 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -275,11 +275,17 @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex) uint64_t vfp_expand_imm(int size, uint8_t imm8); /* Vector operations shared between ARM and AArch64. */ -extern const GVecGen2 ceq0_op[4]; -extern const GVecGen2 clt0_op[4]; -extern const GVecGen2 cgt0_op[4]; -extern const GVecGen2 cle0_op[4]; -extern const GVecGen2 cge0_op[4]; +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 5937069992..8208651394 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -594,14 +594,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, - int rn, const GVecGen2 *gvec_op) -{ - tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -12327,13 +12319,21 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } break; case 0x8: /* CMGT, CMGE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size); + } return; case 0x9: /* CMEQ, CMLE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size); + } return; case 0xa: /* CMLT */ - gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]); + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size); return; case 0xb: if (u) { /* ABS, NEG */ diff --git a/target/arm/translate.c b/target/arm/translate.c index f082384117..b08c4a2527 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3917,204 +3917,59 @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, return 1; } -static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero); - tcg_temp_free_vec(zero); -} +#define GEN_CMP0(NAME, COND) \ + static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \ + { \ + tcg_gen_setcondi_i32(COND, d, a, 0); \ + tcg_gen_neg_i32(d, d); \ + } \ + static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \ + { \ + tcg_gen_setcondi_i64(COND, d, a, 0); \ + tcg_gen_neg_i64(d, d); \ + } \ + static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \ + { \ + TCGv_vec zero = tcg_const_zeros_vec_matching(d); \ + tcg_gen_cmp_vec(COND, vece, d, a, zero); \ + tcg_temp_free_vec(zero); \ + } \ + void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \ + uint32_t opr_sz, uint32_t max_sz) \ + { \ + const GVecGen2 op[4] = { \ + { .fno = gen_helper_gvec_##NAME##0_b, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_8 }, \ + { .fno = gen_helper_gvec_##NAME##0_h, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_16 }, \ + { .fni4 = gen_##NAME##0_i32, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_32 }, \ + { .fni8 = gen_##NAME##0_i64, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .prefer_i64 = TCG_TARGET_REG_BITS == 64, \ + .vece = MO_64 }, \ + }; \ + tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \ + } static const TCGOpcode vecop_list_cmp[] = { INDEX_op_cmp_vec, 0 }; -const GVecGen2 ceq0_op[4] = { - { .fno = gen_helper_gvec_ceq0_b, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_ceq0_h, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_ceq0_i32, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_ceq0_i64, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +GEN_CMP0(ceq, TCG_COND_EQ) +GEN_CMP0(cle, TCG_COND_LE) +GEN_CMP0(cge, TCG_COND_GE) +GEN_CMP0(clt, TCG_COND_LT) +GEN_CMP0(cgt, TCG_COND_GT) -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cle0_op[4] = { - { .fno = gen_helper_gvec_cle0_b, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cle0_h, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cle0_i32, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cle0_i64, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cge0_op[4] = { - { .fno = gen_helper_gvec_cge0_b, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cge0_h, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cge0_i32, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cge0_i64, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 clt0_op[4] = { - { .fno = gen_helper_gvec_clt0_b, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_clt0_h, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_clt0_i32, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_clt0_i64, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cgt0_op[4] = { - { .fno = gen_helper_gvec_cgt0_b, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cgt0_h, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cgt0_i32, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cgt0_i64, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +#undef GEN_CMP0 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -7146,24 +7001,19 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; case NEON_2RM_VCEQ0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &ceq0_op[size]); + gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cgt0_op[size]); + gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cle0_op[size]); + gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cge0_op[size]); + gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &clt0_op[size]); + gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; default: From patchwork Sat May 2 22:44:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186066 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2285599ilf; Sat, 2 May 2020 15:48:41 -0700 (PDT) X-Google-Smtp-Source: APiQypLkAMsVe8lceOcdFe2c5hZj+a+ZnIdqN57gfd9NZom99ryrjSxsKz9/gSr4xChuYDJm30Ha X-Received: by 2002:a37:3c2:: with SMTP id 185mr9423432qkd.123.1588459721062; Sat, 02 May 2020 15:48:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459721; cv=none; d=google.com; s=arc-20160816; b=bCg3xMG8xMoVzs953M5BZxeZF3zxSEQILToUag7ooav4cJb3nvIH5Zjz/YHdfbaX5X 0lxjGNEmqzyRnAxOXZ5a18Mxz0vktjP5AeW/Y3o91k/+pkf1Mik2jfls0CFsw/fjFKDT yw/crzIeFfMGQl2YFJawFTVa3B0mX4fmksZjGvsIP3m9m8qz0K5oaqVMyIsAn1BM963J YTOaflg1QQTr3XL/EnabjhGZY0IJ8Om9E/IcrKj7Qcyj6rldGvXzT4HSJK0KOvKSxpYO sc74VyH4ZwFncuhmW3RYuGLXElsLzAMUD0rC1ZpbaVLk4TzdLnveMlTexhBUB88yLPAV IodQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=SGz8AkTuw4m2K0/Y+X8G2iZzicXAVy+z9HaUzJVjiyw=; b=xdS3YPmFukWnSSZpdpUZyN67nmyj6OwdrPE37uiVr0QH1fxo2Bn4+c4pPg2tTilIbc gDgkGP5auSAZ8TrTcYhDA6Cf/Mvg4pubh81JPS2nJM4YZVFQ2yW/2UZed8bn8SLu38JV kcX0yn3n78J3kClasCnXqacl1us4tNwJCSahOSj6SP5mrgtdOUdDy0js6Wh5ftA+pCHg Vu42Y8JY/kOXMbaZlUm5D6zluifIy2w4WGAbCpWXNGv+RRXT3pKZl+xuM/Rb2hf37QbZ 0ET8DbfeQ1e3wQS5ci+q/fiSgN883e7GDHLV283T+xQ++qkBTlbBvJs6sycYvzXzBRG5 Amlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WG28M6n9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id w50si4095785qtj.31.2020.05.02.15.48.40 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:48:41 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WG28M6n9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:42548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0w4-0001GM-DA for patch@linaro.org; Sat, 02 May 2020 18:48:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51596) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sn-0003ES-Mi for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sm-0004i7-Pb for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:17 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:33986) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sm-0004eg-7v for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:16 -0400 Received: by mail-pl1-x643.google.com with SMTP id s10so5199599plr.1 for ; Sat, 02 May 2020 15:45:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SGz8AkTuw4m2K0/Y+X8G2iZzicXAVy+z9HaUzJVjiyw=; b=WG28M6n9jGMfmZK8eGrGKV06ifVIOSEuxewCJp/5bxAzk8ucgmF9rf53j1LSXLrR7R JzSW4fcibt9ULmZdtmnKuQpUTV1a9ZwexDl0xsNfVO/sX6RPaq5wZr9Y014gsuPdQm2s KOsXVMCi+b6/ziCiTSRgmRar7ZRr/XuhEDgPTPhc5BZih7abHnN2sOLZT6dfrFu6jegy 8xM9Ox6P7kWjl9wyrs9txOKke7J1VKU4Nrisky9rIGh0/K7fveqcYVDVqsnm6uza5f0N rOjP+J5kBUaf8B1UN0wqvuTU1aPOjbytgjb3GO9c4Rj20eccV8AZlnWSQCnMJbh+6J2s ZS+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SGz8AkTuw4m2K0/Y+X8G2iZzicXAVy+z9HaUzJVjiyw=; b=dMIV0F3F00nPTJdtK8X+ztvvBeEDrPKMavPhZ5KGYchZOfOdiiZ5VUsFeyFscNYWnd q/kqV2/WbWeiVUp+XQc3jn/rKvxiR0pdMEHH3/RuUvxs/4i2PvFmNQ70bZ0xonX3ff/p hxroDuBGElOEs/9Wo9SbuAX1oqquAmpYc7YOU+uipW6odZsEQwwikv4flCYwInVey5J8 93vwYa5JXv7R/OCmzoqjx78ts/vubspmjSVSLQQcBtkBndIgX4gqA4RxETeMSLfgdiIu goBTTHUOUUdcrkFg7zUcw7dvSxJxn9x/oDOEj8Vq/Ot75AoEDJRxG+7epQivetNU/lUa /q+w== X-Gm-Message-State: AGi0PuYsijIKe1HcKQJL0+fCeMEIwvUCvTPa+4+BKOlUW/7rjCmvzYJf 6LTjnm+lX0CzwqE9kBNlNJm7LiuVMBA= X-Received: by 2002:a17:902:a60a:: with SMTP id u10mr10666516plq.249.1588459514622; Sat, 02 May 2020 15:45:14 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 07/15] target/arm: Wrap vector mla/mls GVecGen3 in GVecGen3Fn Date: Sat, 2 May 2020 15:44:55 -0700 Message-Id: <20200502224503.2282-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 7 ++- target/arm/translate-a64.c | 4 +- target/arm/translate.c | 124 ++++++++++++++++++++----------------- 3 files changed, 74 insertions(+), 61 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index 20ec9cedd7..4fbcdf1294 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -286,8 +286,11 @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 mla_op[4]; -extern const GVecGen3 mls_op[4]; +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 8208651394..2b5ae4d43a 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11243,9 +11243,9 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x12: /* MLA, MLS */ if (u) { - gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size); } else { - gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size); } return; case 0x11: diff --git a/target/arm/translate.c b/target/arm/translate.c index b08c4a2527..da807242ff 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4792,62 +4792,69 @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) /* Note that while NEON does not support VMLA and VMLS as 64-bit ops, * these tables are shared with AArch64 which does support them. */ +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mla8_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mla16_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mla32_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mla64_i64, + .fniv = gen_mla_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} -static const TCGOpcode vecop_list_mla[] = { - INDEX_op_mul_vec, INDEX_op_add_vec, 0 -}; - -static const TCGOpcode vecop_list_mls[] = { - INDEX_op_mul_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen3 mla_op[4] = { - { .fni4 = gen_mla8_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_8 }, - { .fni4 = gen_mla16_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_16 }, - { .fni4 = gen_mla32_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_32 }, - { .fni8 = gen_mla64_i64, - .fniv = gen_mla_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_64 }, -}; - -const GVecGen3 mls_op[4] = { - { .fni4 = gen_mls8_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_8 }, - { .fni4 = gen_mls16_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_16 }, - { .fni4 = gen_mls32_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_32 }, - { .fni8 = gen_mls64_i64, - .fniv = gen_mls_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_64 }, -}; +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mls8_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mls16_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mls32_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mls64_i64, + .fniv = gen_mls_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* CMTST : test is "if (X & Y != 0)". */ static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) @@ -5529,8 +5536,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) return 0; case NEON_3R_VML: /* VMLA, VMLS */ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size, - u ? &mls_op[size] : &mla_op[size]); + if (u) { + gen_gvec_mls(size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size); + } else { + gen_gvec_mla(size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size); + } return 0; case NEON_3R_VTST_VCEQ: From patchwork Sat May 2 22:44:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186067 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2285905ilf; Sat, 2 May 2020 15:49:10 -0700 (PDT) X-Google-Smtp-Source: APiQypJMgUb4PBuBY/r/h2IsLC3Lx/dH8X1NqoXDxwhm7SGuOrlIVXZhX0tU3EcMReKyDGJcBqEX X-Received: by 2002:ac8:71d8:: with SMTP id i24mr10519332qtp.223.1588459750084; Sat, 02 May 2020 15:49:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459750; cv=none; d=google.com; s=arc-20160816; b=haYfyCnzXVFcjnJIMzd/bYYENyYirUoi1sfwuuTgCJg12+1nTG+Dabjkm1jyvLRgvX vMq1+ixhuGRzpqUDfA4uhf/Ol09ec2Digxl8iK7lvLFYh70Gkq5EMQarmnvEW5sF7wFa j9kgbFAOzEf42ZDz9EeO2sjxqrPUehxpmgnpyZzwtCS1CFZQKemd/SgAVbvMcHeWA8Wu TgujEQvDb6s8wdSJFM/+IOl0a9QLlpAgj6/0DQi51H15Jo0GZNWdoSTPkr46z1LV/vtW WnZuCoBe18f4fXTcnWSKtMRkxxkNhEW+KR46dC4bXm+62y/U08yn+EoxAMnc2ymSjysE ntVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=NFh/cV4KerzqWvcCyd3UKrXMuXUL/sOfyhSZ6x7PVt0=; b=XcBPQLD4QOsxRlHrKws/aEAkGvhyHgkzeKCJk5t6iU13ZYkGj9jFL27wkxRjWJju6S 4mduzrUb2TIt3bTYLXIMg2UAZwisqjEZLEYQbK1/KaxWXD+w8jVdPF0QE+aQgxdBlX5u GOV/BGKB9MZwXRD6jSwNYCTzndJGF0t2qvLsdJsRiD2LS2L/ELoMvfPQyFut69aobfsK y6h6IDru70wDsy/fCS3ivh4fRfyEgmWyi7SYKzh/ZleCJ6UKP+ekQJ7jKHKkFUDG9Kq8 91XjhNvvQYcEN1gLQeL2DpSz9foJZoELHLChM7iPu9jVRjux/aIZIYf81IwDqtQoTRiy tQMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Usfqed0M; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id a30si4178055qtk.396.2020.05.02.15.49.10 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:49:10 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Usfqed0M; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:41686 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0wX-0000uZ-H4 for patch@linaro.org; Sat, 02 May 2020 18:49:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51608) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sp-0003FO-8o for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0so-0004j7-6z for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:18 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:44524) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sn-0004iJ-OG for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:17 -0400 Received: by mail-pl1-x644.google.com with SMTP id h11so5176625plr.11 for ; Sat, 02 May 2020 15:45:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NFh/cV4KerzqWvcCyd3UKrXMuXUL/sOfyhSZ6x7PVt0=; b=Usfqed0MHgVVZ2KmBv9T5DH20+FAL3bIVPO4RjQCR7KdUfvXEXRZQKnTPnRQtR/YEU 2WpWxb2kbgpHA0P1dsLS+vAmJP7SbhnzDzrGElMmTzGpdsqLp6Qr+V5qvrQF0LrCWsZH iT1I+mzYjdVIVc0pPckGGicPuFlvSPRNX16rXm923Scwx8BRfTKGvOrs/VCnwFrp7CK5 oX4wHquDCVA2b7HGckWJN3/yRewCVsqpLZRSD+ej88zUIOuDmW/OuMwDq3TIhq5TirKt VRttz71fZWtuBJ3t5HGFo1Tf4viVN1igi0fIYnDBko5Ppj5gghGdS23hf+OxAM8t9BoZ a41Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NFh/cV4KerzqWvcCyd3UKrXMuXUL/sOfyhSZ6x7PVt0=; b=KyENjHMydEvlRxXP6v+dh3pL8AvBpwq8fhgq91JtrO+tO4JVcdRsN9Se8XY11PUzkf pmC+p79WYoXVWAO+c8/tGCybS+ylUd5TfVEsyR/nLoEdd0xYW/YqWIb3sNWjhnaJQ0Gr XRnNAwzcIWB+LItyUHogY3yJdy5Ru8+GqucvlKY2saImDRKZYwSaNyLdY5cmiURasAbf h+DJtiJuxAZJtWoBNz8Gw3J/q/LK0m/kgb+MX5C216QtVT9ECHQ1EY9VdiTP4CsrWapT tdC4XhrtdG81UWK1P08CKa3aEndupXYGxoY5+GfjlIc9FqzyO0kpElxBwZRtr/alF7Gf NL/A== X-Gm-Message-State: AGi0PuZ7UFNFeK/v+bdXretdbZ7YJg9ca/sKXOUxALTzIMde7VvTD8qT Ry73XlN4CuAzNICd35picPpEe4eWFyM= X-Received: by 2002:a17:902:8ec1:: with SMTP id x1mr6157461plo.325.1588459515899; Sat, 02 May 2020 15:45:15 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 08/15] target/arm: Wrap vector cmtst/ushl/sshl GVecGen3 in GVecGen3Fn Date: Sat, 2 May 2020 15:44:56 -0700 Message-Id: <20200502224503.2282-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::644 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 10 ++- target/arm/translate-a64.c | 18 ++--- target/arm/translate.c | 159 ++++++++++++++++++++----------------- 3 files changed, 101 insertions(+), 86 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index 4fbcdf1294..b3e47e7a7f 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,9 +291,13 @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 cmtst_op[4]; -extern const GVecGen3 sshl_op[4]; -extern const GVecGen3 ushl_op[4]; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2b5ae4d43a..2be6ab541e 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -594,15 +594,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, - int rn, int rm, const GVecGen3 *gvec_op) -{ - tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), is_q ? 16 : 8, - vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand operation using an out-of-line helper. */ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, int rn, int rm, int data, gen_helper_gvec_3 *fn) @@ -11210,8 +11201,11 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) (u ? uqsub_op : sqsub_op) + size); return; case 0x08: /* SSHL, USHL */ - gen_gvec_op3(s, is_q, rd, rn, rm, - u ? &ushl_op[size] : &sshl_op[size]); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size); + } return; case 0x0c: /* SMAX, UMAX */ if (u) { @@ -11250,7 +11244,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x11: if (!u) { /* CMTST */ - gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size); return; } /* else CMEQ */ diff --git a/target/arm/translate.c b/target/arm/translate.c index da807242ff..e5aa78c88a 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4878,27 +4878,31 @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); } -static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 }; - -const GVecGen3 cmtst_op[4] = { - { .fni4 = gen_helper_neon_tst_u8, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_8 }, - { .fni4 = gen_helper_neon_tst_u16, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_16 }, - { .fni4 = gen_cmtst_i32, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_32 }, - { .fni8 = gen_cmtst_i64, - .fniv = gen_cmtst_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_cmtst, - .vece = MO_64 }, -}; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_helper_neon_tst_u8, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_helper_neon_tst_u16, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_cmtst_i32, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_cmtst_i64, + .fniv = gen_cmtst_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -5016,29 +5020,33 @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(rsh); } -static const TCGOpcode ushl_list[] = { - INDEX_op_neg_vec, INDEX_op_shlv_vec, - INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 -}; - -const GVecGen3 ushl_op[4] = { - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_b, - .opt_opc = ushl_list, - .vece = MO_8 }, - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_h, - .opt_opc = ushl_list, - .vece = MO_16 }, - { .fni4 = gen_ushl_i32, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_32 }, - { .fni8 = gen_ushl_i64, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_64 }, -}; +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_shlv_vec, + INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ushl_i32, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ushl_i64, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -5150,29 +5158,33 @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(tmp); } -static const TCGOpcode sshl_list[] = { - INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, - INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 -}; - -const GVecGen3 sshl_op[4] = { - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_b, - .opt_opc = sshl_list, - .vece = MO_8 }, - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_h, - .opt_opc = sshl_list, - .vece = MO_16 }, - { .fni4 = gen_sshl_i32, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_32 }, - { .fni8 = gen_sshl_i64, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_64 }, -}; +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, + INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sshl_i32, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sshl_i64, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5548,8 +5560,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tcg_gen_gvec_cmp(TCG_COND_EQ, size, rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size); } else { /* VTST */ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, - vec_size, vec_size, &cmtst_op[size]); + gen_gvec_cmtst(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); } return 0; @@ -5584,8 +5596,13 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VSHL: /* Note the operation is vshl vd,vm,vn */ - tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size, - u ? &ushl_op[size] : &sshl_op[size]); + if (u) { + gen_gvec_ushl(size, rd_ofs, rm_ofs, rn_ofs, + vec_size, vec_size); + } else { + gen_gvec_sshl(size, rd_ofs, rm_ofs, rn_ofs, + vec_size, vec_size); + } return 0; } From patchwork Sat May 2 22:44:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186070 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2287179ilf; Sat, 2 May 2020 15:51:09 -0700 (PDT) X-Google-Smtp-Source: APiQypL4S7SjQxZ8/K+QWsXLt/7cm2HYq9GLuGK3dCl64uw2TNFKFr9SXof8Jq2Q1+l6lddVs5dV X-Received: by 2002:ac8:5209:: with SMTP id r9mr9971429qtn.57.1588459868949; Sat, 02 May 2020 15:51:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459868; cv=none; d=google.com; s=arc-20160816; b=hCXul6mynSRQ1WzXCYQyjEubmnVZea2l0p/0YvR4TndY5V4IpHEcu/xUSg8t7sJYG7 SzTp+iH7W6mTE/22kKSlQYVOHMlOAXyqZ5+f48SGC0dMoQqzS3plyybRJBT8gZP/FUa2 /NxFh/KeH+ea27mJSdnYUaWbxkxglYHwqq0TBT1CgA3k5K3G4SARFiEromDbwD3Wj/y0 5xBkuMaNotoeW5DHCevkLDHRsLzRA78KUg8p8NCjt9GhcId0qtulujFZEIVNKqTHmjmn 7QndMA25hutK7Bvg3JsmSeFbfyIisa64OwU/nRqI4Yi8xhRQuM3Qs/NPZhS5ijb8v/DC kgUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=ZBw+W06UBx//+69mT18KkEAMZiojtZ4qaweFEtT8bk0=; b=jL5R0RwfU0FPBpv+OPMhlj8Y/oQKE1vcB2IaBt2E5fWvNhT9g5khC0BwYT1hjGZA4k spqq7aDoy7+R2iKOFWqtrtF5iDDz0KJoQansLuRjZKcuY5YQh92xRD+DT/IJpWe58o10 9Cd682j6RaYe4NyJ0l5vQEoDXj90TgmAUR4PS4f0Fomh6qrNJ1SO+hly6xwK6CgupsVa paW2nZsI6DUCFgGI91btdYnSGaESt6cBqUR1f74aE140BbgMz6dpiLKU3glI2cXkqE0g M9cWsBt90/Fc5pawrR6E/rOEAbptezbEdb+Oy5oWogjAW2yAzoAiL1iWQx1UNAJsc3xn PTyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Mn/6U0XZ"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id w88si4036949qtd.125.2020.05.02.15.51.08 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:51:08 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Mn/6U0XZ"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:50444 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0yS-0004mC-BH for patch@linaro.org; Sat, 02 May 2020 18:51:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51624) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sr-0003Fc-2f for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sq-0004js-0D for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:20 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:38476) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sp-0004jH-AH for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:19 -0400 Received: by mail-pf1-x444.google.com with SMTP id y25so3540961pfn.5 for ; Sat, 02 May 2020 15:45:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZBw+W06UBx//+69mT18KkEAMZiojtZ4qaweFEtT8bk0=; b=Mn/6U0XZZzZqcEGEn/rCmyTimrOSl/2+e2rUjftQENEpapE4Euqn7Y5lR6oQPXzIsQ yIDJcIA7LJveMhTzAsFOgidxj4eTRfqfXMpGN75NFzPoCfe2Myp/4PhO+OIn4jxsVYj4 5V+oa/bXEkJYiiRHgXWlT1qRgJBZvT0gMKuvuXoXfhuAKzyL8bzDx2CPDgPqeoylj+M4 K3ouVQ0HQux1lDSfft9MTBzx0u/36p0NM453A74UDhbgKrHDuyb7rJg3gfr4zM53bWOe qikfJZSoD73j+zcLvlFx2gAbMoHCLvPKv0kz5VpBHekerk/C2NvKOA2lkax1hWEu7KWh bvMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZBw+W06UBx//+69mT18KkEAMZiojtZ4qaweFEtT8bk0=; b=Uk6x8eMBJo1CHCutO/MIkgElSny3WyaYwqGOlY4x4I3jUFp8Srw5/6Uojc73wtdz43 H22R8Hg9s8yYY1xPuIgwvgA0Q85y7+7uTVaUQjtdsJ9zO9MqCZFhldpdUjiUHR3qQtwM G4DIy96EG6U8TOhKuU55DbPKICIgcMU8Ju0FeedugTZol8a+zOoiAL0hsg8g9EWkbF1U vz92kRp1TXRwDI+lDOOXgxOSRvnlrkZIBZYLkwZURNrI2oaS8hltYNEHrp9tl4doh0Cy N2f691N+PKqocKaQt1IIvleqZqEOdOFliqNOLm0jPANKWcbkxoYHZQ7Xoij9EASuf8Lc RK5Q== X-Gm-Message-State: AGi0Pub8wQl1k71Z7HUvkQcP8mDiaDNnXgBzPvuTX1rcGjxjWdeJoLSH EIej09QG9IsQIoJLYunoVukhf2Df/8E= X-Received: by 2002:a05:6a00:2b4:: with SMTP id q20mr10609612pfs.104.1588459517133; Sat, 02 May 2020 15:45:17 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 09/15] target/arm: Wrap vector uqadd/sqadd/uqsub/sqsub GVecGen4 in GVecGen3Fn Date: Sat, 2 May 2020 15:44:57 -0700 Message-Id: <20200502224503.2282-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x444.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::444 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 13 +- target/arm/translate-a64.c | 22 ++-- target/arm/translate.c | 248 +++++++++++++++++++++---------------- 3 files changed, 157 insertions(+), 126 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index b3e47e7a7f..ada84d411d 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -298,16 +298,21 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen4 uqadd_op[4]; -extern const GVecGen4 sqadd_op[4]; -extern const GVecGen4 uqsub_op[4]; -extern const GVecGen4 sqsub_op[4]; void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2be6ab541e..eeaa92b9f1 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11185,20 +11185,18 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) switch (opcode) { case 0x01: /* SQADD, UQADD */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqadd_op : sqadd_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size); + } return; case 0x05: /* SQSUB, UQSUB */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqsub_op : sqsub_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size); + } return; case 0x08: /* SSHL, USHL */ if (u) { diff --git a/target/arm/translate.c b/target/arm/translate.c index e5aa78c88a..8e6c6f7b00 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5197,32 +5197,37 @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqadd[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 uqadd_op[4] = { - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_b, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_8 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_h, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_16 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_s, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_32 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_d, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_64 }, -}; +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_b, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_h, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_s, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_d, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5235,32 +5240,37 @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqadd[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 sqadd_op[4] = { - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_b, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_h, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_s, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_d, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5273,32 +5283,37 @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqsub[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 uqsub_op[4] = { - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_b, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_h, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_s, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_d, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5311,32 +5326,37 @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqsub[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 sqsub_op[4] = { - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_b, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_h, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_s, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_d, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. @@ -5522,15 +5542,23 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) return 0; case NEON_3R_VQADD: - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), - rn_ofs, rm_ofs, vec_size, vec_size, - (u ? uqadd_op : sqadd_op) + size); + if (u) { + gen_gvec_uqadd_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_sqadd_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } return 0; case NEON_3R_VQSUB: - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), - rn_ofs, rm_ofs, vec_size, vec_size, - (u ? uqsub_op : sqsub_op) + size); + if (u) { + gen_gvec_uqsub_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_sqsub_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } return 0; case NEON_3R_VMUL: /* VMUL */ From patchwork Sat May 2 22:44:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186074 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2288304ilf; Sat, 2 May 2020 15:52:44 -0700 (PDT) X-Google-Smtp-Source: APiQypJfSjy54s6IGL4ne94O+Z+V9/OSugOn+Tv2zKkfohMf8iPhaibIGvdGEnqMbQc9EmlTLdT3 X-Received: by 2002:a37:9645:: with SMTP id y66mr9730503qkd.278.1588459964311; Sat, 02 May 2020 15:52:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459964; cv=none; d=google.com; s=arc-20160816; b=yVhukGUR2PJ4lUp73L6cRUtKnhW54JaGYdl2QUAoV+57qsBEL7ySSiew+v8hlmbeVT hHSNK4JHUXkoK/1r/pJ0y7Z3zwC5IJCBjtBfbL3Dosq4oyU+u/u5KQojjjhemX5OkmVM z5I7LXz2pRJbFbIz/B6l+R2V8XeS7nxTjP6Uj+PEjfrSTWuucN5vzfvoTsixR5XNdxXU TJA3DfqCGNpSPeaC1bT9f9T3vUl+TR1chY3SFigk6cdEoT/zy0oS0DcqNLlBPq+zx6Jj OZnqIGSAh6mM3MEF/0ia8cfN5++uYMT5ViRrDYjmtOQ/HvhgFc876QsRFwv+MAOX+Ihk 0EDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=VEa8/NpiWarpCAFc/pG+GmiGCs6Kk+JgIooeJUPrPpQ=; b=sOnHuKW6cqDY7j7Ti/TGMidRlb0XmnUh98BQeeiGxskIqg2sDU7sA5Tnh2BdfXCpJN awu/fNRw8J9QG/BFMxuYUdiGH9U0fWFLMfdXrzaiwFvBxph4ZHiysbkqYn5y5GoWAQ8U Bn/OyHMA6rdmD/0a0D8XbaVH2DsPd+S3FBLDADTeSLKwxKv7P9BL6jcCV5JAt19H6ZY9 QrME8A8uG/X5LS89P55njQnMiIWVUQB1Qcdh/d7wyEkSflgrNdoA2OJbJOZD+Vrzjjjy ygeZ4cYGgxeN+oE4KCUBpdroMsbBmDV9ZUoEAFtndPfxhplifWxfW7AJMmIzJv0ahES8 6BAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=q3nZdk4j; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id 192si419778qkm.40.2020.05.02.15.52.44 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:52:44 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=q3nZdk4j; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0zz-0008Nh-Sj for patch@linaro.org; Sat, 02 May 2020 18:52:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51630) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0ss-0003GS-85 for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sr-0004k5-4w for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:21 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]:43074) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sq-0004jn-Dz for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:20 -0400 Received: by mail-pf1-x42a.google.com with SMTP id v63so3530532pfb.10 for ; Sat, 02 May 2020 15:45:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VEa8/NpiWarpCAFc/pG+GmiGCs6Kk+JgIooeJUPrPpQ=; b=q3nZdk4jfh59jQCCFRlwN4SYs/oRjr5cSufqk8QzQhppB271N1pQ52/YgKuuqa4HP7 aYBVPMQwIo9TdjHyRjEq39r3GkdIEJQdx3Rf9drAD/V9W08MdL0Eh8ddAeu8ZQX14AzA o/MDAzAOdHeOGtuR8mA+RGHK/dkysPlV4XZ8gek8JZed2DyTuVhiGZJh5LCC7Bk6RdBk uUAJDpdFZXiVO6zT5ATdER+DSOEKH2YTFGQBDhA5Z6D7qSZyT6ITrEycCWAr88Ln71bz z3qeaHoti5GQd0S3zGrxX0JJpmlkyoXU7rQwxtYxkMqG8NuJ5rCopcaVPc8b1Y/oJ9sv dHZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VEa8/NpiWarpCAFc/pG+GmiGCs6Kk+JgIooeJUPrPpQ=; b=A74PT6yxJZf3SDQbOmIBHKCu6OrL+7YPjjT7vigitvAVuoBIodVrPZHGTyMc2nlufP F54PaMpgzS6McRb6Ep+UnTayfSiBehj+yN8wLs8qO3y8XgwK3RCslBC6FzV6nxoFswVK 1DhfaqTpQAus8F7E1Vh6lbYvqlaRCDNkV+iNjtbk3CyiEkcRpov5My3ZzmyLpyd+2Ds1 pT8imJDA0q97JoaoLd3adL36/v3jkJHK18b6MHhTbsj4h2oyaVNHG+tjfLdvoqy1jmD3 84qROKkxbA8wkMjGxmZFfqrmFbuUn9qkfinkrdyo/8yLq4McC29rDcEhXff2q4B+B40k vqvA== X-Gm-Message-State: AGi0PuZOTM4rYqOOvuM+tDA9ftbGn1+1cfiPq4cl40JfVYDd5H0j5312 8KxxlzDdr9vMX7bYydzQLp1TKuiTmJc= X-Received: by 2002:aa7:8eca:: with SMTP id b10mr10573920pfr.4.1588459518699; Sat, 02 May 2020 15:45:18 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 10/15] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 Date: Sat, 2 May 2020 15:44:58 -0700 Message-Id: <20200502224503.2282-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::42a X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" These operations do not touch fp_status. Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/translate-a64.c | 5 ++--- target/arm/translate.c | 12 ++---------- target/arm/vfp_helper.c | 4 ++-- 4 files changed, 8 insertions(+), 17 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 33c76192d2..aed3050965 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -211,8 +211,8 @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr) -DEF_HELPER_2(recpe_u32, i32, i32, ptr) -DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr) +DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32) +DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32) DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32) DEF_HELPER_3(shl_cc, i32, env, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index eeaa92b9f1..4d5cdcef2f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -9716,7 +9716,7 @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode, switch (opcode) { case 0x3c: /* URECPE */ - gen_helper_recpe_u32(tcg_res, tcg_op, fpst); + gen_helper_recpe_u32(tcg_res, tcg_op); break; case 0x3d: /* FRECPE */ gen_helper_recpe_f32(tcg_res, tcg_op, fpst); @@ -12261,7 +12261,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - need_fpstatus = true; break; case 0x1e: /* FRINT32Z */ case 0x1f: /* FRINT64Z */ @@ -12429,7 +12428,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus); break; case 0x7c: /* URSQRTE */ - gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus); + gen_helper_rsqrte_u32(tcg_res, tcg_op); break; case 0x1e: /* FRINT32Z */ case 0x5e: /* FRINT32X */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 8e6c6f7b00..a5bb4b0040 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -7265,19 +7265,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } case NEON_2RM_VRECPE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_recpe_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_recpe_u32(tmp, tmp); break; - } case NEON_2RM_VRSQRTE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_rsqrte_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_rsqrte_u32(tmp, tmp); break; - } case NEON_2RM_VRECPE_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 930d6e747f..a792661166 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1023,7 +1023,7 @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp) return make_float64(val); } -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(recpe_u32)(uint32_t a) { /* float_status *s = fpstp; */ int input, estimate; @@ -1038,7 +1038,7 @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) return deposit32(0, (32 - 9), 9, estimate); } -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(rsqrte_u32)(uint32_t a) { int estimate; From patchwork Sat May 2 22:44:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186076 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2289154ilf; Sat, 2 May 2020 15:54:11 -0700 (PDT) X-Google-Smtp-Source: APiQypKLgiGfUjvDMdhtpCxIX7H5swMzAR2aji/KTvCC52rhp8wiLlgj3SYaRPI5El4MNSxzc+SW X-Received: by 2002:a0c:b286:: with SMTP id r6mr10113634qve.244.1588460051729; Sat, 02 May 2020 15:54:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588460051; cv=none; d=google.com; s=arc-20160816; b=mb9dgXPzXztN6SvjWUn7s7iUMzmsIPYY29mMcq/fa343tCZqGV6QrUM773kq1brfFk xd1hdNizjyw/QSYb/Seg+tpd/oShEkmbbskR+W53cgas/uYIOuorphhILQL4z+lo2N+K JSqPAVIl9uicr+26tUJjFnurlsRiLhzch5/ByeCC7leZE/3XLgIE1j6YqR6IrWLBvKGm Zq0/n54Rk0l4VUmfRwC2O6ecs5jkqlNTlaROKWjoczd68zC0DRBWQjM/3ZNivN4cFr5A b2ExqHT88Y/v0EmhBuSRX9RfbLMRjGaGTuagxwBMBI6VnX/NlJKVbwUwGHK5c8cor2eX WAWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=HkrQs5dpEFTRpOusdkDcmR/jxV9J9hkTnHpacMdwp8Y=; b=j4ACHaetd7OhCEKkgad+QdPvHd8mlA7ywicv1ysHMotM/+YHe9grWtzvFeBVgomgYI KGaDuBw7VhlpdHuPWtFD6B8WZFt3eo7Wwkq+LqLELjOVVePX6Df78j4JYAPh00ZmPuMh FLFWy1tmVSOpvMQsTNiHthUlcN1JuNAi5KweF9wZGuWgCNAu+1mTe67hO2DDk03AGzC6 i8A7PAK//2pFSAOHk+CCbGNEespTV51N1AVAswbxpPq8QoiDmZI68VMjN8bzjhEg1KLe lSRR7LjP2m7JOD8Agok+4Sxjz7fBLFhS9CUWWseodqPf60q6vuSnK6SImVbj42Qpol7V 46uA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=b9bf32EX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id n22si4092921qtp.366.2020.05.02.15.54.11 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:54:11 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=b9bf32EX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:36900 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV11P-00049l-7s for patch@linaro.org; Sat, 02 May 2020 18:54:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51644) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0su-0003Hv-B7 for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0st-0004nG-DT for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:24 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:42290) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0ss-0004kf-Vr for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:23 -0400 Received: by mail-pf1-x442.google.com with SMTP id f7so3533590pfa.9 for ; Sat, 02 May 2020 15:45:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HkrQs5dpEFTRpOusdkDcmR/jxV9J9hkTnHpacMdwp8Y=; b=b9bf32EXnUvjntIHj/5yJHKXOw/KTHIu6oftw7zymW/onndkCLzsDPFDkU1WZzVRRC SnWpLpZ5wkH72ikbhFaYrMXV+5/W6hLS6YYBJmW4ateHNxt/VXJ2tYfbJjbNQgjsJr9n oEBADLAmMBDJ5mUBFti78UdvC4wULzYVU79O7W5D+iY0h5phdNdc34cGH34Jhcd6d0AZ DZeZLxtG1tumCP6mwFij02F+t0dCUzR7bMWNVgRlDfKdxeQs6W1uxiP4+yr8PtHmTiMV YYw9tHMUQB2CTERFPI0APOyGCCdQB5/64L8/JQfOjgQNC5roClnzWAg3O8Vi7qFDLu5Y gw6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HkrQs5dpEFTRpOusdkDcmR/jxV9J9hkTnHpacMdwp8Y=; b=QLbHc6JzMalxdwUy0fp8M7NEVo3iVBLu/XsFPhK5hi8qwevc46n7wnrTYg/Fn9W+SC BxDEciMBj68YzLDQsGB1XzFaNYnuzVu+z91IgEM31CMeVcBynzCvBy2QTjBouwV5+XOb l5GeDjnqLT3t47b/3xE2uZid2nxcRM0x6ljm1rhE8IAVM605ThDMrK+B41vdU6xydKZo bxATaIaX4GWbf55A+lc+SEWRhdeMe0CDIDWJYbXu+CS4ArZMEe+hLmxnSEdKcbQswkZd nPkI6wCyUCqQao9mXf/4E08sT32e0Vir6GH/oSi6w8Cp2ajlOAqOvb/FT/tpSUHzAzuA sXsw== X-Gm-Message-State: AGi0PuaBaU3uACpyMRx3lLuDYTtkFxq3Js10eY1RgRO6+Kd53BlRwmxI z84nENcwEnlprEIvQ51rCMuaQjmc/4U= X-Received: by 2002:a63:7e1b:: with SMTP id z27mr10123554pgc.19.1588459521156; Sat, 02 May 2020 15:45:21 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 11/15] target/arm: Wrap vector qrdmla/qrdmls in GVecGen3Fn Date: Sat, 2 May 2020 15:44:59 -0700 Message-Id: <20200502224503.2282-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::442 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 5 ++++ target/arm/translate-a64.c | 34 ++---------------------- target/arm/translate.c | 54 +++++++++++++++++++------------------- 3 files changed, 34 insertions(+), 59 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.h b/target/arm/translate.h index ada84d411d..76cd3d31c7 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -332,6 +332,11 @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 4d5cdcef2f..1821a8e09d 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -604,18 +604,6 @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } -/* Expand a 3-operand + env pointer operation using - * an out-of-line helper. - */ -static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd, - int rn, int rm, gen_helper_gvec_3_ptr *fn) -{ - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), cpu_env, - is_q ? 16 : 8, vec_full_reg_size(s), 0, fn); -} - /* Expand a 3-operand + fpstatus pointer + simd data value operation using * an out-of-line helper. */ @@ -11710,29 +11698,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) switch (opcode) { case 0x0: /* SQRDMLAH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size); return; case 0x1: /* SQRDMLSH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size); return; case 0x2: /* SDOT / UDOT */ diff --git a/target/arm/translate.c b/target/arm/translate.c index a5bb4b0040..8d20726dcf 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3901,20 +3901,26 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; - -/* Expand v8.1 simd helper. */ -static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, - int q, int rd, int rn, int rm) +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { - if (dc_isar_feature(aa32_rdm, s)) { - int opr_sz = (1 + q) * 8; - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), cpu_env, - opr_sz, opr_sz, 0, fn); - return 0; - } - return 1; + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); +} + +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ @@ -5465,13 +5471,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; /* VPADD */ } /* VQRDMLAH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; @@ -5484,13 +5487,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } /* VQRDMLSH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; From patchwork Sat May 2 22:45:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186065 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2285554ilf; Sat, 2 May 2020 15:48:36 -0700 (PDT) X-Google-Smtp-Source: APiQypIib87GiRJ0xL792KxLMjnThm5mo6lGZkAv0Oz+kee5Eyh/oDR38NLEubuWjsmNxj4KGxyR X-Received: by 2002:a37:a42:: with SMTP id 63mr9547370qkk.109.1588459716172; Sat, 02 May 2020 15:48:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459716; cv=none; d=google.com; s=arc-20160816; b=ySwEJf+kRRCIOBk5eoWxTxVcx95dJPaE4N8dpkqTnLtTD3XrUxH46ZrxgDfzq6gngJ CpvzjAfQGEjvSQJQM+rTmJglgSt+BwamB/9ofwofm1dKnVRIPY+IEI9FtT0m4Ihn8YD2 aUhnlIq6+p3fKE0HoyyAonOKvdumr/P4TN7qBgw2Pcj/xOyYo8mi4NPn1ozRlNlbB/Z6 P97fc29fg+hU0YNDPBW7RTWoWgN3qjS2GY4io87VvX0jg6BVCtQvdJCWfkV93nebRXPq 1ecMb55fejc93+FfqIRHSEdwXPktiZhxGqPPTgjw1cb8sVL0FjIqXL7Iq94HSdqWmm5e +Blg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=BpJ/0KBzNlbhkzKG7jY9E7XmyCHGStePvg0lY46/qCo=; b=yxxaHrSeu0dkcAk2B4emb3oH5mbRfLxIllevmnWA2wv61MVRB/VoRV/i+ljhaauMdj jDSYjmtvFksI8jLmFTfHpx9YB6jmcCV1bMHGGepTNc4apuFLU9cmwcIUYtv8tt30QDiS +TAddmle9f7JwTW5h4jAUxmxsfvOBD1MS+rnYuIQ7PxVYr4t/gHjiDO4r8Q2IKtNiMW5 smPOEMu/8G9708y9zgJTXQ4mMpo7y6i34szP1KOsSR/ZUa3md7n6HicCO522TR2VRWKk UjYkBOttJJMZ/y3Ip1ism0cBh334RL0DABmWAZ1XmHDI4IM0APr4ix5L8coC3MV0kZ/T 44dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=RsutYu0H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id k93si4041030qtd.278.2020.05.02.15.48.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:48:36 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=RsutYu0H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:39588 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0vz-00088m-H7 for patch@linaro.org; Sat, 02 May 2020 18:48:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51662) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sw-0003JY-85 for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sv-0004pb-1Y for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:25 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]:40880) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0su-0004nV-IE for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:24 -0400 Received: by mail-pf1-x435.google.com with SMTP id x2so3536771pfx.7 for ; Sat, 02 May 2020 15:45:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BpJ/0KBzNlbhkzKG7jY9E7XmyCHGStePvg0lY46/qCo=; b=RsutYu0H7r2yNq8Hitn8c9N9qvOAfTPquE3qILNGOQ5VNYEi9d5L7eiZqJc/2aGPu7 h98w1YRhtWzSkZU5mS9lA2UPfKD8crY/HN57tRLb/DoJ2iH25zFIn9cxHw1l5ZSr2gYd Z4eY2ZS+HY6G+LkiCwM+dqplUOG/rdGjT3Rf0G5F7c8hgMFq2JoYNXmCrVy2S8EBI/Cq ya1NVb6xBETiTs51+Vu8FvMdxPOzj62H5maPR1AH+mOHQstuYNK68dCzODQZVBbiTP3j S18zfW2Un2nkX2/g/MoFwIBnIS7MqfL89Myl5f8md3msa70RbwsiUqsRGjWTmmmXNwzJ rtkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BpJ/0KBzNlbhkzKG7jY9E7XmyCHGStePvg0lY46/qCo=; b=XDXE8MaFbKXv6Te+3OpCfrXjSLNfJON+0hnqTC/QBwSRmsENxIyn8TBMTLyAdQU/zK z5DOVH8CrLlhFP72Ph9pARPQO9MDEwpvHo11hTskXKsS0tzCjIbNd3B2Sar8+qYg8yaC ACbZ483vci4smJj/z3PQwqFeZvaVSOfILQAAd0a4xhxuwNUo4tbBgr9rB3AF85DkxfBS CKxruShsH1iqeB4tLbrLhxlXSuIp496YvBfUsB+P0b94HIwNc/QSRCnjrRRekFS8NCI2 8gVFvqnmSYapOH0l3sdyx0W9cevveRPUNvYviOjwHWAh90MwOYc/9hS05MVzxAGxh6PP VgyQ== X-Gm-Message-State: AGi0Pua+t3IZrcOIyCvDWRc2SQ3Gl3IBmpH60eKgryfg0mNbAxSfLY1J GOZ3YFCVOo3P1xdkDh1MhartLWQYaII= X-Received: by 2002:a63:7745:: with SMTP id s66mr10589413pgc.340.1588459522477; Sat, 02 May 2020 15:45:22 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 12/15] target/arm: Pass pointer to qc to qrdmla/qrdmls Date: Sat, 2 May 2020 15:45:00 -0700 Message-Id: <20200502224503.2282-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::435 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage. Signed-off-by: Richard Henderson --- target/arm/translate.c | 18 ++++++++--- target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------ 2 files changed, 54 insertions(+), 34 deletions(-) -- 2.20.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 8d20726dcf..532768d65f 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3901,6 +3901,18 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; +static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz, + gen_helper_gvec_3_ptr *fn) +{ + TCGv_ptr qc_ptr = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc)); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr, + opr_sz, max_sz, 0, fn); + tcg_temp_free_ptr(qc_ptr); +} + void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { @@ -3908,8 +3920,7 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, @@ -3919,8 +3930,7 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 096fea67ef..6aa2ca0827 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -36,8 +36,6 @@ #define H4(x) (x) #endif -#define SET_QC() env->vfp.qc[0] = 1 - static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) { uint64_t *d = vd + opr_sz; @@ -49,8 +47,8 @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Simplify: * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -60,7 +58,7 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) + ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -69,30 +67,30 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */ -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Similarly, using subtraction: * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -102,7 +100,7 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) - ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -111,85 +109,97 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlah_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) + ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlah_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */ -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlsh_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) - ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlsh_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Sat May 2 22:45:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186075 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2288908ilf; Sat, 2 May 2020 15:53:45 -0700 (PDT) X-Google-Smtp-Source: APiQypK9rzs6Zsmc8Ms5hbJkcvdDoFWOsinr5AbHcmxPFXOZ7y2tDNY/gjatvAbJ11S6CSuOEsJf X-Received: by 2002:aed:31c7:: with SMTP id 65mr10791679qth.150.1588460025775; Sat, 02 May 2020 15:53:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588460025; cv=none; d=google.com; s=arc-20160816; b=yn7p/jiu4jjeSBrA2mVL2LI+McNiCnfdMzVnl6HDLotginLd4ZBnPsHTmL/4ZsVJZk gCYbI/XjX7MU0P7svmD01o2e39tsf/l5P2DJXezV83DvTBSOnA6dmdrl+1Zv9ujnzdA7 TvtYrPe4nJTnIG5WXvg2gs19PgNO4fkGvLJiOzpJ7YzpyA6pJDUF7ySL/sL9Y3zBziqp IJ9qr4/Z/Sx7lbkfFhG9U5gAKMV5g5MDPGTxyN5ta5zLxs9b/pSKCmEHB/dZY/4WoE54 JSyKF45oZmf2z6AMTL8M7xqNAM3rYULukw2m9TUekcaN8r56DkYezPDwNyFias+UU5CI zDGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Satlmvku/pHGor6xmZAkPC5FD5Q8goGXCeVbGvbx+tw=; b=YTdGOgNmaN4+8XOaryNXOnv9lkC4fFsnFIh7bdYM5L15OEiSUwUmkY2McKNiZ3qWJq 1vaXaYlrqoQQIZcJRojMjaX+GXWrSu+NF0xj5YvkJPRBc1X8eldWdYSboDu5ZZNuNDdn bvujKiM+xJnKUqg+NluabvjH23L3SHMJeMtXJ+sK2AlWW822t0IE9/SNPTwrjAyb5AJW JlXLV56JS5s2nI6zmKEQNEWXESU4zwsB504cBTtkwT6JCb9Q2K2cY/WxphwWN8E/ZYH7 78gGGh6jYLk/cUcA9C8IWqbF2WAosrINmhawJvfiJK/yErZnmXyBY/4ZvYsB2Y8yQezf e3Iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=nPHy5yfg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id s6si4124566qkg.9.2020.05.02.15.53.45 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:53:45 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=nPHy5yfg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:35120 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV10z-00038U-9v for patch@linaro.org; Sat, 02 May 2020 18:53:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51672) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sx-0003LC-5d for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sw-0004qV-He for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:26 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:39404) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sw-0004pl-2k for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:26 -0400 Received: by mail-pj1-x1042.google.com with SMTP id e6so1858710pjt.4 for ; Sat, 02 May 2020 15:45:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Satlmvku/pHGor6xmZAkPC5FD5Q8goGXCeVbGvbx+tw=; b=nPHy5yfgpXDeN1+IM6iWDH2PgfvnaDOJFAbwk0MFJFhcUueKwuPRKjUQT5W1cYzR3d 0lUaiCKwbxW2ExeOF/9qs+rjALx6OC7xcCAl78N5Hau9wkC6PjjKTg3FuV62iRsUmMeJ BWHEJo85iVCEakK7zngz70NX/YzDtX+PQiu1SzxId6zIpofao814ePS+RIC/5LMvrczT QkugSengw/EOT/9LF3kRvXWN2epJ7+EcSQi26ZzOOHqHJaFTy8x+GHYFvb/LTQQxsNLZ EtM2RZG/uDksS+SqTk+P7A0blWkrva9mkfUoSqdtomzNI4DFUF1DBKMF/Um5baj3h+cJ nSkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Satlmvku/pHGor6xmZAkPC5FD5Q8goGXCeVbGvbx+tw=; b=qOar6Rfl0VH2bCDqyOutkz2/HrmpzrU6Q/VDstX5tOGOKDtScf7pTqvwxE3krZFv6R CJGNdHcyxAzkFagkwxMtKHnRBG2dfNzV2VyFtsrsXdcQ4hrRdEhl7IYKl877Wd/J7TvH Dl07+Ffu+DJ/ztsv8hi/9o8k23iYIw9Q0Pr3nfYWbReKkBzS8U2HMcr+UabQMR096mYw TotZpjiOKqkCZ0NdbWNu4KdhjEBULtKrNJln92AJkKg3XddTwO/dXItK2WYgNMogSFL4 IOFJjVeVQGEvKhkbNUcfljV1UyE5H9HX33ytqUn63rE55rVIfDrG6WqLWi4ZuuC2NFO9 JUlg== X-Gm-Message-State: AGi0PuYMoI6ZssjcoftPbAYq1Bo503SQCF0NiqbqBJR3hqmS7dP+E8GX McVsah0SBW2t+ccpp1PJEoMIU1xlwsc= X-Received: by 2002:a17:902:8608:: with SMTP id f8mr11222779plo.110.1588459524389; Sat, 02 May 2020 15:45:24 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 13/15] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_* Date: Sat, 2 May 2020 15:45:01 -0700 Message-Id: <20200502224503.2282-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1042 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Must clear the tail for AdvSIMD when SVE is enabled. Fixes: ca40a6e6e39 Signed-off-by: Richard Henderson --- target/arm/vec_helper.c | 2 ++ 1 file changed, 2 insertions(+) -- 2.20.1 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 6aa2ca0827..a483841add 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -747,6 +747,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) @@ -771,6 +772,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ mm, a[i + j], 0, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) From patchwork Sat May 2 22:45:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186069 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2287100ilf; Sat, 2 May 2020 15:51:00 -0700 (PDT) X-Google-Smtp-Source: APiQypJCsuPAhVLjxZJjGo0Exxbe5BXo01U/ZzDPiOGJWdWit+5+uVvWKHpVEojN63epv6i+iS9g X-Received: by 2002:ac8:7b81:: with SMTP id p1mr10432732qtu.134.1588459860669; Sat, 02 May 2020 15:51:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459860; cv=none; d=google.com; s=arc-20160816; b=bhaXwtzcqeMjV2QjSb6DTiiS1hRAXneSjMnl0kgO7eS276KGYr/mjWBR1l2N8Bnsno iZv8AktdQeEgb3A2linaIghtb/QAP+R09DBtXdp1Ub3uUX4k8Xt7yJpraxxco1p607f+ aPncFK+frr7XslAucBczOIQmuWqv+d5ZarjyTkQUOjnbU2mfTcVbvU4zJX0tLnyqx0Bt NiocS9YQl8gttevyQG12mig3diYVRjpYVgkSD4rjoPQbjojZoJYZjot0rHVbMifhAWos 4qZZLBXvkrbCV2DXQCnePOf5JrLJ1HlUHYkCh3gGnXHHfnO4h/Ynmp2mpnsCX9AJ25Eb 2BqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=+liplU8tAVfM4Ulw1/YXG/0tJBrp4eVoUpZCRifSwxM=; b=zaV6i+jvOq9POC+Z18hTsKORjDJSH6NBE96AzesHFNpxZo9lXEVJdV1AkthSWcEiRr jfBETtFuVZfGNgBWhJ9vJuzHX/tDE03WsjTLaM+pWUmgCgZRO8knn9nKu0J3543YedT2 CdWB73G0Iam908Gj7HHesFMXlP1NuTYzREKwQrzL2C1Lic3+E5uMQEybMBTd3faxmRMy WeMYibRzjpwuhv2s9B0ewndVAQLAUW2m3/Fcurpp5NEG4DoG2VDCAihUmpPHTABSHrUb BRUJUplg+7Cd0l0CXwq8eUP9JJwfxilxWJdPwkUJMDYuBx8l9YVM4w0yEHM7GW8/0bVl Ai2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XqRmyHGE; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id w5si4573481qvl.220.2020.05.02.15.51.00 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:51:00 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XqRmyHGE; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49866 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0yK-0004Nh-62 for patch@linaro.org; Sat, 02 May 2020 18:51:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51696) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0sz-0003N3-Et for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sy-0004sD-5u for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:29 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:41229) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sx-0004qc-NU for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:27 -0400 Received: by mail-pg1-x542.google.com with SMTP id o18so1332754pgg.8 for ; Sat, 02 May 2020 15:45:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+liplU8tAVfM4Ulw1/YXG/0tJBrp4eVoUpZCRifSwxM=; b=XqRmyHGEJBgUClJwGZruu8bRvSDQ47A00rHKttB+hqI1dZL31C6bQcQr8TeHiZKfUi KHwuFvlhp+mQIkBkHUWAa6n1KvDTYHOA60kcg3VcxGIl1wyue3i+1thsZIKXwc8mqSSq uC8Q5lVnmEtVHd9giVm6+G02sSx++tXZ1hqnbIxDkacpJ3tSpjNQXjLc7a+JO4nQHMri nNx3r28bFD3HCYGyypIuhyBhmzAbhWu+PWj+Ew2Pd0g1HQsBJclldYdkdFEAI3he0cMM 6gt+21lpPJA9HKPbPR07HWcJQ12TeFdwTF2MO5YHbuouh5KLj1SPZDK0cQXyWbsTDODq UIMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+liplU8tAVfM4Ulw1/YXG/0tJBrp4eVoUpZCRifSwxM=; b=ViH9MI1Xm+g62z8zLqLPBR46arlKNl0lwE7vRWvMtuNz+4aW7zocWBDjdlnB68RDXz ouMkqzmFGdAE4WThzHEx0fzsxPeThQhysOYqT+x9vAh0bp8QAjYki+NzJBN/ipbMVcqT O6f1mcazzcvkFPDPbN+VcXw8w3uVuBbhT45L5g3UjFCZGXR9PkbL5PQMNThTHoKRmbkd nVi6ygyKKeBKdKkoJ+4UX1DWzjOXUubItVYIpe+nZx5CposL690WckEZAGrbT1upZoUC UkYsSP1QTSfOWg+g1o7uvU8HlUquDiKuudqgh0pge219RvBd54H0Ua9sUBtR2Q/RHx/z a6NA== X-Gm-Message-State: AGi0PubNxna2D9jCeiwM1KFaGIklxejLb+RoqxjMLncTqR7YvLFMcKV4 L7Uu8KF8RqBFeOMkQNt0WfyRnWXLmwo= X-Received: by 2002:aa7:8ec1:: with SMTP id b1mr10639157pfr.103.1588459525773; Sat, 02 May 2020 15:45:25 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 14/15] target/arm: Vectorize SABD/UABD Date: Sat, 2 May 2020 15:45:02 -0700 Message-Id: <20200502224503.2282-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::542 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 5 ++ target/arm/translate-a64.c | 8 ++- target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 24 +++++++ 5 files changed, 176 insertions(+), 4 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index aed3050965..4678d3a6f4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -731,6 +731,16 @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 76cd3d31c7..d4c4111a5c 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -337,6 +337,11 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1821a8e09d..38f72bf550 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11207,6 +11207,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); } return; + case 0xe: /* SABD, UABD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11339,7 +11346,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index 532768d65f..e0c4de2898 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5374,6 +5374,126 @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5642,6 +5762,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) vec_size, vec_size); } return 0; + + case NEON_3R_VABD: + if (u) { + gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; } if (size == 3) { @@ -5772,9 +5902,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABD: - GEN_NEON_INTEGER_OP(abd); - break; case NEON_3R_VABA: GEN_NEON_INTEGER_OP(abd); tcg_temp_free_i32(tmp2); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a483841add..a4972d02fc 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1407,3 +1407,27 @@ DO_CMP0(gvec_cgt0_h, int16_t, >) DO_CMP0(gvec_cge0_h, int16_t, >=) #undef DO_CMP0 + +#define DO_ABD(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABD(gvec_sabd_b, int8_t) +DO_ABD(gvec_sabd_h, int16_t) +DO_ABD(gvec_sabd_s, int32_t) +DO_ABD(gvec_sabd_d, int64_t) + +DO_ABD(gvec_uabd_b, uint8_t) +DO_ABD(gvec_uabd_h, uint16_t) +DO_ABD(gvec_uabd_s, uint32_t) +DO_ABD(gvec_uabd_d, uint64_t) + +#undef DO_ABD From patchwork Sat May 2 22:45:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186072 Delivered-To: patch@linaro.org Received: by 2002:a92:3d9a:0:0:0:0:0 with SMTP id k26csp2288215ilf; Sat, 2 May 2020 15:52:38 -0700 (PDT) X-Google-Smtp-Source: APiQypLjipQsnG6t3oLXmwFnFSLIzQNOVJcgpw0Au+gBPH8A9dL+1UlfUp5Fa5jLCAJ0C5WAHE44 X-Received: by 2002:ad4:45ae:: with SMTP id y14mr10336983qvu.145.1588459957953; Sat, 02 May 2020 15:52:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588459957; cv=none; d=google.com; s=arc-20160816; b=Iw7F5obG8ZSs0y3iPXzU6mtLLwKEUHRe8bjkWdJndpQLDsNW8xDY1Y2PLnwcMdsMmY CCAG3CVXttCgqBuAYkXNbojJOolKM7tKhB4JpUA2AYrV77C5OEsgEnNnyiq/0U0UhN+g isKnyRKCPDKz4cMqIZ/zowfdcm7Pdm2JnVCCYYuPrPwMkMMrcyGLua+4PYtKhjCSmAqB CjVeKSB886GR4jbIvNGxdLAVnVAotWRHUO3BOu9Y41aC89mg5m8mjC0Ir/dVZ/cJje+O ZjFMjZZoteXJSHToR9xNQL2caZMVrZfrXXc8ik9IrFjIBVf+B3qr5oFpU444mHAPYGm5 0Ckw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=6qcliVaePe6nHnDvPPzkZjTUmRFhb2EK04O2+8P0WDs=; b=pc8BQQaYDc3B0+yHXA3Q/jUZzWPnjWZyWrqG0pDf7dn5h+GA6Vet1HaOBJH/OlLcyc Olxxni2N5JGZ+tfsB2K8mdSAew4PRPypjoa9GnFc7LCTsFVtRBhOV6k2I61GCiJVtfpU zw+85PmsjllzJIYGqcBVbVQ9wjceYtEaF2dMPqx+ulabvvRBUantzNurp7WgA87kM/gs Rq/XbzcFWF6sQDHtI5uP6oBZnVvTDYKp9N5nZrKCt6uOUaHORDVOm8NZ/jBq9jwbjjMu ywhAa3e8BE8TzIy4xNInLRtEjIqS1na90PpbBlhplSL819RiadTWVz60djb3SY3u7S6W 7OzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cFSSfl6b; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id v9si3963447qtq.203.2020.05.02.15.52.37 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 02 May 2020 15:52:37 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cFSSfl6b; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jV0zt-00088v-BT for patch@linaro.org; Sat, 02 May 2020 18:52:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51722) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jV0t1-0003Ps-Bd for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jV0sz-0004uN-Oo for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:30 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:39404) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jV0sz-0004s4-3Q for qemu-devel@nongnu.org; Sat, 02 May 2020 18:45:29 -0400 Received: by mail-pj1-x1041.google.com with SMTP id e6so1858756pjt.4 for ; Sat, 02 May 2020 15:45:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6qcliVaePe6nHnDvPPzkZjTUmRFhb2EK04O2+8P0WDs=; b=cFSSfl6brnL1I5iRetZgNowAIefhuuoyKUJra5RgM6E2nnoh2dJE/Y9AyviJQ4jEPQ a3ACRmR7jykaQ0rG7ehzIWyPV0uxtNCbHA9uM0uPWTnr+pAnv4dn/t3exyEslYkApbEx h3UEqOVdY4Bq9Xp7WmYN8jc3vxMCMbnNs+u84p3hnr10/FQAvpBvUHVaqKhjcNQpzMVc A7Pc47rRzVCO2OZIz2OHixHiwUcW64hjIXBnl1ERiPDS7Ns7pIVjXehkRSzl6vo5JATx LgzJ1ByVGeAw+VRqPQzETmv/VsXYyz6UZjilZjIvzRu9O0WswjsdpftaaaT++DQB4zSr BwvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6qcliVaePe6nHnDvPPzkZjTUmRFhb2EK04O2+8P0WDs=; b=qI36Dncb0Q4WyI0tyVmElTRLs/4g/VKeohhWNOhTUHKbUvSCM3yzawgIJUUNQjXwrd 1uT5JgiPK19KlsVcrUwQScGRdZWpVQxi7yB242StBlDa7j4awYdLwvFTWEInMM3kb/7T Gr50EUEg4NNIkV7vn7sSvziwAZSqAMu9Z2XlGvcJiPEhlpsJCUSeP9XTZiSqdHcYj9gF aNmj5KbQ2dS2EYq/zPsRqOdD/S7X/aTF/pBaKR9f4Z82/mKqWYZ2KTDbXQzAZHhNrp+/ 2Rmtw14RiTy03LiJKTV/RUMi4RlkNcvnTbNNRro9jHdks988oqkITul99AS8piLL0nCK 8trg== X-Gm-Message-State: AGi0PuY+JPtBjh7tYziXHkk39+GsUDtZ3ajhU/XdKPrwNZjp3Vg5yHE7 ph0qDJ6iVJkZUDo1ypY6ewLlFH+sEaQ= X-Received: by 2002:a17:90a:df88:: with SMTP id p8mr8285165pjv.119.1588459526988; Sat, 02 May 2020 15:45:26 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id h5sm2956182pjv.4.2020.05.02.15.45.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 May 2020 15:45:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 15/15] target/arm: Vectorize SABA/UABA Date: Sat, 2 May 2020 15:45:03 -0700 Message-Id: <20200502224503.2282-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200502224503.2282-1-richard.henderson@linaro.org> References: <20200502224503.2282-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1041 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 +++-- target/arm/translate.h | 5 ++ target/arm/neon_helper.c | 10 --- target/arm/translate-a64.c | 17 ++--- target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 24 +++++++ 6 files changed, 174 insertions(+), 33 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 4678d3a6f4..1857f4ee46 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -284,13 +284,6 @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32) DEF_HELPER_2(neon_pmax_u16, i32, i32, i32) DEF_HELPER_2(neon_pmax_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u8, i32, i32, i32) -DEF_HELPER_2(neon_abd_s8, i32, i32, i32) -DEF_HELPER_2(neon_abd_u16, i32, i32, i32) -DEF_HELPER_2(neon_abd_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u32, i32, i32, i32) -DEF_HELPER_2(neon_abd_s32, i32, i32, i32) - DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) @@ -741,6 +734,16 @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index d4c4111a5c..70139efcee 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -342,6 +342,11 @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index 448be93fa1..2ef75e04c8 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -576,16 +576,6 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) \ - dest = (src1 > src2) ? (src1 - src2) : (src2 - src1) -NEON_VOP(abd_s8, neon_s8, 4) -NEON_VOP(abd_u8, neon_u8, 4) -NEON_VOP(abd_s16, neon_s16, 2) -NEON_VOP(abd_u16, neon_u16, 2) -NEON_VOP(abd_s32, neon_s32, 1) -NEON_VOP(abd_u32, neon_u32, 1) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) do { \ int8_t tmp; \ tmp = (int8_t)src2; \ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 38f72bf550..da140d8b91 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11214,6 +11214,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); } return; + case 0xf: /* SABA, UABA */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11346,16 +11353,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xf: /* SABA, UABA */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 }, - { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 }, - { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index e0c4de2898..4af52ab7e8 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5494,6 +5494,124 @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5772,6 +5890,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) vec_size, vec_size); } return 0; + + case NEON_3R_VABA: + if (u) { + gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; } if (size == 3) { @@ -5902,12 +6030,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABA: - GEN_NEON_INTEGER_OP(abd); - tcg_temp_free_i32(tmp2); - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - break; case NEON_3R_VPMAX: GEN_NEON_INTEGER_OP(pmax); break; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a4972d02fc..1be41a8baf 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1431,3 +1431,27 @@ DO_ABD(gvec_uabd_s, uint32_t) DO_ABD(gvec_uabd_d, uint64_t) #undef DO_ABD + +#define DO_ABA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABA(gvec_saba_b, int8_t) +DO_ABA(gvec_saba_h, int16_t) +DO_ABA(gvec_saba_s, int32_t) +DO_ABA(gvec_saba_d, int64_t) + +DO_ABA(gvec_uaba_b, uint8_t) +DO_ABA(gvec_uaba_h, uint16_t) +DO_ABA(gvec_uaba_s, uint32_t) +DO_ABA(gvec_uaba_d, uint64_t) + +#undef DO_ABD