From patchwork Fri May 8 15:21:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186375 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp91541ill; Fri, 8 May 2020 08:33:43 -0700 (PDT) X-Google-Smtp-Source: APiQypLGnBe/X5dvoLK7a2bX/kEwcAz5jVJWOSWGIDMtDgbojRP83dsbdx+oNkX42uWiylPey7AP X-Received: by 2002:ac8:120a:: with SMTP id x10mr3673074qti.127.1588952022994; Fri, 08 May 2020 08:33:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952022; cv=none; d=google.com; s=arc-20160816; b=H62ap0gwvBUPlmLlWpz9AOvX5qF4dwcsA22IQ6e2nwxt+cB16f7bfQ29Q9fK1yuAm3 RDsHfGav6grP/vGRDpY/d2kMIN9q5dQvoDiKiETOGvgDqzTPXA5QpuH25Qkb6s/NDPvN iL54uL1xphgklRmQSso/FGVQ3O2xIgnSAWVURyMLbLuxbecDhJn5J/q/dlqgzvwmZryb 5dEulB8PT/JT5BGLoTOJv26K9LiCnqGeeVq60ZN9oAZ3s82PxTVBxvZTWQgIpOKk79qk Q1vjkmZCSG8AekfwW34QQcZdZoCEVMyCB182HOkJCyryV1muycLFKjVtcRlAai4QhONe 7fMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=tXShSqZexNc++RA79kZbeII65vXkMMTdtJf/eQbbKSA=; b=jYMuYNHX69f1nG5VrhI1Nozic7LIG54BH6ABYWWQiLznMpgSC7neRbAlP5C9S3iCRB 2xFPXpQgwATFtCtGEaIF8437Q1qY7CLoYG7RJRHpu37Zu9kzS7Oc+2++mh4ucs0793r6 jGlremIQvDRfM+M4WpkVPhMGeNDGjJi4bnQoZr9Of8mpI6UF+/bBtydlSzhfnvTgG7pK 8W4sXBa3jrSY5XTrNtJ4hyvM66queIe5u6gdCKPRkNHFfgbqykxW7Rkn4jIFDb4/2+s1 br6s3iZ9dn0XMOkQPaXZG3WApJyH2VryR4sQizWbW3jjhJ/oezLGq59fVQOtzgb8DeBm fYfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="i2v/NrfH"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id 16si1190121qka.349.2020.05.08.08.33.42 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:33:42 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="i2v/NrfH"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:53378 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX50Q-0004ur-BG for patch@linaro.org; Fri, 08 May 2020 11:33:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34882) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pC-0004KO-U7 for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:06 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:42369) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pB-0006w2-Ei for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:06 -0400 Received: by mail-pg1-x541.google.com with SMTP id n11so991964pgl.9 for ; Fri, 08 May 2020 08:22:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tXShSqZexNc++RA79kZbeII65vXkMMTdtJf/eQbbKSA=; b=i2v/NrfH8Y5UUU8xLTP7tPeBP3dAEXaV1MjzvCNDVDT0LeDEl5SZrVkOcE7PRkf7v3 hFM1vHQj2cnf6NGGyWfzLQi4g7FOlWs/EoGuc1hUaLrzDiUgH8B4CR2DyVcYPBcNpZHS gS2y5n0d8TCaLV9OpdOxRNLeRaN+RYxNUy2U+heo8mMHLgmjfIGvW4J/wfDJHLdkQTx+ ak4uWyeZIZKMGPhK6epy/RKOK71gCeXvHnM/F7HAyUkAaVLM/qoaotCSuVIfN3eqZms0 V7IQ51EcV5yYGOmE058XpWwwLliAnW1r3thxXp6mB+wS7uHZzTZJ6ubitc5aKYBA4gEE my0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tXShSqZexNc++RA79kZbeII65vXkMMTdtJf/eQbbKSA=; b=mG1sRX0F+ZzwJQYlSGf5bY4LyScpWpnk0G8RCOC2YcAsSlEqXoIDvL4neV6EnA6Db0 HXMBNC/Egu8ZnHABmma7FFWbbfdysEaw07pQN0ZPYiSpeEmsTbFClGvyFdoup5A87w03 l9pdfUdlRR3jqQckRIz0EMcZ01Ra8H+rxgHA9QbipWaSODpWfBGEMb9cdTqOkajUe2H1 x+0B7pxdJpBvBVaGJnepfXSziwXuQtPd5QKblvnUIll8yN6aJ0GiImg7sURQK6X2KsxO CMQadnuRdvg9ICQbpLpfFqvTrSXLtizvUPSFyT72zizo7iQrHUPY0+h9ykswK1FUxegB D2ww== X-Gm-Message-State: AGi0PuZ9UnBLgef0LuBP1mloDMvG6ZCFHIU88VilBcNcf5MYjQzCj4aQ i+fuonzMiuc5OF0bT0IAhX5Y1mdTo64= X-Received: by 2002:a63:7c0a:: with SMTP id x10mr2538614pgc.78.1588951323205; Fri, 08 May 2020 08:22:03 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 01/16] target/arm: Create gen_gvec_[us]sra Date: Fri, 8 May 2020 08:21:45 -0700 Message-Id: <20200508152200.6547-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 5817626b20..9bc162345c 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -691,6 +691,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index cb7925ea46..1839a59a8e 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -299,6 +297,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 62e5729904..315de9a9b6 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10188,19 +10188,8 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. */ - if (shift == 8 << size) { - shift -= 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 74fac1d09c..c18140f2e6 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3874,33 +3874,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_ssra[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; -const GVecGen2i ssra_op[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_ssra, - .load_dest = true, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -3932,33 +3950,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_usra[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; -const GVecGen2i usra_op[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5220,19 +5260,12 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift = -shift; - /* Shifts larger than the element size are architecturally - * valid. Unsigned results in all zeros; signed results - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >= 8 << size) { - /* rd += 0 */ + if (u) { + gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 3d534188a8..230085b35e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] += n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Fri May 8 15:21:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186378 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp92778ill; Fri, 8 May 2020 08:35:08 -0700 (PDT) X-Google-Smtp-Source: APiQypIGK454VZ89IWstwFEsfL1zPoKpwGJkkUIAZeIfHSxqywF9z7UOaPdk1l6UJrKnEI5Dajn3 X-Received: by 2002:ad4:55ab:: with SMTP id f11mr3266429qvx.226.1588952108333; Fri, 08 May 2020 08:35:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952108; cv=none; d=google.com; s=arc-20160816; b=TaFo1+/oVWQFjqSdt0N1bqygWwYDRiuPaNph9emwOL0PSA+4zxLGv8RerXpGxIH71G OcecO9t7tR9XgAC+83eUoP1DJOgK5NTQBjbEm5XfDXQz+mVXiT0ZPQa5PV7Cb6uYAGt2 KTPWuE4PoWhc/mJBLq6ocPERixX90ioFTK8f4SFbhgnc2y/+X4m5u6FzyjwBoJBcrpF7 O0O5siaZXLGuxFFIIfzEYiZyEG1CmAWqtTMe45ORy0dxW3zVrbCA03AhxODmyPXhDmHG sueoapcXqRNuyuL9/wTBPhYvnB7xBU8jj8M5YstkcVV0dC/Nsmo9HD1IocmmSPWAoErY 1cGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=g5kfD0E3pfFxFdZr8EUcY8rxb0P42WLlcS7Ouf78ldY=; b=ObVEdiRAq395lfP4QJ+hmx1+8Zb7Y2cxWhLwhfLaQrpSSgJg4Lm8Rik7xP5aAW8Yhz 8JPFl53q16aFYEpzFcgCAL6NyGZI1KBoILPlI9nLZuA4M1YKnC+cPKDYJd8oc04C/ZEU 6QZRiMfVHTNdZ9J/lci/40M43JEb9k+BNaOqSP9Q8AAtC38CV9eqre3BJ1HyCdaGt/92 e/OjZIEsV9P47x0ulB0arLDbpRf/eUwQgvtDBM3r9wMtt70vtQwS8HDyZE1gARSHqHy0 jgJh3J/jTAGzomss0RONS1gMMollaxb6zjNfeRwfr62E7vkjfeigKMnobAlr3njlaPUQ 5gTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cB2j8Cbn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id m16si495058qtu.168.2020.05.08.08.35.08 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:35:08 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cB2j8Cbn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:60038 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX51n-0007C5-P3 for patch@linaro.org; Fri, 08 May 2020 11:35:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34916) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pF-0004Q2-8j for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:09 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:32794) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pD-00070C-Dn for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:08 -0400 Received: by mail-pg1-x543.google.com with SMTP id a4so1012708pgc.0 for ; Fri, 08 May 2020 08:22:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=g5kfD0E3pfFxFdZr8EUcY8rxb0P42WLlcS7Ouf78ldY=; b=cB2j8Cbn3k+MbYEnY7MuQBuD47snaa11pT6eM4bi2oI4PkkyIWNvpy2EeRETZMi0aY /Im63nNYDfkBPPZwLX7hh3HEWfCiyLDhpsXepSh7aEPwQwynjXHij8ry1Tsbme9yJKkx i2vpt0S/Nzhzw/HFbKCIdTNp5ZFEJ2t/IP5DfWrgDTguSQj+kn5z4+ViAwg90ehAG+a9 Y+Q93yT3o2MN6KXf6N34UicFflivfAGEVMNX+ht2oVqBelm16FzQ0fs5iNpFpXpvLLL7 0MobMa8zs2CHCmJG4dyX3Jd5RY9FLNAoC8UPhaNvWsvL+G5dAERm9VMkKQo6ZvQnkTVK 6zCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=g5kfD0E3pfFxFdZr8EUcY8rxb0P42WLlcS7Ouf78ldY=; b=AUNuRH2X5J8iRoB+gU2R+rcqjE55BUb42iAZEhvowSuCZ5sCUNhgme7uG24xy6UQ/Y VSEA8YrfJA2oldmbSU26k1K5AdCIzW2wsvjMljyfQ53EM0f+HWlaH58AK+aqI1zq4NFq 62MqVrZ/3LD/kGoGb1Ymki0Ap49AlbazZJscslpSqjoMU4/yksCH8iRaA2pDl5GzShUe TGON4uGAHfG3swD17wkpRCx8N7cW0/NXzT3s0IYx4G8gaSuVp8n5t5LaJw5qrDso1RcJ pCW4ce83A6NkmTX9IlRfX5vuZIlnfP0ALImQfV6gE6XwV/UQOfpsoWCrSkc8qLc59O3d XWiw== X-Gm-Message-State: AGi0PuZGIfAvT+wutS0dypH/dr0n39CXJqHmymCTdT+3NWKuZ0Edszcd Jd68dDuw6oC08bOakdDRRhtb5mxSWOs= X-Received: by 2002:aa7:9904:: with SMTP id z4mr3429148pff.38.1588951324741; Fri, 08 May 2020 08:22:04 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 02/16] target/arm: Create gen_gvec_{u,s}{rshr,rsra} Date: Fri, 8 May 2020 08:21:46 -0700 Message-Id: <20200508152200.6547-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::543; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x543.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate.h | 9 + target/arm/translate-a64.c | 11 +- target/arm/translate.c | 461 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 50 ++++ 5 files changed, 525 insertions(+), 26 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 9bc162345c..aeb1f52455 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -701,6 +701,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1839a59a8e..1db3b43a61 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -302,6 +302,15 @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 315de9a9b6..50949d306b 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10218,10 +10218,15 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, return; case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_urshr : gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate = true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_ursra : gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/translate.c b/target/arm/translate.c index c18140f2e6..19bd514a84 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4000,6 +4000,422 @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, } } +/* + * Shift one less than the requested amount, and the low bit is + * the rounding bit. For the 8 and 16-bit operations, because we + * mask the low bit, we can perform a normal integer shift instead + * of a vector shift. + */ +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sar8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sar16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_sari_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_sari_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, sh - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_sari_vec(vece, d, a, sh); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr8_i64(t, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr16_i64(t, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + gen_srshr32_i32(t, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr64_i64(t, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + gen_srshr_vec(vece, t, a, sh); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift == (8 << vece)) { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_shr8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_shr16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_shri_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_shri_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_shri_vec(vece, d, a, shift); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, sh); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, sh); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (sh == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, sh); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, sh); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (sh == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, sh - 1); + } else { + gen_urshr_vec(vece, t, a, sh); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { uint64_t mask = dup_const(MO_8, 0xff >> shift); @@ -5269,6 +5685,28 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case 2: /* VRSHR */ + /* Right shift comes here negative. */ + shift = -shift; + if (u) { + gen_gvec_urshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + + case 3: /* VRSRA */ + if (u) { + gen_gvec_ursra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srsra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + case 4: /* VSRI */ if (!u) { return 1; @@ -5320,13 +5758,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) neon_load_reg64(cpu_V0, rm + pass); tcg_gen_movi_i64(cpu_V1, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - if (u) - gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1); - else - gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1); - break; case 6: /* VQSHLU */ gen_helper_neon_qshlu_s64(cpu_V0, cpu_env, cpu_V0, cpu_V1); @@ -5343,11 +5774,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) default: g_assert_not_reached(); } - if (op == 3) { - /* Accumulate. */ - neon_load_reg64(cpu_V1, rd + pass); - tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1); - } neon_store_reg64(cpu_V0, rd + pass); } else { /* size < 3 */ /* Operands in T0 and T1. */ @@ -5355,10 +5781,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tmp2 = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp2, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - GEN_NEON_INTEGER_OP(rshl); - break; case 6: /* VQSHLU */ switch (size) { case 0: @@ -5384,13 +5806,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) g_assert_not_reached(); } tcg_temp_free_i32(tmp2); - - if (op == 3) { - /* Accumulate. */ - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - tcg_temp_free_i32(tmp2); - } neon_store_reg(rd, pass, tmp); } } /* for pass */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 230085b35e..fd8b2bff49 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) #undef DO_SRA +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] = (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] += (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Fri May 8 15:21:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186380 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp94440ill; Fri, 8 May 2020 08:37:03 -0700 (PDT) X-Google-Smtp-Source: APiQypLW/Uy/0+5SLrbst5fEql2VodSwxlFUvx7c3tzhQ9ILm0P1hdFseHbJGA0fA1HPxQZQx/sM X-Received: by 2002:ac8:4a06:: with SMTP id x6mr3709735qtq.163.1588952223493; Fri, 08 May 2020 08:37:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952223; cv=none; d=google.com; s=arc-20160816; b=naS/AUyUawu6ucaJvmqx2KYgiDB7RcZ9n5+Q7dDhFiTxiaJuzZMrC36jOPYnhCFQlv dZ6hZQG6O7bw80qSjDatBZpMhQNwB2YxVLPiNKoRStUXlcLme6wKsIGU0H5Jzvlgrjxc IvQPzGT2EzMK6x30ZVzhqtLNkJJwwiUU2zC1GIUwZFr9vefkCGeOIXo+e3sO89hLC4r2 gYQ0dXXHmXDahN8nwFKwoz4LS1wrO4z/Sf7GAR7XZpkyW8Q3f5QoRxpIo5KSXUXSApp4 V4lt9Jgq/meVXiTB+y8cjUMaE7FWt2aax3Ag6+qpBNYDUKnXUF0YsXS9R8M8evMBoqd0 LelA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=7MYAHSBEG+F9mmfJQAxzNPkIWeGVdFuYzsxiBlYTS3o=; b=xG9TcnqWj21rUj8B04rnh3Z0AUhFGPSkF2JMSNdDQGw1MPTkNvvrqLSc2GCJAREmaz jt4739VR42NnhI8V4aNm4iOlQ96z+m8TgzmT085Gao48O6UjubkaNZ2LjQR9vSFjU1Ho nsduU+pg5vBnz9i7pwFyaqTJwZQNemY7DNmz/kpIVKv8XNBiIHXvwILRywC3RB9ffII5 g9lusSnqMx7xHZesVcSPkwm7ffQYOUHkKg9C4rVKutbNhmtsNwrpZLR524Ko+W0TULVY XsLekMI+PktzsFAJdm6D4EmMz8RjP7qk5LgHohv+po3kC1q+BF1cusbvnOHERxTCePkO qa8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=So4AOh8d; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id bn14si1409902qvb.112.2020.05.08.08.37.03 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:37:03 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=So4AOh8d; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:38654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX53e-0001nF-T5 for patch@linaro.org; Fri, 08 May 2020 11:37:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34924) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pF-0004Rh-Rw for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:09 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:39291) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pE-00072Q-Al for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:09 -0400 Received: by mail-pj1-x1043.google.com with SMTP id e6so4386644pjt.4 for ; Fri, 08 May 2020 08:22:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7MYAHSBEG+F9mmfJQAxzNPkIWeGVdFuYzsxiBlYTS3o=; b=So4AOh8dDY9izEikn5TqnsJNKMARs/18AEkWz0oT433UU4Kuj+nWxL2MwBLHdtFQmR KFq6t5NHPwchvlUMmvJv8k2KZp4D8u/N0r+VDCh0BvXmqOy/Wlltk10XWAWYXnoHUGQH 22j0NXDmHjA/kEiBV10QNSnXFz/3457VJEqmKihXDAS+SOP6eRRxqe+3eizvQfrJUxRO KXU/TK6Vx2a5uQKfaqn56B8ZYlY8CwlCzeRMMTKZvhs5JalBEjA5APcr3566QfCkzzWX K3do8HFwiEul4ZHZUN2nj4BezSCtJjLhcJHpAb2WzCjqIzaXS8IdShtqpisKTvysalWw cotw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7MYAHSBEG+F9mmfJQAxzNPkIWeGVdFuYzsxiBlYTS3o=; b=F5qfURpn8k/U0takD4G5qnWtcX8mXcpXvWoAwh5dg4mqcFe03GVhzgKIPeH0V7woCN /8ob9HZ8MCsa9fenydiNiFwzGtrPcSX1N/XwLwNgC7KzxWXq/G7LP9QGqoIYdvDHyWdv 1wcfWorr5uAETMENtJsbflNTsP7MHAeINiEJyxiUhVpiyuLscTChriy4+0JV/x+OD1Kv ihtoP67Wck/J1J97JdDfBavAOW+3FLhtNFs0cspfvZ+QcWnchhVywOxW9XluItXtfqIs ZAS8l6RDpLxxkMJuFXZVW2sqVAQaRC/plFKJJHTB3KT8Ln2cZ1F0Rtp0ISrXThov+O2k ecbQ== X-Gm-Message-State: AGi0PuYF/V+UCossNEBf5ReHvgIXY7/U79GqR/2Qv/zIlyrYOcK3uFvT J1h0RgUKCHsCHbLocIYzhQbPBKAFXqQ= X-Received: by 2002:a17:90a:bd91:: with SMTP id z17mr6826512pjr.189.1588951326166; Fri, 08 May 2020 08:22:06 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 03/16] target/arm: Create gen_gvec_{sri,sli} Date: Fri, 8 May 2020 08:21:47 -0700 Message-Id: <20200508152200.6547-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index aeb1f52455..33c76192d2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -721,6 +721,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1db3b43a61..fa5c3f12b9 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -311,6 +309,11 @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 50949d306b..2d7dad6c3f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -585,16 +585,6 @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); } -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -10191,12 +10181,9 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); return; case 0x00: /* SSHR / USHR */ @@ -10247,7 +10234,6 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); - done: clear_vec_high(s, is_q, rd); } @@ -10272,7 +10258,7 @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, } if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index 19bd514a84..e221d0c959 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4454,47 +4454,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sri_op[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4532,47 +4547,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sli_op[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -5713,20 +5741,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } /* Right shift comes here negative. */ shift = -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]); - } + gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index fd8b2bff49..096fea67ef 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) #undef DO_RSRA +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Fri May 8 15:21:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186385 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp96796ill; Fri, 8 May 2020 08:40:18 -0700 (PDT) X-Google-Smtp-Source: APiQypKCTzqulHv/KSTr58Z2jKGeyOpjHE6zFx1D7MnUI8snsN6tSNKWQxPgCnLWToCPOAde0/hr X-Received: by 2002:a37:8881:: with SMTP id k123mr3393997qkd.164.1588952418164; Fri, 08 May 2020 08:40:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952418; cv=none; d=google.com; s=arc-20160816; b=mN29Vn205Z/vBWXfvFcjs4gZZFZQKzWMH2o1OnPG4Q8QoXWGkQswKzvdcubSd6nYqU WSYBjPToynZWXWTmRtrvv/+Z5dQWstoaHv74lZCN0ZZQNhO2KDoOgF3lpiPpZkmnhDud Ftx8C7CtDGmsm1FxJSkrgBap1m6feScCnjM0LY1ugUzyOkW/V8lHTjP1Wns/76+4lSf9 gIR0PIO8znUXhr8pxZt5lDMsvCnJ8z3fjCvfykcAIBwOliixfb9liT0KL7VAwVyKutYo FVSYkZU2bqLJcik/I5G0jEe7WDC4OnkdITpAz2V6zAhuA2kPPz5zfrnHMzJz1VR+8UkP HgPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=P8q6V0iik55UUDqAciuD8of8rt9ZSjxKLPBKIwVKE6U=; b=djOL3HUd/rLYg81gZIWDewmLbM0G8fF5YsTZ9DS0yiTTGgsYd6FIXmvfg+n7RMw4/r YXFXzkAQVnNVEZYw4Gq1DCMEO9V0f9iOyz+odKGbyFtTRJY5c/F9qKr1mYNE0Y2WKq7o ZY8Vz5kg3esc16qfMiZOU4z5uDFO+6PtHyOMqCFMF9kFj84/UGb2WCizPLZhbSBrqO6e KAhzuPFGdSN7QzhgwFZWL0s8kx0yCY9/EfqB3i7wa8PbZdkEhh2BjFjVJJd8H8pjKu5l w9vDQDcuriiaeDDs+Co4IA0gkCaKqCmEvWx6J4SIqKaSBA3dBfvYL10G/kuiEW25rU+/ 1oHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=SASCfKKQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id o4si1147833qtq.360.2020.05.08.08.40.18 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:40:18 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=SASCfKKQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:53842 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX56n-00088V-Lf for patch@linaro.org; Fri, 08 May 2020 11:40:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pG-0004TB-Dn for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:10 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:53720) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pF-00074W-Gm for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:10 -0400 Received: by mail-pj1-x1041.google.com with SMTP id hi11so4406718pjb.3 for ; Fri, 08 May 2020 08:22:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P8q6V0iik55UUDqAciuD8of8rt9ZSjxKLPBKIwVKE6U=; b=SASCfKKQNfB4YH8qYJvppHPRmjaTeg8nyvXp1tYN4R6JQlKj9ocdu2JxPxPntXL36O rMKuTTczpRiDZJbKaTMeXoE+Az9Pe7NkENqELts71YtQbUdwWhgQZLlkMlBTiyqKPngF UHERB4wVdoZJSIyy1igzSsYES5qmlAUIkMYfV+U6eMS0os+LdM0b7xBAkhjvcjk14fVh bePQ4Pkel192dW2AovobBxGBj8bbMDpoaOYWWR+U9gpk46TrbWAXOuA+OIbQP7f+0m3/ YueXxiCcwGCNhmmjLGRUTC84lrSY5wRU2KDCdiwOGLWTnl93eeqT/6M00+LsiGCSf9BH qlSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P8q6V0iik55UUDqAciuD8of8rt9ZSjxKLPBKIwVKE6U=; b=HNuoGEbISOYlqZ5GT0iyw7pMwpJkzMwVLcgI4gLAurKeabB3XIFIy6GbCi8Bd4naQM xnn7ZxIdJkgCRiqwKLEFobZtyV9r6EJxCVbDmMAUrnSLc77lVHh6VkPoJ/gMkazba19k DHJbbanNL0v0/k2sUvK6jwQ3SFx9/32qZVgUFiakut/Zoi4AdgiqctR5ZQHM644EcbF0 aUbWZPTA/K9aw3hnhIxj9Ki9feic2ZBsH01SWlptV0d25J4gRGXo34EBdRdE1Mc1G5XF 0C72/hDQDmGpAXS+gPX/+e6FzQMUZHM2wDLFn/LcJyHkWhd7PVPcc2bCM4xlHmR/rPHY sK2A== X-Gm-Message-State: AGi0PuYfMgpTSZK446WhBVgZZ+OqpKR+iQ+GLW70PxkbSp0FHDpTLIPs NKqsfK5Sj1OXBeWZigjZ2PqICdB2oXk= X-Received: by 2002:a17:90a:2e82:: with SMTP id r2mr6809299pjd.128.1588951327715; Fri, 08 May 2020 08:22:07 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 04/16] target/arm: Remove unnecessary range check for VSHL Date: Fri, 8 May 2020 08:21:48 -0700 Message-Id: <20200508152200.6547-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" In 1dc8425e551, while converting to gvec, I added an extra range check against the shift count. This was unnecessary because the encoding of the shift count produces 0 to the element size - 1. Signed-off-by: Richard Henderson --- target/arm/translate.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.c b/target/arm/translate.c index e221d0c959..967108b3f4 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5750,16 +5750,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) gen_gvec_sli(size, rd_ofs, rm_ofs, shift, vec_size, vec_size); } else { /* VSHL */ - /* Shifts larger than the element size are - * architecturally valid and results in zero. - */ - if (shift >= 8 << size) { - tcg_gen_gvec_dup_imm(size, rd_ofs, - vec_size, vec_size, 0); - } else { - tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, - vec_size, vec_size); - } + tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; } From patchwork Fri May 8 15:21:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186388 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp98349ill; Fri, 8 May 2020 08:42:24 -0700 (PDT) X-Google-Smtp-Source: APiQypILYdQ4Q+GuzOaZ9YYjBBKiN2vktb7xGvqDr/ti4dIhbskh9d1rujw4v2kjGnfm1hbtn0D4 X-Received: by 2002:ad4:4744:: with SMTP id c4mr3437409qvx.203.1588952544241; Fri, 08 May 2020 08:42:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952544; cv=none; d=google.com; s=arc-20160816; b=k5HljC5owozbYuS4Li2ErdpULUPkhALwoI+DJPG1cKhP2F/nHOFYGF9HXp+ds+SpxD 9zB0KQFLnXOOJt/4MouK8V9ZU1JZEfgx2anxhS9BYjgSYleFPXryQ6wg8Z4tXDTLoDpE TROhv3nmvCFLh0XWEcWs0SB/z/e2oazs+FwP9SE0h3P7tycKV0QP6j9+zfL+7fhHu/7U psmkW5sRkBpTitB2Pc7A1gv3aGJTeMb7jwY/1ETLOcc7jsJnJBqvKDecvdeBYRqSsuom ZIXUAi7u2YJ6woJPwQwM/N8ACsOxl4iggK62GES245LVkY4ygLu0l1VXDrdHHV7zaDyq Qxxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=ggARKVO3fE1KslGSPmXFgU6A6xt9PP9kAyW6YMyEtPE=; b=DIikxHvki50wWByYCf8/GwO00MetMdwfkr1yE3hcvfLL5w5eEp0+Cpd1EDpEvq039g kQeGLrD7N9N5Ai3Cx7FXjtfOaBOl6hwP5KCRJtX5wL+efgfqUteiIvKNuwwAu+8oe8Dz oUvpBph3IvVT0bt0IB/idMzjCsTLRfLj/Y3gbM9qERXdYgp7HUiVPN1fSUunJHoWYUBh EJGNl+rKbAEZ6iCqExjk3klzRWdUoLhXG/+b07T+ofbO/ywKMWT6svjL3/i26A/rcGvv C16IynrKck/QeT+sYdBCouCbb6ksTZUsCwMFnlajhOcSx6cSHf7/E+eJM6UTghRqd6Iq kbxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FF7s2NHr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id q11si1155980qtk.235.2020.05.08.08.42.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:42:24 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FF7s2NHr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:33506 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX58p-00032g-Ov for patch@linaro.org; Fri, 08 May 2020 11:42:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34938) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pH-0004Vv-LZ for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:11 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:35698) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pG-00075C-KN for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:11 -0400 Received: by mail-pf1-x442.google.com with SMTP id r14so1098490pfg.2 for ; Fri, 08 May 2020 08:22:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ggARKVO3fE1KslGSPmXFgU6A6xt9PP9kAyW6YMyEtPE=; b=FF7s2NHrtpYP6Staa8rGRTTe+4iyT4m59qcA6WFiR6CFqOvr0kuSRvoA2e3KFhHpU2 cPoN/0yO7+nZGpXWScwUer2FdoPmTXGMaaaoHIL7ebuqcRhYNsKLK1EnDlMoBk390bsI t849JZ0A4qJGLC2jQ30RToCf8Y3jlGUHry6yHQB4ZiSBdYgiYte6sjUaO5sK79JFeLAa ufsgtjtrBxbk+QSue0TQmiYD/oRxUjwauCurMmPCDgahnOHvgvFAg+hrGzlwA/fPxOGZ sHYxytZhoJFkMLqUHutD6uiHgCtfDhQp0Gn0pko+Hb0yybM8x1XyhsH+qPK/nvzSYbdb uIkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ggARKVO3fE1KslGSPmXFgU6A6xt9PP9kAyW6YMyEtPE=; b=pEfHnsHuWyOspT2Kh0d3j10hQekQlyBfa9UXeya5PxRrDk0/e99yVVB8V5Ffz0W623 DQi3ezccVWzERtWqzzuXzaUMuY064QynplGlKnczOX4MFCUCalWkO7uOmIRHbblRBDuc /hQ508ga6u8+Q+7GO8aHs5S0a7Mr5o3ROQ14LKC8sLYVang/pqyKzEjeTuIv7vhjPaV/ Qj6d4bxMhPgKyxnyPYbZHNpOjyCSEDYSIUNhebPPts7WJ9gqEa8pl5bXTQ4ha4H4cCB+ VagowY3h9zf5VG5MJVsUc9qkrGiZ4bnpi/Bk6tJOlONYSh9FQM/dyo0bWV8f8WO08Sb+ cfdg== X-Gm-Message-State: AGi0Pub8JvJyM8Dcee55hESVeRwV3bN0Keb+pZCDrbbKahYmPB/F90Wm u+7xINEugflca1Xaw4ndtOXzJqUNdIg= X-Received: by 2002:a62:5487:: with SMTP id i129mr3222318pfb.77.1588951328913; Fri, 08 May 2020 08:22:08 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 05/16] target/arm: Tidy handle_vec_simd_shri Date: Fri, 8 May 2020 08:21:49 -0700 Message-Id: <20200508152200.6547-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2d7dad6c3f..d5e77f34a7 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10155,16 +10155,7 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, int size = 32 - clz32(immh) - 1; int immhb = immh << 3 | immb; int shift = 2 * (8 << size) - immhb; - bool accumulate = false; - int dsize = is_q ? 128 : 64; - int esize = 8 << size; - int elements = dsize/esize; - MemOp memop = size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn = new_tmp_a64(s); - TCGv_i64 tcg_rd = new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -10178,13 +10169,12 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_usra : gen_gvec_ssra, size); - return; + gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra; + break; case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); - return; + gvec_fn = gen_gvec_sri; + break; case 0x00: /* SSHR / USHR */ if (is_u) { @@ -10192,49 +10182,31 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, /* Shift count the same size as element size produces zero. */ tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size); + return; } + gvec_fn = tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign. */ if (shift == 8 << size) { shift -= 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn = tcg_gen_gvec_sari; } - return; + break; case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_urshr : gen_gvec_srshr, size); - return; + gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr; + break; case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_ursra : gen_gvec_srsra, size); - return; + gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra; + break; default: g_assert_not_reached(); } - round_const = 1ULL << (shift - 1); - tcg_round = tcg_const_i64(round_const); - - for (i = 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } /* SHL/SLI - Vector shift left */ From patchwork Fri May 8 15:21:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186377 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp91996ill; Fri, 8 May 2020 08:34:13 -0700 (PDT) X-Google-Smtp-Source: APiQypK4F/SPIL3IALCUY9nDj0CILxhoBGZ1YNEoU6BdYCDmWnwS/5Ar1ZtBfXfVUpSs/wcqdVDP X-Received: by 2002:a37:6490:: with SMTP id y138mr3430847qkb.32.1588952053525; Fri, 08 May 2020 08:34:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952053; cv=none; d=google.com; s=arc-20160816; b=qGGIlwoiDLL+fcPfn0+to8JXD6wfYdthvF1uV7oUv+WvHLw915y4yCpKQVTBH6QYNN vbwviWOvHcAIgwB2SOmvj+smC4UvcdcRv0bqJVBUQKsbJl1YqIHIWXw5jaZlWKLcHARq 5N5Po458yrXK5o9KgG8liM2d4z30kDs9Fm/0nWbRJDgvs/fLlKtmA2fpIbB9UORVluOJ h2PfH5pCT5BxLWbQid82TIM8Ioqxihl/e/mD2ZG/vLDJNdu9MKRsQMhJeD8/FxLDyJzd W3FL/uhpLub31+r49iDq9n+g8F75SoYXjBwyoqIFopvZp8XEZsRhHOnQisgd8doeKBuw 69Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=pyNuivcPxdt7V5pwewNgwdn/FthppnmmYlM/tMN7sI8=; b=L/K0z3vapKgmmfSaWFxMCxOL1r47tlsMDhQmokR7/Z43q3xqSF7J41MrH/blkkhq8Q VVzwTlQ/DmV1bxUD19/nd9rYW+PNLlZjoNZ2YVbkezEK61YStGzMHIGRGWYhO5uxrEUG sQAT81p0ZmGbOWJVGX0xdxT7Yq0Fd3nif4MRHutwZaX9E8jiHmohopKBn2XBPXkUZ+8y ngpB/yaGTv5U+BI7E0KUZ8VdcCuXvP8pFb/ebWBE25T2Q+sf3YG4fVaRQRJO/cMVYNkX 1KCAaxkqp8DhEgCRbvXWb3nyD8WiTj/T5mZZF252LcaJobSBI8H1eNPVSjRM3gPzZ7C5 DPzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Dd+gaqc5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id o91si1055090qtd.401.2020.05.08.08.34.13 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:34:13 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Dd+gaqc5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:60030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX50u-0007Bt-Vd for patch@linaro.org; Fri, 08 May 2020 11:34:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pL-0004bL-HZ for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:17 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:40682) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pI-00075w-Bu for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:14 -0400 Received: by mail-pg1-x541.google.com with SMTP id j21so999752pgb.7 for ; Fri, 08 May 2020 08:22:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pyNuivcPxdt7V5pwewNgwdn/FthppnmmYlM/tMN7sI8=; b=Dd+gaqc56PmGvvHoUGZANJnIrsXHNHBDb0YjyIjCd5EIve/gQjKpkOFANWPAwOq/Zm /I1fBjNQ0QWHCRIyXS0K0Mkv8457rDvI5yqZkiHTaOjV/OMhPXGvMzFD5sX85zr+h2uq e8sKMXhFD6lfOi3j5OFdcIomcTDQr0tIHP0vDJT4BU6BAcfYkuq6GwAA759RPEcsnUol OVIgoSu5N4SAB5MLvJMURycd79WsWcONpjXQD6gpzXb3cJMbotdAJfvjQ89m/hI3WBI4 7sCpoXBLWuvoTyWCP6ebvtFS73RJwmp2Ei/VnavSIeNNpZlsBiXTEcLb+Lnx1oBifLxY FFyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pyNuivcPxdt7V5pwewNgwdn/FthppnmmYlM/tMN7sI8=; b=aMZmKNaZiSdvLnDMCXRnQtW7pS1xf9m7qR0GYCDUV48OO3RjsTdsqO2YLC2STc7afe zVNfgDUwF2TF0Hj5f5gzySh4D/kaJKhOlEIGJ5OZQGNciPTAHQ3k1xe13ldA3EdSOPxB h3KIms+SFKiIF5tNf6QIJr7gE99LWCT1VBP5Rxq3/UTp2+bNEnJALXn2RPCct38YIG8T 9xgMEDBxNR67frNPtQzsnYi9bk2VW+qT5feFgLytbufh0dXyWnVqMizfBsQl6aDMci0F NibQPz1KeFDTohIiyYXeUivy6sTreow84kWQbU1pFVGg+H6xt09oQXpvFIaqjAnY8Wmn IYXg== X-Gm-Message-State: AGi0PuYcuvGPKahCI7eaDgD3j1jZMXsHpNbKPsNXcPHa0VQVQy3F7ILM qpNonq2tHDi6AgqNsAsdanzNlwtsYUA= X-Received: by 2002:a63:30c4:: with SMTP id w187mr2615045pgw.276.1588951330273; Fri, 08 May 2020 08:22:10 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 06/16] target/arm: Create gen_gvec_{ceq, clt, cle, cgt, cge}0 Date: Fri, 8 May 2020 08:21:50 -0700 Message-Id: <20200508152200.6547-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Macro-ize the 5 nearly identical comparisons. Signed-off-by: Richard Henderson --- target/arm/translate.h | 16 ++- target/arm/translate-a64.c | 22 ++-- target/arm/translate.c | 254 ++++++++----------------------------- 3 files changed, 74 insertions(+), 218 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.h b/target/arm/translate.h index fa5c3f12b9..e35c812cc5 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -275,11 +275,17 @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex) uint64_t vfp_expand_imm(int size, uint8_t imm8); /* Vector operations shared between ARM and AArch64. */ -extern const GVecGen2 ceq0_op[4]; -extern const GVecGen2 clt0_op[4]; -extern const GVecGen2 cgt0_op[4]; -extern const GVecGen2 cle0_op[4]; -extern const GVecGen2 cge0_op[4]; +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index d5e77f34a7..fef93dc27a 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -577,14 +577,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, - int rn, const GVecGen2 *gvec_op) -{ - tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -12310,13 +12302,21 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } break; case 0x8: /* CMGT, CMGE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size); + } return; case 0x9: /* CMEQ, CMLE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size); + } return; case 0xa: /* CMLT */ - gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]); + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size); return; case 0xb: if (u) { /* ABS, NEG */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 967108b3f4..45df3743f6 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3645,204 +3645,59 @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, return 1; } -static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero); - tcg_temp_free_vec(zero); -} +#define GEN_CMP0(NAME, COND) \ + static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \ + { \ + tcg_gen_setcondi_i32(COND, d, a, 0); \ + tcg_gen_neg_i32(d, d); \ + } \ + static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \ + { \ + tcg_gen_setcondi_i64(COND, d, a, 0); \ + tcg_gen_neg_i64(d, d); \ + } \ + static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \ + { \ + TCGv_vec zero = tcg_const_zeros_vec_matching(d); \ + tcg_gen_cmp_vec(COND, vece, d, a, zero); \ + tcg_temp_free_vec(zero); \ + } \ + void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \ + uint32_t opr_sz, uint32_t max_sz) \ + { \ + const GVecGen2 op[4] = { \ + { .fno = gen_helper_gvec_##NAME##0_b, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_8 }, \ + { .fno = gen_helper_gvec_##NAME##0_h, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_16 }, \ + { .fni4 = gen_##NAME##0_i32, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_32 }, \ + { .fni8 = gen_##NAME##0_i64, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .prefer_i64 = TCG_TARGET_REG_BITS == 64, \ + .vece = MO_64 }, \ + }; \ + tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \ + } static const TCGOpcode vecop_list_cmp[] = { INDEX_op_cmp_vec, 0 }; -const GVecGen2 ceq0_op[4] = { - { .fno = gen_helper_gvec_ceq0_b, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_ceq0_h, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_ceq0_i32, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_ceq0_i64, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +GEN_CMP0(ceq, TCG_COND_EQ) +GEN_CMP0(cle, TCG_COND_LE) +GEN_CMP0(cge, TCG_COND_GE) +GEN_CMP0(clt, TCG_COND_LT) +GEN_CMP0(cgt, TCG_COND_GT) -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cle0_op[4] = { - { .fno = gen_helper_gvec_cle0_b, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cle0_h, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cle0_i32, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cle0_i64, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cge0_op[4] = { - { .fno = gen_helper_gvec_cge0_b, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cge0_h, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cge0_i32, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cge0_i64, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 clt0_op[4] = { - { .fno = gen_helper_gvec_clt0_b, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_clt0_h, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_clt0_i32, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_clt0_i64, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cgt0_op[4] = { - { .fno = gen_helper_gvec_cgt0_b, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cgt0_h, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cgt0_i32, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cgt0_i64, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +#undef GEN_CMP0 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -6770,24 +6625,19 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; case NEON_2RM_VCEQ0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &ceq0_op[size]); + gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cgt0_op[size]); + gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cle0_op[size]); + gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cge0_op[size]); + gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &clt0_op[size]); + gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; default: From patchwork Fri May 8 15:21:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186383 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp96054ill; Fri, 8 May 2020 08:39:13 -0700 (PDT) X-Google-Smtp-Source: APiQypKH3BPQSBwL+ekKVoiStC8CVgIBs3N+H6MEpg2UX1GSwNOPqyKEii5Vz8GQiywIK+2UxNZk X-Received: by 2002:a37:8346:: with SMTP id f67mr3196117qkd.283.1588952353663; Fri, 08 May 2020 08:39:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952353; cv=none; d=google.com; s=arc-20160816; b=KirhYHLS/vbeXSWCHD07YqxcZ3RLbj9M+7YOzRwBBfKFXxhFZaI50L/8RSizhgrpHV 7LocqX9flb9542F13VFmztVipkMW4QxeHZLjLxnwsfTwh4Wy+DcxDITPgZ4h4oGyyw7Q 3qYNjdklQvpbqbnDouwro8aWFV+UEC/y2eBG+j6U0nQoBPfHIbwq9JgwiWwelTqE0jbt KbutVti3f6PMGYxzzQkEkB2xxWsmepvHJZWmDAUdFcXZ0XcSd1MN2KYfzi8zgG9GuBnS NezG8FfD5YuvqGaB/ZJ9NwEHuJyPoL+yLG9HQUXek4R0jY2JNF0ap4bJ8XA/etTfsbqG Nu7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Bq1T35WTx3CYkI2SYC9Hyi7M0Eqbmy6VmAmWCxKnbCw=; b=Y56lHeGphepr02tvHyEExiRB2s4ZOgsChpQbzEwu26qvgHwTUtNoe1LWhKIuKQuasd 8E6M0Z07O9wGuxFnnPD8BhSbrh8IernHIq0X0secCbjL6qVeNYCOkTLs1GKJXJuklvnf mxGNGWqV2YR7Ivc6UMiHUqP925r1piycRQk8yTDmUNMCEg90/Xpl5V3X8JxlyPCdT3bX CWVFdY+oVvW+N1s1K7frQaqR6wReimVKi4vzqzAwT8ybEvQqmbc2+ufQyIitQZnplIkd hDpFW1M1OOGP2bcsBY4MhpbKzQPD2g7i1kCAArS7kGwPn+ZsiKdGUk8Tm082Uljgqabj +fxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="b/4uadFi"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id l5si1254101qke.234.2020.05.08.08.39.13 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:39:13 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="b/4uadFi"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49166 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX55k-0006Ev-Te for patch@linaro.org; Fri, 08 May 2020 11:39:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35012) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pO-0004cK-7A for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:19 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:40683) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pK-00076F-8u for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:17 -0400 Received: by mail-pg1-x542.google.com with SMTP id j21so999785pgb.7 for ; Fri, 08 May 2020 08:22:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Bq1T35WTx3CYkI2SYC9Hyi7M0Eqbmy6VmAmWCxKnbCw=; b=b/4uadFitO7HkqRLhXIX7Tolfi/VbYL+s6jJbQW6xWidQNuMLPd3tYN+nYTrjycv6p HFsSUy8GU3AyR0wynL0KsCxpaMUI9ZzwUBkSyaVAab4IoEis52YQfmgCJJOhCuzA2f2D Keq6+eU6eFDW+NNf7yfLDj5QO29r+hi041iPL2wQVJruTP7M+3VOLPJEk7tLVPjxgLQ6 mZghtUZ+MKrFzm1srOCz/WjeG/EAjDBnw7dOp5zUDSjNiBkYQ9kcqZGRIi8cGi/FipPM pDg36h/JP4y7n7x5yF1Nx/6jIqrSX7dAtzKawg91CDSMl5RNVB5+RidHQQP3k0RAxh/y HvLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Bq1T35WTx3CYkI2SYC9Hyi7M0Eqbmy6VmAmWCxKnbCw=; b=SnYa9FVfi7ATe5AMC/w5+y6j3gzWqRmqlqFK1JrH6b2awmZg1Dn2f8kzK9GJ3oSWJU dMiJwxRDXmOCaLFJ/H2uzbaZ0e2v1y2eMwBfzWITJmHY+fzHwxpdJnI9ehuzHzfbm8jj vgp5IuQ+x840zszJGdJxKC/MBXrlgSecxzEXkAWen2+D237XLe4ffQe6PhmCFgybcx2o znroicZA7G0a7KrJhMDC3nIrwL6WYvbUH2JE1je4p0+sj5AWZIzQzrDYxCy4jxR2/jj+ 1Hq8xkTBRsPf/QupfQ4BwRmOuUeXTmE2ZJRLn2YdjPSNXZ274FJ890vqCcpnV7kpvyjt Vlng== X-Gm-Message-State: AGi0PubhYjZ7Rk8OCDABzo/DpF5M4jbqq2Lru9bn+g7MU9xjuLVoyn2G +je3GZRzojt5TK+uC0LRYkEMOoCssGg= X-Received: by 2002:aa7:842b:: with SMTP id q11mr3421533pfn.302.1588951331794; Fri, 08 May 2020 08:22:11 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 07/16] target/arm: Create gen_gvec_{mla,mls} Date: Fri, 8 May 2020 08:21:51 -0700 Message-Id: <20200508152200.6547-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 7 +- target/arm/translate-a64.c | 4 +- target/arm/translate-neon.inc.c | 16 +---- target/arm/translate.c | 117 +++++++++++++++++--------------- 4 files changed, 71 insertions(+), 73 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.h b/target/arm/translate.h index e35c812cc5..9354ceba35 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -286,8 +286,11 @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 mla_op[4]; -extern const GVecGen3 mls_op[4]; +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index fef93dc27a..ab9df12e44 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11226,9 +11226,9 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x12: /* MLA, MLS */ if (u) { - gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size); } else { - gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size); } return; case 0x11: diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index 50b77b6d71..aefeff498a 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -632,6 +632,8 @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax) DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin) DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin) DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul) +DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla) +DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -685,20 +687,6 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) return do_3same(s, a, gen_VMUL_p_3s); } -#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ - oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s) - - -DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op) -DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op) - #define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ diff --git a/target/arm/translate.c b/target/arm/translate.c index 45df3743f6..52b6d32cf1 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4520,62 +4520,69 @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) /* Note that while NEON does not support VMLA and VMLS as 64-bit ops, * these tables are shared with AArch64 which does support them. */ +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mla8_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mla16_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mla32_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mla64_i64, + .fniv = gen_mla_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} -static const TCGOpcode vecop_list_mla[] = { - INDEX_op_mul_vec, INDEX_op_add_vec, 0 -}; - -static const TCGOpcode vecop_list_mls[] = { - INDEX_op_mul_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen3 mla_op[4] = { - { .fni4 = gen_mla8_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_8 }, - { .fni4 = gen_mla16_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_16 }, - { .fni4 = gen_mla32_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_32 }, - { .fni8 = gen_mla64_i64, - .fniv = gen_mla_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_64 }, -}; - -const GVecGen3 mls_op[4] = { - { .fni4 = gen_mls8_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_8 }, - { .fni4 = gen_mls16_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_16 }, - { .fni4 = gen_mls32_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_32 }, - { .fni8 = gen_mls64_i64, - .fniv = gen_mls_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_64 }, -}; +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mls8_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mls16_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mls32_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mls64_i64, + .fniv = gen_mls_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* CMTST : test is "if (X & Y != 0)". */ static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) From patchwork Fri May 8 15:21:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186381 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp94877ill; Fri, 8 May 2020 08:37:39 -0700 (PDT) X-Google-Smtp-Source: APiQypKB5/pcZALlcvlcGZhV2MNTznUvkLWVf2lJ9w1mbpnrilHYnwvRv1sheXf2dii5591xCq1Y X-Received: by 2002:ad4:4744:: with SMTP id c4mr3417832qvx.203.1588952259583; Fri, 08 May 2020 08:37:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952259; cv=none; d=google.com; s=arc-20160816; b=FPOoc4m9R6TjGg0HH1Ak5Sh0RKN7kj2tiUT27/uIhCfRv5WlR+9mcZSd1/zoYDhe8g lcEc4TvF265Brc8/wrKsMD4qTcN3Q3lBkqgfm52DAKWUoVELQO/VKHjxqdZdF4EN1EJ5 1zxH/G1Eh3t9+T3HHUIWrrsxa2shXQFeQlheMKNlwUEuyPJYhnb0c+Zfr7oSJowJYzLO Kg89eNT2uJBmX5ZOkmhTjEfcubpuRoOc2FgIdGljVT9kwOXkQO5fcQz60DaRww3GNuj6 cC3RoB1ePjdYuzn65dmNE3S9vFe7LQBwvheeJ+XqouLXk2HtMxfqco87zM+nMWIGg9kA dMuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=I+AyE2bJa10UxodOkXAXrVws6OuY7h9XFrym37/9NTI=; b=Zy3O46WVX12+ebedQ3D64946bTcUm6AkKua+yzPA263zRR/WX7tsGj32GiQwp0lebE kbPYyDP28zcRLpkwDIpguBSaaaACtkLXRbZtbeQPYzqyLNp9fRYkAiuEeFRebmdv2Lsw twP4fCxdtA2evHdQstOawPV7mwFarbRUvWfPDlFfxEtQ3sGPGWl6yLAiciW/4vDGv/uK M0dg/+ERWFSB5140lV5IWvO6wg9wFj/WPEB1KLBTJU8p5Te1syDHe4lbZvjwg5B1PmR7 7fHQwLx/PcFEu1hvlSdoc50KdLdqmaow9BIIyXpZaQL8EpsXVcsuxwgKFyzJnqIFnWo3 +jMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=sjBb+YGr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id a1si1072168qtp.195.2020.05.08.08.37.39 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:37:39 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=sjBb+YGr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40494 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX54F-0002YI-0R for patch@linaro.org; Fri, 08 May 2020 11:37:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pM-0004bO-8c for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:17 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:39098) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pL-00076O-9D for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:15 -0400 Received: by mail-pl1-x642.google.com with SMTP id s20so858166plp.6 for ; Fri, 08 May 2020 08:22:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I+AyE2bJa10UxodOkXAXrVws6OuY7h9XFrym37/9NTI=; b=sjBb+YGrz5hisCzyfJ5eZEZSuQjzKv/3/DWIqIya4a0HyY18QMZxb5bdRYSIL7dskB h6trMfvD/p50bLgPVLtQS/L6Ub3dlTjRefMjacTsscqtJYlkWxRKSA9jfADtOyUOULQW H7n3lGHWrKnV7tJPImPmvq2qtaBw1xRfRBw43qeLcS0gqfLl1nouHbADkrebZMaX9LVA +cNzQjFHv8Sek/hruKBamvrWbzTUhAYvaneRMybwK1u7ajAOksalVmedBGYKLBLmix27 NkjytkVu+GfEfZWbtNgzr+kB6m6T7Ep8RWXAxLu2+1jdJlAmpgzIK/9zN+PRQV4OL5NN 6Asw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I+AyE2bJa10UxodOkXAXrVws6OuY7h9XFrym37/9NTI=; b=S7MGHb0hQbuqWvPT76bvcfQclu5XNv5oqdIfWtDve4nK/CSPbehDz2v5eSXj6Q8ab2 b5XrlN0Fw+qntsh01jkPD3BdXFAVm/6S2l7RA/UApBrHDG5xFBDxRAU2LKnL7mjJb58f ZGRx/KMNPz+DQrVJsGBVABya2RpfEXdkRMwSt7c30C1dMdUxMpauPtRMtXFz032f43ld w9XL8+HUo2hsYF2LfGHCr7jWNQtky8rT9zxr5iMkHsMSJZZXqXRCJ6ov2D866A53r/7G rIgwrhmWZMvUn/RdK/Nm88c6sk4cwyatD1mNLGj+UpJW5CsD2u7e2iBiKcjfq4wnwnfe RASA== X-Gm-Message-State: AGi0PubMqVml6wRGdGYrwuO4B6qXFuHFDYb94EvrWp0E/w4Uxj+r1QWu Boz6AIDUHGbcDERi/lkdlq1imKb4GjQ= X-Received: by 2002:a17:902:549:: with SMTP id 67mr2901759plf.115.1588951333075; Fri, 08 May 2020 08:22:13 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 08/16] target/arm: Swap argument order for VSHL during decode Date: Fri, 8 May 2020 08:21:52 -0700 Message-Id: <20200508152200.6547-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::642; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x642.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Rather than perform the argument swap during code generation, perform it during decode. This means it doesn't have to be special cased later, and we can share code with aarch64 code generation. Hopefully the decode comment addresses any confusion that might arise in between. Signed-off-by: Richard Henderson --- target/arm/neon-dp.decode | 9 +++++++-- target/arm/translate-neon.inc.c | 3 +-- 2 files changed, 8 insertions(+), 4 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode index ec3a92fe75..6b0b6566d6 100644 --- a/target/arm/neon-dp.decode +++ b/target/arm/neon-dp.decode @@ -65,8 +65,13 @@ VCGT_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same VCGE_S_3s 1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same -VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same -VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same +# The shift operations are of the form Vd = Vm << Vn. +# By reversing the names of the fields here, we can use standard expanders. +@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \ + &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp + +VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev +VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index aefeff498a..416302bcc7 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -692,8 +692,7 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) uint32_t rn_ofs, uint32_t rm_ofs, \ uint32_t oprsz, uint32_t maxsz) \ { \ - /* Note the operation is vshl vd,vm,vn */ \ - tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, \ + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ oprsz, maxsz, &OPARRAY[vece]); \ } \ DO_3SAME(INSN, gen_##INSN##_3s) From patchwork Fri May 8 15:21:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186386 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp97265ill; Fri, 8 May 2020 08:40:55 -0700 (PDT) X-Google-Smtp-Source: APiQypKZ3k21qAoMeSFeoEPwEziFoIRxRAup3SbR9D90tgbDLSmoexpJtZONR9Ey92RarnVnTKYQ X-Received: by 2002:a37:a10:: with SMTP id 16mr3307464qkk.6.1588952454962; Fri, 08 May 2020 08:40:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952454; cv=none; d=google.com; s=arc-20160816; b=LQsbIFEqY1rxjm51ZUS9FQo6pe6F9q3A/eVnbH1vBpkt/XcTgqrl9muEhvTLc9KQz9 kA03DlPHq8pM6L8MoMrTZp+PxJkXkdXBVLzdbibjVikViERDSBRl89j8Ce5sPQL4bjjp 9AMgm+KOsU9DqZgOr2M9yerhuUhfItNv1DeTxKB5fOWMTZWXbf6vqJNbonqEsQNSzWnS tf41iXZNbT0Dprqqx5s4xczBhsHyHJZrRTNL6+FdZ8KidSm8/MLAhX0mM7LTMwCHYipN 1lMyGNMxrKeAKgC2E02Mpbqgd/9s9YhE7LiATpVpfP3Y2mqxri4Y26TfYWDee96S24uH daTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=+ReXnu4YkA6OfC/hv6LH+m8ByOGFyJ5z8shvxl2jmPs=; b=VWr/1V1x0eGs+h0BltTtz1WXuMqc0O6rO5xWj3iVzzT9hpbBn5tNBLuR/4DMHIH1dE o/Pjf7iKoynGqAkkNZHqwbLpgpdHif6Dx/BEZWkr+8HMQUO8GzLQ43xEdfMSjWXaJSml gvXzPemB3Ja1X9G9om6+KOl9mN50OudsVKUKsCqeIm5l/zm9lOfRcZBnrxH0Z/oCeqCW juuEEBdYv8vh/c8+ohjXlaGP0a+qitpyeoPM4bUPVyVbWSir4yeu8PwRpkoz857rG/xk mB9UBXJsUGbSODKZf0Z43BeExD+FRhrRwzr7NWdumS0RtuioltR0LOkuc0fzg8/thJH8 mMLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=VDfPrjx9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id x48si1127649qtj.283.2020.05.08.08.40.54 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:40:54 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=VDfPrjx9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56764 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX57O-0000yq-Bx for patch@linaro.org; Fri, 08 May 2020 11:40:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35026) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pR-0004gf-7g for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:22 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:38718) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pM-00079M-Ey for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:20 -0400 Received: by mail-pg1-x544.google.com with SMTP id l25so1003494pgc.5 for ; Fri, 08 May 2020 08:22:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+ReXnu4YkA6OfC/hv6LH+m8ByOGFyJ5z8shvxl2jmPs=; b=VDfPrjx9E2MYScHfLoxQgnw5ld5mIxYQUyrErGLMntTSqtETj6djdLQKoICm6r1p+4 W40eARoRO5VtacxlXN9Jq4LvmUDi+VCQpaLS0FRDS710nmqG1oLWh+IOhwy0PghtwbIR AHrXG6t0NuGGEpuyuZ9jxNYym0N/jPvlJOXEk3V6jE36O/ycy1IQeOdPD1C0AuSRg5o2 srwdnxOEyO6RyUyM2ufj7megBYt+AJGanQF5aMmgTieGgF01asQtpCh7VggWwCzg0VZX 3bnpfSkD7I43XrrnUfiVL54mtHt7kRagQ6MwfaJ2bYWuEmSZMAff0DqU977UZHwBNd8u nRYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+ReXnu4YkA6OfC/hv6LH+m8ByOGFyJ5z8shvxl2jmPs=; b=L0uOxBrKnQCMTl3BzF3eaGUoro+8XgKyzyIzAF85BJ7XBYFP3eE10xUBN9xF3Hf0+B OXpEs7TC/vbppa1c9DPUUzA1KfUZULwy0d8l+1iGn6DcIDi+fmz2wXQOBYNCCoN+iyGM uz9X1KUbP+M+5xafrenpoow4APFShcN2ATp/6Raz/OK3a7zbu99YpU85oPbSWSoNBo9X ppPYHSLsT0l9aiYz8I+ccWaKou+tH2F13Bjo+o/1Os4uG2AYecZSc/ZXvmnlQWgfwIzm fKrVBiTWUscLuHx9QfkbCwzFL+EiTlO+8HrkCrcwa9gVMbbTXbBVJj9eFyhN8eBaqEw2 iq/Q== X-Gm-Message-State: AGi0PubpgBaF4sa4LZ+G2Qd5Rw3Zww3CTc+BE/StENiqaR8C7u/Ns7Lf wzCa8feMEcHvZFa65x1bt+VM88J0hQY= X-Received: by 2002:aa7:9a43:: with SMTP id x3mr3511795pfj.266.1588951334335; Fri, 08 May 2020 08:22:14 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 09/16] target/arm: Create gen_gvec_{cmtst,ushl,sshl} Date: Fri, 8 May 2020 08:21:53 -0700 Message-Id: <20200508152200.6547-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::544; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x544.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 10 ++- target/arm/translate-a64.c | 18 ++-- target/arm/translate-neon.inc.c | 23 +---- target/arm/translate.c | 146 +++++++++++++++++--------------- 4 files changed, 95 insertions(+), 102 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.h b/target/arm/translate.h index 9354ceba35..a02a54cabf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,9 +291,13 @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 cmtst_op[4]; -extern const GVecGen3 sshl_op[4]; -extern const GVecGen3 ushl_op[4]; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ab9df12e44..3956c19ed8 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -577,15 +577,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, - int rn, int rm, const GVecGen3 *gvec_op) -{ - tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), is_q ? 16 : 8, - vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand operation using an out-of-line helper. */ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, int rn, int rm, int data, gen_helper_gvec_3 *fn) @@ -11193,8 +11184,11 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) (u ? uqsub_op : sqsub_op) + size); return; case 0x08: /* SSHL, USHL */ - gen_gvec_op3(s, is_q, rd, rn, rm, - u ? &ushl_op[size] : &sshl_op[size]); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size); + } return; case 0x0c: /* SMAX, UMAX */ if (u) { @@ -11233,7 +11227,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x11: if (!u) { /* CMTST */ - gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size); return; } /* else CMEQ */ diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index 416302bcc7..e16475c212 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -603,6 +603,8 @@ DO_3SAME(VBIC, tcg_gen_gvec_andc) DO_3SAME(VORR, tcg_gen_gvec_or) DO_3SAME(VORN, tcg_gen_gvec_orc) DO_3SAME(VEOR, tcg_gen_gvec_xor) +DO_3SAME(VSHL_S, gen_gvec_sshl) +DO_3SAME(VSHL_U, gen_gvec_ushl) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -634,6 +636,7 @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin) DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul) DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla) DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls) +DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -650,13 +653,6 @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE) DO_3SAME_CMP(VCGE_U, TCG_COND_GEU) DO_3SAME_CMP(VCEQ, TCG_COND_EQ) -static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) -{ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]); -} -DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s) - #define DO_3SAME_GVEC4(INSN, OPARRAY) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ @@ -686,16 +682,3 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) } return do_3same(s, a, gen_VMUL_p_3s); } - -#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ - oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op) -DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op) diff --git a/target/arm/translate.c b/target/arm/translate.c index 52b6d32cf1..e366281274 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4606,27 +4606,31 @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); } -static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 }; - -const GVecGen3 cmtst_op[4] = { - { .fni4 = gen_helper_neon_tst_u8, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_8 }, - { .fni4 = gen_helper_neon_tst_u16, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_16 }, - { .fni4 = gen_cmtst_i32, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_32 }, - { .fni8 = gen_cmtst_i64, - .fniv = gen_cmtst_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_cmtst, - .vece = MO_64 }, -}; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_helper_neon_tst_u8, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_helper_neon_tst_u16, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_cmtst_i32, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_cmtst_i64, + .fniv = gen_cmtst_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -4744,29 +4748,33 @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(rsh); } -static const TCGOpcode ushl_list[] = { - INDEX_op_neg_vec, INDEX_op_shlv_vec, - INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 -}; - -const GVecGen3 ushl_op[4] = { - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_b, - .opt_opc = ushl_list, - .vece = MO_8 }, - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_h, - .opt_opc = ushl_list, - .vece = MO_16 }, - { .fni4 = gen_ushl_i32, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_32 }, - { .fni8 = gen_ushl_i64, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_64 }, -}; +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_shlv_vec, + INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ushl_i32, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ushl_i64, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -4878,29 +4886,33 @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(tmp); } -static const TCGOpcode sshl_list[] = { - INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, - INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 -}; - -const GVecGen3 sshl_op[4] = { - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_b, - .opt_opc = sshl_list, - .vece = MO_8 }, - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_h, - .opt_opc = sshl_list, - .vece = MO_16 }, - { .fni4 = gen_sshl_i32, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_32 }, - { .fni8 = gen_sshl_i64, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_64 }, -}; +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, + INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sshl_i32, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sshl_i64, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) From patchwork Fri May 8 15:21:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186384 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp96151ill; Fri, 8 May 2020 08:39:22 -0700 (PDT) X-Google-Smtp-Source: APiQypLFMUiR87unaOERGGK1N7koYpz7pSO8X8pBdhiI/pzqEg7cePhYoWmHTE8m+z6jospjBmWy X-Received: by 2002:aed:358e:: with SMTP id c14mr3761276qte.47.1588952362664; Fri, 08 May 2020 08:39:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952362; cv=none; d=google.com; s=arc-20160816; b=Y97/lGGn0GQYYVl4yP5jekiAS9gJ4KRLQ2sW2X84G3L52CM6D42oyFbPwsENQV7BHf 8ZOwz+r6joIJkbekphTqMnd3/2EnOVa/vXP+Gfv447tnLhAVEDZHCQf627jMI2zE2WeC CiU5pn1kXaRjOvxlAsjTM88yu7Fm/CQtMgoCe+6t+YWnQ6QP8h7O6JEQeS+2gsCxG2Zv gU4sOjZkDHlT443bs6NRakv1n7dy4+6ZUv3UugPI1LGINlYZYrqIOqdPMFZhicA6Gzvr B/C3ewpQ1yVei5gA2tul2/2LH6wzlxKNyAVqLQugkxhzehRRbZGRwembCCHhMklA5ZyH i8Gg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=wmlmEvDjH/qmc42bbx9VZXnXkTk1yDurmZ9MBCyd0Cw=; b=EP2AGR6mWn2/8o56ByQypWL8/6YcomoVXlamPvKzVV0ijtk/FqKPTqfLnrgvWyNQGr joUULGtAXWjuBYAd3DSUPMCWVNwNltzMjdoZN2ex3sgE3RR+STQbf7zRofV85feIjW+U 0xSEDkDhxKXzbqz3YDdF6sOGXEzaKKLcDhpHoVCfxGasUzBMk7LA66wgLkcYq72NCM44 oEcaLvr7LqxE08GzRlfIRYPPNHWg4hfkgxvjgOnHCntTvBfzB9mdLA0vH9HA+DCzn30H DQLfFnv1qbWSxRZ0BG58Wu59DqmdvBa0D8OPqc9ZpJdznkv/yey0bqblKzZVbUf4Mm2F cjoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="lPcA/yrM"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id a22si1305177qtn.144.2020.05.08.08.39.22 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:39:22 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="lPcA/yrM"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:47258 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX55u-0005SZ-2k for patch@linaro.org; Fri, 08 May 2020 11:39:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35032) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pR-0004hJ-TF for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:22 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:56237) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pN-00079l-Ua for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:21 -0400 Received: by mail-pj1-x1041.google.com with SMTP id js4so423515pjb.5 for ; Fri, 08 May 2020 08:22:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wmlmEvDjH/qmc42bbx9VZXnXkTk1yDurmZ9MBCyd0Cw=; b=lPcA/yrMY46r1uejyXfemiobLrZgp9Q2QgS4Tl1dknPDWX4jwIROIxS7McmqpKINT8 4/WNLmYvVryzN33Tv3woxJzUO9raRAV4MhCro0lHPQJYnQxVOzqIxKiNu/BNUiLVGGOo 3MGLj3HXdqJ8x4itIP2suPoZNbgVsF/jmWmJeWC5MQ6JFwuVUSNYcA1dSuTtc0wpmwLN PvQc/kxVukh6yCxa1XDKgtloKW/nuBs2DSRfwQd2kSe4g+wL7AQxhGQ1/2wN85ilV4RD G8Zt8ya05UL2l3fs1RxvOi/RV9cRfF7Haqg4/21ITUOPYiQwwzQhqRDYfICMSuvqWVlc rC/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wmlmEvDjH/qmc42bbx9VZXnXkTk1yDurmZ9MBCyd0Cw=; b=BkyAVKPJ+QNpnzxQdspmfWHacNOsT1Kr2u2OuUQdNAWLsEXH/ccCU77qSwm77Ypsff EuE/xtVxeDO6DlRp7qJPUt9HRauhpjTWPoVw/hhJqWJ629PHxDb03MfAH3Mf6zGIkUpZ NsNoDZdipgoQ7JtW9p9vpAv7/mmOybDQGoboVv0gZyFIQiAyGZJEn/UaCOklxHEIki/o lhGS12NlOoNAmlj4+I1T0tZbv+zmdGsT6gNTb0WdsdFx7gcHbDoqkAPpD9nLE4MN2qhy tZ7cbYpvHCh0Krwk0+5ZUS4Nvpfyv4LZHrEG4L/EAtMWclA5Xy83UYLvpA10qUT+emYY OZUQ== X-Gm-Message-State: AGi0PubbGA5Z4V34bh/wnrgvxQ70hH+fb5nsy3y6C5HRPnew7i3pSvJX qmTX3jZOIv8GxAP6wNn5ho+k2wZ/11Y= X-Received: by 2002:a17:90a:648d:: with SMTP id h13mr6507135pjj.12.1588951336008; Fri, 08 May 2020 08:22:16 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 10/16] target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub} Date: Fri, 8 May 2020 08:21:54 -0700 Message-Id: <20200508152200.6547-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 13 +- target/arm/translate-a64.c | 22 ++- target/arm/translate-neon.inc.c | 19 +-- target/arm/translate.c | 228 +++++++++++++++++--------------- 4 files changed, 147 insertions(+), 135 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.h b/target/arm/translate.h index a02a54cabf..4e1778c5e0 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -298,16 +298,21 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen4 uqadd_op[4]; -extern const GVecGen4 sqadd_op[4]; -extern const GVecGen4 uqsub_op[4]; -extern const GVecGen4 sqsub_op[4]; void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 3956c19ed8..ea5f6ceadc 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11168,20 +11168,18 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) switch (opcode) { case 0x01: /* SQADD, UQADD */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqadd_op : sqadd_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size); + } return; case 0x05: /* SQSUB, UQSUB */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqsub_op : sqsub_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size); + } return; case 0x08: /* SSHL, USHL */ if (u) { diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index e16475c212..099491b16f 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -605,6 +605,10 @@ DO_3SAME(VORN, tcg_gen_gvec_orc) DO_3SAME(VEOR, tcg_gen_gvec_xor) DO_3SAME(VSHL_S, gen_gvec_sshl) DO_3SAME(VSHL_U, gen_gvec_ushl) +DO_3SAME(VQADD_S, gen_gvec_sqadd_qc) +DO_3SAME(VQADD_U, gen_gvec_uqadd_qc) +DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc) +DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -653,21 +657,6 @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE) DO_3SAME_CMP(VCGE_U, TCG_COND_GEU) DO_3SAME_CMP(VCEQ, TCG_COND_EQ) -#define DO_3SAME_GVEC4(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), \ - rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -DO_3SAME_GVEC4(VQADD_S, sqadd_op) -DO_3SAME_GVEC4(VQADD_U, uqadd_op) -DO_3SAME_GVEC4(VQSUB_S, sqsub_op) -DO_3SAME_GVEC4(VQSUB_U, uqsub_op) - static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) { diff --git a/target/arm/translate.c b/target/arm/translate.c index e366281274..72c3aab544 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4925,32 +4925,37 @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqadd[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 uqadd_op[4] = { - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_b, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_8 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_h, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_16 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_s, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_32 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_d, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_64 }, -}; +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_b, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_h, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_s, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_d, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -4963,32 +4968,37 @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqadd[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 sqadd_op[4] = { - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_b, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_h, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_s, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_d, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5001,32 +5011,37 @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqsub[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 uqsub_op[4] = { - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_b, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_h, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_s, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_d, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5039,32 +5054,37 @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqsub[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 sqsub_op[4] = { - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_b, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_h, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_s, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_d, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. From patchwork Fri May 8 15:21:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186387 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp97426ill; Fri, 8 May 2020 08:41:07 -0700 (PDT) X-Google-Smtp-Source: APiQypIymJDpQCtsLZPv6rs3alIRk1y7ijZiSjf6w1J8zx01RYCl29/m3fcniywzl9Ql/z+bMW6u X-Received: by 2002:ac8:739a:: with SMTP id t26mr3579339qtp.311.1588952467415; Fri, 08 May 2020 08:41:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952467; cv=none; d=google.com; s=arc-20160816; b=y1/wbLrSPuL60LMxzT6bjVhe8X16ITHa1q7OwA2/qUlHjwsuxev2ZdmD6HvRT9XwcJ tVcnVBBSvtDFrLZO+u7FafJaO1YK6NmJKQ8FZWNgKnRve9QyaaMsNIE8vIjkyxppvxnj fBsFUxQF5oA1udeEMFhaqKDDuDdSbpzsXc5C18bSMuSy1+Ooj7RBUzrNBEVgPedv+NW1 f5NMoiusoaIDSa6yzhdlysaonh5m/CEYj39/AmVHq1nINRPRr7BGcZ+HfR+FLE3UNJk7 XYhFxoOizxpcjwxYbH68ojVnDsVNTSHkFtMknxxJ0HvHRT4IXkWIvgzf7Qvwn3TY5A+3 ivkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=bxUsBWRJww8ejhrlDwXgOXgiUwWOHVNz3+zWU3i5HOQ=; b=BNtJCJPeqfoA9rY70IwabOVM8+UXD2xjpj0JVTR2j7YJLEaakBXBZtwtfh3I5Rtp+O b33CphWBo9yaJdoiVVhIoA9Riy3ZZQ77eNzqTpsufsXjnqs2TQs9MghPf5eNxRm6in4i yF4zuEsN4BjNQHpPjgRxzEdGk8rKbsNi2Vl9djyBntVavofPeotXTjG5OuijrsSGXf1/ qrTlCpQSRXcE0f6ftxiKSRANGEjz94nPdhlen8OGO0nEiVsp8YYuVfhtzPYOa5NeP/8v 72RINTnad3D1eVulBXTETtxAVOJAizCYSDRaHQxGO9ADjgjn500kAalOR3RlAM63OWCs qR3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JprZnie5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id d15si1181623qkk.368.2020.05.08.08.41.07 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:41:07 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JprZnie5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:54966 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX57a-00009p-TG for patch@linaro.org; Fri, 08 May 2020 11:41:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35042) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pS-0004hb-O9 for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:23 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]:40802) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pQ-0007Cf-TQ for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:22 -0400 Received: by mail-pj1-x1035.google.com with SMTP id fu13so4389200pjb.5 for ; Fri, 08 May 2020 08:22:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bxUsBWRJww8ejhrlDwXgOXgiUwWOHVNz3+zWU3i5HOQ=; b=JprZnie5KakcICF8sZYB9lLGBDvCv1jG8Rq/RHKxt/wJTpIOXPuDQcOk06wI7YKE6G WngPiKGG2u/gCMWxW+WBf6HAtLS4FuEyBlE+e+O94IofjS91Q33B0BFjsmxmGmUu6+PA 3nQ8bq0d5MpgnGvp9o3XTGvMtAaoQNXn7NG/8YZvQk+B6Gg34s3o5YFofUblVz1ebIaO rDAiuFHDLDbQOZ+lhFbmg1sEJ89tev7FXljYLYn2z5o4JCpLOZdL2sAYEpy1FWh0UcIw rRs9Bc8hhheh53+zXA2Z09ZVTf1/BKoZF2iQRvKmAPWM1AfiIGxr4+0uey6a7QLz7OpE u4Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bxUsBWRJww8ejhrlDwXgOXgiUwWOHVNz3+zWU3i5HOQ=; b=UT+3g0Z/V3d7lOVpLO4lbLTBch1xibInI1p+PEhIXzqDHuL1WHMpOnC6SuIWDQv2Rt Si7B8tbMpeYlCv7GcLej2zgaVf7wYGm/MIlXMGUr0W0AXXd5boPLOyBxSFMla7/Jk3Xq S0jIyQc1wyEsszoqvwu3pF0GthJrKkkdtqyhhrqTAUt8liBp3tjeMWh55k3D2cxh80uh 5sHFpgFMwoKqJON8xJ+euF6o61a5aVJyEnRBFvNwhw7J0ZtTyV1qI+CeXm99UqgJ1f1A 4ZbSe982KxKRstl/+djoUQYIoDt55GP+m4+ZkOWqJ8nWXsUJh8F/igD08+8/G4yYPikz 7X0w== X-Gm-Message-State: AGi0PuZ4GGYpSw5HyCkwjESR1swX/Jw7Yggy/K1PXojbCkAcWkr9Diaa lVIuNVEas2VIUDmm0kjU7PmA7HTCUH4= X-Received: by 2002:a17:90a:eacb:: with SMTP id ev11mr6497620pjb.80.1588951337417; Fri, 08 May 2020 08:22:17 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 11/16] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 Date: Fri, 8 May 2020 08:21:55 -0700 Message-Id: <20200508152200.6547-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" These operations do not touch fp_status. Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/translate-a64.c | 5 ++--- target/arm/translate.c | 12 ++---------- target/arm/vfp_helper.c | 4 ++-- 4 files changed, 8 insertions(+), 17 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 33c76192d2..aed3050965 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -211,8 +211,8 @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr) -DEF_HELPER_2(recpe_u32, i32, i32, ptr) -DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr) +DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32) +DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32) DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32) DEF_HELPER_3(shl_cc, i32, env, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ea5f6ceadc..367fa403ae 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -9699,7 +9699,7 @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode, switch (opcode) { case 0x3c: /* URECPE */ - gen_helper_recpe_u32(tcg_res, tcg_op, fpst); + gen_helper_recpe_u32(tcg_res, tcg_op); break; case 0x3d: /* FRECPE */ gen_helper_recpe_f32(tcg_res, tcg_op, fpst); @@ -12244,7 +12244,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - need_fpstatus = true; break; case 0x1e: /* FRINT32Z */ case 0x1f: /* FRINT64Z */ @@ -12412,7 +12411,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus); break; case 0x7c: /* URSQRTE */ - gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus); + gen_helper_rsqrte_u32(tcg_res, tcg_op); break; case 0x1e: /* FRINT32Z */ case 0x5e: /* FRINT32X */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 72c3aab544..676701143b 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6873,19 +6873,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } case NEON_2RM_VRECPE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_recpe_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_recpe_u32(tmp, tmp); break; - } case NEON_2RM_VRSQRTE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_rsqrte_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_rsqrte_u32(tmp, tmp); break; - } case NEON_2RM_VRECPE_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 930d6e747f..a792661166 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1023,7 +1023,7 @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp) return make_float64(val); } -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(recpe_u32)(uint32_t a) { /* float_status *s = fpstp; */ int input, estimate; @@ -1038,7 +1038,7 @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) return deposit32(0, (32 - 9), 9, estimate); } -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(rsqrte_u32)(uint32_t a) { int estimate; From patchwork Fri May 8 15:21:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186389 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp98515ill; Fri, 8 May 2020 08:42:38 -0700 (PDT) X-Google-Smtp-Source: APiQypKEuQJaT+BaBCEgTB2RIf3la9OgtVzmZi3MbglwsIunBMcdeZljFHStMIBDzR/JQRh/liT4 X-Received: by 2002:ac8:3733:: with SMTP id o48mr3636046qtb.149.1588952558061; Fri, 08 May 2020 08:42:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952558; cv=none; d=google.com; s=arc-20160816; b=vRxlZOI+nyFhTkLlqt3D7eInUXG9qG0syxZ0gtUgLU39DNUZxH0OQ7bqtB3fCyzEWC rVvOalq6LypBZVsNfOV9Es7oXeDr2Rwj0YqjnE/iWcvCSkb22QFtkr+qHp9rbfR7Qq2x iXWdfX41H9yjV2S2dScbKcZtCtBPn262ehglVrymYLqk4DK7MaqCGs++AbzX9lt8BERg SEXwUuuVOBzVk2OZg+Iy8aGW6HCCPhfyXm6ice+VQ511Za2uSbyE6A7zqsw3AUw4y8mQ HvhBCYFFHnhRIUCfpwfIfLTm8YVdyEe+cH2i2RZfXP9SLGFYaPiVXYa6zmd1aS0RQujG n8WQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=gc/m33i2xUJ86Yrdh+2LNrQ2J4l5DWq2aMDMhprlab0=; b=Ku7cq1/9dEp55Cf3kxfYBpWFyLmrrNh1jCoUTYjG7vmodekTe8JDazuQnqYJlFH7i/ QAt24mGMJd2Q8POg+VDkVsO2mrDZH6etqBrnwG8sOXF9dwQi2Y7r0ubDTCWDF/CVySn7 cZHW7dohuRNX+GivN8LcotouvIvpTzZ+gXbU5MN6QoPkMuDIiH8qK8uENvCaWJcJbhff fF5MhkFejMpWNo5VCP7XYrYKU6rt+TgepDROVfCIgOkunNjoSbdAUlEOTk1RoTeaSj+Y uIfoAmRnPl5oe4IBSbs/U7N2p5pOgu+U1rV2c49xJNeRIJ7iojjeXi0LiJ6wOoyNnuVT ccBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=wXIErqxc; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id o48si1159597qtk.221.2020.05.08.08.42.37 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:42:38 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=wXIErqxc; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34568 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX593-0003Xh-Ha for patch@linaro.org; Fri, 08 May 2020 11:42:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35064) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pU-0004kB-2J for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:24 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:43802) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pQ-0007Cl-Ty for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:23 -0400 Received: by mail-pl1-x644.google.com with SMTP id z6so849477plk.10 for ; Fri, 08 May 2020 08:22:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gc/m33i2xUJ86Yrdh+2LNrQ2J4l5DWq2aMDMhprlab0=; b=wXIErqxcZOpLpdxbx05NwoJz5pSTUsHZ4MIZI8VA/slJN11b5FK/qaEVT01oT9nDZW mZUPgL/1wnLQp14hhsY/9RVaU1Izuw4MzB40hGx6gWqSwIMsEo53F86f0LeJo45eiJgb iYHdxEgTYPlOHEeJ387C2YRk5IxqzyCv3hqGFXAEJ40cKfdVYVHJxy6uPB9Flo6QicWL uEd6MaZe92IuMpWNVCrAHgGkN+okOHXwmjXJgeVH3qk7vgmE3ke48WA5IpWszyWT1xM+ chpv8t0OPo6M7Ra/yMFOQ8Iov2feXccbmoL86dO4dWeS2eioHOsR2RAHXAf52MDWuQx2 yYGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gc/m33i2xUJ86Yrdh+2LNrQ2J4l5DWq2aMDMhprlab0=; b=nUweff9/Vi0fXZunOc210DoX8pu8ntLq+1cx20ViDGSUnFSptxXJu0noqGgrXdnD5W O5xQVvWCaxw/Vr1I+meXM8vliI4VtKB7oJsVQQ36eZJbTgxGavM4N4AzDWClgiqyCaJT Z13s1hI2P5gJtW2961iWpKufGvhikdTHgrcHbXCcyC4+8Ko/caE/iwIgPO/Kw1EMz1Oe 8VXM2mpwXIveoeWHfriwGlzGbQJLjHBEExtn8qjZzwYIiwWOdcNgncJ27MMrUqe5/Z32 DDvW82hRvb867+5EIIXV7KoscESGZpVLvt9TUfJruafdLzyUubgq+64iAOi/FaJl5V4W GvoA== X-Gm-Message-State: AGi0PubwyAQKFec5H5gnG5MRD9XBY3CM8vdBV9bAXwB4atMLsHWTGatp 69JDwGZXActVi6Woz77XqlB6hW5X2lE= X-Received: by 2002:a17:90a:7046:: with SMTP id f64mr5907405pjk.205.1588951338676; Fri, 08 May 2020 08:22:18 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 12/16] target/arm: Create gen_gvec_{qrdmla,qrdmls} Date: Fri, 8 May 2020 08:21:56 -0700 Message-Id: <20200508152200.6547-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Signed-off-by: Richard Henderson --- target/arm/translate.h | 5 ++++ target/arm/translate-a64.c | 34 ++---------------------- target/arm/translate.c | 54 +++++++++++++++++++------------------- 3 files changed, 34 insertions(+), 59 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.h b/target/arm/translate.h index 4e1778c5e0..aea8a9759d 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -332,6 +332,11 @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 367fa403ae..4577df3cf4 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -587,18 +587,6 @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } -/* Expand a 3-operand + env pointer operation using - * an out-of-line helper. - */ -static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd, - int rn, int rm, gen_helper_gvec_3_ptr *fn) -{ - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), cpu_env, - is_q ? 16 : 8, vec_full_reg_size(s), 0, fn); -} - /* Expand a 3-operand + fpstatus pointer + simd data value operation using * an out-of-line helper. */ @@ -11693,29 +11681,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) switch (opcode) { case 0x0: /* SQRDMLAH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size); return; case 0x1: /* SQRDMLSH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size); return; case 0x2: /* SDOT / UDOT */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 676701143b..8321644f25 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3629,20 +3629,26 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; - -/* Expand v8.1 simd helper. */ -static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, - int q, int rd, int rn, int rm) +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { - if (dc_isar_feature(aa32_rdm, s)) { - int opr_sz = (1 + q) * 8; - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), cpu_env, - opr_sz, opr_sz, 0, fn); - return 0; - } - return 1; + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); +} + +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ @@ -5197,13 +5203,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; /* VPADD */ } /* VQRDMLAH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; @@ -5216,13 +5219,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } /* VQRDMLSH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; From patchwork Fri May 8 15:21:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186391 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp99186ill; Fri, 8 May 2020 08:43:41 -0700 (PDT) X-Google-Smtp-Source: APiQypJ4IW9MSXGG4oG43+RZnol8tM6UJ445pq5IxyJrSaSLTsGs5tl7Ab0rf910h54MVeGO+RBL X-Received: by 2002:a37:ac9:: with SMTP id 192mr3401393qkk.249.1588952621257; Fri, 08 May 2020 08:43:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952621; cv=none; d=google.com; s=arc-20160816; b=fs1acw6tuYWLtdcxAUqWn13AElZg2dM81BFNP0QnMfzpCKgW/A7pOfLcYpYMb0C2la Zj+Y1zN7KKuc3ItUZzCTuUIohxELlVBA38T3x36wDiiPvsANlETlwdVIqE6ol6BLHtJJ UjTwQNf/eYa14D3CIQePM/V6nOplN22zoWpAj8PFYznorqak/HH20qtUsXT4Q6yWdyJF Fku77UgvwCBHNTHfz8Gmoi3CJ1HsWSBpXwyv2Nxl+KrOUY1vDLRDKe93TvjxZzlq10uW 1NDCjKwoUKOMRAICAomuJ5nxBCI9FdPiuxpQ5rJFeAgWyNXatdyjE31+BvNmsNA7ZG1Z 2F6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=mTyI6RTbdq5lxnfMwBSiPFloUeeqJIpM4PV70ndWeTw=; b=q7kbbJhiI1YpTk/ws7Mtw4zF0dtUvfQwtIJix89AQKfeDYsUo5Q6WnbS4HGhaYi2Nu ETHQu9PxqXNLYimZfI0guBdNaTd2x8Ea/R4gWzo+inxBuFLm4zsih7HB8qQ//YBY3hG8 D3uDd9RCLfTdOKrCgYcCitz1pAtaQ+uaSerqMNGAvnsOf0vK/lIOI0lrLa7mvTrHoreD ttX17nQzJLCCrfqBf45mQRAS0551KMrIcD6PEYJrr9YYINVXOD5XjbCtEJFzFRfPL4xX uhQd0PLguHnxv4i0f7jZEj2yKMTYiWNlxywTfT/qCbhfnGREKSJgfjzxwvneideoNlMy 3O9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=UncVX11W; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id a138si1305028qkc.26.2020.05.08.08.43.41 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:43:41 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=UncVX11W; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40234 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX5A4-00067p-PH for patch@linaro.org; Fri, 08 May 2020 11:43:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35056) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pT-0004iv-L2 for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:23 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]:42417) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pS-0007Ew-20 for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:23 -0400 Received: by mail-pl1-x62b.google.com with SMTP id k19so852636pll.9 for ; Fri, 08 May 2020 08:22:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mTyI6RTbdq5lxnfMwBSiPFloUeeqJIpM4PV70ndWeTw=; b=UncVX11WOrsTf91UVQub2DRa9BQO6ACHS5e3sXusvuOYIep/TCk4tH0oRWywQbwDud zzlvfaGkihrOO0nImRWiX5JwXaLDu6DLfs99jfCxTMxbB8uZmg+JOzVwOfvSMvUGCC8H SPnH5oChDXHPDg9NLy7VPuSLZ0gDZ8eOe9pSy1PgIVqJTylEVE1u9LhKUVWkRqLI2iP4 9+kFWxcVSg1XL7MNIQayzWBmxe/GmQMPVdYi8a9Zuk2lPL5BWPeDxZOYAvYTITGchx9/ lakrnCejANKfg/VkbxZkQKYEa+vojhHaEbH+GVxXnvHo6ZtqI4O3y/bPg+xvE81PvDwP nYPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mTyI6RTbdq5lxnfMwBSiPFloUeeqJIpM4PV70ndWeTw=; b=tyfSHDnUqNAsWHdUm4cOqW75nwwppi/jqQKyPO4B38etdT2iGcF7P7Y4d8zGk4w4Oh LhOLJKEZhGgyk17eK60Fw8kDkge0iLw6h+SidNU4Ckd9oGf7SULBGkAGmNKmSWQ/3wU8 3QZzrJCvpa7xykRiJEjcHzWnL5PLhVQnD21F/uun5Xmk8kqilQCRjlGdnHtDPljWAfkN 2j2Rp68rKVozHbCH/qPsuqxW1VgokwovO06irACB8/GOb+XyFVpIZaUPQwYF+yT9KGe4 hWdpAUA+kLj2oSkATXE/dO+N3yZIASQdZK5sJDWpopPLoNQ32igB7GSCml8xsDApsDia Jv0Q== X-Gm-Message-State: AGi0PubwkoSCmicam/iQk1lOdh1uhgXmtKyObQKV94SiW5OoOz7Vhxmm /YsQqfltxtDQ7yc59N0JnKP8kLJENiE= X-Received: by 2002:a17:90b:155:: with SMTP id em21mr6523462pjb.59.1588951340145; Fri, 08 May 2020 08:22:20 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 13/16] target/arm: Pass pointer to qc to qrdmla/qrdmls Date: Fri, 8 May 2020 08:21:57 -0700 Message-Id: <20200508152200.6547-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage. Signed-off-by: Richard Henderson --- target/arm/translate.c | 18 ++++++++--- target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------ 2 files changed, 54 insertions(+), 34 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate.c b/target/arm/translate.c index 8321644f25..0e2cf6028a 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3629,6 +3629,18 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; +static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz, + gen_helper_gvec_3_ptr *fn) +{ + TCGv_ptr qc_ptr = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc)); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr, + opr_sz, max_sz, 0, fn); + tcg_temp_free_ptr(qc_ptr); +} + void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { @@ -3636,8 +3648,7 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, @@ -3647,8 +3658,7 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 096fea67ef..6aa2ca0827 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -36,8 +36,6 @@ #define H4(x) (x) #endif -#define SET_QC() env->vfp.qc[0] = 1 - static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) { uint64_t *d = vd + opr_sz; @@ -49,8 +47,8 @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Simplify: * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -60,7 +58,7 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) + ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -69,30 +67,30 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */ -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Similarly, using subtraction: * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -102,7 +100,7 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) - ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -111,85 +109,97 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlah_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) + ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlah_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */ -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlsh_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) - ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlsh_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Fri May 8 15:21:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186392 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp99349ill; Fri, 8 May 2020 08:43:54 -0700 (PDT) X-Google-Smtp-Source: APiQypJNvPDvSeILKUP53UMrLeVJxYODn4vc7yITTwLKtOqn4SbPT1pAmNq7qm31v3SqzmVc+7ug X-Received: by 2002:ad4:45ac:: with SMTP id y12mr3347745qvu.227.1588952634631; Fri, 08 May 2020 08:43:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952634; cv=none; d=google.com; s=arc-20160816; b=jGT+fUITfryQEfJ/51PIZ2kjPhBzrTmS/fE+Ohavi76/UMMVpf8gcHw8sGfv4Kk+dU T9YKAANXZu/f7vr6r72ajTcZoHHmtxFBCx/aXXdWEm9w0kJm+yeNaxzGYmK3yIPZt3OB h5wX28GV6F5fiJh7aM+6zyb8FoxgHlyukoVzIT/FfpDl3EJTwQyIMhdOseA6nLKZB8U/ lmUGI+lqLQtwAeVPW/0nHt6Ysvd+k22w/W4/r16a2q3GN+Yfrq8cz8fPDs7IrVAx38AX XR2C9A/PDswB/Lx5sz5T5eEw91wj+vp0Zdz7oCz9iSIRwaZOznx49xpO3L1JfoIj9gZU f3LQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Satlmvku/pHGor6xmZAkPC5FD5Q8goGXCeVbGvbx+tw=; b=oLHH+EuT0khz64gIAHhFPilzutZajeIMJk2ld2E7VujcK/LKwyJeRWL8LDxBdMUOEC 0K5wUsoWBD3ICLcBxEsEBJdLzpY0zai09bz2DBHkynfPFq2vxDATfds/6pUZr4yW3H5C sTlx4ChUR+63rrJbbLGT7v0mmtXhl80w/Ms1AI06TKdN/MiSflZZVAGihQrhIaKCI3sw 7uH8TREyfWBxJf4pqU71efwM+El4NwlPpf7I5383E4FkFj9ktFwGyW4cBHaEOAoJmQh8 HpDVpOMr4ndV1XDb30mliZOAbIraGpQbbOCxyvW6oMyr4kgxr3+rJUyPZi5B3Md5Lyjp uT+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=NB2V41QM; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id n1si1233397qkn.87.2020.05.08.08.43.54 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:43:54 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=NB2V41QM; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:41312 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX5AI-0006bx-10 for patch@linaro.org; Fri, 08 May 2020 11:43:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35070) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pU-0004kk-Mj for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:24 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:39098) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pT-0007I8-66 for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:24 -0400 Received: by mail-pl1-x641.google.com with SMTP id s20so858506plp.6 for ; Fri, 08 May 2020 08:22:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Satlmvku/pHGor6xmZAkPC5FD5Q8goGXCeVbGvbx+tw=; b=NB2V41QMaUYgyy5+LQORYYXwax1Bq7vky0SwxoJr4i8PnpnbnMW/GvT/FGHuhpEMuO 1uRW/VNsCchD9LXy1fbsJ0JPOBCEWltUJBYfd36t3U5vjM83Zj93gZA2YjoU3vw07ZEu nmA7O4LCsgnhvoClppU809tIkgI10uHkHHs8/CRZwvBczNjeUEQbGJvHcFbOaOgzXDp4 cP/8JOy/pbIMnCyRpj4Fp22Uk3eAjKkF/ImvUsTb17auDbS7mA+s73ZjVgiWEfF98r6l mORMTukZq411JKDskOBmdkx8JqO6IhM3skIGDElndyLZtcZDzkqS7C6xslMlzYlR3gRM dI8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Satlmvku/pHGor6xmZAkPC5FD5Q8goGXCeVbGvbx+tw=; b=iApy3mkRTjArKMoA/B7NR2UxkxqtKveGm4p7fGTpyK6y4GXJZh3XlHYP/xFuoQW5O7 buNg+ScLt/bfXmIH/Dv1ePA5yU36kd+1SSb/9L2L25jifqEBGL13hCIFmP/vCTGYqIUx IZ7cjvHLvq27EH/kbqu4In7P+XWW+lIiYPN83doNTxeHvDFQEy6UCHh46vv9a6UI/p+Y SB+HAEeOLjxOOoPZg8Y0IURzqOmJJdDYcniUJxINuX342nafVc3U0wKLq6U/TvFALRyd kHr5xLXIkFHNZyQSwlOPrfYjkKXlnxUAJTQ8TMOvUwUBm5je/moP5VWeqZMu1oznE31L 0RNw== X-Gm-Message-State: AGi0PuYiA69KT/bw3yhZFns23HogrHvYo8WkCbysOAyybEO3gTivQHT2 f9GOR3IKJb+noVfguJd3sVgFEwBasDo= X-Received: by 2002:a17:90a:fd94:: with SMTP id cx20mr6922796pjb.157.1588951341477; Fri, 08 May 2020 08:22:21 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 14/16] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_* Date: Fri, 8 May 2020 08:21:58 -0700 Message-Id: <20200508152200.6547-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Must clear the tail for AdvSIMD when SVE is enabled. Fixes: ca40a6e6e39 Signed-off-by: Richard Henderson --- target/arm/vec_helper.c | 2 ++ 1 file changed, 2 insertions(+) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 6aa2ca0827..a483841add 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -747,6 +747,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) @@ -771,6 +772,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ mm, a[i + j], 0, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) From patchwork Fri May 8 15:21:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186394 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp102043ill; Fri, 8 May 2020 08:47:30 -0700 (PDT) X-Google-Smtp-Source: APiQypJ4ICRC6ZpbPr+xptKzRS9pqZB1JcV8ZU2NzPdxTrKGwcNf0emvu1mqBb8v0NaHTOx/ufus X-Received: by 2002:aed:31c7:: with SMTP id 65mr3926462qth.150.1588952849954; Fri, 08 May 2020 08:47:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952849; cv=none; d=google.com; s=arc-20160816; b=Fu4C4grlE8Kl7+k484CvhtQCAAzv/fQrp5Ul6kbmxESCjNGTr089ii6CSI8XxRBE8k FemcDf7yFJkoEgibBnPs+Snj3fnzg160tefzUc3A4GCVBnitLX7N6mjm5/mXcMTak9xh v4kRrmcl8kL2YrQtpgbwv7lOFrbXRPACq+LnqbKf/176qeHmfV6XS+6Fgzd6bjxoS9kb PJn7xQr+xN04UoDxtevT+dNyETCPcG9lge03Z/LBGqf6JIPpm4EQEaCgVobvFFK4EyoQ MsltQCL/TOSbW7BDf0qq2Sga7hnCX6I5D4ZUL3E/UcdY+Qyl3zBkd/KLLoJVTX21ekP9 KbdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=GhTaGh+nCSjvlRkbgPDytkfIlvYPc2vjTiPUcZT8Rmg=; b=ZxxwfJUhLq7K0EQUgBFF6dRXz6Qw5qXAdbUMEk+b3AQDWN+Fdz05dRT5tvEPYfhUvr BqXr1RQsPVpPg8hWzz8G9NlEinJQYDDxJkx3fpoaCZUExhciCgyjp4quCe1MdIrlDckz xPCUIlYDszJdAeVZWQHh8DrsETL+oioCc6ve+/yIBXzPEcafknh3lf2Yb0iQAVXh+Rci oL2x9TZhvpOCHxTA7llEze82sgUDs0yMgq3m2YbHAI9P8JiuBQHQ8B/aprfPpJud/Cvh 2OS2/0k3JLCGYPkal0y0zTobmlpdcnvv3We56jrnGdhzAktX1KRLovTg9/7CoqIKaGYm kUwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=k6hZ99xQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id r56si1189651qtk.368.2020.05.08.08.47.29 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:47:29 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=k6hZ99xQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49396 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX5Dl-0001os-Ay for patch@linaro.org; Fri, 08 May 2020 11:47:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35094) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pV-0004ne-Vq for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:26 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:35931) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pU-0007Iz-Md for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:25 -0400 Received: by mail-pl1-x643.google.com with SMTP id f15so861202plr.3 for ; Fri, 08 May 2020 08:22:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GhTaGh+nCSjvlRkbgPDytkfIlvYPc2vjTiPUcZT8Rmg=; b=k6hZ99xQUq/IJZsAXx/zRSgR/a1PJAsIszh5qI8nSfscZUL+wXo01LSFdEIwNz+AJY UxkxL5iXPP6YtfHYLj23XWy7NhT0dyyATgNOsAFwYZqJmvgSOrBuvopfiXCUN0yrXw1m 2ygnWS3F5SSNzyt3L5pxHJH0xo89b1U3l1vJEPnIf+WfG5sEnn1qgTPt7UZGhhrPWM5T 0M1aA6g8PgF35aeR8IPMJnFmiyrz57RxLquIJWQg9yR/Iry0SsM7HVUn2CqFFCUaum/h lSJjyNapPSHgnfBO6CLvlgbvC2SRmuT9BVRkuzYYShAMtTSBhST3iJZQmR9Xfin32EA5 +w8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GhTaGh+nCSjvlRkbgPDytkfIlvYPc2vjTiPUcZT8Rmg=; b=Fu3c16LbFCJ4UJqmFv9+BrKBDNbGxyAHgqByYRYdcRI6opUmN+XFsDIahW/za2KBGW TdeNg8yYw5YljoCsuDV9UmUjtmy/rdXqm6/fAt9OysvUYg+s77GtTOxvXe+XUSQxBmtD Oi9ytUV0NALzwPt+7i8mlj8Amr9MRhIj7x2eqxAe5Fm9sixJvYBFshb4eneJGjI/VOJd DXoWQFWwjV+A2hYDl59RuRPEXb1mvc4Bt/rKS0u+aE0Jev5UNYtfXLwIDz6dAePIAjVA XBrbWx7xCbQBQBibXmXPlHDkn9w5vK6e2VAUz/FlMSxRh/L5tQO2PPQU8XVEpvqjPIE4 a4XQ== X-Gm-Message-State: AGi0PuY14Aaa1DrSKKvY2OHl5x+qWQ9CArPk/kjnjP1q4g03JyG5jV2S /KUGLaMPcFTB+529FMAsoVsTlCN+GvE= X-Received: by 2002:a17:902:59ce:: with SMTP id d14mr2120232plj.339.1588951342701; Fri, 08 May 2020 08:22:22 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 15/16] target/arm: Vectorize SABD/UABD Date: Fri, 8 May 2020 08:21:59 -0700 Message-Id: <20200508152200.6547-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 5 ++ target/arm/translate-a64.c | 8 ++- target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 24 +++++++ 5 files changed, 176 insertions(+), 4 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index aed3050965..4678d3a6f4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -731,6 +731,16 @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index aea8a9759d..bbfe3d1393 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -337,6 +337,11 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 4577df3cf4..54b06553a6 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11190,6 +11190,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); } return; + case 0xe: /* SABD, UABD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11322,7 +11329,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index 0e2cf6028a..0ed6f49f42 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5102,6 +5102,126 @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5236,6 +5356,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 1; + case NEON_3R_VABD: + if (u) { + gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; + case NEON_3R_VADD_VSUB: case NEON_3R_LOGIC: case NEON_3R_VMAX: @@ -5380,9 +5510,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABD: - GEN_NEON_INTEGER_OP(abd); - break; case NEON_3R_VABA: GEN_NEON_INTEGER_OP(abd); tcg_temp_free_i32(tmp2); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a483841add..a4972d02fc 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1407,3 +1407,27 @@ DO_CMP0(gvec_cgt0_h, int16_t, >) DO_CMP0(gvec_cge0_h, int16_t, >=) #undef DO_CMP0 + +#define DO_ABD(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABD(gvec_sabd_b, int8_t) +DO_ABD(gvec_sabd_h, int16_t) +DO_ABD(gvec_sabd_s, int32_t) +DO_ABD(gvec_sabd_d, int64_t) + +DO_ABD(gvec_uabd_b, uint8_t) +DO_ABD(gvec_uabd_h, uint16_t) +DO_ABD(gvec_uabd_s, uint32_t) +DO_ABD(gvec_uabd_d, uint64_t) + +#undef DO_ABD From patchwork Fri May 8 15:22:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 186390 Delivered-To: patch@linaro.org Received: by 2002:a92:8d81:0:0:0:0:0 with SMTP id w1csp98784ill; Fri, 8 May 2020 08:43:02 -0700 (PDT) X-Google-Smtp-Source: APiQypJpRhlqROQhjmkIxD4YZj7IBTvK+H+PBwFraVwkhRakbP9QO6Ls1XvqMIyk3sUgxq7WTZzN X-Received: by 2002:aed:3445:: with SMTP id w63mr3691332qtd.183.1588952582821; Fri, 08 May 2020 08:43:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588952582; cv=none; d=google.com; s=arc-20160816; b=Yjf5hLFVoLHKw8wZTEwEldskOLzpLOw4C3EspwycVsk5QRU9Y3n7Y5ybroOMv08lw7 FCtAhoDkOCwpDAlYMXPWAX2VU9IeVoDJq9NfWirLx007C2EjIC+PgZ0zjyjTd8vPbGIA Th68g0ysHUmkt8W+F1tMZfEjuHQ0vKOKkuaRGRtfIa152UQhzDFaNOlesOslj3DOuXnP 7vKJN/vYTIQFsZb+Klrny1hEcRdi7FFH4B3P3or+eEwOSYu62C6WqbS8/YVkQ/5HTMfZ AcC/hfWuCMaBbOsoQxmy9PluMtidrP7wPDmnpd75SL7niqFf7IFmtKVgSlVX/8ZzjOvW g8vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=jvZZJJheFZAQS52tsRpTsJn9yoK7/9CNevNbrrRVauk=; b=Lv72BypJbkfCyTgvXPWKGlWrEgTcDrPnTWLG+6Xm/5kJDcX3y4IE/40DMHdYfvNDO6 8s7Q+FVq3UNAbRHRcJ7nOctm81fe/KzOd3UJLSUBUEC9T6v7MNkRDydnlwdbFHKfuZ0Y xTUUWBecoJ86vc30nlA78DCKRN+b0Gl2HAknowQRx1mfd7bv02wDlbJTEbSV6uazMLNO DkudtpskcGISinknxz6naAcq4fXfwlDqrPBeCHvhbBc9J8IzRay0w+m4LDRGtLPh2z9s DBi/Z/dfJF+H3TThDGT5FHtQHEBuw5JTP8BKRzC3+a1eT7dTJM156NMm7/GmlolUxPMw bWIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=X5xDcPJD; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:470:142::17]) by mx.google.com with ESMTPS id w144si1188779qkb.18.2020.05.08.08.43.02 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 May 2020 08:43:02 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) client-ip=2001:470:142::17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=X5xDcPJD; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:38482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jX59S-0005NZ-Bu for patch@linaro.org; Fri, 08 May 2020 11:43:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35114) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jX4pX-0004o2-9j for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:27 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:40059) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jX4pV-0007KK-Tr for qemu-devel@nongnu.org; Fri, 08 May 2020 11:22:27 -0400 Received: by mail-pj1-x1044.google.com with SMTP id fu13so4389594pjb.5 for ; Fri, 08 May 2020 08:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jvZZJJheFZAQS52tsRpTsJn9yoK7/9CNevNbrrRVauk=; b=X5xDcPJDW7hJE/fQ3vX8ExtU0GWyzu+ORTa5pB8sHojwPhreLsthr54x2Ymof5A+Bq efUdL7wC2LoC5ptjh53XZeDVB3Y545BsmL4/EryQyJXjy1i9hJd0tyBQP9zf8FxItqU8 5TmXhj9rs3P5OkMAimmiVAtLJBH7zeY19ixyVlkb+8bHvjPunsQs9zIx16x3Z5g4y0y5 rIl9EVe+/gzaquuypcSR9b92Ea0iw7x8YQqFNchG/VWZZQfTJ3NBQkbN3+hmj3L57lt+ TDz4uwRvny1fmNMm3YkfXdi+DSOB5BUXKrz2mtJrgmKnu4KmQ2oqAkzmzIrotN2RC9sK Vp/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jvZZJJheFZAQS52tsRpTsJn9yoK7/9CNevNbrrRVauk=; b=p2qSFqZsOgfVHxSF/IdYpaEFxPhqKuRhc2j+X6KFidCOzUKc3LtIHsviruMY3IK3nL okufwjOq9S9c4vvdhNfENr/xdCuZTqNqd9W6mLZDi9Gn6CG3wIQ4uvjAAZsL90vMYu7m gwcLXzDY4OX0JfZ53hgnXi9L4rttNc0fUbU6waik2FgHUFVauiavGkZjndhipK+bvEPZ dQoD5Icx0PihYdCdTDbEKuH0WNCYzg/t4onYopzRQyTtTPqpA4tPOn9Nprg48B9H3ixs JinQpZFnhSrTQ0/J/+iDdSjeN02p0lGZ0jpQHxLBqqbctBhGxeriqyxJvA9ewTjbNePP am0w== X-Gm-Message-State: AGi0PubnESznnMQgAfILgfyYXJuT51935yQyHRoK3lspDxa6N+AizR7U WWI3vlHjLq8BHPqyG9yeA1ueHjtxOO4= X-Received: by 2002:a17:90a:d0c3:: with SMTP id y3mr6108858pjw.133.1588951344015; Fri, 08 May 2020 08:22:24 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id n16sm2104575pfq.61.2020.05.08.08.22.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 May 2020 08:22:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 16/16] target/arm: Vectorize SABA/UABA Date: Fri, 8 May 2020 08:22:00 -0700 Message-Id: <20200508152200.6547-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200508152200.6547-1-richard.henderson@linaro.org> References: <20200508152200.6547-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 +++-- target/arm/translate.h | 5 ++ target/arm/neon_helper.c | 10 --- target/arm/translate-a64.c | 17 ++--- target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 24 +++++++ 6 files changed, 174 insertions(+), 33 deletions(-) -- 2.20.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 4678d3a6f4..1857f4ee46 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -284,13 +284,6 @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32) DEF_HELPER_2(neon_pmax_u16, i32, i32, i32) DEF_HELPER_2(neon_pmax_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u8, i32, i32, i32) -DEF_HELPER_2(neon_abd_s8, i32, i32, i32) -DEF_HELPER_2(neon_abd_u16, i32, i32, i32) -DEF_HELPER_2(neon_abd_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u32, i32, i32, i32) -DEF_HELPER_2(neon_abd_s32, i32, i32, i32) - DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) @@ -741,6 +734,16 @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index bbfe3d1393..c937dfe9bf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -342,6 +342,11 @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index 448be93fa1..2ef75e04c8 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -576,16 +576,6 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) \ - dest = (src1 > src2) ? (src1 - src2) : (src2 - src1) -NEON_VOP(abd_s8, neon_s8, 4) -NEON_VOP(abd_u8, neon_u8, 4) -NEON_VOP(abd_s16, neon_s16, 2) -NEON_VOP(abd_u16, neon_u16, 2) -NEON_VOP(abd_s32, neon_s32, 1) -NEON_VOP(abd_u32, neon_u32, 1) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) do { \ int8_t tmp; \ tmp = (int8_t)src2; \ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 54b06553a6..991e451644 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11197,6 +11197,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); } return; + case 0xf: /* SABA, UABA */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11329,16 +11336,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xf: /* SABA, UABA */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 }, - { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 }, - { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index 0ed6f49f42..bb7a731b21 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5222,6 +5222,124 @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5366,6 +5484,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case NEON_3R_VABA: + if (u) { + gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; + case NEON_3R_VADD_VSUB: case NEON_3R_LOGIC: case NEON_3R_VMAX: @@ -5510,12 +5638,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABA: - GEN_NEON_INTEGER_OP(abd); - tcg_temp_free_i32(tmp2); - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - break; case NEON_3R_VPMAX: GEN_NEON_INTEGER_OP(pmax); break; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a4972d02fc..fa33df859e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1431,3 +1431,27 @@ DO_ABD(gvec_uabd_s, uint32_t) DO_ABD(gvec_uabd_d, uint64_t) #undef DO_ABD + +#define DO_ABA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABA(gvec_saba_b, int8_t) +DO_ABA(gvec_saba_h, int16_t) +DO_ABA(gvec_saba_s, int32_t) +DO_ABA(gvec_saba_d, int64_t) + +DO_ABA(gvec_uaba_b, uint8_t) +DO_ABA(gvec_uaba_h, uint16_t) +DO_ABA(gvec_uaba_s, uint32_t) +DO_ABA(gvec_uaba_d, uint64_t) + +#undef DO_ABA