From patchwork Fri Feb 7 21:49:16 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 24322 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-yk0-f198.google.com (mail-yk0-f198.google.com [209.85.160.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 83592202B2 for ; Fri, 7 Feb 2014 21:49:43 +0000 (UTC) Received: by mail-yk0-f198.google.com with SMTP id 131sf1666578ykp.1 for ; Fri, 07 Feb 2014 13:49:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=GVMhTw7/1X3/ByrrHybKQ2kQ0nLdALZT9iuLLelNdY0=; b=TS31Z62jGl9sX1Yo33ONcXjgYh65TbD2gkIX6DVSp50y1/zoHKu8YAi/i+asBekpFl wdfadDvMgqu4593dr7TFzJkkAmWXRZxiuMOqlpEO9TcXNTJUoXsXTXhqmSFWq2steNXC b1OfxDjGO3+bjWsmyWivv2A1/850AH48fa8ijfQsutwsoZlLQHG/bWwgNG+Ja4uWWxRq gxOVEuH7jYlPN3ks2DLdCbTLoCrorKl21gQ9Ci3vm3tc1njyW2ZdihFTcSrfhhnhnGb+ CyxqRCMzCHnAZecjIsKyRh/vZezotGyRRpcrV6jtNKoL2Y6R4o0f91mTwoEuprLc0DU3 VkEg== X-Gm-Message-State: ALoCoQllS1AJvdHV0/Dh46REIl/fGfDNfRh/0O/piT8PzKgOmBFKcmr1SFAyst9juc8oNDTD5GDo X-Received: by 10.58.253.33 with SMTP id zx1mr6492276vec.10.1391809782393; Fri, 07 Feb 2014 13:49:42 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.20.146 with SMTP id 18ls320824qgj.58.gmail; Fri, 07 Feb 2014 13:49:42 -0800 (PST) X-Received: by 10.58.161.78 with SMTP id xq14mr30277veb.51.1391809782251; Fri, 07 Feb 2014 13:49:42 -0800 (PST) Received: from mail-vb0-f42.google.com (mail-vb0-f42.google.com [209.85.212.42]) by mx.google.com with ESMTPS id us10si1786013vcb.134.2014.02.07.13.49.42 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 07 Feb 2014 13:49:42 -0800 (PST) Received-SPF: neutral (google.com: 209.85.212.42 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.212.42; Received: by mail-vb0-f42.google.com with SMTP id o7so892447vbn.1 for ; Fri, 07 Feb 2014 13:49:42 -0800 (PST) X-Received: by 10.220.109.1 with SMTP id h1mr12295388vcp.20.1391809782137; Fri, 07 Feb 2014 13:49:42 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.174.196 with SMTP id u4csp137813vcz; Fri, 7 Feb 2014 13:49:41 -0800 (PST) X-Received: by 10.204.76.202 with SMTP id d10mr27581bkk.122.1391809780940; Fri, 07 Feb 2014 13:49:40 -0800 (PST) Received: from mnementh.archaic.org.uk (mnementh.archaic.org.uk. [2001:8b0:1d0::1]) by mx.google.com with ESMTPS id dg6si4952388bkc.66.2014.02.07.13.49.39 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 07 Feb 2014 13:49:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of pm215@archaic.org.uk designates 2001:8b0:1d0::1 as permitted sender) client-ip=2001:8b0:1d0::1; Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80) (envelope-from ) id 1WBtIZ-0002w0-Rw; Fri, 07 Feb 2014 21:49:23 +0000 From: Peter Maydell To: qemu-devel@nongnu.org Cc: patches@linaro.org, Alexander Graf , Michael Matz , Claudio Fontana , Dirk Mueller , Laurent Desnogues , kvmarm@lists.cs.columbia.edu, Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Christoffer Dall , Will Newton , Peter Crosthwaite Subject: [PATCH 1/8] target-arm: A64: Implement plain vector SIMD indexed element insns Date: Fri, 7 Feb 2014 21:49:16 +0000 Message-Id: <1391809763-11251-2-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1391809763-11251-1-git-send-email-peter.maydell@linaro.org> References: <1391809763-11251-1-git-send-email-peter.maydell@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: peter.maydell@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.212.42 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Implement all the SIMD vector x indexed element instructions in the subcategory which are not 'long' ops. Signed-off-by: Peter Maydell --- target-arm/helper-a64.c | 26 +++++ target-arm/helper-a64.h | 2 + target-arm/translate-a64.c | 245 ++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 272 insertions(+), 1 deletion(-) diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c index 6ca958a..fe90a5c 100644 --- a/target-arm/helper-a64.c +++ b/target-arm/helper-a64.c @@ -123,6 +123,32 @@ uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, void *fp_status) return float_rel_to_flags(float64_compare(x, y, fp_status)); } +float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp) +{ + float_status *fpst = fpstp; + + if ((float32_is_zero(a) && float32_is_infinity(b)) || + (float32_is_infinity(a) && float32_is_zero(b))) { + /* 2.0 with the sign bit set to sign(A) XOR sign(B) */ + return make_float32((1U << 30) | + ((float32_val(a) ^ float32_val(b)) & (1U << 31))); + } + return float32_mul(a, b, fpst); +} + +float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp) +{ + float_status *fpst = fpstp; + + if ((float64_is_zero(a) && float64_is_infinity(b)) || + (float64_is_infinity(a) && float64_is_zero(b))) { + /* 2.0 with the sign bit set to sign(A) XOR sign(B) */ + return make_float64((1ULL << 62) | + ((float64_val(a) ^ float64_val(b)) & (1ULL << 63))); + } + return float64_mul(a, b, fpst); +} + uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices, uint32_t rn, uint32_t numregs) { diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h index 99832ee..84310e8 100644 --- a/target-arm/helper-a64.h +++ b/target-arm/helper-a64.h @@ -27,3 +27,5 @@ DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr) DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr) DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr) DEF_HELPER_FLAGS_5(simd_tbl, TCG_CALL_NO_RWG_SE, i64, env, i64, i64, i32, i32) +DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, ptr) +DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, ptr) diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index d60223a..b7f1ecf 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -7813,7 +7813,250 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) */ static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn) { - unsupported_encoding(s, insn); + /* This encoding has two kinds of instruction: + * normal, where we perform elt x idxelt => elt for each + * element in the vector + * long, where we perform elt x idxelt and generate a result of + * double the width of the input element + * The long ops have a 'part' specifier (ie come in INSN, INSN2 pairs). + */ + bool is_q = extract32(insn, 30, 1); + bool u = extract32(insn, 29, 1); + int size = extract32(insn, 22, 2); + int l = extract32(insn, 21, 1); + int m = extract32(insn, 20, 1); + /* Note that the Rm field here is only 4 bits, not 5 as it usually is */ + int rm = extract32(insn, 16, 4); + int opcode = extract32(insn, 12, 4); + int h = extract32(insn, 11, 1); + int rn = extract32(insn, 5, 5); + int rd = extract32(insn, 0, 5); + bool is_long = false; + bool is_fp = false; + int index; + TCGv_ptr fpst; + + switch (opcode) { + case 0x0: /* MLA */ + case 0x4: /* MLS */ + if (!u) { + unallocated_encoding(s); + return; + } + break; + case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */ + case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */ + case 0xa: /* SMULL, SMULL2, UMULL, UMULL2 */ + is_long = true; + break; + case 0x3: /* SQDMLAL, SQDMLAL2 */ + case 0x7: /* SQDMLSL, SQDMLSL2 */ + case 0xb: /* SQDMULL, SQDMULL2 */ + is_long = true; + /* fall through */ + case 0xc: /* SQDMULH */ + case 0xd: /* SQRDMULH */ + case 0x8: /* MUL */ + if (u) { + unallocated_encoding(s); + return; + } + break; + case 0x1: /* FMLA */ + case 0x5: /* FMLS */ + if (u) { + unallocated_encoding(s); + return; + } + /* fall through */ + case 0x9: /* FMUL, FMULX */ + if (!extract32(size, 1, 1)) { + unallocated_encoding(s); + return; + } + is_fp = true; + break; + } + + if (is_fp) { + /* low bit of size indicates single/double */ + size = extract32(size, 0, 1) ? 3 : 2; + if (size == 2) { + index = h << 1 | l; + } else { + if (l || !is_q) { + unallocated_encoding(s); + return; + } + index = h; + } + rm |= (m << 4); + } else { + switch (size) { + case 1: + index = h << 2 | l << 1 | m; + break; + case 2: + index = h << 1 | l; + rm |= (m << 4); + break; + default: + unallocated_encoding(s); + return; + } + } + + if (is_long) { + unsupported_encoding(s, insn); + return; + } + + if (is_fp) { + fpst = get_fpstatus_ptr(); + } else { + TCGV_UNUSED_PTR(fpst); + } + + if (size == 3) { + TCGv_i64 tcg_idx = tcg_temp_new_i64(); + int pass; + + assert(is_fp && is_q && !is_long); + + read_vec_element(s, tcg_idx, rm, index, MO_64); + + for (pass = 0; pass < 2; pass++) { + TCGv_i64 tcg_op = tcg_temp_new_i64(); + TCGv_i64 tcg_res = tcg_temp_new_i64(); + + read_vec_element(s, tcg_op, rn, pass, MO_64); + + switch (opcode) { + case 0x5: /* FMLS */ + /* As usual for ARM, separate negation for fused multiply-add */ + gen_helper_vfp_negd(tcg_op, tcg_op); + /* fall through */ + case 0x1: /* FMLA */ + read_vec_element(s, tcg_res, rd, pass, MO_64); + gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst); + break; + case 0x9: /* FMUL, FMULX */ + if (u) { + gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst); + } else { + gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst); + } + break; + default: + g_assert_not_reached(); + } + + write_vec_element(s, tcg_res, rd, pass, MO_64); + tcg_temp_free_i64(tcg_op); + tcg_temp_free_i64(tcg_res); + } + + tcg_temp_free_i64(tcg_idx); + } else if (!is_long) { + /* 32 bit floating point, or 16 or 32 bit integer */ + TCGv_i32 tcg_idx = tcg_temp_new_i32(); + int pass; + + read_vec_element_i32(s, tcg_idx, rm, index, size); + + if (size == 1) { + /* The simplest way to handle the 16x16 indexed ops is to duplicate + * the index into both halves of the 32 bit tcg_idx and then use + * the usual Neon helpers. + */ + tcg_gen_deposit_i32(tcg_idx, tcg_idx, tcg_idx, 16, 16); + } + + for (pass = 0; pass < (is_q ? 4 : 2); pass++) { + TCGv_i32 tcg_op = tcg_temp_new_i32(); + TCGv_i32 tcg_res = tcg_temp_new_i32(); + + read_vec_element_i32(s, tcg_op, rn, pass, MO_32); + + switch (opcode) { + case 0x0: /* MLA */ + case 0x4: /* MLS */ + case 0x8: /* MUL */ + { + static NeonGenTwoOpFn * const fns[2][2] = { + { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 }, + { tcg_gen_add_i32, tcg_gen_sub_i32 }, + }; + NeonGenTwoOpFn *genfn; + bool is_sub = opcode == 0x4; + + if (size == 1) { + gen_helper_neon_mul_u16(tcg_res, tcg_op, tcg_idx); + } else { + tcg_gen_mul_i32(tcg_res, tcg_op, tcg_idx); + } + if (opcode == 0x8) { + break; + } + read_vec_element_i32(s, tcg_op, rd, pass, MO_32); + genfn = fns[size - 1][is_sub]; + genfn(tcg_res, tcg_op, tcg_res); + break; + } + case 0x5: /* FMLS */ + /* As usual for ARM, separate negation for fused multiply-add */ + gen_helper_vfp_negs(tcg_op, tcg_op); + /* fall through */ + case 0x1: /* FMLA */ + read_vec_element_i32(s, tcg_res, rd, pass, MO_32); + gen_helper_vfp_muladds(tcg_res, tcg_op, tcg_idx, tcg_res, fpst); + break; + case 0x9: /* FMUL, FMULX */ + if (u) { + gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst); + } else { + gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst); + } + break; + case 0xc: /* SQDMULH */ + if (size == 1) { + gen_helper_neon_qdmulh_s16(tcg_res, cpu_env, + tcg_op, tcg_idx); + } else { + gen_helper_neon_qdmulh_s32(tcg_res, cpu_env, + tcg_op, tcg_idx); + } + break; + case 0xd: /* SQRDMULH */ + if (size == 1) { + gen_helper_neon_qrdmulh_s16(tcg_res, cpu_env, + tcg_op, tcg_idx); + } else { + gen_helper_neon_qrdmulh_s32(tcg_res, cpu_env, + tcg_op, tcg_idx); + } + break; + default: + g_assert_not_reached(); + } + + write_vec_element_i32(s, tcg_res, rd, pass, MO_32); + tcg_temp_free_i32(tcg_op); + tcg_temp_free_i32(tcg_res); + } + + tcg_temp_free_i32(tcg_idx); + + if (!is_q) { + clear_vec_high(s, rd); + } + } else { + /* long ops: 16x16->32 or 32x32->64 */ + } + + if (!TCGV_IS_UNUSED_PTR(fpst)) { + tcg_temp_free_ptr(fpst); + } } /* C3.6.19 Crypto AES