From patchwork Thu Feb 20 11:17:06 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 25025 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pa0-f72.google.com (mail-pa0-f72.google.com [209.85.220.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 41733203BE for ; Thu, 20 Feb 2014 11:28:55 +0000 (UTC) Received: by mail-pa0-f72.google.com with SMTP id rd3sf4449861pab.11 for ; Thu, 20 Feb 2014 03:28:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:date :message-id:in-reply-to:references:cc:subject:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:list-subscribe :errors-to:sender:x-original-sender :x-original-authentication-results:mailing-list; bh=DeSep1o6y1Ys47SJqdy5O+myiGUM+ztKIeBi/MbQlvA=; b=Kva4PZkgO0fxXeRgUwhozTw12BT5Hiib7MOdavD0Gi+6UgR89W2GrmTCIkG0X/k8CW 6sfeQv94u/hfYFj4MHuClaV2swDeA2sXxsdjJe8SMPCOJZ5ebc6km5AXgDzYKoiEYxt/ A2/WGqn9QSP7AbmmJIPJdyFnFyNSQ1gn0qbxg9YAxx5ZWj0e0YB9mBcAK3xDLsUOMeOa 0GN1HPhwXOhZmz7tcn3gJm4FzJNdadz6qjQz/9C5IfzQlC4Ieq0RlOdlNPFo0M45y5mU fEXl+2s8Jt07aoChhZOfQcONn6vRUBRGu4JJmaUy8HNBg12Ro3rMXkR5jGKx5A1klf8R i5Qg== X-Gm-Message-State: ALoCoQk50m4GVppyrq2m2HHrnupFvSXh30bj1gcnS7v++AicV8/0Gfx8IzpQIBdKohGA9cD2u/lE X-Received: by 10.66.102.36 with SMTP id fl4mr522140pab.20.1392895734370; Thu, 20 Feb 2014 03:28:54 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.23.175 with SMTP id 44ls136458qgp.12.gmail; Thu, 20 Feb 2014 03:28:54 -0800 (PST) X-Received: by 10.220.252.134 with SMTP id mw6mr659826vcb.51.1392895734154; Thu, 20 Feb 2014 03:28:54 -0800 (PST) Received: from mail-vc0-f181.google.com (mail-vc0-f181.google.com [209.85.220.181]) by mx.google.com with ESMTPS id qi7si1326773veb.148.2014.02.20.03.28.54 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 20 Feb 2014 03:28:54 -0800 (PST) Received-SPF: neutral (google.com: 209.85.220.181 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.220.181; Received: by mail-vc0-f181.google.com with SMTP id ie18so1677628vcb.40 for ; Thu, 20 Feb 2014 03:28:54 -0800 (PST) X-Received: by 10.220.106.84 with SMTP id w20mr737703vco.18.1392895734014; Thu, 20 Feb 2014 03:28:54 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.174.196 with SMTP id u4csp50796vcz; Thu, 20 Feb 2014 03:28:53 -0800 (PST) X-Received: by 10.224.14.2 with SMTP id e2mr731344qaa.73.1392895733664; Thu, 20 Feb 2014 03:28:53 -0800 (PST) Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id d2si942730qag.16.2014.02.20.03.28.53 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 20 Feb 2014 03:28:53 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Received: from localhost ([::1]:37637 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WGRoD-00022f-4v for patch@linaro.org; Thu, 20 Feb 2014 06:28:53 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58927) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WGRei-000452-IO for qemu-devel@nongnu.org; Thu, 20 Feb 2014 06:19:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WGReg-0004xZ-Jz for qemu-devel@nongnu.org; Thu, 20 Feb 2014 06:19:04 -0500 Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:46035) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WGReg-0004wV-BM for qemu-devel@nongnu.org; Thu, 20 Feb 2014 06:19:02 -0500 Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80) (envelope-from ) id 1WGRdG-0003S1-K5; Thu, 20 Feb 2014 11:17:34 +0000 From: Peter Maydell To: Anthony Liguori Date: Thu, 20 Feb 2014 11:17:06 +0000 Message-Id: <1392895054-13232-3-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1392895054-13232-1-git-send-email-peter.maydell@linaro.org> References: <1392895054-13232-1-git-send-email-peter.maydell@linaro.org> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:8b0:1d0::1 Cc: Blue Swirl , qemu-devel@nongnu.org, Aurelien Jarno Subject: [Qemu-devel] [PULL 02/30] target-arm: A64: Implement plain vector SIMD indexed element insns X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: peter.maydell@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.220.181 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 Implement all the SIMD vector x indexed element instructions in the subcategory which are not 'long' ops. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target-arm/helper-a64.c | 26 +++++ target-arm/helper-a64.h | 2 + target-arm/translate-a64.c | 248 ++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 275 insertions(+), 1 deletion(-) diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c index 6ca958a..fe90a5c 100644 --- a/target-arm/helper-a64.c +++ b/target-arm/helper-a64.c @@ -123,6 +123,32 @@ uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, void *fp_status) return float_rel_to_flags(float64_compare(x, y, fp_status)); } +float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp) +{ + float_status *fpst = fpstp; + + if ((float32_is_zero(a) && float32_is_infinity(b)) || + (float32_is_infinity(a) && float32_is_zero(b))) { + /* 2.0 with the sign bit set to sign(A) XOR sign(B) */ + return make_float32((1U << 30) | + ((float32_val(a) ^ float32_val(b)) & (1U << 31))); + } + return float32_mul(a, b, fpst); +} + +float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp) +{ + float_status *fpst = fpstp; + + if ((float64_is_zero(a) && float64_is_infinity(b)) || + (float64_is_infinity(a) && float64_is_zero(b))) { + /* 2.0 with the sign bit set to sign(A) XOR sign(B) */ + return make_float64((1ULL << 62) | + ((float64_val(a) ^ float64_val(b)) & (1ULL << 63))); + } + return float64_mul(a, b, fpst); +} + uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices, uint32_t rn, uint32_t numregs) { diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h index 99832ee..84310e8 100644 --- a/target-arm/helper-a64.h +++ b/target-arm/helper-a64.h @@ -27,3 +27,5 @@ DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr) DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr) DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr) DEF_HELPER_FLAGS_5(simd_tbl, TCG_CALL_NO_RWG_SE, i64, env, i64, i64, i32, i32) +DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, ptr) +DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, ptr) diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index d60223a..a96ee4a 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -7813,7 +7813,253 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) */ static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn) { - unsupported_encoding(s, insn); + /* This encoding has two kinds of instruction: + * normal, where we perform elt x idxelt => elt for each + * element in the vector + * long, where we perform elt x idxelt and generate a result of + * double the width of the input element + * The long ops have a 'part' specifier (ie come in INSN, INSN2 pairs). + */ + bool is_q = extract32(insn, 30, 1); + bool u = extract32(insn, 29, 1); + int size = extract32(insn, 22, 2); + int l = extract32(insn, 21, 1); + int m = extract32(insn, 20, 1); + /* Note that the Rm field here is only 4 bits, not 5 as it usually is */ + int rm = extract32(insn, 16, 4); + int opcode = extract32(insn, 12, 4); + int h = extract32(insn, 11, 1); + int rn = extract32(insn, 5, 5); + int rd = extract32(insn, 0, 5); + bool is_long = false; + bool is_fp = false; + int index; + TCGv_ptr fpst; + + switch (opcode) { + case 0x0: /* MLA */ + case 0x4: /* MLS */ + if (!u) { + unallocated_encoding(s); + return; + } + break; + case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */ + case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */ + case 0xa: /* SMULL, SMULL2, UMULL, UMULL2 */ + is_long = true; + break; + case 0x3: /* SQDMLAL, SQDMLAL2 */ + case 0x7: /* SQDMLSL, SQDMLSL2 */ + case 0xb: /* SQDMULL, SQDMULL2 */ + is_long = true; + /* fall through */ + case 0xc: /* SQDMULH */ + case 0xd: /* SQRDMULH */ + case 0x8: /* MUL */ + if (u) { + unallocated_encoding(s); + return; + } + break; + case 0x1: /* FMLA */ + case 0x5: /* FMLS */ + if (u) { + unallocated_encoding(s); + return; + } + /* fall through */ + case 0x9: /* FMUL, FMULX */ + if (!extract32(size, 1, 1)) { + unallocated_encoding(s); + return; + } + is_fp = true; + break; + default: + unallocated_encoding(s); + return; + } + + if (is_fp) { + /* low bit of size indicates single/double */ + size = extract32(size, 0, 1) ? 3 : 2; + if (size == 2) { + index = h << 1 | l; + } else { + if (l || !is_q) { + unallocated_encoding(s); + return; + } + index = h; + } + rm |= (m << 4); + } else { + switch (size) { + case 1: + index = h << 2 | l << 1 | m; + break; + case 2: + index = h << 1 | l; + rm |= (m << 4); + break; + default: + unallocated_encoding(s); + return; + } + } + + if (is_long) { + unsupported_encoding(s, insn); + return; + } + + if (is_fp) { + fpst = get_fpstatus_ptr(); + } else { + TCGV_UNUSED_PTR(fpst); + } + + if (size == 3) { + TCGv_i64 tcg_idx = tcg_temp_new_i64(); + int pass; + + assert(is_fp && is_q && !is_long); + + read_vec_element(s, tcg_idx, rm, index, MO_64); + + for (pass = 0; pass < 2; pass++) { + TCGv_i64 tcg_op = tcg_temp_new_i64(); + TCGv_i64 tcg_res = tcg_temp_new_i64(); + + read_vec_element(s, tcg_op, rn, pass, MO_64); + + switch (opcode) { + case 0x5: /* FMLS */ + /* As usual for ARM, separate negation for fused multiply-add */ + gen_helper_vfp_negd(tcg_op, tcg_op); + /* fall through */ + case 0x1: /* FMLA */ + read_vec_element(s, tcg_res, rd, pass, MO_64); + gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst); + break; + case 0x9: /* FMUL, FMULX */ + if (u) { + gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst); + } else { + gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst); + } + break; + default: + g_assert_not_reached(); + } + + write_vec_element(s, tcg_res, rd, pass, MO_64); + tcg_temp_free_i64(tcg_op); + tcg_temp_free_i64(tcg_res); + } + + tcg_temp_free_i64(tcg_idx); + } else if (!is_long) { + /* 32 bit floating point, or 16 or 32 bit integer */ + TCGv_i32 tcg_idx = tcg_temp_new_i32(); + int pass; + + read_vec_element_i32(s, tcg_idx, rm, index, size); + + if (size == 1) { + /* The simplest way to handle the 16x16 indexed ops is to duplicate + * the index into both halves of the 32 bit tcg_idx and then use + * the usual Neon helpers. + */ + tcg_gen_deposit_i32(tcg_idx, tcg_idx, tcg_idx, 16, 16); + } + + for (pass = 0; pass < (is_q ? 4 : 2); pass++) { + TCGv_i32 tcg_op = tcg_temp_new_i32(); + TCGv_i32 tcg_res = tcg_temp_new_i32(); + + read_vec_element_i32(s, tcg_op, rn, pass, MO_32); + + switch (opcode) { + case 0x0: /* MLA */ + case 0x4: /* MLS */ + case 0x8: /* MUL */ + { + static NeonGenTwoOpFn * const fns[2][2] = { + { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 }, + { tcg_gen_add_i32, tcg_gen_sub_i32 }, + }; + NeonGenTwoOpFn *genfn; + bool is_sub = opcode == 0x4; + + if (size == 1) { + gen_helper_neon_mul_u16(tcg_res, tcg_op, tcg_idx); + } else { + tcg_gen_mul_i32(tcg_res, tcg_op, tcg_idx); + } + if (opcode == 0x8) { + break; + } + read_vec_element_i32(s, tcg_op, rd, pass, MO_32); + genfn = fns[size - 1][is_sub]; + genfn(tcg_res, tcg_op, tcg_res); + break; + } + case 0x5: /* FMLS */ + /* As usual for ARM, separate negation for fused multiply-add */ + gen_helper_vfp_negs(tcg_op, tcg_op); + /* fall through */ + case 0x1: /* FMLA */ + read_vec_element_i32(s, tcg_res, rd, pass, MO_32); + gen_helper_vfp_muladds(tcg_res, tcg_op, tcg_idx, tcg_res, fpst); + break; + case 0x9: /* FMUL, FMULX */ + if (u) { + gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst); + } else { + gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst); + } + break; + case 0xc: /* SQDMULH */ + if (size == 1) { + gen_helper_neon_qdmulh_s16(tcg_res, cpu_env, + tcg_op, tcg_idx); + } else { + gen_helper_neon_qdmulh_s32(tcg_res, cpu_env, + tcg_op, tcg_idx); + } + break; + case 0xd: /* SQRDMULH */ + if (size == 1) { + gen_helper_neon_qrdmulh_s16(tcg_res, cpu_env, + tcg_op, tcg_idx); + } else { + gen_helper_neon_qrdmulh_s32(tcg_res, cpu_env, + tcg_op, tcg_idx); + } + break; + default: + g_assert_not_reached(); + } + + write_vec_element_i32(s, tcg_res, rd, pass, MO_32); + tcg_temp_free_i32(tcg_op); + tcg_temp_free_i32(tcg_res); + } + + tcg_temp_free_i32(tcg_idx); + + if (!is_q) { + clear_vec_high(s, rd); + } + } else { + /* long ops: 16x16->32 or 32x32->64 */ + } + + if (!TCGV_IS_UNUSED_PTR(fpst)) { + tcg_temp_free_ptr(fpst); + } } /* C3.6.19 Crypto AES