From patchwork Thu Feb 20 11:17:06 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Peter Maydell <peter.maydell@linaro.org>
X-Patchwork-Id: 25025
Return-Path: <patchwork-forward+bncBC6Z756YVMIBB5WNS6MAKGQEPP6OCNA@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-pa0-f72.google.com (mail-pa0-f72.google.com
 [209.85.220.72])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 41733203BE
 for <linaro@patches.linaro.org>; Thu, 20 Feb 2014 11:28:55 +0000 (UTC)
Received: by mail-pa0-f72.google.com with SMTP id rd3sf4449861pab.11
 for <linaro@patches.linaro.org>; Thu, 20 Feb 2014 03:28:54 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:date
 :message-id:in-reply-to:references:cc:subject:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:list-subscribe
 :errors-to:sender:x-original-sender
 :x-original-authentication-results:mailing-list;
 bh=DeSep1o6y1Ys47SJqdy5O+myiGUM+ztKIeBi/MbQlvA=;
 b=Kva4PZkgO0fxXeRgUwhozTw12BT5Hiib7MOdavD0Gi+6UgR89W2GrmTCIkG0X/k8CW
 6sfeQv94u/hfYFj4MHuClaV2swDeA2sXxsdjJe8SMPCOJZ5ebc6km5AXgDzYKoiEYxt/
 A2/WGqn9QSP7AbmmJIPJdyFnFyNSQ1gn0qbxg9YAxx5ZWj0e0YB9mBcAK3xDLsUOMeOa
 0GN1HPhwXOhZmz7tcn3gJm4FzJNdadz6qjQz/9C5IfzQlC4Ieq0RlOdlNPFo0M45y5mU
 fEXl+2s8Jt07aoChhZOfQcONn6vRUBRGu4JJmaUy8HNBg12Ro3rMXkR5jGKx5A1klf8R
 i5Qg==
X-Gm-Message-State: ALoCoQk50m4GVppyrq2m2HHrnupFvSXh30bj1gcnS7v++AicV8/0Gfx8IzpQIBdKohGA9cD2u/lE
X-Received: by 10.66.102.36 with SMTP id fl4mr522140pab.20.1392895734370;
 Thu, 20 Feb 2014 03:28:54 -0800 (PST)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.23.175 with SMTP id 44ls136458qgp.12.gmail; Thu, 20 Feb
 2014 03:28:54 -0800 (PST)
X-Received: by 10.220.252.134 with SMTP id mw6mr659826vcb.51.1392895734154; 
 Thu, 20 Feb 2014 03:28:54 -0800 (PST)
Received: from mail-vc0-f181.google.com (mail-vc0-f181.google.com
 [209.85.220.181]) by mx.google.com with ESMTPS id
 qi7si1326773veb.148.2014.02.20.03.28.54
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Thu, 20 Feb 2014 03:28:54 -0800 (PST)
Received-SPF: neutral (google.com: 209.85.220.181 is neither permitted nor
 denied by best guess record for domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org)
 client-ip=209.85.220.181; 
Received: by mail-vc0-f181.google.com with SMTP id ie18so1677628vcb.40
 for <patchwork-forward@linaro.org>;
 Thu, 20 Feb 2014 03:28:54 -0800 (PST)
X-Received: by 10.220.106.84 with SMTP id w20mr737703vco.18.1392895734014;
 Thu, 20 Feb 2014 03:28:54 -0800 (PST)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.220.174.196 with SMTP id u4csp50796vcz;
 Thu, 20 Feb 2014 03:28:53 -0800 (PST)
X-Received: by 10.224.14.2 with SMTP id e2mr731344qaa.73.1392895733664;
 Thu, 20 Feb 2014 03:28:53 -0800 (PST)
Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11])
 by mx.google.com with ESMTPS id d2si942730qag.16.2014.02.20.03.28.53
 for <patch@linaro.org> (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Thu, 20 Feb 2014 03:28:53 -0800 (PST)
Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 client-ip=2001:4830:134:3::11; 
Received: from localhost ([::1]:37637 helo=lists.gnu.org)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <qemu-devel-bounces+patch=linaro.org@nongnu.org>)
 id 1WGRoD-00022f-4v
 for patch@linaro.org; Thu, 20 Feb 2014 06:28:53 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58927)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1WGRei-000452-IO
 for qemu-devel@nongnu.org; Thu, 20 Feb 2014 06:19:06 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1WGReg-0004xZ-Jz
 for qemu-devel@nongnu.org; Thu, 20 Feb 2014 06:19:04 -0500
Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:46035)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1WGReg-0004wV-BM
 for qemu-devel@nongnu.org; Thu, 20 Feb 2014 06:19:02 -0500
Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80)
 (envelope-from <pm215@archaic.org.uk>)
 id 1WGRdG-0003S1-K5; Thu, 20 Feb 2014 11:17:34 +0000
From: Peter Maydell <peter.maydell@linaro.org>
To: Anthony Liguori <aliguori@amazon.com>
Date: Thu, 20 Feb 2014 11:17:06 +0000
Message-Id: <1392895054-13232-3-git-send-email-peter.maydell@linaro.org>
X-Mailer: git-send-email 1.7.10.4
In-Reply-To: <1392895054-13232-1-git-send-email-peter.maydell@linaro.org>
References: <1392895054-13232-1-git-send-email-peter.maydell@linaro.org>
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:8b0:1d0::1
Cc: Blue Swirl <blauwirbel@gmail.com>, qemu-devel@nongnu.org,
 Aurelien Jarno <aurelien@aurel32.net>
Subject: [Qemu-devel] [PULL 02/30] target-arm: A64: Implement plain vector
 SIMD indexed element insns
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: peter.maydell@linaro.org
X-Original-Authentication-Results: mx.google.com;       spf=neutral
 (google.com: 209.85.220.181 is neither permitted nor denied by best
 guess record for domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541

Implement all the SIMD vector x indexed element instructions
in the subcategory which are not 'long' ops.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/helper-a64.c    |  26 +++++
 target-arm/helper-a64.h    |   2 +
 target-arm/translate-a64.c | 248 ++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 275 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index 6ca958a..fe90a5c 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -123,6 +123,32 @@ uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, void *fp_status)
     return float_rel_to_flags(float64_compare(x, y, fp_status));
 }
 
+float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp)
+{
+    float_status *fpst = fpstp;
+
+    if ((float32_is_zero(a) && float32_is_infinity(b)) ||
+        (float32_is_infinity(a) && float32_is_zero(b))) {
+        /* 2.0 with the sign bit set to sign(A) XOR sign(B) */
+        return make_float32((1U << 30) |
+                            ((float32_val(a) ^ float32_val(b)) & (1U << 31)));
+    }
+    return float32_mul(a, b, fpst);
+}
+
+float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp)
+{
+    float_status *fpst = fpstp;
+
+    if ((float64_is_zero(a) && float64_is_infinity(b)) ||
+        (float64_is_infinity(a) && float64_is_zero(b))) {
+        /* 2.0 with the sign bit set to sign(A) XOR sign(B) */
+        return make_float64((1ULL << 62) |
+                            ((float64_val(a) ^ float64_val(b)) & (1ULL << 63)));
+    }
+    return float64_mul(a, b, fpst);
+}
+
 uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices,
                           uint32_t rn, uint32_t numregs)
 {
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index 99832ee..84310e8 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -27,3 +27,5 @@ DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
 DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr)
 DEF_HELPER_FLAGS_5(simd_tbl, TCG_CALL_NO_RWG_SE, i64, env, i64, i64, i32, i32)
+DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, ptr)
+DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, ptr)
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index d60223a..a96ee4a 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -7813,7 +7813,253 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    /* This encoding has two kinds of instruction:
+     *  normal, where we perform elt x idxelt => elt for each
+     *     element in the vector
+     *  long, where we perform elt x idxelt and generate a result of
+     *     double the width of the input element
+     * The long ops have a 'part' specifier (ie come in INSN, INSN2 pairs).
+     */
+    bool is_q = extract32(insn, 30, 1);
+    bool u = extract32(insn, 29, 1);
+    int size = extract32(insn, 22, 2);
+    int l = extract32(insn, 21, 1);
+    int m = extract32(insn, 20, 1);
+    /* Note that the Rm field here is only 4 bits, not 5 as it usually is */
+    int rm = extract32(insn, 16, 4);
+    int opcode = extract32(insn, 12, 4);
+    int h = extract32(insn, 11, 1);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+    bool is_long = false;
+    bool is_fp = false;
+    int index;
+    TCGv_ptr fpst;
+
+    switch (opcode) {
+    case 0x0: /* MLA */
+    case 0x4: /* MLS */
+        if (!u) {
+            unallocated_encoding(s);
+            return;
+        }
+        break;
+    case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
+    case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
+    case 0xa: /* SMULL, SMULL2, UMULL, UMULL2 */
+        is_long = true;
+        break;
+    case 0x3: /* SQDMLAL, SQDMLAL2 */
+    case 0x7: /* SQDMLSL, SQDMLSL2 */
+    case 0xb: /* SQDMULL, SQDMULL2 */
+        is_long = true;
+        /* fall through */
+    case 0xc: /* SQDMULH */
+    case 0xd: /* SQRDMULH */
+    case 0x8: /* MUL */
+        if (u) {
+            unallocated_encoding(s);
+            return;
+        }
+        break;
+    case 0x1: /* FMLA */
+    case 0x5: /* FMLS */
+        if (u) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* fall through */
+    case 0x9: /* FMUL, FMULX */
+        if (!extract32(size, 1, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        is_fp = true;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
+    if (is_fp) {
+        /* low bit of size indicates single/double */
+        size = extract32(size, 0, 1) ? 3 : 2;
+        if (size == 2) {
+            index = h << 1 | l;
+        } else {
+            if (l || !is_q) {
+                unallocated_encoding(s);
+                return;
+            }
+            index = h;
+        }
+        rm |= (m << 4);
+    } else {
+        switch (size) {
+        case 1:
+            index = h << 2 | l << 1 | m;
+            break;
+        case 2:
+            index = h << 1 | l;
+            rm |= (m << 4);
+            break;
+        default:
+            unallocated_encoding(s);
+            return;
+        }
+    }
+
+    if (is_long) {
+        unsupported_encoding(s, insn);
+        return;
+    }
+
+    if (is_fp) {
+        fpst = get_fpstatus_ptr();
+    } else {
+        TCGV_UNUSED_PTR(fpst);
+    }
+
+    if (size == 3) {
+        TCGv_i64 tcg_idx = tcg_temp_new_i64();
+        int pass;
+
+        assert(is_fp && is_q && !is_long);
+
+        read_vec_element(s, tcg_idx, rm, index, MO_64);
+
+        for (pass = 0; pass < 2; pass++) {
+            TCGv_i64 tcg_op = tcg_temp_new_i64();
+            TCGv_i64 tcg_res = tcg_temp_new_i64();
+
+            read_vec_element(s, tcg_op, rn, pass, MO_64);
+
+            switch (opcode) {
+            case 0x5: /* FMLS */
+                /* As usual for ARM, separate negation for fused multiply-add */
+                gen_helper_vfp_negd(tcg_op, tcg_op);
+                /* fall through */
+            case 0x1: /* FMLA */
+                read_vec_element(s, tcg_res, rd, pass, MO_64);
+                gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
+                break;
+            case 0x9: /* FMUL, FMULX */
+                if (u) {
+                    gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst);
+                } else {
+                    gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst);
+                }
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+            write_vec_element(s, tcg_res, rd, pass, MO_64);
+            tcg_temp_free_i64(tcg_op);
+            tcg_temp_free_i64(tcg_res);
+        }
+
+        tcg_temp_free_i64(tcg_idx);
+    } else if (!is_long) {
+        /* 32 bit floating point, or 16 or 32 bit integer */
+        TCGv_i32 tcg_idx = tcg_temp_new_i32();
+        int pass;
+
+        read_vec_element_i32(s, tcg_idx, rm, index, size);
+
+        if (size == 1) {
+            /* The simplest way to handle the 16x16 indexed ops is to duplicate
+             * the index into both halves of the 32 bit tcg_idx and then use
+             * the usual Neon helpers.
+             */
+            tcg_gen_deposit_i32(tcg_idx, tcg_idx, tcg_idx, 16, 16);
+        }
+
+        for (pass = 0; pass < (is_q ? 4 : 2); pass++) {
+            TCGv_i32 tcg_op = tcg_temp_new_i32();
+            TCGv_i32 tcg_res = tcg_temp_new_i32();
+
+            read_vec_element_i32(s, tcg_op, rn, pass, MO_32);
+
+            switch (opcode) {
+            case 0x0: /* MLA */
+            case 0x4: /* MLS */
+            case 0x8: /* MUL */
+            {
+                static NeonGenTwoOpFn * const fns[2][2] = {
+                    { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 },
+                    { tcg_gen_add_i32, tcg_gen_sub_i32 },
+                };
+                NeonGenTwoOpFn *genfn;
+                bool is_sub = opcode == 0x4;
+
+                if (size == 1) {
+                    gen_helper_neon_mul_u16(tcg_res, tcg_op, tcg_idx);
+                } else {
+                    tcg_gen_mul_i32(tcg_res, tcg_op, tcg_idx);
+                }
+                if (opcode == 0x8) {
+                    break;
+                }
+                read_vec_element_i32(s, tcg_op, rd, pass, MO_32);
+                genfn = fns[size - 1][is_sub];
+                genfn(tcg_res, tcg_op, tcg_res);
+                break;
+            }
+            case 0x5: /* FMLS */
+                /* As usual for ARM, separate negation for fused multiply-add */
+                gen_helper_vfp_negs(tcg_op, tcg_op);
+                /* fall through */
+            case 0x1: /* FMLA */
+                read_vec_element_i32(s, tcg_res, rd, pass, MO_32);
+                gen_helper_vfp_muladds(tcg_res, tcg_op, tcg_idx, tcg_res, fpst);
+                break;
+            case 0x9: /* FMUL, FMULX */
+                if (u) {
+                    gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst);
+                } else {
+                    gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst);
+                }
+                break;
+            case 0xc: /* SQDMULH */
+                if (size == 1) {
+                    gen_helper_neon_qdmulh_s16(tcg_res, cpu_env,
+                                               tcg_op, tcg_idx);
+                } else {
+                    gen_helper_neon_qdmulh_s32(tcg_res, cpu_env,
+                                               tcg_op, tcg_idx);
+                }
+                break;
+            case 0xd: /* SQRDMULH */
+                if (size == 1) {
+                    gen_helper_neon_qrdmulh_s16(tcg_res, cpu_env,
+                                                tcg_op, tcg_idx);
+                } else {
+                    gen_helper_neon_qrdmulh_s32(tcg_res, cpu_env,
+                                                tcg_op, tcg_idx);
+                }
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+            write_vec_element_i32(s, tcg_res, rd, pass, MO_32);
+            tcg_temp_free_i32(tcg_op);
+            tcg_temp_free_i32(tcg_res);
+        }
+
+        tcg_temp_free_i32(tcg_idx);
+
+        if (!is_q) {
+            clear_vec_high(s, rd);
+        }
+    } else {
+        /* long ops: 16x16->32 or 32x32->64 */
+    }
+
+    if (!TCGV_IS_UNUSED_PTR(fpst)) {
+        tcg_temp_free_ptr(fpst);
+    }
 }
 
 /* C3.6.19 Crypto AES