From patchwork Wed Jan 29 13:39:59 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 23851 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ve0-f198.google.com (mail-ve0-f198.google.com [209.85.128.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 9BE67202FA for ; Wed, 29 Jan 2014 13:47:21 +0000 (UTC) Received: by mail-ve0-f198.google.com with SMTP id pa12sf4382374veb.5 for ; Wed, 29 Jan 2014 05:47:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:date:message-id:in-reply-to :references:mime-version:cc:subject:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:list-subscribe :errors-to:sender:x-original-sender :x-original-authentication-results:mailing-list:content-type :content-transfer-encoding; bh=N1qjubqKjy9lVnHmpBTk4yg+9mR404j7rh9r2173boM=; b=FJVd6RvQKGeAucb+MwO8q9k00gZgn8Wyexvbsfomqaly5Ypj6GQTxUxw0E19J79v1r LXOJ5zN9h+6BesvSZvfAt+D6llJBVY6WAzLxWlrOOjpXPz0xtcLCvTQKPXjRJl9MhMyq /IjaFGI5nDJo6z8h+pZx6IVZOGOIF78i5czYdb90DhVbxXUkaC81FFwVOkNUDgElOSk9 jeIvZpMdDMco5OQzaB7lDTo+yp10Nsw51UNCp8AjOEE0B9x2OqOZVCgqFIeJWbGmB0Sr UzaZesn3vJAYiEOnXDOZxqaz6ShAGVUz91dYPckIh1LAkxPTcj76igzqaQluq/y/+eZo URJA== X-Gm-Message-State: ALoCoQnuhEvHcSieeX6yeceVpmI78KIjP4Koo75H0zxb6ZMLy0nVocL+va44G5Zr9NTI1jNnLNw8 X-Received: by 10.236.111.73 with SMTP id v49mr2420887yhg.46.1391003240490; Wed, 29 Jan 2014 05:47:20 -0800 (PST) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.109.137 with SMTP id l9ls163051qgf.40.gmail; Wed, 29 Jan 2014 05:47:20 -0800 (PST) X-Received: by 10.52.171.39 with SMTP id ar7mr1817891vdc.5.1391003240359; Wed, 29 Jan 2014 05:47:20 -0800 (PST) Received: from mail-ve0-f174.google.com (mail-ve0-f174.google.com [209.85.128.174]) by mx.google.com with ESMTPS id tj7si805036vdc.72.2014.01.29.05.47.20 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 29 Jan 2014 05:47:20 -0800 (PST) Received-SPF: neutral (google.com: 209.85.128.174 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.128.174; Received: by mail-ve0-f174.google.com with SMTP id pa12so1166232veb.5 for ; Wed, 29 Jan 2014 05:47:20 -0800 (PST) X-Received: by 10.220.29.200 with SMTP id r8mr6824355vcc.9.1391003240256; Wed, 29 Jan 2014 05:47:20 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.174.196 with SMTP id u4csp113980vcz; Wed, 29 Jan 2014 05:47:19 -0800 (PST) X-Received: by 10.224.55.197 with SMTP id v5mr12112958qag.9.1391003239102; Wed, 29 Jan 2014 05:47:19 -0800 (PST) Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id fy9si1760988qab.149.2014.01.29.05.47.18 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 29 Jan 2014 05:47:18 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Received: from localhost ([::1]:42364 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W8VU6-0002SR-8t for patch@linaro.org; Wed, 29 Jan 2014 08:47:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48950) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W8VNe-0001Fn-LY for qemu-devel@nongnu.org; Wed, 29 Jan 2014 08:40:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W8VNc-0001Gz-3Q for qemu-devel@nongnu.org; Wed, 29 Jan 2014 08:40:38 -0500 Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:45249) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W8VNb-0001GJ-Ev for qemu-devel@nongnu.org; Wed, 29 Jan 2014 08:40:35 -0500 Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80) (envelope-from ) id 1W8VN9-0006z9-0p; Wed, 29 Jan 2014 13:40:07 +0000 From: Peter Maydell To: Anthony Liguori Date: Wed, 29 Jan 2014 13:39:59 +0000 Message-Id: <1391002805-26596-33-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1391002805-26596-1-git-send-email-peter.maydell@linaro.org> References: <1391002805-26596-1-git-send-email-peter.maydell@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:8b0:1d0::1 Cc: Blue Swirl , qemu-devel@nongnu.org, Aurelien Jarno Subject: [Qemu-devel] [PULL 32/38] target-arm: A64: Add SIMD shift by immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: peter.maydell@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.128.174 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 From: Alex Bennée This implements a subset of the AdvSIMD shift operations (namely all the none saturating or narrowing ones). The actual shift generation code itself is common for both the scalar and vector cases but wrapped with either vector element iteration or the fp reg access. The rounding operations need to take special care to correctly reflect the result of adding rounding bits on high bits as the intermediates do not truncate. Signed-off-by: Alex Bennée Reviewed-by: Richard Henderson Signed-off-by: Peter Maydell --- target-arm/translate-a64.c | 375 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 373 insertions(+), 2 deletions(-) diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index 509982c..6c1ec1e 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -5503,15 +5503,216 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) unsupported_encoding(s, insn); } +/* + * Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate) + * + * This code is handles the common shifting code and is used by both + * the vector and scalar code. + */ +static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src, + TCGv_i64 tcg_rnd, bool accumulate, + bool is_u, int size, int shift) +{ + bool extended_result = false; + bool round = !TCGV_IS_UNUSED_I64(tcg_rnd); + int ext_lshift = 0; + TCGv_i64 tcg_src_hi; + + if (round && size == 3) { + extended_result = true; + ext_lshift = 64 - shift; + tcg_src_hi = tcg_temp_new_i64(); + } else if (shift == 64) { + if (!accumulate && is_u) { + /* result is zero */ + tcg_gen_movi_i64(tcg_res, 0); + return; + } + } + + /* Deal with the rounding step */ + if (round) { + if (extended_result) { + TCGv_i64 tcg_zero = tcg_const_i64(0); + if (!is_u) { + /* take care of sign extending tcg_res */ + tcg_gen_sari_i64(tcg_src_hi, tcg_src, 63); + tcg_gen_add2_i64(tcg_src, tcg_src_hi, + tcg_src, tcg_src_hi, + tcg_rnd, tcg_zero); + } else { + tcg_gen_add2_i64(tcg_src, tcg_src_hi, + tcg_src, tcg_zero, + tcg_rnd, tcg_zero); + } + tcg_temp_free_i64(tcg_zero); + } else { + tcg_gen_add_i64(tcg_src, tcg_src, tcg_rnd); + } + } + + /* Now do the shift right */ + if (round && extended_result) { + /* extended case, >64 bit precision required */ + if (ext_lshift == 0) { + /* special case, only high bits matter */ + tcg_gen_mov_i64(tcg_src, tcg_src_hi); + } else { + tcg_gen_shri_i64(tcg_src, tcg_src, shift); + tcg_gen_shli_i64(tcg_src_hi, tcg_src_hi, ext_lshift); + tcg_gen_or_i64(tcg_src, tcg_src, tcg_src_hi); + } + } else { + if (is_u) { + if (shift == 64) { + /* essentially shifting in 64 zeros */ + tcg_gen_movi_i64(tcg_src, 0); + } else { + tcg_gen_shri_i64(tcg_src, tcg_src, shift); + } + } else { + if (shift == 64) { + /* effectively extending the sign-bit */ + tcg_gen_sari_i64(tcg_src, tcg_src, 63); + } else { + tcg_gen_sari_i64(tcg_src, tcg_src, shift); + } + } + } + + if (accumulate) { + tcg_gen_add_i64(tcg_res, tcg_res, tcg_src); + } else { + tcg_gen_mov_i64(tcg_res, tcg_src); + } + + if (extended_result) { + tcg_temp_free_i64(tcg_src_hi); + } +} + +/* Common SHL/SLI - Shift left with an optional insert */ +static void handle_shli_with_ins(TCGv_i64 tcg_res, TCGv_i64 tcg_src, + bool insert, int shift) +{ + if (insert) { /* SLI */ + tcg_gen_deposit_i64(tcg_res, tcg_res, tcg_src, shift, 64 - shift); + } else { /* SHL */ + tcg_gen_shli_i64(tcg_res, tcg_src, shift); + } +} + +/* SSHR[RA]/USHR[RA] - Scalar shift right (optional rounding/accumulate) */ +static void handle_scalar_simd_shri(DisasContext *s, + bool is_u, int immh, int immb, + int opcode, int rn, int rd) +{ + const int size = 3; + int immhb = immh << 3 | immb; + int shift = 2 * (8 << size) - immhb; + bool accumulate = false; + bool round = false; + TCGv_i64 tcg_rn; + TCGv_i64 tcg_rd; + TCGv_i64 tcg_round; + + if (!extract32(immh, 3, 1)) { + unallocated_encoding(s); + return; + } + + switch (opcode) { + case 0x02: /* SSRA / USRA (accumulate) */ + accumulate = true; + break; + case 0x04: /* SRSHR / URSHR (rounding) */ + round = true; + break; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ + accumulate = round = true; + break; + } + + if (round) { + uint64_t round_const = 1ULL << (shift - 1); + tcg_round = tcg_const_i64(round_const); + } else { + TCGV_UNUSED_I64(tcg_round); + } + + tcg_rn = read_fp_dreg(s, rn); + tcg_rd = accumulate ? read_fp_dreg(s, rd) : tcg_temp_new_i64(); + + handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, + accumulate, is_u, size, shift); + + write_fp_dreg(s, rd, tcg_rd); + + tcg_temp_free_i64(tcg_rn); + tcg_temp_free_i64(tcg_rd); + if (round) { + tcg_temp_free_i64(tcg_round); + } +} + +/* SHL/SLI - Scalar shift left */ +static void handle_scalar_simd_shli(DisasContext *s, bool insert, + int immh, int immb, int opcode, + int rn, int rd) +{ + int size = 32 - clz32(immh) - 1; + int immhb = immh << 3 | immb; + int shift = immhb - (8 << size); + TCGv_i64 tcg_rn = new_tmp_a64(s); + TCGv_i64 tcg_rd = new_tmp_a64(s); + + if (!extract32(immh, 3, 1)) { + unallocated_encoding(s); + return; + } + + tcg_rn = read_fp_dreg(s, rn); + tcg_rd = insert ? read_fp_dreg(s, rd) : tcg_temp_new_i64(); + + handle_shli_with_ins(tcg_rd, tcg_rn, insert, shift); + + write_fp_dreg(s, rd, tcg_rd); + + tcg_temp_free_i64(tcg_rn); + tcg_temp_free_i64(tcg_rd); +} + /* C3.6.9 AdvSIMD scalar shift by immediate * 31 30 29 28 23 22 19 18 16 15 11 10 9 5 4 0 * +-----+---+-------------+------+------+--------+---+------+------+ * | 0 1 | U | 1 1 1 1 1 0 | immh | immb | opcode | 1 | Rn | Rd | * +-----+---+-------------+------+------+--------+---+------+------+ + * + * This is the scalar version so it works on a fixed sized registers */ static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn) { - unsupported_encoding(s, insn); + int rd = extract32(insn, 0, 5); + int rn = extract32(insn, 5, 5); + int opcode = extract32(insn, 11, 5); + int immb = extract32(insn, 16, 3); + int immh = extract32(insn, 19, 4); + bool is_u = extract32(insn, 29, 1); + + switch (opcode) { + case 0x00: /* SSHR / USHR */ + case 0x02: /* SSRA / USRA */ + case 0x04: /* SRSHR / URSHR */ + case 0x06: /* SRSRA / URSRA */ + handle_scalar_simd_shri(s, is_u, immh, immb, opcode, rn, rd); + break; + case 0x0a: /* SHL / SLI */ + handle_scalar_simd_shli(s, is_u, immh, immb, opcode, rn, rd); + break; + default: + unsupported_encoding(s, insn); + break; + } } /* C3.6.10 AdvSIMD scalar three different @@ -5816,6 +6017,148 @@ static void disas_simd_scalar_indexed(DisasContext *s, uint32_t insn) unsupported_encoding(s, insn); } +/* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */ +static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, + int immh, int immb, int opcode, int rn, int rd) +{ + int size = 32 - clz32(immh) - 1; + int immhb = immh << 3 | immb; + int shift = 2 * (8 << size) - immhb; + bool accumulate = false; + bool round = false; + int dsize = is_q ? 128 : 64; + int esize = 8 << size; + int elements = dsize/esize; + TCGMemOp memop = size | (is_u ? 0 : MO_SIGN); + TCGv_i64 tcg_rn = new_tmp_a64(s); + TCGv_i64 tcg_rd = new_tmp_a64(s); + TCGv_i64 tcg_round; + int i; + + if (extract32(immh, 3, 1) && !is_q) { + unallocated_encoding(s); + return; + } + + if (size > 3 && !is_q) { + unallocated_encoding(s); + return; + } + + switch (opcode) { + case 0x02: /* SSRA / USRA (accumulate) */ + accumulate = true; + break; + case 0x04: /* SRSHR / URSHR (rounding) */ + round = true; + break; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ + accumulate = round = true; + break; + } + + if (round) { + uint64_t round_const = 1ULL << (shift - 1); + tcg_round = tcg_const_i64(round_const); + } else { + TCGV_UNUSED_I64(tcg_round); + } + + for (i = 0; i < elements; i++) { + read_vec_element(s, tcg_rn, rn, i, memop); + if (accumulate) { + read_vec_element(s, tcg_rd, rd, i, memop); + } + + handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, + accumulate, is_u, size, shift); + + write_vec_element(s, tcg_rd, rd, i, size); + } + + if (!is_q) { + clear_vec_high(s, rd); + } + + if (round) { + tcg_temp_free_i64(tcg_round); + } +} + +/* SHL/SLI - Vector shift left */ +static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, + int immh, int immb, int opcode, int rn, int rd) +{ + int size = 32 - clz32(immh) - 1; + int immhb = immh << 3 | immb; + int shift = immhb - (8 << size); + int dsize = is_q ? 128 : 64; + int esize = 8 << size; + int elements = dsize/esize; + TCGv_i64 tcg_rn = new_tmp_a64(s); + TCGv_i64 tcg_rd = new_tmp_a64(s); + int i; + + if (extract32(immh, 3, 1) && !is_q) { + unallocated_encoding(s); + return; + } + + if (size > 3 && !is_q) { + unallocated_encoding(s); + return; + } + + for (i = 0; i < elements; i++) { + read_vec_element(s, tcg_rn, rn, i, size); + if (insert) { + read_vec_element(s, tcg_rd, rd, i, size); + } + + handle_shli_with_ins(tcg_rd, tcg_rn, insert, shift); + + write_vec_element(s, tcg_rd, rd, i, size); + } + + if (!is_q) { + clear_vec_high(s, rd); + } +} + +/* USHLL/SHLL - Vector shift left with widening */ +static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u, + int immh, int immb, int opcode, int rn, int rd) +{ + int size = 32 - clz32(immh) - 1; + int immhb = immh << 3 | immb; + int shift = immhb - (8 << size); + int dsize = 64; + int esize = 8 << size; + int elements = dsize/esize; + TCGv_i64 tcg_rn = new_tmp_a64(s); + TCGv_i64 tcg_rd = new_tmp_a64(s); + int i; + + if (size >= 3) { + unallocated_encoding(s); + return; + } + + /* For the LL variants the store is larger than the load, + * so if rd == rn we would overwrite parts of our input. + * So load everything right now and use shifts in the main loop. + */ + read_vec_element(s, tcg_rn, rn, is_q ? 1 : 0, MO_64); + + for (i = 0; i < elements; i++) { + tcg_gen_shri_i64(tcg_rd, tcg_rn, i * esize); + ext_and_shift_reg(tcg_rd, tcg_rd, size | (!is_u << 2), 0); + tcg_gen_shli_i64(tcg_rd, tcg_rd, shift); + write_vec_element(s, tcg_rd, rd, i, size + 1); + } +} + + /* C3.6.14 AdvSIMD shift by immediate * 31 30 29 28 23 22 19 18 16 15 11 10 9 5 4 0 * +---+---+---+-------------+------+------+--------+---+------+------+ @@ -5824,7 +6167,35 @@ static void disas_simd_scalar_indexed(DisasContext *s, uint32_t insn) */ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn) { - unsupported_encoding(s, insn); + int rd = extract32(insn, 0, 5); + int rn = extract32(insn, 5, 5); + int opcode = extract32(insn, 11, 5); + int immb = extract32(insn, 16, 3); + int immh = extract32(insn, 19, 4); + bool is_u = extract32(insn, 29, 1); + bool is_q = extract32(insn, 30, 1); + + switch (opcode) { + case 0x00: /* SSHR / USHR */ + case 0x02: /* SSRA / USRA (accumulate) */ + case 0x04: /* SRSHR / URSHR (rounding) */ + case 0x06: /* SRSRA / URSRA (accum + rounding) */ + handle_vec_simd_shri(s, is_q, is_u, immh, immb, opcode, rn, rd); + break; + case 0x0a: /* SHL / SLI */ + handle_vec_simd_shli(s, is_q, is_u, immh, immb, opcode, rn, rd); + break; + case 0x14: /* SSHLL / USHLL */ + handle_vec_simd_wshli(s, is_q, is_u, immh, immb, opcode, rn, rd); + break; + default: + /* We don't currently implement any of the Narrow or saturating shifts; + * nor do we implement the fixed-point conversions in this + * encoding group (SCVTF, FCVTZS, UCVTF, FCVTZU). + */ + unsupported_encoding(s, insn); + return; + } } static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,