From patchwork Sat Feb 17 18:23:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128733 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1851993ljc; Sat, 17 Feb 2018 11:21:42 -0800 (PST) X-Google-Smtp-Source: AH8x22423pjFv1aliFN3VwW0Fx6M82h+juuALgvTKWhOANKAyRIAdOpdDRbHwiTz6AVWifc0u99z X-Received: by 10.37.130.78 with SMTP id d14mr7229269ybn.246.1518895302079; Sat, 17 Feb 2018 11:21:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895302; cv=none; d=google.com; s=arc-20160816; b=F3sC1Q5PdD0Ki7v4XHBS79RERriD8WCHEhuptIjpAp7XOj3vXf+uj92oXCL4rrnfzw xoSKKp85rHYiUXLftAlM6vxRUlNz1Ycb1WxSjxMhmCo1lsUTYWWHS8QYr9nebTKb0fxA Bm1VhBXL1O2yeJtNGqFPo1JSmn2D7lMOEeZHzslW+LBzrzV6BFX0MI4sKRQaqwYG4hcL lUACEJlGjiBvhfx4C2UgKJrs3yFMNGwFFJKhtRAne0OzEL79HLdaf/VGWPZDUQMEx1mf xIXpKmvjhdQBEdBg9e1prg/D5/5WxEp9qQHyeaDdHJfk30cv2cpeBr94sI4gY5e3H8DR A0BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=FAwFtClp/Sq1qypFjOXNqjYmfK4mZxSG3Hwsb2R+tLp6rFsV/6Yfv5yRTzT1LO9VxM ScInm4Fj7wDfXXShEU1zXjE7n+nE8DuQEJ59+HgOsvdVahFj/bltfgefJ2uQmqE8l79i dTiXli9qhnvtz1rip+l+AeXsmpLp0YmPk2mSMeoP6QvhBX/YPJ/1SKoCz0C4HKX54nKv TyOB5ryLttHEHLSxJmbrT0wi8H6C8PpHThzdL6Og7Na451N/ysSXOPyJw2/sVEd3vk7j FeHLlucKkEIZRUJJMLceC3K1Lo4Hre1GMMfLDqnzRnqn05Pp9zvmFl1xP9yTQpkKO8rU gIEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=H/MUIYn1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a64si262775ywb.159.2018.02.17.11.21.41 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:21:42 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=H/MUIYn1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en83J-0007qJ-BW for patch@linaro.org; Sat, 17 Feb 2018 14:21:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AF-0001SM-Ko for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AD-00027L-U8 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:47 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45436) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AD-000276-MH for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:45 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3428969plo.12 for ; Sat, 17 Feb 2018 10:24:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=H/MUIYn1loOyvN60iecRE0FQjCwd/BjSnnyc15z+QbX2c/H8rVKH38gGmh604YWggC 0FKT8NXeIKlxk4p9mXj1VpQS3Vw942fgR+Zv9AP1fWaMKC5QxzeYiaF3OHKU7YZSINoa QiqPd7izm5AMwGANa1dlHyrrfkUfzI/Wg5oMY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=VF7u2CAImX0FKxYzpwqgGQRbOGzGSkxAdhHxAbMPTo1Ldd7z7XXi6Csx6DqLT99kVU woi9t8SwpgpZphcqxKXTQ96i7aihTI76jcbwKA6fIp1C6vvtjLf6BpYJdPMUIw7kBhiq e/XFrmvjw1LWAGrNEmfGt/Zg5vu+NnbcVO9Lk4WYrnysbS28k8n+/WIUntSERrl//jT6 VzwzE1EqP0XFOG7pdg1pM7ZxKIrDspsdEas9abi61mlg7NmQTksG/24ilzpPzJwn/kKb V4PthdGjEUXI8dy/IXczW5dOsMrynwb4G1KcstmoGyCWE8Hu4wm42RvD3G0Jwvkg5h0h EZHg== X-Gm-Message-State: APf1xPCDvNBKBYNDdzeFNcLiJxPwe+VDNoEFQEh52961dUgihMgMYf+b bxMLypynfC8G0isplcSH/Ser5YMGACY= X-Received: by 2002:a17:902:7808:: with SMTP id p8-v6mr9622082pll.161.1518891884401; Sat, 17 Feb 2018 10:24:44 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:43 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:05 -0800 Message-Id: <20180217182323.25885-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 49/67] target/arm: Implement SVE FP Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++++++++++ target/arm/sve_helper.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++++++++ 4 files changed, 127 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 84d0a8978c..a95f077c7f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d80babfae7..6622275b44 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2948,6 +2948,59 @@ DO_ZPZ_FP_D(sve_ucvt_dd, uint64_t, uint64_to_float64) #undef DO_ZPZ_FP #undef DO_ZPZ_FP_D +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +#define DO_FMLA(NAME, N, H, NEG1, NEG3) \ +void HELPER(NAME)(CPUARMState *env, void *vg, uint32_t desc) \ +{ \ + intptr_t i = 0, opr_sz = simd_oprsz(desc); \ + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); \ + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); \ + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); \ + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); \ + void *vd = &env->vfp.zregs[rd]; \ + void *vn = &env->vfp.zregs[rn]; \ + void *vm = &env->vfp.zregs[rm]; \ + void *va = &env->vfp.zregs[ra]; \ + do { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (likely(pg & 1)) { \ + float##N e1 = *(uint##N##_t *)(vn + H(i)); \ + float##N e2 = *(uint##N##_t *)(vm + H(i)); \ + float##N e3 = *(uint##N##_t *)(va + H(i)); \ + float##N r; \ + if (NEG1) e1 = float##N##_chs(e1); \ + if (NEG3) e3 = float##N##_chs(e3); \ + r = float##N##_muladd(e1, e2, e3, 0, &env->vfp.fp_status); \ + *(uint##N##_t *)(vd + H(i)) = r; \ + } \ + i += sizeof(float##N), pg >>= sizeof(float##N); \ + } while (i & 15); \ + } while (i < opr_sz); \ +} + +DO_FMLA(sve_fmla_zpzzz_h, 16, H1_2, 0, 0) +DO_FMLA(sve_fmla_zpzzz_s, 32, H1_4, 0, 0) +DO_FMLA(sve_fmla_zpzzz_d, 64, , 0, 0) + +DO_FMLA(sve_fmls_zpzzz_h, 16, H1_2, 0, 1) +DO_FMLA(sve_fmls_zpzzz_s, 32, H1_4, 0, 1) +DO_FMLA(sve_fmls_zpzzz_d, 64, , 0, 1) + +DO_FMLA(sve_fnmla_zpzzz_h, 16, H1_2, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_s, 32, H1_4, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_d, 64, , 1, 0) + +DO_FMLA(sve_fnmls_zpzzz_h, 16, H1_2, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_s, 32, H1_4, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_d, 64, , 1, 1) + +#undef DO_FMLA + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1692980d20..3124368fb5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3208,6 +3208,47 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); + +static void do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg = tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc = deposit32(a->rd, 5, 5, a->rn); + desc = deposit32(desc, 10, 5, a->rm); + desc = deposit32(desc, 15, 5, a->ra); + desc = simd_desc(vsz, vsz, desc); + + t_desc = tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); +} + +#define DO_FMLA(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_sve_fmla * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_fmla(s, a, fns[a->esz]); \ +} + +DO_FMLA(FMLA_zpzzz, fmla_zpzzz) +DO_FMLA(FMLS_zpzzz, fmls_zpzzz) +DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz) +DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) + +#undef DO_FMLA + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1a13c603ff..817833f96e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -129,6 +129,8 @@ &rprrr_esz ra=%reg_movprfx @rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ &rprrr_esz rn=%reg_movprfx +@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ + &rprrr_esz rn=%reg_movprfx # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -709,6 +711,21 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +### SVE FP Multiply-Add Group + +# SVE floating-point multiply-accumulate writing addend +FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm +FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm +FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm +FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE floating-point multiply-accumulate writing multiplicand +# FMAD, FMSB, FNMAD, FNMS +FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra +FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra +FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra +FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point