From patchwork Tue Feb 27 14:38:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 129806 Delivered-To: patch@linaro.org Received: by 10.46.66.2 with SMTP id p2csp1472792lja; Tue, 27 Feb 2018 06:45:54 -0800 (PST) X-Google-Smtp-Source: AG47ELsYXiw1swfsuI46ZgV9exSTzm88Fjlm73GC0PXMYR0jkUit6uvRxeG/txbtvqniYIVwLpSq X-Received: by 2002:a25:f608:: with SMTP id t8-v6mr9363110ybd.403.1519742753927; Tue, 27 Feb 2018 06:45:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519742753; cv=none; d=google.com; s=arc-20160816; b=u+e7hR66f3aor4bdDtwX9FiNo351lrNfuzye/6wna7GGHf3PdyO44/vN1EF7NXhIGB 6Hf6jTwuzmeWekpfBzcjqLQVz7LbZYz9hs+Y7pCdWq+mEq3gNrn05zrbB5SowCQoToMe CLTChONLyV5XA6/fHBCX3lbATaEe0ocrqzsAvxML1UlqOQQuHU7bwx3SjXGdflsp2dvN MPVeQsU+NE3WGUxjYA93wtpquNQpngVC5j1axnac6Hfe2batqdxFgz3T9rXBX/Ggaby8 HE2NhE2Met/cz5N3uSvQiVvQECVCONGrdbmi8Jp2dkWx362SegRSDbIMV2tqbCfclCiP +1QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Udvn+v2EYmidmut+a91fqRC7Q2kSoODnzfg5Opg6GRg=; b=Z2gCPBvniNGQd/n8Olt/TuESSmncmYIDwuQn+ZcnXLG4sCLV3sdBXb/ilNsO4jwlca cUXmiTl4wcgqQ51LeuPufTURChApLeHqeLlI2gZx4AicbV66pcRfRimDO1moR498AVSp VYRucSkMjF14yB/exxSkMuDE3BNG94ivl3qV3uT23A2yfMnLS8WnnUfdVru6neeq03iN ErBVxfeiEBUjfviecpm/6N6Ew3oGYYX1TrU6lPifB3oBazlxa/Zrb/cdwzQKv4tqegF7 q5BO72BEVEgSGv2upu/ONLFTCpXp4qRsSeRNYQ6D7uNZHtd93S6Ju43YlhrLYId7pWuv FyEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JACcwsAi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o9-v6si1909431ybk.791.2018.02.27.06.45.53 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 27 Feb 2018 06:45:53 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JACcwsAi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:37711 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eqgVt-0001PG-7g for patch@linaro.org; Tue, 27 Feb 2018 09:45:53 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56014) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eqgPT-0003mm-Ez for qemu-devel@nongnu.org; Tue, 27 Feb 2018 09:39:20 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eqgPS-00084l-2Z for qemu-devel@nongnu.org; Tue, 27 Feb 2018 09:39:15 -0500 Received: from mail-wm0-x241.google.com ([2a00:1450:400c:c09::241]:40641) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eqgPR-000845-Ns for qemu-devel@nongnu.org; Tue, 27 Feb 2018 09:39:13 -0500 Received: by mail-wm0-x241.google.com with SMTP id t6so15916780wmt.5 for ; Tue, 27 Feb 2018 06:39:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Udvn+v2EYmidmut+a91fqRC7Q2kSoODnzfg5Opg6GRg=; b=JACcwsAiNZKGApW7Dcb4GLKau5DtPTRtV2EkuoS5iDl2OW2fUIe2IJJSKFmKzEZChs Ap5uzoOIGDzsWhsd/JQl+DMGB8T930gkwEwFEghxqbuxCcLXEIdf+P1mVuk889lcKKAY 82KrVjcpci3wv5/XriHQ5nzhNNPGsZgbGpDyQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Udvn+v2EYmidmut+a91fqRC7Q2kSoODnzfg5Opg6GRg=; b=gRiKOmadKXke3+B+LpKtTU6/CUetuANW9NkmoBdGWBXJjiJjM6jRzbZfUbz+/8I+c4 S0NMyUvPJRkdTIuEu2RYVKaBSyebx9JVDn/DaqgwlD8HiNrdTMLvcj/JnFJz8kKaOL65 61/1QY3k3h436vyplQTW8zSU3Mk+76p6741jQJ1Rg3oIQphm21iv3Hd3S2sA3gV35Ahr SZAwhSD9dZxxXIjo/0kQ21dup+OfYSKy0jhvMW7p/GGvyZ3Xvv1qdULHRbnyPSPYFOZ/ QV2qArR4L91zboaOvNHncyxnjnuDQWbN8+hD+XFNpH85RjzSLXEQisZJm+j3Ie9/aOQR I/Sg== X-Gm-Message-State: APf1xPCUqJ1Q0bIpc351DpMbzVtBrZIj/hI2980Wr1EyWWLq5cxqYYkE A2ehmux6DUpS0wWu3y0WT6QCyZzm7oY= X-Received: by 10.28.95.139 with SMTP id t133mr11439936wmb.88.1519742352582; Tue, 27 Feb 2018 06:39:12 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id o6sm6457262wmo.38.2018.02.27.06.38.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Feb 2018 06:39:05 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id 0D1063E0BFD; Tue, 27 Feb 2018 14:38:54 +0000 (GMT) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: qemu-arm@nongnu.org Date: Tue, 27 Feb 2018 14:38:36 +0000 Message-Id: <20180227143852.11175-16-alex.bennee@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180227143852.11175-1-alex.bennee@linaro.org> References: <20180227143852.11175-1-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::241 Subject: [Qemu-devel] [PATCH v4 15/31] arm/translate-a64: add FP16 x2 ops for simd_indexed X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Alex_Benn=C3=A9e?= , richard.henderson@linaro.org, qemu-devel@nongnu.org, Peter Maydell Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" A bunch of the vectorised bitwise operations just operate on larger chunks at a time. We can do the same for the new half-precision operations by introducing some TWOHALFOP helpers which work on each half of a pair of half-precision operations at once. Hopefully all this hoop jumping will get simpler once we have generically vectorised helpers here. Signed-off-by: Alex Bennée Reviewed-by: Richard Henderson --- v2 - checkpatch fixes --- target/arm/helper-a64.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- target/arm/helper-a64.h | 10 ++++++++++ target/arm/translate-a64.c | 26 +++++++++++++++++++++----- 3 files changed, 76 insertions(+), 6 deletions(-) -- 2.15.1 diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c index 8fdbe034f3..4d5ae96d8f 100644 --- a/target/arm/helper-a64.c +++ b/target/arm/helper-a64.c @@ -629,8 +629,32 @@ ADVSIMD_HALFOP(max) ADVSIMD_HALFOP(minnum) ADVSIMD_HALFOP(maxnum) +#define ADVSIMD_TWOHALFOP(name) \ +uint32_t ADVSIMD_HELPER(name, 2h)(uint32_t two_a, uint32_t two_b, void *fpstp) \ +{ \ + float16 a1, a2, b1, b2; \ + uint32_t r1, r2; \ + float_status *fpst = fpstp; \ + a1 = extract32(two_a, 0, 16); \ + a2 = extract32(two_a, 16, 16); \ + b1 = extract32(two_b, 0, 16); \ + b2 = extract32(two_b, 16, 16); \ + r1 = float16_ ## name(a1, b1, fpst); \ + r2 = float16_ ## name(a2, b2, fpst); \ + return deposit32(r1, 16, 16, r2); \ +} + +ADVSIMD_TWOHALFOP(add) +ADVSIMD_TWOHALFOP(sub) +ADVSIMD_TWOHALFOP(mul) +ADVSIMD_TWOHALFOP(div) +ADVSIMD_TWOHALFOP(min) +ADVSIMD_TWOHALFOP(max) +ADVSIMD_TWOHALFOP(minnum) +ADVSIMD_TWOHALFOP(maxnum) + /* Data processing - scalar floating-point and advanced SIMD */ -float16 HELPER(advsimd_mulxh)(float16 a, float16 b, void *fpstp) +static float16 float16_mulx(float16 a, float16 b, void *fpstp) { float_status *fpst = fpstp; @@ -646,6 +670,9 @@ float16 HELPER(advsimd_mulxh)(float16 a, float16 b, void *fpstp) return float16_mul(a, b, fpst); } +ADVSIMD_HALFOP(mulx) +ADVSIMD_TWOHALFOP(mulx) + /* fused multiply-accumulate */ float16 HELPER(advsimd_muladdh)(float16 a, float16 b, float16 c, void *fpstp) { @@ -653,6 +680,23 @@ float16 HELPER(advsimd_muladdh)(float16 a, float16 b, float16 c, void *fpstp) return float16_muladd(a, b, c, 0, fpst); } +uint32_t HELPER(advsimd_muladd2h)(uint32_t two_a, uint32_t two_b, + uint32_t two_c, void *fpstp) +{ + float_status *fpst = fpstp; + float16 a1, a2, b1, b2, c1, c2; + uint32_t r1, r2; + a1 = extract32(two_a, 0, 16); + a2 = extract32(two_a, 16, 16); + b1 = extract32(two_b, 0, 16); + b2 = extract32(two_b, 16, 16); + c1 = extract32(two_c, 0, 16); + c2 = extract32(two_c, 16, 16); + r1 = float16_muladd(a1, b1, c1, 0, fpst); + r2 = float16_muladd(a2, b2, c2, 0, fpst); + return deposit32(r1, 16, 16, r2); +} + /* * Floating point comparisons produce an integer result. Softfloat * routines return float_relation types which we convert to the 0/-1 diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h index 79012eee9d..003ffa582f 100644 --- a/target/arm/helper-a64.h +++ b/target/arm/helper-a64.h @@ -65,3 +65,13 @@ DEF_HELPER_3(advsimd_acge_f16, i32, f16, f16, ptr) DEF_HELPER_3(advsimd_acgt_f16, i32, f16, f16, ptr) DEF_HELPER_3(advsimd_mulxh, f16, f16, f16, ptr) DEF_HELPER_4(advsimd_muladdh, f16, f16, f16, f16, ptr) +DEF_HELPER_3(advsimd_add2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_sub2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_mul2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_div2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_max2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_min2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_maxnum2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_minnum2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_mulx2h, i32, i32, i32, ptr) +DEF_HELPER_4(advsimd_muladd2h, i32, i32, i32, i32, ptr) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 6a264bc134..3487c0430f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11417,8 +11417,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) * multiply-add */ tcg_gen_xori_i32(tcg_op, tcg_op, 0x80008000); } - gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_idx, - tcg_res, fpst); + if (is_scalar) { + gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_idx, + tcg_res, fpst); + } else { + gen_helper_advsimd_muladd2h(tcg_res, tcg_op, tcg_idx, + tcg_res, fpst); + } break; case 2: if (opcode == 0x5) { @@ -11437,10 +11442,21 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) switch (size) { case 1: if (u) { - gen_helper_advsimd_mulxh(tcg_res, tcg_op, tcg_idx, - fpst); + if (is_scalar) { + gen_helper_advsimd_mulxh(tcg_res, tcg_op, + tcg_idx, fpst); + } else { + gen_helper_advsimd_mulx2h(tcg_res, tcg_op, + tcg_idx, fpst); + } } else { - g_assert_not_reached(); + if (is_scalar) { + gen_helper_advsimd_mulh(tcg_res, tcg_op, + tcg_idx, fpst); + } else { + gen_helper_advsimd_mul2h(tcg_res, tcg_op, + tcg_idx, fpst); + } } break; case 2: