From patchwork Sun Mar 17 09:08:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 160470 Delivered-To: patch@linaro.org Received: by 2002:a02:5cc1:0:0:0:0:0 with SMTP id w62csp1409759jad; Sun, 17 Mar 2019 02:17:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqxjnf2nC2m8gBlUyveKuH0MZ5VS/zK06YMJXQepcJfh7vPv0uOJpySWLrQiGIXun5vknUk7 X-Received: by 2002:adf:fe89:: with SMTP id l9mr8539958wrr.286.1552814263397; Sun, 17 Mar 2019 02:17:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552814263; cv=none; d=google.com; s=arc-20160816; b=lsiHzHNX3zsgpiOj/5C7qDhpQK5To3VbqKDy/r7LaL1sumbaw//icpdPQNfZhR1g68 RMwx+L1UK2tbFaG8tL5q8u2LnZD8s7vn0kFwZ4JWsCT6CDnXOk1zhtDBULxJ9OiZg4mF UWh2foTOlyDSIwHFT0OjQqApnxW3zyQ+uFFOAyH3wStZo0cDRSurRisuYorWBlnOkpP+ 4FTXJdijf7S9Tux3w765sQMqEbjJDrxR3FfzGrZa4f8v8hGiB88DPpEG31EZATO42HLU 8VogGq7+zbpSNMWa7e/FhebbZpSaCVhCeGf31k8Le3rS1iO2IoI3a/0CAEjmth1TCqIu vcpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=bBFnZ0BADMOEz5d0Rm7CJ8evRxG1eZY9LhoqyBMSN6s=; b=u6bspa0/tLtZcS+BkkGPjU+ecLLsfSeCYuZg/+XMLW8is71QGQVQ+F1Hb+r1qzMaIb pE5i5XbvARBC3tG7JIj4le0yBJxWmpcY3NM4Xo2LZZu44S7mGPXd+KxjZTsElJzv3S1Y vV7vSxcGlR1TvXmTMY6x5zW77JFs02wBmFRE7aeXeI3FPqgZ4GPhA6FlQAhEncN9Sob4 Cy/bWmlY5mGQ9mXvLuMlSt/OjFzMNhA1iJoq4aOZRVQELtNOQPaXKd5Iwp/YXjma2AXO 4JRxUfdlGeLMzqFd7bgf3zyA1dgPOskhqgApamdUEo25t/E5BKStfG5c33nUmUL94n0O +OWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="zjtHqY/v"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id o12si4423613wme.76.2019.03.17.02.17.43 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 17 Mar 2019 02:17:43 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="zjtHqY/v"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:52198 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h5RvK-0004PY-AD for patch@linaro.org; Sun, 17 Mar 2019 05:17:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:47780) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h5Rmr-0005oG-In for qemu-devel@nongnu.org; Sun, 17 Mar 2019 05:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h5Rmp-0004EY-5s for qemu-devel@nongnu.org; Sun, 17 Mar 2019 05:08:57 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:44613) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h5Rmo-0004EA-Lq for qemu-devel@nongnu.org; Sun, 17 Mar 2019 05:08:55 -0400 Received: by mail-pg1-x544.google.com with SMTP id h34so9305186pgh.11 for ; Sun, 17 Mar 2019 02:08:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bBFnZ0BADMOEz5d0Rm7CJ8evRxG1eZY9LhoqyBMSN6s=; b=zjtHqY/v1pblRSiZQ7u49s+r4DbZXAx3VLnmf731jIXna25ErUhTqJZxob0K3T+sVP m2iArHk4Jjozlfqvq4MExw7LjHV6OqMFORzr4Tp3PxtdkVFD8djsamVWBSf2OM2t0Ipw MlF3HJpfs9TY9LQSRp/U+VQxRc0ARu0pPv8Z9MP+TM5zDQguEaISgAy/sIMzdN8qPQYl kQlRlH8OKhf9WKtedPW2QYb3J/IvWnZyfvHFpemT9VDRxTNOysS2miwJnUeVs076Qs2j DDqTLvLDvvdTlnDgL6Y6LddaCwHqkX79bEvbPQTdZ/pL6s/U24iFAUKSIwNsGkht8gdY 0qow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bBFnZ0BADMOEz5d0Rm7CJ8evRxG1eZY9LhoqyBMSN6s=; b=GZoJPtRbv8KSviWQg4KcYHms3ZRSJNSXhODV2cCTjW/0I8GjKugRtpDHz5r5xZmr0q 0nNw5MSMuy43IwYh7J3oZ7kpw97jBbBamMk0NMb67nqQWeEw1kB14utdYnWC/O0sgnld HGs8nrKqOpAL7yuHvcHth7/kutFTnNGNHRweXlNl1V73TKUcFJKl2fzF/xoUKDpjfTAv fcvCjUGfeA7RHQLf3H+w15Zvif3P4WrqiBP8MdvLX0yvRFQrJ9slbDDcAQKmvomjQTGf lltZqa7RtdusXzZvVFY1LvjhhIWOo/I4wKUrI2Q3PwCK/H7/DajSCtDw7y4XzCr7K38Z BYOA== X-Gm-Message-State: APjAAAW6eFxAhuucQQjWjSdHR6a0z5O+kF3AjRa+h/k74FczuWibuPLU cHoWoUbUvTI90Y0AOQ1j9rdAs41imUw= X-Received: by 2002:a17:902:9307:: with SMTP id bc7mr13632407plb.234.1552813733377; Sun, 17 Mar 2019 02:08:53 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-188-82.tukw.qwest.net. [97.113.188.82]) by smtp.gmail.com with ESMTPSA id b85sm19378435pfj.56.2019.03.17.02.08.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 17 Mar 2019 02:08:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sun, 17 Mar 2019 02:08:33 -0700 Message-Id: <20190317090834.5552-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190317090834.5552-1-richard.henderson@linaro.org> References: <20190317090834.5552-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH for-4.1 v2 12/13] tcg/ppc: Update vector support to v2.07 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This includes single-word loads and stores, lots of double-word arithmetic, and a few extra logical operations. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 3 +- tcg/ppc/tcg-target.inc.c | 155 +++++++++++++++++++++++++++++++++------ 2 files changed, 134 insertions(+), 24 deletions(-) -- 2.17.2 diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index 5797ad35d5..4bbb33df7e 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -60,6 +60,7 @@ typedef enum { extern bool have_isa_altivec; extern bool have_isa_2_06; +extern bool have_isa_2_07_vsx; extern bool have_isa_3_00; /* optional instructions automatically implemented */ @@ -142,7 +143,7 @@ extern bool have_isa_3_00; #endif #define TCG_TARGET_HAS_andc_vec 1 -#define TCG_TARGET_HAS_orc_vec 0 +#define TCG_TARGET_HAS_orc_vec have_isa_2_07_vsx #define TCG_TARGET_HAS_not_vec 1 #define TCG_TARGET_HAS_neg_vec 0 #define TCG_TARGET_HAS_shi_vec 0 diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index e6f3dca394..b8df9b55cf 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -67,6 +67,7 @@ static tcg_insn_unit *tb_ret_addr; bool have_isa_altivec; bool have_isa_2_06; bool have_isa_2_06_vsx; +bool have_isa_2_07_vsx; bool have_isa_3_00; #define HAVE_ISA_2_06 have_isa_2_06 @@ -473,10 +474,12 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define LVEWX XO31(71) #define LXSDX XO31(588) /* v2.06 */ #define LXVDSX XO31(332) /* v2.06 */ +#define LXSIWZX XO31(12) /* v2.07 */ #define STVX XO31(231) #define STVEWX XO31(199) #define STXSDX XO31(716) /* v2.06 */ +#define STXSIWX XO31(140) /* v2.07 */ #define VADDSBS VX4(768) #define VADDUBS VX4(512) @@ -487,6 +490,7 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VADDSWS VX4(896) #define VADDUWS VX4(640) #define VADDUWM VX4(128) +#define VADDUDM VX4(192) /* v2.07 */ #define VSUBSBS VX4(1792) #define VSUBUBS VX4(1536) @@ -497,47 +501,62 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSUBSWS VX4(1920) #define VSUBUWS VX4(1664) #define VSUBUWM VX4(1152) +#define VSUBUDM VX4(1216) /* v2.07 */ #define VMAXSB VX4(258) #define VMAXSH VX4(322) #define VMAXSW VX4(386) +#define VMAXSD VX4(450) /* v2.07 */ #define VMAXUB VX4(2) #define VMAXUH VX4(66) #define VMAXUW VX4(130) +#define VMAXUD VX4(194) /* v2.07 */ #define VMINSB VX4(770) #define VMINSH VX4(834) #define VMINSW VX4(898) +#define VMINSD VX4(962) /* v2.07 */ #define VMINUB VX4(514) #define VMINUH VX4(578) #define VMINUW VX4(642) +#define VMINUD VX4(706) /* v2.07 */ #define VCMPEQUB VX4(6) #define VCMPEQUH VX4(70) #define VCMPEQUW VX4(134) +#define VCMPEQUD VX4(199) /* v2.07 */ #define VCMPGTSB VX4(774) #define VCMPGTSH VX4(838) #define VCMPGTSW VX4(902) +#define VCMPGTSD VX4(967) /* v2.07 */ #define VCMPGTUB VX4(518) #define VCMPGTUH VX4(582) #define VCMPGTUW VX4(646) +#define VCMPGTUD VX4(711) /* v2.07 */ #define VSLB VX4(260) #define VSLH VX4(324) #define VSLW VX4(388) +#define VSLD VX4(1476) /* v2.07 */ #define VSRB VX4(516) #define VSRH VX4(580) #define VSRW VX4(644) +#define VSRD VX4(1732) /* v2.07 */ #define VSRAB VX4(772) #define VSRAH VX4(836) #define VSRAW VX4(900) +#define VSRAD VX4(964) /* v2.07 */ #define VRLB VX4(4) #define VRLH VX4(68) #define VRLW VX4(132) +#define VRLD VX4(196) /* v2.07 */ #define VMULEUB VX4(520) #define VMULEUH VX4(584) +#define VMULEUW VX4(648) /* v2.07 */ #define VMULOUB VX4(8) #define VMULOUH VX4(72) +#define VMULOUW VX4(136) /* v2.07 */ +#define VMULUWM VX4(137) /* v2.07 */ #define VMSUMUHM VX4(38) #define VMRGHB VX4(12) @@ -555,6 +574,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VNOR VX4(1284) #define VOR VX4(1156) #define VXOR VX4(1220) +#define VEQV VX4(1668) /* v2.07 */ +#define VNAND VX4(1412) /* v2.07 */ +#define VORC VX4(1348) /* v2.07 */ #define VSPLTB VX4(524) #define VSPLTH VX4(588) @@ -568,6 +590,11 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define XXPERMDI (OPCD(60) | (10 << 3)) /* 2.06 */ +#define MFVSRD XO31(51) /* v2.07 */ +#define MFVSRWZ XO31(115) /* v2.07 */ +#define MTVSRD XO31(179) /* v2.07 */ +#define MTVSRWZ XO31(179) /* v2.07 */ + #define RT(r) ((r)<<21) #define RS(r) ((r)<<21) #define RA(r) ((r)<<16) @@ -691,7 +718,15 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) if (ret < 32 && arg < 32) { tcg_out32(s, OR | SAB(arg, ret, arg)); break; - } else if (ret < 32 || arg < 32) { + } else if (ret < 32 && have_isa_2_07_vsx) { + tcg_out32(s, (type == TCG_TYPE_I32 ? MFVSRWZ : MFVSRD) + | VRT(arg) | RA(ret) | 1); + break; + } else if (arg < 32 && have_isa_2_07_vsx) { + tcg_out32(s, (type == TCG_TYPE_I32 ? MTVSRWZ : MTVSRD) + | VRT(ret) | RA(arg) | 1); + break; + } else { /* Altivec does not support vector/integer moves. */ return false; } @@ -1113,6 +1148,10 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, tcg_out_mem_long(s, LWZ, LWZX, ret, base, offset); break; } + if (have_isa_2_07_vsx) { + tcg_out_mem_long(s, 0, LXSIWZX | 1, ret & 31, base, offset); + break; + } assert((offset & 3) == 0); tcg_out_mem_long(s, 0, LVEWX, ret & 31, base, offset); shift = (offset - 4) & 0xc; @@ -1159,6 +1198,10 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, tcg_out_mem_long(s, STW, STWX, arg, base, offset); break; } + if (have_isa_2_07_vsx) { + tcg_out_mem_long(s, 0, STXSIWX | 1, arg & 31, base, offset); + break; + } assert((offset & 3) == 0); shift = (offset - 4) & 0xc; if (shift) { @@ -2891,26 +2934,39 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_not_vec: case INDEX_op_dupm_vec: return 1; + case INDEX_op_orc_vec: + return have_isa_2_07_vsx; case INDEX_op_add_vec: case INDEX_op_sub_vec: case INDEX_op_smax_vec: case INDEX_op_smin_vec: case INDEX_op_umax_vec: case INDEX_op_umin_vec: + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: + return vece <= MO_32 || have_isa_2_07_vsx; case INDEX_op_ssadd_vec: case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: - case INDEX_op_shlv_vec: - case INDEX_op_shrv_vec: - case INDEX_op_sarv_vec: return vece <= MO_32; case INDEX_op_cmp_vec: - case INDEX_op_mul_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: - return vece <= MO_32 ? -1 : 0; + return vece <= MO_32 || have_isa_2_07_vsx ? -1 : 0; + case INDEX_op_mul_vec: + switch (vece) { + case MO_8: + case MO_16: + return -1; + case MO_32: + return have_isa_2_07_vsx ? 1 : -1; + case MO_64: + return have_isa_2_07_vsx ? -1 : 0; + } + return 0; default: return 0; } @@ -2974,28 +3030,28 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, const int *const_args) { static const uint32_t - add_op[4] = { VADDUBM, VADDUHM, VADDUWM, 0 }, - sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, 0 }, - eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, 0 }, - gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, 0 }, - gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, 0 }, + add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM }, + sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM }, + eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD }, + gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, VCMPGTSD }, + gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, VCMPGTUD }, ssadd_op[4] = { VADDSBS, VADDSHS, VADDSWS, 0 }, usadd_op[4] = { VADDUBS, VADDUHS, VADDUWS, 0 }, sssub_op[4] = { VSUBSBS, VSUBSHS, VSUBSWS, 0 }, ussub_op[4] = { VSUBUBS, VSUBUHS, VSUBUWS, 0 }, - umin_op[4] = { VMINUB, VMINUH, VMINUW, 0 }, - smin_op[4] = { VMINSB, VMINSH, VMINSW, 0 }, - umax_op[4] = { VMAXUB, VMAXUH, VMAXUW, 0 }, - smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, 0 }, - shlv_op[4] = { VSLB, VSLH, VSLW, 0 }, - shrv_op[4] = { VSRB, VSRH, VSRW, 0 }, - sarv_op[4] = { VSRAB, VSRAH, VSRAW, 0 }, + umin_op[4] = { VMINUB, VMINUH, VMINUW, VMINUD }, + smin_op[4] = { VMINSB, VMINSH, VMINSW, VMINSD }, + umax_op[4] = { VMAXUB, VMAXUH, VMAXUW, VMAXUD }, + smax_op[4] = { VMAXSB, VMAXSH, VMAXSW, VMAXSD }, + shlv_op[4] = { VSLB, VSLH, VSLW, VSLD }, + shrv_op[4] = { VSRB, VSRH, VSRW, VSRD }, + sarv_op[4] = { VSRAB, VSRAH, VSRAW, VSRAD }, mrgh_op[4] = { VMRGHB, VMRGHH, VMRGHW, 0 }, mrgl_op[4] = { VMRGLB, VMRGLH, VMRGLW, 0 }, - muleu_op[4] = { VMULEUB, VMULEUH, 0, 0 }, - mulou_op[4] = { VMULOUB, VMULOUH, 0, 0 }, + muleu_op[4] = { VMULEUB, VMULEUH, VMULEUW, 0 }, + mulou_op[4] = { VMULOUB, VMULOUH, VMULOUW, 0 }, pkum_op[4] = { VPKUHUM, VPKUWUM, 0, 0 }, - rotl_op[4] = { VRLB, VRLH, VRLW, 0 }; + rotl_op[4] = { VRLB, VRLH, VRLW, VRLD }; TCGType type = vecl + TCG_TYPE_V64; TCGArg a0 = args[0], a1 = args[1], a2 = args[2]; @@ -3017,6 +3073,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_sub_vec: insn = sub_op[vece]; break; + case INDEX_op_mul_vec: + tcg_debug_assert(vece == MO_32 && have_isa_2_07_vsx); + insn = VMULUWM; + break; case INDEX_op_ssadd_vec: insn = ssadd_op[vece]; break; @@ -3066,8 +3126,28 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, insn = VNOR; a2 = a1; break; + case INDEX_op_orc_vec: + insn = VORC; + break; case INDEX_op_dup_vec: + if (a1 < 32) { + bool ok; + switch (vece) { + case MO_64: + ok = tcg_out_mov(s, TCG_TYPE_I64, a0, a1); + break; + case MO_32: + case MO_16: + case MO_8: + ok = tcg_out_mov(s, TCG_TYPE_I32, a0, a1); + break; + default: + g_assert_not_reached(); + } + tcg_debug_assert(ok); + a1 = a0; + } /* Recall we use VSX integer loads, so the integer is right justified within the left (zero-index) double-word. */ switch (vece) { @@ -3165,7 +3245,7 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, { bool need_swap = false, need_inv = false; - tcg_debug_assert(vece <= MO_32); + tcg_debug_assert(vece <= MO_32 || have_isa_2_07_vsx); switch (cond) { case TCG_COND_EQ: @@ -3229,6 +3309,7 @@ static void expand_vec_mul(TCGType type, unsigned vece, TCGv_vec v0, break; case MO_32: + tcg_debug_assert(!have_isa_2_07_vsx); t3 = tcg_temp_new_vec(type); t4 = tcg_temp_new_vec(type); tcg_gen_dupi_vec(MO_8, t4, -16); @@ -3246,6 +3327,27 @@ static void expand_vec_mul(TCGType type, unsigned vece, TCGv_vec v0, tcg_temp_free_vec(t4); break; + case MO_64: + tcg_debug_assert(have_isa_2_07_vsx); + t3 = tcg_temp_new_vec(type); + t4 = tcg_temp_new_vec(type); + tcg_gen_dupi_vec(MO_8, t4, 32); + vec_gen_3(INDEX_op_ppc_rotl_vec, type, MO_64, tcgv_vec_arg(t2), + tcgv_vec_arg(v2), tcgv_vec_arg(t4)); + vec_gen_3(INDEX_op_ppc_mulou_vec, type, MO_32, tcgv_vec_arg(t3), + tcgv_vec_arg(v1), tcgv_vec_arg(t2)); + vec_gen_3(INDEX_op_ppc_muleu_vec, type, MO_32, tcgv_vec_arg(t2), + tcgv_vec_arg(v1), tcgv_vec_arg(t2)); + vec_gen_3(INDEX_op_ppc_mulou_vec, type, MO_32, tcgv_vec_arg(t1), + tcgv_vec_arg(v1), tcgv_vec_arg(v2)); + tcg_gen_add_vec(MO_64, t2, t2, t3); + vec_gen_3(INDEX_op_shlv_vec, type, MO_32, tcgv_vec_arg(t2), + tcgv_vec_arg(t2), tcgv_vec_arg(t4)); + tcg_gen_add_vec(MO_64, v0, t1, t2); + tcg_temp_free_vec(t3); + tcg_temp_free_vec(t4); + break; + default: g_assert_not_reached(); } @@ -3327,6 +3429,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) static const TCGTargetOpDef sub2 = { .args_ct_str = { "r", "r", "rI", "rZM", "r", "r" } }; static const TCGTargetOpDef v_r = { .args_ct_str = { "v", "r" } }; + static const TCGTargetOpDef v_vr = { .args_ct_str = { "v", "vr" } }; static const TCGTargetOpDef v_v = { .args_ct_str = { "v", "v" } }; static const TCGTargetOpDef v_v_v = { .args_ct_str = { "v", "v", "v" } }; static const TCGTargetOpDef v_v_v_v @@ -3494,8 +3597,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_ppc_rotl_vec: return &v_v_v; case INDEX_op_not_vec: - case INDEX_op_dup_vec: return &v_v; + case INDEX_op_dup_vec: + return have_isa_2_07_vsx ? &v_vr : &v_v; case INDEX_op_ld_vec: case INDEX_op_st_vec: case INDEX_op_dupm_vec: @@ -3522,6 +3626,11 @@ static void tcg_target_init(TCGContext *s) have_isa_2_06_vsx = true; } } + if (hwcap2 & PPC_FEATURE2_ARCH_2_07) { + if (hwcap & PPC_FEATURE_HAS_VSX) { + have_isa_2_07_vsx = true; + } + } #ifdef PPC_FEATURE2_ARCH_3_00 if (hwcap2 & PPC_FEATURE2_ARCH_3_00) { have_isa_3_00 = true;