From patchwork Tue Mar 19 17:21:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 160607 Delivered-To: patch@linaro.org Received: by 2002:a02:5cc1:0:0:0:0:0 with SMTP id w62csp4145123jad; Tue, 19 Mar 2019 10:33:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqzx46ODITeZvJyFdeM6qIzXoOlnk3Ovmkp4whxjNlqQ2RURJVavtVbwfYVw4CscQAttjVsY X-Received: by 2002:adf:f70e:: with SMTP id r14mr2652835wrp.37.1553016787423; Tue, 19 Mar 2019 10:33:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553016787; cv=none; d=google.com; s=arc-20160816; b=MTRCeImw3tZahXA7CI94y7OACXjqHFWHJMWGkhbJxv4zTmJR93+ujVUld5TnEDmUec nfaBVR6betkRKDQiPKN95iPFqcmrBy9m935ePLxsXg6Z+L07RVzu4FQEoK9QtjPvm35Y mQZkOjY1vMAloSbpjRv+TUmG3cARYdAdifGfMaJIXc9BZMlOlvGPQnXWZVPxQPvWDv0H SL8jDLPZsckpVzLyTmqc4pM/f0gOhcfd5/Cr3Ur+a0dt3a4g/4UNKAWbOmZdF897ebDr tqYAsHiU4eU1THIF6LZbgnsuD4ewhFqwmYJ8HR2qDEnJT/yFaS0U4pKD0zj3QfGusrSS s4/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=f+nZftJct4PG9U/ho9/G7hv/qP3KbyU48cT7pTmMyRA=; b=e0hW63AtiDhQENdJX70ktj0XCSjiYNAAyuxcv/ZQ8XRdVuBjmyI6nEgl4XTuMwFlXC B3WbiHEcBOaMNMqwk+PNd/PD6cxHBlZTqUCFTA2Jy+sJJZrmxcHvZXwXwBvxsN0TdA3j 7oZVySv0f8+KE3vPPHK6l6CgcFUSOzDNmHp2fdGBNt4cGoRokNjfZOKzTJDlb5TTCo8i JDrIYRVxflwa9wWpo0wXsi1CkGWnvuli4XlEGSCfUT1co1U5iHS7KgLSv2YzQJ815maV H29zq+gpIK98B+cD0pvBazQjcrtwmo7i3+ArAChJ7e54da1CE20aRAySSj/gOXZyg+9K TAWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=tKZ4RDVJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l18si2033217wmj.143.2019.03.19.10.33.06 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 19 Mar 2019 10:33:07 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=tKZ4RDVJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:60818 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h6Ibq-00042B-4x for patch@linaro.org; Tue, 19 Mar 2019 13:33:06 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49272) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h6IR4-0004s1-2u for qemu-devel@nongnu.org; Tue, 19 Mar 2019 13:22:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h6IR0-0004iq-GJ for qemu-devel@nongnu.org; Tue, 19 Mar 2019 13:21:58 -0400 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]:40841) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h6IQz-0004hl-T1 for qemu-devel@nongnu.org; Tue, 19 Mar 2019 13:21:54 -0400 Received: by mail-pf1-x433.google.com with SMTP id c207so5175708pfc.7 for ; Tue, 19 Mar 2019 10:21:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=f+nZftJct4PG9U/ho9/G7hv/qP3KbyU48cT7pTmMyRA=; b=tKZ4RDVJJmk3UdfjUXDF/H+NjAsEy1jbsFDZX1KgFSz7il+fFWb7oxIAApOXtyKJui EwNnSI7g1HXp8m0cO7P5YuXJWQ5UyLpKcGU38Zl0D/HIkXfKBBV955Xiq/0S+F2cUSVy k5s2xnTew7xb4yaNtWC7O3ArpHxHcA19Tvj/AvY1tTI004wdskrm/iMWTOYniJGQ+ORP rztAsiv4sdt4+rKbPorWKzj/WeFztZ/ExkIa8IdvKJ5/ncZKEUp4Tknxn+yGk3D4pPzH SOqj8OEOhewW3JMfP3D58T/88od1WkXpUF7rdgiPiHuK3H4SDo31FvXkz4l5iK5TprzS QHtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f+nZftJct4PG9U/ho9/G7hv/qP3KbyU48cT7pTmMyRA=; b=mdcB52TZxnxE3jZD89eoLMjqz+WRt0zMS/MFMMjWqL81rYHASxZNYJT6kYNZWnkAN2 erIAuBEYj+nRTLeAEvufiCX5Tg76Stc/LHvrykbTR7igtJUbCOmBmM6DTtjljEGDeeV1 b3ii2GKuO6qrbTQe9n9HERZ1+tIfOjUb45SOl1/sFUwNe+Qe9TmekEVKRpQfVFYXbaja 8KICNJuTWHJ68ZUc0JVK6LgUekaRWsC4dAt1Ll+9Qs49E3yJTftTPvkkDPmmmvO0c3Ay +Gdbl1hTBtlPDkIzbM0o7uXqFCPDQjvt84jNdkxRaYnHCfxKpWnSY907euTYCe+9VF6u 9A8w== X-Gm-Message-State: APjAAAU4Ks6LELl93yF343/5w+vtT5kfOqrkfTcvoOfqsRnmckDpbwPn IqUB2M1Et8tykEc3nLIEwh2TPuMS7bA= X-Received: by 2002:a63:618d:: with SMTP id v135mr2871087pgb.2.1553016112527; Tue, 19 Mar 2019 10:21:52 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-188-82.tukw.qwest.net. [97.113.188.82]) by smtp.gmail.com with ESMTPSA id w68sm5616666pfb.176.2019.03.19.10.21.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 19 Mar 2019 10:21:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 19 Mar 2019 10:21:26 -0700 Message-Id: <20190319172126.7502-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190319172126.7502-1-richard.henderson@linaro.org> References: <20190319172126.7502-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::433 Subject: [Qemu-devel] [PATCH for-4.1 v3 17/17] tcg/ppc: Update vector support to v3.00 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This includes vector load/store with immediate offset, some extra move and splat insns, compare ne, and negate. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.h | 3 +- tcg/ppc/tcg-target.inc.c | 103 ++++++++++++++++++++++++++++++++++----- 2 files changed, 94 insertions(+), 12 deletions(-) -- 2.17.2 diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index 4546fcd83c..f3bc7fdc51 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -63,6 +63,7 @@ extern bool have_isa_2_06; extern bool have_isa_2_06_vsx; extern bool have_isa_2_07_vsx; extern bool have_isa_3_00; +extern bool have_isa_3_00_vsx; /* optional instructions automatically implemented */ #define TCG_TARGET_HAS_ext8u_i32 0 /* andi */ @@ -147,7 +148,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_andc_vec 1 #define TCG_TARGET_HAS_orc_vec have_isa_2_07_vsx #define TCG_TARGET_HAS_not_vec 1 -#define TCG_TARGET_HAS_neg_vec 0 +#define TCG_TARGET_HAS_neg_vec have_isa_3_00_vsx #define TCG_TARGET_HAS_shi_vec 0 #define TCG_TARGET_HAS_shs_vec 0 #define TCG_TARGET_HAS_shv_vec 1 diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index c340806158..d53aad9d5d 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -69,6 +69,7 @@ bool have_isa_2_06; bool have_isa_2_06_vsx; bool have_isa_2_07_vsx; bool have_isa_3_00; +bool have_isa_3_00_vsx; #define HAVE_ISA_2_06 have_isa_2_06 #define HAVE_ISEL have_isa_2_06 @@ -475,11 +476,16 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define LXSDX XO31(588) /* v2.06 */ #define LXVDSX XO31(332) /* v2.06 */ #define LXSIWZX XO31(12) /* v2.07 */ +#define LXV (OPCD(61) | 1) /* v3.00 */ +#define LXSD (OPCD(57) | 2) /* v3.00 */ +#define LXVWSX XO31(364) /* v3.00 */ #define STVX XO31(231) #define STVEWX XO31(199) #define STXSDX XO31(716) /* v2.06 */ #define STXSIWX XO31(140) /* v2.07 */ +#define STXV (OPCD(61) | 5) /* v3.00 */ +#define STXSD (OPCD(61) | 2) /* v3.00 */ #define VADDSBS VX4(768) #define VADDUBS VX4(512) @@ -503,6 +509,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSUBUWM VX4(1152) #define VSUBUDM VX4(1216) /* v2.07 */ +#define VNEGW (VX4(1538) | (6 << 16)) /* v3.00 */ +#define VNEGD (VX4(1538) | (7 << 16)) /* v3.00 */ + #define VMAXSB VX4(258) #define VMAXSH VX4(322) #define VMAXSW VX4(386) @@ -532,6 +541,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VCMPGTUH VX4(582) #define VCMPGTUW VX4(646) #define VCMPGTUD VX4(711) /* v2.07 */ +#define VCMPNEB VX4(7) /* v3.00 */ +#define VCMPNEH VX4(71) /* v3.00 */ +#define VCMPNEW VX4(135) /* v3.00 */ #define VSLB VX4(260) #define VSLH VX4(324) @@ -588,11 +600,14 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, #define VSLDOI VX4(44) #define XXPERMDI (OPCD(60) | (10 << 3)) /* v2.06 */ +#define XXSPLTIB (OPCD(60) | (360 << 1)) /* v3.00 */ #define MFVSRD XO31(51) /* v2.07 */ #define MFVSRWZ XO31(115) /* v2.07 */ #define MTVSRD XO31(179) /* v2.07 */ #define MTVSRWZ XO31(179) /* v2.07 */ +#define MTVSRDD XO31(435) /* v3.00 */ +#define MTVSRWS XO31(403) /* v3.00 */ #define RT(r) ((r)<<21) #define RS(r) ((r)<<21) @@ -919,6 +934,10 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg ret, return; } } + if (have_isa_3_00_vsx && val == (tcg_target_long)dup_const(MO_8, val)) { + tcg_out32(s, XXSPLTIB | VRT(ret) | ((val & 0xff) << 11) | 1); + return; + } /* * Otherwise we must load the value from the constant pool. @@ -1104,7 +1123,7 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, TCGReg base, tcg_target_long offset) { tcg_target_long orig = offset, l0, l1, extra = 0, align = 0; - bool is_store = false; + bool is_int_store = false; TCGReg rs = TCG_REG_TMP1; switch (opi) { @@ -1117,11 +1136,20 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, break; } break; + case LXSD: + case STXSD: + align = 3; + break; + case LXV: case LXV | 8: + case STXV: case STXV | 8: + /* The |8 cases force altivec registers. */ + align = 15; + break; case STD: align = 3; /* FALLTHRU */ case STB: case STH: case STW: - is_store = true; + is_int_store = true; break; } @@ -1130,7 +1158,7 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, if (rs == base) { rs = TCG_REG_R0; } - tcg_debug_assert(!is_store || rs != rt); + tcg_debug_assert(!is_int_store || rs != rt); tcg_out_movi(s, TCG_TYPE_PTR, rs, orig); tcg_out32(s, opx | TAB(rt, base, rs)); return; @@ -1194,7 +1222,8 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, case TCG_TYPE_V64: tcg_debug_assert(ret >= 32); if (have_isa_2_06_vsx) { - tcg_out_mem_long(s, 0, LXSDX | 1, ret & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? LXSD : 0, LXSDX | 1, + ret & 31, base, offset); break; } assert((offset & 7) == 0); @@ -1206,7 +1235,8 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, case TCG_TYPE_V128: tcg_debug_assert(ret >= 32); assert((offset & 15) == 0); - tcg_out_mem_long(s, 0, LVX, ret & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? LXV | 8 : 0, LVX, + ret & 31, base, offset); break; default: g_assert_not_reached(); @@ -1245,7 +1275,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, case TCG_TYPE_V64: tcg_debug_assert(arg >= 32); if (have_isa_2_06_vsx) { - tcg_out_mem_long(s, 0, STXSDX | 1, arg & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? STXSD : 0, + STXSDX | 1, arg & 31, base, offset); break; } assert((offset & 7) == 0); @@ -1258,7 +1289,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, break; case TCG_TYPE_V128: tcg_debug_assert(arg >= 32); - tcg_out_mem_long(s, 0, STVX, arg & 31, base, offset); + tcg_out_mem_long(s, have_isa_3_00_vsx ? STXV | 8 : 0, STVX, + arg & 31, base, offset); break; default: g_assert_not_reached(); @@ -2979,6 +3011,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_shri_vec: case INDEX_op_sari_vec: return vece <= MO_32 || have_isa_2_07_vsx ? -1 : 0; + case INDEX_op_neg_vec: + return vece >= MO_32 && have_isa_3_00_vsx; case INDEX_op_mul_vec: switch (vece) { case MO_8: @@ -2997,7 +3031,22 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg dst, TCGReg src) { tcg_debug_assert(dst >= 32); - tcg_debug_assert(src >= 32); + + /* Splat from integer reg allowed via constraints for v3.00. */ + if (src < 32) { + tcg_debug_assert(have_isa_3_00_vsx); + switch (vece) { + case MO_64: + tcg_out32(s, MTVSRDD | 1 | VRT(dst) | RA(src) | RB(src)); + return true; + case MO_32: + tcg_out32(s, MTVSRWS | 1 | VRT(dst) | RA(src)); + return true; + default: + /* Fail, so that we fall back on either dupm or mov+dup. */ + return false; + } + } /* * Recall we use (or emulate) VSX integer loads, so the integer is @@ -3036,7 +3085,11 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, out &= 31; switch (vece) { case MO_8: - tcg_out_mem_long(s, 0, LVEBX, out, base, offset); + if (have_isa_3_00_vsx) { + tcg_out_mem_long(s, LXV | 8, LVX, out, base, offset & -16); + } else { + tcg_out_mem_long(s, 0, LVEBX, out, base, offset); + } elt = extract32(offset, 0, 4); #ifndef HOST_WORDS_BIGENDIAN elt ^= 15; @@ -3045,7 +3098,11 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, break; case MO_16: assert((offset & 1) == 0); - tcg_out_mem_long(s, 0, LVEHX, out, base, offset); + if (have_isa_3_00_vsx) { + tcg_out_mem_long(s, LXV | 8, LVX, out, base, offset & -16); + } else { + tcg_out_mem_long(s, 0, LVEHX, out, base, offset); + } elt = extract32(offset, 1, 3); #ifndef HOST_WORDS_BIGENDIAN elt ^= 7; @@ -3053,6 +3110,10 @@ static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, tcg_out32(s, VSPLTH | VRT(out) | VRB(out) | (elt << 16)); break; case MO_32: + if (have_isa_3_00_vsx) { + tcg_out_mem_long(s, 0, LXVWSX | 1, out, base, offset); + break; + } assert((offset & 3) == 0); tcg_out_mem_long(s, 0, LVEWX, out, base, offset); elt = extract32(offset, 2, 2); @@ -3092,7 +3153,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static const uint32_t add_op[4] = { VADDUBM, VADDUHM, VADDUWM, VADDUDM }, sub_op[4] = { VSUBUBM, VSUBUHM, VSUBUWM, VSUBUDM }, + neg_op[4] = { 0, 0, VNEGW, VNEGD }, eq_op[4] = { VCMPEQUB, VCMPEQUH, VCMPEQUW, VCMPEQUD }, + ne_op[4] = { VCMPNEB, VCMPNEH, VCMPNEW, 0 }, gts_op[4] = { VCMPGTSB, VCMPGTSH, VCMPGTSW, VCMPGTSD }, gtu_op[4] = { VCMPGTUB, VCMPGTUH, VCMPGTUW, VCMPGTUD }, ssadd_op[4] = { VADDSBS, VADDSHS, VADDSWS, 0 }, @@ -3134,6 +3197,11 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_sub_vec: insn = sub_op[vece]; break; + case INDEX_op_neg_vec: + insn = neg_op[vece]; + a2 = a1; + a1 = 0; + break; case INDEX_op_mul_vec: tcg_debug_assert(vece == MO_32 && have_isa_2_07_vsx); insn = VMULUWM; @@ -3196,6 +3264,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case TCG_COND_EQ: insn = eq_op[vece]; break; + case TCG_COND_NE: + insn = ne_op[vece]; + break; case TCG_COND_GT: insn = gts_op[vece]; break; @@ -3266,6 +3337,10 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, case TCG_COND_GTU: break; case TCG_COND_NE: + if (have_isa_3_00_vsx && vece <= MO_32) { + break; + } + /* fall through */ case TCG_COND_LE: case TCG_COND_LEU: need_inv = true; @@ -3421,6 +3496,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) static const TCGTargetOpDef sub2 = { .args_ct_str = { "r", "r", "rI", "rZM", "r", "r" } }; static const TCGTargetOpDef v_r = { .args_ct_str = { "v", "r" } }; + static const TCGTargetOpDef v_vr = { .args_ct_str = { "v", "vr" } }; static const TCGTargetOpDef v_v = { .args_ct_str = { "v", "v" } }; static const TCGTargetOpDef v_v_v = { .args_ct_str = { "v", "v", "v" } }; static const TCGTargetOpDef v_v_v_v @@ -3588,8 +3664,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_ppc_rotl_vec: return &v_v_v; case INDEX_op_not_vec: - case INDEX_op_dup_vec: + case INDEX_op_neg_vec: return &v_v; + case INDEX_op_dup_vec: + return have_isa_3_00_vsx ? &v_vr : &v_v; case INDEX_op_ld_vec: case INDEX_op_st_vec: case INDEX_op_dupm_vec: @@ -3624,6 +3702,9 @@ static void tcg_target_init(TCGContext *s) #ifdef PPC_FEATURE2_ARCH_3_00 if (hwcap2 & PPC_FEATURE2_ARCH_3_00) { have_isa_3_00 = true; + if (hwcap & PPC_FEATURE_HAS_VSX) { + have_isa_3_00_vsx = true; + } } #endif