From patchwork Fri Oct 30 02:26:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 319979 Delivered-To: patch@linaro.org Received: by 2002:a92:7b12:0:0:0:0:0 with SMTP id w18csp1013238ilc; Thu, 29 Oct 2020 19:26:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxKrTYl0kOS+oVbqKYw7RJj2u3g3P5gEVzRJ0S5IzijD87A0QwREml8L6jdCfS75xeiO8OQ X-Received: by 2002:a25:d40f:: with SMTP id m15mr520058ybf.393.1604024800183; Thu, 29 Oct 2020 19:26:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604024800; cv=none; d=google.com; s=arc-20160816; b=Z5IEkwYAR4uIkdRp0ojy7Y6kEqTzXJk2TtEMgRek4+AM4sugYwbVOPBzbsMvBqW3sX cuK7EAQmTX5jKPuOAkmGdpn3It5/6yOkFq8kiFDxTjK8mklVWC5m1ch2ng7k4bNr87n+ jcgAquiYf1A7Z6BFV3vN0eY6hntj1JEzQjEi0diocw74PfLEZxdFKc0Ae8f6VdjsJ2ES tjj8AHybmg+u6uyg9ICfqDDWXm+XavgLUhqY05B0ryGrthRUCTdlAdNmJino7jVyB0Ys /NYmveFwhXaoj/a4tYLvEWCn1CTazOb+A8SuDcR1PvdGhg1FrRkec0bL5EhsNt0nsaaN fFZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=pKwAEGss07SFEqvTkuenDxl2hW2mchkMUugv+tWaYr8=; b=VUERuKpi0vZQryPO+7MCGgnGx/01lQ/z+98uNVZjL/paQnlIM3vWAg+tiAZTCVnJsG fkIIlZlYrMlBP/aH9VmWBCjvgOcWDpois+3LBx8PZ+vuyMoIP9AtxPyCmDPBeZtid0oM ld/YC0/IgXZbHSoq6om3MJ5GVSBBhDXoJ7sfQ+Xpt9kVq0RhC+1Xuj2CqX8oumwHQn+y E5mGNEr1AOiMM7XW7S16KjYGKAyHhQgOfJdfbnaqWrxWI8eMOmK4xeSdL0asMt42nQZO LRHCVXv/e+Vb+VAcPquTMdSK9ktLzyZN96RDdDMbw9HxCaAkhilw1SwUpFo5tL+77nY6 G6YA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Pb6+X5fQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id e9si4915103ybo.281.2020.10.29.19.26.40 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Oct 2020 19:26:40 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Pb6+X5fQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49512 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYK7j-0004yv-Iv for patch@linaro.org; Thu, 29 Oct 2020 22:26:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39764) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7V-0004yj-Cq for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:25 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:38793) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7T-0005uD-Es for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:25 -0400 Received: by mail-pl1-x643.google.com with SMTP id f21so2248346plr.5 for ; Thu, 29 Oct 2020 19:26:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pKwAEGss07SFEqvTkuenDxl2hW2mchkMUugv+tWaYr8=; b=Pb6+X5fQwVWKsS3DbmfE6X7NPWGv8SEqx+8vr4/g5N2W7xi8DAgrDCwWKO84/WgNjw Qu9M0l7Rdd439+CNt0gc3s3OEIyv7P4ivgUG3cUe9m8wvAIpshePRM7qh89w7tReP2UM iaEgFSMBekfUkpEJIBfDTzo0CNByF5PW79NYTGewhfQenrLTfjlcdKZXN6MqRz/LzwF5 vupJHxAcgxpQD/gAD83Nd8Tvfl7caLNua68tX2DbQWteSzrpInIyQgm5pDn19L92zsgw lL9epjZYvWKi89N19lFPwmSYlsMsBGmalYK4CBs1cAzH3tbUXpWSCepStgc6EaCh+yQ8 YEXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pKwAEGss07SFEqvTkuenDxl2hW2mchkMUugv+tWaYr8=; b=aIeamkedfAj7u0oMKhQL47BYOjYGm5GHdHqT4Rk4/vstdqEMt1DBSB7QP32SyJCy6E YO+OpIdpMM7w3Doiforzeky2zXyKiVJ2AFd5ZduLQ4B65udUmiNE46BTEwcgBWsmnGXU 9dsRz8QRUSd6TOiIxGdmWHcsIQYThiH27LpDFOBDfAZ9LI4l2J7m9EyQm6Sjk9pB800Y gxF1/4F1c0U1SFnwTrn/Z2qwY+nRj5M6bTUncAilYgWlqSZW7gy1rSVVUR+HoSBR7iS3 Lo34NVdvd/yuZF/J40JidkSDCUg3iGNwQdAsMnKFdQIKh/56jbFGMP5k+aIweQ+3xVdn qXzg== X-Gm-Message-State: AOAM5314DKKH9NEShshPkXA0IDumIvsACcLQzFK0uxs20hsShb1VDFUH ue6QMi3TIH2XtONx+ObT7B/HTC6mm/YJew== X-Received: by 2002:a17:902:fe07:b029:d6:88c5:f5d5 with SMTP id g7-20020a170902fe07b02900d688c5f5d5mr6013516plj.63.1604024781479; Thu, 29 Oct 2020 19:26:21 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 01/11] target/arm: Introduce neon_full_reg_offset Date: Thu, 29 Oct 2020 19:26:08 -0700 Message-Id: <20201030022618.785675-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This function makes it clear that we're talking about the whole register, and not the 32-bit piece at index 0. This fixes a bug when running on a big-endian host. Signed-off-by: Richard Henderson --- target/arm/translate.c | 8 ++++++ target/arm/translate-neon.c.inc | 44 ++++++++++++++++----------------- target/arm/translate-vfp.c.inc | 2 +- 3 files changed, 31 insertions(+), 23 deletions(-) -- 2.25.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 38371db540..1b61e50f9c 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1094,6 +1094,14 @@ static inline void gen_hlt(DisasContext *s, int imm) unallocated_encoding(s); } +/* + * Return the offset of a "full" NEON Dreg. + */ +static long neon_full_reg_offset(unsigned reg) +{ + return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); +} + static inline long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 4d1a292981..e259e24c05 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -76,7 +76,7 @@ neon_element_offset(int reg, int element, MemOp size) ofs ^= 8 - element_size; } #endif - return neon_reg_offset(reg, 0) + ofs; + return neon_full_reg_offset(reg) + ofs; } static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop) @@ -585,12 +585,12 @@ static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a) * We cannot write 16 bytes at once because the * destination is unaligned. */ - tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0), + tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(vd), 8, 8, tmp); - tcg_gen_gvec_mov(0, neon_reg_offset(vd + 1, 0), - neon_reg_offset(vd, 0), 8, 8); + tcg_gen_gvec_mov(0, neon_full_reg_offset(vd + 1), + neon_full_reg_offset(vd), 8, 8); } else { - tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0), + tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(vd), vec_size, vec_size, tmp); } tcg_gen_addi_i32(addr, addr, 1 << size); @@ -691,9 +691,9 @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a) static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn) { int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rn_ofs = neon_reg_offset(a->vn, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rn_ofs = neon_full_reg_offset(a->vn); + int rm_ofs = neon_full_reg_offset(a->vm); if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -1177,8 +1177,8 @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn) { /* Handle a 2-reg-shift insn which can be vectorized. */ int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rm_ofs = neon_full_reg_offset(a->vm); if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -1620,8 +1620,8 @@ static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a, { /* FP operations in 2-reg-and-shift group */ int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rm_ofs = neon_full_reg_offset(a->vm); TCGv_ptr fpst; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -1756,7 +1756,7 @@ static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a, return true; } - reg_ofs = neon_reg_offset(a->vd, 0); + reg_ofs = neon_full_reg_offset(a->vd); vec_size = a->q ? 16 : 8; imm = asimd_imm_const(a->imm, a->cmode, a->op); @@ -2300,9 +2300,9 @@ static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff *a) return true; } - tcg_gen_gvec_3_ool(neon_reg_offset(a->vd, 0), - neon_reg_offset(a->vn, 0), - neon_reg_offset(a->vm, 0), + tcg_gen_gvec_3_ool(neon_full_reg_offset(a->vd), + neon_full_reg_offset(a->vn), + neon_full_reg_offset(a->vm), 16, 16, 0, fn_gvec); return true; } @@ -2445,8 +2445,8 @@ static bool do_2scalar_fp_vec(DisasContext *s, arg_2scalar *a, { /* Two registers and a scalar, using gvec */ int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rn_ofs = neon_reg_offset(a->vn, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rn_ofs = neon_full_reg_offset(a->vn); int rm_ofs; int idx; TCGv_ptr fpstatus; @@ -2477,7 +2477,7 @@ static bool do_2scalar_fp_vec(DisasContext *s, arg_2scalar *a, /* a->vm is M:Vm, which encodes both register and index */ idx = extract32(a->vm, a->size + 2, 2); a->vm = extract32(a->vm, 0, a->size + 2); - rm_ofs = neon_reg_offset(a->vm, 0); + rm_ofs = neon_full_reg_offset(a->vm); fpstatus = fpstatus_ptr(a->size == 1 ? FPST_STD_F16 : FPST_STD); tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpstatus, @@ -2923,7 +2923,7 @@ static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a) return true; } - tcg_gen_gvec_dup_mem(a->size, neon_reg_offset(a->vd, 0), + tcg_gen_gvec_dup_mem(a->size, neon_full_reg_offset(a->vd), neon_element_offset(a->vm, a->index, a->size), a->q ? 16 : 8, a->q ? 16 : 8); return true; @@ -3412,8 +3412,8 @@ static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a) static bool do_2misc_vec(DisasContext *s, arg_2misc *a, GVecGen2Fn *fn) { int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rm_ofs = neon_full_reg_offset(a->vm); if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index a7ed9bc81b..368bae0a73 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -653,7 +653,7 @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a) } tmp = load_reg(s, a->rt); - tcg_gen_gvec_dup_i32(size, neon_reg_offset(a->vn, 0), + tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(a->vn), vec_size, vec_size, tmp); tcg_temp_free_i32(tmp); From patchwork Fri Oct 30 02:26:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 319982 Delivered-To: patch@linaro.org Received: by 2002:a92:7b12:0:0:0:0:0 with SMTP id w18csp1014194ilc; Thu, 29 Oct 2020 19:28:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJEnMAcOGpjfNCNapZEYAOVNGpp0sBZMI+ztKr/PUOqpAZftksELI/zTjJUUFC4RlOqhXq X-Received: by 2002:a25:cc8e:: with SMTP id l136mr587830ybf.10.1604024934489; Thu, 29 Oct 2020 19:28:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604024934; cv=none; d=google.com; s=arc-20160816; b=Q9P9CBdubd5MZBbwwujeaQIBfxAyoqiS6NPPDK3c7HdaXJfgOl3v0izYW6+1HIJyfv 3sYwCRAZpLhMN0s2/z76Dh/xwpZaz8gtzeyTp3rf8dAe8/zyP3AWXWeOlbuWtiVtZaqo 4WHCGA6cxy/zLLZgAAc0phqwmOjLlDzDec+OI3evrAkRguel0jNStbzaKClCMQhkJpRv A0N9AkdYaNEr1Zw/RLD2zoArUO7dL/ngtYj72H40PS0c9HhktG59qlaQUPLtdkWrN5RN XfDH/0AFTvUWpdMN+wd46zF7nuRpiaOs38fYXmjz171dORM9yp/DokDg3pxCiWC917o2 Jj4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=d+BfMFnltv94D9h3YrYpUyscB1U2BxczjPk7SPCCLHg=; b=Q6Rnr7hBco/aDTtS6GcbsdsUsZdnWgG8szOzm+C/IeeIce0FPgCV140k6mAZDFFq3d twMkkn6K7uV0cGoS1ZQSjPBIoAAXDi0lHjDfHC2U5vFlHv+fQ68DK4Rux2obMpSxBzjl OdtcS8HGm9Skx99/554z6Do+1+dgr1fg7qPR28yMwzJ0aDKwdCrwNDQBTkZ2TAxHdiiD 863oihCuxXKjfUba++hGtAs7fT0WfI41HYz2rRugIz+NZZuO04b8/4C/SwuwVf05a4bi fgfCGRtqpl4DCYvAE0q1v9ZK0MS9p/XHwee0u5kz/6Ma5bI9QZc4NBFaDLgbEMEDWdCl Q/iA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WmXeekiO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id w8si5010576ybs.13.2020.10.29.19.28.54 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Oct 2020 19:28:54 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WmXeekiO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56470 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYK9t-0007rJ-TL for patch@linaro.org; Thu, 29 Oct 2020 22:28:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39790) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7W-0004yw-Kg for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:26 -0400 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]:42755) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7U-0005uL-GK for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:26 -0400 Received: by mail-pg1-x52f.google.com with SMTP id k9so2224676pgt.9 for ; Thu, 29 Oct 2020 19:26:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=d+BfMFnltv94D9h3YrYpUyscB1U2BxczjPk7SPCCLHg=; b=WmXeekiOayWAzZXmzjasX34a63S1llh07JYPPRrtY2FQvl40jDxg0TS4XjjGYVOFFE HiuMJWbhaRZeeh6KhgRDy1/zg6Qo5xDAvhKnxMce8wh6GmzLAIJ+H4Egn4NsbVOHrhrm p6OABvxb/wQVPAwK0Q2bUL8kTuQZwufwxFzUkt2rUQz3EtMzqE1bRoCRv9orEaFWfalD qiG0AUhiVwg8E/K3+WvXW28UrYZIbY9qjfANO+3dnszuyRucG5Eaz0ffYnj2slFzlCji yIzE+4Mapd4eZKy6tA7xGVFoXxOHtiVabYE3aavAVbJzHJ0gRH3shdAFaDBTVulqbXBI qAhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d+BfMFnltv94D9h3YrYpUyscB1U2BxczjPk7SPCCLHg=; b=fKhajSbE4G8i0tbeNQrjdnczX8xKW96RYW81sJpXFem37sNtdLjEkMEZUikEkhCITa UUKX9g70xmMcZKjkg8a4r5dJ06ij5hCKk6YyvWnbOlKkowS7exRe86vzVwuA0+UGPDcq Ln9FD7uykq8AUkIVZVPIlNTnUDqeLuJuSRzOFN45tCianNdnnOSgzWJQLXI1LrzJ0zgj 3boqRkt7MwvOg3ilvTfrn5RfeMNHL6hRm2u96ad0oF42CbnwheNTiJ9fDINLtWKBsQQf NhzfgAqxVAv1Zv9+DgTwRyYZrnjGmmfg3Li5B/0nXAcnOAkorIesFtA832iPe7Sh5BLU 0/ew== X-Gm-Message-State: AOAM532M5Af1kRMecq/B4BY7JtO1gcULeodoNfqYA1pNGf+mLKDYchnz 946oCkVeXVstSz6K9+xoJ77zlacLiK+D1Q== X-Received: by 2002:a17:90a:3e09:: with SMTP id j9mr116490pjc.192.1604024782736; Thu, 29 Oct 2020 19:26:22 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 02/11] target/arm: Move neon_element_offset to translate.c Date: Thu, 29 Oct 2020 19:26:09 -0700 Message-Id: <20201030022618.785675-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52f; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52f.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This will shortly have users outside of translate-neon.c.inc. Signed-off-by: Richard Henderson --- target/arm/translate.c | 20 ++++++++++++++++++++ target/arm/translate-neon.c.inc | 19 ------------------- 2 files changed, 20 insertions(+), 19 deletions(-) -- 2.25.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 1b61e50f9c..bf0b5cac61 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1102,6 +1102,26 @@ static long neon_full_reg_offset(unsigned reg) return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); } +/* + * Return the offset of a 2**SIZE piece of a NEON register, at index ELE, + * where 0 is the least significant end of the register. + */ +static long neon_element_offset(int reg, int element, MemOp size) +{ + int element_size = 1 << size; + int ofs = element * element_size; +#ifdef HOST_WORDS_BIGENDIAN + /* + * Calculate the offset assuming fully little-endian, + * then XOR to account for the order of the 8-byte units. + */ + if (element_size < 8) { + ofs ^= 8 - element_size; + } +#endif + return neon_full_reg_offset(reg) + ofs; +} + static inline long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index e259e24c05..96ab2248fc 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -60,25 +60,6 @@ static inline int neon_3same_fp_size(DisasContext *s, int x) #include "decode-neon-ls.c.inc" #include "decode-neon-shared.c.inc" -/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE, - * where 0 is the least significant end of the register. - */ -static inline long -neon_element_offset(int reg, int element, MemOp size) -{ - int element_size = 1 << size; - int ofs = element * element_size; -#ifdef HOST_WORDS_BIGENDIAN - /* Calculate the offset assuming fully little-endian, - * then XOR to account for the order of the 8-byte units. - */ - if (element_size < 8) { - ofs ^= 8 - element_size; - } -#endif - return neon_full_reg_offset(reg) + ofs; -} - static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop) { long offset = neon_element_offset(reg, ele, mop & MO_SIZE); From patchwork Fri Oct 30 02:26:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 316539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91566C4363A for ; Fri, 30 Oct 2020 02:28:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0470820756 for ; Fri, 30 Oct 2020 02:28:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="bkm3wJea" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0470820756 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53106 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYK93-0006VO-TI for qemu-devel@archiver.kernel.org; Thu, 29 Oct 2020 22:28:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39802) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7X-000500-Fg for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:27 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:41445) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7V-0005uY-T4 for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:27 -0400 Received: by mail-pg1-x543.google.com with SMTP id g12so3914825pgm.8 for ; Thu, 29 Oct 2020 19:26:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PzEKwCdPZHtQpaS3rSzSooMjXk0UfsubT3zCzhcVGPc=; b=bkm3wJea1zNcaojLSIBcUZoqK8kiSOY5nOwAXlfCTiPs8D2D1/kDSc/t+ci9c/hXhW dbM3qnSPoEePGmQDPsgb9eoQDpVrWEnb2M7ouc/3Uj3lVOKunVFFiee87Hx5Jj6g68n5 OAQTHmDDsULgzFuZSWa1IAEqCu8v/ikTmCsyHdesUF1QB+7b+d3VOSCkCqdXDWUjOzxK 85uzDMrQvteYOXOin5mSJMpiggOaYI6Efpx0oNFSsVbbIVpYQ9Zeai9KyvbTYlqAI4dV suLXHn0py0E2BYb49uHlOypYh/mSvtnALCeN49L/WX7jIdAT8zZvtxldX2XgYx90eBKu c3RA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PzEKwCdPZHtQpaS3rSzSooMjXk0UfsubT3zCzhcVGPc=; b=lohll/K2PL379tnCxADl0nCKxyVmKkDn7UZ3890w5+NEGhy7WwRSO0BQWEcE3uD2HX QAfMkemizQjC1PviYQyHqMry9yjuLcwDbOP8XP6fHuIdDuVxLD9lSIiFUNEhqbJgYrm0 XrbBQpsQErefCHtI1VwXeGlcp76rgzOQFnnzKHlBKfHZUimJjTdhNanaEOuCySVdDWXB WwTKj/po5A3pyRGlNpFvDAmROfrWLPWqoc736PGWvOuZ78dxvJ8CQd6lHQok2N9FQkTE nsr7Z1eIPGTVloXBGzNRhY84420+kLjfccpBCX+Ufdp2ZiINomg/hXUcQbXbM9iCqN+V Buag== X-Gm-Message-State: AOAM5321tqZ96GA1DSv+/G+xVvzT5IorfPBgRn0I4CIOsxrwzfav65gy 9fLAY/6CPPo0WvhRG5EEqvGqBdwutkOZ1A== X-Google-Smtp-Source: ABdhPJyfpVn6MZvV6omJsoDUwzsUXiAYaUjeybWkdVwPqX3IBtVWg583+MGh3+oyhccK7QCuX/S1XQ== X-Received: by 2002:a17:90a:5b15:: with SMTP id o21mr92188pji.45.1604024784188; Thu, 29 Oct 2020 19:26:24 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 03/11] target/arm: Use neon_element_offset in neon_load/store_reg Date: Thu, 29 Oct 2020 19:26:10 -0700 Message-Id: <20201030022618.785675-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::543; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x543.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" These are the only users of neon_reg_offset, so remove that. Signed-off-by: Richard Henderson --- target/arm/translate.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index bf0b5cac61..88a926d1df 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1137,26 +1137,16 @@ static inline long vfp_reg_offset(bool dp, unsigned reg) } } -/* Return the offset of a 32-bit piece of a NEON register. - zero is the least significant end of the register. */ -static inline long -neon_reg_offset (int reg, int n) -{ - int sreg; - sreg = reg * 2 + n; - return vfp_reg_offset(0, sreg); -} - static TCGv_i32 neon_load_reg(int reg, int pass) { TCGv_i32 tmp = tcg_temp_new_i32(); - tcg_gen_ld_i32(tmp, cpu_env, neon_reg_offset(reg, pass)); + tcg_gen_ld_i32(tmp, cpu_env, neon_element_offset(reg, pass, MO_32)); return tmp; } static void neon_store_reg(int reg, int pass, TCGv_i32 var) { - tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass)); + tcg_gen_st_i32(var, cpu_env, neon_element_offset(reg, pass, MO_32)); tcg_temp_free_i32(var); } From patchwork Fri Oct 30 02:26:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 319985 Delivered-To: patch@linaro.org Received: by 2002:a92:7b12:0:0:0:0:0 with SMTP id w18csp1015056ilc; Thu, 29 Oct 2020 19:30:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwJPyBWSEDrTXOv+rhSIyqSLXj9buydZtCEmiGZKvpL7UvQiygqVz+5HqP6VpeJMRdTRVVU X-Received: by 2002:a25:be90:: with SMTP id i16mr497419ybk.189.1604025042506; Thu, 29 Oct 2020 19:30:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604025042; cv=none; d=google.com; s=arc-20160816; b=VBpDpc6fHrrE4bex6OmUPA0NSVuUG0MpVTeC2ZTxFUkd7+e3WyigVYTcWb0pbHAP4Z Y/K761Vnc1UdNMwhFhmL+KriM5aRKiobM/xetGXqzr6u9Osg2g5L/QdrNL3e/1FHQS2b 7mv5gR5e0PyWJ2a1h3NrwpK5P7TBy7qdOIqMV+HXqxMShbf4hVXTZIVn5bfuaOBxVrfm Z1U6W///pyAs3/oOtBh/bX9iB4sgNf73yTl02bH1fTreJalvA3mwqJ1zFs/6aule6ebq GvvJDbaGe3gNPLxvH+UzrIA0v0AR+V/LQ5+d0cjcsglgCyYFiVSdVsjlc09KQ6lNpV5Q 4iTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=h4nXGEfFKdYYxAxVCeZfgKCWCHDlbl0aB+zYefgN5rQ=; b=P6wf3GXa252kABVzXn9mi+/fngAi2ZVljhsB4LvYSv8ptEcvvG6HMxO+GYPfO3V3yI s34XvIn3pMx/i1GKMFqVnZU9p6u2mhYOhODvgqGp2iwxzCrXr5VsPICk5eQHdQD7NbUr 2E2c7wrLOgCrvwKflQUpbHfjVE98UY/BmbZhIHn3xzHgw2IdlTzs4dm8gK3cKkMmGbqT H/b5koBH9HoK/mTsyOCJFykrJ+wU98/WMiBrnrnqDCz4q+bloaIJP4MLA87/poLjVhKV Z4/zvrd6I5a3tFVfY2nQXVlgZqY8cHAbHpV/NtU4DOqc6OL4k/PctCaUXzAFdKIm706R 58nA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=beTkeJwF; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id e4si5044201ybp.361.2020.10.29.19.30.42 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Oct 2020 19:30:42 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=beTkeJwF; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34252 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKBd-0001t7-V6 for patch@linaro.org; Thu, 29 Oct 2020 22:30:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39812) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7Y-000513-H5 for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:28 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:39071) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7X-0005un-2B for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:28 -0400 Received: by mail-pf1-x444.google.com with SMTP id e15so3979424pfh.6 for ; Thu, 29 Oct 2020 19:26:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=h4nXGEfFKdYYxAxVCeZfgKCWCHDlbl0aB+zYefgN5rQ=; b=beTkeJwF0RNwyM5S8cby2EtTfgia+VK55dpYJO451RHn6Suyf1C/HNkT9a+FT1dDyQ CbSYtq529xfbp08/Nbw5kKYqfjWICC5UOpRhLdT7IIPt71CbDwsHaG8QIjioW8wxhemS TmYEvg29N3/NPZ83iZcOodJUubgQxQGqzF7JYIL0TQmBJRvaLTCxJ9fRw1ITp+Kjr+DW 5W64PD7kDbe/kTEroV5e/7qC9p1Qz2IqYdyYylxlVSGy49VgG++YrGIanbR15btrkaHG jvQ24pA8btIMd31qxV3NUKlZZaMgHJGQVjLOsvWbWTJxQSGlHw8PwQSizgmoldfFLzEc a/dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h4nXGEfFKdYYxAxVCeZfgKCWCHDlbl0aB+zYefgN5rQ=; b=mSp6wMGfHDFnceNqns6HuAl+hqXYB/lO5SlGqNgd/8EF2QO9jcqEjgjoh4YToJjgCl 15+y7y380QZs80u3irWJIkKbv85xzqaXklHVJAo7PbfG5n8wzoQofWjJb2eoy/vEBP38 Yhgo2HeP8r3wXLFL57NFMdtZ+j1+CSRQ4JFQj6/Uye4ViG9OfRmVpHpMEGOmUFMJLcSE wHWQymT0ImSGdY8xxpXMALCWG76BulxbkbiH8QxkesViw4XCS2BDBqzJjUwmHqFdMlFT /GEaVBNkF0p9LNOb52pCQMGjhMc8+a5K7mSc7JIe4hoES3N41rV1GY9KOqV+AC5erSI7 d64Q== X-Gm-Message-State: AOAM533c5X4PR/K2kXRcS3CvxaQjSLyE8SdW1HQdXtfkLkGTnukpWSgc I78clPhswLlNRHMG/6mHdqE193kHroSBYQ== X-Received: by 2002:a17:90a:fa93:: with SMTP id cu19mr89530pjb.117.1604024785397; Thu, 29 Oct 2020 19:26:25 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 04/11] target/arm: Use neon_element_offset in vfp_reg_offset Date: Thu, 29 Oct 2020 19:26:11 -0700 Message-Id: <20201030022618.785675-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x444.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This seems a bit more readable than using offsetof CPU_DoubleU. Signed-off-by: Richard Henderson --- target/arm/translate.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) -- 2.25.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 88a926d1df..88ded4ac2c 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1122,18 +1122,13 @@ static long neon_element_offset(int reg, int element, MemOp size) return neon_full_reg_offset(reg) + ofs; } -static inline long vfp_reg_offset(bool dp, unsigned reg) +/* Return the offset of a VFP Dreg (dp = true) or VFP Sreg (dp = false). */ +static long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { - return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); + return neon_element_offset(reg, 0, MO_64); } else { - long ofs = offsetof(CPUARMState, vfp.zregs[reg >> 2].d[(reg >> 1) & 1]); - if (reg & 1) { - ofs += offsetof(CPU_DoubleU, l.upper); - } else { - ofs += offsetof(CPU_DoubleU, l.lower); - } - return ofs; + return neon_element_offset(reg >> 1, reg & 1, MO_32); } } From patchwork Fri Oct 30 02:26:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 319981 Delivered-To: patch@linaro.org Received: by 2002:a92:7b12:0:0:0:0:0 with SMTP id w18csp1013634ilc; Thu, 29 Oct 2020 19:27:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzq4ulQPOt2AcDujDsE/xIHeCuRNbtcb+88lTsEZXFLY5eFxXPrPzzzYVz+hr3jngWDtf4q X-Received: by 2002:a25:3892:: with SMTP id f140mr611862yba.150.1604024858084; Thu, 29 Oct 2020 19:27:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604024858; cv=none; d=google.com; s=arc-20160816; b=08ILylegOZ3w+FzKi9jNMn8GXCV5/581X68rJg5wtezubeW9RgQnuh6A7q7U/fT3rM jKhbvgeOM4Bu5WeqCLJo90Mp3qIYMcUDkfMSl/KkY6q7YLHdlniyEGR1Q9rBhY7Mxwqd 3d4nlKTSbbCufqRAIEnE9zm86FxpkknyJG3VPm1Q7vL1aGK10wpcDZ4I2+58+1L6kpN7 QHYPpZcPkxdRmSP5nbSj1qO9xBYGbYd9qKjHVpLabfmX8myIzYEfbrtx/7Kp2bRacunh vwYwKlPpUUwS37XgYbpwMHW5puNXxI/CQfV/S8F33IYmlXpRKTfd2Wt/VsLdXzLxea0d AmWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=q8xYB3OrHbx1hmimSvEy8fL3WL3tKZ5pAYr2yKnH6ks=; b=xY9rSK79LV/CbnGq7QmVX39kCc2a/sfoTjKslCbj9o+eZdrAVyWW9fWcexTnwj4jHL Yr4niJTlwhn408Ytm/jKNKlZ2exc40Gktq2T57jPVqV8k5+I8D63YiI3Vmn4Wbz4z4bf 3xdLQP6g0cOxgsPFWwyO20roU/jbZki94cboQci5fXz/E/zr0iGy+9HJvbQaL4lhzO6p t5TZk450nX0oy6pHOghvmQ6i6YaQJigYjeVNa5danF1slMzZxdhC3lmbOHvS8pbbsK1i 7k9B51I5mCnzuwHnZUqO33hJ+mOx+Bhcxb+FqOhi6rEJVA7fbI8eORAbOCgQ0QoQnKD1 zz6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=C8jYjr8X; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 4si5305848ybe.256.2020.10.29.19.27.37 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Oct 2020 19:27:38 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=C8jYjr8X; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49898 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYK8f-0005Ag-FX for patch@linaro.org; Thu, 29 Oct 2020 22:27:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39838) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7b-00057I-LU for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:31 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:42706) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7Y-0005v5-Us for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:31 -0400 Received: by mail-pf1-x443.google.com with SMTP id x13so3965086pfa.9 for ; Thu, 29 Oct 2020 19:26:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=q8xYB3OrHbx1hmimSvEy8fL3WL3tKZ5pAYr2yKnH6ks=; b=C8jYjr8X25v5WoD6bjpaRE+WPA59vGRyTYZCTXbqE4gXBoNujPYcaXGuHMvJrrkQPy BkOEqV2zD7CizzhXFZxOyVSidd23Q9K+M7UtMDG4BPVlYsuyuIbK5pAN5j9moRwn7wHQ vmeyWRo911XGiqbuO5L72IFQFSC25acAC/sWFzc/SbyN9bkTzKa3LF38YtNveBc/DypM DqSuzE5E/plyv7FJtAiyadRa3ZUMR9C+rzcYuawcGO1SVmVq5VoAm0TiempUzj1ow4Pa 330yGoaOUYiXRC3VQ+9ivTOhL+S2oi2CeRrb3JDpY33ziCrD+5xKFTDTnglWmdcZg/Z2 zpeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=q8xYB3OrHbx1hmimSvEy8fL3WL3tKZ5pAYr2yKnH6ks=; b=iPcRKxROcw0wNGbdpYNLqMWm5zbG9GAgqFn2wTRwvEdQz1RUr0g75cxXgeZT1QMruS Ty7BaqgxVAFNZVZv9X2FdflbzSsYv1brKJBPUlhmC72bONqSCMtJTB45xZQT7T/DbB+x WO0xFQyIu0XZguB2FSv4x60Ewq6lV8fGtLQ9iJo5hyAsKB6e8ALhRYB7/SDd6Lv5cxRZ LYrnhfQwg0qeCiLPVhkqY4mVArkShwb4MOi6s3HeIxhs4abDvctMu6F5IWNZBHlXpSL2 rjACMiG5jUPmcoFEqhUpD6+61ImzYPmbv2I3/1bzAGW68Ved37+m32ITgnVP1TOH/mfQ L4vQ== X-Gm-Message-State: AOAM531K2dPbRdOHZB6z2tl42FdIqYX2yJmVasxxFq6huaNeMn6grSiS vnFedP84n/XxKGDRhXnCDSJ/MgMw82+VSw== X-Received: by 2002:a62:dd56:0:b029:155:8165:c6c2 with SMTP id w83-20020a62dd560000b02901558165c6c2mr7001060pff.3.1604024786853; Thu, 29 Oct 2020 19:26:26 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 05/11] target/arm: Add read/write_neon_element32 Date: Thu, 29 Oct 2020 19:26:12 -0700 Message-Id: <20201030022618.785675-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::443; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x443.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Model these off the aa64 read/write_vec_element functions. Use it within translate-neon.c.inc. The new functions do not allocate or free temps, so this rearranges the calling code a bit. Signed-off-by: Richard Henderson --- target/arm/translate.c | 26 ++++ target/arm/translate-neon.c.inc | 256 ++++++++++++++++++++------------ 2 files changed, 183 insertions(+), 99 deletions(-) -- 2.25.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 88ded4ac2c..0ed9eab0b0 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1165,6 +1165,32 @@ static inline void neon_store_reg32(TCGv_i32 var, int reg) tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg)); } +static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size) +{ + long off = neon_element_offset(reg, ele, size); + + switch (size) { + case MO_32: + tcg_gen_ld_i32(dest, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + +static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp size) +{ + long off = neon_element_offset(reg, ele, size); + + switch (size) { + case MO_32: + tcg_gen_st_i32(src, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + static TCGv_ptr vfp_reg_ptr(bool dp, int reg) { TCGv_ptr ret = tcg_temp_new_ptr(); diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 96ab2248fc..549381703e 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -956,18 +956,24 @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn) * early. Since Q is 0 there are always just two passes, so instead * of a complicated loop over each pass we just unroll. */ - tmp = neon_load_reg(a->vn, 0); - tmp2 = neon_load_reg(a->vn, 1); + tmp = tcg_temp_new_i32(); + tmp2 = tcg_temp_new_i32(); + tmp3 = tcg_temp_new_i32(); + + read_neon_element32(tmp, a->vn, 0, MO_32); + read_neon_element32(tmp2, a->vn, 1, MO_32); fn(tmp, tmp, tmp2); - tcg_temp_free_i32(tmp2); - tmp3 = neon_load_reg(a->vm, 0); - tmp2 = neon_load_reg(a->vm, 1); + read_neon_element32(tmp3, a->vm, 0, MO_32); + read_neon_element32(tmp2, a->vm, 1, MO_32); fn(tmp3, tmp3, tmp2); - tcg_temp_free_i32(tmp2); - neon_store_reg(a->vd, 0, tmp); - neon_store_reg(a->vd, 1, tmp3); + write_neon_element32(tmp, a->vd, 0, MO_32); + write_neon_element32(tmp3, a->vd, 1, MO_32); + + tcg_temp_free_i32(tmp); + tcg_temp_free_i32(tmp2); + tcg_temp_free_i32(tmp3); return true; } @@ -1275,7 +1281,7 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a, * 2-reg-and-shift operations, size < 3 case, where the * helper needs to be passed cpu_env. */ - TCGv_i32 constimm; + TCGv_i32 constimm, tmp; int pass; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -1301,12 +1307,14 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a, * by immediate using the variable shift operations. */ constimm = tcg_const_i32(dup_const(a->size, a->shift)); + tmp = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 tmp = neon_load_reg(a->vm, pass); + read_neon_element32(tmp, a->vm, pass, MO_32); fn(tmp, cpu_env, tmp, constimm); - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp, a->vd, pass, MO_32); } + tcg_temp_free_i32(tmp); tcg_temp_free_i32(constimm); return true; } @@ -1364,21 +1372,21 @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a, constimm = tcg_const_i64(-a->shift); rm1 = tcg_temp_new_i64(); rm2 = tcg_temp_new_i64(); + rd = tcg_temp_new_i32(); /* Load both inputs first to avoid potential overwrite if rm == rd */ neon_load_reg64(rm1, a->vm); neon_load_reg64(rm2, a->vm + 1); shiftfn(rm1, rm1, constimm); - rd = tcg_temp_new_i32(); narrowfn(rd, cpu_env, rm1); - neon_store_reg(a->vd, 0, rd); + write_neon_element32(rd, a->vd, 0, MO_32); shiftfn(rm2, rm2, constimm); - rd = tcg_temp_new_i32(); narrowfn(rd, cpu_env, rm2); - neon_store_reg(a->vd, 1, rd); + write_neon_element32(rd, a->vd, 1, MO_32); + tcg_temp_free_i32(rd); tcg_temp_free_i64(rm1); tcg_temp_free_i64(rm2); tcg_temp_free_i64(constimm); @@ -1428,10 +1436,14 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a, constimm = tcg_const_i32(imm); /* Load all inputs first to avoid potential overwrite */ - rm1 = neon_load_reg(a->vm, 0); - rm2 = neon_load_reg(a->vm, 1); - rm3 = neon_load_reg(a->vm + 1, 0); - rm4 = neon_load_reg(a->vm + 1, 1); + rm1 = tcg_temp_new_i32(); + rm2 = tcg_temp_new_i32(); + rm3 = tcg_temp_new_i32(); + rm4 = tcg_temp_new_i32(); + read_neon_element32(rm1, a->vm, 0, MO_32); + read_neon_element32(rm2, a->vm, 1, MO_32); + read_neon_element32(rm3, a->vm, 2, MO_32); + read_neon_element32(rm4, a->vm, 3, MO_32); rtmp = tcg_temp_new_i64(); shiftfn(rm1, rm1, constimm); @@ -1441,7 +1453,8 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a, tcg_temp_free_i32(rm2); narrowfn(rm1, cpu_env, rtmp); - neon_store_reg(a->vd, 0, rm1); + write_neon_element32(rm1, a->vd, 0, MO_32); + tcg_temp_free_i32(rm1); shiftfn(rm3, rm3, constimm); shiftfn(rm4, rm4, constimm); @@ -1452,7 +1465,8 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a, narrowfn(rm3, cpu_env, rtmp); tcg_temp_free_i64(rtmp); - neon_store_reg(a->vd, 1, rm3); + write_neon_element32(rm3, a->vd, 1, MO_32); + tcg_temp_free_i32(rm3); return true; } @@ -1553,8 +1567,10 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a, widen_mask = dup_const(a->size + 1, widen_mask); } - rm0 = neon_load_reg(a->vm, 0); - rm1 = neon_load_reg(a->vm, 1); + rm0 = tcg_temp_new_i32(); + rm1 = tcg_temp_new_i32(); + read_neon_element32(rm0, a->vm, 0, MO_32); + read_neon_element32(rm1, a->vm, 1, MO_32); tmp = tcg_temp_new_i64(); widenfn(tmp, rm0); @@ -1808,11 +1824,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, if (src1_wide) { neon_load_reg64(rn0_64, a->vn); } else { - TCGv_i32 tmp = neon_load_reg(a->vn, 0); + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vn, 0, MO_32); widenfn(rn0_64, tmp); tcg_temp_free_i32(tmp); } - rm = neon_load_reg(a->vm, 0); + rm = tcg_temp_new_i32(); + read_neon_element32(rm, a->vm, 0, MO_32); widenfn(rm_64, rm); tcg_temp_free_i32(rm); @@ -1825,11 +1843,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, if (src1_wide) { neon_load_reg64(rn1_64, a->vn + 1); } else { - TCGv_i32 tmp = neon_load_reg(a->vn, 1); + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vn, 1, MO_32); widenfn(rn1_64, tmp); tcg_temp_free_i32(tmp); } - rm = neon_load_reg(a->vm, 1); + rm = tcg_temp_new_i32(); + read_neon_element32(rm, a->vm, 1, MO_32); neon_store_reg64(rn0_64, a->vd); @@ -1922,9 +1942,11 @@ static bool do_narrow_3d(DisasContext *s, arg_3diff *a, narrowfn(rd1, rn_64); - neon_store_reg(a->vd, 0, rd0); - neon_store_reg(a->vd, 1, rd1); + write_neon_element32(rd0, a->vd, 0, MO_32); + write_neon_element32(rd1, a->vd, 1, MO_32); + tcg_temp_free_i32(rd0); + tcg_temp_free_i32(rd1); tcg_temp_free_i64(rn_64); tcg_temp_free_i64(rm_64); @@ -1999,14 +2021,14 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a, rd0 = tcg_temp_new_i64(); rd1 = tcg_temp_new_i64(); - rn = neon_load_reg(a->vn, 0); - rm = neon_load_reg(a->vm, 0); + rn = tcg_temp_new_i32(); + rm = tcg_temp_new_i32(); + read_neon_element32(rn, a->vn, 0, MO_32); + read_neon_element32(rm, a->vm, 0, MO_32); opfn(rd0, rn, rm); - tcg_temp_free_i32(rn); - tcg_temp_free_i32(rm); - rn = neon_load_reg(a->vn, 1); - rm = neon_load_reg(a->vm, 1); + read_neon_element32(rn, a->vn, 1, MO_32); + read_neon_element32(rm, a->vm, 1, MO_32); opfn(rd1, rn, rm); tcg_temp_free_i32(rn); tcg_temp_free_i32(rm); @@ -2308,16 +2330,16 @@ static void gen_neon_dup_high16(TCGv_i32 var) static inline TCGv_i32 neon_get_scalar(int size, int reg) { - TCGv_i32 tmp; - if (size == 1) { - tmp = neon_load_reg(reg & 7, reg >> 4); + TCGv_i32 tmp = tcg_temp_new_i32(); + if (size == MO_16) { + read_neon_element32(tmp, reg & 7, reg >> 4, MO_32); if (reg & 8) { gen_neon_dup_high16(tmp); } else { gen_neon_dup_low16(tmp); } } else { - tmp = neon_load_reg(reg & 15, reg >> 4); + read_neon_element32(tmp, reg & 15, reg >> 4, MO_32); } return tmp; } @@ -2331,7 +2353,7 @@ static bool do_2scalar(DisasContext *s, arg_2scalar *a, * perform an accumulation operation of that result into the * destination. */ - TCGv_i32 scalar; + TCGv_i32 scalar, tmp; int pass; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -2358,17 +2380,20 @@ static bool do_2scalar(DisasContext *s, arg_2scalar *a, } scalar = neon_get_scalar(a->size, a->vm); + tmp = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 tmp = neon_load_reg(a->vn, pass); + read_neon_element32(tmp, a->vn, pass, MO_32); opfn(tmp, tmp, scalar); if (accfn) { - TCGv_i32 rd = neon_load_reg(a->vd, pass); + TCGv_i32 rd = tcg_temp_new_i32(); + read_neon_element32(rd, a->vd, pass, MO_32); accfn(tmp, rd, tmp); tcg_temp_free_i32(rd); } - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp, a->vd, pass, MO_32); } + tcg_temp_free_i32(tmp); tcg_temp_free_i32(scalar); return true; } @@ -2523,7 +2548,7 @@ static bool do_vqrdmlah_2sc(DisasContext *s, arg_2scalar *a, * performs a kind of fused op-then-accumulate using a helper * function that takes all of rd, rn and the scalar at once. */ - TCGv_i32 scalar; + TCGv_i32 scalar, rn, rd; int pass; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -2554,14 +2579,17 @@ static bool do_vqrdmlah_2sc(DisasContext *s, arg_2scalar *a, } scalar = neon_get_scalar(a->size, a->vm); + rn = tcg_temp_new_i32(); + rd = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 rn = neon_load_reg(a->vn, pass); - TCGv_i32 rd = neon_load_reg(a->vd, pass); + read_neon_element32(rn, a->vn, pass, MO_32); + read_neon_element32(rd, a->vd, pass, MO_32); opfn(rd, cpu_env, rn, scalar, rd); - tcg_temp_free_i32(rn); - neon_store_reg(a->vd, pass, rd); + write_neon_element32(rd, a->vd, pass, MO_32); } + tcg_temp_free_i32(rn); + tcg_temp_free_i32(rd); tcg_temp_free_i32(scalar); return true; @@ -2628,12 +2656,12 @@ static bool do_2scalar_long(DisasContext *s, arg_2scalar *a, scalar = neon_get_scalar(a->size, a->vm); /* Load all inputs before writing any outputs, in case of overlap */ - rn = neon_load_reg(a->vn, 0); + rn = tcg_temp_new_i32(); + read_neon_element32(rn, a->vn, 0, MO_32); rn0_64 = tcg_temp_new_i64(); opfn(rn0_64, rn, scalar); - tcg_temp_free_i32(rn); - rn = neon_load_reg(a->vn, 1); + read_neon_element32(rn, a->vn, 1, MO_32); rn1_64 = tcg_temp_new_i64(); opfn(rn1_64, rn, scalar); tcg_temp_free_i32(rn); @@ -2857,30 +2885,34 @@ static bool trans_VTBL(DisasContext *s, arg_VTBL *a) return false; } n <<= 3; + tmp = tcg_temp_new_i32(); if (a->op) { - tmp = neon_load_reg(a->vd, 0); + read_neon_element32(tmp, a->vd, 0, MO_32); } else { - tmp = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp, 0); } - tmp2 = neon_load_reg(a->vm, 0); + tmp2 = tcg_temp_new_i32(); + read_neon_element32(tmp2, a->vm, 0, MO_32); ptr1 = vfp_reg_ptr(true, a->vn); tmp4 = tcg_const_i32(n); gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp4); - tcg_temp_free_i32(tmp); + if (a->op) { - tmp = neon_load_reg(a->vd, 1); + read_neon_element32(tmp, a->vd, 1, MO_32); } else { - tmp = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp, 0); } - tmp3 = neon_load_reg(a->vm, 1); + tmp3 = tcg_temp_new_i32(); + read_neon_element32(tmp3, a->vm, 1, MO_32); gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp4); + tcg_temp_free_i32(tmp); tcg_temp_free_i32(tmp4); tcg_temp_free_ptr(ptr1); - neon_store_reg(a->vd, 0, tmp2); - neon_store_reg(a->vd, 1, tmp3); - tcg_temp_free_i32(tmp); + + write_neon_element32(tmp2, a->vd, 0, MO_32); + write_neon_element32(tmp3, a->vd, 1, MO_32); + tcg_temp_free_i32(tmp2); + tcg_temp_free_i32(tmp3); return true; } @@ -2913,6 +2945,7 @@ static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a) static bool trans_VREV64(DisasContext *s, arg_VREV64 *a) { int pass, half; + TCGv_i32 tmp[2]; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -2936,11 +2969,12 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a) return true; } - for (pass = 0; pass < (a->q ? 2 : 1); pass++) { - TCGv_i32 tmp[2]; + tmp[0] = tcg_temp_new_i32(); + tmp[1] = tcg_temp_new_i32(); + for (pass = 0; pass < (a->q ? 2 : 1); pass++) { for (half = 0; half < 2; half++) { - tmp[half] = neon_load_reg(a->vm, pass * 2 + half); + read_neon_element32(tmp[half], a->vm, pass * 2 + half, MO_32); switch (a->size) { case 0: tcg_gen_bswap32_i32(tmp[half], tmp[half]); @@ -2954,9 +2988,12 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a) g_assert_not_reached(); } } - neon_store_reg(a->vd, pass * 2, tmp[1]); - neon_store_reg(a->vd, pass * 2 + 1, tmp[0]); + write_neon_element32(tmp[1], a->vd, pass * 2, MO_32); + write_neon_element32(tmp[0], a->vd, pass * 2 + 1, MO_32); } + + tcg_temp_free_i32(tmp[0]); + tcg_temp_free_i32(tmp[1]); return true; } @@ -3001,12 +3038,14 @@ static bool do_2misc_pairwise(DisasContext *s, arg_2misc *a, rm0_64 = tcg_temp_new_i64(); rm1_64 = tcg_temp_new_i64(); rd_64 = tcg_temp_new_i64(); - tmp = neon_load_reg(a->vm, pass * 2); + + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, pass * 2, MO_32); widenfn(rm0_64, tmp); - tcg_temp_free_i32(tmp); - tmp = neon_load_reg(a->vm, pass * 2 + 1); + read_neon_element32(tmp, a->vm, pass * 2 + 1, MO_32); widenfn(rm1_64, tmp); tcg_temp_free_i32(tmp); + opfn(rd_64, rm0_64, rm1_64); tcg_temp_free_i64(rm0_64); tcg_temp_free_i64(rm1_64); @@ -3219,8 +3258,10 @@ static bool do_vmovn(DisasContext *s, arg_2misc *a, narrowfn(rd0, cpu_env, rm); neon_load_reg64(rm, a->vm + 1); narrowfn(rd1, cpu_env, rm); - neon_store_reg(a->vd, 0, rd0); - neon_store_reg(a->vd, 1, rd1); + write_neon_element32(rd0, a->vd, 0, MO_32); + write_neon_element32(rd1, a->vd, 1, MO_32); + tcg_temp_free_i32(rd0); + tcg_temp_free_i32(rd1); tcg_temp_free_i64(rm); return true; } @@ -3277,9 +3318,11 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a) } rd = tcg_temp_new_i64(); + rm0 = tcg_temp_new_i32(); + rm1 = tcg_temp_new_i32(); - rm0 = neon_load_reg(a->vm, 0); - rm1 = neon_load_reg(a->vm, 1); + read_neon_element32(rm0, a->vm, 0, MO_32); + read_neon_element32(rm1, a->vm, 1, MO_32); widenfn(rd, rm0); tcg_gen_shli_i64(rd, rd, 8 << a->size); @@ -3320,21 +3363,25 @@ static bool trans_VCVT_F16_F32(DisasContext *s, arg_2misc *a) fpst = fpstatus_ptr(FPST_STD); ahp = get_ahp_flag(); - tmp = neon_load_reg(a->vm, 0); + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 0, MO_32); gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp); - tmp2 = neon_load_reg(a->vm, 1); + tmp2 = tcg_temp_new_i32(); + read_neon_element32(tmp2, a->vm, 1, MO_32); gen_helper_vfp_fcvt_f32_to_f16(tmp2, tmp2, fpst, ahp); tcg_gen_shli_i32(tmp2, tmp2, 16); tcg_gen_or_i32(tmp2, tmp2, tmp); - tcg_temp_free_i32(tmp); - tmp = neon_load_reg(a->vm, 2); + read_neon_element32(tmp, a->vm, 2, MO_32); gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp); - tmp3 = neon_load_reg(a->vm, 3); - neon_store_reg(a->vd, 0, tmp2); + tmp3 = tcg_temp_new_i32(); + read_neon_element32(tmp3, a->vm, 3, MO_32); + write_neon_element32(tmp2, a->vd, 0, MO_32); + tcg_temp_free_i32(tmp2); gen_helper_vfp_fcvt_f32_to_f16(tmp3, tmp3, fpst, ahp); tcg_gen_shli_i32(tmp3, tmp3, 16); tcg_gen_or_i32(tmp3, tmp3, tmp); - neon_store_reg(a->vd, 1, tmp3); + write_neon_element32(tmp3, a->vd, 1, MO_32); + tcg_temp_free_i32(tmp3); tcg_temp_free_i32(tmp); tcg_temp_free_i32(ahp); tcg_temp_free_ptr(fpst); @@ -3369,21 +3416,25 @@ static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a) fpst = fpstatus_ptr(FPST_STD); ahp = get_ahp_flag(); tmp3 = tcg_temp_new_i32(); - tmp = neon_load_reg(a->vm, 0); - tmp2 = neon_load_reg(a->vm, 1); + tmp2 = tcg_temp_new_i32(); + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 0, MO_32); + read_neon_element32(tmp2, a->vm, 1, MO_32); tcg_gen_ext16u_i32(tmp3, tmp); gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp); - neon_store_reg(a->vd, 0, tmp3); + write_neon_element32(tmp3, a->vd, 0, MO_32); tcg_gen_shri_i32(tmp, tmp, 16); gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp); - neon_store_reg(a->vd, 1, tmp); - tmp3 = tcg_temp_new_i32(); + write_neon_element32(tmp, a->vd, 1, MO_32); + tcg_temp_free_i32(tmp); tcg_gen_ext16u_i32(tmp3, tmp2); gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp); - neon_store_reg(a->vd, 2, tmp3); + write_neon_element32(tmp3, a->vd, 2, MO_32); + tcg_temp_free_i32(tmp3); tcg_gen_shri_i32(tmp2, tmp2, 16); gen_helper_vfp_fcvt_f16_to_f32(tmp2, tmp2, fpst, ahp); - neon_store_reg(a->vd, 3, tmp2); + write_neon_element32(tmp2, a->vd, 3, MO_32); + tcg_temp_free_i32(tmp2); tcg_temp_free_i32(ahp); tcg_temp_free_ptr(fpst); @@ -3489,6 +3540,7 @@ DO_2M_CRYPTO(SHA256SU0, aa32_sha2, 2) static bool do_2misc(DisasContext *s, arg_2misc *a, NeonGenOneOpFn *fn) { + TCGv_i32 tmp; int pass; /* Handle a 2-reg-misc operation by iterating 32 bits at a time */ @@ -3514,11 +3566,13 @@ static bool do_2misc(DisasContext *s, arg_2misc *a, NeonGenOneOpFn *fn) return true; } + tmp = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 tmp = neon_load_reg(a->vm, pass); + read_neon_element32(tmp, a->vm, pass, MO_32); fn(tmp, tmp); - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp, a->vd, pass, MO_32); } + tcg_temp_free_i32(tmp); return true; } @@ -3871,25 +3925,29 @@ static bool trans_VTRN(DisasContext *s, arg_2misc *a) return true; } - if (a->size == 2) { + tmp = tcg_temp_new_i32(); + tmp2 = tcg_temp_new_i32(); + if (a->size == MO_32) { for (pass = 0; pass < (a->q ? 4 : 2); pass += 2) { - tmp = neon_load_reg(a->vm, pass); - tmp2 = neon_load_reg(a->vd, pass + 1); - neon_store_reg(a->vm, pass, tmp2); - neon_store_reg(a->vd, pass + 1, tmp); + read_neon_element32(tmp, a->vm, pass, MO_32); + read_neon_element32(tmp2, a->vd, pass + 1, MO_32); + write_neon_element32(tmp2, a->vm, pass, MO_32); + write_neon_element32(tmp, a->vd, pass + 1, MO_32); } } else { for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - tmp = neon_load_reg(a->vm, pass); - tmp2 = neon_load_reg(a->vd, pass); - if (a->size == 0) { + read_neon_element32(tmp, a->vm, pass, MO_32); + read_neon_element32(tmp2, a->vd, pass, MO_32); + if (a->size == MO_8) { gen_neon_trn_u8(tmp, tmp2); } else { gen_neon_trn_u16(tmp, tmp2); } - neon_store_reg(a->vm, pass, tmp2); - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp2, a->vm, pass, MO_32); + write_neon_element32(tmp, a->vd, pass, MO_32); } } + tcg_temp_free_i32(tmp); + tcg_temp_free_i32(tmp2); return true; } From patchwork Fri Oct 30 02:26:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 316537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB076C4363A for ; Fri, 30 Oct 2020 02:29:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 60DA52075E for ; Fri, 30 Oct 2020 02:29:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="KHigW7mj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 60DA52075E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60250 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKAw-0000zk-3v for qemu-devel@archiver.kernel.org; Thu, 29 Oct 2020 22:29:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39844) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7c-00058a-2t for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:32 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:41565) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7a-0005vG-2J for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:31 -0400 Received: by mail-pl1-x641.google.com with SMTP id w11so2239568pll.8 for ; Thu, 29 Oct 2020 19:26:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NlmDoKUnT++Y4HE65X/AaQrAu9Ha43jQPJiWJ9I7vlY=; b=KHigW7mjKR+Fy7c+w9VywdlAxsBf2oVHbvmbXG1N0XPLZ4GCT19rWVRzP1DXzl0xGK 1/gCor8LznWoq6NLSEJRbf66v2MDO5Bp0uacAFIYet+f5ahYTPbvAMXUu8XKpzR6D2Nf kJm0EeAwCYFjnsM6nb8A60PM89TZiHDfnMMo9nMhUZmdp0e9aiMbFm/9NfNG7vrcG1DY XIjNCKWaf/1ATiHwOQlSxYfvNJQeavOIuU60Hf98hd8HZ5Ivw4DJw+sG7aPqcBMjC5ZS I9BHZrBHTpvTWwlg9lEE6O2ZHvAelQ9uX+BlBlYnjf98QGhr3E/eJDrE1uZbZ0MUvN29 Y/wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NlmDoKUnT++Y4HE65X/AaQrAu9Ha43jQPJiWJ9I7vlY=; b=EgQXIFuuqAK1FzJPglG/Bxwf8cOAyNm4vNTkbUMbdThNzJXDseHzoGM8zelHcBHv4m J7hH73frLVztwZaU0m7e0HKPE2qyW+nxvMBg5ZoaPOhIWK81iYcUHzC1XevTQ8v6noFZ vq6KuUCIbDIRIntvBHQrakuiCZKwO8M3o6fKTKTa+rpRvorFu3nWWY9g8CEFrfI5i5xx FXcaDqUSJTvbkqZaWcQ6mPtmYy7OCuu4NsMboYAofHzHGjkHA2bbOAMo8wa/xd1WgRN8 3WBPHIrv9jirSq466hpeNxCdOUpmVuu+1BBvJgZiCHAUch3374Wv49bWwvXGKwax9olW /naQ== X-Gm-Message-State: AOAM532OF8AnPbEUQPOfHv2+/+Mk7DovbektlocnSxWJUaJFwWz59aXC Gq6GnPyjh0CiVBz5+cZz400Zrt9LXr/XBg== X-Google-Smtp-Source: ABdhPJyEIX+O20CAsufJ6Cm81UPx/84DLuFVPFBqnNjvHr6YmoXyrdLyUhCw222+KeEdFInATVD+aA== X-Received: by 2002:a17:902:7485:b029:d6:9c14:f376 with SMTP id h5-20020a1709027485b02900d69c14f376mr1696859pll.62.1604024788307; Thu, 29 Oct 2020 19:26:28 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 06/11] target/arm: Expand read/write_neon_element32 to all MemOp Date: Thu, 29 Oct 2020 19:26:13 -0700 Message-Id: <20201030022618.785675-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We can then use this to improve VMOV (scalar to gp) and VMOV (gp to scalar) so that we simply perform the memory operation that we wanted, rather than inserting or extracting from a 32-bit quantity. These were the last uses of neon_load/store_reg, so remove them. Signed-off-by: Richard Henderson --- target/arm/translate.c | 50 +++++++++++++----------- target/arm/translate-vfp.c.inc | 71 +++++----------------------------- 2 files changed, 37 insertions(+), 84 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 0ed9eab0b0..55d5f4ed73 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1106,9 +1106,9 @@ static long neon_full_reg_offset(unsigned reg) * Return the offset of a 2**SIZE piece of a NEON register, at index ELE, * where 0 is the least significant end of the register. */ -static long neon_element_offset(int reg, int element, MemOp size) +static long neon_element_offset(int reg, int element, MemOp memop) { - int element_size = 1 << size; + int element_size = 1 << (memop & MO_SIZE); int ofs = element * element_size; #ifdef HOST_WORDS_BIGENDIAN /* @@ -1132,19 +1132,6 @@ static long vfp_reg_offset(bool dp, unsigned reg) } } -static TCGv_i32 neon_load_reg(int reg, int pass) -{ - TCGv_i32 tmp = tcg_temp_new_i32(); - tcg_gen_ld_i32(tmp, cpu_env, neon_element_offset(reg, pass, MO_32)); - return tmp; -} - -static void neon_store_reg(int reg, int pass, TCGv_i32 var) -{ - tcg_gen_st_i32(var, cpu_env, neon_element_offset(reg, pass, MO_32)); - tcg_temp_free_i32(var); -} - static inline void neon_load_reg64(TCGv_i64 var, int reg) { tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg)); @@ -1165,12 +1152,25 @@ static inline void neon_store_reg32(TCGv_i32 var, int reg) tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg)); } -static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size) +static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop) { - long off = neon_element_offset(reg, ele, size); + long off = neon_element_offset(reg, ele, memop); - switch (size) { - case MO_32: + switch (memop) { + case MO_SB: + tcg_gen_ld8s_i32(dest, cpu_env, off); + break; + case MO_UB: + tcg_gen_ld8u_i32(dest, cpu_env, off); + break; + case MO_SW: + tcg_gen_ld16s_i32(dest, cpu_env, off); + break; + case MO_UW: + tcg_gen_ld16u_i32(dest, cpu_env, off); + break; + case MO_UL: + case MO_SL: tcg_gen_ld_i32(dest, cpu_env, off); break; default: @@ -1178,11 +1178,17 @@ static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size) } } -static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp size) +static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop) { - long off = neon_element_offset(reg, ele, size); + long off = neon_element_offset(reg, ele, memop); - switch (size) { + switch (memop) { + case MO_8: + tcg_gen_st8_i32(src, cpu_env, off); + break; + case MO_16: + tcg_gen_st16_i32(src, cpu_env, off); + break; case MO_32: tcg_gen_st_i32(src, cpu_env, off); break; diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index 368bae0a73..28f22f9872 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -511,11 +511,9 @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a) { /* VMOV scalar to general purpose register */ TCGv_i32 tmp; - int pass; - uint32_t offset; - /* SIZE == 2 is a VFP instruction; otherwise NEON. */ - if (a->size == 2 + /* SIZE == MO_32 is a VFP instruction; otherwise NEON. */ + if (a->size == MO_32 ? !dc_isar_feature(aa32_fpsp_v2, s) : !arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -526,44 +524,12 @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a) return false; } - offset = a->index << a->size; - pass = extract32(offset, 2, 1); - offset = extract32(offset, 0, 2) * 8; - if (!vfp_access_check(s)) { return true; } - tmp = neon_load_reg(a->vn, pass); - switch (a->size) { - case 0: - if (offset) { - tcg_gen_shri_i32(tmp, tmp, offset); - } - if (a->u) { - gen_uxtb(tmp); - } else { - gen_sxtb(tmp); - } - break; - case 1: - if (a->u) { - if (offset) { - tcg_gen_shri_i32(tmp, tmp, 16); - } else { - gen_uxth(tmp); - } - } else { - if (offset) { - tcg_gen_sari_i32(tmp, tmp, 16); - } else { - gen_sxth(tmp); - } - } - break; - case 2: - break; - } + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vn, a->index, a->size | (a->u ? 0 : MO_SIGN)); store_reg(s, a->rt, tmp); return true; @@ -572,12 +538,10 @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a) static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a) { /* VMOV general purpose register to scalar */ - TCGv_i32 tmp, tmp2; - int pass; - uint32_t offset; + TCGv_i32 tmp; - /* SIZE == 2 is a VFP instruction; otherwise NEON. */ - if (a->size == 2 + /* SIZE == MO_32 is a VFP instruction; otherwise NEON. */ + if (a->size == MO_32 ? !dc_isar_feature(aa32_fpsp_v2, s) : !arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -588,30 +552,13 @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a) return false; } - offset = a->index << a->size; - pass = extract32(offset, 2, 1); - offset = extract32(offset, 0, 2) * 8; - if (!vfp_access_check(s)) { return true; } tmp = load_reg(s, a->rt); - switch (a->size) { - case 0: - tmp2 = neon_load_reg(a->vn, pass); - tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 8); - tcg_temp_free_i32(tmp2); - break; - case 1: - tmp2 = neon_load_reg(a->vn, pass); - tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 16); - tcg_temp_free_i32(tmp2); - break; - case 2: - break; - } - neon_store_reg(a->vn, pass, tmp); + write_neon_element32(tmp, a->vn, a->index, a->size); + tcg_temp_free_i32(tmp); return true; } From patchwork Fri Oct 30 02:26:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 316536 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ABEBC4363A for ; Fri, 30 Oct 2020 02:31:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D615220756 for ; Fri, 30 Oct 2020 02:31:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="kuaN1nxT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D615220756 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37676 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKCc-0003ON-R4 for qemu-devel@archiver.kernel.org; Thu, 29 Oct 2020 22:31:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39866) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7e-0005D1-7v for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:38 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:38429) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7b-0005vM-Hf for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:33 -0400 Received: by mail-pf1-x443.google.com with SMTP id 10so3983648pfp.5 for ; Thu, 29 Oct 2020 19:26:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bhlXbEbWYnNuol5HsH6fu3cqY3M1Y2WKuzmjH4v4ad4=; b=kuaN1nxTxx4+GDRhdqgo7BxE6807MamHOEA3UKsxs4J5ebLgEnYEW3ZocLPCd1ARgQ ZcJTUobJVUvRnsv1Mi3Jp5R/oyPsyoDzkp9+QFOQm8/K4ON3t+53Hjw0jTYb1gjU5Tu8 k82cIwpZUT2m+DB+h0+9MqAidskA0dIiJBdFIVUa4Qw/lnNhQ/R6gVNXkgUnvCmdASk7 /HAyq1Dx53qgQk8UM9qpE+4KYLTn4ni/FAKHAUZHxqVvbo28AMOzemqprX6w2sQlshSt icwflbq1Ug4XTG3QtvgvPkbJ16oNDDZFAojMRAoBr2zlnkB/JNLRbZbC/OC6slSRMPke fPSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bhlXbEbWYnNuol5HsH6fu3cqY3M1Y2WKuzmjH4v4ad4=; b=Z+sqjGvN92lGQsj43/fcG/VzGcDlfXQL9iJHDArELC9+QNWemnNui7n0xXsYmBkAfu NZtkKiTtuIb5m5LhRooag6kdx5fpjrdgukPBpFVjmbA46US+Q+AAzWEV3EVOdICVVlhg 8m9MORVrtqJvU8NutAlscVznxC/L01Mq5aTa5mIWdmbzj/rsPnuRTNEEFV/WsbTXJvss WPEfv57EHrERtg37FloQPkOwBE6icPd71y8ePdXlr0JP9G5EkEMr3Kl9KlEmLlBNrBPZ otAP65GveL7hzEsLnfJoAURuRAvQaN0/30uHktrmqae/HZpOzdVQA11VJ0KUgdl7L3Te 7q/g== X-Gm-Message-State: AOAM532tWxBgoqcygxevSqWjQRfO9oG/J6d3VvtVWggXeszM0PyDiu4V U071qmi4D/8QGbJ7AUiPj24I96toN2yTKQ== X-Google-Smtp-Source: ABdhPJzk+IED5cTEpCKiyVtEnda6Z+tkqYUPpj12iHUkag8MO8KrlEtvPD6VWI80LUiimXImV5ziyw== X-Received: by 2002:a17:90a:fe82:: with SMTP id co2mr133926pjb.22.1604024789658; Thu, 29 Oct 2020 19:26:29 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 07/11] target/arm: Rename neon_load_reg32 to vfp_load_reg32 Date: Thu, 29 Oct 2020 19:26:14 -0700 Message-Id: <20201030022618.785675-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::443; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x443.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The only uses of this function are for loading VFP single-precision values, and nothing to do with NEON. Signed-off-by: Richard Henderson --- target/arm/translate.c | 4 +- target/arm/translate-vfp.c.inc | 184 ++++++++++++++++----------------- 2 files changed, 94 insertions(+), 94 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 55d5f4ed73..8491ab705b 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1142,12 +1142,12 @@ static inline void neon_store_reg64(TCGv_i64 var, int reg) tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg)); } -static inline void neon_load_reg32(TCGv_i32 var, int reg) +static inline void vfp_load_reg32(TCGv_i32 var, int reg) { tcg_gen_ld_i32(var, cpu_env, vfp_reg_offset(false, reg)); } -static inline void neon_store_reg32(TCGv_i32 var, int reg) +static inline void vfp_store_reg32(TCGv_i32 var, int reg) { tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg)); } diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index 28f22f9872..d2a9b658bb 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -283,8 +283,8 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) frn = tcg_temp_new_i32(); frm = tcg_temp_new_i32(); dest = tcg_temp_new_i32(); - neon_load_reg32(frn, rn); - neon_load_reg32(frm, rm); + vfp_load_reg32(frn, rn); + vfp_load_reg32(frm, rm); switch (a->cc) { case 0: /* eq: Z */ tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero, @@ -315,7 +315,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) if (sz == 1) { tcg_gen_andi_i32(dest, dest, 0xffff); } - neon_store_reg32(dest, rd); + vfp_store_reg32(dest, rd); tcg_temp_free_i32(frn); tcg_temp_free_i32(frm); tcg_temp_free_i32(dest); @@ -395,13 +395,13 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a) TCGv_i32 tcg_res; tcg_op = tcg_temp_new_i32(); tcg_res = tcg_temp_new_i32(); - neon_load_reg32(tcg_op, rm); + vfp_load_reg32(tcg_op, rm); if (sz == 1) { gen_helper_rinth(tcg_res, tcg_op, fpst); } else { gen_helper_rints(tcg_res, tcg_op, fpst); } - neon_store_reg32(tcg_res, rd); + vfp_store_reg32(tcg_res, rd); tcg_temp_free_i32(tcg_op); tcg_temp_free_i32(tcg_res); } @@ -470,7 +470,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst); } tcg_gen_extrl_i64_i32(tcg_tmp, tcg_res); - neon_store_reg32(tcg_tmp, rd); + vfp_store_reg32(tcg_tmp, rd); tcg_temp_free_i32(tcg_tmp); tcg_temp_free_i64(tcg_res); tcg_temp_free_i64(tcg_double); @@ -478,7 +478,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) TCGv_i32 tcg_single, tcg_res; tcg_single = tcg_temp_new_i32(); tcg_res = tcg_temp_new_i32(); - neon_load_reg32(tcg_single, rm); + vfp_load_reg32(tcg_single, rm); if (sz == 1) { if (is_signed) { gen_helper_vfp_toslh(tcg_res, tcg_single, tcg_shift, fpst); @@ -492,7 +492,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst); } } - neon_store_reg32(tcg_res, rd); + vfp_store_reg32(tcg_res, rd); tcg_temp_free_i32(tcg_res); tcg_temp_free_i32(tcg_single); } @@ -776,14 +776,14 @@ static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a) if (a->l) { /* VFP to general purpose register */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vn); + vfp_load_reg32(tmp, a->vn); tcg_gen_andi_i32(tmp, tmp, 0xffff); store_reg(s, a->rt, tmp); } else { /* general purpose register to VFP */ tmp = load_reg(s, a->rt); tcg_gen_andi_i32(tmp, tmp, 0xffff); - neon_store_reg32(tmp, a->vn); + vfp_store_reg32(tmp, a->vn); tcg_temp_free_i32(tmp); } @@ -805,7 +805,7 @@ static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a) if (a->l) { /* VFP to general purpose register */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vn); + vfp_load_reg32(tmp, a->vn); if (a->rt == 15) { /* Set the 4 flag bits in the CPSR. */ gen_set_nzcv(tmp); @@ -816,7 +816,7 @@ static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a) } else { /* general purpose register to VFP */ tmp = load_reg(s, a->rt); - neon_store_reg32(tmp, a->vn); + vfp_store_reg32(tmp, a->vn); tcg_temp_free_i32(tmp); } @@ -842,18 +842,18 @@ static bool trans_VMOV_64_sp(DisasContext *s, arg_VMOV_64_sp *a) if (a->op) { /* fpreg to gpreg */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); store_reg(s, a->rt, tmp); tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm + 1); + vfp_load_reg32(tmp, a->vm + 1); store_reg(s, a->rt2, tmp); } else { /* gpreg to fpreg */ tmp = load_reg(s, a->rt); - neon_store_reg32(tmp, a->vm); + vfp_store_reg32(tmp, a->vm); tcg_temp_free_i32(tmp); tmp = load_reg(s, a->rt2); - neon_store_reg32(tmp, a->vm + 1); + vfp_store_reg32(tmp, a->vm + 1); tcg_temp_free_i32(tmp); } @@ -885,18 +885,18 @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_dp *a) if (a->op) { /* fpreg to gpreg */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm * 2); + vfp_load_reg32(tmp, a->vm * 2); store_reg(s, a->rt, tmp); tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm * 2 + 1); + vfp_load_reg32(tmp, a->vm * 2 + 1); store_reg(s, a->rt2, tmp); } else { /* gpreg to fpreg */ tmp = load_reg(s, a->rt); - neon_store_reg32(tmp, a->vm * 2); + vfp_store_reg32(tmp, a->vm * 2); tcg_temp_free_i32(tmp); tmp = load_reg(s, a->rt2); - neon_store_reg32(tmp, a->vm * 2 + 1); + vfp_store_reg32(tmp, a->vm * 2 + 1); tcg_temp_free_i32(tmp); } @@ -927,9 +927,9 @@ static bool trans_VLDR_VSTR_hp(DisasContext *s, arg_VLDR_VSTR_sp *a) tmp = tcg_temp_new_i32(); if (a->l) { gen_aa32_ld16u(s, tmp, addr, get_mem_index(s)); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); } else { - neon_load_reg32(tmp, a->vd); + vfp_load_reg32(tmp, a->vd); gen_aa32_st16(s, tmp, addr, get_mem_index(s)); } tcg_temp_free_i32(tmp); @@ -961,9 +961,9 @@ static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a) tmp = tcg_temp_new_i32(); if (a->l) { gen_aa32_ld32u(s, tmp, addr, get_mem_index(s)); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); } else { - neon_load_reg32(tmp, a->vd); + vfp_load_reg32(tmp, a->vd); gen_aa32_st32(s, tmp, addr, get_mem_index(s)); } tcg_temp_free_i32(tmp); @@ -1066,10 +1066,10 @@ static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a) if (a->l) { /* load */ gen_aa32_ld32u(s, tmp, addr, get_mem_index(s)); - neon_store_reg32(tmp, a->vd + i); + vfp_store_reg32(tmp, a->vd + i); } else { /* store */ - neon_load_reg32(tmp, a->vd + i); + vfp_load_reg32(tmp, a->vd + i); gen_aa32_st32(s, tmp, addr, get_mem_index(s)); } tcg_gen_addi_i32(addr, addr, offset); @@ -1285,15 +1285,15 @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn, fd = tcg_temp_new_i32(); fpst = fpstatus_ptr(FPST_FPCR); - neon_load_reg32(f0, vn); - neon_load_reg32(f1, vm); + vfp_load_reg32(f0, vn); + vfp_load_reg32(f1, vm); for (;;) { if (reads_vd) { - neon_load_reg32(fd, vd); + vfp_load_reg32(fd, vd); } fn(fd, f0, f1, fpst); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); if (veclen == 0) { break; @@ -1303,10 +1303,10 @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn, veclen--; vd = vfp_advance_sreg(vd, delta_d); vn = vfp_advance_sreg(vn, delta_d); - neon_load_reg32(f0, vn); + vfp_load_reg32(f0, vn); if (delta_m) { vm = vfp_advance_sreg(vm, delta_m); - neon_load_reg32(f1, vm); + vfp_load_reg32(f1, vm); } } @@ -1349,14 +1349,14 @@ static bool do_vfp_3op_hp(DisasContext *s, VFPGen3OpSPFn *fn, fd = tcg_temp_new_i32(); fpst = fpstatus_ptr(FPST_FPCR_F16); - neon_load_reg32(f0, vn); - neon_load_reg32(f1, vm); + vfp_load_reg32(f0, vn); + vfp_load_reg32(f1, vm); if (reads_vd) { - neon_load_reg32(fd, vd); + vfp_load_reg32(fd, vd); } fn(fd, f0, f1, fpst); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); tcg_temp_free_i32(f0); tcg_temp_free_i32(f1); @@ -1489,11 +1489,11 @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) f0 = tcg_temp_new_i32(); fd = tcg_temp_new_i32(); - neon_load_reg32(f0, vm); + vfp_load_reg32(f0, vm); for (;;) { fn(fd, f0); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); if (veclen == 0) { break; @@ -1503,7 +1503,7 @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) /* single source one-many */ while (veclen--) { vd = vfp_advance_sreg(vd, delta_d); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); } break; } @@ -1512,7 +1512,7 @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) veclen--; vd = vfp_advance_sreg(vd, delta_d); vm = vfp_advance_sreg(vm, delta_m); - neon_load_reg32(f0, vm); + vfp_load_reg32(f0, vm); } tcg_temp_free_i32(f0); @@ -1545,9 +1545,9 @@ static bool do_vfp_2op_hp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) } f0 = tcg_temp_new_i32(); - neon_load_reg32(f0, vm); + vfp_load_reg32(f0, vm); fn(f0, f0); - neon_store_reg32(f0, vd); + vfp_store_reg32(f0, vd); tcg_temp_free_i32(f0); return true; @@ -2037,20 +2037,20 @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i32(); - neon_load_reg32(vn, a->vn); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vn, a->vn); + vfp_load_reg32(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negh(vn, vn); } - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negh(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_vfp_muladdh(vd, vn, vm, vd, fpst); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(vn); @@ -2102,20 +2102,20 @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i32(); - neon_load_reg32(vn, a->vn); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vn, a->vn); + vfp_load_reg32(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negs(vn, vn); } - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negs(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR); gen_helper_vfp_muladds(vd, vn, vm, vd, fpst); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(vn); @@ -2230,7 +2230,7 @@ static bool trans_VMOV_imm_hp(DisasContext *s, arg_VMOV_imm_sp *a) } fd = tcg_const_i32(vfp_expand_imm(MO_16, a->imm)); - neon_store_reg32(fd, a->vd); + vfp_store_reg32(fd, a->vd); tcg_temp_free_i32(fd); return true; } @@ -2270,7 +2270,7 @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a) fd = tcg_const_i32(vfp_expand_imm(MO_32, a->imm)); for (;;) { - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); if (veclen == 0) { break; @@ -2397,11 +2397,11 @@ static bool trans_VCMP_hp(DisasContext *s, arg_VCMP_sp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (a->z) { tcg_gen_movi_i32(vm, 0); } else { - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); } if (a->e) { @@ -2436,11 +2436,11 @@ static bool trans_VCMP_sp(DisasContext *s, arg_VCMP_sp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (a->z) { tcg_gen_movi_i32(vm, 0); } else { - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); } if (a->e) { @@ -2519,7 +2519,7 @@ static bool trans_VCVT_f32_f16(DisasContext *s, arg_VCVT_f32_f16 *a) /* The T bit tells us if we want the low or high 16 bits of Vm */ tcg_gen_ld16u_i32(tmp, cpu_env, vfp_f16_offset(a->vm, a->t)); gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp_mode); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_i32(ahp_mode); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); @@ -2583,7 +2583,7 @@ static bool trans_VCVT_f16_f32(DisasContext *s, arg_VCVT_f16_f32 *a) ahp_mode = get_ahp_flag(); tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp_mode); tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t)); tcg_temp_free_i32(ahp_mode); @@ -2645,10 +2645,10 @@ static bool trans_VRINTR_hp(DisasContext *s, arg_VRINTR_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_rinth(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2668,10 +2668,10 @@ static bool trans_VRINTR_sp(DisasContext *s, arg_VRINTR_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rints(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2724,13 +2724,13 @@ static bool trans_VRINTZ_hp(DisasContext *s, arg_VRINTZ_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); tcg_rmode = tcg_const_i32(float_round_to_zero); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); gen_helper_rinth(tmp, tmp, fpst); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tcg_rmode); tcg_temp_free_i32(tmp); @@ -2752,13 +2752,13 @@ static bool trans_VRINTZ_sp(DisasContext *s, arg_VRINTZ_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); tcg_rmode = tcg_const_i32(float_round_to_zero); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); gen_helper_rints(tmp, tmp, fpst); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tcg_rmode); tcg_temp_free_i32(tmp); @@ -2816,10 +2816,10 @@ static bool trans_VRINTX_hp(DisasContext *s, arg_VRINTX_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_rinth_exact(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2839,10 +2839,10 @@ static bool trans_VRINTX_sp(DisasContext *s, arg_VRINTX_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rints_exact(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2900,7 +2900,7 @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i64(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); gen_helper_vfp_fcvtds(vd, vm, cpu_env); neon_store_reg64(vd, a->vd); tcg_temp_free_i32(vm); @@ -2930,7 +2930,7 @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a) vm = tcg_temp_new_i64(); neon_load_reg64(vm, a->vm); gen_helper_vfp_fcvtsd(vd, vm, cpu_env); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i64(vm); return true; @@ -2950,7 +2950,7 @@ static bool trans_VCVT_int_hp(DisasContext *s, arg_VCVT_int_sp *a) } vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); if (a->s) { /* i32 -> f16 */ @@ -2959,7 +2959,7 @@ static bool trans_VCVT_int_hp(DisasContext *s, arg_VCVT_int_sp *a) /* u32 -> f16 */ gen_helper_vfp_uitoh(vm, vm, fpst); } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -2979,7 +2979,7 @@ static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a) } vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); fpst = fpstatus_ptr(FPST_FPCR); if (a->s) { /* i32 -> f32 */ @@ -2988,7 +2988,7 @@ static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a) /* u32 -> f32 */ gen_helper_vfp_uitos(vm, vm, fpst); } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -3015,7 +3015,7 @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i64(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); fpst = fpstatus_ptr(FPST_FPCR); if (a->s) { /* i32 -> f64 */ @@ -3057,7 +3057,7 @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a) vd = tcg_temp_new_i32(); neon_load_reg64(vm, a->vm); gen_helper_vjcvt(vd, vm, cpu_env); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i64(vm); tcg_temp_free_i32(vd); return true; @@ -3080,7 +3080,7 @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a) frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm); vd = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); fpst = fpstatus_ptr(FPST_FPCR_F16); shift = tcg_const_i32(frac_bits); @@ -3115,7 +3115,7 @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a) g_assert_not_reached(); } - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i32(shift); tcg_temp_free_ptr(fpst); @@ -3139,7 +3139,7 @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a) frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm); vd = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); fpst = fpstatus_ptr(FPST_FPCR); shift = tcg_const_i32(frac_bits); @@ -3174,7 +3174,7 @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a) g_assert_not_reached(); } - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i32(shift); tcg_temp_free_ptr(fpst); @@ -3261,7 +3261,7 @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a) fpst = fpstatus_ptr(FPST_FPCR_F16); vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); if (a->s) { if (a->rz) { @@ -3276,7 +3276,7 @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a) gen_helper_vfp_touih(vm, vm, fpst); } } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -3297,7 +3297,7 @@ static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a) fpst = fpstatus_ptr(FPST_FPCR); vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); if (a->s) { if (a->rz) { @@ -3312,7 +3312,7 @@ static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a) gen_helper_vfp_touis(vm, vm, fpst); } } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -3355,7 +3355,7 @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a) gen_helper_vfp_touid(vd, vm, fpst); } } - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i64(vm); tcg_temp_free_ptr(fpst); @@ -3468,10 +3468,10 @@ static bool trans_VINS(DisasContext *s, arg_VINS *a) /* Insert low half of Vm into high half of Vd */ rm = tcg_temp_new_i32(); rd = tcg_temp_new_i32(); - neon_load_reg32(rm, a->vm); - neon_load_reg32(rd, a->vd); + vfp_load_reg32(rm, a->vm); + vfp_load_reg32(rd, a->vd); tcg_gen_deposit_i32(rd, rd, rm, 16, 16); - neon_store_reg32(rd, a->vd); + vfp_store_reg32(rd, a->vd); tcg_temp_free_i32(rm); tcg_temp_free_i32(rd); return true; @@ -3495,9 +3495,9 @@ static bool trans_VMOVX(DisasContext *s, arg_VINS *a) /* Set Vd to high half of Vm */ rm = tcg_temp_new_i32(); - neon_load_reg32(rm, a->vm); + vfp_load_reg32(rm, a->vm); tcg_gen_shri_i32(rm, rm, 16); - neon_store_reg32(rm, a->vd); + vfp_store_reg32(rm, a->vd); tcg_temp_free_i32(rm); return true; } From patchwork Fri Oct 30 02:26:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 316535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62A0EC4363A for ; Fri, 30 Oct 2020 02:33:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 733C320739 for ; Fri, 30 Oct 2020 02:33:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="wSPJS9jp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 733C320739 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42950 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKDx-0005Xw-Af for qemu-devel@archiver.kernel.org; Thu, 29 Oct 2020 22:33:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39880) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7i-0005EA-SH for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:38 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:42499) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7c-0005va-Pp for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:38 -0400 Received: by mail-pl1-x643.google.com with SMTP id t22so2234252plr.9 for ; Thu, 29 Oct 2020 19:26:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KtpYW6ejYaGItCqJaQW/H6G3iR7ZEPtLWbUtROL9oZM=; b=wSPJS9jpNWoeqBkG0S4rciJVWdp4h4/mtWB2NNLsbDa2H8ouyzVt6NWtc+naaFFnVX 5BULNEe2/PVXCp5yfwplE68NTEpEbX0eFGoxgdItwPjHfLx2ninMxNnFvNHStQQjj9lo 1k/yLHKtvQN8Q7qFgCdhDw1yrY5F6/Kj1P0aH+gUQDw6ZjFwVLjDzoEn9/3j8eXw5P0v CVdLORwNsIXYLm+sKMa5VspvQW21zPJYj3kHCQmUlTl4QV0paAzzQJ80r52o3DfUbLc6 1X2eBeT28WaHoKEYh4yC/Y7F7my/iYza4CRfqV0bB+WLUjG5gtCa5wYNeNk4tdActwOc UAYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KtpYW6ejYaGItCqJaQW/H6G3iR7ZEPtLWbUtROL9oZM=; b=P8PhV+kt4CmJvFvs947FDxxEcwsZIz/O85WFScP1AkS9WqD7uj8JIlgZaC1YH18H78 cWxB1ZTWb48CzUP4c4ceegN3JzrPzEk24wkR/8CehArj6IdGR98hPEoAdcVeZU06MRm4 wkLzJOBY6dK/6Ms6xqn3EYN0truiK8NSnFgv/Xhl72bXhOZ/Rvxhrzyps2sapKWmVnQj 3HaomKPH44gcp9wjbBJ/zy9J8iFB+Rs+CYvyTGVj7PLHH/HRr5gQi6Y1O3rmlgR9O6k9 6jFb5c1R8CkQaLnLXeqdPBGnt+uXHK2PT92H9DB9KIF3UH+dQQWPnVbnqwhfNKYXtRbS re5A== X-Gm-Message-State: AOAM532BccDW/RFi/E4JUH+yfLoDmcDcvtNwJYBocbkSnT+DVeufGLSo aKHt6WOi2nm1zWqPjf4AKs97yXe79bh1sQ== X-Google-Smtp-Source: ABdhPJzp9lWm0pNg+VaRZjhSV71ZeMlwcrrcQ+LI8KpCBgmmopwtCLVMoVhjbETn01f2SESIsYBX2g== X-Received: by 2002:a17:902:ed54:b029:d3:d0bc:e41d with SMTP id y20-20020a170902ed54b02900d3d0bce41dmr7153448plb.13.1604024791022; Thu, 29 Oct 2020 19:26:31 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 08/11] target/arm: Add read/write_neon_element64 Date: Thu, 29 Oct 2020 19:26:15 -0700 Message-Id: <20201030022618.785675-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Replace all uses of neon_load/store_reg64 within translate-neon.c.inc. Signed-off-by: Richard Henderson --- target/arm/translate.c | 26 +++++++++ target/arm/translate-neon.c.inc | 94 ++++++++++++++++----------------- 2 files changed, 73 insertions(+), 47 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 8491ab705b..4fb0a62200 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1178,6 +1178,19 @@ static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop) } } +static void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop) +{ + long off = neon_element_offset(reg, ele, memop); + + switch (memop) { + case MO_Q: + tcg_gen_ld_i64(dest, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop) { long off = neon_element_offset(reg, ele, memop); @@ -1197,6 +1210,19 @@ static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop) } } +static void write_neon_element64(TCGv_i64 src, int reg, int ele, MemOp memop) +{ + long off = neon_element_offset(reg, ele, memop); + + switch (memop) { + case MO_64: + tcg_gen_st_i64(src, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + static TCGv_ptr vfp_reg_ptr(bool dp, int reg) { TCGv_ptr ret = tcg_temp_new_ptr(); diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 549381703e..c2d67160f9 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -1265,9 +1265,9 @@ static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a, for (pass = 0; pass < a->q + 1; pass++) { TCGv_i64 tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm + pass); + read_neon_element64(tmp, a->vm, pass, MO_64); fn(tmp, cpu_env, tmp, constimm); - neon_store_reg64(tmp, a->vd + pass); + write_neon_element64(tmp, a->vd, pass, MO_64); tcg_temp_free_i64(tmp); } tcg_temp_free_i64(constimm); @@ -1375,8 +1375,8 @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a, rd = tcg_temp_new_i32(); /* Load both inputs first to avoid potential overwrite if rm == rd */ - neon_load_reg64(rm1, a->vm); - neon_load_reg64(rm2, a->vm + 1); + read_neon_element64(rm1, a->vm, 0, MO_64); + read_neon_element64(rm2, a->vm, 1, MO_64); shiftfn(rm1, rm1, constimm); narrowfn(rd, cpu_env, rm1); @@ -1579,7 +1579,7 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a, tcg_gen_shli_i64(tmp, tmp, a->shift); tcg_gen_andi_i64(tmp, tmp, ~widen_mask); } - neon_store_reg64(tmp, a->vd); + write_neon_element64(tmp, a->vd, 0, MO_64); widenfn(tmp, rm1); tcg_temp_free_i32(rm1); @@ -1587,7 +1587,7 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a, tcg_gen_shli_i64(tmp, tmp, a->shift); tcg_gen_andi_i64(tmp, tmp, ~widen_mask); } - neon_store_reg64(tmp, a->vd + 1); + write_neon_element64(tmp, a->vd, 1, MO_64); tcg_temp_free_i64(tmp); return true; } @@ -1822,7 +1822,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, rm_64 = tcg_temp_new_i64(); if (src1_wide) { - neon_load_reg64(rn0_64, a->vn); + read_neon_element64(rn0_64, a->vn, 0, MO_64); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 0, MO_32); @@ -1841,7 +1841,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, * avoid incorrect results if a narrow input overlaps with the result. */ if (src1_wide) { - neon_load_reg64(rn1_64, a->vn + 1); + read_neon_element64(rn1_64, a->vn, 1, MO_64); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 1, MO_32); @@ -1851,12 +1851,12 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, rm = tcg_temp_new_i32(); read_neon_element32(rm, a->vm, 1, MO_32); - neon_store_reg64(rn0_64, a->vd); + write_neon_element64(rn0_64, a->vd, 0, MO_64); widenfn(rm_64, rm); tcg_temp_free_i32(rm); opfn(rn1_64, rn1_64, rm_64); - neon_store_reg64(rn1_64, a->vd + 1); + write_neon_element64(rn1_64, a->vd, 1, MO_64); tcg_temp_free_i64(rn0_64); tcg_temp_free_i64(rn1_64); @@ -1928,15 +1928,15 @@ static bool do_narrow_3d(DisasContext *s, arg_3diff *a, rd0 = tcg_temp_new_i32(); rd1 = tcg_temp_new_i32(); - neon_load_reg64(rn_64, a->vn); - neon_load_reg64(rm_64, a->vm); + read_neon_element64(rn_64, a->vn, 0, MO_64); + read_neon_element64(rm_64, a->vm, 0, MO_64); opfn(rn_64, rn_64, rm_64); narrowfn(rd0, rn_64); - neon_load_reg64(rn_64, a->vn + 1); - neon_load_reg64(rm_64, a->vm + 1); + read_neon_element64(rn_64, a->vn, 1, MO_64); + read_neon_element64(rm_64, a->vm, 1, MO_64); opfn(rn_64, rn_64, rm_64); @@ -2036,16 +2036,16 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a, /* Don't store results until after all loads: they might overlap */ if (accfn) { tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vd); + read_neon_element64(tmp, a->vd, 0, MO_64); accfn(tmp, tmp, rd0); - neon_store_reg64(tmp, a->vd); - neon_load_reg64(tmp, a->vd + 1); + write_neon_element64(tmp, a->vd, 0, MO_64); + read_neon_element64(tmp, a->vd, 1, MO_64); accfn(tmp, tmp, rd1); - neon_store_reg64(tmp, a->vd + 1); + write_neon_element64(tmp, a->vd, 1, MO_64); tcg_temp_free_i64(tmp); } else { - neon_store_reg64(rd0, a->vd); - neon_store_reg64(rd1, a->vd + 1); + write_neon_element64(rd0, a->vd, 0, MO_64); + write_neon_element64(rd1, a->vd, 1, MO_64); } tcg_temp_free_i64(rd0); @@ -2669,16 +2669,16 @@ static bool do_2scalar_long(DisasContext *s, arg_2scalar *a, if (accfn) { TCGv_i64 t64 = tcg_temp_new_i64(); - neon_load_reg64(t64, a->vd); + read_neon_element64(t64, a->vd, 0, MO_64); accfn(t64, t64, rn0_64); - neon_store_reg64(t64, a->vd); - neon_load_reg64(t64, a->vd + 1); + write_neon_element64(t64, a->vd, 0, MO_64); + read_neon_element64(t64, a->vd, 1, MO_64); accfn(t64, t64, rn1_64); - neon_store_reg64(t64, a->vd + 1); + write_neon_element64(t64, a->vd, 1, MO_64); tcg_temp_free_i64(t64); } else { - neon_store_reg64(rn0_64, a->vd); - neon_store_reg64(rn1_64, a->vd + 1); + write_neon_element64(rn0_64, a->vd, 0, MO_64); + write_neon_element64(rn1_64, a->vd, 1, MO_64); } tcg_temp_free_i64(rn0_64); tcg_temp_free_i64(rn1_64); @@ -2812,10 +2812,10 @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a) right = tcg_temp_new_i64(); dest = tcg_temp_new_i64(); - neon_load_reg64(right, a->vn); - neon_load_reg64(left, a->vm); + read_neon_element64(right, a->vn, 0, MO_64); + read_neon_element64(left, a->vm, 0, MO_64); tcg_gen_extract2_i64(dest, right, left, a->imm * 8); - neon_store_reg64(dest, a->vd); + write_neon_element64(dest, a->vd, 0, MO_64); tcg_temp_free_i64(left); tcg_temp_free_i64(right); @@ -2831,21 +2831,21 @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a) destright = tcg_temp_new_i64(); if (a->imm < 8) { - neon_load_reg64(right, a->vn); - neon_load_reg64(middle, a->vn + 1); + read_neon_element64(right, a->vn, 0, MO_64); + read_neon_element64(middle, a->vn, 1, MO_64); tcg_gen_extract2_i64(destright, right, middle, a->imm * 8); - neon_load_reg64(left, a->vm); + read_neon_element64(left, a->vm, 0, MO_64); tcg_gen_extract2_i64(destleft, middle, left, a->imm * 8); } else { - neon_load_reg64(right, a->vn + 1); - neon_load_reg64(middle, a->vm); + read_neon_element64(right, a->vn, 1, MO_64); + read_neon_element64(middle, a->vm, 0, MO_64); tcg_gen_extract2_i64(destright, right, middle, (a->imm - 8) * 8); - neon_load_reg64(left, a->vm + 1); + read_neon_element64(left, a->vm, 1, MO_64); tcg_gen_extract2_i64(destleft, middle, left, (a->imm - 8) * 8); } - neon_store_reg64(destright, a->vd); - neon_store_reg64(destleft, a->vd + 1); + write_neon_element64(destright, a->vd, 0, MO_64); + write_neon_element64(destleft, a->vd, 1, MO_64); tcg_temp_free_i64(destright); tcg_temp_free_i64(destleft); @@ -3052,11 +3052,11 @@ static bool do_2misc_pairwise(DisasContext *s, arg_2misc *a, if (accfn) { TCGv_i64 tmp64 = tcg_temp_new_i64(); - neon_load_reg64(tmp64, a->vd + pass); + read_neon_element64(tmp64, a->vd, pass, MO_64); accfn(rd_64, tmp64, rd_64); tcg_temp_free_i64(tmp64); } - neon_store_reg64(rd_64, a->vd + pass); + write_neon_element64(rd_64, a->vd, pass, MO_64); tcg_temp_free_i64(rd_64); } return true; @@ -3254,9 +3254,9 @@ static bool do_vmovn(DisasContext *s, arg_2misc *a, rd0 = tcg_temp_new_i32(); rd1 = tcg_temp_new_i32(); - neon_load_reg64(rm, a->vm); + read_neon_element64(rm, a->vm, 0, MO_64); narrowfn(rd0, cpu_env, rm); - neon_load_reg64(rm, a->vm + 1); + read_neon_element64(rm, a->vm, 1, MO_64); narrowfn(rd1, cpu_env, rm); write_neon_element32(rd0, a->vd, 0, MO_32); write_neon_element32(rd1, a->vd, 1, MO_32); @@ -3326,10 +3326,10 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a) widenfn(rd, rm0); tcg_gen_shli_i64(rd, rd, 8 << a->size); - neon_store_reg64(rd, a->vd); + write_neon_element64(rd, a->vd, 0, MO_64); widenfn(rd, rm1); tcg_gen_shli_i64(rd, rd, 8 << a->size); - neon_store_reg64(rd, a->vd + 1); + write_neon_element64(rd, a->vd, 1, MO_64); tcg_temp_free_i64(rd); tcg_temp_free_i32(rm0); @@ -3847,10 +3847,10 @@ static bool trans_VSWP(DisasContext *s, arg_2misc *a) rm = tcg_temp_new_i64(); rd = tcg_temp_new_i64(); for (pass = 0; pass < (a->q ? 2 : 1); pass++) { - neon_load_reg64(rm, a->vm + pass); - neon_load_reg64(rd, a->vd + pass); - neon_store_reg64(rm, a->vd + pass); - neon_store_reg64(rd, a->vm + pass); + read_neon_element64(rm, a->vm, pass, MO_64); + read_neon_element64(rd, a->vd, pass, MO_64); + write_neon_element64(rm, a->vd, pass, MO_64); + write_neon_element64(rd, a->vm, pass, MO_64); } tcg_temp_free_i64(rm); tcg_temp_free_i64(rd); From patchwork Fri Oct 30 02:26:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 319987 Delivered-To: patch@linaro.org Received: by 2002:a92:7b12:0:0:0:0:0 with SMTP id w18csp1015148ilc; Thu, 29 Oct 2020 19:30:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzZIeKVPfI7cx2IWEz2a+mW2U2Acgph20gJWn5WXIubMvvr6E6O0Qsk8SoNjyJaSzycWBMC X-Received: by 2002:a25:2846:: with SMTP id o67mr659732ybo.164.1604025055231; Thu, 29 Oct 2020 19:30:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604025055; cv=none; d=google.com; s=arc-20160816; b=QuIxTO3E0ydDu/NjDJmCqG8VqJZmkXpGdYUnzmSyeCtH+yDk2tpj/LYJhFOnlWjNf2 PQy/unVV8PguMA/PBKwKqR3t8BK0ExWg2tpVVdaswte9zaui13cV4FolJUVDB0NXLpp4 hsRsB6581on33YQjMiZu6USp3+OrFSaSSy0yfMgnfKqilf/qIBRyYFnDHImWEuIE+NfB W+ogqV+7xXs+dPNsA2MXuJgtYRE53ftrYe2vEU8BiawEP56qx1IijWhrap3cC56WDA3R elhiee4XIg40oFZlo20bd1sCla8qHXXeX5RQUNBFNFlnFrlv6SQkUUeNh2EtJCHlZGKy BtQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=I1RxcADQNWXG6kJXZJzeM0kiw0mxZyWZrvQsQoREMmo=; b=MczWv5b63OwkAwLpQX4aw0mwMniHM1y9kDJKWiCmhXtb1BxdeiiphE452Pmmn6Gzk3 nuoD+QmD16whQO+d+0uPj1kkaDZg/DirD1YVTq9Uo9at4mMiz0FgIoJJFOBTd22+sbXg dEj80qVWuVN2t5WvEw68uflMjRh2flsqo/J68OpTMBYXwq8Uo9BtqBWNJCr+wKf7UqpT 0BpauLKl8XeNRaHDnOo6HLduV7ymW2Ul5jKh3J/esseh6AeoPmwu/tnK/6eDezR6qyan uAqV1M0h2dGr/QpRFJiZBz6cOX2vP9PNDYNMjpfdM4/gq4KHR/ptpWT3kgr89MoQ35so b7rA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=juHt5YLh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id x61si6020315ybh.247.2020.10.29.19.30.55 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Oct 2020 19:30:55 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=juHt5YLh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:35376 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKBq-0002Ss-MG for patch@linaro.org; Thu, 29 Oct 2020 22:30:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39906) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7k-0005IP-NA for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:40 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:41445) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7i-0005vh-Dc for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:40 -0400 Received: by mail-pg1-x541.google.com with SMTP id g12so3915066pgm.8 for ; Thu, 29 Oct 2020 19:26:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I1RxcADQNWXG6kJXZJzeM0kiw0mxZyWZrvQsQoREMmo=; b=juHt5YLhMN+yVkGrtrmS2L/TtkZNE0I4Vp48kLgxgrwcq/zqYmAlOZJwVjoVAEpRDc vqLWpy+cHahlrbncZF1lSEvycl6ZweWaKHhTNDK5l7o+FFnDOITqxxWBJSehl/KgUs8J FQzxY6mU4DBppD0aBoRr/VOmNTGZHGnczG+UKAuZyfUgEHPAxKjmsBPX2CQXtAVSmRhR KlXIhmWL1qzKbjP3axaw/Vsing/AM47cEWiUfRffvu/L3TwkCVWMvR0epdfSOdDs0izc GKJsa8JQKGVoqaj+k0oiKwuB/GD6rPeugWGZpQD9QyOOTQPn2CeCBXHrrapCdvxxkrYe 6xkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I1RxcADQNWXG6kJXZJzeM0kiw0mxZyWZrvQsQoREMmo=; b=rus87r0h0iwwwAyqtuPSqcQf08grB21aSutdYWaYhMBwwIqfSsq/eAadebJm0W5U9s 2YntvZh1r6+dr4F+w/K7RIHdT3pFB7+nSf6RS6pCictfUoGUS3tbLVM3zR/rSTbCrR7u P/HiUTXwy+XwYmqDKWNam7WYJFtEh1qoUsbaZPihdme3d1YWBpr929D2cJPhEOcUmBNx NyBsU2UWl6YbhViSbWu+cqJLUl9RZEV00SJi9qsFt37L4lynMPtltjPM+QsnkNwcPwE9 jpe/9h1CwBhx4ka1+Rtn/QvNl3Q1hy0G1gsZv+LHs6n8eIYHfiLynFmieZo1g4art6eP 54wg== X-Gm-Message-State: AOAM53116qCPlpbtAt1B8tFCLfttw+LoSKnGXqlyPduE3KcTt0gaDiVa 5a3avG5NBmAwb6D+92oJvy9Pks0Jq8ZqBA== X-Received: by 2002:a17:90b:14e:: with SMTP id em14mr113660pjb.186.1604024792423; Thu, 29 Oct 2020 19:26:32 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 09/11] target/arm: Rename neon_load_reg64 to vfp_load_reg64 Date: Thu, 29 Oct 2020 19:26:16 -0700 Message-Id: <20201030022618.785675-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The only uses of this function are for loading VFP double-precision values, and nothing to do with NEON. Signed-off-by: Richard Henderson --- target/arm/translate.c | 8 ++-- target/arm/translate-vfp.c.inc | 84 +++++++++++++++++----------------- 2 files changed, 46 insertions(+), 46 deletions(-) -- 2.25.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index 4fb0a62200..7611c1f0f1 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1132,14 +1132,14 @@ static long vfp_reg_offset(bool dp, unsigned reg) } } -static inline void neon_load_reg64(TCGv_i64 var, int reg) +static inline void vfp_load_reg64(TCGv_i64 var, int reg) { - tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg)); + tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(true, reg)); } -static inline void neon_store_reg64(TCGv_i64 var, int reg) +static inline void vfp_store_reg64(TCGv_i64 var, int reg) { - tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg)); + tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(true, reg)); } static inline void vfp_load_reg32(TCGv_i32 var, int reg) diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index d2a9b658bb..f966de5b1f 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -236,8 +236,8 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) tcg_gen_ext_i32_i64(nf, cpu_NF); tcg_gen_ext_i32_i64(vf, cpu_VF); - neon_load_reg64(frn, rn); - neon_load_reg64(frm, rm); + vfp_load_reg64(frn, rn); + vfp_load_reg64(frm, rm); switch (a->cc) { case 0: /* eq: Z */ tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero, @@ -264,7 +264,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) tcg_temp_free_i64(tmp); break; } - neon_store_reg64(dest, rd); + vfp_store_reg64(dest, rd); tcg_temp_free_i64(frn); tcg_temp_free_i64(frm); tcg_temp_free_i64(dest); @@ -385,9 +385,9 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a) TCGv_i64 tcg_res; tcg_op = tcg_temp_new_i64(); tcg_res = tcg_temp_new_i64(); - neon_load_reg64(tcg_op, rm); + vfp_load_reg64(tcg_op, rm); gen_helper_rintd(tcg_res, tcg_op, fpst); - neon_store_reg64(tcg_res, rd); + vfp_store_reg64(tcg_res, rd); tcg_temp_free_i64(tcg_op); tcg_temp_free_i64(tcg_res); } else { @@ -463,7 +463,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) tcg_double = tcg_temp_new_i64(); tcg_res = tcg_temp_new_i64(); tcg_tmp = tcg_temp_new_i32(); - neon_load_reg64(tcg_double, rm); + vfp_load_reg64(tcg_double, rm); if (is_signed) { gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst); } else { @@ -1002,9 +1002,9 @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a) tmp = tcg_temp_new_i64(); if (a->l) { gen_aa32_ld64(s, tmp, addr, get_mem_index(s)); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); } else { - neon_load_reg64(tmp, a->vd); + vfp_load_reg64(tmp, a->vd); gen_aa32_st64(s, tmp, addr, get_mem_index(s)); } tcg_temp_free_i64(tmp); @@ -1149,10 +1149,10 @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a) if (a->l) { /* load */ gen_aa32_ld64(s, tmp, addr, get_mem_index(s)); - neon_store_reg64(tmp, a->vd + i); + vfp_store_reg64(tmp, a->vd + i); } else { /* store */ - neon_load_reg64(tmp, a->vd + i); + vfp_load_reg64(tmp, a->vd + i); gen_aa32_st64(s, tmp, addr, get_mem_index(s)); } tcg_gen_addi_i32(addr, addr, offset); @@ -1416,15 +1416,15 @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn, fd = tcg_temp_new_i64(); fpst = fpstatus_ptr(FPST_FPCR); - neon_load_reg64(f0, vn); - neon_load_reg64(f1, vm); + vfp_load_reg64(f0, vn); + vfp_load_reg64(f1, vm); for (;;) { if (reads_vd) { - neon_load_reg64(fd, vd); + vfp_load_reg64(fd, vd); } fn(fd, f0, f1, fpst); - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); if (veclen == 0) { break; @@ -1433,10 +1433,10 @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn, veclen--; vd = vfp_advance_dreg(vd, delta_d); vn = vfp_advance_dreg(vn, delta_d); - neon_load_reg64(f0, vn); + vfp_load_reg64(f0, vn); if (delta_m) { vm = vfp_advance_dreg(vm, delta_m); - neon_load_reg64(f1, vm); + vfp_load_reg64(f1, vm); } } @@ -1599,11 +1599,11 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm) f0 = tcg_temp_new_i64(); fd = tcg_temp_new_i64(); - neon_load_reg64(f0, vm); + vfp_load_reg64(f0, vm); for (;;) { fn(fd, f0); - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); if (veclen == 0) { break; @@ -1613,7 +1613,7 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm) /* single source one-many */ while (veclen--) { vd = vfp_advance_dreg(vd, delta_d); - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); } break; } @@ -1622,7 +1622,7 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm) veclen--; vd = vfp_advance_dreg(vd, delta_d); vd = vfp_advance_dreg(vm, delta_m); - neon_load_reg64(f0, vm); + vfp_load_reg64(f0, vm); } tcg_temp_free_i64(f0); @@ -2173,20 +2173,20 @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i64(); vd = tcg_temp_new_i64(); - neon_load_reg64(vn, a->vn); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vn, a->vn); + vfp_load_reg64(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negd(vn, vn); } - neon_load_reg64(vd, a->vd); + vfp_load_reg64(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negd(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR); gen_helper_vfp_muladdd(vd, vn, vm, vd, fpst); - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(vn); @@ -2325,7 +2325,7 @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a) fd = tcg_const_i64(vfp_expand_imm(MO_64, a->imm)); for (;;) { - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); if (veclen == 0) { break; @@ -2480,11 +2480,11 @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a) vd = tcg_temp_new_i64(); vm = tcg_temp_new_i64(); - neon_load_reg64(vd, a->vd); + vfp_load_reg64(vd, a->vd); if (a->z) { tcg_gen_movi_i64(vm, 0); } else { - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); } if (a->e) { @@ -2557,7 +2557,7 @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a) tcg_gen_ld16u_i32(tmp, cpu_env, vfp_f16_offset(a->vm, a->t)); vd = tcg_temp_new_i64(); gen_helper_vfp_fcvt_f16_to_f64(vd, tmp, fpst, ahp_mode); - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i32(ahp_mode); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); @@ -2621,7 +2621,7 @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a) tmp = tcg_temp_new_i32(); vm = tcg_temp_new_i64(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); gen_helper_vfp_fcvt_f64_to_f16(tmp, vm, fpst, ahp_mode); tcg_temp_free_i64(vm); tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t)); @@ -2700,10 +2700,10 @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a) } tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm); + vfp_load_reg64(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rintd(tmp, tmp, fpst); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(tmp); return true; @@ -2789,13 +2789,13 @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a) } tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm); + vfp_load_reg64(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); tcg_rmode = tcg_const_i32(float_round_to_zero); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); gen_helper_rintd(tmp, tmp, fpst); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(tmp); tcg_temp_free_i32(tcg_rmode); @@ -2871,10 +2871,10 @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a) } tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm); + vfp_load_reg64(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rintd_exact(tmp, tmp, fpst); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(tmp); return true; @@ -2902,7 +2902,7 @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a) vd = tcg_temp_new_i64(); vfp_load_reg32(vm, a->vm); gen_helper_vfp_fcvtds(vd, vm, cpu_env); - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_i64(vd); return true; @@ -2928,7 +2928,7 @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i64(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); gen_helper_vfp_fcvtsd(vd, vm, cpu_env); vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); @@ -3024,7 +3024,7 @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a) /* u32 -> f64 */ gen_helper_vfp_uitod(vd, vm, fpst); } - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_i64(vd); tcg_temp_free_ptr(fpst); @@ -3055,7 +3055,7 @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a) vm = tcg_temp_new_i64(); vd = tcg_temp_new_i32(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); gen_helper_vjcvt(vd, vm, cpu_env); vfp_store_reg32(vd, a->vd); tcg_temp_free_i64(vm); @@ -3204,7 +3204,7 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a) frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm); vd = tcg_temp_new_i64(); - neon_load_reg64(vd, a->vd); + vfp_load_reg64(vd, a->vd); fpst = fpstatus_ptr(FPST_FPCR); shift = tcg_const_i32(frac_bits); @@ -3239,7 +3239,7 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a) g_assert_not_reached(); } - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i64(vd); tcg_temp_free_i32(shift); tcg_temp_free_ptr(fpst); @@ -3340,7 +3340,7 @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a) fpst = fpstatus_ptr(FPST_FPCR); vm = tcg_temp_new_i64(); vd = tcg_temp_new_i32(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); if (a->s) { if (a->rz) { From patchwork Fri Oct 30 02:26:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 319988 Delivered-To: patch@linaro.org Received: by 2002:a92:7b12:0:0:0:0:0 with SMTP id w18csp1016024ilc; Thu, 29 Oct 2020 19:32:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxL+rwFmibHgb+TmZB3Jylke8kzkhZZEHClPtcWbCizdTgdZAuVAccS6YFTDXVUk4YmXgJ0 X-Received: by 2002:a05:6902:52d:: with SMTP id y13mr559339ybs.364.1604025156222; Thu, 29 Oct 2020 19:32:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1604025156; cv=none; d=google.com; s=arc-20160816; b=Ieyv8bHgfaNfvpGdRc3GVPY6KzkpV9V/JfV5XrA1woaczPB4XJSjZ0/jTpP5e95Go7 8cKl+dWub8/Nxk7D5LxjF14NlmR6CLYxEm0R8FvWTzZ9TMxsh/RXgFOdmcjFdFRW8lwO Zc6Hiwhdh5dvHi+k8akkPfUeACDm9blNZqcvCWzSjK7yBGnVMWTeNleVNnzCICt4bEou kO0XA/qJFtlaTqVuW5Ni42olttj7OzlmAMqRGgtqRQvOy5HApEuJ1MpiLCWl9K4AfW6f A7TY3Wo3kd9Uu679eKt2EgVfb91CmuLBMHToTsYQKiBCox5I8Au7dE367rPEFleVDaiF D5IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=5ctNmX935/85OBzb+3yZO19vwGcb5IjbGDXkuOiA+8s=; b=vjr3FnnZQjdr7XBWMJfn+X4OQbmrOLceN5CdxmU6g9hVPnHkUPdGwL5yUyQdTtXopJ 4l448P6HDySLxs/828x2+mfL+qLWUwV10S7BwfJ+SK3cXhwlCwJJqEijcVZIZnJlOgbq v/U3ZW6QUuA3y5RucnX/uodtBdhSNMCBtdCs1yIqyWBuR2QiAdC5ZGs2SjZ8bm8oTiFF Duiyykx9HeKd2C/RFI0WQ8AGVrR31PpwH8NEfiXhXug8FNAKjratdgIrmi3jAXygINei UowMCXUIgQcso6i4GlkNLRFXylEOMHNQF9IK3BTp+cuBygzbPHdAEtvmxBDzUSrziP5n NIuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fA6Cp8KT; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id t60si6279393ybi.18.2020.10.29.19.32.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Oct 2020 19:32:36 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fA6Cp8KT; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:41062 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKDT-0004lz-Na for patch@linaro.org; Thu, 29 Oct 2020 22:32:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39900) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7k-0005HC-5m for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:40 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:40753) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7i-0005vr-IA for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:39 -0400 Received: by mail-pl1-x643.google.com with SMTP id j5so2243464plk.7 for ; Thu, 29 Oct 2020 19:26:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5ctNmX935/85OBzb+3yZO19vwGcb5IjbGDXkuOiA+8s=; b=fA6Cp8KTlfM9f4jjDGWhBRINgApSSCd4hcIlys/IMuxBB97mnHCitWV6TQQhSj28ZI uFB4BrKaUSI1jF+FPwZCgEtlG5+CQGlIY9oRn5WhWtREuxtXUMHfcTFrKJ0ua/EgbDO3 g1MPh2JBCxim9BpwaKXPElyCPiKsSLWJGKv+8KfEuuLFO3w0ocW4STXhGvFxXFU8nug8 5fGE+385YXidNxMujonxZDbmtDBEBwaJQapCYsVYuiOQPTttnV4XXoz+Y0Ig5Sxcf3FH VEz36k9jc1p5vya1wz/G85aBZz+lHzGnQITwSK4fDiSh7PCcmt/DC1Hml5YF31QA6bz0 c5Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5ctNmX935/85OBzb+3yZO19vwGcb5IjbGDXkuOiA+8s=; b=DnVjBViWwlgIUCTxdJhmrClevT4DlO00qaGf/Rg+06v/GTQqkGpHCcnXItZ+rX6jgO By+VBqqNBL3ZGjF9/EQCaKGshTQSjMNoC8qN8G3/+YOm/8aVkgtkw3lHo7d85eeJdFzy SA7UomXArOl8nmYPVhElgdcidfbpsqV0546YViApJo2YtJuBnSnJqTduec9CbyAwx+C2 Km9hcmOKxjuZ5YfMOJuwP38QEDsitB9p8bidvnoXJ8snFjFV4Pa24lNLsFIdrM/VIyXN d1JfQHhJxIKzphqR9BlM0Rug7PqYO1M1SWZRxINsRetwZ6c6UP0apUx9zxyt8RGEHTLv 7gBw== X-Gm-Message-State: AOAM531HDuDtASTJ0J6/KRcRWbaoUtDjiIWLgq5zDmUPhGD70VtkYgMG BRIM7HLkrzMqp+/xiOLkZmCVJTps9hbDZg== X-Received: by 2002:a17:902:7d89:b029:d5:cfb6:e44 with SMTP id a9-20020a1709027d89b02900d5cfb60e44mr7136707plm.28.1604024793684; Thu, 29 Oct 2020 19:26:33 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 10/11] target/arm: Simplify do_long_3d and do_2scalar_long Date: Thu, 29 Oct 2020 19:26:17 -0700 Message-Id: <20201030022618.785675-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" In both cases, we can sink the write-back and perform the accumulate into the normal destination temps. Signed-off-by: Richard Henderson --- target/arm/translate-neon.c.inc | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) -- 2.25.1 diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index c2d67160f9..1c16c56e7e 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -2037,17 +2037,14 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a, if (accfn) { tmp = tcg_temp_new_i64(); read_neon_element64(tmp, a->vd, 0, MO_64); - accfn(tmp, tmp, rd0); - write_neon_element64(tmp, a->vd, 0, MO_64); + accfn(rd0, tmp, rd0); read_neon_element64(tmp, a->vd, 1, MO_64); - accfn(tmp, tmp, rd1); - write_neon_element64(tmp, a->vd, 1, MO_64); + accfn(rd1, tmp, rd1); tcg_temp_free_i64(tmp); - } else { - write_neon_element64(rd0, a->vd, 0, MO_64); - write_neon_element64(rd1, a->vd, 1, MO_64); } + write_neon_element64(rd0, a->vd, 0, MO_64); + write_neon_element64(rd1, a->vd, 1, MO_64); tcg_temp_free_i64(rd0); tcg_temp_free_i64(rd1); @@ -2670,16 +2667,14 @@ static bool do_2scalar_long(DisasContext *s, arg_2scalar *a, if (accfn) { TCGv_i64 t64 = tcg_temp_new_i64(); read_neon_element64(t64, a->vd, 0, MO_64); - accfn(t64, t64, rn0_64); - write_neon_element64(t64, a->vd, 0, MO_64); + accfn(rn0_64, t64, rn0_64); read_neon_element64(t64, a->vd, 1, MO_64); - accfn(t64, t64, rn1_64); - write_neon_element64(t64, a->vd, 1, MO_64); + accfn(rn1_64, t64, rn1_64); tcg_temp_free_i64(t64); - } else { - write_neon_element64(rn0_64, a->vd, 0, MO_64); - write_neon_element64(rn1_64, a->vd, 1, MO_64); } + + write_neon_element64(rn0_64, a->vd, 0, MO_64); + write_neon_element64(rn1_64, a->vd, 1, MO_64); tcg_temp_free_i64(rn0_64); tcg_temp_free_i64(rn1_64); return true; From patchwork Fri Oct 30 02:26:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 316533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C985BC4363A for ; Fri, 30 Oct 2020 02:35:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 17DC820739 for ; Fri, 30 Oct 2020 02:35:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="RUI9yWx5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 17DC820739 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51140 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKGE-0000VI-0L for qemu-devel@archiver.kernel.org; Thu, 29 Oct 2020 22:35:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39920) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYK7l-0005Kn-Pa for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:41 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:37071) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kYK7i-0005vy-II for qemu-devel@nongnu.org; Thu, 29 Oct 2020 22:26:41 -0400 Received: by mail-pg1-x541.google.com with SMTP id h6so3924595pgk.4 for ; Thu, 29 Oct 2020 19:26:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/srvkEMVHhjFuX4GEztdA7BtJNAKhyCjy2OjjjYFdsY=; b=RUI9yWx5Ixh8M4p1inbRmcRCRBOSB1rxWNuVyYBrxxZZzmNzXSaPvYj9pccDCn7B6D dk/WwcOF1Ff4CCgVyw89CBegAP/BhIx4MKGiGsNz6SO1HyYTSQPjLg/t4LArloBYb0tX 3o/stimnlG8DZAAOu1b8HRRYJ3qw/CSo95hIFhf8LAkTp3IBKzstiQeZYtQJaO/3y0tJ 5mtEvwbKr4jZ3b4NzSquIHXJXWBofKA/qZ7idF0/EUXUtwPaywzn7XMWPPbaPoBfNiGI pXqtoM7CDdK0+QM9tOTx5hGkbd/3yCj8jauni6wzzygK4Y0jTfwHOMaf+WU17ZKxqy5b hXNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/srvkEMVHhjFuX4GEztdA7BtJNAKhyCjy2OjjjYFdsY=; b=Np98nVNhTMrpHwpHNP/EuFthRKvNEeqG7F+WbBs4iBBe7BAPchni7ebhCGnEAKk5Dq wgNQWqB/S0YAuv30zCKhh8y58v1HxR7OcK+Rstk7eEsNTiyAfa6GO5S9lwttxSp2Kd3Q 5kcrjbdwRtTjNRrTqnfCBDzsbWOnhhbyj+q9p7oCR8Kk1G8fm3UQnKXHqf5YSCWeeU2d bSZT60o/uYRbrdv9TWjGAXuG3AaQyU06pcxVMW8XiLkq1Y/S8G0mXQqXDRMB8CEUCn04 oOKsXDJZcnEtEqfsC+DMUfStyeKkULH1zJBbjWt1F21KA5G+GeU9nPq0HgwEINA6fN6p Qn1g== X-Gm-Message-State: AOAM530tjoeuqNevJXA36U/73WfxabL1LXufC+3HXDrZOvNjTbG13bBM IBRjtvkk+kJznGcJSAEWwAxiHAt2lSPatg== X-Google-Smtp-Source: ABdhPJzKzloWAgDl8KsMNg+x9WyjzJ9YGL3SskZv9lNO6pX6C5tMytxDEyhmrven5nN75H7GzBQdBA== X-Received: by 2002:a17:90a:1903:: with SMTP id 3mr127263pjg.74.1604024795293; Thu, 29 Oct 2020 19:26:35 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id b7sm4446517pfr.171.2020.10.29.19.26.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Oct 2020 19:26:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 11/11] target/arm: Improve do_prewiden_3d Date: Thu, 29 Oct 2020 19:26:18 -0700 Message-Id: <20201030022618.785675-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201030022618.785675-1-richard.henderson@linaro.org> References: <20201030022618.785675-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We can use proper widening loads to extend 32-bit inputs, and skip the "widenfn" step. Signed-off-by: Richard Henderson --- target/arm/translate.c | 6 +++ target/arm/translate-neon.c.inc | 66 ++++++++++++++++++--------------- 2 files changed, 43 insertions(+), 29 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 7611c1f0f1..29ea1eb781 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1183,6 +1183,12 @@ static void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop) long off = neon_element_offset(reg, ele, memop); switch (memop) { + case MO_SL: + tcg_gen_ld32s_i64(dest, cpu_env, off); + break; + case MO_UL: + tcg_gen_ld32u_i64(dest, cpu_env, off); + break; case MO_Q: tcg_gen_ld_i64(dest, cpu_env, off); break; diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 1c16c56e7e..59368cb243 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -1788,11 +1788,10 @@ static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a) static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, NeonGenWidenFn *widenfn, NeonGenTwo64OpFn *opfn, - bool src1_wide) + int src1_mop, int src2_mop) { /* 3-regs different lengths, prewidening case (VADDL/VSUBL/VAADW/VSUBW) */ TCGv_i64 rn0_64, rn1_64, rm_64; - TCGv_i32 rm; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -1804,12 +1803,12 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, return false; } - if (!widenfn || !opfn) { + if (!opfn) { /* size == 3 case, which is an entirely different insn group */ return false; } - if ((a->vd & 1) || (src1_wide && (a->vn & 1))) { + if ((a->vd & 1) || (src1_mop == MO_Q && (a->vn & 1))) { return false; } @@ -1821,40 +1820,48 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, rn1_64 = tcg_temp_new_i64(); rm_64 = tcg_temp_new_i64(); - if (src1_wide) { - read_neon_element64(rn0_64, a->vn, 0, MO_64); + if (src1_mop >= 0) { + read_neon_element64(rn0_64, a->vn, 0, src1_mop); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 0, MO_32); widenfn(rn0_64, tmp); tcg_temp_free_i32(tmp); } - rm = tcg_temp_new_i32(); - read_neon_element32(rm, a->vm, 0, MO_32); + if (src2_mop >= 0) { + read_neon_element64(rm_64, a->vm, 0, src2_mop); + } else { + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 0, MO_32); + widenfn(rm_64, tmp); + tcg_temp_free_i32(tmp); + } - widenfn(rm_64, rm); - tcg_temp_free_i32(rm); opfn(rn0_64, rn0_64, rm_64); /* * Load second pass inputs before storing the first pass result, to * avoid incorrect results if a narrow input overlaps with the result. */ - if (src1_wide) { - read_neon_element64(rn1_64, a->vn, 1, MO_64); + if (src1_mop >= 0) { + read_neon_element64(rn1_64, a->vn, 1, src1_mop); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 1, MO_32); widenfn(rn1_64, tmp); tcg_temp_free_i32(tmp); } - rm = tcg_temp_new_i32(); - read_neon_element32(rm, a->vm, 1, MO_32); + if (src2_mop >= 0) { + read_neon_element64(rm_64, a->vm, 1, src2_mop); + } else { + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 1, MO_32); + widenfn(rm_64, tmp); + tcg_temp_free_i32(tmp); + } write_neon_element64(rn0_64, a->vd, 0, MO_64); - widenfn(rm_64, rm); - tcg_temp_free_i32(rm); opfn(rn1_64, rn1_64, rm_64); write_neon_element64(rn1_64, a->vd, 1, MO_64); @@ -1865,14 +1872,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, return true; } -#define DO_PREWIDEN(INSN, S, EXT, OP, SRC1WIDE) \ +#define DO_PREWIDEN(INSN, S, OP, SRC1WIDE, SIGN) \ static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a) \ { \ static NeonGenWidenFn * const widenfn[] = { \ gen_helper_neon_widen_##S##8, \ gen_helper_neon_widen_##S##16, \ - tcg_gen_##EXT##_i32_i64, \ - NULL, \ + NULL, NULL, \ }; \ static NeonGenTwo64OpFn * const addfn[] = { \ gen_helper_neon_##OP##l_u16, \ @@ -1880,18 +1886,20 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, tcg_gen_##OP##_i64, \ NULL, \ }; \ - return do_prewiden_3d(s, a, widenfn[a->size], \ - addfn[a->size], SRC1WIDE); \ + int narrow_mop = a->size == MO_32 ? MO_32 | SIGN : -1; \ + return do_prewiden_3d(s, a, widenfn[a->size], addfn[a->size], \ + SRC1WIDE ? MO_Q : narrow_mop, \ + narrow_mop); \ } -DO_PREWIDEN(VADDL_S, s, ext, add, false) -DO_PREWIDEN(VADDL_U, u, extu, add, false) -DO_PREWIDEN(VSUBL_S, s, ext, sub, false) -DO_PREWIDEN(VSUBL_U, u, extu, sub, false) -DO_PREWIDEN(VADDW_S, s, ext, add, true) -DO_PREWIDEN(VADDW_U, u, extu, add, true) -DO_PREWIDEN(VSUBW_S, s, ext, sub, true) -DO_PREWIDEN(VSUBW_U, u, extu, sub, true) +DO_PREWIDEN(VADDL_S, s, add, false, MO_SIGN) +DO_PREWIDEN(VADDL_U, u, add, false, 0) +DO_PREWIDEN(VSUBL_S, s, sub, false, MO_SIGN) +DO_PREWIDEN(VSUBL_U, u, sub, false, 0) +DO_PREWIDEN(VADDW_S, s, add, true, MO_SIGN) +DO_PREWIDEN(VADDW_U, u, add, true, 0) +DO_PREWIDEN(VSUBW_S, s, sub, true, MO_SIGN) +DO_PREWIDEN(VSUBW_U, u, sub, true, 0) static bool do_narrow_3d(DisasContext *s, arg_3diff *a, NeonGenTwo64OpFn *opfn, NeonGenNarrowFn *narrowfn)