From patchwork Wed Oct 28 03:26:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 311525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B915AC388F9 for ; Wed, 28 Oct 2020 03:31:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0F3922242B for ; Wed, 28 Oct 2020 03:31:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="wSsNrr2g" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F3922242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33578 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcBN-0001J5-1P for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:31:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7E-0005j4-2i for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:12 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:53609) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7B-0005mZ-U8 for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:11 -0400 Received: by mail-pj1-x1043.google.com with SMTP id m17so907381pjz.3 for ; Tue, 27 Oct 2020 20:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pKwAEGss07SFEqvTkuenDxl2hW2mchkMUugv+tWaYr8=; b=wSsNrr2g87F9GYXP1g6yxqUuds8kkQcnpNqwgVZ1mPf5NTiSsorzJMXVjY9RK9Y6TU 16hLID3vglwPXRga50sTPvPeViezo4SKGMndHjmuXrICfjnUtENCpH+BT74QCVPD3krM UvxzNrt91fI7oTUY8fn+rbdBVGO/fMatZad54i/ftbaRuF6zpIPbonNdULz9yynHwTVu yQK4LJXOVdtDkb+NKes1ZRz7J1o+vZjiw+wHalSlLtfLuLd1mUWfxTDvMTq63wClC0jo 2/6HUpwUg/+x7XaoueoQ1YyoZaBC8kco2J7d356jZy3bhp42l0SPWIK8dcfw/3rQ3k1A Zflw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pKwAEGss07SFEqvTkuenDxl2hW2mchkMUugv+tWaYr8=; b=ctWNeNgoVFZitiFjqSGi46S6GvT7fqSAZuhCfGLUUKEkZMQew0BpG5/Xxn6WOqoU6U 4VtJxnPmEjkK4WHboF2/mI/irdhf8IX9cVPWKyz90likqRCZ3tQ1B3nDJvY0hNbKdW1C 3gF9YvxLvXsXCQM8RaV6zM6vtQqBoTMeDYEsidh+qWlBAdG5qSlvaqQHvGhjHz4zvvtU tiNxdkYj1AKco1SBn5E48UuJUaKjyogPEjOl6tHRo9d+tGFkEMUXRQ56fyjLqMcstDEV 2F4Vj/KsVwxFqxM34G139jG1NHajAtBIw+MnXKgwaUAC/n6xe9XoK3WFQsLlCAqFcKeQ o9wg== X-Gm-Message-State: AOAM533hdX99yGseaK2rV7jiuVT7iks58s5BlZqA1PaEfLHd6oqeoo7x MRwb2W6QsQudQYeVVWRpcY+P7hi2k2vNOQ== X-Google-Smtp-Source: ABdhPJxk+1dZne93FjSdoxRBGDFP+uXQCCPsdSpOWh+nmXXimN8vrD3wk5F5vb1azLLTO9iVwpez6g== X-Received: by 2002:a17:90a:ab86:: with SMTP id n6mr4972954pjq.82.1603855627591; Tue, 27 Oct 2020 20:27:07 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 01/11] target/arm: Introduce neon_full_reg_offset Date: Tue, 27 Oct 2020 20:26:53 -0700 Message-Id: <20201028032703.201526-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This function makes it clear that we're talking about the whole register, and not the 32-bit piece at index 0. This fixes a bug when running on a big-endian host. Signed-off-by: Richard Henderson --- target/arm/translate.c | 8 ++++++ target/arm/translate-neon.c.inc | 44 ++++++++++++++++----------------- target/arm/translate-vfp.c.inc | 2 +- 3 files changed, 31 insertions(+), 23 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 38371db540..1b61e50f9c 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1094,6 +1094,14 @@ static inline void gen_hlt(DisasContext *s, int imm) unallocated_encoding(s); } +/* + * Return the offset of a "full" NEON Dreg. + */ +static long neon_full_reg_offset(unsigned reg) +{ + return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); +} + static inline long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 4d1a292981..e259e24c05 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -76,7 +76,7 @@ neon_element_offset(int reg, int element, MemOp size) ofs ^= 8 - element_size; } #endif - return neon_reg_offset(reg, 0) + ofs; + return neon_full_reg_offset(reg) + ofs; } static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop) @@ -585,12 +585,12 @@ static bool trans_VLD_all_lanes(DisasContext *s, arg_VLD_all_lanes *a) * We cannot write 16 bytes at once because the * destination is unaligned. */ - tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0), + tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(vd), 8, 8, tmp); - tcg_gen_gvec_mov(0, neon_reg_offset(vd + 1, 0), - neon_reg_offset(vd, 0), 8, 8); + tcg_gen_gvec_mov(0, neon_full_reg_offset(vd + 1), + neon_full_reg_offset(vd), 8, 8); } else { - tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0), + tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(vd), vec_size, vec_size, tmp); } tcg_gen_addi_i32(addr, addr, 1 << size); @@ -691,9 +691,9 @@ static bool trans_VLDST_single(DisasContext *s, arg_VLDST_single *a) static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn) { int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rn_ofs = neon_reg_offset(a->vn, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rn_ofs = neon_full_reg_offset(a->vn); + int rm_ofs = neon_full_reg_offset(a->vm); if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -1177,8 +1177,8 @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn) { /* Handle a 2-reg-shift insn which can be vectorized. */ int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rm_ofs = neon_full_reg_offset(a->vm); if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -1620,8 +1620,8 @@ static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a, { /* FP operations in 2-reg-and-shift group */ int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rm_ofs = neon_full_reg_offset(a->vm); TCGv_ptr fpst; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -1756,7 +1756,7 @@ static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a, return true; } - reg_ofs = neon_reg_offset(a->vd, 0); + reg_ofs = neon_full_reg_offset(a->vd); vec_size = a->q ? 16 : 8; imm = asimd_imm_const(a->imm, a->cmode, a->op); @@ -2300,9 +2300,9 @@ static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff *a) return true; } - tcg_gen_gvec_3_ool(neon_reg_offset(a->vd, 0), - neon_reg_offset(a->vn, 0), - neon_reg_offset(a->vm, 0), + tcg_gen_gvec_3_ool(neon_full_reg_offset(a->vd), + neon_full_reg_offset(a->vn), + neon_full_reg_offset(a->vm), 16, 16, 0, fn_gvec); return true; } @@ -2445,8 +2445,8 @@ static bool do_2scalar_fp_vec(DisasContext *s, arg_2scalar *a, { /* Two registers and a scalar, using gvec */ int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rn_ofs = neon_reg_offset(a->vn, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rn_ofs = neon_full_reg_offset(a->vn); int rm_ofs; int idx; TCGv_ptr fpstatus; @@ -2477,7 +2477,7 @@ static bool do_2scalar_fp_vec(DisasContext *s, arg_2scalar *a, /* a->vm is M:Vm, which encodes both register and index */ idx = extract32(a->vm, a->size + 2, 2); a->vm = extract32(a->vm, 0, a->size + 2); - rm_ofs = neon_reg_offset(a->vm, 0); + rm_ofs = neon_full_reg_offset(a->vm); fpstatus = fpstatus_ptr(a->size == 1 ? FPST_STD_F16 : FPST_STD); tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, fpstatus, @@ -2923,7 +2923,7 @@ static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a) return true; } - tcg_gen_gvec_dup_mem(a->size, neon_reg_offset(a->vd, 0), + tcg_gen_gvec_dup_mem(a->size, neon_full_reg_offset(a->vd), neon_element_offset(a->vm, a->index, a->size), a->q ? 16 : 8, a->q ? 16 : 8); return true; @@ -3412,8 +3412,8 @@ static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a) static bool do_2misc_vec(DisasContext *s, arg_2misc *a, GVecGen2Fn *fn) { int vec_size = a->q ? 16 : 8; - int rd_ofs = neon_reg_offset(a->vd, 0); - int rm_ofs = neon_reg_offset(a->vm, 0); + int rd_ofs = neon_full_reg_offset(a->vd); + int rm_ofs = neon_full_reg_offset(a->vm); if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index a7ed9bc81b..368bae0a73 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -653,7 +653,7 @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a) } tmp = load_reg(s, a->rt); - tcg_gen_gvec_dup_i32(size, neon_reg_offset(a->vn, 0), + tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(a->vn), vec_size, vec_size, tmp); tcg_temp_free_i32(tmp); From patchwork Wed Oct 28 03:26:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 311526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 465C2C55178 for ; Wed, 28 Oct 2020 03:28:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9A6572242B for ; Wed, 28 Oct 2020 03:28:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="TA4HH1/E" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9A6572242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55528 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXc8P-0006w0-KG for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:28:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33886) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7F-0005jM-F4 for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:13 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:42078) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7D-0005mt-K7 for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:13 -0400 Received: by mail-pg1-x544.google.com with SMTP id s22so1950090pga.9 for ; Tue, 27 Oct 2020 20:27:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=d+BfMFnltv94D9h3YrYpUyscB1U2BxczjPk7SPCCLHg=; b=TA4HH1/EaEUqJryZowq6/Pm6lsgaLlvfP4ghKfNOPI5toEVzyuo28n49H3mXTRZwC9 odFBOwv7PcmcYiYUbDcskBl02NS8nBxjNjUE+W5XX2oocqVvVGTziLE3yW56PYilVd13 PHwN7/D/jUgvZUfcD0gx4Jn8q459zunV09slwRjDgmRyubxpVd05DrtAsMLG6ZAG+wyH cibe6n0fida+j8cCgxjroNhRbgbmcatzfldQ3hQ1dqiBU+NF2PiTeLdDixHPU0tz4yPv jJSYylGsFLijpTK/j23aaTPYZbufarFTaEUMho4dbtSQZ5dqtfvf4ldLP5XJJ8RDnHBm reCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d+BfMFnltv94D9h3YrYpUyscB1U2BxczjPk7SPCCLHg=; b=D5guNRgQLg+iN1vIZ6b1Llqg1NrvN/3jgTNU1YZP9q9TShlvcJjDSuO3P885D03PhA KFEmuCR3d+1Zo2MYYirPwFrQd/tusN4HwqazVoZy7iooPepWHeleD7atX0K5HuuSmaeX Z01t131Job6BHylLlmlMPmj/+ip4OBvj8JRma6QQbej44FrmdC2VbsbzrbpzcuajTT5G y1M9Z1/CKkK21KBRTfUzaIzeXo3nIomB6I/DEptVDlgG+Pmf28yLhom85kYn7Z0KG6/g xH2R8X238JSapoKk0x52pphGIYLTUZgncYjrlmjkEBRSte59/2Ur4LaRKcMKB0IwOEXD TaQA== X-Gm-Message-State: AOAM531Ac4I3e0pPraLFaRlbvIt32UbkOsBYtfBWBlACzdH6hVsbHSti ApCbmugLgzwocx0LkuCazZO4/faQr9sfYQ== X-Google-Smtp-Source: ABdhPJyDnoyxfx0IIIiQFTDi/BccJ1hzMD/dQf4dTKXoDd5G4zoBJ5XSmn6PA0ll0U9dxLXRjFz33Q== X-Received: by 2002:aa7:9607:0:b029:155:2b85:93f5 with SMTP id q7-20020aa796070000b02901552b8593f5mr5269414pfg.36.1603855629736; Tue, 27 Oct 2020 20:27:09 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 02/11] target/arm: Move neon_element_offset to translate.c Date: Tue, 27 Oct 2020 20:26:54 -0700 Message-Id: <20201028032703.201526-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::544; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x544.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This will shortly have users outside of translate-neon.c.inc. Signed-off-by: Richard Henderson --- target/arm/translate.c | 20 ++++++++++++++++++++ target/arm/translate-neon.c.inc | 19 ------------------- 2 files changed, 20 insertions(+), 19 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 1b61e50f9c..bf0b5cac61 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1102,6 +1102,26 @@ static long neon_full_reg_offset(unsigned reg) return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); } +/* + * Return the offset of a 2**SIZE piece of a NEON register, at index ELE, + * where 0 is the least significant end of the register. + */ +static long neon_element_offset(int reg, int element, MemOp size) +{ + int element_size = 1 << size; + int ofs = element * element_size; +#ifdef HOST_WORDS_BIGENDIAN + /* + * Calculate the offset assuming fully little-endian, + * then XOR to account for the order of the 8-byte units. + */ + if (element_size < 8) { + ofs ^= 8 - element_size; + } +#endif + return neon_full_reg_offset(reg) + ofs; +} + static inline long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index e259e24c05..96ab2248fc 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -60,25 +60,6 @@ static inline int neon_3same_fp_size(DisasContext *s, int x) #include "decode-neon-ls.c.inc" #include "decode-neon-shared.c.inc" -/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE, - * where 0 is the least significant end of the register. - */ -static inline long -neon_element_offset(int reg, int element, MemOp size) -{ - int element_size = 1 << size; - int ofs = element * element_size; -#ifdef HOST_WORDS_BIGENDIAN - /* Calculate the offset assuming fully little-endian, - * then XOR to account for the order of the 8-byte units. - */ - if (element_size < 8) { - ofs ^= 8 - element_size; - } -#endif - return neon_full_reg_offset(reg) + ofs; -} - static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop) { long offset = neon_element_offset(reg, ele, mop & MO_SIZE); From patchwork Wed Oct 28 03:26:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 301744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB0E3C55178 for ; Wed, 28 Oct 2020 03:31:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3F9552242B for ; Wed, 28 Oct 2020 03:31:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="UlqjFTYU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3F9552242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33674 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcBO-0001LQ-2L for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:31:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33906) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7H-0005lm-28 for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:15 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:33111) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7F-0005nt-Ds for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:14 -0400 Received: by mail-pl1-x644.google.com with SMTP id b19so1816618pld.0 for ; Tue, 27 Oct 2020 20:27:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PzEKwCdPZHtQpaS3rSzSooMjXk0UfsubT3zCzhcVGPc=; b=UlqjFTYUBTRQ1QMzglwYS1l4sK1esJ7EEqyB5r5Fo46dJsw4lW2+783uLLUlb/N0lB WF/eC8kcDGrs3gaym4zU6l2bdCe3tONe2x0zvw2wam4TxNL0/BVfIO47+kM63HnC3HO3 A2GvD2g56z4W67QNRT0k0OCR99vQHolMQy1+41iighuvAN7a2aGxwLOB4oedQ4ciBfmb wqSuYkN6lhn35zQ1h1uPteaHgBEJhX1fgOLxl/zehuZWzR3fv6QeV9/3TjRFY9dHvKPw FdyZU0xZ1KiIYd2VJzsPv8J5IlH3yXQIwTIBkUtj1M8jYO7WMA6WDbs6gbaag1oppdAL gkmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PzEKwCdPZHtQpaS3rSzSooMjXk0UfsubT3zCzhcVGPc=; b=o3iVs5Gsxkb2JYAbKPg/LnwvgdiuIJi66uHKlaF3OEwE93R2gNpgLY2HrS3lzs824N V5FmIR0b+atI7JnPk5iLyqulDyMQbTP9QXL7QeKy4oJvCOVFxn7Aj6pmYUAk+v4ALrIz iFDdaFESBsrAp5Rbib3sa4agy3UWgYxHF31MqxOL/AUL64xk6W0VvNQmn4WT9HiGFdIU 2EBPBUOAq+O85rgzO8sKyYbmnIXeKYfklfvAnjX0j9tw9RUV/uzBJZexts4Y9lySD/Kd ST34E3RNQGnGxRa5kQYG2JuR2t+zn2xF2qLgE1ci/Q68cZvQMKYzbtbwSQh1XaQkfLiu 3iZg== X-Gm-Message-State: AOAM532nOvd5kTMYqXQkIm4X3AzZJ2cl6QdGSG4sLyK25UAqsj0/+/XP zal5rhdu9egc4/D76/UHb/qhygpJkw7ZMw== X-Google-Smtp-Source: ABdhPJykOJ+OnFx1PlBazfyvZF+4xVBMC8vDDRXQrKoQml3TwnN4NWT8a8sWIb8WUbUrNkNk3+foaw== X-Received: by 2002:a17:90a:d3d5:: with SMTP id d21mr4741698pjw.168.1603855631494; Tue, 27 Oct 2020 20:27:11 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 03/11] target/arm: Use neon_element_offset in neon_load/store_reg Date: Tue, 27 Oct 2020 20:26:55 -0700 Message-Id: <20201028032703.201526-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" These are the only users of neon_reg_offset, so remove that. Signed-off-by: Richard Henderson --- target/arm/translate.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index bf0b5cac61..88a926d1df 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1137,26 +1137,16 @@ static inline long vfp_reg_offset(bool dp, unsigned reg) } } -/* Return the offset of a 32-bit piece of a NEON register. - zero is the least significant end of the register. */ -static inline long -neon_reg_offset (int reg, int n) -{ - int sreg; - sreg = reg * 2 + n; - return vfp_reg_offset(0, sreg); -} - static TCGv_i32 neon_load_reg(int reg, int pass) { TCGv_i32 tmp = tcg_temp_new_i32(); - tcg_gen_ld_i32(tmp, cpu_env, neon_reg_offset(reg, pass)); + tcg_gen_ld_i32(tmp, cpu_env, neon_element_offset(reg, pass, MO_32)); return tmp; } static void neon_store_reg(int reg, int pass, TCGv_i32 var) { - tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass)); + tcg_gen_st_i32(var, cpu_env, neon_element_offset(reg, pass, MO_32)); tcg_temp_free_i32(var); } From patchwork Wed Oct 28 03:26:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 301743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3416C55179 for ; Wed, 28 Oct 2020 03:33:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 25BEC2242B for ; Wed, 28 Oct 2020 03:33:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="PyYiNtpi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 25BEC2242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39914 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcDU-0003xb-5L for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:33:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33914) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7I-0005nh-3R for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:16 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:50942) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7G-0005o4-Il for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:15 -0400 Received: by mail-pj1-x1044.google.com with SMTP id p21so1851120pju.0 for ; Tue, 27 Oct 2020 20:27:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=h4nXGEfFKdYYxAxVCeZfgKCWCHDlbl0aB+zYefgN5rQ=; b=PyYiNtpijz7wMq4HYzU5qdF96xEJtzQk3tGL+6IPM1IsUfgJnHWrAQsnvIXmwd3yDI fnyP/0wo+NZY8AJsaSQa58A2Z9XIDJj7Z+HOPRqfLh/oGWpivGJUqSH6gBB8fmpaHap0 F1iWBju8kLfRHcfi+IJl1IFYVx62fHiGlks1hl+l/DBxxHjnNnKqnqiDFoySClTC8gwG bGhniQ0SyT3fPFxeLvhbELt0xXFwHPUrlvz5iQAeOLaK9p6CIwO+DzmPovAQnY9orQjn LtacQ9M/VmRTmbp+raEvxeabkPs5GYOLO4v18Sh1Qvwzi2kMrmQk5WDqQcvxN7Xvqj1c zfbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h4nXGEfFKdYYxAxVCeZfgKCWCHDlbl0aB+zYefgN5rQ=; b=dAKKIEdDovBReKYJ48d5QN8q0m8WJDb4wMf7smJEVQq46kHSCOvrpfDIwmTOgAQA7F U7kN7ixAe/FkhJ0X1m6bRj04Alk3UnOcR47b4AVRVTim0A0TXLZQWR216m66HGPLvhN3 RLRnwXA9opf6bzBEa4qTooxOm61tCmL8Fh9WpJkcD4Vbn3BHMTwFufglrs6yjv1dBvkj uI4m3FzrAFA8HaAFwnqoo9tlsAa/zsudD/VvT8LmZB1TL90sMy6W6CZBihLvetZo+fww wuYy8EkcHJUmck67i9PyCsQIfRHpma4IOolIPDujSxkkp4TzGKOlRTgOTSNc5iHeeyMr OARw== X-Gm-Message-State: AOAM532ABffBRdiZ+Kz5pa0/7tq7mRf0MsJrNcUDYyX+ye9VYKlOkSyu xsUjYzl8259OPsFirJZIJF4eXB4suGxg9g== X-Google-Smtp-Source: ABdhPJwT6X2otPFypp+dQDOwRfMzfol41PIIJ3cM+9vOwh10Bm23u1VhW38ndWkHCPygfQbHiiMCAQ== X-Received: by 2002:a17:902:7d90:b029:d6:3192:2bd3 with SMTP id a16-20020a1709027d90b02900d631922bd3mr5480370plm.72.1603855632911; Tue, 27 Oct 2020 20:27:12 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 04/11] target/arm: Use neon_element_offset in vfp_reg_offset Date: Tue, 27 Oct 2020 20:26:56 -0700 Message-Id: <20201028032703.201526-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This seems a bit more readable than using offsetof CPU_DoubleU. Signed-off-by: Richard Henderson --- target/arm/translate.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 88a926d1df..88ded4ac2c 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1122,18 +1122,13 @@ static long neon_element_offset(int reg, int element, MemOp size) return neon_full_reg_offset(reg) + ofs; } -static inline long vfp_reg_offset(bool dp, unsigned reg) +/* Return the offset of a VFP Dreg (dp = true) or VFP Sreg (dp = false). */ +static long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { - return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); + return neon_element_offset(reg, 0, MO_64); } else { - long ofs = offsetof(CPUARMState, vfp.zregs[reg >> 2].d[(reg >> 1) & 1]); - if (reg & 1) { - ofs += offsetof(CPU_DoubleU, l.upper); - } else { - ofs += offsetof(CPU_DoubleU, l.lower); - } - return ofs; + return neon_element_offset(reg >> 1, reg & 1, MO_32); } } From patchwork Wed Oct 28 03:26:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 301742 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F791C55179 for ; Wed, 28 Oct 2020 03:36:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 29C352242F for ; Wed, 28 Oct 2020 03:36:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="GbKJ/Z4N" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 29C352242F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46494 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcG0-0006oj-0H for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:36:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33946) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7M-0005vn-TH for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:20 -0400 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]:55637) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7I-0005oy-W7 for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:20 -0400 Received: by mail-pj1-x1031.google.com with SMTP id c17so1838403pjo.5 for ; Tue, 27 Oct 2020 20:27:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=O09AiNAMHQ639q4rE7lgjXhBjUkXojtW1+aHXGPZyuc=; b=GbKJ/Z4NzCJloKeUsg2L7JzWRl9T5tdgdEbtOAz8F/8fXWwU1dG7HvuEGDMc2lMlJL QXgqE0FeHyTdSXfyluwoY+xnaiHBtKcofQzdOxdpWn4f67m7qc6DKh9XWRnT1CmasMyB 0mE+ty4z4DE4f4PMWZJH8tyuLzf72Oa0JquWi42445bWlmw1vMQDODnAGu0whSje3FSL prP02O3eAhE12Ae/2LeMuueUJm08cTIrWvzLyaWha1z1W5DGy1/Q43sKWKJOomfZxynl /56kwS75QZfaZ9kT6tfvmbFduDv/bkrPOb0sxfG6MduQn13E9ePAJp1Gzh7/Q+9hDpAg JFMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=O09AiNAMHQ639q4rE7lgjXhBjUkXojtW1+aHXGPZyuc=; b=nmcNND0xXYWiee70wIcIMEl5CS+Q8LkDphhFSZAbMM5i5e+P0RSp5+FG4E4qaFdEnW 7oAjXG7VPsfRbR16ChnNDQk8uli3w7pR128uDmWBh1LdQFgAoR0EyPBpDpaktR/4elZ5 Hs9sqV14r8zvXqL1Tc6NwRY1ISCEELCjxajmVgIlhghjuFRo90vUFXVHujOg19hgKMs3 Grqg21AuW62gq2MqkLBu9OhJ6kNnN54UmA7VwkNzmADyDWrGaZBWRO+DLpfy/q/sfBMI OuO2Dfw4ciM/EIii0OajGShrV1ESQlNvoYgUQsIGNv0bJbpKsNeehcHiBtQFefXas72H e1mQ== X-Gm-Message-State: AOAM533paiGmSBl5Y0KcyV2Q6hPBgsBN2h8RQUFTLDfOiU+oys83D3b+ a11l8D7YzrT5wbR7HVZ6JH2HjKodQSyCiA== X-Google-Smtp-Source: ABdhPJwoAmQWeAq91/P5eGgJ2SjtG8y0ncSjuw8g2xM8nbo6/ckS34lBj/i2Bo/+Auec0xNHcYCOnA== X-Received: by 2002:a17:902:7606:b029:d3:d2dd:2b3b with SMTP id k6-20020a1709027606b02900d3d2dd2b3bmr5453695pll.67.1603855634593; Tue, 27 Oct 2020 20:27:14 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 05/11] target/arm: Add read/write_neon_element32 Date: Tue, 27 Oct 2020 20:26:57 -0700 Message-Id: <20201028032703.201526-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Model these off the aa64 read/write_vec_element functions. Use it within translate-neon.c.inc. The new functions do not allocate or free temps, so this rearranges the calling code a bit. Signed-off-by: Richard Henderson --- target/arm/translate.c | 26 ++++ target/arm/translate-neon.c.inc | 248 +++++++++++++++++++------------- 2 files changed, 177 insertions(+), 97 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 88ded4ac2c..0ed9eab0b0 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1165,6 +1165,32 @@ static inline void neon_store_reg32(TCGv_i32 var, int reg) tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg)); } +static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size) +{ + long off = neon_element_offset(reg, ele, size); + + switch (size) { + case MO_32: + tcg_gen_ld_i32(dest, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + +static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp size) +{ + long off = neon_element_offset(reg, ele, size); + + switch (size) { + case MO_32: + tcg_gen_st_i32(src, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + static TCGv_ptr vfp_reg_ptr(bool dp, int reg) { TCGv_ptr ret = tcg_temp_new_ptr(); diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 96ab2248fc..3f8a0bb88b 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -956,18 +956,24 @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn) * early. Since Q is 0 there are always just two passes, so instead * of a complicated loop over each pass we just unroll. */ - tmp = neon_load_reg(a->vn, 0); - tmp2 = neon_load_reg(a->vn, 1); + tmp = tcg_temp_new_i32(); + tmp2 = tcg_temp_new_i32(); + tmp3 = tcg_temp_new_i32(); + + read_neon_element32(tmp, a->vn, 0, MO_32); + read_neon_element32(tmp2, a->vn, 1, MO_32); fn(tmp, tmp, tmp2); - tcg_temp_free_i32(tmp2); - tmp3 = neon_load_reg(a->vm, 0); - tmp2 = neon_load_reg(a->vm, 1); + read_neon_element32(tmp3, a->vm, 0, MO_32); + read_neon_element32(tmp2, a->vm, 1, MO_32); fn(tmp3, tmp3, tmp2); - tcg_temp_free_i32(tmp2); - neon_store_reg(a->vd, 0, tmp); - neon_store_reg(a->vd, 1, tmp3); + write_neon_element32(tmp, a->vd, 0, MO_32); + write_neon_element32(tmp3, a->vd, 1, MO_32); + + tcg_temp_free_i32(tmp); + tcg_temp_free_i32(tmp2); + tcg_temp_free_i32(tmp3); return true; } @@ -1275,7 +1281,7 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a, * 2-reg-and-shift operations, size < 3 case, where the * helper needs to be passed cpu_env. */ - TCGv_i32 constimm; + TCGv_i32 constimm, tmp; int pass; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -1301,12 +1307,14 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a, * by immediate using the variable shift operations. */ constimm = tcg_const_i32(dup_const(a->size, a->shift)); + tmp = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 tmp = neon_load_reg(a->vm, pass); + read_neon_element32(tmp, a->vm, pass, MO_32); fn(tmp, cpu_env, tmp, constimm); - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp, a->vd, pass, MO_32); } + tcg_temp_free_i32(tmp); tcg_temp_free_i32(constimm); return true; } @@ -1364,21 +1372,21 @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a, constimm = tcg_const_i64(-a->shift); rm1 = tcg_temp_new_i64(); rm2 = tcg_temp_new_i64(); + rd = tcg_temp_new_i32(); /* Load both inputs first to avoid potential overwrite if rm == rd */ neon_load_reg64(rm1, a->vm); neon_load_reg64(rm2, a->vm + 1); shiftfn(rm1, rm1, constimm); - rd = tcg_temp_new_i32(); narrowfn(rd, cpu_env, rm1); - neon_store_reg(a->vd, 0, rd); + write_neon_element32(rd, a->vd, 0, MO_32); shiftfn(rm2, rm2, constimm); - rd = tcg_temp_new_i32(); narrowfn(rd, cpu_env, rm2); - neon_store_reg(a->vd, 1, rd); + write_neon_element32(rd, a->vd, 1, MO_32); + tcg_temp_free_i32(rd); tcg_temp_free_i64(rm1); tcg_temp_free_i64(rm2); tcg_temp_free_i64(constimm); @@ -1428,10 +1436,14 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a, constimm = tcg_const_i32(imm); /* Load all inputs first to avoid potential overwrite */ - rm1 = neon_load_reg(a->vm, 0); - rm2 = neon_load_reg(a->vm, 1); - rm3 = neon_load_reg(a->vm + 1, 0); - rm4 = neon_load_reg(a->vm + 1, 1); + rm1 = tcg_temp_new_i32(); + rm2 = tcg_temp_new_i32(); + rm3 = tcg_temp_new_i32(); + rm4 = tcg_temp_new_i32(); + read_neon_element32(rm1, a->vm, 0, MO_32); + read_neon_element32(rm2, a->vm, 1, MO_32); + read_neon_element32(rm3, a->vm, 2, MO_32); + read_neon_element32(rm4, a->vm, 3, MO_32); rtmp = tcg_temp_new_i64(); shiftfn(rm1, rm1, constimm); @@ -1441,7 +1453,8 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a, tcg_temp_free_i32(rm2); narrowfn(rm1, cpu_env, rtmp); - neon_store_reg(a->vd, 0, rm1); + write_neon_element32(rm1, a->vd, 0, MO_32); + tcg_temp_free_i32(rm1); shiftfn(rm3, rm3, constimm); shiftfn(rm4, rm4, constimm); @@ -1452,7 +1465,8 @@ static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a, narrowfn(rm3, cpu_env, rtmp); tcg_temp_free_i64(rtmp); - neon_store_reg(a->vd, 1, rm3); + write_neon_element32(rm3, a->vd, 1, MO_32); + tcg_temp_free_i32(rm3); return true; } @@ -1553,8 +1567,10 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a, widen_mask = dup_const(a->size + 1, widen_mask); } - rm0 = neon_load_reg(a->vm, 0); - rm1 = neon_load_reg(a->vm, 1); + rm0 = tcg_temp_new_i32(); + rm1 = tcg_temp_new_i32(); + read_neon_element32(rm0, a->vm, 0, MO_32); + read_neon_element32(rm1, a->vm, 1, MO_32); tmp = tcg_temp_new_i64(); widenfn(tmp, rm0); @@ -1808,11 +1824,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, if (src1_wide) { neon_load_reg64(rn0_64, a->vn); } else { - TCGv_i32 tmp = neon_load_reg(a->vn, 0); + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vn, 0, MO_32); widenfn(rn0_64, tmp); tcg_temp_free_i32(tmp); } - rm = neon_load_reg(a->vm, 0); + rm = tcg_temp_new_i32(); + read_neon_element32(rm, a->vm, 0, MO_32); widenfn(rm_64, rm); tcg_temp_free_i32(rm); @@ -1825,11 +1843,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, if (src1_wide) { neon_load_reg64(rn1_64, a->vn + 1); } else { - TCGv_i32 tmp = neon_load_reg(a->vn, 1); + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vn, 1, MO_32); widenfn(rn1_64, tmp); tcg_temp_free_i32(tmp); } - rm = neon_load_reg(a->vm, 1); + rm = tcg_temp_new_i32(); + read_neon_element32(rm, a->vm, 1, MO_32); neon_store_reg64(rn0_64, a->vd); @@ -1922,9 +1942,11 @@ static bool do_narrow_3d(DisasContext *s, arg_3diff *a, narrowfn(rd1, rn_64); - neon_store_reg(a->vd, 0, rd0); - neon_store_reg(a->vd, 1, rd1); + write_neon_element32(rd0, a->vd, 0, MO_32); + write_neon_element32(rd1, a->vd, 1, MO_32); + tcg_temp_free_i32(rd0); + tcg_temp_free_i32(rd1); tcg_temp_free_i64(rn_64); tcg_temp_free_i64(rm_64); @@ -1999,14 +2021,14 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a, rd0 = tcg_temp_new_i64(); rd1 = tcg_temp_new_i64(); - rn = neon_load_reg(a->vn, 0); - rm = neon_load_reg(a->vm, 0); + rn = tcg_temp_new_i32(); + rm = tcg_temp_new_i32(); + read_neon_element32(rn, a->vn, 0, MO_32); + read_neon_element32(rm, a->vm, 0, MO_32); opfn(rd0, rn, rm); - tcg_temp_free_i32(rn); - tcg_temp_free_i32(rm); - rn = neon_load_reg(a->vn, 1); - rm = neon_load_reg(a->vm, 1); + read_neon_element32(rn, a->vn, 1, MO_32); + read_neon_element32(rm, a->vm, 1, MO_32); opfn(rd1, rn, rm); tcg_temp_free_i32(rn); tcg_temp_free_i32(rm); @@ -2308,16 +2330,16 @@ static void gen_neon_dup_high16(TCGv_i32 var) static inline TCGv_i32 neon_get_scalar(int size, int reg) { - TCGv_i32 tmp; - if (size == 1) { - tmp = neon_load_reg(reg & 7, reg >> 4); + TCGv_i32 tmp = tcg_temp_new_i32(); + if (size == MO_16) { + read_neon_element32(tmp, reg & 7, reg >> 4, MO_32); if (reg & 8) { gen_neon_dup_high16(tmp); } else { gen_neon_dup_low16(tmp); } } else { - tmp = neon_load_reg(reg & 15, reg >> 4); + read_neon_element32(tmp, reg & 15, reg >> 4, MO_32); } return tmp; } @@ -2331,7 +2353,7 @@ static bool do_2scalar(DisasContext *s, arg_2scalar *a, * perform an accumulation operation of that result into the * destination. */ - TCGv_i32 scalar; + TCGv_i32 scalar, tmp; int pass; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -2358,17 +2380,20 @@ static bool do_2scalar(DisasContext *s, arg_2scalar *a, } scalar = neon_get_scalar(a->size, a->vm); + tmp = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 tmp = neon_load_reg(a->vn, pass); + read_neon_element32(tmp, a->vn, pass, MO_32); opfn(tmp, tmp, scalar); if (accfn) { - TCGv_i32 rd = neon_load_reg(a->vd, pass); + TCGv_i32 rd = tcg_temp_new_i32(); + read_neon_element32(rd, a->vd, pass, MO_32); accfn(tmp, rd, tmp); tcg_temp_free_i32(rd); } - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp, a->vd, pass, MO_32); } + tcg_temp_free_i32(tmp); tcg_temp_free_i32(scalar); return true; } @@ -2523,7 +2548,7 @@ static bool do_vqrdmlah_2sc(DisasContext *s, arg_2scalar *a, * performs a kind of fused op-then-accumulate using a helper * function that takes all of rd, rn and the scalar at once. */ - TCGv_i32 scalar; + TCGv_i32 scalar, rn, rd; int pass; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { @@ -2554,14 +2579,17 @@ static bool do_vqrdmlah_2sc(DisasContext *s, arg_2scalar *a, } scalar = neon_get_scalar(a->size, a->vm); + rn = tcg_temp_new_i32(); + rd = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 rn = neon_load_reg(a->vn, pass); - TCGv_i32 rd = neon_load_reg(a->vd, pass); + read_neon_element32(rn, a->vn, pass, MO_32); + read_neon_element32(rd, a->vd, pass, MO_32); opfn(rd, cpu_env, rn, scalar, rd); - tcg_temp_free_i32(rn); - neon_store_reg(a->vd, pass, rd); + write_neon_element32(rd, a->vd, pass, MO_32); } + tcg_temp_free_i32(rn); + tcg_temp_free_i32(rd); tcg_temp_free_i32(scalar); return true; @@ -2628,12 +2656,12 @@ static bool do_2scalar_long(DisasContext *s, arg_2scalar *a, scalar = neon_get_scalar(a->size, a->vm); /* Load all inputs before writing any outputs, in case of overlap */ - rn = neon_load_reg(a->vn, 0); + rn = tcg_temp_new_i32(); + read_neon_element32(rn, a->vn, 0, MO_32); rn0_64 = tcg_temp_new_i64(); opfn(rn0_64, rn, scalar); - tcg_temp_free_i32(rn); - rn = neon_load_reg(a->vn, 1); + read_neon_element32(rn, a->vn, 1, MO_32); rn1_64 = tcg_temp_new_i64(); opfn(rn1_64, rn, scalar); tcg_temp_free_i32(rn); @@ -2857,30 +2885,34 @@ static bool trans_VTBL(DisasContext *s, arg_VTBL *a) return false; } n <<= 3; + tmp = tcg_temp_new_i32(); if (a->op) { - tmp = neon_load_reg(a->vd, 0); + read_neon_element32(tmp, a->vd, 0, MO_32); } else { - tmp = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp, 0); } - tmp2 = neon_load_reg(a->vm, 0); + tmp2 = tcg_temp_new_i32(); + read_neon_element32(tmp2, a->vm, 0, MO_32); ptr1 = vfp_reg_ptr(true, a->vn); tmp4 = tcg_const_i32(n); gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp4); - tcg_temp_free_i32(tmp); + if (a->op) { - tmp = neon_load_reg(a->vd, 1); + read_neon_element32(tmp, a->vd, 1, MO_32); } else { - tmp = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp, 0); } - tmp3 = neon_load_reg(a->vm, 1); + tmp3 = tcg_temp_new_i32(); + read_neon_element32(tmp3, a->vm, 1, MO_32); gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp4); + tcg_temp_free_i32(tmp); tcg_temp_free_i32(tmp4); tcg_temp_free_ptr(ptr1); - neon_store_reg(a->vd, 0, tmp2); - neon_store_reg(a->vd, 1, tmp3); - tcg_temp_free_i32(tmp); + + write_neon_element32(tmp2, a->vd, 0, MO_32); + write_neon_element32(tmp3, a->vd, 1, MO_32); + tcg_temp_free_i32(tmp2); + tcg_temp_free_i32(tmp3); return true; } @@ -2939,8 +2971,11 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a) for (pass = 0; pass < (a->q ? 2 : 1); pass++) { TCGv_i32 tmp[2]; + tmp[0] = tcg_temp_new_i32(); + tmp[1] = tcg_temp_new_i32(); + for (half = 0; half < 2; half++) { - tmp[half] = neon_load_reg(a->vm, pass * 2 + half); + read_neon_element32(tmp[half], a->vm, pass * 2 + half, MO_32); switch (a->size) { case 0: tcg_gen_bswap32_i32(tmp[half], tmp[half]); @@ -2954,8 +2989,8 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a) g_assert_not_reached(); } } - neon_store_reg(a->vd, pass * 2, tmp[1]); - neon_store_reg(a->vd, pass * 2 + 1, tmp[0]); + write_neon_element32(tmp[1], a->vd, pass * 2, MO_32); + write_neon_element32(tmp[0], a->vd, pass * 2 + 1, MO_32); } return true; } @@ -3001,12 +3036,14 @@ static bool do_2misc_pairwise(DisasContext *s, arg_2misc *a, rm0_64 = tcg_temp_new_i64(); rm1_64 = tcg_temp_new_i64(); rd_64 = tcg_temp_new_i64(); - tmp = neon_load_reg(a->vm, pass * 2); + + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, pass * 2, MO_32); widenfn(rm0_64, tmp); - tcg_temp_free_i32(tmp); - tmp = neon_load_reg(a->vm, pass * 2 + 1); + read_neon_element32(tmp, a->vm, pass * 2 + 1, MO_32); widenfn(rm1_64, tmp); tcg_temp_free_i32(tmp); + opfn(rd_64, rm0_64, rm1_64); tcg_temp_free_i64(rm0_64); tcg_temp_free_i64(rm1_64); @@ -3219,8 +3256,10 @@ static bool do_vmovn(DisasContext *s, arg_2misc *a, narrowfn(rd0, cpu_env, rm); neon_load_reg64(rm, a->vm + 1); narrowfn(rd1, cpu_env, rm); - neon_store_reg(a->vd, 0, rd0); - neon_store_reg(a->vd, 1, rd1); + write_neon_element32(rd0, a->vd, 0, MO_32); + write_neon_element32(rd1, a->vd, 1, MO_32); + tcg_temp_free_i32(rd0); + tcg_temp_free_i32(rd1); tcg_temp_free_i64(rm); return true; } @@ -3277,9 +3316,11 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a) } rd = tcg_temp_new_i64(); + rm0 = tcg_temp_new_i32(); + rm1 = tcg_temp_new_i32(); - rm0 = neon_load_reg(a->vm, 0); - rm1 = neon_load_reg(a->vm, 1); + read_neon_element32(rm0, a->vm, 0, MO_32); + read_neon_element32(rm1, a->vm, 1, MO_32); widenfn(rd, rm0); tcg_gen_shli_i64(rd, rd, 8 << a->size); @@ -3320,21 +3361,25 @@ static bool trans_VCVT_F16_F32(DisasContext *s, arg_2misc *a) fpst = fpstatus_ptr(FPST_STD); ahp = get_ahp_flag(); - tmp = neon_load_reg(a->vm, 0); + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 0, MO_32); gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp); - tmp2 = neon_load_reg(a->vm, 1); + tmp2 = tcg_temp_new_i32(); + read_neon_element32(tmp2, a->vm, 1, MO_32); gen_helper_vfp_fcvt_f32_to_f16(tmp2, tmp2, fpst, ahp); tcg_gen_shli_i32(tmp2, tmp2, 16); tcg_gen_or_i32(tmp2, tmp2, tmp); - tcg_temp_free_i32(tmp); - tmp = neon_load_reg(a->vm, 2); + read_neon_element32(tmp, a->vm, 2, MO_32); gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp); - tmp3 = neon_load_reg(a->vm, 3); - neon_store_reg(a->vd, 0, tmp2); + tmp3 = tcg_temp_new_i32(); + read_neon_element32(tmp3, a->vm, 3, MO_32); + write_neon_element32(tmp2, a->vd, 0, MO_32); + tcg_temp_free_i32(tmp2); gen_helper_vfp_fcvt_f32_to_f16(tmp3, tmp3, fpst, ahp); tcg_gen_shli_i32(tmp3, tmp3, 16); tcg_gen_or_i32(tmp3, tmp3, tmp); - neon_store_reg(a->vd, 1, tmp3); + write_neon_element32(tmp3, a->vd, 1, MO_32); + tcg_temp_free_i32(tmp3); tcg_temp_free_i32(tmp); tcg_temp_free_i32(ahp); tcg_temp_free_ptr(fpst); @@ -3369,21 +3414,25 @@ static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a) fpst = fpstatus_ptr(FPST_STD); ahp = get_ahp_flag(); tmp3 = tcg_temp_new_i32(); - tmp = neon_load_reg(a->vm, 0); - tmp2 = neon_load_reg(a->vm, 1); + tmp2 = tcg_temp_new_i32(); + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 0, MO_32); + read_neon_element32(tmp2, a->vm, 1, MO_32); tcg_gen_ext16u_i32(tmp3, tmp); gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp); - neon_store_reg(a->vd, 0, tmp3); + write_neon_element32(tmp3, a->vd, 0, MO_32); tcg_gen_shri_i32(tmp, tmp, 16); gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp); - neon_store_reg(a->vd, 1, tmp); - tmp3 = tcg_temp_new_i32(); + write_neon_element32(tmp, a->vd, 1, MO_32); + tcg_temp_free_i32(tmp); tcg_gen_ext16u_i32(tmp3, tmp2); gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp); - neon_store_reg(a->vd, 2, tmp3); + write_neon_element32(tmp3, a->vd, 2, MO_32); + tcg_temp_free_i32(tmp3); tcg_gen_shri_i32(tmp2, tmp2, 16); gen_helper_vfp_fcvt_f16_to_f32(tmp2, tmp2, fpst, ahp); - neon_store_reg(a->vd, 3, tmp2); + write_neon_element32(tmp2, a->vd, 3, MO_32); + tcg_temp_free_i32(tmp2); tcg_temp_free_i32(ahp); tcg_temp_free_ptr(fpst); @@ -3489,6 +3538,7 @@ DO_2M_CRYPTO(SHA256SU0, aa32_sha2, 2) static bool do_2misc(DisasContext *s, arg_2misc *a, NeonGenOneOpFn *fn) { + TCGv_i32 tmp; int pass; /* Handle a 2-reg-misc operation by iterating 32 bits at a time */ @@ -3514,11 +3564,13 @@ static bool do_2misc(DisasContext *s, arg_2misc *a, NeonGenOneOpFn *fn) return true; } + tmp = tcg_temp_new_i32(); for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - TCGv_i32 tmp = neon_load_reg(a->vm, pass); + read_neon_element32(tmp, a->vm, pass, MO_32); fn(tmp, tmp); - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp, a->vd, pass, MO_32); } + tcg_temp_free_i32(tmp); return true; } @@ -3871,24 +3923,26 @@ static bool trans_VTRN(DisasContext *s, arg_2misc *a) return true; } - if (a->size == 2) { + tmp = tcg_temp_new_i32(); + tmp2 = tcg_temp_new_i32(); + if (a->size == MO_32) { for (pass = 0; pass < (a->q ? 4 : 2); pass += 2) { - tmp = neon_load_reg(a->vm, pass); - tmp2 = neon_load_reg(a->vd, pass + 1); - neon_store_reg(a->vm, pass, tmp2); - neon_store_reg(a->vd, pass + 1, tmp); + read_neon_element32(tmp, a->vm, pass, MO_32); + read_neon_element32(tmp2, a->vd, pass + 1, MO_32); + write_neon_element32(tmp2, a->vm, pass, MO_32); + write_neon_element32(tmp, a->vd, pass + 1, MO_32); } } else { for (pass = 0; pass < (a->q ? 4 : 2); pass++) { - tmp = neon_load_reg(a->vm, pass); - tmp2 = neon_load_reg(a->vd, pass); - if (a->size == 0) { + read_neon_element32(tmp, a->vm, pass, MO_32); + read_neon_element32(tmp2, a->vd, pass, MO_32); + if (a->size == MO_8) { gen_neon_trn_u8(tmp, tmp2); } else { gen_neon_trn_u16(tmp, tmp2); } - neon_store_reg(a->vm, pass, tmp2); - neon_store_reg(a->vd, pass, tmp); + write_neon_element32(tmp2, a->vm, pass, MO_32); + write_neon_element32(tmp, a->vd, pass, MO_32); } } return true; From patchwork Wed Oct 28 03:26:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 311523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA3EEC388F9 for ; Wed, 28 Oct 2020 03:33:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 288A82242F for ; Wed, 28 Oct 2020 03:33:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="ZoS1iQ3T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 288A82242F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39922 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcDU-0003xo-58 for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:33:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33940) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7L-0005tr-JG for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:19 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:43107) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7J-0005pG-M0 for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:19 -0400 Received: by mail-pl1-x644.google.com with SMTP id o9so1800500plx.10 for ; Tue, 27 Oct 2020 20:27:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NlmDoKUnT++Y4HE65X/AaQrAu9Ha43jQPJiWJ9I7vlY=; b=ZoS1iQ3T33oGbjteeKwJK/ZCChN8xz344QrrFqZP/6fVnjwArTBNd6F4agfo3jZXFb 6Uh7oZ5bRZakzb8olVdYZGzdBDKtXUQje1JJrOi1quzkTcq0Be2hD6QVzxGcRhE9VKnX erHgP/6TfXM7kegpq8auEAcZ0LntGkb+sfcPAfa3LlyyvWcT4g5cXphaslD1tLi3Zquv er+C/J8Arq0KEyNCA5CpHzNkaIJkszy7CedtTuiQrVxE4nx2t7p22BpBomaNTD0bvqQO YFXQieYgANVu1r05DtQcw3U882I06reXjqfDSnzLS8hZIt6LD8LP0LigJmUIpCDqjxA5 Zk7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NlmDoKUnT++Y4HE65X/AaQrAu9Ha43jQPJiWJ9I7vlY=; b=eBgBlh2qN1Wmd7BI3bvvjjDGYROuBHAcop4JC7D/FpuK+AXe1HPEyRM2Pr2PjVgxZd snVLKZtPLSJVBg9MCyDtU9hbaOLXuB1mG9fJPJATsUno9TiLzinewModWjH6pu6l+H4w QcCZcPJqWcjk/lBcI+gGLzUI1G0ZlTRmCuNaLnO4I4GnixUwXou5kSXsAGHIP7V7QHZS eeuIONjTiR/r9vhEfsQYgnM/ewApFlQ00UhNuZd7IvsVLHBOmRstCVXolWx/mr/XnP/e iFukhAB2/HRxRbUMIH8GMBGEWU1idCA/i6rn5vqsVEAPlnbysFgTiRitDjE26gpji3qK NlBw== X-Gm-Message-State: AOAM533tQfB/FExpbmcdfVkMg9CJAugw5n1OP6C+RwgT6Ixh9trICQve OAsmfE3MRMu8i+gJBWxBLGSurSOKG3DjLw== X-Google-Smtp-Source: ABdhPJzXo266fh5quVImjkfLczkve0pjau6+PryKTtTvJ6O3xReOHF50k/irC2gwKGU601jqWbeppQ== X-Received: by 2002:a17:902:a9c9:b029:d6:2d8f:f7b4 with SMTP id b9-20020a170902a9c9b02900d62d8ff7b4mr5705024plr.2.1603855635796; Tue, 27 Oct 2020 20:27:15 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 06/11] target/arm: Expand read/write_neon_element32 to all MemOp Date: Tue, 27 Oct 2020 20:26:58 -0700 Message-Id: <20201028032703.201526-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We can then use this to improve VMOV (scalar to gp) and VMOV (gp to scalar) so that we simply perform the memory operation that we wanted, rather than inserting or extracting from a 32-bit quantity. These were the last uses of neon_load/store_reg, so remove them. Signed-off-by: Richard Henderson --- target/arm/translate.c | 50 +++++++++++++----------- target/arm/translate-vfp.c.inc | 71 +++++----------------------------- 2 files changed, 37 insertions(+), 84 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 0ed9eab0b0..55d5f4ed73 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1106,9 +1106,9 @@ static long neon_full_reg_offset(unsigned reg) * Return the offset of a 2**SIZE piece of a NEON register, at index ELE, * where 0 is the least significant end of the register. */ -static long neon_element_offset(int reg, int element, MemOp size) +static long neon_element_offset(int reg, int element, MemOp memop) { - int element_size = 1 << size; + int element_size = 1 << (memop & MO_SIZE); int ofs = element * element_size; #ifdef HOST_WORDS_BIGENDIAN /* @@ -1132,19 +1132,6 @@ static long vfp_reg_offset(bool dp, unsigned reg) } } -static TCGv_i32 neon_load_reg(int reg, int pass) -{ - TCGv_i32 tmp = tcg_temp_new_i32(); - tcg_gen_ld_i32(tmp, cpu_env, neon_element_offset(reg, pass, MO_32)); - return tmp; -} - -static void neon_store_reg(int reg, int pass, TCGv_i32 var) -{ - tcg_gen_st_i32(var, cpu_env, neon_element_offset(reg, pass, MO_32)); - tcg_temp_free_i32(var); -} - static inline void neon_load_reg64(TCGv_i64 var, int reg) { tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg)); @@ -1165,12 +1152,25 @@ static inline void neon_store_reg32(TCGv_i32 var, int reg) tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg)); } -static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size) +static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop) { - long off = neon_element_offset(reg, ele, size); + long off = neon_element_offset(reg, ele, memop); - switch (size) { - case MO_32: + switch (memop) { + case MO_SB: + tcg_gen_ld8s_i32(dest, cpu_env, off); + break; + case MO_UB: + tcg_gen_ld8u_i32(dest, cpu_env, off); + break; + case MO_SW: + tcg_gen_ld16s_i32(dest, cpu_env, off); + break; + case MO_UW: + tcg_gen_ld16u_i32(dest, cpu_env, off); + break; + case MO_UL: + case MO_SL: tcg_gen_ld_i32(dest, cpu_env, off); break; default: @@ -1178,11 +1178,17 @@ static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size) } } -static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp size) +static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop) { - long off = neon_element_offset(reg, ele, size); + long off = neon_element_offset(reg, ele, memop); - switch (size) { + switch (memop) { + case MO_8: + tcg_gen_st8_i32(src, cpu_env, off); + break; + case MO_16: + tcg_gen_st16_i32(src, cpu_env, off); + break; case MO_32: tcg_gen_st_i32(src, cpu_env, off); break; diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index 368bae0a73..28f22f9872 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -511,11 +511,9 @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a) { /* VMOV scalar to general purpose register */ TCGv_i32 tmp; - int pass; - uint32_t offset; - /* SIZE == 2 is a VFP instruction; otherwise NEON. */ - if (a->size == 2 + /* SIZE == MO_32 is a VFP instruction; otherwise NEON. */ + if (a->size == MO_32 ? !dc_isar_feature(aa32_fpsp_v2, s) : !arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -526,44 +524,12 @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a) return false; } - offset = a->index << a->size; - pass = extract32(offset, 2, 1); - offset = extract32(offset, 0, 2) * 8; - if (!vfp_access_check(s)) { return true; } - tmp = neon_load_reg(a->vn, pass); - switch (a->size) { - case 0: - if (offset) { - tcg_gen_shri_i32(tmp, tmp, offset); - } - if (a->u) { - gen_uxtb(tmp); - } else { - gen_sxtb(tmp); - } - break; - case 1: - if (a->u) { - if (offset) { - tcg_gen_shri_i32(tmp, tmp, 16); - } else { - gen_uxth(tmp); - } - } else { - if (offset) { - tcg_gen_sari_i32(tmp, tmp, 16); - } else { - gen_sxth(tmp); - } - } - break; - case 2: - break; - } + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vn, a->index, a->size | (a->u ? 0 : MO_SIGN)); store_reg(s, a->rt, tmp); return true; @@ -572,12 +538,10 @@ static bool trans_VMOV_to_gp(DisasContext *s, arg_VMOV_to_gp *a) static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a) { /* VMOV general purpose register to scalar */ - TCGv_i32 tmp, tmp2; - int pass; - uint32_t offset; + TCGv_i32 tmp; - /* SIZE == 2 is a VFP instruction; otherwise NEON. */ - if (a->size == 2 + /* SIZE == MO_32 is a VFP instruction; otherwise NEON. */ + if (a->size == MO_32 ? !dc_isar_feature(aa32_fpsp_v2, s) : !arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -588,30 +552,13 @@ static bool trans_VMOV_from_gp(DisasContext *s, arg_VMOV_from_gp *a) return false; } - offset = a->index << a->size; - pass = extract32(offset, 2, 1); - offset = extract32(offset, 0, 2) * 8; - if (!vfp_access_check(s)) { return true; } tmp = load_reg(s, a->rt); - switch (a->size) { - case 0: - tmp2 = neon_load_reg(a->vn, pass); - tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 8); - tcg_temp_free_i32(tmp2); - break; - case 1: - tmp2 = neon_load_reg(a->vn, pass); - tcg_gen_deposit_i32(tmp, tmp2, tmp, offset, 16); - tcg_temp_free_i32(tmp2); - break; - case 2: - break; - } - neon_store_reg(a->vn, pass, tmp); + write_neon_element32(tmp, a->vn, a->index, a->size); + tcg_temp_free_i32(tmp); return true; } From patchwork Wed Oct 28 03:26:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 311522 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D81FC388F9 for ; Wed, 28 Oct 2020 03:36:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C16132242B for ; Wed, 28 Oct 2020 03:36:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="qnyEiYjK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C16132242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46464 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcFz-0006nn-Km for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:36:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33968) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7O-0005yQ-Gh for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:22 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:33110) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7L-0005ps-LF for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:22 -0400 Received: by mail-pl1-x642.google.com with SMTP id b19so1816715pld.0 for ; Tue, 27 Oct 2020 20:27:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bhlXbEbWYnNuol5HsH6fu3cqY3M1Y2WKuzmjH4v4ad4=; b=qnyEiYjKU2ZUQdIAJIbTXl58k29BJ9N2ORPXk+Gmj1xQrhPMTrnNk41OzkOOLQlIg8 7Tz9LeeVWjACegM3w6xTOC4wJ4+i6UCoyfyvvqvJm7EqBE6SYYb/LoUAJ4X+CINAP8hJ uMZcjYC/GD3QfljdwAL/l4V+S2ro8wEwC/cZR5Kp3JSFmMs5SclVEa+VLXG+cWxS8XpZ 5dHvmV1DV4flsOyOysbxbhYZWbx/6aIqtxCPA/xhYfg4QoZg+DFoDLMojORRkP6penzk KFsjuPK46+6JymMjXa1YTb6G+ITejHAF9EOfS18guU4SzsWOPDmHd+qeFp00nfBObhvA QwUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bhlXbEbWYnNuol5HsH6fu3cqY3M1Y2WKuzmjH4v4ad4=; b=OPZu9Wu4QXZCXPqtJIShADEHGeydVXn+ghbnq1/BTab0cyTOw3pNZcdzvJFKN51cff cfkhHBTQ7JMfprUfKY+06l3mp2Cs+DWrFMBBpX7hbQKBrvZKa/3sh5YWpJYUSOOvFlu9 jo0Y9oOvWXvphs7rlXuegkzPauAC8dv0JHe2/5mOWcMoo25kJLCWGOShtC7DGCDPNUQY HN4fue7kVNgiVKcmQoHTpHcmzw2VJwosBu1li1ywqQjcgK2wTKVLKqF7dbD0BaZR2yBQ zxcISwLyYHD4HDsNiqK8YO6TVWWBhevvIzG93MXjNJM6cl+Cp347/49aG5vbAeJal/+a gKTg== X-Gm-Message-State: AOAM531PtB+pKK6Y76LEuwJxgGaucheprrHIWECiMt2cL4ox6tdYBxRg w3uDbWBaXXohHRPTpRd/fvdkz6tmmXhm0Q== X-Google-Smtp-Source: ABdhPJzig9+9L5aTkiEfFCM7c4Aax2UnNsj0NefcDXxVqz0XN2GGEnhc3C2Fov7DG505M9r15WPCrA== X-Received: by 2002:a17:902:bc49:b029:d6:44c7:565f with SMTP id t9-20020a170902bc49b02900d644c7565fmr5468902plz.11.1603855637475; Tue, 27 Oct 2020 20:27:17 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 07/11] target/arm: Rename neon_load_reg32 to vfp_load_reg32 Date: Tue, 27 Oct 2020 20:26:59 -0700 Message-Id: <20201028032703.201526-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::642; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x642.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The only uses of this function are for loading VFP single-precision values, and nothing to do with NEON. Signed-off-by: Richard Henderson --- target/arm/translate.c | 4 +- target/arm/translate-vfp.c.inc | 184 ++++++++++++++++----------------- 2 files changed, 94 insertions(+), 94 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 55d5f4ed73..8491ab705b 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1142,12 +1142,12 @@ static inline void neon_store_reg64(TCGv_i64 var, int reg) tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg)); } -static inline void neon_load_reg32(TCGv_i32 var, int reg) +static inline void vfp_load_reg32(TCGv_i32 var, int reg) { tcg_gen_ld_i32(var, cpu_env, vfp_reg_offset(false, reg)); } -static inline void neon_store_reg32(TCGv_i32 var, int reg) +static inline void vfp_store_reg32(TCGv_i32 var, int reg) { tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg)); } diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index 28f22f9872..d2a9b658bb 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -283,8 +283,8 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) frn = tcg_temp_new_i32(); frm = tcg_temp_new_i32(); dest = tcg_temp_new_i32(); - neon_load_reg32(frn, rn); - neon_load_reg32(frm, rm); + vfp_load_reg32(frn, rn); + vfp_load_reg32(frm, rm); switch (a->cc) { case 0: /* eq: Z */ tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero, @@ -315,7 +315,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) if (sz == 1) { tcg_gen_andi_i32(dest, dest, 0xffff); } - neon_store_reg32(dest, rd); + vfp_store_reg32(dest, rd); tcg_temp_free_i32(frn); tcg_temp_free_i32(frm); tcg_temp_free_i32(dest); @@ -395,13 +395,13 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a) TCGv_i32 tcg_res; tcg_op = tcg_temp_new_i32(); tcg_res = tcg_temp_new_i32(); - neon_load_reg32(tcg_op, rm); + vfp_load_reg32(tcg_op, rm); if (sz == 1) { gen_helper_rinth(tcg_res, tcg_op, fpst); } else { gen_helper_rints(tcg_res, tcg_op, fpst); } - neon_store_reg32(tcg_res, rd); + vfp_store_reg32(tcg_res, rd); tcg_temp_free_i32(tcg_op); tcg_temp_free_i32(tcg_res); } @@ -470,7 +470,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst); } tcg_gen_extrl_i64_i32(tcg_tmp, tcg_res); - neon_store_reg32(tcg_tmp, rd); + vfp_store_reg32(tcg_tmp, rd); tcg_temp_free_i32(tcg_tmp); tcg_temp_free_i64(tcg_res); tcg_temp_free_i64(tcg_double); @@ -478,7 +478,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) TCGv_i32 tcg_single, tcg_res; tcg_single = tcg_temp_new_i32(); tcg_res = tcg_temp_new_i32(); - neon_load_reg32(tcg_single, rm); + vfp_load_reg32(tcg_single, rm); if (sz == 1) { if (is_signed) { gen_helper_vfp_toslh(tcg_res, tcg_single, tcg_shift, fpst); @@ -492,7 +492,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst); } } - neon_store_reg32(tcg_res, rd); + vfp_store_reg32(tcg_res, rd); tcg_temp_free_i32(tcg_res); tcg_temp_free_i32(tcg_single); } @@ -776,14 +776,14 @@ static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a) if (a->l) { /* VFP to general purpose register */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vn); + vfp_load_reg32(tmp, a->vn); tcg_gen_andi_i32(tmp, tmp, 0xffff); store_reg(s, a->rt, tmp); } else { /* general purpose register to VFP */ tmp = load_reg(s, a->rt); tcg_gen_andi_i32(tmp, tmp, 0xffff); - neon_store_reg32(tmp, a->vn); + vfp_store_reg32(tmp, a->vn); tcg_temp_free_i32(tmp); } @@ -805,7 +805,7 @@ static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a) if (a->l) { /* VFP to general purpose register */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vn); + vfp_load_reg32(tmp, a->vn); if (a->rt == 15) { /* Set the 4 flag bits in the CPSR. */ gen_set_nzcv(tmp); @@ -816,7 +816,7 @@ static bool trans_VMOV_single(DisasContext *s, arg_VMOV_single *a) } else { /* general purpose register to VFP */ tmp = load_reg(s, a->rt); - neon_store_reg32(tmp, a->vn); + vfp_store_reg32(tmp, a->vn); tcg_temp_free_i32(tmp); } @@ -842,18 +842,18 @@ static bool trans_VMOV_64_sp(DisasContext *s, arg_VMOV_64_sp *a) if (a->op) { /* fpreg to gpreg */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); store_reg(s, a->rt, tmp); tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm + 1); + vfp_load_reg32(tmp, a->vm + 1); store_reg(s, a->rt2, tmp); } else { /* gpreg to fpreg */ tmp = load_reg(s, a->rt); - neon_store_reg32(tmp, a->vm); + vfp_store_reg32(tmp, a->vm); tcg_temp_free_i32(tmp); tmp = load_reg(s, a->rt2); - neon_store_reg32(tmp, a->vm + 1); + vfp_store_reg32(tmp, a->vm + 1); tcg_temp_free_i32(tmp); } @@ -885,18 +885,18 @@ static bool trans_VMOV_64_dp(DisasContext *s, arg_VMOV_64_dp *a) if (a->op) { /* fpreg to gpreg */ tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm * 2); + vfp_load_reg32(tmp, a->vm * 2); store_reg(s, a->rt, tmp); tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm * 2 + 1); + vfp_load_reg32(tmp, a->vm * 2 + 1); store_reg(s, a->rt2, tmp); } else { /* gpreg to fpreg */ tmp = load_reg(s, a->rt); - neon_store_reg32(tmp, a->vm * 2); + vfp_store_reg32(tmp, a->vm * 2); tcg_temp_free_i32(tmp); tmp = load_reg(s, a->rt2); - neon_store_reg32(tmp, a->vm * 2 + 1); + vfp_store_reg32(tmp, a->vm * 2 + 1); tcg_temp_free_i32(tmp); } @@ -927,9 +927,9 @@ static bool trans_VLDR_VSTR_hp(DisasContext *s, arg_VLDR_VSTR_sp *a) tmp = tcg_temp_new_i32(); if (a->l) { gen_aa32_ld16u(s, tmp, addr, get_mem_index(s)); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); } else { - neon_load_reg32(tmp, a->vd); + vfp_load_reg32(tmp, a->vd); gen_aa32_st16(s, tmp, addr, get_mem_index(s)); } tcg_temp_free_i32(tmp); @@ -961,9 +961,9 @@ static bool trans_VLDR_VSTR_sp(DisasContext *s, arg_VLDR_VSTR_sp *a) tmp = tcg_temp_new_i32(); if (a->l) { gen_aa32_ld32u(s, tmp, addr, get_mem_index(s)); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); } else { - neon_load_reg32(tmp, a->vd); + vfp_load_reg32(tmp, a->vd); gen_aa32_st32(s, tmp, addr, get_mem_index(s)); } tcg_temp_free_i32(tmp); @@ -1066,10 +1066,10 @@ static bool trans_VLDM_VSTM_sp(DisasContext *s, arg_VLDM_VSTM_sp *a) if (a->l) { /* load */ gen_aa32_ld32u(s, tmp, addr, get_mem_index(s)); - neon_store_reg32(tmp, a->vd + i); + vfp_store_reg32(tmp, a->vd + i); } else { /* store */ - neon_load_reg32(tmp, a->vd + i); + vfp_load_reg32(tmp, a->vd + i); gen_aa32_st32(s, tmp, addr, get_mem_index(s)); } tcg_gen_addi_i32(addr, addr, offset); @@ -1285,15 +1285,15 @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn, fd = tcg_temp_new_i32(); fpst = fpstatus_ptr(FPST_FPCR); - neon_load_reg32(f0, vn); - neon_load_reg32(f1, vm); + vfp_load_reg32(f0, vn); + vfp_load_reg32(f1, vm); for (;;) { if (reads_vd) { - neon_load_reg32(fd, vd); + vfp_load_reg32(fd, vd); } fn(fd, f0, f1, fpst); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); if (veclen == 0) { break; @@ -1303,10 +1303,10 @@ static bool do_vfp_3op_sp(DisasContext *s, VFPGen3OpSPFn *fn, veclen--; vd = vfp_advance_sreg(vd, delta_d); vn = vfp_advance_sreg(vn, delta_d); - neon_load_reg32(f0, vn); + vfp_load_reg32(f0, vn); if (delta_m) { vm = vfp_advance_sreg(vm, delta_m); - neon_load_reg32(f1, vm); + vfp_load_reg32(f1, vm); } } @@ -1349,14 +1349,14 @@ static bool do_vfp_3op_hp(DisasContext *s, VFPGen3OpSPFn *fn, fd = tcg_temp_new_i32(); fpst = fpstatus_ptr(FPST_FPCR_F16); - neon_load_reg32(f0, vn); - neon_load_reg32(f1, vm); + vfp_load_reg32(f0, vn); + vfp_load_reg32(f1, vm); if (reads_vd) { - neon_load_reg32(fd, vd); + vfp_load_reg32(fd, vd); } fn(fd, f0, f1, fpst); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); tcg_temp_free_i32(f0); tcg_temp_free_i32(f1); @@ -1489,11 +1489,11 @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) f0 = tcg_temp_new_i32(); fd = tcg_temp_new_i32(); - neon_load_reg32(f0, vm); + vfp_load_reg32(f0, vm); for (;;) { fn(fd, f0); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); if (veclen == 0) { break; @@ -1503,7 +1503,7 @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) /* single source one-many */ while (veclen--) { vd = vfp_advance_sreg(vd, delta_d); - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); } break; } @@ -1512,7 +1512,7 @@ static bool do_vfp_2op_sp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) veclen--; vd = vfp_advance_sreg(vd, delta_d); vm = vfp_advance_sreg(vm, delta_m); - neon_load_reg32(f0, vm); + vfp_load_reg32(f0, vm); } tcg_temp_free_i32(f0); @@ -1545,9 +1545,9 @@ static bool do_vfp_2op_hp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) } f0 = tcg_temp_new_i32(); - neon_load_reg32(f0, vm); + vfp_load_reg32(f0, vm); fn(f0, f0); - neon_store_reg32(f0, vd); + vfp_store_reg32(f0, vd); tcg_temp_free_i32(f0); return true; @@ -2037,20 +2037,20 @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i32(); - neon_load_reg32(vn, a->vn); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vn, a->vn); + vfp_load_reg32(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negh(vn, vn); } - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negh(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_vfp_muladdh(vd, vn, vm, vd, fpst); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(vn); @@ -2102,20 +2102,20 @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i32(); - neon_load_reg32(vn, a->vn); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vn, a->vn); + vfp_load_reg32(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negs(vn, vn); } - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negs(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR); gen_helper_vfp_muladds(vd, vn, vm, vd, fpst); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(vn); @@ -2230,7 +2230,7 @@ static bool trans_VMOV_imm_hp(DisasContext *s, arg_VMOV_imm_sp *a) } fd = tcg_const_i32(vfp_expand_imm(MO_16, a->imm)); - neon_store_reg32(fd, a->vd); + vfp_store_reg32(fd, a->vd); tcg_temp_free_i32(fd); return true; } @@ -2270,7 +2270,7 @@ static bool trans_VMOV_imm_sp(DisasContext *s, arg_VMOV_imm_sp *a) fd = tcg_const_i32(vfp_expand_imm(MO_32, a->imm)); for (;;) { - neon_store_reg32(fd, vd); + vfp_store_reg32(fd, vd); if (veclen == 0) { break; @@ -2397,11 +2397,11 @@ static bool trans_VCMP_hp(DisasContext *s, arg_VCMP_sp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (a->z) { tcg_gen_movi_i32(vm, 0); } else { - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); } if (a->e) { @@ -2436,11 +2436,11 @@ static bool trans_VCMP_sp(DisasContext *s, arg_VCMP_sp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); if (a->z) { tcg_gen_movi_i32(vm, 0); } else { - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); } if (a->e) { @@ -2519,7 +2519,7 @@ static bool trans_VCVT_f32_f16(DisasContext *s, arg_VCVT_f32_f16 *a) /* The T bit tells us if we want the low or high 16 bits of Vm */ tcg_gen_ld16u_i32(tmp, cpu_env, vfp_f16_offset(a->vm, a->t)); gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp_mode); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_i32(ahp_mode); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); @@ -2583,7 +2583,7 @@ static bool trans_VCVT_f16_f32(DisasContext *s, arg_VCVT_f16_f32 *a) ahp_mode = get_ahp_flag(); tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp_mode); tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t)); tcg_temp_free_i32(ahp_mode); @@ -2645,10 +2645,10 @@ static bool trans_VRINTR_hp(DisasContext *s, arg_VRINTR_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_rinth(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2668,10 +2668,10 @@ static bool trans_VRINTR_sp(DisasContext *s, arg_VRINTR_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rints(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2724,13 +2724,13 @@ static bool trans_VRINTZ_hp(DisasContext *s, arg_VRINTZ_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); tcg_rmode = tcg_const_i32(float_round_to_zero); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); gen_helper_rinth(tmp, tmp, fpst); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tcg_rmode); tcg_temp_free_i32(tmp); @@ -2752,13 +2752,13 @@ static bool trans_VRINTZ_sp(DisasContext *s, arg_VRINTZ_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); tcg_rmode = tcg_const_i32(float_round_to_zero); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); gen_helper_rints(tmp, tmp, fpst); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tcg_rmode); tcg_temp_free_i32(tmp); @@ -2816,10 +2816,10 @@ static bool trans_VRINTX_hp(DisasContext *s, arg_VRINTX_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_rinth_exact(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2839,10 +2839,10 @@ static bool trans_VRINTX_sp(DisasContext *s, arg_VRINTX_sp *a) } tmp = tcg_temp_new_i32(); - neon_load_reg32(tmp, a->vm); + vfp_load_reg32(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rints_exact(tmp, tmp, fpst); - neon_store_reg32(tmp, a->vd); + vfp_store_reg32(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); return true; @@ -2900,7 +2900,7 @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i64(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); gen_helper_vfp_fcvtds(vd, vm, cpu_env); neon_store_reg64(vd, a->vd); tcg_temp_free_i32(vm); @@ -2930,7 +2930,7 @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a) vm = tcg_temp_new_i64(); neon_load_reg64(vm, a->vm); gen_helper_vfp_fcvtsd(vd, vm, cpu_env); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i64(vm); return true; @@ -2950,7 +2950,7 @@ static bool trans_VCVT_int_hp(DisasContext *s, arg_VCVT_int_sp *a) } vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); if (a->s) { /* i32 -> f16 */ @@ -2959,7 +2959,7 @@ static bool trans_VCVT_int_hp(DisasContext *s, arg_VCVT_int_sp *a) /* u32 -> f16 */ gen_helper_vfp_uitoh(vm, vm, fpst); } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -2979,7 +2979,7 @@ static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a) } vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); fpst = fpstatus_ptr(FPST_FPCR); if (a->s) { /* i32 -> f32 */ @@ -2988,7 +2988,7 @@ static bool trans_VCVT_int_sp(DisasContext *s, arg_VCVT_int_sp *a) /* u32 -> f32 */ gen_helper_vfp_uitos(vm, vm, fpst); } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -3015,7 +3015,7 @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i64(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); fpst = fpstatus_ptr(FPST_FPCR); if (a->s) { /* i32 -> f64 */ @@ -3057,7 +3057,7 @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a) vd = tcg_temp_new_i32(); neon_load_reg64(vm, a->vm); gen_helper_vjcvt(vd, vm, cpu_env); - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i64(vm); tcg_temp_free_i32(vd); return true; @@ -3080,7 +3080,7 @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a) frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm); vd = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); fpst = fpstatus_ptr(FPST_FPCR_F16); shift = tcg_const_i32(frac_bits); @@ -3115,7 +3115,7 @@ static bool trans_VCVT_fix_hp(DisasContext *s, arg_VCVT_fix_sp *a) g_assert_not_reached(); } - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i32(shift); tcg_temp_free_ptr(fpst); @@ -3139,7 +3139,7 @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a) frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm); vd = tcg_temp_new_i32(); - neon_load_reg32(vd, a->vd); + vfp_load_reg32(vd, a->vd); fpst = fpstatus_ptr(FPST_FPCR); shift = tcg_const_i32(frac_bits); @@ -3174,7 +3174,7 @@ static bool trans_VCVT_fix_sp(DisasContext *s, arg_VCVT_fix_sp *a) g_assert_not_reached(); } - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i32(shift); tcg_temp_free_ptr(fpst); @@ -3261,7 +3261,7 @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a) fpst = fpstatus_ptr(FPST_FPCR_F16); vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); if (a->s) { if (a->rz) { @@ -3276,7 +3276,7 @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a) gen_helper_vfp_touih(vm, vm, fpst); } } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -3297,7 +3297,7 @@ static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a) fpst = fpstatus_ptr(FPST_FPCR); vm = tcg_temp_new_i32(); - neon_load_reg32(vm, a->vm); + vfp_load_reg32(vm, a->vm); if (a->s) { if (a->rz) { @@ -3312,7 +3312,7 @@ static bool trans_VCVT_sp_int(DisasContext *s, arg_VCVT_sp_int *a) gen_helper_vfp_touis(vm, vm, fpst); } } - neon_store_reg32(vm, a->vd); + vfp_store_reg32(vm, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_ptr(fpst); return true; @@ -3355,7 +3355,7 @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a) gen_helper_vfp_touid(vd, vm, fpst); } } - neon_store_reg32(vd, a->vd); + vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); tcg_temp_free_i64(vm); tcg_temp_free_ptr(fpst); @@ -3468,10 +3468,10 @@ static bool trans_VINS(DisasContext *s, arg_VINS *a) /* Insert low half of Vm into high half of Vd */ rm = tcg_temp_new_i32(); rd = tcg_temp_new_i32(); - neon_load_reg32(rm, a->vm); - neon_load_reg32(rd, a->vd); + vfp_load_reg32(rm, a->vm); + vfp_load_reg32(rd, a->vd); tcg_gen_deposit_i32(rd, rd, rm, 16, 16); - neon_store_reg32(rd, a->vd); + vfp_store_reg32(rd, a->vd); tcg_temp_free_i32(rm); tcg_temp_free_i32(rd); return true; @@ -3495,9 +3495,9 @@ static bool trans_VMOVX(DisasContext *s, arg_VINS *a) /* Set Vd to high half of Vm */ rm = tcg_temp_new_i32(); - neon_load_reg32(rm, a->vm); + vfp_load_reg32(rm, a->vm); tcg_gen_shri_i32(rm, rm, 16); - neon_store_reg32(rm, a->vd); + vfp_store_reg32(rm, a->vd); tcg_temp_free_i32(rm); return true; } From patchwork Wed Oct 28 03:27:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 301745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94F30C388F9 for ; Wed, 28 Oct 2020 03:29:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DD0B42242B for ; Wed, 28 Oct 2020 03:29:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Vy9hnefq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DD0B42242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56624 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXc90-0007NR-QG for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:29:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33974) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7P-0005zV-8A for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:23 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:42561) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7N-0005qa-0u for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:22 -0400 Received: by mail-pf1-x442.google.com with SMTP id x13so2106922pfa.9 for ; Tue, 27 Oct 2020 20:27:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uf4fak4C9UMbnnxKzDyZfkBRd4fVzUYOKLBXjqeKuvM=; b=Vy9hnefqcNpIM3/uTlScELP349Wx1HRZEUcAH1aFx8REMZwpSCvR2SeF6+DWbW95As vedsrVuhVjjhQhaj+X/L8kS8Un21X3C3WNxyro9aNBzEGNemiFbgiu9uCqqG1bZMZlZm ZLjW1HTwzGtwal78E48b6wai9VIqgGwyO92VrhXAofaE7913+80Dc1+KjsZSYG7zXLld hJvmKogC5SG5UvEWys0Sjl3yDMwfRx/yLc7G8GfzQZ1JYcy5lc0GEm8Ml6itf8Jk83AN wBo4kBKf2fE4Bt9tzdO/gaiWqnD0mbChjh3K3g1PWtZv9neyThXkWUmHejJN/GQh6nxn FI3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uf4fak4C9UMbnnxKzDyZfkBRd4fVzUYOKLBXjqeKuvM=; b=ZzIg9V8uLp5Sd5vumN4TcBEnKTnr2jm1zNmxs7lCC8mMqlxR0MaC5gq/WFzyoP3x73 WlLmgQKGLf9I4useaQhYnS7GCQJPTSInsqlhRjX+drZTH7wbAvsXNKjoltqHgsog8a0k V8zYcqfFOXD9jMAUgSrTsPhOO4iUuzNUeAa8kdv0zMBwXW42/dhOzCIJNRBm+qyOfS7m inDsM+gWPJnAqaak0zx+1oEgKXuj2sB9vXQkA/aPDA1SxiXetlWga1Rl8EUndDqiW5Lz m2AFNEJwz1OEJUaljZJoJZjClgM1so5miiPg+37pnVWCbyI+/hYAXV+S/MMmvhMqc3Gk VYbg== X-Gm-Message-State: AOAM531FNRM+Vgdy1KNvHde1mPE7Y0xkeMjsDdab4mgVr+ov84mkKRA/ mwU+HHkpZtQ1D0n5ubqcFQrvKjNP0H9NBw== X-Google-Smtp-Source: ABdhPJxRoCxan8L8XZ6CGH2CtLZqGl5nx3eIzgaBp0rcbueF9mIegXyCmQBqClHkmqSwxzL6gmiltQ== X-Received: by 2002:a62:8cd2:0:b029:163:b01f:405d with SMTP id m201-20020a628cd20000b0290163b01f405dmr5232507pfd.48.1603855639124; Tue, 27 Oct 2020 20:27:19 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 08/11] target/arm: Add read/write_neon_element64 Date: Tue, 27 Oct 2020 20:27:00 -0700 Message-Id: <20201028032703.201526-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Replace all uses of neon_load/store_reg64 within translate-neon.c.inc. Signed-off-by: Richard Henderson --- target/arm/translate.c | 26 +++++++++ target/arm/translate-neon.c.inc | 94 ++++++++++++++++----------------- 2 files changed, 73 insertions(+), 47 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 8491ab705b..4fb0a62200 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1178,6 +1178,19 @@ static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop) } } +static void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop) +{ + long off = neon_element_offset(reg, ele, memop); + + switch (memop) { + case MO_Q: + tcg_gen_ld_i64(dest, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop) { long off = neon_element_offset(reg, ele, memop); @@ -1197,6 +1210,19 @@ static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop) } } +static void write_neon_element64(TCGv_i64 src, int reg, int ele, MemOp memop) +{ + long off = neon_element_offset(reg, ele, memop); + + switch (memop) { + case MO_64: + tcg_gen_st_i64(src, cpu_env, off); + break; + default: + g_assert_not_reached(); + } +} + static TCGv_ptr vfp_reg_ptr(bool dp, int reg) { TCGv_ptr ret = tcg_temp_new_ptr(); diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 3f8a0bb88b..7a0269970c 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -1265,9 +1265,9 @@ static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a, for (pass = 0; pass < a->q + 1; pass++) { TCGv_i64 tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm + pass); + read_neon_element64(tmp, a->vm, pass, MO_64); fn(tmp, cpu_env, tmp, constimm); - neon_store_reg64(tmp, a->vd + pass); + write_neon_element64(tmp, a->vd, pass, MO_64); tcg_temp_free_i64(tmp); } tcg_temp_free_i64(constimm); @@ -1375,8 +1375,8 @@ static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a, rd = tcg_temp_new_i32(); /* Load both inputs first to avoid potential overwrite if rm == rd */ - neon_load_reg64(rm1, a->vm); - neon_load_reg64(rm2, a->vm + 1); + read_neon_element64(rm1, a->vm, 0, MO_64); + read_neon_element64(rm2, a->vm, 1, MO_64); shiftfn(rm1, rm1, constimm); narrowfn(rd, cpu_env, rm1); @@ -1579,7 +1579,7 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a, tcg_gen_shli_i64(tmp, tmp, a->shift); tcg_gen_andi_i64(tmp, tmp, ~widen_mask); } - neon_store_reg64(tmp, a->vd); + write_neon_element64(tmp, a->vd, 0, MO_64); widenfn(tmp, rm1); tcg_temp_free_i32(rm1); @@ -1587,7 +1587,7 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a, tcg_gen_shli_i64(tmp, tmp, a->shift); tcg_gen_andi_i64(tmp, tmp, ~widen_mask); } - neon_store_reg64(tmp, a->vd + 1); + write_neon_element64(tmp, a->vd, 1, MO_64); tcg_temp_free_i64(tmp); return true; } @@ -1822,7 +1822,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, rm_64 = tcg_temp_new_i64(); if (src1_wide) { - neon_load_reg64(rn0_64, a->vn); + read_neon_element64(rn0_64, a->vn, 0, MO_64); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 0, MO_32); @@ -1841,7 +1841,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, * avoid incorrect results if a narrow input overlaps with the result. */ if (src1_wide) { - neon_load_reg64(rn1_64, a->vn + 1); + read_neon_element64(rn1_64, a->vn, 1, MO_64); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 1, MO_32); @@ -1851,12 +1851,12 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, rm = tcg_temp_new_i32(); read_neon_element32(rm, a->vm, 1, MO_32); - neon_store_reg64(rn0_64, a->vd); + write_neon_element64(rn0_64, a->vd, 0, MO_64); widenfn(rm_64, rm); tcg_temp_free_i32(rm); opfn(rn1_64, rn1_64, rm_64); - neon_store_reg64(rn1_64, a->vd + 1); + write_neon_element64(rn1_64, a->vd, 1, MO_64); tcg_temp_free_i64(rn0_64); tcg_temp_free_i64(rn1_64); @@ -1928,15 +1928,15 @@ static bool do_narrow_3d(DisasContext *s, arg_3diff *a, rd0 = tcg_temp_new_i32(); rd1 = tcg_temp_new_i32(); - neon_load_reg64(rn_64, a->vn); - neon_load_reg64(rm_64, a->vm); + read_neon_element64(rn_64, a->vn, 0, MO_64); + read_neon_element64(rm_64, a->vm, 0, MO_64); opfn(rn_64, rn_64, rm_64); narrowfn(rd0, rn_64); - neon_load_reg64(rn_64, a->vn + 1); - neon_load_reg64(rm_64, a->vm + 1); + read_neon_element64(rn_64, a->vn, 1, MO_64); + read_neon_element64(rm_64, a->vm, 1, MO_64); opfn(rn_64, rn_64, rm_64); @@ -2036,16 +2036,16 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a, /* Don't store results until after all loads: they might overlap */ if (accfn) { tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vd); + read_neon_element64(tmp, a->vd, 0, MO_64); accfn(tmp, tmp, rd0); - neon_store_reg64(tmp, a->vd); - neon_load_reg64(tmp, a->vd + 1); + write_neon_element64(tmp, a->vd, 0, MO_64); + read_neon_element64(tmp, a->vd, 1, MO_64); accfn(tmp, tmp, rd1); - neon_store_reg64(tmp, a->vd + 1); + write_neon_element64(tmp, a->vd, 1, MO_64); tcg_temp_free_i64(tmp); } else { - neon_store_reg64(rd0, a->vd); - neon_store_reg64(rd1, a->vd + 1); + write_neon_element64(rd0, a->vd, 0, MO_64); + write_neon_element64(rd1, a->vd, 1, MO_64); } tcg_temp_free_i64(rd0); @@ -2669,16 +2669,16 @@ static bool do_2scalar_long(DisasContext *s, arg_2scalar *a, if (accfn) { TCGv_i64 t64 = tcg_temp_new_i64(); - neon_load_reg64(t64, a->vd); + read_neon_element64(t64, a->vd, 0, MO_64); accfn(t64, t64, rn0_64); - neon_store_reg64(t64, a->vd); - neon_load_reg64(t64, a->vd + 1); + write_neon_element64(t64, a->vd, 0, MO_64); + read_neon_element64(t64, a->vd, 1, MO_64); accfn(t64, t64, rn1_64); - neon_store_reg64(t64, a->vd + 1); + write_neon_element64(t64, a->vd, 1, MO_64); tcg_temp_free_i64(t64); } else { - neon_store_reg64(rn0_64, a->vd); - neon_store_reg64(rn1_64, a->vd + 1); + write_neon_element64(rn0_64, a->vd, 0, MO_64); + write_neon_element64(rn1_64, a->vd, 1, MO_64); } tcg_temp_free_i64(rn0_64); tcg_temp_free_i64(rn1_64); @@ -2812,10 +2812,10 @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a) right = tcg_temp_new_i64(); dest = tcg_temp_new_i64(); - neon_load_reg64(right, a->vn); - neon_load_reg64(left, a->vm); + read_neon_element64(right, a->vn, 0, MO_64); + read_neon_element64(left, a->vm, 0, MO_64); tcg_gen_extract2_i64(dest, right, left, a->imm * 8); - neon_store_reg64(dest, a->vd); + write_neon_element64(dest, a->vd, 0, MO_64); tcg_temp_free_i64(left); tcg_temp_free_i64(right); @@ -2831,21 +2831,21 @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a) destright = tcg_temp_new_i64(); if (a->imm < 8) { - neon_load_reg64(right, a->vn); - neon_load_reg64(middle, a->vn + 1); + read_neon_element64(right, a->vn, 0, MO_64); + read_neon_element64(middle, a->vn, 1, MO_64); tcg_gen_extract2_i64(destright, right, middle, a->imm * 8); - neon_load_reg64(left, a->vm); + read_neon_element64(left, a->vm, 0, MO_64); tcg_gen_extract2_i64(destleft, middle, left, a->imm * 8); } else { - neon_load_reg64(right, a->vn + 1); - neon_load_reg64(middle, a->vm); + read_neon_element64(right, a->vn, 1, MO_64); + read_neon_element64(middle, a->vm, 0, MO_64); tcg_gen_extract2_i64(destright, right, middle, (a->imm - 8) * 8); - neon_load_reg64(left, a->vm + 1); + read_neon_element64(left, a->vm, 1, MO_64); tcg_gen_extract2_i64(destleft, middle, left, (a->imm - 8) * 8); } - neon_store_reg64(destright, a->vd); - neon_store_reg64(destleft, a->vd + 1); + write_neon_element64(destright, a->vd, 0, MO_64); + write_neon_element64(destleft, a->vd, 1, MO_64); tcg_temp_free_i64(destright); tcg_temp_free_i64(destleft); @@ -3050,11 +3050,11 @@ static bool do_2misc_pairwise(DisasContext *s, arg_2misc *a, if (accfn) { TCGv_i64 tmp64 = tcg_temp_new_i64(); - neon_load_reg64(tmp64, a->vd + pass); + read_neon_element64(tmp64, a->vd, pass, MO_64); accfn(rd_64, tmp64, rd_64); tcg_temp_free_i64(tmp64); } - neon_store_reg64(rd_64, a->vd + pass); + write_neon_element64(rd_64, a->vd, pass, MO_64); tcg_temp_free_i64(rd_64); } return true; @@ -3252,9 +3252,9 @@ static bool do_vmovn(DisasContext *s, arg_2misc *a, rd0 = tcg_temp_new_i32(); rd1 = tcg_temp_new_i32(); - neon_load_reg64(rm, a->vm); + read_neon_element64(rm, a->vm, 0, MO_64); narrowfn(rd0, cpu_env, rm); - neon_load_reg64(rm, a->vm + 1); + read_neon_element64(rm, a->vm, 1, MO_64); narrowfn(rd1, cpu_env, rm); write_neon_element32(rd0, a->vd, 0, MO_32); write_neon_element32(rd1, a->vd, 1, MO_32); @@ -3324,10 +3324,10 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a) widenfn(rd, rm0); tcg_gen_shli_i64(rd, rd, 8 << a->size); - neon_store_reg64(rd, a->vd); + write_neon_element64(rd, a->vd, 0, MO_64); widenfn(rd, rm1); tcg_gen_shli_i64(rd, rd, 8 << a->size); - neon_store_reg64(rd, a->vd + 1); + write_neon_element64(rd, a->vd, 1, MO_64); tcg_temp_free_i64(rd); tcg_temp_free_i32(rm0); @@ -3845,10 +3845,10 @@ static bool trans_VSWP(DisasContext *s, arg_2misc *a) rm = tcg_temp_new_i64(); rd = tcg_temp_new_i64(); for (pass = 0; pass < (a->q ? 2 : 1); pass++) { - neon_load_reg64(rm, a->vm + pass); - neon_load_reg64(rd, a->vd + pass); - neon_store_reg64(rm, a->vd + pass); - neon_store_reg64(rd, a->vm + pass); + read_neon_element64(rm, a->vm, pass, MO_64); + read_neon_element64(rd, a->vd, pass, MO_64); + write_neon_element64(rm, a->vd, pass, MO_64); + write_neon_element64(rd, a->vm, pass, MO_64); } tcg_temp_free_i64(rm); tcg_temp_free_i64(rd); From patchwork Wed Oct 28 03:27:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 301741 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E04DC388F7 for ; Wed, 28 Oct 2020 03:38:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BF6D62242B for ; Wed, 28 Oct 2020 03:38:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="g+Am5Fng" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BF6D62242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52570 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcIS-000124-Lk for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:38:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7R-00062q-4A for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:25 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:39190) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7O-0005r2-Sa for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:24 -0400 Received: by mail-pf1-x441.google.com with SMTP id e15so2113169pfh.6 for ; Tue, 27 Oct 2020 20:27:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I1RxcADQNWXG6kJXZJzeM0kiw0mxZyWZrvQsQoREMmo=; b=g+Am5FngPgmTr/0tGjUcihdbCea4vAppVBtllr16G4pNKZnXmU06Db8HEhGd6CLlP7 WK6y6lgej9IPezdd5yfhP7noGq1uF5XdvJzmePmYJfmIZVmya1BZVGPMqr6wxmTBmRih M0YAmYAzl9lGhPrMeknzmjuBn1mlcP3vl8AHJaa0G+uFWfgDqSFgNZDxzJUU7mK7Kczn bZplHscNNoO90K4xtCR8NEBMAb1ERWOONTnXmAA7QnLHwJVt3yX4HcyyPDuxaeUMYMvV lFKK4NkLk7Z5txW3zlnWr8YoV1Td4rtNmCwcnhqUpVL7njYui8ox1LxpjSAyyT7z+nze nH8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I1RxcADQNWXG6kJXZJzeM0kiw0mxZyWZrvQsQoREMmo=; b=uRrU8zbgKYjIbLzHID2A64T8Bou4LWPPLiCZsuqG58XSv2SsZGjWrJIvt/nylm1IRk HqSKGSmJvF6+VyqvV/uI2FZQVYRercBux4JGwZd0bIGm66XVdd3Bs9Lra/b2iZ6DYwSj 7eekZxkTf+6uR/2cG4pNlYx/eEVMewjkLSLbr54n9Jr9g2woNxp/Jwmyh9Z5PJljxRLw 2thPYpN5n5+2ksVtYbcDX3fed5DKsJez74pbIHqw4VbaBNYNBDluxhBm/38eq3EZA0CF 52P5IeRlFlp0UdBNyP0lNs0RW0OQCea0WldgkbKFHqHvsLCf2HF/P21vCEM71SXTqP7Q hiwA== X-Gm-Message-State: AOAM530G0GAafmAXJ70Xx+qhSyVbbItM6JhLykTJCr4PGsVswc4zO1fl xYLf2cOgKrsZWEXEawjQpZQ64iuyKhjYyQ== X-Google-Smtp-Source: ABdhPJwzUPvSQgbiGzcWLw54sqjlVyeAE8ERHPClLl5wF9fvt8+pZw6WvV7cgLja1YH1LKkvTXX+cg== X-Received: by 2002:a05:6a00:2d5:b029:152:197a:a23a with SMTP id b21-20020a056a0002d5b0290152197aa23amr5311316pft.66.1603855640831; Tue, 27 Oct 2020 20:27:20 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 09/11] target/arm: Rename neon_load_reg64 to vfp_load_reg64 Date: Tue, 27 Oct 2020 20:27:01 -0700 Message-Id: <20201028032703.201526-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::441; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x441.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The only uses of this function are for loading VFP double-precision values, and nothing to do with NEON. Signed-off-by: Richard Henderson --- target/arm/translate.c | 8 ++-- target/arm/translate-vfp.c.inc | 84 +++++++++++++++++----------------- 2 files changed, 46 insertions(+), 46 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 4fb0a62200..7611c1f0f1 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1132,14 +1132,14 @@ static long vfp_reg_offset(bool dp, unsigned reg) } } -static inline void neon_load_reg64(TCGv_i64 var, int reg) +static inline void vfp_load_reg64(TCGv_i64 var, int reg) { - tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg)); + tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(true, reg)); } -static inline void neon_store_reg64(TCGv_i64 var, int reg) +static inline void vfp_store_reg64(TCGv_i64 var, int reg) { - tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg)); + tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(true, reg)); } static inline void vfp_load_reg32(TCGv_i32 var, int reg) diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc index d2a9b658bb..f966de5b1f 100644 --- a/target/arm/translate-vfp.c.inc +++ b/target/arm/translate-vfp.c.inc @@ -236,8 +236,8 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) tcg_gen_ext_i32_i64(nf, cpu_NF); tcg_gen_ext_i32_i64(vf, cpu_VF); - neon_load_reg64(frn, rn); - neon_load_reg64(frm, rm); + vfp_load_reg64(frn, rn); + vfp_load_reg64(frm, rm); switch (a->cc) { case 0: /* eq: Z */ tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero, @@ -264,7 +264,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a) tcg_temp_free_i64(tmp); break; } - neon_store_reg64(dest, rd); + vfp_store_reg64(dest, rd); tcg_temp_free_i64(frn); tcg_temp_free_i64(frm); tcg_temp_free_i64(dest); @@ -385,9 +385,9 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a) TCGv_i64 tcg_res; tcg_op = tcg_temp_new_i64(); tcg_res = tcg_temp_new_i64(); - neon_load_reg64(tcg_op, rm); + vfp_load_reg64(tcg_op, rm); gen_helper_rintd(tcg_res, tcg_op, fpst); - neon_store_reg64(tcg_res, rd); + vfp_store_reg64(tcg_res, rd); tcg_temp_free_i64(tcg_op); tcg_temp_free_i64(tcg_res); } else { @@ -463,7 +463,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) tcg_double = tcg_temp_new_i64(); tcg_res = tcg_temp_new_i64(); tcg_tmp = tcg_temp_new_i32(); - neon_load_reg64(tcg_double, rm); + vfp_load_reg64(tcg_double, rm); if (is_signed) { gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst); } else { @@ -1002,9 +1002,9 @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a) tmp = tcg_temp_new_i64(); if (a->l) { gen_aa32_ld64(s, tmp, addr, get_mem_index(s)); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); } else { - neon_load_reg64(tmp, a->vd); + vfp_load_reg64(tmp, a->vd); gen_aa32_st64(s, tmp, addr, get_mem_index(s)); } tcg_temp_free_i64(tmp); @@ -1149,10 +1149,10 @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a) if (a->l) { /* load */ gen_aa32_ld64(s, tmp, addr, get_mem_index(s)); - neon_store_reg64(tmp, a->vd + i); + vfp_store_reg64(tmp, a->vd + i); } else { /* store */ - neon_load_reg64(tmp, a->vd + i); + vfp_load_reg64(tmp, a->vd + i); gen_aa32_st64(s, tmp, addr, get_mem_index(s)); } tcg_gen_addi_i32(addr, addr, offset); @@ -1416,15 +1416,15 @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn, fd = tcg_temp_new_i64(); fpst = fpstatus_ptr(FPST_FPCR); - neon_load_reg64(f0, vn); - neon_load_reg64(f1, vm); + vfp_load_reg64(f0, vn); + vfp_load_reg64(f1, vm); for (;;) { if (reads_vd) { - neon_load_reg64(fd, vd); + vfp_load_reg64(fd, vd); } fn(fd, f0, f1, fpst); - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); if (veclen == 0) { break; @@ -1433,10 +1433,10 @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn, veclen--; vd = vfp_advance_dreg(vd, delta_d); vn = vfp_advance_dreg(vn, delta_d); - neon_load_reg64(f0, vn); + vfp_load_reg64(f0, vn); if (delta_m) { vm = vfp_advance_dreg(vm, delta_m); - neon_load_reg64(f1, vm); + vfp_load_reg64(f1, vm); } } @@ -1599,11 +1599,11 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm) f0 = tcg_temp_new_i64(); fd = tcg_temp_new_i64(); - neon_load_reg64(f0, vm); + vfp_load_reg64(f0, vm); for (;;) { fn(fd, f0); - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); if (veclen == 0) { break; @@ -1613,7 +1613,7 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm) /* single source one-many */ while (veclen--) { vd = vfp_advance_dreg(vd, delta_d); - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); } break; } @@ -1622,7 +1622,7 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm) veclen--; vd = vfp_advance_dreg(vd, delta_d); vd = vfp_advance_dreg(vm, delta_m); - neon_load_reg64(f0, vm); + vfp_load_reg64(f0, vm); } tcg_temp_free_i64(f0); @@ -2173,20 +2173,20 @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i64(); vd = tcg_temp_new_i64(); - neon_load_reg64(vn, a->vn); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vn, a->vn); + vfp_load_reg64(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negd(vn, vn); } - neon_load_reg64(vd, a->vd); + vfp_load_reg64(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negd(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR); gen_helper_vfp_muladdd(vd, vn, vm, vd, fpst); - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(vn); @@ -2325,7 +2325,7 @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a) fd = tcg_const_i64(vfp_expand_imm(MO_64, a->imm)); for (;;) { - neon_store_reg64(fd, vd); + vfp_store_reg64(fd, vd); if (veclen == 0) { break; @@ -2480,11 +2480,11 @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a) vd = tcg_temp_new_i64(); vm = tcg_temp_new_i64(); - neon_load_reg64(vd, a->vd); + vfp_load_reg64(vd, a->vd); if (a->z) { tcg_gen_movi_i64(vm, 0); } else { - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); } if (a->e) { @@ -2557,7 +2557,7 @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a) tcg_gen_ld16u_i32(tmp, cpu_env, vfp_f16_offset(a->vm, a->t)); vd = tcg_temp_new_i64(); gen_helper_vfp_fcvt_f16_to_f64(vd, tmp, fpst, ahp_mode); - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i32(ahp_mode); tcg_temp_free_ptr(fpst); tcg_temp_free_i32(tmp); @@ -2621,7 +2621,7 @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a) tmp = tcg_temp_new_i32(); vm = tcg_temp_new_i64(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); gen_helper_vfp_fcvt_f64_to_f16(tmp, vm, fpst, ahp_mode); tcg_temp_free_i64(vm); tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t)); @@ -2700,10 +2700,10 @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a) } tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm); + vfp_load_reg64(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rintd(tmp, tmp, fpst); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(tmp); return true; @@ -2789,13 +2789,13 @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a) } tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm); + vfp_load_reg64(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); tcg_rmode = tcg_const_i32(float_round_to_zero); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); gen_helper_rintd(tmp, tmp, fpst); gen_helper_set_rmode(tcg_rmode, tcg_rmode, fpst); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(tmp); tcg_temp_free_i32(tcg_rmode); @@ -2871,10 +2871,10 @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a) } tmp = tcg_temp_new_i64(); - neon_load_reg64(tmp, a->vm); + vfp_load_reg64(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR); gen_helper_rintd_exact(tmp, tmp, fpst); - neon_store_reg64(tmp, a->vd); + vfp_store_reg64(tmp, a->vd); tcg_temp_free_ptr(fpst); tcg_temp_free_i64(tmp); return true; @@ -2902,7 +2902,7 @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a) vd = tcg_temp_new_i64(); vfp_load_reg32(vm, a->vm); gen_helper_vfp_fcvtds(vd, vm, cpu_env); - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_i64(vd); return true; @@ -2928,7 +2928,7 @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i64(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); gen_helper_vfp_fcvtsd(vd, vm, cpu_env); vfp_store_reg32(vd, a->vd); tcg_temp_free_i32(vd); @@ -3024,7 +3024,7 @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a) /* u32 -> f64 */ gen_helper_vfp_uitod(vd, vm, fpst); } - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i32(vm); tcg_temp_free_i64(vd); tcg_temp_free_ptr(fpst); @@ -3055,7 +3055,7 @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a) vm = tcg_temp_new_i64(); vd = tcg_temp_new_i32(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); gen_helper_vjcvt(vd, vm, cpu_env); vfp_store_reg32(vd, a->vd); tcg_temp_free_i64(vm); @@ -3204,7 +3204,7 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a) frac_bits = (a->opc & 1) ? (32 - a->imm) : (16 - a->imm); vd = tcg_temp_new_i64(); - neon_load_reg64(vd, a->vd); + vfp_load_reg64(vd, a->vd); fpst = fpstatus_ptr(FPST_FPCR); shift = tcg_const_i32(frac_bits); @@ -3239,7 +3239,7 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a) g_assert_not_reached(); } - neon_store_reg64(vd, a->vd); + vfp_store_reg64(vd, a->vd); tcg_temp_free_i64(vd); tcg_temp_free_i32(shift); tcg_temp_free_ptr(fpst); @@ -3340,7 +3340,7 @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a) fpst = fpstatus_ptr(FPST_FPCR); vm = tcg_temp_new_i64(); vd = tcg_temp_new_i32(); - neon_load_reg64(vm, a->vm); + vfp_load_reg64(vm, a->vm); if (a->s) { if (a->rz) { From patchwork Wed Oct 28 03:27:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 311521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02DBFC388F7 for ; Wed, 28 Oct 2020 03:40:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 390302242F for ; Wed, 28 Oct 2020 03:40:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="reizS/99" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 390302242F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54722 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcK9-0001zM-8V for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:40:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34018) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7S-00065e-HL for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:26 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:35534) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7Q-0005s0-Gc for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:26 -0400 Received: by mail-pj1-x1041.google.com with SMTP id h4so1788355pjk.0 for ; Tue, 27 Oct 2020 20:27:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Pq1quEEZ7i20Esg7262druvuILVuWDN9u6fa6Cr8HCw=; b=reizS/99RjdBccqfoiG7xNyGGiXBccDhMG6sdRcdl7cmLmPLxSWxS0/bzU56r5w+/J wrxKsq8prZR5YP02Ac3+XHx2EQcE43ybIBv0oW34rBhhDIXgnTxl4sAH1quvoL3GMAmk /6HsHWByUS89q9KvW8snhy0kXz2aSXkGIa4xYIsk4MwFJ5n0ZRUpZDkNqeagjbvBQA5i IRwkWT5ctLwh5RRyYmNE8OTzn0f9dNzsL/5Wt98fqtUY92Gr7NVDA6STKwh/S7HwQYnI mTGdFWYQjo0rnkOH2Exw4bLWOPookVkdh78jxIksDGB2qsxJzItVx3ePyyU/J28xvtuA ECvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Pq1quEEZ7i20Esg7262druvuILVuWDN9u6fa6Cr8HCw=; b=VmIoPKPSth+42hMX8uu9beWJsqTM9wdXbEoyF7DcadZRF4Hns2rTaiGgx+sLC77DhL LKRIiTW5JN2imN+piNYvIsTNfpN0LEzREV+Nyp5FxBHS+hVqB3FHNNxkmxRQX2YBBL4l mXXwYPM4O5hpDRHAH41GlL7F5ko1zbfr2iB8w4EswSlCnjw/GNolG5/M5R78X+L3Zigw FHRS07SVUieJxTRWAOh0wVd9d8LnAeF7dQopxKw/0JatmEyJ2MZIhbWb57H9e/XK2yIA MYDgMF1n4+SToX8trByNCuEn2PMq2db5547NXUts4boCbKSpK3XgnWsy3nq0PqC4uU5v QMqA== X-Gm-Message-State: AOAM532GP10YcG0rvM9DQ1x3pJ2eAS2k006SHsqmMtwM7JtxYjttD8ru LWoYGauff35+mpgVJSrUzCMVPH/txwUCcw== X-Google-Smtp-Source: ABdhPJxeVf27/jvCi9ZHkw3nAXykDRfBmfTGLzZ4PtfmYoZY1ktxnydNk+BcPpD96e0r/4XGZL//FA== X-Received: by 2002:a17:90a:aa91:: with SMTP id l17mr4756168pjq.198.1603855642741; Tue, 27 Oct 2020 20:27:22 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 10/11] target/arm: Simplify do_long_3d and do_2scalar_long Date: Tue, 27 Oct 2020 20:27:02 -0700 Message-Id: <20201028032703.201526-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" In both cases, we can sink the write-back and perform the accumulate into the normal destination temps. Signed-off-by: Richard Henderson --- target/arm/translate-neon.c.inc | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 7a0269970c..7cd41c79ec 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -2037,17 +2037,14 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a, if (accfn) { tmp = tcg_temp_new_i64(); read_neon_element64(tmp, a->vd, 0, MO_64); - accfn(tmp, tmp, rd0); - write_neon_element64(tmp, a->vd, 0, MO_64); + accfn(rd0, tmp, rd0); read_neon_element64(tmp, a->vd, 1, MO_64); - accfn(tmp, tmp, rd1); - write_neon_element64(tmp, a->vd, 1, MO_64); + accfn(rd1, tmp, rd1); tcg_temp_free_i64(tmp); - } else { - write_neon_element64(rd0, a->vd, 0, MO_64); - write_neon_element64(rd1, a->vd, 1, MO_64); } + write_neon_element64(rd0, a->vd, 0, MO_64); + write_neon_element64(rd1, a->vd, 1, MO_64); tcg_temp_free_i64(rd0); tcg_temp_free_i64(rd1); @@ -2670,16 +2667,14 @@ static bool do_2scalar_long(DisasContext *s, arg_2scalar *a, if (accfn) { TCGv_i64 t64 = tcg_temp_new_i64(); read_neon_element64(t64, a->vd, 0, MO_64); - accfn(t64, t64, rn0_64); - write_neon_element64(t64, a->vd, 0, MO_64); + accfn(rn0_64, t64, rn0_64); read_neon_element64(t64, a->vd, 1, MO_64); - accfn(t64, t64, rn1_64); - write_neon_element64(t64, a->vd, 1, MO_64); + accfn(rn1_64, t64, rn1_64); tcg_temp_free_i64(t64); - } else { - write_neon_element64(rn0_64, a->vd, 0, MO_64); - write_neon_element64(rn1_64, a->vd, 1, MO_64); } + + write_neon_element64(rn0_64, a->vd, 0, MO_64); + write_neon_element64(rn1_64, a->vd, 1, MO_64); tcg_temp_free_i64(rn0_64); tcg_temp_free_i64(rn1_64); return true; From patchwork Wed Oct 28 03:27:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 311524 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51DEEC388F9 for ; Wed, 28 Oct 2020 03:31:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AB2752242B for ; Wed, 28 Oct 2020 03:31:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="LVWIXQLX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB2752242B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:34784 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kXcBp-0001p2-DU for qemu-devel@archiver.kernel.org; Tue, 27 Oct 2020 23:31:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34028) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kXc7U-00068L-2d for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:28 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:35535) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kXc7R-0005s8-SS for qemu-devel@nongnu.org; Tue, 27 Oct 2020 23:27:27 -0400 Received: by mail-pj1-x1042.google.com with SMTP id h4so1788370pjk.0 for ; Tue, 27 Oct 2020 20:27:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2jgCpFX/4z4yxSfwQUBESL4YrVuNrEMYDdELf2CSHJQ=; b=LVWIXQLXlv3jW+42kNQLr4t6N1pm6Mo7MRjE6LyTOXsM6IiInyNqSekmameXLqMZK5 fEWPJoFgO90Of2l/0j27UIWWvgnT0LTBOn8FpYS42jaic+ghLJb5WXvRViGr44Y1q6UY 42Hlr05Bs5UK2pY9GV+lLq0Wb70hMRdvuJSDS/7iVfUmCeZywLORyQg7+/e5V4jbbdLc 9vUj26WTCNGntBaRyibgzu3DSHsPe5HJGae+Ix3f1Qz9LMUILVSWqWHYtEtMqpH5TA5A q6cwJd6DMPjm8GkxhDGblr0IBDAdGOV3zFh0Cd5/8XIOX50GK1TqhFeFwSoynRd4+z/K 0Row== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2jgCpFX/4z4yxSfwQUBESL4YrVuNrEMYDdELf2CSHJQ=; b=h9Zx8MmJmnns31/IGJirrei5iQ64UACEvnANkH8r7e3/Ng1g03gmsBQNeYpeWhb61I tdCjMfo1bmr/b23jelhgel/QbABinv38kd57HJ+k1XwCeRFQ9tTJMzDH23lSF7gGCe6n 6aasGtOTuZvKveca9c8DDhu5NPCObt/jjgZf3bvXfwDMR7ZcYhMGFekSUKZnUjtyAlxK lavZ3MuVN33DjwIUgiGCwPEJPwgHa6HTQzHyAl9LRBAmWzISP5OF0/QJo5W2mLyBtnwh Hw1bXXew7EjySJq3JX9RofC112ada0JX4Q6lPG7VSRn5f+7RjjJxulJWfNHFUKesl0wx C6Ww== X-Gm-Message-State: AOAM5313T/8SRXov4HzPYRT/O3bKEguNN1jNe/0O99OIKnbNEQzdPBAp uZzqkA7spy64ikEuUL6jeQA/nOgEqOzp9A== X-Google-Smtp-Source: ABdhPJw6G7dJI7qO1FvtxuQ7hqI5OlTkfQZ1zck9UIYgyKEBb1akok8do+Et1JKcFpO1cAUyFbGd/w== X-Received: by 2002:a17:90a:eac7:: with SMTP id ev7mr5075816pjb.55.1603855644033; Tue, 27 Oct 2020 20:27:24 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id d26sm3764413pfo.82.2020.10.27.20.27.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Oct 2020 20:27:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 11/11] target/arm: Improve do_prewiden_3d Date: Tue, 27 Oct 2020 20:27:03 -0700 Message-Id: <20201028032703.201526-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201028032703.201526-1-richard.henderson@linaro.org> References: <20201028032703.201526-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We can use proper widening loads to extend 32-bit inputs, and skip the "widenfn" step. Signed-off-by: Richard Henderson --- target/arm/translate.c | 6 +++ target/arm/translate-neon.c.inc | 66 ++++++++++++++++++--------------- 2 files changed, 43 insertions(+), 29 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 7611c1f0f1..29ea1eb781 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1183,6 +1183,12 @@ static void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop) long off = neon_element_offset(reg, ele, memop); switch (memop) { + case MO_SL: + tcg_gen_ld32s_i64(dest, cpu_env, off); + break; + case MO_UL: + tcg_gen_ld32u_i64(dest, cpu_env, off); + break; case MO_Q: tcg_gen_ld_i64(dest, cpu_env, off); break; diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 7cd41c79ec..8f33b54067 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -1788,11 +1788,10 @@ static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a) static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, NeonGenWidenFn *widenfn, NeonGenTwo64OpFn *opfn, - bool src1_wide) + int src1_mop, int src2_mop) { /* 3-regs different lengths, prewidening case (VADDL/VSUBL/VAADW/VSUBW) */ TCGv_i64 rn0_64, rn1_64, rm_64; - TCGv_i32 rm; if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { return false; @@ -1804,12 +1803,12 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, return false; } - if (!widenfn || !opfn) { + if (!opfn) { /* size == 3 case, which is an entirely different insn group */ return false; } - if ((a->vd & 1) || (src1_wide && (a->vn & 1))) { + if ((a->vd & 1) || (src1_mop == MO_Q && (a->vn & 1))) { return false; } @@ -1821,40 +1820,48 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, rn1_64 = tcg_temp_new_i64(); rm_64 = tcg_temp_new_i64(); - if (src1_wide) { - read_neon_element64(rn0_64, a->vn, 0, MO_64); + if (src1_mop >= 0) { + read_neon_element64(rn0_64, a->vn, 0, src1_mop); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 0, MO_32); widenfn(rn0_64, tmp); tcg_temp_free_i32(tmp); } - rm = tcg_temp_new_i32(); - read_neon_element32(rm, a->vm, 0, MO_32); + if (src2_mop >= 0) { + read_neon_element64(rm_64, a->vm, 0, src2_mop); + } else { + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 0, MO_32); + widenfn(rm_64, tmp); + tcg_temp_free_i32(tmp); + } - widenfn(rm_64, rm); - tcg_temp_free_i32(rm); opfn(rn0_64, rn0_64, rm_64); /* * Load second pass inputs before storing the first pass result, to * avoid incorrect results if a narrow input overlaps with the result. */ - if (src1_wide) { - read_neon_element64(rn1_64, a->vn, 1, MO_64); + if (src1_mop >= 0) { + read_neon_element64(rn1_64, a->vn, 1, src1_mop); } else { TCGv_i32 tmp = tcg_temp_new_i32(); read_neon_element32(tmp, a->vn, 1, MO_32); widenfn(rn1_64, tmp); tcg_temp_free_i32(tmp); } - rm = tcg_temp_new_i32(); - read_neon_element32(rm, a->vm, 1, MO_32); + if (src2_mop >= 0) { + read_neon_element64(rm_64, a->vm, 1, src2_mop); + } else { + TCGv_i32 tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, a->vm, 1, MO_32); + widenfn(rm_64, tmp); + tcg_temp_free_i32(tmp); + } write_neon_element64(rn0_64, a->vd, 0, MO_64); - widenfn(rm_64, rm); - tcg_temp_free_i32(rm); opfn(rn1_64, rn1_64, rm_64); write_neon_element64(rn1_64, a->vd, 1, MO_64); @@ -1865,14 +1872,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, return true; } -#define DO_PREWIDEN(INSN, S, EXT, OP, SRC1WIDE) \ +#define DO_PREWIDEN(INSN, S, OP, SRC1WIDE, SIGN) \ static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a) \ { \ static NeonGenWidenFn * const widenfn[] = { \ gen_helper_neon_widen_##S##8, \ gen_helper_neon_widen_##S##16, \ - tcg_gen_##EXT##_i32_i64, \ - NULL, \ + NULL, NULL, \ }; \ static NeonGenTwo64OpFn * const addfn[] = { \ gen_helper_neon_##OP##l_u16, \ @@ -1880,18 +1886,20 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a, tcg_gen_##OP##_i64, \ NULL, \ }; \ - return do_prewiden_3d(s, a, widenfn[a->size], \ - addfn[a->size], SRC1WIDE); \ + int narrow_mop = a->size == MO_32 ? MO_32 | SIGN : -1; \ + return do_prewiden_3d(s, a, widenfn[a->size], addfn[a->size], \ + SRC1WIDE ? MO_Q : narrow_mop, \ + narrow_mop); \ } -DO_PREWIDEN(VADDL_S, s, ext, add, false) -DO_PREWIDEN(VADDL_U, u, extu, add, false) -DO_PREWIDEN(VSUBL_S, s, ext, sub, false) -DO_PREWIDEN(VSUBL_U, u, extu, sub, false) -DO_PREWIDEN(VADDW_S, s, ext, add, true) -DO_PREWIDEN(VADDW_U, u, extu, add, true) -DO_PREWIDEN(VSUBW_S, s, ext, sub, true) -DO_PREWIDEN(VSUBW_U, u, extu, sub, true) +DO_PREWIDEN(VADDL_S, s, add, false, MO_SIGN) +DO_PREWIDEN(VADDL_U, u, add, false, 0) +DO_PREWIDEN(VSUBL_S, s, sub, false, MO_SIGN) +DO_PREWIDEN(VSUBL_U, u, sub, false, 0) +DO_PREWIDEN(VADDW_S, s, add, true, MO_SIGN) +DO_PREWIDEN(VADDW_U, u, add, true, 0) +DO_PREWIDEN(VSUBW_S, s, sub, true, MO_SIGN) +DO_PREWIDEN(VSUBW_U, u, sub, true, 0) static bool do_narrow_3d(DisasContext *s, arg_3diff *a, NeonGenTwo64OpFn *opfn, NeonGenNarrowFn *narrowfn)