[v2,09/29] target/arm: Simplify do_reduction_op

Message ID	20240909162240.647173-10-richard.henderson@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell <peter.maydell@linaro.org> Subject: [PATCH v2 09/29] target/arm: Simplify do_reduction_op Date: Mon, 9 Sep 2024 09:22:19 -0700 Message-ID: <20240909162240.647173-10-richard.henderson@linaro.org> In-Reply-To: <20240909162240.647173-1-richard.henderson@linaro.org> References: <20240909162240.647173-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	target/arm: AdvSIMD decodetree conversion, part 4 \| expand [v2,00/29] target/arm: AdvSIMD decodetree conversion, part 4 [v2,01/29] target/arm: Replace tcg_gen_dupi_vec with constants in gengvec.c [v2,02/29] target/arm: Replace tcg_gen_dupi_vec with constants in translate-sve.c [v2,03/29] target/arm: Use cmpsel in gen_ushl_vec [v2,04/29] target/arm: Use cmpsel in gen_sshl_vec [v2,05/29] target/arm: Use tcg_gen_extract2_i64 for EXT [v2,06/29] target/arm: Convert EXT to decodetree [v2,07/29] target/arm: Convert TBL, TBX to decodetree [v2,08/29] target/arm: Convert UZP, TRN, ZIP to decodetree [v2,09/29] target/arm: Simplify do_reduction_op [v2,10/29] target/arm: Convert ADDV, ADDLV, MAXV, *MINV to decodetree [v2,11/29] target/arm: Convert FMAXNMV, FMINNMV, FMAXV, FMINV to decodetree [v2,12/29] target/arm: Convert FMOVI (scalar, immediate) to decodetree [v2,13/29] target/arm: Convert MOVI, FMOV, ORR, BIC (vector immediate) to decodetree [v2,14/29] target/arm: Introduce gen_gvec_sshr, gen_gvec_ushr [v2,15/29] target/arm: Fix whitespace near gen_srshr64_i64 [v2,16/29] target/arm: Convert handle_vec_simd_shri to decodetree [v2,17/29] target/arm: Convert handle_vec_simd_shli to decodetree [v2,18/29] target/arm: Use {, s}extract in handle_vec_simd_wshli [v2,19/29] target/arm: Convert SSHLL, USHLL to decodetree [v2,20/29] target/arm: Push tcg_rnd into handle_shri_with_rndacc [v2,21/29] target/arm: Split out subroutines of handle_shri_with_rndacc [v2,22/29] target/arm: Convert SHRN, RSHRN to decodetree [v2,23/29] target/arm: Convert handle_scalar_simd_shri to decodetree [v2,24/29] target/arm: Convert handle_scalar_simd_shli to decodetree [v2,25/29] target/arm: Convert VQSHL, VQSHLU to gvec [v2,26/29] target/arm: Widen NeonGenNarrowEnvFn return to 64 bits [v2,27/29] target/arm: Convert SQSHL, UQSHL, SQSHLU (immediate) to decodetree [v2,28/29] target/arm: Convert vector [US]QSHRN, [US]QRSHRN, SQSHRUN to decodetree [v2,29/29] target/arm: Convert scalar [US]QSHRN, [US]QRSHRN, SQSHRUN to decodetree

Message ID

20240909162240.647173-10-richard.henderson@linaro.org

State

Superseded

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org,
	Peter Maydell <peter.maydell@linaro.org>
Subject: [PATCH v2 09/29] target/arm: Simplify do_reduction_op
Date: Mon,  9 Sep 2024 09:22:19 -0700
Message-ID: <20240909162240.647173-10-richard.henderson@linaro.org>
In-Reply-To: <20240909162240.647173-1-richard.henderson@linaro.org>
References: <20240909162240.647173-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::636;
 envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x636.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

target/arm: AdvSIMD decodetree conversion, part 4 | expand

Commit Message

Richard Henderson Sept. 9, 2024, 4:22 p.m. UTC

Use simple shift and add instead of ctpop, ctz, shift and mask.
Unlike SVE, there is no predicate to disable elements.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-a64.c | 40 +++++++++++-----------------------
 1 file changed, 13 insertions(+), 27 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 04160b2513..74efb35164 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -9027,34 +9027,23 @@  static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
  * important for correct NaN propagation that we do these
  * operations in exactly the order specified by the pseudocode.
  *
- * This is a recursive function, TCG temps should be freed by the
- * calling function once it is done with the values.
+ * This is a recursive function.
  */
 static TCGv_i32 do_reduction_op(DisasContext *s, int fpopcode, int rn,
-                                int esize, int size, int vmap, TCGv_ptr fpst)
+                                MemOp esz, int ebase, int ecount, TCGv_ptr fpst)
 {
-    if (esize == size) {
-        int element;
-        MemOp msize = esize == 16 ? MO_16 : MO_32;
-        TCGv_i32 tcg_elem;
-
-        /* We should have one register left here */
-        assert(ctpop8(vmap) == 1);
-        element = ctz32(vmap);
-        assert(element < 8);
-
-        tcg_elem = tcg_temp_new_i32();
-        read_vec_element_i32(s, tcg_elem, rn, element, msize);
+    if (ecount == 1) {
+        TCGv_i32 tcg_elem = tcg_temp_new_i32();
+        read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
         return tcg_elem;
     } else {
-        int bits = size / 2;
-        int shift = ctpop8(vmap) / 2;
-        int vmap_lo = (vmap >> shift) & vmap;
-        int vmap_hi = (vmap & ~vmap_lo);
+        int half = ecount >> 1;
         TCGv_i32 tcg_hi, tcg_lo, tcg_res;
 
-        tcg_hi = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_hi, fpst);
-        tcg_lo = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_lo, fpst);
+        tcg_hi = do_reduction_op(s, fpopcode, rn, esz,
+                                 ebase + half, half, fpst);
+        tcg_lo = do_reduction_op(s, fpopcode, rn, esz,
+                                 ebase, half, fpst);
         tcg_res = tcg_temp_new_i32();
 
         switch (fpopcode) {
@@ -9105,7 +9094,6 @@  static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
     bool is_u = extract32(insn, 29, 1);
     bool is_fp = false;
     bool is_min = false;
-    int esize;
     int elements;
     int i;
     TCGv_i64 tcg_res, tcg_elt;
@@ -9152,8 +9140,7 @@  static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
         return;
     }
 
-    esize = 8 << size;
-    elements = (is_q ? 128 : 64) / esize;
+    elements = (is_q ? 16 : 8) >> size;
 
     tcg_res = tcg_temp_new_i64();
     tcg_elt = tcg_temp_new_i64();
@@ -9208,9 +9195,8 @@  static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
          */
         TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR);
         int fpopcode = opcode | is_min << 4 | is_u << 5;
-        int vmap = (1 << elements) - 1;
-        TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, esize,
-                                             (is_q ? 128 : 64), vmap, fpst);
+        TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, size,
+                                             0, elements, fpst);
         tcg_gen_extu_i32_i64(tcg_res, tcg_res32);
     }

[v2,09/29] target/arm: Simplify do_reduction_op

Commit Message

Patch