[v2,20/25] target/ppc: Rewrite trans_ADDG6S

Message ID	20230307183503.2512684-21-richard.henderson@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: Daniel Henrique Barboza <danielhb413@gmail.com>, =?utf-8?q?C=C3=A9dric_L?= =?utf-8?q?e_Goater?= <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>, qemu-ppc@nongnu.org Subject: [PATCH v2 20/25] target/ppc: Rewrite trans_ADDG6S Date: Tue, 7 Mar 2023 10:34:58 -0800 Message-Id: <20230307183503.2512684-21-richard.henderson@linaro.org> In-Reply-To: <20230307183503.2512684-1-richard.henderson@linaro.org> References: <20230307183503.2512684-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::102f; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	tcg: Remove tcg_const_* \| expand [v2,00/25] tcg: Remove tcg_const_* [v2,01/25] target/arm: Use rmode >= 0 for need_rmode [v2,02/25] target/arm: Handle FPROUNDING_ODD in arm_rmode_to_sf [v2,03/25] target/arm: Improve arm_rmode_to_sf [v2,04/25] target/arm: Consistently use ARMFPRounding during translation [v2,05/25] target/arm: Create gen_set_rmode, gen_restore_rmode [v2,06/25] target/arm: Improve trans_BFCI [v2,07/25] target/arm: Avoid tcg_const_ptr in gen_sve_{ldr, str} [v2,08/25] target/arm: Avoid tcg_const_* in translate-mve.c [v2,09/25] target/arm: Avoid tcg_const_ptr in disas_simd_zip_trn [v2,10/25] target/arm: Avoid tcg_const_ptr in handle_vec_simd_sqshrn [v2,11/25] target/arm: Avoid tcg_const_ptr in handle_rev [v2,12/25] target/m68k: Reject immediate as destination in gen_ea_mode [v2,13/25] target/m68k: Use tcg_constant_i32 in gen_ea_mode [v2,14/25] target/ppc: Avoid tcg_const_i64 in do_vcntmb [v2,15/25] target/ppc: Avoid tcg_const_* in vmx-impl.c.inc [v2,16/25] target/ppc: Avoid tcg_const_* in xxeval [v2,17/25] target/ppc: Avoid tcg_const_* in vsx-impl.c.inc [v2,18/25] target/ppc: Avoid tcg_const_* in fp-impl.c.inc [v2,19/25] target/ppc: Avoid tcg_const_* in power8-pmu-regs.c.inc [v2,20/25] target/ppc: Rewrite trans_ADDG6S [v2,21/25] target/ppc: Fix gen_tlbsx_booke206 [v2,22/25] target/ppc: Avoid tcg_const_* in translate.c [v2,23/25] target/tricore: Use min/max for saturate [v2,24/25] tcg: Drop tcg_const__vec [v2,25/25] tcg: Drop tcg_const_

Message ID

20230307183503.2512684-21-richard.henderson@linaro.org

State

Superseded

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>, =?utf-8?q?C=C3=A9dric_L?=
	=?utf-8?q?e_Goater?= <clg@kaod.org>,
 David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>,
 qemu-ppc@nongnu.org
Subject: [PATCH v2 20/25] target/ppc: Rewrite trans_ADDG6S
Date: Tue,  7 Mar 2023 10:34:58 -0800
Message-Id: <20230307183503.2512684-21-richard.henderson@linaro.org>
In-Reply-To: <20230307183503.2512684-1-richard.henderson@linaro.org>
References: <20230307183503.2512684-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::102f;
 envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102f.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

tcg: Remove tcg_const_* | expand

Commit Message

Richard Henderson March 7, 2023, 6:34 p.m. UTC

Compute all carry bits in parallel instead of a loop.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Cédric Le Goater <clg@kaod.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Greg Kurz <groug@kaod.org>
Cc: qemu-ppc@nongnu.org
---
 target/ppc/translate/fixedpoint-impl.c.inc | 44 +++++++++++-----------
 1 file changed, 23 insertions(+), 21 deletions(-)

Comments

Daniel Henrique Barboza March 7, 2023, 9:51 p.m. UTC | #1

On 3/7/23 15:34, Richard Henderson wrote:
> Compute all carry bits in parallel instead of a loop.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Hmmmm the function was added by 6addef4d27268 9 months ago. All tcg ops you used
here were available back then.

I guess this existing implementation was an oversight on our end.


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

> Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
> Cc: Cédric Le Goater <clg@kaod.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Greg Kurz <groug@kaod.org>
> Cc: qemu-ppc@nongnu.org
> ---
>   target/ppc/translate/fixedpoint-impl.c.inc | 44 +++++++++++-----------
>   1 file changed, 23 insertions(+), 21 deletions(-)
> 
> diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
> index 20ea484c3d..02d86b77a8 100644
> --- a/target/ppc/translate/fixedpoint-impl.c.inc
> +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> @@ -484,33 +484,35 @@ static bool trans_PEXTD(DisasContext *ctx, arg_X *a)
>   
>   static bool trans_ADDG6S(DisasContext *ctx, arg_X *a)
>   {
> -    const uint64_t carry_bits = 0x1111111111111111ULL;
> -    TCGv t0, t1, carry, zero = tcg_constant_tl(0);
> +    const target_ulong carry_bits = (target_ulong)-1 / 0xf;
> +    TCGv in1, in2, carryl, carryh, tmp;
> +    TCGv zero = tcg_constant_tl(0);
>   
>       REQUIRE_INSNS_FLAGS2(ctx, BCDA_ISA206);
>   
> -    t0 = tcg_temp_new();
> -    t1 = tcg_const_tl(0);
> -    carry = tcg_const_tl(0);
> +    in1 = cpu_gpr[a->ra];
> +    in2 = cpu_gpr[a->rb];
> +    tmp = tcg_temp_new();
> +    carryl = tcg_temp_new();
> +    carryh = tcg_temp_new();
>   
> -    for (int i = 0; i < 16; i++) {
> -        tcg_gen_shri_tl(t0, cpu_gpr[a->ra], i * 4);
> -        tcg_gen_andi_tl(t0, t0, 0xf);
> -        tcg_gen_add_tl(t1, t1, t0);
> +    /* Addition with carry. */
> +    tcg_gen_add2_tl(carryl, carryh, in1, zero, in2, zero);
> +    /* Addition without carry. */
> +    tcg_gen_xor_tl(tmp, in1, in2);
> +    /* Difference between the two is carry in to each bit. */
> +    tcg_gen_xor_tl(carryl, carryl, tmp);
>   
> -        tcg_gen_shri_tl(t0, cpu_gpr[a->rb], i * 4);
> -        tcg_gen_andi_tl(t0, t0, 0xf);
> -        tcg_gen_add_tl(t1, t1, t0);
> +    /*
> +     * The carry-out that we're looking for is the carry-in to
> +     * the next nibble.  Shift the double-word down one nibble,
> +     * which puts all of the bits back into one word.
> +     */
> +    tcg_gen_extract2_tl(carryl, carryl, carryh, 4);
>   
> -        tcg_gen_andi_tl(t1, t1, 0x10);
> -        tcg_gen_setcond_tl(TCG_COND_NE, t1, t1, zero);
> -
> -        tcg_gen_shli_tl(t0, t1, i * 4);
> -        tcg_gen_or_tl(carry, carry, t0);
> -    }
> -
> -    tcg_gen_xori_tl(carry, carry, (target_long)carry_bits);
> -    tcg_gen_muli_tl(cpu_gpr[a->rt], carry, 6);
> +    /* Invert, isolate the carry bits, and produce 6's. */
> +    tcg_gen_andc_tl(carryl, tcg_constant_tl(carry_bits), carryl);
> +    tcg_gen_muli_tl(cpu_gpr[a->rt], carryl, 6);
>       return true;
>   }
>

Richard Henderson March 7, 2023, 10:34 p.m. UTC | #2

On 3/7/23 13:51, Daniel Henrique Barboza wrote:
> 
> 
> On 3/7/23 15:34, Richard Henderson wrote:
>> Compute all carry bits in parallel instead of a loop.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
> 
> Hmmmm the function was added by 6addef4d27268 9 months ago. All tcg ops you used
> here were available back then.
> 
> I guess this existing implementation was an oversight on our end.

I seem to have missed the whole patch set, which is a shame -- I could have pointed you to 
a similar function for hppa.


r~

diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
index 20ea484c3d..02d86b77a8 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -484,33 +484,35 @@  static bool trans_PEXTD(DisasContext *ctx, arg_X *a)
 
 static bool trans_ADDG6S(DisasContext *ctx, arg_X *a)
 {
-    const uint64_t carry_bits = 0x1111111111111111ULL;
-    TCGv t0, t1, carry, zero = tcg_constant_tl(0);
+    const target_ulong carry_bits = (target_ulong)-1 / 0xf;
+    TCGv in1, in2, carryl, carryh, tmp;
+    TCGv zero = tcg_constant_tl(0);
 
     REQUIRE_INSNS_FLAGS2(ctx, BCDA_ISA206);
 
-    t0 = tcg_temp_new();
-    t1 = tcg_const_tl(0);
-    carry = tcg_const_tl(0);
+    in1 = cpu_gpr[a->ra];
+    in2 = cpu_gpr[a->rb];
+    tmp = tcg_temp_new();
+    carryl = tcg_temp_new();
+    carryh = tcg_temp_new();
 
-    for (int i = 0; i < 16; i++) {
-        tcg_gen_shri_tl(t0, cpu_gpr[a->ra], i * 4);
-        tcg_gen_andi_tl(t0, t0, 0xf);
-        tcg_gen_add_tl(t1, t1, t0);
+    /* Addition with carry. */
+    tcg_gen_add2_tl(carryl, carryh, in1, zero, in2, zero);
+    /* Addition without carry. */
+    tcg_gen_xor_tl(tmp, in1, in2);
+    /* Difference between the two is carry in to each bit. */
+    tcg_gen_xor_tl(carryl, carryl, tmp);
 
-        tcg_gen_shri_tl(t0, cpu_gpr[a->rb], i * 4);
-        tcg_gen_andi_tl(t0, t0, 0xf);
-        tcg_gen_add_tl(t1, t1, t0);
+    /*
+     * The carry-out that we're looking for is the carry-in to
+     * the next nibble.  Shift the double-word down one nibble,
+     * which puts all of the bits back into one word.
+     */
+    tcg_gen_extract2_tl(carryl, carryl, carryh, 4);
 
-        tcg_gen_andi_tl(t1, t1, 0x10);
-        tcg_gen_setcond_tl(TCG_COND_NE, t1, t1, zero);
-
-        tcg_gen_shli_tl(t0, t1, i * 4);
-        tcg_gen_or_tl(carry, carry, t0);
-    }
-
-    tcg_gen_xori_tl(carry, carry, (target_long)carry_bits);
-    tcg_gen_muli_tl(cpu_gpr[a->rt], carry, 6);
+    /* Invert, isolate the carry bits, and produce 6's. */
+    tcg_gen_andc_tl(carryl, tcg_constant_tl(carry_bits), carryl);
+    tcg_gen_muli_tl(cpu_gpr[a->rt], carryl, 6);
     return true;
 }

[v2,20/25] target/ppc: Rewrite trans_ADDG6S

Commit Message

Comments

Patch