diff mbox series

tcg/riscv: Fix reg overlap case in tcg_out_addsub2

Message ID 20221020233836.2341671-1-richard.henderson@linaro.org
State Accepted
Commit 9b246685b3dbbf21800e3a9a09f8bed384a1fb37
Headers show
Series tcg/riscv: Fix reg overlap case in tcg_out_addsub2 | expand

Commit Message

Richard Henderson Oct. 20, 2022, 11:38 p.m. UTC
There was a typo using opc_addi instead of opc_add with the
two registers.  While we're at it, simplify the gating test
to al == bl to improve dynamic scheduling even when the
output register does not overlap the inputs.

Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
Supersedes: 20221020104154.4276-4-zhiwei_liu@linux.alibaba.com
("[RFC PATCH 3/3] tcg/riscv: Remove a wrong optimization for addsub2")
---
 tcg/riscv/tcg-target.c.inc | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Alistair Francis Oct. 25, 2022, 2:20 a.m. UTC | #1
On Fri, Oct 21, 2022 at 9:47 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> There was a typo using opc_addi instead of opc_add with the
> two registers.  While we're at it, simplify the gating test
> to al == bl to improve dynamic scheduling even when the
> output register does not overlap the inputs.
>
> Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
> Supersedes: 20221020104154.4276-4-zhiwei_liu@linux.alibaba.com
> ("[RFC PATCH 3/3] tcg/riscv: Remove a wrong optimization for addsub2")
> ---
>  tcg/riscv/tcg-target.c.inc | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 81a83e45b1..1cdaf7b57b 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -687,9 +687,15 @@ static void tcg_out_addsub2(TCGContext *s,
>          if (cbl) {
>              tcg_out_opc_imm(s, opc_addi, rl, al, bl);
>              tcg_out_opc_imm(s, OPC_SLTIU, TCG_REG_TMP0, rl, bl);
> -        } else if (rl == al && rl == bl) {
> +        } else if (al == bl) {
> +            /*
> +             * If the input regs overlap, this is a simple doubling
> +             * and carry-out is the input msb.  This special case is
> +             * required when the output reg overlaps the input,
> +             * but we might as well use it always.
> +             */
>              tcg_out_opc_imm(s, OPC_SLTI, TCG_REG_TMP0, al, 0);
> -            tcg_out_opc_reg(s, opc_addi, rl, al, bl);
> +            tcg_out_opc_reg(s, opc_add, rl, al, al);
>          } else {
>              tcg_out_opc_reg(s, opc_add, rl, al, bl);
>              tcg_out_opc_reg(s, OPC_SLTU, TCG_REG_TMP0,
> --
> 2.34.1
>
>
Alistair Francis Oct. 25, 2022, 3:12 a.m. UTC | #2
On Fri, Oct 21, 2022 at 9:47 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> There was a typo using opc_addi instead of opc_add with the
> two registers.  While we're at it, simplify the gating test
> to al == bl to improve dynamic scheduling even when the
> output register does not overlap the inputs.
>
> Reported-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
> Supersedes: 20221020104154.4276-4-zhiwei_liu@linux.alibaba.com
> ("[RFC PATCH 3/3] tcg/riscv: Remove a wrong optimization for addsub2")
> ---
>  tcg/riscv/tcg-target.c.inc | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 81a83e45b1..1cdaf7b57b 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -687,9 +687,15 @@ static void tcg_out_addsub2(TCGContext *s,
>          if (cbl) {
>              tcg_out_opc_imm(s, opc_addi, rl, al, bl);
>              tcg_out_opc_imm(s, OPC_SLTIU, TCG_REG_TMP0, rl, bl);
> -        } else if (rl == al && rl == bl) {
> +        } else if (al == bl) {
> +            /*
> +             * If the input regs overlap, this is a simple doubling
> +             * and carry-out is the input msb.  This special case is
> +             * required when the output reg overlaps the input,
> +             * but we might as well use it always.
> +             */
>              tcg_out_opc_imm(s, OPC_SLTI, TCG_REG_TMP0, al, 0);
> -            tcg_out_opc_reg(s, opc_addi, rl, al, bl);
> +            tcg_out_opc_reg(s, opc_add, rl, al, al);
>          } else {
>              tcg_out_opc_reg(s, opc_add, rl, al, bl);
>              tcg_out_opc_reg(s, OPC_SLTU, TCG_REG_TMP0,
> --
> 2.34.1
>
>
diff mbox series

Patch

diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 81a83e45b1..1cdaf7b57b 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -687,9 +687,15 @@  static void tcg_out_addsub2(TCGContext *s,
         if (cbl) {
             tcg_out_opc_imm(s, opc_addi, rl, al, bl);
             tcg_out_opc_imm(s, OPC_SLTIU, TCG_REG_TMP0, rl, bl);
-        } else if (rl == al && rl == bl) {
+        } else if (al == bl) {
+            /*
+             * If the input regs overlap, this is a simple doubling
+             * and carry-out is the input msb.  This special case is
+             * required when the output reg overlaps the input,
+             * but we might as well use it always.
+             */
             tcg_out_opc_imm(s, OPC_SLTI, TCG_REG_TMP0, al, 0);
-            tcg_out_opc_reg(s, opc_addi, rl, al, bl);
+            tcg_out_opc_reg(s, opc_add, rl, al, al);
         } else {
             tcg_out_opc_reg(s, opc_add, rl, al, bl);
             tcg_out_opc_reg(s, OPC_SLTU, TCG_REG_TMP0,