diff mbox

[01/13] target-arm: A64: add support for conditional select

Message ID 1386280289-27636-2-git-send-email-peter.maydell@linaro.org
State Superseded
Headers show

Commit Message

Peter Maydell Dec. 5, 2013, 9:51 p.m. UTC
From: Claudio Fontana <claudio.fontana@linaro.org>

This patch adds support for the instruction group "C3.5.6
Conditional select": CSEL, CSINC, CSINV, CSNEG.

Signed-off-by: Claudio Fontana <claudio.fontana@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c |   56 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 54 insertions(+), 2 deletions(-)

Comments

Richard Henderson Dec. 5, 2013, 10:26 p.m. UTC | #1
On 12/06/2013 10:51 AM, Peter Maydell wrote:
> +    if (cond >= 0x0e) { /* condition "always" */
> +        tcg_src = read_cpu_reg(s, rn, sf);
> +        tcg_gen_mov_i64(tcg_rd, tcg_src);

I wonder if it's worth adding that 0x0[ef] case to the generic condition
processing rather than keep replicating it everywhere.

> +    } else {
> +        /* OPTME: we could use movcond here, at the cost of duplicating
> +         * a lot of the arm_gen_test_cc() logic.
> +         */

Honestly, arm_gen_test_cc should get refactored to a real test (as opposed to
branch) sooner rather than later.

Longer term it's probably worth recognizing the special case of Rm==31 &&
Rn==31 && else_inc as setcond as opposed to movcond.

> +        arm_gen_test_cc(cond, label_match);
> +        /* nomatch: */
> +        tcg_src = read_cpu_reg(s, rm, sf);
> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
> +        if (else_inv) {
> +            tcg_gen_not_i64(tcg_rd, tcg_rd);
> +        }
> +        if (else_inc) {
> +            tcg_gen_addi_i64(tcg_rd, tcg_rd, 1);
> +        }

I think better as

  if (else_inv && else_inc) {
      tcg_gen_neg_i64(tcg_rd, tcg_src);
  } else if (else_inv) {
      tcg_gen_not_i64(tcg_rd, tcg_src);
  } else if (else_inc) {
      tcg_gen_addi_i64(tcg_rd, tcg_src, 1);
  } else {
      tcg_gen_mov_i64(tcg_rd, tcg_src);
  }

> +        if (!sf) {
> +            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
> +        }

I do wonder about the usefulness of passing SF (as opposed to hardcoding 1) to
read_cpu_reg to begin, since the ext32u that it generates is redundant with the
one here at the end, and likely cannot be optimized away.


r~
Peter Maydell Dec. 5, 2013, 10:31 p.m. UTC | #2
On 5 December 2013 22:26, Richard Henderson <rth@twiddle.net> wrote:
> On 12/06/2013 10:51 AM, Peter Maydell wrote:
>> +    if (cond >= 0x0e) { /* condition "always" */
>> +        tcg_src = read_cpu_reg(s, rn, sf);
>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>
> I wonder if it's worth adding that 0x0[ef] case to the generic condition
> processing rather than keep replicating it everywhere.
>
>> +    } else {
>> +        /* OPTME: we could use movcond here, at the cost of duplicating
>> +         * a lot of the arm_gen_test_cc() logic.
>> +         */
>
> Honestly, arm_gen_test_cc should get refactored to a real test (as opposed to
> branch) sooner rather than later.

By "sooner rather than later" do you mean "as part of this patch series" ?

> Longer term it's probably worth recognizing the special case of Rm==31 &&
> Rn==31 && else_inc as setcond as opposed to movcond.
>
>> +        arm_gen_test_cc(cond, label_match);
>> +        /* nomatch: */
>> +        tcg_src = read_cpu_reg(s, rm, sf);
>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>> +        if (else_inv) {
>> +            tcg_gen_not_i64(tcg_rd, tcg_rd);
>> +        }
>> +        if (else_inc) {
>> +            tcg_gen_addi_i64(tcg_rd, tcg_rd, 1);
>> +        }
>
> I think better as
>
>   if (else_inv && else_inc) {
>       tcg_gen_neg_i64(tcg_rd, tcg_src);
>   } else if (else_inv) {
>       tcg_gen_not_i64(tcg_rd, tcg_src);
>   } else if (else_inc) {
>       tcg_gen_addi_i64(tcg_rd, tcg_src, 1);
>   } else {
>       tcg_gen_mov_i64(tcg_rd, tcg_src);
>   }
>
>> +        if (!sf) {
>> +            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
>> +        }
>
> I do wonder about the usefulness of passing SF (as opposed to hardcoding 1) to
> read_cpu_reg to begin, since the ext32u that it generates is redundant with the
> one here at the end, and likely cannot be optimized away.

Mmm. I did think that was a little unfortunate but didn't think of the
idea of just passing 1 to read_cpu_reg() for some reason.

-- PMM
Richard Henderson Dec. 5, 2013, 10:40 p.m. UTC | #3
On 12/06/2013 11:31 AM, Peter Maydell wrote:
> On 5 December 2013 22:26, Richard Henderson <rth@twiddle.net> wrote:
>> On 12/06/2013 10:51 AM, Peter Maydell wrote:
>>> +    if (cond >= 0x0e) { /* condition "always" */
>>> +        tcg_src = read_cpu_reg(s, rn, sf);
>>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>>
>> I wonder if it's worth adding that 0x0[ef] case to the generic condition
>> processing rather than keep replicating it everywhere.
>>
>>> +    } else {
>>> +        /* OPTME: we could use movcond here, at the cost of duplicating
>>> +         * a lot of the arm_gen_test_cc() logic.
>>> +         */
>>
>> Honestly, arm_gen_test_cc should get refactored to a real test (as opposed to
>> branch) sooner rather than later.
> 
> By "sooner rather than later" do you mean "as part of this patch series" ?

It might make later patch series easier.  But I won't insist.


r~
Peter Maydell Dec. 6, 2013, 12:45 p.m. UTC | #4
On 5 December 2013 22:26, Richard Henderson <rth@twiddle.net> wrote:
> On 12/06/2013 10:51 AM, Peter Maydell wrote:
>> +    if (cond >= 0x0e) { /* condition "always" */
>> +        tcg_src = read_cpu_reg(s, rn, sf);
>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>
> I wonder if it's worth adding that 0x0[ef] case to the generic condition
> processing rather than keep replicating it everywhere.

I think "always true" is a special case anyway because you don't
want to emit any kind of branching/label logic at all.

>> +    } else {
>> +        /* OPTME: we could use movcond here, at the cost of duplicating
>> +         * a lot of the arm_gen_test_cc() logic.
>> +         */
>
> Honestly, arm_gen_test_cc should get refactored to a real test (as opposed to
> branch) sooner rather than later.

I had a think about this and I couldn't really come up with a particularly
nice looking API for it, given the way that movcondi/setcondi/brcondi work.
The best I could come up with was something like:

enum ARMCondType {
     CondSimple;
     CondAnd;
     CondOr;
};
typedef struct ARMCondState {
     int type;
     TCGv_i32 cmpop;
     TCGCond cond;
     bool cmpop_is_temp;
} ARMCondState;

/**
 * arm_cond_gen_test_cc:
 * @condstate: filled in with information about condition to check
 * @armcc: ARM condition value
 * @round: which of the (max 2) tests to deal with
 *
 * Decode the ARM condition value and identify which TCG condition
 * and operation to use. May emit code to set up the condition arguments.
 * The general idea is that you call with:
 *   arm_test_gen_cc(&condstate, armcc, 1);
 * and then emit a brcondi/movcondi/setcondi against zero with the
 * filled in condstate.cmpop and condstate.cond. If condstate.cmpop_is_temp
 * is true you then have to free cmpop with tcg_temp_free_i32().
 * If the condstate.type is CondAnd or CondOr then this is a condition
 * which is in two parts (eg gt: which is !Z && N == V). In this case then
 * (a) you are expected to set up multiple labels, feed movcond output
 * into a second movcond, etc to achieve the AND/OR effect and (b)
 * you should call the function a second time with round == 2 to get the
 * information for the second branch/comparison/test.
 */
void arm_gen_test_cc(ARMCondState *condstate, int armcc, int round);

But that seems pretty ugly to me.

The other awkwardness is that there's no easy way to do a "conditional
move of 64 bit values based on 32 bit comparison", which means you need
to sign extend the condstate.cmpop for 64 bit versions.

So I definitely think I'd rather postpone this for now, unless you have
a neat idea that I've missed for making it look nice.

>> +        arm_gen_test_cc(cond, label_match);
>> +        /* nomatch: */
>> +        tcg_src = read_cpu_reg(s, rm, sf);
>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>> +        if (else_inv) {
>> +            tcg_gen_not_i64(tcg_rd, tcg_rd);
>> +        }
>> +        if (else_inc) {
>> +            tcg_gen_addi_i64(tcg_rd, tcg_rd, 1);
>> +        }
>
> I think better as
>
>   if (else_inv && else_inc) {
>       tcg_gen_neg_i64(tcg_rd, tcg_src);
>   } else if (else_inv) {
>       tcg_gen_not_i64(tcg_rd, tcg_src);
>   } else if (else_inc) {
>       tcg_gen_addi_i64(tcg_rd, tcg_src, 1);
>   } else {
>       tcg_gen_mov_i64(tcg_rd, tcg_src);
>   }

Agreed, will fix.

>> +        if (!sf) {
>> +            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
>> +        }
>
> I do wonder about the usefulness of passing SF (as opposed to hardcoding 1) to
> read_cpu_reg to begin, since the ext32u that it generates is redundant with the
> one here at the end, and likely cannot be optimized away.

Will fix.

thanks
-- PMM
Richard Henderson Dec. 6, 2013, 4:44 p.m. UTC | #5
On 12/07/2013 01:45 AM, Peter Maydell wrote:
> On 5 December 2013 22:26, Richard Henderson <rth@twiddle.net> wrote:
>> On 12/06/2013 10:51 AM, Peter Maydell wrote:
>>> +    if (cond >= 0x0e) { /* condition "always" */
>>> +        tcg_src = read_cpu_reg(s, rn, sf);
>>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>>
>> I wonder if it's worth adding that 0x0[ef] case to the generic condition
>> processing rather than keep replicating it everywhere.
> 
> I think "always true" is a special case anyway because you don't
> want to emit any kind of branching/label logic at all.

Sure, but unlike unconditional branches, which are useful to special-case, one
sort of expects never to see an unconditional conditional move.  Given
TCG_COND_ALWAYS, we can re-use generic logic and have things fall out
relatively easily.

> I had a think about this and I couldn't really come up with a particularly
> nice looking API for it, given the way that movcondi/setcondi/brcondi work.
> The best I could come up with was something like:

The s390 target has an example with DisasCompare and disas_jcc.
The i386 target has another example with CCPrepare and gen_prepare_cc.

The i386 port uses similar flags to ARM, but represents them differently.  I
suppose good ideas could be taken from either port.


> So I definitely think I'd rather postpone this for now, unless you have
> a neat idea that I've missed for making it look nice.

Let's just postpone for now.


r~
Peter Maydell Dec. 6, 2013, 5:23 p.m. UTC | #6
On 6 December 2013 16:44, Richard Henderson <rth@twiddle.net> wrote:
> On 12/07/2013 01:45 AM, Peter Maydell wrote:
>> On 5 December 2013 22:26, Richard Henderson <rth@twiddle.net> wrote:
>>> On 12/06/2013 10:51 AM, Peter Maydell wrote:
>>>> +    if (cond >= 0x0e) { /* condition "always" */
>>>> +        tcg_src = read_cpu_reg(s, rn, sf);
>>>> +        tcg_gen_mov_i64(tcg_rd, tcg_src);
>>>
>>> I wonder if it's worth adding that 0x0[ef] case to the generic condition
>>> processing rather than keep replicating it everywhere.
>>
>> I think "always true" is a special case anyway because you don't
>> want to emit any kind of branching/label logic at all.
>
> Sure, but unlike unconditional branches, which are useful to special-case, one
> sort of expects never to see an unconditional conditional move.  Given
> TCG_COND_ALWAYS, we can re-use generic logic and have things fall out
> relatively easily.

I guess. It doesn't seem much worth adding extra code to arm_gen_test_cc
that we expect to become redundant if/when we do these insns "properly"
with setcond/movcond, though, so I think it's OK like this for now.

thanks
-- PMM
diff mbox

Patch

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index fdc3ed8..a9c1803 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -724,10 +724,62 @@  static void disas_cc_reg(DisasContext *s, uint32_t insn)
     unsupported_encoding(s, insn);
 }
 
-/* Conditional select */
+/* C3.5.6 Conditional select
+ *   31   30  29  28             21 20  16 15  12 11 10 9    5 4    0
+ * +----+----+---+-----------------+------+------+-----+------+------+
+ * | sf | op | S | 1 1 0 1 0 1 0 0 |  Rm  | cond | op2 |  Rn  |  Rd  |
+ * +----+----+---+-----------------+------+------+-----+------+------+
+ */
 static void disas_cond_select(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    unsigned int sf, else_inv, rm, cond, else_inc, rn, rd;
+    TCGv_i64 tcg_rd, tcg_src;
+
+    if (extract32(insn, 29, 1) || extract32(insn, 11, 1)) {
+        /* S == 1 or op2<1> == 1 */
+        unallocated_encoding(s);
+        return;
+    }
+    sf = extract32(insn, 31, 1);
+    else_inv = extract32(insn, 30, 1);
+    rm = extract32(insn, 16, 5);
+    cond = extract32(insn, 12, 4);
+    else_inc = extract32(insn, 10, 1);
+    rn = extract32(insn, 5, 5);
+    rd = extract32(insn, 0, 5);
+    tcg_rd = cpu_reg(s, rd);
+
+    if (cond >= 0x0e) { /* condition "always" */
+        tcg_src = read_cpu_reg(s, rn, sf);
+        tcg_gen_mov_i64(tcg_rd, tcg_src);
+    } else {
+        /* OPTME: we could use movcond here, at the cost of duplicating
+         * a lot of the arm_gen_test_cc() logic.
+         */
+        int label_match = gen_new_label();
+        int label_continue = gen_new_label();
+
+        arm_gen_test_cc(cond, label_match);
+        /* nomatch: */
+        tcg_src = read_cpu_reg(s, rm, sf);
+        tcg_gen_mov_i64(tcg_rd, tcg_src);
+        if (else_inv) {
+            tcg_gen_not_i64(tcg_rd, tcg_rd);
+        }
+        if (else_inc) {
+            tcg_gen_addi_i64(tcg_rd, tcg_rd, 1);
+        }
+        if (!sf) {
+            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+        }
+        tcg_gen_br(label_continue);
+        /* match: */
+        gen_set_label(label_match);
+        tcg_src = read_cpu_reg(s, rn, sf);
+        tcg_gen_mov_i64(tcg_rd, tcg_src);
+        /* continue: */
+        gen_set_label(label_continue);
+    }
 }
 
 /* Data-processing (1 source) */