Message ID | 20210430202610.1136687-78-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | target/arm: Implement SVE2 | expand |
On Fri, 30 Apr 2021 at 22:23, Richard Henderson <richard.henderson@linaro.org> wrote: > > We were extracting the M register twice, once incorrectly > as M:vm and once correctly as rm. Remove the incorrect > name and remove the incorrect decode. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > target/arm/neon-shared.decode | 4 +- > target/arm/translate-neon.c | 90 +++++++++++++++-------------------- > 2 files changed, 40 insertions(+), 54 deletions(-) > > diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode > index ca0c699072..facb621450 100644 > --- a/target/arm/neon-shared.decode > +++ b/target/arm/neon-shared.decode > @@ -61,8 +61,8 @@ VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ > VCMLA_scalar 1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \ > vm=%vm_dp vn=%vn_dp vd=%vd_dp size=2 index=0 > > -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \ > - vm=%vm_dp vn=%vn_dp vd=%vd_dp > +VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ > + vn=%vn_dp vd=%vd_dp Is it possible to make this kind of bug a decodetree error? It seems unlikely that there's a use for having a bit which is decoded both by a %foo field specification and also in some other way... > > %vfml_scalar_q0_rm 0:3 5:1 > %vfml_scalar_q1_index 5:1 3:1 > diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c > index a0e267694b..52b75ff76f 100644 > --- a/target/arm/translate-neon.c > +++ b/target/arm/translate-neon.c > @@ -151,6 +151,36 @@ static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var) > } > } > > +static bool do_neon_ddda(DisasContext *s, int q, int vd, int vn, int vm, > + int data, gen_helper_gvec_4 *fn_gvec) This patch seems to be doing more than its commit message suggests. If we want to share code between trans_VDOT and trans_VDOT_scalar can we do that refactoring in its own patch, please ? thanks -- PMM
On 5/13/21 2:25 PM, Peter Maydell wrote: >> -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \ >> - vm=%vm_dp vn=%vn_dp vd=%vd_dp >> +VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ >> + vn=%vn_dp vd=%vd_dp > > Is it possible to make this kind of bug a decodetree error? > It seems unlikely that there's a use for having a bit which is > decoded both by a %foo field specification and also in some > other way... That's not what's happening here. This has separate fields "rm" and "vm" decoded in different ways. r~
On 5/13/21 2:25 PM, Peter Maydell wrote: >> +static bool do_neon_ddda(DisasContext *s, int q, int vd, int vn, int vm, >> + int data, gen_helper_gvec_4 *fn_gvec) > > This patch seems to be doing more than its commit message suggests. > If we want to share code between trans_VDOT and trans_VDOT_scalar > can we do that refactoring in its own patch, please ? It appears as if a rebasing error squashed two patches together (git commit --amend vs git rebase --continue after conflicts). I nearly made the identical mistake while splitting this apart again just now. :-P r~
On Sat, 15 May 2021 at 18:13, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 5/13/21 2:25 PM, Peter Maydell wrote: > >> -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \ > >> - vm=%vm_dp vn=%vn_dp vd=%vd_dp > >> +VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ > >> + vn=%vn_dp vd=%vd_dp > > > > Is it possible to make this kind of bug a decodetree error? > > It seems unlikely that there's a use for having a bit which is > > decoded both by a %foo field specification and also in some > > other way... > > That's not what's happening here. This has separate fields "rm" and "vm" > decoded in different ways. But they overlap: rm:4 in the pattern itself is using bits [3:0], and "vm=%vm_dp" is also using [3:0] because the %vm_dp field specifier is defined as "5:1 0:4". I'm suggesting that if the pattern uses a field specifier we should check that none of the bits in that field specifier are used in the pattern for some other purpose (here 'u' and 'rm'). thanks -- PMM
On 5/16/21 11:09 AM, Peter Maydell wrote: > On Sat, 15 May 2021 at 18:13, Richard Henderson > <richard.henderson@linaro.org> wrote: >> >> On 5/13/21 2:25 PM, Peter Maydell wrote: >>>> -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \ >>>> - vm=%vm_dp vn=%vn_dp vd=%vd_dp >>>> +VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ >>>> + vn=%vn_dp vd=%vd_dp >>> >>> Is it possible to make this kind of bug a decodetree error? >>> It seems unlikely that there's a use for having a bit which is >>> decoded both by a %foo field specification and also in some >>> other way... >> >> That's not what's happening here. This has separate fields "rm" and "vm" >> decoded in different ways. > > But they overlap: rm:4 in the pattern itself is using bits [3:0], > and "vm=%vm_dp" is also using [3:0] because the %vm_dp field > specifier is defined as "5:1 0:4". I'm suggesting that if the > pattern uses a field specifier we should check that none of the > bits in that field specifier are used in the pattern for some > other purpose (here 'u' and 'rm'). We do this, more or less, for sve: # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ &rprrr_esz ra=%reg_movprfx where ra and rd overlap. Though ra and rd overlap exactly, so perhaps that's not quite the same as vm above, overlapping both rm and index. r~
diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index ca0c699072..facb621450 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -61,8 +61,8 @@ VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ VCMLA_scalar 1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp size=2 index=0 -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \ - vm=%vm_dp vn=%vn_dp vd=%vd_dp +VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ + vn=%vn_dp vd=%vd_dp %vfml_scalar_q0_rm 0:3 5:1 %vfml_scalar_q1_index 5:1 3:1 diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index a0e267694b..52b75ff76f 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -151,6 +151,36 @@ static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var) } } +static bool do_neon_ddda(DisasContext *s, int q, int vd, int vn, int vm, + int data, gen_helper_gvec_4 *fn_gvec) +{ + /* UNDEF accesses to D16-D31 if they don't exist. */ + if (((vd | vn | vm) & 0x10) && !dc_isar_feature(aa32_simd_r32, s)) { + return false; + } + + /* + * UNDEF accesses to odd registers for each bit of Q. + * Q will be 0b111 for all Q-reg instructions, otherwise + * when we have mixed Q- and D-reg inputs. + */ + if (((vd & 1) * 4 | (vn & 1) * 2 | (vm & 1)) & q) { + return false; + } + + if (!vfp_access_check(s)) { + return true; + } + + int opr_sz = q ? 16 : 8; + tcg_gen_gvec_4_ool(vfp_reg_offset(1, vd), + vfp_reg_offset(1, vn), + vfp_reg_offset(1, vm), + vfp_reg_offset(1, vd), + opr_sz, opr_sz, data, fn_gvec); + return true; +} + static bool do_neon_ddda_fpst(DisasContext *s, int q, int vd, int vn, int vm, int data, ARMFPStatusFlavour fp_flavor, gen_helper_gvec_4_ptr *fn_gvec_ptr) @@ -241,35 +271,13 @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a) static bool trans_VDOT(DisasContext *s, arg_VDOT *a) { - int opr_sz; - gen_helper_gvec_4 *fn_gvec; - if (!dc_isar_feature(aa32_dp, s)) { return false; } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn | a->vm) & 0x10)) { - return false; - } - - if ((a->vn | a->vm | a->vd) & a->q) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - opr_sz = (1 + a->q) * 8; - fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; - tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->vm), - vfp_reg_offset(1, a->vd), - opr_sz, opr_sz, 0, fn_gvec); - return true; + return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, + a->u + ? gen_helper_gvec_udot_b + : gen_helper_gvec_sdot_b); } static bool trans_VFML(DisasContext *s, arg_VFML *a) @@ -323,35 +331,13 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) { - gen_helper_gvec_4 *fn_gvec; - int opr_sz; - if (!dc_isar_feature(aa32_dp, s)) { return false; } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn) & 0x10)) { - return false; - } - - if ((a->vd | a->vn) & a->q) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; - opr_sz = (1 + a->q) * 8; - tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->rm), - vfp_reg_offset(1, a->vd), - opr_sz, opr_sz, a->index, fn_gvec); - return true; + return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + a->u + ? gen_helper_gvec_udot_idx_b + : gen_helper_gvec_sdot_idx_b); } static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
We were extracting the M register twice, once incorrectly as M:vm and once correctly as rm. Remove the incorrect name and remove the incorrect decode. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/neon-shared.decode | 4 +- target/arm/translate-neon.c | 90 +++++++++++++++-------------------- 2 files changed, 40 insertions(+), 54 deletions(-) -- 2.25.1