Message ID | 586F858B.5050209@foss.arm.com |
---|---|
State | New |
Headers | show |
Ping. https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00381.html Thanks, Kyrill On 06/01/17 11:54, Kyrill Tkachov wrote: > Hi all, > > In this wrong-code issue the RTL tries to load a const_vector: > (const_vector:V8QI [ > (const_int 1 [0x1]) > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 0 [0]) > ]) > > into a NEON register. The logic for that is in neon_valid_immediate which does a number of tricks > to decide what value and of what size to do a VMOV on to load the correct vector immediate into the register. > It goes wrong on big-endian. On both big and little-endian this outputs: > vmov.i16 d18, #0x1 > > This is wrong on big-endian. The vector layout has to be such as if loaded from memory. > I've tried various approaches of fixing neon_valid_immediate to generate the correct immediate but have been unsuccessful, > resulting in regressions in various parts of the testsuite or making a big mess of the function. > > Given that armeb is not a target of major concern I believe the safest route at this stage is to only allow vector constants > that will obviously work on big-endian, that is the ones that are just a single element duplicated in all lanes. > > This patch fixes the execution failures: > FAIL: gfortran.dg/intrinsic_pack_1.f90 > FAIL: gfortran.dg/c_f_pointer_logical.f03 > FAIL: gcc.dg/torture/pr60183.c > FAIL: gcc.dg/vect/pr51581-4.c > > on armeb-none-eabi. > > Ok for trunk? > > Thanks, > Kyrill > > 2017-01-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/71270 > * config/arm/arm.c (neon_valid_immediate): Reject vector constants > in big-endian mode when they are not a single duplicated value.
On 06/01/17 11:54, Kyrill Tkachov wrote: > Hi all, > > In this wrong-code issue the RTL tries to load a const_vector: > (const_vector:V8QI [ > (const_int 1 [0x1]) > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 0 [0]) > ]) > > into a NEON register. The logic for that is in neon_valid_immediate > which does a number of tricks > to decide what value and of what size to do a VMOV on to load the > correct vector immediate into the register. > It goes wrong on big-endian. On both big and little-endian this outputs: > vmov.i16 d18, #0x1 > > This is wrong on big-endian. The vector layout has to be such as if > loaded from memory. > I've tried various approaches of fixing neon_valid_immediate to generate > the correct immediate but have been unsuccessful, > resulting in regressions in various parts of the testsuite or making a > big mess of the function. > > Given that armeb is not a target of major concern I believe the safest > route at this stage is to only allow vector constants > that will obviously work on big-endian, that is the ones that are just a > single element duplicated in all lanes. > > This patch fixes the execution failures: > FAIL: gfortran.dg/intrinsic_pack_1.f90 > FAIL: gfortran.dg/c_f_pointer_logical.f03 > FAIL: gcc.dg/torture/pr60183.c > FAIL: gcc.dg/vect/pr51581-4.c > > on armeb-none-eabi. > > Ok for trunk? > > Thanks, > Kyrill > > 2017-01-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/71270 > * config/arm/arm.c (neon_valid_immediate): Reject vector constants > in big-endian mode when they are not a single duplicated value. > Ok, but if you plan to close the PR above, please create a new 'enhancement' PR to fix the missed optimization. R. > armeb-vec.patch > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index 45e0a3cd11fa650f456f7382ffbbcd1c932b95eb..2beee8ebe94eeadd6fb1d5b119e3b474e1ab902a 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -11542,6 +11542,12 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse, > return 18; > } > > + /* The tricks done in the code below apply for little-endian vector layout. > + For big-endian vectors only allow vectors of the form { a, a, a..., a }. > + FIXME: Implement logic for big-endian vectors. */ > + if (BYTES_BIG_ENDIAN && vector && !const_vec_duplicate_p (op)) > + return -1; > + > /* Splat vector constant out into a byte vector. */ > for (i = 0; i < n_elts; i++) > { >
On 20/01/17 14:33, Richard Earnshaw (lists) wrote: > On 06/01/17 11:54, Kyrill Tkachov wrote: >> Hi all, >> >> In this wrong-code issue the RTL tries to load a const_vector: >> (const_vector:V8QI [ >> (const_int 1 [0x1]) >> (const_int 0 [0]) >> (const_int 1 [0x1]) >> (const_int 0 [0]) >> (const_int 1 [0x1]) >> (const_int 0 [0]) >> (const_int 1 [0x1]) >> (const_int 0 [0]) >> ]) >> >> into a NEON register. The logic for that is in neon_valid_immediate >> which does a number of tricks >> to decide what value and of what size to do a VMOV on to load the >> correct vector immediate into the register. >> It goes wrong on big-endian. On both big and little-endian this outputs: >> vmov.i16 d18, #0x1 >> >> This is wrong on big-endian. The vector layout has to be such as if >> loaded from memory. >> I've tried various approaches of fixing neon_valid_immediate to generate >> the correct immediate but have been unsuccessful, >> resulting in regressions in various parts of the testsuite or making a >> big mess of the function. >> >> Given that armeb is not a target of major concern I believe the safest >> route at this stage is to only allow vector constants >> that will obviously work on big-endian, that is the ones that are just a >> single element duplicated in all lanes. >> >> This patch fixes the execution failures: >> FAIL: gfortran.dg/intrinsic_pack_1.f90 >> FAIL: gfortran.dg/c_f_pointer_logical.f03 >> FAIL: gcc.dg/torture/pr60183.c >> FAIL: gcc.dg/vect/pr51581-4.c >> >> on armeb-none-eabi. >> >> Ok for trunk? >> >> Thanks, >> Kyrill >> >> 2017-01-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/71270 >> * config/arm/arm.c (neon_valid_immediate): Reject vector constants >> in big-endian mode when they are not a single duplicated value. >> > Ok, but if you plan to close the PR above, please create a new > 'enhancement' PR to fix the missed optimization. Thanks, I've committed it as r244716 and opened PR 79166 to track the missed optimisation. Kyrill > R. > >> armeb-vec.patch >> >> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >> index 45e0a3cd11fa650f456f7382ffbbcd1c932b95eb..2beee8ebe94eeadd6fb1d5b119e3b474e1ab902a 100644 >> --- a/gcc/config/arm/arm.c >> +++ b/gcc/config/arm/arm.c >> @@ -11542,6 +11542,12 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse, >> return 18; >> } >> >> + /* The tricks done in the code below apply for little-endian vector layout. >> + For big-endian vectors only allow vectors of the form { a, a, a..., a }. >> + FIXME: Implement logic for big-endian vectors. */ >> + if (BYTES_BIG_ENDIAN && vector && !const_vec_duplicate_p (op)) >> + return -1; >> + >> /* Splat vector constant out into a byte vector. */ >> for (i = 0; i < n_elts; i++) >> { >>
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 45e0a3cd11fa650f456f7382ffbbcd1c932b95eb..2beee8ebe94eeadd6fb1d5b119e3b474e1ab902a 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -11542,6 +11542,12 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse, return 18; } + /* The tricks done in the code below apply for little-endian vector layout. + For big-endian vectors only allow vectors of the form { a, a, a..., a }. + FIXME: Implement logic for big-endian vectors. */ + if (BYTES_BIG_ENDIAN && vector && !const_vec_duplicate_p (op)) + return -1; + /* Splat vector constant out into a byte vector. */ for (i = 0; i < n_elts; i++) {