Message ID | 55FBAFD1.9080300@linaro.org |
---|---|
State | New |
Headers | show |
On Thu, 17 Sep 2015, Michael Collison wrote: > Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR > expressions. I need some assistance; this test case will fail on targets that > don't have support for MIN/MAX such as 68k. Is there any way to remedy this > short of enumerating whether a target support MIN/MAX in > testsuite/lib/target_support? > > 2015-07-24 Michael Collison <michael.collison@linaro.org> > Andrew Pinski <andrew.pinski@caviumnetworks.com> > > * match.pd ((x < y) && (x < z) -> x < min (y,z), > (x > y) and (x > z) -> x > max (y,z)) > * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test. > > diff --git a/gcc/match.pd b/gcc/match.pd > index 5e8fd32..8691710 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see > (convert (bit_and (op (convert:utype @0) (convert:utype @1)) > (convert:utype @4))))))) > > + > +/* Transform (@0 < @1 and @0 < @2) to use min */ > +(for op (lt le) > +(simplify You seem to be missing all indentation. > +(bit_and:c (op @0 @1) (op @0 @2)) :c seems useless here. On the other hand, it might make sense to use op:s since this is mostly useful if it removes the 2 original comparisons. > +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) How did you chose this restriction? It seems safe enough, but the transformation could make sense in other cases as well. It can always be generalized later though. > +(op @0 (min @1 @2))))) > + > +/* Transform (@0 > @1 and @0 > @2) to use max */ > +(for op (gt ge) Note that you could unify the patterns with something like: (for op (lt le gt ge) ext (min min max max) (simplify ... > +(simplify > +(bit_and:c (op @0 @1) (op @0 @2)) > +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) > +(op @0 (max @1 @2))))) > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c > b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c > new file mode 100644 > index 0000000..cc0189a > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c > @@ -0,0 +1,23 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +#define N 1024 > + > +int a[N], b[N], c[N]; > + > +void add (unsigned int m, unsigned int n) > +{ > + unsigned int i; > + for (i = 0; i < m && i < n; ++i) Maybe writing '&' instead of '&&' would make it depend less on the target. Also, both tests seem to be for GENERIC (i.e. I expect that you are already seeing the optimized version with -fdump-tree-original or -fdump-tree-gimple). Maybe something as simple as: int f(long a, long b, long c) { int cmp1 = a < b; int cmp2 = a < c; return cmp1 & cmp2; } > + a[i] = b[i] + c[i]; > +} > + > +void add2 (unsigned int m, unsigned int n) > +{ > + unsigned int i; > + for (i = N-1; i > m && i > n; --i) > + a[i] = b[i] + c[i]; > +} > + > +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
Just a couple extra points. We can end up with a mix of < and >, which might prevent from matching: _3 = b_1(D) > a_2(D); _5 = a_2(D) < c_4(D); _8 = _3 & _5; Just like with &, we could also transform: x < y | x < z ---> x < max(y, z) (but maybe wait to make sure reviewers are ok with the first transformation before generalizing) On Fri, 18 Sep 2015, Marc Glisse wrote: > On Thu, 17 Sep 2015, Michael Collison wrote: > >> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR >> expressions. I need some assistance; this test case will fail on targets >> that don't have support for MIN/MAX such as 68k. Is there any way to remedy >> this short of enumerating whether a target support MIN/MAX in >> testsuite/lib/target_support? >> >> 2015-07-24 Michael Collison <michael.collison@linaro.org> >> Andrew Pinski <andrew.pinski@caviumnetworks.com> >> >> * match.pd ((x < y) && (x < z) -> x < min (y,z), >> (x > y) and (x > z) -> x > max (y,z)) >> * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test. >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 5e8fd32..8691710 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see >> (convert (bit_and (op (convert:utype @0) (convert:utype @1)) >> (convert:utype @4))))))) >> >> + >> +/* Transform (@0 < @1 and @0 < @2) to use min */ >> +(for op (lt le) >> +(simplify > > You seem to be missing all indentation. > >> +(bit_and:c (op @0 @1) (op @0 @2)) > > :c seems useless here. On the other hand, it might make sense to use op:s > since this is mostly useful if it removes the 2 original comparisons. > >> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) > > How did you chose this restriction? It seems safe enough, but the > transformation could make sense in other cases as well. It can always be > generalized later though. > >> +(op @0 (min @1 @2))))) >> + >> +/* Transform (@0 > @1 and @0 > @2) to use max */ >> +(for op (gt ge) > > Note that you could unify the patterns with something like: > (for op (lt le gt ge) > ext (min min max max) > (simplify ... > >> +(simplify >> +(bit_and:c (op @0 @1) (op @0 @2)) >> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >> +(op @0 (max @1 @2))))) >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >> new file mode 100644 >> index 0000000..cc0189a >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >> @@ -0,0 +1,23 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >> + >> +#define N 1024 >> + >> +int a[N], b[N], c[N]; >> + >> +void add (unsigned int m, unsigned int n) >> +{ >> + unsigned int i; >> + for (i = 0; i < m && i < n; ++i) > > Maybe writing '&' instead of '&&' would make it depend less on the target. > Also, both tests seem to be for GENERIC (i.e. I expect that you are already > seeing the optimized version with -fdump-tree-original or > -fdump-tree-gimple). Maybe something as simple as: > int f(long a, long b, long c) { > int cmp1 = a < b; > int cmp2 = a < c; > return cmp1 & cmp2; > } > >> + a[i] = b[i] + c[i]; >> +} >> + >> +void add2 (unsigned int m, unsigned int n) >> +{ >> + unsigned int i; >> + for (i = N-1; i > m && i > n; --i) >> + a[i] = b[i] + c[i]; >> +} >> + >> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ >> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */ > >
On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > Just a couple extra points. We can end up with a mix of < and >, which might > prevent from matching: > > _3 = b_1(D) > a_2(D); > _5 = a_2(D) < c_4(D); > _8 = _3 & _5; > > Just like with &, we could also transform: > x < y | x < z ---> x < max(y, z) > > (but maybe wait to make sure reviewers are ok with the first transformation > before generalizing) Please merge the patterns as suggested and do the :c/:s changes as well. The issue with getting mixed < and > is indeed there - I've wanted to extend :c to handle tcc_comparison in some way at some point but didn't get to how best to implement that yet... So to fix that currently you have to replicate the merged pattern with swapped comparison operands. Otherwise I'm fine with the general approach. Richard. > > On Fri, 18 Sep 2015, Marc Glisse wrote: > >> On Thu, 17 Sep 2015, Michael Collison wrote: >> >>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR >>> expressions. I need some assistance; this test case will fail on targets >>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy >>> this short of enumerating whether a target support MIN/MAX in >>> testsuite/lib/target_support? >>> >>> 2015-07-24 Michael Collison <michael.collison@linaro.org> >>> Andrew Pinski <andrew.pinski@caviumnetworks.com> >>> >>> * match.pd ((x < y) && (x < z) -> x < min (y,z), >>> (x > y) and (x > z) -> x > max (y,z)) >>> * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test. >>> >>> diff --git a/gcc/match.pd b/gcc/match.pd >>> index 5e8fd32..8691710 100644 >>> --- a/gcc/match.pd >>> +++ b/gcc/match.pd >>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see >>> (convert (bit_and (op (convert:utype @0) (convert:utype @1)) >>> (convert:utype @4))))))) >>> >>> + >>> +/* Transform (@0 < @1 and @0 < @2) to use min */ >>> +(for op (lt le) >>> +(simplify >> >> >> You seem to be missing all indentation. >> >>> +(bit_and:c (op @0 @1) (op @0 @2)) >> >> >> :c seems useless here. On the other hand, it might make sense to use op:s >> since this is mostly useful if it removes the 2 original comparisons. >> >>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >> >> >> How did you chose this restriction? It seems safe enough, but the >> transformation could make sense in other cases as well. It can always be >> generalized later though. >> >>> +(op @0 (min @1 @2))))) >>> + >>> +/* Transform (@0 > @1 and @0 > @2) to use max */ >>> +(for op (gt ge) >> >> >> Note that you could unify the patterns with something like: >> (for op (lt le gt ge) >> ext (min min max max) >> (simplify ... >> >>> +(simplify >>> +(bit_and:c (op @0 @1) (op @0 @2)) >>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>> +(op @0 (max @1 @2))))) >>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>> new file mode 100644 >>> index 0000000..cc0189a >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>> @@ -0,0 +1,23 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >>> + >>> +#define N 1024 >>> + >>> +int a[N], b[N], c[N]; >>> + >>> +void add (unsigned int m, unsigned int n) >>> +{ >>> + unsigned int i; >>> + for (i = 0; i < m && i < n; ++i) >> >> >> Maybe writing '&' instead of '&&' would make it depend less on the target. >> Also, both tests seem to be for GENERIC (i.e. I expect that you are already >> seeing the optimized version with -fdump-tree-original or >> -fdump-tree-gimple). Maybe something as simple as: >> int f(long a, long b, long c) { >> int cmp1 = a < b; >> int cmp2 = a < c; >> return cmp1 & cmp2; >> } >> >>> + a[i] = b[i] + c[i]; >>> +} >>> + >>> +void add2 (unsigned int m, unsigned int n) >>> +{ >>> + unsigned int i; >>> + for (i = N-1; i > m && i > n; --i) >>> + a[i] = b[i] + c[i]; >>> +} >>> + >>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ >>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */ >> >> >> > > -- > Marc Glisse
Marc, Can you elaborate on merging the patterns using 'ext' as mentioned in your post? I don't see any documentation or examples. On 09/18/2015 12:00 AM, Marc Glisse wrote: > On Thu, 17 Sep 2015, Michael Collison wrote: > >> Here is the the patch modified with test cases for MIN_EXPR and >> MAX_EXPR expressions. I need some assistance; this test case will >> fail on targets that don't have support for MIN/MAX such as 68k. Is >> there any way to remedy this short of enumerating whether a target >> support MIN/MAX in testsuite/lib/target_support? >> >> 2015-07-24 Michael Collison <michael.collison@linaro.org> >> Andrew Pinski <andrew.pinski@caviumnetworks.com> >> >> * match.pd ((x < y) && (x < z) -> x < min (y,z), >> (x > y) and (x > z) -> x > max (y,z)) >> * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test. >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 5e8fd32..8691710 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see >> (convert (bit_and (op (convert:utype @0) (convert:utype @1)) >> (convert:utype @4))))))) >> >> + >> +/* Transform (@0 < @1 and @0 < @2) to use min */ >> +(for op (lt le) >> +(simplify > > You seem to be missing all indentation. > >> +(bit_and:c (op @0 @1) (op @0 @2)) > > :c seems useless here. On the other hand, it might make sense to use > op:s since this is mostly useful if it removes the 2 original > comparisons. > >> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) > > How did you chose this restriction? It seems safe enough, but the > transformation could make sense in other cases as well. It can always > be generalized later though. > >> +(op @0 (min @1 @2))))) >> + >> +/* Transform (@0 > @1 and @0 > @2) to use max */ >> +(for op (gt ge) > > Note that you could unify the patterns with something like: > (for op (lt le gt ge) > ext (min min max max) > (simplify ... > >> +(simplify >> +(bit_and:c (op @0 @1) (op @0 @2)) >> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >> +(op @0 (max @1 @2))))) >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >> new file mode 100644 >> index 0000000..cc0189a >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >> @@ -0,0 +1,23 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >> + >> +#define N 1024 >> + >> +int a[N], b[N], c[N]; >> + >> +void add (unsigned int m, unsigned int n) >> +{ >> + unsigned int i; >> + for (i = 0; i < m && i < n; ++i) > > Maybe writing '&' instead of '&&' would make it depend less on the > target. Also, both tests seem to be for GENERIC (i.e. I expect that > you are already seeing the optimized version with -fdump-tree-original > or -fdump-tree-gimple). Maybe something as simple as: > int f(long a, long b, long c) { > int cmp1 = a < b; > int cmp2 = a < c; > return cmp1 & cmp2; > } > >> + a[i] = b[i] + c[i]; >> +} >> + >> +void add2 (unsigned int m, unsigned int n) >> +{ >> + unsigned int i; >> + for (i = N-1; i > m && i > n; --i) >> + a[i] = b[i] + c[i]; >> +} >> + >> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ >> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */ >
On Fri, 18 Sep 2015, Michael Collison wrote: > Can you elaborate on merging the patterns using 'ext' as mentioned in your > post? I don't see any documentation or examples. https://gcc.gnu.org/onlinedocs/gccint/The-Language.html "for" lets you iterate on several variables at the same time. For instance, (for bitop (bit_and bit_ior) rbitop (bit_ior bit_and) (simplify (bitop:c (rbitop:c @0 @1) (bit_not@2 @0)) (bitop @1 @2))) expands to (simplify (bit_and:c (bit_ior:c @0 @1) (bit_not@2 @0)) (bit_and @1 @2)) (simplify (bit_ior:c (bit_and:c @0 @1) (bit_not@2 @0)) (bit_ior @1 @2)) there are other examples using (for cmp (eq ne) icmp (ne eq) or (for cmp (simple_comparison) scmp (swapped_simple_comparison) and you can have repetitions (for logs (LOG LOG LOG2 LOG2 LOG10 LOG10) exps (SQRT CBRT) (the iteration wraps around for exps).
Richard and Marc, What is ':s'? I don't see any documentation for it. So you would like me to remove :c and add :s? On 09/18/2015 02:23 AM, Richard Biener wrote: > On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >> Just a couple extra points. We can end up with a mix of < and >, which might >> prevent from matching: >> >> _3 = b_1(D) > a_2(D); >> _5 = a_2(D) < c_4(D); >> _8 = _3 & _5; >> >> Just like with &, we could also transform: >> x < y | x < z ---> x < max(y, z) >> >> (but maybe wait to make sure reviewers are ok with the first transformation >> before generalizing) > Please merge the patterns as suggested and do the :c/:s changes as well. > > The issue with getting mixed < and > is indeed there - I've wanted to > extend :c to handle tcc_comparison in some way at some point but > didn't get to how best to implement that yet... > > So to fix that currently you have to replicate the merged pattern > with swapped comparison operands. > > Otherwise I'm fine with the general approach. > > Richard. > >> On Fri, 18 Sep 2015, Marc Glisse wrote: >> >>> On Thu, 17 Sep 2015, Michael Collison wrote: >>> >>>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR >>>> expressions. I need some assistance; this test case will fail on targets >>>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy >>>> this short of enumerating whether a target support MIN/MAX in >>>> testsuite/lib/target_support? >>>> >>>> 2015-07-24 Michael Collison <michael.collison@linaro.org> >>>> Andrew Pinski <andrew.pinski@caviumnetworks.com> >>>> >>>> * match.pd ((x < y) && (x < z) -> x < min (y,z), >>>> (x > y) and (x > z) -> x > max (y,z)) >>>> * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test. >>>> >>>> diff --git a/gcc/match.pd b/gcc/match.pd >>>> index 5e8fd32..8691710 100644 >>>> --- a/gcc/match.pd >>>> +++ b/gcc/match.pd >>>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see >>>> (convert (bit_and (op (convert:utype @0) (convert:utype @1)) >>>> (convert:utype @4))))))) >>>> >>>> + >>>> +/* Transform (@0 < @1 and @0 < @2) to use min */ >>>> +(for op (lt le) >>>> +(simplify >>> >>> You seem to be missing all indentation. >>> >>>> +(bit_and:c (op @0 @1) (op @0 @2)) >>> >>> :c seems useless here. On the other hand, it might make sense to use op:s >>> since this is mostly useful if it removes the 2 original comparisons. >>> >>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>> >>> How did you chose this restriction? It seems safe enough, but the >>> transformation could make sense in other cases as well. It can always be >>> generalized later though. >>> >>>> +(op @0 (min @1 @2))))) >>>> + >>>> +/* Transform (@0 > @1 and @0 > @2) to use max */ >>>> +(for op (gt ge) >>> >>> Note that you could unify the patterns with something like: >>> (for op (lt le gt ge) >>> ext (min min max max) >>> (simplify ... >>> >>>> +(simplify >>>> +(bit_and:c (op @0 @1) (op @0 @2)) >>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>>> +(op @0 (max @1 @2))))) >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>>> new file mode 100644 >>>> index 0000000..cc0189a >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>>> @@ -0,0 +1,23 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >>>> + >>>> +#define N 1024 >>>> + >>>> +int a[N], b[N], c[N]; >>>> + >>>> +void add (unsigned int m, unsigned int n) >>>> +{ >>>> + unsigned int i; >>>> + for (i = 0; i < m && i < n; ++i) >>> >>> Maybe writing '&' instead of '&&' would make it depend less on the target. >>> Also, both tests seem to be for GENERIC (i.e. I expect that you are already >>> seeing the optimized version with -fdump-tree-original or >>> -fdump-tree-gimple). Maybe something as simple as: >>> int f(long a, long b, long c) { >>> int cmp1 = a < b; >>> int cmp2 = a < c; >>> return cmp1 & cmp2; >>> } >>> >>>> + a[i] = b[i] + c[i]; >>>> +} >>>> + >>>> +void add2 (unsigned int m, unsigned int n) >>>> +{ >>>> + unsigned int i; >>>> + for (i = N-1; i > m && i > n; --i) >>>> + a[i] = b[i] + c[i]; >>>> +} >>>> + >>>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ >>>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */ >>> >>> >> -- >> Marc Glisse
On Wed, Sep 30, 2015 at 9:29 AM, Michael Collison <michael.collison@linaro.org> wrote: > Richard and Marc, > > What is ':s'? I don't see any documentation for it. So you would like me to > remove :c and add :s? There is documentation for both in the internals manual. I don't have enough context to say whether you should remove "them" or not. What's the current patch? If you made the suggested changes you should be left with only required :s and :c. Richard. > > > On 09/18/2015 02:23 AM, Richard Biener wrote: >> >> On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> Just a couple extra points. We can end up with a mix of < and >, which >>> might >>> prevent from matching: >>> >>> _3 = b_1(D) > a_2(D); >>> _5 = a_2(D) < c_4(D); >>> _8 = _3 & _5; >>> >>> Just like with &, we could also transform: >>> x < y | x < z ---> x < max(y, z) >>> >>> (but maybe wait to make sure reviewers are ok with the first >>> transformation >>> before generalizing) >> >> Please merge the patterns as suggested and do the :c/:s changes as well. >> >> The issue with getting mixed < and > is indeed there - I've wanted to >> extend :c to handle tcc_comparison in some way at some point but >> didn't get to how best to implement that yet... >> >> So to fix that currently you have to replicate the merged pattern >> with swapped comparison operands. >> >> Otherwise I'm fine with the general approach. >> >> Richard. >> >>> On Fri, 18 Sep 2015, Marc Glisse wrote: >>> >>>> On Thu, 17 Sep 2015, Michael Collison wrote: >>>> >>>>> Here is the the patch modified with test cases for MIN_EXPR and >>>>> MAX_EXPR >>>>> expressions. I need some assistance; this test case will fail on >>>>> targets >>>>> that don't have support for MIN/MAX such as 68k. Is there any way to >>>>> remedy >>>>> this short of enumerating whether a target support MIN/MAX in >>>>> testsuite/lib/target_support? >>>>> >>>>> 2015-07-24 Michael Collison <michael.collison@linaro.org> >>>>> Andrew Pinski <andrew.pinski@caviumnetworks.com> >>>>> >>>>> * match.pd ((x < y) && (x < z) -> x < min (y,z), >>>>> (x > y) and (x > z) -> x > max (y,z)) >>>>> * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test. >>>>> >>>>> diff --git a/gcc/match.pd b/gcc/match.pd >>>>> index 5e8fd32..8691710 100644 >>>>> --- a/gcc/match.pd >>>>> +++ b/gcc/match.pd >>>>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not >>>>> see >>>>> (convert (bit_and (op (convert:utype @0) (convert:utype @1)) >>>>> (convert:utype @4))))))) >>>>> >>>>> + >>>>> +/* Transform (@0 < @1 and @0 < @2) to use min */ >>>>> +(for op (lt le) >>>>> +(simplify >>>> >>>> >>>> You seem to be missing all indentation. >>>> >>>>> +(bit_and:c (op @0 @1) (op @0 @2)) >>>> >>>> >>>> :c seems useless here. On the other hand, it might make sense to use >>>> op:s >>>> since this is mostly useful if it removes the 2 original comparisons. >>>> >>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>>> >>>> >>>> How did you chose this restriction? It seems safe enough, but the >>>> transformation could make sense in other cases as well. It can always be >>>> generalized later though. >>>> >>>>> +(op @0 (min @1 @2))))) >>>>> + >>>>> +/* Transform (@0 > @1 and @0 > @2) to use max */ >>>>> +(for op (gt ge) >>>> >>>> >>>> Note that you could unify the patterns with something like: >>>> (for op (lt le gt ge) >>>> ext (min min max max) >>>> (simplify ... >>>> >>>>> +(simplify >>>>> +(bit_and:c (op @0 @1) (op @0 @2)) >>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>>>> +(op @0 (max @1 @2))))) >>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>>>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>>>> new file mode 100644 >>>>> index 0000000..cc0189a >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c >>>>> @@ -0,0 +1,23 @@ >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >>>>> + >>>>> +#define N 1024 >>>>> + >>>>> +int a[N], b[N], c[N]; >>>>> + >>>>> +void add (unsigned int m, unsigned int n) >>>>> +{ >>>>> + unsigned int i; >>>>> + for (i = 0; i < m && i < n; ++i) >>>> >>>> >>>> Maybe writing '&' instead of '&&' would make it depend less on the >>>> target. >>>> Also, both tests seem to be for GENERIC (i.e. I expect that you are >>>> already >>>> seeing the optimized version with -fdump-tree-original or >>>> -fdump-tree-gimple). Maybe something as simple as: >>>> int f(long a, long b, long c) { >>>> int cmp1 = a < b; >>>> int cmp2 = a < c; >>>> return cmp1 & cmp2; >>>> } >>>> >>>>> + a[i] = b[i] + c[i]; >>>>> +} >>>>> + >>>>> +void add2 (unsigned int m, unsigned int n) >>>>> +{ >>>>> + unsigned int i; >>>>> + for (i = N-1; i > m && i > n; --i) >>>>> + a[i] = b[i] + c[i]; >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ >>>>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */ >>>> >>>> >>>> >>> -- >>> Marc Glisse > > > -- > Michael Collison > Linaro Toolchain Working Group > michael.collison@linaro.org >
diff --git a/gcc/match.pd b/gcc/match.pd index 5e8fd32..8691710 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see (convert (bit_and (op (convert:utype @0) (convert:utype @1)) (convert:utype @4))))))) + +/* Transform (@0 < @1 and @0 < @2) to use min */ +(for op (lt le) +(simplify +(bit_and:c (op @0 @1) (op @0 @2)) +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) +(op @0 (min @1 @2))))) + +/* Transform (@0 > @1 and @0 > @2) to use max */ +(for op (gt ge) +(simplify +(bit_and:c (op @0 @1) (op @0 @2)) +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) +(op @0 (max @1 @2))))) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c new file mode 100644 index 0000000..cc0189a --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#define N 1024 + +int a[N], b[N], c[N]; + +void add (unsigned int m, unsigned int n) +{ + unsigned int i; + for (i = 0; i < m && i < n; ++i) + a[i] = b[i] + c[i]; +} + +void add2 (unsigned int m, unsigned int n) +{ + unsigned int i; + for (i = N-1; i > m && i > n; --i) + a[i] = b[i] + c[i]; +} + +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */