Message ID | CAAgBjM=yhvx93aAZE8Q9Y2fkY5AQrvujJKDoCBc3eKPYmROUoQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | PR91166 - Unfolded ZIPs of constants | expand |
Not really my area, but FWIW... Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > Hi, > The attached patch tries to fix PR91166. > Does it look OK ? > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > Thanks, > Prathamesh > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > PR middle-end/91166 > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > (define_predicates): Add entry for uniform_vector_p. > > testsuite/ > * gcc.target/aarch64/sve/pr91166.c: New test. > > diff --git a/gcc/match.pd b/gcc/match.pd > index 4a7aa0185d8..2ad98c28fd8 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > integer_valued_real_p > integer_pow2p > uniform_integer_cst_p > - HONOR_NANS) > + HONOR_NANS > + uniform_vector_p) > > /* Operator lists. */ > (define_operator_list tcc_comparison > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > (if (changed) > (vec_perm { op0; } { op1; } { op2; })))))))))) > + > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > +(simplify > + (vec_perm (vec_duplicate@0 @1) @0 @2) > + { @0; }) > + > +(simplify > + (vec_perm uniform_vector_p@0 @0 @1) > + { @0; }) No need for the curly braces here, can use "@0" as the target of the simplification. It'd probably be worth using (match ...) to define a new predicate that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, calling into uniform_vector_p for the latter two. Thanks, Richard > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > new file mode 100644 > index 00000000000..42654be3b31 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > + > +void > +f1 (double x[][4]) > +{ > + for (int i = 0; i < 4; ++i) > + for (int j = 0; j < 4; ++j) > + x[i][j] = 0; > +} > + > +void > +f2 (double x[][4], double y) > +{ > + for (int i = 0; i < 4; ++i) > + for (int j = 0; j < 4; ++j) > + x[i][j] = y; > +} > + > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
On Fri, 19 Jul 2019 at 18:12, Richard Sandiford <richard.sandiford@arm.com> wrote: > > Not really my area, but FWIW... > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > Hi, > > The attached patch tries to fix PR91166. > > Does it look OK ? > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > > > Thanks, > > Prathamesh > > > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > > > PR middle-end/91166 > > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > > (define_predicates): Add entry for uniform_vector_p. > > > > testsuite/ > > * gcc.target/aarch64/sve/pr91166.c: New test. > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > index 4a7aa0185d8..2ad98c28fd8 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > > integer_valued_real_p > > integer_pow2p > > uniform_integer_cst_p > > - HONOR_NANS) > > + HONOR_NANS > > + uniform_vector_p) > > > > /* Operator lists. */ > > (define_operator_list tcc_comparison > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > > (if (changed) > > (vec_perm { op0; } { op1; } { op2; })))))))))) > > + > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > > +(simplify > > + (vec_perm (vec_duplicate@0 @1) @0 @2) > > + { @0; }) > > + > > +(simplify > > + (vec_perm uniform_vector_p@0 @0 @1) > > + { @0; }) > > No need for the curly braces here, can use "@0" as the target of > the simplification. > > It'd probably be worth using (match ...) to define a new predicate > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, > calling into uniform_vector_p for the latter two. Hi, Thanks for the suggestions. Does this version look OK ? Thanks, Prathamesh > > Thanks, > Richard > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > new file mode 100644 > > index 00000000000..42654be3b31 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > @@ -0,0 +1,20 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > > + > > +void > > +f1 (double x[][4]) > > +{ > > + for (int i = 0; i < 4; ++i) > > + for (int j = 0; j < 4; ++j) > > + x[i][j] = 0; > > +} > > + > > +void > > +f2 (double x[][4], double y) > > +{ > > + for (int i = 0; i < 4; ++i) > > + for (int j = 0; j < 4; ++j) > > + x[i][j] = y; > > +} > > + > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */ 2019-07-23 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR middle-end/91166 * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. (define_predicates): Add entry for uniform_vector_p. (vec_same_elem_p): New match pattern. testsuite/ * gcc.target/aarch64/sve/pr91166.c: New test. diff --git a/gcc/match.pd b/gcc/match.pd index 4a7aa0185d8..f14670a7982 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see integer_valued_real_p integer_pow2p uniform_integer_cst_p - HONOR_NANS) + HONOR_NANS + uniform_vector_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -5568,3 +5569,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) (if (changed) (vec_perm { op0; } { op1; } { op2; })))))))))) + +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ + +(match (vec_same_elem_p @0) + uniform_vector_p@0) + +(match (vec_same_elem_p @0) + (vec_duplicate @0)) + +(simplify + (vec_perm (vec_same_elem_p@0 @1) @0 @2) + @0) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c new file mode 100644 index 00000000000..42654be3b31 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ + +void +f1 (double x[][4]) +{ + for (int i = 0; i < 4; ++i) + for (int j = 0; j < 4; ++j) + x[i][j] = 0; +} + +void +f2 (double x[][4], double y) +{ + for (int i = 0; i < 4; ++i) + for (int j = 0; j < 4; ++j) + x[i][j] = y; +} + +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford > <richard.sandiford@arm.com> wrote: > > > > Not really my area, but FWIW... > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > Hi, > > > The attached patch tries to fix PR91166. > > > Does it look OK ? > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > > > > > Thanks, > > > Prathamesh > > > > > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > > > > > PR middle-end/91166 > > > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > > > (define_predicates): Add entry for uniform_vector_p. > > > > > > testsuite/ > > > * gcc.target/aarch64/sve/pr91166.c: New test. > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > index 4a7aa0185d8..2ad98c28fd8 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > > > integer_valued_real_p > > > integer_pow2p > > > uniform_integer_cst_p > > > - HONOR_NANS) > > > + HONOR_NANS > > > + uniform_vector_p) > > > > > > /* Operator lists. */ > > > (define_operator_list tcc_comparison > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > > > (if (changed) > > > (vec_perm { op0; } { op1; } { op2; })))))))))) > > > + > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > > > +(simplify > > > + (vec_perm (vec_duplicate@0 @1) @0 @2) > > > + { @0; }) > > > + > > > +(simplify > > > + (vec_perm uniform_vector_p@0 @0 @1) > > > + { @0; }) > > > > No need for the curly braces here, can use "@0" as the target of > > the simplification. > > > > It'd probably be worth using (match ...) to define a new predicate > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, > > calling into uniform_vector_p for the latter two. > Hi, > Thanks for the suggestions. > Does this version look OK ? Can you write +(simplify + (vec_perm (vec_same_elem_p@0 @1) @0 @2) + @0) as (vec_perm vec_same_elem_p@0 @0 @1) ? Otherwise looks OK. Thanks, Richard. > Thanks, > Prathamesh > > > > > Thanks, > > Richard > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > new file mode 100644 > > > index 00000000000..42654be3b31 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > @@ -0,0 +1,20 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > > > + > > > +void > > > +f1 (double x[][4]) > > > +{ > > > + for (int i = 0; i < 4; ++i) > > > + for (int j = 0; j < 4; ++j) > > > + x[i][j] = 0; > > > +} > > > + > > > +void > > > +f2 (double x[][4], double y) > > > +{ > > > + for (int i = 0; i < 4; ++i) > > > + for (int j = 0; j < 4; ++j) > > > + x[i][j] = y; > > > +} > > > + > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */ > -- Richard Biener <rguenther@suse.de> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote: > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford > > <richard.sandiford@arm.com> wrote: > > > > > > Not really my area, but FWIW... > > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > > Hi, > > > > The attached patch tries to fix PR91166. > > > > Does it look OK ? > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > > > > > > > Thanks, > > > > Prathamesh > > > > > > > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > > > > > > > PR middle-end/91166 > > > > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > > > > (define_predicates): Add entry for uniform_vector_p. > > > > > > > > testsuite/ > > > > * gcc.target/aarch64/sve/pr91166.c: New test. > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > index 4a7aa0185d8..2ad98c28fd8 100644 > > > > --- a/gcc/match.pd > > > > +++ b/gcc/match.pd > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > > > > integer_valued_real_p > > > > integer_pow2p > > > > uniform_integer_cst_p > > > > - HONOR_NANS) > > > > + HONOR_NANS > > > > + uniform_vector_p) > > > > > > > > /* Operator lists. */ > > > > (define_operator_list tcc_comparison > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > > > > (if (changed) > > > > (vec_perm { op0; } { op1; } { op2; })))))))))) > > > > + > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > > > > +(simplify > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2) > > > > + { @0; }) > > > > + > > > > +(simplify > > > > + (vec_perm uniform_vector_p@0 @0 @1) > > > > + { @0; }) > > > > > > No need for the curly braces here, can use "@0" as the target of > > > the simplification. > > > > > > It'd probably be worth using (match ...) to define a new predicate > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, > > > calling into uniform_vector_p for the latter two. > > Hi, > > Thanks for the suggestions. > > Does this version look OK ? > > Can you write > > +(simplify > + (vec_perm (vec_same_elem_p@0 @1) @0 @2) > + @0) > > as > > (vec_perm vec_same_elem_p@0 @0 @1) > > ? (simplify (vec_perm vec_same_elem_p@0 @0 @1) @0) results in: gimple-match.c: In function ‘bool gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node* (*)(tree), code_helper, tree, tree, tree, tree)’: gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’ {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’ if (gimple_vec_same_elem_p (op0, valueize)) ^~~~~~~~ because gimple_vec_same_elem_p has tree *res_ops as 2nd param and we're passing valueize as 2nd arg. Thanks, Prathamesh > > Otherwise looks OK. > > Thanks, > Richard. > > > Thanks, > > Prathamesh > > > > > > > > Thanks, > > > Richard > > > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > new file mode 100644 > > > > index 00000000000..42654be3b31 > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > @@ -0,0 +1,20 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > > > > + > > > > +void > > > > +f1 (double x[][4]) > > > > +{ > > > > + for (int i = 0; i < 4; ++i) > > > > + for (int j = 0; j < 4; ++j) > > > > + x[i][j] = 0; > > > > +} > > > > + > > > > +void > > > > +f2 (double x[][4], double y) > > > > +{ > > > > + for (int i = 0; i < 4; ++i) > > > > + for (int j = 0; j < 4; ++j) > > > > + x[i][j] = y; > > > > +} > > > > + > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */ > > > > -- > Richard Biener <rguenther@suse.de> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote: > > > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > > > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford > > > <richard.sandiford@arm.com> wrote: > > > > > > > > Not really my area, but FWIW... > > > > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > > > Hi, > > > > > The attached patch tries to fix PR91166. > > > > > Does it look OK ? > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > > > > > > > > > Thanks, > > > > > Prathamesh > > > > > > > > > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > > > > > > > > > PR middle-end/91166 > > > > > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > > > > > (define_predicates): Add entry for uniform_vector_p. > > > > > > > > > > testsuite/ > > > > > * gcc.target/aarch64/sve/pr91166.c: New test. > > > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > > index 4a7aa0185d8..2ad98c28fd8 100644 > > > > > --- a/gcc/match.pd > > > > > +++ b/gcc/match.pd > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > > > > > integer_valued_real_p > > > > > integer_pow2p > > > > > uniform_integer_cst_p > > > > > - HONOR_NANS) > > > > > + HONOR_NANS > > > > > + uniform_vector_p) > > > > > > > > > > /* Operator lists. */ > > > > > (define_operator_list tcc_comparison > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > > > > > (if (changed) > > > > > (vec_perm { op0; } { op1; } { op2; })))))))))) > > > > > + > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > > > > > +(simplify > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2) > > > > > + { @0; }) > > > > > + > > > > > +(simplify > > > > > + (vec_perm uniform_vector_p@0 @0 @1) > > > > > + { @0; }) > > > > > > > > No need for the curly braces here, can use "@0" as the target of > > > > the simplification. > > > > > > > > It'd probably be worth using (match ...) to define a new predicate > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, > > > > calling into uniform_vector_p for the latter two. > > > Hi, > > > Thanks for the suggestions. > > > Does this version look OK ? > > > > Can you write > > > > +(simplify > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2) > > + @0) > > > > as > > > > (vec_perm vec_same_elem_p@0 @0 @1) > > > > ? > (simplify > (vec_perm vec_same_elem_p@0 @0 @1) > @0) > > results in: > gimple-match.c: In function ‘bool > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node* > (*)(tree), code_helper, tree, tree, tree, tree)’: > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’ > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’ > if (gimple_vec_same_elem_p (op0, valueize)) > ^~~~~~~~ > > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and > we're passing valueize as 2nd arg. Ah, you need the (match vec_same_elem_p @0 (if (uniform_vector_p (@0))) (match vec_same_elem_p (vec_duplicate @0)) form then. > Thanks, > Prathamesh > > > > Otherwise looks OK. > > > > Thanks, > > Richard. > > > > > Thanks, > > > Prathamesh > > > > > > > > > > > Thanks, > > > > Richard > > > > > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > > new file mode 100644 > > > > > index 00000000000..42654be3b31 > > > > > --- /dev/null > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > > @@ -0,0 +1,20 @@ > > > > > +/* { dg-do compile } */ > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > > > > > + > > > > > +void > > > > > +f1 (double x[][4]) > > > > > +{ > > > > > + for (int i = 0; i < 4; ++i) > > > > > + for (int j = 0; j < 4; ++j) > > > > > + x[i][j] = 0; > > > > > +} > > > > > + > > > > > +void > > > > > +f2 (double x[][4], double y) > > > > > +{ > > > > > + for (int i = 0; i < 4; ++i) > > > > > + for (int j = 0; j < 4; ++j) > > > > > + x[i][j] = y; > > > > > +} > > > > > + > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */ > > > > > > > -- > > Richard Biener <rguenther@suse.de> > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) > -- Richard Biener <rguenther@suse.de> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
On Tue, 23 Jul 2019 at 17:48, Richard Biener <rguenther@suse.de> wrote: > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > > > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote: > > > > > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > > > > > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford > > > > <richard.sandiford@arm.com> wrote: > > > > > > > > > > Not really my area, but FWIW... > > > > > > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > > > > Hi, > > > > > > The attached patch tries to fix PR91166. > > > > > > Does it look OK ? > > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > > > > > > > > > > > Thanks, > > > > > > Prathamesh > > > > > > > > > > > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > > > > > > > > > > > PR middle-end/91166 > > > > > > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > > > > > > (define_predicates): Add entry for uniform_vector_p. > > > > > > > > > > > > testsuite/ > > > > > > * gcc.target/aarch64/sve/pr91166.c: New test. > > > > > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > > > index 4a7aa0185d8..2ad98c28fd8 100644 > > > > > > --- a/gcc/match.pd > > > > > > +++ b/gcc/match.pd > > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > > > > > > integer_valued_real_p > > > > > > integer_pow2p > > > > > > uniform_integer_cst_p > > > > > > - HONOR_NANS) > > > > > > + HONOR_NANS > > > > > > + uniform_vector_p) > > > > > > > > > > > > /* Operator lists. */ > > > > > > (define_operator_list tcc_comparison > > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > > > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > > > > > > (if (changed) > > > > > > (vec_perm { op0; } { op1; } { op2; })))))))))) > > > > > > + > > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > > > > > > +(simplify > > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2) > > > > > > + { @0; }) > > > > > > + > > > > > > +(simplify > > > > > > + (vec_perm uniform_vector_p@0 @0 @1) > > > > > > + { @0; }) > > > > > > > > > > No need for the curly braces here, can use "@0" as the target of > > > > > the simplification. > > > > > > > > > > It'd probably be worth using (match ...) to define a new predicate > > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, > > > > > calling into uniform_vector_p for the latter two. > > > > Hi, > > > > Thanks for the suggestions. > > > > Does this version look OK ? > > > > > > Can you write > > > > > > +(simplify > > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2) > > > + @0) > > > > > > as > > > > > > (vec_perm vec_same_elem_p@0 @0 @1) > > > > > > ? > > (simplify > > (vec_perm vec_same_elem_p@0 @0 @1) > > @0) > > > > results in: > > gimple-match.c: In function ‘bool > > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node* > > (*)(tree), code_helper, tree, tree, tree, tree)’: > > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’ > > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’ > > if (gimple_vec_same_elem_p (op0, valueize)) > > ^~~~~~~~ > > > > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and > > we're passing valueize as 2nd arg. > > Ah, you need the > > (match vec_same_elem_p > @0 > (if (uniform_vector_p (@0))) > > (match vec_same_elem_p > (vec_duplicate @0)) > > form then. Thanks, that worked. Is the attached patch OK to commit ? Thanks, Prathamesh > > > Thanks, > > Prathamesh > > > > > > Otherwise looks OK. > > > > > > Thanks, > > > Richard. > > > > > > > Thanks, > > > > Prathamesh > > > > > > > > > > > > > > Thanks, > > > > > Richard > > > > > > > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > > > new file mode 100644 > > > > > > index 00000000000..42654be3b31 > > > > > > --- /dev/null > > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > > > @@ -0,0 +1,20 @@ > > > > > > +/* { dg-do compile } */ > > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > > > > > > + > > > > > > +void > > > > > > +f1 (double x[][4]) > > > > > > +{ > > > > > > + for (int i = 0; i < 4; ++i) > > > > > > + for (int j = 0; j < 4; ++j) > > > > > > + x[i][j] = 0; > > > > > > +} > > > > > > + > > > > > > +void > > > > > > +f2 (double x[][4], double y) > > > > > > +{ > > > > > > + for (int i = 0; i < 4; ++i) > > > > > > + for (int j = 0; j < 4; ++j) > > > > > > + x[i][j] = y; > > > > > > +} > > > > > > + > > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */ > > > > > > > > > > -- > > > Richard Biener <rguenther@suse.de> > > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) > > > > -- > Richard Biener <rguenther@suse.de> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) diff --git a/gcc/match.pd b/gcc/match.pd index 4a7aa0185d8..c5c6a041cfc 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see integer_valued_real_p integer_pow2p uniform_integer_cst_p - HONOR_NANS) + HONOR_NANS + uniform_vector_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -5568,3 +5569,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) (if (changed) (vec_perm { op0; } { op1; } { op2; })))))))))) + +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ + +(match vec_same_elem_p + @0 + (if (uniform_vector_p (@0)))) + +(match vec_same_elem_p + (vec_duplicate @0)) + +(simplify + (vec_perm vec_same_elem_p@0 @0 @1) + @0) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c new file mode 100644 index 00000000000..42654be3b31 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ + +void +f1 (double x[][4]) +{ + for (int i = 0; i < 4; ++i) + for (int j = 0; j < 4; ++j) + x[i][j] = 0; +} + +void +f2 (double x[][4], double y) +{ + for (int i = 0; i < 4; ++i) + for (int j = 0; j < 4; ++j) + x[i][j] = y; +} + +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > On Tue, 23 Jul 2019 at 17:48, Richard Biener <rguenther@suse.de> wrote: > > > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > > > > > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote: > > > > > > > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote: > > > > > > > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford > > > > > <richard.sandiford@arm.com> wrote: > > > > > > > > > > > > Not really my area, but FWIW... > > > > > > > > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > > > > > > > Hi, > > > > > > > The attached patch tries to fix PR91166. > > > > > > > Does it look OK ? > > > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu. > > > > > > > > > > > > > > Thanks, > > > > > > > Prathamesh > > > > > > > > > > > > > > 2019-07-17 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> > > > > > > > > > > > > > > PR middle-end/91166 > > > > > > > * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. > > > > > > > (define_predicates): Add entry for uniform_vector_p. > > > > > > > > > > > > > > testsuite/ > > > > > > > * gcc.target/aarch64/sve/pr91166.c: New test. > > > > > > > > > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > > > > > index 4a7aa0185d8..2ad98c28fd8 100644 > > > > > > > --- a/gcc/match.pd > > > > > > > +++ b/gcc/match.pd > > > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see > > > > > > > integer_valued_real_p > > > > > > > integer_pow2p > > > > > > > uniform_integer_cst_p > > > > > > > - HONOR_NANS) > > > > > > > + HONOR_NANS > > > > > > > + uniform_vector_p) > > > > > > > > > > > > > > /* Operator lists. */ > > > > > > > (define_operator_list tcc_comparison > > > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > > > > > { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) > > > > > > > (if (changed) > > > > > > > (vec_perm { op0; } { op1; } { op2; })))))))))) > > > > > > > + > > > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ > > > > > > > +(simplify > > > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2) > > > > > > > + { @0; }) > > > > > > > + > > > > > > > +(simplify > > > > > > > + (vec_perm uniform_vector_p@0 @0 @1) > > > > > > > + { @0; }) > > > > > > > > > > > > No need for the curly braces here, can use "@0" as the target of > > > > > > the simplification. > > > > > > > > > > > > It'd probably be worth using (match ...) to define a new predicate > > > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR, > > > > > > calling into uniform_vector_p for the latter two. > > > > > Hi, > > > > > Thanks for the suggestions. > > > > > Does this version look OK ? > > > > > > > > Can you write > > > > > > > > +(simplify > > > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2) > > > > + @0) > > > > > > > > as > > > > > > > > (vec_perm vec_same_elem_p@0 @0 @1) > > > > > > > > ? > > > (simplify > > > (vec_perm vec_same_elem_p@0 @0 @1) > > > @0) > > > > > > results in: > > > gimple-match.c: In function ‘bool > > > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node* > > > (*)(tree), code_helper, tree, tree, tree, tree)’: > > > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’ > > > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’ > > > if (gimple_vec_same_elem_p (op0, valueize)) > > > ^~~~~~~~ > > > > > > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and > > > we're passing valueize as 2nd arg. > > > > Ah, you need the > > > > (match vec_same_elem_p > > @0 > > (if (uniform_vector_p (@0))) > > > > (match vec_same_elem_p > > (vec_duplicate @0)) > > > > form then. > Thanks, that worked. > Is the attached patch OK to commit ? Yes. Thanks, Richard. > Thanks, > Prathamesh > > > > > Thanks, > > > Prathamesh > > > > > > > > Otherwise looks OK. > > > > > > > > Thanks, > > > > Richard. > > > > > > > > > Thanks, > > > > > Prathamesh > > > > > > > > > > > > > > > > > Thanks, > > > > > > Richard > > > > > > > > > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > > > > new file mode 100644 > > > > > > > index 00000000000..42654be3b31 > > > > > > > --- /dev/null > > > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c > > > > > > > @@ -0,0 +1,20 @@ > > > > > > > +/* { dg-do compile } */ > > > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ > > > > > > > + > > > > > > > +void > > > > > > > +f1 (double x[][4]) > > > > > > > +{ > > > > > > > + for (int i = 0; i < 4; ++i) > > > > > > > + for (int j = 0; j < 4; ++j) > > > > > > > + x[i][j] = 0; > > > > > > > +} > > > > > > > + > > > > > > > +void > > > > > > > +f2 (double x[][4], double y) > > > > > > > +{ > > > > > > > + for (int i = 0; i < 4; ++i) > > > > > > > + for (int j = 0; j < 4; ++j) > > > > > > > + x[i][j] = y; > > > > > > > +} > > > > > > > + > > > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */ > > > > > > > > > > > > > -- > > > > Richard Biener <rguenther@suse.de> > > > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > > > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) > > > > > > > -- > > Richard Biener <rguenther@suse.de> > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg) > -- Richard Biener <rguenther@suse.de> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
diff --git a/gcc/match.pd b/gcc/match.pd index 4a7aa0185d8..2ad98c28fd8 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3. If not see integer_valued_real_p integer_pow2p uniform_integer_cst_p - HONOR_NANS) + HONOR_NANS + uniform_vector_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); }) (if (changed) (vec_perm { op0; } { op1; } { op2; })))))))))) + +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element. */ +(simplify + (vec_perm (vec_duplicate@0 @1) @0 @2) + { @0; }) + +(simplify + (vec_perm uniform_vector_p@0 @0 @1) + { @0; }) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c new file mode 100644 index 00000000000..42654be3b31 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */ + +void +f1 (double x[][4]) +{ + for (int i = 0; i < 4; ++i) + for (int j = 0; j < 4; ++j) + x[i][j] = 0; +} + +void +f2 (double x[][4], double y) +{ + for (int i = 0; i < 4; ++i) + for (int j = 0; j < 4; ++j) + x[i][j] = y; +} + +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */