diff mbox

[match.pd] Fix for PR35691

Message ID CAAgBjMkJnnkCZMH9RtFkQCeZ2E2U_6B0HXwYYnWL2d+XsYJFnw@mail.gmail.com
State Superseded
Headers show

Commit Message

Prathamesh Kulkarni Nov. 3, 2016, 10:21 a.m. UTC
Hi Richard,
The attached patch tries to fix PR35691, by adding the following two
transforms to match.pd:
(x == 0 && y == 0) -> (x | typeof(x)(y)) == 0.
(x != 0 || y != 0) -> (x | typeof(x)(y)) != 0.

For GENERIC, the "and" operator is truth_andif_expr, and it seems for GIMPLE,
it gets transformed to bit_and_expr
so to match for both GENERIC and GIMPLE, I had to guard the for-stmt:

#if GENERIC
(for op (truth_andif truth_orif)
#elif GIMPLE
(for op (bit_and bit_ior)
#endif

Is that OK ?
Bootstrap+test running on x86_64-unknown-linux-gnu.

Thanks,
Prathamesh
2016-11-03  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	PR middle-end/35691
	* match.pd: Add following two patterns:
	(x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
	(x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.

testsuite/
	* gcc.dg/pr35691-1.c: New test-case.
	* gcc.dg/pr35691-2.c: Likewise.
	* gcc.dg/pr35691-3.c: Likewise.
	* gcc.dg/pr35691-4.c: Likewise.

Comments

Richard Biener Nov. 3, 2016, 10:43 a.m. UTC | #1
On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

> Hi Richard,

> The attached patch tries to fix PR35691, by adding the following two

> transforms to match.pd:

> (x == 0 && y == 0) -> (x | typeof(x)(y)) == 0.

> (x != 0 || y != 0) -> (x | typeof(x)(y)) != 0.

> 

> For GENERIC, the "and" operator is truth_andif_expr, and it seems for GIMPLE,

> it gets transformed to bit_and_expr

> so to match for both GENERIC and GIMPLE, I had to guard the for-stmt:

> 

> #if GENERIC

> (for op (truth_andif truth_orif)

> #elif GIMPLE

> (for op (bit_and bit_ior)

> #endif

> 

> Is that OK ?


As you are not removing the fold-const.c variant I'd say you should
simply not look for truth_* and only handle GIMPLE.  Note that we
have tree-ssa-ifcombine.c which should handle the variant with
control-flow (but I guess it does not and your patch wouldn't help
it either).

The transform would also work for vectors (element_precision for
the test but also a value-matching zero which should ensure the
same number of elements).

Richard.
Prathamesh Kulkarni Nov. 3, 2016, 2:24 p.m. UTC | #2
On 3 November 2016 at 16:13, Richard Biener <rguenther@suse.de> wrote:
> On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

>

>> Hi Richard,

>> The attached patch tries to fix PR35691, by adding the following two

>> transforms to match.pd:

>> (x == 0 && y == 0) -> (x | typeof(x)(y)) == 0.

>> (x != 0 || y != 0) -> (x | typeof(x)(y)) != 0.

>>

>> For GENERIC, the "and" operator is truth_andif_expr, and it seems for GIMPLE,

>> it gets transformed to bit_and_expr

>> so to match for both GENERIC and GIMPLE, I had to guard the for-stmt:

>>

>> #if GENERIC

>> (for op (truth_andif truth_orif)

>> #elif GIMPLE

>> (for op (bit_and bit_ior)

>> #endif

>>

>> Is that OK ?

>

> As you are not removing the fold-const.c variant I'd say you should

> simply not look for truth_* and only handle GIMPLE.  Note that we

> have tree-ssa-ifcombine.c which should handle the variant with

> control-flow (but I guess it does not and your patch wouldn't help

> it either).

>

> The transform would also work for vectors (element_precision for

> the test but also a value-matching zero which should ensure the

> same number of elements).

Um sorry, I didn't get how to check vectors to be of equal length by a
matching zero.
Could you please elaborate on that ?
Thanks!
>

> Richard.
Marc Glisse Nov. 3, 2016, 5:11 p.m. UTC | #3
On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

> On 3 November 2016 at 16:13, Richard Biener <rguenther@suse.de> wrote:

>> On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

>>

>>> Hi Richard,

>>> The attached patch tries to fix PR35691, by adding the following two

>>> transforms to match.pd:

>>> (x == 0 && y == 0) -> (x | typeof(x)(y)) == 0.

>>> (x != 0 || y != 0) -> (x | typeof(x)(y)) != 0.

>>>

>>> For GENERIC, the "and" operator is truth_andif_expr, and it seems for GIMPLE,

>>> it gets transformed to bit_and_expr

>>> so to match for both GENERIC and GIMPLE, I had to guard the for-stmt:

>>>

>>> #if GENERIC

>>> (for op (truth_andif truth_orif)

>>> #elif GIMPLE

>>> (for op (bit_and bit_ior)

>>> #endif

>>>

>>> Is that OK ?

>>

>> As you are not removing the fold-const.c variant I'd say you should

>> simply not look for truth_* and only handle GIMPLE.  Note that we

>> have tree-ssa-ifcombine.c which should handle the variant with

>> control-flow (but I guess it does not and your patch wouldn't help

>> it either).

>>

>> The transform would also work for vectors (element_precision for

>> the test but also a value-matching zero which should ensure the

>> same number of elements).

> Um sorry, I didn't get how to check vectors to be of equal length by a

> matching zero.

> Could you please elaborate on that ?


He may have meant something like:

   (op (cmp @0 integer_zerop@2) (cmp @1 @2))

So the last operand is checked with operand_equal_p instead of 
integer_zerop. But the fact that we could compute bit_ior on the 
comparison results should already imply that the number of elements is the 
same. This would also prevent the case where one vector is signed and the 
other unsigned, which requires a view_convert (I don't remember if convert 
automatically becomes view_convert here as in fold_convert or not).

For (some_int == 0) & (some_long == 0), doing ((long)some_int | some_long) 
== 0 should also work (and it doesn't matter if we pick a sign- or 
zero-extension), but that's more complicated, not necessary for a first 
version.

On platforms that have IOR on floats (at least x86 with SSE, maybe some 
vector mode on s390?), it would be cool to do the same for floats (most 
likely at the RTL level).

-- 
Marc Glisse
Marc Glisse Nov. 3, 2016, 7:11 p.m. UTC | #4
On Thu, 3 Nov 2016, Richard Biener wrote:

> On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

>

>> Hi Richard,

>> The attached patch tries to fix PR35691, by adding the following two

>> transforms to match.pd:

>> (x == 0 && y == 0) -> (x | typeof(x)(y)) == 0.

>> (x != 0 || y != 0) -> (x | typeof(x)(y)) != 0.

>>

>> For GENERIC, the "and" operator is truth_andif_expr, and it seems for GIMPLE,

>> it gets transformed to bit_and_expr

>> so to match for both GENERIC and GIMPLE, I had to guard the for-stmt:

>>

>> #if GENERIC

>> (for op (truth_andif truth_orif)

>> #elif GIMPLE

>> (for op (bit_and bit_ior)

>> #endif

>>

>> Is that OK ?

>

> As you are not removing the fold-const.c variant I'd say you should

> simply not look for truth_* and only handle GIMPLE.  Note that we

> have tree-ssa-ifcombine.c which should handle the variant with

> control-flow (but I guess it does not and your patch wouldn't help

> it either).

>

> The transform would also work for vectors (element_precision for

> the test but also a value-matching zero which should ensure the

> same number of elements).


On the other hand, now that we are using VEC_COND_EXPR all over the place, 
it may be hard to write a testcase that makes this kind of pattern 
appear...

(related to PR 68714)

-- 
Marc Glisse
Richard Biener Nov. 3, 2016, 7:13 p.m. UTC | #5
On November 3, 2016 6:11:07 PM GMT+01:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

>

>> On 3 November 2016 at 16:13, Richard Biener <rguenther@suse.de>

>wrote:

>>> On Thu, 3 Nov 2016, Prathamesh Kulkarni wrote:

>>>

>>>> Hi Richard,

>>>> The attached patch tries to fix PR35691, by adding the following

>two

>>>> transforms to match.pd:

>>>> (x == 0 && y == 0) -> (x | typeof(x)(y)) == 0.

>>>> (x != 0 || y != 0) -> (x | typeof(x)(y)) != 0.

>>>>

>>>> For GENERIC, the "and" operator is truth_andif_expr, and it seems

>for GIMPLE,

>>>> it gets transformed to bit_and_expr

>>>> so to match for both GENERIC and GIMPLE, I had to guard the

>for-stmt:

>>>>

>>>> #if GENERIC

>>>> (for op (truth_andif truth_orif)

>>>> #elif GIMPLE

>>>> (for op (bit_and bit_ior)

>>>> #endif

>>>>

>>>> Is that OK ?

>>>

>>> As you are not removing the fold-const.c variant I'd say you should

>>> simply not look for truth_* and only handle GIMPLE.  Note that we

>>> have tree-ssa-ifcombine.c which should handle the variant with

>>> control-flow (but I guess it does not and your patch wouldn't help

>>> it either).

>>>

>>> The transform would also work for vectors (element_precision for

>>> the test but also a value-matching zero which should ensure the

>>> same number of elements).

>> Um sorry, I didn't get how to check vectors to be of equal length by

>a

>> matching zero.

>> Could you please elaborate on that ?

>

>He may have meant something like:

>

>   (op (cmp @0 integer_zerop@2) (cmp @1 @2))


I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the point of the pattern.

>So the last operand is checked with operand_equal_p instead of 

>integer_zerop. But the fact that we could compute bit_ior on the 

>comparison results should already imply that the number of elements is

>the 

>same.


Though for equality compares we also allow scalar results IIRC.

 This would also prevent the case where one vector is signed and
>the 

>other unsigned, which requires a view_convert (I don't remember if

>convert 

>automatically becomes view_convert here as in fold_convert or not).


No, but the other way around (for sign changes).

>For (some_int == 0) & (some_long == 0), doing ((long)some_int |

>some_long) 

>== 0 should also work (and it doesn't matter if we pick a sign- or 

>zero-extension), but that's more complicated, not necessary for a first

>

>version.


Yeah.

>On platforms that have IOR on floats (at least x86 with SSE, maybe some

>

>vector mode on s390?), it would be cool to do the same for floats (most

>

>likely at the RTL level).


On GIMPLE view-converts could come to the rescue here as well.  Or we cab just allow bit-and/or on floats as much as we allow them on pointers.

Richard.
Marc Glisse Nov. 3, 2016, 7:57 p.m. UTC | #6
On Thu, 3 Nov 2016, Richard Biener wrote:

>>>> The transform would also work for vectors (element_precision for

>>>> the test but also a value-matching zero which should ensure the

>>>> same number of elements).

>>> Um sorry, I didn't get how to check vectors to be of equal length by a

>>> matching zero.

>>> Could you please elaborate on that ?

>>

>> He may have meant something like:

>>

>>   (op (cmp @0 integer_zerop@2) (cmp @1 @2))

>

> I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the point of the pattern.


Oups, that's what I had written first, and then I somehow managed to 
confuse myself enough to remove it so as to remove the call to types_match 
:-(

>> So the last operand is checked with operand_equal_p instead of 

>> integer_zerop. But the fact that we could compute bit_ior on the 

>> comparison results should already imply that the number of elements is 

>> the same.

>

> Though for equality compares we also allow scalar results IIRC.


Oh, right, I keep forgetting that :-( And I have no idea how to generate 
one for a testcase, at least until the GIMPLE FE lands...

>> On platforms that have IOR on floats (at least x86 with SSE, maybe some 

>> vector mode on s390?), it would be cool to do the same for floats (most 

>> likely at the RTL level).

>

> On GIMPLE view-converts could come to the rescue here as well.  Or we 

> cab just allow bit-and/or on floats as much as we allow them on 

> pointers.


Would that generate sensible code on targets that do not have logic insns 
for floats? Actually, even on x86_64 that generates inefficient code, so 
there would be some work (for instance grep finds no gen_iordf3, only 
gen_iorv2df3).

I am also a bit wary of doing those obfuscating optimizations too early... 
a==0 is something that other optimizations might use. long 
c=(long&)a|(long&)b; (double&)c==0; less so...

(and I am assuming that signaling NaNs don't make the whole transformation 
impossible, which might be wrong)

-- 
Marc Glisse
Richard Biener Nov. 4, 2016, 8:11 a.m. UTC | #7
On Thu, 3 Nov 2016, Marc Glisse wrote:

> On Thu, 3 Nov 2016, Richard Biener wrote:

> 

> > > > > The transform would also work for vectors (element_precision for

> > > > > the test but also a value-matching zero which should ensure the

> > > > > same number of elements).

> > > > Um sorry, I didn't get how to check vectors to be of equal length by a

> > > > matching zero.

> > > > Could you please elaborate on that ?

> > > 

> > > He may have meant something like:

> > > 

> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))

> > 

> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the

> > point of the pattern.

> 

> Oups, that's what I had written first, and then I somehow managed to confuse

> myself enough to remove it so as to remove the call to types_match :-(

> 

> > > So the last operand is checked with operand_equal_p instead of

> > > integer_zerop. But the fact that we could compute bit_ior on the

> > > comparison results should already imply that the number of elements is the

> > > same.

> > 

> > Though for equality compares we also allow scalar results IIRC.

> 

> Oh, right, I keep forgetting that :-( And I have no idea how to generate one

> for a testcase, at least until the GIMPLE FE lands...

> 

> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some

> > > vector mode on s390?), it would be cool to do the same for floats (most

> > > likely at the RTL level).

> > 

> > On GIMPLE view-converts could come to the rescue here as well.  Or we cab

> > just allow bit-and/or on floats as much as we allow them on pointers.

> 

> Would that generate sensible code on targets that do not have logic insns for

> floats? Actually, even on x86_64 that generates inefficient code, so there

> would be some work (for instance grep finds no gen_iordf3, only gen_iorv2df3).

> 

> I am also a bit wary of doing those obfuscating optimizations too early...

> a==0 is something that other optimizations might use. long

> c=(long&)a|(long&)b; (double&)c==0; less so...

> 

> (and I am assuming that signaling NaNs don't make the whole transformation

> impossible, which might be wrong)


Yeah.  I also think it's not so much important - I just wanted to mention
vectors...

Btw, I still think we need a more sensible infrastructure for passes
to gather, analyze and modify complex conditions.  (I'm always pointing
to tree-affine.c as an, albeit not very good, example for handling
a similar problem)

Richard.
diff mbox

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index 48f7351..65930bb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -519,6 +519,23 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (TYPE_UNSIGNED (type))
     (bit_and @0 (bit_not (lshift { build_all_ones_cst (type); } @1)))))
 
+/* PR35691: Transform
+   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
+   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
+
+#if GENERIC
+(for op (truth_andif truth_orif)
+#elif GIMPLE
+(for op (bit_and bit_ior)
+#endif
+     cmp (eq ne)
+ (simplify
+  (op (cmp @0 integer_zerop) (cmp @1 integer_zerop))
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+       && INTEGRAL_TYPE_P (TREE_TYPE (@1))
+       && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1)))
+    (cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0)); }))))
+
 /* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
 (simplify
  (minus (bit_and:cs @0 (bit_not @1)) (bit_and:cs @0 @1))
diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c b/gcc/testsuite/gcc.dg/pr35691-1.c
new file mode 100644
index 0000000..25a7ace
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr35691-1.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-gimple" } */
+
+int foo1(int z0, unsigned z1) 
+{
+  return (z0 == 0) && (z1 == 0);
+}
+
+/* { dg-final { scan-tree-dump-not "z1.\[0-9\]*_\[0-9\]* = (int) z1" "gimple" } } */
diff --git a/gcc/testsuite/gcc.dg/pr35691-2.c b/gcc/testsuite/gcc.dg/pr35691-2.c
new file mode 100644
index 0000000..5211f815
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr35691-2.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop-details" } */
+
+int foo(int z0, unsigned z1)
+{
+  int t0 = (z0 == 0);
+  int t1 = (z1 == 0);
+  int t2 = (t0 && t1);
+  return t2;
+}
+
+/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "forwprop1" } } */
diff --git a/gcc/testsuite/gcc.dg/pr35691-3.c b/gcc/testsuite/gcc.dg/pr35691-3.c
new file mode 100644
index 0000000..134bbdf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr35691-3.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-gimple" } */
+
+int foo1(int z0, unsigned z1) 
+{
+  return (z0 != 0) || (z1 != 0);
+}
+
+/* { dg-final { scan-tree-dump-not "z1.\[0-9\]*_\[0-9\]* = (int) z1" "gimple" } } */
diff --git a/gcc/testsuite/gcc.dg/pr35691-4.c b/gcc/testsuite/gcc.dg/pr35691-4.c
new file mode 100644
index 0000000..90cbf6d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr35691-4.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop-details" } */
+
+int foo(int z0, unsigned z1)
+{
+  int t0 = (z0 != 0);
+  int t1 = (z1 != 0);
+  int t2 = (t0 || t1);
+  return t2;
+}
+
+/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "forwprop1" } } */