diff mbox

Simplify X /[ex] 8 == 0

Message ID alpine.DEB.2.02.1611041255001.15240@laptop-mg.saclay.inria.fr
State Accepted
Commit 40fd269ab128d1f5fa7688d7d5e7298744c08cdd
Headers show

Commit Message

Marc Glisse Nov. 4, 2016, 12:34 p.m. UTC
Hello,

this kind of simplification is already handled by fold_comparison, but 
the code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In 
particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 
can be turned into X==0.

When we have an explicit division by 0, there is a better optimisation 
possible (insert __builtin_unreachable() or __builtin_trap() after that 
statement, as in find_explicit_erroneous_behavior), so I don't touch it.

For the overflow inequality case, it would have been a bit clearer to 
generate
   (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)
and let that be constant folded instead of having that ugly and 
error-prone test in constant_boolean_node, but I saved one tree ;-)

Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions 
are in the libitm part of the testsuite, they should be fixed by 
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the 
testsuite when that patch is in.

2016-11-07  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.
 	* match.pd (A /[ex] B CMP C): New simplifications.

gcc/testsuite/
 	* gcc.dg/tree-ssa/cmpexactdiv.c: New file.

-- 
Marc Glisse

Comments

Richard Biener Nov. 4, 2016, 1:15 p.m. UTC | #1
On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,

>

> this kind of simplification is already handled by fold_comparison, but the

> code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In

> particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can

> be turned into X==0.

>

> When we have an explicit division by 0, there is a better optimisation

> possible (insert __builtin_unreachable() or __builtin_trap() after that

> statement, as in find_explicit_erroneous_behavior), so I don't touch it.

>

> For the overflow inequality case, it would have been a bit clearer to

> generate

>   (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)

> and let that be constant folded instead of having that ugly and error-prone

> test in constant_boolean_node, but I saved one tree ;-)

>

> Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are

> in the libitm part of the testsuite, they should be fixed by

> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the

> testsuite when that patch is in.


Ok.

You fail to handle A /[ex] -2 < 2?  Is that on purpose?  Or just lazy so you
dont' have to handle inverting the comparison?

Thanks,
Richard.

> 2016-11-07  Marc Glisse  <marc.glisse@inria.fr>

>

> gcc/

>         * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.

>         * match.pd (A /[ex] B CMP C): New simplifications.

>

> gcc/testsuite/

>         * gcc.dg/tree-ssa/cmpexactdiv.c: New file.

>

> --

> Marc Glisse
Richard Biener Nov. 4, 2016, 1:15 p.m. UTC | #2
On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:

>> Hello,

>>

>> this kind of simplification is already handled by fold_comparison, but the

>> code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In

>> particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can

>> be turned into X==0.

>>

>> When we have an explicit division by 0, there is a better optimisation

>> possible (insert __builtin_unreachable() or __builtin_trap() after that

>> statement, as in find_explicit_erroneous_behavior), so I don't touch it.

>>

>> For the overflow inequality case, it would have been a bit clearer to

>> generate

>>   (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)

>> and let that be constant folded instead of having that ugly and error-prone

>> test in constant_boolean_node, but I saved one tree ;-)

>>

>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are

>> in the libitm part of the testsuite, they should be fixed by

>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the

>> testsuite when that patch is in.

>

> Ok.

>

> You fail to handle A /[ex] -2 < 2?  Is that on purpose?  Or just lazy so you

> dont' have to handle inverting the comparison?


Oh, I suppose nothing generates exact divisions by negative numbers.

Richard.

> Thanks,

> Richard.

>

>> 2016-11-07  Marc Glisse  <marc.glisse@inria.fr>

>>

>> gcc/

>>         * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.

>>         * match.pd (A /[ex] B CMP C): New simplifications.

>>

>> gcc/testsuite/

>>         * gcc.dg/tree-ssa/cmpexactdiv.c: New file.

>>

>> --

>> Marc Glisse
Marc Glisse Nov. 4, 2016, 1:32 p.m. UTC | #3
On Fri, 4 Nov 2016, Richard Biener wrote:

> On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener

> <richard.guenther@gmail.com> wrote:

>> On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:

>>> Hello,

>>>

>>> this kind of simplification is already handled by fold_comparison, but the

>>> code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In

>>> particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can

>>> be turned into X==0.

>>>

>>> When we have an explicit division by 0, there is a better optimisation

>>> possible (insert __builtin_unreachable() or __builtin_trap() after that

>>> statement, as in find_explicit_erroneous_behavior), so I don't touch it.

>>>

>>> For the overflow inequality case, it would have been a bit clearer to

>>> generate

>>>   (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)

>>> and let that be constant folded instead of having that ugly and error-prone

>>> test in constant_boolean_node, but I saved one tree ;-)

>>>

>>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are

>>> in the libitm part of the testsuite, they should be fixed by

>>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the

>>> testsuite when that patch is in.

>>

>> Ok.

>>

>> You fail to handle A /[ex] -2 < 2?  Is that on purpose?  Or just lazy so you

>> dont' have to handle inverting the comparison?

>

> Oh, I suppose nothing generates exact divisions by negative numbers.


Yes, that :-) I think I'd rather wait until I can test it before 
simplifying those too much.

-- 
Marc Glisse
diff mbox

Patch

Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 241840)
+++ gcc/fold-const.c	(working copy)
@@ -8740,22 +8740,21 @@  fold_comparison (location_t loc, enum tr
 		  SET_EXPR_LOCATION (tem, loc);
 		  return tem;
 		}
 	      return fold_build2_loc (loc, code, type, cval1, cval2);
 	    }
 	}
     }
 
   /* We can fold X/C1 op C2 where C1 and C2 are integer constants
      into a single range test.  */
-  if ((TREE_CODE (arg0) == TRUNC_DIV_EXPR
-       || TREE_CODE (arg0) == EXACT_DIV_EXPR)
+  if (TREE_CODE (arg0) == TRUNC_DIV_EXPR
       && TREE_CODE (arg1) == INTEGER_CST
       && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
       && !integer_zerop (TREE_OPERAND (arg0, 1))
       && !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
       && !TREE_OVERFLOW (arg1))
     {
       tem = fold_div_compare (loc, code, type, arg0, arg1);
       if (tem != NULL_TREE)
 	return tem;
     }
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 241840)
+++ gcc/match.pd	(working copy)
@@ -2314,20 +2314,50 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	   (ne @0 { build_real (TREE_TYPE (@0), c2); }))))
 	/* sqrt(x) < c is the same as x < c*c, if we ignore NaNs.  */
 	(if (! HONOR_NANS (@0))
 	 (cmp @0 { build_real (TREE_TYPE (@0), c2); })
 	 /* sqrt(x) < c is the same as x >= 0 && x < c*c.  */
 	 (if (GENERIC)
 	  (truth_andif
 	   (ge @0 { build_real (TREE_TYPE (@0), dconst0); })
 	   (cmp @0 { build_real (TREE_TYPE (@0), c2); }))))))))))))
 
+/* Fold A /[ex] B CMP C to A CMP B * C.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (exact_div @0 @1) INTEGER_CST@2)
+  (if (!integer_zerop (@1))
+   (if (wi::eq_p (@2, 0))
+    (cmp @0 @2)
+    (if (TREE_CODE (@1) == INTEGER_CST)
+     (with
+      {
+	bool ovf;
+	wide_int prod = wi::mul (@2, @1, TYPE_SIGN (TREE_TYPE (@1)), &ovf);
+      }
+      (if (ovf)
+       { constant_boolean_node (cmp == NE_EXPR, type); }
+       (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); }))))))))
+(for cmp (lt le gt ge)
+ (simplify
+  (cmp (exact_div @0 INTEGER_CST@1) INTEGER_CST@2)
+  (if (wi::gt_p (@1, 0, TYPE_SIGN (TREE_TYPE (@1))))
+   (with
+    {
+      bool ovf;
+      wide_int prod = wi::mul (@2, @1, TYPE_SIGN (TREE_TYPE (@1)), &ovf);
+    }
+    (if (ovf)
+     { constant_boolean_node (wi::lt_p (@2, 0, TYPE_SIGN (TREE_TYPE (@2)))
+			      != (cmp == LT_EXPR || cmp == LE_EXPR), type); }
+     (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); }))))))
+
 /* Unordered tests if either argument is a NaN.  */
 (simplify
  (bit_ior (unordered @0 @0) (unordered @1 @1))
  (if (types_match (@0, @1))
   (unordered @0 @1)))
 (simplify
  (bit_and (ordered @0 @0) (ordered @1 @1))
  (if (types_match (@0, @1))
   (ordered @0 @1)))
 (simplify
Index: gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c	(working copy)
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(int *p, int *q){
+  __SIZE_TYPE__ n = q - p;
+  return n == 0;
+}
+
+int g(int *p, int *q){
+  __PTRDIFF_TYPE__ n = q - p;
+  return n <= 2;
+}
+
+int h(long *p, long *q){
+  __SIZE_TYPE__ n = q - p;
+  return n == (__SIZE_TYPE__)(-1)/2;
+}
+
+/* { dg-final { scan-tree-dump-not "== 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "<= 8" "optimized" { target { int32 } } } } */
+/* { dg-final { scan-tree-dump "return 0" "optimized" } } */