Message ID | CAAgBjMniAxbrvydpQE61_xbs_PN-=3P4eSQNx6+37fi1Krbamw@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Wed, 9 Nov 2016, Prathamesh Kulkarni wrote: > On 8 November 2016 at 16:46, Richard Biener <rguenther@suse.de> wrote: > > On Tue, 8 Nov 2016, Prathamesh Kulkarni wrote: > > > >> On 8 November 2016 at 13:23, Richard Biener <rguenther@suse.de> wrote: > >> > On Mon, 7 Nov 2016, Prathamesh Kulkarni wrote: > >> > > >> >> On 7 November 2016 at 23:06, Prathamesh Kulkarni > >> >> <prathamesh.kulkarni@linaro.org> wrote: > >> >> > On 7 November 2016 at 15:43, Richard Biener <rguenther@suse.de> wrote: > >> >> >> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote: > >> >> >> > >> >> >>> On 4 November 2016 at 13:41, Richard Biener <rguenther@suse.de> wrote: > >> >> >>> > On Thu, 3 Nov 2016, Marc Glisse wrote: > >> >> >>> > > >> >> >>> >> On Thu, 3 Nov 2016, Richard Biener wrote: > >> >> >>> >> > >> >> >>> >> > > > > The transform would also work for vectors (element_precision for > >> >> >>> >> > > > > the test but also a value-matching zero which should ensure the > >> >> >>> >> > > > > same number of elements). > >> >> >>> >> > > > Um sorry, I didn't get how to check vectors to be of equal length by a > >> >> >>> >> > > > matching zero. > >> >> >>> >> > > > Could you please elaborate on that ? > >> >> >>> >> > > > >> >> >>> >> > > He may have meant something like: > >> >> >>> >> > > > >> >> >>> >> > > (op (cmp @0 integer_zerop@2) (cmp @1 @2)) > >> >> >>> >> > > >> >> >>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the > >> >> >>> >> > point of the pattern. > >> >> >>> >> > >> >> >>> >> Oups, that's what I had written first, and then I somehow managed to confuse > >> >> >>> >> myself enough to remove it so as to remove the call to types_match :-( > >> >> >>> >> > >> >> >>> >> > > So the last operand is checked with operand_equal_p instead of > >> >> >>> >> > > integer_zerop. But the fact that we could compute bit_ior on the > >> >> >>> >> > > comparison results should already imply that the number of elements is the > >> >> >>> >> > > same. > >> >> >>> >> > > >> >> >>> >> > Though for equality compares we also allow scalar results IIRC. > >> >> >>> >> > >> >> >>> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate one > >> >> >>> >> for a testcase, at least until the GIMPLE FE lands... > >> >> >>> >> > >> >> >>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some > >> >> >>> >> > > vector mode on s390?), it would be cool to do the same for floats (most > >> >> >>> >> > > likely at the RTL level). > >> >> >>> >> > > >> >> >>> >> > On GIMPLE view-converts could come to the rescue here as well. Or we cab > >> >> >>> >> > just allow bit-and/or on floats as much as we allow them on pointers. > >> >> >>> >> > >> >> >>> >> Would that generate sensible code on targets that do not have logic insns for > >> >> >>> >> floats? Actually, even on x86_64 that generates inefficient code, so there > >> >> >>> >> would be some work (for instance grep finds no gen_iordf3, only gen_iorv2df3). > >> >> >>> >> > >> >> >>> >> I am also a bit wary of doing those obfuscating optimizations too early... > >> >> >>> >> a==0 is something that other optimizations might use. long > >> >> >>> >> c=(long&)a|(long&)b; (double&)c==0; less so... > >> >> >>> >> > >> >> >>> >> (and I am assuming that signaling NaNs don't make the whole transformation > >> >> >>> >> impossible, which might be wrong) > >> >> >>> > > >> >> >>> > Yeah. I also think it's not so much important - I just wanted to mention > >> >> >>> > vectors... > >> >> >>> > > >> >> >>> > Btw, I still think we need a more sensible infrastructure for passes > >> >> >>> > to gather, analyze and modify complex conditions. (I'm always pointing > >> >> >>> > to tree-affine.c as an, albeit not very good, example for handling > >> >> >>> > a similar problem) > >> >> >>> Thanks for mentioning the value-matching capture @@, I wasn't aware of > >> >> >>> this match.pd feature. > >> >> >>> The current patch keeps it restricted to only bitwise operators on integers. > >> >> >>> Bootstrap+test running on x86_64-unknown-linux-gnu. > >> >> >>> OK to commit if passes ? > >> >> >> > >> >> >> +/* PR35691: Transform > >> >> >> + (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0. > >> >> >> + (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0. */ > >> >> >> + > >> >> >> > >> >> >> Please omit the vertical space > >> >> >> > >> >> >> +(for bitop (bit_and bit_ior) > >> >> >> + cmp (eq ne) > >> >> >> + (simplify > >> >> >> + (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop)) > >> >> >> > >> >> >> if you capture the first integer_zerop as @2 then you can re-use it... > >> >> >> > >> >> >> + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > >> >> >> + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) > >> >> >> + && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE > >> >> >> (@1))) > >> >> >> + (cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0)); > >> >> >> > >> >> >> ... here inplace of the { build_zero_cst ... }. > >> >> >> > >> >> >> Ok with that changes. > >> >> > Thanks, committed the attached version as r241915. > >> >> ugh, the svn commit message has: > >> >> > >> >> testsuite/ > >> >> * gcc.dg/pr35691-1.c: New test-case. > >> >> * gcc.dg/pr35691-4.c: Likewise. > >> >> > >> >> pr35691-4.c was a typo, should be pr35691-2.c :/ > >> >> However testsuite/ChangeLog correctly has entry for pr35691-2.c > >> >> Is it possible to edit the commit message for r241915 ? > >> >> Sorry about this. > >> > > >> > No, just leave it as-is. > >> Hi, > >> Chritstophe reported to me that the commit caused test-cases > >> pr35691-1.c and pr35691-2.c (which were added by the commit) > >> to FAIL for cortex-a5: > >> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241915/arm-none-linux-gnueabihf/diff-gcc-rh60-arm-none-linux-gnueabihf-arm-cortex-a5-vfpv3-d16-fp16.txt > >> > >> It seems truth_andif_expr is not simplified to bit_and_expr on > >> cortex-a5 as for x86_64 (and other arm variants). > >> The differences in dumps start from 004t.gimple for pr35691-1.c: > >> > >> x86_64 gimple dump: > >> foo (int z0, unsigned int z1) > >> { > >> int D.1800; > >> int t0; > >> int t1; > >> int t2; > >> > >> _1 = z0 == 0; > >> t0 = (int) _1; > >> _2 = z1 == 0; > >> t1 = (int) _2; > >> _3 = t0 != 0; > >> _4 = t1 != 0; > >> _5 = _3 & _4; > >> t2 = (int) _5; > >> D.1800 = t2; > >> return D.1800; > >> } > >> > >> cortex-a5 gimple dump: > >> foo (int z0, unsigned int z1) > >> { > >> int iftmp.0; > >> int D.4176; > >> int t0; > >> int t1; > >> int t2; > >> > >> _1 = z0 == 0; > >> t0 = (int) _1; > >> _2 = z1 == 0; > >> t1 = (int) _2; > >> if (t0 != 0) goto <D.4174>; else goto <D.4172>; > >> <D.4174>: > >> if (t1 != 0) goto <D.4175>; else goto <D.4172>; > >> <D.4175>: > >> iftmp.0 = 1; > >> goto <D.4173>; > >> <D.4172>: > >> iftmp.0 = 0; > >> <D.4173>: > >> t2 = iftmp.0; > >> D.4176 = t2; > >> return D.4176; > >> } > >> > >> Since the pattern expects truth_andif_expr to be converted to bit_and_expr, > >> it fails to match for cortex-a5. > >> This seems to happen only for cortex-a5 (the other variants a9, a15, > >> a57 are OK). > >> > >> Is my assumption that truth_andif_expr would be always converted to bit_and_expr > >> for above case incorrect ? > > > > Yes, it depends on LOGICAL_OP_SHORT_CIRCUIT. > Thanks, I wasn't aware of that. I adjusted the test-case in the attached patch, > to always contain bit_and_expr/bit_ior_expr, which passes on cortex-a5 > and powerpc. > I will update PR35691 with the comment that it's partially fixed, for > targets when LOGICAL_OP_NON_SHORT_CIRCUIT is true. > Cross-testing on arm*-*-* in progress. > Ok to commit if passes ? Ok. Richard.
diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c b/gcc/testsuite/gcc.dg/pr35691-1.c index 5211f815..125923d 100644 --- a/gcc/testsuite/gcc.dg/pr35691-1.c +++ b/gcc/testsuite/gcc.dg/pr35691-1.c @@ -5,7 +5,7 @@ int foo(int z0, unsigned z1) { int t0 = (z0 == 0); int t1 = (z1 == 0); - int t2 = (t0 && t1); + int t2 = (t0 & t1); return t2; } diff --git a/gcc/testsuite/gcc.dg/pr35691-2.c b/gcc/testsuite/gcc.dg/pr35691-2.c index 90cbf6d..70f68a6 100644 --- a/gcc/testsuite/gcc.dg/pr35691-2.c +++ b/gcc/testsuite/gcc.dg/pr35691-2.c @@ -5,7 +5,7 @@ int foo(int z0, unsigned z1) { int t0 = (z0 != 0); int t1 = (z1 != 0); - int t2 = (t0 || t1); + int t2 = (t0 | t1); return t2; }