Message ID | 87lggu41h9.fsf@linaro.org |
---|---|
State | Accepted |
Commit | 09a7858b2c53eccf28f780f5f3e4f2764f440eb1 |
Headers | show |
Series | Check whether any statements need masking (PR 83922) | expand |
On Fri, Jan 19, 2018 at 10:55 AM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > This PR is an odd case in which, due to the low optimisation level, > we enter vectorisation with: > > outer1: > x_1 = PHI <x_3(outer2), ...>; > ... > > inner: > x_2 = 0; > ... > > outer2: > x_3 = PHI <x_2(inner)>; > > These statements are tentatively treated as a double reduction by > vect_force_simple_reduction, but in the end only x_3 and x_2 are marked > as relevant. vect_analyze_loop_operations skips over x_3, leaving the > vectorizable_reduction check to a presumed future test of x_1, which > in this case never happens. We therefore end up vectorising x_2 only > (complete with peeling for niters!) and leave the scalar x_3 in place. > > This caused a segfault in the support for fully-masked loops, > since there were no statements that needed masking. Fixed by > checking for that. > > But I think this is also a flaw in vect_analyze_loop_operations. > Outer loop vectorisation reduces the number of times that the > inner loop is executed, so it wouldn't necessarily be valid > to leave the scalar x_3 in place for all vectorisable x_2. > There's already code to forbid that when x_1 isn't present: > > /* FORNOW: we currently don't support the case that these phis > are not used in the outerloop (unless it is double reduction, > i.e., this phi is vect_reduction_def), cause this case > requires to actually do something here. */ > > I think we need to do the same if x_1 is present but not relevant. Hmm, yeah... > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. > OK to install? Ok. Richard. > Richard > > > 2018-01-19 Richard Sandiford <richard.sandiford@linaro.org> > > gcc/ > PR tree-optimization/83922 > * tree-vect-loop.c (vect_verify_full_masking): Return false if > there are no statements that need masking. > (vect_active_double_reduction_p): New function. > (vect_analyze_loop_operations): Use it when handling phis that > are not in the loop header. > > gcc/testsuite/ > PR tree-optimization/83922 > * gcc.dg/pr83922.c: New test. > > Index: gcc/tree-vect-loop.c > =================================================================== > --- gcc/tree-vect-loop.c 2018-01-19 09:36:33.409191362 +0000 > +++ gcc/tree-vect-loop.c 2018-01-19 09:52:00.681330865 +0000 > @@ -1294,6 +1294,12 @@ vect_verify_full_masking (loop_vec_info > struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); > unsigned int min_ni_width; > > + /* Use a normal loop if there are no statements that need masking. > + This only happens in rare degenerate cases: it means that the loop > + has no loads, no stores, and no live-out values. */ > + if (LOOP_VINFO_MASKS (loop_vinfo).is_empty ()) > + return false; > + > /* Get the maximum number of iterations that is representable > in the counter type. */ > tree ni_type = TREE_TYPE (LOOP_VINFO_NITERSM1 (loop_vinfo)); > @@ -1739,6 +1745,33 @@ vect_update_vf_for_slp (loop_vec_info lo > } > } > > +/* Return true if STMT_INFO describes a double reduction phi and if > + the other phi in the reduction is also relevant for vectorization. > + This rejects cases such as: > + > + outer1: > + x_1 = PHI <x_3(outer2), ...>; > + ... > + > + inner: > + x_2 = ...; > + ... > + > + outer2: > + x_3 = PHI <x_2(inner)>; > + > + if nothing in x_2 or elsewhere makes x_1 relevant. */ > + > +static bool > +vect_active_double_reduction_p (stmt_vec_info stmt_info) > +{ > + if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def) > + return false; > + > + gimple *other_phi = STMT_VINFO_REDUC_DEF (stmt_info); > + return STMT_VINFO_RELEVANT_P (vinfo_for_stmt (other_phi)); > +} > + > /* Function vect_analyze_loop_operations. > > Scan the loop stmts and make sure they are all vectorizable. */ > @@ -1786,8 +1819,7 @@ vect_analyze_loop_operations (loop_vec_i > i.e., this phi is vect_reduction_def), cause this case > requires to actually do something here. */ > if (STMT_VINFO_LIVE_P (stmt_info) > - && STMT_VINFO_DEF_TYPE (stmt_info) > - != vect_double_reduction_def) > + && !vect_active_double_reduction_p (stmt_info)) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > Index: gcc/testsuite/gcc.dg/pr83922.c > =================================================================== > --- /dev/null 2018-01-19 09:30:49.543814408 +0000 > +++ gcc/testsuite/gcc.dg/pr83922.c 2018-01-19 09:52:00.680331041 +0000 > @@ -0,0 +1,21 @@ > +/* { dg-options "-O -ftree-vectorize" } */ > + > +int j4; > + > +void > +k1 (int ak) > +{ > + while (ak < 1) > + { > + int ur; > + > + for (ur = 0; ur < 2; ++ur) > + { > + ++j4; > + if (j4 != 0) > + j4 = 0; > + } > + > + ++ak; > + } > +}
Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2018-01-19 09:36:33.409191362 +0000 +++ gcc/tree-vect-loop.c 2018-01-19 09:52:00.681330865 +0000 @@ -1294,6 +1294,12 @@ vect_verify_full_masking (loop_vec_info struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); unsigned int min_ni_width; + /* Use a normal loop if there are no statements that need masking. + This only happens in rare degenerate cases: it means that the loop + has no loads, no stores, and no live-out values. */ + if (LOOP_VINFO_MASKS (loop_vinfo).is_empty ()) + return false; + /* Get the maximum number of iterations that is representable in the counter type. */ tree ni_type = TREE_TYPE (LOOP_VINFO_NITERSM1 (loop_vinfo)); @@ -1739,6 +1745,33 @@ vect_update_vf_for_slp (loop_vec_info lo } } +/* Return true if STMT_INFO describes a double reduction phi and if + the other phi in the reduction is also relevant for vectorization. + This rejects cases such as: + + outer1: + x_1 = PHI <x_3(outer2), ...>; + ... + + inner: + x_2 = ...; + ... + + outer2: + x_3 = PHI <x_2(inner)>; + + if nothing in x_2 or elsewhere makes x_1 relevant. */ + +static bool +vect_active_double_reduction_p (stmt_vec_info stmt_info) +{ + if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def) + return false; + + gimple *other_phi = STMT_VINFO_REDUC_DEF (stmt_info); + return STMT_VINFO_RELEVANT_P (vinfo_for_stmt (other_phi)); +} + /* Function vect_analyze_loop_operations. Scan the loop stmts and make sure they are all vectorizable. */ @@ -1786,8 +1819,7 @@ vect_analyze_loop_operations (loop_vec_i i.e., this phi is vect_reduction_def), cause this case requires to actually do something here. */ if (STMT_VINFO_LIVE_P (stmt_info) - && STMT_VINFO_DEF_TYPE (stmt_info) - != vect_double_reduction_def) + && !vect_active_double_reduction_p (stmt_info)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, Index: gcc/testsuite/gcc.dg/pr83922.c =================================================================== --- /dev/null 2018-01-19 09:30:49.543814408 +0000 +++ gcc/testsuite/gcc.dg/pr83922.c 2018-01-19 09:52:00.680331041 +0000 @@ -0,0 +1,21 @@ +/* { dg-options "-O -ftree-vectorize" } */ + +int j4; + +void +k1 (int ak) +{ + while (ak < 1) + { + int ur; + + for (ur = 0; ur < 2; ++ur) + { + ++j4; + if (j4 != 0) + j4 = 0; + } + + ++ak; + } +}