From patchwork Sun Nov 6 09:02:00 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Rosen X-Patchwork-Id: 4943 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 9B01323EF9 for ; Sun, 6 Nov 2011 09:02:17 +0000 (UTC) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by fiordland.canonical.com (Postfix) with ESMTP id 6CE4BA183D7 for ; Sun, 6 Nov 2011 09:02:07 +0000 (UTC) Received: by faan26 with SMTP id n26so5990835faa.11 for ; Sun, 06 Nov 2011 01:02:07 -0800 (PST) Received: by 10.152.144.73 with SMTP id sk9mr4023839lab.34.1320570127101; Sun, 06 Nov 2011 01:02:07 -0800 (PST) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.14.103 with SMTP id o7cs24213lac; Sun, 6 Nov 2011 01:02:05 -0800 (PST) Received: by 10.43.50.67 with SMTP id vd3mr8994862icb.10.1320570122286; Sun, 06 Nov 2011 01:02:02 -0800 (PST) Received: from mail-iy0-f178.google.com (mail-iy0-f178.google.com [209.85.210.178]) by mx.google.com with ESMTPS id p6si11974485icj.125.2011.11.06.01.02.00 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 06 Nov 2011 01:02:02 -0800 (PST) Received-SPF: neutral (google.com: 209.85.210.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) client-ip=209.85.210.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.210.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) smtp.mail=ira.rosen@linaro.org Received: by iadj38 with SMTP id j38so5628542iad.37 for ; Sun, 06 Nov 2011 01:02:00 -0800 (PST) MIME-Version: 1.0 Received: by 10.42.155.133 with SMTP id u5mr34245150icw.8.1320570120536; Sun, 06 Nov 2011 01:02:00 -0800 (PST) Received: by 10.142.47.19 with HTTP; Sun, 6 Nov 2011 01:02:00 -0800 (PST) Date: Sun, 6 Nov 2011 11:02:00 +0200 Message-ID: Subject: [patch] SLP conditions From: Ira Rosen To: gcc-patches@gcc.gnu.org Cc: Patch Tracking Hi, This patch adds a support of conditions in SLP. It also fixes a bug in pattern handling in SLP (we should put pattern statements instead of original statements in the root), and allows pattern def-stmts in SLP. Bootstrapped on powerpc64-suse-linux and tested on powerpc64-suse-linux and x86_64-suse-linux. Committed. Ira ChangeLog: * tree-vectorizer.h (vectorizable_condition): Add argument. * tree-vect-loop.c (vectorizable_reduction): Fail for condition in SLP. Update calls to vectorizable_condition. * tree-vect-stmts.c (vect_is_simple_cond): Add basic block info to the arguments. Pass it to vect_is_simple_use_1. (vectorizable_condition): Add slp_node to the arguments. Support vectorization of basic blocks. Fail for reduction in SLP. Update calls to vect_is_simple_cond and vect_is_simple_use. Support SLP: call vect_get_slp_defs to get vector operands. (vect_analyze_stmt): Update calls to vectorizable_condition. (vect_transform_stmt): Likewise. * tree-vect-slp.c (vect_create_new_slp_node): Handle COND_EXPR. (vect_get_and_check_slp_defs): Handle COND_EXPR. Allow pattern def stmts. (vect_build_slp_tree): Handle COND_EXPR. (vect_analyze_slp_instance): Push pattern statements to root node. (vect_get_constant_vectors): Fix comments. Handle COND_EXPR. testsuite/ChangeLog: * gcc.dg/vect/bb-slp-cond-1.c: New test. * gcc.dg/vect/slp-cond-1.c: New test. * gcc.dg/vect/slp-cond-2.c: New test. Index: testsuite/gcc.dg/vect/bb-slp-cond-1.c =================================================================== --- testsuite/gcc.dg/vect/bb-slp-cond-1.c (revision 0) +++ testsuite/gcc.dg/vect/bb-slp-cond-1.c (revision 0) @@ -0,0 +1,46 @@ +/* { dg-require-effective-target vect_condition } */ + +#include "tree-vect.h" + +#define N 128 + +__attribute__((noinline, noclone)) void +foo (int *a, int stride) +{ + int i; + + for (i = 0; i < N/stride; i++, a += stride) + { + a[0] = a[0] ? 1 : 5; + a[1] = a[1] ? 2 : 6; + a[2] = a[2] ? 3 : 7; + a[3] = a[3] ? 4 : 8; + } +} + + +int a[N]; +int main () +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + a[i] = i; + + foo (a, 4); + + for (i = 1; i < N; i++) + if (a[i] != i%4 + 1) + abort (); + + if (a[0] != 5) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_element_align } } } */ +/* { dg-final { cleanup-tree-dump "slp" } } */ + Index: testsuite/gcc.dg/vect/slp-cond-1.c =================================================================== --- testsuite/gcc.dg/vect/slp-cond-1.c (revision 0) +++ testsuite/gcc.dg/vect/slp-cond-1.c (revision 0) @@ -0,0 +1,126 @@ +/* { dg-require-effective-target vect_condition } */ +#include "tree-vect.h" + +#define N 32 +int a[N], b[N]; +int d[N], e[N]; +int k[N]; + +__attribute__((noinline, noclone)) void +f1 (void) +{ + int i; + for (i = 0; i < N/4; i++) + { + k[4*i] = a[4*i] < b[4*i] ? 17 : 0; + k[4*i+1] = a[4*i+1] < b[4*i+1] ? 17 : 0; + k[4*i+2] = a[4*i+2] < b[4*i+2] ? 17 : 0; + k[4*i+3] = a[4*i+3] < b[4*i+3] ? 17 : 0; + } +} + +__attribute__((noinline, noclone)) void +f2 (void) +{ + int i; + for (i = 0; i < N/2; ++i) + { + k[2*i] = a[2*i] < b[2*i] ? 0 : 24; + k[2*i+1] = a[2*i+1] < b[2*i+1] ? 7 : 4; + } +} + +__attribute__((noinline, noclone)) void +f3 (void) +{ + int i; + for (i = 0; i < N/2; ++i) + { + k[2*i] = a[2*i] < b[2*i] ? 51 : 12; + k[2*i+1] = a[2*i+1] > b[2*i+1] ? 51 : 12; + } +} + +__attribute__((noinline, noclone)) void +f4 (void) +{ + int i; + for (i = 0; i < N/2; ++i) + { + int d0 = d[2*i], e0 = e[2*i]; + int d1 = d[2*i+1], e1 = e[2*i+1]; + k[2*i] = a[2*i] >= b[2*i] ? d0 : e0; + k[2*i+1] = a[2*i+1] >= b[2*i+1] ? d1 : e1; + } +} + +int +main () +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + { + switch (i % 9) + { + case 0: asm (""); a[i] = - i - 1; b[i] = i + 1; break; + case 1: a[i] = 0; b[i] = 0; break; + case 2: a[i] = i + 1; b[i] = - i - 1; break; + case 3: a[i] = i; b[i] = i + 7; break; + case 4: a[i] = i; b[i] = i; break; + case 5: a[i] = i + 16; b[i] = i + 3; break; + case 6: a[i] = - i - 5; b[i] = - i; break; + case 7: a[i] = - i; b[i] = - i; break; + case 8: a[i] = - i; b[i] = - i - 7; break; + } + d[i] = i; + e[i] = 2 * i; + } + f1 (); + for (i = 0; i < N; i++) + if (k[i] != ((i % 3) == 0 ? 17 : 0)) + abort (); + + f2 (); + for (i = 0; i < N; i++) + { + switch (i % 9) + { + case 0: + case 6: + if (k[i] != ((i/9 % 2) == 0 ? 0 : 7)) + abort (); + break; + case 1: + case 5: + case 7: + if (k[i] != ((i/9 % 2) == 0 ? 4 : 24)) + abort (); + break; + case 2: + case 4: + case 8: + if (k[i] != ((i/9 % 2) == 0 ? 24 : 4)) + abort (); + break; + case 3: + if (k[i] != ((i/9 % 2) == 0 ? 7 : 0)) + abort (); + break; + } + } + + f3 (); + + f4 (); + for (i = 0; i < N; i++) + if (k[i] != ((i % 3) == 0 ? e[i] : d[i])) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ Index: testsuite/gcc.dg/vect/slp-cond-2.c =================================================================== --- testsuite/gcc.dg/vect/slp-cond-2.c (revision 0) +++ testsuite/gcc.dg/vect/slp-cond-2.c (revision 0) @@ -0,0 +1,127 @@ +/* { dg-require-effective-target vect_cond_mixed } */ +#include "tree-vect.h" + +#define N 32 +int d[N], e[N], f[N]; +unsigned char k[N]; +float a[N], b[N]; + +__attribute__((noinline, noclone)) void +f1 (void) +{ + int i; + for (i = 0; i < N/4; i++) + { + k[4*i] = a[4*i] < b[4*i] ? 17 : 0; + k[4*i+1] = a[4*i+1] < b[4*i+1] ? 17 : 0; + k[4*i+2] = a[4*i+2] < b[4*i+2] ? 17 : 0; + k[4*i+3] = a[4*i+3] < b[4*i+3] ? 17 : 0; + } +} + +__attribute__((noinline, noclone)) void +f2 (void) +{ + int i; + for (i = 0; i < N/2; ++i) + { + k[2*i] = a[2*i] < b[2*i] ? 0 : 24; + k[2*i+1] = a[2*i+1] < b[2*i+1] ? 7 : 4; + } +} + +__attribute__((noinline, noclone)) void +f3 (void) +{ + int i; + for (i = 0; i < N/2; ++i) + { + k[2*i] = a[2*i] < b[2*i] ? 51 : 12; + k[2*i+1] = a[2*i+1] > b[2*i+1] ? 51 : 12; + } +} + +__attribute__((noinline, noclone)) void +f4 (void) +{ + int i; + for (i = 0; i < N/2; ++i) + { + int d0 = d[2*i], e0 = e[2*i]; + int d1 = d[2*i+1], e1 = e[2*i+1]; + f[2*i] = a[2*i] >= b[2*i] ? d0 : e0; + f[2*i+1] = a[2*i+1] >= b[2*i+1] ? d1 : e1; + } +} + +int +main () +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + { + switch (i % 9) + { + case 0: asm (""); a[i] = - i - 1; b[i] = i + 1; break; + case 1: a[i] = 0; b[i] = 0; break; + case 2: a[i] = i + 1; b[i] = - i - 1; break; + case 3: a[i] = i; b[i] = i + 7; break; + case 4: a[i] = i; b[i] = i; break; + case 5: a[i] = i + 16; b[i] = i + 3; break; + case 6: a[i] = - i - 5; b[i] = - i; break; + case 7: a[i] = - i; b[i] = - i; break; + case 8: a[i] = - i; b[i] = - i - 7; break; + } + d[i] = i; + e[i] = 2 * i; + } + + f1 (); + for (i = 0; i < N; i++) + if (k[i] != ((i % 3) == 0 ? 17 : 0)) + abort (); + + f2 (); + for (i = 0; i < N; i++) + { + switch (i % 9) + { + case 0: + case 6: + if (k[i] != ((i/9 % 2) == 0 ? 0 : 7)) + abort (); + break; + case 1: + case 5: + case 7: + if (k[i] != ((i/9 % 2) == 0 ? 4 : 24)) + abort (); + break; + case 2: + case 4: + case 8: + if (k[i] != ((i/9 % 2) == 0 ? 24 : 4)) + abort (); + break; + case 3: + if (k[i] != ((i/9 % 2) == 0 ? 7 : 0)) + abort (); + break; + } + } + + f3 (); + + f4 (); + for (i = 0; i < N; i++) + if (f[i] != ((i % 3) == 0 ? e[i] : d[i])) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ Index: tree-vectorizer.h =================================================================== --- tree-vectorizer.h (revision 181025) +++ tree-vectorizer.h (working copy) @@ -837,7 +837,7 @@ extern bool vect_transform_stmt (gimple, gimple_st extern void vect_remove_stores (gimple); extern bool vect_analyze_stmt (gimple, bool *, slp_tree); extern bool vectorizable_condition (gimple, gimple_stmt_iterator *, gimple *, - tree, int); + tree, int, slp_tree); extern void vect_get_load_cost (struct data_reference *, int, bool, unsigned int *, unsigned int *); extern void vect_get_store_cost (struct data_reference *, int, unsigned int *); Index: tree-vect-loop.c =================================================================== --- tree-vect-loop.c (revision 181025) +++ tree-vect-loop.c (working copy) @@ -4416,6 +4416,9 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i gcc_unreachable (); } + if (code == COND_EXPR && slp_node) + return false; + scalar_dest = gimple_assign_lhs (stmt); scalar_type = TREE_TYPE (scalar_dest); if (!POINTER_TYPE_P (scalar_type) && !INTEGRAL_TYPE_P (scalar_type) @@ -4502,7 +4505,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i if (code == COND_EXPR) { - if (!vectorizable_condition (stmt, gsi, NULL, ops[reduc_index], 0)) + if (!vectorizable_condition (stmt, gsi, NULL, ops[reduc_index], 0, NULL)) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "unsupported condition in reduction"); @@ -4774,7 +4777,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_i gcc_assert (!slp_node); vectorizable_condition (stmt, gsi, vec_stmt, PHI_RESULT (VEC_index (gimple, phis, 0)), - reduc_index); + reduc_index, NULL); /* Multiple types are not supported for condition. */ break; } Index: tree-vect-stmts.c =================================================================== --- tree-vect-stmts.c (revision 181025) +++ tree-vect-stmts.c (working copy) @@ -4606,7 +4606,8 @@ vectorizable_load (gimple stmt, gimple_stmt_iterat condition operands are supportable using vec_is_simple_use. */ static bool -vect_is_simple_cond (tree cond, loop_vec_info loop_vinfo, tree *comp_vectype) +vect_is_simple_cond (tree cond, loop_vec_info loop_vinfo, bb_vec_info bb_vinfo, + tree *comp_vectype) { tree lhs, rhs; tree def; @@ -4622,7 +4623,7 @@ static bool if (TREE_CODE (lhs) == SSA_NAME) { gimple lhs_def_stmt = SSA_NAME_DEF_STMT (lhs); - if (!vect_is_simple_use_1 (lhs, loop_vinfo, NULL, &lhs_def_stmt, &def, + if (!vect_is_simple_use_1 (lhs, loop_vinfo, bb_vinfo, &lhs_def_stmt, &def, &dt, &vectype1)) return false; } @@ -4633,11 +4634,11 @@ static bool if (TREE_CODE (rhs) == SSA_NAME) { gimple rhs_def_stmt = SSA_NAME_DEF_STMT (rhs); - if (!vect_is_simple_use_1 (rhs, loop_vinfo, NULL, &rhs_def_stmt, &def, + if (!vect_is_simple_use_1 (rhs, loop_vinfo, bb_vinfo, &rhs_def_stmt, &def, &dt, &vectype2)) return false; } - else if (TREE_CODE (rhs) != INTEGER_CST && TREE_CODE (rhs) != REAL_CST + else if (TREE_CODE (rhs) != INTEGER_CST && TREE_CODE (rhs) != REAL_CST && TREE_CODE (rhs) != FIXED_CST) return false; @@ -4660,7 +4661,8 @@ static bool bool vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi, - gimple *vec_stmt, tree reduc_def, int reduc_index) + gimple *vec_stmt, tree reduc_def, int reduc_index, + slp_tree slp_node) { tree scalar_dest = NULL_TREE; tree vec_dest = NULL_TREE; @@ -4676,25 +4678,29 @@ vectorizable_condition (gimple stmt, gimple_stmt_i tree def; enum vect_def_type dt, dts[4]; int nunits = TYPE_VECTOR_SUBPARTS (vectype); - int ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits; + int ncopies; enum tree_code code; stmt_vec_info prev_stmt_info = NULL; - int j; + int i, j; + bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info); + VEC (tree, heap) *vec_oprnds0 = NULL, *vec_oprnds1 = NULL; + VEC (tree, heap) *vec_oprnds2 = NULL, *vec_oprnds3 = NULL; - /* FORNOW: unsupported in basic block SLP. */ - gcc_assert (loop_vinfo); + if (slp_node || PURE_SLP_STMT (stmt_info)) + ncopies = 1; + else + ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits; - /* FORNOW: SLP not supported. */ - if (STMT_SLP_TYPE (stmt_info)) - return false; - gcc_assert (ncopies >= 1); if (reduc_index && ncopies > 1) return false; /* FORNOW */ - if (!STMT_VINFO_RELEVANT_P (stmt_info)) + if (reduc_index && STMT_SLP_TYPE (stmt_info)) return false; + if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo) + return false; + if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle && reduc_def)) @@ -4721,14 +4727,14 @@ vectorizable_condition (gimple stmt, gimple_stmt_i then_clause = gimple_assign_rhs2 (stmt); else_clause = gimple_assign_rhs3 (stmt); - if (!vect_is_simple_cond (cond_expr, loop_vinfo, &comp_vectype) + if (!vect_is_simple_cond (cond_expr, loop_vinfo, bb_vinfo, &comp_vectype) || !comp_vectype) return false; if (TREE_CODE (then_clause) == SSA_NAME) { gimple then_def_stmt = SSA_NAME_DEF_STMT (then_clause); - if (!vect_is_simple_use (then_clause, loop_vinfo, NULL, + if (!vect_is_simple_use (then_clause, loop_vinfo, bb_vinfo, &then_def_stmt, &def, &dt)) return false; } @@ -4740,7 +4746,7 @@ vectorizable_condition (gimple stmt, gimple_stmt_i if (TREE_CODE (else_clause) == SSA_NAME) { gimple else_def_stmt = SSA_NAME_DEF_STMT (else_clause); - if (!vect_is_simple_use (else_clause, loop_vinfo, NULL, + if (!vect_is_simple_use (else_clause, loop_vinfo, bb_vinfo, &else_def_stmt, &def, &dt)) return false; } @@ -4755,8 +4761,16 @@ vectorizable_condition (gimple stmt, gimple_stmt_i return expand_vec_cond_expr_p (vectype, comp_vectype); } - /* Transform */ + /* Transform. */ + if (!slp_node) + { + vec_oprnds0 = VEC_alloc (tree, heap, 1); + vec_oprnds1 = VEC_alloc (tree, heap, 1); + vec_oprnds2 = VEC_alloc (tree, heap, 1); + vec_oprnds3 = VEC_alloc (tree, heap, 1); + } + /* Handle def. */ scalar_dest = gimple_assign_lhs (stmt); vec_dest = vect_create_destination_var (scalar_dest, vectype); @@ -4764,67 +4778,118 @@ vectorizable_condition (gimple stmt, gimple_stmt_i /* Handle cond expr. */ for (j = 0; j < ncopies; j++) { - gimple new_stmt; + gimple new_stmt = NULL; if (j == 0) { - gimple gtemp; - vec_cond_lhs = + if (slp_node) + { + VEC (tree, heap) *ops = VEC_alloc (tree, heap, 4); + VEC (slp_void_p, heap) *vec_defs; + + vec_defs = VEC_alloc (slp_void_p, heap, 4); + VEC_safe_push (tree, heap, ops, TREE_OPERAND (cond_expr, 0)); + VEC_safe_push (tree, heap, ops, TREE_OPERAND (cond_expr, 1)); + VEC_safe_push (tree, heap, ops, then_clause); + VEC_safe_push (tree, heap, ops, else_clause); + vect_get_slp_defs (ops, slp_node, &vec_defs, -1); + vec_oprnds3 = (VEC (tree, heap) *) VEC_pop (slp_void_p, vec_defs); + vec_oprnds2 = (VEC (tree, heap) *) VEC_pop (slp_void_p, vec_defs); + vec_oprnds1 = (VEC (tree, heap) *) VEC_pop (slp_void_p, vec_defs); + vec_oprnds0 = (VEC (tree, heap) *) VEC_pop (slp_void_p, vec_defs); + + VEC_free (tree, heap, ops); + VEC_free (slp_void_p, heap, vec_defs); + } + else + { + gimple gtemp; + vec_cond_lhs = vect_get_vec_def_for_operand (TREE_OPERAND (cond_expr, 0), stmt, NULL); - vect_is_simple_use (TREE_OPERAND (cond_expr, 0), loop_vinfo, + vect_is_simple_use (TREE_OPERAND (cond_expr, 0), loop_vinfo, NULL, >emp, &def, &dts[0]); - vec_cond_rhs = - vect_get_vec_def_for_operand (TREE_OPERAND (cond_expr, 1), - stmt, NULL); - vect_is_simple_use (TREE_OPERAND (cond_expr, 1), loop_vinfo, - NULL, >emp, &def, &dts[1]); - if (reduc_index == 1) - vec_then_clause = reduc_def; - else - { - vec_then_clause = vect_get_vec_def_for_operand (then_clause, + + vec_cond_rhs = + vect_get_vec_def_for_operand (TREE_OPERAND (cond_expr, 1), + stmt, NULL); + vect_is_simple_use (TREE_OPERAND (cond_expr, 1), loop_vinfo, + NULL, >emp, &def, &dts[1]); + if (reduc_index == 1) + vec_then_clause = reduc_def; + else + { + vec_then_clause = vect_get_vec_def_for_operand (then_clause, + stmt, NULL); + vect_is_simple_use (then_clause, loop_vinfo, + NULL, >emp, &def, &dts[2]); + } + if (reduc_index == 2) + vec_else_clause = reduc_def; + else + { + vec_else_clause = vect_get_vec_def_for_operand (else_clause, stmt, NULL); - vect_is_simple_use (then_clause, loop_vinfo, - NULL, >emp, &def, &dts[2]); - } - if (reduc_index == 2) - vec_else_clause = reduc_def; - else - { - vec_else_clause = vect_get_vec_def_for_operand (else_clause, - stmt, NULL); - vect_is_simple_use (else_clause, loop_vinfo, + vect_is_simple_use (else_clause, loop_vinfo, NULL, >emp, &def, &dts[3]); + } } } else { - vec_cond_lhs = vect_get_vec_def_for_stmt_copy (dts[0], vec_cond_lhs); - vec_cond_rhs = vect_get_vec_def_for_stmt_copy (dts[1], vec_cond_rhs); + vec_cond_lhs = vect_get_vec_def_for_stmt_copy (dts[0], + VEC_pop (tree, vec_oprnds0)); + vec_cond_rhs = vect_get_vec_def_for_stmt_copy (dts[1], + VEC_pop (tree, vec_oprnds1)); vec_then_clause = vect_get_vec_def_for_stmt_copy (dts[2], - vec_then_clause); + VEC_pop (tree, vec_oprnds2)); vec_else_clause = vect_get_vec_def_for_stmt_copy (dts[3], - vec_else_clause); + VEC_pop (tree, vec_oprnds3)); } + if (!slp_node) + { + VEC_quick_push (tree, vec_oprnds0, vec_cond_lhs); + VEC_quick_push (tree, vec_oprnds1, vec_cond_rhs); + VEC_quick_push (tree, vec_oprnds2, vec_then_clause); + VEC_quick_push (tree, vec_oprnds3, vec_else_clause); + } + /* Arguments are ready. Create the new vector stmt. */ - vec_compare = build2 (TREE_CODE (cond_expr), vectype, - vec_cond_lhs, vec_cond_rhs); - vec_cond_expr = build3 (VEC_COND_EXPR, vectype, - vec_compare, vec_then_clause, vec_else_clause); + FOR_EACH_VEC_ELT (tree, vec_oprnds0, i, vec_cond_lhs) + { + vec_cond_rhs = VEC_index (tree, vec_oprnds1, i); + vec_then_clause = VEC_index (tree, vec_oprnds2, i); + vec_else_clause = VEC_index (tree, vec_oprnds3, i); - new_stmt = gimple_build_assign (vec_dest, vec_cond_expr); - new_temp = make_ssa_name (vec_dest, new_stmt); - gimple_assign_set_lhs (new_stmt, new_temp); - vect_finish_stmt_generation (stmt, new_stmt, gsi); - if (j == 0) - STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt; - else - STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt; + vec_compare = build2 (TREE_CODE (cond_expr), vectype, + vec_cond_lhs, vec_cond_rhs); + vec_cond_expr = build3 (VEC_COND_EXPR, vectype, + vec_compare, vec_then_clause, vec_else_clause); - prev_stmt_info = vinfo_for_stmt (new_stmt); + new_stmt = gimple_build_assign (vec_dest, vec_cond_expr); + new_temp = make_ssa_name (vec_dest, new_stmt); + gimple_assign_set_lhs (new_stmt, new_temp); + vect_finish_stmt_generation (stmt, new_stmt, gsi); + if (slp_node) + VEC_quick_push (gimple, SLP_TREE_VEC_STMTS (slp_node), new_stmt); + } + + if (slp_node) + continue; + + if (j == 0) + STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt; + else + STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt; + + prev_stmt_info = vinfo_for_stmt (new_stmt); } + VEC_free (tree, heap, vec_oprnds0); + VEC_free (tree, heap, vec_oprnds1); + VEC_free (tree, heap, vec_oprnds2); + VEC_free (tree, heap, vec_oprnds3); + return true; } @@ -4996,7 +5061,7 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vect || vectorizable_call (stmt, NULL, NULL) || vectorizable_store (stmt, NULL, NULL, NULL) || vectorizable_reduction (stmt, NULL, NULL, NULL) - || vectorizable_condition (stmt, NULL, NULL, NULL, 0)); + || vectorizable_condition (stmt, NULL, NULL, NULL, 0, NULL)); else { if (bb_vinfo) @@ -5005,7 +5070,8 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vect || vectorizable_operation (stmt, NULL, NULL, node) || vectorizable_assignment (stmt, NULL, NULL, node) || vectorizable_load (stmt, NULL, NULL, node, NULL) - || vectorizable_store (stmt, NULL, NULL, node)); + || vectorizable_store (stmt, NULL, NULL, node) + || vectorizable_condition (stmt, NULL, NULL, NULL, 0, node)); } if (!ok) @@ -5113,8 +5179,7 @@ vect_transform_stmt (gimple stmt, gimple_stmt_iter break; case condition_vec_info_type: - gcc_assert (!slp_node); - done = vectorizable_condition (stmt, gsi, &vec_stmt, NULL, 0); + done = vectorizable_condition (stmt, gsi, &vec_stmt, NULL, 0, slp_node); gcc_assert (done); break; Index: tree-vect-slp.c =================================================================== --- tree-vect-slp.c (revision 181025) +++ tree-vect-slp.c (working copy) @@ -109,7 +109,11 @@ vect_create_new_slp_node (VEC (gimple, heap) *scal if (is_gimple_call (stmt)) nops = gimple_call_num_args (stmt); else if (is_gimple_assign (stmt)) - nops = gimple_num_ops (stmt) - 1; + { + nops = gimple_num_ops (stmt) - 1; + if (gimple_assign_rhs_code (stmt) == COND_EXPR) + nops++; + } else return NULL; @@ -191,20 +195,41 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vi bool different_types = false; bool pattern = false; slp_oprnd_info oprnd_info, oprnd0_info, oprnd1_info; + int op_idx = 1; + tree compare_rhs = NULL_TREE; if (loop_vinfo) loop = LOOP_VINFO_LOOP (loop_vinfo); if (is_gimple_call (stmt)) number_of_oprnds = gimple_call_num_args (stmt); + else if (is_gimple_assign (stmt)) + { + number_of_oprnds = gimple_num_ops (stmt) - 1; + if (gimple_assign_rhs_code (stmt) == COND_EXPR) + number_of_oprnds++; + } else - number_of_oprnds = gimple_num_ops (stmt) - 1; + return false; for (i = 0; i < number_of_oprnds; i++) { - oprnd = gimple_op (stmt, i + 1); + if (compare_rhs) + { + oprnd = compare_rhs; + compare_rhs = NULL_TREE; + } + else + oprnd = gimple_op (stmt, op_idx++); + oprnd_info = VEC_index (slp_oprnd_info, *oprnds_info, i); + if (COMPARISON_CLASS_P (oprnd)) + { + compare_rhs = TREE_OPERAND (oprnd, 1); + oprnd = TREE_OPERAND (oprnd, 0); + } + if (!vect_is_simple_use (oprnd, loop_vinfo, bb_vinfo, &def_stmt, &def, &dt) || (!def_stmt && dt != vect_constant_def)) @@ -244,8 +269,7 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vi def_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (def_stmt)); dt = STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt)); - if (dt == vect_unknown_def_type - || STMT_VINFO_PATTERN_DEF_STMT (vinfo_for_stmt (def_stmt))) + if (dt == vect_unknown_def_type) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "Unsupported pattern."); @@ -424,6 +448,7 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_ VEC (gimple, heap) *stmts = SLP_TREE_SCALAR_STMTS (*node); gimple stmt = VEC_index (gimple, stmts, 0); enum tree_code first_stmt_code = ERROR_MARK, rhs_code = ERROR_MARK; + enum tree_code first_cond_code = ERROR_MARK; tree lhs; bool stop_recursion = false, need_same_oprnds = false; tree vectype, scalar_type, first_op1 = NULL_TREE; @@ -440,11 +465,18 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_ VEC (slp_oprnd_info, heap) *oprnds_info; unsigned int nops; slp_oprnd_info oprnd_info; + tree cond; if (is_gimple_call (stmt)) nops = gimple_call_num_args (stmt); + else if (is_gimple_assign (stmt)) + { + nops = gimple_num_ops (stmt) - 1; + if (gimple_assign_rhs_code (stmt) == COND_EXPR) + nops++; + } else - nops = gimple_num_ops (stmt) - 1; + return false; oprnds_info = vect_create_oprnd_info (nops, group_size); @@ -485,6 +517,22 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_ return false; } + if (is_gimple_assign (stmt) + && gimple_assign_rhs_code (stmt) == COND_EXPR + && (cond = gimple_assign_rhs1 (stmt)) + && !COMPARISON_CLASS_P (cond)) + { + if (vect_print_dump_info (REPORT_SLP)) + { + fprintf (vect_dump, + "Build SLP failed: condition is not comparison "); + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); + } + + vect_free_oprnd_info (&oprnds_info, true); + return false; + } + scalar_type = vect_get_smallest_scalar_type (stmt, &dummy, &dummy); vectype = get_vectype_for_scalar_type (scalar_type); if (!vectype) @@ -737,7 +785,8 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_ /* Not memory operation. */ if (TREE_CODE_CLASS (rhs_code) != tcc_binary - && TREE_CODE_CLASS (rhs_code) != tcc_unary) + && TREE_CODE_CLASS (rhs_code) != tcc_unary + && rhs_code != COND_EXPR) { if (vect_print_dump_info (REPORT_SLP)) { @@ -750,6 +799,26 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_ return false; } + if (rhs_code == COND_EXPR) + { + tree cond_expr = gimple_assign_rhs1 (stmt); + + if (i == 0) + first_cond_code = TREE_CODE (cond_expr); + else if (first_cond_code != TREE_CODE (cond_expr)) + { + if (vect_print_dump_info (REPORT_SLP)) + { + fprintf (vect_dump, "Build SLP failed: different" + " operation"); + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); + } + + vect_free_oprnd_info (&oprnds_info, true); + return false; + } + } + /* Find the def-stmts. */ if (!vect_get_and_check_slp_defs (loop_vinfo, bb_vinfo, *node, stmt, ncopies_for_cost, (i == 0), @@ -1402,7 +1471,12 @@ vect_analyze_slp_instance (loop_vec_info loop_vinf /* Collect the stores and store them in SLP_TREE_SCALAR_STMTS. */ while (next) { - VEC_safe_push (gimple, heap, scalar_stmts, next); + if (STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (next)) + && STMT_VINFO_RELATED_STMT (vinfo_for_stmt (next))) + VEC_safe_push (gimple, heap, scalar_stmts, + STMT_VINFO_RELATED_STMT (vinfo_for_stmt (next))); + else + VEC_safe_push (gimple, heap, scalar_stmts, next); next = GROUP_NEXT_ELEMENT (vinfo_for_stmt (next)); } } @@ -1411,7 +1485,7 @@ vect_analyze_slp_instance (loop_vec_info loop_vinf /* Collect reduction statements. */ VEC (gimple, heap) *reductions = LOOP_VINFO_REDUCTIONS (loop_vinfo); for (i = 0; VEC_iterate (gimple, reductions, i, next); i++) - VEC_safe_push (gimple, heap, scalar_stmts, next); + VEC_safe_push (gimple, heap, scalar_stmts, next); } node = vect_create_new_slp_node (scalar_stmts); @@ -2150,15 +2224,15 @@ vect_get_constant_vectors (tree op, slp_tree slp_n For example, we have two scalar operands, s1 and s2 (e.g., group of strided accesses of size two), while NUNITS is four (i.e., four scalars - of this type can be packed in a vector). The output vector will contain - two copies of each scalar operand: {s1, s2, s1, s2}. (NUMBER_OF_COPIES + of this type can be packed in a vector). The output vector will contain + two copies of each scalar operand: {s1, s2, s1, s2}. (NUMBER_OF_COPIES will be 2). If GROUP_SIZE > NUNITS, the scalars will be split into several vectors containing the operands. For example, NUNITS is four as before, and the group size is 8 - (s1, s2, ..., s8). We will create two vectors {s1, s2, s3, s4} and + (s1, s2, ..., s8). We will create two vectors {s1, s2, s3, s4} and {s5, s6, s7, s8}. */ number_of_copies = least_common_multiple (nunits, group_size) / group_size; @@ -2170,8 +2244,23 @@ vect_get_constant_vectors (tree op, slp_tree slp_n { if (is_store) op = gimple_assign_rhs1 (stmt); - else + else if (gimple_assign_rhs_code (stmt) != COND_EXPR) op = gimple_op (stmt, op_num + 1); + else + { + if (op_num == 0 || op_num == 1) + { + tree cond = gimple_assign_rhs1 (stmt); + op = TREE_OPERAND (cond, op_num); + } + else + { + if (op_num == 2) + op = gimple_assign_rhs2 (stmt); + else + op = gimple_assign_rhs3 (stmt); + } + } if (reduc_index != -1) {