From patchwork Thu Aug 4 06:36:18 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 73239 Delivered-To: patch@linaro.org Received: by 10.140.29.52 with SMTP id a49csp1230075qga; Wed, 3 Aug 2016 23:36:50 -0700 (PDT) X-Received: by 10.66.51.137 with SMTP id k9mr122356929pao.49.1470292610840; Wed, 03 Aug 2016 23:36:50 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id i5si13029759pae.80.2016.08.03.23.36.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Aug 2016 23:36:50 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-433145-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-433145-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-433145-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=Qff6JcWxH9p2R19hHX+3gsJp8DVwU+9+xBhC3NqxZ5XgaP WfRyY3EMV9L3WHlCIg+lCn9KFWjzNkQBWIb3avTMyAZ8MXZQ+3ta/bwrj5evei77 EAxs3XVbuB8nbmgUEt7UpydyVsGKsTRC5zpccVCitQKmymd6A+1TRfqpVX6S4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=MmiHIQ67uQVJidMs+mDQMtdOfms=; b=kE98DvZoWev24edsKG/n ml7G8erPwmlDbDgt8PFBdTdnH70xA5/EIpK643NK03+aILWaQmqVvQG62OGfjq01 mcwz2Q9XzhAq81p8frZv4YqyqILERJ127ivBmuyylnlFGcem/cnVKJ24BxgEZ1gm x0DwHEvm1sl1b1taXLsSLGo= Received: (qmail 8104 invoked by alias); 4 Aug 2016 06:36:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 8086 invoked by uid 89); 4 Aug 2016 06:36:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL, BAYES_20, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=cgraph_node, vec, 15416, bp_pack_value X-HELO: mail-it0-f54.google.com Received: from mail-it0-f54.google.com (HELO mail-it0-f54.google.com) (209.85.214.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 04 Aug 2016 06:36:22 +0000 Received: by mail-it0-f54.google.com with SMTP id f6so314593360ith.1 for ; Wed, 03 Aug 2016 23:36:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=HWVYG1fAcvbKakOEjW8asu/00dcrWqvcvUPtqHSnRkM=; b=AS3c+PBhN/IqU/cJBtVu8KBDxJXUbsn4Sb9JL3sQLAv9BCI/adEkWQyWuSgoACV0nx BH21b74qXn1HdVQNiFUwqRGd3Ch4xv6vouQ8oFleK1/FPdDNiqIH5S/PS2eOtCqMQEKh 3Wncta+9W7cqorOyZwb6NHh6xrNQfk55iSJ0YjdmUu8NG28M99eM0CTI6Qw29y+xWkwD zDH879fvfKkgdoypNIN5D0iBDzdXLRguyr08fLXl+D8lQUGwUXCmd+qodvaFrFrrT2BR iGYEvMPgitUIZq1PZ+aTzONirFpFrsID04t6+NrZF6EXzm8jdM/m4BbQs4mxQVsqmK3k LLig== X-Gm-Message-State: AEkoouuYumxTcEGpdYRaSe/Njwa+eIhTTnMPr6j44ZE5D4cM+yWZD3ghVWcqsopOO87LvN8T76cKmlZBITA5mDC4 X-Received: by 10.36.130.135 with SMTP id t129mr27404316itd.31.1470292579608; Wed, 03 Aug 2016 23:36:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.208.18 with HTTP; Wed, 3 Aug 2016 23:36:18 -0700 (PDT) From: Prathamesh Kulkarni Date: Thu, 4 Aug 2016 12:06:18 +0530 Message-ID: Subject: [RFC] ipa bitwise constant propagation To: Richard Biener , Jan Hubicka , Martin Jambor , Kugan Vivekanandarajah , gcc Patches X-IsSubscribed: yes Hi, This is a prototype patch for propagating known/unknown bits inter-procedurally. for integral types which propagates info obtained from get_nonzero_bits (). Patch required making following changes: a) To make info from get_nonzero_bits() available to ipa, I had to remove guard !nonzero_p in ccp_finalize. However that triggered the following ICE in get_ptr_info() for default_none.f95 (and several other fortran tests) with options: -fopenacc -O2 ICE: http://pastebin.com/KjD7HMQi I confirmed with Richard that this was a latent issue. b) I chose widest_int for representing value, mask in ipcp_bits_lattice and correspondingly changed declarations for bit_value_unop_1/bit_value_binop_1 to take precision and sign instead of type (those are the only two fields that were used). Both these functions are exported by tree-ssa-ccp.h I hope that's ok ? c) Changed streamer_read_wi/streamer_write_wi to non-static. Ah I see Kugan has submitted a patch for this, so I will drop this hunk. d) We have following in tree-ssa-ccp.c:get_default_value (): if (flag_tree_bit_ccp) { wide_int nonzero_bits = get_nonzero_bits (var); if (nonzero_bits != -1) { val.lattice_val = CONSTANT; val.value = build_zero_cst (TREE_TYPE (var)); val.mask = extend_mask (nonzero_bits); } extend_mask() sets all upper bits to 1 in nonzero_bits, ie, varying in terms of bit-ccp. I suppose in tree-ccp we need to extend mask if var is parameter since we don't know in advance what values it will receive from different callers and mark all upper bits as 1 to be safe. However I suppose with ipa, we can determine exactly which bits of parameter are constant and setting all upper bits to 1 will become unnecessary ? For example, consider following artificial test-case: int f(int x) { if (x > 300) return 1; else return 2; } int main(int argc, char **argv) { return f(argc & 0xc) + f (argc & 0x3); } For x, the mask would be meet of: <0, 0xc> meet <0, 0x3> == (0x3 | 0xc) | (0 ^ 0) == 0xf and ipcp_update_bits() sets nonzero_bits for x to 0xf. However get_default_value then calls extend_mask (0xf), resulting in all upper bits being set to 1 and consequently the condition if (x > 300) doesn't get folded. To resolve this, I added a new flag "set_by_ipa" to decl_common, which is set to true if the mask of parameter is determined by ipa-cp, and the condition changes to: if (SSA_NAME_VAR (var) && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL && DECL_SET_BY_IPA (SSA_NAME_VAR (var)) val.mask = widest_int::from (nonzero_bits, TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var))); else val.mask = extend_mask (nonzero_bits); I am not sure if adding a new flag to decl_common is a good idea. How do other ipa passes deal with this/similar issue ? I suppose we would want to gate this on some flag, say -fipa-bit-cp ? I haven't yet gated it on the flag, will do in next version of patch. I have added some very simple test-cases, I will try to add more meaningful ones. Patch passes bootstrap+test on x86_64-unknown-linux-gnu and cross-tested on arm*-*-* and aarch64*-*-* with the exception of some fortran tests failing due to above ICE. As next steps, I am planning to extend it to handle alignment propagation, and do further testing (lto-bootstrap, chromium). I would be grateful for feedback on the current patch. Thanks, Prathamesh diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 5b6cb9a..b770f6a 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3. If not see #include "params.h" #include "ipa-inline.h" #include "ipa-utils.h" +#include "tree-ssa-ccp.h" template class ipcp_value; @@ -266,6 +267,40 @@ private: bool meet_with_1 (unsigned new_align, unsigned new_misalign); }; +/* Lattice of known bits, only capable of holding one value. + Similar to ccp_prop_value_t, mask represents which bits of value are constant. + If a bit in mask is set to 0, then the corresponding bit in + value is known to be constant. */ + +class ipcp_bits_lattice +{ +public: + bool bottom_p () { return lattice_val == IPA_BITS_VARYING; } + bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; } + bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; } + bool set_to_bottom (); + bool set_to_constant (widest_int, widest_int, signop, unsigned); + + widest_int get_value () { return value; } + widest_int get_mask () { return mask; } + signop get_sign () { return sgn; } + unsigned get_precision () { return precision; } + + bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree); + bool meet_with (widest_int, widest_int, signop, unsigned); + + void print (FILE *); + +private: + enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val; + widest_int value, mask; + signop sgn; + unsigned precision; + + bool meet_with_1 (widest_int, widest_int); + void get_value_and_mask (tree, widest_int *, widest_int *); +}; + /* Structure containing lattices for a parameter itself and for pieces of aggregates that are passed in the parameter or by a reference in a parameter plus some other useful flags. */ @@ -281,6 +316,8 @@ public: ipcp_agg_lattice *aggs; /* Lattice describing known alignment. */ ipcp_alignment_lattice alignment; + /* Lattice describing known bits. */ + ipcp_bits_lattice bits_lattice; /* Number of aggregate lattices */ int aggs_count; /* True if aggregate data were passed by reference (as opposed to by @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f) fprintf (f, " Alignment %u, misalignment %u\n", align, misalign); } +void +ipcp_bits_lattice::print (FILE *f) +{ + if (top_p ()) + fprintf (f, " Bits unknown (TOP)\n"); + else if (bottom_p ()) + fprintf (f, " Bits unusable (BOTTOM)\n"); + else + { + fprintf (f, " Bits: value = "); print_hex (get_value (), f); + fprintf (f, ", mask = "); print_hex (get_mask (), f); + fprintf (f, "\n"); + } +} + /* Print all ipcp_lattices of all functions to F. */ static void @@ -484,6 +536,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits) fprintf (f, " ctxs: "); plats->ctxlat.print (f, dump_sources, dump_benefits); plats->alignment.print (f); + plats->bits_lattice.print (f); if (plats->virt_call) fprintf (f, " virt_call flag set\n"); @@ -911,6 +964,161 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other, return meet_with_1 (other.align, adjusted_misalign); } +/* Set lattice value to bottom, if it already isn't the case. */ + +bool +ipcp_bits_lattice::set_to_bottom () +{ + if (bottom_p ()) + return false; + lattice_val = IPA_BITS_VARYING; + value = 0; + mask = -1; + return true; +} + +/* Set to constant if it isn't already. Only meant to be called + when switching state from TOP. */ + +bool +ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask, + signop sgn, unsigned precision) +{ + gcc_assert (top_p ()); + this->lattice_val = IPA_BITS_CONSTANT; + this->value = value; + this->mask = mask; + this->sgn = sgn; + this->precision = precision; + return true; +} + +/* Convert operand to value, mask form. */ + +void +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp) +{ + wide_int get_nonzero_bits (const_tree); + + if (TREE_CODE (operand) == INTEGER_CST) + { + *valuep = wi::to_widest (operand); + *maskp = 0; + } + else if (TREE_CODE (operand) == SSA_NAME) + { + *valuep = 0; + *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED); + } + else + gcc_unreachable (); +} + +/* Meet operation, similar to ccp_lattice_meet, we xor values + if this->value, value have different values at same bit positions, we want + to drop that bit to varying. Return true if mask is changed. + This function assumes that the lattice value is in CONSTANT state */ + +bool +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask) +{ + gcc_assert (constant_p ()); + + widest_int old_mask = this->mask; + this->mask = (this->mask | mask) | (this->value ^ value); + + if (wi::sext (this->mask, this->precision) == -1) + return set_to_bottom (); + + bool changed = this->mask != old_mask; + return changed; +} + +/* Meet the bits lattice with operand + described by ctxlat.set_contains_variable (); ret |= set_agg_lats_contain_variable (plats); ret |= plats->alignment.set_to_bottom (); + ret |= plats->bits_lattice.set_to_bottom (); return ret; } @@ -1003,6 +1212,7 @@ initialize_node_lattices (struct cgraph_node *node) plats->ctxlat.set_to_bottom (); set_agg_lats_to_bottom (plats); plats->alignment.set_to_bottom (); + plats->bits_lattice.set_to_bottom (); } else set_all_contains_variable (plats); @@ -1621,6 +1831,57 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs, } } +/* Propagate bits across jfunc that is associated with + edge cs and update dest_lattice accordingly. */ + +bool +propagate_bits_accross_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc, + ipcp_bits_lattice *dest_lattice) +{ + if (dest_lattice->bottom_p ()) + return false; + + if (jfunc->type == IPA_JF_PASS_THROUGH) + { + struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller); + enum tree_code code = ipa_get_jf_pass_through_operation (jfunc); + tree operand = NULL_TREE; + + if (code != NOP_EXPR) + operand = ipa_get_jf_pass_through_operand (jfunc); + + int src_idx = ipa_get_jf_pass_through_formal_id (jfunc); + struct ipcp_param_lattices *src_lats + = ipa_get_parm_lattices (caller_info, src_idx); + + /* Try to proapgate bits if src_lattice is bottom, but jfunc is known. + for eg consider: + int f(int x) + { + g (x & 0xff); + } + Assume lattice for x is bottom, however we can still propagate + result of x & 0xff == 0xff, which gets computed during ccp1 pass + and we store it in jump function during analysis stage. */ + + if (src_lats->bits_lattice.bottom_p () + && jfunc->bits.known) + return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask, + jfunc->bits.sgn, jfunc->bits.precision); + else + return dest_lattice->meet_with (src_lats->bits_lattice, code, operand); + } + + else if (jfunc->type == IPA_JF_ANCESTOR) + return dest_lattice->set_to_bottom (); + + else if (jfunc->bits.known) + return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask, + jfunc->bits.sgn, jfunc->bits.precision); + else + return dest_lattice->set_to_bottom (); +} + /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all other cases, return false). If there are no aggregate items, set @@ -1968,6 +2229,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs) &dest_plats->ctxlat); ret |= propagate_alignment_accross_jump_function (cs, jump_func, &dest_plats->alignment); + ret |= propagate_bits_accross_jump_function (cs, jump_func, + &dest_plats->bits_lattice); ret |= propagate_aggs_accross_jump_function (cs, jump_func, dest_plats); } @@ -4592,6 +4855,74 @@ ipcp_store_alignment_results (void) } } +/* Look up all the bits information that we have discovered and copy it over + to the transformation summary. */ + +static void +ipcp_store_bits_results (void) +{ + cgraph_node *node; + + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) + { + ipa_node_params *info = IPA_NODE_REF (node); + bool dumped_sth = false; + bool found_useful_result = false; + + if (info->ipcp_orig_node) + info = IPA_NODE_REF (info->ipcp_orig_node); + + unsigned count = ipa_get_param_count (info); + for (unsigned i = 0; i < count; i++) + { + ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i); + if (plats->bits_lattice.constant_p ()) + { + found_useful_result = true; + break; + } + } + + if (!found_useful_result) + continue; + + ipcp_grow_transformations_if_necessary (); + ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node); + vec_safe_reserve_exact (ts->bits, count); + + for (unsigned i = 0; i < count; i++) + { + ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i); + ipa_bits bits_jfunc; + + if (plats->bits_lattice.constant_p ()) + { + bits_jfunc.known = true; + bits_jfunc.value = plats->bits_lattice.get_value (); + bits_jfunc.mask = plats->bits_lattice.get_mask (); + bits_jfunc.sgn = plats->bits_lattice.get_sign (); + bits_jfunc.precision = plats->bits_lattice.get_precision (); + } + else + bits_jfunc.known = false; + + ts->bits->quick_push (bits_jfunc); + if (!dump_file || !bits_jfunc.known) + continue; + if (!dumped_sth) + { + fprintf (dump_file, "Propagated bits info for function %s/%i:\n", + node->name (), node->order); + dumped_sth = true; + } + fprintf (dump_file, " param %i: value = ", i); + print_hex (bits_jfunc.value, dump_file); + fprintf (dump_file, ", mask = "); + print_hex (bits_jfunc.mask, dump_file); + fprintf (dump_file, "\n"); + } + } +} /* The IPCP driver. */ static unsigned int @@ -4625,6 +4956,8 @@ ipcp_driver (void) ipcp_decision_stage (&topo); /* Store results of alignment propagation. */ ipcp_store_alignment_results (); + /* Store results of bits propagation. */ + ipcp_store_bits_results (); /* Free all IPCP structures. */ free_toporder_info (&topo); diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 132b622..0913cc5 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -302,6 +302,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs) } else fprintf (f, " Unknown alignment\n"); + + if (jump_func->bits.known) + { + fprintf (f, " value: "); print_hex (jump_func->bits.value, f); + fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f); + fprintf (f, "\n"); + } + else + fprintf (f, " Unknown bits\n"); } } @@ -381,6 +390,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc) { jfunc->type = IPA_JF_UNKNOWN; jfunc->alignment.known = false; + jfunc->bits.known = false; } /* Set JFUNC to be a copy of another jmp (to be used by jump function @@ -1674,6 +1684,27 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi, else gcc_assert (!jfunc->alignment.known); + if (INTEGRAL_TYPE_P (TREE_TYPE (arg)) + && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST)) + { + jfunc->bits.known = true; + jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg)); + jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg)); + + if (TREE_CODE (arg) == SSA_NAME) + { + jfunc->bits.value = 0; + jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg), UNSIGNED); + } + else + { + jfunc->bits.value = wi::to_widest (arg); + jfunc->bits.mask = 0; + } + } + else + gcc_assert (!jfunc->bits.known); + if (is_gimple_ip_invariant (arg) || (TREE_CODE (arg) == VAR_DECL && is_global_var (arg) @@ -3690,6 +3721,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst, for (unsigned i = 0; i < src_alignments->length (); ++i) dst_alignments->quick_push ((*src_alignments)[i]); } + + if (src_trans && vec_safe_length (src_trans->bits) > 0) + { + ipcp_grow_transformations_if_necessary (); + src_trans = ipcp_get_transformation_summary (src); + const vec *src_bits = src_trans->bits; + vec *&dst_bits + = ipcp_get_transformation_summary (dst)->bits; + vec_safe_reserve_exact (dst_bits, src_bits->length ()); + for (unsigned i = 0; i < src_bits->length (); ++i) + dst_bits->quick_push ((*src_bits)[i]); + } } /* Register our cgraph hooks if they are not already there. */ @@ -4609,6 +4652,17 @@ ipa_write_jump_function (struct output_block *ob, streamer_write_uhwi (ob, jump_func->alignment.align); streamer_write_uhwi (ob, jump_func->alignment.misalign); } + + bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, jump_func->bits.known, 1); + streamer_write_bitpack (&bp); + if (jump_func->bits.known) + { + streamer_write_wi (ob, jump_func->bits.value); + streamer_write_wi (ob, jump_func->bits.mask); + streamer_write_enum (ob->main_stream, signop, UNSIGNED + 1, jump_func->bits.sgn); + streamer_write_uhwi (ob, jump_func->bits.precision); + } } /* Read in jump function JUMP_FUNC from IB. */ @@ -4685,6 +4739,19 @@ ipa_read_jump_function (struct lto_input_block *ib, } else jump_func->alignment.known = false; + + bp = streamer_read_bitpack (ib); + bool bits_known = bp_unpack_value (&bp, 1); + if (bits_known) + { + jump_func->bits.known = true; + jump_func->bits.value = streamer_read_wi (ib); + jump_func->bits.mask = streamer_read_wi (ib); + jump_func->bits.sgn = streamer_read_enum (ib, signop, UNSIGNED + 1); + jump_func->bits.precision = streamer_read_uhwi (ib); + } + else + jump_func->bits.known = false; } /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are @@ -5050,6 +5117,31 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node) } else streamer_write_uhwi (ob, 0); + + ts = ipcp_get_transformation_summary (node); + if (ts && vec_safe_length (ts->bits) > 0) + { + count = ts->bits->length (); + streamer_write_uhwi (ob, count); + + for (unsigned i = 0; i < count; ++i) + { + const ipa_bits& bits_jfunc = (*ts->bits)[i]; + struct bitpack_d bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, bits_jfunc.known, 1); + streamer_write_bitpack (&bp); + if (bits_jfunc.known) + { + streamer_write_wi (ob, bits_jfunc.value); + streamer_write_wi (ob, bits_jfunc.mask); + streamer_write_enum (ob->main_stream, signop, + UNSIGNED + 1, bits_jfunc.sgn); + streamer_write_uhwi (ob, bits_jfunc.precision); + } + } + } + else + streamer_write_uhwi (ob, 0); } /* Stream in the aggregate value replacement chain for NODE from IB. */ @@ -5102,6 +5194,28 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node, } } } + + count = streamer_read_uhwi (ib); + if (count > 0) + { + ipcp_grow_transformations_if_necessary (); + ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node); + vec_safe_grow_cleared (ts->bits, count); + + for (i = 0; i < count; i++) + { + ipa_bits& bits_jfunc = (*ts->bits)[i]; + struct bitpack_d bp = streamer_read_bitpack (ib); + bits_jfunc.known = bp_unpack_value (&bp, 1); + if (bits_jfunc.known) + { + bits_jfunc.value = streamer_read_wi (ib); + bits_jfunc.mask = streamer_read_wi (ib); + bits_jfunc.sgn = streamer_read_enum (ib, signop, UNSIGNED + 1); + bits_jfunc.precision = streamer_read_uhwi (ib); + } + } + } } /* Write all aggregate replacement for nodes in set. */ @@ -5404,6 +5518,55 @@ ipcp_update_alignments (struct cgraph_node *node) } } +/* Update bits info of formal parameters as described in + ipcp_transformation_summary. */ + +static void +ipcp_update_bits (struct cgraph_node *node) +{ + tree parm = DECL_ARGUMENTS (node->decl); + tree next_parm = parm; + ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node); + + if (!ts || vec_safe_length (ts->bits) == 0) + return; + + vec &bits = *ts->bits; + unsigned count = bits.length (); + + for (unsigned i = 0; i < count; ++i, parm = next_parm) + { + if (node->clone.combined_args_to_skip + && bitmap_bit_p (node->clone.combined_args_to_skip, i)) + continue; + + gcc_checking_assert (parm); + next_parm = DECL_CHAIN (parm); + + if (!bits[i].known + || !INTEGRAL_TYPE_P (TREE_TYPE (parm)) + || !is_gimple_reg (parm)) + continue; + + tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm); + if (!ddef) + continue; + + if (dump_file) + { + fprintf (dump_file, "Adjusting mask for param %u to ", i); + print_hex (bits[i].mask, dump_file); + fprintf (dump_file, "\n"); + } + + unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef)); + wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED) + | wide_int::from (bits[i].value, prec, bits[i].sgn); + set_nonzero_bits (ddef, nonzero_bits); + DECL_SET_BY_IPA (parm) = 1; + } +} + /* IPCP transformation phase doing propagation of aggregate values. */ unsigned int @@ -5423,6 +5586,7 @@ ipcp_transform_function (struct cgraph_node *node) node->name (), node->order); ipcp_update_alignments (node); + ipcp_update_bits (node); aggval = ipa_get_agg_replacements_for_node (node); if (!aggval) return 0; diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h index e32d078..d69a071 100644 --- a/gcc/ipa-prop.h +++ b/gcc/ipa-prop.h @@ -154,6 +154,16 @@ struct GTY(()) ipa_alignment unsigned misalign; }; +/* Information about zero/non-zero bits. */ +struct GTY(()) ipa_bits +{ + bool known; + widest_int value; + widest_int mask; + enum signop sgn; + unsigned precision; +}; + /* A jump function for a callsite represents the values passed as actual arguments of the callsite. See enum jump_func_type for the various types of jump functions supported. */ @@ -166,6 +176,9 @@ struct GTY (()) ipa_jump_func /* Information about alignment of pointers. */ struct ipa_alignment alignment; + /* Information about zero/non-zero bits. */ + struct ipa_bits bits; + enum jump_func_type type; /* Represents a value of a jump function. pass_through is used only in jump function context. constant represents the actual constant in constant jump @@ -482,6 +495,8 @@ struct GTY(()) ipcp_transformation_summary ipa_agg_replacement_value *agg_values; /* Alignment information for pointers. */ vec *alignments; + /* Known bits information. */ + vec *bits; }; void ipa_set_node_agg_value_chain (struct cgraph_node *node, diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c index 1d56d21..01462e2 100644 --- a/gcc/lto-streamer-in.c +++ b/gcc/lto-streamer-in.c @@ -712,7 +712,7 @@ make_new_block (struct function *fn, unsigned int index) /* Read a wide-int. */ -static widest_int +widest_int streamer_read_wi (struct lto_input_block *ib) { HOST_WIDE_INT a[WIDE_INT_MAX_ELTS]; diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index aa6b589..8fbd882 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -1830,7 +1830,7 @@ output_ssa_names (struct output_block *ob, struct function *fn) /* Output a wide-int. */ -static void +void streamer_write_wi (struct output_block *ob, const widest_int &w) { diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h index ecc1e5d..4da89d0 100644 --- a/gcc/lto-streamer.h +++ b/gcc/lto-streamer.h @@ -1225,4 +1225,7 @@ DEFINE_DECL_STREAM_FUNCS (TYPE_DECL, type_decl) DEFINE_DECL_STREAM_FUNCS (NAMESPACE_DECL, namespace_decl) DEFINE_DECL_STREAM_FUNCS (LABEL_DECL, label_decl) +widest_int streamer_read_wi (struct lto_input_block *); +void streamer_write_wi (struct output_block *, const widest_int &); + #endif /* GCC_LTO_STREAMER_H */ diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-1.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-1.c new file mode 100644 index 0000000..2389d9f --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-1.c @@ -0,0 +1,33 @@ +/* Propagate 0xff from main to f3 to f2. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp -fdump-tree-optimized" } */ + +int pass_test(void); +int fail_test(void); + +__attribute__((noinline, noclone)) +static int f2(int x) +{ + if (x > 300) + return fail_test(); + else + return pass_test(); +} + +__attribute__((noinline, noclone)) +static int f3(int y) +{ + int k = f2(y); + return k; +} + +int main(int argc) +{ + int k = argc & 0xff; + int a = f3(k); + return a; +} + +/* { dg-final { scan-ipa-dump-times "Adjusting mask for param 0 to 0xff" 2 "cp" } } */ +/* { dg-final { scan-tree-dump-not "fail_test" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-2.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-2.c new file mode 100644 index 0000000..8704cce --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-2.c @@ -0,0 +1,37 @@ +/* x's mask should be meet(0xc, 0x3) == 0xf */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */ + +__attribute__((noinline)) +static int f1(int x) +{ + if (x > 300) + return 1; + else + return 2; +} + +__attribute__((noinline)) +static int f2(int y) +{ + return f1(y & 0x03); +} + +__attribute__((noinline)) +static int f3(int z) +{ + return f1(z & 0xc); +} + +extern int a; +extern int b; + +int main(void) +{ + int k = f2(a); + int l = f3(b); + return k + l; +} + +/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-3.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-3.c new file mode 100644 index 0000000..1d0a2ab --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */ + +__attribute__((noinline)) +static int f(int x) +{ + int f2(int); + + if (x > 300) + { + int z = f(x + 1); + return f2 (z); + } + else + return 2; +} + +int main(int argc, char **argv) +{ + int k = f(argc & 0xff); + return k; +} + +/* { dg-final { scan-ipa-dump-not "Adjusting mask for" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-4.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-4.c new file mode 100644 index 0000000..135cde9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-4.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */ + +__attribute__((noinline)) +static int f(int x) +{ + if (x > 300) + return 1; + else + return 2; +} + +int main(void) +{ + int a = f(1); + int b = f(2); + int c = f(4); + return a + b + c; +} + +/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0x7" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-5.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-5.c new file mode 100644 index 0000000..731f307 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-5.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */ + +__attribute__((noinline)) +static int f1(int x) +{ + if (x > 20) + return 1; + else + return 2; +} + +__attribute__((noinline)) +static int f2(int y) +{ + return f1 (y & 0x3); +} + +int main(int argc, char **argv) +{ + int z = f2 (argc & 0xff); + int k = f1 (argc & 0xc); + return z + k; +} + +/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */ diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 6e8595c..9369cf7 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1558,7 +1558,8 @@ struct GTY(()) tree_decl_common { /* DECL_ALIGN. It should have the same size as TYPE_ALIGN. */ unsigned int align : 6; - /* 20 bits unused. */ + unsigned set_by_ipa: 1; + /* 19 bits unused. */ /* UID for points-to sets, stable over copying from inlining. */ unsigned int pt_uid; diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c index ae120a8..c75e9ce 100644 --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -142,7 +142,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "stor-layout.h" #include "optabs-query.h" - +#include "tree-ssa-ccp.h" /* Possible lattice values. */ typedef enum @@ -287,7 +287,11 @@ get_default_value (tree var) { val.lattice_val = CONSTANT; val.value = build_zero_cst (TREE_TYPE (var)); - val.mask = extend_mask (nonzero_bits); + if (SSA_NAME_VAR (var) && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL + && DECL_SET_BY_IPA (SSA_NAME_VAR (var))) + val.mask = widest_int::from (nonzero_bits, TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var)))); + else + val.mask = extend_mask (nonzero_bits); } } } @@ -537,9 +541,9 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val) static ccp_prop_value_t get_value_for_expr (tree, bool); static ccp_prop_value_t bit_value_binop (enum tree_code, tree, tree, tree); -static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *, - tree, const widest_int &, const widest_int &, - tree, const widest_int &, const widest_int &); +void bit_value_binop_1 (enum tree_code, signop, unsigned, widest_int *, widest_int *, + signop, unsigned, const widest_int &, const widest_int &, + signop, unsigned, const widest_int &, const widest_int &); /* Return a widest_int that can be used for bitwise simplifications from VAL. */ @@ -895,7 +899,7 @@ do_dbg_cnt (void) Return TRUE when something was optimized. */ static bool -ccp_finalize (bool nonzero_p) +ccp_finalize (bool nonzero_p ATTRIBUTE_UNUSED) { bool something_changed; unsigned i; @@ -913,10 +917,7 @@ ccp_finalize (bool nonzero_p) if (!name || (!POINTER_TYPE_P (TREE_TYPE (name)) - && (!INTEGRAL_TYPE_P (TREE_TYPE (name)) - /* Don't record nonzero bits before IPA to avoid - using too much memory. */ - || !nonzero_p))) + && (!INTEGRAL_TYPE_P (TREE_TYPE (name))))) continue; val = get_value (name); @@ -1225,10 +1226,11 @@ ccp_fold (gimple *stmt) RVAL and RMASK representing a value of type RTYPE and set the value, mask pair *VAL and *MASK to the result. */ -static void -bit_value_unop_1 (enum tree_code code, tree type, +void +bit_value_unop_1 (enum tree_code code, signop type_sgn, unsigned type_precision, widest_int *val, widest_int *mask, - tree rtype, const widest_int &rval, const widest_int &rmask) + signop rtype_sgn, unsigned rtype_precision, + const widest_int &rval, const widest_int &rmask) { switch (code) { @@ -1241,25 +1243,23 @@ bit_value_unop_1 (enum tree_code code, tree type, { widest_int temv, temm; /* Return ~rval + 1. */ - bit_value_unop_1 (BIT_NOT_EXPR, type, &temv, &temm, type, rval, rmask); - bit_value_binop_1 (PLUS_EXPR, type, val, mask, - type, temv, temm, type, 1, 0); + bit_value_unop_1 (BIT_NOT_EXPR, type_sgn, type_precision, &temv, &temm, + type_sgn, type_precision, rval, rmask); + bit_value_binop_1 (PLUS_EXPR, type_sgn, type_precision, val, mask, + type_sgn, type_precision, temv, temm, + type_sgn, type_precision, 1, 0); break; } CASE_CONVERT: { - signop sgn; - /* First extend mask and value according to the original type. */ - sgn = TYPE_SIGN (rtype); - *mask = wi::ext (rmask, TYPE_PRECISION (rtype), sgn); - *val = wi::ext (rval, TYPE_PRECISION (rtype), sgn); + *mask = wi::ext (rmask, rtype_precision, rtype_sgn); + *val = wi::ext (rval, rtype_precision, rtype_sgn); /* Then extend mask and value according to the target type. */ - sgn = TYPE_SIGN (type); - *mask = wi::ext (*mask, TYPE_PRECISION (type), sgn); - *val = wi::ext (*val, TYPE_PRECISION (type), sgn); + *mask = wi::ext (*mask, type_precision, type_sgn); + *val = wi::ext (*val, type_precision, type_sgn); break; } @@ -1273,15 +1273,16 @@ bit_value_unop_1 (enum tree_code code, tree type, R1VAL, R1MASK and R2VAL, R2MASK representing a values of type R1TYPE and R2TYPE and set the value, mask pair *VAL and *MASK to the result. */ -static void -bit_value_binop_1 (enum tree_code code, tree type, +void +bit_value_binop_1 (enum tree_code code, signop type_sgn, unsigned type_precision, widest_int *val, widest_int *mask, - tree r1type, const widest_int &r1val, - const widest_int &r1mask, tree r2type, + signop r1type_sgn, unsigned r1type_precision, + const widest_int &r1val, const widest_int &r1mask, + signop r2type_sgn, unsigned r2type_precision, const widest_int &r2val, const widest_int &r2mask) { - signop sgn = TYPE_SIGN (type); - int width = TYPE_PRECISION (type); + signop sgn = type_sgn; + int width = (int) type_precision; bool swap_p = false; /* Assume we'll get a constant result. Use an initial non varying @@ -1407,11 +1408,11 @@ bit_value_binop_1 (enum tree_code code, tree type, case MINUS_EXPR: { widest_int temv, temm; - bit_value_unop_1 (NEGATE_EXPR, r2type, &temv, &temm, - r2type, r2val, r2mask); - bit_value_binop_1 (PLUS_EXPR, type, val, mask, - r1type, r1val, r1mask, - r2type, temv, temm); + bit_value_unop_1 (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm, + r2type_sgn, r2type_precision, r2val, r2mask); + bit_value_binop_1 (PLUS_EXPR, type_sgn, type_precision, val, mask, + r1type_sgn, r1type_precision, r1val, r1mask, + r2type_sgn, r2type_precision, temv, temm); break; } @@ -1473,7 +1474,7 @@ bit_value_binop_1 (enum tree_code code, tree type, break; /* For comparisons the signedness is in the comparison operands. */ - sgn = TYPE_SIGN (r1type); + sgn = r1type_sgn; /* If we know the most significant bits we know the values value ranges by means of treating varying bits as zero @@ -1526,8 +1527,9 @@ bit_value_unop (enum tree_code code, tree type, tree rhs) gcc_assert ((rval.lattice_val == CONSTANT && TREE_CODE (rval.value) == INTEGER_CST) || wi::sext (rval.mask, TYPE_PRECISION (TREE_TYPE (rhs))) == -1); - bit_value_unop_1 (code, type, &value, &mask, - TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask); + bit_value_unop_1 (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask, + TYPE_SIGN (TREE_TYPE (rhs)), TYPE_PRECISION (TREE_TYPE (rhs)), + value_to_wide_int (rval), rval.mask); if (wi::sext (mask, TYPE_PRECISION (type)) != -1) { val.lattice_val = CONSTANT; @@ -1572,9 +1574,11 @@ bit_value_binop (enum tree_code code, tree type, tree rhs1, tree rhs2) && TREE_CODE (r2val.value) == INTEGER_CST) || wi::sext (r2val.mask, TYPE_PRECISION (TREE_TYPE (rhs2))) == -1); - bit_value_binop_1 (code, type, &value, &mask, - TREE_TYPE (rhs1), value_to_wide_int (r1val), r1val.mask, - TREE_TYPE (rhs2), value_to_wide_int (r2val), r2val.mask); + bit_value_binop_1 (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask, + TYPE_SIGN (TREE_TYPE (rhs1)), TYPE_PRECISION (TREE_TYPE (rhs1)), + value_to_wide_int (r1val), r1val.mask, + TYPE_SIGN (TREE_TYPE (rhs2)), TYPE_PRECISION (TREE_TYPE (rhs2)), + value_to_wide_int (r2val), r2val.mask); if (wi::sext (mask, TYPE_PRECISION (type)) != -1) { val.lattice_val = CONSTANT; @@ -1673,9 +1677,9 @@ bit_value_assume_aligned (gimple *stmt, tree attr, ccp_prop_value_t ptrval, align = build_int_cst_type (type, -aligni); alignval = get_value_for_expr (align, true); - bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask, - type, value_to_wide_int (ptrval), ptrval.mask, - type, value_to_wide_int (alignval), alignval.mask); + bit_value_binop_1 (BIT_AND_EXPR, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask, + TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (ptrval), ptrval.mask, + TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (alignval), alignval.mask); if (wi::sext (mask, TYPE_PRECISION (type)) != -1) { val.lattice_val = CONSTANT; diff --git a/gcc/tree-ssa-ccp.h b/gcc/tree-ssa-ccp.h new file mode 100644 index 0000000..b76a834 --- /dev/null +++ b/gcc/tree-ssa-ccp.h @@ -0,0 +1,30 @@ +/* Copyright (C) 2016-2016 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +GCC is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef TREE_SSA_CCP_H +#define TREE_SSA_CCP_H + +void bit_value_binop_1 (enum tree_code, signop, unsigned, widest_int *, widest_int *, + signop, unsigned, const widest_int &, const widest_int &, + signop, unsigned, const widest_int &, const widest_int &); + +void bit_value_unop_1 (enum tree_code, signop, unsigned, widest_int *, widest_int *, + signop, unsigned, const widest_int &, const widest_int &); + + +#endif diff --git a/gcc/tree.h b/gcc/tree.h index fff65d6..e21b31d 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -2346,6 +2346,9 @@ extern machine_mode element_mode (const_tree t); #define DECL_IGNORED_P(NODE) \ (DECL_COMMON_CHECK (NODE)->decl_common.ignored_flag) +#define DECL_SET_BY_IPA(NODE) \ + (DECL_COMMON_CHECK (NODE)->decl_common.set_by_ipa) + /* Nonzero for a given ..._DECL node means that this node represents an "abstract instance" of the given declaration (e.g. in the original declaration of an inline function). When generating symbolic debugging