From patchwork Tue Sep 22 13:36:24 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 53997 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-yk0-f198.google.com (mail-yk0-f198.google.com [209.85.160.198]) by patches.linaro.org (Postfix) with ESMTPS id 50C3122B1E for ; Tue, 22 Sep 2015 13:36:49 +0000 (UTC) Received: by ykdz138 with SMTP id z138sf10829816ykd.3 for ; Tue, 22 Sep 2015 06:36:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:message-id:date:from:user-agent:mime-version:to:cc :subject:content-type:x-original-sender :x-original-authentication-results; bh=qFJDs8xvWNNRBf1giM+XfZZKBsyXh3YYfwdoJJMWKuI=; b=Ui4BLRExR+o8klccqgTTsKD5yj/ijBTY0ycyWBkzIuvKtIJCCKSKvy6+B2PDuLWWJu tNKuDPR7Emyk8IhTORlVKKH58zdjiOMlWzfRcWIlyihVtcBEb70wzjkM7pfZyjO/icFF RYG4IJUZNFFN9hcSrwHWgvZ9N5APQH1Xkb2qv4yTiGTsM5ezMi3Fes8PYF+NSbGbmeVY dbPjcZ1JADYZlnq65JJzBm1TDIzcvHAkrtFVk8pgm4dy1cvugXI883XiQGb3eFUZOkU7 r9uB7c1TZ+G7rWtjZjO+BU9rvt+byvaxFRxQ1TLd1igujOBzK4SXxfljAjSSzwQb+kSR NvZg== X-Gm-Message-State: ALoCoQn1zKjJo7aX/uctu0OZ8Q5VoyFOQPMGODvWk6ESqDdZqp5GsjmnwD85FlHvwBMM5hQ7dMwp X-Received: by 10.112.151.9 with SMTP id um9mr4288483lbb.19.1442929008964; Tue, 22 Sep 2015 06:36:48 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.28.170 with SMTP id c10ls532966lah.14.gmail; Tue, 22 Sep 2015 06:36:48 -0700 (PDT) X-Received: by 10.152.43.138 with SMTP id w10mr3036364lal.104.1442929008697; Tue, 22 Sep 2015 06:36:48 -0700 (PDT) Received: from mail-la0-x22d.google.com (mail-la0-x22d.google.com. [2a00:1450:4010:c03::22d]) by mx.google.com with ESMTPS id h2si1107021lag.122.2015.09.22.06.36.48 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Sep 2015 06:36:48 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22d as permitted sender) client-ip=2a00:1450:4010:c03::22d; Received: by lahg1 with SMTP id g1so12653211lah.1 for ; Tue, 22 Sep 2015 06:36:48 -0700 (PDT) X-Received: by 10.25.19.73 with SMTP id j70mr2658063lfi.29.1442929008583; Tue, 22 Sep 2015 06:36:48 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp496370lbq; Tue, 22 Sep 2015 06:36:47 -0700 (PDT) X-Received: by 10.60.74.8 with SMTP id p8mr2556673oev.74.1442929007144; Tue, 22 Sep 2015 06:36:47 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id fm4si1021760obb.20.2015.09.22.06.36.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Sep 2015 06:36:47 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-408023-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 37590 invoked by alias); 22 Sep 2015 13:36:33 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 37576 invoked by uid 89); 22 Sep 2015 13:36:32 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 22 Sep 2015 13:36:30 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-4-hOwxicvERzWf0hKyDZKUAQ-1; Tue, 22 Sep 2015 14:36:24 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 22 Sep 2015 14:36:24 +0100 Message-ID: <56015958.3090009@arm.com> Date: Tue, 22 Sep 2015 14:36:24 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Richard Biener Subject: [RFC] PR tree-optimization/67628: Make tree ifcombine more symmetric and interactions with dom X-MC-Unique: hOwxicvERzWf0hKyDZKUAQ-1 X-IsSubscribed: yes X-Original-Sender: kyrylo.tkachov@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22d as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 Hi all, I'm looking into improving usage of aarch64 conditional compare instructions and PR 67628 is one area of improvement I identified. The problem there is different tree-level behaviour for the expressions: (a > b && b <= c) && c > d vs. a > b && (b <= c && c > d) The second variant generates worse code for aarch64 (and x86_64) because tree ifcombine doesn't do as good a job at merging basic blocks. I've tracked this down to the function ifcombine_andif in tree-ssa-ifcombine.c. With this patch I get the good codegen from that PR, both forms are combined into a single basic block and the conditional compare expansion code in ccmp.c does a good job at generating the ccmp instructions. I also see about 2% more ccmp instructions being generated for SPEC2006 with the code generally looking better as well (if I remove this whole single-conditional restriction completely I get 19% more conditional compares in SPEC2006, but that's something I want to investigate separately). The idea is that for the two expressions given above the control flow and contents of the basic blocks is the same, but when ifcombine_ifandif is called the inner_cond_bb and outer_cond_bb arguments are reversed i.e. for the first expression the inner_cond_bb contains a single comparison c > d and outer_cond_bb contains multiple comparisons a > b && b <= c. In the second case the outer cond contains the single comparison a > b and the inner block this time contains the multiple conditions a > b && b <= c, thus failing the "Only do this optimization if the inner bb contains only the conditional" check. The patch makes this condition more symmetrical by rejecting this optimisation only if neither the inner nor the outer condition blocks are simple. Unfortunately, I see a testsuite regression with this patch: FAIL: gcc.dg/pr66299-2.c scan-tree-dump-not optimized "<<" The reduced part of that test is: void test1 (int x, unsigned u) { if ((1U << x) != 64 || (2 << x) != u || (x << x) != 384 || (3 << x) == 9 || (x << 14) != 98304U || (1 << x) == 14 || (3 << 2) != 12) __builtin_abort (); } The patched ifcombine pass works more or less as expected and produces fewer basic blocks. Before this patch a relevant part of the ifcombine dump for test1 is: ;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot if (x_1(D) != 6) goto ; else goto ; ;; basic block 3, loop depth 0, count 0, freq 9996, maybe hot _2 = 2 << x_1(D); _3 = (unsigned intD.10) _2; if (_3 != u_4(D)) goto ; else goto ; After this patch it is: ;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot _2 = 2 << x_1(D); _3 = (unsigned intD.10) _2; _9 = _3 != u_4(D); _10 = x_1(D) != 6; _11 = _9 | _10; if (_11 != 0) goto ; else goto ; The second form ends up generating worse codegen however, and the badness starts with the dom1 pass. In the unpatched case it manages to deduce that x must be 6 by the time it reaches basic block 3 and uses that information to eliminate the shift in "_2 = 2 << x_1(D)" from basic block 3 In the patched case it is unable to make that call, I think because the x != 6 condition is IORed with another test. I'm not familiar with the internals of the dom pass, so I'm not sure where to go looking for a fix for this. Is the ifcombine change a step in the right direction? If so, what would need to be done to fix the issue with the dom pass? I suppose what we want is to not combine basic blocks if the sequence and conditions of the basic blocks are such that dom can potentially exploit them, but how do we express that? Thanks, Kyrill P.S. I can provide more complete tree dumps for the examples if needed. 2015-09-22 Kyrylo Tkachov PR tree-optimization/67628 * tree-ssa-ifcombine.c (ifcombine_ifandif): Allow optimization when either the inner or outer block contain a single conditional. diff --git a/gcc/tree-ssa-ifcombine.c b/gcc/tree-ssa-ifcombine.c index 9f04174..bfe17ba 100644 --- a/gcc/tree-ssa-ifcombine.c +++ b/gcc/tree-ssa-ifcombine.c @@ -511,8 +511,12 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool inner_inv, gimple_stmt_iterator gsi; if (!LOGICAL_OP_NON_SHORT_CIRCUIT) return false; - /* Only do this optimization if the inner bb contains only the conditional. */ - if (!gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb (inner_cond_bb))) + /* Only do this optimization if inner or outer bb contains + only the conditional. */ + if (!gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb ( + inner_cond_bb)) + && !gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb ( + outer_cond_bb))) return false; t1 = fold_build2_loc (gimple_location (inner_cond), inner_cond_code,