From patchwork Tue Aug 5 09:31:22 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhenqiang Chen X-Patchwork-Id: 34901 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pd0-f200.google.com (mail-pd0-f200.google.com [209.85.192.200]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 2DDC320523 for ; Tue, 5 Aug 2014 09:31:50 +0000 (UTC) Received: by mail-pd0-f200.google.com with SMTP id w10sf4233732pde.3 for ; Tue, 05 Aug 2014 02:31:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:mime-version:date:message-id:subject:from:to :x-original-sender:x-original-authentication-results:content-type; bh=z+IdCA+10S86umsZOooqcXIPUwy/i4uyV//BSEGhE2Q=; b=MzjD9IdjcvDzvGj1R7mCm80EFc6FmepR+A4kc9zwrKepU8lPoj4CERTzMFHSBVCYyA bSveAdjQpmjAMMqWt4Aq5nNHYPQOlu0nHeSMbI2nm0NIEZoHQlodv9ItDbS6Rb/AsEkX OUmRSCzzQIjzNrXyYKZM09Gp4667Y2fdFsWPlsDVmPvR4roNJBUb22FSITcEnWilJ0FU YwnO/xLSZAYQu2AEVbCT0nhP26xC6kb6SNPycQ8wigk9nAItI6ZZuOVV1Z2/eBTZkNiG UMsFZs9d0DK5SMnyi3KBJ1oHTbjQO/OdM0h4+1psan9RRkq3WTFkZCChGXvLm5fqkWvQ CBJw== X-Gm-Message-State: ALoCoQmoI0rdebB1fJLw//9entgBrwwET8SQma08voZBmX6mKLyj3oEwbW0O7/FYqCRkYyAvGY7I X-Received: by 10.66.65.202 with SMTP id z10mr505503pas.45.1407231109503; Tue, 05 Aug 2014 02:31:49 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.108.228 with SMTP id j91ls181118qgf.31.gmail; Tue, 05 Aug 2014 02:31:49 -0700 (PDT) X-Received: by 10.52.82.166 with SMTP id j6mr65111vdy.87.1407231109333; Tue, 05 Aug 2014 02:31:49 -0700 (PDT) Received: from mail-vc0-x235.google.com (mail-vc0-x235.google.com [2607:f8b0:400c:c03::235]) by mx.google.com with ESMTPS id xc7si694197vcb.32.2014.08.05.02.31.49 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 05 Aug 2014 02:31:49 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2607:f8b0:400c:c03::235 as permitted sender) client-ip=2607:f8b0:400c:c03::235; Received: by mail-vc0-f181.google.com with SMTP id lf12so997197vcb.12 for ; Tue, 05 Aug 2014 02:31:49 -0700 (PDT) X-Received: by 10.53.6.132 with SMTP id cu4mr588447vdd.62.1407231109186; Tue, 05 Aug 2014 02:31:49 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.221.37.5 with SMTP id tc5csp369797vcb; Tue, 5 Aug 2014 02:31:48 -0700 (PDT) X-Received: by 10.70.94.100 with SMTP id db4mr2812417pdb.122.1407231107675; Tue, 05 Aug 2014 02:31:47 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id xn6si1318220pbc.85.2014.08.05.02.31.47 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Aug 2014 02:31:47 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-374116-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 14007 invoked by alias); 5 Aug 2014 09:31:35 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 13991 invoked by uid 89); 5 Aug 2014 09:31:34 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-lb0-f179.google.com Received: from mail-lb0-f179.google.com (HELO mail-lb0-f179.google.com) (209.85.217.179) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 05 Aug 2014 09:31:27 +0000 Received: by mail-lb0-f179.google.com with SMTP id v6so496462lbi.10 for ; Tue, 05 Aug 2014 02:31:23 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.152.4.97 with SMTP id j1mr2823905laj.10.1407231082873; Tue, 05 Aug 2014 02:31:22 -0700 (PDT) Received: by 10.112.136.66 with HTTP; Tue, 5 Aug 2014 02:31:22 -0700 (PDT) Date: Tue, 5 Aug 2014 17:31:22 +0800 Message-ID: Subject: [PATCH, ARM] Keep constants in register when expanding From: Zhenqiang Chen To: "gcc-patches@gcc.gnu.org" X-IsSubscribed: yes X-Original-Sender: zhenqiang.chen@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2607:f8b0:400c:c03::235 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 Hi, For some large constants, ARM will split them during expanding, which makes impossible to hoist them out the loop or shared by different references (refer the test case in the patch). The patch keeps some constants in registers. If the constant can not be optimized, the cprop and combine passes can optimize them as what we do in current expand pass with define_insn_and_split "*arm_subsi3_insn" define_insn_and_split "*arm_andsi3_insn" define_insn_and_split "*iorsi3_insn" define_insn_and_split "*arm_xorsi3" The patch does not modify addsi3 since the define_insn_and_split "*arm_addsi3" is only valid when (reload_completed || !arm_eliminable_register (operands[1])). The cprop and combine passes can not optimize the large constant if we put it in register, which will lead to regression. For logic operators, the patch skips changes for constants: INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2]) since expand pass always uses "sign-extend" to get the value (trunc_int_for_mode called from immed_wide_int_const) for rtl, and logs show most negative values are UNSIGNED when they are TREE node. And combine pass is smart enough to recover the negative value to positive value. Bootstrap and no make check regression on Chrome book. For coremark, dhrystone and eembcv1, no any code size and performance change on Cortex-M4. No any file in CSiBE has code size change for Cortex-A15 and Cortex-M4. No Spec2000 performance regression on Chrome book and dumped assemble codes only show very few difference. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-08-05 Zhenqiang Chen * config/arm/arm.md (subsi3, andsi3, iorsi3, xorsi3): Keep some large constants in register other than split them. testsuite/ChangeLog: 2014-08-05 Zhenqiang Chen * gcc.target/arm/maskdata.c: New test. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index bd8ea8f..c8b3001 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1162,9 +1162,16 @@ { if (TARGET_32BIT) { - arm_split_constant (MINUS, SImode, NULL_RTX, - INTVAL (operands[1]), operands[0], - operands[2], optimize && can_create_pseudo_p ()); + if (!const_ok_for_arm (INTVAL (operands[1])) + && can_create_pseudo_p ()) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_subsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (MINUS, SImode, NULL_RTX, + INTVAL (operands[1]), operands[0], operands[2], + optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ @@ -2077,6 +2084,17 @@ emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1])); } + else if (!(const_ok_for_arm (INTVAL (operands[2])) + || const_ok_for_arm (~INTVAL (operands[2])) + /* zero_extendhi instruction is efficient enough. */ + || INTVAL (operands[2]) == 0xffff + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2])))) + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_andsi3 (operands[0], operands[1], operands[2])); + } else arm_split_constant (AND, SImode, NULL_RTX, INTVAL (operands[2]), operands[0], @@ -2882,9 +2900,20 @@ { if (TARGET_32BIT) { - arm_split_constant (IOR, SImode, NULL_RTX, - INTVAL (operands[2]), operands[0], operands[1], - optimize && can_create_pseudo_p ()); + if (!(const_ok_for_arm (INTVAL (operands[2])) + || (TARGET_THUMB2 + && const_ok_for_arm (~INTVAL (operands[2]))) + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2])))) + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_iorsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (IOR, SImode, NULL_RTX, + INTVAL (operands[2]), operands[0], operands[1], + optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ @@ -3052,9 +3081,18 @@ { if (TARGET_32BIT) { - arm_split_constant (XOR, SImode, NULL_RTX, - INTVAL (operands[2]), operands[0], operands[1], - optimize && can_create_pseudo_p ()); + if (!(const_ok_for_arm (INTVAL (operands[2])) + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2])))) + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_xorsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (XOR, SImode, NULL_RTX, + INTVAL (operands[2]), operands[0], operands[1], + optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ diff --git a/gcc/testsuite/gcc.target/arm/maskdata.c b/gcc/testsuite/gcc.target/arm/maskdata.c new file mode 100644 index 0000000..b4231a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/maskdata.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options " -O2 -fno-gcse " } */ +/* { dg-require-effective-target arm_thumb2_ok } */ + +#define MASK 0xfe00ff +void maskdata (int * data, int len) +{ + int i = len; + for (; i > 0; i -= 2) + { + data[i] &= MASK; + data[i + 1] &= MASK; + } +} +/* { dg-final { scan-assembler "254" } } */ +/* { dg-final { scan-assembler "255" } } */