From patchwork Mon Aug 11 02:35:15 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhenqiang Chen X-Patchwork-Id: 35196 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-yk0-f197.google.com (mail-yk0-f197.google.com [209.85.160.197]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 5A581205D8 for ; Mon, 11 Aug 2014 02:35:43 +0000 (UTC) Received: by mail-yk0-f197.google.com with SMTP id 142sf25057548ykq.8 for ; Sun, 10 Aug 2014 19:35:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:mime-version:in-reply-to:references:date:message-id :subject:from:to:cc:x-original-sender :x-original-authentication-results:content-type; bh=OWqmmASlyTO6B+A63O/YUjg3gvv8gWhZrvMCfDMIH3w=; b=m7K+LOIgDabiK/4quKyAiep2Cez0AGLw081mV1Dy5bmDHc0oLwlM7E3zjJDI9ZolyD P8bOoSzdIZ2CzKIET9QYaO+pS8k8HtfaGdX96/h1S7hTUbaqLgnPSYO3tkm6QV8sxfFK WLEGf/9WNUr3jUMS+CBXv463Zr+0p2Xt2WmlTudpeUJOl/9SqPUOj+JRjGVjm2u1bsXi rF4kcEqlhn1qd+d0OD2gDthSJ2F2qj75QiryopsmitDyFZyH6j0OFRS2xyiNpCQwNDet NeDrymf13qwPbF9xxk8kWg0gQCywQKHNzjaQ4yHib1O+xAO0A2YMAtGMMHV8B2Nb/vxx kYuw== X-Gm-Message-State: ALoCoQld+4XM7BlXp3H7T8C2DyOMDWNUjH3hMUyL9IxkLbMBrhW309TwKKzrCWJjC1xjPH3NSAfu X-Received: by 10.52.186.103 with SMTP id fj7mr1800772vdc.8.1407724542990; Sun, 10 Aug 2014 19:35:42 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.21.84 with SMTP id 78ls1277973qgk.79.gmail; Sun, 10 Aug 2014 19:35:42 -0700 (PDT) X-Received: by 10.52.32.230 with SMTP id m6mr1121635vdi.74.1407724542908; Sun, 10 Aug 2014 19:35:42 -0700 (PDT) Received: from mail-vc0-x22e.google.com (mail-vc0-x22e.google.com [2607:f8b0:400c:c03::22e]) by mx.google.com with ESMTPS id cf1si5820320vdc.84.2014.08.10.19.35.42 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 10 Aug 2014 19:35:42 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2607:f8b0:400c:c03::22e as permitted sender) client-ip=2607:f8b0:400c:c03::22e; Received: by mail-vc0-f174.google.com with SMTP id la4so11021787vcb.33 for ; Sun, 10 Aug 2014 19:35:42 -0700 (PDT) X-Received: by 10.52.129.165 with SMTP id nx5mr17071230vdb.25.1407724542703; Sun, 10 Aug 2014 19:35:42 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.221.37.5 with SMTP id tc5csp130984vcb; Sun, 10 Aug 2014 19:35:41 -0700 (PDT) X-Received: by 10.68.225.105 with SMTP id rj9mr39502614pbc.108.1407724541121; Sun, 10 Aug 2014 19:35:41 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id pu2si12210381pbb.53.2014.08.10.19.35.40 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 10 Aug 2014 19:35:41 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-374770-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 10230 invoked by alias); 11 Aug 2014 02:35:27 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 10203 invoked by uid 89); 11 Aug 2014 02:35:21 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-lb0-f174.google.com Received: from mail-lb0-f174.google.com (HELO mail-lb0-f174.google.com) (209.85.217.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 11 Aug 2014 02:35:19 +0000 Received: by mail-lb0-f174.google.com with SMTP id c11so5373590lbj.19 for ; Sun, 10 Aug 2014 19:35:16 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.112.40.161 with SMTP id y1mr34549045lbk.61.1407724516089; Sun, 10 Aug 2014 19:35:16 -0700 (PDT) Received: by 10.112.140.169 with HTTP; Sun, 10 Aug 2014 19:35:15 -0700 (PDT) In-Reply-To: References: Date: Mon, 11 Aug 2014 10:35:15 +0800 Message-ID: Subject: Re: [PATCH, ARM] Keep constants in register when expanding From: Zhenqiang Chen To: Ramana Radhakrishnan Cc: "gcc-patches@gcc.gnu.org" X-IsSubscribed: yes X-Original-Sender: zhenqiang.chen@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2607:f8b0:400c:c03::22e as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 On 8 August 2014 23:22, Ramana Radhakrishnan wrote: > On Tue, Aug 5, 2014 at 10:31 AM, Zhenqiang Chen > wrote: >> Hi, >> >> For some large constants, ARM will split them during expanding, which >> makes impossible to hoist them out the loop or shared by different >> references (refer the test case in the patch). >> >> The patch keeps some constants in registers. If the constant can not >> be optimized, the cprop and combine passes can optimize them as what >> we do in current expand pass with >> >> define_insn_and_split "*arm_subsi3_insn" >> define_insn_and_split "*arm_andsi3_insn" >> define_insn_and_split "*iorsi3_insn" >> define_insn_and_split "*arm_xorsi3" >> >> The patch does not modify addsi3 since the define_insn_and_split >> "*arm_addsi3" is only valid when (reload_completed || >> !arm_eliminable_register (operands[1])). The cprop and combine passes >> can not optimize the large constant if we put it in register, which >> will lead to regression. >> >> For logic operators, the patch skips changes for constants: >> >> INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2]) >> >> since expand pass always uses "sign-extend" to get the value >> (trunc_int_for_mode called from immed_wide_int_const) for rtl, and >> logs show most negative values are UNSIGNED when they are TREE node. >> And combine pass is smart enough to recover the negative value to >> positive value. > > I am unable to verify any change in code generation for this testcase > with and without the patch when I had a play with the patch. > > what gives ? Thanks for trying the patch. Do you add option -fno-gcse which is mentioned in dg-options " -O2 -fno-gcse "? Without it, there is no change for the testcase since cprop pass will propagate the constant to AND expr (A patch to enhance cprop pass was discussed at https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html). In addition, if the constant can not be hoisted out the loop, later combine pass can also optimize it. These (cprop and combine) are reasons why the patch itself has little impact on current tests. Test case in the patch is updated since it does not work for armv7-m. Thanks! -Zhenqiang > Ramana > > >> >> Bootstrap and no make check regression on Chrome book. >> For coremark, dhrystone and eembcv1, no any code size and performance >> change on Cortex-M4. >> No any file in CSiBE has code size change for Cortex-A15 and Cortex-M4. >> No Spec2000 performance regression on Chrome book and dumped assemble >> codes only show very few difference. >> >> OK for trunk? >> >> Thanks! >> -Zhenqiang >> >> ChangeLog: >> 2014-08-05 Zhenqiang Chen >> >> * config/arm/arm.md (subsi3, andsi3, iorsi3, xorsi3): Keep some >> large constants in register other than split them. >> >> testsuite/ChangeLog: >> 2014-08-05 Zhenqiang Chen >> >> * gcc.target/arm/maskdata.c: New test. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index bd8ea8f..c8b3001 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1162,9 +1162,16 @@ { if (TARGET_32BIT) { - arm_split_constant (MINUS, SImode, NULL_RTX, - INTVAL (operands[1]), operands[0], - operands[2], optimize && can_create_pseudo_p ()); + if (!const_ok_for_arm (INTVAL (operands[1])) + && can_create_pseudo_p ()) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_subsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (MINUS, SImode, NULL_RTX, + INTVAL (operands[1]), operands[0], operands[2], + optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ @@ -2077,6 +2084,17 @@ emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1])); } + else if (!(const_ok_for_arm (INTVAL (operands[2])) + || const_ok_for_arm (~INTVAL (operands[2])) + /* zero_extendhi instruction is efficient enough. */ + || INTVAL (operands[2]) == 0xffff + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2])))) + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_andsi3 (operands[0], operands[1], operands[2])); + } else arm_split_constant (AND, SImode, NULL_RTX, INTVAL (operands[2]), operands[0], @@ -2882,9 +2900,20 @@ { if (TARGET_32BIT) { - arm_split_constant (IOR, SImode, NULL_RTX, - INTVAL (operands[2]), operands[0], operands[1], - optimize && can_create_pseudo_p ()); + if (!(const_ok_for_arm (INTVAL (operands[2])) + || (TARGET_THUMB2 + && const_ok_for_arm (~INTVAL (operands[2]))) + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2])))) + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_iorsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (IOR, SImode, NULL_RTX, + INTVAL (operands[2]), operands[0], operands[1], + optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ @@ -3052,9 +3081,18 @@ { if (TARGET_32BIT) { - arm_split_constant (XOR, SImode, NULL_RTX, - INTVAL (operands[2]), operands[0], operands[1], - optimize && can_create_pseudo_p ()); + if (!(const_ok_for_arm (INTVAL (operands[2])) + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2])))) + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_xorsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (XOR, SImode, NULL_RTX, + INTVAL (operands[2]), operands[0], operands[1], + optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ diff --git a/gcc/testsuite/gcc.target/arm/maskdata.c b/gcc/testsuite/gcc.target/arm/maskdata.c new file mode 100644 index 0000000..6960c75 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/maskdata.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options " -O2 -fno-gcse " } */ +/* { dg-require-effective-target arm_thumb2_ok } */ + +#define MASK 0xfe00ff +void maskdata (int * data, int len) +{ + int i = len; + for (; i > 0; i -= 2) + { + data[i] &= MASK; + data[i + 1] &= MASK; + } +} +/* { dg-final { scan-assembler-not "65536" } } */