From patchwork Fri Nov 6 14:19:10 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 56130 Delivered-To: patch@linaro.org Received: by 10.112.61.134 with SMTP id p6csp1039986lbr; Fri, 6 Nov 2015 06:19:30 -0800 (PST) X-Received: by 10.66.221.71 with SMTP id qc7mr18047866pac.82.1446819569946; Fri, 06 Nov 2015 06:19:29 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id xk9si484628pab.38.2015.11.06.06.19.29 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Nov 2015 06:19:29 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-412929-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-412929-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-412929-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=uMrJ9aM6a6Sct+45K Zz6y9NVf8YWURfG3qWg/amsjXe1w7h7E8xXhqsSkuGmmWLjS+RL65Qotedx5ZVEB UMjkzLdKGV+xPilRTTHn7DhP0jA6VTIMkNEbDGTo1FbpIQ5sLRNxMyCUj+0aQcL5 8MF2iUyP5C/RfI4XIEImd7zG4U= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=wJg/7yaUuNrlfbntuE7UoXE IJpg=; b=t2YSfqYFIOuP0t1zLGcQGoutdiVfg5A8R28JPAGOODDDW5i0URZSOLn URL/7z+7Ypv+9STgplyPWC42UiYpAd7C4WhNivXsNlA1EoE5vnJwRXCyTrsYy7Z5 oQRtZvdmSJG7efrwlE2b7ptqT4OFltIzvSbFJs0VZHbGsf3EZ7JM= Received: (qmail 45478 invoked by alias); 6 Nov 2015 14:19:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 45466 invoked by uid 89); 6 Nov 2015 14:19:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 06 Nov 2015 14:19:16 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-23-MLY_SRSuSdmEkt2edR6ErA-1; Fri, 06 Nov 2015 14:19:10 +0000 Received: from [10.2.206.200] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 6 Nov 2015 14:19:10 +0000 Message-ID: <563CB6DE.7070106@arm.com> Date: Fri, 06 Nov 2015 14:19:10 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Segher Boessenkool CC: gcc Patches Subject: Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults References: <56376FFF.3070008@arm.com> <20151104235015.GA13203@gate.crashing.org> <563B4516.5090001@arm.com> <20151106005636.GA31412@gate.crashing.org> In-Reply-To: <20151106005636.GA31412@gate.crashing.org> X-MC-Unique: MLY_SRSuSdmEkt2edR6ErA-1 X-IsSubscribed: yes On 06/11/15 00:56, Segher Boessenkool wrote: > On Thu, Nov 05, 2015 at 12:01:26PM +0000, Kyrill Tkachov wrote: >> Thanks, that looks less intrusive. I did try it out on arm and aarch64. >> It does work on the aarch64 testcase. However, there's also a correctness >> regression, I'll try to explain inline.... >>> diff --git a/gcc/combine.c b/gcc/combine.c >>> index c3db2e0..3bf7ffb 100644 >>> --- a/gcc/combine.c >>> +++ b/gcc/combine.c >>> @@ -5284,6 +5284,15 @@ subst (rtx x, rtx from, rtx to, int in_dest, int >>> in_cond, int unique_copy) >>> || GET_CODE (SET_DEST (x)) == PC)) >>> fmt = "ie"; >>> >>> + /* Substituting into the operands of a widening MULT is not likely >>> + to create RTL matching a machine insn. */ >>> + if (code == MULT >>> + && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND >>> + || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND) >>> + && (GET_CODE (XEXP (x, 1)) == ZERO_EXTEND >>> + || GET_CODE (XEXP (x, 1)) == SIGN_EXTEND)) >>> + return x; >> I think we should also add: >> && REG_P (XEXP (XEXP (x, 0), 0)) >> && REG_P (XEXP (XEXP (x, 1), 0)) >> >> to the condition. Otherwise I've seen regressions in the arm testsuite, the >> gcc.target/arm/smlatb-1.s test in particular that tries to match the pattern >> (define_insn "*maddhisi4tb" >> [(set (match_operand:SI 0 "s_register_operand" "=r") >> (plus:SI (mult:SI (ashiftrt:SI >> (match_operand:SI 1 "s_register_operand" "r") >> (const_int 16)) >> (sign_extend:SI >> (match_operand:HI 2 "s_register_operand" "r"))) >> (match_operand:SI 3 "s_register_operand" "r")))] >> >> >> There we have a sign_extend of a shift that we want to convert to the form >> that the pattern expects. So adding the checks for REG_P fixes that for me. > I'll have a look at this; I thought it should be handled with the new patch > (attached), but maybe not. > >> For the correctness issue I saw on aarch64 the shortest case I could reduce >> is: >> short int a[16], b[16]; >> void >> f5 (void) >> { >> a[0] = b[0] / 14; >> } > (Without -mcpu=cortex-a53, or you just get a divide insn). > >> Is there a way that subst can signal some kind of "failed to substitute" >> result? > Yep, see new patch. The "from == to" condition is for when subst is called > just to simplify some code (normally with pc_rtx, pc_rtx). Indeed, this looks better but it still needs the REG_P checks for the inner operands of the extends to not screw up the arm case. Here's my proposed extension. I've also modified the testcase slightly to not use an unitialised variable. It still demonstrates the issue we're trying to solve. Bootstrapped and tested on arm and aarch64. I'll let you put it through it's paces on your setup :) P.S. Do we want to restrict this to targets that have a widening mul optab like I did in the original patch? Thanks, Kyrill 2015-11-06 Segher Boessenkool Kyrylo Tkachov * combine.c (subst): Don't substitute or simplify when handling register-wise widening multiply. (force_to_mode): Likewise. 2015-11-06 Kyrylo Tkachov * gcc.target/aarch64/umaddl_combine_1.c: New test. >> If not, I got it to work by using that check to set the in_dest variable to >> the >> subsequent recursive call to subst, in a similar way to my original patch, >> but I was >> hoping to avoid overloading the meaning of in_dest. > Yes me too :-) > > > Segher > > > diff --git a/gcc/combine.c b/gcc/combine.c > index c3db2e0..c806db9 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -5284,6 +5284,20 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy) > || GET_CODE (SET_DEST (x)) == PC)) > fmt = "ie"; > > + /* Substituting into the operands of a widening MULT is not likely > + to create RTL matching a machine insn. */ > + if (code == MULT > + && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND > + || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND) > + && (GET_CODE (XEXP (x, 1)) == ZERO_EXTEND > + || GET_CODE (XEXP (x, 1)) == SIGN_EXTEND)) > + { > + if (from == to) > + return x; > + else > + return gen_rtx_CLOBBER (GET_MODE (x), const0_rtx); > + } > + > /* Get the mode of operand 0 in case X is now a SIGN_EXTEND of a > constant. */ > if (fmt[0] == 'e') > @@ -8455,6 +8469,15 @@ force_to_mode (rtx x, machine_mode mode, unsigned HOST_WIDE_INT mask, > /* ... fall through ... */ > > case MULT: > + /* Substituting into the operands of a widening MULT is not likely to > + create RTL matching a machine insn. */ > + if (code == MULT > + && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND > + || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND) > + && (GET_CODE (XEXP (x, 1)) == ZERO_EXTEND > + || GET_CODE (XEXP (x, 1)) == SIGN_EXTEND)) > + return gen_lowpart_or_truncate (mode, x); > + > /* For PLUS, MINUS and MULT, we need any bits less significant than the > most significant bit in MASK since carries from those bits will > affect the bits we are interested in. */ commit 24d4a7f9f32e9290e95f940c1ac57fc9be26687c Author: Kyrylo Tkachov Date: Fri Oct 30 13:56:10 2015 +0000 [combine] Don't transform sign and zero extends inside mults diff --git a/gcc/combine.c b/gcc/combine.c index fa1a93f..26b14bf 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -5286,6 +5286,22 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy) || GET_CODE (SET_DEST (x)) == PC)) fmt = "ie"; + /* Substituting into the operands of a widening MULT is not likely + to create RTL matching a machine insn. */ + if (code == MULT + && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND + || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND) + && (GET_CODE (XEXP (x, 1)) == ZERO_EXTEND + || GET_CODE (XEXP (x, 1)) == SIGN_EXTEND) + && REG_P (XEXP (XEXP (x, 0), 0)) + && REG_P (XEXP (XEXP (x, 1), 0))) + { + if (from == to) + return x; + else + return gen_rtx_CLOBBER (GET_MODE (x), const0_rtx); + } + /* Get the mode of operand 0 in case X is now a SIGN_EXTEND of a constant. */ if (fmt[0] == 'e') @@ -8447,6 +8463,17 @@ force_to_mode (rtx x, machine_mode mode, unsigned HOST_WIDE_INT mask, /* ... fall through ... */ case MULT: + /* Substituting into the operands of a widening MULT is not likely to + create RTL matching a machine insn. */ + if (code == MULT + && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND + || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND) + && (GET_CODE (XEXP (x, 1)) == ZERO_EXTEND + || GET_CODE (XEXP (x, 1)) == SIGN_EXTEND) + && REG_P (XEXP (XEXP (x, 0), 0)) + && REG_P (XEXP (XEXP (x, 1), 0))) + return gen_lowpart_or_truncate (mode, x); + /* For PLUS, MINUS and MULT, we need any bits less significant than the most significant bit in MASK since carries from those bits will affect the bits we are interested in. */ diff --git a/gcc/testsuite/gcc.target/aarch64/umaddl_combine_1.c b/gcc/testsuite/gcc.target/aarch64/umaddl_combine_1.c new file mode 100644 index 0000000..430277f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/umaddl_combine_1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a53" } */ + +enum reg_class +{ + NO_REGS, + AD_REGS, + ALL_REGS, LIM_REG_CLASSES +}; + +extern enum reg_class + reg_class_subclasses[((int) LIM_REG_CLASSES)][((int) LIM_REG_CLASSES)]; + +void +init_reg_sets_1 (unsigned int i) +{ + unsigned int j; + { + for (j = i + 1; j < ((int) LIM_REG_CLASSES); j++) + { + enum reg_class *p; + p = ®_class_subclasses[j][0]; + while (*p != LIM_REG_CLASSES) + p++; + } + } +} + +/* { dg-final { scan-assembler-not "umull\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } } */