From patchwork Wed Feb 17 10:20:19 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 62097 Delivered-To: patch@linaro.org Received: by 10.112.43.199 with SMTP id y7csp2193018lbl; Wed, 17 Feb 2016 02:20:48 -0800 (PST) X-Received: by 10.98.12.8 with SMTP id u8mr977212pfi.36.1455704447954; Wed, 17 Feb 2016 02:20:47 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id o12si1051043pfa.162.2016.02.17.02.20.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Feb 2016 02:20:47 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-421598-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-421598-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-421598-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=Bx74G7o8VzDmTw9ebA t/DMOjuswmmyrGk4i7lmix7BnzmCZkFhhJgWKht9IRO3e7gNhLIj0eQA2J5XR9aE SVVD06zGp0Zp0fXVVsEagaXT5zJHWvbSuadeARje89qoEDUvxStFBranIUz7EO1y +nDmx7HMt0P7FDVg7Jq7nTJVo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=voS6LOWG65L/HvohPn6PvufR 0Wo=; b=XmJEjV7peEVOtqrgMYqhTj2ziEpDfvfxCTKOeidpctfZUsGyymK/hujt 3Fn3IEx1FJNnEaVLmIhFkKgccOgVhuzEPbrLYIwgT5aIsXQPWvR8wsZ+aP+NDplA sfSwlGL0m1N60V0NR1IWe7t6s7bJL5/qiqemX3zyEfhbl7geFZ0= Received: (qmail 78876 invoked by alias); 17 Feb 2016 10:20:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 78863 invoked by uid 89); 17 Feb 2016 10:20:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=costing, msg01714html, 201601, 2016-01 X-HELO: mail-qk0-f182.google.com Received: from mail-qk0-f182.google.com (HELO mail-qk0-f182.google.com) (209.85.220.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 17 Feb 2016 10:20:23 +0000 Received: by mail-qk0-f182.google.com with SMTP id x1so4010594qkc.1 for ; Wed, 17 Feb 2016 02:20:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=2YtqgrUO3XnXr1UhYelb8gnlFb+L7Fi1GTVZVRSk/eQ=; b=AhNQXs9i4r2Lqa3XUtNUWGQioznyT4j9MuKf+85CDI9BIgN2q3YeMPLpgFdf1Q9XyB laQ8h8+7QZd0DAFdHvy+WgJDm7j191SKaBvi+Pf4nIol6MjDblJW56u+Tws+Z4rNsHxm xC5idQNFdKtjlgHnEBgD4mmYWEjlJe68rbLyUD7/kTfZWfo7drRzPG70GIkdeN32Me5E 3cMNVi1mGw8F08blBeYUUe4EQFdNKQjI/rkFx0ZhPxib/o1Qq6VgkU1Ne1HO2FwVvNsU /72A3y9n3fzHzM7BMlqyOBtRL/wa80mFkdKXYSBPTbOyUWsm/OAvnET1Nshw+aCv1eOx d2Aw== X-Gm-Message-State: AG10YOQA6+9WCsuK/nK41UQGUQZ3d2Nx/Eye3zl8w+oJqHcOgx82+O4fmqtMNOn38ndoI/XBsuHVhfZZOP/7E7Ov MIME-Version: 1.0 X-Received: by 10.55.73.199 with SMTP id w190mr746164qka.77.1455704420145; Wed, 17 Feb 2016 02:20:20 -0800 (PST) Received: by 10.140.90.84 with HTTP; Wed, 17 Feb 2016 02:20:19 -0800 (PST) In-Reply-To: <56C445F6.6040004@foss.arm.com> References: <56C1B74D.4070009@foss.arm.com> <56C445F6.6040004@foss.arm.com> Date: Wed, 17 Feb 2016 11:20:19 +0100 Message-ID: Subject: Re: [PATCH, ARM] stop changing signedness in PROMOTE_MODE From: Christophe Lyon To: Kyrill Tkachov Cc: Ramana Radhakrishnan , Jim Wilson , "gcc-patches@gcc.gnu.org" X-IsSubscribed: yes On 17 February 2016 at 11:05, Kyrill Tkachov wrote: > > On 17/02/16 10:03, Christophe Lyon wrote: >> >> On 15 February 2016 at 12:32, Kyrill Tkachov >> wrote: >>> >>> On 04/02/16 08:58, Ramana Radhakrishnan wrote: >>>> >>>> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson >>>> wrote: >>>>> >>>>> This is my suggested fix for PR 65932, which is a linux kernel >>>>> miscompile with gcc-5.1. >>>>> >>>>> The problem here is caused by a chain of events. The first is that >>>>> the relatively new eipa_sra pass creates fake parameters that behave >>>>> slightly differently than normal parameters. The second is that the >>>>> optimizer creates phi nodes that copy local variables to fake >>>>> parameters and/or vice versa. The third is that the ouf-of-ssa pass >>>>> assumes that it can emit simple move instructions for these phi nodes. >>>>> And the fourth is that the ARM port has a PROMOTE_MODE macro that >>>>> forces QImode and HImode to unsigned, but a >>>>> TARGET_PROMOTE_FUNCTION_MODE hook that does not. So signed char and >>>>> short parameters have different in register representations than local >>>>> variables, and require a conversion when copying between them, a >>>>> conversion that the out-of-ssa pass can't easily emit. >>>>> >>>>> Ultimately, I think this is a problem in the arm backend. It should >>>>> not have a PROMOTE_MODE macro that is changing the sign of char and >>>>> short local variables. I also think that we should merge the >>>>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to >>>>> prevent this from happening again. >>>>> >>>>> I see four general problems with the current ARM PROMOTE_MODE >>>>> definition. >>>>> 1) Unsigned char is only faster for armv5 and earlier, before the sxtb >>>>> instruction was added. It is a lose for armv6 and later. >>>>> 2) Unsigned short was only faster for targets that don't support >>>>> unaligned accesses. Support for these targets was removed a while >>>>> ago, and this PROMODE_MODE hunk should have been removed at the same >>>>> time. It was accidentally left behind. >>>>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was >>>>> converted to a function, the PROMOTE_MODE code was copied without the >>>>> UNSIGNEDP changes. Thus it is only an accident that >>>>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree. Changing >>>>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE >>>>> changes to resolve the difference are safe. >>>>> 4) There is a general principle that you should only change signedness >>>>> in PROMOTE_MODE if the hardware forces it, as otherwise this results >>>>> in extra conversion instructions that make code slower. The mips64 >>>>> hardware for instance requires that 32-bit values be sign-extended >>>>> regardless of type, and instructions may trap if this is not true. >>>>> However, it has a set of 32-bit instructions that operate on these >>>>> values, and hence no conversions are required. There is no similar >>>>> case on ARM. Thus the conversions are unnecessary and unwise. This >>>>> can be seen in the testcases where gcc emits both a zero-extend and a >>>>> sign-extend inside a loop, as the sign-extend is required for a >>>>> compare, and the zero-extend is required by PROMOTE_MODE. >>>> >>>> Given Kyrill's testing with the patch and the reasonably detailed >>>> check of the effects of code generation changes - The arm.h hunk is ok >>>> - I do think we should make this explicit in the documentation that >>>> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and >>>> better still maybe put in a checking assert for the same in the >>>> mid-end but that could be the subject of a follow-up patch. >>>> >>>> Ok to apply just the arm.h hunk as I think Kyrill has taken care of >>>> the testsuite fallout separately. >>> >>> Hi all, >>> >>> I'd like to backport the arm.h from this ( r233130) to the GCC 5 >>> branch. As the CSE patch from my series had some fallout on x86_64 >>> due to a deficiency in the AVX patterns that is too invasive to fix >>> at this stage (and presumably backport), I'd like to just backport >>> this arm.h fix and adjust the tests to XFAIL the fallout that comes >>> with not applying the CSE patch. The attached patch does that. >>> >>> The code quality fallout on code outside the testsuite is not >>> that gread. The SPEC benchmarks are not affected by not applying >>> the CSE change, and only a single sequence in a popular embedded >>> benchmark >>> shows some degradation for -mtune=cortex-a9 in the same way as the >>> wmul-1.c and wmul-2.c tests. >>> >>> I think that's a fair tradeoff for fixing the wrong code bug on that >>> branch. >>> >>> Ok to backport r233130 and the attached testsuite patch to the GCC 5 >>> branch? >>> >>> Thanks, >>> Kyrill >>> >>> 2016-02-15 Kyrylo Tkachov >>> >>> PR target/65932 >>> * gcc.target/arm/wmul-1.c: Add -mtune=cortex-a9 to dg-options. >>> xfail the scan-assembler test. >>> * gcc.target/arm/wmul-2.c: Likewise. >>> * gcc.target/arm/wmul-3.c: Simplify test to generate a single >>> smulbb. >>> >>> >> Hi Kyrill, >> >> I've noticed that wmul-3 still fails on the gcc-5 branch when forcing GCC >> configuration to: >> --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16 >> (target arm-none-linux-gnueabihf) >> >> The generated code is: >> sxth r0, r0 >> sxth r1, r1 >> mul r0, r1, r0 >> instead of >> smulbb r0, r1, r0 >> on trunk. >> >> I guess we don't worry? > > > Hi Christophe, > Hmmm, I suspect we might want to backport > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html > to fix backend the costing logic of smulbb. > Could you please try that patch to see if it helps? > Ha indeed, with the attached patch, we now generate smulbb. I didn't run a full make check though. OK with a suitable ChangeLog entry? Christophe. > Thanks, > Kyrill > > >>> >>>> regards >>>> Ramana >>>> >>>> >>>> >>>> >>>>> My change was tested with an arm bootstrap, make check, and SPEC >>>>> CPU2000 run. The original poster verified that this gives a linux >>>>> kernel that boots correctly. >>>>> >>>>> The PRMOTE_MODE change causes 3 testsuite testcases to fail. These >>>>> are tests to verify that smulbb and/or smlabb are generated. >>>>> Eliminating the unnecessary sign conversions causes us to get better >>>>> code that doesn't include the smulbb and smlabb instructions. I had >>>>> to modify the testcases to get them to emit the desired instructions. >>>>> With the testcase changes there are no additional testsuite failures, >>>>> though I'm concerned that these testcases with the changes may be >>>>> fragile, and future changes may break them again. >>>> >>>> >>>> >>>>> If there are ARM parts where smulbb/smlabb are faster than mul/mla, >>>>> then maybe we should try to add new patterns to get the instructions >>>>> emitted again for the unmodified testcases. >>>>> >>>>> Jim >>> >>> > Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 233484) +++ gcc/config/arm/arm.c (working copy) @@ -10306,8 +10306,10 @@ /* SMUL[TB][TB]. */ if (speed_p) *cost += extra_cost->mult[0].extend; - *cost += (rtx_cost (XEXP (x, 0), SIGN_EXTEND, 0, speed_p) - + rtx_cost (XEXP (x, 1), SIGN_EXTEND, 0, speed_p)); + *cost += rtx_cost (XEXP (XEXP (x, 0), 0), + SIGN_EXTEND, 0, speed_p); + *cost += rtx_cost (XEXP (XEXP (x, 1), 0), + SIGN_EXTEND, 1, speed_p); return true; } if (speed_p)