From patchwork Fri Nov 13 16:01:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charles Baylis X-Patchwork-Id: 56521 Delivered-To: patch@linaro.org Received: by 10.112.155.196 with SMTP id vy4csp1124498lbb; Fri, 13 Nov 2015 08:01:59 -0800 (PST) X-Received: by 10.66.158.97 with SMTP id wt1mr33585441pab.155.1447430517439; Fri, 13 Nov 2015 08:01:57 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id wx8si28184995pab.113.2015.11.13.08.01.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 13 Nov 2015 08:01:57 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-414011-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-414011-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-414011-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=i8Cyg6o9/8E5LgfhX/BigA8EvDbuv6tn8yrBj8IE7kB q1ZW5tkGNRoszQQjW3NXOPMnNiZ2mYrbitphn/1PZ5IxjxCMhe/lIM2re2LxwJtO 3mRGdU4xGRfyRDVCPii+UyrlXOvQz7ObdwWz9sR9Dq2QpLH9P3dDemfUi/WPP6Kc = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=UTDgRjf1zzqMZk7g+OzaN7W0Apc=; b=Nv9eLIKyaR7Wo85WV 9Cr2RjSwgAFPS65bHaCLipA/m9j+JTSaESw20pIBchVIMMCp6Gm15rhVTYWEl8nP 6bbszekXsWPYRqTW3vup/Up6kdBK8EPascFsmdt1VcORRhi9X4kbhTY0GDQ37naH 6P+aztTXxr5WRw7WNyEVO/VlcE= Received: (qmail 20017 invoked by alias); 13 Nov 2015 16:01:44 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 20002 invoked by uid 89); 13 Nov 2015 16:01:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, KAM_LOTSOFHASH, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-oi0-f47.google.com Received: from mail-oi0-f47.google.com (HELO mail-oi0-f47.google.com) (209.85.218.47) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 13 Nov 2015 16:01:41 +0000 Received: by oige206 with SMTP id e206so51856224oig.2 for ; Fri, 13 Nov 2015 08:01:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to:cc :content-type; bh=jugvQzK5fMvFz5eSvMOKnc6oVgRf0kEILvs0uQV8lZQ=; b=Gc5Q1gOGfP7k0qBXNFgfdgm9ftNuygmEBpSVHgoVeUvDTz8+4Tysbgz7JKWMRMKGv6 D0+3PWjUYRgg6kGq6pcMdKxQ4wRNVctq8yWaLBCLBlwZQTnCTIORkxdWXowTthB+PgzT rhnJIcvQEtOsEI/H6k7xnseb0riotNer3zEfcbgqBCeO8avzcSAwpTfk+DuLJujSPI/x dOrXTyO7FICyijkZshYafHWiiXKoro0Z6+Ff1rdDLPw7IxP2F6rKTJ+6oa6CilxMaib9 AGyQNOJtlFZooH0ruwrU3PmDmOm+/HOAwAPaHB2KPhqsLtcLnfPBuKY7d4UCJbVZyTXT jjgw== X-Gm-Message-State: ALoCoQkjZ5uGzCAjqBHZt4yCYxfl2KhF4/QsmkBpGPawY6am1TBu3PumvWvIPZZZseSFrkm+SCts MIME-Version: 1.0 X-Received: by 10.202.181.133 with SMTP id e127mr12324076oif.16.1447430499116; Fri, 13 Nov 2015 08:01:39 -0800 (PST) Received: by 10.202.215.215 with HTTP; Fri, 13 Nov 2015 08:01:39 -0800 (PST) Date: Fri, 13 Nov 2015 16:01:39 +0000 Message-ID: Subject: [PATCH v2] [ARM] PR61551 RFC: Improve costs for NEON addressing modes From: Charles Baylis To: Ramana Radhakrishnan , Kyrylo Tkachov , Richard Earnshaw Cc: GCC Patches X-IsSubscribed: yes Hi Following on from previous discussion: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03464.html and IRC. I'm going to try once more to make the case for fixing the worst problem for GCC 6, pending a rewrite of the address_cost infrastructure for GCC 7. I think the rewrite you're describing is overkill for this problem. There is one specific problem which I would like to fix for GCC6, and that is the failure of the ARM backend to allow use of post-indexed addressing for some vector modes. Test program: #include char *f(char *p, int8x8x4_t v, int r) { vst4_s8(p, v); p+=32; return p; } Desired code: f: vst4.8 {d0-d3}, [r0]! bx lr Currently generated code: f: mov r3, r0 adds r0, r0, #32 vst4.8 {d0-d3}, [r3] bx lr The auto-inc-dec phase does not apply in this case, because the costs for RTXs which use POST_INC are wrong. Using gdb to poke at this, we can see: $ arm-unknown-linux-gnueabihf-gcc -mfpu=neon -O3 -S /tmp/foo.c -wrapper gdb,--args GNU gdb (Ubuntu 7.9-1ubuntu1) 7.9 Reading symbols from /home/charles.baylis/tools/tools-arm-unknown-linux-gnueabihf-git/bin/../libexec/gcc/arm-unknown-linux-gnueabihf/6.0.0/cc1...done. (gdb) b auto-inc-dec.c:473 Breakpoint 1 at 0x102c253: file /home/charles.baylis/srcarea/gcc/gcc-git/gcc/auto-inc-dec.c, line 473. (gdb) r (gdb) print debug_rtx(mem) (mem:OI (reg/v/f:SI 112 [ p ]) [0 MEM[(signed char[32] *)p_2(D)]+0 S32 A8]) $1 = void (gdb) print rtx_cost(mem, V16QImode, SET, 1, false) $2 = 4 (gdb) print debug_rtx(mem_tmp) (mem:OI (post_inc:SI (reg/f:SI 115 [ p ])) [0 S32 A64]) $3 = void (gdb) print rtx_cost(mem_tmp, V16QImode, SET, 1, false) $4 = 32 So, the cost of (mem:OI (reg/v/f:SI 112 [ p ])) is 4, while the cost of (mem:OI (post_inc:SI (reg/f:SI 115 [ p ]))) is 32. That is a difference equivalent to 7 insns, which has no basis in reality. It is just a bug. Addressing some specific review points from the previous version. > > + { > > + 0, > > + COSTS_N_INSNS (15), > > + COSTS_N_INSNS (15), > > + COSTS_N_INSNS (15), > > + COSTS_N_INSNS (15) > > + } /* vec512 */ > > } > > }; > > I'm curious as to the numbers here - The costs should reflect the relative costs of the > addressing modes not the costs of the loads and stores - thus having high numbers > here for vector modes may just prevent this from even triggering in auto-inc-dec > code ? In my experience with GCC I've never satisfactorily answered the question > whether these should be comparable to rtx_costs or not. In an ideal world they should > be but I'm never sure. IOW I'm not sure if using COSTS_N_INSNS or plain numbers > here is appropriate. That's the point of the patch. These numbers give the same behaviour as the current arm_rtx_costs code, and they are obviously wrong. > 17:45 < ramana> My problem is that the mid-end in a number of other places > compares the cost coming out of rtx_cost and address_cost and if the 2 > are not in sync we get funny values. There is already no correspondence at all between the two at present. My patch doesn't address this, but I think it must at least make it better. However, I don't really understand this comment - as you point out above, address_cost and rtx_cost return values measured in different units. I don't see how they can be made to correspond, given that. > Right, but this does not change arm_address_costs - so how is this going to work? > I would like this moved into a new function aarch_address_costs and that replacing > arm_address_costs only to be called from here. I could do that, but if I did, I would have to resubmit the patch at https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00387.html along with a reimplemention of arm_address_costs which used a table without changing its numerical results (pending subsequent tuning). Since the former would already solve my problem, and the latter would then be a pure code clean up of a separate function, why not accept the '387 patch as is, and leave the clean up until GCC 7? Alternatively, this is an updated patch series which changes the costs for MEMs in arm_rtx_costs using the table. Passes make check with no regressions for arm-unknown-linux-gnueabihf on qemu. >From 68f4318327e75a709f6a3bea327915c0558127df Mon Sep 17 00:00:00 2001 From: Charles Baylis Date: Fri, 13 Nov 2015 12:24:08 +0000 Subject: [PATCH 4/4] Use integer costs for soft float If using soft float, then costs of accessing FP values is actually the same as the cost of accessing integers of the same size. Change-Id: Icb672b2b599ea4e433bc0b29c228e9f910aeb3ee --- gcc/config/arm/arm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 101ff28..726a385 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -9579,15 +9579,15 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, : ARM_NUM_REGS (mode) <= 12 ? extra_cost->extra_mem->vec384[op] : extra_cost->extra_mem->vec512[op]); } - else if (FLOAT_MODE_P (mode)) + else if (mode == BLKmode) + *cost += extra_cost->extra_mem->blk[op]; + else if (FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT) { *cost += (ARM_NUM_REGS (mode) <= 1 ? extra_cost->extra_mem->sf[op] : ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->df[op] : extra_cost->extra_mem->cdf[op]); } - else if (mode == BLKmode) - *cost += extra_cost->extra_mem->blk[op]; else { /* integer modes */ *cost += -- 1.9.1