From patchwork Fri Dec 16 12:21:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 88293 Delivered-To: patch@linaro.org Received: by 10.140.20.101 with SMTP id 92csp1371786qgi; Fri, 16 Dec 2016 04:22:23 -0800 (PST) X-Received: by 10.98.90.132 with SMTP id o126mr2731328pfb.41.1481890943504; Fri, 16 Dec 2016 04:22:23 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id c135si7566189pfb.131.2016.12.16.04.22.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Dec 2016 04:22:23 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-444585-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-444585-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-444585-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=Xxv++bfkxUBsSYVyY dM+OX8DMLGiKyV76sxlucg+S33aT16nNyOMQaOVCCHLmPtGoDJKHGF8QKUq936tC kuLYcnBuR59XLdhpozOx507ZsWYjNYoCqbzO6BeSzITye15v4lcUdkHxkHfO3qYs L2SJ15KYKU5X5tiek+Xpevqi6E= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=lOJmQIsAVf72yS7O+Mb7pzw b1hg=; b=bo5Z83de54+r7jp5xr/WrH27AwXF+2ArMXmA3xPH9RsvAE9ySk9wXqO aG11qM6gVZPV8nyvALnSkznrTp45GhEI/7WiElUHGq9ohy/XVL9NNh6zDOJoBleE vL+9T8J5HkmjQdtzYa7/opPkcme+rH9Lgo07JNFuxcB8xjIIdaDM= Received: (qmail 120641 invoked by alias); 16 Dec 2016 12:22:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 120622 invoked by uid 89); 16 Dec 2016 12:22:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=gpi, risk, 2016-12-16 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Dec 2016 12:21:57 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B4753C14; Fri, 16 Dec 2016 04:21:54 -0800 (PST) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C53603F445; Fri, 16 Dec 2016 04:21:53 -0800 (PST) Message-ID: <5853DC60.4060000@foss.arm.com> Date: Fri, 16 Dec 2016 12:21:52 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: James Greenhalgh CC: GCC Patches , Marcus Shawcroft , Richard Earnshaw , nd@arm.com Subject: Re: [PATCH][AArch64] Split X-reg UBFX into W-reg LSR when possible References: <5849294C.6060109@foss.arm.com> <20161215115346.GA14755@arm.com> In-Reply-To: <20161215115346.GA14755@arm.com> On 15/12/16 11:53, James Greenhalgh wrote: > On Thu, Dec 08, 2016 at 09:35:08AM +0000, Kyrill Tkachov wrote: >> Hi all, >> >> In this patch we split X-register UBFX instructions that extract up to the >> edge of a W-register into a W-register LSR instruction. So for the example in >> the testcase instead of: >> UBFX X0, X0, 24, 8 >> >> we'd generate: >> LSR w0, w0, 24 >> >> An LSR is a simpler instruction and there's a higher chance that it can be >> combined with other instructions. >> >> To do this the patch separates the sign_extract case from the zero_extract >> case in the * ANY_EXTRACT pattern and further splits the >> SImode/DImode patterns from the resulting zrero_extract pattern. >> The DImode zero_extract pattern then becomes a define_insn_and_split that >> splits into a zero_extend of an lshiftrt when the bitposition and width of >> the zero_extract add up to 32. >> >> Bootstrapped and tested on aarch64-none-linux-gnu. >> >> Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low >> risk. I'm happy for it to wait for the next release if necessary. > I'm OK with the idea, and I'm OK taking this in for Stage 3, but I'm not > convinced by the implementation. > >> 2016-12-08 Kyrylo Tkachov >> >> * config/aarch64/aarch64.md (*): Split into... >> (*extv): ...This... >> (*extzvsi): ...This... >> (*extzvdi:): ... And this. Add splitting to lshiftrt when possible. > Why do we want to to it this way, rather than simply defining a single > "split" which works in the case you're trying to catch. > > i.e. (untested) > > (define_split > [(set (match_operand:DI 0 "register_operand") > (zero_extract:DI (match_operand:DI 1 "register_operand") > (match_operand 2 > "aarch64_simd_shift_imm_offset_di") > (match_operand 3 > "aarch64_simd_shift_imm_di")))] > "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), > 1, GET_MODE_BITSIZE (DImode) - 1) > && (INTVAL (operands[2]) + INTVAL (operands[3])) > == GET_MODE_BITSIZE (SImode)" > [(set (match_dup 0) > (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))] > { > operands[4] = gen_lowpart (SImode, operands[1]); > } > ) > > Thanks, > James Yes, that works and is simpler. Is this ok then? Bootstrapped and tested on aarch64-none-linux-gnu. Thanks, Kyrill 2016-12-16 Kyrylo Tkachov * config/aarch64/aarch64.md: New define_split above insv. 2016-12-16 Kyrylo Tkachov * gcc.target/aarch64/ubfx_lsr_1.c: New test. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 65ea04587442b0cab18b1e4652537524b82d5b86..5a40ee6abd5e123116aaaa478dced2207dd59478 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4340,6 +4340,26 @@ (define_insn "*" [(set_attr "type" "bfx")] ) +;; When the bitposition and width add up to 32 we can use a W-reg LSR +;; instruction taking advantage of the implicit zero-extension of the X-reg. +(define_split + [(set (match_operand:DI 0 "register_operand") + (zero_extract:DI (match_operand:DI 1 "register_operand") + (match_operand 2 + "aarch64_simd_shift_imm_offset_di") + (match_operand 3 + "aarch64_simd_shift_imm_di")))] + "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1, + GET_MODE_BITSIZE (DImode) - 1) + && (INTVAL (operands[2]) + INTVAL (operands[3])) + == GET_MODE_BITSIZE (SImode)" + [(set (match_dup 0) + (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))] + { + operands[4] = gen_lowpart (SImode, operands[1]); + } +) + ;; Bitfield Insert (insv) (define_expand "insv" [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c b/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c new file mode 100644 index 0000000000000000000000000000000000000000..f6f72b074e1fc6bcb1976eee6c545e9781b4bed6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* Check that an X-reg UBFX can be simplified into a W-reg LSR. */ + +int +f (unsigned long long x) +{ + x = (x >> 24) & 255; + return x + 1; +} + +/* { dg-final { scan-assembler "lsr\tw" } } */ +/* { dg-final { scan-assembler-not "ubfx\tx" } } */