From patchwork Tue Nov 24 09:42:29 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 57209 Delivered-To: patch@linaro.org Received: by 10.112.155.196 with SMTP id vy4csp1971583lbb; Tue, 24 Nov 2015 01:42:53 -0800 (PST) X-Received: by 10.98.15.67 with SMTP id x64mr21567665pfi.67.1448358173693; Tue, 24 Nov 2015 01:42:53 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 66si25650007pfs.166.2015.11.24.01.42.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Nov 2015 01:42:53 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-415138-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-415138-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-415138-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=e3cWInfLesb/aJ/Ay l1EV5bVUJmMX4/rhkIP6zoMyf7ox+O0nH5YxlwDYBQXvCDH2839Os6Kh0Ft0l8r9 ddEa7rwh8DHphcp5GzUx+X6Pob7Skei7FxLZaEZKR2ic57MC7ToQNiWYlRKiL+D+ YPghh0eBWAs+nVvgRsGYcBI3SY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=b6qSMzOsRCiK3weqk2w4UBg rjsk=; b=cBUQZrzTYnW455ODFTFK/YSBfFCWFGj6Rslygs/m7PoJ6GoDyXmhLdZ IW2HdpHW/Rtj2nieD0wEr4nfoK9c1jZkzpn5oH/zVYfEvsoBMBeWQU4FvZil/hUU 1kS7umnvBr3dgqgeqgBFxy+Gmyrknxp71o/K3iVBFYZuWxCKGBMw= Received: (qmail 8150 invoked by alias); 24 Nov 2015 09:42:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 8027 invoked by uid 89); 24 Nov 2015 09:42:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 24 Nov 2015 09:42:34 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-31-SZ62wf-wQPqNdybQMSGZ8Q-1; Tue, 24 Nov 2015 09:42:29 +0000 Received: from [10.2.206.200] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 24 Nov 2015 09:42:29 +0000 Message-ID: <56543105.9090104@arm.com> Date: Tue, 24 Nov 2015 09:42:29 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: James Greenhalgh CC: GCC Patches , Marcus Shawcroft , Richard Earnshaw Subject: Re: [PATCH][AArch64][v2] Improve comparison with complex immediates followed by branch/cset References: <5638D61C.5060100@arm.com> <20151112120543.GA22716@arm.com> <5652EB5D.8040002@arm.com> <20151123145800.GB14088@arm.com> <56532A43.3000500@arm.com> In-Reply-To: <56532A43.3000500@arm.com> X-MC-Unique: SZ62wf-wQPqNdybQMSGZ8Q-1 X-IsSubscribed: yes On 23/11/15 15:01, Kyrill Tkachov wrote: > > On 23/11/15 14:58, James Greenhalgh wrote: >> On Mon, Nov 23, 2015 at 10:33:01AM +0000, Kyrill Tkachov wrote: >>> On 12/11/15 12:05, James Greenhalgh wrote: >>>> On Tue, Nov 03, 2015 at 03:43:24PM +0000, Kyrill Tkachov wrote: >>>>> Hi all, >>>>> >>>>> Bootstrapped and tested on aarch64. >>>>> >>>>> Ok for trunk? >>>> Comments in-line. >>>> >>> Here's an updated patch according to your comments. >>> Sorry it took so long to respin it, had other things to deal with with >>> stage1 closing... >>> >>> I've indented the sample code sequences and used valid mnemonics. >>> These patterns can only match during combine, so I'd expect them to always >>> split during combine or immediately after, but I don't think that's a documented >>> guarantee so I've gated them on !reload_completed. >>> >>> I've used IN_RANGE in the predicate.md hunk and added scan-assembler checks >>> in the tests. >>> >>> Is this ok? >>> >>> Thanks, >>> Kyrill >>> >>> 2015-11-20 Kyrylo Tkachov >>> >>> * config/aarch64/aarch64.md (*condjump): Rename to... >>> (condjump): ... This. >>> (*compare_condjump): New define_insn_and_split. >>> (*compare_cstore_insn): Likewise. >>> (*cstore_insn): Rename to... >>> (cstore_insn): ... This. >>> * config/aarch64/iterators.md (CMP): Handle ne code. >>> * config/aarch64/predicates.md (aarch64_imm24): New predicate. >>> >>> 2015-11-20 Kyrylo Tkachov >>> >>> * gcc.target/aarch64/cmpimm_branch_1.c: New test. >>> * gcc.target/aarch64/cmpimm_cset_1.c: Likewise. >>> >>> commit bb44feed4e6beaae25d9bdffa45073dc61c65838 >>> Author: Kyrylo Tkachov >>> Date: Mon Sep 21 10:56:47 2015 +0100 >>> >>> [AArch64] Improve comparison with complex immediates >>> >>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md >>> index 11f6387..3e57d08 100644 >>> --- a/gcc/config/aarch64/aarch64.md >>> +++ b/gcc/config/aarch64/aarch64.md >>> @@ -372,7 +372,7 @@ (define_expand "mod3" >>> } >>> ) >>> -(define_insn "*condjump" >>> +(define_insn "condjump" >>> [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator" >>> [(match_operand 1 "cc_register" "") (const_int 0)]) >>> (label_ref (match_operand 2 "" "")) >>> @@ -397,6 +397,41 @@ (define_insn "*condjump" >>> (const_int 1)))] >>> ) >>> +;; For a 24-bit immediate CST we can optimize the compare for equality >>> +;; and branch sequence from: >>> +;; mov x0, #imm1 >>> +;; movk x0, #imm2, lsl 16 /* x0 contains CST. */ >>> +;; cmp x1, x0 >>> +;; b .Label >>> +;; into the shorter: >>> +;; sub x0, x0, #(CST & 0xfff000) >>> +;; subs x0, x0, #(CST & 0x000fff) >> sub x0, x1, #(CST....) ? >> >> The transform doesn't make sense otherwise. > > Doh, yes. The source should be x1 of course. > Here's what I'll be committing. Thanks, Kyrill 2015-11-24 Kyrylo Tkachov * config/aarch64/aarch64.md (*condjump): Rename to... (condjump): ... This. (*compare_condjump): New define_insn_and_split. (*compare_cstore_insn): Likewise. (*cstore_insn): Rename to... (cstore_insn): ... This. * config/aarch64/iterators.md (CMP): Handle ne code. * config/aarch64/predicates.md (aarch64_imm24): New predicate. 2015-11-24 Kyrylo Tkachov * gcc.target/aarch64/cmpimm_branch_1.c: New test. * gcc.target/aarch64/cmpimm_cset_1.c: Likewise. > Kyrill > >> >>> +;; b .Label >>> +(define_insn_and_split "*compare_condjump" >>> + [(set (pc) (if_then_else (EQL >>> + (match_operand:GPI 0 "register_operand" "r") >>> + (match_operand:GPI 1 "aarch64_imm24" "n")) >>> + (label_ref:P (match_operand 2 "" "")) >>> + (pc)))] >>> + "!aarch64_move_imm (INTVAL (operands[1]), mode) >>> + && !aarch64_plus_operand (operands[1], mode) >>> + && !reload_completed" >>> + "#" >>> + "&& true" >>> + [(const_int 0)] >>> + { >>> + HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff; >>> + HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000; >>> + rtx tmp = gen_reg_rtx (mode); >>> + emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm))); >>> + emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm))); >>> + rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM); >>> + rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx); >>> + emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2])); >>> + DONE; >>> + } >>> +) >>> + >>> (define_expand "casesi" >>> [(match_operand:SI 0 "register_operand" "") ; Index >>> (match_operand:SI 1 "const_int_operand" "") ; Lower bound >>> @@ -2901,7 +2936,7 @@ (define_expand "cstore4" >>> " >>> ) >>> -(define_insn "*cstore_insn" >>> +(define_insn "aarch64_cstore" >>> [(set (match_operand:ALLI 0 "register_operand" "=r") >>> (match_operator:ALLI 1 "aarch64_comparison_operator" >>> [(match_operand 2 "cc_register" "") (const_int 0)]))] >>> @@ -2910,6 +2945,40 @@ (define_insn "*cstore_insn" >>> [(set_attr "type" "csel")] >>> ) >>> +;; For a 24-bit immediate CST we can optimize the compare for equality >>> +;; and branch sequence from: >>> +;; mov x0, #imm1 >>> +;; movk x0, #imm2, lsl 16 /* x0 contains CST. */ >>> +;; cmp x1, x0 >>> +;; cset x2, >>> +;; into the shorter: >>> +;; sub x0, x0, #(CST & 0xfff000) >>> +;; subs x0, x0, #(CST & 0x000fff) >>> +;; cset x1, . >> Please fix the register allocation in your shorter sequence, these >> are not equivalent. >> >>> +(define_insn_and_split "*compare_cstore_insn" >>> + [(set (match_operand:GPI 0 "register_operand" "=r") >>> + (EQL:GPI (match_operand:GPI 1 "register_operand" "r") >>> + (match_operand:GPI 2 "aarch64_imm24" "n")))] >>> + "!aarch64_move_imm (INTVAL (operands[2]), mode) >>> + && !aarch64_plus_operand (operands[2], mode) >>> + && !reload_completed" >>> + "#" >>> + "&& true" >>> + [(const_int 0)] >>> + { >>> + HOST_WIDE_INT lo_imm = UINTVAL (operands[2]) & 0xfff; >>> + HOST_WIDE_INT hi_imm = UINTVAL (operands[2]) & 0xfff000; >>> + rtx tmp = gen_reg_rtx (mode); >>> + emit_insn (gen_add3 (tmp, operands[1], GEN_INT (-hi_imm))); >>> + emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm))); >>> + rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM); >>> + rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx); >>> + emit_insn (gen_aarch64_cstore (operands[0], cmp_rtx, cc_reg)); >>> + DONE; >>> + } >>> + [(set_attr "type" "csel")] >>> +) >>> + >>> ;; zero_extend version of the above >>> (define_insn "*cstoresi_insn_uxtw" >>> [(set (match_operand:DI 0 "register_operand" "=r") >>> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md >>> index c2eb7de..422bc87 100644 >>> --- a/gcc/config/aarch64/iterators.md >>> +++ b/gcc/config/aarch64/iterators.md >>> @@ -824,7 +824,8 @@ (define_code_attr cmp_2 [(lt "1") (le "1") (eq "2") (ge "2") (gt "2") >>> (ltu "1") (leu "1") (geu "2") (gtu "2")]) >>> (define_code_attr CMP [(lt "LT") (le "LE") (eq "EQ") (ge "GE") (gt "GT") >>> - (ltu "LTU") (leu "LEU") (geu "GEU") (gtu "GTU")]) >>> + (ltu "LTU") (leu "LEU") (ne "NE") (geu "GEU") >>> + (gtu "GTU")]) >>> (define_code_attr fix_trunc_optab [(fix "fix_trunc") >>> (unsigned_fix "fixuns_trunc")]) >>> diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md >>> index e7f76e0..c0c3ff5 100644 >>> --- a/gcc/config/aarch64/predicates.md >>> +++ b/gcc/config/aarch64/predicates.md >>> @@ -145,6 +145,11 @@ (define_predicate "aarch64_imm3" >>> (and (match_code "const_int") >>> (match_test "(unsigned HOST_WIDE_INT) INTVAL (op) <= 4"))) >>> +;; An immediate that fits into 24 bits. >>> +(define_predicate "aarch64_imm24" >>> + (and (match_code "const_int") >>> + (match_test "IN_RANGE (UINTVAL (op), 0, 0xffffff)"))) >>> + >>> (define_predicate "aarch64_pwr_imm3" >>> (and (match_code "const_int") >>> (match_test "INTVAL (op) != 0 >>> diff --git a/gcc/testsuite/gcc.target/aarch64/cmpimm_branch_1.c b/gcc/testsuite/gcc.target/aarch64/cmpimm_branch_1.c >>> new file mode 100644 >>> index 0000000..7ad736b >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/cmpimm_branch_1.c >>> @@ -0,0 +1,26 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-save-temps -O2" } */ >>> + >>> +/* Test that we emit a sub+subs sequence rather than mov+movk+cmp. */ >>> + >>> +void g (void); >>> +void >>> +foo (int x) >>> +{ >>> + if (x != 0x123456) >>> + g (); >>> +} >>> + >>> +void >>> +fool (long long x) >>> +{ >>> + if (x != 0x123456) >>> + g (); >>> +} >>> + >>> +/* { dg-final { scan-assembler-not "cmp\tw\[0-9\]*.*" } } */ >>> +/* { dg-final { scan-assembler-not "cmp\tx\[0-9\]*.*" } } */ >>> +/* { dg-final { scan-assembler-times "sub\tw\[0-9\]+.*" 1 } } */ >>> +/* { dg-final { scan-assembler-times "sub\tx\[0-9\]+.*" 1 } } */ >>> +/* { dg-final { scan-assembler-times "subs\tw\[0-9\]+.*" 1 } } */ >>> +/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+.*" 1 } } */ >>> diff --git a/gcc/testsuite/gcc.target/aarch64/cmpimm_cset_1.c b/gcc/testsuite/gcc.target/aarch64/cmpimm_cset_1.c >>> new file mode 100644 >>> index 0000000..6a03cc9 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/cmpimm_cset_1.c >>> @@ -0,0 +1,23 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-save-temps -O2" } */ >>> + >>> +/* Test that we emit a sub+subs sequence rather than mov+movk+cmp. */ >>> + >>> +int >>> +foo (int x) >>> +{ >>> + return x == 0x123456; >>> +} >>> + >>> +long >>> +fool (long x) >>> +{ >>> + return x == 0x123456; >>> +} >>> + >> This test will be broken for ILP32. This should be long long. >> >> OK with those comments fixed. > > Thanks, I'll prepare an updated version. > > Kyrill > >> Thanks, >> James >> > commit 30cc3774824ba7fd372111a223ade075ad7c49cc Author: Kyrylo Tkachov Date: Mon Sep 21 10:56:47 2015 +0100 [AArch64] Improve comparison with complex immediates diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 11f6387..3283cb2 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -372,7 +372,7 @@ (define_expand "mod3" } ) -(define_insn "*condjump" +(define_insn "condjump" [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator" [(match_operand 1 "cc_register" "") (const_int 0)]) (label_ref (match_operand 2 "" "")) @@ -397,6 +397,41 @@ (define_insn "*condjump" (const_int 1)))] ) +;; For a 24-bit immediate CST we can optimize the compare for equality +;; and branch sequence from: +;; mov x0, #imm1 +;; movk x0, #imm2, lsl 16 /* x0 contains CST. */ +;; cmp x1, x0 +;; b .Label +;; into the shorter: +;; sub x0, x1, #(CST & 0xfff000) +;; subs x0, x0, #(CST & 0x000fff) +;; b .Label +(define_insn_and_split "*compare_condjump" + [(set (pc) (if_then_else (EQL + (match_operand:GPI 0 "register_operand" "r") + (match_operand:GPI 1 "aarch64_imm24" "n")) + (label_ref:P (match_operand 2 "" "")) + (pc)))] + "!aarch64_move_imm (INTVAL (operands[1]), mode) + && !aarch64_plus_operand (operands[1], mode) + && !reload_completed" + "#" + "&& true" + [(const_int 0)] + { + HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff; + HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000; + rtx tmp = gen_reg_rtx (mode); + emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm))); + emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm))); + rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM); + rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx); + emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2])); + DONE; + } +) + (define_expand "casesi" [(match_operand:SI 0 "register_operand" "") ; Index (match_operand:SI 1 "const_int_operand" "") ; Lower bound @@ -2901,7 +2936,7 @@ (define_expand "cstore4" " ) -(define_insn "*cstore_insn" +(define_insn "aarch64_cstore" [(set (match_operand:ALLI 0 "register_operand" "=r") (match_operator:ALLI 1 "aarch64_comparison_operator" [(match_operand 2 "cc_register" "") (const_int 0)]))] @@ -2910,6 +2945,40 @@ (define_insn "*cstore_insn" [(set_attr "type" "csel")] ) +;; For a 24-bit immediate CST we can optimize the compare for equality +;; and branch sequence from: +;; mov x0, #imm1 +;; movk x0, #imm2, lsl 16 /* x0 contains CST. */ +;; cmp x1, x0 +;; cset x2, +;; into the shorter: +;; sub x0, x1, #(CST & 0xfff000) +;; subs x0, x0, #(CST & 0x000fff) +;; cset x2, . +(define_insn_and_split "*compare_cstore_insn" + [(set (match_operand:GPI 0 "register_operand" "=r") + (EQL:GPI (match_operand:GPI 1 "register_operand" "r") + (match_operand:GPI 2 "aarch64_imm24" "n")))] + "!aarch64_move_imm (INTVAL (operands[2]), mode) + && !aarch64_plus_operand (operands[2], mode) + && !reload_completed" + "#" + "&& true" + [(const_int 0)] + { + HOST_WIDE_INT lo_imm = UINTVAL (operands[2]) & 0xfff; + HOST_WIDE_INT hi_imm = UINTVAL (operands[2]) & 0xfff000; + rtx tmp = gen_reg_rtx (mode); + emit_insn (gen_add3 (tmp, operands[1], GEN_INT (-hi_imm))); + emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm))); + rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM); + rtx cmp_rtx = gen_rtx_fmt_ee (, mode, cc_reg, const0_rtx); + emit_insn (gen_aarch64_cstore (operands[0], cmp_rtx, cc_reg)); + DONE; + } + [(set_attr "type" "csel")] +) + ;; zero_extend version of the above (define_insn "*cstoresi_insn_uxtw" [(set (match_operand:DI 0 "register_operand" "=r") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c2eb7de..422bc87 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -824,7 +824,8 @@ (define_code_attr cmp_2 [(lt "1") (le "1") (eq "2") (ge "2") (gt "2") (ltu "1") (leu "1") (geu "2") (gtu "2")]) (define_code_attr CMP [(lt "LT") (le "LE") (eq "EQ") (ge "GE") (gt "GT") - (ltu "LTU") (leu "LEU") (geu "GEU") (gtu "GTU")]) + (ltu "LTU") (leu "LEU") (ne "NE") (geu "GEU") + (gtu "GTU")]) (define_code_attr fix_trunc_optab [(fix "fix_trunc") (unsigned_fix "fixuns_trunc")]) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index e7f76e0..c0c3ff5 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -145,6 +145,11 @@ (define_predicate "aarch64_imm3" (and (match_code "const_int") (match_test "(unsigned HOST_WIDE_INT) INTVAL (op) <= 4"))) +;; An immediate that fits into 24 bits. +(define_predicate "aarch64_imm24" + (and (match_code "const_int") + (match_test "IN_RANGE (UINTVAL (op), 0, 0xffffff)"))) + (define_predicate "aarch64_pwr_imm3" (and (match_code "const_int") (match_test "INTVAL (op) != 0 diff --git a/gcc/testsuite/gcc.target/aarch64/cmpimm_branch_1.c b/gcc/testsuite/gcc.target/aarch64/cmpimm_branch_1.c new file mode 100644 index 0000000..7ad736b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/cmpimm_branch_1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-save-temps -O2" } */ + +/* Test that we emit a sub+subs sequence rather than mov+movk+cmp. */ + +void g (void); +void +foo (int x) +{ + if (x != 0x123456) + g (); +} + +void +fool (long long x) +{ + if (x != 0x123456) + g (); +} + +/* { dg-final { scan-assembler-not "cmp\tw\[0-9\]*.*" } } */ +/* { dg-final { scan-assembler-not "cmp\tx\[0-9\]*.*" } } */ +/* { dg-final { scan-assembler-times "sub\tw\[0-9\]+.*" 1 } } */ +/* { dg-final { scan-assembler-times "sub\tx\[0-9\]+.*" 1 } } */ +/* { dg-final { scan-assembler-times "subs\tw\[0-9\]+.*" 1 } } */ +/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+.*" 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/cmpimm_cset_1.c b/gcc/testsuite/gcc.target/aarch64/cmpimm_cset_1.c new file mode 100644 index 0000000..f6fd69f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/cmpimm_cset_1.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-save-temps -O2" } */ + +/* Test that we emit a sub+subs sequence rather than mov+movk+cmp. */ + +int +foo (int x) +{ + return x == 0x123456; +} + +long long +fool (long long x) +{ + return x == 0x123456; +} + +/* { dg-final { scan-assembler-not "cmp\tw\[0-9\]*.*" } } */ +/* { dg-final { scan-assembler-not "cmp\tx\[0-9\]*.*" } } */ +/* { dg-final { scan-assembler-times "sub\tw\[0-9\]+.*" 1 } } */ +/* { dg-final { scan-assembler-times "sub\tx\[0-9\]+.*" 1 } } */ +/* { dg-final { scan-assembler-times "subs\tw\[0-9\]+.*" 1 } } */ +/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+.*" 1 } } */