From patchwork Tue Mar 29 11:41:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charles Baylis X-Patchwork-Id: 64594 Delivered-To: patch@linaro.org Received: by 10.112.199.169 with SMTP id jl9csp1956140lbc; Tue, 29 Mar 2016 04:41:49 -0700 (PDT) X-Received: by 10.98.93.155 with SMTP id n27mr2624215pfj.88.1459251709817; Tue, 29 Mar 2016 04:41:49 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id u6si22938686pfa.186.2016.03.29.04.41.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Mar 2016 04:41:49 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-423926-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-423926-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-423926-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=I9Vbj0iUfNdwBLfNvP7IaZsx+wKX+j/y26+D95WmZDZ ujqG9cKKcmsNlqHBapMLahv4M1k7a0iqzQ3A8/THJSSmCwFNsa204if1p9HvzFpY PIfVdXgm8BS8HOyZ3hbgD3dQZUiLdhqLVejzTWQjlTXI2lFGbmbokcvX3XjDUf8U = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=fRuPkE/svHMzEPx8aYP+U4iPSoQ=; b=sR9KVA4K/KVBBLIlK jx/yyJALPUn39M7DqhmVdUk1Z26vn216Y3IID3/+jp7FK29yaSZkpyKiByzYO6bg Wo+FzWGhWqqhaCCfTUPpj1Dadb0a5ixaM1B7bYCJOFdr/BZY5BZ8REIvlvfOjomU 2EQcxPjwo6scChQtMSxi0rTlvc= Received: (qmail 80231 invoked by alias); 29 Mar 2016 11:41:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 80181 invoked by uid 89); 29 Mar 2016 11:41:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.0 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=quoted, jm, Wireless, sk:charles X-HELO: mail-oi0-f54.google.com Received: from mail-oi0-f54.google.com (HELO mail-oi0-f54.google.com) (209.85.218.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 29 Mar 2016 11:41:19 +0000 Received: by mail-oi0-f54.google.com with SMTP id h6so17359005oia.2 for ; Tue, 29 Mar 2016 04:41:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to:cc; bh=YYUgtLsDVZSNU7B/hbZr6fd17xxtlFMGQws3136Oiy8=; b=fWHpZ5DNDs5I6Y4uoF09+UjXTkcPLJSfWEqu+/uqVquIuD0PF56ZmWVVivsqkNYA6z rtVa3o/mhG7hzO7ly9At90BWNY8iO58BL1HKJGvywef1y7bribRLhSHBtKwlzi5g+SUx AJ8xU20BgM3V46LH6Yetqo7EcSe97QbYP/uV6xhBa8Qu1ezj+m2Kj25mCv5smrH9Qay1 vQ+pQHm5AMMx3Bcyqvg/j1Vl4SkH2EspjTbdYMarUygn71u7qx/OH9x/UkeYcAumsmJa heKCUsi4d5dCVlVcwAOqIbM0JWSBu/Bn0SyLsWtLSL1QutxmVQ1/yfMsXg0utBXZIZ3I y2nQ== X-Gm-Message-State: AD7BkJKjRRm+3JvtrMYi+xAIARdar9BU35Cx/JVBCTYQVdN+YlVY3FRDRBQ5lOvkiRJYKiD4Xfeg0lenzr7oxPZ/ MIME-Version: 1.0 X-Received: by 10.157.45.197 with SMTP id g63mr795110otb.139.1459251676136; Tue, 29 Mar 2016 04:41:16 -0700 (PDT) Received: by 10.202.205.85 with HTTP; Tue, 29 Mar 2016 04:41:16 -0700 (PDT) Date: Tue, 29 Mar 2016 12:41:16 +0100 Message-ID: Subject: [PATCH ARM v3] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg From: Charles Baylis To: Kugan , Kyrylo Tkachov , Ramana Radhakrishnan , Richard Earnshaw Cc: GCC Patches X-IsSubscribed: yes On 29 March 2016 at 02:16, Kugan wrote: > > Hi Charles, > > +static void > +arm_emit_long_call_profile_insn () > +{ > + rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc"); > + /* if movt/movw are not available, use a constant pool */ > + if (!arm_arch_thumb2) > > Should this be !TARGET_USE_MOVT? Hi Kugan, Thanks for the review. TARGET_USE_MOVT has additional conditions which mean that it can be false on targets with MOVW/MOVT depending on the tuning parameters for the target CPU. Because this patch works in a slightly odd way, I think it is better to use MOVW/MOVT where possible so that the slightly hacky use of the literal pool is avoided. Since this only happens when profiling, it is not essential to have the fully optimised code sequence here. I'm happy to change it if anybody feels strongly though. I've noticed in the quoted snippet that there are some GNU coding style errors, so I've respun the patch with those corrected. gcc/ChangeLog: 2016-03-29 Charles Baylis * config/arm/arm-protos.h (arm_emit_long_call_profile): New function. * config/arm/arm.c (arm_emit_long_call_profile_insn): New function. (arm_expand_prologue): Likewise. (thumb1_expand_prologue): Likewise. (arm_output_long_call_to_profile_func): Likewise. (arm_emit_long_call_profile): Likewise. * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment. * config/arm/arm.md (arm_long_call_profile): New pattern. * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New define. * config/arm/thumb1.md (thumb1_long_call_profile): New pattern. * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE. gcc/testsuite/ChangeLog: 2016-03-29 Charles Baylis * gcc.target/arm/pr69770.c: New test. >From 5785ddcfd518c44cf87b0fc74b4397fd98d1b0c1 Mon Sep 17 00:00:00 2001 From: Charles Baylis Date: Tue, 29 Mar 2016 12:28:25 +0100 Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg gcc/ChangeLog: 2016-03-29 Charles Baylis * config/arm/arm-protos.h (arm_emit_long_call_profile): New function. * config/arm/arm.c (arm_emit_long_call_profile_insn): New function. (arm_expand_prologue): Likewise. (thumb1_expand_prologue): Likewise. (arm_output_long_call_to_profile_func): Likewise. (arm_emit_long_call_profile): Likewise. * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment. * config/arm/arm.md (arm_long_call_profile): New pattern. * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New define. * config/arm/thumb1.md (thumb1_long_call_profile): New pattern. * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE. gcc/testsuite/ChangeLog: 2016-03-29 Charles Baylis * gcc.target/arm/pr69770.c: New test. Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6 diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 0083673..324c9f4 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -343,6 +343,7 @@ extern void arm_register_target_pragmas (void); extern void arm_cpu_cpp_builtins (struct cpp_reader *); extern bool arm_is_constant_pool_ref (rtx); +void arm_emit_long_call_profile (); /* Flags used to identify the presence of processor capabilities. */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c868490..885657a 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -21426,6 +21426,21 @@ output_probe_stack_range (rtx reg1, rtx reg2) return ""; } +static void +arm_emit_long_call_profile_insn () +{ + rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc"); + /* If movt/movw are not available, use a constant pool. */ + if (!arm_arch_thumb2) + { + sym_ref = force_const_mem (Pmode, sym_ref); + } + rtvec vec = gen_rtvec (1, sym_ref); + rtx tmp = gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE); + emit_insn (tmp); +} + + /* Generate the prologue instructions for entry into an ARM or Thumb-2 function. */ void @@ -21789,6 +21804,10 @@ arm_expand_prologue (void) arm_load_pic_register (mask); } + if (crtl->profile && TARGET_LONG_CALLS + && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS) + arm_emit_long_call_profile_insn (); + /* If we are profiling, make sure no instructions are scheduled before the call to mcount. Similarly if the user has requested no scheduling in the prolog. Similarly if we want non-call exceptions @@ -24985,6 +25004,10 @@ thumb1_expand_prologue (void) if (frame_pointer_needed) thumb_set_frame_pointer (offsets); + if (crtl->profile && TARGET_LONG_CALLS + && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS) + arm_emit_long_call_profile_insn (); + /* If we are profiling, make sure no instructions are scheduled before the call to mcount. Similarly if the user has requested no scheduling in the prolog. Similarly if we want non-call exceptions @@ -30289,4 +30312,70 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri, return; } +static void +arm_output_long_call_to_profile_func (rtx * operands, bool push_scratch) +{ + /* operands[0] is the address of the __gnu_mcount_nc function. + operands[1] is the scratch register we use to load that address. */ + if (push_scratch) + output_asm_insn ("push\t{%1}", operands); + output_asm_insn ("push\t{lr}", operands); + if (GET_CODE (operands[0]) == SYMBOL_REF) + { + output_asm_insn ("movw\t%1, #:lower16:%c0", operands); + output_asm_insn ("movt\t%1, #:upper16:%c0", operands); + } + else + { + output_asm_insn ("ldr\t%1, %0", operands); + } + if (!arm_arch5) + { + output_asm_insn ("mov\tlr, pc", operands); + output_asm_insn ("mov\tpc, %1", operands); + } + else + output_asm_insn ("blx\t%1", operands); + if (push_scratch) + output_asm_insn ("pop\t{%1}", operands); +} + +void +arm_emit_long_call_profile () +{ + rtx alcp = NULL; + rtx operands[2]; + bool push_scratch; + /* Find the use of arm_long_call_profile. */ + for (rtx_insn * insn = get_insns (); insn; insn = NEXT_INSN (insn)) + { + if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_PROLOGUE_END) + break; + if (INSN_CODE (insn) == CODE_FOR_arm_long_call_profile + || INSN_CODE (insn) == CODE_FOR_thumb1_long_call_profile) + { + alcp = PATTERN (insn); + break; + } + } + gcc_assert (alcp); + + operands[0] = XEXP (XEXP (alcp, 0), 0); + if (TARGET_32BIT) + { + operands[1] = gen_rtx_REG (SImode, IP_REGNUM); + push_scratch = false; + } + else + { + /* for nested functions, we can set push_scratch to false, since + final.c:profile_function.c and ASM_OUTPUT_REG_PUSH preserve it as + part of the sequence to preserve ip across the call to the + profiling function. */ + operands[1] = gen_rtx_REG (SImode, R0_REGNUM + 7); + push_scratch = !IS_NESTED (arm_current_func_type ()); + } + arm_output_long_call_to_profile_func (operands, push_scratch); +} + #include "gt-arm.h" diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 6352140..c8e2e47 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2044,7 +2044,9 @@ extern int making_const_table; that ASM_OUTPUT_REG_PUSH will be matched with ASM_OUTPUT_REG_POP, and that r7 isn't used by the function profiler, so we can use it as a scratch reg. WARNING: This isn't safe in the general case! It may be - sensitive to future changes in final.c:profile_function. */ + sensitive to future changes in final.c:profile_function. This is also + relied on in arm_emit_long_call_profile () which assumes r7 can be + used as a scratch register to load the address of the function profiler. */ #define ASM_OUTPUT_REG_PUSH(STREAM, REGNO) \ do \ { \ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 47171b9..0c9710b 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11424,6 +11424,15 @@ DONE; }) +(define_insn "arm_long_call_profile" + [(unspec_volatile [(match_operand:SI 0 "general_operand" "ji,m") + ] VUNSPEC_LONG_CALL_PROFILE)] + "TARGET_32BIT" + "%@ arm_long_call_profile" + [(set_attr "arm_pool_range" "*,4096") + (set_attr "arm_neg_pool_range" "*,4084")] +) + ;; Vector bits common to IWMMXT and Neon (include "vec-common.md") ;; Load the Intel Wireless Multimedia Extension patterns diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h index 5d6c4ed..89bf698 100644 --- a/gcc/config/arm/bpabi.h +++ b/gcc/config/arm/bpabi.h @@ -174,11 +174,20 @@ #undef NO_PROFILE_COUNTERS #define NO_PROFILE_COUNTERS 1 +#undef ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS +#define ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS 1 #undef ARM_FUNCTION_PROFILER #define ARM_FUNCTION_PROFILER(STREAM, LABELNO) \ { \ - fprintf (STREAM, "\tpush\t{lr}\n"); \ - fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n"); \ + if (TARGET_LONG_CALLS) \ + { \ + arm_emit_long_call_profile (); \ + } \ + else \ + { \ + fprintf (STREAM, "\tpush\t{lr}\n"); \ + fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n"); \ + } \ } #undef SUBTARGET_FRAME_POINTER_REQUIRED diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index 072ed4d..482d8cb 100644 --- a/gcc/config/arm/thumb1.md +++ b/gcc/config/arm/thumb1.md @@ -1798,7 +1798,7 @@ [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")] VUNSPEC_EH_RETURN) (clobber (match_scratch:SI 1 "=&l"))] - "TARGET_THUMB1" + "TARGET_THUMB1 && 0" "#" "&& reload_completed" [(const_int 0)] @@ -1809,4 +1809,13 @@ }" [(set_attr "type" "mov_reg")] ) + +(define_insn "thumb1_long_call_profile" + [(unspec_volatile [(match_operand:SI 0 "general_operand" "j,m") + ] VUNSPEC_LONG_CALL_PROFILE)] + "TARGET_THUMB1" + "%@ thumb1_long_call_profile" + [(set_attr "pool_range" "1018")] +) + diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 5744c62..d7ddc3a 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -148,6 +148,7 @@ VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content. VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing. + VUNSPEC_LONG_CALL_PROFILE ; Represent a long call to profile function ]) ;; Enumerators for NEON unspecs. diff --git a/gcc/testsuite/gcc.target/arm/pr69770.c b/gcc/testsuite/gcc.target/arm/pr69770.c new file mode 100644 index 0000000..93b433d --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr69770.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-pg -mlong-calls" } */ + +extern void g (void); + +int +f () +{ + g (); + return 0; +} + + +/* { dg-final { scan-assembler-not "bl\[ \t\]+__gnu_mcount_nc" } } */ +/* { dg-final { scan-assembler "__gnu_mcount_nc" } } */ -- 1.9.1