From patchwork Mon Jul 30 11:46:43 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 10355 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id C7C6523E56 for ; Mon, 30 Jul 2012 11:46:45 +0000 (UTC) Received: from mail-yw0-f52.google.com (mail-yw0-f52.google.com [209.85.213.52]) by fiordland.canonical.com (Postfix) with ESMTP id 7CFFCA182B1 for ; Mon, 30 Jul 2012 11:46:45 +0000 (UTC) Received: by yhpp61 with SMTP id p61so4583720yhp.11 for ; Mon, 30 Jul 2012 04:46:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf :mime-version:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=NLMevCn/eeYt5b5xKxT1of3jSVflR24VLhDUldDmaQE=; b=lkTe9OgjOaTLJmmsIrMrZgPQKdZvPYrQ7sM3gIEYMO/XNmhSjXJLyn+drupFLZh9rm P7aLTbp3BKzo5OgGVWi+ywFYQx7wGKZCNo1AKhWG/yQNqOrN3XjEegXFvERgzpo2OBHS Z1Y/5FPp8xiFQnTinT235sD9v+jAqBUhbNzXzBhK157Pp/SXgaM/qFMgG15+HagtQEtU qNdARbp1vlXIWzhGYgPLTxOrqZqPecL7dZOHYVLjq6K4k9n5DI5Afjt9PTm2UcZZr3+K GEDHKf/q0qOpq5d4V7nxbpWyC6ExW/Hj7S4wOKeUwvfmEdrv6yfhviTAfkSyxU86HBKl B4UA== Received: by 10.50.213.39 with SMTP id np7mr8040034igc.51.1343648804831; Mon, 30 Jul 2012 04:46:44 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.50.87.40 with SMTP id u8csp83961igz; Mon, 30 Jul 2012 04:46:44 -0700 (PDT) Received: by 10.52.180.230 with SMTP id dr6mr9292512vdc.130.1343648804157; Mon, 30 Jul 2012 04:46:44 -0700 (PDT) Received: from mail-vc0-f178.google.com (mail-vc0-f178.google.com [209.85.220.178]) by mx.google.com with ESMTPS id fa5si6520893vdb.96.2012.07.30.04.46.43 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 30 Jul 2012 04:46:44 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.178 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) client-ip=209.85.220.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.220.178 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) smtp.mail=ramana.radhakrishnan@linaro.org Received: by vcbf13 with SMTP id f13so5512334vcb.37 for ; Mon, 30 Jul 2012 04:46:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.75.194 with SMTP id e2mr9513775vdw.27.1343648803695; Mon, 30 Jul 2012 04:46:43 -0700 (PDT) Received: by 10.58.210.194 with HTTP; Mon, 30 Jul 2012 04:46:43 -0700 (PDT) Date: Mon, 30 Jul 2012 12:46:43 +0100 Message-ID: Subject: [Patch ARM 3/6] Adjust costs for Large moves for ARM. From: Ramana Radhakrishnan To: gcc-patches@gcc.gnu.org Cc: Patch Tracking , Richard Earnshaw X-Gm-Message-State: ALoCoQnp6ESTXwgtO+fRo7Q0CAfKwF+JLFycb51TlP1xBI3tCL8Q+QJCaxm589o7lhW3Ryym0l4u Hi, lower-subreg.c goes completely bonkers at times with code that uses the large vector modes, especially the vld3 / vst3 type operations. In these cases these large modes are usually split into SImode moves which then cause massive spilling and in these cases we end up generating really really bad code. The problem here appears to be around the fact that we report the cost of a reg-reg move to be 0 and the alternate is also 0 which means that by default we split in any large register case. I am a bit unsure about DImode moves and whether they should be split or not which is why there is a fixme in this particular case. With the examples that I've tried out which has been suitably complex neon intrinsics code, this appears to prevent the gratuitous splitting. Ofcourse not splitting has it's own problems as we now have a contiguous 3 registers with large values being allocated. I'm not however sure how this will hold up in practice and in real life applications and if someone could provide some feedback on this it would be great. If only smaller portions of those large registers are used, it gets a bit harder for the register allocator to get this right. So this is a patch that might need more tweaking and is potentially the most contentious of the lot. In addition the same logic could be applied to arm_size_cost before I commit this patch. regards, Ramana 2012-07-27 Ramana Radhakrishnan * config/arm/arm.c (arm_rtx_costs_1): Adjust cost for register register moves. (arm_reg_reg_move_cost_for_mode): Use it. --- gcc/config/arm/arm.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 46 insertions(+), 0 deletions(-) case UNSPEC: @@ -26501,4 +26509,42 @@ arm_split_tocx_imoves (rtx *operands, enum machine_mode mode) } +static int +arm_reg_reg_move_cost_for_mode (enum machine_mode mode) +{ + /* Check if this is a move between 2 pseudos and + 2 hard registers will fall out from the stuff + below. */ + if (TARGET_NEON && TARGET_HARD_FLOAT) + { + /* FIXME - this is currently in only to prevent + the large register moves. However in practice + preventing splitting of DImode values requires + more tuning. */ + if (mode != DImode + && (VALID_NEON_DREG_MODE (mode) + || VALID_NEON_QREG_MODE (mode))) + return 1; + + /* The cost of moving a structure type size is the + number of 128 bit moves one needs to do in addition + to the number of 64 bit moves one needs to do in + case of the EImode values. */ + if (VALID_NEON_STRUCT_MODE (mode)) + { + return ((GET_MODE_SIZE (mode) / GET_MODE_SIZE (TImode)) + + ((GET_MODE_SIZE (mode) / GET_MODE_SIZE (DImode)) & 1)); + } + } + + if (TARGET_HARD_FLOAT && TARGET_VFP) + { + if (mode == DFmode + && mode == SFmode) + return 1; + } + + return ARM_NUM_REGS (mode); +} + #include "gt-arm.h" diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index b281485..c59184f 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -268,6 +268,7 @@ static int arm_cortex_a5_branch_cost (bool, bool); static bool arm_vectorize_vec_perm_const_ok (enum machine_mode vmode, const unsigned char *sel); +static int arm_reg_reg_move_cost_for_mode (enum machine_mode mode); /* Table of machine attributes. */ @@ -7637,6 +7638,13 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed) return true; case SET: + if (s_register_operand (SET_DEST (x), GET_MODE (SET_DEST (x))) + && s_register_operand (SET_SRC (x), GET_MODE (SET_SRC (x)))) + { + *total = COSTS_N_INSNS (arm_reg_reg_move_cost_for_mode + (GET_MODE (SET_DEST (x)))); + return true; + } return false;