From patchwork Mon Jul 30 11:45:44 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 10354 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 6A4AA23E56 for ; Mon, 30 Jul 2012 11:45:46 +0000 (UTC) Received: from mail-gg0-f180.google.com (mail-gg0-f180.google.com [209.85.161.180]) by fiordland.canonical.com (Postfix) with ESMTP id 1E08DA18AB9 for ; Mon, 30 Jul 2012 11:45:46 +0000 (UTC) Received: by mail-gg0-f180.google.com with SMTP id f1so4579640ggn.11 for ; Mon, 30 Jul 2012 04:45:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf :mime-version:date:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=4eaWXVfD0/h4Uz0XmjN+EpIHeazE/M0g5EWIV0L/QXo=; b=Xr7yusdADKrfs96PW2CF29NZ3KhNI/MZKCmAMR1tvjd+RXQYAzo/ymCueEZW2GUxdn /PrDQJLnGS1m1Mpb8YhqksGLfLHZfwQ8JHjMGaun6tfDuGPbTo1+XD3W2v2UWVoUQT1n FTodeZ/oiCRK/YJ7mLb2f4pOz35ai6DPeA3kS9U4fSqRaY5EA9lbOxH4O5c9GQ9MyThJ 8XMXNd7AJYqNFnU57kdPblaLhy7Ib4zlIdxp/Jnhz+JE20c8eQirMvVX/r+q7KouiMzs VboDwf/nHWp6Pj9lmtDZiEbDiaNsdE1HP1uh0UWeyk6tbSFjcjx1u+Qcki03M/28+zJv q5lg== Received: by 10.50.219.194 with SMTP id pq2mr10616634igc.25.1343648745717; Mon, 30 Jul 2012 04:45:45 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.50.87.40 with SMTP id u8csp83939igz; Mon, 30 Jul 2012 04:45:45 -0700 (PDT) Received: by 10.52.89.197 with SMTP id bq5mr9193382vdb.85.1343648744736; Mon, 30 Jul 2012 04:45:44 -0700 (PDT) Received: from mail-vb0-f50.google.com (mail-vb0-f50.google.com [209.85.212.50]) by mx.google.com with ESMTPS id p10si6529207vdv.71.2012.07.30.04.45.44 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 30 Jul 2012 04:45:44 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.212.50 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) client-ip=209.85.212.50; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.212.50 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) smtp.mail=ramana.radhakrishnan@linaro.org Received: by vbal1 with SMTP id l1so5497446vba.37 for ; Mon, 30 Jul 2012 04:45:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.141.203 with SMTP id n11mr10561763vcu.74.1343648744075; Mon, 30 Jul 2012 04:45:44 -0700 (PDT) Received: by 10.58.210.194 with HTTP; Mon, 30 Jul 2012 04:45:44 -0700 (PDT) Date: Mon, 30 Jul 2012 12:45:44 +0100 Message-ID: Subject: [Patch ARM 2/6] Fix Large struct mode splitters for cases where registers are not TImode. From: Ramana Radhakrishnan To: gcc-patches@gcc.gnu.org Cc: Patch Tracking , Richard Earnshaw X-Gm-Message-State: ALoCoQmbRQY+Qd/UlZXM4R0K8qxZcwD9S+lk3mTU3jkIrmwEwgEie+m4tw+QqA9PtM++AHM7U9s0 > Patch 2 is a bug fix that fixes up the splitters so that they take > into account the right register for the right mode . For instance a > register not fit for a TImode value shouldn't be put in one even if > the larger mode allows a different register . This is possible for > OImode values or indeed HFA style values being passed around as > parameters and is potentially an issue for folks building hard-float > systems with neon and using some of the large structures. , The large struct mode splitters don't take into account whether a TImode value can be generated from a value that is in an appropriate neon register for that value. This is possible in cases where you have an EImode, OImode, CImode or TImode value in the appropriate registers as these could be passed in their corresponding neon D registers. This was exposed by the tests for v{ld/st/tbl/tbx}2/3/4{lane/}* and friends in the new set of tests that follow at the end of this patch series. This is a problem for folks using the new hard float ABI and passing such values in registers - so it might not show up that much in practice but it's certainly worth backporting after sitting in trunk for a few days. It certainly is not a regression since this bug has always been there but it is a fundamental correctness issue in the backend with respect to such splits, so I'd like some more consensus on whether this can be safely backported. regards, Ramana 2012-07-27 Ramana Radhakrishnan PR target/ * config/arm/arm-protos.h (arm_split_eimoves): Declare. (arm_split_tocx_imoves): Declare. * config/arm/iterators.md (TOCXI): New. * config/arm/neon.md (EI TI OI CI XI mode splitters): Unify and use iterator. Simplify EImode splitter. Move logic to ... * config/arm/arm.c (arm_split_eimoves): here .. Handle case for EImode values in registers not suitable for splits into TImode values. (arm_split_tocx_imoves): Likewise. --- gcc/config/arm/arm-protos.h | 3 + gcc/config/arm/arm.c | 91 +++++++++++++++++++++++++++++++++++++++++++ gcc/config/arm/iterators.md | 3 + gcc/config/arm/neon.md | 84 +++++----------------------------------- 4 files changed, 107 insertions(+), 74 deletions(-) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index c590ef4..dc93c5d 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -248,6 +248,9 @@ extern int vfp3_const_double_for_fract_bits (rtx); extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx, rtx); extern bool arm_validize_comparison (rtx *, rtx *, rtx *); +extern void arm_split_tocx_imoves (rtx *, enum machine_mode); +extern void arm_split_eimoves (rtx *); + #endif /* RTX_CODE */ extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 1f3f9b3..b281485 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -26410,4 +26410,95 @@ arm_validize_comparison (rtx *comparison, rtx * op1, rtx * op2) } +/* EImode values are usually in 3 DImode registers. This could be suitably + split into TImode moves and DImode moves. */ +void +arm_split_eimoves (rtx *operands) +{ + int rdest = REGNO (operands[0]); + int rsrc = REGNO (operands[1]); + int count = 0; + int increment = 0; + rtx dest[3], src[3]; + int i, j; + + if (NEON_REGNO_OK_FOR_QUAD (rdest) && NEON_REGNO_OK_FOR_QUAD (rsrc)) + { + dest[0] = gen_rtx_REG (TImode, rdest); + src[0] = gen_rtx_REG (TImode, rsrc); + count = 2; + increment = 4; + } + else + { + dest[0] = gen_rtx_REG (DImode, rdest); + src[0] = gen_rtx_REG (DImode, rsrc); + dest[1] = gen_rtx_REG (DImode, rdest + 2); + src[1] = gen_rtx_REG (DImode, rsrc + 2); + count = 3; + increment = 2; + } + + dest[count - 1] = gen_rtx_REG (DImode, rdest + 4); + src[count - 1] = gen_rtx_REG (DImode, rsrc + 4); + + neon_disambiguate_copy (operands, dest, src, count); + + for (i = 0, j = 0 ; j < count ; i = i + 2, j++) + emit_move_insn (operands[i], operands[i + 1]); + + return; +} + +/* Split TI, CI, OI and XImode moves into appropriate smaller + forms. */ +void +arm_split_tocx_imoves (rtx *operands, enum machine_mode mode) +{ + int rdest = REGNO (operands[0]); + int rsrc = REGNO (operands[1]); + enum machine_mode split_mode; + int count = 0; + int factor = 0; + int j; + /* We never should need more than 8 DImode registers in the worst case. */ + rtx dest[8], src[8]; + int i; + + if (NEON_REGNO_OK_FOR_QUAD (rdest) && NEON_REGNO_OK_FOR_QUAD (rsrc)) + { + split_mode = TImode; + if (dump_file) + fprintf (dump_file, "split_mode is TImode\n"); + } + else + { + split_mode = DImode; + if (dump_file) + fprintf (dump_file, "split_mode is DImode\n"); + } + + + count = GET_MODE_SIZE (mode) / GET_MODE_SIZE (split_mode); + factor = GET_MODE_SIZE (split_mode) / UNITS_PER_WORD; + + if (dump_file) + fprintf (dump_file, "count %d factor %d\n", count, factor); + + for (i = 0 ; i < count; i++) + { + dest[i] = gen_rtx_REG (split_mode, rdest + i * factor ); + src[i] = gen_rtx_REG (split_mode, rsrc + i * factor); + } + + neon_disambiguate_copy (operands, dest, src, count); + for (j = 0, i = 0 ; j < count ; j++, i = i + 2) + { + emit_move_insn (operands[i], operands[i + 1]); + } + + return; + +} + #include "gt-arm.h" diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index def8d9f..3474d16 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -89,6 +89,9 @@ ;; Opaque structure types wider than TImode. (define_mode_iterator VSTRUCT [EI OI CI XI]) +;; Opaque structure types other than EImode. +(define_mode_iterator TOCXI [TI OI CI XI]) + ;; Opaque structure types used in table lookups (except vtbl1/vtbx1). (define_mode_iterator VTAB [TI EI OI]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 1ffbb7d..7434625 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -293,85 +293,21 @@ [(set (match_operand:EI 0 "s_register_operand" "") (match_operand:EI 1 "s_register_operand" ""))] "TARGET_NEON && reload_completed" - [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))] + [(const_int 0)] { - int rdest = REGNO (operands[0]); - int rsrc = REGNO (operands[1]); - rtx dest[2], src[2]; - - dest[0] = gen_rtx_REG (TImode, rdest); - src[0] = gen_rtx_REG (TImode, rsrc); - dest[1] = gen_rtx_REG (DImode, rdest + 4); - src[1] = gen_rtx_REG (DImode, rsrc + 4); - - neon_disambiguate_copy (operands, dest, src, 2); + arm_split_eimoves (operands); + DONE; }) -(define_split - [(set (match_operand:OI 0 "s_register_operand" "") - (match_operand:OI 1 "s_register_operand" ""))] +;; Splitter for TI, OI, CI and XI modes. +(define_split ;; TI, OI, CI and XImode move split. + [(set (match_operand:TOCXI 0 "s_register_operand" "") + (match_operand:TOCXI 1 "s_register_operand" ""))] "TARGET_NEON && reload_completed" - [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))] + [(const_int 0)] { - int rdest = REGNO (operands[0]); - int rsrc = REGNO (operands[1]); - rtx dest[2], src[2]; - - dest[0] = gen_rtx_REG (TImode, rdest); - src[0] = gen_rtx_REG (TImode, rsrc); - dest[1] = gen_rtx_REG (TImode, rdest + 4); - src[1] = gen_rtx_REG (TImode, rsrc + 4); - - neon_disambiguate_copy (operands, dest, src, 2); -}) - -(define_split - [(set (match_operand:CI 0 "s_register_operand" "") - (match_operand:CI 1 "s_register_operand" ""))] - "TARGET_NEON && reload_completed" - [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 5))] -{ - int rdest = REGNO (operands[0]); - int rsrc = REGNO (operands[1]); - rtx dest[3], src[3]; - - dest[0] = gen_rtx_REG (TImode, rdest); - src[0] = gen_rtx_REG (TImode, rsrc); - dest[1] = gen_rtx_REG (TImode, rdest + 4); - src[1] = gen_rtx_REG (TImode, rsrc + 4); - dest[2] = gen_rtx_REG (TImode, rdest + 8); - src[2] = gen_rtx_REG (TImode, rsrc + 8); - - neon_disambiguate_copy (operands, dest, src, 3); -}) - -(define_split - [(set (match_operand:XI 0 "s_register_operand" "") - (match_operand:XI 1 "s_register_operand" ""))] - "TARGET_NEON && reload_completed" - [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 5)) - (set (match_dup 6) (match_dup 7))] -{ - int rdest = REGNO (operands[0]); - int rsrc = REGNO (operands[1]); - rtx dest[4], src[4]; - - dest[0] = gen_rtx_REG (TImode, rdest); - src[0] = gen_rtx_REG (TImode, rsrc); - dest[1] = gen_rtx_REG (TImode, rdest + 4); - src[1] = gen_rtx_REG (TImode, rsrc + 4); - dest[2] = gen_rtx_REG (TImode, rdest + 8); - src[2] = gen_rtx_REG (TImode, rsrc + 8); - dest[3] = gen_rtx_REG (TImode, rdest + 12); - src[3] = gen_rtx_REG (TImode, rsrc + 12); - - neon_disambiguate_copy (operands, dest, src, 4); + arm_split_tocx_imoves (operands, mode); + DONE; }) (define_expand "movmisalign"