From patchwork Thu Mar 24 15:40:53 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Richard Sandiford <richard.sandiford@linaro.org>
X-Patchwork-Id: 769
Return-Path: <richard.sandiford@linaro.org>
Delivered-To: unknown
Received: from imap.gmail.com (74.125.159.109) by localhost6.localdomain6
 with IMAP4-SSL; 08 Jun 2011 14:45:40 -0000
Delivered-To: patches@linaro.org
Received: by 10.42.161.68 with SMTP id s4cs115605icx;
 Thu, 24 Mar 2011 08:40:59 -0700 (PDT)
Received: by 10.216.255.76 with SMTP id i54mr859715wes.26.1300981258687;
 Thu, 24 Mar 2011 08:40:58 -0700 (PDT)
Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50])
 by mx.google.com with ESMTPS id p60si30617weh.137.2011.03.24.08.40.57
 (version=TLSv1/SSLv3 cipher=OTHER);
 Thu, 24 Mar 2011 08:40:57 -0700 (PDT)
Received-SPF: neutral (google.com: 74.125.82.50 is neither permitted nor
 denied by best guess record for domain of
 richard.sandiford@linaro.org) client-ip=74.125.82.50; 
Authentication-Results: mx.google.com;
 spf=neutral (google.com: 74.125.82.50 is neither
 permitted nor denied by best guess record for domain of
 richard.sandiford@linaro.org)
 smtp.mail=richard.sandiford@linaro.org
Received: by wwc33 with SMTP id 33so101447wwc.31
 for <patches@linaro.org>; Thu, 24 Mar 2011 08:40:56 -0700 (PDT)
Received: by 10.216.142.35 with SMTP id h35mr863035wej.31.1300981256737;
 Thu, 24 Mar 2011 08:40:56 -0700 (PDT)
Received: from richards-thinkpad (gbibp9ph1--blueice2n1.emea.ibm.com
 [195.212.29.75])
 by mx.google.com with ESMTPS id d54sm10509wej.34.2011.03.24.08.40.54
 (version=TLSv1/SSLv3 cipher=OTHER);
 Thu, 24 Mar 2011 08:40:55 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@linaro.org>
To: gcc-patches@gcc.gnu.org
Mail-Followup-To: gcc-patches@gcc.gnu.org, patches@linaro.org,
 richard.sandiford@linaro.org
Subject: Tighten ARM's CANNOT_CHANGE_MODE_CLASS
cc: patches@linaro.org
Date: Thu, 24 Mar 2011 15:40:53 +0000
Message-ID: <g47hbov856.fsf@linaro.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)
MIME-Version: 1.0

We currently generate very poor code for tests like:

#include <arm_neon.h>

void
foo (uint32_t *a, uint32_t *b, uint32_t *c)
{
  uint32x4x3_t x, y;

  x = vld3q_u32 (a);
  y = vld3q_u32 (b);
  x.val[0] = vaddq_u32 (x.val[0], y.val[0]);
  x.val[1] = vaddq_u32 (x.val[1], y.val[1]);
  x.val[2] = vaddq_u32 (x.val[2], y.val[2]);
  vst3q_u32 (a, x);
}

This is because we force the uint32x4x3_t values to the stack and
then load and store the individual vectors.

What we actually want is for the uint32x4x3_t values to be stored
in registers, and for the individual vectors to be accessed as
subregs of those registers.  The first part involves some middle-end
mode changes (see recent gcc@ thread), while the second part requires
a change to ARM's CANNOT_CHANGE_MODE_CLASS.

CANNOT_CHANGE_MODE_CLASS is defined as:

/* FPA registers can't do subreg as all values are reformatted to internal
   precision.  VFP registers may only be accessed in the mode they
   were set.  */
#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
   : 0)

But this VFP restriction appears to apply only to VFPv1; thanks to
Peter Maydell for the archaeology.

Tested on arm-linux-gnueabi.  OK to install?

This doesn't have any direct benefit without the middle-end mode change,
but it needs to go in first in order for that change not to regress.

Richard


gcc/
	* config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Restrict FPA_REGS
	case to VFPv1.

Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	2011-03-24 13:47:14.000000000 +0000
+++ gcc/config/arm/arm.h	2011-03-24 15:26:19.000000000 +0000
@@ -1167,12 +1167,14 @@ #define IRA_COVER_CLASSES						     \
 }
 
 /* FPA registers can't do subreg as all values are reformatted to internal
-   precision.  VFP registers may only be accessed in the mode they
+   precision.  VFPv1 registers may only be accessed in the mode they
    were set.  */
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
-  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
-   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
-     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)		\
+  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)			\
+   ? (reg_classes_intersect_p (FPA_REGS, (CLASS))		\
+      || (TARGET_VFP						\
+	  && arm_fpu_desc->rev == 1				\
+	  && reg_classes_intersect_p (VFP_REGS, (CLASS))))	\
    : 0)
 
 /* The class value for index registers, and the one for base regs.  */