From patchwork Tue Jul 26 13:55:02 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 72813 Delivered-To: patch@linaro.org Received: by 10.140.29.52 with SMTP id a49csp1700922qga; Tue, 26 Jul 2016 06:55:48 -0700 (PDT) X-Received: by 10.98.20.201 with SMTP id 192mr39947344pfu.144.1469541348220; Tue, 26 Jul 2016 06:55:48 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id an6si848030pad.167.2016.07.26.06.55.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jul 2016 06:55:48 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-432548-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-432548-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-432548-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=VaoJwYYeeq6cNyKKFbtxrTMYGX/mcHR4TsP3I1z0JTsicPANrw bg5s6EreoLVgT6dET+o2by0f5EeaV8x4vaDQDtRRpHHK67ZvQG62v38Ih7MoSH7C zY8JDG2Nx6cBA/jL++nXyjZdKxqHK9zIwdKAX4jTuKkAvXJQ9/pydjTkY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=y+JNU6RcefXjN7FMXaLZYvh8jbY=; b=VQbDvrXm9PPUGfhm/Tqe /HU0gw86FX66evn8313nwnpRfTtfpjzXQm9KCFEhk6vQETTqTedmUb77lTlfD8y9 FyAdedIgf5gwFPUrLqlO1aqNOBiOkRnAVAbWZUksCbu10UzArgaO5AyWHSjLHfUe rnkHscQm9AtZ9jIIaCp5X7U= Received: (qmail 12817 invoked by alias); 26 Jul 2016 13:55:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 12806 invoked by uid 89); 26 Jul 2016 13:55:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 spammy=4813, gimple_seq, xx, field_t X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 26 Jul 2016 13:55:24 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01lp0182.outbound.protection.outlook.com [213.199.154.182]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-52-KgOtx8QmNC24usndNC-uBw-1; Tue, 26 Jul 2016 14:55:20 +0100 Received: from AM2PR08CA0041.eurprd08.prod.outlook.com (10.162.32.51) by VI1PR0801MB1951.eurprd08.prod.outlook.com (10.173.74.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.544.10; Tue, 26 Jul 2016 13:55:17 +0000 Received: from AM1FFO11FD028.protection.gbl (2a01:111:f400:7e00::191) by AM2PR08CA0041.outlook.office365.com (2a01:111:e400:843e::51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.549.15 via Frontend Transport; Tue, 26 Jul 2016 13:55:17 +0000 Received: from nebula.arm.com (217.140.96.140) by AM1FFO11FD028.mail.protection.outlook.com (10.174.64.217) with Microsoft SMTP Server (TLS) id 15.1.539.16 via Frontend Transport; Tue, 26 Jul 2016 13:55:18 +0000 Received: from e107456-lin.cambridge.arm.com (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.294.0; Tue, 26 Jul 2016 14:55:04 +0100 From: James Greenhalgh To: CC: , , Subject: [AArch64] Handle HFAs of float16 types properly Date: Tue, 26 Jul 2016 14:55:02 +0100 Message-ID: <1469541302-17088-1-git-send-email-james.greenhalgh@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140; IPV:CAL; SCL:-1; CTRY:GB; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(7916002)(2980300002)(438002)(199003)(377424004)(189002)(4326007)(104016004)(77096005)(11100500001)(110136002)(586003)(450100001)(6806005)(2351001)(229853001)(106466001)(36756003)(568964002)(189998001)(5890100001)(26826002)(7696003)(5003600100003)(2476003)(512874002)(87936001)(84326002)(19580395003)(246002)(50986999)(356003)(19580405001)(8936002)(7846002)(2906002)(92566002)(4610100001)(8676002)(305945005)(86362001)(33646002)(50226002); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0801MB1951; H:nebula.arm.com; FPR:; SPF:Pass; PTR:fw-tnat.cambridge.arm.com; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; AM1FFO11FD028; 1:NcH1yw3+Rjj9E6Tg5gViPSGp15tinVTHYHvC/pZdN27EZRkBkonl5Y7xDRoh1/8s4srKflzgOfXokP2+q4dNqHxX880iJIiiDkwsFa2DGufIpTSt86XI16oJkseSzgfwSixn8/3IDM7sxmdbLev1c5R+d59MhBwhs3qe9FiHQZLk5Wn5p1e/4/tbiU1bkmoeUjoGOnvsO3GMAkxSqd9xkKFweXWvJe/Ac5s0o8ttFTXJuUnb50d2cGfkBrTMStTLHTVbD84TRpbjoRyYKCyX8fOh8SgthbMyeD/htxPx2J5gekTNk9rAOdO7J2RKF1r+tFKkIovAOWvKSAdHw1I+vTi+FcgP2krKQF8HZ8wPERl36O1+v7yaP7rcnEL7nsZrDCqfXgdziaI9MYsZQZOKH34V7jCQE7bJpRoifvYVnBIDEuUswVDkZ+5g0zioyq5TiDBPxviThJuiDZzed99mGPn+m0pgzYCwcuQeErqQ1WEc5xv2ELqKnmcSeHOSLRIwIlVfnCKREmaY2H/+8wta+C3rtqw7v49Cs6kAlp/rJIIe2+5vDy2Owxm5Y3KfkgB6 X-MS-Office365-Filtering-Correlation-Id: f37daca8-1dd2-4b7d-74d5-08d3b55c7a95 X-Microsoft-Exchange-Diagnostics: 1; VI1PR0801MB1951; 2:aessiLZur7z2Y/nOo9kqXNuDpl9JqFIDzsfxzFn3dXx0Gzh3WybDCf0Xbyt5xLPOh8fIbGxGhsQO7JqZXnEg39S3AXZfvK6/xm0QZxSSfQvHWfIY88ggL71ofB1b5oZEqwcedB9Fjuk0wbYTH0LQL1BZ1xs5ACrV3wiU66RvYIFQfp0x02YZ4Yv6XHiD6sgk; 3:q7U07vRg8IntltZ7aoYEbW/MR1VS8XsYykAnNbA94tiK+oWCwvzWHZXTUpkNNa2kT1ZJ/GjbstDiQTb3qkDic8pSyE2GAru1vQKLJNwiLfmvpUyRQ/nKj/M8bO8DD4NDwOok8pGWPlR8ghJHwm+wuuLiFwntBTWXBEEEm4nc5pTMsut5jcZvcXEahQLOVPuB9ysQrd2RABIB7N8/hnWuSHn4vu2Zhje23AA2mBspF8dYCrGdYG/bvhHyexEMPPBo7XOBpre8zJO7O6NRnuZW2g== X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(8251501002); SRVR:VI1PR0801MB1951; X-Microsoft-Exchange-Diagnostics: 1; VI1PR0801MB1951; 25:GxjotbuZSrTbR+vBvAuO1ypMjBk0E/lPE+pkLxs5Nio894a9Rb2VpbO5FRtXuZ/0OnnXFeCxyc5mHCHFK2UEL+hPE8L7w88eZZDFlKg0yFO9UxIGntp9Nv9FKRLuA4uVdIzo+f7XDBrfb7JcyNgxPzojMaQUgJ67rn/9hwwSrDPWIFwTsuatU4xUDyMLD43Es/YA+3FxnKEN8Q0opLfGMiFTtPBBljPo16rAxNXRcNVMMuEuFJ1hI4+04IrUi4Xqx9UOv1f//E+kltzU1857ZrWzGqksX55Qymqmhnz/NwULdflgpmn5oC/0Nzy6hVFwpanNLmwrQkub6ZMz8i84E7N3Q0BUtHu8Y8idibTQDzdMYQ8WRvcq/spt7Z4GQqpA+63Lh6IqZzMSHRo89dwROKtpgnm4HnTP5y0G1f0+thFTAzFRaj3MhHRe4nnP5rG9eaypFjNAKQD05LocfNmEmaaFV4qiKrOdJsHTGLWUg+SvqjNutIGAO+vIv1VSX6Pf; 31:TgWueghYZkxMOJA8tnnTW04ccWDm+OQe8T7M7Hjjz7ZfvfLYJF+UvsCwkEP71jcz92utKfq+GiT03hh/NrtV75UJiF7JEaQNEH0s7sdSLBjaULQTnXvZPjfnbYn4xgphe4tR0qIw+c291ktR0gbKfh6pbuiBMxzsH3ZZzEMsLwn4gNNUxTAHTb9oWqdnVcleNUGt5/Oi6bSc/tggQo28BA== NoDisclaimer: True X-Microsoft-Exchange-Diagnostics: 1; VI1PR0801MB1951; 20:f/Eub2e+XYAkEzVMvKb6+4i3zu4YipBa34g9knBwQJ/ltFq9zQpjRYD5yhXRtwmGq5R+9Pw3beckqzRnJiUpaFwX5CoOqHwJSbU2BHcn6KfyHUjyNJyNaz+nIZ6mr5K7UuFXvHVUTFtEVuyu5feCLzzWkiwGAPPzAAaqNw0iKNTTMALcieWF/aVIpWYD/T8EYSX6okqElO1c5W/9yVe5UngwRaR9agev6Ga6Joegctdsr3mgtL6z3yhhW1jyU9dG; 4:iy0eQJP4yuJHeIkS+rNpFilkCRw1QrVPXfGxr7eJYffhuceiIfG57NgO6az0/DH+jNLZ3T99rUW3FRGybjbQZxz7UWGu5FlG5MdSiMI070ijMQFJ50CfAYT//e1yiqyPJqli2VZ/ljQS8l/E03WIy9mFBqU1rzdjLVrwMxduQ2DT9PDu694iMBP1TsXA7F3w8BiLPpkW2LfbcVf41AzlhPdNtN0JESQNXq0l3VI3ftLw6ArqFOOveDdV+bfmmERh6kAUK53tdMd/Op93r5lbDAwOppewCwr39cWC1pAYqNJ9RCUVIlmZkWSbfrZH1i5f89nBDugACWGBNaGShvWTwm6aZNfIdQvLsXm4bHwlBVp/HYNDtXaTdKVaYWe5Jt05d+XF812G8kJpz66++8+oO7zm/YQEI/ZPyBtqxdcdOZziPujQY6h6Ri5YYq6pV5aCQ4GB2pMggrW3DUHcayOpnz8pRDuDEH0PY9JlhPBcpwHxpYUskRsbzJbZGjslYl+f5gzKaaEhDm8Th7/er08IPw== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(102415321)(601004)(2401047)(5005006)(8121501046)(13013025)(13023025)(13024025)(13020025)(3002001)(10201501046)(6055026); SRVR:VI1PR0801MB1951; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0801MB1951; X-Forefront-PRVS: 00159D1518 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR0801MB1951; 23:GBidCIk9xr4inFX8Xz7o8trJRB9IDTgTqxPNs+0?= =?us-ascii?Q?tEK6nmGdjkYxlh/2dWkO+vOolPReDURo117GOdPjSNqVfxNcapioExlABOQM?= =?us-ascii?Q?XeWNfXeqsbkenHy7km9g/gbXmbhfq+CTvP9TwDs6Lk5CzSg2eOuIHgz/9tLY?= =?us-ascii?Q?bvycJN1cosKJUwsHsZFVMU87hdMIjOCUKPH2LY6z5nJKeBCqc85cqJzMIghw?= =?us-ascii?Q?U4yEUQ/UvUQD5sYwOvZ2Biu8+2oovUKN+TJSVosr8YoSCMonSufb9yYi1WE+?= =?us-ascii?Q?l8XwenS2VNYkJIifyv3B99HZR6Y1zRTJBGk05+13CWyMTuhiui8cw14nUtAL?= =?us-ascii?Q?dPlU/r/CMw6tkZfn+9RywXa/Jh1NqTRoSu1EYyMRpMX7XDyoAgzA3R14fbhn?= =?us-ascii?Q?/OzPeFmZvNRCa63g/9houGnjKet/yfqto3Q7c71uNsuPWKrWYR0RWfX0Vzis?= =?us-ascii?Q?d9XvVxPTCtoposYIt7lgSVwWbT8INYCuE+UAbWZ9lHqucC7G5qJHAaBA1Qei?= =?us-ascii?Q?BjPxcguuVpQCyz/6g3iY0nZ+Uhc24gB40ztnxzo2Jpvwrj5nCltIfq7pBwRC?= =?us-ascii?Q?Pzo4egoMb0Hu85B+9XlSt95IfeXP+sP7Grb9by9qy6YZ2EDU1EuiGg+CKNuj?= =?us-ascii?Q?jNUGsNUoaG4+BoygXDVubrEPiq3D+Zbc/KL+x8U4qIhEyDEJUvOFj3deCl2/?= =?us-ascii?Q?a3DW6i7+pqL1jn0rBdy5u9sK2VvCPAcVoF8puUaQbY+6evjMlKMXrp+Tn2OE?= =?us-ascii?Q?4DZZ8YkvYhwMDG80cyQA2/OQeIVTrhT7mJ5MdimOrQk9tBXAoPGuR+A8VMvo?= =?us-ascii?Q?04GPakosB/RngQ7tl7coELluGWhSGYecMdVBxti2if339gf54MiDmVzqYtB4?= =?us-ascii?Q?ifPJf5iNvlEZWpxmg0NH0iRmSe2mQanFFRe6TRW+Lp9p4QBwWXQz6fDNNJyd?= =?us-ascii?Q?cqQ9oB9Du1UwrEiS2ahCZ1sgZ7yWfiYiBIignMLxqeQUGwfddDLAkdqUv/ke?= =?us-ascii?Q?b5UcXLtErQbHGoDbQlxHlOwaI3OemEfin8ublyNNtIOyFMA60HCTZGXbwRO0?= =?us-ascii?Q?v6AUgyCExQ9MwdzxM9qDYXbgLkX83UNYpWzZ8k+XoKK5kgBYZ9JTVLDFYuI6?= =?us-ascii?Q?9AX8qFxhH1p8=3D?= X-Microsoft-Exchange-Diagnostics: 1; VI1PR0801MB1951; 6:QYurilZD0KvA24AbfYyXo9uDUt9irujD8yilAdY8t3ppWLGHGAN+xYVeWoRh9XDzxcUbiADzAV95orq7xisNJjDMC7t3vT7DJbpWzs4xgRfmI5UqEAcVYaol2zW67HQU/AnUCRygFxGlB7Is6bFzDTAbo7lxIQvEdjr+1m+7hPszzDgKqtm2i1rHenQeIUgjlf8dp3jrHwqIXOIynnDIiw7nnlME53oIxO0x5uR7Trh6ZqrdYXFPE6mpMUTLcdYEF4YIzy1y8xymFvirHSEqzTtH6UMzNTiym9C7hGUb9XN6gWYeyYpJsUqKYRTaPFMZxoRqRUbV+ZyEksnvuS5IAA==; 5:q8e1qjp1KkWUlyC6FGMThC2rcJpdFUdAZKcFyA6cyWroEaVReJQ4lzItf6hsh7iwBnPyx5AmwtMag0/wYHBINp3cWvkP657yd166DLcOmRbjT7fPuWalI4BYN28MWQbnpfVuxP4Uy7vtQIjWLFVxxw==; 24:u6VdL+FSoHzpk4vwDuCMU2D/gu/1+1pQh9N8l44nBFkuivDd+XS+e5TEAys3ZMa+M/2Flr1ei385dvo9Dd0ABgfsJF0LZ1w1pgrBo+zOmFE=; 7:sh9/HVqBOKrsAHGE+L+9yeMFcMkNqSwOXzAoz/yjEYMsQF9jS/71AdZivqBaWWE252RJGxL0sjRiozw7I0ARbW9VdGlFWVdaNV2uBbVP4OcUUU7qTdBL10f6E3jkLNW/IRdbmTeGnWrDYz65YaAnOHF6jymibR7REbtLXsdYttdlbKMu7sbJEYNJ8xdL1SHnEtk0+njQWeKeysYh8kXRYKazDQ6KusxrwRIdG5Jeu5uw3r9mbeDy9K/qZmydrHM+ SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; VI1PR0801MB1951; 20:HyoMxtVxy82hRqQ4HbeXiDk4gCHA9w9ppNXEOw1+483AUJ2xwtiXaj9sKO8HEu2RdoM985pPuEq+zwOKLply0hJjq2lNckzHlBNJB/QIZsJUyATuPFxkH5nFTuasFQ2osye1hIkKoTrsNMZveu7iklrjOLb8nTklcsEk9qz/HQU= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2016 13:55:18.5265 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1951 X-MC-Unique: KgOtx8QmNC24usndNC-uBw-1 X-IsSubscribed: yes Hi, It looks like we've not been handling structures of 16-bit floating-point data correctly for AArch64. For some reason we end up passing them packed in to integer registers. That is to say, on trunk and GCC 6, for: struct x { __fp16 x[4]; }; __fp16 foo1 (struct x x) { return x.x[1]; } We generate: foo1: sbfx x0, x0, 16, 16 mov v0.h[0], w0 ret Which is wrong. This patch fixes that, so now we generate: foo1: umov w0, v1.h[0] sxth x0, w0 mov v0.h[0], w0 ret Far from optimal (I'll work on that...) but at least getting the data from the right register bank! To do this we need to keep around a reference to the fp16 type after we construct it. I've moved this initialisation to a new function aarch64_init_fp16_types in aarch64-builtins.c and made the references available through arm_neon.h. After that, we want to remove the #if 0 wrapping HFmode support in aarch64_gimplify_va_arg_expr in aarch64.c, and add HFmode to the REAL_TYPE and COMPLEX_TYPE support in aapcs_vfp_sub_candidate. Strictly speaking, we don't need the hunk regarding COMPLEX_TYPE. We can't build complex forms of __fp16. But, were we ever to support the _Float16 type we'd need this. Rather than leave the chance it will be forgotten about, I've just added it here. If the maintainers would prefer, I can change this to a TODO and put a sticky-note somewhere near my desk. With those simple changes, we fix the argument passing. The rest of the patch is an update to the various testcases in aapcs64.exp to fully cover various __fp16 cases (both naked, and within an HFA). Bootstrapped on aarch64-none-linux-gnu and tested with no issues. Also tested on aarch64_be-none-elf. All test came back clean. OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6, though it will apply cleanly there if the maintainers support that. Thanks, James --- gcc/ 2016-07-26 James Greenhalgh * config/aarch64/aarch64.h (aarch64_fp16_type_node): Declare. (aarch64_fp16_ptr_type_node): Likewise. * config/aarch64/aarch64-simd-builtins.c (aarch64_fp16_ptr_type_node): Define. (aarch64_init_fp16_types): New, refactored out of... (aarch64_init_builtins): ...here, update to call aarch64_init_fp16_types. * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Handle HFmode. (aapcs_vfp_sub_candidate): Likewise. gcc/testsuite/ 2016-07-26 James Greenhalgh * gcc.target/aarch64/aapcs64/abitest-common.h: Define half-precision registers. * gcc.target/aarch64/aapcs64/abitest.S (dumpregs): Add assembly for saving the half-precision registers. * gcc.target/aarch64/aapcs64/func-ret-1.c: Test that an __fp16 value is returned in h0. * gcc.target/aarch64/aapcs64/test_2.c: Check that __FP16 arguments are passed in FP/SIMD registers. * gcc.target/aarch64/aapcs64/test_27.c: New, test that __fp16 HFA passing works corrcetly. * gcc.target/aarch64/aapcs64/type-def.h (hfa_f16x1_t): New. (hfa_f16x2_t): Likewise. (hfa_f16x3_t): Likewise. * gcc.target/aarch64/aapcs64/va_arg-1.c: Check that __fp16 values are promoted to double and passed in a double register. * gcc.target/aarch64/aapcs64/va_arg-2.c: Check that __fp16 values are promoted to double and stacked. * gcc.target/aarch64/aapcs64/va_arg-4.c: Check stacking of HFA of __fp16 data types. * gcc.target/aarch64/aapcs64/va_arg-5.c: Likewise. * gcc.target/aarch64/aapcs64/va_arg-16.c: New, check HFAs of __fp16 first get passed in FP/SIMD registers, then stacked. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index ca91d91..1de325a 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -443,13 +443,15 @@ static struct aarch64_simd_type_info aarch64_simd_types [] = { }; #undef ENTRY -/* This type is not SIMD-specific; it is the user-visible __fp16. */ -static tree aarch64_fp16_type_node = NULL_TREE; - static tree aarch64_simd_intOI_type_node = NULL_TREE; static tree aarch64_simd_intCI_type_node = NULL_TREE; static tree aarch64_simd_intXI_type_node = NULL_TREE; +/* The user-visible __fp16 type, and a pointer to that type. Used + across the back-end. */ +tree aarch64_fp16_type_node = NULL_TREE; +tree aarch64_fp16_ptr_type_node = NULL_TREE; + static const char * aarch64_mangle_builtin_scalar_type (const_tree type) { @@ -883,6 +885,21 @@ aarch64_init_builtin_rsqrt (void) } } +/* Initialize the backend types that support the user-visible __fp16 + type, also initialize a pointer to that type, to be used when + forming HFAs. */ + +static void +aarch64_init_fp16_types (void) +{ + aarch64_fp16_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (aarch64_fp16_type_node) = 16; + layout_type (aarch64_fp16_type_node); + + (*lang_hooks.types.register_builtin_type) (aarch64_fp16_type_node, "__fp16"); + aarch64_fp16_ptr_type_node = build_pointer_type (aarch64_fp16_type_node); +} + void aarch64_init_builtins (void) { @@ -904,11 +921,7 @@ aarch64_init_builtins (void) = add_builtin_function ("__builtin_aarch64_set_fpsr", ftype_set_fpr, AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); - aarch64_fp16_type_node = make_node (REAL_TYPE); - TYPE_PRECISION (aarch64_fp16_type_node) = 16; - layout_type (aarch64_fp16_type_node); - - (*lang_hooks.types.register_builtin_type) (aarch64_fp16_type_node, "__fp16"); + aarch64_init_fp16_types (); if (TARGET_SIMD) aarch64_init_simd_builtins (); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index fe2683e..addcf2c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9888,15 +9888,10 @@ aarch64_gimplify_va_arg_expr (tree valist, tree type, gimple_seq *pre_p, field_t = long_double_type_node; field_ptr_t = long_double_ptr_type_node; break; -/* The half precision and quad precision are not fully supported yet. Enable - the following code after the support is complete. Need to find the correct - type node for __fp16 *. */ -#if 0 case HFmode: - field_t = float_type_node; - field_ptr_t = float_ptr_type_node; + field_t = aarch64_fp16_type_node; + field_ptr_t = aarch64_fp16_ptr_type_node; break; -#endif case V2SImode: case V4SImode: { @@ -10058,7 +10053,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep) { case REAL_TYPE: mode = TYPE_MODE (type); - if (mode != DFmode && mode != SFmode && mode != TFmode) + if (mode != DFmode && mode != SFmode + && mode != TFmode && mode != HFmode) return -1; if (*modep == VOIDmode) @@ -10071,7 +10067,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep) case COMPLEX_TYPE: mode = TYPE_MODE (TREE_TYPE (type)); - if (mode != DFmode && mode != SFmode && mode != TFmode) + if (mode != DFmode && mode != SFmode + && mode != TFmode && mode != HFmode) return -1; if (*modep == VOIDmode) diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 1915980..9e26eb1 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -928,4 +928,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define ASM_OUTPUT_POOL_EPILOGUE aarch64_asm_output_pool_epilogue +/* This type is the user-visible __fp16, and a pointer to that type. We + need it in many places in the backend. Defined in aarch64-builtins.c. */ +extern tree aarch64_fp16_type_node; +extern tree aarch64_fp16_ptr_type_node; + #endif /* GCC_AARCH64_H */ diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-common.h b/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-common.h index 4e2ef0d..138de73 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-common.h +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-common.h @@ -57,7 +57,17 @@ #define X8 320 #define X9 328 -#define STACK 336 +#define H0 336 +#define H1 338 +#define H2 340 +#define H3 342 +#define H4 344 +#define H5 346 +#define H6 348 +#define H7 350 + + +#define STACK 352 /* The type of test. 'myfunc' in abitest.S needs to know which kind of test it is running to decide what to do at the runtime. Keep the diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S b/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S index c2fbd83..893e68c 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest.S @@ -13,7 +13,12 @@ dumpregs: myfunc: mov x16, sp mov x17, sp - sub sp, sp, 352 // 336 for registers and 16 for old sp and lr + sub sp, sp, 368 // 352 for registers and 16 for old sp and lr + + sub x17, x17, 8 + st4 { v4.h, v5.h, v6.h, v7.h }[0], [x17] //344 + sub x17, x17, 8 + st4 { v0.h, v1.h, v2.h, v3.h }[0], [x17] //336 stp x8, x9, [x17, #-16]! //320 diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-1.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-1.c index a21c926..29a1ca6 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-1.c +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-1.c @@ -44,4 +44,5 @@ FUNC_VAL_CHECK (12, vf2_t, vf2, D0, f32in64) FUNC_VAL_CHECK (13, vi4_t, vi4, Q0, i32in128) FUNC_VAL_CHECK (14, int *, int_ptr, X0, flat) FUNC_VAL_CHECK (15, vlf1_t, vlf1, Q0, flat) +FUNC_VAL_CHECK (16, __fp16, 0xabcd, H0, flat) #endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/test_2.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/test_2.c index 94817ed..ce7c60a8 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/test_2.c +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/test_2.c @@ -12,5 +12,6 @@ ARG(double, 4.0, D1) ARG(float, 2.0f, S2) ARG(double, 5.0, D3) + ARG(__fp16, 8.0f, H4) LAST_ARG(int, 3, W0) #endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/test_27.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/test_27.c new file mode 100644 index 0000000..7bc79f5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/test_27.c @@ -0,0 +1,46 @@ +/* Test AAPCS64 layout + + Test named homogeneous floating-point aggregates of __fp16 data, + which should be passed in SIMD/FP registers or via the stack. */ + +/* { dg-do run { target aarch64*-*-* } } */ + +#ifndef IN_FRAMEWORK +#define TESTFILE "test_27.c" + +struct x0 +{ + __fp16 v[1]; +} f16x1; + +struct x1 +{ + __fp16 v[2]; +} f16x2; + +struct x2 +{ + __fp16 v[3]; +} f16x3; + +#define HAS_DATA_INIT_FUNC +void init_data () +{ + f16x1.v[0] = 2.0f; + f16x2.v[0] = 4.0f; + f16x2.v[1] = 8.0f; + f16x3.v[0] = 16.0f; + f16x3.v[1] = 32.0f; + f16x3.v[2] = 64.0f; +} + +#include "abitest.h" +#else +ARG (struct x0, f16x1, H0) +ARG (struct x1, f16x2, H1) +ARG (struct x2, f16x3, H3) +ARG (struct x1, f16x2, H6) +ARG (struct x0, f16x1, STACK) +ARG (int, 0xdeadbeef, W0) +LAST_ARG (double, 456.789, STACK+8) +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h b/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h index 3b9b349..ca1fa58 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h @@ -44,6 +44,24 @@ struct hfa_fx3_t float c; }; +struct hfa_f16x1_t +{ + __fp16 a; +}; + +struct hfa_f16x2_t +{ + __fp16 a; + __fp16 b; +}; + +struct hfa_f16x3_t +{ + __fp16 a; + __fp16 b; + __fp16 c; +}; + struct hfa_dx2_t { double a; diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c index 4fb9a03..5b9e057 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-1.c @@ -19,6 +19,8 @@ signed short ss = 0xcba9; signed int ss_promoted = 0xffffcba9; float fp = 65432.12345f; double fp_promoted = (double)65432.12345f; +__fp16 fp16 = 2.0f; +__fp16 fp16_promoted = (double)2.0f; #define HAS_DATA_INIT_FUNC void init_data () @@ -46,9 +48,13 @@ void init_data () ANON ( long double , 98765432123456789.987654321L, Q2, 12) ANON ( vf2_t, vf2 , D3, 13) ANON ( vi4_t, vi4 , Q4, 14) + /* 7.2: For unprototyped (i.e. pre- ANSI or K&R C) and variadic functions, + in addition to the normal conversions and promotions, arguments of + type __fp16 are converted to type double. */ + ANON_PROMOTED( __fp16, fp16 , double, fp16_promoted, D5, 15) #ifndef __AAPCS64_BIG_ENDIAN__ - LAST_ANON ( int , 0xeeee, STACK+32,15) + LAST_ANON ( int , 0xeeee, STACK+32,16) #else - LAST_ANON ( int , 0xeeee, STACK+36,15) + LAST_ANON ( int , 0xeeee, STACK+36,16) #endif #endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-16.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-16.c new file mode 100644 index 0000000..73f8f1c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-16.c @@ -0,0 +1,28 @@ +/* Test AAPCS64 layout and __builtin_va_arg. + + This test is focused particularly on __fp16 unnamed homogeneous + floating-point aggregate types which should be passed in fp/simd + registers until we run out of those, then the stack. */ + +/* { dg-do run { target aarch64*-*-* } } */ + +#ifndef IN_FRAMEWORK +#define AAPCS64_TEST_STDARG +#define TESTFILE "va_arg-16.c" +#include "type-def.h" + +struct hfa_f16x1_t hfa_f16x1 = {2.0f}; +struct hfa_f16x2_t hfa_f16x2 = {4.0f, 8.0f}; +struct hfa_f16x3_t hfa_f16x3 = {16.0f, 32.0f, 64.0f}; + +#include "abitest.h" +#else + ARG (int, 1, W0, LAST_NAMED_ARG_ID) + DOTS + ANON (struct hfa_f16x1_t, hfa_f16x1, H0 , 0) + ANON (struct hfa_f16x2_t, hfa_f16x2, H1 , 1) + ANON (struct hfa_f16x3_t, hfa_f16x3, H3 , 2) + ANON (struct hfa_f16x2_t, hfa_f16x2, H6 , 3) + ANON (struct hfa_f16x1_t, hfa_f16x1, STACK , 4) + LAST_ANON(double , 1.0 , STACK+8, 5) +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-2.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-2.c index e972691..8f2f881 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-2.c +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-2.c @@ -19,6 +19,8 @@ signed short ss = 0xcba9; signed int ss_promoted = 0xffffcba9; float fp = 65432.12345f; double fp_promoted = (double)65432.12345f; +__fp16 fp16 = 2.0f; +__fp16 fp16_promoted = (double)2.0f; #define HAS_DATA_INIT_FUNC void init_data () @@ -64,9 +66,10 @@ void init_data () ANON ( long double , 98765432123456789.987654321L, STACK+80, 20) ANON ( vf2_t, vf2 , STACK+96, 21) ANON ( vi4_t, vi4 , STACK+112,22) + ANON_PROMOTED( __fp16 , fp16 , double, fp16_promoted, STACK+128,23) #ifndef __AAPCS64_BIG_ENDIAN__ - LAST_ANON ( int , 0xeeee, STACK+128,23) + LAST_ANON ( int , 0xeeee, STACK+136,24) #else - LAST_ANON ( int , 0xeeee, STACK+132,23) + LAST_ANON ( int , 0xeeee, STACK+140,24) #endif #endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-4.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-4.c index fab3575..010ad8b 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-4.c +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-4.c @@ -29,6 +29,8 @@ struct non_hfa_ffvf2_t non_hfa_ffvf2; struct non_hfa_fffd_t non_hfa_fffd = {33.f, 34.f, 35.f, 36.0}; union hfa_union_t hfa_union; union non_hfa_union_t non_hfa_union; +struct hfa_f16x2_t hfa_f16x2 = {2.0f, 4.0f}; +struct hfa_f16x3_t hfa_f16x3 = {2.0f, 4.0f, 8.0f}; #define HAS_DATA_INIT_FUNC void init_data () @@ -89,9 +91,12 @@ void init_data () PTR_ANON (struct non_hfa_ffs_t , non_hfa_ffs , STACK+120, 18) ANON (struct non_hfa_ffs_2_t, non_hfa_ffs_2, STACK+128, 19) ANON (union non_hfa_union_t, non_hfa_union, STACK+144, 20) + /* HFA of __fp16 passed on stack, directed __fp16 test is va_arg-10.c. */ + ANON (struct hfa_f16x2_t , hfa_f16x2 , STACK+152, 21) + ANON (struct hfa_f16x3_t , hfa_f16x3 , STACK+160, 22) #ifndef __AAPCS64_BIG_ENDIAN__ - LAST_ANON(int , 2 , STACK+152, 30) + LAST_ANON(int , 2 , STACK+168, 30) #else - LAST_ANON(int , 2 , STACK+156, 30) + LAST_ANON(int , 2 , STACK+172, 30) #endif #endif diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-5.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-5.c index 4853f92..e54f1f5 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-5.c +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-5.c @@ -17,6 +17,8 @@ struct hfa_dx4_t hfa_dx4 = {1234.123, 2345.234, 3456.345, 4567.456}; struct hfa_ldx3_t hfa_ldx3 = {123456.7890, 234567.8901, 345678.9012}; struct hfa_ffs_t hfa_ffs; union hfa_union_t hfa_union; +struct hfa_f16x2_t hfa_f16x2 = {2.0f, 4.0f}; +struct hfa_f16x3_t hfa_f16x3 = {2.0f, 4.0f, 8.0f}; #define HAS_DATA_INIT_FUNC void init_data () @@ -43,5 +45,8 @@ void init_data () ANON (struct hfa_fx1_t , hfa_fx1 , STACK+24, 4) ANON (struct hfa_fx2_t , hfa_fx2 , STACK+32, 5) ANON (struct hfa_dx2_t , hfa_dx2 , STACK+40, 6) - LAST_ANON(double , 1.0 , STACK+56, 7) + /* HFA of __fp16 passed on stack, directed __fp16 test is va_arg-10.c. */ + ANON (struct hfa_f16x2_t, hfa_f16x2, STACK+56, 7) + ANON (struct hfa_f16x3_t, hfa_f16x3, STACK+64, 8) + LAST_ANON(double , 1.0 , STACK+72, 9) #endif