From patchwork Mon Jan 11 11:53:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 59457 Delivered-To: patch@linaro.org Received: by 10.112.130.2 with SMTP id oa2csp2059717lbb; Mon, 11 Jan 2016 03:54:08 -0800 (PST) X-Received: by 10.66.101.3 with SMTP id fc3mr139163736pab.2.1452513248375; Mon, 11 Jan 2016 03:54:08 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id r70si19902219pfr.123.2016.01.11.03.54.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 Jan 2016 03:54:08 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-418640-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-418640-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-418640-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=MJAk4ojYitbmXzcEZpFrZ9kwMKCLMdRjRZLZJNOgXZ4ug80l2P muG8ex2rlcIqyb+8GUAw7WA7LvL+SL7s3WYcBGFLc/RHHeYyJcOdlF7YR8J8/jhx 4iHn/GT1cvtnpFk507wgPpGzxDca6xEgviqh9RPid37LZRW6GfV6yCxm8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=TGRJI/D+/T1rAo/M4ZHfI1UzmkI=; b=K+HcWrZlKVN7e2cLVPhJ u4bZROT4SkT6kEbWdkYKfz4K0JiXQWLF99RM/pqe7zZhym7qS6wHBEjsUWvqTwXA 0yhBqHDXffCXjjtNPKZ9Nm6C8TDne/57xqZDqAiMzNEXoaz7PAdLjf/gydd8G6Q4 GJxsrswUyqZJqCRXriGwtb8= Received: (qmail 23185 invoked by alias); 11 Jan 2016 11:53:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 23140 invoked by uid 89); 11 Jan 2016 11:53:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=AWL, BAYES_50, SPF_PASS autolearn=ham version=3.3.2 spammy=HX-Exchange-Antispam-Report-CFA-Test:102615245, tuning, HX-Exchange-Antispam-Report-CFA-Test:13016025, HX-Exchange-Antispam-Report-CFA-Test:13018025 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Jan 2016 11:53:54 +0000 Received: from emea01-db3-obe.outbound.protection.outlook.com (mail-db3lrp0078.outbound.protection.outlook.com [213.199.154.78]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-18-NQL3CdM0TnegYiUqWwXpUg-1; Mon, 11 Jan 2016 11:53:49 +0000 Received: from VI1PR08CA0001.eurprd08.prod.outlook.com (10.164.95.11) by HE1PR08MB0889.eurprd08.prod.outlook.com (10.164.53.27) with Microsoft SMTP Server (TLS) id 15.1.361.13; Mon, 11 Jan 2016 11:53:46 +0000 Received: from AM1FFO11FD053.protection.gbl (2a01:111:f400:7e00::162) by VI1PR08CA0001.outlook.office365.com (2a01:111:e400:597a::11) with Microsoft SMTP Server (TLS) id 15.1.365.19 via Frontend Transport; Mon, 11 Jan 2016 11:53:46 +0000 Received: from nebula.arm.com (217.140.96.140) by AM1FFO11FD053.mail.protection.outlook.com (10.174.65.72) with Microsoft SMTP Server (TLS) id 15.1.355.15 via Frontend Transport; Mon, 11 Jan 2016 11:53:46 +0000 Received: from e107456-lin.cambridge.arm.com (10.1.2.79) by mail.arm.com (10.1.106.66) with Microsoft SMTP Server id 14.3.266.1; Mon, 11 Jan 2016 11:53:42 +0000 From: James Greenhalgh To: CC: , , , , , , , Subject: [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt Date: Mon, 11 Jan 2016 11:53:39 +0000 Message-ID: <1452513219-25168-1-git-send-email-james.greenhalgh@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1; AM1FFO11FD053; 1:TnJG8sH5QRH4dnf0pQ0rFMRWaO+iNF1nyRl70uwfULJSi5v4RpjhS8DV6FIq5fI331Xo8ub4lTAGzWuMtcfVYherYInUHLI/05NU2MVVAxtQdMbKVuxbPay+jl6SWYKdiVKe+yyaxZQX8aNgMCPCZnJEopKtkH179Bf71c1/B472cERer0Air84shv1vurRMQphzrMP4ruuC3+1ftfjUVck/bJplDsydc6KlOdOkXnqFyo8WlJGAuXDngSDPma8DQU50FVYo/hzSw9OkChOD9AsDGmFeOeaHITxPS7FVbGrBHG3/TimhZUOQxS1hqTqYDmPs34DsybRy6VueSqLnJ9Uo22MSuOrSz0mwh4MIx/CTppn3RIHYGWfsU4N55DB01Ze0p5Os/gO0NFnUzbriImdPk3Cr2k9osq2djwsppts= X-Forefront-Antispam-Report: CIP:217.140.96.140; CTRY:GB; IPV:CAL; IPV:NLI; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(2980300002)(438002)(164054003)(199003)(189002)(377424004)(5008740100001)(189998001)(110136002)(6806005)(1096002)(50986999)(4326007)(1220700001)(104016004)(11100500001)(568964002)(2476003)(86362001)(2906002)(586003)(512874002)(26826002)(229853001)(230783001)(92566002)(77096005)(50226001)(36756003)(5890100001)(84326002)(106466001)(4001150100001)(33646002)(4610100001)(19580405001)(19580395003)(87936001)(5003600100002)(2351001)(7059030); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR08MB0889; H:nebula.arm.com; FPR:; SPF:Pass; PTR:fw-tnat.cambridge.arm.com; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; HE1PR08MB0889; 2:3DHo+GTusiLM89hndsbJsnnmGpbz1jjk5Z5fq4vviFPi05eRm4tKX3RNKk3jK9E9/V+KJ2+UAgZPY3igpNbecbYqD25pSbKbvwNxVj37XtMwxnv6sBrQ2OlPCjpIwlvB789HmtX0OOp2dD35vgQ/+g==; 3:AmEnyp1JJskI4dDamiP83uLjduIL3NiZ5ny+02HSmwAlxWuqwO3psisPxuXjI7rQvENgxazjltY5Z/1SJxVRKE6+xPFNfVKkry1XkDMlSashu2S+zNZM0RiXATFe+6NrphGgnaSCK28Z7ETFbOUj06Y/sBNac1S9AvWwvz98QoEt8geWMTyrS9gjy9VZJArvzt6/rJCWqmkj28lzO82A1CEZkJIUOYjwRqu6QrGo9SXzx6b8F1kUPUyr5TFOIh7QruEVIRd1hONO6X92EEzPiw==; 25:ybojmMDBnzHE7sE6jNa0PjcexIHtdykk2L/0e1YOqMyPY2uEugUWS0tPY/Ks5slAskOGpmKta+Sf0LqVuaFIIRf6+es6aQPL1QrO1xY67Pxd9R1mug9lYRtpODn6oEHMbufsFovVSwNuz2n8S/V3KumS4HwlpNVdYKD6VpHvP/RnPl5afEQCHGfz05pBfDwj+0gc/1wJFA52gpv+728W5lLVhPXzov2EFH1ryTDkMf5tIyDCpJIvWThL/8LQ704X; 20:2RiKKQ8sPF/uTGyINZHz2/+0/lAG5I5+DfAq8lVJPL1+zQlD1WrG+j6+SHPdT091BtvLn5ow5dXusqMS9gCXI8ZAv36cKltoiDwqhBokYWzln7QVGScJ+rznlxpWdTUzs+hujwWDntL+3wDQyEvqsAz4udroF5BTKVgfiN/Z8x8= X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(8251501001); SRVR:HE1PR08MB0889; X-MS-Office365-Filtering-Correlation-Id: ad9737f6-024c-449b-5b80-08d31a7ddcb7 X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(102615245)(601004)(2401047)(13018025)(5005006)(13013025)(13016025)(520078)(8121501046)(3002001)(10201501046); SRVR:HE1PR08MB0889; BCL:0; PCL:0; RULEID:; SRVR:HE1PR08MB0889; X-Microsoft-Exchange-Diagnostics: 1; HE1PR08MB0889; 4:wwBvFuCQ/lKX9iy4QZGW9M4xnVSoUfBnKz0JDiWd1kgJ6kILccOagIzorpSYq7Fy2XsxYmIUsrc1dWzQ3dOf66o9L8tLZn1wefvi/uE4J5q2MaVPoBIy8Bz3iT6/D51PRiJsFK84hbDn7rplaDLxnhMLANYS7IcVR5muPRQI/CvJMxHD/cQUpWRujfP2rBZIKP6OT2kUQ4F6dKztZ0wn2hyFJ6sD31Kkbh0OKcz1+aeKA6YZdjb/emJHBZKzl2GURjUMduPC8UvoRCTh2jC4UTXl/HUBqmPAsqsKAYuyQyw2b1OLqTJgkR287LqaT5gFVpDZTb84XkNyHfWRXmHgwNb132NEx8h8xB78nWny/6uHo0nLeC6EHnaB7ZHh08wNrSaDBMgQBEKe+9Db5hIny5+5VVYMLky8MzDoxvg7A4RWfGsYOHOu20qdcD//O5yUfmJLmjdgW8b74MhzSsMhSUXJ2ui917ZzTdSbnHev6eJOwDq7gqWfAye08oLXf5cn X-Forefront-PRVS: 0818724663 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; HE1PR08MB0889; 23:snK6iM5E0yoEAVWxi770fE9VwdYGqIU1Cq+865qEd?= =?us-ascii?Q?mIBxdvLuL7XPRmvZGtgnVoyfb6rSmvB/JjMrrXUkAmB+IemMXOdC4GVxL/sS?= =?us-ascii?Q?QMcibvguhi3NxSKW6foWT6p/WFJ9+Gq+px539WALO0TLeTohQrGbbZl+9H6f?= =?us-ascii?Q?DRdNwzYcwOvXPGv9fK17IYyZ7RD4V/Jds2Q++F3i8Ixr3KtWbi9tXHJodqrt?= =?us-ascii?Q?f5uWaw+QevDRo7vQVCnWavBrPGnPMJRXe/OhkfyXvZtDvRU3Un0/0vJOcdfU?= =?us-ascii?Q?egx3ctcAtcGrGeIA6Hk4GZdfMITu8vKP+WZi/0Qi1dlJhSJRg2dM/7j1o9s6?= =?us-ascii?Q?pBQBePMFwkeQ/R2vODPxJiGuuMv4jXBn0hAb6UW5SFixt9xAofXxggH6UXqz?= =?us-ascii?Q?N8S3dG8qRX523LodtmKraESgaF8l0sK+tT6E0l0RKEdpJwu0UOZ/kc9DTsp3?= =?us-ascii?Q?bLAWrz/4Ewx+5dnUWdXI1XJY0Qj7i297QAVNMSHfExGVkF3zRCxvyWRHk3sp?= =?us-ascii?Q?jE0klHXH2Z0AE5+opyUwMsuMOikNSoufafMMiKPA18MD0Yb1qDoBAgcRga0a?= =?us-ascii?Q?bG0weKxILKef/LBsO2wJSEJV+k8eovAxj5IMpsdGEHAk568Cep4iNEemxE32?= =?us-ascii?Q?BaGKU+7VNQybAQ3FK54T8jsuAAyBMElF4qxwcgOvd1QNE9szMddiTqYsD8ZE?= =?us-ascii?Q?MPXPODgZA2ufDrI5uJhnw9Y5Vb/QL/mI26iBQeVq26N1uL/1dXnxKQPX9+19?= =?us-ascii?Q?NU56/MJRNpj+NL8ni0zchcoL7+90y+hfoz4yGVk9ZF45husOCGgNjNWfxUlm?= =?us-ascii?Q?RxJaMU9NJsLlEbCbqPea1ej/IYB4SD53g2YkjEsDsLXIxj1gYnPX/dK8T/8Q?= =?us-ascii?Q?twIXl4PuNq+skIdsDa6g+Uu0meswruo1nxw+iYzRG/YUErhXd1TKSZ7ciMMu?= =?us-ascii?Q?rHwWDwSJ3ZkR5L0WjO2IrlHvLPrYI2gJpko5H304/jeEQShr1dQNJOhhcEKe?= =?us-ascii?Q?NsLJMh2cLd5pXWCVW9ydXpE1CD1f0NBZZ+AvvXPkZ5LqJPsb/KmOIEmZwyPd?= =?us-ascii?Q?iIrNPg5Lws8OK70EbM40Oes1KFel8Ptek34C+2oOgltfS/g80/tIDJnMYFG2?= =?us-ascii?Q?Z9nR8e8FGk=3D?= X-Microsoft-Exchange-Diagnostics: 1; HE1PR08MB0889; 5:4UTxCVzU99R8loO4P06QDKJFROyGDj8YMYsc409/tVqukUOXNzRk4SdocBUJImGczg6bLwizdpuI2SRqKg9kFFLjwL86dvd6IavlrZaqDReN537hfx+Y6ufQjjLSmavPmFGKo5m6dMzmnUeIf7h48w==; 24:qQhemcAUNBexkocyB0oXEonJnCyl2oJr5bgUD06U1mZ7ap3kuk5pIRTztFwMdICi+yh86KuRX+cgUulBCHJOohmXByhGUVXBocQa7H3fUBE=; 20:NKwpdN+CGFmaKrhxev19C74T0uPJTvUeS1a9Mv06Cj2Q3nzx2sLL5VvF6DDhmovhWF93kdhT1aKjGE8JX7Qc3hp3WOTZ3fRaLhembuf76oULunzQ4qMHXtJn3lKNL6FO1EGjSWXE3w2HRZfT+omLRSQINIzZlCC+v+fABfhUlmU= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Jan 2016 11:53:46.2958 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR08MB0889 X-MC-Unique: NQL3CdM0TnegYiUqWwXpUg-1 X-IsSubscribed: yes Hi, I'd like to switch the logic around in aarch64.c such that -mlow-precision-recip-sqrt causes us to always emit the low-precision software expansion for reciprocal square root. I have two reasons to do this; first is consistency across -mcpu targets, second is enabling more -mcpu targets to use the flag for peak tuning. I don't much like that the precision we use for -mlow-precision-recip-sqrt differs between cores (and possibly compiler revisions). Yes, we're under -ffast-math but I take this flag to mean the user explicitly wants the low-precision expansion, and we should not diverge from that based on an internal decision as to what is optimal for performance in the high-precision case. I'd prefer to keep things as predictable as possible, and here that means always emitting the low-precision expansion when asked. Judging by the comments in the thread proposing the reciprocal square root optimisation, this will benefit all cores currently supported by GCC. To be clear, we would still not expand in the high-precision case for any cores which do not explicitly ask for it. Currently that is Cortex-A57 and xgene, though I will be proposing a patch to remove Cortex-A57 from that list shortly. Which gives my second motivation for this patch. -mlow-precision-recip-sqrt is intended as a tuning flag for situations where performance is more important than precision, but the current logic requires setting an internal flag which also changes the performance characteristics where high-precision is needed. This conflates two decisions the target might want to make, and reduces the applicability of an option targets might want to enable for performance. In particular, I'd still like to see -mlow-precision-recip-sqrt continue to emit the cheaper, low-precision sequence for floats under Cortex-A57. Based on that reasoning, this patch makes the appropriate change to the logic. I've checked with the current -mcpu values to ensure that behaviour without -mlow-precision-recip-sqrt does not change, and that behaviour with -mlow-precision-recip-sqrt is to emit the low precision sequences. I've also put this through bootstrap and test on aarch64-none-linux-gnu with no issues. OK? Thanks, James --- 2015-12-10 James Greenhalgh * config/aarch64/aarch64.c (use_rsqrt_p): Always use software reciprocal sqrt for -mlow-precision-recip-sqrt. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9142ac0..1d5d898 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -7485,8 +7485,9 @@ use_rsqrt_p (void) { return (!flag_trapping_math && flag_unsafe_math_optimizations - && (aarch64_tune_params.extra_tuning_flags - & AARCH64_EXTRA_TUNE_RECIP_SQRT)); + && ((aarch64_tune_params.extra_tuning_flags + & AARCH64_EXTRA_TUNE_RECIP_SQRT) + || flag_mrecip_low_precision_sqrt)); } /* Function to decide when to use