From patchwork Wed Jul 20 09:51:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 72412 Delivered-To: patch@linaro.org Received: by 10.140.29.52 with SMTP id a49csp554038qga; Wed, 20 Jul 2016 02:53:31 -0700 (PDT) X-Received: by 10.98.152.6 with SMTP id q6mr63295660pfd.86.1469008411040; Wed, 20 Jul 2016 02:53:31 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id i62si2608330pfi.6.2016.07.20.02.53.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Jul 2016 02:53:31 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-432037-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-432037-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-432037-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=XdZx0yTTqyWQlw4p vYHPK2cCvu4jty3xPT8ZKRVjmLvhsYA8pM9jXAgzhzkkZEmjBtKTHTHmC7imYdoP VUrTFVCUkhvHEn6sDjeKIw4wRaAdqn9naJM6Jt1ZF8wblrK3bN5U6janGZgfZ/3W CxGd+y79m6rTeEQOCN3cDUCRwlU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=Okhvy4KBE9I2XtwtEVKYUa 328no=; b=ONAPWQQ759jDT2Cj2rVSQXXEm3fVSdQgS0O+WLTLtu/0ehGqTQ2vyW E8UrbQWdAqafFANhTbPL2XNJl+b8raOffpjPaOZznthbntUMwB518KGYmNEZA/AJ VTmdPBXKwUp544FtGJTV7BZGMSYfUMLeHqWVn8EhkyZcZdyIsbG6E= Received: (qmail 66459 invoked by alias); 20 Jul 2016 09:53:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 66439 invoked by uid 89); 20 Jul 2016 09:53:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=unavailable version=3.3.2 spammy= X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 20 Jul 2016 09:53:05 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01lp0242.outbound.protection.outlook.com [213.199.154.242]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-53-i-LBqU6CP7mzc2s3vS2vDA-1; Wed, 20 Jul 2016 10:51:48 +0100 Received: from AM3PR08CA0052.eurprd08.prod.outlook.com (10.163.23.148) by AM4PR0801MB1428.eurprd08.prod.outlook.com (10.168.5.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.544.10; Wed, 20 Jul 2016 09:51:45 +0000 Received: from DB3FFO11FD015.protection.gbl (2a01:111:f400:7e04::190) by AM3PR08CA0052.outlook.office365.com (2a01:111:e400:8854::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.544.10 via Frontend Transport; Wed, 20 Jul 2016 09:51:45 +0000 Received: from nebula.arm.com (217.140.96.140) by DB3FFO11FD015.mail.protection.outlook.com (10.47.216.189) with Microsoft SMTP Server (TLS) id 15.1.534.7 via Frontend Transport; Wed, 20 Jul 2016 09:51:46 +0000 Received: from e107456-lin.cambridge.arm.com (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.294.0; Wed, 20 Jul 2016 10:51:43 +0100 From: James Greenhalgh To: CC: , , , , , , , Subject: [RFC: Patch 2/2 v3] Introduce a new cost model for ifcvt. Date: Wed, 20 Jul 2016 10:51:33 +0100 Message-ID: <1469008295-28884-2-git-send-email-james.greenhalgh@arm.com> In-Reply-To: <1469008295-28884-1-git-send-email-james.greenhalgh@arm.com> References: <0f3c74c6-0c1b-f2df-77ce-a2ffc112583d@redhat.com> <1469008295-28884-1-git-send-email-james.greenhalgh@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140; IPV:CAL; SCL:-1; CTRY:GB; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(7916002)(2980300002)(438002)(199003)(189002)(377424004)(2906002)(50986999)(84326002)(76176999)(87936001)(33646002)(586003)(106466001)(86362001)(5890100001)(36756003)(2351001)(229853001)(77096005)(568964002)(104016004)(356003)(4326007)(6806005)(2950100001)(11100500001)(246002)(189998001)(2476003)(26826002)(110136002)(8936002)(4610100001)(19580405001)(512874002)(92566002)(50226002)(305945005)(19580395003)(7846002)(5003600100003)(7696003)(551984002)(8676002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR0801MB1428; H:nebula.arm.com; FPR:; SPF:Pass; PTR:fw-tnat.cambridge.arm.com; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; DB3FFO11FD015; 1:gLRFFRSI8sxZFsqU/X2pcIO0cQRVvgtM8bFpBBa3ve08ZGbeG08Zy7i7NSfhfECrWmrIqJsiRd/3Gne+AJewWxQ4USTxocA5ZnPNYA/eT5/2N3z72HdinObQeOclAJS34w/ScS4CIOPiZKdY8gRsIQMJUOG/2eMfzRwXyX7Ar/HXcgK3ndpfl9ywr0zIJ+tuzbl3PFMjJRqzgY2eL3qvjv6Pn16cChntOHVXgE46cwI7Bs3kwQ1EcyaZqaMe3St2OXWyfNTcxmE0KorQjBY+1ggjMRzBom7X3YlBKCbUMV7UfA1Jv90mehJt0qPHx9n9DqGtTa4vA1AdbkWSVgUbIqtWXk/7H/KAb+oUOWzSj1TDouOElMjqZzYkekPhxqqgNa20ty3V5fU3oyMbpnlZXLu14WLSjm8p/Mrgednvqx9CcgXCUMfCoiJj1HgKMK4UAHnLDm3P609cKLRdpvDcovC5OlwIvSzFGs0Bmdj6U1yzMrG9v5Ppa2bUBbrt803/cIwIACrxhKbUfQXOnb2go010bed5UZCTrb5OZ8FGs14he5H/5kip+spTtkUVC583laxPUVgG9s29PgUifZb3dQ== X-MS-Office365-Filtering-Correlation-Id: c372f8ce-0cd2-4d27-0506-08d3b083767e X-Microsoft-Exchange-Diagnostics: 1; AM4PR0801MB1428; 2:zEJdFGhorn64yixbQgDEmXm3tcJVJTKsU0h3EpXRbFM/W/ayXDykTR30kVQwQ63MfR6Lj1b8IJ6Bf4j+Cwv5mVLyJh8KpFEi5D94MfYYrhWvTsxOXWOykI5LW4xCj4D9PfLQBu0aQkcGyj7wNMU0Z4e/c+YRG/gJF4/fYA1yC7hP+AvIn1Cm/N0YuRCh3D+S; 3:ZLCGB7sa9jq2S/uVDoMiksUPRFh3pyEzIO8l9Mchj4jzniHY/EvHnKZZ1kibtUX2EEbY7cGurgzylt6rKe4iL2STuYoE8R26IWtvnSks5FSSD4sm0u56yW2v2loTiZRHKG29MetL2ZZuWa1x9xMDJ/QC0Z5BEainvg1coAH1+pkUDG/G3dDRDH3k6R9PJ2RamPa8W7ay/PrajHyBevtsopHeJwFZAkWH0I/oXkwM6D01MoELvOLfZur3KsRPRnJfIZVvWmNIHt/FXkDPJ2C4ew== X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(8251501002); SRVR:AM4PR0801MB1428; X-Microsoft-Exchange-Diagnostics: 1; AM4PR0801MB1428; 25:fzRrOIw4ScHqz3aYDUOCXpYew+Vcr3i5PXeXWS+O35/574JOUIct2XmhXAr4dpFz/V6j/Mw/GLHJ9MkqWj58RKPIqLhbPtFv5g+2LZL40xVUgfT7ZnMZDEOGEFDSUIZYruVIySaTSCbp/uywTL+8iPoBpcw6GH9NL2CZJt/RgQFj+Lnvbbd0N0tTKPLsNTmoZFwA95VR5687ot03VmuEqniB1fO2+tsMi5em3cDB1ziv0i1ZkCBTaaSUXCEC3QBty3I7GnG6sxsFlKX9sGav8cmUtjWHqkYIq+xYfEdr0lDsSfZ8pNdEV9NwSFkmePtYlptWGVE514Q3+fxIY+P6IiLty9WJbnEaTMgGcyrWUpt0x6dlKyOBgbHhcaEK1eWjww6FCIItN6png3XaRwhE40Rzfkw58p66DQklGOMlRIAfkGQRX8DqrfBmOXhrxTgSaM+3jlF+bQbyxKE79GhiUN67CyQxfTNiNpK66IiIyF7n2jhCRd1JWeG71pl9g6Kt7SU/bhRn3DKIoSbdGMyaBOHHQNSE9pSFeLQIt8CyClEfRgwgApqwKgNflhC3OGaYhwQY3tgnTn2b608s0Kv+YIKY0i1VEIV2WnZYa3JGCfs2UhHV05yhwk5LiWANxqJxA+MVLaLO7HUqZSC5rTXH5ZJYqqvlh76LdeOW/aVFKPwUypJB9ThBATg8I/Sm4/NTVYh77EOL541XkEZzlyLJPNLrIH6WeMXWvMLNl3Id3Sx9LR6wlMvjcu1mQ+M3C7uz3IYIboEI4pVy2fF3kPZmlttLu2HnPPAZoOyzqH39GVqtXmkF4Iw2ZgPL45XfLf8+u41MKcfj969QctUKfOSsBJnPQj0+QCUp+nKW2v82296AI870Ft0OCYjC7E1jy1zQeLfZLKRRRT4vZuocPwvSzhoOY2OxsrHpupuReXZhEkE= X-Microsoft-Exchange-Diagnostics: 1; AM4PR0801MB1428; 31:WIP4zj7Jue8TZqwQ2XA3pMDHQLm5s+aux8FliJJ4FS4R4DUNmvxVQH9C0DAhKki4m/9WMOIUPv7lI3WS1nhUV2+E3yaLshb+a05gVMiJco1ekKjao4+V6DRtDIkGI32mY4QHX/XY6fnLgv9FAJ667Fvk6O3f/jiXiTzvUrIHB9EGUavtbIxro5hwdRlvugZQcGHRfDPdWQptnsvgL/ulVw==; 20:1357op4grelD+lGyxQqJjbW4f2OKSiSEohj4Rc4MAZPRER5/LkueEQ/lN+vQW/XeX13PZzEHCy95luLKakCyswmFyOQgYIn8RiMcwc2AFVilcPeOyLRCDBKsf9DBgPERy05qqrEbNkQVjjv88+fFqtnigTZ+1vFLXtv6NWi7oYr81dH8eeDCD2i1+CP29i7GQmtNrQSpYU3fGgAnkS0tx8oT2vpK4+RTIJMexOKrq7K+npLCDxfGePprhpDitYtf NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(102415321)(601004)(2401047)(13023025)(13024025)(13020025)(13013025)(5005006)(8121501046)(10201501046)(3002001)(6055026); SRVR:AM4PR0801MB1428; BCL:0; PCL:0; RULEID:; SRVR:AM4PR0801MB1428; X-Microsoft-Exchange-Diagnostics: 1; AM4PR0801MB1428; 4:zsyMyJhxROijEqw/dOwjidjUqXJaBbnbCnpoGTAm5IcUja1uULLShC/AlZ5KeRIAYquqt90dJAv7b2e8pv7dbndbN7HHjYiWne15pi/WQTM1pPrPuG+oR1L1KYiy/DyqrGhce9i80mKweGRtOaegpIWKXAzaNOx6tq9LvpBLzeeAXgszZwaIZYxgjFwe80EG/R1IJ56gn8NqXR/1R85AoXbtVeZcc2j1Yk4GEnCLlxe0CKf2F5lyGKKPNUUqwcjABfIdSk3XM1UgsORnWqd+xMtgmm++sGue3nZ5VYJ9HNOHz5jQPEWoiioFcYAHfs+GQiLSduiyGf74OAkf3TO/IQ+2N/rZXRt2EHE6dTOzbREQZ/h4Q9z5io3vlYXWtg2XCtAxBly1ArzDkebhQ3AvotAVDUEamBhmi+JxvjN4b3hS3Zda6xioD9N9tuyadRvEwxBS5+uRiwqODBZ3ShuFsw5EjvSrYEE50gjJnLWQsJTRb9q0zc2bMaMLFzGgkSUd3lrMbYpsv8VQyNZBdesisg== X-Forefront-PRVS: 000947967F X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; AM4PR0801MB1428; 23:CFBuNnSR9mC0kFrHK0WuOxGpf2LfBbevlC5Yobd?= =?us-ascii?Q?wvAU6ywQCoOrzWPelJG4oHkS/hS+nkkayhsqKOEVaY2zixwpr3rkZNesGd3H?= =?us-ascii?Q?q3TeopWlW44OzpRP9+JQTCmxc/Sgf5fWom8JiMPB4oBL59f94CnfTgKb4ogu?= =?us-ascii?Q?YM23/8dq8AdWgMBhJEffGXThCAdt0gV+14D3OAB0twQbXsp/10JU9RJGLtmi?= =?us-ascii?Q?7RBmMbf/Ugt00ojRkBND94B2xCZJPYmE/B4yFEikh5Kpp/QyspdrR++NrIWs?= =?us-ascii?Q?AXlrdmoe14fsJ5F6lmhCuebaaWJb5GY667JMwoIWds9cHvcO4b6oTmaRrgT9?= =?us-ascii?Q?OxebfzDWu5CreO4xIOKdvdJtL9Fh3MyobwM+BNhYfi1DY+F3tn8VgOqa3rvE?= =?us-ascii?Q?0uYvyU9bkSQWo9cT3MKoJ+/jHcqdLzeZRheu7/+zC08arGjAf9OQv7e9MSic?= =?us-ascii?Q?+Cww9pcPoBocgyuaUHQhqWPuByo3v+DCV8AbmQAFBfcaHg8QWGp3hGUSnFNw?= =?us-ascii?Q?eGgmi4rkbqsuRe6+mON+pqbLRxIUorebSV6lPaP9o+FfNlu+JbAU/wEERlPb?= =?us-ascii?Q?NqSa8VtA7n5fdOCYEQQ+6FOza+xQQsBoRKo5XfKFIBPYQpMgEcLAVR1Ds5D2?= =?us-ascii?Q?Bq0NEeK5VhG7ejDv15zJZydvdL4prpsEZQuFdetlOMkdRVQTn1Rz/I/AD9dZ?= =?us-ascii?Q?XEyo6l6KH0l3ECXVx22pclL+JcdWf7CmTWX9nSdCB57Duzi1L5B0QzH2fOie?= =?us-ascii?Q?8DEo2Z1qRp4yzP7xJ32huClGQyg6xSZdx/z62YsQI7heJfxvyxUPQxn/l8+M?= =?us-ascii?Q?q9pkLvlcquAsFu2NcxVNUwnCybbWo/sBW1u3Zp7l6qhT9dvO0REfw6Yf4fIZ?= =?us-ascii?Q?Nyy3lmXNIt22FCNP6HXTzsnSZPZG6Jp1jLCC0PnGnKC6/+Oqj6gbBzAY/0ez?= =?us-ascii?Q?vVwV8PbmJkiyXAOEVUuoI7arMER9TxsNND3vntJIYBIoIWVYpm46OfGPEYYU?= =?us-ascii?Q?hhCEJ0QxXvVSBN/8MDweRXjsV7KwMtzyHJM9iuj5LLeYEEQU4AKAVyF3eqlV?= =?us-ascii?Q?DLsnW2+IS5QQONsJ9f0JzTcw1CAD5oRh44gznhBTKTBVQrvgRTWTxxzcVjiF?= =?us-ascii?Q?xqbYD6QYsLnzBZ9zmL0+Te+jXKcNlqIogx1FFu11ULMEKuRWQoGMS1w=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1; AM4PR0801MB1428; 6:gO1nA5Vzz3WiveURV/LulJWPAneLM85UW6azCVVzg8v+t8jggktKAm6U/dT8zFS8Ql5Z4/mUfzVn5t72kuFDXjPzvseDti7uClY0XimNFT5nlwosXBe+suLe7gl5uxwSwWSPWksEOm7xECNKtbYFHCh3Jjtak7afE44iyId3zFu42+TlTDT/LFmRmntnISpTLVl7PtvkNLLDq8BbAcQUVtbZqZCZ3sUjIpWbN2MAi2yDVZYkUTJjkEsZt/TO0gV1cVzwNTS/re8P3uQ/tT1FvHQhvZ4RVrc7rN/sBgxX1GkusId/m/cQbGk504YaaNbveXzYL1+qKWB9MxxLEKVtag==; 5:NhdR50koqLcTs/Rc+BRSpw1QDDTi5bC3+sScjNyNGXmIdDCdZwEyi7v4qFrUGzs+/NCbtSYbqY39VRuMNV9Ca7c3l8XZPrnPjSOwRzNOqVpnp0Q3mPvwevUfFPNVgVrXn8YYjjo0PLIoxAt9LXTuXQ==; 24:KbgJ/jtAz4L2PaN0uGGJinP2z35HRXc3/DsTrAYjUXzMD4RDDUGnKv5D+i7bqedoUiE6W78IUoheHXLc0ubCXCN2FVFeUH3KCtvm1Hbv69o=; 7:QWHfcdFu6Jqv3AROmi2eSaqA2QJZplGZboaCs7iQxX3oslNF+PmDgbHrJozFZAenRj2Z4u5TFs2gRSPX4FVlu3720hEh/MHv0szVYZT13mY7/rixlqDyaa/zFRmvXG1AHSTil6NbtJUMJqEV9f3WeLrZXyvZacEMKIcgU6PpxZ7zaFENYeOeYYvpTBZ/ipMT3fqI2yoqI6EDW75356KzLax/aMbipSvLWMnpOgk2VP1qgYz0O90bQwihVDW/Nj4r SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; AM4PR0801MB1428; 20:w/FPz041BVJz8TD5JnrFukl2EsqdzJYz6NZb7zGziHvaf4re3jzdlrbRo91+itylqbJxGGNEl3Qtn14ZIeXg9fJNsbkxxkMSAsRAbeO7KXbFkDcuTJheG5VnzB7ZhOrxcrSe0K2yqoz8ZElFd5JB7+nCzEdfZMlPenc4a4TTSKg= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jul 2016 09:51:46.2050 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0801MB1428 X-MC-Unique: i-LBqU6CP7mzc2s3vS2vDA-1 X-IsSubscribed: yes Hi, This patch modifies the way we calculate costs in ifcvt.c. Rather than using a combination of magic numbers and approximations to descide if we should perform the transformation before constructing the new RTL, we instead construct the new RTL and use the cost of that to form our cost model. We want slightly different behaviour when compiling for speed than what we want when compiling for size. For size, we just want to look at what the size of code would have been before the transformation, and what we plan to generate now. We need a little bit of guess work to try to figure what cost to assign to the compare (for which we don't keep track of the full insn) and branch (which insn_rtx_cost won't handle), but otherwise the cost model is easy to calculate. For speed, we want to use the max_noce_ifcvt_seq_cost hook defined in patch 1/4. Here we don't care about the original cost, our hook is defined in terms of how expensive the instructions which are brought on to the unconditional path are permitted to be. For speed then, we have a simple numerical comparison between the new cost and the cost returned by the hook. To acheieve this, first we abstract all the cost logic in to noce_conversion_profitable_p. To get the size cost logic right, we need a few modifications to the fields of noce_if_info. We're going to drop "then_cost" and "else_cost", which will instead be covered by "original_cost" which is the sum of these costs, plus an extra magic COSTS_N_INSNS (2) to cover a compare and branch. We're going to drop branch_cost which was used by the old cost model, and add max_seq_cost which is defined in the new model. Finally, we can factor out the repeated calculation of speed_p, and just store it once in noce_if_info. This last point fixes the inconsistency of which basic block we check optimize_bb_for_speed_p against. To build the sum for "original_cost" we need to update bb_valid_for_noce_process_p such that it adds to the cost pointer it takes rather than overwriting it. Having done that, we need to update all the cost models in the file to check for profitability just after we check that if-conversion has succeeded. Finally, we use the params added in 1/4 to allow us to do something sensible with the testcases that look for if-conversion. With these tests we only care that the mechanics would work if the cost model were permissive enough, not that a traget has actually set the cost model high enough, so we just set the parameters to their maximum values. Bootstrapped on x86-64 and aarch64. OK? Thanks, James --- gcc/ 2016-07-20 James Greenhalgh * ifcvt.c (noce_if_info): New fields: speed_p, original_cost, max_seq_cost. Removed fields: then_cost, else_cost, branch_cost. (noce_conversion_profitable_p): New. (noce_try_store_flag_constants): Use it. (noce_try_addcc): Likewise. (noce_try_store_flag_mask): Likewise. (noce_try_cmove): Likewise. (noce_try_cmove_arith): Likewise. (bb_valid_for_noce_process_p): Add to the cost parameter rather than overwriting it. (noce_convert_multiple_sets): Move cost model to here, from... (bb_ok_for_noce_convert_multiple_sets) ...here. (noce_process_if_block): Update calls for above changes. (noce_find_if_block): Record new noce_if_info parameters. gcc/testsuite/ 2016-07-18 James Greenhalgh * gcc.dg/ifcvt-2.c: Use parameter to guide if-conversion heuristics. * gcc.dg/ifcvt-3.c: Use parameter to guide if-conversion heuristics. * gcc.dg/pr68435.c: Use parameter to guide if-conversion heuristics. * gcc.dg/ifcvt-4.c: Use parameter to guide if-conversion heuristics. * gcc.dg/ifcvt-5.c: Use parameter to guide if-conversion heuristics. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index a92ab6d..4e3d8f3 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -807,12 +807,17 @@ struct noce_if_info bool then_simple; bool else_simple; - /* The total rtx cost of the instructions in then_bb and else_bb. */ - unsigned int then_cost; - unsigned int else_cost; + /* True if we're optimisizing the control block for speed, false if + we're optimizing for size. */ + bool speed_p; - /* Estimated cost of the particular branch instruction. */ - unsigned int branch_cost; + /* The combined cost of COND, JUMP and the costs for THEN_BB and + ELSE_BB. */ + unsigned int original_cost; + + /* Maximum permissible cost for the unconditional sequence we should + generate to replace this branch. */ + unsigned int max_seq_cost; /* The name of the noce transform that succeeded in if-converting this structure. Used for debugging. */ @@ -835,6 +840,27 @@ static int noce_try_minmax (struct noce_if_info *); static int noce_try_abs (struct noce_if_info *); static int noce_try_sign_mask (struct noce_if_info *); +/* Return TRUE if SEQ is a good candidate as a replacement for the + if-convertible sequence described in IF_INFO. */ + +inline static bool +noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) +{ + bool speed_p = if_info->speed_p; + + /* Cost up the new sequence. */ + unsigned int cost = seq_cost (seq, speed_p); + + /* When compiling for size, we can make a reasonably accurately guess + at the size growth. */ + if (!speed_p) + { + return cost <= if_info->original_cost; + } + else + return cost <= if_info->max_seq_cost; +} + /* Helper function for noce_try_store_flag*. */ static rtx @@ -1319,8 +1345,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) registers where we handle overlap below. */ && (REG_P (XEXP (a, 0)) || (noce_operand_ok (XEXP (a, 0)) - && ! reg_overlap_mentioned_p (if_info->x, XEXP (a, 0)))) - && if_info->branch_cost >= 2) + && ! reg_overlap_mentioned_p (if_info->x, XEXP (a, 0))))) { common = XEXP (a, 0); a = XEXP (a, 1); @@ -1393,22 +1418,24 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) else gcc_unreachable (); } + /* Is this (cond) ? 2^n : 0? */ else if (ifalse == 0 && exact_log2 (itrue) >= 0 - && (STORE_FLAG_VALUE == 1 - || if_info->branch_cost >= 2)) + && STORE_FLAG_VALUE == 1) normalize = 1; + /* Is this (cond) ? 0 : 2^n? */ else if (itrue == 0 && exact_log2 (ifalse) >= 0 && can_reverse - && (STORE_FLAG_VALUE == 1 || if_info->branch_cost >= 2)) + && STORE_FLAG_VALUE == 1) { normalize = 1; reversep = true; } + /* Is this (cond) ? -1 : x? */ else if (itrue == -1 - && (STORE_FLAG_VALUE == -1 - || if_info->branch_cost >= 2)) + && STORE_FLAG_VALUE == -1) normalize = -1; + /* Is this (cond) ? x : -1? */ else if (ifalse == -1 && can_reverse - && (STORE_FLAG_VALUE == -1 || if_info->branch_cost >= 2)) + && STORE_FLAG_VALUE == -1) { normalize = -1; reversep = true; @@ -1497,7 +1524,7 @@ noce_try_store_flag_constants (struct noce_if_info *if_info) noce_emit_move_insn (if_info->x, target); seq = end_ifcvt_sequence (if_info); - if (!seq) + if (!seq || !noce_conversion_profitable_p (seq, if_info)) return FALSE; emit_insn_before_setloc (seq, if_info->jump, @@ -1551,7 +1578,7 @@ noce_try_addcc (struct noce_if_info *if_info) noce_emit_move_insn (if_info->x, target); seq = end_ifcvt_sequence (if_info); - if (!seq) + if (!seq || !noce_conversion_profitable_p (seq, if_info)) return FALSE; emit_insn_before_setloc (seq, if_info->jump, @@ -1564,10 +1591,10 @@ noce_try_addcc (struct noce_if_info *if_info) } /* If that fails, construct conditional increment or decrement using - setcc. */ - if (if_info->branch_cost >= 2 - && (XEXP (if_info->a, 1) == const1_rtx - || XEXP (if_info->a, 1) == constm1_rtx)) + setcc. We're changing a branch and an increment to a comparison and + an ADD/SUB. */ + if (XEXP (if_info->a, 1) == const1_rtx + || XEXP (if_info->a, 1) == constm1_rtx) { start_sequence (); if (STORE_FLAG_VALUE == INTVAL (XEXP (if_info->a, 1))) @@ -1593,7 +1620,7 @@ noce_try_addcc (struct noce_if_info *if_info) noce_emit_move_insn (if_info->x, target); seq = end_ifcvt_sequence (if_info); - if (!seq) + if (!seq || !noce_conversion_profitable_p (seq, if_info)) return FALSE; emit_insn_before_setloc (seq, if_info->jump, @@ -1621,15 +1648,14 @@ noce_try_store_flag_mask (struct noce_if_info *if_info) return FALSE; reversep = 0; - if ((if_info->branch_cost >= 2 - || STORE_FLAG_VALUE == -1) - && ((if_info->a == const0_rtx - && rtx_equal_p (if_info->b, if_info->x)) - || ((reversep = (reversed_comparison_code (if_info->cond, - if_info->jump) - != UNKNOWN)) - && if_info->b == const0_rtx - && rtx_equal_p (if_info->a, if_info->x)))) + + if ((if_info->a == const0_rtx + && rtx_equal_p (if_info->b, if_info->x)) + || ((reversep = (reversed_comparison_code (if_info->cond, + if_info->jump) + != UNKNOWN)) + && if_info->b == const0_rtx + && rtx_equal_p (if_info->a, if_info->x))) { start_sequence (); target = noce_emit_store_flag (if_info, @@ -1643,22 +1669,11 @@ noce_try_store_flag_mask (struct noce_if_info *if_info) if (target) { - int old_cost, new_cost, insn_cost; - int speed_p; - if (target != if_info->x) noce_emit_move_insn (if_info->x, target); seq = end_ifcvt_sequence (if_info); - if (!seq) - return FALSE; - - speed_p = optimize_bb_for_speed_p (BLOCK_FOR_INSN (if_info->insn_a)); - insn_cost = insn_rtx_cost (PATTERN (if_info->insn_a), speed_p); - old_cost = COSTS_N_INSNS (if_info->branch_cost) + insn_cost; - new_cost = seq_cost (seq, speed_p); - - if (new_cost > old_cost) + if (!seq || !noce_conversion_profitable_p (seq, if_info)) return FALSE; emit_insn_before_setloc (seq, if_info->jump, @@ -1827,9 +1842,7 @@ noce_try_cmove (struct noce_if_info *if_info) we don't know about, so give them a chance before trying this approach. */ else if (!targetm.have_conditional_execution () - && CONST_INT_P (if_info->a) && CONST_INT_P (if_info->b) - && ((if_info->branch_cost >= 2 && STORE_FLAG_VALUE == -1) - || if_info->branch_cost >= 3)) + && CONST_INT_P (if_info->a) && CONST_INT_P (if_info->b)) { machine_mode mode = GET_MODE (if_info->x); HOST_WIDE_INT ifalse = INTVAL (if_info->a); @@ -1865,7 +1878,7 @@ noce_try_cmove (struct noce_if_info *if_info) noce_emit_move_insn (if_info->x, target); seq = end_ifcvt_sequence (if_info); - if (!seq) + if (!seq || !noce_conversion_profitable_p (seq, if_info)) return FALSE; emit_insn_before_setloc (seq, if_info->jump, @@ -2078,11 +2091,9 @@ noce_try_cmove_arith (struct noce_if_info *if_info) conditional on their addresses followed by a load. Don't do this early because it'll screw alias analysis. Note that we've already checked for no side effects. */ - /* ??? FIXME: Magic number 5. */ if (cse_not_expected && MEM_P (a) && MEM_P (b) - && MEM_ADDR_SPACE (a) == MEM_ADDR_SPACE (b) - && if_info->branch_cost >= 5) + && MEM_ADDR_SPACE (a) == MEM_ADDR_SPACE (b)) { machine_mode address_mode = get_address_mode (a); @@ -2114,23 +2125,6 @@ noce_try_cmove_arith (struct noce_if_info *if_info) if (!can_conditionally_move_p (x_mode)) return FALSE; - unsigned int then_cost; - unsigned int else_cost; - if (insn_a) - then_cost = if_info->then_cost; - else - then_cost = 0; - - if (insn_b) - else_cost = if_info->else_cost; - else - else_cost = 0; - - /* We're going to execute one of the basic blocks anyway, so - bail out if the most expensive of the two blocks is unacceptable. */ - if (MAX (then_cost, else_cost) > COSTS_N_INSNS (if_info->branch_cost)) - return FALSE; - /* Possibly rearrange operands to make things come out more natural. */ if (reversed_comparison_code (if_info->cond, if_info->jump) != UNKNOWN) { @@ -2319,7 +2313,7 @@ noce_try_cmove_arith (struct noce_if_info *if_info) noce_emit_move_insn (x, target); ifcvt_seq = end_ifcvt_sequence (if_info); - if (!ifcvt_seq) + if (!ifcvt_seq || !noce_conversion_profitable_p (ifcvt_seq, if_info)) return FALSE; emit_insn_before_setloc (ifcvt_seq, if_info->jump, @@ -2805,7 +2799,7 @@ noce_try_sign_mask (struct noce_if_info *if_info) && (if_info->insn_b == NULL_RTX || BLOCK_FOR_INSN (if_info->insn_b) == if_info->test_bb)); if (!(t_unconditional - || (set_src_cost (t, mode, optimize_bb_for_speed_p (if_info->test_bb)) + || (set_src_cost (t, mode, if_info->speed_p) < COSTS_N_INSNS (2)))) return FALSE; @@ -3034,8 +3028,8 @@ contains_mem_rtx_p (rtx x) x := a and all previous computations in TEST_BB don't produce any values that are live after TEST_BB. In other words, all the insns in TEST_BB are there only - to compute a value for x. Put the rtx cost of the insns - in TEST_BB into COST. Record whether TEST_BB is a single simple + to compute a value for x. Add the rtx cost of the insns + in TEST_BB to COST. Record whether TEST_BB is a single simple set instruction in SIMPLE_P. */ static bool @@ -3067,7 +3061,7 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond, if (first_insn == last_insn) { *simple_p = noce_operand_ok (SET_DEST (first_set)); - *cost = insn_rtx_cost (first_set, speed_p); + *cost += insn_rtx_cost (first_set, speed_p); return *simple_p; } @@ -3114,7 +3108,7 @@ bb_valid_for_noce_process_p (basic_block test_bb, rtx cond, goto free_bitmap_and_fail; BITMAP_FREE (test_bb_temps); - *cost = potential_cost; + *cost += potential_cost; *simple_p = false; return true; @@ -3290,9 +3284,15 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) for (int i = 0; i < count; i++) noce_emit_move_insn (targets[i], temporaries[i]); - /* Actually emit the sequence. */ + /* Actually emit the sequence if it isn't too expensive. */ rtx_insn *seq = get_insns (); + if (!noce_conversion_profitable_p (seq, if_info)) + { + end_sequence (); + return FALSE; + } + for (insn = seq; insn; insn = NEXT_INSN (insn)) set_used_flags (insn); @@ -3342,22 +3342,16 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) /* Return true iff basic block TEST_BB is comprised of only (SET (REG) (REG)) insns suitable for conversion to a series - of conditional moves. FORNOW: Use II to find the expected cost of - the branch into/over TEST_BB. - - TODO: This creates an implicit "magic number" for branch_cost. - II->branch_cost now guides the maximum number of set instructions in - a basic block which is considered profitable to completely - if-convert. */ + of conditional moves. Also check that we have more than one set + (other routines can handle a single set better than we would), and + fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets. */ static bool -bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, - struct noce_if_info *ii) +bb_ok_for_noce_convert_multiple_sets (basic_block test_bb) { rtx_insn *insn; unsigned count = 0; unsigned param = PARAM_VALUE (PARAM_MAX_RTL_IF_CONVERSION_INSNS); - unsigned limit = MIN (ii->branch_cost, param); FOR_BB_INSNS (test_bb, insn) { @@ -3393,14 +3387,15 @@ bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, if (!can_conditionally_move_p (GET_MODE (dest))) return false; - /* FORNOW: Our cost model is a count of the number of instructions we - would if-convert. This is suboptimal, and should be improved as part - of a wider rework of branch_cost. */ - if (++count > limit) - return false; + count++; } - return count > 1; + /* If we would only put out one conditional move, the other strategies + this pass tries are better optimized and will be more appropriate. + Some targets want to strictly limit the number of conditional moves + that are emitted, they set this through PARAM, we need to respect + that. */ + return count > 1 && count <= param; } /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert @@ -3436,7 +3431,7 @@ noce_process_if_block (struct noce_if_info *if_info) if (!else_bb && HAVE_conditional_move && !HAVE_cc0 - && bb_ok_for_noce_convert_multiple_sets (then_bb, if_info)) + && bb_ok_for_noce_convert_multiple_sets (then_bb)) { if (noce_convert_multiple_sets (if_info)) { @@ -3447,12 +3442,12 @@ noce_process_if_block (struct noce_if_info *if_info) } } - if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->then_cost, + if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->original_cost, &if_info->then_simple)) return false; if (else_bb - && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->else_cost, + && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->original_cost, &if_info->else_simple)) return false; @@ -3983,6 +3978,7 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge, rtx cond; rtx_insn *cond_earliest; struct noce_if_info if_info; + bool speed_p = optimize_bb_for_speed_p (test_bb); /* We only ever should get here before reload. */ gcc_assert (!reload_completed); @@ -4074,8 +4070,16 @@ noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge, if_info.cond_earliest = cond_earliest; if_info.jump = jump; if_info.then_else_reversed = then_else_reversed; - if_info.branch_cost = BRANCH_COST (optimize_bb_for_speed_p (test_bb), - predictable_edge_p (then_edge)); + if_info.speed_p = speed_p; + if_info.max_seq_cost + = targetm.max_noce_ifcvt_seq_cost (then_edge); + /* We'll add in the cost of THEN_BB and ELSE_BB later, when we check + that they are valid to transform. We can't easily get back to the insn + for COND (and it may not exist if we had to canonicalize to get COND), + and jump_insns are always given a cost of 1 by seq_cost, so treat + both instructions as having cost COSTS_N_INSNS (1). */ + if_info.original_cost = COSTS_N_INSNS (2); + /* Do the real work. */ diff --git a/gcc/testsuite/gcc.dg/ifcvt-2.c b/gcc/testsuite/gcc.dg/ifcvt-2.c index e0e1728..73e0dcc 100644 --- a/gcc/testsuite/gcc.dg/ifcvt-2.c +++ b/gcc/testsuite/gcc.dg/ifcvt-2.c @@ -1,5 +1,5 @@ /* { dg-do compile { target aarch64*-*-* x86_64-*-* } } */ -/* { dg-options "-fdump-rtl-ce1 -O2" } */ +/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-unpredictable-cost=100" } */ typedef unsigned char uint8_t; diff --git a/gcc/testsuite/gcc.dg/ifcvt-3.c b/gcc/testsuite/gcc.dg/ifcvt-3.c index 44233d4..b250bc1 100644 --- a/gcc/testsuite/gcc.dg/ifcvt-3.c +++ b/gcc/testsuite/gcc.dg/ifcvt-3.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { { aarch64*-*-* i?86-*-* x86_64-*-* } && lp64 } } } */ -/* { dg-options "-fdump-rtl-ce1 -O2" } */ +/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-unpredictable-cost=100" } */ typedef long long s64; diff --git a/gcc/testsuite/gcc.dg/ifcvt-4.c b/gcc/testsuite/gcc.dg/ifcvt-4.c index 319b583..0d1671c 100644 --- a/gcc/testsuite/gcc.dg/ifcvt-4.c +++ b/gcc/testsuite/gcc.dg/ifcvt-4.c @@ -1,4 +1,4 @@ -/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=3" } */ +/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=3 --param max-rtl-if-conversion-unpredictable-cost=100" } */ /* { dg-additional-options "-misel" { target { powerpc*-*-* } } } */ /* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" { "arm*-*-* hppa*64*-*-* visium-*-*" } } */ diff --git a/gcc/testsuite/gcc.dg/ifcvt-5.c b/gcc/testsuite/gcc.dg/ifcvt-5.c index 818099a..d2a9476 100644 --- a/gcc/testsuite/gcc.dg/ifcvt-5.c +++ b/gcc/testsuite/gcc.dg/ifcvt-5.c @@ -1,7 +1,8 @@ /* Check that multi-insn if-conversion is not done if the override - parameter would not allow it. */ + parameter would not allow it. Set the cost parameter very high + to ensure that the limiting factor is actually the count parameter. */ -/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=1" } */ +/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=1 --param max-rtl-if-conversion-unpredictable-cost=200" } */ typedef int word __attribute__((mode(word))); diff --git a/gcc/testsuite/gcc.dg/pr68435.c b/gcc/testsuite/gcc.dg/pr68435.c index 765699a..f86b7f8 100644 --- a/gcc/testsuite/gcc.dg/pr68435.c +++ b/gcc/testsuite/gcc.dg/pr68435.c @@ -1,5 +1,5 @@ /* { dg-do compile { target aarch64*-*-* x86_64-*-* } } */ -/* { dg-options "-fdump-rtl-ce1 -O2 -w" } */ +/* { dg-options "-fdump-rtl-ce1 -O2 -w --param max-rtl-if-conversion-unpredictable-cost=100" } */ typedef struct cpp_reader cpp_reader; enum cpp_ttype