From patchwork Thu Dec 15 11:33:59 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 88134 Delivered-To: patch@linaro.org Received: by 10.140.20.101 with SMTP id 92csp739194qgi; Thu, 15 Dec 2016 03:34:57 -0800 (PST) X-Received: by 10.84.143.68 with SMTP id 62mr1666327ply.63.1481801697259; Thu, 15 Dec 2016 03:34:57 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 63si2079441pfd.50.2016.12.15.03.34.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Dec 2016 03:34:57 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-444494-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-444494-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-444494-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=RjUQ9eUayYWMtmG5 0TjCnAxDPli32ztJvUn6Fa8T4V9EDUw4AwaMR5sDtkT+k8GEVrloUQBPidePRY8D HdiND+1lyHfXVzUrHNygVxZqf8JUh7PdvDMCxgSMq3P1Lz3LmJscFntkEpvt5LtF MlFEoe03/unaxmzr9uPbXqRzp3Y= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=kpDiaxY/fFIfH1RlKSOVhB zXtjY=; b=qSd0YsyavQLFLXNkj4T8ggb12xDIKNx7bgI6uu4WgiPvMb1weUxhG6 tV1RNqlD8FSTVLktbaU6V/9B/OL3o/le8tA2kxbAhbq3wKbcvKlcgbT5rkwfY040 Ky2BggEu5wRuVKiLZ4TG4tybEIdTHpVaGiqK+U5KbDlGXB3XuedL4= Received: (qmail 57072 invoked by alias); 15 Dec 2016 11:34:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 57062 invoked by uid 89); 15 Dec 2016 11:34:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=lieu, corrupted, Hx-spam-relays-external:135, H*RU:135 X-HELO: EUR03-AM5-obe.outbound.protection.outlook.com Received: from mail-eopbgr30074.outbound.protection.outlook.com (HELO EUR03-AM5-obe.outbound.protection.outlook.com) (40.107.3.74) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 15 Dec 2016 11:34:31 +0000 Received: from VI1PR0801CA0073.eurprd08.prod.outlook.com (10.173.67.145) by HE1PR0801MB1851.eurprd08.prod.outlook.com (10.168.150.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.771.8; Thu, 15 Dec 2016 11:34:27 +0000 Received: from AM1FFO11FD041.protection.gbl (2a01:111:f400:7e00::135) by VI1PR0801CA0073.outlook.office365.com (2603:10a6:800:7d::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.771.8 via Frontend Transport; Thu, 15 Dec 2016 11:34:26 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; redhat.com; dkim=none (message not signed) header.d=none; redhat.com; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by AM1FFO11FD041.mail.protection.outlook.com (10.174.64.230) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.771.7 via Frontend Transport; Thu, 15 Dec 2016 11:34:26 +0000 X-IncomingTopHeaderMarker: OriginalChecksum:; UpperCasedChecksum:; SizeAsReceived:757; Count:13 Received: from e107456-lin.cambridge.arm.com (10.1.2.79) by mail.arm.com (10.1.105.66) with Microsoft SMTP Server id 14.3.294.0; Thu, 15 Dec 2016 11:34:05 +0000 From: James Greenhalgh To: CC: , , Subject: [Patch] Undermine the jump threading cost model to fix PR77445. Date: Thu, 15 Dec 2016 11:33:59 +0000 Message-ID: <1481801639-14286-1-git-send-email-james.greenhalgh@arm.com> In-Reply-To: <20161018155510.GA11109@arm.com> References: <20161018155510.GA11109@arm.com> MIME-Version: 1.0 X-IncomingHeaderCount: 13 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140; IPV:CAL; SCL:-1; CTRY:GB; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(7916002)(39850400002)(39410400002)(39860400002)(39840400002)(39450400003)(2980300002)(438002)(377424004)(199003)(189002)(4610100001)(2906002)(50226002)(356003)(189998001)(33646002)(246002)(38730400001)(92566002)(5890100001)(84326002)(77096006)(4001150100001)(86362001)(4326007)(305945005)(2476003)(512874002)(626004)(6666003)(36756003)(2351001)(26826002)(106466001)(5660300001)(110136003)(6916009)(2950100002)(8936002)(568964002)(8676002)(104016004)(50986999)(76176999); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0801MB1851; H:nebula.arm.com; FPR:; SPF:Pass; PTR:fw-tnat.cambridge.arm.com; A:1; MX:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; AM1FFO11FD041; 1:0enPBSnIeFRyfYtFE4hC7/m8oEVbKwJanGmv5ulbNZjchP5s/bBYSR7vxsHjvrlRqV+9rSnu3WSNDnWnwZICjqGyQNDbSpZiSknKM7TnUba2URBV2FVv+3P/BkCxJj95HlCzHupHXAJns6meAvJm8QnbN4WUQxNmJoED4LKi/eEZzSxwc8DaXqiJqQRzyhVpenHwDAvkcIoxExCYGg/pTQYJ8nna6KToip2clm5ybl5xzSWPK5NqZ8DNXNWNTHuwY6e5pTb6dOlLHmdDVmtG3BTyGczcZXT6ZdqVNEUmvNDmeRMkDB7plTxsBRNGsiKOGQciTfHkkK/TYas/Uk4d27AOg8Qj5PP9mT7ZTOUFVQ4UOJQB9x4Sm1jhXUN8uCxiAKdJiLplku2kIzufRtHOb9iJ83uD6ng2l00g5UhxOwRHc90aEI8JLgmJu0wordtCZez3Emgi4zkJY+GAdCn9q0EJIh76mf2RJjzhbH/UY+O3fIqqagC0XMq1/LUc/W1mZovv/hOBYS7OwnaYHIgpMmnfXC9Z2M92gTne+j+3YkTOYGooq3F6dtwR+lCLUJXQJUbB24G7ehirQC045MEBDg== X-MS-Office365-Filtering-Correlation-Id: e7ca832b-e351-4435-6449-08d424de5373 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(8251501002); SRVR:HE1PR0801MB1851; X-Microsoft-Exchange-Diagnostics: 1; HE1PR0801MB1851; 3:pwSRuDu4306fi+rrzgUMEmO9xYaeUSu+bwxyc6zFlk6UmTLjN6JlGD6pkPgjUH5dpIQemCfmK7M5V/e6K8FIUBooYCscrGyRQf1Ws6pSg1RroIyRXesiD3i1t5dyu1skpd1lY3vAmyftiFzvKk2KJG6R/1CGqfCzin9fOqRNfk6fBLSGqQvB9AD77b4COeZVfVeFc+ar/7+Qpsqf0Rmlwws6CMRHmEvH4mWQ/u/CdtK0przXbYC8bnutKRCdl/1JBISU262TH/V1oOD3jiPE8FtlZfGf4e+djPe3FffnHL/HDvYxZIJhKsdnyUcuOawiAa2vYqwCR8APbe+/V71J86sob2dY8VJSN77keM6BvgD9gwI5NiW/rF/NSRZ7cjLeALkZ4UHZ6EXHMnQkMleILg== X-Microsoft-Exchange-Diagnostics: 1; HE1PR0801MB1851; 25:Ef0UFrbbpsqV17VaP6xKTHOgWOMUEmb7/msYFaJxWwECltpmfJj2FJf0uMPZsuh3wxgHZ0I5YzsAvyp4KMcJn64mo55zKiur0PFulGwruJh7gzFkbqw5jM+qdIe1+kUi394N3iU+nNFZCBwGO30zUoxO+gEqp/eId87VzF2lAlLtchFBrSJEx47vLb8Qf5GvhD/tvCe/S+w5s7x5YBqkYH3cB6/gimZKSPr/RGaeRfSD/U8r/0SsjRlOWhHsjX0hIbFFqR8SPhm9ZA8wDVHK4irRZRoPJ48eaVeL4Tcqn5eb+gRKF4Eh/r6aZt/vpNHZEA1v+8IMNGla6BK0rQne8hik4dQotVMWoRPJX6soaas5dD8go4LvoRcrVfpZKAwYTZ4eTXtGgi1hV1Waf4VauvDOo0UNpEvBJ+4EEWoMiEtDnPOC14LUsP+QKtZO0KtBS5+PGHyq0s92hmdN8JIIZE0VvgBxMijwd5NgmicZgbWRnw1FMmhhUVCfGvcDUneUJfK4qrKcIGf+iUChgmmyjJYfhwxOXF97e1gTTmunpZPloSIb+Oq/wHvSi7nsbPCK5WbNy+2567UQhaxez6o7k4Uodz4LYcjd3LrC/SlST0VQpE+tDoMOEb5m939QnptrnTPE++o6TWOmkpiIJajflrpDr9+qNwbhnHI2gCRs13ziL4E8Jasp9VNnHsQxWsk2mYcUC4QUspspo0Ip3dpw9m8lHiHvX4oEDVMmkv6gAebECcQqodm0MTMT6k/FCfEDvbn4utND5yUybMFghYIDGPJh2Gf+JkYb4XoiSq57rRRdCIBTQ7fkVdpwMRzfqCzxE3VI8KhuJfIB+U+i79CaOg8X3b5Bvvam+Mf5ly9mJS8/qsmHVRm4otrroq5Zwrp3ZK/qAEBH5vqEbq4AhbdmX5WvQgfgovl/BAlg18qHQtg= X-Microsoft-Exchange-Diagnostics: 1; HE1PR0801MB1851; 31:4vOIWsaPX/aDiO+vMNdo/4LrZ9fQP26NMPJHyWdLwMuFTi+OupLlwReT4ZcJ2wXMsV5Sb6KUocLfhSIGYYfYfaofsTXuaTvyUgw3JCcYWKDkyOEv28lmBijmQLTt1QExgeN/Xbh/hmrEJDLuveBOQDvXfw4Q2QT/1/NGged/zA89VceEGiB9Y/clORIjMp0vVfGzeZDY55dbc/bw862CYhBBiQIZDez/ke2cMiCHYzxHTXxEohb1XsvMeCcefYtm5dmOL/WGC+/Ri6nCPzQ2PN7YoPTrrLr3obXaDbw31uy2DJiqquizqoyJcqulBCpKk5FwsYqQSOEci+a/u0sxqy5KFikjiQZodZtx/lz92s0=; 20:nO2mrN/Mknhjh7K5uRWM43HT2gxT6Zh7VpoVi1NfUy9zioaSYcwt/3Np3ufpNgvOAKuOV2hLBL+YOuDzXyjQUVGHIeW36xly07YGhdEGlHXilZFGaRuQ6BVfwn7W/Ao4oy1d5PCEaLFzYPm36+5W575mHagnfRO/Xe58DaRb3l2Z6Zu2dt5quihR9Mr2FEXU48iiU+i8vy+miYNQ0KKxyKar/+xZ3R39crT6/IplveuMC9L/txpjKEUvD3CUA21K NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(102415395)(6040375)(601004)(2401047)(13013025)(5005006)(8121501046)(13023025)(13024025)(13020025)(10201501046)(3002001)(6055026)(6041248)(20161123555025)(20161123562025)(20161123564025)(20161123560025)(6072148); SRVR:HE1PR0801MB1851; BCL:0; PCL:0; RULEID:; SRVR:HE1PR0801MB1851; X-Microsoft-Exchange-Diagnostics: 1; HE1PR0801MB1851; 4:hZymyK1R2vKTbtO9gOcaiufusIz4umFjSEsz6WDgJuIcz2lCkUncHalV5ZDr7IWFvfuSssyJFhA+HncYOdf1pdcrGXw2zgZjV1xt050j7XglzkObhcRgXMphA2cD826MP7zCTrBwFRnAriRJbph5qzNtMfx2TEZfDTnM2jtNEmXrtZ5mvXS3P0FiqzMb34AanqlLs59oGYeF4T7PhCkNsT5gjDBC6YCTPj57LOPdbuA3wCP9M0HP67a/o/LnudCHmHQwlpSzd9f/foLbkfiW8RkrZZEjLPDbbX3/WQKv81+huij41VLfLWW0q8YUvLSlZbDlkvCr1mzhI8hEX7gpvQUwaIKZgz6RQVbGromdbAA5uvbBep5/y0PyJdag4L40CapcgMK8Z+cyH4NLdC0dgb+TpYcMoW/rVKACv0tk2O6ODJeA69M+PNpN6wsIf3LjfIRk+yo/fFKlNuXI8TzKMs/XNUvdhF+KHZuNvdcX8ztZj1xp3t38vVNIhBInHbOWScczWOCGabtXi/dhP01XyJF711kk9DDKhE2m9oA7WoxxFkcJUp5B9ULnxQUVcDVi1grn0U9D1QtKHMjiPI/zhAFvlujejLw7o4XOEgcNf9EJ582Q8CbZ+QcfAeozp8/JCGcRbwMReZCY0e2o6ncUSJ8Use5ipqZHNpszgDxJ3Q21OfE9AdraIur6vunKp/STQv97BvoTg28u0R1c+CMbUxK98TwHwkOsdam+f+l4Ipg= X-Forefront-PRVS: 0157DEB61B X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; HE1PR0801MB1851; 23:zppTftmpx/LiMbUFBuFxBRG+ScXbBqD8UkfhEP9?= =?us-ascii?Q?Kx4xaZ0AV5WZzJLLufLNNA+Ul/RSuvY/rhw0f3yBtO15MNv1SRwpsUcrf6yO?= =?us-ascii?Q?+7FCskl0F2dOQkj6XgunYER3El7kOhFBQVv6ksMSU4U/8Bvu1kxHLLMU8hPM?= =?us-ascii?Q?QGmlfTEqiGIvUgrLbRNFdM9EoV+RRT0+ngZyl287vzJ6rKDxxo/RoQw93Ox4?= =?us-ascii?Q?+jRfxM1PUWe5pFNsR2AvqPTTIivcT9SBMvGdiDeuo7410v6sXe82KVrhIeW/?= =?us-ascii?Q?IeQ6N5rlT7VoWk2gBk5jIS6Mo+pj+iOeTpo7K6orixlWmKqOxX2i0jncQF9E?= =?us-ascii?Q?goATozwE0zHHO+PWSih72hiA2niXF2XjvrYdG0qsGdGmvtJwLzSLx7IrEAND?= =?us-ascii?Q?jCcNtl/J3YLRWQg7NIL0xR7qJRnGhSzhqdqmNDWhjfDYEtwYYM1zNmqx78O5?= =?us-ascii?Q?EeFUUkl8ua2DyA0Smx+6GJ6Lf1foTiICOnBQRZIixvxvHvNeeR2GfbuLlNU7?= =?us-ascii?Q?BxXdibzimz3hYMEyhjDwa2U3s2hYyEAmcKnFKej64lt9ZMVFTHc/rxXvD4dY?= =?us-ascii?Q?qik37Nzgq+ZbC9+F0LLPDQeMG4NHZ6cMHQmVt98z/t/RvyeIQtmd25akwj9I?= =?us-ascii?Q?hlw0ue6oZWUbg8QLxyMd+s9rGof+mTBLwq4jRfJGVPGC507kSodJePHJEuw2?= =?us-ascii?Q?GU0HQcT/x9Gg8UJtRlSDmoHSPK/PZDIr2zYIuQbO+pL7s2pSfNLX7tDJ5PzD?= =?us-ascii?Q?LBWVH5rs9y8E1B7GB9ZyybFtq8uepMfhRWHyfKbzqkINPmnANg3Bi+lKHCz2?= =?us-ascii?Q?pWhVgkCmbV4pQQW4OA8pmWQwSWV1W5jzX3KhIROkRZJytgeTZuxCMUXNRa4m?= =?us-ascii?Q?DPuwMbEOSGF9jdbqw9E1K9NxOQHvVYHhvFtIgRjl7WD5FjIlx1etN63Sc8tj?= =?us-ascii?Q?IlUpKwacGtk9IwzI1jVGT2KT/XunyAtcy5798nplelbodCLOpD+UWOhOm3Gc?= =?us-ascii?Q?UeL+J4QG4za9IuBM7eM8I5AV/3KxE20rKcMheT5XUCUdCIIfjKxVvymbxZ6M?= =?us-ascii?Q?3kwSTiycBWvkzZRdF4PrYR6GkWR+MZ5lQcTjHBVOCPXwPziVYiNovaa6NJKI?= =?us-ascii?Q?woQo+n6TbKKqBftUhbvpheU102BPI9xnv?= X-Microsoft-Exchange-Diagnostics: 1; HE1PR0801MB1851; 6:V5ewObPvJkDwUM4FaNjxqk0xUGKuEtyqfyVbuZ6/R8LEvcshz2N1V/TbexpwDOebGrN2yYp364J+9YzfBtVYxymxcIobiYNKMeDMWUNaBgVXC0+5fbcq4uE+Bplv3r0t/aUYwj8gBq5UW+jK2CJyV80SOX5hidxbx1E6NW36exKDQA4Ce76gmSkUsUaevGGEvwH7RyYuShzQC21XWJwe8xBpBm8siy+LAyoES7IiWEemPEdh6s25TSktNfV3Kh+MHHmIWtqusdR/2ksIr9LB6nQA3haQFIb5QMdm74a8MASKAw0tipCbOjuGXElC8bVAyulIIKkh6oIyboHgOcI1EDY047LX89/ZFFLeWi/Avsvv1x/ewh9JnRxgc3veH8qN/0SKsaCvrCD8tpfSvazN/pMfXhp91lCtPoBmjW/l6HcDeG0WIkqo0h2LHR2s/D7XkRH3ng5gjtqUkigdH9wSAA==; 5:oy4yKdrDBt4bJw2TPIHXH2CP56/Mqmiokk4p+NIjumiCIb1f8XyJ5tOEH7rIf/8BVMOJff5kj0OKa4jSRogQYdhot2u0EaDNhSkjzs9ejXONiCsz34C496UCTbsCK8uEL83JFvwhgD27arAO49yECw==; 24:UyXKbg/7hFEZmeqbiu7JezXBjSf1DAUl64pUP03RtCQNTH8a6MKYLTfxAT6yr19GH0S2477O7aCBG+jhP+/hDgOuBLEw7RqE+p+4b3oDSBg= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; HE1PR0801MB1851; 7:Tx8rl+a4scwXYcaWzLRXxBMxhuS3bD8zEWYz2UIEDRdjI3kPaFaJibQw9BkYKCyCm0DcBQRN5x2vTgdZyvEhAXYZ1z1xbi4JxOs7baybW+Z9ViAad5edkGHsHlU6W68e7BG2FZWvEP0hFshlHzaJIRaW2sfQMWTDokraT/gZo5dfstzIlSZCVgAg/P0yhuNDWWCQDEBjdORBKbeyleW1NmNk51SoM+wkSIrB/530Tf9g3HX9sbJdPK+GR8RVrn15Q+dWBg848dlzw0ik7G78tf83DxjzvImNZjMRoABynqGBRXoW9Z//2+hyJ2zYD77w1zWuEvFR5RbHOI82MvXK6q/B4aQJo16vGdBkUE6HOqjpLo3K9/IkGFILNq7l04a+RWEMhFnDqHfCMiO3DEBLElmidePUrxv0QPhBiQUvFWYfcqtKJF3XFwuHJ6XNWgkt20wQApDCiNYkX2LO0J7COw== X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Dec 2016 11:34:26.5153 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[217.140.96.140]; Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1851 X-IsSubscribed: yes Hi, As mentioned in PR77445, the improvements to the jump threading cost model this year have caused substantial regressions in the amount of jump threading we do and the performance of workloads which rely on that threading. This patch represents the low-bar in fixing the performance issues reported in PR77445 - by weakening the cost model enough that we thread in a way much closer to GCC 6. I don't think this patch is likely to be acceptable for trunk, but I'm posting it for consideration regardless. Under the new cost model, if the edge doesn't pass optimize_edge_for_speed_p, then we don't thread. The problem in late threading is bad edge profile data makes the edge look cold, and thus it fails optimize_edge_for_speed_p and is no longer considered a candidate for threading. As an aside, I think this is the wrong cost model for jump threading, where you get the most impact if you can resolve unpredictable switch statements - which by their nature may have multiple cold edges in need of threading. Early threading should avoid these issues, as there is no edge profile info yet. optimize_edge_for_speed_p is therefore more likely to hold, but the condition for threading is: if (speed_p && optimize_edge_for_speed_p (taken_edge)) { if (n_insns >= PARAM_VALUE (PARAM_MAX_FSM_THREAD_PATH_INSNS)) { [...reject threading...] } } else if (n_insns > 1) { [...reject threading...] } With speed_p is hardwired to false for the early threader ( pass_early_thread_jumps::execute ): find_jump_threads_backwards (bb, false); So we always fall to the n_insns > 1 case and thus only rarely get to thread. In this patch I change that call in pass_early_thread_jumps::execute to instead look at optimize_bb_for_speed_p (bb) . That allows the speed_p check to pass in the main threading cost model, and then the optimize_edge_for_speed_p can also pass. That gets the first stage of jump-threading back working in a proprietary benchmark which is sensitive to this optimisation. To get the rest of the required jump threading, I also have to weaken the cost model - and this is obviously a hack! The easy hack is to special case when the taken edge has frequency zero, and permit jump threading there. I know this patch is likely not the preferred way to fix this. For me that would be a change to the cost model, which as I mentioned above I think misses the point about which edges we want to thread. By far the best fix would be to the junk edge profiling data we create during threading. However, this patch does fix the performance issues identified in PR77445, and does highlight a fundamental issue with the early threader (which doesn't seem to me like it will be effective while it sets speed_p to false), so I'd like it to be considered for trunk if no better fix appears before stage 4. Bootstrapped on x86_64 with no issues. The testsuite changes just reshuffle which passes spot the threading opportunities. OK? Thanks, James --- gcc/ 2016-12-15 James Greenhalgh PR tree-optimization/77445 * tree-ssa-threadbackward.c (profitable_jump_thread_path) Work around sometimes corrupt edge frequency data. (pass_early_thread_jumps::execute): Pass optimize_bb_for_speed_p as the speed_p parameter to find_jump_threads_backwards to enable threading in more cases. gcc/testsuite/ 2016-12-15 James Greenhalgh PR tree-optimization/77445 * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust options and dump passes. * gcc.dg/tree-ssa/pr66752-3.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c index 896c8bf..39ec3d6 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-dce2" } */ +/* { dg-options "-O2 -fdump-tree-ethread-details -fdump-tree-dce2" } */ extern int status, pt; extern int count; @@ -34,7 +34,7 @@ foo (int N, int c, int b, int *a) /* There are 4 FSM jump threading opportunities, all of which will be realized, which will eliminate testing of FLAG, completely. */ -/* { dg-final { scan-tree-dump-times "Registering FSM" 4 "thread1"} } */ +/* { dg-final { scan-tree-dump-times "Registering FSM" 4 "ethread"} } */ /* There should be no assignments or references to FLAG, verify they're eliminated as early as possible. */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c index 9a9d1cb..5b087fb 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c @@ -1,8 +1,9 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 16" "thread1" } } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 9" "thread2" } } */ -/* { dg-final { scan-tree-dump "Jumps threaded: 3" "thread3" } } */ +/* { dg-options "-O2 -fdump-tree-ethread-stats -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 16" "ethread" } } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 6" "thread1" } } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 4" "thread2" } } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 2" "thread3" } } */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom2" } } */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom3" } } */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp2" } } */ diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 203e20e..0d29ab5 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -311,7 +311,20 @@ profitable_jump_thread_path (vec *&path, return NULL; } - if (speed_p && optimize_edge_for_speed_p (taken_edge)) + + /* FIXME: Edge frequency can get badly out shape as a result of + the jump threading passes. In those cases, + EDGE_FREQUENCY (taken_edge) == 0 , and so trivially fails the + test for optimize_edge_for_speed_p. The correct fix would + be to ensure that profiling information coming out of jump threading + is meaningful, but in lieu of that add a hack check to this cost model + which permits jump threading in the case EDGE_FREQUENCY has been + corrupted. Only do this if the profile info is present and corrupt, + not if it is absent. */ + if (speed_p + && (optimize_edge_for_speed_p (taken_edge) + || (profile_status_for_fn (cfun) != PROFILE_ABSENT + && EDGE_FREQUENCY (taken_edge) == 0))) { if (n_insns >= PARAM_VALUE (PARAM_MAX_FSM_THREAD_PATH_INSNS)) { @@ -870,7 +883,7 @@ pass_early_thread_jumps::execute (function *fun) FOR_EACH_BB_FN (bb, fun) { if (EDGE_COUNT (bb->succs) > 1) - find_jump_threads_backwards (bb, false); + find_jump_threads_backwards (bb, optimize_bb_for_speed_p (bb)); } thread_through_all_blocks (true); return 0;