From patchwork Mon Oct 17 18:38:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 77801 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp525294qge; Mon, 17 Oct 2016 11:38:11 -0700 (PDT) X-Received: by 10.99.140.76 with SMTP id q12mr33085257pgn.178.1476729491587; Mon, 17 Oct 2016 11:38:11 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t5si28455519pgb.171.2016.10.17.11.38.11; Mon, 17 Oct 2016 11:38:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933875AbcJQSiI (ORCPT + 27 others); Mon, 17 Oct 2016 14:38:08 -0400 Received: from foss.arm.com ([217.140.101.70]:41154 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932930AbcJQSiH (ORCPT ); Mon, 17 Oct 2016 14:38:07 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EE57829; Mon, 17 Oct 2016 11:38:05 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C181C3F3D6; Mon, 17 Oct 2016 11:38:05 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 0BE2B1AE3BDB; Mon, 17 Oct 2016 19:38:07 +0100 (BST) Date: Mon, 17 Oct 2016 19:38:07 +0100 From: Will Deacon To: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, peterz@infradead.org, mingo@kernel.org, ard.biesheuvel@linaro.org, james.greenhalgh@arm.com, gregory.clement@free-electrons.com, sboyd@codeaurora.org Subject: Build failure with v4.9-rc1 and GCC trunk -- compiler weirdness Message-ID: <20161017183806.GG5601@arm.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, I'm seeing an arm64 build failure with -rc1 and GCC trunk, although I believe that the new compiler behaviour at the heart of the problem has the potential to affect other architectures and other pieces of kernel code relying on dead-code elimination to remove deliberately undefined functions. The failure looks like: | drivers/built-in.o: In function `armada_3700_add_composite_clk': | | linux/drivers/clk/mvebu/armada-37xx-periph.c:351: | undefined reference to `____ilog2_NaN' | | linux/drivers/clk/mvebu/armada-37xx-periph.c:351:(.text+0xc72e0): | relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol | `____ilog2_NaN' | | make: *** [vmlinux] Error 1 and if we look at the source for armada_3700_add_composite_clk, we see that this is caused by: int table_size = 0; rate->reg = reg + (u64)rate->reg; for (clkt = rate->table; clkt->div; clkt++) table_size++; rate->width = order_base_2(table_size); order_base_2 calls ilog2, which has the ____ilog2_NaN call: #define ilog2(n) \ ( \ __builtin_constant_p(n) ? ( \ (n) < 1 ? ____ilog2_NaN() : \ This is because we're in a curious case where GCC has emitted a special-cased version of armada_3700_add_composite_clk, with table_size effectively constant-folded as 0. Whilst we shouldn't see this in a non-buggy kernel (hence the deliberate call to the undefined function ____ilog2_NaN), it means that the final link fails because we have a ____ilog2_NaN in the code, with a runtime check on table_size. In other words, __builtin_constant_p appears to be weaker than we've been assuming. Talking to the compiler guys here, this is due to the "jump-threading" optimisation pass, so the patch below disables that. A simpler example is: int foo(); int bar(); int count(int *argc) { int table_size = 0; for (; *argc; argc++) table_size++; if (__builtin_constant_p(table_size)) return table_size == 0 ? foo() : bar(); return bar(); } which compiles to: count: ldr w0, [x0] cbz w0, .L4 b bar .p2align 3 .L4: b foo and, with the "optimisation" disabled: count: b bar Thoughts? It feels awfully fragile disabling passes like this, but with GCC transforming the code like this, I can't immediately think of a way to preserve the intended behaviour of the code. Will --->8 diff --git a/Makefile b/Makefile index 512e47a53e9a..750873d6d11e 100644 --- a/Makefile +++ b/Makefile @@ -641,6 +641,11 @@ endif # Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) +# Stop gcc from converting switches into a form that defeats dead code +# elimination and can subsequently lead to calls to intentionally +# undefined functions appearing in the final link. +KBUILD_CFLAGS += $(call cc-option,--param=max-fsm-thread-path-insns=1) + include scripts/Makefile.gcc-plugins ifdef CONFIG_READABLE_ASM