From patchwork Wed Oct 19 14:59:54 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 78268 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp292637qge; Wed, 19 Oct 2016 08:00:19 -0700 (PDT) X-Received: by 10.99.221.85 with SMTP id g21mr9815991pgj.121.1476889219070; Wed, 19 Oct 2016 08:00:19 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n9si37603468pgd.67.2016.10.19.08.00.18; Wed, 19 Oct 2016 08:00:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S943718AbcJSO77 (ORCPT + 27 others); Wed, 19 Oct 2016 10:59:59 -0400 Received: from mail-io0-f173.google.com ([209.85.223.173]:32853 "EHLO mail-io0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S943518AbcJSO74 (ORCPT ); Wed, 19 Oct 2016 10:59:56 -0400 Received: by mail-io0-f173.google.com with SMTP id q192so34508389iod.0 for ; Wed, 19 Oct 2016 07:59:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=+eS8J6fD1i+ZzvQ7Mux+MGkoRCzJacTvWhN7Fg/EWxo=; b=BIN1Grakh5mftCoJYrDH2Z347J5VhoHOjEtFIjxOeOqUeDgiN/LYkfFnDyXc1D4gYW DVC1D6tEDHtQrIuFl2Yw6MEJxuclEW16cRsBxbktIpINweN2kHDmc5DtiCGXh96TxFCU LGZQe/HfL/8uR7V5FSwKzofiWBDe+rU4ObpoU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=+eS8J6fD1i+ZzvQ7Mux+MGkoRCzJacTvWhN7Fg/EWxo=; b=JPltPjUWMAgERnCushgxndE8F6HPzOvD4wOhuWz04R8ofSZW3oYWZkcBzPNZaQYY/T /lf29hnuDaNLCtrfkdTefGPRpIaHrhpbRr/BW5hCi6rT+dMC8go6178mR/a3XmC/AYmx TafsgCsUZ6KWqtzpl9U0UKkplpsCPc2sZ5G57E5+RUh06V7qFS/RGiNZgM4vOjE3sEDA lzrY8CTvvIxBOJjz9dPQu8FZgngvxJ45OqKKJe5HLoCmpmXKOTxN663jiY7CjbZpY8j+ f7H4wY3lBHUj/cCHEiDe56nViGbuF7ika8NaFgaNnZlgpzYYsOnky4fBR272taulrq9D NCKw== X-Gm-Message-State: AA6/9RlEXeJm8iqWQNvxJ1nGxYCie6u/tudXv+HfN69VGrz+TpWDcl4Ctyicf6utD8a7OyfP8JEEA+G2Sx09BUy0 X-Received: by 10.107.28.148 with SMTP id c142mr6724870ioc.45.1476889195518; Wed, 19 Oct 2016 07:59:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.5.139 with HTTP; Wed, 19 Oct 2016 07:59:54 -0700 (PDT) In-Reply-To: <20161019133500.GQ9193@arm.com> References: <20161017183806.GG5601@arm.com> <20161019133500.GQ9193@arm.com> From: Ard Biesheuvel Date: Wed, 19 Oct 2016 15:59:54 +0100 Message-ID: Subject: Re: Build failure with v4.9-rc1 and GCC trunk -- compiler weirdness To: Will Deacon Cc: "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Linus Torvalds , Peter Zijlstra , Ingo Molnar , james.greenhalgh@arm.com, Gregory CLEMENT , Stephen Boyd Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 19 October 2016 at 14:35, Will Deacon wrote: > Hi Ard, > > On Mon, Oct 17, 2016 at 08:43:19PM +0100, Ard Biesheuvel wrote: >> On 17 October 2016 at 19:38, Will Deacon wrote: >> > I'm seeing an arm64 build failure with -rc1 and GCC trunk, although I >> > believe that the new compiler behaviour at the heart of the problem >> > has the potential to affect other architectures and other pieces of >> > kernel code relying on dead-code elimination to remove deliberately >> > undefined functions. >> > >> > The failure looks like: >> > >> > | drivers/built-in.o: In function `armada_3700_add_composite_clk': >> > | >> > | linux/drivers/clk/mvebu/armada-37xx-periph.c:351: >> > | undefined reference to `____ilog2_NaN' >> > | >> > | linux/drivers/clk/mvebu/armada-37xx-periph.c:351:(.text+0xc72e0): >> > | relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol >> > | `____ilog2_NaN' >> > | >> > | make: *** [vmlinux] Error 1 >> > >> > and if we look at the source for armada_3700_add_composite_clk, we see >> > that this is caused by: >> > >> > int table_size = 0; >> > >> > rate->reg = reg + (u64)rate->reg; >> > for (clkt = rate->table; clkt->div; clkt++) >> > table_size++; >> > rate->width = order_base_2(table_size); >> > >> > order_base_2 calls ilog2, which has the ____ilog2_NaN call: >> > >> > #define ilog2(n) \ >> > ( \ >> > __builtin_constant_p(n) ? ( \ >> > (n) < 1 ? ____ilog2_NaN() : \ >> > >> > This is because we're in a curious case where GCC has emitted a >> > special-cased version of armada_3700_add_composite_clk, with table_size >> > effectively constant-folded as 0. Whilst we shouldn't see this in a >> > non-buggy kernel (hence the deliberate call to the undefined function >> > ____ilog2_NaN), it means that the final link fails because we have a >> > ____ilog2_NaN in the code, with a runtime check on table_size. >> > >> >> This is indeed an unintended side effect, but I would not call it >> weird behaviour at all. The code in its current form does not handle >> the case where it could end up passing 0 into order_base_2(), and we >> simply need to handle that case. > > The reasons I think it's weird are: > > (1) The optimisation doesn't generate better code in this case -- > optimising for the table_size == 0 case is uninformed, particularly > as that *cannot* happen at runtime (GCC probably can't tell, due > to things like container_of, but all the clock data is static). > AFAICT, the references to the static clock data are indirected via of_device_get_match_data(), which means there is no way the compiler can prove that table_size is always non-zero. > (2) __builtin_constant_p(n) could be interpreted by a developer as > "this code will execute with a constant n at runtime". With this > issue, GCC could (in theory) generate a specialisation for every > possible value of a variable, and return __builtin_constant_p as > true for all of them, which somewhat undermines the point of the > builtin. > Yes, and that would be perfectly legal from a correctness point of view, and would likely help performance as well. By using __builtin_constant_p(), you are choosing to perform a build time evaluation of an expression that would ordinarily be evaluated only at runtime. This implies that you have to address undefined behavior at build time rather than at runtime as well. >> If order_base_2() is not defined for input 0, it should BUG() in that >> case, and the associated __builtin_unreachable() should prevent the >> special version from being emitted. If order_base_2() is defined for input >> 0, it should not invoke ilog2() with that argument, and the problem should >> go away as well. > > I don't necessarily think it should BUG() if it's not defined for input > 0; things like __ffs don't do that and we'd be introducing conditional > checks for cases that should not happen. The comment above order_base_2 > does suggest that ob2(0) should return 0, but it can actually end up > invoking ilog2(-1), which is obviously wrong. > > I could update the comment, but that doesn't fix the build issue. > Fixing roundup_pow_of_two() [which is arguably incorrect] would probably fix the build issue as well, no? diff --git a/include/linux/log2.h b/include/linux/log2.h index fd7ff3d91e6a..8a4be5e4223b 100644 --- a/include/linux/log2.h +++ b/include/linux/log2.h @@ -168,7 +168,7 @@ unsigned long __rounddown_pow_of_two(unsigned long n) #define roundup_pow_of_two(n) \ ( \ __builtin_constant_p(n) ? ( \ - (n == 1) ? 1 : \ + (n <= 1) ? 1 : \ (1UL << (ilog2((n) - 1) + 1)) \ ) : \ __roundup_pow_of_two(n) \