Message ID | 20190307091514.2489338-1-arnd@arndb.de |
---|---|
State | New |
Headers | show |
Series | [1/2] futex: mark futex_detect_cmpxchg() as 'noinline' | expand |
On Thu, 2019-03-07 at 10:14 +0100, Arnd Bergmann wrote: > On 32-bit ARM, I got a link failure in futex_init() when building > with clang in some random configurations: > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' > > As far as I can tell, the problem is that a branch is over 16MB > apart in those configurations, but only if it branches back to > the init text. > > Marking the futex_detect_cmpxchg() function as noinline and > not __init avoids the problem for me. Perhaps the __init and __exit #defines should be noinline to allow discarding of the code. These defines are already marked with __section so I'd've expected these to be noinline anyway. --- include/linux/init.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/init.h b/include/linux/init.h index 5255069f5a9f..806215a74064 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -47,7 +47,7 @@ /* These are for everybody (although not all archs will actually discard it in modules) */ -#define __init __section(.init.text) __cold __latent_entropy __noinitretpoline +#define __init __section(.init.text) __cold __latent_entropy __noinitretpoline noinline #define __initdata __section(.init.data) #define __initconst __section(.init.rodata) #define __exitdata __section(.exit.data) @@ -80,7 +80,7 @@ #define __exitused __used #endif -#define __exit __section(.exit.text) __exitused __cold notrace +#define __exit __section(.exit.text) __exitused __cold notrace noinline /* Used for MEMORY_HOTPLUG */ #define __meminit __section(.meminit.text) __cold notrace \
On Thu, Mar 07, 2019 at 09:19:04AM -0800, Joe Perches wrote: > On Thu, 2019-03-07 at 10:14 +0100, Arnd Bergmann wrote: > > On 32-bit ARM, I got a link failure in futex_init() when building > > with clang in some random configurations: > > > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' > > > > As far as I can tell, the problem is that a branch is over 16MB > > apart in those configurations, but only if it branches back to > > the init text. > > > > Marking the futex_detect_cmpxchg() function as noinline and > > not __init avoids the problem for me. > > Perhaps the __init and __exit #defines should be noinline > to allow discarding of the code. How does that help this case? The problem is futex_detect_cmpxchg() being placed in .init.text which is too far from .text.fixup. Whether it is inlined or not is immaterial since both futex_detect_cmpxchg() and its only caller futex_init() are both __init functions. It seems to me to be completely sane to have: static void __init foo(...) { } static int __init foo_init(...) { foo(); } and have the expectation that the compiler _can_, if it desires, inline foo() into foo_init(). Let me re-iterate: the inlining of futex_detect_cmpxchg() into futex_init() is not the issue here. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
On Thu, 2019-03-07 at 17:25 +0000, Russell King - ARM Linux admin wrote: > On Thu, Mar 07, 2019 at 09:19:04AM -0800, Joe Perches wrote: > > On Thu, 2019-03-07 at 10:14 +0100, Arnd Bergmann wrote: > > > On 32-bit ARM, I got a link failure in futex_init() when building > > > with clang in some random configurations: > > > > > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' > > > > > > As far as I can tell, the problem is that a branch is over 16MB > > > apart in those configurations, but only if it branches back to > > > the init text. > > > > > > Marking the futex_detect_cmpxchg() function as noinline and > > > not __init avoids the problem for me. > > > > Perhaps the __init and __exit #defines should be noinline > > to allow discarding of the code. > > How does that help this case? It doesn't particularly. It does help any other case that might arise. > It seems to me to be completely sane to have: > > static void __init foo(...) > { > } > > static int __init foo_init(...) > { > foo(); > } > > and have the expectation that the compiler _can_, if it desires, inline > foo() into foo_init(). True, but init function performance requirements are generally trivial and isolating the possibility of defects like this seems a reasonable trade-off to me.
On Thu, Mar 07, 2019 at 09:42:40AM -0800, Joe Perches wrote: > On Thu, 2019-03-07 at 17:25 +0000, Russell King - ARM Linux admin wrote: > > On Thu, Mar 07, 2019 at 09:19:04AM -0800, Joe Perches wrote: > > > On Thu, 2019-03-07 at 10:14 +0100, Arnd Bergmann wrote: > > > > On 32-bit ARM, I got a link failure in futex_init() when building > > > > with clang in some random configurations: > > > > > > > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' > > > > > > > > As far as I can tell, the problem is that a branch is over 16MB > > > > apart in those configurations, but only if it branches back to > > > > the init text. > > > > > > > > Marking the futex_detect_cmpxchg() function as noinline and > > > > not __init avoids the problem for me. > > > > > > Perhaps the __init and __exit #defines should be noinline > > > to allow discarding of the code. > > > > How does that help this case? > > It doesn't particularly. > It does help any other case that might arise. Please describe what "any other case" you are thinking of - as already stated, your patch would have no beneficial effect for the observed issue. In fact, it _could_ do exactly the opposite. Where functions can be inlined, they may become smaller due to no need for function prologue and epilogue, and other optimisations. Forcing the compiler to avoid inlining them could result in larger code, which means a higher chance of triggering the "relocation truncated to fit" problem. The converse is also a possibility too - it all depends on the inlining behaviour of the compiler towards static functions, and the resulting size of the code. What may be right for one architecture and compiler may not be right for another. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
On Thu, Mar 7, 2019 at 1:15 AM Arnd Bergmann <arnd@arndb.de> wrote: > > On 32-bit ARM, I got a link failure in futex_init() when building > with clang in some random configurations: > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' Do we know what function from the fixup text section is calling futex_detect_cmpxchg? I'm curious if this is maybe another case of -Wsection where some function may be in the wrong section? > > As far as I can tell, the problem is that a branch is over 16MB > apart in those configurations, but only if it branches back to > the init text. > > Marking the futex_detect_cmpxchg() function as noinline and > not __init avoids the problem for me. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > kernel/futex.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/futex.c b/kernel/futex.c > index c3b73b0311bc..dda77ed9f445 100644 > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -3849,7 +3849,7 @@ SYSCALL_DEFINE6(futex_time32, u32 __user *, uaddr, int, op, u32, val, > } > #endif /* CONFIG_COMPAT_32BIT_TIME */ > > -static void __init futex_detect_cmpxchg(void) > +static noinline void futex_detect_cmpxchg(void) > { > #ifndef CONFIG_HAVE_FUTEX_CMPXCHG > u32 curval; > -- > 2.20.0 > -- Thanks, ~Nick Desaulniers
On Thu, Mar 07, 2019 at 10:12:11AM -0800, Nick Desaulniers wrote: > On Thu, Mar 7, 2019 at 1:15 AM Arnd Bergmann <arnd@arndb.de> wrote: > > > > On 32-bit ARM, I got a link failure in futex_init() when building > > with clang in some random configurations: > > > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' > > Do we know what function from the fixup text section is calling > futex_detect_cmpxchg? I'm curious if this is maybe another case of > -Wsection where some function may be in the wrong section? > Looks like this is the call stack: futex_init -> futex_detect_cmpxchg -> cmpxchg_futex_value_locked -> futex_atomic_cmpxchg_inatomic This is the same issue I reported: https://github.com/ClangBuiltLinux/linux/issues/325 Marking arm's futex_atomic_cmpxchg_inatomic as noinline also fixes this so maybe that's it? Cheers, Nathan > > > > As far as I can tell, the problem is that a branch is over 16MB > > apart in those configurations, but only if it branches back to > > the init text. > > > > Marking the futex_detect_cmpxchg() function as noinline and > > not __init avoids the problem for me. > > > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > --- > > kernel/futex.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/futex.c b/kernel/futex.c > > index c3b73b0311bc..dda77ed9f445 100644 > > --- a/kernel/futex.c > > +++ b/kernel/futex.c > > @@ -3849,7 +3849,7 @@ SYSCALL_DEFINE6(futex_time32, u32 __user *, uaddr, int, op, u32, val, > > } > > #endif /* CONFIG_COMPAT_32BIT_TIME */ > > > > -static void __init futex_detect_cmpxchg(void) > > +static noinline void futex_detect_cmpxchg(void) > > { > > #ifndef CONFIG_HAVE_FUTEX_CMPXCHG > > u32 curval; > > -- > > 2.20.0 > > > > > -- > Thanks, > ~Nick Desaulniers
On Thu, Mar 7, 2019 at 7:21 PM Nathan Chancellor <natechancellor@gmail.com> wrote: > > On Thu, Mar 07, 2019 at 10:12:11AM -0800, Nick Desaulniers wrote: > > On Thu, Mar 7, 2019 at 1:15 AM Arnd Bergmann <arnd@arndb.de> wrote: > > > > > > On 32-bit ARM, I got a link failure in futex_init() when building > > > with clang in some random configurations: > > > > > > kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' > > > > Do we know what function from the fixup text section is calling > > futex_detect_cmpxchg? I'm curious if this is maybe another case of > > -Wsection where some function may be in the wrong section? > > > > Looks like this is the call stack: > > futex_init -> > futex_detect_cmpxchg -> > cmpxchg_futex_value_locked -> > futex_atomic_cmpxchg_inatomic > > This is the same issue I reported: https://github.com/ClangBuiltLinux/linux/issues/325 > > Marking arm's futex_atomic_cmpxchg_inatomic as noinline also fixes this > so maybe that's it? I think part of the problem is the linker sometimes failing to generate veneers for long calls. I've seen this in the past with relocations in non-executable sections, but it's possible that the problem here is having a relocation to a section that doesn't start with ".text". I have not analyzed the linker behavior in more detail, but have experimentally shown that the problem doesn't happen if the relocation is from the .text.fixup segment to .text rather than .init.text. It's also possible that this is simply a result of the relative distance of the locations in memory as Russell said, so it could come back if the kernel grows further. Arnd
diff --git a/kernel/futex.c b/kernel/futex.c index c3b73b0311bc..dda77ed9f445 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -3849,7 +3849,7 @@ SYSCALL_DEFINE6(futex_time32, u32 __user *, uaddr, int, op, u32, val, } #endif /* CONFIG_COMPAT_32BIT_TIME */ -static void __init futex_detect_cmpxchg(void) +static noinline void futex_detect_cmpxchg(void) { #ifndef CONFIG_HAVE_FUTEX_CMPXCHG u32 curval;
On 32-bit ARM, I got a link failure in futex_init() when building with clang in some random configurations: kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text' As far as I can tell, the problem is that a branch is over 16MB apart in those configurations, but only if it branches back to the init text. Marking the futex_detect_cmpxchg() function as noinline and not __init avoids the problem for me. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- kernel/futex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.20.0