Message ID | 20191001083701.27207-1-yamada.masahiro@socionext.com |
---|---|
State | Superseded |
Headers | show |
Series | [v2] ARM: add __always_inline to functions called from __get_user_check() | expand |
On Tue, 2019-10-01 at 17:37 +0900, Masahiro Yamada wrote: > KernelCI reports that bcm2835_defconfig is no longer booting since > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING > forcibly") (https://lkml.org/lkml/2019/9/26/825). > > I also received a regression report from Nicolas Saenz Julienne > (https://lkml.org/lkml/2019/9/27/263). > > This problem has cropped up on bcm2835_defconfig because it enables > CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining > functions with -Os. I was able to reproduce it with other boards and > defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE. > > The __get_user_check() specifically uses r0, r1, r2 registers. > So, uaccess_save_and_enable() and uaccess_restore() must be inlined. > Otherwise, those register assignments would be entirely dropped, > according to my analysis of the disassembly. > > Prior to commit 9012d011660e ("compiler: allow all arches to enable > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for > inlining functions, except on x86. > > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. > So, __always_inline is now the only guaranteed way of forcible inlining. > > I also added __always_inline to 4 functions in the call-graph from the > __get_user_check() macro. > > Fixes: 9012d011660e ("compiler: allow all arches to enable > CONFIG_OPTIMIZE_INLINING") > Reported-by: "kernelci.org bot" <bot@kernelci.org> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
On Tue, Oct 1, 2019 at 1:37 AM Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > > KernelCI reports that bcm2835_defconfig is no longer booting since > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING > forcibly") (https://lkml.org/lkml/2019/9/26/825). > > I also received a regression report from Nicolas Saenz Julienne > (https://lkml.org/lkml/2019/9/27/263). > > This problem has cropped up on bcm2835_defconfig because it enables > CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining > functions with -Os. I was able to reproduce it with other boards and > defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE. > > The __get_user_check() specifically uses r0, r1, r2 registers. > So, uaccess_save_and_enable() and uaccess_restore() must be inlined. > Otherwise, those register assignments would be entirely dropped, > according to my analysis of the disassembly. > > Prior to commit 9012d011660e ("compiler: allow all arches to enable > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for > inlining functions, except on x86. > > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. > So, __always_inline is now the only guaranteed way of forcible inlining. No, the C preprocessor is the only guaranteed way of inlining. I preferred v1; if you're going to <strikethrough>play with fire</strikethrough>write assembly, don't get burned. > > I also added __always_inline to 4 functions in the call-graph from the > __get_user_check() macro. > > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") > Reported-by: "kernelci.org bot" <bot@kernelci.org> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> > --- > > Changes in v2: > - Use __always_inline instead of changing the function call places > (per Russell King) > - The previous submission is: https://lore.kernel.org/patchwork/patch/1132459/ > > arch/arm/include/asm/domain.h | 8 ++++---- > arch/arm/include/asm/uaccess.h | 4 ++-- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h > index 567dbede4785..f1d0a7807cd0 100644 > --- a/arch/arm/include/asm/domain.h > +++ b/arch/arm/include/asm/domain.h > @@ -82,7 +82,7 @@ > #ifndef __ASSEMBLY__ > > #ifdef CONFIG_CPU_CP15_MMU > -static inline unsigned int get_domain(void) > +static __always_inline unsigned int get_domain(void) > { > unsigned int domain; > > @@ -94,7 +94,7 @@ static inline unsigned int get_domain(void) > return domain; > } > > -static inline void set_domain(unsigned val) > +static __always_inline void set_domain(unsigned int val) > { > asm volatile( > "mcr p15, 0, %0, c3, c0 @ set domain" > @@ -102,12 +102,12 @@ static inline void set_domain(unsigned val) > isb(); > } > #else > -static inline unsigned int get_domain(void) > +static __always_inline unsigned int get_domain(void) > { > return 0; > } > > -static inline void set_domain(unsigned val) > +static __always_inline void set_domain(unsigned int val) > { > } > #endif > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h > index 303248e5b990..98c6b91be4a8 100644 > --- a/arch/arm/include/asm/uaccess.h > +++ b/arch/arm/include/asm/uaccess.h > @@ -22,7 +22,7 @@ > * perform such accesses (eg, via list poison values) which could then > * be exploited for priviledge escalation. > */ > -static inline unsigned int uaccess_save_and_enable(void) > +static __always_inline unsigned int uaccess_save_and_enable(void) > { > #ifdef CONFIG_CPU_SW_DOMAIN_PAN > unsigned int old_domain = get_domain(); > @@ -37,7 +37,7 @@ static inline unsigned int uaccess_save_and_enable(void) > #endif > } > > -static inline void uaccess_restore(unsigned int flags) > +static __always_inline void uaccess_restore(unsigned int flags) > { > #ifdef CONFIG_CPU_SW_DOMAIN_PAN > /* Restore the user access mask */ > -- > 2.17.1 > -- Thanks, ~Nick Desaulniers
On Tue, Oct 01, 2019 at 10:03:50AM -0700, Nick Desaulniers wrote: > On Tue, Oct 1, 2019 at 1:37 AM Masahiro Yamada > <yamada.masahiro@socionext.com> wrote: > > > > KernelCI reports that bcm2835_defconfig is no longer booting since > > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING > > forcibly") (https://lkml.org/lkml/2019/9/26/825). > > > > I also received a regression report from Nicolas Saenz Julienne > > (https://lkml.org/lkml/2019/9/27/263). > > > > This problem has cropped up on bcm2835_defconfig because it enables > > CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining > > functions with -Os. I was able to reproduce it with other boards and > > defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE. > > > > The __get_user_check() specifically uses r0, r1, r2 registers. > > So, uaccess_save_and_enable() and uaccess_restore() must be inlined. > > Otherwise, those register assignments would be entirely dropped, > > according to my analysis of the disassembly. > > > > Prior to commit 9012d011660e ("compiler: allow all arches to enable > > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for > > inlining functions, except on x86. > > > > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. > > So, __always_inline is now the only guaranteed way of forcible inlining. > > No, the C preprocessor is the only guaranteed way of inlining. I > preferred v1; if you're going to <strikethrough>play with > fire</strikethrough>write assembly, don't get burned. It seems we disagree on that. Masahiro Yamada, please send this to the patch system, thanks. > > > > > I also added __always_inline to 4 functions in the call-graph from the > > __get_user_check() macro. > > > > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") > > Reported-by: "kernelci.org bot" <bot@kernelci.org> > > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> > > --- > > > > Changes in v2: > > - Use __always_inline instead of changing the function call places > > (per Russell King) > > - The previous submission is: https://lore.kernel.org/patchwork/patch/1132459/ > > > > arch/arm/include/asm/domain.h | 8 ++++---- > > arch/arm/include/asm/uaccess.h | 4 ++-- > > 2 files changed, 6 insertions(+), 6 deletions(-) > > > > diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h > > index 567dbede4785..f1d0a7807cd0 100644 > > --- a/arch/arm/include/asm/domain.h > > +++ b/arch/arm/include/asm/domain.h > > @@ -82,7 +82,7 @@ > > #ifndef __ASSEMBLY__ > > > > #ifdef CONFIG_CPU_CP15_MMU > > -static inline unsigned int get_domain(void) > > +static __always_inline unsigned int get_domain(void) > > { > > unsigned int domain; > > > > @@ -94,7 +94,7 @@ static inline unsigned int get_domain(void) > > return domain; > > } > > > > -static inline void set_domain(unsigned val) > > +static __always_inline void set_domain(unsigned int val) > > { > > asm volatile( > > "mcr p15, 0, %0, c3, c0 @ set domain" > > @@ -102,12 +102,12 @@ static inline void set_domain(unsigned val) > > isb(); > > } > > #else > > -static inline unsigned int get_domain(void) > > +static __always_inline unsigned int get_domain(void) > > { > > return 0; > > } > > > > -static inline void set_domain(unsigned val) > > +static __always_inline void set_domain(unsigned int val) > > { > > } > > #endif > > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h > > index 303248e5b990..98c6b91be4a8 100644 > > --- a/arch/arm/include/asm/uaccess.h > > +++ b/arch/arm/include/asm/uaccess.h > > @@ -22,7 +22,7 @@ > > * perform such accesses (eg, via list poison values) which could then > > * be exploited for priviledge escalation. > > */ > > -static inline unsigned int uaccess_save_and_enable(void) > > +static __always_inline unsigned int uaccess_save_and_enable(void) > > { > > #ifdef CONFIG_CPU_SW_DOMAIN_PAN > > unsigned int old_domain = get_domain(); > > @@ -37,7 +37,7 @@ static inline unsigned int uaccess_save_and_enable(void) > > #endif > > } > > > > -static inline void uaccess_restore(unsigned int flags) > > +static __always_inline void uaccess_restore(unsigned int flags) > > { > > #ifdef CONFIG_CPU_SW_DOMAIN_PAN > > /* Restore the user access mask */ > > -- > > 2.17.1 > > > > > -- > Thanks, > ~Nick Desaulniers > -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
On Wed, Oct 2, 2019 at 5:25 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
> Masahiro Yamada, please send this to the patch system, thanks.
Done. (8908/1)
Thanks.
--
Best Regards
Masahiro Yamada
Hi Nick, On Wed, Oct 2, 2019 at 2:04 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. > > So, __always_inline is now the only guaranteed way of forcible inlining. > > No, the C preprocessor is the only guaranteed way of inlining. I do not think this is fatal if I understood this correctly: https://lore.kernel.org/patchwork/patch/1122097/#1331784 For GCC, we at least get a warning if a function with __always_inline is not inlined. I am fine with adding -Werror=attributes to the top Makefile (unless we have unpleasant side-effects). You filed the bug for Clang, and hopefully it will be OK in the future releases? -- Best Regards Masahiro Yamada
diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h index 567dbede4785..f1d0a7807cd0 100644 --- a/arch/arm/include/asm/domain.h +++ b/arch/arm/include/asm/domain.h @@ -82,7 +82,7 @@ #ifndef __ASSEMBLY__ #ifdef CONFIG_CPU_CP15_MMU -static inline unsigned int get_domain(void) +static __always_inline unsigned int get_domain(void) { unsigned int domain; @@ -94,7 +94,7 @@ static inline unsigned int get_domain(void) return domain; } -static inline void set_domain(unsigned val) +static __always_inline void set_domain(unsigned int val) { asm volatile( "mcr p15, 0, %0, c3, c0 @ set domain" @@ -102,12 +102,12 @@ static inline void set_domain(unsigned val) isb(); } #else -static inline unsigned int get_domain(void) +static __always_inline unsigned int get_domain(void) { return 0; } -static inline void set_domain(unsigned val) +static __always_inline void set_domain(unsigned int val) { } #endif diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 303248e5b990..98c6b91be4a8 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h @@ -22,7 +22,7 @@ * perform such accesses (eg, via list poison values) which could then * be exploited for priviledge escalation. */ -static inline unsigned int uaccess_save_and_enable(void) +static __always_inline unsigned int uaccess_save_and_enable(void) { #ifdef CONFIG_CPU_SW_DOMAIN_PAN unsigned int old_domain = get_domain(); @@ -37,7 +37,7 @@ static inline unsigned int uaccess_save_and_enable(void) #endif } -static inline void uaccess_restore(unsigned int flags) +static __always_inline void uaccess_restore(unsigned int flags) { #ifdef CONFIG_CPU_SW_DOMAIN_PAN /* Restore the user access mask */
KernelCI reports that bcm2835_defconfig is no longer booting since commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") (https://lkml.org/lkml/2019/9/26/825). I also received a regression report from Nicolas Saenz Julienne (https://lkml.org/lkml/2019/9/27/263). This problem has cropped up on bcm2835_defconfig because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining functions with -Os. I was able to reproduce it with other boards and defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE. The __get_user_check() specifically uses r0, r1, r2 registers. So, uaccess_save_and_enable() and uaccess_restore() must be inlined. Otherwise, those register assignments would be entirely dropped, according to my analysis of the disassembly. Prior to commit 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for inlining functions, except on x86. Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. So, __always_inline is now the only guaranteed way of forcible inlining. I also added __always_inline to 4 functions in the call-graph from the __get_user_check() macro. Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") Reported-by: "kernelci.org bot" <bot@kernelci.org> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> --- Changes in v2: - Use __always_inline instead of changing the function call places (per Russell King) - The previous submission is: https://lore.kernel.org/patchwork/patch/1132459/ arch/arm/include/asm/domain.h | 8 ++++---- arch/arm/include/asm/uaccess.h | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) -- 2.17.1