diff mbox series

[v2] ARM: add __always_inline to functions called from __get_user_check()

Message ID 20191001083701.27207-1-yamada.masahiro@socionext.com
State Superseded
Headers show
Series [v2] ARM: add __always_inline to functions called from __get_user_check() | expand

Commit Message

Masahiro Yamada Oct. 1, 2019, 8:37 a.m. UTC
KernelCI reports that bcm2835_defconfig is no longer booting since
commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
forcibly") (https://lkml.org/lkml/2019/9/26/825).

I also received a regression report from Nicolas Saenz Julienne
(https://lkml.org/lkml/2019/9/27/263).

This problem has cropped up on bcm2835_defconfig because it enables
CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining
functions with -Os. I was able to reproduce it with other boards and
defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE.

The __get_user_check() specifically uses r0, r1, r2 registers.
So, uaccess_save_and_enable() and uaccess_restore() must be inlined.
Otherwise, those register assignments would be entirely dropped,
according to my analysis of the disassembly.

Prior to commit 9012d011660e ("compiler: allow all arches to enable
CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
inlining functions, except on x86.

Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
So, __always_inline is now the only guaranteed way of forcible inlining.

I also added __always_inline to 4 functions in the call-graph from the
__get_user_check() macro.

Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

---

Changes in v2:
  - Use __always_inline instead of changing the function call places
     (per Russell King)
  - The previous submission is: https://lore.kernel.org/patchwork/patch/1132459/

 arch/arm/include/asm/domain.h  | 8 ++++----
 arch/arm/include/asm/uaccess.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

-- 
2.17.1

Comments

Nicolas Saenz Julienne Oct. 1, 2019, 9:03 a.m. UTC | #1
On Tue, 2019-10-01 at 17:37 +0900, Masahiro Yamada wrote:
> KernelCI reports that bcm2835_defconfig is no longer booting since

> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING

> forcibly") (https://lkml.org/lkml/2019/9/26/825).

> 

> I also received a regression report from Nicolas Saenz Julienne

> (https://lkml.org/lkml/2019/9/27/263).

> 

> This problem has cropped up on bcm2835_defconfig because it enables

> CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining

> functions with -Os. I was able to reproduce it with other boards and

> defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE.

> 

> The __get_user_check() specifically uses r0, r1, r2 registers.

> So, uaccess_save_and_enable() and uaccess_restore() must be inlined.

> Otherwise, those register assignments would be entirely dropped,

> according to my analysis of the disassembly.

> 

> Prior to commit 9012d011660e ("compiler: allow all arches to enable

> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for

> inlining functions, except on x86.

> 

> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.

> So, __always_inline is now the only guaranteed way of forcible inlining.

> 

> I also added __always_inline to 4 functions in the call-graph from the

> __get_user_check() macro.

> 

> Fixes: 9012d011660e ("compiler: allow all arches to enable

> CONFIG_OPTIMIZE_INLINING")

> Reported-by: "kernelci.org bot" <bot@kernelci.org>

> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>

> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>


Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Nick Desaulniers Oct. 1, 2019, 5:03 p.m. UTC | #2
On Tue, Oct 1, 2019 at 1:37 AM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>

> KernelCI reports that bcm2835_defconfig is no longer booting since

> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING

> forcibly") (https://lkml.org/lkml/2019/9/26/825).

>

> I also received a regression report from Nicolas Saenz Julienne

> (https://lkml.org/lkml/2019/9/27/263).

>

> This problem has cropped up on bcm2835_defconfig because it enables

> CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining

> functions with -Os. I was able to reproduce it with other boards and

> defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE.

>

> The __get_user_check() specifically uses r0, r1, r2 registers.

> So, uaccess_save_and_enable() and uaccess_restore() must be inlined.

> Otherwise, those register assignments would be entirely dropped,

> according to my analysis of the disassembly.

>

> Prior to commit 9012d011660e ("compiler: allow all arches to enable

> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for

> inlining functions, except on x86.

>

> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.

> So, __always_inline is now the only guaranteed way of forcible inlining.


No, the C preprocessor is the only guaranteed way of inlining.  I
preferred v1; if you're going to <strikethrough>play with
fire</strikethrough>write assembly, don't get burned.

>

> I also added __always_inline to 4 functions in the call-graph from the

> __get_user_check() macro.

>

> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")

> Reported-by: "kernelci.org bot" <bot@kernelci.org>

> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>

> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

> ---

>

> Changes in v2:

>   - Use __always_inline instead of changing the function call places

>      (per Russell King)

>   - The previous submission is: https://lore.kernel.org/patchwork/patch/1132459/

>

>  arch/arm/include/asm/domain.h  | 8 ++++----

>  arch/arm/include/asm/uaccess.h | 4 ++--

>  2 files changed, 6 insertions(+), 6 deletions(-)

>

> diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h

> index 567dbede4785..f1d0a7807cd0 100644

> --- a/arch/arm/include/asm/domain.h

> +++ b/arch/arm/include/asm/domain.h

> @@ -82,7 +82,7 @@

>  #ifndef __ASSEMBLY__

>

>  #ifdef CONFIG_CPU_CP15_MMU

> -static inline unsigned int get_domain(void)

> +static __always_inline unsigned int get_domain(void)

>  {

>         unsigned int domain;

>

> @@ -94,7 +94,7 @@ static inline unsigned int get_domain(void)

>         return domain;

>  }

>

> -static inline void set_domain(unsigned val)

> +static __always_inline void set_domain(unsigned int val)

>  {

>         asm volatile(

>         "mcr    p15, 0, %0, c3, c0      @ set domain"

> @@ -102,12 +102,12 @@ static inline void set_domain(unsigned val)

>         isb();

>  }

>  #else

> -static inline unsigned int get_domain(void)

> +static __always_inline unsigned int get_domain(void)

>  {

>         return 0;

>  }

>

> -static inline void set_domain(unsigned val)

> +static __always_inline void set_domain(unsigned int val)

>  {

>  }

>  #endif

> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h

> index 303248e5b990..98c6b91be4a8 100644

> --- a/arch/arm/include/asm/uaccess.h

> +++ b/arch/arm/include/asm/uaccess.h

> @@ -22,7 +22,7 @@

>   * perform such accesses (eg, via list poison values) which could then

>   * be exploited for priviledge escalation.

>   */

> -static inline unsigned int uaccess_save_and_enable(void)

> +static __always_inline unsigned int uaccess_save_and_enable(void)

>  {

>  #ifdef CONFIG_CPU_SW_DOMAIN_PAN

>         unsigned int old_domain = get_domain();

> @@ -37,7 +37,7 @@ static inline unsigned int uaccess_save_and_enable(void)

>  #endif

>  }

>

> -static inline void uaccess_restore(unsigned int flags)

> +static __always_inline void uaccess_restore(unsigned int flags)

>  {

>  #ifdef CONFIG_CPU_SW_DOMAIN_PAN

>         /* Restore the user access mask */

> --

> 2.17.1

>



-- 
Thanks,
~Nick Desaulniers
Russell King (Oracle) Oct. 2, 2019, 8:24 a.m. UTC | #3
On Tue, Oct 01, 2019 at 10:03:50AM -0700, Nick Desaulniers wrote:
> On Tue, Oct 1, 2019 at 1:37 AM Masahiro Yamada

> <yamada.masahiro@socionext.com> wrote:

> >

> > KernelCI reports that bcm2835_defconfig is no longer booting since

> > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING

> > forcibly") (https://lkml.org/lkml/2019/9/26/825).

> >

> > I also received a regression report from Nicolas Saenz Julienne

> > (https://lkml.org/lkml/2019/9/27/263).

> >

> > This problem has cropped up on bcm2835_defconfig because it enables

> > CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining

> > functions with -Os. I was able to reproduce it with other boards and

> > defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE.

> >

> > The __get_user_check() specifically uses r0, r1, r2 registers.

> > So, uaccess_save_and_enable() and uaccess_restore() must be inlined.

> > Otherwise, those register assignments would be entirely dropped,

> > according to my analysis of the disassembly.

> >

> > Prior to commit 9012d011660e ("compiler: allow all arches to enable

> > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for

> > inlining functions, except on x86.

> >

> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.

> > So, __always_inline is now the only guaranteed way of forcible inlining.

> 

> No, the C preprocessor is the only guaranteed way of inlining.  I

> preferred v1; if you're going to <strikethrough>play with

> fire</strikethrough>write assembly, don't get burned.


It seems we disagree on that.

Masahiro Yamada, please send this to the patch system, thanks.

> 

> >

> > I also added __always_inline to 4 functions in the call-graph from the

> > __get_user_check() macro.

> >

> > Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")

> > Reported-by: "kernelci.org bot" <bot@kernelci.org>

> > Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>

> > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

> > ---

> >

> > Changes in v2:

> >   - Use __always_inline instead of changing the function call places

> >      (per Russell King)

> >   - The previous submission is: https://lore.kernel.org/patchwork/patch/1132459/

> >

> >  arch/arm/include/asm/domain.h  | 8 ++++----

> >  arch/arm/include/asm/uaccess.h | 4 ++--

> >  2 files changed, 6 insertions(+), 6 deletions(-)

> >

> > diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h

> > index 567dbede4785..f1d0a7807cd0 100644

> > --- a/arch/arm/include/asm/domain.h

> > +++ b/arch/arm/include/asm/domain.h

> > @@ -82,7 +82,7 @@

> >  #ifndef __ASSEMBLY__

> >

> >  #ifdef CONFIG_CPU_CP15_MMU

> > -static inline unsigned int get_domain(void)

> > +static __always_inline unsigned int get_domain(void)

> >  {

> >         unsigned int domain;

> >

> > @@ -94,7 +94,7 @@ static inline unsigned int get_domain(void)

> >         return domain;

> >  }

> >

> > -static inline void set_domain(unsigned val)

> > +static __always_inline void set_domain(unsigned int val)

> >  {

> >         asm volatile(

> >         "mcr    p15, 0, %0, c3, c0      @ set domain"

> > @@ -102,12 +102,12 @@ static inline void set_domain(unsigned val)

> >         isb();

> >  }

> >  #else

> > -static inline unsigned int get_domain(void)

> > +static __always_inline unsigned int get_domain(void)

> >  {

> >         return 0;

> >  }

> >

> > -static inline void set_domain(unsigned val)

> > +static __always_inline void set_domain(unsigned int val)

> >  {

> >  }

> >  #endif

> > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h

> > index 303248e5b990..98c6b91be4a8 100644

> > --- a/arch/arm/include/asm/uaccess.h

> > +++ b/arch/arm/include/asm/uaccess.h

> > @@ -22,7 +22,7 @@

> >   * perform such accesses (eg, via list poison values) which could then

> >   * be exploited for priviledge escalation.

> >   */

> > -static inline unsigned int uaccess_save_and_enable(void)

> > +static __always_inline unsigned int uaccess_save_and_enable(void)

> >  {

> >  #ifdef CONFIG_CPU_SW_DOMAIN_PAN

> >         unsigned int old_domain = get_domain();

> > @@ -37,7 +37,7 @@ static inline unsigned int uaccess_save_and_enable(void)

> >  #endif

> >  }

> >

> > -static inline void uaccess_restore(unsigned int flags)

> > +static __always_inline void uaccess_restore(unsigned int flags)

> >  {

> >  #ifdef CONFIG_CPU_SW_DOMAIN_PAN

> >         /* Restore the user access mask */

> > --

> > 2.17.1

> >

> 

> 

> -- 

> Thanks,

> ~Nick Desaulniers

> 


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Masahiro Yamada Oct. 2, 2019, 10:45 a.m. UTC | #4
On Wed, Oct 2, 2019 at 5:25 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
> Masahiro Yamada, please send this to the patch system, thanks.


Done.  (8908/1)

Thanks.

-- 
Best Regards
Masahiro Yamada
Masahiro Yamada Oct. 2, 2019, 1:01 p.m. UTC | #5
Hi Nick,

On Wed, Oct 2, 2019 at 2:04 AM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.

> > So, __always_inline is now the only guaranteed way of forcible inlining.

>

> No, the C preprocessor is the only guaranteed way of inlining.



I do not think this is fatal if I understood this correctly:
https://lore.kernel.org/patchwork/patch/1122097/#1331784


For GCC, we at least get a warning if a function with
 __always_inline is not inlined.
I am fine with adding -Werror=attributes to the top Makefile
(unless we have unpleasant side-effects).

You filed the bug for Clang, and hopefully it will be OK
in the future releases?




--
Best Regards
Masahiro Yamada
diff mbox series

Patch

diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
index 567dbede4785..f1d0a7807cd0 100644
--- a/arch/arm/include/asm/domain.h
+++ b/arch/arm/include/asm/domain.h
@@ -82,7 +82,7 @@ 
 #ifndef __ASSEMBLY__
 
 #ifdef CONFIG_CPU_CP15_MMU
-static inline unsigned int get_domain(void)
+static __always_inline unsigned int get_domain(void)
 {
 	unsigned int domain;
 
@@ -94,7 +94,7 @@  static inline unsigned int get_domain(void)
 	return domain;
 }
 
-static inline void set_domain(unsigned val)
+static __always_inline void set_domain(unsigned int val)
 {
 	asm volatile(
 	"mcr	p15, 0, %0, c3, c0	@ set domain"
@@ -102,12 +102,12 @@  static inline void set_domain(unsigned val)
 	isb();
 }
 #else
-static inline unsigned int get_domain(void)
+static __always_inline unsigned int get_domain(void)
 {
 	return 0;
 }
 
-static inline void set_domain(unsigned val)
+static __always_inline void set_domain(unsigned int val)
 {
 }
 #endif
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 303248e5b990..98c6b91be4a8 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -22,7 +22,7 @@ 
  * perform such accesses (eg, via list poison values) which could then
  * be exploited for priviledge escalation.
  */
-static inline unsigned int uaccess_save_and_enable(void)
+static __always_inline unsigned int uaccess_save_and_enable(void)
 {
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	unsigned int old_domain = get_domain();
@@ -37,7 +37,7 @@  static inline unsigned int uaccess_save_and_enable(void)
 #endif
 }
 
-static inline void uaccess_restore(unsigned int flags)
+static __always_inline void uaccess_restore(unsigned int flags)
 {
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	/* Restore the user access mask */