diff mbox series

[v2,1/2] bpf: don't rely on GCC __attribute__((optimize)) to disable GCSE

Message ID 20201028171506.15682-2-ardb@kernel.org
State New
Headers show
Series get rid of GCC __attribute__((optimize)) for BPF | expand

Commit Message

Ard Biesheuvel Oct. 28, 2020, 5:15 p.m. UTC
Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for
___bpf_prog_run()") introduced a __no_fgcse macro that expands to a
function scope __attribute__((optimize("-fno-gcse"))), to disable a
GCC specific optimization that was causing trouble on x86 builds, and
was not expected to have any positive effect in the first place.

However, as the GCC manual documents, __attribute__((optimize))
is not for production use, and results in all other optimization
options to be forgotten for the function in question. This can
cause all kinds of trouble, but in one particular reported case,
it causes -fno-asynchronous-unwind-tables to be disregarded,
resulting in .eh_frame info to be emitted for the function.

This reverts commit 3193c0836, and instead, it disables the -fgcse
optimization for the entire source file, but only when building for
X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the
original commit states that CONFIG_RETPOLINE=n triggers the issue,
whereas CONFIG_RETPOLINE=y performs better without the optimization,
so it is kept disabled in both cases.

Fixes: 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()")
Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_hCoBQ@mail.gmail.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 include/linux/compiler-gcc.h   | 2 --
 include/linux/compiler_types.h | 4 ----
 kernel/bpf/Makefile            | 6 +++++-
 kernel/bpf/core.c              | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

Comments

Alexei Starovoitov Oct. 28, 2020, 9:39 p.m. UTC | #1
On Wed, Oct 28, 2020 at 06:15:05PM +0100, Ard Biesheuvel wrote:
> Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for
> ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a
> function scope __attribute__((optimize("-fno-gcse"))), to disable a
> GCC specific optimization that was causing trouble on x86 builds, and
> was not expected to have any positive effect in the first place.
> 
> However, as the GCC manual documents, __attribute__((optimize))
> is not for production use, and results in all other optimization
> options to be forgotten for the function in question. This can
> cause all kinds of trouble, but in one particular reported case,
> it causes -fno-asynchronous-unwind-tables to be disregarded,
> resulting in .eh_frame info to be emitted for the function.
> 
> This reverts commit 3193c0836, and instead, it disables the -fgcse
> optimization for the entire source file, but only when building for
> X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the
> original commit states that CONFIG_RETPOLINE=n triggers the issue,
> whereas CONFIG_RETPOLINE=y performs better without the optimization,
> so it is kept disabled in both cases.
> 
> Fixes: 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()")
> Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_hCoBQ@mail.gmail.com/
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  include/linux/compiler-gcc.h   | 2 --
>  include/linux/compiler_types.h | 4 ----
>  kernel/bpf/Makefile            | 6 +++++-
>  kernel/bpf/core.c              | 2 +-
>  4 files changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index d1e3c6896b71..5deb37024574 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -175,5 +175,3 @@
>  #else
>  #define __diag_GCC_8(s)
>  #endif
> -
> -#define __no_fgcse __attribute__((optimize("-fno-gcse")))

See my reply in the other thread.
I prefer
-#define __no_fgcse __attribute__((optimize("-fno-gcse")))
+#define __no_fgcse __attribute__((optimize("-fno-gcse,-fno-omit-frame-pointer")))

Potentially with -fno-asynchronous-unwind-tables.

__attribute__((optimize("")) is not as broken as you're claiming to be.
It has quirky gcc internal logic, but it's still widely used
in many software projects.
Ard Biesheuvel Oct. 28, 2020, 10:15 p.m. UTC | #2
On Wed, 28 Oct 2020 at 22:39, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>

> On Wed, Oct 28, 2020 at 06:15:05PM +0100, Ard Biesheuvel wrote:

> > Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for

> > ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a

> > function scope __attribute__((optimize("-fno-gcse"))), to disable a

> > GCC specific optimization that was causing trouble on x86 builds, and

> > was not expected to have any positive effect in the first place.

> >

> > However, as the GCC manual documents, __attribute__((optimize))

> > is not for production use, and results in all other optimization

> > options to be forgotten for the function in question. This can

> > cause all kinds of trouble, but in one particular reported case,

> > it causes -fno-asynchronous-unwind-tables to be disregarded,

> > resulting in .eh_frame info to be emitted for the function.

> >

> > This reverts commit 3193c0836, and instead, it disables the -fgcse

> > optimization for the entire source file, but only when building for

> > X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the

> > original commit states that CONFIG_RETPOLINE=n triggers the issue,

> > whereas CONFIG_RETPOLINE=y performs better without the optimization,

> > so it is kept disabled in both cases.

> >

> > Fixes: 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()")

> > Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_hCoBQ@mail.gmail.com/

> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

> > ---

> >  include/linux/compiler-gcc.h   | 2 --

> >  include/linux/compiler_types.h | 4 ----

> >  kernel/bpf/Makefile            | 6 +++++-

> >  kernel/bpf/core.c              | 2 +-

> >  4 files changed, 6 insertions(+), 8 deletions(-)

> >

> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h

> > index d1e3c6896b71..5deb37024574 100644

> > --- a/include/linux/compiler-gcc.h

> > +++ b/include/linux/compiler-gcc.h

> > @@ -175,5 +175,3 @@

> >  #else

> >  #define __diag_GCC_8(s)

> >  #endif

> > -

> > -#define __no_fgcse __attribute__((optimize("-fno-gcse")))

>

> See my reply in the other thread.

> I prefer

> -#define __no_fgcse __attribute__((optimize("-fno-gcse")))

> +#define __no_fgcse __attribute__((optimize("-fno-gcse,-fno-omit-frame-pointer")))

>

> Potentially with -fno-asynchronous-unwind-tables.

>


So how would that work? arm64 has the following:

KBUILD_CFLAGS += -fno-asynchronous-unwind-tables -fno-unwind-tables

ifeq ($(CONFIG_SHADOW_CALL_STACK), y)
KBUILD_CFLAGS += -ffixed-x18
endif

and it adds -fpatchable-function-entry=2 for compilers that support
it, but only when CONFIG_FTRACE is enabled.

Also, as Nick pointed out, -fno-gcse does not work on Clang.

Every architecture will have a different set of requirements here. And
there is no way of knowing which -f options are disregarded when you
use the function attribute.

So how on earth are you going to #define __no-fgcse correctly for
every configuration imaginable?

> __attribute__((optimize("")) is not as broken as you're claiming to be.

> It has quirky gcc internal logic, but it's still widely used

> in many software projects.


So it's fine because it is only a little bit broken? I'm sorry, but
that makes no sense whatsoever.

If you insist on sticking with this broken construct, can you please
make it GCC/x86-only at least?
Alexei Starovoitov Oct. 28, 2020, 10:59 p.m. UTC | #3
On Wed, Oct 28, 2020 at 11:15:04PM +0100, Ard Biesheuvel wrote:
> On Wed, 28 Oct 2020 at 22:39, Alexei Starovoitov

> <alexei.starovoitov@gmail.com> wrote:

> >

> > On Wed, Oct 28, 2020 at 06:15:05PM +0100, Ard Biesheuvel wrote:

> > > Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for

> > > ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a

> > > function scope __attribute__((optimize("-fno-gcse"))), to disable a

> > > GCC specific optimization that was causing trouble on x86 builds, and

> > > was not expected to have any positive effect in the first place.

> > >

> > > However, as the GCC manual documents, __attribute__((optimize))

> > > is not for production use, and results in all other optimization

> > > options to be forgotten for the function in question. This can

> > > cause all kinds of trouble, but in one particular reported case,

> > > it causes -fno-asynchronous-unwind-tables to be disregarded,

> > > resulting in .eh_frame info to be emitted for the function.

> > >

> > > This reverts commit 3193c0836, and instead, it disables the -fgcse

> > > optimization for the entire source file, but only when building for

> > > X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the

> > > original commit states that CONFIG_RETPOLINE=n triggers the issue,

> > > whereas CONFIG_RETPOLINE=y performs better without the optimization,

> > > so it is kept disabled in both cases.

> > >

> > > Fixes: 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()")

> > > Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_hCoBQ@mail.gmail.com/

> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

> > > ---

> > >  include/linux/compiler-gcc.h   | 2 --

> > >  include/linux/compiler_types.h | 4 ----

> > >  kernel/bpf/Makefile            | 6 +++++-

> > >  kernel/bpf/core.c              | 2 +-

> > >  4 files changed, 6 insertions(+), 8 deletions(-)

> > >

> > > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h

> > > index d1e3c6896b71..5deb37024574 100644

> > > --- a/include/linux/compiler-gcc.h

> > > +++ b/include/linux/compiler-gcc.h

> > > @@ -175,5 +175,3 @@

> > >  #else

> > >  #define __diag_GCC_8(s)

> > >  #endif

> > > -

> > > -#define __no_fgcse __attribute__((optimize("-fno-gcse")))

> >

> > See my reply in the other thread.

> > I prefer

> > -#define __no_fgcse __attribute__((optimize("-fno-gcse")))

> > +#define __no_fgcse __attribute__((optimize("-fno-gcse,-fno-omit-frame-pointer")))

> >

> > Potentially with -fno-asynchronous-unwind-tables.

> >

> 

> So how would that work? arm64 has the following:

> 

> KBUILD_CFLAGS += -fno-asynchronous-unwind-tables -fno-unwind-tables

> 

> ifeq ($(CONFIG_SHADOW_CALL_STACK), y)

> KBUILD_CFLAGS += -ffixed-x18

> endif

> 

> and it adds -fpatchable-function-entry=2 for compilers that support

> it, but only when CONFIG_FTRACE is enabled.


I think you're assuming that GCC drops all flags when it sees __attribute__((optimize)).
That's not the case.

> Also, as Nick pointed out, -fno-gcse does not work on Clang.


yes and what's the point?
#define __no_fgcse is GCC only. clang doesn't need this workaround.

> Every architecture will have a different set of requirements here. And

> there is no way of knowing which -f options are disregarded when you

> use the function attribute.

> 

> So how on earth are you going to #define __no-fgcse correctly for

> every configuration imaginable?

> 

> > __attribute__((optimize("")) is not as broken as you're claiming to be.

> > It has quirky gcc internal logic, but it's still widely used

> > in many software projects.

> 

> So it's fine because it is only a little bit broken? I'm sorry, but

> that makes no sense whatsoever.

> 

> If you insist on sticking with this broken construct, can you please

> make it GCC/x86-only at least?


I'm totally fine with making
#define __no_fgcse __attribute__((optimize("-fno-gcse,-fno-omit-frame-pointer")))
to be gcc+x86 only.
I'd like to get rid of it, but objtool is not smart enough to understand
generated asm without it.
Geert Uytterhoeven Oct. 29, 2020, 8:25 a.m. UTC | #4
On Wed, Oct 28, 2020 at 6:15 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for

> ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a

> function scope __attribute__((optimize("-fno-gcse"))), to disable a

> GCC specific optimization that was causing trouble on x86 builds, and

> was not expected to have any positive effect in the first place.

>

> However, as the GCC manual documents, __attribute__((optimize))

> is not for production use, and results in all other optimization

> options to be forgotten for the function in question. This can

> cause all kinds of trouble, but in one particular reported case,

> it causes -fno-asynchronous-unwind-tables to be disregarded,

> resulting in .eh_frame info to be emitted for the function.

>

> This reverts commit 3193c0836, and instead, it disables the -fgcse

> optimization for the entire source file, but only when building for

> X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the

> original commit states that CONFIG_RETPOLINE=n triggers the issue,

> whereas CONFIG_RETPOLINE=y performs better without the optimization,

> so it is kept disabled in both cases.

>

> Fixes: 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()")

> Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_hCoBQ@mail.gmail.com/

> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>


(probably you missed by tag on v1 due to kernel.org hickups)

Thanks, this gets rid of the following warning, which you may
want to quote in the patch description:

    aarch64-linux-gnu-ld: warning: orphan section `.eh_frame' from
`kernel/bpf/core.o' being placed in section `.eh_frame'

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>


Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Nick Desaulniers Oct. 30, 2020, 12:34 a.m. UTC | #5
On Wed, Oct 28, 2020 at 10:15 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>

> Commit 3193c0836 ("bpf: Disable GCC -fgcse optimization for

> ___bpf_prog_run()") introduced a __no_fgcse macro that expands to a

> function scope __attribute__((optimize("-fno-gcse"))), to disable a

> GCC specific optimization that was causing trouble on x86 builds, and

> was not expected to have any positive effect in the first place.

>

> However, as the GCC manual documents, __attribute__((optimize))

> is not for production use, and results in all other optimization

> options to be forgotten for the function in question. This can

> cause all kinds of trouble, but in one particular reported case,

> it causes -fno-asynchronous-unwind-tables to be disregarded,

> resulting in .eh_frame info to be emitted for the function.

>

> This reverts commit 3193c0836, and instead, it disables the -fgcse

> optimization for the entire source file, but only when building for

> X86 using GCC with CONFIG_BPF_JIT_ALWAYS_ON disabled. Note that the

> original commit states that CONFIG_RETPOLINE=n triggers the issue,

> whereas CONFIG_RETPOLINE=y performs better without the optimization,

> so it is kept disabled in both cases.

>

> Fixes: 3193c0836 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()")

> Link: https://lore.kernel.org/lkml/CAMuHMdUg0WJHEcq6to0-eODpXPOywLot6UD2=GFHpzoj_hCoBQ@mail.gmail.com/

> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

> ---

>  include/linux/compiler-gcc.h   | 2 --

>  include/linux/compiler_types.h | 4 ----

>  kernel/bpf/Makefile            | 6 +++++-

>  kernel/bpf/core.c              | 2 +-

>  4 files changed, 6 insertions(+), 8 deletions(-)

>

> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h

> index d1e3c6896b71..5deb37024574 100644

> --- a/include/linux/compiler-gcc.h

> +++ b/include/linux/compiler-gcc.h

> @@ -175,5 +175,3 @@

>  #else

>  #define __diag_GCC_8(s)

>  #endif

> -

> -#define __no_fgcse __attribute__((optimize("-fno-gcse")))

> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h

> index 6e390d58a9f8..ac3fa37a84f9 100644

> --- a/include/linux/compiler_types.h

> +++ b/include/linux/compiler_types.h

> @@ -247,10 +247,6 @@ struct ftrace_likely_data {

>  #define asm_inline asm

>  #endif

>

> -#ifndef __no_fgcse

> -# define __no_fgcse

> -#endif

> -

>  /* Are two types/vars the same type (ignoring qualifiers)? */

>  #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

>

> diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile

> index bdc8cd1b6767..c1b9f71ee6aa 100644

> --- a/kernel/bpf/Makefile

> +++ b/kernel/bpf/Makefile

> @@ -1,6 +1,10 @@

>  # SPDX-License-Identifier: GPL-2.0

>  obj-y := core.o

> -CFLAGS_core.o += $(call cc-disable-warning, override-init)

> +ifneq ($(CONFIG_BPF_JIT_ALWAYS_ON),y)

> +# ___bpf_prog_run() needs GCSE disabled on x86; see 3193c0836f203 for details

> +cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse

> +endif

> +CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy)


Writing multiple conditions in a conditional block in GNU make is
painful, hence the double `y` trick.  I feel like either 3 nested
conditionals (one for CONFIG_BPF_JIT_ALWAYS_ON, CONFIG_X86, and
CONFIG_CC_IS_GCC) would have been clearer, or using three `y`, rather
than mixing and matching `if`s with multiple `y`s, but regardless of
what color I think we should paint the bikeshed:

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>


This also doesn't resolve all issues here, but is a step in the right
direction, IMO.

>

>  obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o prog_iter.o

>  obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o

> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c

> index 9268d77898b7..55454d2278b1 100644

> --- a/kernel/bpf/core.c

> +++ b/kernel/bpf/core.c

> @@ -1369,7 +1369,7 @@ u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)

>   *

>   * Decode and execute eBPF instructions.

>   */

> -static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)

> +static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)

>  {

>  #define BPF_INSN_2_LBL(x, y)    [BPF_##x | BPF_##y] = &&x##_##y

>  #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z

> --

> 2.17.1

>



-- 
Thanks,
~Nick Desaulniers
diff mbox series

Patch

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d1e3c6896b71..5deb37024574 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -175,5 +175,3 @@ 
 #else
 #define __diag_GCC_8(s)
 #endif
-
-#define __no_fgcse __attribute__((optimize("-fno-gcse")))
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 6e390d58a9f8..ac3fa37a84f9 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -247,10 +247,6 @@  struct ftrace_likely_data {
 #define asm_inline asm
 #endif
 
-#ifndef __no_fgcse
-# define __no_fgcse
-#endif
-
 /* Are two types/vars the same type (ignoring qualifiers)? */
 #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
 
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index bdc8cd1b6767..c1b9f71ee6aa 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -1,6 +1,10 @@ 
 # SPDX-License-Identifier: GPL-2.0
 obj-y := core.o
-CFLAGS_core.o += $(call cc-disable-warning, override-init)
+ifneq ($(CONFIG_BPF_JIT_ALWAYS_ON),y)
+# ___bpf_prog_run() needs GCSE disabled on x86; see 3193c0836f203 for details
+cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse
+endif
+CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy)
 
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o prog_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 9268d77898b7..55454d2278b1 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1369,7 +1369,7 @@  u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
  *
  * Decode and execute eBPF instructions.
  */
-static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
+static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
 {
 #define BPF_INSN_2_LBL(x, y)    [BPF_##x | BPF_##y] = &&x##_##y
 #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z