Message ID | 20180926050355.32746-9-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | LSE atomics out-of-line | expand |
* rth: > diff --git a/libgcc/config/aarch64/lse.c b/libgcc/config/aarch64/lse.c > new file mode 100644 > index 00000000000..20f4bde741f > --- /dev/null > +++ b/libgcc/config/aarch64/lse.c > +static void __attribute__((constructor)) > +init_have_atomics(void) > +{ > + unsigned long hwcap = getauxval(AT_HWCAP); > + __aa64_have_atomics = (hwcap & HWCAP_ATOMICS) != 0; > +} Is there an expectation that it is possible to use the atomics in IFUNC resolvers? Then this needs an explanation why it is safe to run with the other kind of atomics until the initialization of __aa64_have_atomics has happened. (GNU style requires a space before a parenthesis, at least in a function call or function declarator.) Thanks, Florian
On 9/26/18 1:59 AM, Florian Weimer wrote: > * rth: > >> diff --git a/libgcc/config/aarch64/lse.c b/libgcc/config/aarch64/lse.c >> new file mode 100644 >> index 00000000000..20f4bde741f >> --- /dev/null >> +++ b/libgcc/config/aarch64/lse.c > >> +static void __attribute__((constructor)) >> +init_have_atomics(void) >> +{ >> + unsigned long hwcap = getauxval(AT_HWCAP); >> + __aa64_have_atomics = (hwcap & HWCAP_ATOMICS) != 0; >> +} > > Is there an expectation that it is possible to use the atomics in IFUNC > resolvers? Then this needs an explanation why it is safe to run with > the other kind of atomics until the initialization of > __aa64_have_atomics has happened. Yes. The explanation is simple, in that the !have_atomics path is also atomic. It will simply use the slower load/store-exclusive path. Perhaps, despite the official ARMv8.1-Atomics name, LSE was in fact a better choice for a name after all, as its lack does not imply a lack of atomicity. And a comment, to be sure. > (GNU style requires a space before a parenthesis, at least in a function > call or function declarator.) Yes, of course. It's no longer automatic for my fingers and eyes. r~
* Richard Henderson: > On 9/26/18 1:59 AM, Florian Weimer wrote: >> * rth: >> >>> diff --git a/libgcc/config/aarch64/lse.c b/libgcc/config/aarch64/lse.c >>> new file mode 100644 >>> index 00000000000..20f4bde741f >>> --- /dev/null >>> +++ b/libgcc/config/aarch64/lse.c >> >>> +static void __attribute__((constructor)) >>> +init_have_atomics(void) >>> +{ >>> + unsigned long hwcap = getauxval(AT_HWCAP); >>> + __aa64_have_atomics = (hwcap & HWCAP_ATOMICS) != 0; >>> +} >> >> Is there an expectation that it is possible to use the atomics in IFUNC >> resolvers? Then this needs an explanation why it is safe to run with >> the other kind of atomics until the initialization of >> __aa64_have_atomics has happened. > > Yes. The explanation is simple, in that the !have_atomics path is > also atomic. It will simply use the slower load/store-exclusive path. > > Perhaps, despite the official ARMv8.1-Atomics name, LSE was in fact a > better choice for a name after all, as its lack does not imply a lack > of atomicity. And a comment, to be sure. That's not what I meant. I'm curious if LSE and non-LSE atomics on the same location will still result in the expected memory ordering. If they don't, then this requires *some* explanation why this is okay. Thanks, Florian
On 9/26/18 7:33 AM, Florian Weimer wrote: >>> *That's not what I meant. I'm curious if LSE and non-LSE atomics on the >>> same location will still result in the expected memory ordering. If >>> they don't, then this requires *some* explanation why this is okay. >>> >>> Thanks, >>> Florian Yes, they interoperate just fine. r~
On 26/09/2018 06:03, rth7680@gmail.com wrote: > From: Richard Henderson <richard.henderson@linaro.org> > > This is the libgcc part of the interface -- providing the functions. > Rationale is provided at the top of libgcc/config/aarch64/lse.c. > > * config/aarch64/lse.c: New file. > * config/aarch64/t-lse: New file. > * config.host: Add t-lse to all aarch64 tuples. > --- > libgcc/config/aarch64/lse.c | 258 ++++++++++++++++++++++++++++++++++++ > libgcc/config.host | 4 + > libgcc/config/aarch64/t-lse | 44 ++++++ > 3 files changed, 306 insertions(+) > create mode 100644 libgcc/config/aarch64/lse.c > create mode 100644 libgcc/config/aarch64/t-lse > > diff --git a/libgcc/config/aarch64/lse.c b/libgcc/config/aarch64/lse.c > new file mode 100644 > index 00000000000..20f4bde741f > --- /dev/null > +++ b/libgcc/config/aarch64/lse.c > @@ -0,0 +1,258 @@ > +/* Out-of-line LSE atomics for AArch64 architecture. > + Copyright (C) 2018 Free Software Foundation, Inc. > + Contributed by Linaro Ltd. > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify it under > +the terms of the GNU General Public License as published by the Free > +Software Foundation; either version 3, or (at your option) any later > +version. > + > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +WARRANTY; without even the implied warranty of MERCHANTABILITY or > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > +for more details. > + > +Under Section 7 of GPL version 3, you are granted additional > +permissions described in the GCC Runtime Library Exception, version > +3.1, as published by the Free Software Foundation. > + > +You should have received a copy of the GNU General Public License and > +a copy of the GCC Runtime Library Exception along with this program; > +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > +<http://www.gnu.org/licenses/>. */ > + > +/* > + * The problem that we are trying to solve is operating system deployment > + * of ARMv8.1-Atomics, also known as Large System Exensions (LSE). > + * > + * There are a number of potential solutions for this problem which have > + * been proposed and rejected for various reasons. To recap: > + * > + * (1) Multiple builds. The dynamic linker will examine /lib64/atomics/ > + * if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten. > + * However, not all Linux distributions are happy with multiple builds, > + * and anyway it has no effect on main applications. > + * > + * (2) IFUNC. We could put these functions into libgcc_s.so, and have > + * a single copy of each function for all DSOs. However, ARM is concerned > + * that the branch-to-indirect-branch that is implied by using a PLT, > + * as required by IFUNC, is too much overhead for smaller cpus. > + * > + * (3) Statically predicted direct branches. This is the approach that > + * is taken here. These functions are linked into every DSO that uses them. > + * All of the symbols are hidden, so that the functions are called via a > + * direct branch. The choice of LSE vs non-LSE is done via one byte load > + * followed by a well-predicted direct branch. The functions are compiled > + * separately to minimize code size. > + */ > + > +/* Define or declare the symbol gating the LSE implementations. */ > +#ifndef L_have_atomics > +extern > +#endif > +_Bool __aa64_have_atomics __attribute__((visibility("hidden"), nocommon)); This needs to be able to build against glibc versions that do not have HWCAP_ATOMICS available in the headers. Thus initialize to 0 ? > + > +/* The branch controlled by this test should be easily predicted, in that > + it will, after constructors, always branch the same way. The expectation > + is that systems that implement ARMv8.1-Atomics are "beefier" than those > + that omit the extension. By arranging for the fall-through path to use > + load-store-exclusive insns, we aid the branch predictor of the > + smallest cpus. */ > +#define have_atomics __builtin_expect(__aa64_have_atomics, 0) > + > +#ifdef L_have_atomics > +/* Disable initialization of __aa64_have_atomics during bootstrap. */ > +# ifndef inhibit_libc > +# include <sys/auxv.h> > + > +static void __attribute__((constructor)) > +init_have_atomics(void) > +{ > + unsigned long hwcap = getauxval(AT_HWCAP); > + __aa64_have_atomics = (hwcap & HWCAP_ATOMICS) != 0; > +} And then have the constructor run only when HWCAP_ATOMICS is defined ?. i.e. #ifdef HWCAP_ATOMICS //constructor #endif > +# endif /* inhibit_libc */ > +#else > + > +/* Tell the assembler to accept LSE instructions. */ > +asm(".arch armv8-a+lse"); Thankfully I think this is now well supported in most places that we don't need probe tests for this .. :) So the tests that I'm running now will be good enough for a armv8-a run. I need to track down a machine internally with the right sort of headers for a v8.1-A run, that's not going to happen till next week. regards Ramana > + > +/* Turn size and memory model defines into mnemonic fragments. */ > +#if SIZE == 1 > +# define S "b" > +# define MASK ", uxtb" > +#elif SIZE == 2 > +# define S "h" > +# define MASK ", uxth" > +#elif SIZE == 4 || SIZE == 8 > +# define S "" > +# define MASK "" > +#else > +# error > +#endif > + > +#if SIZE < 8 > +# define T unsigned int > +# define W "w" > +#else > +# define T unsigned long long > +# define W "" > +#endif > + > +#if MODEL == 1 > +# define SUFF _relax > +# define A "" > +# define L "" > +#elif MODEL == 2 > +# define SUFF _acq > +# define A "a" > +# define L "" > +#elif MODEL == 3 > +# define SUFF _rel > +# define A "" > +# define L "l" > +#elif MODEL == 4 > +# define SUFF _acq_rel > +# define A "a" > +# define L "l" > +#else > +# error > +#endif > + > +#define NAME2(B, S, X) __aa64_ ## B ## S ## X > +#define NAME1(B, S, X) NAME2(B, S, X) > +#define NAME(BASE) NAME1(BASE, SIZE, SUFF) > + > +#define str1(S) #S > +#define str(S) str1(S) > + > +#ifdef L_cas > +T NAME(cas)(T cmp, T new, T *ptr) __attribute__((visibility("hidden"))); > +T NAME(cas)(T cmp, T new, T *ptr) > +{ > + T old; > + unsigned tmp; > + > + if (have_atomics) > + __asm__("cas" A L S " %"W"0, %"W"2, %1" > + : "=r"(old), "+m"(*ptr) : "r"(new), "0"(cmp)); > + else > + __asm__( > + "0: " > + "ld" A "xr"S" %"W"0, %1\n\t" > + "cmp %"W"0, %"W"4" MASK "\n\t" > + "bne 1f\n\t" > + "st" L "xr"S" %w2, %"W"3, %1\n\t" > + "cbnz %w2, 0b\n" > + "1:" > + : "=&r"(old), "+m"(*ptr), "=&r"(tmp) : "r"(new), "r"(cmp)); > + > + return old; > +} > +#endif > + > +#ifdef L_swp > +T NAME(swp)(T new, T *ptr) __attribute__((visibility("hidden"))); > +T NAME(swp)(T new, T *ptr) > +{ > + T old; > + unsigned tmp; > + > + if (have_atomics) > + __asm__("swp" A L S " %"W"2, %"W"0, %1" > + : "=r"(old), "+m"(*ptr) : "r"(new)); > + else > + __asm__( > + "0: " > + "ld" A "xr"S" %"W"0, %1\n\t" > + "st" L "xr"S" %w2, %"W"3, %1\n\t" > + "cbnz %w2, 0b\n" > + "1:" > + : "=&r"(old), "+m"(*ptr), "=&r"(tmp) : "r"(new)); > + > + return old; > +} > +#endif > + > +#if defined(L_ldadd) || defined(L_ldclr) \ > + || defined(L_ldeor) || defined(L_ldset) > + > +#ifdef L_ldadd > +#define LDOP ldadd > +#define OP add > +#elif defined(L_ldclr) > +#define LDOP ldclr > +#define OP bic > +#elif defined(L_ldeor) > +#define LDOP ldeor > +#define OP eor > +#elif defined(L_ldset) > +#define LDOP ldset > +#define OP orr > +#else > +#error > +#endif > + > +T NAME(LDOP)(T val, T *ptr) __attribute__((visibility("hidden"))); > +T NAME(LDOP)(T val, T *ptr) > +{ > + T old; > + unsigned tmp; > + > + if (have_atomics) > + __asm__(str(LDOP) A L S " %"W"2, %"W"0, %1" > + : "=r"(old), "+m"(*ptr) : "r"(val)); > + else > + __asm__( > + "0: " > + "ld" A "xr"S" %"W"0, %1\n\t" > + str(OP) " %"W"2, %"W"0, %"W"3\n\t" > + "st" L "xr"S" %w2, %"W"2, %1\n\t" > + "cbnz %w2, 0b\n" > + "1:" > + : "=&r"(old), "+m"(*ptr), "=&r"(tmp) : "r"(val)); > + > + return old; > +} > +#endif > + > +#if defined(L_stadd) || defined(L_stclr) \ > + || defined(L_steor) || defined(L_stset) > + > +#ifdef L_stadd > +#define STOP stadd > +#define OP add > +#elif defined(L_stclr) > +#define STOP stclr > +#define OP bic > +#elif defined(L_steor) > +#define STOP steor > +#define OP eor > +#elif defined(L_stset) > +#define STOP stset > +#define OP orr > +#else > +#error > +#endif > + > +void NAME(STOP)(T val, T *ptr) __attribute__((visibility("hidden"))); > +void NAME(STOP)(T val, T *ptr) > +{ > + unsigned tmp; > + > + if (have_atomics) > + __asm__(str(STOP) L S " %"W"1, %0" : "+m"(*ptr) : "r"(val)); > + else > + __asm__( > + "0: " > + "ldxr"S" %"W"1, %0\n\t" > + str(OP) " %"W"1, %"W"1, %"W"2\n\t" > + "st" L "xr"S" %w1, %"W"1, %0\n\t" > + "cbnz %w1, 0b\n" > + "1:" > + : "+m"(*ptr), "=&r"(tmp) : "r"(val)); > +} > +#endif > +#endif /* L_have_atomics */ > diff --git a/libgcc/config.host b/libgcc/config.host > index 029f6569caf..2c4a05d69c5 100644 > --- a/libgcc/config.host > +++ b/libgcc/config.host > @@ -340,23 +340,27 @@ aarch64*-*-elf | aarch64*-*-rtems*) > extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o" > extra_parts="$extra_parts crtfastmath.o" > tmake_file="${tmake_file} ${cpu_type}/t-aarch64" > + tmake_file="${tmake_file} ${cpu_type}/t-lse" > tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" > md_unwind_header=aarch64/aarch64-unwind.h > ;; > aarch64*-*-freebsd*) > extra_parts="$extra_parts crtfastmath.o" > tmake_file="${tmake_file} ${cpu_type}/t-aarch64" > + tmake_file="${tmake_file} ${cpu_type}/t-lse" > tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" > md_unwind_header=aarch64/freebsd-unwind.h > ;; > aarch64*-*-fuchsia*) > tmake_file="${tmake_file} ${cpu_type}/t-aarch64" > + tmake_file="${tmake_file} ${cpu_type}/t-lse" > tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp" > ;; > aarch64*-*-linux*) > extra_parts="$extra_parts crtfastmath.o" > md_unwind_header=aarch64/linux-unwind.h > tmake_file="${tmake_file} ${cpu_type}/t-aarch64" > + tmake_file="${tmake_file} ${cpu_type}/t-lse" > tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" > ;; > alpha*-*-linux*) > diff --git a/libgcc/config/aarch64/t-lse b/libgcc/config/aarch64/t-lse > new file mode 100644 > index 00000000000..e862b0c2448 > --- /dev/null > +++ b/libgcc/config/aarch64/t-lse > @@ -0,0 +1,44 @@ > +# Out-of-line LSE atomics for AArch64 architecture. > +# Copyright (C) 2018 Free Software Foundation, Inc. > +# Contributed by Linaro Ltd. > +# > +# This file is part of GCC. > +# > +# GCC is free software; you can redistribute it and/or modify it > +# under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3, or (at your option) > +# any later version. > +# > +# GCC is distributed in the hope that it will be useful, but > +# WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with GCC; see the file COPYING3. If not see > +# <http://www.gnu.org/licenses/>. > + > +# CAS, Swap, Load-and-operate have 4 sizes and 4 memory models > +S1 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), cas swp ldadd ldclr ldeor ldset)) > +O1 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S1))) > + > +# Store-and-operate has 4 sizes but only 2 memory models (relaxed, release). > +S2 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), stadd stclr steor stset)) > +O2 := $(foreach m, 1 3, $(addsuffix _$(m)$(objext), $(S2))) > + > +LSE_OBJS := $(O1) $(O2) > + > +libgcc-objects += $(LSE_OBJS) have_atomic$(objext) > + > +empty = > +space = $(empty) $(empty) > +PAT_SPLIT = $(subst _,$(space),$(*F)) > +PAT_BASE = $(word 1,$(PAT_SPLIT)) > +PAT_N = $(word 2,$(PAT_SPLIT)) > +PAT_M = $(word 3,$(PAT_SPLIT)) > + > +have_atomic$(objext): $(srcdir)/config/aarch64/lse.c > + $(gcc_compile) -DL_have_atomics -c $< > + > +$(LSE_OBJS): $(srcdir)/config/aarch64/lse.c > + $(gcc_compile) -DL_$(PAT_BASE) -DSIZE=$(PAT_N) -DMODEL=$(PAT_M) -c $< >
diff --git a/libgcc/config/aarch64/lse.c b/libgcc/config/aarch64/lse.c new file mode 100644 index 00000000000..20f4bde741f --- /dev/null +++ b/libgcc/config/aarch64/lse.c @@ -0,0 +1,258 @@ +/* Out-of-line LSE atomics for AArch64 architecture. + Copyright (C) 2018 Free Software Foundation, Inc. + Contributed by Linaro Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +<http://www.gnu.org/licenses/>. */ + +/* + * The problem that we are trying to solve is operating system deployment + * of ARMv8.1-Atomics, also known as Large System Exensions (LSE). + * + * There are a number of potential solutions for this problem which have + * been proposed and rejected for various reasons. To recap: + * + * (1) Multiple builds. The dynamic linker will examine /lib64/atomics/ + * if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten. + * However, not all Linux distributions are happy with multiple builds, + * and anyway it has no effect on main applications. + * + * (2) IFUNC. We could put these functions into libgcc_s.so, and have + * a single copy of each function for all DSOs. However, ARM is concerned + * that the branch-to-indirect-branch that is implied by using a PLT, + * as required by IFUNC, is too much overhead for smaller cpus. + * + * (3) Statically predicted direct branches. This is the approach that + * is taken here. These functions are linked into every DSO that uses them. + * All of the symbols are hidden, so that the functions are called via a + * direct branch. The choice of LSE vs non-LSE is done via one byte load + * followed by a well-predicted direct branch. The functions are compiled + * separately to minimize code size. + */ + +/* Define or declare the symbol gating the LSE implementations. */ +#ifndef L_have_atomics +extern +#endif +_Bool __aa64_have_atomics __attribute__((visibility("hidden"), nocommon)); + +/* The branch controlled by this test should be easily predicted, in that + it will, after constructors, always branch the same way. The expectation + is that systems that implement ARMv8.1-Atomics are "beefier" than those + that omit the extension. By arranging for the fall-through path to use + load-store-exclusive insns, we aid the branch predictor of the + smallest cpus. */ +#define have_atomics __builtin_expect(__aa64_have_atomics, 0) + +#ifdef L_have_atomics +/* Disable initialization of __aa64_have_atomics during bootstrap. */ +# ifndef inhibit_libc +# include <sys/auxv.h> + +static void __attribute__((constructor)) +init_have_atomics(void) +{ + unsigned long hwcap = getauxval(AT_HWCAP); + __aa64_have_atomics = (hwcap & HWCAP_ATOMICS) != 0; +} +# endif /* inhibit_libc */ +#else + +/* Tell the assembler to accept LSE instructions. */ +asm(".arch armv8-a+lse"); + +/* Turn size and memory model defines into mnemonic fragments. */ +#if SIZE == 1 +# define S "b" +# define MASK ", uxtb" +#elif SIZE == 2 +# define S "h" +# define MASK ", uxth" +#elif SIZE == 4 || SIZE == 8 +# define S "" +# define MASK "" +#else +# error +#endif + +#if SIZE < 8 +# define T unsigned int +# define W "w" +#else +# define T unsigned long long +# define W "" +#endif + +#if MODEL == 1 +# define SUFF _relax +# define A "" +# define L "" +#elif MODEL == 2 +# define SUFF _acq +# define A "a" +# define L "" +#elif MODEL == 3 +# define SUFF _rel +# define A "" +# define L "l" +#elif MODEL == 4 +# define SUFF _acq_rel +# define A "a" +# define L "l" +#else +# error +#endif + +#define NAME2(B, S, X) __aa64_ ## B ## S ## X +#define NAME1(B, S, X) NAME2(B, S, X) +#define NAME(BASE) NAME1(BASE, SIZE, SUFF) + +#define str1(S) #S +#define str(S) str1(S) + +#ifdef L_cas +T NAME(cas)(T cmp, T new, T *ptr) __attribute__((visibility("hidden"))); +T NAME(cas)(T cmp, T new, T *ptr) +{ + T old; + unsigned tmp; + + if (have_atomics) + __asm__("cas" A L S " %"W"0, %"W"2, %1" + : "=r"(old), "+m"(*ptr) : "r"(new), "0"(cmp)); + else + __asm__( + "0: " + "ld" A "xr"S" %"W"0, %1\n\t" + "cmp %"W"0, %"W"4" MASK "\n\t" + "bne 1f\n\t" + "st" L "xr"S" %w2, %"W"3, %1\n\t" + "cbnz %w2, 0b\n" + "1:" + : "=&r"(old), "+m"(*ptr), "=&r"(tmp) : "r"(new), "r"(cmp)); + + return old; +} +#endif + +#ifdef L_swp +T NAME(swp)(T new, T *ptr) __attribute__((visibility("hidden"))); +T NAME(swp)(T new, T *ptr) +{ + T old; + unsigned tmp; + + if (have_atomics) + __asm__("swp" A L S " %"W"2, %"W"0, %1" + : "=r"(old), "+m"(*ptr) : "r"(new)); + else + __asm__( + "0: " + "ld" A "xr"S" %"W"0, %1\n\t" + "st" L "xr"S" %w2, %"W"3, %1\n\t" + "cbnz %w2, 0b\n" + "1:" + : "=&r"(old), "+m"(*ptr), "=&r"(tmp) : "r"(new)); + + return old; +} +#endif + +#if defined(L_ldadd) || defined(L_ldclr) \ + || defined(L_ldeor) || defined(L_ldset) + +#ifdef L_ldadd +#define LDOP ldadd +#define OP add +#elif defined(L_ldclr) +#define LDOP ldclr +#define OP bic +#elif defined(L_ldeor) +#define LDOP ldeor +#define OP eor +#elif defined(L_ldset) +#define LDOP ldset +#define OP orr +#else +#error +#endif + +T NAME(LDOP)(T val, T *ptr) __attribute__((visibility("hidden"))); +T NAME(LDOP)(T val, T *ptr) +{ + T old; + unsigned tmp; + + if (have_atomics) + __asm__(str(LDOP) A L S " %"W"2, %"W"0, %1" + : "=r"(old), "+m"(*ptr) : "r"(val)); + else + __asm__( + "0: " + "ld" A "xr"S" %"W"0, %1\n\t" + str(OP) " %"W"2, %"W"0, %"W"3\n\t" + "st" L "xr"S" %w2, %"W"2, %1\n\t" + "cbnz %w2, 0b\n" + "1:" + : "=&r"(old), "+m"(*ptr), "=&r"(tmp) : "r"(val)); + + return old; +} +#endif + +#if defined(L_stadd) || defined(L_stclr) \ + || defined(L_steor) || defined(L_stset) + +#ifdef L_stadd +#define STOP stadd +#define OP add +#elif defined(L_stclr) +#define STOP stclr +#define OP bic +#elif defined(L_steor) +#define STOP steor +#define OP eor +#elif defined(L_stset) +#define STOP stset +#define OP orr +#else +#error +#endif + +void NAME(STOP)(T val, T *ptr) __attribute__((visibility("hidden"))); +void NAME(STOP)(T val, T *ptr) +{ + unsigned tmp; + + if (have_atomics) + __asm__(str(STOP) L S " %"W"1, %0" : "+m"(*ptr) : "r"(val)); + else + __asm__( + "0: " + "ldxr"S" %"W"1, %0\n\t" + str(OP) " %"W"1, %"W"1, %"W"2\n\t" + "st" L "xr"S" %w1, %"W"1, %0\n\t" + "cbnz %w1, 0b\n" + "1:" + : "+m"(*ptr), "=&r"(tmp) : "r"(val)); +} +#endif +#endif /* L_have_atomics */ diff --git a/libgcc/config.host b/libgcc/config.host index 029f6569caf..2c4a05d69c5 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -340,23 +340,27 @@ aarch64*-*-elf | aarch64*-*-rtems*) extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o" extra_parts="$extra_parts crtfastmath.o" tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" md_unwind_header=aarch64/aarch64-unwind.h ;; aarch64*-*-freebsd*) extra_parts="$extra_parts crtfastmath.o" tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" md_unwind_header=aarch64/freebsd-unwind.h ;; aarch64*-*-fuchsia*) tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp" ;; aarch64*-*-linux*) extra_parts="$extra_parts crtfastmath.o" md_unwind_header=aarch64/linux-unwind.h tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" ;; alpha*-*-linux*) diff --git a/libgcc/config/aarch64/t-lse b/libgcc/config/aarch64/t-lse new file mode 100644 index 00000000000..e862b0c2448 --- /dev/null +++ b/libgcc/config/aarch64/t-lse @@ -0,0 +1,44 @@ +# Out-of-line LSE atomics for AArch64 architecture. +# Copyright (C) 2018 Free Software Foundation, Inc. +# Contributed by Linaro Ltd. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# <http://www.gnu.org/licenses/>. + +# CAS, Swap, Load-and-operate have 4 sizes and 4 memory models +S1 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), cas swp ldadd ldclr ldeor ldset)) +O1 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S1))) + +# Store-and-operate has 4 sizes but only 2 memory models (relaxed, release). +S2 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), stadd stclr steor stset)) +O2 := $(foreach m, 1 3, $(addsuffix _$(m)$(objext), $(S2))) + +LSE_OBJS := $(O1) $(O2) + +libgcc-objects += $(LSE_OBJS) have_atomic$(objext) + +empty = +space = $(empty) $(empty) +PAT_SPLIT = $(subst _,$(space),$(*F)) +PAT_BASE = $(word 1,$(PAT_SPLIT)) +PAT_N = $(word 2,$(PAT_SPLIT)) +PAT_M = $(word 3,$(PAT_SPLIT)) + +have_atomic$(objext): $(srcdir)/config/aarch64/lse.c + $(gcc_compile) -DL_have_atomics -c $< + +$(LSE_OBJS): $(srcdir)/config/aarch64/lse.c + $(gcc_compile) -DL_$(PAT_BASE) -DSIZE=$(PAT_N) -DMODEL=$(PAT_M) -c $<
From: Richard Henderson <richard.henderson@linaro.org> This is the libgcc part of the interface -- providing the functions. Rationale is provided at the top of libgcc/config/aarch64/lse.c. * config/aarch64/lse.c: New file. * config/aarch64/t-lse: New file. * config.host: Add t-lse to all aarch64 tuples. --- libgcc/config/aarch64/lse.c | 258 ++++++++++++++++++++++++++++++++++++ libgcc/config.host | 4 + libgcc/config/aarch64/t-lse | 44 ++++++ 3 files changed, 306 insertions(+) create mode 100644 libgcc/config/aarch64/lse.c create mode 100644 libgcc/config/aarch64/t-lse -- 2.17.1