Message ID | 1418396617-25844-1-git-send-email-maxim.uvarov@linaro.org |
---|---|
State | Accepted |
Commit | 223786242ef17204b17abea28de280a8f036e755 |
Headers | show |
On 12/12/2014 05:03 PM, Maxim Uvarov wrote: > Odp atomic operations based on compiler build-ins. Make > sure that compiler supports such operation at configure > stage. This check should be limited to platforms that use gcc atomics.
On 12/12/2014 06:47 PM, Taras Kondratiuk wrote: > On 12/12/2014 05:03 PM, Maxim Uvarov wrote: >> Odp atomic operations based on compiler build-ins. Make >> sure that compiler supports such operation at configure >> stage. > This check should be limited to platforms that use gcc atomics. Do we have such platforms? Maxim.
On 12/12/2014 05:51 PM, Maxim Uvarov wrote: > On 12/12/2014 06:47 PM, Taras Kondratiuk wrote: >> On 12/12/2014 05:03 PM, Maxim Uvarov wrote: >>> Odp atomic operations based on compiler build-ins. Make >>> sure that compiler supports such operation at configure >>> stage. >> This check should be limited to platforms that use gcc atomics. > Do we have such platforms? I assume you question was: do we have platform that *don't* use gcc atomics? Currently all public platform do use gcc atomics, but we shouldn't assume this for other platforms. Some of them may use older version of gcc which doesn't have these built-ins. Can this check be done/enabled per platform?
On 12 December 2014 at 10:51, Maxim Uvarov <maxim.uvarov@linaro.org> wrote: > > On 12/12/2014 06:47 PM, Taras Kondratiuk wrote: > >> On 12/12/2014 05:03 PM, Maxim Uvarov wrote: >> >>> Odp atomic operations based on compiler build-ins. Make >>> sure that compiler supports such operation at configure >>> stage. >>> >> This check should be limited to platforms that use gcc atomics. >> > Do we have such platforms? __OCTEON__ directly swappes out gcc, infact that pre processor switch affects other files in linux-generic too. static inline uint32_t odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom) { #if defined __OCTEON__ >-------uint32_t ret; >-------__asm__ __volatile__ ("syncws"); >-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret), "+m" (atom) : >------->------->------- "r" (atom)); >-------return ret; #else >-------return __atomic_fetch_add(&atom->v, 1, __ATOMIC_RELAXED); #endif } > > > Maxim. > > > _______________________________________________ > lng-odp mailing list > lng-odp@lists.linaro.org > http://lists.linaro.org/mailman/listinfo/lng-odp >
On 12/12/2014 06:57 PM, Taras Kondratiuk wrote: > On 12/12/2014 05:51 PM, Maxim Uvarov wrote: >> On 12/12/2014 06:47 PM, Taras Kondratiuk wrote: >>> On 12/12/2014 05:03 PM, Maxim Uvarov wrote: >>>> Odp atomic operations based on compiler build-ins. Make >>>> sure that compiler supports such operation at configure >>>> stage. >>> This check should be limited to platforms that use gcc atomics. >> Do we have such platforms? > I assume you question was: do we have platform that *don't* use gcc > atomics? Currently all public platform do use gcc atomics, but we > shouldn't assume this for other platforms. Some of them may use older > version of gcc which doesn't have these built-ins. > Can this check be done/enabled per platform? > Ok, might be need platform/linux-x/configure.ac which will be included from main configure.ac. In that case internal checks for compilers, bsp and etc will be in platform specific place. Maxim.
On 12 December 2014 at 16:59, Mike Holmes <mike.holmes@linaro.org> wrote: > > > On 12 December 2014 at 10:51, Maxim Uvarov <maxim.uvarov@linaro.org> wrote: >> >> On 12/12/2014 06:47 PM, Taras Kondratiuk wrote: >>> >>> On 12/12/2014 05:03 PM, Maxim Uvarov wrote: >>>> >>>> Odp atomic operations based on compiler build-ins. Make >>>> sure that compiler supports such operation at configure >>>> stage. >>> >>> This check should be limited to platforms that use gcc atomics. >> >> Do we have such platforms? > > > __OCTEON__ directly swappes out gcc, infact that pre processor switch > affects other files in linux-generic too. > > static inline uint32_t odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom) > { > #if defined __OCTEON__ >>-------uint32_t ret; >>-------__asm__ __volatile__ ("syncws"); >>-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret), "+m" (atom) : >>------->------->------- "r" (atom)); >>-------return ret; > #else >>-------return __atomic_fetch_add(&atom->v, 1, __ATOMIC_RELAXED); > #endif > } This is an OCTEON-special for just one function. I didn't want to remove it. But I am not sure it is actually needed. Shouldn't the compiler be able to generate the appropriate OCTEON-specific instructions (e.g. laa(d), saa(d), lai(d), lad(d) etc), gcc has supported OCTEON since version 4.4? Also I don't understand the reason for the syncw *before* the lai (load-atomic-increment) but perhaps the load-atomic instructions (which shouldn't linger in the write buffer like stores) need to be treated differently. I think this OCTEON-fix should either be removed or there should be specials for most if not all of the functions in odp_atomic.h > > >> >> >> >> Maxim. >> >> >> _______________________________________________ >> lng-odp mailing list >> lng-odp@lists.linaro.org >> http://lists.linaro.org/mailman/listinfo/lng-odp > > > > -- > Mike Holmes > Linaro Sr Technical Manager > LNG - ODP > > _______________________________________________ > lng-odp mailing list > lng-odp@lists.linaro.org > http://lists.linaro.org/mailman/listinfo/lng-odp >
On 15 December 2014 at 06:40, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote: > On Fri, Dec 12, 2014 at 10:56:26PM +0100, Ola Liljedahl wrote: >> On 12 December 2014 at 16:59, Mike Holmes <mike.holmes@linaro.org> wrote: >> > >> > >> > On 12 December 2014 at 10:51, Maxim Uvarov <maxim.uvarov@linaro.org> wrote: >> >> >> >> On 12/12/2014 06:47 PM, Taras Kondratiuk wrote: >> >>> >> >>> On 12/12/2014 05:03 PM, Maxim Uvarov wrote: >> >>>> >> >>>> Odp atomic operations based on compiler build-ins. Make >> >>>> sure that compiler supports such operation at configure >> >>>> stage. >> >>> >> >>> This check should be limited to platforms that use gcc atomics. >> >> >> >> Do we have such platforms? >> > >> > >> > __OCTEON__ directly swappes out gcc, infact that pre processor switch >> > affects other files in linux-generic too. >> > >> > static inline uint32_t odp_atomic_fetch_inc_u32(odp_atomic_u32_t *atom) >> > { >> > #if defined __OCTEON__ >> >>-------uint32_t ret; >> >>-------__asm__ __volatile__ ("syncws"); >> >>-------__asm__ __volatile__ ("lai %0,(%2)" : "=r" (ret), "+m" (atom) : >> >>------->------->------- "r" (atom)); >> >>-------return ret; >> > #else >> >>-------return __atomic_fetch_add(&atom->v, 1, __ATOMIC_RELAXED); >> > #endif >> > } >> This is an OCTEON-special for just one function. I didn't want to >> remove it. But I am not sure it is actually needed. Shouldn't the >> compiler be able to generate the appropriate OCTEON-specific >> instructions (e.g. laa(d), saa(d), lai(d), lad(d) etc), gcc has >> supported OCTEON since version 4.4? Also I don't understand the reason >> for the syncw *before* the lai (load-atomic-increment) but perhaps the >> load-atomic instructions (which shouldn't linger in the write buffer >> like stores) need to be treated differently. > > I generated the dis-assembly of octeon gcc generated output, > Compiler does generate appropriate instruction for __atomic_fetch* instructions. So no need for any OCTEON-specific inline assembler code then? > and if the contract is "__ATOMIC_RELAXED" then "syncws" can be removed. The ordering contract here is now relaxed. The public use case for odp_atomics.h is for software counters, no need to waste performance with unnecessary ordering requirements. Have you checked the code generation for the _odp_atomic_internal.h functions? E.g. for acquire and release memory orderings. I assume that GCC will generate correct code but don't know if it actually generates optimal code. > > Typical use case for putting "syncw" before the lai operation to make sure > cpu does not change the order of "lai" and "instructions before it" > as octeon is weakly ordered cpu for better performance. This would be release ordering which is often associated with unlock type of operations. But the type of ordering you need depends on the usage of the atomic operation which is unknown to the atomic operation itself. Is there any other case where "syncw" should still be used, e.g. for performance if not for correctness reasons? Flush the write buffer on release type of operations. We need a better name for this than odp_sync_stores(). I know MIPS is weakly ordered but I was under the impression that the OCTEON micro-architectures are more strongly ordered (at least the first OCTEON implementations were in-order). > >> >> I think this OCTEON-fix should either be removed or there should be >> specials for most if not all of the functions in odp_atomic.h > > Yes, It can be removed if the contract is "__ATOMIC_RELAXED". Great. I can post a patch. > > Do we have any specific reason to keep the API as odp_atomic_*, if its > an "__ATOMIC_RELAXED" memory model then its better to rename as counters. This was my original idea as well. > > >> > >> > >> >> >> >> >> >> >> >> Maxim. >> >> >> >> >> >> _______________________________________________ >> >> lng-odp mailing list >> >> lng-odp@lists.linaro.org >> >> http://lists.linaro.org/mailman/listinfo/lng-odp >> > >> > >> > >> > -- >> > Mike Holmes >> > Linaro Sr Technical Manager >> > LNG - ODP >> > >> > _______________________________________________ >> > lng-odp mailing list >> > lng-odp@lists.linaro.org >> > http://lists.linaro.org/mailman/listinfo/lng-odp >> >
diff --git a/configure.ac b/configure.ac index 8dbad4b..29387ee 100644 --- a/configure.ac +++ b/configure.ac @@ -40,6 +40,25 @@ AC_TYPE_INT32_T AC_TYPE_UINT32_T AC_TYPE_UINT64_T +AC_MSG_CHECKING(for MAX GCC atomic builtins) +AC_LINK_IFELSE( + [AC_LANG_SOURCE( + [[#include <stdint.h> + int main() { + volatile uint32_t v = 1; + __atomic_fetch_add(&v, 1, __ATOMIC_RELAXED); + __atomic_fetch_sub(&v, 1, __ATOMIC_RELAXED); + __atomic_store_n(&v, 1, __ATOMIC_RELAXED); + __atomic_load_n(&v, __ATOMIC_RELAXED); + return 0; + } + ]])], + AC_MSG_RESULT(yes), + AC_MSG_RESULT(no) + echo "Atomic operation are not supported by your compiller." + echo "Use newer version. For gcc > 4.7.3" + exit -1) + ########################################################################## # Determine which platform to build for ##########################################################################
Odp atomic operations based on compiler build-ins. Make sure that compiler supports such operation at configure stage. Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org> --- configure.ac | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)