diff mbox

linux-generic: timer: disable 128 bit timer optimization for clang

Message ID 1463584533-24004-1-git-send-email-maxim.uvarov@linaro.org
State Superseded
Headers show

Commit Message

Maxim Uvarov May 18, 2016, 3:15 p.m. UTC
Fix compilation error for clang with disabling 128 bit optimization.
In function `_odp_atomic_u128_xchg_mm':
undefined reference to `__atomic_exchange'

Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
---
 I need some quick way to make clang build happy. Clean patch can go later.

 Maxim.

 platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Mike Holmes May 18, 2016, 3:52 p.m. UTC | #1
On 18 May 2016 at 11:15, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> Fix compilation error for clang with disabling 128 bit optimization.
> In function `_odp_atomic_u128_xchg_mm':
> undefined reference to `__atomic_exchange'
>
> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
> ---
>  I need some quick way to make clang build happy


Why not revert whatever introduced the issue ?


> . Clean patch can go later.
>

When is "later" defined to be ?


Why dont we just wait for the correct fix ?



>
>  Maxim.
>
>  platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/platform/linux-generic/include/odp_atomic_internal.h
> b/platform/linux-generic/include/odp_atomic_internal.h
> index 3c5606c..31c8059 100644
> --- a/platform/linux-generic/include/odp_atomic_internal.h
> +++ b/platform/linux-generic/include/odp_atomic_internal.h
> @@ -590,7 +590,9 @@ static inline void
> _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>  /* Check if target and compiler supports 128-bit scalars and corresponding
>   * exchange and CAS operations */
>  /* GCC on x86-64 needs -mcx16 compiler option */
> -#if defined __SIZEOF_INT128__ && defined
> __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
> +#if defined(__SIZEOF_INT128__) && \
> +       defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
> +       !defined(__clang__)
>
>  /** Preprocessor symbol that indicates support for 128-bit atomics */
>  #define ODP_ATOMIC_U128
> --
> 2.7.1.250.gff4ea60
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
Maxim Uvarov May 18, 2016, 3:56 p.m. UTC | #2
On 05/18/16 18:52, Mike Holmes wrote:
>
>
> On 18 May 2016 at 11:15, Maxim Uvarov <maxim.uvarov@linaro.org 
> <mailto:maxim.uvarov@linaro.org>> wrote:
>
>     Fix compilation error for clang with disabling 128 bit optimization.
>     In function `_odp_atomic_u128_xchg_mm':
>     undefined reference to `__atomic_exchange'
>
>     Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>     <mailto:maxim.uvarov@linaro.org>>
>     ---
>      I need some quick way to make clang build happy
>
>
> Why not revert whatever introduced the issue ?
>
>     . Clean patch can go later.
>
> When is "later" defined to be ?
>
>
> Why dont we just wait for the correct fix ?
to make -m32 work now.

Maxim.
>
>
>      Maxim.
>
>      platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>      1 file changed, 3 insertions(+), 1 deletion(-)
>
>     diff --git a/platform/linux-generic/include/odp_atomic_internal.h
>     b/platform/linux-generic/include/odp_atomic_internal.h
>     index 3c5606c..31c8059 100644
>     --- a/platform/linux-generic/include/odp_atomic_internal.h
>     +++ b/platform/linux-generic/include/odp_atomic_internal.h
>     @@ -590,7 +590,9 @@ static inline void
>     _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>      /* Check if target and compiler supports 128-bit scalars and
>     corresponding
>       * exchange and CAS operations */
>      /* GCC on x86-64 needs -mcx16 compiler option */
>     -#if defined __SIZEOF_INT128__ && defined
>     __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>     +#if defined(__SIZEOF_INT128__) && \
>     +       defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>     +       !defined(__clang__)
>
>      /** Preprocessor symbol that indicates support for 128-bit atomics */
>      #define ODP_ATOMIC_U128
>     --
>     2.7.1.250.gff4ea60
>
>     _______________________________________________
>     lng-odp mailing list
>     lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>     https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
>
>
> -- 
> Mike Holmes
> Technical Manager - Linaro Networking Group
> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM SoCs
> "Work should be fun and collaborative, the rest follows"
>
Mike Holmes May 18, 2016, 4:48 p.m. UTC | #3
On 18 May 2016 at 11:56, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> On 05/18/16 18:52, Mike Holmes wrote:
>
>>
>>
>> On 18 May 2016 at 11:15, Maxim Uvarov <maxim.uvarov@linaro.org <mailto:
>> maxim.uvarov@linaro.org>> wrote:
>>
>>     Fix compilation error for clang with disabling 128 bit optimization.
>>     In function `_odp_atomic_u128_xchg_mm':
>>     undefined reference to `__atomic_exchange'
>>
>>     Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>>     <mailto:maxim.uvarov@linaro.org>>
>>     ---
>>      I need some quick way to make clang build happy
>>
>>
>> Why not revert whatever introduced the issue ?
>>
>>     . Clean patch can go later.
>>
>> When is "later" defined to be ?
>>
>>
>> Why dont we just wait for the correct fix ?
>>
> to make -m32 work now.
>

why now, why do we need a fix so urgently that we dont fix it properly.



>
> Maxim.
>
>>
>>
>>      Maxim.
>>
>>      platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>>      1 file changed, 3 insertions(+), 1 deletion(-)
>>
>>     diff --git a/platform/linux-generic/include/odp_atomic_internal.h
>>     b/platform/linux-generic/include/odp_atomic_internal.h
>>     index 3c5606c..31c8059 100644
>>     --- a/platform/linux-generic/include/odp_atomic_internal.h
>>     +++ b/platform/linux-generic/include/odp_atomic_internal.h
>>     @@ -590,7 +590,9 @@ static inline void
>>     _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>>      /* Check if target and compiler supports 128-bit scalars and
>>     corresponding
>>       * exchange and CAS operations */
>>      /* GCC on x86-64 needs -mcx16 compiler option */
>>     -#if defined __SIZEOF_INT128__ && defined
>>     __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>>     +#if defined(__SIZEOF_INT128__) && \
>>     +       defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>>     +       !defined(__clang__)
>>
>>      /** Preprocessor symbol that indicates support for 128-bit atomics */
>>      #define ODP_ATOMIC_U128
>>     --
>>     2.7.1.250.gff4ea60
>>
>>     _______________________________________________
>>     lng-odp mailing list
>>     lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>>     https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>>
>> --
>> Mike Holmes
>> Technical Manager - Linaro Networking Group
>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM
>> SoCs
>> "Work should be fun and collaborative, the rest follows"
>>
>>
>
Maxim Uvarov May 18, 2016, 6:27 p.m. UTC | #4
On 05/18/16 19:48, Mike Holmes wrote:
>
>
> On 18 May 2016 at 11:56, Maxim Uvarov <maxim.uvarov@linaro.org 
> <mailto:maxim.uvarov@linaro.org>> wrote:
>
>     On 05/18/16 18:52, Mike Holmes wrote:
>
>
>
>         On 18 May 2016 at 11:15, Maxim Uvarov <maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>
>         <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>
>             Fix compilation error for clang with disabling 128 bit
>         optimization.
>             In function `_odp_atomic_u128_xchg_mm':
>             undefined reference to `__atomic_exchange'
>
>             Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>
>             <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>>
>             ---
>              I need some quick way to make clang build happy
>
>
>         Why not revert whatever introduced the issue ?
>
>             . Clean patch can go later.
>
>         When is "later" defined to be ?
>
>
>         Why dont we just wait for the correct fix ?
>
>     to make -m32 work now.
>
>
> why now, why do we need a fix so urgently that we dont fix it properly.
>

There is no big urgent and patch can wait usual 24 hours. I might be 
clear describing
problem which patch fixes to me understandable for people who did not 
look into timer test.

I think that 2 Olas patches fixes issue with 128 bit optimization with 
gcc, but introduce some
other things which we did not capture on review process:

1) clangs (at least my Ubuntu clang version 3.4-1ubuntu3 
(tags/RELEASE_34/final) (based on LLVM 3.4)
does not link with gcc build in __atomic_exchange. That means clang 
build should use generic not optimized
for 128 bit version.

2) Build for odp-linux has to be reproducible. And at the same time run 
on any similar arch.
That means that all such optimizations should be under ./configure 
options (for timer it's #define ODP_ATOMIC_U128).
Only in that case we can be sure that x86 generic build (which will be 
in ubuntu, debian, redhat and etc) will run on
all machines, even which do not support intrisics.

3) configure compiller detection things has to be in: 
platform/linux-generic/m4/configure.m4


Current patch fixes (1).  But because (2) and (3) is only dance around 
configure.ac options and no functional change
expected, I think current patch might be the best approach for now. 
./configure.ac and reproducible build as well as
move timer arch code separate file is subject for other patch and it's 
not related to timer internal implementation.

What is your opinion?

Maxim.


>
>     Maxim.
>
>
>
>              Maxim.
>
>          platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>              1 file changed, 3 insertions(+), 1 deletion(-)
>
>             diff --git
>         a/platform/linux-generic/include/odp_atomic_internal.h
>         b/platform/linux-generic/include/odp_atomic_internal.h
>             index 3c5606c..31c8059 100644
>             --- a/platform/linux-generic/include/odp_atomic_internal.h
>             +++ b/platform/linux-generic/include/odp_atomic_internal.h
>             @@ -590,7 +590,9 @@ static inline void
>             _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>              /* Check if target and compiler supports 128-bit scalars and
>             corresponding
>               * exchange and CAS operations */
>              /* GCC on x86-64 needs -mcx16 compiler option */
>             -#if defined __SIZEOF_INT128__ && defined
>             __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>             +#if defined(__SIZEOF_INT128__) && \
>             +  defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>             +       !defined(__clang__)
>
>              /** Preprocessor symbol that indicates support for
>         128-bit atomics */
>              #define ODP_ATOMIC_U128
>             --
>             2.7.1.250.gff4ea60
>
>             _______________________________________________
>             lng-odp mailing list
>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>         <mailto:lng-odp@lists.linaro.org
>         <mailto:lng-odp@lists.linaro.org>>
>         https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
>
>
>         -- 
>         Mike Holmes
>         Technical Manager - Linaro Networking Group
>         Linaro.org <http://www.linaro.org/>***│ *Open source software
>         for ARM SoCs
>         "Work should be fun and collaborative, the rest follows"
>
>
>
>
>
> -- 
> Mike Holmes
> Technical Manager - Linaro Networking Group
> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM SoCs
> "Work should be fun and collaborative, the rest follows"
>
Mike Holmes May 18, 2016, 6:45 p.m. UTC | #5
Maxim and I had a chat, I think  this patch means to say, "for now clang
will use non optimised code, but deeper analysis is needed to optimise with
clang"

On 18 May 2016 at 14:27, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> On 05/18/16 19:48, Mike Holmes wrote:
>
>>
>>
>> On 18 May 2016 at 11:56, Maxim Uvarov <maxim.uvarov@linaro.org <mailto:
>> maxim.uvarov@linaro.org>> wrote:
>>
>>     On 05/18/16 18:52, Mike Holmes wrote:
>>
>>
>>
>>         On 18 May 2016 at 11:15, Maxim Uvarov <maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>
>>         <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>>
>>             Fix compilation error for clang with disabling 128 bit
>>         optimization.
>>             In function `_odp_atomic_u128_xchg_mm':
>>             undefined reference to `__atomic_exchange'
>>
>>             Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>
>>             <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>>
>>             ---
>>              I need some quick way to make clang build happy
>>
>>
>>         Why not revert whatever introduced the issue ?
>>
>>             . Clean patch can go later.
>>
>>         When is "later" defined to be ?
>>
>>
>>         Why dont we just wait for the correct fix ?
>>
>>     to make -m32 work now.
>>
>>
>> why now, why do we need a fix so urgently that we dont fix it properly.
>>
>>
> There is no big urgent and patch can wait usual 24 hours. I might be clear
> describing
> problem which patch fixes to me understandable for people who did not look
> into timer test.
>
> I think that 2 Olas patches fixes issue with 128 bit optimization with
> gcc, but introduce some
> other things which we did not capture on review process:
>
> 1) clangs (at least my Ubuntu clang version 3.4-1ubuntu3
> (tags/RELEASE_34/final) (based on LLVM 3.4)
> does not link with gcc build in __atomic_exchange. That means clang build
> should use generic not optimized
> for 128 bit version.
>
> 2) Build for odp-linux has to be reproducible. And at the same time run on
> any similar arch.
> That means that all such optimizations should be under ./configure options
> (for timer it's #define ODP_ATOMIC_U128).
> Only in that case we can be sure that x86 generic build (which will be in
> ubuntu, debian, redhat and etc) will run on
> all machines, even which do not support intrisics.
>
> 3) configure compiller detection things has to be in:
> platform/linux-generic/m4/configure.m4
>
>
> Current patch fixes (1).  But because (2) and (3) is only dance around
> configure.ac options and no functional change
> expected, I think current patch might be the best approach for now. ./
> configure.ac and reproducible build as well as
> move timer arch code separate file is subject for other patch and it's not
> related to timer internal implementation.
>
> What is your opinion?
>
> Maxim.
>
>
>
>>     Maxim.
>>
>>
>>
>>              Maxim.
>>
>>          platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>>              1 file changed, 3 insertions(+), 1 deletion(-)
>>
>>             diff --git
>>         a/platform/linux-generic/include/odp_atomic_internal.h
>>         b/platform/linux-generic/include/odp_atomic_internal.h
>>             index 3c5606c..31c8059 100644
>>             --- a/platform/linux-generic/include/odp_atomic_internal.h
>>             +++ b/platform/linux-generic/include/odp_atomic_internal.h
>>             @@ -590,7 +590,9 @@ static inline void
>>             _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>>              /* Check if target and compiler supports 128-bit scalars and
>>             corresponding
>>               * exchange and CAS operations */
>>              /* GCC on x86-64 needs -mcx16 compiler option */
>>             -#if defined __SIZEOF_INT128__ && defined
>>             __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>>             +#if defined(__SIZEOF_INT128__) && \
>>             +  defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>>             +       !defined(__clang__)
>>
>>              /** Preprocessor symbol that indicates support for
>>         128-bit atomics */
>>              #define ODP_ATOMIC_U128
>>             --
>>             2.7.1.250.gff4ea60
>>
>>             _______________________________________________
>>             lng-odp mailing list
>>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>>         <mailto:lng-odp@lists.linaro.org
>>         <mailto:lng-odp@lists.linaro.org>>
>>         https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>>
>>         --         Mike Holmes
>>         Technical Manager - Linaro Networking Group
>>         Linaro.org <http://www.linaro.org/>***│ *Open source software
>>         for ARM SoCs
>>         "Work should be fun and collaborative, the rest follows"
>>
>>
>>
>>
>>
>> --
>> Mike Holmes
>> Technical Manager - Linaro Networking Group
>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM
>> SoCs
>> "Work should be fun and collaborative, the rest follows"
>>
>>
>
Maxim Uvarov May 18, 2016, 6:53 p.m. UTC | #6
On 05/18/16 21:45, Mike Holmes wrote:
> Maxim and I had a chat, I think  this patch means to say, "for now 
> clang will use non optimised code, but deeper analysis is needed to 
> optimise with clang"
>
yes, looks like comment under "---" was confusing. This patch is not a 
hack it's only routes clang generated code to lock path instead of lock 
free path with 128 bit instructions.

Maxim.

> On 18 May 2016 at 14:27, Maxim Uvarov <maxim.uvarov@linaro.org 
> <mailto:maxim.uvarov@linaro.org>> wrote:
>
>     On 05/18/16 19:48, Mike Holmes wrote:
>
>
>
>         On 18 May 2016 at 11:56, Maxim Uvarov <maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>
>         <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>
>             On 05/18/16 18:52, Mike Holmes wrote:
>
>
>
>                 On 18 May 2016 at 11:15, Maxim Uvarov
>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>                 <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>
>                 <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>
>                 <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>>> wrote:
>
>                     Fix compilation error for clang with disabling 128 bit
>                 optimization.
>                     In function `_odp_atomic_u128_xchg_mm':
>                     undefined reference to `__atomic_exchange'
>
>                     Signed-off-by: Maxim Uvarov
>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>                 <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>
>                     <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>
>                 <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>>>
>                     ---
>                      I need some quick way to make clang build happy
>
>
>                 Why not revert whatever introduced the issue ?
>
>                     . Clean patch can go later.
>
>                 When is "later" defined to be ?
>
>
>                 Why dont we just wait for the correct fix ?
>
>             to make -m32 work now.
>
>
>         why now, why do we need a fix so urgently that we dont fix it
>         properly.
>
>
>     There is no big urgent and patch can wait usual 24 hours. I might
>     be clear describing
>     problem which patch fixes to me understandable for people who did
>     not look into timer test.
>
>     I think that 2 Olas patches fixes issue with 128 bit optimization
>     with gcc, but introduce some
>     other things which we did not capture on review process:
>
>     1) clangs (at least my Ubuntu clang version 3.4-1ubuntu3
>     (tags/RELEASE_34/final) (based on LLVM 3.4)
>     does not link with gcc build in __atomic_exchange. That means
>     clang build should use generic not optimized
>     for 128 bit version.
>
>     2) Build for odp-linux has to be reproducible. And at the same
>     time run on any similar arch.
>     That means that all such optimizations should be under ./configure
>     options (for timer it's #define ODP_ATOMIC_U128).
>     Only in that case we can be sure that x86 generic build (which
>     will be in ubuntu, debian, redhat and etc) will run on
>     all machines, even which do not support intrisics.
>
>     3) configure compiller detection things has to be in:
>     platform/linux-generic/m4/configure.m4
>
>
>     Current patch fixes (1).  But because (2) and (3) is only dance
>     around configure.ac <http://configure.ac> options and no
>     functional change
>     expected, I think current patch might be the best approach for
>     now. ./configure.ac <http://configure.ac> and reproducible build
>     as well as
>     move timer arch code separate file is subject for other patch and
>     it's not related to timer internal implementation.
>
>     What is your opinion?
>
>     Maxim.
>
>
>
>             Maxim.
>
>
>
>                      Maxim.
>
>          platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>                      1 file changed, 3 insertions(+), 1 deletion(-)
>
>                     diff --git
>         a/platform/linux-generic/include/odp_atomic_internal.h
>         b/platform/linux-generic/include/odp_atomic_internal.h
>                     index 3c5606c..31c8059 100644
>                     ---
>         a/platform/linux-generic/include/odp_atomic_internal.h
>                     +++
>         b/platform/linux-generic/include/odp_atomic_internal.h
>                     @@ -590,7 +590,9 @@ static inline void
>                     _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>                      /* Check if target and compiler supports 128-bit
>         scalars and
>                     corresponding
>                       * exchange and CAS operations */
>                      /* GCC on x86-64 needs -mcx16 compiler option */
>                     -#if defined __SIZEOF_INT128__ && defined
>                     __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>                     +#if defined(__SIZEOF_INT128__) && \
>                     + defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>                     +       !defined(__clang__)
>
>                      /** Preprocessor symbol that indicates support for
>                 128-bit atomics */
>                      #define ODP_ATOMIC_U128
>                     --
>                     2.7.1.250.gff4ea60
>
>         _______________________________________________
>                     lng-odp mailing list
>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>         <mailto:lng-odp@lists.linaro.org
>         <mailto:lng-odp@lists.linaro.org>>
>                 <mailto:lng-odp@lists.linaro.org
>         <mailto:lng-odp@lists.linaro.org>
>                 <mailto:lng-odp@lists.linaro.org
>         <mailto:lng-odp@lists.linaro.org>>>
>         https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
>
>
>                 --         Mike Holmes
>                 Technical Manager - Linaro Networking Group
>                 Linaro.org <http://www.linaro.org/>***│ *Open source
>         software
>                 for ARM SoCs
>                 "Work should be fun and collaborative, the rest follows"
>
>
>
>
>
>         -- 
>         Mike Holmes
>         Technical Manager - Linaro Networking Group
>         Linaro.org <http://www.linaro.org/>***│ *Open source software
>         for ARM SoCs
>         "Work should be fun and collaborative, the rest follows"
>
>
>
>
>
> -- 
> Mike Holmes
> Technical Manager - Linaro Networking Group
> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM SoCs
> "Work should be fun and collaborative, the rest follows"
>
Ola Liljedahl May 18, 2016, 8:24 p.m. UTC | #7
Maxim,

I configured and built using clang and -m32 and it worked for me. clang
chooses the lock-based path.
This means __SIZEOF_INT128__ and __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 were
not defined so the ODP atomic support for 128-bit variables was not
enabled. The timer code fell back to the lock-based path.

I used clang 3.6.2-1 running on Ubuntu 15.10.

I think your clang 3.4 is buggy, it claims to support -mcx16 also for -m32
targets and then defines those preprocessor symbols above which are tested
by odp_atomic_internal.h. But when generating code, the compiler backend
realises that there is no suitable instruction (e.g. cmpxchg16) for
i386/686 targets so generates calls to external functions (e.g.
__atomic_exchange) instead. But those functions either do not exist or are
somehow not included in the linking (I would expect them to be located in
clang's equivalent to libgcc.a). There are some bug reports on missing
support for 128-bit atomics in clang and instead you get these calls to
non-existing functions.

Add an additional check to odp_atomic_internal.h that clang version must be
>= 3.6 (don't know about 3.5) for the ODP support for 128 bit atomics to be
enabled.

-- Ola


On 18 May 2016 at 20:53, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> On 05/18/16 21:45, Mike Holmes wrote:
>
>> Maxim and I had a chat, I think  this patch means to say, "for now clang
>> will use non optimised code, but deeper analysis is needed to optimise with
>> clang"
>>
>> yes, looks like comment under "---" was confusing. This patch is not a
> hack it's only routes clang generated code to lock path instead of lock
> free path with 128 bit instructions.
>
> Maxim.
>
> On 18 May 2016 at 14:27, Maxim Uvarov <maxim.uvarov@linaro.org <mailto:
>> maxim.uvarov@linaro.org>> wrote:
>>
>>     On 05/18/16 19:48, Mike Holmes wrote:
>>
>>
>>
>>         On 18 May 2016 at 11:56, Maxim Uvarov <maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>
>>         <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>>
>>             On 05/18/16 18:52, Mike Holmes wrote:
>>
>>
>>
>>                 On 18 May 2016 at 11:15, Maxim Uvarov
>>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>>                 <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>
>>                 <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>
>>                 <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>>> wrote:
>>
>>                     Fix compilation error for clang with disabling 128 bit
>>                 optimization.
>>                     In function `_odp_atomic_u128_xchg_mm':
>>                     undefined reference to `__atomic_exchange'
>>
>>                     Signed-off-by: Maxim Uvarov
>>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>>                 <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>
>>                     <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>
>>                 <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>>>
>>                     ---
>>                      I need some quick way to make clang build happy
>>
>>
>>                 Why not revert whatever introduced the issue ?
>>
>>                     . Clean patch can go later.
>>
>>                 When is "later" defined to be ?
>>
>>
>>                 Why dont we just wait for the correct fix ?
>>
>>             to make -m32 work now.
>>
>>
>>         why now, why do we need a fix so urgently that we dont fix it
>>         properly.
>>
>>
>>     There is no big urgent and patch can wait usual 24 hours. I might
>>     be clear describing
>>     problem which patch fixes to me understandable for people who did
>>     not look into timer test.
>>
>>     I think that 2 Olas patches fixes issue with 128 bit optimization
>>     with gcc, but introduce some
>>     other things which we did not capture on review process:
>>
>>     1) clangs (at least my Ubuntu clang version 3.4-1ubuntu3
>>     (tags/RELEASE_34/final) (based on LLVM 3.4)
>>     does not link with gcc build in __atomic_exchange. That means
>>     clang build should use generic not optimized
>>     for 128 bit version.
>>
>>     2) Build for odp-linux has to be reproducible. And at the same
>>     time run on any similar arch.
>>     That means that all such optimizations should be under ./configure
>>     options (for timer it's #define ODP_ATOMIC_U128).
>>     Only in that case we can be sure that x86 generic build (which
>>     will be in ubuntu, debian, redhat and etc) will run on
>>     all machines, even which do not support intrisics.
>>
>>     3) configure compiller detection things has to be in:
>>     platform/linux-generic/m4/configure.m4
>>
>>
>>     Current patch fixes (1).  But because (2) and (3) is only dance
>>     around configure.ac <http://configure.ac> options and no
>>     functional change
>>     expected, I think current patch might be the best approach for
>>     now. ./configure.ac <http://configure.ac> and reproducible build
>>
>>     as well as
>>     move timer arch code separate file is subject for other patch and
>>     it's not related to timer internal implementation.
>>
>>     What is your opinion?
>>
>>     Maxim.
>>
>>
>>
>>             Maxim.
>>
>>
>>
>>                      Maxim.
>>
>>          platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>>                      1 file changed, 3 insertions(+), 1 deletion(-)
>>
>>                     diff --git
>>         a/platform/linux-generic/include/odp_atomic_internal.h
>>         b/platform/linux-generic/include/odp_atomic_internal.h
>>                     index 3c5606c..31c8059 100644
>>                     ---
>>         a/platform/linux-generic/include/odp_atomic_internal.h
>>                     +++
>>         b/platform/linux-generic/include/odp_atomic_internal.h
>>                     @@ -590,7 +590,9 @@ static inline void
>>                     _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>>                      /* Check if target and compiler supports 128-bit
>>         scalars and
>>                     corresponding
>>                       * exchange and CAS operations */
>>                      /* GCC on x86-64 needs -mcx16 compiler option */
>>                     -#if defined __SIZEOF_INT128__ && defined
>>                     __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>>                     +#if defined(__SIZEOF_INT128__) && \
>>                     + defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>>                     +       !defined(__clang__)
>>
>>                      /** Preprocessor symbol that indicates support for
>>                 128-bit atomics */
>>                      #define ODP_ATOMIC_U128
>>                     --
>>                     2.7.1.250.gff4ea60
>>
>>         _______________________________________________
>>                     lng-odp mailing list
>>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>>         <mailto:lng-odp@lists.linaro.org
>>         <mailto:lng-odp@lists.linaro.org>>
>>                 <mailto:lng-odp@lists.linaro.org
>>         <mailto:lng-odp@lists.linaro.org>
>>                 <mailto:lng-odp@lists.linaro.org
>>         <mailto:lng-odp@lists.linaro.org>>>
>>         https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>>
>>                 --         Mike Holmes
>>                 Technical Manager - Linaro Networking Group
>>                 Linaro.org <http://www.linaro.org/>***│ *Open source
>>         software
>>                 for ARM SoCs
>>                 "Work should be fun and collaborative, the rest follows"
>>
>>
>>
>>
>>
>>         --         Mike Holmes
>>         Technical Manager - Linaro Networking Group
>>         Linaro.org <http://www.linaro.org/>***│ *Open source software
>>         for ARM SoCs
>>         "Work should be fun and collaborative, the rest follows"
>>
>>
>>
>>
>>
>> --
>> Mike Holmes
>> Technical Manager - Linaro Networking Group
>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM
>> SoCs
>> "Work should be fun and collaborative, the rest follows"
>>
>>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
Maxim Uvarov May 19, 2016, 6:36 a.m. UTC | #8
On 18 May 2016 at 23:24, Ola Liljedahl <ola.liljedahl@linaro.org> wrote:

> Maxim,
>
> I configured and built using clang and -m32 and it worked for me. clang
> chooses the lock-based path.
> This means __SIZEOF_INT128__ and __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 were
> not defined so the ODP atomic support for 128-bit variables was not
> enabled. The timer code fell back to the lock-based path.
>
> I used clang 3.6.2-1 running on Ubuntu 15.10.
>
>
mine is ubuntu 14.04 clang version 3.4 which looks like buggy.


> I think your clang 3.4 is buggy, it claims to support -mcx16 also for -m32
> targets and then defines those preprocessor symbols above which are tested
> by odp_atomic_internal.h. But when generating code, the compiler backend
> realises that there is no suitable instruction (e.g. cmpxchg16) for
> i386/686 targets so generates calls to external functions (e.g.
> __atomic_exchange) instead. But those functions either do not exist or are
> somehow not included in the linking (I would expect them to be located in
> clang's equivalent to libgcc.a). There are some bug reports on missing
> support for 128-bit atomics in clang and instead you get these calls to
> non-existing functions.
>
> Add an additional check to odp_atomic_internal.h that clang version must
> be >= 3.6 (don't know about 3.5) for the ODP support for 128 bit atomics to
> be enabled.
>
> -- Ola
>

Ok, thanks for explanation. I will send v2 of my patch with clang version.

Maxim.



>
>
> On 18 May 2016 at 20:53, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:
>
>> On 05/18/16 21:45, Mike Holmes wrote:
>>
>>> Maxim and I had a chat, I think  this patch means to say, "for now clang
>>> will use non optimised code, but deeper analysis is needed to optimise with
>>> clang"
>>>
>>> yes, looks like comment under "---" was confusing. This patch is not a
>> hack it's only routes clang generated code to lock path instead of lock
>> free path with 128 bit instructions.
>>
>> Maxim.
>>
>> On 18 May 2016 at 14:27, Maxim Uvarov <maxim.uvarov@linaro.org <mailto:
>>> maxim.uvarov@linaro.org>> wrote:
>>>
>>>     On 05/18/16 19:48, Mike Holmes wrote:
>>>
>>>
>>>
>>>         On 18 May 2016 at 11:56, Maxim Uvarov <maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>
>>>         <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>>>
>>>             On 05/18/16 18:52, Mike Holmes wrote:
>>>
>>>
>>>
>>>                 On 18 May 2016 at 11:15, Maxim Uvarov
>>>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>>>                 <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>>
>>>                 <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>
>>>                 <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>>>> wrote:
>>>
>>>                     Fix compilation error for clang with disabling 128
>>> bit
>>>                 optimization.
>>>                     In function `_odp_atomic_u128_xchg_mm':
>>>                     undefined reference to `__atomic_exchange'
>>>
>>>                     Signed-off-by: Maxim Uvarov
>>>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>>>                 <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>>
>>>                     <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>
>>>                 <mailto:maxim.uvarov@linaro.org
>>>         <mailto:maxim.uvarov@linaro.org>>>>
>>>                     ---
>>>                      I need some quick way to make clang build happy
>>>
>>>
>>>                 Why not revert whatever introduced the issue ?
>>>
>>>                     . Clean patch can go later.
>>>
>>>                 When is "later" defined to be ?
>>>
>>>
>>>                 Why dont we just wait for the correct fix ?
>>>
>>>             to make -m32 work now.
>>>
>>>
>>>         why now, why do we need a fix so urgently that we dont fix it
>>>         properly.
>>>
>>>
>>>     There is no big urgent and patch can wait usual 24 hours. I might
>>>     be clear describing
>>>     problem which patch fixes to me understandable for people who did
>>>     not look into timer test.
>>>
>>>     I think that 2 Olas patches fixes issue with 128 bit optimization
>>>     with gcc, but introduce some
>>>     other things which we did not capture on review process:
>>>
>>>     1) clangs (at least my Ubuntu clang version 3.4-1ubuntu3
>>>     (tags/RELEASE_34/final) (based on LLVM 3.4)
>>>     does not link with gcc build in __atomic_exchange. That means
>>>     clang build should use generic not optimized
>>>     for 128 bit version.
>>>
>>>     2) Build for odp-linux has to be reproducible. And at the same
>>>     time run on any similar arch.
>>>     That means that all such optimizations should be under ./configure
>>>     options (for timer it's #define ODP_ATOMIC_U128).
>>>     Only in that case we can be sure that x86 generic build (which
>>>     will be in ubuntu, debian, redhat and etc) will run on
>>>     all machines, even which do not support intrisics.
>>>
>>>     3) configure compiller detection things has to be in:
>>>     platform/linux-generic/m4/configure.m4
>>>
>>>
>>>     Current patch fixes (1).  But because (2) and (3) is only dance
>>>     around configure.ac <http://configure.ac> options and no
>>>     functional change
>>>     expected, I think current patch might be the best approach for
>>>     now. ./configure.ac <http://configure.ac> and reproducible build
>>>
>>>     as well as
>>>     move timer arch code separate file is subject for other patch and
>>>     it's not related to timer internal implementation.
>>>
>>>     What is your opinion?
>>>
>>>     Maxim.
>>>
>>>
>>>
>>>             Maxim.
>>>
>>>
>>>
>>>                      Maxim.
>>>
>>>          platform/linux-generic/include/odp_atomic_internal.h | 4 +++-
>>>                      1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>>                     diff --git
>>>         a/platform/linux-generic/include/odp_atomic_internal.h
>>>         b/platform/linux-generic/include/odp_atomic_internal.h
>>>                     index 3c5606c..31c8059 100644
>>>                     ---
>>>         a/platform/linux-generic/include/odp_atomic_internal.h
>>>                     +++
>>>         b/platform/linux-generic/include/odp_atomic_internal.h
>>>                     @@ -590,7 +590,9 @@ static inline void
>>>                     _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
>>>                      /* Check if target and compiler supports 128-bit
>>>         scalars and
>>>                     corresponding
>>>                       * exchange and CAS operations */
>>>                      /* GCC on x86-64 needs -mcx16 compiler option */
>>>                     -#if defined __SIZEOF_INT128__ && defined
>>>                     __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
>>>                     +#if defined(__SIZEOF_INT128__) && \
>>>                     + defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
>>>                     +       !defined(__clang__)
>>>
>>>                      /** Preprocessor symbol that indicates support for
>>>                 128-bit atomics */
>>>                      #define ODP_ATOMIC_U128
>>>                     --
>>>                     2.7.1.250.gff4ea60
>>>
>>>         _______________________________________________
>>>                     lng-odp mailing list
>>>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>>>         <mailto:lng-odp@lists.linaro.org
>>>         <mailto:lng-odp@lists.linaro.org>>
>>>                 <mailto:lng-odp@lists.linaro.org
>>>         <mailto:lng-odp@lists.linaro.org>
>>>                 <mailto:lng-odp@lists.linaro.org
>>>         <mailto:lng-odp@lists.linaro.org>>>
>>>         https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>>
>>>
>>>
>>>                 --         Mike Holmes
>>>                 Technical Manager - Linaro Networking Group
>>>                 Linaro.org <http://www.linaro.org/>***│ *Open source
>>>         software
>>>                 for ARM SoCs
>>>                 "Work should be fun and collaborative, the rest follows"
>>>
>>>
>>>
>>>
>>>
>>>         --         Mike Holmes
>>>         Technical Manager - Linaro Networking Group
>>>         Linaro.org <http://www.linaro.org/>***│ *Open source software
>>>         for ARM SoCs
>>>         "Work should be fun and collaborative, the rest follows"
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Mike Holmes
>>> Technical Manager - Linaro Networking Group
>>> Linaro.org <http://www.linaro.org/>***│ *Open source software for ARM
>>> SoCs
>>> "Work should be fun and collaborative, the rest follows"
>>>
>>>
>> _______________________________________________
>> lng-odp mailing list
>> lng-odp@lists.linaro.org
>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>
>
diff mbox

Patch

diff --git a/platform/linux-generic/include/odp_atomic_internal.h b/platform/linux-generic/include/odp_atomic_internal.h
index 3c5606c..31c8059 100644
--- a/platform/linux-generic/include/odp_atomic_internal.h
+++ b/platform/linux-generic/include/odp_atomic_internal.h
@@ -590,7 +590,9 @@  static inline void _odp_atomic_flag_clear(_odp_atomic_flag_t *flag)
 /* Check if target and compiler supports 128-bit scalars and corresponding
  * exchange and CAS operations */
 /* GCC on x86-64 needs -mcx16 compiler option */
-#if defined __SIZEOF_INT128__ && defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+#if defined(__SIZEOF_INT128__) && \
+	defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) && \
+	!defined(__clang__)
 
 /** Preprocessor symbol that indicates support for 128-bit atomics */
 #define ODP_ATOMIC_U128