Message ID | 1403701970-21947-1-git-send-email-will.newton@linaro.org |
---|---|
State | Accepted |
Headers | show |
On Wed, 25 Jun 2014, Will Newton wrote: > Add support for the new HWCAP2 values for ARMv8 added in the > 3.15 kernel. Tested using QEMU which supports these extensions. > > ChangeLog: > > 2014-06-25 Will Newton <will.newton@linaro.org> > > * sysdeps/unix/sysv/linux/arm/dl-procinfo.c > (_dl_arm_cap_flags): Add HWCAP2 values. > * sysdeps/unix/sysv/linux/arm/dl-procinfo.h > (_DL_HWCAP_COUNT): Increase to 37. > (_DL_HWCAP_LAST): New define. > (_DL_HWCAP2_LAST): New define. > (_dl_procinfo): Add support for printing > AT_HWCAP2 entries. > (_dl_string_hwcap): Use _dl_hwcap_string. OK.
> + switch(type)
Missing space.
On 25/06/14 14:12, Will Newton wrote: > Add support for the new HWCAP2 values for ARMv8 added in the > 3.15 kernel. Tested using QEMU which supports these extensions. > > ChangeLog: > > 2014-06-25 Will Newton <will.newton@linaro.org> > > * sysdeps/unix/sysv/linux/arm/dl-procinfo.c > (_dl_arm_cap_flags): Add HWCAP2 values. > * sysdeps/unix/sysv/linux/arm/dl-procinfo.h > (_DL_HWCAP_COUNT): Increase to 37. > (_DL_HWCAP_LAST): New define. > (_DL_HWCAP2_LAST): New define. > (_dl_procinfo): Add support for printing > AT_HWCAP2 entries. > (_dl_string_hwcap): Use _dl_hwcap_string. I don't have a specific comment about this patch. I do have a general comment that I think the HWCAPs exported by the kernel for 32-bit ARM are a joke. The principle problem is that there is precisely zero way to determine the base architecture. You cannot even tell whether you are running on ARMv6 or ARMv7, let alone whether you have key features such as Thumb2. I've heard it suggested that you can part the architecture string (eg armv7l), but 1) the format of this string is not precisely defined in a way that allows you to predict what future cores will generate and 2) parsing strings in ifunc code when function calls can't be made is likely to be hairy at best. We really need a better solution than this. R.
On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote: > On 25/06/14 14:12, Will Newton wrote: >> Add support for the new HWCAP2 values for ARMv8 added in the >> 3.15 kernel. Tested using QEMU which supports these extensions. >> >> ChangeLog: >> >> 2014-06-25 Will Newton <will.newton@linaro.org> >> >> * sysdeps/unix/sysv/linux/arm/dl-procinfo.c >> (_dl_arm_cap_flags): Add HWCAP2 values. >> * sysdeps/unix/sysv/linux/arm/dl-procinfo.h >> (_DL_HWCAP_COUNT): Increase to 37. >> (_DL_HWCAP_LAST): New define. >> (_DL_HWCAP2_LAST): New define. >> (_dl_procinfo): Add support for printing >> AT_HWCAP2 entries. >> (_dl_string_hwcap): Use _dl_hwcap_string. > > I don't have a specific comment about this patch. > > I do have a general comment that I think the HWCAPs exported by the > kernel for 32-bit ARM are a joke. The principle problem is that there > is precisely zero way to determine the base architecture. You cannot > even tell whether you are running on ARMv6 or ARMv7, let alone whether > you have key features such as Thumb2. I agree, and the situation on AArch64 looks no better. It's pretty much impossible to determine the micro-architecture from userland too - which may not be that much of an issue for ARM, but AArch64 likely much more so > I've heard it suggested that you can part the architecture string (eg > armv7l), but 1) the format of this string is not precisely defined in a > way that allows you to predict what future cores will generate and 2) > parsing strings in ifunc code when function calls can't be made is > likely to be hairy at best. I think the platform is probably the best way to pass that info. The kernel currently sets it to: snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c", list->elf_name, ENDIANNESS); Where elf_name is one of: v4 v5 v5t v6 v7 v7m That doesn't look too intractable, and we can work with the kernel guys to make sure nothing too surprising is added there. The string parsing of architecture revision is pretty trivial in those cases. Perhaps this is something we can discuss at the GNU Tools Cauldron next month. On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/
On 26/06/14 10:36, Will Newton wrote: > On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote: >> On 25/06/14 14:12, Will Newton wrote: >>> Add support for the new HWCAP2 values for ARMv8 added in the >>> 3.15 kernel. Tested using QEMU which supports these extensions. >>> >>> ChangeLog: >>> >>> 2014-06-25 Will Newton <will.newton@linaro.org> >>> >>> * sysdeps/unix/sysv/linux/arm/dl-procinfo.c >>> (_dl_arm_cap_flags): Add HWCAP2 values. >>> * sysdeps/unix/sysv/linux/arm/dl-procinfo.h >>> (_DL_HWCAP_COUNT): Increase to 37. >>> (_DL_HWCAP_LAST): New define. >>> (_DL_HWCAP2_LAST): New define. >>> (_dl_procinfo): Add support for printing >>> AT_HWCAP2 entries. >>> (_dl_string_hwcap): Use _dl_hwcap_string. >> >> I don't have a specific comment about this patch. >> >> I do have a general comment that I think the HWCAPs exported by the >> kernel for 32-bit ARM are a joke. The principle problem is that there >> is precisely zero way to determine the base architecture. You cannot >> even tell whether you are running on ARMv6 or ARMv7, let alone whether >> you have key features such as Thumb2. > > I agree, and the situation on AArch64 looks no better. It's pretty > much impossible to determine the micro-architecture from userland too > - which may not be that much of an issue for ARM, but AArch64 likely > much more so > There has to be a better way of addressing that issue than reading the microarchitecture name and then switching on that. The list is potentially unbounded: what do you do when you encounter a new micro-architecture? >> I've heard it suggested that you can part the architecture string (eg >> armv7l), but 1) the format of this string is not precisely defined in a >> way that allows you to predict what future cores will generate and 2) >> parsing strings in ifunc code when function calls can't be made is >> likely to be hairy at best. > > I think the platform is probably the best way to pass that info. The > kernel currently sets it to: > > snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c", > list->elf_name, ENDIANNESS); > > Where elf_name is one of: > > v4 > v5 > v5t > v6 > v7 > v7m > > That doesn't look too intractable, and we can work with the kernel > guys to make sure nothing too surprising is added there. Until you realize that these do not have a total ordering; that is, while you can write v4 < v5 < v5t < v6 < v7 You cannot insert v7m in that list at any point, since it is both more and less than v6. In fact, it's both more and less than the baseline v7, since it also has a divide instruction. > The string > parsing of architecture revision is pretty trivial in those cases. > Perhaps this is something we can discuss at the GNU Tools Cauldron > next month. > > On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/ > Yeah, but then, I don't think reading this string is a useful way of solving this problem. R.
On 26 June 2014 11:01, Richard Earnshaw <rearnsha@arm.com> wrote: > On 26/06/14 10:36, Will Newton wrote: >> On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote: >>> On 25/06/14 14:12, Will Newton wrote: >>>> Add support for the new HWCAP2 values for ARMv8 added in the >>>> 3.15 kernel. Tested using QEMU which supports these extensions. >>>> >>>> ChangeLog: >>>> >>>> 2014-06-25 Will Newton <will.newton@linaro.org> >>>> >>>> * sysdeps/unix/sysv/linux/arm/dl-procinfo.c >>>> (_dl_arm_cap_flags): Add HWCAP2 values. >>>> * sysdeps/unix/sysv/linux/arm/dl-procinfo.h >>>> (_DL_HWCAP_COUNT): Increase to 37. >>>> (_DL_HWCAP_LAST): New define. >>>> (_DL_HWCAP2_LAST): New define. >>>> (_dl_procinfo): Add support for printing >>>> AT_HWCAP2 entries. >>>> (_dl_string_hwcap): Use _dl_hwcap_string. >>> >>> I don't have a specific comment about this patch. >>> >>> I do have a general comment that I think the HWCAPs exported by the >>> kernel for 32-bit ARM are a joke. The principle problem is that there >>> is precisely zero way to determine the base architecture. You cannot >>> even tell whether you are running on ARMv6 or ARMv7, let alone whether >>> you have key features such as Thumb2. >> >> I agree, and the situation on AArch64 looks no better. It's pretty >> much impossible to determine the micro-architecture from userland too >> - which may not be that much of an issue for ARM, but AArch64 likely >> much more so >> > > There has to be a better way of addressing that issue than reading the > microarchitecture name and then switching on that. The list is > potentially unbounded: what do you do when you encounter a new > micro-architecture? In the context of an ifunc resolver, do what you do now which is use the hwcap bits. The use case that I can imagine is where we have a microarchitecture that suffers from a particularly poor behaviour with e.g. the default memcpy then we can switch it to use a custom version. At the moment the only way I can see to deal with that is read /proc and stash that information inside ld.so somewhere at startup but that is rather ugly... >>> I've heard it suggested that you can part the architecture string (eg >>> armv7l), but 1) the format of this string is not precisely defined in a >>> way that allows you to predict what future cores will generate and 2) >>> parsing strings in ifunc code when function calls can't be made is >>> likely to be hairy at best. >> >> I think the platform is probably the best way to pass that info. The >> kernel currently sets it to: >> >> snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c", >> list->elf_name, ENDIANNESS); >> >> Where elf_name is one of: >> >> v4 >> v5 >> v5t >> v6 >> v7 >> v7m >> >> That doesn't look too intractable, and we can work with the kernel >> guys to make sure nothing too surprising is added there. > > Until you realize that these do not have a total ordering; that is, > while you can write > > v4 < v5 < v5t < v6 < v7 > > You cannot insert v7m in that list at any point, since it is both more > and less than v6. In fact, it's both more and less than the baseline > v7, since it also has a divide instruction. I don't see why we would care about ordering these values - isn't it just a symbolic value? What use case do you have in mind? >> The string >> parsing of architecture revision is pretty trivial in those cases. >> Perhaps this is something we can discuss at the GNU Tools Cauldron >> next month. >> >> On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/ >> > > Yeah, but then, I don't think reading this string is a useful way of > solving this problem. Do you have an alternative proposal? ;-)
On 26/06/14 14:07, Will Newton wrote: > On 26 June 2014 11:01, Richard Earnshaw <rearnsha@arm.com> wrote: >> On 26/06/14 10:36, Will Newton wrote: >>> On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote: >>>> On 25/06/14 14:12, Will Newton wrote: >>>>> Add support for the new HWCAP2 values for ARMv8 added in the >>>>> 3.15 kernel. Tested using QEMU which supports these extensions. >>>>> >>>>> ChangeLog: >>>>> >>>>> 2014-06-25 Will Newton <will.newton@linaro.org> >>>>> >>>>> * sysdeps/unix/sysv/linux/arm/dl-procinfo.c >>>>> (_dl_arm_cap_flags): Add HWCAP2 values. >>>>> * sysdeps/unix/sysv/linux/arm/dl-procinfo.h >>>>> (_DL_HWCAP_COUNT): Increase to 37. >>>>> (_DL_HWCAP_LAST): New define. >>>>> (_DL_HWCAP2_LAST): New define. >>>>> (_dl_procinfo): Add support for printing >>>>> AT_HWCAP2 entries. >>>>> (_dl_string_hwcap): Use _dl_hwcap_string. >>>> >>>> I don't have a specific comment about this patch. >>>> >>>> I do have a general comment that I think the HWCAPs exported by the >>>> kernel for 32-bit ARM are a joke. The principle problem is that there >>>> is precisely zero way to determine the base architecture. You cannot >>>> even tell whether you are running on ARMv6 or ARMv7, let alone whether >>>> you have key features such as Thumb2. >>> >>> I agree, and the situation on AArch64 looks no better. It's pretty >>> much impossible to determine the micro-architecture from userland too >>> - which may not be that much of an issue for ARM, but AArch64 likely >>> much more so >>> >> >> There has to be a better way of addressing that issue than reading the >> microarchitecture name and then switching on that. The list is >> potentially unbounded: what do you do when you encounter a new >> micro-architecture? > > In the context of an ifunc resolver, do what you do now which is use > the hwcap bits. And that's my major gripe. They don't work, since they don't tell you what you really need to know. There's no HWCAP bit to say Thumb2; similarly there's no HWCAP bit to say that you've got the integer SIMD instructions or the v5e DSP instructions. It's assumed you can work this out from the architecture name; but there's insufficient specification for that to make it work going forwards to new architecture variants. The net result is that you have to start making inferences based on other HWCAPS (eg if I've got Neon, then I've got thumb2); but the inverse is not always implied (for example, if I have thumb2 I don't necessarily have Neon), so there will still be cases that are missed. It's also dangerous to do this since some of the inferences may end up being incorrect in the long run. I don't have an answer to any of this as of today, other than to go read the full hardware features bits out of CP15; but that's a privileged operation not available to user-space code. > The use case that I can imagine is where we have a > microarchitecture that suffers from a particularly poor behaviour with > e.g. the default memcpy then we can switch it to use a custom version. > At the moment the only way I can see to deal with that is read /proc > and stash that information inside ld.so somewhere at startup but that > is rather ugly... > For some of these, puting the relevant code in a VDSO might be a better approach; at least for critical system routines. >>>> I've heard it suggested that you can part the architecture string (eg >>>> armv7l), but 1) the format of this string is not precisely defined in a >>>> way that allows you to predict what future cores will generate and 2) >>>> parsing strings in ifunc code when function calls can't be made is >>>> likely to be hairy at best. >>> >>> I think the platform is probably the best way to pass that info. The >>> kernel currently sets it to: >>> >>> snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c", >>> list->elf_name, ENDIANNESS); >>> >>> Where elf_name is one of: >>> >>> v4 >>> v5 >>> v5t >>> v6 >>> v7 >>> v7m >>> >>> That doesn't look too intractable, and we can work with the kernel >>> guys to make sure nothing too surprising is added there. >> >> Until you realize that these do not have a total ordering; that is, >> while you can write >> >> v4 < v5 < v5t < v6 < v7 >> >> You cannot insert v7m in that list at any point, since it is both more >> and less than v6. In fact, it's both more and less than the baseline >> v7, since it also has a divide instruction. > > I don't see why we would care about ordering these values - isn't it > just a symbolic value? What use case do you have in mind? Again, the sort of question like, can I use Thumb2 instructions? > >>> The string >>> parsing of architecture revision is pretty trivial in those cases. >>> Perhaps this is something we can discuss at the GNU Tools Cauldron >>> next month. >>> >>> On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/ >>> >> >> Yeah, but then, I don't think reading this string is a useful way of >> solving this problem. > > Do you have an alternative proposal? ;-) > Not really, at this time. Sorry. R.
diff --git a/sysdeps/unix/sysv/linux/arm/dl-procinfo.c b/sysdeps/unix/sysv/linux/arm/dl-procinfo.c index 113cda5..7cb3be9 100644 --- a/sysdeps/unix/sysv/linux/arm/dl-procinfo.c +++ b/sysdeps/unix/sysv/linux/arm/dl-procinfo.c @@ -23,7 +23,7 @@ If anything should be added here check whether the size of each string is still ok with the given array size. - All the #ifdefs in the definitions ar equite irritating but + All the #ifdefs in the definitions are quite irritating but necessary if we want to avoid duplicating the information. There are three different modes: @@ -46,13 +46,14 @@ #if !defined PROCINFO_DECL && defined SHARED ._dl_arm_cap_flags #else -PROCINFO_CLASS const char _dl_arm_cap_flags[22][10] +PROCINFO_CLASS const char _dl_arm_cap_flags[37][10] #endif #ifndef PROCINFO_DECL = { "swp", "half", "thumb", "26bit", "fastmult", "fpa", "vfp", "edsp", "java", "iwmmxt", "crunch", "thumbee", "neon", "vfpv3", "vfpv3d16", "tls", "vfpv4", "idiva", "idivt", "vfpd32", "lpae", "evtstrm", + "aes", "pmull", "sha1", "sha2", "crc32", } #endif #if !defined SHARED || defined PROCINFO_DECL diff --git a/sysdeps/unix/sysv/linux/arm/dl-procinfo.h b/sysdeps/unix/sysv/linux/arm/dl-procinfo.h index 20a3e92..f7557b9 100644 --- a/sysdeps/unix/sysv/linux/arm/dl-procinfo.h +++ b/sysdeps/unix/sysv/linux/arm/dl-procinfo.h @@ -23,32 +23,17 @@ #include <ldsodefs.h> #include <sysdep.h> -#define _DL_HWCAP_COUNT 22 +#define _DL_HWCAP_COUNT 37 -/* The kernel provides platform data but it is not interesting. */ -#define _DL_HWCAP_PLATFORM 0 - - -static inline int -__attribute__ ((unused)) -_dl_procinfo (unsigned int type, unsigned long int word) -{ - int i; - - /* Fallback to unknown output mechanism. */ - if (type == AT_HWCAP2) - return -1; - - _dl_printf ("AT_HWCAP: "); +/* Low 22 bits are allocated in HWCAP. */ +#define _DL_HWCAP_LAST 21 - for (i = 0; i < _DL_HWCAP_COUNT; ++i) - if (word & (1 << i)) - _dl_printf (" %s", GLRO(dl_arm_cap_flags)[i]); +/* Low 5 bits are allocated in HWCAP2. */ +#define _DL_HWCAP2_LAST 4 - _dl_printf ("\n"); +/* The kernel provides platform data but it is not interesting. */ +#define _DL_HWCAP_PLATFORM 0 - return 0; -} static inline const char * __attribute__ ((unused)) @@ -57,17 +42,47 @@ _dl_hwcap_string (int idx) return GLRO(dl_arm_cap_flags)[idx]; }; +static inline int +__attribute__ ((unused)) +_dl_procinfo (unsigned int type, unsigned long int word) +{ + switch(type) + { + case AT_HWCAP: + _dl_printf ("AT_HWCAP: "); + + for (int i = 0; i <= _DL_HWCAP_LAST; ++i) + if (word & (1 << i)) + _dl_printf (" %s", _dl_hwcap_string (i)); + break; + case AT_HWCAP2: + { + unsigned int offset = _DL_HWCAP_LAST + 1; + + _dl_printf ("AT_HWCAP2: "); + + for (int i = 0; i <= _DL_HWCAP2_LAST; ++i) + if (word & (1 << i)) + _dl_printf (" %s", _dl_hwcap_string (offset + i)); + break; + } + default: + /* This should not happen. */ + return -1; + } + _dl_printf ("\n"); + return 0; +} + #define HWCAP_IMPORTANT (HWCAP_ARM_VFP | HWCAP_ARM_NEON) static inline int __attribute__ ((unused)) _dl_string_hwcap (const char *str) { - int i; - - for (i = 0; i < _DL_HWCAP_COUNT; i++) + for (int i = 0; i < _DL_HWCAP_COUNT; i++) { - if (strcmp (str, GLRO(dl_arm_cap_flags)[i]) == 0) + if (strcmp (str, _dl_hwcap_string (i)) == 0) return i; } return -1;