Message ID | 20181011182352.25394-1-adhemerval.zanella@linaro.org |
---|---|
State | Accepted |
Commit | c3d8dc45c9df199b8334599a6cbd98c9950dba62 |
Headers | show |
Series | x86: Fix Haswell strong flags (BZ#23709) | expand |
* Adhemerval Zanella: > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > index f4e0f5a2ed..80b3054cf8 100644 > --- a/sysdeps/x86/cpu-features.c > +++ b/sysdeps/x86/cpu-features.c > @@ -316,7 +316,13 @@ init_cpu_features (struct cpu_features *cpu_features) > | bit_arch_Fast_Unaligned_Copy > | bit_arch_Prefer_PMINUB_for_stringop); > break; > + } > > + /* Disable TSX on some Haswell processors to avoid TSX on kernels that > + weren't updated with the latest microcode package (which disables > + broken feature by default). */ > + switch (model) > + { > case 0x3f: > /* Xeon E7 v3 with stepping >= 4 has working TSX. */ > if (stepping >= 4) I think the change is okay as posted. It will need some testing in the field because the newly selected implementations could have unexpected performance drawbacks. Thanks, Florian
On 23/10/2018 06:55, Florian Weimer wrote: > * Adhemerval Zanella: > >> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c >> index f4e0f5a2ed..80b3054cf8 100644 >> --- a/sysdeps/x86/cpu-features.c >> +++ b/sysdeps/x86/cpu-features.c >> @@ -316,7 +316,13 @@ init_cpu_features (struct cpu_features *cpu_features) >> | bit_arch_Fast_Unaligned_Copy >> | bit_arch_Prefer_PMINUB_for_stringop); >> break; >> + } >> >> + /* Disable TSX on some Haswell processors to avoid TSX on kernels that >> + weren't updated with the latest microcode package (which disables >> + broken feature by default). */ >> + switch (model) >> + { >> case 0x3f: >> /* Xeon E7 v3 with stepping >= 4 has working TSX. */ >> if (stepping >= 4) > > I think the change is okay as posted. It will need some testing in the > field because the newly selected implementations could have unexpected > performance drawbacks. This fix only change ifunc selection for Haswell chips [1]: Haswell (Client) GT3E 0 0x6 0x4 0x6 Family 6 Model 70 ULT 0 0x6 0x4 0x5 Family 6 Model 69 S 0 0x6 0x3 0xC Family 6 Model 60 Haswell (Server) E, EP, EX 0 0x6 0x3 0xF Family 6 Model 63 On these chips the internal glibc flags bit_arch_Fast_Rep_String, bit_arch_Fast_Unaligned_Load, bit_arch_Fast_Unaligned_Copy, and bit_arch_Prefer_PMINUB_for_stringop won't be set: * The bit_arch_Fast_Rep_String is only used on ifunc selection on i686 (32-bits) and it selects the *ssse3_rep* memcpy, memmove, bcopy, and mempcpy. It it is not set the *ssse3* variant is used instead. I am not sure which is the performance difference between but my expectation is ssse3_rep should be faster than ssse3. * The bit_arch_Fast_Unaligned_Load influences both i686 and x86_64. For x86_64 it influences the selections of the SSE2 unaligned optimization variants for stpncpy, strcpy, strncpy, stpcpy, strncat, strcat, and strstr. For all but strstr an ssse3 or sse2 variant is used instead I am not sure which is the performance difference between but my expectation the unaligned version should be faster. * The bit_arch_Fast_Unaligned_Copy influences mempcpy, memmove, and memcpy. If chip has not SSE3 the bit will select either a RMS or a unaligned variant. For Haswell the *_avx_unaligned_erms variants will be selected, so this bits won't interfere with best selections. The bit_arch_PMINUB_for_stringop is not used on ifunc selection. [1] https://en.wikichip.org/wiki/intel/cpuid
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index f4e0f5a2ed..80b3054cf8 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -316,7 +316,13 @@ init_cpu_features (struct cpu_features *cpu_features) | bit_arch_Fast_Unaligned_Copy | bit_arch_Prefer_PMINUB_for_stringop); break; + } + /* Disable TSX on some Haswell processors to avoid TSX on kernels that + weren't updated with the latest microcode package (which disables + broken feature by default). */ + switch (model) + { case 0x3f: /* Xeon E7 v3 with stepping >= 4 has working TSX. */ if (stepping >= 4)