Message ID | 20241220213711.1892696-4-sohil.mehta@intel.com |
---|---|
State | New |
Headers | show |
Series | Prepare for new Intel family models | expand |
On 12/20/24 13:36, Sohil Mehta wrote: > X86_FEATURE_REP_GOOD is only set for family 6 processors. Extend the > check to family numbers beyond 15. Could you explain why, please? > It is uncertain whether the Pentium 4s (family 15) should set the > feature flag as well. Commit 185f3b9da24c ("x86: make intel.c have > 64-bit support code") that originally set X86_FEATURE_REP_GOOD also set > the x86_cache_alignment preference for family 15 processors. The > omission of the family 15 seems intentional. > > Also, the 32-bit user copy alignment preference is only set for family 6 > and 15 processors. Extend the preference to family numbers beyond 15. Can you please provide some more context so it's clear which hunk this refers to? Alternatively, can you break this out into a separate patch?
On 12/20/2024 3:27 PM, Dave Hansen wrote: > On 12/20/24 13:36, Sohil Mehta wrote: >> X86_FEATURE_REP_GOOD is only set for family 6 processors. Extend the >> check to family numbers beyond 15. > > Could you explain why, please? > To answer this I was trying to understand where Fast string (X86_FEATURE_REP_GOOD) is used. It looks like copy_page() in lib/copy_page_64.S is the only place it really matters. clear_page() in include/asm/page_64.h would likely use Enhanced fast strings (ERMS) if available. Would it be correct to say that copy_page() and potentially clear_page() would be slower on Family 18/19 CPUs without the fix? >> It is uncertain whether the Pentium 4s (family 15) should set the >> feature flag as well. Commit 185f3b9da24c ("x86: make intel.c have >> 64-bit support code") that originally set X86_FEATURE_REP_GOOD also set >> the x86_cache_alignment preference for family 15 processors. The >> omission of the family 15 seems intentional. >> Analyzing more, it seems the below check in early_init_intel() is not really effective. /* * If fast string is not enabled in IA32_MISC_ENABLE for any reason, * clear the fast string and enhanced fast string CPU capabilities. */ if (c->x86_vfm >= INTEL_PENTIUM_M_DOTHAN) { rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) { pr_info("Disabled fast string operations\n"); setup_clear_cpu_cap(X86_FEATURE_REP_GOOD); setup_clear_cpu_cap(X86_FEATURE_ERMS); } } It gets overridden later in intel_init() with the below code: if (c->x86 == 6) set_cpu_cap(c, X86_FEATURE_REP_GOOD); Shouldn't the order of this be the other way around? >> Also, the 32-bit user copy alignment preference is only set for family 6 >> and 15 processors. Extend the preference to family numbers beyond 15. > > Can you please provide some more context so it's clear which hunk this > refers to? Alternatively, can you break this out into a separate patch? > This is referring to the below chunk. Separating it seems like a better idea to avoid ambiguity. > - > #ifdef CONFIG_X86_INTEL_USERCOPY > /* > * Set up the preferred alignment for movsl bulk memory moves > + * Family 4 - 486: untested > + * Family 5 - Pentium: untested > + * Family 6 - PII/PIII only like movsl with 8-byte alignment > + * Family 15 - P4 is OK down to 8-byte alignment > */ > - switch (c->x86) { > - case 4: /* 486: untested */ > - break; > - case 5: /* Old Pentia: untested */ > - break; > - case 6: /* PII/PIII only like movsl with 8-byte alignment */ > - movsl_mask.mask = 7; > - break; > - case 15: /* P4 is OK down to 8-byte alignment */ > + if (c->x86_vfm >= INTEL_PENTIUM_PRO) > movsl_mask.mask = 7; > - break; > - } > #endif
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 8ded9f859a3a..f44b2e618fb3 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -449,23 +449,16 @@ static void intel_workarounds(struct cpuinfo_x86 *c) (c->x86_stepping < 0x6 || c->x86_stepping == 0xb)) set_cpu_bug(c, X86_BUG_11AP); - #ifdef CONFIG_X86_INTEL_USERCOPY /* * Set up the preferred alignment for movsl bulk memory moves + * Family 4 - 486: untested + * Family 5 - Pentium: untested + * Family 6 - PII/PIII only like movsl with 8-byte alignment + * Family 15 - P4 is OK down to 8-byte alignment */ - switch (c->x86) { - case 4: /* 486: untested */ - break; - case 5: /* Old Pentia: untested */ - break; - case 6: /* PII/PIII only like movsl with 8-byte alignment */ - movsl_mask.mask = 7; - break; - case 15: /* P4 is OK down to 8-byte alignment */ + if (c->x86_vfm >= INTEL_PENTIUM_PRO) movsl_mask.mask = 7; - break; - } #endif intel_smp_check(c); @@ -563,7 +556,7 @@ static void init_intel(struct cpuinfo_x86 *c) #ifdef CONFIG_X86_64 if (c->x86 == 15) c->x86_cache_alignment = c->x86_clflush_size * 2; - if (c->x86 == 6) + if (c->x86 == 6 || c->x86 > 15) set_cpu_cap(c, X86_FEATURE_REP_GOOD); #else /*
X86_FEATURE_REP_GOOD is only set for family 6 processors. Extend the check to family numbers beyond 15. It is uncertain whether the Pentium 4s (family 15) should set the feature flag as well. Commit 185f3b9da24c ("x86: make intel.c have 64-bit support code") that originally set X86_FEATURE_REP_GOOD also set the x86_cache_alignment preference for family 15 processors. The omission of the family 15 seems intentional. Also, the 32-bit user copy alignment preference is only set for family 6 and 15 processors. Extend the preference to family numbers beyond 15. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> --- arch/x86/kernel/cpu/intel.c | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-)