Message ID | 20180606163846.495725-1-jeremy.linton@arm.com |
---|---|
State | Accepted |
Commit | e156ab71a974737c279530e3b868131291fe677e |
Headers | show |
Series | [v2] arm64: topology: Avoid checking numa mask for scheduler MC selection | expand |
On Wed, Jun 06, 2018 at 11:38:46AM -0500, Jeremy Linton wrote: > The numa mask subset check can often lead to system hang or crash during > CPU hotplug and system suspend operation if NUMA is disabled. This is > mostly observed on HMP systems where the CPU compute capacities are > different and ends up in different scheduler domains. Since > cpumask_of_node is returned instead core_sibling, the scheduler is > confused with incorrect cpumasks(e.g. one CPU in two different sched > domains at the same time) on CPU hotplug. > > Lets disable the NUMA siblings checks for the time being, as NUMA in > socket machines have LLC's that will assure that the scheduler topology > isn't "borken". > > The NUMA check exists to assure that if a LLC within a socket crosses > NUMA nodes/chiplets the scheduler domains remain consistent. This code will > likely have to be re-enabled in the near future once the NUMA mask story > is sorted. At the moment its not necessary because the NUMA in socket > machines LLC's are contained within the NUMA domains. > > Further, as a defensive mechanism during hot-plug, lets assure that the > LLC siblings are also masked. > > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Thanks for this. I queued it for this merging window. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 7415c166281f..f845a8617812 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -215,13 +215,8 @@ EXPORT_SYMBOL_GPL(cpu_topology); const struct cpumask *cpu_coregroup_mask(int cpu) { - const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu)); + const cpumask_t *core_mask = &cpu_topology[cpu].core_sibling; - /* Find the smaller of NUMA, core or LLC siblings */ - if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) { - /* not numa in package, lets use the package siblings */ - core_mask = &cpu_topology[cpu].core_sibling; - } if (cpu_topology[cpu].llc_id != -1) { if (cpumask_subset(&cpu_topology[cpu].llc_siblings, core_mask)) core_mask = &cpu_topology[cpu].llc_siblings; @@ -239,8 +234,10 @@ static void update_siblings_masks(unsigned int cpuid) for_each_possible_cpu(cpu) { cpu_topo = &cpu_topology[cpu]; - if (cpuid_topo->llc_id == cpu_topo->llc_id) + if (cpuid_topo->llc_id == cpu_topo->llc_id) { cpumask_set_cpu(cpu, &cpuid_topo->llc_siblings); + cpumask_set_cpu(cpuid, &cpu_topo->llc_siblings); + } if (cpuid_topo->package_id != cpu_topo->package_id) continue;