x86, cpu: Fix cache topology for early P4-SMT

P4 systems with cpuid level < 4 can have SMT, but the cache topology
description available (cpuid2) does not include SMP information.

Now we know that SMT shares all cache levels, and therefore we can
mark all available cache levels as shared.

We do this by setting cpu_llc_id to ->phys_proc_id, since that's
the same for each SMT thread. We can do this unconditional since if
there's no SMT its still true, the one CPU shares cache with only
itself.

This fixes a problem where such CPUs report an incorrect LLC CPU mask.

This in turn fixes a crash in the scheduler where the topology was
build wrong, it assumes the LLC mask to include at least the SMT CPUs.

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140722133514.GM12054@laptop.lan
Signed-off-by: H. Peter Anvin <hpa@zytor.com>

authored by Peter Zijlstra and committed by H. Peter Anvin 2a226155 8142b215

Changed files
+23 -11
arch
x86
+11 -11
arch/x86/kernel/cpu/intel.c
··· 370 370 */ 371 371 detect_extended_topology(c); 372 372 373 + if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) { 374 + /* 375 + * let's use the legacy cpuid vector 0x1 and 0x4 for topology 376 + * detection. 377 + */ 378 + c->x86_max_cores = intel_num_cpu_cores(c); 379 + #ifdef CONFIG_X86_32 380 + detect_ht(c); 381 + #endif 382 + } 383 + 373 384 l2 = init_intel_cacheinfo(c); 374 385 if (c->cpuid_level > 9) { 375 386 unsigned eax = cpuid_eax(10); ··· 448 437 if (c->x86 == 6) 449 438 set_cpu_cap(c, X86_FEATURE_P3); 450 439 #endif 451 - 452 - if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) { 453 - /* 454 - * let's use the legacy cpuid vector 0x1 and 0x4 for topology 455 - * detection. 456 - */ 457 - c->x86_max_cores = intel_num_cpu_cores(c); 458 - #ifdef CONFIG_X86_32 459 - detect_ht(c); 460 - #endif 461 - } 462 440 463 441 /* Work around errata */ 464 442 srat_detect_node(c);
+12
arch/x86/kernel/cpu/intel_cacheinfo.c
··· 730 730 #endif 731 731 } 732 732 733 + #ifdef CONFIG_X86_HT 734 + /* 735 + * If cpu_llc_id is not yet set, this means cpuid_level < 4 which in 736 + * turns means that the only possibility is SMT (as indicated in 737 + * cpuid1). Since cpuid2 doesn't specify shared caches, and we know 738 + * that SMT shares all caches, we can unconditionally set cpu_llc_id to 739 + * c->phys_proc_id. 740 + */ 741 + if (per_cpu(cpu_llc_id, cpu) == BAD_APICID) 742 + per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; 743 + #endif 744 + 733 745 c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d)); 734 746 735 747 return l2;