perf events, x86: Work around the Nehalem AAJ80 erratum

On Nehalem CPUs the retired branch-misses event can be completely bogus,
when there are no branch-misses occuring. When there are a lot of branch
misses then the count is pretty accurate. Still, this leaves us with an
event that over-counts a lot.

Detect this erratum and work it around by using BR_MISP_EXEC.ANY events.
These will also count speculated branches but still it's a lot more
precise in practice than the architectural event.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-yyfg0bxo9jsqxd6a0ovfny27@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>

+14 -2
+14 -2
arch/x86/kernel/cpu/perf_event_intel.c
··· 25 25 /* 26 26 * Intel PerfMon, used on Core and later. 27 27 */ 28 - static const u64 intel_perfmon_event_map[] = 28 + static u64 intel_perfmon_event_map[PERF_COUNT_HW_MAX] __read_mostly = 29 29 { 30 30 [PERF_COUNT_HW_CPU_CYCLES] = 0x003c, 31 31 [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, ··· 1308 1308 * AJ106 could possibly be worked around by not allowing LBR 1309 1309 * usage from PEBS, including the fixup. 1310 1310 * AJ68 could possibly be worked around by always programming 1311 - * a pebs_event_reset[0] value and coping with the lost events. 1311 + * a pebs_event_reset[0] value and coping with the lost events. 1312 1312 * 1313 1313 * But taken together it might just make sense to not enable PEBS on 1314 1314 * these chips. ··· 1412 1412 x86_pmu.percore_constraints = intel_nehalem_percore_constraints; 1413 1413 x86_pmu.enable_all = intel_pmu_nhm_enable_all; 1414 1414 x86_pmu.extra_regs = intel_nehalem_extra_regs; 1415 + 1416 + if (ebx & 0x40) { 1417 + /* 1418 + * Erratum AAJ80 detected, we work it around by using 1419 + * the BR_MISP_EXEC.ANY event. This will over-count 1420 + * branch-misses, but it's still much better than the 1421 + * architectural event which is often completely bogus: 1422 + */ 1423 + intel_perfmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x7f89; 1424 + 1425 + pr_cont("erratum AAJ80 worked around, "); 1426 + } 1415 1427 pr_cont("Nehalem events, "); 1416 1428 break; 1417 1429