Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf/x86: Fix spurious NMI with PEBS Load Latency event

Spurious NMIs will be observed with the following command:

while :; do
perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
-e "cpu/umask=0x03,event=0x0/"
-e "cpu/umask=0x02,event=0x0/"
-e cycles,branches,cache-misses
-e cache-references -- sleep 10
done

The bug was introduced by commit:

8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")

That commit clears the status bits for the counters used for PEBS
events, by masking the whole 64 bits pebs_enabled. However, only the
low 32 bits of both status and pebs_enabled are reserved for PEBS-able
counters.

For status bits 32-34 are fixed counter overflow bits. For
pebs_enabled bits 32-34 are for PEBS Load Latency.

In the test case, the PEBS Load Latency event and fixed counter event
could overflow at the same time. The fixed counter overflow bit will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.

Correct the PEBS enabled mask by ignoring the non-PEBS bits.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Kan Liang and committed by
Ingo Molnar
fd583ad1 18c5c7c6

+3 -2
+1 -1
arch/x86/events/intel/core.c
··· 2151 2151 * counters from the GLOBAL_STATUS mask and we always process PEBS 2152 2152 * events via drain_pebs(). 2153 2153 */ 2154 - status &= ~cpuc->pebs_enabled; 2154 + status &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK); 2155 2155 2156 2156 /* 2157 2157 * PEBS overflow sets bit 62 in the global status register
+1 -1
arch/x86/events/intel/ds.c
··· 1222 1222 1223 1223 /* clear non-PEBS bit and re-check */ 1224 1224 pebs_status = p->status & cpuc->pebs_enabled; 1225 - pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1; 1225 + pebs_status &= PEBS_COUNTER_MASK; 1226 1226 if (pebs_status == (1 << bit)) 1227 1227 return at; 1228 1228 }
+1
arch/x86/events/perf_event.h
··· 79 79 80 80 /* The maximal number of PEBS events: */ 81 81 #define MAX_PEBS_EVENTS 8 82 + #define PEBS_COUNTER_MASK ((1ULL << MAX_PEBS_EVENTS) - 1) 82 83 83 84 /* 84 85 * Flags PEBS can handle without an PMI.