Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf tools: Add support for pinned modifier

This commit adds support for a new modifier "D", which requests that the
event, or group of events, be pinned to the PMU.

The "p" modifier is already taken for precise, and "P" may be used in
future to mean "fully precise".

So we use "D", which stands for pinneD - and looks like a padlock, or if
you're using the ":D" syntax perf smiles at you.

This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.

Comparison of results with and without pinning:

$ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...

79,590,480,683 cycles # 0.000 GHz
166,123,716,524 instructions # 2.09 insns per cycle
# 0.11 stalled cycles per insn

79,352,134,463 cycles # 0.000 GHz [11.11%]
165,178,301,818 instructions # 2.08 insns per cycle
# 0.11 stalled cycles per insn [11.13%]

As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.

The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
[ Use perf_evsel__is_group_leader instead of open coded equivalent, as
suggested by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Michael Ellerman and committed by
Arnaldo Carvalho de Melo
e9a7c414 d50bf78f

+13 -2
+1
tools/perf/Documentation/perf-list.txt
··· 30 30 H - host counting (not in KVM guests) 31 31 p - precise level 32 32 S - read sample value (PERF_SAMPLE_READ) 33 + D - pin the event to the PMU 33 34 34 35 The 'p' modifier can be used for specifying how precise the instruction 35 36 address should be. The 'p' modifier can be specified multiple times:
+10 -1
tools/perf/util/parse-events.c
··· 688 688 int precise; 689 689 int exclude_GH; 690 690 int sample_read; 691 + int pinned; 691 692 }; 692 693 693 694 static int get_event_modifier(struct event_modifier *mod, char *str, ··· 701 700 int eG = evsel ? evsel->attr.exclude_guest : 0; 702 701 int precise = evsel ? evsel->attr.precise_ip : 0; 703 702 int sample_read = 0; 703 + int pinned = evsel ? evsel->attr.pinned : 0; 704 704 705 705 int exclude = eu | ek | eh; 706 706 int exclude_GH = evsel ? evsel->exclude_GH : 0; ··· 736 734 eG = 1; 737 735 } else if (*str == 'S') { 738 736 sample_read = 1; 737 + } else if (*str == 'D') { 738 + pinned = 1; 739 739 } else 740 740 break; 741 741 ··· 765 761 mod->precise = precise; 766 762 mod->exclude_GH = exclude_GH; 767 763 mod->sample_read = sample_read; 764 + mod->pinned = pinned; 765 + 768 766 return 0; 769 767 } 770 768 ··· 779 773 char *p = str; 780 774 781 775 /* The sizeof includes 0 byte as well. */ 782 - if (strlen(str) > (sizeof("ukhGHpppS") - 1)) 776 + if (strlen(str) > (sizeof("ukhGHpppSD") - 1)) 783 777 return -1; 784 778 785 779 while (*p) { ··· 818 812 evsel->attr.exclude_guest = mod.eG; 819 813 evsel->exclude_GH = mod.exclude_GH; 820 814 evsel->sample_read = mod.sample_read; 815 + 816 + if (perf_evsel__is_group_leader(evsel)) 817 + evsel->attr.pinned = mod.pinned; 821 818 } 822 819 823 820 return 0;
+2 -1
tools/perf/util/parse-events.l
··· 82 82 num_raw_hex [a-fA-F0-9]+ 83 83 name [a-zA-Z_*?][a-zA-Z0-9_*?]* 84 84 name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?]* 85 - modifier_event [ukhpGHS]+ 85 + /* If you add a modifier you need to update check_modifier() */ 86 + modifier_event [ukhpGHSD]+ 86 87 modifier_bp [rwx]{1,3} 87 88 88 89 %%