commits

Pull perf updates from Thomas Gleixner:
"A larger set of perf updates.

Not all of them are strictly fixes, but that's solely the tip
maintainers fault as they let the timely -rc1 pull request fall
through the cracks for various reasons including travel. So I'm
sending this nevertheless because rebasing and distangling fixes and
updates would be a mess and risky as well. As of tomorrow, a strict
fixes separation is happening again. Sorry for the slip-up.

Kernel:

- Handle RECORD_MMAP vs. RECORD_MMAP2 correctly so different
consumers of the mmap event get what they requested.

Tools:

- A larger set of updates to perf record/report/scripts vs. time
stamp handling

- More Python3 fixups

- A pile of memory leak plumbing

- perf BPF improvements and fixes

- Finalize the perf.data directory storage"

[ Note: the kernel part is strictly a fix, the updates are purely to
tooling - Linus ]

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
perf bpf: Show more BPF program info in print_bpf_prog_info()
perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog()
perf tools: Save bpf_prog_info and BTF of new BPF programs
perf evlist: Introduce side band thread
perf annotate: Enable annotation of BPF programs
perf build: Check what binutils's 'disassembler()' signature to use
perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation
perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO
perf feature detection: Add -lopcodes to feature-libbfd
perf top: Add option --no-bpf-event
perf bpf: Save BTF information as headers to perf.data
perf bpf: Save BTF in a rbtree in perf_env
perf bpf: Save bpf_prog_info information as headers to perf.data
perf bpf: Save bpf_prog_info in a rbtree in perf_env
perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool
perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear()
bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()
tools lib bpf: Introduce bpf_program__get_prog_info_linear()
perf record: Replace option --bpf-event with --no-bpf-event
perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test()
...

6y ago

Valentin Schneider

b9a7b883

sched/fair: Skip LLC NOHZ logic for asymmetric systems

6y ago

zhangyi (F)

5e86bdda

ext4: cleanup bh release code in ext4_ind_remove_space()

6y ago

Linus Torvalds

19caf581

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

6y ago

Thomas Gleixner

d8b5297f

Merge tag 'perf-core-for-mingo-5.1-20190321' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

6y ago

Valentin Schneider

a0fe2cf0

sched/fair: Tune down misfit NOHZ kicks

6y ago

zhangyi (F)

674a2b27

ext4: brelse all indirect buffer in ext4_ind_remove_space()

6y ago

Linus Torvalds

a75eda7b

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

6y ago

Kairui Song

ffc8599a

x86/gart: Exclude GART aperture from kcore

6y ago

Thomas Gleixner

4a98be82

Merge tag 'perf-core-for-mingo-5.1-20190311' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/core improvements and fixes from Arnaldo:

kernel:

Stephane Eranian :

- Restore mmap record type correctly when handling PERF_RECORD_MMAP2
events, as the same template is used for all the threads interested
in mmap events, some may want just PERF_RECORD_MMAP, while some
may want the extra info in MMAP2 records.

perf probe:

Adrian Hunter:

- Fix getting the kernel map, because since changes related to x86 PTI
entry trampolines handling, there are more than one kernel map.

perf script:

Andi Kleen:

- Support insn output for normal samples, i.e.:

perf script -F ip,sym,insn --xed

Will fetch the sample IP from the thread address space and feed it
to Intel's XED disassembler, producing lines such as:

ffffffffa4068804 native_write_msr wrmsr
ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx

That match 'perf annotate's output.

- Make the --cpu filter apply to PERF_RECORD_COMM/FORK/... events, in
addition to PERF_RECORD_SAMPLE.

perf report:

- Add a new --samples option to save a small random number of samples
per hist entry, using a reservoir technique to select a representative
number of samples.

Then allow browsing the samples using 'perf script' as part of the hist
entry context menu. This automatically adds the right filters, so only
the thread or CPU of the sample is displayed. Then we use less' search
functionality to directly jump to the time stamp of the selected sample.

It uses different menus for assembler and source display. Assembler
needs xed installed and source needs debuginfo.

- Fix the UI browser scripts pop up menu when there are many scripts
available.

perf report:

Andi Kleen:

- Add 'time' sort option. E.g.:

% perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
...
0.67% 277061.87300 [.] _dl_start
0.50% 277061.87300 [.] f1
0.50% 277061.87300 [.] f2
0.33% 277061.87300 [.] main
0.29% 277061.87300 [.] _dl_lookup_symbol_x
0.29% 277061.87300 [.] dl_main
0.29% 277061.87300 [.] do_lookup_x
0.17% 277061.87300 [.] _dl_debug_initialize
0.17% 277061.87300 [.] _dl_init_paths
0.08% 277061.87300 [.] check_match
0.04% 277061.87300 [.] _dl_count_modids
1.33% 277061.87400 [.] f1
1.33% 277061.87400 [.] f2
1.33% 277061.87400 [.] main
1.17% 277061.87500 [.] main
1.08% 277061.87500 [.] f1
1.08% 277061.87500 [.] f2
1.00% 277061.87600 [.] main
0.83% 277061.87600 [.] f1
0.83% 277061.87600 [.] f2
1.00% 277061.87700 [.] main

tools headers:

Arnaldo Carvalho de Melo:

- Update x86's syscall_64.tbl, no change in tools/perf behaviour.

- Sync copies asm-generic/unistd.h and linux/in with the kernel sources.

perf data:

Jiri Olsa:

- Prep work to support having perf.data stored as a directory, with one
file per CPU, that ultimately will allow having one ring buffer reading
thread per CPU.

Vendor events:

Martin Liška:

- perf PMU events for AMD Family 17h.

perf script python:

Tony Jones:

- Add python3 support for the remaining Intel PT related scripts, with
these we should have a clean build of perf with python3 while still
supporting the build with python2.

libbpf:

Arnaldo Carvalho de Melo:

- Fix the build on uCLibc, adding the missing stdarg.h since we use
va_list in one typedef.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

6y ago

Song Liu

f8dfeae0

perf bpf: Show more BPF program info in print_bpf_prog_info()

6y ago

Valentin Schneider +1

e25a7a94

sched/fair: Comment some nohz_balancer_kick() kick conditions

6y ago

Lukas Czerner

6c732840

ext4: report real fs size after failed resize

6y ago

Linus Torvalds

f6cc519b

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

6y ago

Alexander Shiyan

2a6a8e2d

clocksource/drivers/clps711x: Remove board support

6y ago

Thomas Gleixner

f7798711

Merge branch 'x86/cpu' into x86/urgent

6y ago

kbuild test robot

c634dc6b

perf/x86/intel: Make dev_attr_allow_tsx_force_abort static

6y ago

Arnaldo Carvalho de Melo

dfcbc2f2

tools lib bpf: Fix the build by adding a missing stdarg.h include

6y ago

Song Liu

fc462ac7

perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog()

6y ago

Konstantin Khlebnikov

4c47acd8

sched/core: Fix buffer overflow in cgroup2 property cpu.max

6y ago

Lukas Czerner

d64264d6

ext4: add missing brelse() in add_new_gdb_meta_bg()

6y ago

Linus Torvalds

e08fef88

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

6y ago

Chen Jie

5a07168d

futex: Ensure that futex address is aligned in handle_futex_death()

6y ago

Atish Patra

32d0be01

clocksource/drivers/riscv: Fix clocksource mask

6y ago

Nathan Chancellor

e898e69d

x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error

6y ago

Ingo Molnar

266d63a7

x86/cpufeature: Fix various quality problems in the <asm/cpu_device_hd.h> header

7y ago

Peter Zijlstra

f764c58b

perf/x86: Fixup typo in stub functions

6y ago

Andi Kleen

e3b74de5

perf tools report: Add custom scripts to script menu

6y ago

Song Liu

d56354dc

perf tools: Save bpf_prog_info and BTF of new BPF programs

To fully annotate BPF programs with source code mapping, 4 different
information are needed:

1) PERF_RECORD_KSYMBOL
2) PERF_RECORD_BPF_EVENT
3) bpf_prog_info
4) btf

This patch handles 3) and 4) for BPF programs loaded after 'perf
record|top'.

For timely process of these information, a dedicated event is added to
the side band evlist.

When PERF_RECORD_BPF_EVENT is received via the side band event, the
polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env.

This information is saved to perf.data at the end of 'perf record'.

Committer testing:

The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a
unnamed union, so can't be used in a struct designated initialization
with older gccs, get it out of that, isolating as 'attr.wakeup_watermark
= 1;' to work with all gcc versions.

We also need to add '--no-bpf-event' to the 'perf record'
perf_event_attr tests in 'perf test', as the way that that test goes is
to intercept the events being setup and looking if they match the fields
described in the control files, since now it finds first the side band
event used to catch the PERF_RECORD_BPF_EVENT, they all fail.

With these issues fixed:

Same scenario as for testing BPF programs loaded before 'perf record' or
'perf top' starts, only start the BPF programs after 'perf record|top',
so that its information get collected by the sideband threads, the rest
works as for the programs loaded before start monitoring.

Add missing 'inline' to the bpf_event__add_sb_event() when
HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without
binutils devel files installed.

Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

6y ago

Peter Zijlstra

a23314e9

sched/cpufreq: Fix 32-bit math overflow

6y ago

Jason Yan

7cf77140

ext4: remove useless ext4_pin_inode()

6y ago

Linus Torvalds

1ebf5afb

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

6y ago

Gustavo A. R. Silva

93417a3f

genirq: Mark expected switch case fall-through

6y ago

Bart Van Assche

82efcab3

workqueue: Only unregister a registered lockdep key

6y ago

YueHaibing

9039de40

clocksource/drivers/mips-gic-timer: Make gic_compare_irqaction static

6y ago

Valdis Kletnieks

4fe64a62

x86/mm/pti: Make local symbols static

6y ago

Kan Liang

0f42b790

x86/cpufeature: Add facility to check for min microcode revisions

7y ago

Peter Zijlstra

ede271b0

perf/x86/intel: Fix memory corruption

6y ago

Andi Kleen

59c24980

perf ui browser: Fix ui popup argv browser for many entries

6y ago

Song Liu

657ee553

perf evlist: Introduce side band thread

6y ago

Sebastian Andrzej Siewior

ad01423a

kthread: Do not use TIMER_IRQSAFE

6y ago

Jan Kara

1dc1097f

ext4: avoid panic during forced reboot

6y ago

Linus Torvalds

9fc13bbd

Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux

6y ago

Valdis Kletnieks

48084abf

watchdog/core: Make variables static

6y ago

Valdis Kletnieks

bb2e3205

genirq/devres: Remove excess parameter from kernel doc

6y ago

Linus Torvalds

9e98c678

Linux 5.1-rc1 v5.1-rc1

6y ago

YueHaibing

008258d9

clocksource/drivers/timer-ti-dm: Make omap_dm_timer_set_load_start() static

6y ago

Matthew Whitehead

0f4d3aa7

x86/cpu/cyrix: Remove {get,set}Cx86_old macros used for Cyrix processors

6y ago

Jiaxun Yang

02371991

x86/CPU/AMD: Set the CPB bit unconditionally on F17h

7y ago

Linus Torvalds

f261c4e5

Merge branch 'akpm' (patches from Andrew)

6y ago

Andi Kleen

905e4aff

perf script: Add array bound checking to list_scripts

6y ago

Song Liu

6987561c

perf annotate: Enable annotation of BPF programs

In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
a new function symbol__disassemble_bpf(), where annotation line
information is filled based on the bpf_prog_info and btf data saved in
given perf_env.

symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
programs.

Committer testing:

After fixing this:

- u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
+ u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms);

Detected when crossbuilding to a 32-bit arch.

And making all this dependent on HAVE_LIBBFD_SUPPORT and
HAVE_LIBBPF_SUPPORT:

1) Have a BPF program running, one that has BTF info, etc, I used
the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
by 'perf trace'.

# grep -B1 augmented_raw ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
#
# perf trace -e *mmsg
dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2

2) Then do a 'perf record' system wide for a while:

# perf record -a
^C[ perf record: Woken up 68 times to write data ]
[ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
#

3) Check that we captured BPF and BTF info in the perf.data file:

# perf report --header-only | grep 'b[pt]f'
# event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 41
# bpf_prog_info of id 42
# btf info of id 2
#

4) Check which programs got recorded:

# perf report | grep bpf_prog | head
0.16% exe bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.14% exe bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.08% fuse-overlayfs bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.07% fuse-overlayfs bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.00% runc bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% sh bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
#

This was with the default --sort order for 'perf report', which is:

--sort comm,dso,symbol

If we just look for the symbol, for instance:

# perf report --sort symbol | grep bpf_prog | head
0.26% [k] bpf_prog_819967866022f1e1_sys_enter - -
0.24% [k] bpf_prog_c1bd85c092d6e4aa_sys_exit - -
#

or the DSO:

# perf report --sort dso | grep bpf_prog | head
0.26% bpf_prog_819967866022f1e1_sys_enter
0.24% bpf_prog_c1bd85c092d6e4aa_sys_exit
#

We'll see the two BPF programs that augmented_raw_syscalls.o puts in
place, one attached to the raw_syscalls:sys_enter and another to the
raw_syscalls:sys_exit tracepoints, as expected.

Now we can finally do, from the command line, annotation for one of
those two symbols, with the original BPF program source coude intermixed
with the disassembled JITed code:

# perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter

Samples: 950 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Percent int sys_enter(struct syscall_enter_args *args)
53.41 push %rbp

0.63 mov %rsp,%rbp
0.31 sub $0x170,%rsp
1.93 sub $0x28,%rbp
7.02 mov %rbx,0x0(%rbp)
3.20 mov %r13,0x8(%rbp)
1.07 mov %r14,0x10(%rbp)
0.61 mov %r15,0x18(%rbp)
0.11 xor %eax,%eax
1.29 mov %rax,0x20(%rbp)
0.11 mov %rdi,%rbx
return bpf_get_current_pid_tgid();
2.02 → callq *ffffffffda6776d9
2.76 mov %eax,-0x148(%rbp)
mov %rbp,%rsi
int sys_enter(struct syscall_enter_args *args)
add $0xfffffffffffffeb8,%rsi
return bpf_map_lookup_elem(pids, &pid) != NULL;
movabs $0xffff975ac2607800,%rdi

1.26 → callq *ffffffffda6789e9
cmp $0x0,%rax
2.43 → je 0
add $0x38,%rax
0.21 xor %r13d,%r13d
if (pid_filter__has(&pids_filtered, getpid()))
0.81 cmp $0x0,%rax
→ jne 0
mov %rbp,%rdi
probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
2.22 add $0xfffffffffffffeb8,%rdi
0.11 mov $0x40,%esi
0.32 mov %rbx,%rdx
2.74 → callq *ffffffffda658409
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
0.22 mov %rbp,%rsi
1.69 add $0xfffffffffffffec0,%rsi
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
movabs $0xffff975bfcd36000,%rdi

add $0xd0,%rdi
0.21 mov 0x0(%rsi),%eax
0.93 cmp $0x200,%rax
→ jae 0
0.10 shl $0x3,%rax

0.11 add %rdi,%rax
0.11 → jmp 0
xor %eax,%eax
if (syscall == NULL || !syscall->enabled)
1.07 cmp $0x0,%rax
→ je 0
if (syscall == NULL || !syscall->enabled)
6.57 movzbq 0x0(%rax),%rdi

if (syscall == NULL || !syscall->enabled)
cmp $0x0,%rdi
0.95 → je 0
mov $0x40,%r8d
switch (augmented_args.args.syscall_nr) {
mov -0x140(%rbp),%rdi
switch (augmented_args.args.syscall_nr) {
cmp $0x2,%rdi
→ je 0
cmp $0x101,%rdi
→ je 0
cmp $0x15,%rdi
→ jne 0
case SYS_OPEN: filename_arg = (const void *)args->args[0];
mov 0x10(%rbx),%rdx
→ jmp 0
case SYS_OPENAT: filename_arg = (const void *)args->args[1];
mov 0x18(%rbx),%rdx
if (filename_arg != NULL) {
cmp $0x0,%rdx
→ je 0
xor %edi,%edi
augmented_args.filename.reserved = 0;
mov %edi,-0x104(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rbp,%rdi
add $0xffffffffffffff00,%rdi
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov $0x100,%esi
→ callq *ffffffffda658499
mov $0x148,%r8d
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %eax,-0x108(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rax,%rdi
shl $0x20,%rdi

shr $0x20,%rdi

if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
cmp $0xff,%rdi
→ ja 0
len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
add $0x48,%rax
len &= sizeof(augmented_args.filename.value) - 1;
and $0xff,%rax
mov %rax,%r8
mov %rbp,%rcx
return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
add $0xfffffffffffffeb8,%rcx
mov %rbx,%rdi
movabs $0xffff975fbd72d800,%rsi

mov $0xffffffff,%edx
→ callq *ffffffffda658ad9
mov %rax,%r13
}
mov %r13,%rax
0.72 mov 0x0(%rbp),%rbx
mov 0x8(%rbp),%r13
1.16 mov 0x10(%rbp),%r14
0.10 mov 0x18(%rbp),%r15
0.42 add $0x28,%rbp
0.54 leaveq
0.54 ← retq
#

Please see 'man perf-config' to see how to control what should be seen,
via ~/.perfconfig [annotate] section, for instance, one can suppress the
source code and see just the disassembly, etc.

Alternatively, use the TUI bu just using 'perf annotate', press
'/bpf_prog' to see the bpf symbols, press enter and do the interactive
annotation, which allows for dumping to a file after selecting the
the various output tunables, for instance, the above without source code
intermixed, plus showing all the instruction offsets:

# perf annotate bpf_prog_819967866022f1e1_sys_enter

Then press: 's' to hide the source code + 'O' twice to show all
instruction offsets, then 'P' to print to the
bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:

# cat bpf_prog_819967866022f1e1_sys_enter.annotation
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Event: cycles:ppp

53.41 0: push %rbp

0.63 1: mov %rsp,%rbp
0.31 4: sub $0x170,%rsp
1.93 b: sub $0x28,%rbp
7.02 f: mov %rbx,0x0(%rbp)
3.20 13: mov %r13,0x8(%rbp)
1.07 17: mov %r14,0x10(%rbp)
0.61 1b: mov %r15,0x18(%rbp)
0.11 1f: xor %eax,%eax
1.29 21: mov %rax,0x20(%rbp)
0.11 25: mov %rdi,%rbx
2.02 28: → callq *ffffffffda6776d9
2.76 2d: mov %eax,-0x148(%rbp)
33: mov %rbp,%rsi
36: add $0xfffffffffffffeb8,%rsi
3d: movabs $0xffff975ac2607800,%rdi

1.26 47: → callq *ffffffffda6789e9
4c: cmp $0x0,%rax
2.43 50: → je 0
52: add $0x38,%rax
0.21 56: xor %r13d,%r13d
0.81 59: cmp $0x0,%rax
5d: → jne 0
63: mov %rbp,%rdi
2.22 66: add $0xfffffffffffffeb8,%rdi
0.11 6d: mov $0x40,%esi
0.32 72: mov %rbx,%rdx
2.74 75: → callq *ffffffffda658409
0.22 7a: mov %rbp,%rsi
1.69 7d: add $0xfffffffffffffec0,%rsi
84: movabs $0xffff975bfcd36000,%rdi

8e: add $0xd0,%rdi
0.21 95: mov 0x0(%rsi),%eax
0.93 98: cmp $0x200,%rax
9f: → jae 0
0.10 a1: shl $0x3,%rax

0.11 a5: add %rdi,%rax
0.11 a8: → jmp 0
aa: xor %eax,%eax
1.07 ac: cmp $0x0,%rax
b0: → je 0
6.57 b6: movzbq 0x0(%rax),%rdi

bb: cmp $0x0,%rdi
0.95 bf: → je 0
c5: mov $0x40,%r8d
cb: mov -0x140(%rbp),%rdi
d2: cmp $0x2,%rdi
d6: → je 0
d8: cmp $0x101,%rdi
df: → je 0
e1: cmp $0x15,%rdi
e5: → jne 0
e7: mov 0x10(%rbx),%rdx
eb: → jmp 0
ed: mov 0x18(%rbx),%rdx
f1: cmp $0x0,%rdx
f5: → je 0
f7: xor %edi,%edi
f9: mov %edi,-0x104(%rbp)
ff: mov %rbp,%rdi
102: add $0xffffffffffffff00,%rdi
109: mov $0x100,%esi
10e: → callq *ffffffffda658499
113: mov $0x148,%r8d
119: mov %eax,-0x108(%rbp)
11f: mov %rax,%rdi
122: shl $0x20,%rdi

126: shr $0x20,%rdi

12a: cmp $0xff,%rdi
131: → ja 0
133: add $0x48,%rax
137: and $0xff,%rax
13d: mov %rax,%r8
140: mov %rbp,%rcx
143: add $0xfffffffffffffeb8,%rcx
14a: mov %rbx,%rdi
14d: movabs $0xffff975fbd72d800,%rsi

157: mov $0xffffffff,%edx
15c: → callq *ffffffffda658ad9
161: mov %rax,%r13
164: mov %r13,%rax
0.72 167: mov 0x0(%rbp),%rbx
16b: mov 0x8(%rbp),%r13
1.16 16f: mov 0x10(%rbp),%r14
0.10 173: mov 0x18(%rbp),%r15
0.42 177: add $0x28,%rbp
0.54 17b: leaveq
0.54 17c: ← retq

Another cool way to test all this is to symple use 'perf top' look for
those symbols, go there and press enter, annotate it live :-)

Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>