Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tracing: Add 'hist' trigger Documentation

Add documentation and usage examples for 'hist' triggers.

Link: http://lkml.kernel.org/r/2e13f35f47fea6d647f0efefccfc9673ea84b29f.1457029949.git.tom.zanussi@linux.intel.com

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

authored by

Tom Zanussi and committed by
Steven Rostedt
0fc3813c d0bad49b

+1155
+1155
Documentation/trace/events.txt
··· 512 512 513 513 Note that there can be only one traceon or traceoff trigger per 514 514 triggering event. 515 + 516 + - hist 517 + 518 + This command aggregates event hits into a hash table keyed on one or 519 + more trace event format fields (or stacktrace) and a set of running 520 + totals derived from one or more trace event format fields and/or 521 + event counts (hitcount). 522 + 523 + The format of a hist trigger is as follows: 524 + 525 + hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>] 526 + [:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue] 527 + [:clear] [if <filter>] 528 + 529 + When a matching event is hit, an entry is added to a hash table 530 + using the key(s) and value(s) named. Keys and values correspond to 531 + fields in the event's format description. Values must correspond to 532 + numeric fields - on an event hit, the value(s) will be added to a 533 + sum kept for that field. The special string 'hitcount' can be used 534 + in place of an explicit value field - this is simply a count of 535 + event hits. If 'values' isn't specified, an implicit 'hitcount' 536 + value will be automatically created and used as the only value. 537 + Keys can be any field, or the special string 'stacktrace', which 538 + will use the event's kernel stacktrace as the key. The keywords 539 + 'keys' or 'key' can be used to specify keys, and the keywords 540 + 'values', 'vals', or 'val' can be used to specify values. Compound 541 + keys consisting of up to two fields can be specified by the 'keys' 542 + keyword. Hashing a compound key produces a unique entry in the 543 + table for each unique combination of component keys, and can be 544 + useful for providing more fine-grained summaries of event data. 545 + Additionally, sort keys consisting of up to two fields can be 546 + specified by the 'sort' keyword. If more than one field is 547 + specified, the result will be a 'sort within a sort': the first key 548 + is taken to be the primary sort key and the second the secondary 549 + key. 550 + 551 + 'hist' triggers add a 'hist' file to each event's subdirectory. 552 + Reading the 'hist' file for the event will dump the hash table in 553 + its entirety to stdout. Each printed hash table entry is a simple 554 + list of the keys and values comprising the entry; keys are printed 555 + first and are delineated by curly braces, and are followed by the 556 + set of value fields for the entry. By default, numeric fields are 557 + displayed as base-10 integers. This can be modified by appending 558 + any of the following modifiers to the field name: 559 + 560 + .hex display a number as a hex value 561 + .sym display an address as a symbol 562 + .sym-offset display an address as a symbol and offset 563 + .syscall display a syscall id as a system call name 564 + .execname display a common_pid as a program name 565 + 566 + Note that in general the semantics of a given field aren't 567 + interpreted when applying a modifier to it, but there are some 568 + restrictions to be aware of in this regard: 569 + 570 + - only the 'hex' modifier can be used for values (because values 571 + are essentially sums, and the other modifiers don't make sense 572 + in that context). 573 + - the 'execname' modifier can only be used on a 'common_pid'. The 574 + reason for this is that the execname is simply the 'comm' value 575 + saved for the 'current' process when an event was triggered, 576 + which is the same as the common_pid value saved by the event 577 + tracing code. Trying to apply that comm value to other pid 578 + values wouldn't be correct, and typically events that care save 579 + pid-specific comm fields in the event itself. 580 + 581 + A typical usage scenario would be the following to enable a hist 582 + trigger, read its current contents, and then turn it off: 583 + 584 + # echo 'hist:keys=skbaddr.hex:vals=len' > \ 585 + /sys/kernel/debug/tracing/events/net/netif_rx/trigger 586 + 587 + # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist 588 + 589 + # echo '!hist:keys=skbaddr.hex:vals=len' > \ 590 + /sys/kernel/debug/tracing/events/net/netif_rx/trigger 591 + 592 + The trigger file itself can be read to show the details of the 593 + currently attached hist trigger. This information is also displayed 594 + at the top of the 'hist' file when read. 595 + 596 + By default, the size of the hash table is 2048 entries. The 'size' 597 + parameter can be used to specify more or fewer than that. The units 598 + are in terms of hashtable entries - if a run uses more entries than 599 + specified, the results will show the number of 'drops', the number 600 + of hits that were ignored. The size should be a power of 2 between 601 + 128 and 131072 (any non- power-of-2 number specified will be rounded 602 + up). 603 + 604 + The 'sort' parameter can be used to specify a value field to sort 605 + on. The default if unspecified is 'hitcount' and the default sort 606 + order is 'ascending'. To sort in the opposite direction, append 607 + .descending' to the sort key. 608 + 609 + The 'pause' parameter can be used to pause an existing hist trigger 610 + or to start a hist trigger but not log any events until told to do 611 + so. 'continue' or 'cont' can be used to start or restart a paused 612 + hist trigger. 613 + 614 + The 'clear' parameter will clear the contents of a running hist 615 + trigger and leave its current paused/active state. 616 + 617 + Note that the 'pause', 'cont', and 'clear' parameters should be 618 + applied using 'append' shell operator ('>>') if applied to an 619 + existing trigger, rather than via the '>' operator, which will cause 620 + the trigger to be removed through truncation. 621 + 622 + - enable_hist/disable_hist 623 + 624 + The enable_hist and disable_hist triggers can be used to have one 625 + event conditionally start and stop another event's already-attached 626 + hist trigger. Any number of enable_hist and disable_hist triggers 627 + can be attached to a given event, allowing that event to kick off 628 + and stop aggregations on a host of other events. 629 + 630 + The format is very similar to the enable/disable_event triggers: 631 + 632 + enable_hist:<system>:<event>[:count] 633 + disable_hist:<system>:<event>[:count] 634 + 635 + Instead of enabling or disabling the tracing of the target event 636 + into the trace buffer as the enable/disable_event triggers do, the 637 + enable/disable_hist triggers enable or disable the aggregation of 638 + the target event into a hash table. 639 + 640 + A typical usage scenario for the enable_hist/disable_hist triggers 641 + would be to first set up a paused hist trigger on some event, 642 + followed by an enable_hist/disable_hist pair that turns the hist 643 + aggregation on and off when conditions of interest are hit: 644 + 645 + # echo 'hist:keys=skbaddr.hex:vals=len:pause' > \ 646 + /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger 647 + 648 + # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ 649 + /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger 650 + 651 + # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ 652 + /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger 653 + 654 + The above sets up an initially paused hist trigger which is unpaused 655 + and starts aggregating events when a given program is executed, and 656 + which stops aggregating when the process exits and the hist trigger 657 + is paused again. 658 + 659 + The examples below provide a more concrete illustration of the 660 + concepts and typical usage patterns discussed above. 661 + 662 + 663 + 6.2 'hist' trigger examples 664 + --------------------------- 665 + 666 + The first set of examples creates aggregations using the kmalloc 667 + event. The fields that can be used for the hist trigger are listed 668 + in the kmalloc event's format file: 669 + 670 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format 671 + name: kmalloc 672 + ID: 374 673 + format: 674 + field:unsigned short common_type; offset:0; size:2; signed:0; 675 + field:unsigned char common_flags; offset:2; size:1; signed:0; 676 + field:unsigned char common_preempt_count; offset:3; size:1; signed:0; 677 + field:int common_pid; offset:4; size:4; signed:1; 678 + 679 + field:unsigned long call_site; offset:8; size:8; signed:0; 680 + field:const void * ptr; offset:16; size:8; signed:0; 681 + field:size_t bytes_req; offset:24; size:8; signed:0; 682 + field:size_t bytes_alloc; offset:32; size:8; signed:0; 683 + field:gfp_t gfp_flags; offset:40; size:4; signed:0; 684 + 685 + We'll start by creating a hist trigger that generates a simple table 686 + that lists the total number of bytes requested for each function in 687 + the kernel that made one or more calls to kmalloc: 688 + 689 + # echo 'hist:key=call_site:val=bytes_req' > \ 690 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 691 + 692 + This tells the tracing system to create a 'hist' trigger using the 693 + call_site field of the kmalloc event as the key for the table, which 694 + just means that each unique call_site address will have an entry 695 + created for it in the table. The 'val=bytes_req' parameter tells 696 + the hist trigger that for each unique entry (call_site) in the 697 + table, it should keep a running total of the number of bytes 698 + requested by that call_site. 699 + 700 + We'll let it run for awhile and then dump the contents of the 'hist' 701 + file in the kmalloc event's subdirectory (for readability, a number 702 + of entries have been omitted): 703 + 704 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 705 + # trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] 706 + 707 + { call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176 708 + { call_site: 18446744071579557049 } hitcount: 1 bytes_req: 1024 709 + { call_site: 18446744071580608289 } hitcount: 1 bytes_req: 16384 710 + { call_site: 18446744071581827654 } hitcount: 1 bytes_req: 24 711 + { call_site: 18446744071580700980 } hitcount: 1 bytes_req: 8 712 + { call_site: 18446744071579359876 } hitcount: 1 bytes_req: 152 713 + { call_site: 18446744071580795365 } hitcount: 3 bytes_req: 144 714 + { call_site: 18446744071581303129 } hitcount: 3 bytes_req: 144 715 + { call_site: 18446744071580713234 } hitcount: 4 bytes_req: 2560 716 + { call_site: 18446744071580933750 } hitcount: 4 bytes_req: 736 717 + . 718 + . 719 + . 720 + { call_site: 18446744072106047046 } hitcount: 69 bytes_req: 5576 721 + { call_site: 18446744071582116407 } hitcount: 73 bytes_req: 2336 722 + { call_site: 18446744072106054684 } hitcount: 136 bytes_req: 140504 723 + { call_site: 18446744072106224230 } hitcount: 136 bytes_req: 19584 724 + { call_site: 18446744072106078074 } hitcount: 153 bytes_req: 2448 725 + { call_site: 18446744072106062406 } hitcount: 153 bytes_req: 36720 726 + { call_site: 18446744071582507929 } hitcount: 153 bytes_req: 37088 727 + { call_site: 18446744072102520590 } hitcount: 273 bytes_req: 10920 728 + { call_site: 18446744071582143559 } hitcount: 358 bytes_req: 716 729 + { call_site: 18446744072106465852 } hitcount: 417 bytes_req: 56712 730 + { call_site: 18446744072102523378 } hitcount: 485 bytes_req: 27160 731 + { call_site: 18446744072099568646 } hitcount: 1676 bytes_req: 33520 732 + 733 + Totals: 734 + Hits: 4610 735 + Entries: 45 736 + Dropped: 0 737 + 738 + The output displays a line for each entry, beginning with the key 739 + specified in the trigger, followed by the value(s) also specified in 740 + the trigger. At the beginning of the output is a line that displays 741 + the trigger info, which can also be displayed by reading the 742 + 'trigger' file: 743 + 744 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 745 + hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] 746 + 747 + At the end of the output are a few lines that display the overall 748 + totals for the run. The 'Hits' field shows the total number of 749 + times the event trigger was hit, the 'Entries' field shows the total 750 + number of used entries in the hash table, and the 'Dropped' field 751 + shows the number of hits that were dropped because the number of 752 + used entries for the run exceeded the maximum number of entries 753 + allowed for the table (normally 0, but if not a hint that you may 754 + want to increase the size of the table using the 'size' parameter). 755 + 756 + Notice in the above output that there's an extra field, 'hitcount', 757 + which wasn't specified in the trigger. Also notice that in the 758 + trigger info output, there's a parameter, 'sort=hitcount', which 759 + wasn't specified in the trigger either. The reason for that is that 760 + every trigger implicitly keeps a count of the total number of hits 761 + attributed to a given entry, called the 'hitcount'. That hitcount 762 + information is explicitly displayed in the output, and in the 763 + absence of a user-specified sort parameter, is used as the default 764 + sort field. 765 + 766 + The value 'hitcount' can be used in place of an explicit value in 767 + the 'values' parameter if you don't really need to have any 768 + particular field summed and are mainly interested in hit 769 + frequencies. 770 + 771 + To turn the hist trigger off, simply call up the trigger in the 772 + command history and re-execute it with a '!' prepended: 773 + 774 + # echo '!hist:key=call_site:val=bytes_req' > \ 775 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 776 + 777 + Finally, notice that the call_site as displayed in the output above 778 + isn't really very useful. It's an address, but normally addresses 779 + are displayed in hex. To have a numeric field displayed as a hex 780 + value, simply append '.hex' to the field name in the trigger: 781 + 782 + # echo 'hist:key=call_site.hex:val=bytes_req' > \ 783 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 784 + 785 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 786 + # trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active] 787 + 788 + { call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433 789 + { call_site: ffffffffa07186ff } hitcount: 1 bytes_req: 176 790 + { call_site: ffffffff811ae721 } hitcount: 1 bytes_req: 16384 791 + { call_site: ffffffff811c5134 } hitcount: 1 bytes_req: 8 792 + { call_site: ffffffffa04a9ebb } hitcount: 1 bytes_req: 511 793 + { call_site: ffffffff8122e0a6 } hitcount: 1 bytes_req: 12 794 + { call_site: ffffffff8107da84 } hitcount: 1 bytes_req: 152 795 + { call_site: ffffffff812d8246 } hitcount: 1 bytes_req: 24 796 + { call_site: ffffffff811dc1e5 } hitcount: 3 bytes_req: 144 797 + { call_site: ffffffffa02515e8 } hitcount: 3 bytes_req: 648 798 + { call_site: ffffffff81258159 } hitcount: 3 bytes_req: 144 799 + { call_site: ffffffff811c80f4 } hitcount: 4 bytes_req: 544 800 + . 801 + . 802 + . 803 + { call_site: ffffffffa06c7646 } hitcount: 106 bytes_req: 8024 804 + { call_site: ffffffffa06cb246 } hitcount: 132 bytes_req: 31680 805 + { call_site: ffffffffa06cef7a } hitcount: 132 bytes_req: 2112 806 + { call_site: ffffffff8137e399 } hitcount: 132 bytes_req: 23232 807 + { call_site: ffffffffa06c941c } hitcount: 185 bytes_req: 171360 808 + { call_site: ffffffffa06f2a66 } hitcount: 185 bytes_req: 26640 809 + { call_site: ffffffffa036a70e } hitcount: 265 bytes_req: 10600 810 + { call_site: ffffffff81325447 } hitcount: 292 bytes_req: 584 811 + { call_site: ffffffffa072da3c } hitcount: 446 bytes_req: 60656 812 + { call_site: ffffffffa036b1f2 } hitcount: 526 bytes_req: 29456 813 + { call_site: ffffffffa0099c06 } hitcount: 1780 bytes_req: 35600 814 + 815 + Totals: 816 + Hits: 4775 817 + Entries: 46 818 + Dropped: 0 819 + 820 + Even that's only marginally more useful - while hex values do look 821 + more like addresses, what users are typically more interested in 822 + when looking at text addresses are the corresponding symbols 823 + instead. To have an address displayed as symbolic value instead, 824 + simply append '.sym' or '.sym-offset' to the field name in the 825 + trigger: 826 + 827 + # echo 'hist:key=call_site.sym:val=bytes_req' > \ 828 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 829 + 830 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 831 + # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active] 832 + 833 + { call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024 834 + { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 835 + { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 836 + { call_site: [ffffffff8154acbe] usb_alloc_urb } hitcount: 1 bytes_req: 192 837 + { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 838 + { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40 839 + { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 840 + { call_site: [ffffffff811febd5] fsnotify_alloc_group } hitcount: 2 bytes_req: 528 841 + { call_site: [ffffffff81440f58] __tty_buffer_request_room } hitcount: 2 bytes_req: 2624 842 + { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 2 bytes_req: 96 843 + { call_site: [ffffffffa05e19af] ieee80211_start_tx_ba_session [mac80211] } hitcount: 2 bytes_req: 464 844 + { call_site: [ffffffff81672406] tcp_get_metrics } hitcount: 2 bytes_req: 304 845 + { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 846 + { call_site: [ffffffff81089b05] sched_create_group } hitcount: 2 bytes_req: 1424 847 + . 848 + . 849 + . 850 + { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1185 bytes_req: 123240 851 + { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 1185 bytes_req: 104280 852 + { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 1402 bytes_req: 190672 853 + { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 1518 bytes_req: 146208 854 + { call_site: [ffffffffa029070e] drm_vma_node_allow [drm] } hitcount: 1746 bytes_req: 69840 855 + { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 2021 bytes_req: 792312 856 + { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 2592 bytes_req: 145152 857 + { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2629 bytes_req: 378576 858 + { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2629 bytes_req: 3783248 859 + { call_site: [ffffffff81325607] apparmor_file_alloc_security } hitcount: 5192 bytes_req: 10384 860 + { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 5529 bytes_req: 110584 861 + { call_site: [ffffffff8131ebf7] aa_alloc_task_context } hitcount: 21943 bytes_req: 702176 862 + { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 55759 bytes_req: 5074265 863 + 864 + Totals: 865 + Hits: 109928 866 + Entries: 71 867 + Dropped: 0 868 + 869 + Because the default sort key above is 'hitcount', the above shows a 870 + the list of call_sites by increasing hitcount, so that at the bottom 871 + we see the functions that made the most kmalloc calls during the 872 + run. If instead we we wanted to see the top kmalloc callers in 873 + terms of the number of bytes requested rather than the number of 874 + calls, and we wanted the top caller to appear at the top, we can use 875 + the 'sort' parameter, along with the 'descending' modifier: 876 + 877 + # echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \ 878 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 879 + 880 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 881 + # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] 882 + 883 + { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464 884 + { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1790 bytes_req: 712176 885 + { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 8132 bytes_req: 513135 886 + { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 106 bytes_req: 440128 887 + { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2186 bytes_req: 314784 888 + { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 2174 bytes_req: 208992 889 + { call_site: [ffffffff811ae8e1] __kmalloc } hitcount: 8 bytes_req: 131072 890 + { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 859 bytes_req: 116824 891 + { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 1834 bytes_req: 102704 892 + { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 972 bytes_req: 101088 893 + { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 972 bytes_req: 85536 894 + { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 3333 bytes_req: 66664 895 + { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 209 bytes_req: 61632 896 + . 897 + . 898 + . 899 + { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 900 + { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 901 + { call_site: [ffffffff812d8406] copy_semundo } hitcount: 2 bytes_req: 48 902 + { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 1 bytes_req: 48 903 + { call_site: [ffffffffa027121a] drm_getmagic [drm] } hitcount: 1 bytes_req: 48 904 + { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40 905 + { call_site: [ffffffff811c52f4] bprm_change_interp } hitcount: 2 bytes_req: 16 906 + { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 907 + { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 908 + { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 909 + 910 + Totals: 911 + Hits: 32133 912 + Entries: 81 913 + Dropped: 0 914 + 915 + To display the offset and size information in addition to the symbol 916 + name, just use 'sym-offset' instead: 917 + 918 + # echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \ 919 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 920 + 921 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 922 + # trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] 923 + 924 + { call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720 925 + { call_site: [ffffffffa0489a66] intel_ring_begin+0xc6/0x1f0 [i915] } hitcount: 4569 bytes_req: 657936 926 + { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23+0x694/0x1020 [i915] } hitcount: 1519 bytes_req: 472936 927 + { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23+0x516/0x1020 [i915] } hitcount: 3050 bytes_req: 211832 928 + { call_site: [ffffffff811e2a1b] seq_buf_alloc+0x1b/0x50 } hitcount: 34 bytes_req: 148384 929 + { call_site: [ffffffffa04a580c] intel_crtc_page_flip+0xbc/0x870 [i915] } hitcount: 1385 bytes_req: 144040 930 + { call_site: [ffffffff811ae8e1] __kmalloc+0x191/0x1b0 } hitcount: 8 bytes_req: 131072 931 + { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl+0x282/0x360 [drm] } hitcount: 1385 bytes_req: 121880 932 + { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc+0x32/0x100 [drm] } hitcount: 1848 bytes_req: 103488 933 + { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state+0x2c/0xa0 [i915] } hitcount: 461 bytes_req: 62696 934 + { call_site: [ffffffffa029070e] drm_vma_node_allow+0x2e/0xd0 [drm] } hitcount: 1541 bytes_req: 61640 935 + { call_site: [ffffffff815f8d7b] sk_prot_alloc+0xcb/0x1b0 } hitcount: 57 bytes_req: 57456 936 + . 937 + . 938 + . 939 + { call_site: [ffffffff8109524a] alloc_fair_sched_group+0x5a/0x1a0 } hitcount: 2 bytes_req: 128 940 + { call_site: [ffffffffa027b921] drm_vm_open_locked+0x31/0xa0 [drm] } hitcount: 3 bytes_req: 96 941 + { call_site: [ffffffff8122e266] proc_self_follow_link+0x76/0xb0 } hitcount: 8 bytes_req: 96 942 + { call_site: [ffffffff81213e80] load_elf_binary+0x240/0x1650 } hitcount: 3 bytes_req: 84 943 + { call_site: [ffffffff8154bc62] usb_control_msg+0x42/0x110 } hitcount: 1 bytes_req: 8 944 + { call_site: [ffffffffa00bf6fe] hidraw_send_report+0x7e/0x1a0 [hid] } hitcount: 1 bytes_req: 7 945 + { call_site: [ffffffffa00bf1ca] hidraw_report_event+0x8a/0x120 [hid] } hitcount: 1 bytes_req: 7 946 + 947 + Totals: 948 + Hits: 26098 949 + Entries: 64 950 + Dropped: 0 951 + 952 + We can also add multiple fields to the 'values' parameter. For 953 + example, we might want to see the total number of bytes allocated 954 + alongside bytes requested, and display the result sorted by bytes 955 + allocated in a descending order: 956 + 957 + # echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \ 958 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 959 + 960 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 961 + # trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active] 962 + 963 + { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016 964 + { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 541 bytes_req: 2213968 bytes_alloc: 2228224 965 + { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 7404 bytes_req: 1066176 bytes_alloc: 1421568 966 + { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1565 bytes_req: 557368 bytes_alloc: 1037760 967 + { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 9557 bytes_req: 595778 bytes_alloc: 695744 968 + { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 5839 bytes_req: 430680 bytes_alloc: 470400 969 + { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 2388 bytes_req: 324768 bytes_alloc: 458496 970 + { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 3911 bytes_req: 219016 bytes_alloc: 250304 971 + { call_site: [ffffffff815f8d7b] sk_prot_alloc } hitcount: 235 bytes_req: 236880 bytes_alloc: 240640 972 + { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 557 bytes_req: 169024 bytes_alloc: 221760 973 + { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 9378 bytes_req: 187548 bytes_alloc: 206312 974 + { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1519 bytes_req: 157976 bytes_alloc: 194432 975 + . 976 + . 977 + . 978 + { call_site: [ffffffff8109bd3b] sched_autogroup_create_attach } hitcount: 2 bytes_req: 144 bytes_alloc: 192 979 + { call_site: [ffffffff81097ee8] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 980 + { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 981 + { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 982 + { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 983 + { call_site: [ffffffff81213e80] load_elf_binary } hitcount: 3 bytes_req: 84 bytes_alloc: 96 984 + { call_site: [ffffffff81079a2e] kthread_create_on_node } hitcount: 1 bytes_req: 56 bytes_alloc: 64 985 + { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8 986 + { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 bytes_alloc: 8 987 + { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8 988 + 989 + Totals: 990 + Hits: 66598 991 + Entries: 65 992 + Dropped: 0 993 + 994 + Finally, to finish off our kmalloc example, instead of simply having 995 + the hist trigger display symbolic call_sites, we can have the hist 996 + trigger additionally display the complete set of kernel stack traces 997 + that led to each call_site. To do that, we simply use the special 998 + value 'stacktrace' for the key parameter: 999 + 1000 + # echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \ 1001 + /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger 1002 + 1003 + The above trigger will use the kernel stack trace in effect when an 1004 + event is triggered as the key for the hash table. This allows the 1005 + enumeration of every kernel callpath that led up to a particular 1006 + event, along with a running total of any of the event fields for 1007 + that event. Here we tally bytes requested and bytes allocated for 1008 + every callpath in the system that led up to a kmalloc (in this case 1009 + every callpath to a kmalloc for a kernel compile): 1010 + 1011 + # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist 1012 + # trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active] 1013 + 1014 + { stacktrace: 1015 + __kmalloc_track_caller+0x10b/0x1a0 1016 + kmemdup+0x20/0x50 1017 + hidraw_report_event+0x8a/0x120 [hid] 1018 + hid_report_raw_event+0x3ea/0x440 [hid] 1019 + hid_input_report+0x112/0x190 [hid] 1020 + hid_irq_in+0xc2/0x260 [usbhid] 1021 + __usb_hcd_giveback_urb+0x72/0x120 1022 + usb_giveback_urb_bh+0x9e/0xe0 1023 + tasklet_hi_action+0xf8/0x100 1024 + __do_softirq+0x114/0x2c0 1025 + irq_exit+0xa5/0xb0 1026 + do_IRQ+0x5a/0xf0 1027 + ret_from_intr+0x0/0x30 1028 + cpuidle_enter+0x17/0x20 1029 + cpu_startup_entry+0x315/0x3e0 1030 + rest_init+0x7c/0x80 1031 + } hitcount: 3 bytes_req: 21 bytes_alloc: 24 1032 + { stacktrace: 1033 + __kmalloc_track_caller+0x10b/0x1a0 1034 + kmemdup+0x20/0x50 1035 + hidraw_report_event+0x8a/0x120 [hid] 1036 + hid_report_raw_event+0x3ea/0x440 [hid] 1037 + hid_input_report+0x112/0x190 [hid] 1038 + hid_irq_in+0xc2/0x260 [usbhid] 1039 + __usb_hcd_giveback_urb+0x72/0x120 1040 + usb_giveback_urb_bh+0x9e/0xe0 1041 + tasklet_hi_action+0xf8/0x100 1042 + __do_softirq+0x114/0x2c0 1043 + irq_exit+0xa5/0xb0 1044 + do_IRQ+0x5a/0xf0 1045 + ret_from_intr+0x0/0x30 1046 + } hitcount: 3 bytes_req: 21 bytes_alloc: 24 1047 + { stacktrace: 1048 + kmem_cache_alloc_trace+0xeb/0x150 1049 + aa_alloc_task_context+0x27/0x40 1050 + apparmor_cred_prepare+0x1f/0x50 1051 + security_prepare_creds+0x16/0x20 1052 + prepare_creds+0xdf/0x1a0 1053 + SyS_capset+0xb5/0x200 1054 + system_call_fastpath+0x12/0x6a 1055 + } hitcount: 1 bytes_req: 32 bytes_alloc: 32 1056 + . 1057 + . 1058 + . 1059 + { stacktrace: 1060 + __kmalloc+0x11b/0x1b0 1061 + i915_gem_execbuffer2+0x6c/0x2c0 [i915] 1062 + drm_ioctl+0x349/0x670 [drm] 1063 + do_vfs_ioctl+0x2f0/0x4f0 1064 + SyS_ioctl+0x81/0xa0 1065 + system_call_fastpath+0x12/0x6a 1066 + } hitcount: 17726 bytes_req: 13944120 bytes_alloc: 19593808 1067 + { stacktrace: 1068 + __kmalloc+0x11b/0x1b0 1069 + load_elf_phdrs+0x76/0xa0 1070 + load_elf_binary+0x102/0x1650 1071 + search_binary_handler+0x97/0x1d0 1072 + do_execveat_common.isra.34+0x551/0x6e0 1073 + SyS_execve+0x3a/0x50 1074 + return_from_execve+0x0/0x23 1075 + } hitcount: 33348 bytes_req: 17152128 bytes_alloc: 20226048 1076 + { stacktrace: 1077 + kmem_cache_alloc_trace+0xeb/0x150 1078 + apparmor_file_alloc_security+0x27/0x40 1079 + security_file_alloc+0x16/0x20 1080 + get_empty_filp+0x93/0x1c0 1081 + path_openat+0x31/0x5f0 1082 + do_filp_open+0x3a/0x90 1083 + do_sys_open+0x128/0x220 1084 + SyS_open+0x1e/0x20 1085 + system_call_fastpath+0x12/0x6a 1086 + } hitcount: 4766422 bytes_req: 9532844 bytes_alloc: 38131376 1087 + { stacktrace: 1088 + __kmalloc+0x11b/0x1b0 1089 + seq_buf_alloc+0x1b/0x50 1090 + seq_read+0x2cc/0x370 1091 + proc_reg_read+0x3d/0x80 1092 + __vfs_read+0x28/0xe0 1093 + vfs_read+0x86/0x140 1094 + SyS_read+0x46/0xb0 1095 + system_call_fastpath+0x12/0x6a 1096 + } hitcount: 19133 bytes_req: 78368768 bytes_alloc: 78368768 1097 + 1098 + Totals: 1099 + Hits: 6085872 1100 + Entries: 253 1101 + Dropped: 0 1102 + 1103 + If you key a hist trigger on common_pid, in order for example to 1104 + gather and display sorted totals for each process, you can use the 1105 + special .execname modifier to display the executable names for the 1106 + processes in the table rather than raw pids. The example below 1107 + keeps a per-process sum of total bytes read: 1108 + 1109 + # echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \ 1110 + /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger 1111 + 1112 + # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist 1113 + # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active] 1114 + 1115 + { common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512 1116 + { common_pid: Xorg [ 1309] } hitcount: 525 count: 256640 1117 + { common_pid: compiz [ 2889] } hitcount: 59 count: 254400 1118 + { common_pid: bash [ 8710] } hitcount: 3 count: 66369 1119 + { common_pid: dbus-daemon-lau [ 8703] } hitcount: 49 count: 47739 1120 + { common_pid: irqbalance [ 1252] } hitcount: 27 count: 27648 1121 + { common_pid: 01ifupdown [ 8705] } hitcount: 3 count: 17216 1122 + { common_pid: dbus-daemon [ 772] } hitcount: 10 count: 12396 1123 + { common_pid: Socket Thread [ 8342] } hitcount: 11 count: 11264 1124 + { common_pid: nm-dhcp-client. [ 8701] } hitcount: 6 count: 7424 1125 + { common_pid: gmain [ 1315] } hitcount: 18 count: 6336 1126 + . 1127 + . 1128 + . 1129 + { common_pid: postgres [ 1892] } hitcount: 2 count: 32 1130 + { common_pid: postgres [ 1891] } hitcount: 2 count: 32 1131 + { common_pid: gmain [ 8704] } hitcount: 2 count: 32 1132 + { common_pid: upstart-dbus-br [ 2740] } hitcount: 21 count: 21 1133 + { common_pid: nm-dispatcher.a [ 8696] } hitcount: 1 count: 16 1134 + { common_pid: indicator-datet [ 2904] } hitcount: 1 count: 16 1135 + { common_pid: gdbus [ 2998] } hitcount: 1 count: 16 1136 + { common_pid: rtkit-daemon [ 2052] } hitcount: 1 count: 8 1137 + { common_pid: init [ 1] } hitcount: 2 count: 2 1138 + 1139 + Totals: 1140 + Hits: 2116 1141 + Entries: 51 1142 + Dropped: 0 1143 + 1144 + Similarly, if you key a hist trigger on syscall id, for example to 1145 + gather and display a list of systemwide syscall hits, you can use 1146 + the special .syscall modifier to display the syscall names rather 1147 + than raw ids. The example below keeps a running total of syscall 1148 + counts for the system during the run: 1149 + 1150 + # echo 'hist:key=id.syscall:val=hitcount' > \ 1151 + /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger 1152 + 1153 + # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist 1154 + # trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active] 1155 + 1156 + { id: sys_fsync [ 74] } hitcount: 1 1157 + { id: sys_newuname [ 63] } hitcount: 1 1158 + { id: sys_prctl [157] } hitcount: 1 1159 + { id: sys_statfs [137] } hitcount: 1 1160 + { id: sys_symlink [ 88] } hitcount: 1 1161 + { id: sys_sendmmsg [307] } hitcount: 1 1162 + { id: sys_semctl [ 66] } hitcount: 1 1163 + { id: sys_readlink [ 89] } hitcount: 3 1164 + { id: sys_bind [ 49] } hitcount: 3 1165 + { id: sys_getsockname [ 51] } hitcount: 3 1166 + { id: sys_unlink [ 87] } hitcount: 3 1167 + { id: sys_rename [ 82] } hitcount: 4 1168 + { id: unknown_syscall [ 58] } hitcount: 4 1169 + { id: sys_connect [ 42] } hitcount: 4 1170 + { id: sys_getpid [ 39] } hitcount: 4 1171 + . 1172 + . 1173 + . 1174 + { id: sys_rt_sigprocmask [ 14] } hitcount: 952 1175 + { id: sys_futex [202] } hitcount: 1534 1176 + { id: sys_write [ 1] } hitcount: 2689 1177 + { id: sys_setitimer [ 38] } hitcount: 2797 1178 + { id: sys_read [ 0] } hitcount: 3202 1179 + { id: sys_select [ 23] } hitcount: 3773 1180 + { id: sys_writev [ 20] } hitcount: 4531 1181 + { id: sys_poll [ 7] } hitcount: 8314 1182 + { id: sys_recvmsg [ 47] } hitcount: 13738 1183 + { id: sys_ioctl [ 16] } hitcount: 21843 1184 + 1185 + Totals: 1186 + Hits: 67612 1187 + Entries: 72 1188 + Dropped: 0 1189 + 1190 + The syscall counts above provide a rough overall picture of system 1191 + call activity on the system; we can see for example that the most 1192 + popular system call on this system was the 'sys_ioctl' system call. 1193 + 1194 + We can use 'compound' keys to refine that number and provide some 1195 + further insight as to which processes exactly contribute to the 1196 + overall ioctl count. 1197 + 1198 + The command below keeps a hitcount for every unique combination of 1199 + system call id and pid - the end result is essentially a table 1200 + that keeps a per-pid sum of system call hits. The results are 1201 + sorted using the system call id as the primary key, and the 1202 + hitcount sum as the secondary key: 1203 + 1204 + # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \ 1205 + /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger 1206 + 1207 + # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist 1208 + # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active] 1209 + 1210 + { id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1 1211 + { id: sys_read [ 0], common_pid: gdbus [ 2976] } hitcount: 1 1212 + { id: sys_read [ 0], common_pid: console-kit-dae [ 3400] } hitcount: 1 1213 + { id: sys_read [ 0], common_pid: postgres [ 1865] } hitcount: 1 1214 + { id: sys_read [ 0], common_pid: deja-dup-monito [ 3543] } hitcount: 2 1215 + { id: sys_read [ 0], common_pid: NetworkManager [ 890] } hitcount: 2 1216 + { id: sys_read [ 0], common_pid: evolution-calen [ 3048] } hitcount: 2 1217 + { id: sys_read [ 0], common_pid: postgres [ 1864] } hitcount: 2 1218 + { id: sys_read [ 0], common_pid: nm-applet [ 3022] } hitcount: 2 1219 + { id: sys_read [ 0], common_pid: whoopsie [ 1212] } hitcount: 2 1220 + . 1221 + . 1222 + . 1223 + { id: sys_ioctl [ 16], common_pid: bash [ 8479] } hitcount: 1 1224 + { id: sys_ioctl [ 16], common_pid: bash [ 3472] } hitcount: 12 1225 + { id: sys_ioctl [ 16], common_pid: gnome-terminal [ 3199] } hitcount: 16 1226 + { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 1808 1227 + { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 5580 1228 + . 1229 + . 1230 + . 1231 + { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2690] } hitcount: 3 1232 + { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2688] } hitcount: 16 1233 + { id: sys_inotify_add_watch [254], common_pid: gmain [ 975] } hitcount: 2 1234 + { id: sys_inotify_add_watch [254], common_pid: gmain [ 3204] } hitcount: 4 1235 + { id: sys_inotify_add_watch [254], common_pid: gmain [ 2888] } hitcount: 4 1236 + { id: sys_inotify_add_watch [254], common_pid: gmain [ 3003] } hitcount: 4 1237 + { id: sys_inotify_add_watch [254], common_pid: gmain [ 2873] } hitcount: 4 1238 + { id: sys_inotify_add_watch [254], common_pid: gmain [ 3196] } hitcount: 6 1239 + { id: sys_openat [257], common_pid: java [ 2623] } hitcount: 2 1240 + { id: sys_eventfd2 [290], common_pid: ibus-ui-gtk3 [ 2760] } hitcount: 4 1241 + { id: sys_eventfd2 [290], common_pid: compiz [ 2994] } hitcount: 6 1242 + 1243 + Totals: 1244 + Hits: 31536 1245 + Entries: 323 1246 + Dropped: 0 1247 + 1248 + The above list does give us a breakdown of the ioctl syscall by 1249 + pid, but it also gives us quite a bit more than that, which we 1250 + don't really care about at the moment. Since we know the syscall 1251 + id for sys_ioctl (16, displayed next to the sys_ioctl name), we 1252 + can use that to filter out all the other syscalls: 1253 + 1254 + # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \ 1255 + /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger 1256 + 1257 + # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist 1258 + # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active] 1259 + 1260 + { id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1 1261 + { id: sys_ioctl [ 16], common_pid: evolution-addre [ 8571] } hitcount: 1 1262 + { id: sys_ioctl [ 16], common_pid: gmain [ 3003] } hitcount: 1 1263 + { id: sys_ioctl [ 16], common_pid: gmain [ 2781] } hitcount: 1 1264 + { id: sys_ioctl [ 16], common_pid: gmain [ 2829] } hitcount: 1 1265 + { id: sys_ioctl [ 16], common_pid: bash [ 8726] } hitcount: 1 1266 + { id: sys_ioctl [ 16], common_pid: bash [ 8508] } hitcount: 1 1267 + { id: sys_ioctl [ 16], common_pid: gmain [ 2970] } hitcount: 1 1268 + { id: sys_ioctl [ 16], common_pid: gmain [ 2768] } hitcount: 1 1269 + . 1270 + . 1271 + . 1272 + { id: sys_ioctl [ 16], common_pid: pool [ 8559] } hitcount: 45 1273 + { id: sys_ioctl [ 16], common_pid: pool [ 8555] } hitcount: 48 1274 + { id: sys_ioctl [ 16], common_pid: pool [ 8551] } hitcount: 48 1275 + { id: sys_ioctl [ 16], common_pid: avahi-daemon [ 896] } hitcount: 66 1276 + { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 26674 1277 + { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 73443 1278 + 1279 + Totals: 1280 + Hits: 101162 1281 + Entries: 103 1282 + Dropped: 0 1283 + 1284 + The above output shows that 'compiz' and 'Xorg' are far and away 1285 + the heaviest ioctl callers (which might lead to questions about 1286 + whether they really need to be making all those calls and to 1287 + possible avenues for further investigation.) 1288 + 1289 + The compound key examples used a key and a sum value (hitcount) to 1290 + sort the output, but we can just as easily use two keys instead. 1291 + Here's an example where we use a compound key composed of the the 1292 + common_pid and size event fields. Sorting with pid as the primary 1293 + key and 'size' as the secondary key allows us to display an 1294 + ordered summary of the recvfrom sizes, with counts, received by 1295 + each process: 1296 + 1297 + # echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \ 1298 + /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger 1299 + 1300 + # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist 1301 + # trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active] 1302 + 1303 + { common_pid: smbd [ 784], size: 4 } hitcount: 1 1304 + { common_pid: dnsmasq [ 1412], size: 4096 } hitcount: 672 1305 + { common_pid: postgres [ 1796], size: 1000 } hitcount: 6 1306 + { common_pid: postgres [ 1867], size: 1000 } hitcount: 10 1307 + { common_pid: bamfdaemon [ 2787], size: 28 } hitcount: 2 1308 + { common_pid: bamfdaemon [ 2787], size: 14360 } hitcount: 1 1309 + { common_pid: compiz [ 2994], size: 8 } hitcount: 1 1310 + { common_pid: compiz [ 2994], size: 20 } hitcount: 11 1311 + { common_pid: gnome-terminal [ 3199], size: 4 } hitcount: 2 1312 + { common_pid: firefox [ 8817], size: 4 } hitcount: 1 1313 + { common_pid: firefox [ 8817], size: 8 } hitcount: 5 1314 + { common_pid: firefox [ 8817], size: 588 } hitcount: 2 1315 + { common_pid: firefox [ 8817], size: 628 } hitcount: 1 1316 + { common_pid: firefox [ 8817], size: 6944 } hitcount: 1 1317 + { common_pid: firefox [ 8817], size: 408880 } hitcount: 2 1318 + { common_pid: firefox [ 8822], size: 8 } hitcount: 2 1319 + { common_pid: firefox [ 8822], size: 160 } hitcount: 2 1320 + { common_pid: firefox [ 8822], size: 320 } hitcount: 2 1321 + { common_pid: firefox [ 8822], size: 352 } hitcount: 1 1322 + . 1323 + . 1324 + . 1325 + { common_pid: pool [ 8923], size: 1960 } hitcount: 10 1326 + { common_pid: pool [ 8923], size: 2048 } hitcount: 10 1327 + { common_pid: pool [ 8924], size: 1960 } hitcount: 10 1328 + { common_pid: pool [ 8924], size: 2048 } hitcount: 10 1329 + { common_pid: pool [ 8928], size: 1964 } hitcount: 4 1330 + { common_pid: pool [ 8928], size: 1965 } hitcount: 2 1331 + { common_pid: pool [ 8928], size: 2048 } hitcount: 6 1332 + { common_pid: pool [ 8929], size: 1982 } hitcount: 1 1333 + { common_pid: pool [ 8929], size: 2048 } hitcount: 1 1334 + 1335 + Totals: 1336 + Hits: 2016 1337 + Entries: 224 1338 + Dropped: 0 1339 + 1340 + The above example also illustrates the fact that although a compound 1341 + key is treated as a single entity for hashing purposes, the sub-keys 1342 + it's composed of can be accessed independently. 1343 + 1344 + The next example uses a string field as the hash key and 1345 + demonstrates how you can manually pause and continue a hist trigger. 1346 + In this example, we'll aggregate fork counts and don't expect a 1347 + large number of entries in the hash table, so we'll drop it to a 1348 + much smaller number, say 256: 1349 + 1350 + # echo 'hist:key=child_comm:val=hitcount:size=256' > \ 1351 + /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger 1352 + 1353 + # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist 1354 + # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] 1355 + 1356 + { child_comm: dconf worker } hitcount: 1 1357 + { child_comm: ibus-daemon } hitcount: 1 1358 + { child_comm: whoopsie } hitcount: 1 1359 + { child_comm: smbd } hitcount: 1 1360 + { child_comm: gdbus } hitcount: 1 1361 + { child_comm: kthreadd } hitcount: 1 1362 + { child_comm: dconf worker } hitcount: 1 1363 + { child_comm: evolution-alarm } hitcount: 2 1364 + { child_comm: Socket Thread } hitcount: 2 1365 + { child_comm: postgres } hitcount: 2 1366 + { child_comm: bash } hitcount: 3 1367 + { child_comm: compiz } hitcount: 3 1368 + { child_comm: evolution-sourc } hitcount: 4 1369 + { child_comm: dhclient } hitcount: 4 1370 + { child_comm: pool } hitcount: 5 1371 + { child_comm: nm-dispatcher.a } hitcount: 8 1372 + { child_comm: firefox } hitcount: 8 1373 + { child_comm: dbus-daemon } hitcount: 8 1374 + { child_comm: glib-pacrunner } hitcount: 10 1375 + { child_comm: evolution } hitcount: 23 1376 + 1377 + Totals: 1378 + Hits: 89 1379 + Entries: 20 1380 + Dropped: 0 1381 + 1382 + If we want to pause the hist trigger, we can simply append :pause to 1383 + the command that started the trigger. Notice that the trigger info 1384 + displays as [paused]: 1385 + 1386 + # echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \ 1387 + /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger 1388 + 1389 + # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist 1390 + # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused] 1391 + 1392 + { child_comm: dconf worker } hitcount: 1 1393 + { child_comm: kthreadd } hitcount: 1 1394 + { child_comm: dconf worker } hitcount: 1 1395 + { child_comm: gdbus } hitcount: 1 1396 + { child_comm: ibus-daemon } hitcount: 1 1397 + { child_comm: Socket Thread } hitcount: 2 1398 + { child_comm: evolution-alarm } hitcount: 2 1399 + { child_comm: smbd } hitcount: 2 1400 + { child_comm: bash } hitcount: 3 1401 + { child_comm: whoopsie } hitcount: 3 1402 + { child_comm: compiz } hitcount: 3 1403 + { child_comm: evolution-sourc } hitcount: 4 1404 + { child_comm: pool } hitcount: 5 1405 + { child_comm: postgres } hitcount: 6 1406 + { child_comm: firefox } hitcount: 8 1407 + { child_comm: dhclient } hitcount: 10 1408 + { child_comm: emacs } hitcount: 12 1409 + { child_comm: dbus-daemon } hitcount: 20 1410 + { child_comm: nm-dispatcher.a } hitcount: 20 1411 + { child_comm: evolution } hitcount: 35 1412 + { child_comm: glib-pacrunner } hitcount: 59 1413 + 1414 + Totals: 1415 + Hits: 199 1416 + Entries: 21 1417 + Dropped: 0 1418 + 1419 + To manually continue having the trigger aggregate events, append 1420 + :cont instead. Notice that the trigger info displays as [active] 1421 + again, and the data has changed: 1422 + 1423 + # echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \ 1424 + /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger 1425 + 1426 + # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist 1427 + # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] 1428 + 1429 + { child_comm: dconf worker } hitcount: 1 1430 + { child_comm: dconf worker } hitcount: 1 1431 + { child_comm: kthreadd } hitcount: 1 1432 + { child_comm: gdbus } hitcount: 1 1433 + { child_comm: ibus-daemon } hitcount: 1 1434 + { child_comm: Socket Thread } hitcount: 2 1435 + { child_comm: evolution-alarm } hitcount: 2 1436 + { child_comm: smbd } hitcount: 2 1437 + { child_comm: whoopsie } hitcount: 3 1438 + { child_comm: compiz } hitcount: 3 1439 + { child_comm: evolution-sourc } hitcount: 4 1440 + { child_comm: bash } hitcount: 5 1441 + { child_comm: pool } hitcount: 5 1442 + { child_comm: postgres } hitcount: 6 1443 + { child_comm: firefox } hitcount: 8 1444 + { child_comm: dhclient } hitcount: 11 1445 + { child_comm: emacs } hitcount: 12 1446 + { child_comm: dbus-daemon } hitcount: 22 1447 + { child_comm: nm-dispatcher.a } hitcount: 22 1448 + { child_comm: evolution } hitcount: 35 1449 + { child_comm: glib-pacrunner } hitcount: 59 1450 + 1451 + Totals: 1452 + Hits: 206 1453 + Entries: 21 1454 + Dropped: 0 1455 + 1456 + The previous example showed how to start and stop a hist trigger by 1457 + appending 'pause' and 'continue' to the hist trigger command. A 1458 + hist trigger can also be started in a paused state by initially 1459 + starting the trigger with ':pause' appended. This allows you to 1460 + start the trigger only when you're ready to start collecting data 1461 + and not before. For example, you could start the trigger in a 1462 + paused state, then unpause it and do something you want to measure, 1463 + then pause the trigger again when done. 1464 + 1465 + Of course, doing this manually can be difficult and error-prone, but 1466 + it is possible to automatically start and stop a hist trigger based 1467 + on some condition, via the enable_hist and disable_hist triggers. 1468 + 1469 + For example, suppose we wanted to take a look at the relative 1470 + weights in terms of skb length for each callpath that leads to a 1471 + netif_receieve_skb event when downloading a decent-sized file using 1472 + wget. 1473 + 1474 + First we set up an initially paused stacktrace trigger on the 1475 + netif_receive_skb event: 1476 + 1477 + # echo 'hist:key=stacktrace:vals=len:pause' > \ 1478 + /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger 1479 + 1480 + Next, we set up an 'enable_hist' trigger on the sched_process_exec 1481 + event, with an 'if filename==/usr/bin/wget' filter. The effect of 1482 + this new trigger is that it will 'unpause' the hist trigger we just 1483 + set up on netif_receive_skb if and only if it sees a 1484 + sched_process_exec event with a filename of '/usr/bin/wget'. When 1485 + that happens, all netif_receive_skb events are aggregated into a 1486 + hash table keyed on stacktrace: 1487 + 1488 + # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ 1489 + /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger 1490 + 1491 + The aggregation continues until the netif_receive_skb is paused 1492 + again, which is what the following disable_hist event does by 1493 + creating a similar setup on the sched_process_exit event, using the 1494 + filter 'comm==wget': 1495 + 1496 + # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ 1497 + /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger 1498 + 1499 + Whenever a process exits and the comm field of the disable_hist 1500 + trigger filter matches 'comm==wget', the netif_receive_skb hist 1501 + trigger is disabled. 1502 + 1503 + The overall effect is that netif_receive_skb events are aggregated 1504 + into the hash table for only the duration of the wget. Executing a 1505 + wget command and then listing the 'hist' file will display the 1506 + output generated by the wget command: 1507 + 1508 + $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz 1509 + 1510 + # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist 1511 + # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] 1512 + 1513 + { stacktrace: 1514 + __netif_receive_skb_core+0x46d/0x990 1515 + __netif_receive_skb+0x18/0x60 1516 + netif_receive_skb_internal+0x23/0x90 1517 + napi_gro_receive+0xc8/0x100 1518 + ieee80211_deliver_skb+0xd6/0x270 [mac80211] 1519 + ieee80211_rx_handlers+0xccf/0x22f0 [mac80211] 1520 + ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211] 1521 + ieee80211_rx+0x31d/0x900 [mac80211] 1522 + iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm] 1523 + iwl_rx_dispatch+0x8e/0xf0 [iwldvm] 1524 + iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi] 1525 + irq_thread_fn+0x20/0x50 1526 + irq_thread+0x11f/0x150 1527 + kthread+0xd2/0xf0 1528 + ret_from_fork+0x42/0x70 1529 + } hitcount: 85 len: 28884 1530 + { stacktrace: 1531 + __netif_receive_skb_core+0x46d/0x990 1532 + __netif_receive_skb+0x18/0x60 1533 + netif_receive_skb_internal+0x23/0x90 1534 + napi_gro_complete+0xa4/0xe0 1535 + dev_gro_receive+0x23a/0x360 1536 + napi_gro_receive+0x30/0x100 1537 + ieee80211_deliver_skb+0xd6/0x270 [mac80211] 1538 + ieee80211_rx_handlers+0xccf/0x22f0 [mac80211] 1539 + ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211] 1540 + ieee80211_rx+0x31d/0x900 [mac80211] 1541 + iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm] 1542 + iwl_rx_dispatch+0x8e/0xf0 [iwldvm] 1543 + iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi] 1544 + irq_thread_fn+0x20/0x50 1545 + irq_thread+0x11f/0x150 1546 + kthread+0xd2/0xf0 1547 + } hitcount: 98 len: 664329 1548 + { stacktrace: 1549 + __netif_receive_skb_core+0x46d/0x990 1550 + __netif_receive_skb+0x18/0x60 1551 + process_backlog+0xa8/0x150 1552 + net_rx_action+0x15d/0x340 1553 + __do_softirq+0x114/0x2c0 1554 + do_softirq_own_stack+0x1c/0x30 1555 + do_softirq+0x65/0x70 1556 + __local_bh_enable_ip+0xb5/0xc0 1557 + ip_finish_output+0x1f4/0x840 1558 + ip_output+0x6b/0xc0 1559 + ip_local_out_sk+0x31/0x40 1560 + ip_send_skb+0x1a/0x50 1561 + udp_send_skb+0x173/0x2a0 1562 + udp_sendmsg+0x2bf/0x9f0 1563 + inet_sendmsg+0x64/0xa0 1564 + sock_sendmsg+0x3d/0x50 1565 + } hitcount: 115 len: 13030 1566 + { stacktrace: 1567 + __netif_receive_skb_core+0x46d/0x990 1568 + __netif_receive_skb+0x18/0x60 1569 + netif_receive_skb_internal+0x23/0x90 1570 + napi_gro_complete+0xa4/0xe0 1571 + napi_gro_flush+0x6d/0x90 1572 + iwl_pcie_irq_handler+0x92a/0x12f0 [iwlwifi] 1573 + irq_thread_fn+0x20/0x50 1574 + irq_thread+0x11f/0x150 1575 + kthread+0xd2/0xf0 1576 + ret_from_fork+0x42/0x70 1577 + } hitcount: 934 len: 5512212 1578 + 1579 + Totals: 1580 + Hits: 1232 1581 + Entries: 4 1582 + Dropped: 0 1583 + 1584 + The above shows all the netif_receive_skb callpaths and their total 1585 + lengths for the duration of the wget command. 1586 + 1587 + The 'clear' hist trigger param can be used to clear the hash table. 1588 + Suppose we wanted to try another run of the previous example but 1589 + this time also wanted to see the complete list of events that went 1590 + into the histogram. In order to avoid having to set everything up 1591 + again, we can just clear the histogram first: 1592 + 1593 + # echo 'hist:key=stacktrace:vals=len:clear' >> \ 1594 + /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger 1595 + 1596 + Just to verify that it is in fact cleared, here's what we now see in 1597 + the hist file: 1598 + 1599 + # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist 1600 + # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] 1601 + 1602 + Totals: 1603 + Hits: 0 1604 + Entries: 0 1605 + Dropped: 0 1606 + 1607 + Since we want to see the detailed list of every netif_receive_skb 1608 + event occurring during the new run, which are in fact the same 1609 + events being aggregated into the hash table, we add some additional 1610 + 'enable_event' events to the triggering sched_process_exec and 1611 + sched_process_exit events as such: 1612 + 1613 + # echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \ 1614 + /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger 1615 + 1616 + # echo 'disable_event:net:netif_receive_skb if comm==wget' > \ 1617 + /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger 1618 + 1619 + If you read the trigger files for the sched_process_exec and 1620 + sched_process_exit triggers, you should see two triggers for each: 1621 + one enabling/disabling the hist aggregation and the other 1622 + enabling/disabling the logging of events: 1623 + 1624 + # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger 1625 + enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget 1626 + enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget 1627 + 1628 + # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger 1629 + enable_event:net:netif_receive_skb:unlimited if comm==wget 1630 + disable_hist:net:netif_receive_skb:unlimited if comm==wget 1631 + 1632 + In other words, whenever either of the sched_process_exec or 1633 + sched_process_exit events is hit and matches 'wget', it enables or 1634 + disables both the histogram and the event log, and what you end up 1635 + with is a hash table and set of events just covering the specified 1636 + duration. Run the wget command again: 1637 + 1638 + $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz 1639 + 1640 + Displaying the 'hist' file should show something similar to what you 1641 + saw in the last run, but this time you should also see the 1642 + individual events in the trace file: 1643 + 1644 + # cat /sys/kernel/debug/tracing/trace 1645 + 1646 + # tracer: nop 1647 + # 1648 + # entries-in-buffer/entries-written: 183/1426 #P:4 1649 + # 1650 + # _-----=> irqs-off 1651 + # / _----=> need-resched 1652 + # | / _---=> hardirq/softirq 1653 + # || / _--=> preempt-depth 1654 + # ||| / delay 1655 + # TASK-PID CPU# |||| TIMESTAMP FUNCTION 1656 + # | | | |||| | | 1657 + wget-15108 [000] ..s1 31769.606929: netif_receive_skb: dev=lo skbaddr=ffff88009c353100 len=60 1658 + wget-15108 [000] ..s1 31769.606999: netif_receive_skb: dev=lo skbaddr=ffff88009c353200 len=60 1659 + dnsmasq-1382 [000] ..s1 31769.677652: netif_receive_skb: dev=lo skbaddr=ffff88009c352b00 len=130 1660 + dnsmasq-1382 [000] ..s1 31769.685917: netif_receive_skb: dev=lo skbaddr=ffff88009c352200 len=138 1661 + ##### CPU 2 buffer started #### 1662 + irq/29-iwlwifi-559 [002] ..s. 31772.031529: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433d00 len=2948 1663 + irq/29-iwlwifi-559 [002] ..s. 31772.031572: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432200 len=1500 1664 + irq/29-iwlwifi-559 [002] ..s. 31772.032196: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433100 len=2948 1665 + irq/29-iwlwifi-559 [002] ..s. 31772.032761: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433000 len=2948 1666 + irq/29-iwlwifi-559 [002] ..s. 31772.033220: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432e00 len=1500 1667 + . 1668 + . 1669 + .