Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
"New tracing features:

- The ring buffer is no longer disabled when reading the trace file.

The trace_pipe file was made to be used for live tracing and
reading as it acted like the normal producer/consumer. As the trace
file would not consume the data, the easy way of handling it was to
just disable writes to the ring buffer.

This came to a surprise to the BPF folks who complained about lost
events due to reading. This is no longer an issue. If someone wants
to keep the old disabling there's a new option "pause-on-trace"
that can be set.

- New set_ftrace_notrace_pid file. PIDs in this file will not be
traced by the function tracer.

Similar to set_ftrace_pid, which makes the function tracer only
trace those tasks with PIDs in the file, the set_ftrace_notrace_pid
does the reverse.

- New set_event_notrace_pid file. PIDs in this file will cause events
not to be traced if triggered by a task with a matching PID.

Similar to the set_event_pid file but will not be traced. Note,
sched_waking and sched_switch events may still be traced if one of
the tasks referenced by those events contains a PID that is allowed
to be traced.

Tracing related features:

- New bootconfig option, that is attached to the initrd file.

If bootconfig is on the command line, then the initrd file is
searched looking for a bootconfig appended at the end.

- New GPU tracepoint infrastructure to help the gfx drivers to get
off debugfs (acked by Greg Kroah-Hartman)

And other minor updates and fixes"

* tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
tracing: Do not allocate buffer in trace_find_next_entry() in atomic
tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
selftests/ftrace: Add test to test new set_event_notrace_pid file
selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
tracing: Create set_event_notrace_pid to not trace tasks
ftrace: Create set_ftrace_notrace_pid to not trace tasks
ftrace: Make function trace pid filtering a bit more exact
ftrace/kprobe: Show the maxactive number on kprobe_events
tracing: Have the document reflect that the trace file keeps tracing enabled
ring-buffer/tracing: Have iterator acknowledge dropped events
tracing: Do not disable tracing when reading the trace file
ring-buffer: Do not disable recording when there is an iterator
ring-buffer: Make resize disable per cpu buffer instead of total buffer
ring-buffer: Optimize rb_iter_head_event()
ring-buffer: Do not die if rb_iter_peek() fails more than thrice
ring-buffer: Have rb_iter_head_event() handle concurrent writer
ring-buffer: Add page_stamp to iterator for synchronization
ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
tracing: Save off entry when peeking at next entry
...

+1196 -258
+66 -16
Documentation/trace/ftrace.rst
··· 125 125 trace: 126 126 127 127 This file holds the output of the trace in a human 128 - readable format (described below). Note, tracing is temporarily 129 - disabled when the file is open for reading. Once all readers 130 - are closed, tracing is re-enabled. Opening this file for 128 + readable format (described below). Opening this file for 131 129 writing with the O_TRUNC flag clears the ring buffer content. 130 + Note, this file is not a consumer. If tracing is off 131 + (no tracer running, or tracing_on is zero), it will produce 132 + the same output each time it is read. When tracing is on, 133 + it may produce inconsistent results as it tries to read 134 + the entire buffer without consuming it. 132 135 133 136 trace_pipe: 134 137 ··· 145 142 will not be read again with a sequential read. The 146 143 "trace" file is static, and if the tracer is not 147 144 adding more data, it will display the same 148 - information every time it is read. Unlike the 149 - "trace" file, opening this file for reading will not 150 - temporarily disable tracing. 145 + information every time it is read. 151 146 152 147 trace_options: 153 148 ··· 263 262 traced by the function tracer as well. This option will also 264 263 cause PIDs of tasks that exit to be removed from the file. 265 264 265 + set_ftrace_notrace_pid: 266 + 267 + Have the function tracer ignore threads whose PID are listed in 268 + this file. 269 + 270 + If the "function-fork" option is set, then when a task whose 271 + PID is listed in this file forks, the child's PID will 272 + automatically be added to this file, and the child will not be 273 + traced by the function tracer as well. This option will also 274 + cause PIDs of tasks that exit to be removed from the file. 275 + 276 + If a PID is in both this file and "set_ftrace_pid", then this 277 + file takes precedence, and the thread will not be traced. 278 + 266 279 set_event_pid: 267 280 268 281 Have the events only trace a task with a PID listed in this file. 269 282 Note, sched_switch and sched_wake_up will also trace events 270 283 listed in this file. 284 + 285 + To have the PIDs of children of tasks with their PID in this file 286 + added on fork, enable the "event-fork" option. That option will also 287 + cause the PIDs of tasks to be removed from this file when the task 288 + exits. 289 + 290 + set_event_notrace_pid: 291 + 292 + Have the events not trace a task with a PID listed in this file. 293 + Note, sched_switch and sched_wakeup will trace threads not listed 294 + in this file, even if a thread's PID is in the file if the 295 + sched_switch or sched_wakeup events also trace a thread that should 296 + be traced. 271 297 272 298 To have the PIDs of children of tasks with their PID in this file 273 299 added on fork, enable the "event-fork" option. That option will also ··· 1153 1125 the trace displays additional information about the 1154 1126 latency, as described in "Latency trace format". 1155 1127 1128 + pause-on-trace 1129 + When set, opening the trace file for read, will pause 1130 + writing to the ring buffer (as if tracing_on was set to zero). 1131 + This simulates the original behavior of the trace file. 1132 + When the file is closed, tracing will be enabled again. 1133 + 1156 1134 record-cmd 1157 1135 When any event or tracer is enabled, a hook is enabled 1158 1136 in the sched_switch trace point to fill comm cache ··· 1210 1176 tasks fork. Also, when tasks with PIDs in set_event_pid exit, 1211 1177 their PIDs will be removed from the file. 1212 1178 1179 + This affects PIDs listed in set_event_notrace_pid as well. 1180 + 1213 1181 function-trace 1214 1182 The latency tracers will enable function tracing 1215 1183 if this option is enabled (default it is). When ··· 1225 1189 when those tasks fork. Also, when tasks with PIDs in 1226 1190 set_ftrace_pid exit, their PIDs will be removed from the 1227 1191 file. 1192 + 1193 + This affects PIDs in set_ftrace_notrace_pid as well. 1228 1194 1229 1195 display-graph 1230 1196 When set, the latency tracers (irqsoff, wakeup, etc) will ··· 2164 2126 # cat trace 2165 2127 # tracer: hwlat 2166 2128 # 2129 + # entries-in-buffer/entries-written: 13/13 #P:8 2130 + # 2167 2131 # _-----=> irqs-off 2168 2132 # / _----=> need-resched 2169 2133 # | / _---=> hardirq/softirq ··· 2173 2133 # ||| / delay 2174 2134 # TASK-PID CPU# |||| TIMESTAMP FUNCTION 2175 2135 # | | | |||| | | 2176 - <...>-3638 [001] d... 19452.055471: #1 inner/outer(us): 12/14 ts:1499801089.066141940 2177 - <...>-3638 [003] d... 19454.071354: #2 inner/outer(us): 11/9 ts:1499801091.082164365 2178 - <...>-3638 [002] dn.. 19461.126852: #3 inner/outer(us): 12/9 ts:1499801098.138150062 2179 - <...>-3638 [001] d... 19488.340960: #4 inner/outer(us): 8/12 ts:1499801125.354139633 2180 - <...>-3638 [003] d... 19494.388553: #5 inner/outer(us): 8/12 ts:1499801131.402150961 2181 - <...>-3638 [003] d... 19501.283419: #6 inner/outer(us): 0/12 ts:1499801138.297435289 nmi-total:4 nmi-count:1 2136 + <...>-1729 [001] d... 678.473449: #1 inner/outer(us): 11/12 ts:1581527483.343962693 count:6 2137 + <...>-1729 [004] d... 689.556542: #2 inner/outer(us): 16/9 ts:1581527494.889008092 count:1 2138 + <...>-1729 [005] d... 714.756290: #3 inner/outer(us): 16/16 ts:1581527519.678961629 count:5 2139 + <...>-1729 [001] d... 718.788247: #4 inner/outer(us): 9/17 ts:1581527523.889012713 count:1 2140 + <...>-1729 [002] d... 719.796341: #5 inner/outer(us): 13/9 ts:1581527524.912872606 count:1 2141 + <...>-1729 [006] d... 844.787091: #6 inner/outer(us): 9/12 ts:1581527649.889048502 count:2 2142 + <...>-1729 [003] d... 849.827033: #7 inner/outer(us): 18/9 ts:1581527654.889013793 count:1 2143 + <...>-1729 [007] d... 853.859002: #8 inner/outer(us): 9/12 ts:1581527658.889065736 count:1 2144 + <...>-1729 [001] d... 855.874978: #9 inner/outer(us): 9/11 ts:1581527660.861991877 count:1 2145 + <...>-1729 [001] d... 863.938932: #10 inner/outer(us): 9/11 ts:1581527668.970010500 count:1 nmi-total:7 nmi-count:1 2146 + <...>-1729 [007] d... 878.050780: #11 inner/outer(us): 9/12 ts:1581527683.385002600 count:1 nmi-total:5 nmi-count:1 2147 + <...>-1729 [007] d... 886.114702: #12 inner/outer(us): 9/12 ts:1581527691.385001600 count:1 2182 2148 2183 2149 2184 2150 The above output is somewhat the same in the header. All events will have ··· 2194 2148 This is the count of events recorded that were greater than the 2195 2149 tracing_threshold (See below). 2196 2150 2197 - inner/outer(us): 12/14 2151 + inner/outer(us): 11/11 2198 2152 2199 2153 This shows two numbers as "inner latency" and "outer latency". The test 2200 2154 runs in a loop checking a timestamp twice. The latency detected within ··· 2202 2156 after the previous timestamp and the next timestamp in the loop is 2203 2157 the "outer latency". 2204 2158 2205 - ts:1499801089.066141940 2159 + ts:1581527483.343962693 2206 2160 2207 - The absolute timestamp that the event happened. 2161 + The absolute timestamp that the first latency was recorded in the window. 2208 2162 2209 - nmi-total:4 nmi-count:1 2163 + count:6 2164 + 2165 + The number of times a latency was detected during the window. 2166 + 2167 + nmi-total:7 nmi-count:1 2210 2168 2211 2169 On architectures that support it, if an NMI comes in during the 2212 2170 test, the time spent in NMI is reported in "nmi-total" (in
+2
drivers/Kconfig
··· 200 200 201 201 source "drivers/android/Kconfig" 202 202 203 + source "drivers/gpu/trace/Kconfig" 204 + 203 205 source "drivers/nvdimm/Kconfig" 204 206 205 207 source "drivers/dax/Kconfig"
+1
drivers/gpu/Makefile
··· 5 5 obj-$(CONFIG_TEGRA_HOST1X) += host1x/ 6 6 obj-y += drm/ vga/ 7 7 obj-$(CONFIG_IMX_IPUV3_CORE) += ipu-v3/ 8 + obj-$(CONFIG_TRACE_GPU_MEM) += trace/
+4
drivers/gpu/trace/Kconfig
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + 3 + config TRACE_GPU_MEM 4 + bool
+3
drivers/gpu/trace/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + 3 + obj-$(CONFIG_TRACE_GPU_MEM) += trace_gpu_mem.o
+13
drivers/gpu/trace/trace_gpu_mem.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * GPU memory trace points 4 + * 5 + * Copyright (C) 2020 Google, Inc. 6 + */ 7 + 8 + #include <linux/module.h> 9 + 10 + #define CREATE_TRACE_POINTS 11 + #include <trace/events/gpu_mem.h> 12 + 13 + EXPORT_TRACEPOINT_SYMBOL(gpu_mem_total);
+2 -1
include/linux/bootconfig.h
··· 216 216 } 217 217 218 218 /* XBC node initializer */ 219 - int __init xbc_init(char *buf); 219 + int __init xbc_init(char *buf, const char **emsg, int *epos); 220 + 220 221 221 222 /* XBC cleanup data structures */ 222 223 void __init xbc_destroy_all(void);
+2 -2
include/linux/ring_buffer.h
··· 135 135 136 136 struct ring_buffer_event * 137 137 ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts); 138 - struct ring_buffer_event * 139 - ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts); 138 + void ring_buffer_iter_advance(struct ring_buffer_iter *iter); 140 139 void ring_buffer_iter_reset(struct ring_buffer_iter *iter); 141 140 int ring_buffer_iter_empty(struct ring_buffer_iter *iter); 141 + bool ring_buffer_iter_dropped(struct ring_buffer_iter *iter); 142 142 143 143 unsigned long ring_buffer_size(struct trace_buffer *buffer, int cpu); 144 144
+2
include/linux/trace_events.h
··· 85 85 struct mutex mutex; 86 86 struct ring_buffer_iter **buffer_iter; 87 87 unsigned long iter_flags; 88 + void *temp; /* temp holder */ 89 + unsigned int temp_size; 88 90 89 91 /* trace_seq for __print_flags() and __print_symbolic() etc. */ 90 92 struct trace_seq tmp_seq;
+57
include/trace/events/gpu_mem.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * GPU memory trace points 4 + * 5 + * Copyright (C) 2020 Google, Inc. 6 + */ 7 + 8 + #undef TRACE_SYSTEM 9 + #define TRACE_SYSTEM gpu_mem 10 + 11 + #if !defined(_TRACE_GPU_MEM_H) || defined(TRACE_HEADER_MULTI_READ) 12 + #define _TRACE_GPU_MEM_H 13 + 14 + #include <linux/tracepoint.h> 15 + 16 + /* 17 + * The gpu_memory_total event indicates that there's an update to either the 18 + * global or process total gpu memory counters. 19 + * 20 + * This event should be emitted whenever the kernel device driver allocates, 21 + * frees, imports, unimports memory in the GPU addressable space. 22 + * 23 + * @gpu_id: This is the gpu id. 24 + * 25 + * @pid: Put 0 for global total, while positive pid for process total. 26 + * 27 + * @size: Virtual size of the allocation in bytes. 28 + * 29 + */ 30 + TRACE_EVENT(gpu_mem_total, 31 + 32 + TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size), 33 + 34 + TP_ARGS(gpu_id, pid, size), 35 + 36 + TP_STRUCT__entry( 37 + __field(uint32_t, gpu_id) 38 + __field(uint32_t, pid) 39 + __field(uint64_t, size) 40 + ), 41 + 42 + TP_fast_assign( 43 + __entry->gpu_id = gpu_id; 44 + __entry->pid = pid; 45 + __entry->size = size; 46 + ), 47 + 48 + TP_printk("gpu_id=%u pid=%u size=%llu", 49 + __entry->gpu_id, 50 + __entry->pid, 51 + __entry->size) 52 + ); 53 + 54 + #endif /* _TRACE_GPU_MEM_H */ 55 + 56 + /* This part must be outside protection */ 57 + #include <trace/define_trace.h>
+10 -4
init/main.c
··· 353 353 static void __init setup_boot_config(const char *cmdline) 354 354 { 355 355 static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata; 356 + const char *msg; 357 + int pos; 356 358 u32 size, csum; 357 359 char *data, *copy; 358 360 u32 *hdr; ··· 402 400 memcpy(copy, data, size); 403 401 copy[size] = '\0'; 404 402 405 - ret = xbc_init(copy); 406 - if (ret < 0) 407 - pr_err("Failed to parse bootconfig\n"); 408 - else { 403 + ret = xbc_init(copy, &msg, &pos); 404 + if (ret < 0) { 405 + if (pos < 0) 406 + pr_err("Failed to init bootconfig: %s.\n", msg); 407 + else 408 + pr_err("Failed to parse bootconfig: %s at %d.\n", 409 + msg, pos); 410 + } else { 409 411 pr_info("Load bootconfig: %d bytes %d nodes\n", size, ret); 410 412 /* keys starting with "kernel." are passed via cmdline */ 411 413 extra_command_line = xbc_make_cmdline("kernel");
+173 -27
kernel/trace/ftrace.c
··· 102 102 103 103 tr = ops->private; 104 104 105 - return tr->function_pids != NULL; 105 + return tr->function_pids != NULL || tr->function_no_pids != NULL; 106 106 } 107 107 108 108 static void ftrace_update_trampoline(struct ftrace_ops *ops); ··· 139 139 #endif 140 140 } 141 141 142 + #define FTRACE_PID_IGNORE -1 143 + #define FTRACE_PID_TRACE -2 144 + 142 145 static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip, 143 146 struct ftrace_ops *op, struct pt_regs *regs) 144 147 { 145 148 struct trace_array *tr = op->private; 149 + int pid; 146 150 147 - if (tr && this_cpu_read(tr->array_buffer.data->ftrace_ignore_pid)) 148 - return; 151 + if (tr) { 152 + pid = this_cpu_read(tr->array_buffer.data->ftrace_ignore_pid); 153 + if (pid == FTRACE_PID_IGNORE) 154 + return; 155 + if (pid != FTRACE_PID_TRACE && 156 + pid != current->pid) 157 + return; 158 + } 149 159 150 160 op->saved_func(ip, parent_ip, op, regs); 151 161 } ··· 6933 6923 { 6934 6924 struct trace_array *tr = data; 6935 6925 struct trace_pid_list *pid_list; 6926 + struct trace_pid_list *no_pid_list; 6936 6927 6937 6928 pid_list = rcu_dereference_sched(tr->function_pids); 6929 + no_pid_list = rcu_dereference_sched(tr->function_no_pids); 6938 6930 6939 - this_cpu_write(tr->array_buffer.data->ftrace_ignore_pid, 6940 - trace_ignore_this_task(pid_list, next)); 6931 + if (trace_ignore_this_task(pid_list, no_pid_list, next)) 6932 + this_cpu_write(tr->array_buffer.data->ftrace_ignore_pid, 6933 + FTRACE_PID_IGNORE); 6934 + else 6935 + this_cpu_write(tr->array_buffer.data->ftrace_ignore_pid, 6936 + next->pid); 6941 6937 } 6942 6938 6943 6939 static void ··· 6956 6940 6957 6941 pid_list = rcu_dereference_sched(tr->function_pids); 6958 6942 trace_filter_add_remove_task(pid_list, self, task); 6943 + 6944 + pid_list = rcu_dereference_sched(tr->function_no_pids); 6945 + trace_filter_add_remove_task(pid_list, self, task); 6959 6946 } 6960 6947 6961 6948 static void ··· 6968 6949 struct trace_array *tr = data; 6969 6950 6970 6951 pid_list = rcu_dereference_sched(tr->function_pids); 6952 + trace_filter_add_remove_task(pid_list, NULL, task); 6953 + 6954 + pid_list = rcu_dereference_sched(tr->function_no_pids); 6971 6955 trace_filter_add_remove_task(pid_list, NULL, task); 6972 6956 } 6973 6957 ··· 6989 6967 } 6990 6968 } 6991 6969 6992 - static void clear_ftrace_pids(struct trace_array *tr) 6970 + static void clear_ftrace_pids(struct trace_array *tr, int type) 6993 6971 { 6994 6972 struct trace_pid_list *pid_list; 6973 + struct trace_pid_list *no_pid_list; 6995 6974 int cpu; 6996 6975 6997 6976 pid_list = rcu_dereference_protected(tr->function_pids, 6998 6977 lockdep_is_held(&ftrace_lock)); 6999 - if (!pid_list) 6978 + no_pid_list = rcu_dereference_protected(tr->function_no_pids, 6979 + lockdep_is_held(&ftrace_lock)); 6980 + 6981 + /* Make sure there's something to do */ 6982 + if (!pid_type_enabled(type, pid_list, no_pid_list)) 7000 6983 return; 7001 6984 7002 - unregister_trace_sched_switch(ftrace_filter_pid_sched_switch_probe, tr); 6985 + /* See if the pids still need to be checked after this */ 6986 + if (!still_need_pid_events(type, pid_list, no_pid_list)) { 6987 + unregister_trace_sched_switch(ftrace_filter_pid_sched_switch_probe, tr); 6988 + for_each_possible_cpu(cpu) 6989 + per_cpu_ptr(tr->array_buffer.data, cpu)->ftrace_ignore_pid = FTRACE_PID_TRACE; 6990 + } 7003 6991 7004 - for_each_possible_cpu(cpu) 7005 - per_cpu_ptr(tr->array_buffer.data, cpu)->ftrace_ignore_pid = false; 6992 + if (type & TRACE_PIDS) 6993 + rcu_assign_pointer(tr->function_pids, NULL); 7006 6994 7007 - rcu_assign_pointer(tr->function_pids, NULL); 6995 + if (type & TRACE_NO_PIDS) 6996 + rcu_assign_pointer(tr->function_no_pids, NULL); 7008 6997 7009 6998 /* Wait till all users are no longer using pid filtering */ 7010 6999 synchronize_rcu(); 7011 7000 7012 - trace_free_pid_list(pid_list); 7001 + if ((type & TRACE_PIDS) && pid_list) 7002 + trace_free_pid_list(pid_list); 7003 + 7004 + if ((type & TRACE_NO_PIDS) && no_pid_list) 7005 + trace_free_pid_list(no_pid_list); 7013 7006 } 7014 7007 7015 7008 void ftrace_clear_pids(struct trace_array *tr) 7016 7009 { 7017 7010 mutex_lock(&ftrace_lock); 7018 7011 7019 - clear_ftrace_pids(tr); 7012 + clear_ftrace_pids(tr, TRACE_PIDS | TRACE_NO_PIDS); 7020 7013 7021 7014 mutex_unlock(&ftrace_lock); 7022 7015 } 7023 7016 7024 - static void ftrace_pid_reset(struct trace_array *tr) 7017 + static void ftrace_pid_reset(struct trace_array *tr, int type) 7025 7018 { 7026 7019 mutex_lock(&ftrace_lock); 7027 - clear_ftrace_pids(tr); 7020 + clear_ftrace_pids(tr, type); 7028 7021 7029 7022 ftrace_update_pid_func(); 7030 7023 ftrace_startup_all(0); ··· 7103 7066 .show = fpid_show, 7104 7067 }; 7105 7068 7106 - static int 7107 - ftrace_pid_open(struct inode *inode, struct file *file) 7069 + static void *fnpid_start(struct seq_file *m, loff_t *pos) 7070 + __acquires(RCU) 7108 7071 { 7072 + struct trace_pid_list *pid_list; 7073 + struct trace_array *tr = m->private; 7074 + 7075 + mutex_lock(&ftrace_lock); 7076 + rcu_read_lock_sched(); 7077 + 7078 + pid_list = rcu_dereference_sched(tr->function_no_pids); 7079 + 7080 + if (!pid_list) 7081 + return !(*pos) ? FTRACE_NO_PIDS : NULL; 7082 + 7083 + return trace_pid_start(pid_list, pos); 7084 + } 7085 + 7086 + static void *fnpid_next(struct seq_file *m, void *v, loff_t *pos) 7087 + { 7088 + struct trace_array *tr = m->private; 7089 + struct trace_pid_list *pid_list = rcu_dereference_sched(tr->function_no_pids); 7090 + 7091 + if (v == FTRACE_NO_PIDS) { 7092 + (*pos)++; 7093 + return NULL; 7094 + } 7095 + return trace_pid_next(pid_list, v, pos); 7096 + } 7097 + 7098 + static const struct seq_operations ftrace_no_pid_sops = { 7099 + .start = fnpid_start, 7100 + .next = fnpid_next, 7101 + .stop = fpid_stop, 7102 + .show = fpid_show, 7103 + }; 7104 + 7105 + static int pid_open(struct inode *inode, struct file *file, int type) 7106 + { 7107 + const struct seq_operations *seq_ops; 7109 7108 struct trace_array *tr = inode->i_private; 7110 7109 struct seq_file *m; 7111 7110 int ret = 0; ··· 7152 7079 7153 7080 if ((file->f_mode & FMODE_WRITE) && 7154 7081 (file->f_flags & O_TRUNC)) 7155 - ftrace_pid_reset(tr); 7082 + ftrace_pid_reset(tr, type); 7156 7083 7157 - ret = seq_open(file, &ftrace_pid_sops); 7084 + switch (type) { 7085 + case TRACE_PIDS: 7086 + seq_ops = &ftrace_pid_sops; 7087 + break; 7088 + case TRACE_NO_PIDS: 7089 + seq_ops = &ftrace_no_pid_sops; 7090 + break; 7091 + } 7092 + 7093 + ret = seq_open(file, seq_ops); 7158 7094 if (ret < 0) { 7159 7095 trace_array_put(tr); 7160 7096 } else { ··· 7175 7093 return ret; 7176 7094 } 7177 7095 7096 + static int 7097 + ftrace_pid_open(struct inode *inode, struct file *file) 7098 + { 7099 + return pid_open(inode, file, TRACE_PIDS); 7100 + } 7101 + 7102 + static int 7103 + ftrace_no_pid_open(struct inode *inode, struct file *file) 7104 + { 7105 + return pid_open(inode, file, TRACE_NO_PIDS); 7106 + } 7107 + 7178 7108 static void ignore_task_cpu(void *data) 7179 7109 { 7180 7110 struct trace_array *tr = data; 7181 7111 struct trace_pid_list *pid_list; 7112 + struct trace_pid_list *no_pid_list; 7182 7113 7183 7114 /* 7184 7115 * This function is called by on_each_cpu() while the ··· 7199 7104 */ 7200 7105 pid_list = rcu_dereference_protected(tr->function_pids, 7201 7106 mutex_is_locked(&ftrace_lock)); 7107 + no_pid_list = rcu_dereference_protected(tr->function_no_pids, 7108 + mutex_is_locked(&ftrace_lock)); 7202 7109 7203 - this_cpu_write(tr->array_buffer.data->ftrace_ignore_pid, 7204 - trace_ignore_this_task(pid_list, current)); 7110 + if (trace_ignore_this_task(pid_list, no_pid_list, current)) 7111 + this_cpu_write(tr->array_buffer.data->ftrace_ignore_pid, 7112 + FTRACE_PID_IGNORE); 7113 + else 7114 + this_cpu_write(tr->array_buffer.data->ftrace_ignore_pid, 7115 + current->pid); 7205 7116 } 7206 7117 7207 7118 static ssize_t 7208 - ftrace_pid_write(struct file *filp, const char __user *ubuf, 7209 - size_t cnt, loff_t *ppos) 7119 + pid_write(struct file *filp, const char __user *ubuf, 7120 + size_t cnt, loff_t *ppos, int type) 7210 7121 { 7211 7122 struct seq_file *m = filp->private_data; 7212 7123 struct trace_array *tr = m->private; 7213 - struct trace_pid_list *filtered_pids = NULL; 7124 + struct trace_pid_list *filtered_pids; 7125 + struct trace_pid_list *other_pids; 7214 7126 struct trace_pid_list *pid_list; 7215 7127 ssize_t ret; 7216 7128 ··· 7226 7124 7227 7125 mutex_lock(&ftrace_lock); 7228 7126 7229 - filtered_pids = rcu_dereference_protected(tr->function_pids, 7127 + switch (type) { 7128 + case TRACE_PIDS: 7129 + filtered_pids = rcu_dereference_protected(tr->function_pids, 7230 7130 lockdep_is_held(&ftrace_lock)); 7131 + other_pids = rcu_dereference_protected(tr->function_no_pids, 7132 + lockdep_is_held(&ftrace_lock)); 7133 + break; 7134 + case TRACE_NO_PIDS: 7135 + filtered_pids = rcu_dereference_protected(tr->function_no_pids, 7136 + lockdep_is_held(&ftrace_lock)); 7137 + other_pids = rcu_dereference_protected(tr->function_pids, 7138 + lockdep_is_held(&ftrace_lock)); 7139 + break; 7140 + } 7231 7141 7232 7142 ret = trace_pid_write(filtered_pids, &pid_list, ubuf, cnt); 7233 7143 if (ret < 0) 7234 7144 goto out; 7235 7145 7236 - rcu_assign_pointer(tr->function_pids, pid_list); 7146 + switch (type) { 7147 + case TRACE_PIDS: 7148 + rcu_assign_pointer(tr->function_pids, pid_list); 7149 + break; 7150 + case TRACE_NO_PIDS: 7151 + rcu_assign_pointer(tr->function_no_pids, pid_list); 7152 + break; 7153 + } 7154 + 7237 7155 7238 7156 if (filtered_pids) { 7239 7157 synchronize_rcu(); 7240 7158 trace_free_pid_list(filtered_pids); 7241 - } else if (pid_list) { 7159 + } else if (pid_list && !other_pids) { 7242 7160 /* Register a probe to set whether to ignore the tracing of a task */ 7243 7161 register_trace_sched_switch(ftrace_filter_pid_sched_switch_probe, tr); 7244 7162 } ··· 7281 7159 return ret; 7282 7160 } 7283 7161 7162 + static ssize_t 7163 + ftrace_pid_write(struct file *filp, const char __user *ubuf, 7164 + size_t cnt, loff_t *ppos) 7165 + { 7166 + return pid_write(filp, ubuf, cnt, ppos, TRACE_PIDS); 7167 + } 7168 + 7169 + static ssize_t 7170 + ftrace_no_pid_write(struct file *filp, const char __user *ubuf, 7171 + size_t cnt, loff_t *ppos) 7172 + { 7173 + return pid_write(filp, ubuf, cnt, ppos, TRACE_NO_PIDS); 7174 + } 7175 + 7284 7176 static int 7285 7177 ftrace_pid_release(struct inode *inode, struct file *file) 7286 7178 { ··· 7313 7177 .release = ftrace_pid_release, 7314 7178 }; 7315 7179 7180 + static const struct file_operations ftrace_no_pid_fops = { 7181 + .open = ftrace_no_pid_open, 7182 + .write = ftrace_no_pid_write, 7183 + .read = seq_read, 7184 + .llseek = tracing_lseek, 7185 + .release = ftrace_pid_release, 7186 + }; 7187 + 7316 7188 void ftrace_init_tracefs(struct trace_array *tr, struct dentry *d_tracer) 7317 7189 { 7318 7190 trace_create_file("set_ftrace_pid", 0644, d_tracer, 7319 7191 tr, &ftrace_pid_fops); 7192 + trace_create_file("set_ftrace_notrace_pid", 0644, d_tracer, 7193 + tr, &ftrace_no_pid_fops); 7320 7194 } 7321 7195 7322 7196 void __init ftrace_init_tracefs_toplevel(struct trace_array *tr,
+171 -72
kernel/trace/ring_buffer.c
··· 441 441 struct ring_buffer_per_cpu { 442 442 int cpu; 443 443 atomic_t record_disabled; 444 + atomic_t resize_disabled; 444 445 struct trace_buffer *buffer; 445 446 raw_spinlock_t reader_lock; /* serialize readers */ 446 447 arch_spinlock_t lock; ··· 485 484 unsigned flags; 486 485 int cpus; 487 486 atomic_t record_disabled; 488 - atomic_t resize_disabled; 489 487 cpumask_var_t cpumask; 490 488 491 489 struct lock_class_key *reader_lock_key; ··· 503 503 struct ring_buffer_iter { 504 504 struct ring_buffer_per_cpu *cpu_buffer; 505 505 unsigned long head; 506 + unsigned long next_event; 506 507 struct buffer_page *head_page; 507 508 struct buffer_page *cache_reader_page; 508 509 unsigned long cache_read; 509 510 u64 read_stamp; 511 + u64 page_stamp; 512 + struct ring_buffer_event *event; 513 + int missed_events; 510 514 }; 511 515 512 516 /** ··· 1741 1737 1742 1738 size = nr_pages * BUF_PAGE_SIZE; 1743 1739 1744 - /* 1745 - * Don't succeed if resizing is disabled, as a reader might be 1746 - * manipulating the ring buffer and is expecting a sane state while 1747 - * this is true. 1748 - */ 1749 - if (atomic_read(&buffer->resize_disabled)) 1750 - return -EBUSY; 1751 - 1752 1740 /* prevent another thread from changing buffer sizes */ 1753 1741 mutex_lock(&buffer->mutex); 1754 1742 1743 + 1755 1744 if (cpu_id == RING_BUFFER_ALL_CPUS) { 1745 + /* 1746 + * Don't succeed if resizing is disabled, as a reader might be 1747 + * manipulating the ring buffer and is expecting a sane state while 1748 + * this is true. 1749 + */ 1750 + for_each_buffer_cpu(buffer, cpu) { 1751 + cpu_buffer = buffer->buffers[cpu]; 1752 + if (atomic_read(&cpu_buffer->resize_disabled)) { 1753 + err = -EBUSY; 1754 + goto out_err_unlock; 1755 + } 1756 + } 1757 + 1756 1758 /* calculate the pages to update */ 1757 1759 for_each_buffer_cpu(buffer, cpu) { 1758 1760 cpu_buffer = buffer->buffers[cpu]; ··· 1825 1815 1826 1816 if (nr_pages == cpu_buffer->nr_pages) 1827 1817 goto out; 1818 + 1819 + /* 1820 + * Don't succeed if resizing is disabled, as a reader might be 1821 + * manipulating the ring buffer and is expecting a sane state while 1822 + * this is true. 1823 + */ 1824 + if (atomic_read(&cpu_buffer->resize_disabled)) { 1825 + err = -EBUSY; 1826 + goto out_err_unlock; 1827 + } 1828 1828 1829 1829 cpu_buffer->nr_pages_to_update = nr_pages - 1830 1830 cpu_buffer->nr_pages; ··· 1905 1885 free_buffer_page(bpage); 1906 1886 } 1907 1887 } 1888 + out_err_unlock: 1908 1889 mutex_unlock(&buffer->mutex); 1909 1890 return err; 1910 1891 } ··· 1934 1913 cpu_buffer->reader_page->read); 1935 1914 } 1936 1915 1937 - static __always_inline struct ring_buffer_event * 1938 - rb_iter_head_event(struct ring_buffer_iter *iter) 1939 - { 1940 - return __rb_page_index(iter->head_page, iter->head); 1941 - } 1942 - 1943 1916 static __always_inline unsigned rb_page_commit(struct buffer_page *bpage) 1944 1917 { 1945 1918 return local_read(&bpage->page->commit); 1919 + } 1920 + 1921 + static struct ring_buffer_event * 1922 + rb_iter_head_event(struct ring_buffer_iter *iter) 1923 + { 1924 + struct ring_buffer_event *event; 1925 + struct buffer_page *iter_head_page = iter->head_page; 1926 + unsigned long commit; 1927 + unsigned length; 1928 + 1929 + if (iter->head != iter->next_event) 1930 + return iter->event; 1931 + 1932 + /* 1933 + * When the writer goes across pages, it issues a cmpxchg which 1934 + * is a mb(), which will synchronize with the rmb here. 1935 + * (see rb_tail_page_update() and __rb_reserve_next()) 1936 + */ 1937 + commit = rb_page_commit(iter_head_page); 1938 + smp_rmb(); 1939 + event = __rb_page_index(iter_head_page, iter->head); 1940 + length = rb_event_length(event); 1941 + 1942 + /* 1943 + * READ_ONCE() doesn't work on functions and we don't want the 1944 + * compiler doing any crazy optimizations with length. 1945 + */ 1946 + barrier(); 1947 + 1948 + if ((iter->head + length) > commit || length > BUF_MAX_DATA_SIZE) 1949 + /* Writer corrupted the read? */ 1950 + goto reset; 1951 + 1952 + memcpy(iter->event, event, length); 1953 + /* 1954 + * If the page stamp is still the same after this rmb() then the 1955 + * event was safely copied without the writer entering the page. 1956 + */ 1957 + smp_rmb(); 1958 + 1959 + /* Make sure the page didn't change since we read this */ 1960 + if (iter->page_stamp != iter_head_page->page->time_stamp || 1961 + commit > rb_page_commit(iter_head_page)) 1962 + goto reset; 1963 + 1964 + iter->next_event = iter->head + length; 1965 + return iter->event; 1966 + reset: 1967 + /* Reset to the beginning */ 1968 + iter->page_stamp = iter->read_stamp = iter->head_page->page->time_stamp; 1969 + iter->head = 0; 1970 + iter->next_event = 0; 1971 + iter->missed_events = 1; 1972 + return NULL; 1946 1973 } 1947 1974 1948 1975 /* Size is determined by what has been committed */ ··· 2028 1959 else 2029 1960 rb_inc_page(cpu_buffer, &iter->head_page); 2030 1961 2031 - iter->read_stamp = iter->head_page->page->time_stamp; 1962 + iter->page_stamp = iter->read_stamp = iter->head_page->page->time_stamp; 2032 1963 iter->head = 0; 1964 + iter->next_event = 0; 2033 1965 } 2034 1966 2035 1967 /* ··· 3617 3547 /* Iterator usage is expected to have record disabled */ 3618 3548 iter->head_page = cpu_buffer->reader_page; 3619 3549 iter->head = cpu_buffer->reader_page->read; 3550 + iter->next_event = iter->head; 3620 3551 3621 3552 iter->cache_reader_page = iter->head_page; 3622 3553 iter->cache_read = cpu_buffer->read; 3623 3554 3624 - if (iter->head) 3555 + if (iter->head) { 3625 3556 iter->read_stamp = cpu_buffer->read_stamp; 3626 - else 3557 + iter->page_stamp = cpu_buffer->reader_page->page->time_stamp; 3558 + } else { 3627 3559 iter->read_stamp = iter->head_page->page->time_stamp; 3560 + iter->page_stamp = iter->read_stamp; 3561 + } 3628 3562 } 3629 3563 3630 3564 /** ··· 3664 3590 struct buffer_page *reader; 3665 3591 struct buffer_page *head_page; 3666 3592 struct buffer_page *commit_page; 3593 + struct buffer_page *curr_commit_page; 3667 3594 unsigned commit; 3595 + u64 curr_commit_ts; 3596 + u64 commit_ts; 3668 3597 3669 3598 cpu_buffer = iter->cpu_buffer; 3670 - 3671 - /* Remember, trace recording is off when iterator is in use */ 3672 3599 reader = cpu_buffer->reader_page; 3673 3600 head_page = cpu_buffer->head_page; 3674 3601 commit_page = cpu_buffer->commit_page; 3675 - commit = rb_page_commit(commit_page); 3602 + commit_ts = commit_page->page->time_stamp; 3676 3603 3677 - return ((iter->head_page == commit_page && iter->head == commit) || 3604 + /* 3605 + * When the writer goes across pages, it issues a cmpxchg which 3606 + * is a mb(), which will synchronize with the rmb here. 3607 + * (see rb_tail_page_update()) 3608 + */ 3609 + smp_rmb(); 3610 + commit = rb_page_commit(commit_page); 3611 + /* We want to make sure that the commit page doesn't change */ 3612 + smp_rmb(); 3613 + 3614 + /* Make sure commit page didn't change */ 3615 + curr_commit_page = READ_ONCE(cpu_buffer->commit_page); 3616 + curr_commit_ts = READ_ONCE(curr_commit_page->page->time_stamp); 3617 + 3618 + /* If the commit page changed, then there's more data */ 3619 + if (curr_commit_page != commit_page || 3620 + curr_commit_ts != commit_ts) 3621 + return 0; 3622 + 3623 + /* Still racy, as it may return a false positive, but that's OK */ 3624 + return ((iter->head_page == commit_page && iter->head >= commit) || 3678 3625 (iter->head_page == reader && commit_page == head_page && 3679 3626 head_page->read == commit && 3680 3627 iter->head == rb_page_commit(cpu_buffer->reader_page))); ··· 3923 3828 static void rb_advance_iter(struct ring_buffer_iter *iter) 3924 3829 { 3925 3830 struct ring_buffer_per_cpu *cpu_buffer; 3926 - struct ring_buffer_event *event; 3927 - unsigned length; 3928 3831 3929 3832 cpu_buffer = iter->cpu_buffer; 3833 + 3834 + /* If head == next_event then we need to jump to the next event */ 3835 + if (iter->head == iter->next_event) { 3836 + /* If the event gets overwritten again, there's nothing to do */ 3837 + if (rb_iter_head_event(iter) == NULL) 3838 + return; 3839 + } 3840 + 3841 + iter->head = iter->next_event; 3930 3842 3931 3843 /* 3932 3844 * Check if we are at the end of the buffer. 3933 3845 */ 3934 - if (iter->head >= rb_page_size(iter->head_page)) { 3846 + if (iter->next_event >= rb_page_size(iter->head_page)) { 3935 3847 /* discarded commits can make the page empty */ 3936 3848 if (iter->head_page == cpu_buffer->commit_page) 3937 3849 return; ··· 3946 3844 return; 3947 3845 } 3948 3846 3949 - event = rb_iter_head_event(iter); 3950 - 3951 - length = rb_event_length(event); 3952 - 3953 - /* 3954 - * This should not be called to advance the header if we are 3955 - * at the tail of the buffer. 3956 - */ 3957 - if (RB_WARN_ON(cpu_buffer, 3958 - (iter->head_page == cpu_buffer->commit_page) && 3959 - (iter->head + length > rb_commit_index(cpu_buffer)))) 3960 - return; 3961 - 3962 - rb_update_iter_read_stamp(iter, event); 3963 - 3964 - iter->head += length; 3965 - 3966 - /* check for end of page padding */ 3967 - if ((iter->head >= rb_page_size(iter->head_page)) && 3968 - (iter->head_page != cpu_buffer->commit_page)) 3969 - rb_inc_iter(iter); 3847 + rb_update_iter_read_stamp(iter, iter->event); 3970 3848 } 3971 3849 3972 3850 static int rb_lost_events(struct ring_buffer_per_cpu *cpu_buffer) ··· 4034 3952 struct ring_buffer_per_cpu *cpu_buffer; 4035 3953 struct ring_buffer_event *event; 4036 3954 int nr_loops = 0; 3955 + bool failed = false; 4037 3956 4038 3957 if (ts) 4039 3958 *ts = 0; ··· 4061 3978 * to a data event, we should never loop more than three times. 4062 3979 * Once for going to next page, once on time extend, and 4063 3980 * finally once to get the event. 4064 - * (We never hit the following condition more than thrice). 3981 + * We should never hit the following condition more than thrice, 3982 + * unless the buffer is very small, and there's a writer 3983 + * that is causing the reader to fail getting an event. 4065 3984 */ 4066 - if (RB_WARN_ON(cpu_buffer, ++nr_loops > 3)) 3985 + if (++nr_loops > 3) { 3986 + RB_WARN_ON(cpu_buffer, !failed); 4067 3987 return NULL; 3988 + } 4068 3989 4069 3990 if (rb_per_cpu_empty(cpu_buffer)) 4070 3991 return NULL; ··· 4079 3992 } 4080 3993 4081 3994 event = rb_iter_head_event(iter); 3995 + if (!event) { 3996 + failed = true; 3997 + goto again; 3998 + } 4082 3999 4083 4000 switch (event->type_len) { 4084 4001 case RINGBUF_TYPE_PADDING: ··· 4193 4102 return event; 4194 4103 } 4195 4104 4105 + /** ring_buffer_iter_dropped - report if there are dropped events 4106 + * @iter: The ring buffer iterator 4107 + * 4108 + * Returns true if there was dropped events since the last peek. 4109 + */ 4110 + bool ring_buffer_iter_dropped(struct ring_buffer_iter *iter) 4111 + { 4112 + bool ret = iter->missed_events != 0; 4113 + 4114 + iter->missed_events = 0; 4115 + return ret; 4116 + } 4117 + EXPORT_SYMBOL_GPL(ring_buffer_iter_dropped); 4118 + 4196 4119 /** 4197 4120 * ring_buffer_iter_peek - peek at the next event to be read 4198 4121 * @iter: The ring buffer iterator ··· 4313 4208 if (!cpumask_test_cpu(cpu, buffer->cpumask)) 4314 4209 return NULL; 4315 4210 4316 - iter = kmalloc(sizeof(*iter), flags); 4211 + iter = kzalloc(sizeof(*iter), flags); 4317 4212 if (!iter) 4318 4213 return NULL; 4214 + 4215 + iter->event = kmalloc(BUF_MAX_DATA_SIZE, flags); 4216 + if (!iter->event) { 4217 + kfree(iter); 4218 + return NULL; 4219 + } 4319 4220 4320 4221 cpu_buffer = buffer->buffers[cpu]; 4321 4222 4322 4223 iter->cpu_buffer = cpu_buffer; 4323 4224 4324 - atomic_inc(&buffer->resize_disabled); 4325 - atomic_inc(&cpu_buffer->record_disabled); 4225 + atomic_inc(&cpu_buffer->resize_disabled); 4326 4226 4327 4227 return iter; 4328 4228 } ··· 4400 4290 rb_check_pages(cpu_buffer); 4401 4291 raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 4402 4292 4403 - atomic_dec(&cpu_buffer->record_disabled); 4404 - atomic_dec(&cpu_buffer->buffer->resize_disabled); 4293 + atomic_dec(&cpu_buffer->resize_disabled); 4294 + kfree(iter->event); 4405 4295 kfree(iter); 4406 4296 } 4407 4297 EXPORT_SYMBOL_GPL(ring_buffer_read_finish); 4408 4298 4409 4299 /** 4410 - * ring_buffer_read - read the next item in the ring buffer by the iterator 4300 + * ring_buffer_iter_advance - advance the iterator to the next location 4411 4301 * @iter: The ring buffer iterator 4412 - * @ts: The time stamp of the event read. 4413 4302 * 4414 - * This reads the next event in the ring buffer and increments the iterator. 4303 + * Move the location of the iterator such that the next read will 4304 + * be the next location of the iterator. 4415 4305 */ 4416 - struct ring_buffer_event * 4417 - ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts) 4306 + void ring_buffer_iter_advance(struct ring_buffer_iter *iter) 4418 4307 { 4419 - struct ring_buffer_event *event; 4420 4308 struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer; 4421 4309 unsigned long flags; 4422 4310 4423 4311 raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); 4424 - again: 4425 - event = rb_iter_peek(iter, ts); 4426 - if (!event) 4427 - goto out; 4428 - 4429 - if (event->type_len == RINGBUF_TYPE_PADDING) 4430 - goto again; 4431 4312 4432 4313 rb_advance_iter(iter); 4433 - out: 4434 - raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 4435 4314 4436 - return event; 4315 + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 4437 4316 } 4438 - EXPORT_SYMBOL_GPL(ring_buffer_read); 4317 + EXPORT_SYMBOL_GPL(ring_buffer_iter_advance); 4439 4318 4440 4319 /** 4441 4320 * ring_buffer_size - return the size of the ring buffer (in bytes) ··· 4505 4406 if (!cpumask_test_cpu(cpu, buffer->cpumask)) 4506 4407 return; 4507 4408 4508 - atomic_inc(&buffer->resize_disabled); 4409 + atomic_inc(&cpu_buffer->resize_disabled); 4509 4410 atomic_inc(&cpu_buffer->record_disabled); 4510 4411 4511 4412 /* Make sure all commits have finished */ ··· 4526 4427 raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 4527 4428 4528 4429 atomic_dec(&cpu_buffer->record_disabled); 4529 - atomic_dec(&buffer->resize_disabled); 4430 + atomic_dec(&cpu_buffer->resize_disabled); 4530 4431 } 4531 4432 EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); 4532 4433
+93 -17
kernel/trace/trace.c
··· 386 386 * Returns false if @task should be traced. 387 387 */ 388 388 bool 389 - trace_ignore_this_task(struct trace_pid_list *filtered_pids, struct task_struct *task) 389 + trace_ignore_this_task(struct trace_pid_list *filtered_pids, 390 + struct trace_pid_list *filtered_no_pids, 391 + struct task_struct *task) 390 392 { 391 393 /* 392 - * Return false, because if filtered_pids does not exist, 393 - * all pids are good to trace. 394 + * If filterd_no_pids is not empty, and the task's pid is listed 395 + * in filtered_no_pids, then return true. 396 + * Otherwise, if filtered_pids is empty, that means we can 397 + * trace all tasks. If it has content, then only trace pids 398 + * within filtered_pids. 394 399 */ 395 - if (!filtered_pids) 396 - return false; 397 400 398 - return !trace_find_filtered_pid(filtered_pids, task->pid); 401 + return (filtered_pids && 402 + !trace_find_filtered_pid(filtered_pids, task->pid)) || 403 + (filtered_no_pids && 404 + trace_find_filtered_pid(filtered_no_pids, task->pid)); 399 405 } 400 406 401 407 /** ··· 3384 3378 3385 3379 iter->idx++; 3386 3380 if (buf_iter) 3387 - ring_buffer_read(buf_iter, NULL); 3381 + ring_buffer_iter_advance(buf_iter); 3388 3382 } 3389 3383 3390 3384 static struct trace_entry * ··· 3394 3388 struct ring_buffer_event *event; 3395 3389 struct ring_buffer_iter *buf_iter = trace_buffer_iter(iter, cpu); 3396 3390 3397 - if (buf_iter) 3391 + if (buf_iter) { 3398 3392 event = ring_buffer_iter_peek(buf_iter, ts); 3399 - else 3393 + if (lost_events) 3394 + *lost_events = ring_buffer_iter_dropped(buf_iter) ? 3395 + (unsigned long)-1 : 0; 3396 + } else { 3400 3397 event = ring_buffer_peek(iter->array_buffer->buffer, cpu, ts, 3401 3398 lost_events); 3399 + } 3402 3400 3403 3401 if (event) { 3404 3402 iter->ent_size = ring_buffer_event_length(event); ··· 3472 3462 return next; 3473 3463 } 3474 3464 3465 + #define STATIC_TEMP_BUF_SIZE 128 3466 + static char static_temp_buf[STATIC_TEMP_BUF_SIZE]; 3467 + 3475 3468 /* Find the next real entry, without updating the iterator itself */ 3476 3469 struct trace_entry *trace_find_next_entry(struct trace_iterator *iter, 3477 3470 int *ent_cpu, u64 *ent_ts) 3478 3471 { 3479 - return __find_next_entry(iter, ent_cpu, NULL, ent_ts); 3472 + /* __find_next_entry will reset ent_size */ 3473 + int ent_size = iter->ent_size; 3474 + struct trace_entry *entry; 3475 + 3476 + /* 3477 + * If called from ftrace_dump(), then the iter->temp buffer 3478 + * will be the static_temp_buf and not created from kmalloc. 3479 + * If the entry size is greater than the buffer, we can 3480 + * not save it. Just return NULL in that case. This is only 3481 + * used to add markers when two consecutive events' time 3482 + * stamps have a large delta. See trace_print_lat_context() 3483 + */ 3484 + if (iter->temp == static_temp_buf && 3485 + STATIC_TEMP_BUF_SIZE < ent_size) 3486 + return NULL; 3487 + 3488 + /* 3489 + * The __find_next_entry() may call peek_next_entry(), which may 3490 + * call ring_buffer_peek() that may make the contents of iter->ent 3491 + * undefined. Need to copy iter->ent now. 3492 + */ 3493 + if (iter->ent && iter->ent != iter->temp) { 3494 + if ((!iter->temp || iter->temp_size < iter->ent_size) && 3495 + !WARN_ON_ONCE(iter->temp == static_temp_buf)) { 3496 + kfree(iter->temp); 3497 + iter->temp = kmalloc(iter->ent_size, GFP_KERNEL); 3498 + if (!iter->temp) 3499 + return NULL; 3500 + } 3501 + memcpy(iter->temp, iter->ent, iter->ent_size); 3502 + iter->temp_size = iter->ent_size; 3503 + iter->ent = iter->temp; 3504 + } 3505 + entry = __find_next_entry(iter, ent_cpu, NULL, ent_ts); 3506 + /* Put back the original ent_size */ 3507 + iter->ent_size = ent_size; 3508 + 3509 + return entry; 3480 3510 } 3481 3511 3482 3512 /* Find the next real entry, and increment the iterator to the next entry */ ··· 3588 3538 if (ts >= iter->array_buffer->time_start) 3589 3539 break; 3590 3540 entries++; 3591 - ring_buffer_read(buf_iter, NULL); 3541 + ring_buffer_iter_advance(buf_iter); 3592 3542 } 3593 3543 3594 3544 per_cpu_ptr(iter->array_buffer->data, cpu)->skipped_entries = entries; ··· 4031 3981 enum print_line_t ret; 4032 3982 4033 3983 if (iter->lost_events) { 4034 - trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", 4035 - iter->cpu, iter->lost_events); 3984 + if (iter->lost_events == (unsigned long)-1) 3985 + trace_seq_printf(&iter->seq, "CPU:%d [LOST EVENTS]\n", 3986 + iter->cpu); 3987 + else 3988 + trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", 3989 + iter->cpu, iter->lost_events); 4036 3990 if (trace_seq_has_overflowed(&iter->seq)) 4037 3991 return TRACE_TYPE_PARTIAL_LINE; 4038 3992 } ··· 4252 4198 goto release; 4253 4199 4254 4200 /* 4201 + * trace_find_next_entry() may need to save off iter->ent. 4202 + * It will place it into the iter->temp buffer. As most 4203 + * events are less than 128, allocate a buffer of that size. 4204 + * If one is greater, then trace_find_next_entry() will 4205 + * allocate a new buffer to adjust for the bigger iter->ent. 4206 + * It's not critical if it fails to get allocated here. 4207 + */ 4208 + iter->temp = kmalloc(128, GFP_KERNEL); 4209 + if (iter->temp) 4210 + iter->temp_size = 128; 4211 + 4212 + /* 4255 4213 * We make a copy of the current tracer to avoid concurrent 4256 4214 * changes on it while we are reading. 4257 4215 */ ··· 4303 4237 if (trace_clocks[tr->clock_id].in_ns) 4304 4238 iter->iter_flags |= TRACE_FILE_TIME_IN_NS; 4305 4239 4306 - /* stop the trace while dumping if we are not opening "snapshot" */ 4307 - if (!iter->snapshot) 4240 + /* 4241 + * If pause-on-trace is enabled, then stop the trace while 4242 + * dumping, unless this is the "snapshot" file 4243 + */ 4244 + if (!iter->snapshot && (tr->trace_flags & TRACE_ITER_PAUSE_ON_TRACE)) 4308 4245 tracing_stop_tr(tr); 4309 4246 4310 4247 if (iter->cpu_file == RING_BUFFER_ALL_CPUS) { ··· 4338 4269 fail: 4339 4270 mutex_unlock(&trace_types_lock); 4340 4271 kfree(iter->trace); 4272 + kfree(iter->temp); 4341 4273 kfree(iter->buffer_iter); 4342 4274 release: 4343 4275 seq_release_private(inode, file); ··· 4404 4334 if (iter->trace && iter->trace->close) 4405 4335 iter->trace->close(iter); 4406 4336 4407 - if (!iter->snapshot) 4337 + if (!iter->snapshot && tr->stop_count) 4408 4338 /* reenable tracing if it was previously enabled */ 4409 4339 tracing_start_tr(tr); 4410 4340 ··· 4414 4344 4415 4345 mutex_destroy(&iter->mutex); 4416 4346 free_cpumask_var(iter->started); 4347 + kfree(iter->temp); 4417 4348 kfree(iter->trace); 4418 4349 kfree(iter->buffer_iter); 4419 4350 seq_release_private(inode, file); ··· 5034 4963 #endif /* CONFIG_DYNAMIC_FTRACE */ 5035 4964 #ifdef CONFIG_FUNCTION_TRACER 5036 4965 " set_ftrace_pid\t- Write pid(s) to only function trace those pids\n" 4966 + "\t\t (function)\n" 4967 + " set_ftrace_notrace_pid\t- Write pid(s) to not function trace those pids\n" 5037 4968 "\t\t (function)\n" 5038 4969 #endif 5039 4970 #ifdef CONFIG_FUNCTION_GRAPH_TRACER ··· 9219 9146 9220 9147 /* Simulate the iterator */ 9221 9148 trace_init_global_iter(&iter); 9149 + /* Can not use kmalloc for iter.temp */ 9150 + iter.temp = static_temp_buf; 9151 + iter.temp_size = STATIC_TEMP_BUF_SIZE; 9222 9152 9223 9153 for_each_tracing_cpu(cpu) { 9224 9154 atomic_inc(&per_cpu_ptr(iter.array_buffer->data, cpu)->disabled); ··· 9410 9334 goto out_free_buffer_mask; 9411 9335 9412 9336 /* Only allocate trace_printk buffers if a trace_printk exists */ 9413 - if (__stop___trace_bprintk_fmt != __start___trace_bprintk_fmt) 9337 + if (&__stop___trace_bprintk_fmt != &__start___trace_bprintk_fmt) 9414 9338 /* Must be called before global_trace.buffer is allocated */ 9415 9339 trace_printk_init_buffers(); 9416 9340
+31 -8
kernel/trace/trace.h
··· 178 178 kuid_t uid; 179 179 char comm[TASK_COMM_LEN]; 180 180 181 - bool ignore_pid; 182 181 #ifdef CONFIG_FUNCTION_TRACER 183 - bool ftrace_ignore_pid; 182 + int ftrace_ignore_pid; 184 183 #endif 184 + bool ignore_pid; 185 185 }; 186 186 187 187 struct tracer; ··· 206 206 int pid_max; 207 207 unsigned long *pids; 208 208 }; 209 + 210 + enum { 211 + TRACE_PIDS = BIT(0), 212 + TRACE_NO_PIDS = BIT(1), 213 + }; 214 + 215 + static inline bool pid_type_enabled(int type, struct trace_pid_list *pid_list, 216 + struct trace_pid_list *no_pid_list) 217 + { 218 + /* Return true if the pid list in type has pids */ 219 + return ((type & TRACE_PIDS) && pid_list) || 220 + ((type & TRACE_NO_PIDS) && no_pid_list); 221 + } 222 + 223 + static inline bool still_need_pid_events(int type, struct trace_pid_list *pid_list, 224 + struct trace_pid_list *no_pid_list) 225 + { 226 + /* 227 + * Turning off what is in @type, return true if the "other" 228 + * pid list, still has pids in it. 229 + */ 230 + return (!(type & TRACE_PIDS) && pid_list) || 231 + (!(type & TRACE_NO_PIDS) && no_pid_list); 232 + } 209 233 210 234 typedef bool (*cond_update_fn_t)(struct trace_array *tr, void *cond_data); 211 235 ··· 309 285 #endif 310 286 #endif 311 287 struct trace_pid_list __rcu *filtered_pids; 288 + struct trace_pid_list __rcu *filtered_no_pids; 312 289 /* 313 290 * max_lock is used to protect the swapping of buffers 314 291 * when taking a max snapshot. The buffers themselves are ··· 356 331 #ifdef CONFIG_FUNCTION_TRACER 357 332 struct ftrace_ops *ops; 358 333 struct trace_pid_list __rcu *function_pids; 334 + struct trace_pid_list __rcu *function_no_pids; 359 335 #ifdef CONFIG_DYNAMIC_FTRACE 360 336 /* All of these are protected by the ftrace_lock */ 361 337 struct list_head func_probes; ··· 583 557 * caller, and we can skip the current check. 584 558 */ 585 559 enum { 586 - TRACE_BUFFER_BIT, 587 - TRACE_BUFFER_NMI_BIT, 588 - TRACE_BUFFER_IRQ_BIT, 589 - TRACE_BUFFER_SIRQ_BIT, 590 - 591 - /* Start of function recursion bits */ 560 + /* Function recursion bits */ 592 561 TRACE_FTRACE_BIT, 593 562 TRACE_FTRACE_NMI_BIT, 594 563 TRACE_FTRACE_IRQ_BIT, ··· 808 787 bool trace_find_filtered_pid(struct trace_pid_list *filtered_pids, 809 788 pid_t search_pid); 810 789 bool trace_ignore_this_task(struct trace_pid_list *filtered_pids, 790 + struct trace_pid_list *filtered_no_pids, 811 791 struct task_struct *task); 812 792 void trace_filter_add_remove_task(struct trace_pid_list *pid_list, 813 793 struct task_struct *self, ··· 1329 1307 C(IRQ_INFO, "irq-info"), \ 1330 1308 C(MARKERS, "markers"), \ 1331 1309 C(EVENT_FORK, "event-fork"), \ 1310 + C(PAUSE_ON_TRACE, "pause-on-trace"), \ 1332 1311 FUNCTION_FLAGS \ 1333 1312 FGRAPH_FLAGS \ 1334 1313 STACK_FLAGS \
+3 -1
kernel/trace/trace_entries.h
··· 325 325 __field_desc( long, timestamp, tv_nsec ) 326 326 __field( unsigned int, nmi_count ) 327 327 __field( unsigned int, seqnum ) 328 + __field( unsigned int, count ) 328 329 ), 329 330 330 - F_printk("cnt:%u\tts:%010llu.%010lu\tinner:%llu\touter:%llu\tnmi-ts:%llu\tnmi-count:%u\n", 331 + F_printk("cnt:%u\tts:%010llu.%010lu\tinner:%llu\touter:%llu\tcount:%d\tnmi-ts:%llu\tnmi-count:%u\n", 331 332 __entry->seqnum, 332 333 __entry->tv_sec, 333 334 __entry->tv_nsec, 334 335 __entry->duration, 335 336 __entry->outer_duration, 337 + __entry->count, 336 338 __entry->nmi_total_ts, 337 339 __entry->nmi_count) 338 340 );
+217 -63
kernel/trace/trace_events.c
··· 232 232 { 233 233 struct trace_array *tr = trace_file->tr; 234 234 struct trace_array_cpu *data; 235 + struct trace_pid_list *no_pid_list; 235 236 struct trace_pid_list *pid_list; 236 237 237 238 pid_list = rcu_dereference_raw(tr->filtered_pids); 238 - if (!pid_list) 239 + no_pid_list = rcu_dereference_raw(tr->filtered_no_pids); 240 + 241 + if (!pid_list && !no_pid_list) 239 242 return false; 240 243 241 244 data = this_cpu_ptr(tr->array_buffer.data); ··· 513 510 514 511 pid_list = rcu_dereference_raw(tr->filtered_pids); 515 512 trace_filter_add_remove_task(pid_list, NULL, task); 513 + 514 + pid_list = rcu_dereference_raw(tr->filtered_no_pids); 515 + trace_filter_add_remove_task(pid_list, NULL, task); 516 516 } 517 517 518 518 static void ··· 527 521 struct trace_array *tr = data; 528 522 529 523 pid_list = rcu_dereference_sched(tr->filtered_pids); 524 + trace_filter_add_remove_task(pid_list, self, task); 525 + 526 + pid_list = rcu_dereference_sched(tr->filtered_no_pids); 530 527 trace_filter_add_remove_task(pid_list, self, task); 531 528 } 532 529 ··· 553 544 struct task_struct *prev, struct task_struct *next) 554 545 { 555 546 struct trace_array *tr = data; 547 + struct trace_pid_list *no_pid_list; 556 548 struct trace_pid_list *pid_list; 549 + bool ret; 557 550 558 551 pid_list = rcu_dereference_sched(tr->filtered_pids); 552 + no_pid_list = rcu_dereference_sched(tr->filtered_no_pids); 559 553 560 - this_cpu_write(tr->array_buffer.data->ignore_pid, 561 - trace_ignore_this_task(pid_list, prev) && 562 - trace_ignore_this_task(pid_list, next)); 554 + /* 555 + * Sched switch is funny, as we only want to ignore it 556 + * in the notrace case if both prev and next should be ignored. 557 + */ 558 + ret = trace_ignore_this_task(NULL, no_pid_list, prev) && 559 + trace_ignore_this_task(NULL, no_pid_list, next); 560 + 561 + this_cpu_write(tr->array_buffer.data->ignore_pid, ret || 562 + (trace_ignore_this_task(pid_list, NULL, prev) && 563 + trace_ignore_this_task(pid_list, NULL, next))); 563 564 } 564 565 565 566 static void ··· 577 558 struct task_struct *prev, struct task_struct *next) 578 559 { 579 560 struct trace_array *tr = data; 561 + struct trace_pid_list *no_pid_list; 580 562 struct trace_pid_list *pid_list; 581 563 582 564 pid_list = rcu_dereference_sched(tr->filtered_pids); 565 + no_pid_list = rcu_dereference_sched(tr->filtered_no_pids); 583 566 584 567 this_cpu_write(tr->array_buffer.data->ignore_pid, 585 - trace_ignore_this_task(pid_list, next)); 568 + trace_ignore_this_task(pid_list, no_pid_list, next)); 586 569 } 587 570 588 571 static void 589 572 event_filter_pid_sched_wakeup_probe_pre(void *data, struct task_struct *task) 590 573 { 591 574 struct trace_array *tr = data; 575 + struct trace_pid_list *no_pid_list; 592 576 struct trace_pid_list *pid_list; 593 577 594 578 /* Nothing to do if we are already tracing */ ··· 599 577 return; 600 578 601 579 pid_list = rcu_dereference_sched(tr->filtered_pids); 580 + no_pid_list = rcu_dereference_sched(tr->filtered_no_pids); 602 581 603 582 this_cpu_write(tr->array_buffer.data->ignore_pid, 604 - trace_ignore_this_task(pid_list, task)); 583 + trace_ignore_this_task(pid_list, no_pid_list, task)); 605 584 } 606 585 607 586 static void 608 587 event_filter_pid_sched_wakeup_probe_post(void *data, struct task_struct *task) 609 588 { 610 589 struct trace_array *tr = data; 590 + struct trace_pid_list *no_pid_list; 611 591 struct trace_pid_list *pid_list; 612 592 613 593 /* Nothing to do if we are not tracing */ ··· 617 593 return; 618 594 619 595 pid_list = rcu_dereference_sched(tr->filtered_pids); 596 + no_pid_list = rcu_dereference_sched(tr->filtered_no_pids); 620 597 621 598 /* Set tracing if current is enabled */ 622 599 this_cpu_write(tr->array_buffer.data->ignore_pid, 623 - trace_ignore_this_task(pid_list, current)); 600 + trace_ignore_this_task(pid_list, no_pid_list, current)); 624 601 } 625 602 626 - static void __ftrace_clear_event_pids(struct trace_array *tr) 603 + static void unregister_pid_events(struct trace_array *tr) 627 604 { 628 - struct trace_pid_list *pid_list; 629 - struct trace_event_file *file; 630 - int cpu; 631 - 632 - pid_list = rcu_dereference_protected(tr->filtered_pids, 633 - lockdep_is_held(&event_mutex)); 634 - if (!pid_list) 635 - return; 636 - 637 605 unregister_trace_sched_switch(event_filter_pid_sched_switch_probe_pre, tr); 638 606 unregister_trace_sched_switch(event_filter_pid_sched_switch_probe_post, tr); 639 607 ··· 637 621 638 622 unregister_trace_sched_waking(event_filter_pid_sched_wakeup_probe_pre, tr); 639 623 unregister_trace_sched_waking(event_filter_pid_sched_wakeup_probe_post, tr); 624 + } 640 625 641 - list_for_each_entry(file, &tr->events, list) { 642 - clear_bit(EVENT_FILE_FL_PID_FILTER_BIT, &file->flags); 626 + static void __ftrace_clear_event_pids(struct trace_array *tr, int type) 627 + { 628 + struct trace_pid_list *pid_list; 629 + struct trace_pid_list *no_pid_list; 630 + struct trace_event_file *file; 631 + int cpu; 632 + 633 + pid_list = rcu_dereference_protected(tr->filtered_pids, 634 + lockdep_is_held(&event_mutex)); 635 + no_pid_list = rcu_dereference_protected(tr->filtered_no_pids, 636 + lockdep_is_held(&event_mutex)); 637 + 638 + /* Make sure there's something to do */ 639 + if (!pid_type_enabled(type, pid_list, no_pid_list)) 640 + return; 641 + 642 + if (!still_need_pid_events(type, pid_list, no_pid_list)) { 643 + unregister_pid_events(tr); 644 + 645 + list_for_each_entry(file, &tr->events, list) { 646 + clear_bit(EVENT_FILE_FL_PID_FILTER_BIT, &file->flags); 647 + } 648 + 649 + for_each_possible_cpu(cpu) 650 + per_cpu_ptr(tr->array_buffer.data, cpu)->ignore_pid = false; 643 651 } 644 652 645 - for_each_possible_cpu(cpu) 646 - per_cpu_ptr(tr->array_buffer.data, cpu)->ignore_pid = false; 653 + if (type & TRACE_PIDS) 654 + rcu_assign_pointer(tr->filtered_pids, NULL); 647 655 648 - rcu_assign_pointer(tr->filtered_pids, NULL); 656 + if (type & TRACE_NO_PIDS) 657 + rcu_assign_pointer(tr->filtered_no_pids, NULL); 649 658 650 659 /* Wait till all users are no longer using pid filtering */ 651 660 tracepoint_synchronize_unregister(); 652 661 653 - trace_free_pid_list(pid_list); 662 + if ((type & TRACE_PIDS) && pid_list) 663 + trace_free_pid_list(pid_list); 664 + 665 + if ((type & TRACE_NO_PIDS) && no_pid_list) 666 + trace_free_pid_list(no_pid_list); 654 667 } 655 668 656 - static void ftrace_clear_event_pids(struct trace_array *tr) 669 + static void ftrace_clear_event_pids(struct trace_array *tr, int type) 657 670 { 658 671 mutex_lock(&event_mutex); 659 - __ftrace_clear_event_pids(tr); 672 + __ftrace_clear_event_pids(tr, type); 660 673 mutex_unlock(&event_mutex); 661 674 } 662 675 ··· 1058 1013 } 1059 1014 1060 1015 static void * 1061 - p_next(struct seq_file *m, void *v, loff_t *pos) 1016 + __next(struct seq_file *m, void *v, loff_t *pos, int type) 1062 1017 { 1063 1018 struct trace_array *tr = m->private; 1064 - struct trace_pid_list *pid_list = rcu_dereference_sched(tr->filtered_pids); 1019 + struct trace_pid_list *pid_list; 1020 + 1021 + if (type == TRACE_PIDS) 1022 + pid_list = rcu_dereference_sched(tr->filtered_pids); 1023 + else 1024 + pid_list = rcu_dereference_sched(tr->filtered_no_pids); 1065 1025 1066 1026 return trace_pid_next(pid_list, v, pos); 1067 1027 } 1068 1028 1069 - static void *p_start(struct seq_file *m, loff_t *pos) 1029 + static void * 1030 + p_next(struct seq_file *m, void *v, loff_t *pos) 1031 + { 1032 + return __next(m, v, pos, TRACE_PIDS); 1033 + } 1034 + 1035 + static void * 1036 + np_next(struct seq_file *m, void *v, loff_t *pos) 1037 + { 1038 + return __next(m, v, pos, TRACE_NO_PIDS); 1039 + } 1040 + 1041 + static void *__start(struct seq_file *m, loff_t *pos, int type) 1070 1042 __acquires(RCU) 1071 1043 { 1072 1044 struct trace_pid_list *pid_list; ··· 1098 1036 mutex_lock(&event_mutex); 1099 1037 rcu_read_lock_sched(); 1100 1038 1101 - pid_list = rcu_dereference_sched(tr->filtered_pids); 1039 + if (type == TRACE_PIDS) 1040 + pid_list = rcu_dereference_sched(tr->filtered_pids); 1041 + else 1042 + pid_list = rcu_dereference_sched(tr->filtered_no_pids); 1102 1043 1103 1044 if (!pid_list) 1104 1045 return NULL; 1105 1046 1106 1047 return trace_pid_start(pid_list, pos); 1048 + } 1049 + 1050 + static void *p_start(struct seq_file *m, loff_t *pos) 1051 + __acquires(RCU) 1052 + { 1053 + return __start(m, pos, TRACE_PIDS); 1054 + } 1055 + 1056 + static void *np_start(struct seq_file *m, loff_t *pos) 1057 + __acquires(RCU) 1058 + { 1059 + return __start(m, pos, TRACE_NO_PIDS); 1107 1060 } 1108 1061 1109 1062 static void p_stop(struct seq_file *m, void *p) ··· 1665 1588 { 1666 1589 struct trace_array *tr = data; 1667 1590 struct trace_pid_list *pid_list; 1591 + struct trace_pid_list *no_pid_list; 1668 1592 1669 1593 /* 1670 1594 * This function is called by on_each_cpu() while the ··· 1673 1595 */ 1674 1596 pid_list = rcu_dereference_protected(tr->filtered_pids, 1675 1597 mutex_is_locked(&event_mutex)); 1598 + no_pid_list = rcu_dereference_protected(tr->filtered_no_pids, 1599 + mutex_is_locked(&event_mutex)); 1676 1600 1677 1601 this_cpu_write(tr->array_buffer.data->ignore_pid, 1678 - trace_ignore_this_task(pid_list, current)); 1602 + trace_ignore_this_task(pid_list, no_pid_list, current)); 1603 + } 1604 + 1605 + static void register_pid_events(struct trace_array *tr) 1606 + { 1607 + /* 1608 + * Register a probe that is called before all other probes 1609 + * to set ignore_pid if next or prev do not match. 1610 + * Register a probe this is called after all other probes 1611 + * to only keep ignore_pid set if next pid matches. 1612 + */ 1613 + register_trace_prio_sched_switch(event_filter_pid_sched_switch_probe_pre, 1614 + tr, INT_MAX); 1615 + register_trace_prio_sched_switch(event_filter_pid_sched_switch_probe_post, 1616 + tr, 0); 1617 + 1618 + register_trace_prio_sched_wakeup(event_filter_pid_sched_wakeup_probe_pre, 1619 + tr, INT_MAX); 1620 + register_trace_prio_sched_wakeup(event_filter_pid_sched_wakeup_probe_post, 1621 + tr, 0); 1622 + 1623 + register_trace_prio_sched_wakeup_new(event_filter_pid_sched_wakeup_probe_pre, 1624 + tr, INT_MAX); 1625 + register_trace_prio_sched_wakeup_new(event_filter_pid_sched_wakeup_probe_post, 1626 + tr, 0); 1627 + 1628 + register_trace_prio_sched_waking(event_filter_pid_sched_wakeup_probe_pre, 1629 + tr, INT_MAX); 1630 + register_trace_prio_sched_waking(event_filter_pid_sched_wakeup_probe_post, 1631 + tr, 0); 1679 1632 } 1680 1633 1681 1634 static ssize_t 1682 - ftrace_event_pid_write(struct file *filp, const char __user *ubuf, 1683 - size_t cnt, loff_t *ppos) 1635 + event_pid_write(struct file *filp, const char __user *ubuf, 1636 + size_t cnt, loff_t *ppos, int type) 1684 1637 { 1685 1638 struct seq_file *m = filp->private_data; 1686 1639 struct trace_array *tr = m->private; 1687 1640 struct trace_pid_list *filtered_pids = NULL; 1641 + struct trace_pid_list *other_pids = NULL; 1688 1642 struct trace_pid_list *pid_list; 1689 1643 struct trace_event_file *file; 1690 1644 ssize_t ret; ··· 1730 1620 1731 1621 mutex_lock(&event_mutex); 1732 1622 1733 - filtered_pids = rcu_dereference_protected(tr->filtered_pids, 1734 - lockdep_is_held(&event_mutex)); 1623 + if (type == TRACE_PIDS) { 1624 + filtered_pids = rcu_dereference_protected(tr->filtered_pids, 1625 + lockdep_is_held(&event_mutex)); 1626 + other_pids = rcu_dereference_protected(tr->filtered_no_pids, 1627 + lockdep_is_held(&event_mutex)); 1628 + } else { 1629 + filtered_pids = rcu_dereference_protected(tr->filtered_no_pids, 1630 + lockdep_is_held(&event_mutex)); 1631 + other_pids = rcu_dereference_protected(tr->filtered_pids, 1632 + lockdep_is_held(&event_mutex)); 1633 + } 1735 1634 1736 1635 ret = trace_pid_write(filtered_pids, &pid_list, ubuf, cnt); 1737 1636 if (ret < 0) 1738 1637 goto out; 1739 1638 1740 - rcu_assign_pointer(tr->filtered_pids, pid_list); 1639 + if (type == TRACE_PIDS) 1640 + rcu_assign_pointer(tr->filtered_pids, pid_list); 1641 + else 1642 + rcu_assign_pointer(tr->filtered_no_pids, pid_list); 1741 1643 1742 1644 list_for_each_entry(file, &tr->events, list) { 1743 1645 set_bit(EVENT_FILE_FL_PID_FILTER_BIT, &file->flags); ··· 1758 1636 if (filtered_pids) { 1759 1637 tracepoint_synchronize_unregister(); 1760 1638 trace_free_pid_list(filtered_pids); 1761 - } else if (pid_list) { 1762 - /* 1763 - * Register a probe that is called before all other probes 1764 - * to set ignore_pid if next or prev do not match. 1765 - * Register a probe this is called after all other probes 1766 - * to only keep ignore_pid set if next pid matches. 1767 - */ 1768 - register_trace_prio_sched_switch(event_filter_pid_sched_switch_probe_pre, 1769 - tr, INT_MAX); 1770 - register_trace_prio_sched_switch(event_filter_pid_sched_switch_probe_post, 1771 - tr, 0); 1772 - 1773 - register_trace_prio_sched_wakeup(event_filter_pid_sched_wakeup_probe_pre, 1774 - tr, INT_MAX); 1775 - register_trace_prio_sched_wakeup(event_filter_pid_sched_wakeup_probe_post, 1776 - tr, 0); 1777 - 1778 - register_trace_prio_sched_wakeup_new(event_filter_pid_sched_wakeup_probe_pre, 1779 - tr, INT_MAX); 1780 - register_trace_prio_sched_wakeup_new(event_filter_pid_sched_wakeup_probe_post, 1781 - tr, 0); 1782 - 1783 - register_trace_prio_sched_waking(event_filter_pid_sched_wakeup_probe_pre, 1784 - tr, INT_MAX); 1785 - register_trace_prio_sched_waking(event_filter_pid_sched_wakeup_probe_post, 1786 - tr, 0); 1639 + } else if (pid_list && !other_pids) { 1640 + register_pid_events(tr); 1787 1641 } 1788 1642 1789 1643 /* ··· 1778 1680 return ret; 1779 1681 } 1780 1682 1683 + static ssize_t 1684 + ftrace_event_pid_write(struct file *filp, const char __user *ubuf, 1685 + size_t cnt, loff_t *ppos) 1686 + { 1687 + return event_pid_write(filp, ubuf, cnt, ppos, TRACE_PIDS); 1688 + } 1689 + 1690 + static ssize_t 1691 + ftrace_event_npid_write(struct file *filp, const char __user *ubuf, 1692 + size_t cnt, loff_t *ppos) 1693 + { 1694 + return event_pid_write(filp, ubuf, cnt, ppos, TRACE_NO_PIDS); 1695 + } 1696 + 1781 1697 static int ftrace_event_avail_open(struct inode *inode, struct file *file); 1782 1698 static int ftrace_event_set_open(struct inode *inode, struct file *file); 1783 1699 static int ftrace_event_set_pid_open(struct inode *inode, struct file *file); 1700 + static int ftrace_event_set_npid_open(struct inode *inode, struct file *file); 1784 1701 static int ftrace_event_release(struct inode *inode, struct file *file); 1785 1702 1786 1703 static const struct seq_operations show_event_seq_ops = { ··· 1819 1706 .stop = p_stop, 1820 1707 }; 1821 1708 1709 + static const struct seq_operations show_set_no_pid_seq_ops = { 1710 + .start = np_start, 1711 + .next = np_next, 1712 + .show = trace_pid_show, 1713 + .stop = p_stop, 1714 + }; 1715 + 1822 1716 static const struct file_operations ftrace_avail_fops = { 1823 1717 .open = ftrace_event_avail_open, 1824 1718 .read = seq_read, ··· 1845 1725 .open = ftrace_event_set_pid_open, 1846 1726 .read = seq_read, 1847 1727 .write = ftrace_event_pid_write, 1728 + .llseek = seq_lseek, 1729 + .release = ftrace_event_release, 1730 + }; 1731 + 1732 + static const struct file_operations ftrace_set_event_notrace_pid_fops = { 1733 + .open = ftrace_event_set_npid_open, 1734 + .read = seq_read, 1735 + .write = ftrace_event_npid_write, 1848 1736 .llseek = seq_lseek, 1849 1737 .release = ftrace_event_release, 1850 1738 }; ··· 1986 1858 1987 1859 if ((file->f_mode & FMODE_WRITE) && 1988 1860 (file->f_flags & O_TRUNC)) 1989 - ftrace_clear_event_pids(tr); 1861 + ftrace_clear_event_pids(tr, TRACE_PIDS); 1862 + 1863 + ret = ftrace_event_open(inode, file, seq_ops); 1864 + if (ret < 0) 1865 + trace_array_put(tr); 1866 + return ret; 1867 + } 1868 + 1869 + static int 1870 + ftrace_event_set_npid_open(struct inode *inode, struct file *file) 1871 + { 1872 + const struct seq_operations *seq_ops = &show_set_no_pid_seq_ops; 1873 + struct trace_array *tr = inode->i_private; 1874 + int ret; 1875 + 1876 + ret = tracing_check_open_get_tr(tr); 1877 + if (ret) 1878 + return ret; 1879 + 1880 + if ((file->f_mode & FMODE_WRITE) && 1881 + (file->f_flags & O_TRUNC)) 1882 + ftrace_clear_event_pids(tr, TRACE_NO_PIDS); 1990 1883 1991 1884 ret = ftrace_event_open(inode, file, seq_ops); 1992 1885 if (ret < 0) ··· 3224 3075 if (!entry) 3225 3076 pr_warn("Could not create tracefs 'set_event_pid' entry\n"); 3226 3077 3078 + entry = tracefs_create_file("set_event_notrace_pid", 0644, parent, 3079 + tr, &ftrace_set_event_notrace_pid_fops); 3080 + if (!entry) 3081 + pr_warn("Could not create tracefs 'set_event_notrace_pid' entry\n"); 3082 + 3227 3083 /* ring buffer internal formats */ 3228 3084 entry = trace_create_file("header_page", 0444, d_events, 3229 3085 ring_buffer_print_page_header, ··· 3312 3158 clear_event_triggers(tr); 3313 3159 3314 3160 /* Clear the pid list */ 3315 - __ftrace_clear_event_pids(tr); 3161 + __ftrace_clear_event_pids(tr, TRACE_PIDS | TRACE_NO_PIDS); 3316 3162 3317 3163 /* Disable any running events */ 3318 3164 __ftrace_set_clr_event_nolock(tr, NULL, NULL, NULL, 0);
+1 -1
kernel/trace/trace_functions_graph.c
··· 482 482 483 483 /* this is a leaf, now advance the iterator */ 484 484 if (ring_iter) 485 - ring_buffer_read(ring_iter, NULL); 485 + ring_buffer_iter_advance(ring_iter); 486 486 487 487 return next; 488 488 }
+17 -7
kernel/trace/trace_hwlat.c
··· 83 83 u64 nmi_total_ts; /* Total time spent in NMIs */ 84 84 struct timespec64 timestamp; /* wall time */ 85 85 int nmi_count; /* # NMIs during this sample */ 86 + int count; /* # of iteratons over threash */ 86 87 }; 87 88 88 89 /* keep the global state somewhere. */ ··· 125 124 entry->timestamp = sample->timestamp; 126 125 entry->nmi_total_ts = sample->nmi_total_ts; 127 126 entry->nmi_count = sample->nmi_count; 127 + entry->count = sample->count; 128 128 129 129 if (!call_filter_check_discard(call, entry, buffer, event)) 130 130 trace_buffer_unlock_commit_nostack(buffer, event); ··· 169 167 static int get_sample(void) 170 168 { 171 169 struct trace_array *tr = hwlat_trace; 170 + struct hwlat_sample s; 172 171 time_type start, t1, t2, last_t2; 173 - s64 diff, total, last_total = 0; 172 + s64 diff, outer_diff, total, last_total = 0; 174 173 u64 sample = 0; 175 174 u64 thresh = tracing_thresh; 176 175 u64 outer_sample = 0; 177 176 int ret = -1; 177 + unsigned int count = 0; 178 178 179 179 do_div(thresh, NSEC_PER_USEC); /* modifies interval value */ 180 180 ··· 190 186 191 187 init_time(last_t2, 0); 192 188 start = time_get(); /* start timestamp */ 189 + outer_diff = 0; 193 190 194 191 do { 195 192 ··· 199 194 200 195 if (time_u64(last_t2)) { 201 196 /* Check the delta from outer loop (t2 to next t1) */ 202 - diff = time_to_us(time_sub(t1, last_t2)); 197 + outer_diff = time_to_us(time_sub(t1, last_t2)); 203 198 /* This shouldn't happen */ 204 - if (diff < 0) { 199 + if (outer_diff < 0) { 205 200 pr_err(BANNER "time running backwards\n"); 206 201 goto out; 207 202 } 208 - if (diff > outer_sample) 209 - outer_sample = diff; 203 + if (outer_diff > outer_sample) 204 + outer_sample = outer_diff; 210 205 } 211 206 last_t2 = t2; 212 207 ··· 221 216 222 217 /* This checks the inner loop (t1 to t2) */ 223 218 diff = time_to_us(time_sub(t2, t1)); /* current diff */ 219 + 220 + if (diff > thresh || outer_diff > thresh) { 221 + if (!count) 222 + ktime_get_real_ts64(&s.timestamp); 223 + count++; 224 + } 224 225 225 226 /* This shouldn't happen */ 226 227 if (diff < 0) { ··· 247 236 248 237 /* If we exceed the threshold value, we have found a hardware latency */ 249 238 if (sample > thresh || outer_sample > thresh) { 250 - struct hwlat_sample s; 251 239 u64 latency; 252 240 253 241 ret = 1; ··· 259 249 s.seqnum = hwlat_data.count; 260 250 s.duration = sample; 261 251 s.outer_duration = outer_sample; 262 - ktime_get_real_ts64(&s.timestamp); 263 252 s.nmi_total_ts = nmi_total_ts; 264 253 s.nmi_count = nmi_count; 254 + s.count = count; 265 255 trace_hwlat_sample(&s); 266 256 267 257 latency = max(sample, outer_sample);
+2
kernel/trace/trace_kprobe.c
··· 1078 1078 int i; 1079 1079 1080 1080 seq_putc(m, trace_kprobe_is_return(tk) ? 'r' : 'p'); 1081 + if (trace_kprobe_is_return(tk) && tk->rp.maxactive) 1082 + seq_printf(m, "%d", tk->rp.maxactive); 1081 1083 seq_printf(m, ":%s/%s", trace_probe_group_name(&tk->tp), 1082 1084 trace_probe_name(&tk->tp)); 1083 1085
+8 -11
kernel/trace/trace_output.c
··· 617 617 618 618 int trace_print_lat_context(struct trace_iterator *iter) 619 619 { 620 + struct trace_entry *entry, *next_entry; 620 621 struct trace_array *tr = iter->tr; 621 - /* trace_find_next_entry will reset ent_size */ 622 - int ent_size = iter->ent_size; 623 622 struct trace_seq *s = &iter->seq; 624 - u64 next_ts; 625 - struct trace_entry *entry = iter->ent, 626 - *next_entry = trace_find_next_entry(iter, NULL, 627 - &next_ts); 628 623 unsigned long verbose = (tr->trace_flags & TRACE_ITER_VERBOSE); 624 + u64 next_ts; 629 625 630 - /* Restore the original ent_size */ 631 - iter->ent_size = ent_size; 632 - 626 + next_entry = trace_find_next_entry(iter, NULL, &next_ts); 633 627 if (!next_entry) 634 628 next_ts = iter->ts; 629 + 630 + /* trace_find_next_entry() may change iter->ent */ 631 + entry = iter->ent; 635 632 636 633 if (verbose) { 637 634 char comm[TASK_COMM_LEN]; ··· 1155 1158 1156 1159 trace_assign_type(field, entry); 1157 1160 1158 - trace_seq_printf(s, "#%-5u inner/outer(us): %4llu/%-5llu ts:%lld.%09ld", 1161 + trace_seq_printf(s, "#%-5u inner/outer(us): %4llu/%-5llu ts:%lld.%09ld count:%d", 1159 1162 field->seqnum, 1160 1163 field->duration, 1161 1164 field->outer_duration, 1162 1165 (long long)field->timestamp.tv_sec, 1163 - field->timestamp.tv_nsec); 1166 + field->timestamp.tv_nsec, field->count); 1164 1167 1165 1168 if (field->nmi_count) { 1166 1169 /*
+26 -9
lib/bootconfig.c
··· 29 29 static char *xbc_data __initdata; 30 30 static size_t xbc_data_size __initdata; 31 31 static struct xbc_node *last_parent __initdata; 32 + static const char *xbc_err_msg __initdata; 33 + static int xbc_err_pos __initdata; 32 34 33 35 static int __init xbc_parse_error(const char *msg, const char *p) 34 36 { 35 - int pos = p - xbc_data; 37 + xbc_err_msg = msg; 38 + xbc_err_pos = (int)(p - xbc_data); 36 39 37 - pr_err("Parse error at pos %d: %s\n", pos, msg); 38 40 return -EINVAL; 39 41 } 40 42 ··· 740 738 /** 741 739 * xbc_init() - Parse given XBC file and build XBC internal tree 742 740 * @buf: boot config text 741 + * @emsg: A pointer of const char * to store the error message 742 + * @epos: A pointer of int to store the error position 743 743 * 744 744 * This parses the boot config text in @buf. @buf must be a 745 745 * null terminated string and smaller than XBC_DATA_MAX. 746 746 * Return the number of stored nodes (>0) if succeeded, or -errno 747 747 * if there is any error. 748 + * In error cases, @emsg will be updated with an error message and 749 + * @epos will be updated with the error position which is the byte offset 750 + * of @buf. If the error is not a parser error, @epos will be -1. 748 751 */ 749 - int __init xbc_init(char *buf) 752 + int __init xbc_init(char *buf, const char **emsg, int *epos) 750 753 { 751 754 char *p, *q; 752 755 int ret, c; 753 756 757 + if (epos) 758 + *epos = -1; 759 + 754 760 if (xbc_data) { 755 - pr_err("Error: bootconfig is already initialized.\n"); 761 + if (emsg) 762 + *emsg = "Bootconfig is already initialized"; 756 763 return -EBUSY; 757 764 } 758 765 759 766 ret = strlen(buf); 760 767 if (ret > XBC_DATA_MAX - 1 || ret == 0) { 761 - pr_err("Error: Config data is %s.\n", 762 - ret ? "too big" : "empty"); 768 + if (emsg) 769 + *emsg = ret ? "Config data is too big" : 770 + "Config data is empty"; 763 771 return -ERANGE; 764 772 } 765 773 766 774 xbc_nodes = memblock_alloc(sizeof(struct xbc_node) * XBC_NODE_MAX, 767 775 SMP_CACHE_BYTES); 768 776 if (!xbc_nodes) { 769 - pr_err("Failed to allocate memory for bootconfig nodes.\n"); 777 + if (emsg) 778 + *emsg = "Failed to allocate bootconfig nodes"; 770 779 return -ENOMEM; 771 780 } 772 781 memset(xbc_nodes, 0, sizeof(struct xbc_node) * XBC_NODE_MAX); ··· 827 814 if (!ret) 828 815 ret = xbc_verify_tree(); 829 816 830 - if (ret < 0) 817 + if (ret < 0) { 818 + if (epos) 819 + *epos = xbc_err_pos; 820 + if (emsg) 821 + *emsg = xbc_err_msg; 831 822 xbc_destroy_all(); 832 - else 823 + } else 833 824 ret = xbc_node_num; 834 825 835 826 return ret;
+17 -10
tools/bootconfig/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 # Makefile for bootconfig command 3 + include ../scripts/Makefile.include 3 4 4 5 bindir ?= /usr/bin 5 6 6 - HEADER = include/linux/bootconfig.h 7 - CFLAGS = -Wall -g -I./include 7 + ifeq ($(srctree),) 8 + srctree := $(patsubst %/,%,$(dir $(CURDIR))) 9 + srctree := $(patsubst %/,%,$(dir $(srctree))) 10 + endif 8 11 9 - PROGS = bootconfig 12 + LIBSRC = $(srctree)/lib/bootconfig.c $(srctree)/include/linux/bootconfig.h 13 + CFLAGS = -Wall -g -I$(CURDIR)/include 10 14 11 - all: $(PROGS) 15 + ALL_TARGETS := bootconfig 16 + ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) 12 17 13 - bootconfig: ../../lib/bootconfig.c main.c $(HEADER) 18 + all: $(ALL_PROGRAMS) 19 + 20 + $(OUTPUT)bootconfig: main.c $(LIBSRC) 14 21 $(CC) $(filter %.c,$^) $(CFLAGS) -o $@ 15 22 16 - install: $(PROGS) 17 - install bootconfig $(DESTDIR)$(bindir) 23 + test: $(ALL_PROGRAMS) test-bootconfig.sh 24 + ./test-bootconfig.sh $(OUTPUT) 18 25 19 - test: bootconfig 20 - ./test-bootconfig.sh 26 + install: $(ALL_PROGRAMS) 27 + install $(OUTPUT)bootconfig $(DESTDIR)$(bindir) 21 28 22 29 clean: 23 - $(RM) -f *.o bootconfig 30 + $(RM) -f $(OUTPUT)*.o $(ALL_PROGRAMS)
+31 -4
tools/bootconfig/main.c
··· 130 130 int ret; 131 131 u32 size = 0, csum = 0, rcsum; 132 132 char magic[BOOTCONFIG_MAGIC_LEN]; 133 + const char *msg; 133 134 134 135 ret = fstat(fd, &stat); 135 136 if (ret < 0) ··· 183 182 return -EINVAL; 184 183 } 185 184 186 - ret = xbc_init(*buf); 185 + ret = xbc_init(*buf, &msg, NULL); 187 186 /* Wrong data */ 188 - if (ret < 0) 187 + if (ret < 0) { 188 + pr_err("parse error: %s.\n", msg); 189 189 return ret; 190 + } 190 191 191 192 return size; 192 193 } ··· 247 244 return ret; 248 245 } 249 246 247 + static void show_xbc_error(const char *data, const char *msg, int pos) 248 + { 249 + int lin = 1, col, i; 250 + 251 + if (pos < 0) { 252 + pr_err("Error: %s.\n", msg); 253 + return; 254 + } 255 + 256 + /* Note that pos starts from 0 but lin and col should start from 1. */ 257 + col = pos + 1; 258 + for (i = 0; i < pos; i++) { 259 + if (data[i] == '\n') { 260 + lin++; 261 + col = pos - i; 262 + } 263 + } 264 + pr_err("Parse Error: %s at %d:%d\n", msg, lin, col); 265 + 266 + } 267 + 250 268 int apply_xbc(const char *path, const char *xbc_path) 251 269 { 252 270 u32 size, csum; 253 271 char *buf, *data; 254 272 int ret, fd; 273 + const char *msg; 274 + int pos; 255 275 256 276 ret = load_xbc_file(xbc_path, &buf); 257 277 if (ret < 0) { ··· 293 267 *(u32 *)(data + size + 4) = csum; 294 268 295 269 /* Check the data format */ 296 - ret = xbc_init(buf); 270 + ret = xbc_init(buf, &msg, &pos); 297 271 if (ret < 0) { 298 - pr_err("Failed to parse %s: %d\n", xbc_path, ret); 272 + show_xbc_error(data, msg, pos); 299 273 free(data); 300 274 free(buf); 275 + 301 276 return ret; 302 277 } 303 278 printf("Apply %s to %s\n", xbc_path, path);
+10 -4
tools/bootconfig/test-bootconfig.sh
··· 3 3 4 4 echo "Boot config test script" 5 5 6 - BOOTCONF=./bootconfig 7 - INITRD=`mktemp initrd-XXXX` 8 - TEMPCONF=`mktemp temp-XXXX.bconf` 6 + if [ -d "$1" ]; then 7 + TESTDIR=$1 8 + else 9 + TESTDIR=. 10 + fi 11 + BOOTCONF=${TESTDIR}/bootconfig 12 + 13 + INITRD=`mktemp ${TESTDIR}/initrd-XXXX` 14 + TEMPCONF=`mktemp ${TESTDIR}/temp-XXXX.bconf` 15 + OUTFILE=`mktemp ${TESTDIR}/tempout-XXXX` 9 16 NG=0 10 17 11 18 cleanup() { ··· 72 65 xpass test $new_size -eq $initrd_size 73 66 74 67 echo "No error messge while applying" 75 - OUTFILE=`mktemp tempout-XXXX` 76 68 dd if=/dev/zero of=$INITRD bs=4096 count=1 77 69 printf " \0\0\0 \0\0\0" >> $INITRD 78 70 $BOOTCONF -a $TEMPCONF $INITRD > $OUTFILE 2>&1
+125
tools/testing/selftests/ftrace/test.d/event/event-no-pid.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: event tracing - restricts events based on pid notrace filtering 4 + # flags: instance 5 + 6 + do_reset() { 7 + echo > set_event 8 + echo > set_event_pid 9 + echo > set_event_notrace_pid 10 + echo 0 > options/event-fork 11 + echo 0 > events/enable 12 + clear_trace 13 + echo 1 > tracing_on 14 + } 15 + 16 + fail() { #msg 17 + cat trace 18 + do_reset 19 + echo $1 20 + exit_fail 21 + } 22 + 23 + count_pid() { 24 + pid=$@ 25 + cat trace | grep -v '^#' | sed -e 's/[^-]*-\([0-9]*\).*/\1/' | grep $pid | wc -l 26 + } 27 + 28 + count_no_pid() { 29 + pid=$1 30 + cat trace | grep -v '^#' | sed -e 's/[^-]*-\([0-9]*\).*/\1/' | grep -v $pid | wc -l 31 + } 32 + 33 + enable_system() { 34 + system=$1 35 + 36 + if [ -d events/$system ]; then 37 + echo 1 > events/$system/enable 38 + fi 39 + } 40 + 41 + enable_events() { 42 + echo 0 > tracing_on 43 + # Enable common groups of events, as all events can allow for 44 + # events to be traced via scheduling that we don't care to test. 45 + enable_system syscalls 46 + enable_system rcu 47 + enable_system block 48 + enable_system exceptions 49 + enable_system irq 50 + enable_system net 51 + enable_system power 52 + enable_system signal 53 + enable_system sock 54 + enable_system timer 55 + enable_system thermal 56 + echo 1 > tracing_on 57 + } 58 + 59 + if [ ! -f set_event -o ! -d events/sched ]; then 60 + echo "event tracing is not supported" 61 + exit_unsupported 62 + fi 63 + 64 + if [ ! -f set_event_pid -o ! -f set_event_notrace_pid ]; then 65 + echo "event pid notrace filtering is not supported" 66 + exit_unsupported 67 + fi 68 + 69 + echo 0 > options/event-fork 70 + 71 + do_reset 72 + 73 + read mypid rest < /proc/self/stat 74 + 75 + echo $mypid > set_event_notrace_pid 76 + grep -q $mypid set_event_notrace_pid 77 + 78 + enable_events 79 + 80 + yield 81 + 82 + echo 0 > tracing_on 83 + 84 + cnt=`count_pid $mypid` 85 + if [ $cnt -ne 0 ]; then 86 + fail "Filtered out task has events" 87 + fi 88 + 89 + cnt=`count_no_pid $mypid` 90 + if [ $cnt -eq 0 ]; then 91 + fail "No other events were recorded" 92 + fi 93 + 94 + do_reset 95 + 96 + echo $mypid > set_event_notrace_pid 97 + echo 1 > options/event-fork 98 + 99 + enable_events 100 + 101 + yield & 102 + child=$! 103 + echo "child = $child" 104 + wait $child 105 + 106 + echo 0 > tracing_on 107 + 108 + cnt=`count_pid $mypid` 109 + if [ $cnt -ne 0 ]; then 110 + fail "Filtered out task has events" 111 + fi 112 + 113 + cnt=`count_pid $child` 114 + if [ $cnt -ne 0 ]; then 115 + fail "Child of filtered out taskhas events" 116 + fi 117 + 118 + cnt=`count_no_pid $mypid` 119 + if [ $cnt -eq 0 ]; then 120 + fail "No other events were recorded" 121 + fi 122 + 123 + do_reset 124 + 125 + exit 0
+108
tools/testing/selftests/ftrace/test.d/ftrace/func-filter-notrace-pid.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: ftrace - function pid notrace filters 4 + # flags: instance 5 + 6 + # Make sure that function pid matching filter with notrace works. 7 + 8 + if ! grep -q function available_tracers; then 9 + echo "no function tracer configured" 10 + exit_unsupported 11 + fi 12 + 13 + if [ ! -f set_ftrace_notrace_pid ]; then 14 + echo "set_ftrace_notrace_pid not found? Is function tracer not set?" 15 + exit_unsupported 16 + fi 17 + 18 + if [ ! -f set_ftrace_filter ]; then 19 + echo "set_ftrace_filter not found? Is function tracer not set?" 20 + exit_unsupported 21 + fi 22 + 23 + do_function_fork=1 24 + 25 + if [ ! -f options/function-fork ]; then 26 + do_function_fork=0 27 + echo "no option for function-fork found. Option will not be tested." 28 + fi 29 + 30 + read PID _ < /proc/self/stat 31 + 32 + if [ $do_function_fork -eq 1 ]; then 33 + # default value of function-fork option 34 + orig_value=`grep function-fork trace_options` 35 + fi 36 + 37 + do_reset() { 38 + if [ $do_function_fork -eq 0 ]; then 39 + return 40 + fi 41 + 42 + echo > set_ftrace_notrace_pid 43 + echo $orig_value > trace_options 44 + } 45 + 46 + fail() { # msg 47 + do_reset 48 + echo $1 49 + exit_fail 50 + } 51 + 52 + do_test() { 53 + disable_tracing 54 + 55 + echo do_execve* > set_ftrace_filter 56 + echo *do_fork >> set_ftrace_filter 57 + 58 + echo $PID > set_ftrace_notrace_pid 59 + echo function > current_tracer 60 + 61 + if [ $do_function_fork -eq 1 ]; then 62 + # don't allow children to be traced 63 + echo nofunction-fork > trace_options 64 + fi 65 + 66 + enable_tracing 67 + yield 68 + 69 + count_pid=`cat trace | grep -v ^# | grep $PID | wc -l` 70 + count_other=`cat trace | grep -v ^# | grep -v $PID | wc -l` 71 + 72 + # count_pid should be 0 73 + if [ $count_pid -ne 0 -o $count_other -eq 0 ]; then 74 + fail "PID filtering not working? traced task = $count_pid; other tasks = $count_other " 75 + fi 76 + 77 + disable_tracing 78 + clear_trace 79 + 80 + if [ $do_function_fork -eq 0 ]; then 81 + return 82 + fi 83 + 84 + # allow children to be traced 85 + echo function-fork > trace_options 86 + 87 + # With pid in both set_ftrace_pid and set_ftrace_notrace_pid 88 + # there should not be any tasks traced. 89 + 90 + echo $PID > set_ftrace_pid 91 + 92 + enable_tracing 93 + yield 94 + 95 + count_pid=`cat trace | grep -v ^# | grep $PID | wc -l` 96 + count_other=`cat trace | grep -v ^# | grep -v $PID | wc -l` 97 + 98 + # both should be zero 99 + if [ $count_pid -ne 0 -o $count_other -ne 0 ]; then 100 + fail "PID filtering not following fork? traced task = $count_pid; other tasks = $count_other " 101 + fi 102 + } 103 + 104 + do_test 105 + 106 + do_reset 107 + 108 + exit 0
+1 -1
tools/testing/selftests/ftrace/test.d/ftrace/func_traceonoff_triggers.tc
··· 41 41 42 42 echo '** ENABLE EVENTS' 43 43 44 - echo 1 > events/enable 44 + echo 1 > events/sched/enable 45 45 46 46 echo '** ENABLE TRACING' 47 47 enable_tracing