Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tracing: Avoid soft lockup in trace_pipe

running following commands:

# enable the binary option
echo 1 > ./options/bin
# disable context info option
echo 0 > ./options/context-info
# tracing only events
echo 1 > ./events/enable
cat trace_pipe

plus forcing system to generate many tracing events,
is causing lockup (in NON preemptive kernels) inside
tracing_read_pipe function.

The issue is also easily reproduced by running ltp stress test.
(ftrace_stress_test.sh)

The reasons are:
- bin/hex/raw output functions for events are set to
trace_nop_print function, which prints nothing and
returns TRACE_TYPE_HANDLED value
- LOST EVENT trace do not handle trace_seq overflow

These reasons force the while loop in tracing_read_pipe
function never to break.

The attached patch fixies handling of lost event trace, and
changes trace_nop_print to print minimal info, which is needed
for the correct tracing_read_pipe processing.

v2 changes:
- omit the cond_resched changes by trace_nop_print changes
- WARN changed to WARN_ONCE and added info to be able
to find out the culprit

v3 changes:
- make more accurate patch comment

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

authored by

Jiri Olsa and committed by
Steven Rostedt
ee5e51f5 1813dc37

+15 -3
+12 -3
kernel/trace/trace.c
··· 2013 2013 { 2014 2014 enum print_line_t ret; 2015 2015 2016 - if (iter->lost_events) 2017 - trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", 2018 - iter->cpu, iter->lost_events); 2016 + if (iter->lost_events && 2017 + !trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", 2018 + iter->cpu, iter->lost_events)) 2019 + return TRACE_TYPE_PARTIAL_LINE; 2019 2020 2020 2021 if (iter->trace && iter->trace->print_line) { 2021 2022 ret = iter->trace->print_line(iter); ··· 3230 3229 3231 3230 if (iter->seq.len >= cnt) 3232 3231 break; 3232 + 3233 + /* 3234 + * Setting the full flag means we reached the trace_seq buffer 3235 + * size and we should leave by partial output condition above. 3236 + * One of the trace_seq_* functions is not used properly. 3237 + */ 3238 + WARN_ONCE(iter->seq.full, "full flag set for trace type %d", 3239 + iter->ent->type); 3233 3240 } 3234 3241 trace_access_unlock(iter->cpu_file); 3235 3242 trace_event_read_unlock();
+3
kernel/trace/trace_output.c
··· 830 830 enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags, 831 831 struct trace_event *event) 832 832 { 833 + if (!trace_seq_printf(&iter->seq, "type: %d\n", iter->ent->type)) 834 + return TRACE_TYPE_PARTIAL_LINE; 835 + 833 836 return TRACE_TYPE_HANDLED; 834 837 } 835 838