Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'trace-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Revert "tracing: Remove pid in task_rename tracing output"

A change was made to remove the pid field from the task_rename event
because it was thought that it was always done for the current task
and recording the pid would be redundant. This turned out to be
incorrect and there are a few corner case where this is not true and
caused some regressions in tooling.

- Fix the reading from user space for migration

The reading of user space uses a seq lock type of logic where it uses
a per-cpu temporary buffer and disables migration, then enables
preemption, does the copy from user space, disables preemption,
enables migration and checks if there was any schedule switches while
preemption was enabled. If there was a context switch, then it is
considered that the per-cpu buffer could be corrupted and it tries
again. There's a protection check that tests if it takes a hundred
tries, it issues a warning and exits out to prevent a live lock.

This was triggered because the task was selected by the load balancer
to be migrated to another CPU, every time preemption is enabled the
migration task would schedule in try to migrate the task but can't
because migration is disabled and let it run again. This caused the
scheduler to schedule out the task every time it enabled preemption
and made the loop never exit (until the 100 iteration test
triggered).

Fix this by enabling and disabling preemption and keeping migration
enabled if the reading from user space needs to be done again. This
will let the migration thread migrate the task and the copy from user
space will likely pass on the next iteration.

- Fix trace_marker copy option freeing

The "copy_trace_marker" option allows a tracing instance to get a
copy of a write to the trace_marker file of the top level instance.
This is managed by a link list protected by RCU. When an instance is
removed, a check is made if the option is set, and if so
synchronized_rcu() is called.

The problem is that an iteration is made to reset all the flags to
what they were when the instance was created (to perform clean ups)
was done before the check of the copy_trace_marker option and that
option was cleared, so the synchronize_rcu() was never called.

Move the clearing of all the flags after the check of
copy_trace_marker to do synchronize_rcu() so that the option is still
set if it was before and the synchronization is performed.

- Fix entries setting when validating the persistent ring buffer

When validating the persistent ring buffer on boot up, the number of
events per sub-buffer is added to the sub-buffer meta page. The
validator was updating cpu_buffer->head_page (the first sub-buffer of
the per-cpu buffer) and not the "head_page" variable that was
iterating the sub-buffers. This was causing the first sub-buffer to
be assigned the entries for each sub-buffer and not the sub-buffer
that was supposed to be updated.

- Use "hash" value to update the direct callers

When updating the ftrace direct callers, it assigned a temporary
callback to all the callback functions of the ftrace ops and not just
the functions represented by the passed in hash. This causes an
unnecessary slow down of the functions of the ftrace_ops that is not
being modified. Only update the functions that are going to be
modified to call the ftrace loop function so that the update can be
made on those functions.

* tag 'trace-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Use hash argument for tmp_ops in update_ftrace_direct_mod
ring-buffer: Fix to update per-subbuf entries of persistent ring buffer
tracing: Fix trace_marker copy link list updates
tracing: Fix failure to read user space from system call trace events
tracing: Revert "tracing: Remove pid in task_rename tracing output"

+35 -14
+5 -2
include/trace/events/task.h
··· 38 38 TP_ARGS(task, comm), 39 39 40 40 TP_STRUCT__entry( 41 + __field( pid_t, pid) 41 42 __array( char, oldcomm, TASK_COMM_LEN) 42 43 __array( char, newcomm, TASK_COMM_LEN) 43 44 __field( short, oom_score_adj) 44 45 ), 45 46 46 47 TP_fast_assign( 48 + __entry->pid = task->pid; 47 49 memcpy(entry->oldcomm, task->comm, TASK_COMM_LEN); 48 50 strscpy(entry->newcomm, comm, TASK_COMM_LEN); 49 51 __entry->oom_score_adj = task->signal->oom_score_adj; 50 52 ), 51 53 52 - TP_printk("oldcomm=%s newcomm=%s oom_score_adj=%hd", 53 - __entry->oldcomm, __entry->newcomm, __entry->oom_score_adj) 54 + TP_printk("pid=%d oldcomm=%s newcomm=%s oom_score_adj=%hd", 55 + __entry->pid, __entry->oldcomm, 56 + __entry->newcomm, __entry->oom_score_adj) 54 57 ); 55 58 56 59 /**
+2 -2
kernel/trace/ftrace.c
··· 6606 6606 if (!orig_hash) 6607 6607 goto unlock; 6608 6608 6609 - /* Enable the tmp_ops to have the same functions as the direct ops */ 6609 + /* Enable the tmp_ops to have the same functions as the hash object. */ 6610 6610 ftrace_ops_init(&tmp_ops); 6611 - tmp_ops.func_hash = ops->func_hash; 6611 + tmp_ops.func_hash->filter_hash = hash; 6612 6612 6613 6613 err = register_ftrace_function_nolock(&tmp_ops); 6614 6614 if (err)
+1 -1
kernel/trace/ring_buffer.c
··· 2053 2053 2054 2054 entries += ret; 2055 2055 entry_bytes += local_read(&head_page->page->commit); 2056 - local_set(&cpu_buffer->head_page->entries, ret); 2056 + local_set(&head_page->entries, ret); 2057 2057 2058 2058 if (head_page == cpu_buffer->commit_page) 2059 2059 break;
+27 -9
kernel/trace/trace.c
··· 555 555 lockdep_assert_held(&event_mutex); 556 556 557 557 if (enabled) { 558 - if (!list_empty(&tr->marker_list)) 558 + if (tr->trace_flags & TRACE_ITER(COPY_MARKER)) 559 559 return false; 560 560 561 561 list_add_rcu(&tr->marker_list, &marker_copies); ··· 563 563 return true; 564 564 } 565 565 566 - if (list_empty(&tr->marker_list)) 566 + if (!(tr->trace_flags & TRACE_ITER(COPY_MARKER))) 567 567 return false; 568 568 569 - list_del_init(&tr->marker_list); 569 + list_del_rcu(&tr->marker_list); 570 570 tr->trace_flags &= ~TRACE_ITER(COPY_MARKER); 571 571 return true; 572 572 } ··· 6784 6784 6785 6785 do { 6786 6786 /* 6787 + * It is possible that something is trying to migrate this 6788 + * task. What happens then, is when preemption is enabled, 6789 + * the migration thread will preempt this task, try to 6790 + * migrate it, fail, then let it run again. That will 6791 + * cause this to loop again and never succeed. 6792 + * On failures, enabled and disable preemption with 6793 + * migration enabled, to allow the migration thread to 6794 + * migrate this task. 6795 + */ 6796 + if (trys) { 6797 + preempt_enable_notrace(); 6798 + preempt_disable_notrace(); 6799 + cpu = smp_processor_id(); 6800 + buffer = per_cpu_ptr(tinfo->tbuf, cpu)->buf; 6801 + } 6802 + 6803 + /* 6787 6804 * If for some reason, copy_from_user() always causes a context 6788 6805 * switch, this would then cause an infinite loop. 6789 6806 * If this task is preempted by another user space task, it ··· 9761 9744 9762 9745 list_del(&tr->list); 9763 9746 9747 + if (printk_trace == tr) 9748 + update_printk_trace(&global_trace); 9749 + 9750 + /* Must be done before disabling all the flags */ 9751 + if (update_marker_trace(tr, 0)) 9752 + synchronize_rcu(); 9753 + 9764 9754 /* Disable all the flags that were enabled coming in */ 9765 9755 for (i = 0; i < TRACE_FLAGS_MAX_SIZE; i++) { 9766 9756 if ((1ULL << i) & ZEROED_TRACE_FLAGS) 9767 9757 set_tracer_flag(tr, 1ULL << i, 0); 9768 9758 } 9769 - 9770 - if (printk_trace == tr) 9771 - update_printk_trace(&global_trace); 9772 - 9773 - if (update_marker_trace(tr, 0)) 9774 - synchronize_rcu(); 9775 9759 9776 9760 tracing_set_nop(tr); 9777 9761 clear_ftrace_function_probes(tr);