Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

eventfs/tracing: Add callback for release of an eventfs_inode

Synthetic events create and destroy tracefs files when they are created
and removed. The tracing subsystem has its own file descriptor
representing the state of the events attached to the tracefs files.
There's a race between the eventfs files and this file descriptor of the
tracing system where the following can cause an issue:

With two scripts 'A' and 'B' doing:

Script 'A':
echo "hello int aaa" > /sys/kernel/tracing/synthetic_events
while :
do
echo 0 > /sys/kernel/tracing/events/synthetic/hello/enable
done

Script 'B':
echo > /sys/kernel/tracing/synthetic_events

Script 'A' creates a synthetic event "hello" and then just writes zero
into its enable file.

Script 'B' removes all synthetic events (including the newly created
"hello" event).

What happens is that the opening of the "enable" file has:

{
struct trace_event_file *file = inode->i_private;
int ret;

ret = tracing_check_open_get_tr(file->tr);
[..]

But deleting the events frees the "file" descriptor, and a "use after
free" happens with the dereference at "file->tr".

The file descriptor does have a reference counter, but there needs to be a
way to decrement it from the eventfs when the eventfs_inode is removed
that represents this file descriptor.

Add an optional "release" callback to the eventfs_entry array structure,
that gets called when the eventfs file is about to be removed. This allows
for the creating on the eventfs file to increment the tracing file
descriptor ref counter. When the eventfs file is deleted, it can call the
release function that will call the put function for the tracing file
descriptor.

This will protect the tracing file from being freed while a eventfs file
that references it is being opened.

Link: https://lore.kernel.org/linux-trace-kernel/20240426073410.17154-1-Tze-nan.Wu@mediatek.com/
Link: https://lore.kernel.org/linux-trace-kernel/20240502090315.448cba46@gandalf.local.home

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: Tze-nan wu <Tze-nan.Wu@mediatek.com>
Tested-by: Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

+36 -2
+21 -2
fs/tracefs/event_inode.c
··· 84 84 static void release_ei(struct kref *ref) 85 85 { 86 86 struct eventfs_inode *ei = container_of(ref, struct eventfs_inode, kref); 87 + const struct eventfs_entry *entry; 87 88 struct eventfs_root_inode *rei; 88 89 89 90 WARN_ON_ONCE(!ei->is_freed); 91 + 92 + for (int i = 0; i < ei->nr_entries; i++) { 93 + entry = &ei->entries[i]; 94 + if (entry->release) 95 + entry->release(entry->name, ei->data); 96 + } 90 97 91 98 kfree(ei->entry_attrs); 92 99 kfree_const(ei->name); ··· 116 109 if (ei) { 117 110 ei->is_freed = 1; 118 111 put_ei(ei); 112 + } 113 + } 114 + 115 + /* 116 + * Called when creation of an ei fails, do not call release() functions. 117 + */ 118 + static inline void cleanup_ei(struct eventfs_inode *ei) 119 + { 120 + if (ei) { 121 + /* Set nr_entries to 0 to prevent release() function being called */ 122 + ei->nr_entries = 0; 123 + free_ei(ei); 119 124 } 120 125 } 121 126 ··· 753 734 754 735 /* Was the parent freed? */ 755 736 if (list_empty(&ei->list)) { 756 - free_ei(ei); 737 + cleanup_ei(ei); 757 738 ei = NULL; 758 739 } 759 740 return ei; ··· 854 835 return ei; 855 836 856 837 fail: 857 - free_ei(ei); 838 + cleanup_ei(ei); 858 839 tracefs_failed_creating(dentry); 859 840 return ERR_PTR(-ENOMEM); 860 841 }
+3
include/linux/tracefs.h
··· 62 62 typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data, 63 63 const struct file_operations **fops); 64 64 65 + typedef void (*eventfs_release)(const char *name, void *data); 66 + 65 67 /** 66 68 * struct eventfs_entry - dynamically created eventfs file call back handler 67 69 * @name: Then name of the dynamic file in an eventfs directory ··· 74 72 struct eventfs_entry { 75 73 const char *name; 76 74 eventfs_callback callback; 75 + eventfs_release release; 77 76 }; 78 77 79 78 struct eventfs_inode;
+12
kernel/trace/trace_events.c
··· 2552 2552 return 0; 2553 2553 } 2554 2554 2555 + /* The file is incremented on creation and freeing the enable file decrements it */ 2556 + static void event_release(const char *name, void *data) 2557 + { 2558 + struct trace_event_file *file = data; 2559 + 2560 + event_file_put(file); 2561 + } 2562 + 2555 2563 static int 2556 2564 event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file) 2557 2565 { ··· 2574 2566 { 2575 2567 .name = "enable", 2576 2568 .callback = event_callback, 2569 + .release = event_release, 2577 2570 }, 2578 2571 { 2579 2572 .name = "filter", ··· 2642 2633 pr_warn("Could not initialize trace point events/%s\n", name); 2643 2634 return ret; 2644 2635 } 2636 + 2637 + /* Gets decremented on freeing of the "enable" file */ 2638 + event_file_get(file); 2645 2639 2646 2640 return 0; 2647 2641 }