[PATCH] Return probe redesign: architecture independent changes

The following is the second version of the function return probe patches
I sent out earlier this week. Changes since my last submission include:

* Fix in ppc64 code removing an unneeded call to re-enable preemption
* Fix a build problem in ia64 when kprobes was turned off
* Added another BUG_ON check to each of the architecture trampoline
handlers

My initial patch description ==>

From my experiences with adding return probes to x86_64 and ia64, and the
feedback on LKML to those patches, I think we can simplify the design
for return probes.

The following patch tweaks the original design such that:

* Instead of storing the stack address in the return probe instance, the
task pointer is stored. This gives us all we need in order to:
- find the correct return probe instance when we enter the trampoline
(even if we are recursing)
- find all left-over return probe instances when the task is going away

This has the side effect of simplifying the implementation since more
work can be done in kernel/kprobes.c since architecture specific knowledge
of the stack layout is no longer required. Specifically, we no longer have:
- arch_get_kprobe_task()
- arch_kprobe_flush_task()
- get_rp_inst_tsk()
- get_rp_inst()
- trampoline_post_handler() <see next bullet>

* Instead of splitting the return probe handling and cleanup logic across
the pre and post trampoline handlers, all the work is pushed into the
pre function (trampoline_probe_handler), and then we skip single stepping
the original function. In this case the original instruction to be single
stepped was just a NOP, and we can do without the extra interruption.

The new flow of events to having a return probe handler execute when a target
function exits is:

* At system initialization time, a kprobe is inserted at the beginning of
kretprobe_trampoline. kernel/kprobes.c use to handle this on it's own,
but ia64 needed to do this a little differently (i.e. a function pointer
is really a pointer to a structure containing the instruction pointer and
a global pointer), so I added the notion of arch_init(), so that
kernel/kprobes.c:init_kprobes() now allows architecture specific
initialization by calling arch_init() before exiting. Each architecture
now registers a kprobe on it's own trampoline function.

* register_kretprobe() will insert a kprobe at the beginning of the targeted
function with the kprobe pre_handler set to arch_prepare_kretprobe
(still no change)

* When the target function is entered, the kprobe is fired, calling
arch_prepare_kretprobe (still no change)

* In arch_prepare_kretprobe() we try to get a free instance and if one is
available then we fill out the instance with a pointer to the return probe,
the original return address, and a pointer to the task structure (instead
of the stack address.) Just like before we change the return address
to the trampoline function and mark the instance as used.

If multiple return probes are registered for a given target function,
then arch_prepare_kretprobe() will get called multiple times for the same
task (since our kprobe implementation is able to handle multiple kprobes
at the same address.) Past the first call to arch_prepare_kretprobe,
we end up with the original address stored in the return probe instance
pointing to our trampoline function. (This is a significant difference
from the original arch_prepare_kretprobe design.)

* Target function executes like normal and then returns to kretprobe_trampoline.

* kprobe inserted on the first instruction of kretprobe_trampoline is fired
and calls trampoline_probe_handler() (no change here)

* trampoline_probe_handler() consumes each of the instances associated with
the current task by calling the registered handler function and marking
the instance as unused until an instance is found that has a return address
different then the trampoline function.

(change similar to my previous ia64 RFC)

* If the task is killed with some left-over return probe instances (meaning
that a target function was entered, but never returned), then we just
free any instances associated with the task. (Not much different other
then we can handle this without calling architecture specific functions.)

There is a known problem that this patch does not yet solve where
registering a return probe flush_old_exec or flush_thread will put us
in a bad state. Most likely the best way to handle this is to not allow
registering return probes on these two functions.

(Significant change)

This patch series applies to the 2.6.12-rc6-mm1 kernel, and provides:
* kernel/kprobes.c changes
* i386 patch of existing return probes implementation
* x86_64 patch of existing return probe implementation
* ia64 implementation
* ppc64 implementation (provided by Ananth)

This patch implements the architecture independant changes for a reworking
of the kprobes based function return probes design. Changes include:

* Removing functions for querying a return probe instance off a stack address
* Removing the stack_addr field from the kretprobe_instance definition,
and adding a task pointer
* Adding architecture specific initialization via arch_init()
* Removing extern definitions for the architecture trampoline functions
(this isn't needed anymore since the architecture handles the
initialization of the kprobe in the return probe trampoline function.)

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

authored by Rusty Lynch and committed by Linus Torvalds 802eae7c 9ec4b1f3

+22 -75
+3 -25
include/linux/kprobes.h
··· 104 }; 105 106 #ifdef ARCH_SUPPORTS_KRETPROBES 107 - extern int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs); 108 - extern void trampoline_post_handler(struct kprobe *p, struct pt_regs *regs, 109 - unsigned long flags); 110 - extern struct task_struct *arch_get_kprobe_task(void *ptr); 111 extern void arch_prepare_kretprobe(struct kretprobe *rp, struct pt_regs *regs); 112 - extern void arch_kprobe_flush_task(struct task_struct *tk); 113 #else /* ARCH_SUPPORTS_KRETPROBES */ 114 - static inline void kretprobe_trampoline(void) 115 - { 116 - } 117 - static inline int trampoline_probe_handler(struct kprobe *p, 118 - struct pt_regs *regs) 119 - { 120 - return 0; 121 - } 122 - static inline void trampoline_post_handler(struct kprobe *p, 123 - struct pt_regs *regs, unsigned long flags) 124 - { 125 - } 126 static inline void arch_prepare_kretprobe(struct kretprobe *rp, 127 struct pt_regs *regs) 128 { 129 } 130 - static inline void arch_kprobe_flush_task(struct task_struct *tk) 131 - { 132 - } 133 - #define arch_get_kprobe_task(ptr) ((struct task_struct *)NULL) 134 #endif /* ARCH_SUPPORTS_KRETPROBES */ 135 /* 136 * Function-return probe - ··· 134 struct hlist_node uflist; /* either on free list or used list */ 135 struct hlist_node hlist; 136 struct kretprobe *rp; 137 - void *ret_addr; 138 - void *stack_addr; 139 }; 140 141 #ifdef CONFIG_KPROBES ··· 155 extern void arch_arm_kprobe(struct kprobe *p); 156 extern void arch_disarm_kprobe(struct kprobe *p); 157 extern void arch_remove_kprobe(struct kprobe *p); 158 extern void show_registers(struct pt_regs *regs); 159 extern kprobe_opcode_t *get_insn_slot(void); 160 extern void free_insn_slot(kprobe_opcode_t *slot); ··· 176 void unregister_kretprobe(struct kretprobe *rp); 177 178 struct kretprobe_instance *get_free_rp_inst(struct kretprobe *rp); 179 - struct kretprobe_instance *get_rp_inst(void *sara); 180 - struct kretprobe_instance *get_rp_inst_tsk(struct task_struct *tk); 181 void add_rp_inst(struct kretprobe_instance *ri); 182 void kprobe_flush_task(struct task_struct *tk); 183 void recycle_rp_inst(struct kretprobe_instance *ri);
··· 104 }; 105 106 #ifdef ARCH_SUPPORTS_KRETPROBES 107 extern void arch_prepare_kretprobe(struct kretprobe *rp, struct pt_regs *regs); 108 #else /* ARCH_SUPPORTS_KRETPROBES */ 109 static inline void arch_prepare_kretprobe(struct kretprobe *rp, 110 struct pt_regs *regs) 111 { 112 } 113 #endif /* ARCH_SUPPORTS_KRETPROBES */ 114 /* 115 * Function-return probe - ··· 155 struct hlist_node uflist; /* either on free list or used list */ 156 struct hlist_node hlist; 157 struct kretprobe *rp; 158 + kprobe_opcode_t *ret_addr; 159 + struct task_struct *task; 160 }; 161 162 #ifdef CONFIG_KPROBES ··· 176 extern void arch_arm_kprobe(struct kprobe *p); 177 extern void arch_disarm_kprobe(struct kprobe *p); 178 extern void arch_remove_kprobe(struct kprobe *p); 179 + extern int arch_init(void); 180 extern void show_registers(struct pt_regs *regs); 181 extern kprobe_opcode_t *get_insn_slot(void); 182 extern void free_insn_slot(kprobe_opcode_t *slot); ··· 196 void unregister_kretprobe(struct kretprobe *rp); 197 198 struct kretprobe_instance *get_free_rp_inst(struct kretprobe *rp); 199 void add_rp_inst(struct kretprobe_instance *ri); 200 void kprobe_flush_task(struct task_struct *tk); 201 void recycle_rp_inst(struct kretprobe_instance *ri);
+19 -50
kernel/kprobes.c
··· 240 return 0; 241 } 242 243 - struct kprobe trampoline_p = { 244 - .addr = (kprobe_opcode_t *) &kretprobe_trampoline, 245 - .pre_handler = trampoline_probe_handler, 246 - .post_handler = trampoline_post_handler 247 - }; 248 - 249 struct kretprobe_instance *get_free_rp_inst(struct kretprobe *rp) 250 { 251 struct hlist_node *node; ··· 258 return NULL; 259 } 260 261 - struct kretprobe_instance *get_rp_inst(void *sara) 262 - { 263 - struct hlist_head *head; 264 - struct hlist_node *node; 265 - struct task_struct *tsk; 266 - struct kretprobe_instance *ri; 267 - 268 - tsk = arch_get_kprobe_task(sara); 269 - head = &kretprobe_inst_table[hash_ptr(tsk, KPROBE_HASH_BITS)]; 270 - hlist_for_each_entry(ri, node, head, hlist) { 271 - if (ri->stack_addr == sara) 272 - return ri; 273 - } 274 - return NULL; 275 - } 276 - 277 void add_rp_inst(struct kretprobe_instance *ri) 278 { 279 - struct task_struct *tsk; 280 /* 281 * Remove rp inst off the free list - 282 * Add it back when probed function returns 283 */ 284 hlist_del(&ri->uflist); 285 - tsk = arch_get_kprobe_task(ri->stack_addr); 286 /* Add rp inst onto table */ 287 INIT_HLIST_NODE(&ri->hlist); 288 hlist_add_head(&ri->hlist, 289 - &kretprobe_inst_table[hash_ptr(tsk, KPROBE_HASH_BITS)]); 290 291 /* Also add this rp inst to the used list. */ 292 INIT_HLIST_NODE(&ri->uflist); ··· 296 return &kretprobe_inst_table[hash_ptr(tsk, KPROBE_HASH_BITS)]; 297 } 298 299 - struct kretprobe_instance *get_rp_inst_tsk(struct task_struct *tk) 300 - { 301 - struct task_struct *tsk; 302 - struct hlist_head *head; 303 - struct hlist_node *node; 304 - struct kretprobe_instance *ri; 305 - 306 - head = &kretprobe_inst_table[hash_ptr(tk, KPROBE_HASH_BITS)]; 307 - 308 - hlist_for_each_entry(ri, node, head, hlist) { 309 - tsk = arch_get_kprobe_task(ri->stack_addr); 310 - if (tsk == tk) 311 - return ri; 312 - } 313 - return NULL; 314 - } 315 - 316 /* 317 - * This function is called from do_exit or do_execv when task tk's stack is 318 - * about to be recycled. Recycle any function-return probe instances 319 - * associated with this task. These represent probed functions that have 320 - * been called but may never return. 321 */ 322 void kprobe_flush_task(struct task_struct *tk) 323 { 324 unsigned long flags = 0; 325 spin_lock_irqsave(&kprobe_lock, flags); 326 - arch_kprobe_flush_task(tk); 327 spin_unlock_irqrestore(&kprobe_lock, flags); 328 } 329 ··· 574 INIT_HLIST_HEAD(&kretprobe_inst_table[i]); 575 } 576 577 - err = register_die_notifier(&kprobe_exceptions_nb); 578 - /* Register the trampoline probe for return probe */ 579 - register_kprobe(&trampoline_p); 580 return err; 581 } 582
··· 240 return 0; 241 } 242 243 struct kretprobe_instance *get_free_rp_inst(struct kretprobe *rp) 244 { 245 struct hlist_node *node; ··· 264 return NULL; 265 } 266 267 void add_rp_inst(struct kretprobe_instance *ri) 268 { 269 /* 270 * Remove rp inst off the free list - 271 * Add it back when probed function returns 272 */ 273 hlist_del(&ri->uflist); 274 + 275 /* Add rp inst onto table */ 276 INIT_HLIST_NODE(&ri->hlist); 277 hlist_add_head(&ri->hlist, 278 + &kretprobe_inst_table[hash_ptr(ri->task, KPROBE_HASH_BITS)]); 279 280 /* Also add this rp inst to the used list. */ 281 INIT_HLIST_NODE(&ri->uflist); ··· 319 return &kretprobe_inst_table[hash_ptr(tsk, KPROBE_HASH_BITS)]; 320 } 321 322 /* 323 + * This function is called from exit_thread or flush_thread when task tk's 324 + * stack is being recycled so that we can recycle any function-return probe 325 + * instances associated with this task. These left over instances represent 326 + * probed functions that have been called but will never return. 327 */ 328 void kprobe_flush_task(struct task_struct *tk) 329 { 330 + struct kretprobe_instance *ri; 331 + struct hlist_head *head; 332 + struct hlist_node *node, *tmp; 333 unsigned long flags = 0; 334 + 335 spin_lock_irqsave(&kprobe_lock, flags); 336 + head = kretprobe_inst_table_head(current); 337 + hlist_for_each_entry_safe(ri, node, tmp, head, hlist) { 338 + if (ri->task == tk) 339 + recycle_rp_inst(ri); 340 + } 341 spin_unlock_irqrestore(&kprobe_lock, flags); 342 } 343 ··· 606 INIT_HLIST_HEAD(&kretprobe_inst_table[i]); 607 } 608 609 + err = arch_init(); 610 + if (!err) 611 + err = register_die_notifier(&kprobe_exceptions_nb); 612 + 613 return err; 614 } 615