perf: Fix contexted inheritance

Linus reported that the RCU lockdep annotation bits triggered for this
rcu_dereference() because we're not holding rcu_read_lock().

Going over the code I cannot convince myself its correct:

- holding a ref on the parent_ctx, doesn't avoid it being uncloned
concurrently (as the comment says), so we can race with a free.

- holding parent_ctx->mutex doesn't avoid the above free from taking
place either, it would at best avoid parent_ctx from being freed.

I.e. the warning is correct. To fix the bug, serialize against the
unclone_ctx() call by extending the reach of the parent_ctx->lock.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

authored by Peter Zijlstra and committed by Ingo Molnar c5ed5145 ad7f4e3f

+5 -6
+5 -6
kernel/perf_event.c
··· 6494 6494 6495 6495 raw_spin_lock_irqsave(&parent_ctx->lock, flags); 6496 6496 parent_ctx->rotate_disable = 0; 6497 - raw_spin_unlock_irqrestore(&parent_ctx->lock, flags); 6498 6497 6499 6498 child_ctx = child->perf_event_ctxp[ctxn]; 6500 6499 ··· 6501 6502 /* 6502 6503 * Mark the child context as a clone of the parent 6503 6504 * context, or of whatever the parent is a clone of. 6504 - * Note that if the parent is a clone, it could get 6505 - * uncloned at any point, but that doesn't matter 6506 - * because the list of events and the generation 6507 - * count can't have changed since we took the mutex. 6505 + * 6506 + * Note that if the parent is a clone, the holding of 6507 + * parent_ctx->lock avoids it from being uncloned. 6508 6508 */ 6509 - cloned_ctx = rcu_dereference(parent_ctx->parent_ctx); 6509 + cloned_ctx = parent_ctx->parent_ctx; 6510 6510 if (cloned_ctx) { 6511 6511 child_ctx->parent_ctx = cloned_ctx; 6512 6512 child_ctx->parent_gen = parent_ctx->parent_gen; ··· 6516 6518 get_ctx(child_ctx->parent_ctx); 6517 6519 } 6518 6520 6521 + raw_spin_unlock_irqrestore(&parent_ctx->lock, flags); 6519 6522 mutex_unlock(&parent_ctx->mutex); 6520 6523 6521 6524 perf_unpin_context(parent_ctx);