commits

The lockdep warning below detects a possible A->B/B->A locking
dependency of mm->mmap_sem and dcookie_mutex. The order in
sync_buffer() is mm->mmap_sem/dcookie_mutex, while in
sys_lookup_dcookie() it is vice versa.

Fixing it in sys_lookup_dcookie() by unlocking dcookie_mutex before
copy_to_user().

oprofiled/4432 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<ffffffff810b444b>] might_fault+0x53/0xa3

but task is already holding lock:
(dcookie_mutex){+.+.+.}, at: [<ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (dcookie_mutex){+.+.+.}:
[<ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
[<ffffffff81124e5c>] get_dcookie+0x30/0x144
[<ffffffffa0000fba>] sync_buffer+0x196/0x3ec [oprofile]
[<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
[<ffffffff81467b96>] notifier_call_chain+0x37/0x63
[<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
[<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
[<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
[<ffffffff81039e8f>] do_exit+0x2a/0x6fc
[<ffffffff8103a5e4>] do_group_exit+0x83/0xae
[<ffffffff8103a626>] sys_exit_group+0x17/0x1b
[<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

-> #0 (&mm->mmap_sem){++++++}:
[<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
[<ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<ffffffff810b4478>] might_fault+0x80/0xa3
[<ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149
[<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by oprofiled/4432:
#0: (dcookie_mutex){+.+.+.}, at: [<ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149

stack backtrace:
Pid: 4432, comm: oprofiled Not tainted 2.6.39-00008-ge5a450d #9
Call Trace:
[<ffffffff81063193>] print_circular_bug+0xae/0xbc
[<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
[<ffffffff8102ef13>] ? get_parent_ip+0x11/0x42
[<ffffffff810b444b>] ? might_fault+0x53/0xa3
[<ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<ffffffff810b444b>] ? might_fault+0x53/0xa3
[<ffffffff810d7d54>] ? path_put+0x22/0x27
[<ffffffff810b4478>] might_fault+0x80/0xa3
[<ffffffff810b444b>] ? might_fault+0x53/0xa3
[<ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149
[<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

References: https://bugzilla.kernel.org/show_bug.cgi?id=13809
Cc: <stable@kernel.org> # .27+
Signed-off-by: Robert Richter <robert.richter@amd.com>

14y ago

Sebastian Andrzej Siewior

1c3cc116

timers: Consider slack value in mod_timer()

14y ago

Mark Brown

d576fe49

rtc: Staticize non-exported __rtc_set_alarm()

14y ago

Dan Carpenter

e23ef227

ptp: Return -EFAULT on copy_to_user() errors

14y ago

Sage Weil

25845472

ceph: fix sync vs canceled write

14y ago

Linus Torvalds

cd1acdf1

Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd

14y ago

Tyler Hicks

30632870

eCryptfs: Remove ecryptfs_header_cache_2

14y ago

Linus Torvalds

acb11bcd

Merge branch 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung

14y ago

Virupax Sadashivpetimath

e3f88ae9

spi-pl022: Add missing return value update

14y ago

Thomas Gleixner

fd8a7de1

x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU

14y ago

Linus Torvalds

7f45e5cd

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

14y ago

Ingo Molnar

86dd7909

Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent

14y ago

Steven Rostedt

a4f18ed1

ftrace: Revert 8ab2b7efd ftrace: Remove unnecessary disabling of irqs

14y ago

Robert Richter

130c5ce7

oprofile: Fix locking dependency in sync_start()

This fixes the A->B/B->A locking dependency, see the warning below.

The function task_exit_notify() is called with (task_exit_notifier)
.rwsem set and then calls sync_buffer() which locks buffer_mutex. In
sync_start() the buffer_mutex was set to prevent notifier functions to
be started before sync_start() is finished. But when registering the
notifier, (task_exit_notifier).rwsem is locked too, but now in
different order than in sync_buffer(). In theory this causes a locking
dependency, what does not occur in practice since task_exit_notify()
is always called after the notifier is registered which means the lock
is already released.

However, after checking the notifier functions it turned out the
buffer_mutex in sync_start() is unnecessary. This is because
sync_buffer() may be called from the notifiers even if sync_start()
did not finish yet, the buffers are already allocated but empty. No
need to protect this with the mutex.

So we fix this theoretical locking dependency by removing buffer_mutex
in sync_start(). This is similar to the implementation before commit:

750d857 oprofile: fix crash when accessing freed task structs

which introduced the locking dependency.

Lockdep warning:

oprofiled/4447 is trying to acquire lock:
(buffer_mutex){+.+...}, at: [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]

but task is already holding lock:
((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 ((task_exit_notifier).rwsem){++++..}:
[<ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<ffffffff81463a2b>] down_write+0x44/0x67
[<ffffffff810581c0>] blocking_notifier_chain_register+0x52/0x8b
[<ffffffff8105a6ac>] profile_event_register+0x2d/0x2f
[<ffffffffa00013c1>] sync_start+0x47/0xc6 [oprofile]
[<ffffffffa00001bb>] oprofile_setup+0x60/0xa5 [oprofile]
[<ffffffffa00014e3>] event_buffer_open+0x59/0x8c [oprofile]
[<ffffffff810cd3b9>] __dentry_open+0x1eb/0x308
[<ffffffff810cd59d>] nameidata_to_filp+0x60/0x67
[<ffffffff810daad6>] do_last+0x5be/0x6b2
[<ffffffff810dbc33>] path_openat+0xc7/0x360
[<ffffffff810dbfc5>] do_filp_open+0x3d/0x8c
[<ffffffff810ccfd2>] do_sys_open+0x110/0x1a9
[<ffffffff810cd09e>] sys_open+0x20/0x22
[<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

-> #0 (buffer_mutex){+.+...}:
[<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
[<ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
[<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
[<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
[<ffffffff81467b96>] notifier_call_chain+0x37/0x63
[<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
[<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
[<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
[<ffffffff81039e8f>] do_exit+0x2a/0x6fc
[<ffffffff8103a5e4>] do_group_exit+0x83/0xae
[<ffffffff8103a626>] sys_exit_group+0x17/0x1b
[<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by oprofiled/4447:
#0: ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67

stack backtrace:
Pid: 4447, comm: oprofiled Not tainted 2.6.39-00007-gcf4d8d4 #10
Call Trace:
[<ffffffff81063193>] print_circular_bug+0xae/0xbc
[<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
[<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
[<ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
[<ffffffff81062627>] ? mark_lock+0x42f/0x552
[<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
[<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
[<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
[<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
[<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
[<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
[<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
[<ffffffff81467b96>] notifier_call_chain+0x37/0x63
[<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
[<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
[<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
[<ffffffff81039e8f>] do_exit+0x2a/0x6fc
[<ffffffff81465031>] ? retint_swapgs+0xe/0x13
[<ffffffff8103a5e4>] do_group_exit+0x83/0xae
[<ffffffff8103a626>] sys_exit_group+0x17/0x1b
[<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b

Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Carl Love <carll@us.ibm.com>
Cc: <stable@kernel.org> # .36+
Signed-off-by: Robert Richter <robert.richter@amd.com>