Merge branch 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

* 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix deadlock in worker_maybe_bind_and_lock()
workqueue: Document debugging tricks

Fix up trivial spelling conflict in kernel/workqueue.c

+47 -1
+40
Documentation/workqueue.txt
··· 12 12 4. Application Programming Interface (API) 13 13 5. Example Execution Scenarios 14 14 6. Guidelines 15 + 7. Debugging 15 16 16 17 17 18 1. Introduction ··· 380 379 * Unless work items are expected to consume a huge amount of CPU 381 380 cycles, using a bound wq is usually beneficial due to the increased 382 381 level of locality in wq operations and work item execution. 382 + 383 + 384 + 7. Debugging 385 + 386 + Because the work functions are executed by generic worker threads 387 + there are a few tricks needed to shed some light on misbehaving 388 + workqueue users. 389 + 390 + Worker threads show up in the process list as: 391 + 392 + root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1] 393 + root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2] 394 + root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0] 395 + root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0] 396 + 397 + If kworkers are going crazy (using too much cpu), there are two types 398 + of possible problems: 399 + 400 + 1. Something beeing scheduled in rapid succession 401 + 2. A single work item that consumes lots of cpu cycles 402 + 403 + The first one can be tracked using tracing: 404 + 405 + $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event 406 + $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt 407 + (wait a few secs) 408 + ^C 409 + 410 + If something is busy looping on work queueing, it would be dominating 411 + the output and the offender can be determined with the work item 412 + function. 413 + 414 + For the second type of problems it should be possible to just check 415 + the stack trace of the offending worker thread. 416 + 417 + $ cat /proc/THE_OFFENDING_KWORKER/stack 418 + 419 + The work item's function should be trivially visible in the stack 420 + trace.
+7 -1
kernel/workqueue.c
··· 1291 1291 return true; 1292 1292 spin_unlock_irq(&gcwq->lock); 1293 1293 1294 - /* CPU has come up in between, retry migration */ 1294 + /* 1295 + * We've raced with CPU hot[un]plug. Give it a breather 1296 + * and retry migration. cond_resched() is required here; 1297 + * otherwise, we might deadlock against cpu_stop trying to 1298 + * bring down the CPU on non-preemptive kernel. 1299 + */ 1295 1300 cpu_relax(); 1301 + cond_resched(); 1296 1302 } 1297 1303 } 1298 1304