workqueue: Document debugging tricks

It is not obvious how to debug run-away workers.

These are some tips given by Tejun on lkml.

Signed-off-by: Florian Mickler <florian@mickler.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

authored by Florian Mickler and committed by Tejun Heo e2de9e08 6aba74f2

+40
+40
Documentation/workqueue.txt
··· 12 4. Application Programming Interface (API) 13 5. Example Execution Scenarios 14 6. Guidelines 15 16 17 1. Introduction ··· 380 * Unless work items are expected to consume a huge amount of CPU 381 cycles, using a bound wq is usually beneficial due to the increased 382 level of locality in wq operations and work item execution.
··· 12 4. Application Programming Interface (API) 13 5. Example Execution Scenarios 14 6. Guidelines 15 + 7. Debugging 16 17 18 1. Introduction ··· 379 * Unless work items are expected to consume a huge amount of CPU 380 cycles, using a bound wq is usually beneficial due to the increased 381 level of locality in wq operations and work item execution. 382 + 383 + 384 + 7. Debugging 385 + 386 + Because the work functions are executed by generic worker threads 387 + there are a few tricks needed to shed some light on misbehaving 388 + workqueue users. 389 + 390 + Worker threads show up in the process list as: 391 + 392 + root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1] 393 + root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2] 394 + root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0] 395 + root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0] 396 + 397 + If kworkers are going crazy (using too much cpu), there are two types 398 + of possible problems: 399 + 400 + 1. Something beeing scheduled in rapid succession 401 + 2. A single work item that consumes lots of cpu cycles 402 + 403 + The first one can be tracked using tracing: 404 + 405 + $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event 406 + $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt 407 + (wait a few secs) 408 + ^C 409 + 410 + If something is busy looping on work queueing, it would be dominating 411 + the output and the offender can be determined with the work item 412 + function. 413 + 414 + For the second type of problems it should be possible to just check 415 + the stack trace of the offending worker thread. 416 + 417 + $ cat /proc/THE_OFFENDING_KWORKER/stack 418 + 419 + The work item's function should be trivially visible in the stack 420 + trace.