commit e2de9e0862778f4aba103027ce575efbddb8117f · tjh.dev/kernel

tjh.dev / kernel

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

workqueue: Document debugging tricks

It is not obvious how to debug run-away workers.

These are some tips given by Tejun on lkml.

Signed-off-by: Florian Mickler <florian@mickler.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

authored by Florian Mickler and committed by Tejun Heo 15 years ago e2de9e08 6aba74f2

+40

1 changed file

expand all

unified split

Documentation

workqueue.txt

+40

Documentation/workqueue.txt

··· 12 4. Application Programming Interface (API) 13 5. Example Execution Scenarios 14 6. Guidelines 15 16 17 1. Introduction ··· 380 * Unless work items are expected to consume a huge amount of CPU 381 cycles, using a bound wq is usually beneficial due to the increased 382 level of locality in wq operations and work item execution.

··· 12 4. Application Programming Interface (API) 13 5. Example Execution Scenarios 14 6. Guidelines 15 + 7. Debugging 16 17 18 1. Introduction ··· 379 * Unless work items are expected to consume a huge amount of CPU 380 cycles, using a bound wq is usually beneficial due to the increased 381 level of locality in wq operations and work item execution. 382 + 383 + 384 + 7. Debugging 385 + 386 + Because the work functions are executed by generic worker threads 387 + there are a few tricks needed to shed some light on misbehaving 388 + workqueue users. 389 + 390 + Worker threads show up in the process list as: 391 + 392 + root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1] 393 + root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2] 394 + root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0] 395 + root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0] 396 + 397 + If kworkers are going crazy (using too much cpu), there are two types 398 + of possible problems: 399 + 400 + 1. Something beeing scheduled in rapid succession 401 + 2. A single work item that consumes lots of cpu cycles 402 + 403 + The first one can be tracked using tracing: 404 + 405 + $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event 406 + $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt 407 + (wait a few secs) 408 + ^C 409 + 410 + If something is busy looping on work queueing, it would be dominating 411 + the output and the offender can be determined with the work item 412 + function. 413 + 414 + For the second type of problems it should be possible to just check 415 + the stack trace of the offending worker thread. 416 + 417 + $ cat /proc/THE_OFFENDING_KWORKER/stack 418 + 419 + The work item's function should be trivially visible in the stack 420 + trace.