Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

docs: add workload-tracing document to admin-guide

Add a new section to the admin-guide with information of interest to
application developers and system integrators doing analysis of the
Linux kernel for safety critical applications.

This section will contain documents supporting analysis of kernel
interactions with applications, and key kernel subsystems expectations.

Add a new workload-tracing document to this new section.

Signed-off-by: Shefali Sharma <sshefali021@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230131221105.39216-1-skhan@linuxfoundation.org
[jc: tweaked the sphinx formatting a bit]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

authored by

Shuah Khan and committed by
Jonathan Corbet
b7cb8405 00cba6b6

+617
+11
Documentation/admin-guide/index.rst
··· 56 56 57 57 sysfs-rules 58 58 59 + This is the beginning of a section with information of interest to 60 + application developers and system integrators doing analysis of the 61 + Linux kernel for safety critical applications. Documents supporting 62 + analysis of kernel interactions with applications, and key kernel 63 + subsystems expectations will be found here. 64 + 65 + .. toctree:: 66 + :maxdepth: 1 67 + 68 + workload-tracing 69 + 59 70 The rest of this manual consists of various unordered guides on how to 60 71 configure specific aspects of kernel behavior to your liking. 61 72
+606
Documentation/admin-guide/workload-tracing.rst
··· 1 + .. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0) 2 + 3 + ====================================================== 4 + Discovering Linux kernel subsystems used by a workload 5 + ====================================================== 6 + 7 + :Authors: - Shuah Khan <skhan@linuxfoundation.org> 8 + - Shefali Sharma <sshefali021@gmail.com> 9 + :maintained-by: Shuah Khan <skhan@linuxfoundation.org> 10 + 11 + Key Points 12 + ========== 13 + 14 + * Understanding system resources necessary to build and run a workload 15 + is important. 16 + * Linux tracing and strace can be used to discover the system resources 17 + in use by a workload. The completeness of the system usage information 18 + depends on the completeness of coverage of a workload. 19 + * Performance and security of the operating system can be analyzed with 20 + the help of tools such as: 21 + `perf <https://man7.org/linux/man-pages/man1/perf.1.html>`_, 22 + `stress-ng <https://www.mankier.com/1/stress-ng>`_, 23 + `paxtest <https://github.com/opntr/paxtest-freebsd>`_. 24 + * Once we discover and understand the workload needs, we can focus on them 25 + to avoid regressions and use it to evaluate safety considerations. 26 + 27 + Methodology 28 + =========== 29 + 30 + `strace <https://man7.org/linux/man-pages/man1/strace.1.html>`_ is a 31 + diagnostic, instructional, and debugging tool and can be used to discover 32 + the system resources in use by a workload. Once we discover and understand 33 + the workload needs, we can focus on them to avoid regressions and use it 34 + to evaluate safety considerations. We use strace tool to trace workloads. 35 + 36 + This method of tracing using strace tells us the system calls invoked by 37 + the workload and doesn't include all the system calls that can be invoked 38 + by it. In addition, this tracing method tells us just the code paths within 39 + these system calls that are invoked. As an example, if a workload opens a 40 + file and reads from it successfully, then the success path is the one that 41 + is traced. Any error paths in that system call will not be traced. If there 42 + is a workload that provides full coverage of a workload then the method 43 + outlined here will trace and find all possible code paths. The completeness 44 + of the system usage information depends on the completeness of coverage of a 45 + workload. 46 + 47 + The goal is tracing a workload on a system running a default kernel without 48 + requiring custom kernel installs. 49 + 50 + How do we gather fine-grained system information? 51 + ================================================= 52 + 53 + strace tool can be used to trace system calls made by a process and signals 54 + it receives. System calls are the fundamental interface between an 55 + application and the operating system kernel. They enable a program to 56 + request services from the kernel. For instance, the open() system call in 57 + Linux is used to provide access to a file in the file system. strace enables 58 + us to track all the system calls made by an application. It lists all the 59 + system calls made by a process and their resulting output. 60 + 61 + You can generate profiling data combining strace and perf record tools to 62 + record the events and information associated with a process. This provides 63 + insight into the process. "perf annotate" tool generates the statistics of 64 + each instruction of the program. This document goes over the details of how 65 + to gather fine-grained information on a workload's usage of system resources. 66 + 67 + We used strace to trace the perf, stress-ng, paxtest workloads to illustrate 68 + our methodology to discover resources used by a workload. This process can 69 + be applied to trace other workloads. 70 + 71 + Getting the system ready for tracing 72 + ==================================== 73 + 74 + Before we can get started we will show you how to get your system ready. 75 + We assume that you have a Linux distribution running on a physical system 76 + or a virtual machine. Most distributions will include strace command. Let’s 77 + install other tools that aren’t usually included to build Linux kernel. 78 + Please note that the following works on Debian based distributions. You 79 + might have to find equivalent packages on other Linux distributions. 80 + 81 + Install tools to build Linux kernel and tools in kernel repository. 82 + scripts/ver_linux is a good way to check if your system already has 83 + the necessary tools:: 84 + 85 + sudo apt-get build-essentials flex bison yacc 86 + sudo apt install libelf-dev systemtap-sdt-dev libaudit-dev libslang2-dev libperl-dev libdw-dev 87 + 88 + cscope is a good tool to browse kernel sources. Let's install it now:: 89 + 90 + sudo apt-get install cscope 91 + 92 + Install stress-ng and paxtest:: 93 + 94 + apt-get install stress-ng 95 + apt-get install paxtest 96 + 97 + Workload overview 98 + ================= 99 + 100 + As mentioned earlier, we used strace to trace perf bench, stress-ng and 101 + paxtest workloads to show how to analyze a workload and identify Linux 102 + subsystems used by these workloads. Let's start with an overview of these 103 + three workloads to get a better understanding of what they do and how to 104 + use them. 105 + 106 + perf bench (all) workload 107 + ------------------------- 108 + 109 + The perf bench command contains multiple multi-threaded microkernel 110 + benchmarks for executing different subsystems in the Linux kernel and 111 + system calls. This allows us to easily measure the impact of changes, 112 + which can help mitigate performance regressions. It also acts as a common 113 + benchmarking framework, enabling developers to easily create test cases, 114 + integrate transparently, and use performance-rich tooling subsystems. 115 + 116 + Stress-ng netdev stressor workload 117 + ---------------------------------- 118 + 119 + stress-ng is used for performing stress testing on the kernel. It allows 120 + you to exercise various physical subsystems of the computer, as well as 121 + interfaces of the OS kernel, using "stressor-s". They are available for 122 + CPU, CPU cache, devices, I/O, interrupts, file system, memory, network, 123 + operating system, pipelines, schedulers, and virtual machines. Please refer 124 + to the `stress-ng man-page <https://www.mankier.com/1/stress-ng>`_ to 125 + find the description of all the available stressor-s. The netdev stressor 126 + starts specified number (N) of workers that exercise various netdevice 127 + ioctl commands across all the available network devices. 128 + 129 + paxtest kiddie workload 130 + ----------------------- 131 + 132 + paxtest is a program that tests buffer overflows in the kernel. It tests 133 + kernel enforcements over memory usage. Generally, execution in some memory 134 + segments makes buffer overflows possible. It runs a set of programs that 135 + attempt to subvert memory usage. It is used as a regression test suite for 136 + PaX, but might be useful to test other memory protection patches for the 137 + kernel. We used paxtest kiddie mode which looks for simple vulnerabilities. 138 + 139 + What is strace and how do we use it? 140 + ==================================== 141 + 142 + As mentioned earlier, strace which is a useful diagnostic, instructional, 143 + and debugging tool and can be used to discover the system resources in use 144 + by a workload. It can be used: 145 + 146 + * To see how a process interacts with the kernel. 147 + * To see why a process is failing or hanging. 148 + * For reverse engineering a process. 149 + * To find the files on which a program depends. 150 + * For analyzing the performance of an application. 151 + * For troubleshooting various problems related to the operating system. 152 + 153 + In addition, strace can generate run-time statistics on times, calls, and 154 + errors for each system call and report a summary when program exits, 155 + suppressing the regular output. This attempts to show system time (CPU time 156 + spent running in the kernel) independent of wall clock time. We plan to use 157 + these features to get information on workload system usage. 158 + 159 + strace command supports basic, verbose, and stats modes. strace command when 160 + run in verbose mode gives more detailed information about the system calls 161 + invoked by a process. 162 + 163 + Running strace -c generates a report of the percentage of time spent in each 164 + system call, the total time in seconds, the microseconds per call, the total 165 + number of calls, the count of each system call that has failed with an error 166 + and the type of system call made. 167 + 168 + * Usage: strace <command we want to trace> 169 + * Verbose mode usage: strace -v <command> 170 + * Gather statistics: strace -c <command> 171 + 172 + We used the “-c” option to gather fine-grained run-time statistics in use 173 + by three workloads we have chose for this analysis. 174 + 175 + * perf 176 + * stress-ng 177 + * paxtest 178 + 179 + What is cscope and how do we use it? 180 + ==================================== 181 + 182 + Now let’s look at `cscope <https://cscope.sourceforge.net/>`_, a command 183 + line tool for browsing C, C++ or Java code-bases. We can use it to find 184 + all the references to a symbol, global definitions, functions called by a 185 + function, functions calling a function, text strings, regular expression 186 + patterns, files including a file. 187 + 188 + We can use cscope to find which system call belongs to which subsystem. 189 + This way we can find the kernel subsystems used by a process when it is 190 + executed. 191 + 192 + Let’s checkout the latest Linux repository and build cscope database:: 193 + 194 + git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux 195 + cd linux 196 + cscope -R -p10 # builds cscope.out database before starting browse session 197 + cscope -d -p10 # starts browse session on cscope.out database 198 + 199 + Note: Run "cscope -R -p10" to build the database and c"scope -d -p10" to 200 + enter into the browsing session. cscope by default cscope.out database. 201 + To get out of this mode press ctrl+d. -p option is used to specify the 202 + number of file path components to display. -p10 is optimal for browsing 203 + kernel sources. 204 + 205 + What is perf and how do we use it? 206 + ================================== 207 + 208 + Perf is an analysis tool based on Linux 2.6+ systems, which abstracts the 209 + CPU hardware difference in performance measurement in Linux, and provides 210 + a simple command line interface. Perf is based on the perf_events interface 211 + exported by the kernel. It is very useful for profiling the system and 212 + finding performance bottlenecks in an application. 213 + 214 + If you haven't already checked out the Linux mainline repository, you can do 215 + so and then build kernel and perf tool:: 216 + 217 + git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux 218 + cd linux 219 + make -j3 all 220 + cd tools/perf 221 + make 222 + 223 + Note: The perf command can be built without building the kernel in the 224 + repository and can be run on older kernels. However matching the kernel 225 + and perf revisions gives more accurate information on the subsystem usage. 226 + 227 + We used "perf stat" and "perf bench" options. For a detailed information on 228 + the perf tool, run "perf -h". 229 + 230 + perf stat 231 + --------- 232 + The perf stat command generates a report of various hardware and software 233 + events. It does so with the help of hardware counter registers found in 234 + modern CPUs that keep the count of these activities. "perf stat cal" shows 235 + stats for cal command. 236 + 237 + Perf bench 238 + ---------- 239 + The perf bench command contains multiple multi-threaded microkernel 240 + benchmarks for executing different subsystems in the Linux kernel and 241 + system calls. This allows us to easily measure the impact of changes, 242 + which can help mitigate performance regressions. It also acts as a common 243 + benchmarking framework, enabling developers to easily create test cases, 244 + integrate transparently, and use performance-rich tooling. 245 + 246 + "perf bench all" command runs the following benchmarks: 247 + 248 + * sched/messaging 249 + * sched/pipe 250 + * syscall/basic 251 + * mem/memcpy 252 + * mem/memset 253 + 254 + What is stress-ng and how do we use it? 255 + ======================================= 256 + 257 + As mentioned earlier, stress-ng is used for performing stress testing on 258 + the kernel. It allows you to exercise various physical subsystems of the 259 + computer, as well as interfaces of the OS kernel, using stressor-s. They 260 + are available for CPU, CPU cache, devices, I/O, interrupts, file system, 261 + memory, network, operating system, pipelines, schedulers, and virtual 262 + machines. 263 + 264 + The netdev stressor starts N workers that exercise various netdevice ioctl 265 + commands across all the available network devices. The following ioctls are 266 + exercised: 267 + 268 + * SIOCGIFCONF, SIOCGIFINDEX, SIOCGIFNAME, SIOCGIFFLAGS 269 + * SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFMETRIC, SIOCGIFMTU 270 + * SIOCGIFHWADDR, SIOCGIFMAP, SIOCGIFTXQLEN 271 + 272 + The following command runs the stressor:: 273 + 274 + stress-ng --netdev 1 -t 60 --metrics command. 275 + 276 + We can use the perf record command to record the events and information 277 + associated with a process. This command records the profiling data in the 278 + perf.data file in the same directory. 279 + 280 + Using the following commands you can record the events associated with the 281 + netdev stressor, view the generated report perf.data and annotate the to 282 + view the statistics of each instruction of the program:: 283 + 284 + perf record stress-ng --netdev 1 -t 60 --metrics command. 285 + perf report 286 + perf annotate 287 + 288 + What is paxtest and how do we use it? 289 + ===================================== 290 + 291 + paxtest is a program that tests buffer overflows in the kernel. It tests 292 + kernel enforcements over memory usage. Generally, execution in some memory 293 + segments makes buffer overflows possible. It runs a set of programs that 294 + attempt to subvert memory usage. It is used as a regression test suite for 295 + PaX, and will be useful to test other memory protection patches for the 296 + kernel. 297 + 298 + paxtest provides kiddie and blackhat modes. The paxtest kiddie mode runs 299 + in normal mode, whereas the blackhat mode tries to get around the protection 300 + of the kernel testing for vulnerabilities. We focus on the kiddie mode here 301 + and combine "paxtest kiddie" run with "perf record" to collect CPU stack 302 + traces for the paxtest kiddie run to see which function is calling other 303 + functions in the performance profile. Then the "dwarf" (DWARF's Call Frame 304 + Information) mode can be used to unwind the stack. 305 + 306 + The following command can be used to view resulting report in call-graph 307 + format:: 308 + 309 + perf record --call-graph dwarf paxtest kiddie 310 + perf report --stdio 311 + 312 + Tracing workloads 313 + ================= 314 + 315 + Now that we understand the workloads, let's start tracing them. 316 + 317 + Tracing perf bench all workload 318 + ------------------------------- 319 + 320 + Run the following command to trace perf bench all workload:: 321 + 322 + strace -c perf bench all 323 + 324 + **System Calls made by the workload** 325 + 326 + The below table shows the system calls invoked by the workload, number of 327 + times each system call is invoked, and the corresponding Linux subsystem. 328 + 329 + +-------------------+-----------+-----------------+-------------------------+ 330 + | System Call | # calls | Linux Subsystem | System Call (API) | 331 + +===================+===========+=================+=========================+ 332 + | getppid | 10000001 | Process Mgmt | sys_getpid() | 333 + +-------------------+-----------+-----------------+-------------------------+ 334 + | clone | 1077 | Process Mgmt. | sys_clone() | 335 + +-------------------+-----------+-----------------+-------------------------+ 336 + | prctl | 23 | Process Mgmt. | sys_prctl() | 337 + +-------------------+-----------+-----------------+-------------------------+ 338 + | prlimit64 | 7 | Process Mgmt. | sys_prlimit64() | 339 + +-------------------+-----------+-----------------+-------------------------+ 340 + | getpid | 10 | Process Mgmt. | sys_getpid() | 341 + +-------------------+-----------+-----------------+-------------------------+ 342 + | uname | 3 | Process Mgmt. | sys_uname() | 343 + +-------------------+-----------+-----------------+-------------------------+ 344 + | sysinfo | 1 | Process Mgmt. | sys_sysinfo() | 345 + +-------------------+-----------+-----------------+-------------------------+ 346 + | getuid | 1 | Process Mgmt. | sys_getuid() | 347 + +-------------------+-----------+-----------------+-------------------------+ 348 + | getgid | 1 | Process Mgmt. | sys_getgid() | 349 + +-------------------+-----------+-----------------+-------------------------+ 350 + | geteuid | 1 | Process Mgmt. | sys_geteuid() | 351 + +-------------------+-----------+-----------------+-------------------------+ 352 + | getegid | 1 | Process Mgmt. | sys_getegid | 353 + +-------------------+-----------+-----------------+-------------------------+ 354 + | close | 49951 | Filesystem | sys_close() | 355 + +-------------------+-----------+-----------------+-------------------------+ 356 + | pipe | 604 | Filesystem | sys_pipe() | 357 + +-------------------+-----------+-----------------+-------------------------+ 358 + | openat | 48560 | Filesystem | sys_opennat() | 359 + +-------------------+-----------+-----------------+-------------------------+ 360 + | fstat | 8338 | Filesystem | sys_fstat() | 361 + +-------------------+-----------+-----------------+-------------------------+ 362 + | stat | 1573 | Filesystem | sys_stat() | 363 + +-------------------+-----------+-----------------+-------------------------+ 364 + | pread64 | 9646 | Filesystem | sys_pread64() | 365 + +-------------------+-----------+-----------------+-------------------------+ 366 + | getdents64 | 1873 | Filesystem | sys_getdents64() | 367 + +-------------------+-----------+-----------------+-------------------------+ 368 + | access | 3 | Filesystem | sys_access() | 369 + +-------------------+-----------+-----------------+-------------------------+ 370 + | lstat | 1880 | Filesystem | sys_lstat() | 371 + +-------------------+-----------+-----------------+-------------------------+ 372 + | lseek | 6 | Filesystem | sys_lseek() | 373 + +-------------------+-----------+-----------------+-------------------------+ 374 + | ioctl | 3 | Filesystem | sys_ioctl() | 375 + +-------------------+-----------+-----------------+-------------------------+ 376 + | dup2 | 1 | Filesystem | sys_dup2() | 377 + +-------------------+-----------+-----------------+-------------------------+ 378 + | execve | 2 | Filesystem | sys_execve() | 379 + +-------------------+-----------+-----------------+-------------------------+ 380 + | fcntl | 8779 | Filesystem | sys_fcntl() | 381 + +-------------------+-----------+-----------------+-------------------------+ 382 + | statfs | 1 | Filesystem | sys_statfs() | 383 + +-------------------+-----------+-----------------+-------------------------+ 384 + | epoll_create | 2 | Filesystem | sys_epoll_create() | 385 + +-------------------+-----------+-----------------+-------------------------+ 386 + | epoll_ctl | 64 | Filesystem | sys_epoll_ctl() | 387 + +-------------------+-----------+-----------------+-------------------------+ 388 + | newfstatat | 8318 | Filesystem | sys_newfstatat() | 389 + +-------------------+-----------+-----------------+-------------------------+ 390 + | eventfd2 | 192 | Filesystem | sys_eventfd2() | 391 + +-------------------+-----------+-----------------+-------------------------+ 392 + | mmap | 243 | Memory Mgmt. | sys_mmap() | 393 + +-------------------+-----------+-----------------+-------------------------+ 394 + | mprotect | 32 | Memory Mgmt. | sys_mprotect() | 395 + +-------------------+-----------+-----------------+-------------------------+ 396 + | brk | 21 | Memory Mgmt. | sys_brk() | 397 + +-------------------+-----------+-----------------+-------------------------+ 398 + | munmap | 128 | Memory Mgmt. | sys_munmap() | 399 + +-------------------+-----------+-----------------+-------------------------+ 400 + | set_mempolicy | 156 | Memory Mgmt. | sys_set_mempolicy() | 401 + +-------------------+-----------+-----------------+-------------------------+ 402 + | set_tid_address | 1 | Process Mgmt. | sys_set_tid_address() | 403 + +-------------------+-----------+-----------------+-------------------------+ 404 + | set_robust_list | 1 | Futex | sys_set_robust_list() | 405 + +-------------------+-----------+-----------------+-------------------------+ 406 + | futex | 341 | Futex | sys_futex() | 407 + +-------------------+-----------+-----------------+-------------------------+ 408 + | sched_getaffinity | 79 | Scheduler | sys_sched_getaffinity() | 409 + +-------------------+-----------+-----------------+-------------------------+ 410 + | sched_setaffinity | 223 | Scheduler | sys_sched_setaffinity() | 411 + +-------------------+-----------+-----------------+-------------------------+ 412 + | socketpair | 202 | Network | sys_socketpair() | 413 + +-------------------+-----------+-----------------+-------------------------+ 414 + | rt_sigprocmask | 21 | Signal | sys_rt_sigprocmask() | 415 + +-------------------+-----------+-----------------+-------------------------+ 416 + | rt_sigaction | 36 | Signal | sys_rt_sigaction() | 417 + +-------------------+-----------+-----------------+-------------------------+ 418 + | rt_sigreturn | 2 | Signal | sys_rt_sigreturn() | 419 + +-------------------+-----------+-----------------+-------------------------+ 420 + | wait4 | 889 | Time | sys_wait4() | 421 + +-------------------+-----------+-----------------+-------------------------+ 422 + | clock_nanosleep | 37 | Time | sys_clock_nanosleep() | 423 + +-------------------+-----------+-----------------+-------------------------+ 424 + | capget | 4 | Capability | sys_capget() | 425 + +-------------------+-----------+-----------------+-------------------------+ 426 + 427 + Tracing stress-ng netdev stressor workload 428 + ------------------------------------------ 429 + 430 + Run the following command to trace stress-ng netdev stressor workload:: 431 + 432 + strace -c stress-ng --netdev 1 -t 60 --metrics 433 + 434 + **System Calls made by the workload** 435 + 436 + The below table shows the system calls invoked by the workload, number of 437 + times each system call is invoked, and the corresponding Linux subsystem. 438 + 439 + +-------------------+-----------+-----------------+-------------------------+ 440 + | System Call | # calls | Linux Subsystem | System Call (API) | 441 + +===================+===========+=================+=========================+ 442 + | openat | 74 | Filesystem | sys_openat() | 443 + +-------------------+-----------+-----------------+-------------------------+ 444 + | close | 75 | Filesystem | sys_close() | 445 + +-------------------+-----------+-----------------+-------------------------+ 446 + | read | 58 | Filesystem | sys_read() | 447 + +-------------------+-----------+-----------------+-------------------------+ 448 + | fstat | 20 | Filesystem | sys_fstat() | 449 + +-------------------+-----------+-----------------+-------------------------+ 450 + | flock | 10 | Filesystem | sys_flock() | 451 + +-------------------+-----------+-----------------+-------------------------+ 452 + | write | 7 | Filesystem | sys_write() | 453 + +-------------------+-----------+-----------------+-------------------------+ 454 + | getdents64 | 8 | Filesystem | sys_getdents64() | 455 + +-------------------+-----------+-----------------+-------------------------+ 456 + | pread64 | 8 | Filesystem | sys_pread64() | 457 + +-------------------+-----------+-----------------+-------------------------+ 458 + | lseek | 1 | Filesystem | sys_lseek() | 459 + +-------------------+-----------+-----------------+-------------------------+ 460 + | access | 2 | Filesystem | sys_access() | 461 + +-------------------+-----------+-----------------+-------------------------+ 462 + | getcwd | 1 | Filesystem | sys_getcwd() | 463 + +-------------------+-----------+-----------------+-------------------------+ 464 + | execve | 1 | Filesystem | sys_execve() | 465 + +-------------------+-----------+-----------------+-------------------------+ 466 + | mmap | 61 | Memory Mgmt. | sys_mmap() | 467 + +-------------------+-----------+-----------------+-------------------------+ 468 + | munmap | 3 | Memory Mgmt. | sys_munmap() | 469 + +-------------------+-----------+-----------------+-------------------------+ 470 + | mprotect | 20 | Memory Mgmt. | sys_mprotect() | 471 + +-------------------+-----------+-----------------+-------------------------+ 472 + | mlock | 2 | Memory Mgmt. | sys_mlock() | 473 + +-------------------+-----------+-----------------+-------------------------+ 474 + | brk | 3 | Memory Mgmt. | sys_brk() | 475 + +-------------------+-----------+-----------------+-------------------------+ 476 + | rt_sigaction | 21 | Signal | sys_rt_sigaction() | 477 + +-------------------+-----------+-----------------+-------------------------+ 478 + | rt_sigprocmask | 1 | Signal | sys_rt_sigprocmask() | 479 + +-------------------+-----------+-----------------+-------------------------+ 480 + | sigaltstack | 1 | Signal | sys_sigaltstack() | 481 + +-------------------+-----------+-----------------+-------------------------+ 482 + | rt_sigreturn | 1 | Signal | sys_rt_sigreturn() | 483 + +-------------------+-----------+-----------------+-------------------------+ 484 + | getpid | 8 | Process Mgmt. | sys_getpid() | 485 + +-------------------+-----------+-----------------+-------------------------+ 486 + | prlimit64 | 5 | Process Mgmt. | sys_prlimit64() | 487 + +-------------------+-----------+-----------------+-------------------------+ 488 + | arch_prctl | 2 | Process Mgmt. | sys_arch_prctl() | 489 + +-------------------+-----------+-----------------+-------------------------+ 490 + | sysinfo | 2 | Process Mgmt. | sys_sysinfo() | 491 + +-------------------+-----------+-----------------+-------------------------+ 492 + | getuid | 2 | Process Mgmt. | sys_getuid() | 493 + +-------------------+-----------+-----------------+-------------------------+ 494 + | uname | 1 | Process Mgmt. | sys_uname() | 495 + +-------------------+-----------+-----------------+-------------------------+ 496 + | setpgid | 1 | Process Mgmt. | sys_setpgid() | 497 + +-------------------+-----------+-----------------+-------------------------+ 498 + | getrusage | 1 | Process Mgmt. | sys_getrusage() | 499 + +-------------------+-----------+-----------------+-------------------------+ 500 + | geteuid | 1 | Process Mgmt. | sys_geteuid() | 501 + +-------------------+-----------+-----------------+-------------------------+ 502 + | getppid | 1 | Process Mgmt. | sys_getppid() | 503 + +-------------------+-----------+-----------------+-------------------------+ 504 + | sendto | 3 | Network | sys_sendto() | 505 + +-------------------+-----------+-----------------+-------------------------+ 506 + | connect | 1 | Network | sys_connect() | 507 + +-------------------+-----------+-----------------+-------------------------+ 508 + | socket | 1 | Network | sys_socket() | 509 + +-------------------+-----------+-----------------+-------------------------+ 510 + | clone | 1 | Process Mgmt. | sys_clone() | 511 + +-------------------+-----------+-----------------+-------------------------+ 512 + | set_tid_address | 1 | Process Mgmt. | sys_set_tid_address() | 513 + +-------------------+-----------+-----------------+-------------------------+ 514 + | wait4 | 2 | Time | sys_wait4() | 515 + +-------------------+-----------+-----------------+-------------------------+ 516 + | alarm | 1 | Time | sys_alarm() | 517 + +-------------------+-----------+-----------------+-------------------------+ 518 + | set_robust_list | 1 | Futex | sys_set_robust_list() | 519 + +-------------------+-----------+-----------------+-------------------------+ 520 + 521 + Tracing paxtest kiddie workload 522 + ------------------------------- 523 + 524 + Run the following command to trace paxtest kiddie workload:: 525 + 526 + strace -c paxtest kiddie 527 + 528 + **System Calls made by the workload** 529 + 530 + The below table shows the system calls invoked by the workload, number of 531 + times each system call is invoked, and the corresponding Linux subsystem. 532 + 533 + +-------------------+-----------+-----------------+----------------------+ 534 + | System Call | # calls | Linux Subsystem | System Call (API) | 535 + +===================+===========+=================+======================+ 536 + | read | 3 | Filesystem | sys_read() | 537 + +-------------------+-----------+-----------------+----------------------+ 538 + | write | 11 | Filesystem | sys_write() | 539 + +-------------------+-----------+-----------------+----------------------+ 540 + | close | 41 | Filesystem | sys_close() | 541 + +-------------------+-----------+-----------------+----------------------+ 542 + | stat | 24 | Filesystem | sys_stat() | 543 + +-------------------+-----------+-----------------+----------------------+ 544 + | fstat | 2 | Filesystem | sys_fstat() | 545 + +-------------------+-----------+-----------------+----------------------+ 546 + | pread64 | 6 | Filesystem | sys_pread64() | 547 + +-------------------+-----------+-----------------+----------------------+ 548 + | access | 1 | Filesystem | sys_access() | 549 + +-------------------+-----------+-----------------+----------------------+ 550 + | pipe | 1 | Filesystem | sys_pipe() | 551 + +-------------------+-----------+-----------------+----------------------+ 552 + | dup2 | 24 | Filesystem | sys_dup2() | 553 + +-------------------+-----------+-----------------+----------------------+ 554 + | execve | 1 | Filesystem | sys_execve() | 555 + +-------------------+-----------+-----------------+----------------------+ 556 + | fcntl | 26 | Filesystem | sys_fcntl() | 557 + +-------------------+-----------+-----------------+----------------------+ 558 + | openat | 14 | Filesystem | sys_openat() | 559 + +-------------------+-----------+-----------------+----------------------+ 560 + | rt_sigaction | 7 | Signal | sys_rt_sigaction() | 561 + +-------------------+-----------+-----------------+----------------------+ 562 + | rt_sigreturn | 38 | Signal | sys_rt_sigreturn() | 563 + +-------------------+-----------+-----------------+----------------------+ 564 + | clone | 38 | Process Mgmt. | sys_clone() | 565 + +-------------------+-----------+-----------------+----------------------+ 566 + | wait4 | 44 | Time | sys_wait4() | 567 + +-------------------+-----------+-----------------+----------------------+ 568 + | mmap | 7 | Memory Mgmt. | sys_mmap() | 569 + +-------------------+-----------+-----------------+----------------------+ 570 + | mprotect | 3 | Memory Mgmt. | sys_mprotect() | 571 + +-------------------+-----------+-----------------+----------------------+ 572 + | munmap | 1 | Memory Mgmt. | sys_munmap() | 573 + +-------------------+-----------+-----------------+----------------------+ 574 + | brk | 3 | Memory Mgmt. | sys_brk() | 575 + +-------------------+-----------+-----------------+----------------------+ 576 + | getpid | 1 | Process Mgmt. | sys_getpid() | 577 + +-------------------+-----------+-----------------+----------------------+ 578 + | getuid | 1 | Process Mgmt. | sys_getuid() | 579 + +-------------------+-----------+-----------------+----------------------+ 580 + | getgid | 1 | Process Mgmt. | sys_getgid() | 581 + +-------------------+-----------+-----------------+----------------------+ 582 + | geteuid | 2 | Process Mgmt. | sys_geteuid() | 583 + +-------------------+-----------+-----------------+----------------------+ 584 + | getegid | 1 | Process Mgmt. | sys_getegid() | 585 + +-------------------+-----------+-----------------+----------------------+ 586 + | getppid | 1 | Process Mgmt. | sys_getppid() | 587 + +-------------------+-----------+-----------------+----------------------+ 588 + | arch_prctl | 2 | Process Mgmt. | sys_arch_prctl() | 589 + +-------------------+-----------+-----------------+----------------------+ 590 + 591 + Conclusion 592 + ========== 593 + 594 + This document is intended to be used as a guide on how to gather fine-grained 595 + information on the resources in use by workloads using strace. 596 + 597 + References 598 + ========== 599 + 600 + * `Discovery Linux Kernel Subsystems used by OpenAPS <https://elisa.tech/blog/2022/02/02/discovery-linux-kernel-subsystems-used-by-openaps>`_ 601 + * `ELISA-White-Papers-Discovering Linux kernel subsystems used by a workload <https://github.com/elisa-tech/ELISA-White-Papers/blob/master/Processes/Discovering_Linux_kernel_subsystems_used_by_a_workload.md>`_ 602 + * `strace <https://man7.org/linux/man-pages/man1/strace.1.html>`_ 603 + * `perf <https://man7.org/linux/man-pages/man1/perf.1.html>`_ 604 + * `paxtest README <https://github.com/opntr/paxtest-freebsd/blob/hardenedbsd/0.9.14-hbsd/README>`_ 605 + * `stress-ng <https://www.mankier.com/1/stress-ng>`_ 606 + * `Monitoring and managing system status and performance <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/index>`_