at v4.6-rc1 1004 lines 36 kB view raw
1Documentation for /proc/sys/kernel/* kernel version 2.2.10 2 (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> 3 (c) 2009, Shen Feng<shen@cn.fujitsu.com> 4 5For general info and legal blurb, please look in README. 6 7============================================================== 8 9This file contains documentation for the sysctl files in 10/proc/sys/kernel/ and is valid for Linux kernel version 2.2. 11 12The files in this directory can be used to tune and monitor 13miscellaneous and general things in the operation of the Linux 14kernel. Since some of the files _can_ be used to screw up your 15system, it is advisable to read both documentation and source 16before actually making adjustments. 17 18Currently, these files might (depending on your configuration) 19show up in /proc/sys/kernel: 20 21- acct 22- acpi_video_flags 23- auto_msgmni 24- bootloader_type [ X86 only ] 25- bootloader_version [ X86 only ] 26- callhome [ S390 only ] 27- cap_last_cap 28- core_pattern 29- core_pipe_limit 30- core_uses_pid 31- ctrl-alt-del 32- dmesg_restrict 33- domainname 34- hostname 35- hotplug 36- hardlockup_all_cpu_backtrace 37- hung_task_panic 38- hung_task_check_count 39- hung_task_timeout_secs 40- hung_task_warnings 41- kexec_load_disabled 42- kptr_restrict 43- kstack_depth_to_print [ X86 only ] 44- l2cr [ PPC only ] 45- modprobe ==> Documentation/debugging-modules.txt 46- modules_disabled 47- msg_next_id [ sysv ipc ] 48- msgmax 49- msgmnb 50- msgmni 51- nmi_watchdog 52- osrelease 53- ostype 54- overflowgid 55- overflowuid 56- panic 57- panic_on_oops 58- panic_on_stackoverflow 59- panic_on_unrecovered_nmi 60- panic_on_warn 61- perf_cpu_time_max_percent 62- perf_event_paranoid 63- pid_max 64- powersave-nap [ PPC only ] 65- printk 66- printk_delay 67- printk_ratelimit 68- printk_ratelimit_burst 69- pty ==> Documentation/filesystems/devpts.txt 70- randomize_va_space 71- real-root-dev ==> Documentation/initrd.txt 72- reboot-cmd [ SPARC only ] 73- rtsig-max 74- rtsig-nr 75- sem 76- sem_next_id [ sysv ipc ] 77- sg-big-buff [ generic SCSI device (sg) ] 78- shm_next_id [ sysv ipc ] 79- shm_rmid_forced 80- shmall 81- shmmax [ sysv ipc ] 82- shmmni 83- softlockup_all_cpu_backtrace 84- soft_watchdog 85- stop-a [ SPARC only ] 86- sysrq ==> Documentation/sysrq.txt 87- sysctl_writes_strict 88- tainted 89- threads-max 90- unknown_nmi_panic 91- watchdog 92- watchdog_thresh 93- version 94 95============================================================== 96 97acct: 98 99highwater lowwater frequency 100 101If BSD-style process accounting is enabled these values control 102its behaviour. If free space on filesystem where the log lives 103goes below <lowwater>% accounting suspends. If free space gets 104above <highwater>% accounting resumes. <Frequency> determines 105how often do we check the amount of free space (value is in 106seconds). Default: 1074 2 30 108That is, suspend accounting if there left <= 2% free; resume it 109if we got >=4%; consider information about amount of free space 110valid for 30 seconds. 111 112============================================================== 113 114acpi_video_flags: 115 116flags 117 118See Doc*/kernel/power/video.txt, it allows mode of video boot to be 119set during run time. 120 121============================================================== 122 123auto_msgmni: 124 125This variable has no effect and may be removed in future kernel 126releases. Reading it always returns 0. 127Up to Linux 3.17, it enabled/disabled automatic recomputing of msgmni 128upon memory add/remove or upon ipc namespace creation/removal. 129Echoing "1" into this file enabled msgmni automatic recomputing. 130Echoing "0" turned it off. auto_msgmni default value was 1. 131 132 133============================================================== 134 135bootloader_type: 136 137x86 bootloader identification 138 139This gives the bootloader type number as indicated by the bootloader, 140shifted left by 4, and OR'd with the low four bits of the bootloader 141version. The reason for this encoding is that this used to match the 142type_of_loader field in the kernel header; the encoding is kept for 143backwards compatibility. That is, if the full bootloader type number 144is 0x15 and the full version number is 0x234, this file will contain 145the value 340 = 0x154. 146 147See the type_of_loader and ext_loader_type fields in 148Documentation/x86/boot.txt for additional information. 149 150============================================================== 151 152bootloader_version: 153 154x86 bootloader version 155 156The complete bootloader version number. In the example above, this 157file will contain the value 564 = 0x234. 158 159See the type_of_loader and ext_loader_ver fields in 160Documentation/x86/boot.txt for additional information. 161 162============================================================== 163 164callhome: 165 166Controls the kernel's callhome behavior in case of a kernel panic. 167 168The s390 hardware allows an operating system to send a notification 169to a service organization (callhome) in case of an operating system panic. 170 171When the value in this file is 0 (which is the default behavior) 172nothing happens in case of a kernel panic. If this value is set to "1" 173the complete kernel oops message is send to the IBM customer service 174organization in case the mainframe the Linux operating system is running 175on has a service contract with IBM. 176 177============================================================== 178 179cap_last_cap 180 181Highest valid capability of the running kernel. Exports 182CAP_LAST_CAP from the kernel. 183 184============================================================== 185 186core_pattern: 187 188core_pattern is used to specify a core dumpfile pattern name. 189. max length 128 characters; default value is "core" 190. core_pattern is used as a pattern template for the output filename; 191 certain string patterns (beginning with '%') are substituted with 192 their actual values. 193. backward compatibility with core_uses_pid: 194 If core_pattern does not include "%p" (default does not) 195 and core_uses_pid is set, then .PID will be appended to 196 the filename. 197. corename format specifiers: 198 %<NUL> '%' is dropped 199 %% output one '%' 200 %p pid 201 %P global pid (init PID namespace) 202 %i tid 203 %I global tid (init PID namespace) 204 %u uid (in initial user namespace) 205 %g gid (in initial user namespace) 206 %d dump mode, matches PR_SET_DUMPABLE and 207 /proc/sys/fs/suid_dumpable 208 %s signal number 209 %t UNIX time of dump 210 %h hostname 211 %e executable filename (may be shortened) 212 %E executable path 213 %<OTHER> both are dropped 214. If the first character of the pattern is a '|', the kernel will treat 215 the rest of the pattern as a command to run. The core dump will be 216 written to the standard input of that program instead of to a file. 217 218============================================================== 219 220core_pipe_limit: 221 222This sysctl is only applicable when core_pattern is configured to pipe 223core files to a user space helper (when the first character of 224core_pattern is a '|', see above). When collecting cores via a pipe 225to an application, it is occasionally useful for the collecting 226application to gather data about the crashing process from its 227/proc/pid directory. In order to do this safely, the kernel must wait 228for the collecting process to exit, so as not to remove the crashing 229processes proc files prematurely. This in turn creates the 230possibility that a misbehaving userspace collecting process can block 231the reaping of a crashed process simply by never exiting. This sysctl 232defends against that. It defines how many concurrent crashing 233processes may be piped to user space applications in parallel. If 234this value is exceeded, then those crashing processes above that value 235are noted via the kernel log and their cores are skipped. 0 is a 236special value, indicating that unlimited processes may be captured in 237parallel, but that no waiting will take place (i.e. the collecting 238process is not guaranteed access to /proc/<crashing pid>/). This 239value defaults to 0. 240 241============================================================== 242 243core_uses_pid: 244 245The default coredump filename is "core". By setting 246core_uses_pid to 1, the coredump filename becomes core.PID. 247If core_pattern does not include "%p" (default does not) 248and core_uses_pid is set, then .PID will be appended to 249the filename. 250 251============================================================== 252 253ctrl-alt-del: 254 255When the value in this file is 0, ctrl-alt-del is trapped and 256sent to the init(1) program to handle a graceful restart. 257When, however, the value is > 0, Linux's reaction to a Vulcan 258Nerve Pinch (tm) will be an immediate reboot, without even 259syncing its dirty buffers. 260 261Note: when a program (like dosemu) has the keyboard in 'raw' 262mode, the ctrl-alt-del is intercepted by the program before it 263ever reaches the kernel tty layer, and it's up to the program 264to decide what to do with it. 265 266============================================================== 267 268dmesg_restrict: 269 270This toggle indicates whether unprivileged users are prevented 271from using dmesg(8) to view messages from the kernel's log buffer. 272When dmesg_restrict is set to (0) there are no restrictions. When 273dmesg_restrict is set set to (1), users must have CAP_SYSLOG to use 274dmesg(8). 275 276The kernel config option CONFIG_SECURITY_DMESG_RESTRICT sets the 277default value of dmesg_restrict. 278 279============================================================== 280 281domainname & hostname: 282 283These files can be used to set the NIS/YP domainname and the 284hostname of your box in exactly the same way as the commands 285domainname and hostname, i.e.: 286# echo "darkstar" > /proc/sys/kernel/hostname 287# echo "mydomain" > /proc/sys/kernel/domainname 288has the same effect as 289# hostname "darkstar" 290# domainname "mydomain" 291 292Note, however, that the classic darkstar.frop.org has the 293hostname "darkstar" and DNS (Internet Domain Name Server) 294domainname "frop.org", not to be confused with the NIS (Network 295Information Service) or YP (Yellow Pages) domainname. These two 296domain names are in general different. For a detailed discussion 297see the hostname(1) man page. 298 299============================================================== 300hardlockup_all_cpu_backtrace: 301 302This value controls the hard lockup detector behavior when a hard 303lockup condition is detected as to whether or not to gather further 304debug information. If enabled, arch-specific all-CPU stack dumping 305will be initiated. 306 3070: do nothing. This is the default behavior. 308 3091: on detection capture more debug information. 310============================================================== 311 312hotplug: 313 314Path for the hotplug policy agent. 315Default value is "/sbin/hotplug". 316 317============================================================== 318 319hung_task_panic: 320 321Controls the kernel's behavior when a hung task is detected. 322This file shows up if CONFIG_DETECT_HUNG_TASK is enabled. 323 3240: continue operation. This is the default behavior. 325 3261: panic immediately. 327 328============================================================== 329 330hung_task_check_count: 331 332The upper bound on the number of tasks that are checked. 333This file shows up if CONFIG_DETECT_HUNG_TASK is enabled. 334 335============================================================== 336 337hung_task_timeout_secs: 338 339Check interval. When a task in D state did not get scheduled 340for more than this value report a warning. 341This file shows up if CONFIG_DETECT_HUNG_TASK is enabled. 342 3430: means infinite timeout - no checking done. 344Possible values to set are in range {0..LONG_MAX/HZ}. 345 346============================================================== 347 348hung_task_warnings: 349 350The maximum number of warnings to report. During a check interval 351if a hung task is detected, this value is decreased by 1. 352When this value reaches 0, no more warnings will be reported. 353This file shows up if CONFIG_DETECT_HUNG_TASK is enabled. 354 355-1: report an infinite number of warnings. 356 357============================================================== 358 359kexec_load_disabled: 360 361A toggle indicating if the kexec_load syscall has been disabled. This 362value defaults to 0 (false: kexec_load enabled), but can be set to 1 363(true: kexec_load disabled). Once true, kexec can no longer be used, and 364the toggle cannot be set back to false. This allows a kexec image to be 365loaded before disabling the syscall, allowing a system to set up (and 366later use) an image without it being altered. Generally used together 367with the "modules_disabled" sysctl. 368 369============================================================== 370 371kptr_restrict: 372 373This toggle indicates whether restrictions are placed on 374exposing kernel addresses via /proc and other interfaces. 375 376When kptr_restrict is set to (0), the default, there are no restrictions. 377 378When kptr_restrict is set to (1), kernel pointers printed using the %pK 379format specifier will be replaced with 0's unless the user has CAP_SYSLOG 380and effective user and group ids are equal to the real ids. This is 381because %pK checks are done at read() time rather than open() time, so 382if permissions are elevated between the open() and the read() (e.g via 383a setuid binary) then %pK will not leak kernel pointers to unprivileged 384users. Note, this is a temporary solution only. The correct long-term 385solution is to do the permission checks at open() time. Consider removing 386world read permissions from files that use %pK, and using dmesg_restrict 387to protect against uses of %pK in dmesg(8) if leaking kernel pointer 388values to unprivileged users is a concern. 389 390When kptr_restrict is set to (2), kernel pointers printed using 391%pK will be replaced with 0's regardless of privileges. 392 393============================================================== 394 395kstack_depth_to_print: (X86 only) 396 397Controls the number of words to print when dumping the raw 398kernel stack. 399 400============================================================== 401 402l2cr: (PPC only) 403 404This flag controls the L2 cache of G3 processor boards. If 4050, the cache is disabled. Enabled if nonzero. 406 407============================================================== 408 409modules_disabled: 410 411A toggle value indicating if modules are allowed to be loaded 412in an otherwise modular kernel. This toggle defaults to off 413(0), but can be set true (1). Once true, modules can be 414neither loaded nor unloaded, and the toggle cannot be set back 415to false. Generally used with the "kexec_load_disabled" toggle. 416 417============================================================== 418 419msg_next_id, sem_next_id, and shm_next_id: 420 421These three toggles allows to specify desired id for next allocated IPC 422object: message, semaphore or shared memory respectively. 423 424By default they are equal to -1, which means generic allocation logic. 425Possible values to set are in range {0..INT_MAX}. 426 427Notes: 4281) kernel doesn't guarantee, that new object will have desired id. So, 429it's up to userspace, how to handle an object with "wrong" id. 4302) Toggle with non-default value will be set back to -1 by kernel after 431successful IPC object allocation. 432 433============================================================== 434 435nmi_watchdog: 436 437This parameter can be used to control the NMI watchdog 438(i.e. the hard lockup detector) on x86 systems. 439 440 0 - disable the hard lockup detector 441 1 - enable the hard lockup detector 442 443The hard lockup detector monitors each CPU for its ability to respond to 444timer interrupts. The mechanism utilizes CPU performance counter registers 445that are programmed to generate Non-Maskable Interrupts (NMIs) periodically 446while a CPU is busy. Hence, the alternative name 'NMI watchdog'. 447 448The NMI watchdog is disabled by default if the kernel is running as a guest 449in a KVM virtual machine. This default can be overridden by adding 450 451 nmi_watchdog=1 452 453to the guest kernel command line (see Documentation/kernel-parameters.txt). 454 455============================================================== 456 457numa_balancing 458 459Enables/disables automatic page fault based NUMA memory 460balancing. Memory is moved automatically to nodes 461that access it often. 462 463Enables/disables automatic NUMA memory balancing. On NUMA machines, there 464is a performance penalty if remote memory is accessed by a CPU. When this 465feature is enabled the kernel samples what task thread is accessing memory 466by periodically unmapping pages and later trapping a page fault. At the 467time of the page fault, it is determined if the data being accessed should 468be migrated to a local memory node. 469 470The unmapping of pages and trapping faults incur additional overhead that 471ideally is offset by improved memory locality but there is no universal 472guarantee. If the target workload is already bound to NUMA nodes then this 473feature should be disabled. Otherwise, if the system overhead from the 474feature is too high then the rate the kernel samples for NUMA hinting 475faults may be controlled by the numa_balancing_scan_period_min_ms, 476numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, 477numa_balancing_scan_size_mb, and numa_balancing_settle_count sysctls. 478 479============================================================== 480 481numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, 482numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb 483 484Automatic NUMA balancing scans tasks address space and unmaps pages to 485detect if pages are properly placed or if the data should be migrated to a 486memory node local to where the task is running. Every "scan delay" the task 487scans the next "scan size" number of pages in its address space. When the 488end of the address space is reached the scanner restarts from the beginning. 489 490In combination, the "scan delay" and "scan size" determine the scan rate. 491When "scan delay" decreases, the scan rate increases. The scan delay and 492hence the scan rate of every task is adaptive and depends on historical 493behaviour. If pages are properly placed then the scan delay increases, 494otherwise the scan delay decreases. The "scan size" is not adaptive but 495the higher the "scan size", the higher the scan rate. 496 497Higher scan rates incur higher system overhead as page faults must be 498trapped and potentially data must be migrated. However, the higher the scan 499rate, the more quickly a tasks memory is migrated to a local node if the 500workload pattern changes and minimises performance impact due to remote 501memory accesses. These sysctls control the thresholds for scan delays and 502the number of pages scanned. 503 504numa_balancing_scan_period_min_ms is the minimum time in milliseconds to 505scan a tasks virtual memory. It effectively controls the maximum scanning 506rate for each task. 507 508numa_balancing_scan_delay_ms is the starting "scan delay" used for a task 509when it initially forks. 510 511numa_balancing_scan_period_max_ms is the maximum time in milliseconds to 512scan a tasks virtual memory. It effectively controls the minimum scanning 513rate for each task. 514 515numa_balancing_scan_size_mb is how many megabytes worth of pages are 516scanned for a given scan. 517 518============================================================== 519 520osrelease, ostype & version: 521 522# cat osrelease 5232.1.88 524# cat ostype 525Linux 526# cat version 527#5 Wed Feb 25 21:49:24 MET 1998 528 529The files osrelease and ostype should be clear enough. Version 530needs a little more clarification however. The '#5' means that 531this is the fifth kernel built from this source base and the 532date behind it indicates the time the kernel was built. 533The only way to tune these values is to rebuild the kernel :-) 534 535============================================================== 536 537overflowgid & overflowuid: 538 539if your architecture did not always support 32-bit UIDs (i.e. arm, 540i386, m68k, sh, and sparc32), a fixed UID and GID will be returned to 541applications that use the old 16-bit UID/GID system calls, if the 542actual UID or GID would exceed 65535. 543 544These sysctls allow you to change the value of the fixed UID and GID. 545The default is 65534. 546 547============================================================== 548 549panic: 550 551The value in this file represents the number of seconds the kernel 552waits before rebooting on a panic. When you use the software watchdog, 553the recommended setting is 60. 554 555============================================================== 556 557panic_on_io_nmi: 558 559Controls the kernel's behavior when a CPU receives an NMI caused by 560an IO error. 561 5620: try to continue operation (default) 563 5641: panic immediately. The IO error triggered an NMI. This indicates a 565 serious system condition which could result in IO data corruption. 566 Rather than continuing, panicking might be a better choice. Some 567 servers issue this sort of NMI when the dump button is pushed, 568 and you can use this option to take a crash dump. 569 570============================================================== 571 572panic_on_oops: 573 574Controls the kernel's behaviour when an oops or BUG is encountered. 575 5760: try to continue operation 577 5781: panic immediately. If the `panic' sysctl is also non-zero then the 579 machine will be rebooted. 580 581============================================================== 582 583panic_on_stackoverflow: 584 585Controls the kernel's behavior when detecting the overflows of 586kernel, IRQ and exception stacks except a user stack. 587This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled. 588 5890: try to continue operation. 590 5911: panic immediately. 592 593============================================================== 594 595panic_on_unrecovered_nmi: 596 597The default Linux behaviour on an NMI of either memory or unknown is 598to continue operation. For many environments such as scientific 599computing it is preferable that the box is taken out and the error 600dealt with than an uncorrected parity/ECC error get propagated. 601 602A small number of systems do generate NMI's for bizarre random reasons 603such as power management so the default is off. That sysctl works like 604the existing panic controls already in that directory. 605 606============================================================== 607 608panic_on_warn: 609 610Calls panic() in the WARN() path when set to 1. This is useful to avoid 611a kernel rebuild when attempting to kdump at the location of a WARN(). 612 6130: only WARN(), default behaviour. 614 6151: call panic() after printing out WARN() location. 616 617============================================================== 618 619perf_cpu_time_max_percent: 620 621Hints to the kernel how much CPU time it should be allowed to 622use to handle perf sampling events. If the perf subsystem 623is informed that its samples are exceeding this limit, it 624will drop its sampling frequency to attempt to reduce its CPU 625usage. 626 627Some perf sampling happens in NMIs. If these samples 628unexpectedly take too long to execute, the NMIs can become 629stacked up next to each other so much that nothing else is 630allowed to execute. 631 6320: disable the mechanism. Do not monitor or correct perf's 633 sampling rate no matter how CPU time it takes. 634 6351-100: attempt to throttle perf's sample rate to this 636 percentage of CPU. Note: the kernel calculates an 637 "expected" length of each sample event. 100 here means 638 100% of that expected length. Even if this is set to 639 100, you may still see sample throttling if this 640 length is exceeded. Set to 0 if you truly do not care 641 how much CPU is consumed. 642 643============================================================== 644 645perf_event_paranoid: 646 647Controls use of the performance events system by unprivileged 648users (without CAP_SYS_ADMIN). The default value is 1. 649 650 -1: Allow use of (almost) all events by all users 651>=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK 652>=1: Disallow CPU event access by users without CAP_SYS_ADMIN 653>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN 654 655============================================================== 656 657pid_max: 658 659PID allocation wrap value. When the kernel's next PID value 660reaches this value, it wraps back to a minimum PID value. 661PIDs of value pid_max or larger are not allocated. 662 663============================================================== 664 665ns_last_pid: 666 667The last pid allocated in the current (the one task using this sysctl 668lives in) pid namespace. When selecting a pid for a next task on fork 669kernel tries to allocate a number starting from this one. 670 671============================================================== 672 673powersave-nap: (PPC only) 674 675If set, Linux-PPC will use the 'nap' mode of powersaving, 676otherwise the 'doze' mode will be used. 677 678============================================================== 679 680printk: 681 682The four values in printk denote: console_loglevel, 683default_message_loglevel, minimum_console_loglevel and 684default_console_loglevel respectively. 685 686These values influence printk() behavior when printing or 687logging error messages. See 'man 2 syslog' for more info on 688the different loglevels. 689 690- console_loglevel: messages with a higher priority than 691 this will be printed to the console 692- default_message_loglevel: messages without an explicit priority 693 will be printed with this priority 694- minimum_console_loglevel: minimum (highest) value to which 695 console_loglevel can be set 696- default_console_loglevel: default value for console_loglevel 697 698============================================================== 699 700printk_delay: 701 702Delay each printk message in printk_delay milliseconds 703 704Value from 0 - 10000 is allowed. 705 706============================================================== 707 708printk_ratelimit: 709 710Some warning messages are rate limited. printk_ratelimit specifies 711the minimum length of time between these messages (in jiffies), by 712default we allow one every 5 seconds. 713 714A value of 0 will disable rate limiting. 715 716============================================================== 717 718printk_ratelimit_burst: 719 720While long term we enforce one message per printk_ratelimit 721seconds, we do allow a burst of messages to pass through. 722printk_ratelimit_burst specifies the number of messages we can 723send before ratelimiting kicks in. 724 725============================================================== 726 727randomize_va_space: 728 729This option can be used to select the type of process address 730space randomization that is used in the system, for architectures 731that support this feature. 732 7330 - Turn the process address space randomization off. This is the 734 default for architectures that do not support this feature anyways, 735 and kernels that are booted with the "norandmaps" parameter. 736 7371 - Make the addresses of mmap base, stack and VDSO page randomized. 738 This, among other things, implies that shared libraries will be 739 loaded to random addresses. Also for PIE-linked binaries, the 740 location of code start is randomized. This is the default if the 741 CONFIG_COMPAT_BRK option is enabled. 742 7432 - Additionally enable heap randomization. This is the default if 744 CONFIG_COMPAT_BRK is disabled. 745 746 There are a few legacy applications out there (such as some ancient 747 versions of libc.so.5 from 1996) that assume that brk area starts 748 just after the end of the code+bss. These applications break when 749 start of the brk area is randomized. There are however no known 750 non-legacy applications that would be broken this way, so for most 751 systems it is safe to choose full randomization. 752 753 Systems with ancient and/or broken binaries should be configured 754 with CONFIG_COMPAT_BRK enabled, which excludes the heap from process 755 address space randomization. 756 757============================================================== 758 759reboot-cmd: (Sparc only) 760 761??? This seems to be a way to give an argument to the Sparc 762ROM/Flash boot loader. Maybe to tell it what to do after 763rebooting. ??? 764 765============================================================== 766 767rtsig-max & rtsig-nr: 768 769The file rtsig-max can be used to tune the maximum number 770of POSIX realtime (queued) signals that can be outstanding 771in the system. 772 773rtsig-nr shows the number of RT signals currently queued. 774 775============================================================== 776 777sched_schedstats: 778 779Enables/disables scheduler statistics. Enabling this feature 780incurs a small amount of overhead in the scheduler but is 781useful for debugging and performance tuning. 782 783============================================================== 784 785sg-big-buff: 786 787This file shows the size of the generic SCSI (sg) buffer. 788You can't tune it just yet, but you could change it on 789compile time by editing include/scsi/sg.h and changing 790the value of SG_BIG_BUFF. 791 792There shouldn't be any reason to change this value. If 793you can come up with one, you probably know what you 794are doing anyway :) 795 796============================================================== 797 798shmall: 799 800This parameter sets the total amount of shared memory pages that 801can be used system wide. Hence, SHMALL should always be at least 802ceil(shmmax/PAGE_SIZE). 803 804If you are not sure what the default PAGE_SIZE is on your Linux 805system, you can run the following command: 806 807# getconf PAGE_SIZE 808 809============================================================== 810 811shmmax: 812 813This value can be used to query and set the run time limit 814on the maximum shared memory segment size that can be created. 815Shared memory segments up to 1Gb are now supported in the 816kernel. This value defaults to SHMMAX. 817 818============================================================== 819 820shm_rmid_forced: 821 822Linux lets you set resource limits, including how much memory one 823process can consume, via setrlimit(2). Unfortunately, shared memory 824segments are allowed to exist without association with any process, and 825thus might not be counted against any resource limits. If enabled, 826shared memory segments are automatically destroyed when their attach 827count becomes zero after a detach or a process termination. It will 828also destroy segments that were created, but never attached to, on exit 829from the process. The only use left for IPC_RMID is to immediately 830destroy an unattached segment. Of course, this breaks the way things are 831defined, so some applications might stop working. Note that this 832feature will do you no good unless you also configure your resource 833limits (in particular, RLIMIT_AS and RLIMIT_NPROC). Most systems don't 834need this. 835 836Note that if you change this from 0 to 1, already created segments 837without users and with a dead originative process will be destroyed. 838 839============================================================== 840 841sysctl_writes_strict: 842 843Control how file position affects the behavior of updating sysctl values 844via the /proc/sys interface: 845 846 -1 - Legacy per-write sysctl value handling, with no printk warnings. 847 Each write syscall must fully contain the sysctl value to be 848 written, and multiple writes on the same sysctl file descriptor 849 will rewrite the sysctl value, regardless of file position. 850 0 - Same behavior as above, but warn about processes that perform writes 851 to a sysctl file descriptor when the file position is not 0. 852 1 - (default) Respect file position when writing sysctl strings. Multiple 853 writes will append to the sysctl value buffer. Anything past the max 854 length of the sysctl value buffer will be ignored. Writes to numeric 855 sysctl entries must always be at file position 0 and the value must 856 be fully contained in the buffer sent in the write syscall. 857 858============================================================== 859 860softlockup_all_cpu_backtrace: 861 862This value controls the soft lockup detector thread's behavior 863when a soft lockup condition is detected as to whether or not 864to gather further debug information. If enabled, each cpu will 865be issued an NMI and instructed to capture stack trace. 866 867This feature is only applicable for architectures which support 868NMI. 869 8700: do nothing. This is the default behavior. 871 8721: on detection capture more debug information. 873 874============================================================== 875 876soft_watchdog 877 878This parameter can be used to control the soft lockup detector. 879 880 0 - disable the soft lockup detector 881 1 - enable the soft lockup detector 882 883The soft lockup detector monitors CPUs for threads that are hogging the CPUs 884without rescheduling voluntarily, and thus prevent the 'watchdog/N' threads 885from running. The mechanism depends on the CPUs ability to respond to timer 886interrupts which are needed for the 'watchdog/N' threads to be woken up by 887the watchdog timer function, otherwise the NMI watchdog - if enabled - can 888detect a hard lockup condition. 889 890============================================================== 891 892tainted: 893 894Non-zero if the kernel has been tainted. Numeric values, which 895can be ORed together: 896 897 1 - A module with a non-GPL license has been loaded, this 898 includes modules with no license. 899 Set by modutils >= 2.4.9 and module-init-tools. 900 2 - A module was force loaded by insmod -f. 901 Set by modutils >= 2.4.9 and module-init-tools. 902 4 - Unsafe SMP processors: SMP with CPUs not designed for SMP. 903 8 - A module was forcibly unloaded from the system by rmmod -f. 904 16 - A hardware machine check error occurred on the system. 905 32 - A bad page was discovered on the system. 906 64 - The user has asked that the system be marked "tainted". This 907 could be because they are running software that directly modifies 908 the hardware, or for other reasons. 909 128 - The system has died. 910 256 - The ACPI DSDT has been overridden with one supplied by the user 911 instead of using the one provided by the hardware. 912 512 - A kernel warning has occurred. 9131024 - A module from drivers/staging was loaded. 9142048 - The system is working around a severe firmware bug. 9154096 - An out-of-tree module has been loaded. 9168192 - An unsigned module has been loaded in a kernel supporting module 917 signature. 91816384 - A soft lockup has previously occurred on the system. 91932768 - The kernel has been live patched. 920 921============================================================== 922 923threads-max 924 925This value controls the maximum number of threads that can be created 926using fork(). 927 928During initialization the kernel sets this value such that even if the 929maximum number of threads is created, the thread structures occupy only 930a part (1/8th) of the available RAM pages. 931 932The minimum value that can be written to threads-max is 20. 933The maximum value that can be written to threads-max is given by the 934constant FUTEX_TID_MASK (0x3fffffff). 935If a value outside of this range is written to threads-max an error 936EINVAL occurs. 937 938The value written is checked against the available RAM pages. If the 939thread structures would occupy too much (more than 1/8th) of the 940available RAM pages threads-max is reduced accordingly. 941 942============================================================== 943 944unknown_nmi_panic: 945 946The value in this file affects behavior of handling NMI. When the 947value is non-zero, unknown NMI is trapped and then panic occurs. At 948that time, kernel debugging information is displayed on console. 949 950NMI switch that most IA32 servers have fires unknown NMI up, for 951example. If a system hangs up, try pressing the NMI switch. 952 953============================================================== 954 955watchdog: 956 957This parameter can be used to disable or enable the soft lockup detector 958_and_ the NMI watchdog (i.e. the hard lockup detector) at the same time. 959 960 0 - disable both lockup detectors 961 1 - enable both lockup detectors 962 963The soft lockup detector and the NMI watchdog can also be disabled or 964enabled individually, using the soft_watchdog and nmi_watchdog parameters. 965If the watchdog parameter is read, for example by executing 966 967 cat /proc/sys/kernel/watchdog 968 969the output of this command (0 or 1) shows the logical OR of soft_watchdog 970and nmi_watchdog. 971 972============================================================== 973 974watchdog_cpumask: 975 976This value can be used to control on which cpus the watchdog may run. 977The default cpumask is all possible cores, but if NO_HZ_FULL is 978enabled in the kernel config, and cores are specified with the 979nohz_full= boot argument, those cores are excluded by default. 980Offline cores can be included in this mask, and if the core is later 981brought online, the watchdog will be started based on the mask value. 982 983Typically this value would only be touched in the nohz_full case 984to re-enable cores that by default were not running the watchdog, 985if a kernel lockup was suspected on those cores. 986 987The argument value is the standard cpulist format for cpumasks, 988so for example to enable the watchdog on cores 0, 2, 3, and 4 you 989might say: 990 991 echo 0,2-4 > /proc/sys/kernel/watchdog_cpumask 992 993============================================================== 994 995watchdog_thresh: 996 997This value can be used to control the frequency of hrtimer and NMI 998events and the soft and hard lockup thresholds. The default threshold 999is 10 seconds. 1000 1001The softlockup threshold is (2 * watchdog_thresh). Setting this 1002tunable to zero will disable lockup detection altogether. 1003 1004==============================================================