Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:
"Nothing too interesting. Documentation updates and trivial changes;
however, this pull request does containt he previusly discussed
dropping of __must_check from strscpy()"

* 'for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
Documentation: Fix 'file_mapped' -> 'mapped_file'
string: drop __must_check from strscpy() and restore strscpy() usages in cgroup
cgroup, docs: document the root cgroup behavior of cpu and io controllers
cgroup-v2.txt: fix typos
cgroup: Update documentation reference
Documentation/cgroup-v1: fix outdated programming details
cgroup, docs: document cgroup v2 device controller

+76 -22
+1 -6
Documentation/cgroup-v1/cgroups.txt
··· 523 523 Each subsystem should: 524 524 525 525 - add an entry in linux/cgroup_subsys.h 526 - - define a cgroup_subsys object called <name>_subsys 527 - 528 - If a subsystem can be compiled as a module, it should also have in its 529 - module initcall a call to cgroup_load_subsys(), and in its exitcall a 530 - call to cgroup_unload_subsys(). It should also set its_subsys.module = 531 - THIS_MODULE in its .c file. 526 + - define a cgroup_subsys object called <name>_cgrp_subsys 532 527 533 528 Each subsystem may export the following methods. The only mandatory 534 529 methods are css_alloc/free. Any others that are null are presumed to
+2 -2
Documentation/cgroup-v1/memory.txt
··· 524 524 Only anonymous and swap cache memory is listed as part of 'rss' stat. 525 525 This should not be confused with the true 'resident set size' or the 526 526 amount of physical memory used by the cgroup. 527 - 'rss + file_mapped" will give you resident set size of cgroup. 527 + 'rss + mapped_file" will give you resident set size of cgroup. 528 528 (Note: file and shmem may be shared among other cgroups. In that case, 529 - file_mapped is accounted only when the memory cgroup is owner of page 529 + mapped_file is accounted only when the memory cgroup is owner of page 530 530 cache.) 531 531 532 532 5.3 swappiness
+68 -9
Documentation/cgroup-v2.txt
··· 53 53 5-3-2. Writeback 54 54 5-4. PID 55 55 5-4-1. PID Interface Files 56 - 5-5. RDMA 57 - 5-5-1. RDMA Interface Files 58 - 5-6. Misc 59 - 5-6-1. perf_event 56 + 5-5. Device 57 + 5-6. RDMA 58 + 5-6-1. RDMA Interface Files 59 + 5-7. Misc 60 + 5-7-1. perf_event 61 + 5-N. Non-normative information 62 + 5-N-1. CPU controller root cgroup process behaviour 63 + 5-N-2. IO controller root cgroup process behaviour 60 64 6. Namespace 61 65 6-1. Basics 62 66 6-2. The Root and Views ··· 283 279 exempt from this requirement. 284 280 285 281 Topology-wise, a cgroup can be in an invalid state. Please consider 286 - the following toplogy:: 282 + the following topology:: 287 283 288 284 A (threaded domain) - B (threaded) - C (domain, just created) 289 285 ··· 424 420 processes and anonymous resource consumption which can't be associated 425 421 with any other cgroups and requires special treatment from most 426 422 controllers. How resource consumption in the root cgroup is governed 427 - is up to each controller. 423 + is up to each controller (for more information on this topic please 424 + refer to the Non-normative information section in the Controllers 425 + chapter). 428 426 429 427 Note that the restriction doesn't get in the way if there is no 430 428 enabled controller in the cgroup's "cgroup.subtree_control". This is ··· 1069 1063 reached the limit and allocation was about to fail. 1070 1064 1071 1065 Depending on context result could be invocation of OOM 1072 - killer and retrying allocation or failing alloction. 1066 + killer and retrying allocation or failing allocation. 1073 1067 1074 1068 Failed allocation in its turn could be returned into 1075 - userspace as -ENOMEM or siletly ignored in cases like 1069 + userspace as -ENOMEM or silently ignored in cases like 1076 1070 disk readahead. For now OOM in memory cgroup kills 1077 1071 tasks iff shortage has happened inside page fault. 1078 1072 ··· 1197 1191 cgroups. The default is "max". 1198 1192 1199 1193 Swap usage hard limit. If a cgroup's swap usage reaches this 1200 - limit, anonymous meomry of the cgroup will not be swapped out. 1194 + limit, anonymous memory of the cgroup will not be swapped out. 1201 1195 1202 1196 1203 1197 Usage Guidelines ··· 1435 1429 of a new process would cause a cgroup policy to be violated. 1436 1430 1437 1431 1432 + Device controller 1433 + ----------------- 1434 + 1435 + Device controller manages access to device files. It includes both 1436 + creation of new device files (using mknod), and access to the 1437 + existing device files. 1438 + 1439 + Cgroup v2 device controller has no interface files and is implemented 1440 + on top of cgroup BPF. To control access to device files, a user may 1441 + create bpf programs of the BPF_CGROUP_DEVICE type and attach them 1442 + to cgroups. On an attempt to access a device file, corresponding 1443 + BPF programs will be executed, and depending on the return value 1444 + the attempt will succeed or fail with -EPERM. 1445 + 1446 + A BPF_CGROUP_DEVICE program takes a pointer to the bpf_cgroup_dev_ctx 1447 + structure, which describes the device access attempt: access type 1448 + (mknod/read/write) and device (type, major and minor numbers). 1449 + If the program returns 0, the attempt fails with -EPERM, otherwise 1450 + it succeeds. 1451 + 1452 + An example of BPF_CGROUP_DEVICE program may be found in the kernel 1453 + source tree in the tools/testing/selftests/bpf/dev_cgroup.c file. 1454 + 1455 + 1438 1456 RDMA 1439 1457 ---- 1440 1458 ··· 1509 1479 automatically enabled on the v2 hierarchy so that perf events can 1510 1480 always be filtered by cgroup v2 path. The controller can still be 1511 1481 moved to a legacy hierarchy after v2 hierarchy is populated. 1482 + 1483 + 1484 + Non-normative information 1485 + ------------------------- 1486 + 1487 + This section contains information that isn't considered to be a part of 1488 + the stable kernel API and so is subject to change. 1489 + 1490 + 1491 + CPU controller root cgroup process behaviour 1492 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1493 + 1494 + When distributing CPU cycles in the root cgroup each thread in this 1495 + cgroup is treated as if it was hosted in a separate child cgroup of the 1496 + root cgroup. This child cgroup weight is dependent on its thread nice 1497 + level. 1498 + 1499 + For details of this mapping see sched_prio_to_weight array in 1500 + kernel/sched/core.c file (values from this array should be scaled 1501 + appropriately so the neutral - nice 0 - value is 100 instead of 1024). 1502 + 1503 + 1504 + IO controller root cgroup process behaviour 1505 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1506 + 1507 + Root cgroup processes are hosted in an implicit leaf child node. 1508 + When distributing IO resources this implicit child node is taken into 1509 + account as if it was a normal child cgroup of the root cgroup with a 1510 + weight value of 200. 1512 1511 1513 1512 1514 1513 Namespace
+1 -1
include/linux/cgroup-defs.h
··· 561 561 562 562 /* 563 563 * Control Group subsystem type. 564 - * See Documentation/cgroups/cgroups.txt for details 564 + * See Documentation/cgroup-v1/cgroups.txt for details 565 565 */ 566 566 struct cgroup_subsys { 567 567 struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state *parent_css);
+1 -1
include/linux/string.h
··· 28 28 size_t strlcpy(char *, const char *, size_t); 29 29 #endif 30 30 #ifndef __HAVE_ARCH_STRSCPY 31 - ssize_t __must_check strscpy(char *, const char *, size_t); 31 + ssize_t strscpy(char *, const char *, size_t); 32 32 #endif 33 33 #ifndef __HAVE_ARCH_STRCAT 34 34 extern char * strcat(char *, const char *);
+3 -3
kernel/cgroup/cgroup.c
··· 1397 1397 cgroup_on_dfl(cgrp) ? ss->name : ss->legacy_name, 1398 1398 cft->name); 1399 1399 else 1400 - strlcpy(buf, cft->name, CGROUP_FILE_NAME_MAX); 1400 + strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX); 1401 1401 return buf; 1402 1402 } 1403 1403 ··· 1864 1864 1865 1865 root->flags = opts->flags; 1866 1866 if (opts->release_agent) 1867 - strlcpy(root->release_agent_path, opts->release_agent, PATH_MAX); 1867 + strscpy(root->release_agent_path, opts->release_agent, PATH_MAX); 1868 1868 if (opts->name) 1869 - strlcpy(root->name, opts->name, MAX_CGROUP_ROOT_NAMELEN); 1869 + strscpy(root->name, opts->name, MAX_CGROUP_ROOT_NAMELEN); 1870 1870 if (opts->cpuset_clone_children) 1871 1871 set_bit(CGRP_CPUSET_CLONE_CHILDREN, &root->cgrp.flags); 1872 1872 }