Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cgroup, netclassid: periodically release file_lock on classid updating

In our production environment we have faced with problem that updating
classid in cgroup with heavy tasks cause long freeze of the file tables
in this tasks. By heavy tasks we understand tasks with many threads and
opened sockets (e.g. balancers). This freeze leads to an increase number
of client timeouts.

This patch implements following logic to fix this issue:
аfter iterating 1000 file descriptors file table lock will be released
thus providing a time gap for socket creation/deletion.

Now update is non atomic and socket may be skipped using calls:

dup2(oldfd, newfd);
close(oldfd);

But this case is not typical. Moreover before this patch skip is possible
too by hiding socket fd in unix socket buffer.

New sockets will be allocated with updated classid because cgroup state
is updated before start of the file descriptors iteration.

So in common cases this patch has no side effects.

Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Dmitry Yakunin and committed by
David S. Miller
018d26fc ce9a4186

+37 -10
+37 -10
net/core/netclassid_cgroup.c
··· 53 53 kfree(css_cls_state(css)); 54 54 } 55 55 56 + /* 57 + * To avoid freezing of sockets creation for tasks with big number of threads 58 + * and opened sockets lets release file_lock every 1000 iterated descriptors. 59 + * New sockets will already have been created with new classid. 60 + */ 61 + 62 + struct update_classid_context { 63 + u32 classid; 64 + unsigned int batch; 65 + }; 66 + 67 + #define UPDATE_CLASSID_BATCH 1000 68 + 56 69 static int update_classid_sock(const void *v, struct file *file, unsigned n) 57 70 { 58 71 int err; 72 + struct update_classid_context *ctx = (void *)v; 59 73 struct socket *sock = sock_from_file(file, &err); 60 74 61 75 if (sock) { 62 76 spin_lock(&cgroup_sk_update_lock); 63 - sock_cgroup_set_classid(&sock->sk->sk_cgrp_data, 64 - (unsigned long)v); 77 + sock_cgroup_set_classid(&sock->sk->sk_cgrp_data, ctx->classid); 65 78 spin_unlock(&cgroup_sk_update_lock); 66 79 } 80 + if (--ctx->batch == 0) { 81 + ctx->batch = UPDATE_CLASSID_BATCH; 82 + return n + 1; 83 + } 67 84 return 0; 85 + } 86 + 87 + static void update_classid_task(struct task_struct *p, u32 classid) 88 + { 89 + struct update_classid_context ctx = { 90 + .classid = classid, 91 + .batch = UPDATE_CLASSID_BATCH 92 + }; 93 + unsigned int fd = 0; 94 + 95 + do { 96 + task_lock(p); 97 + fd = iterate_fd(p->files, fd, update_classid_sock, &ctx); 98 + task_unlock(p); 99 + cond_resched(); 100 + } while (fd); 68 101 } 69 102 70 103 static void cgrp_attach(struct cgroup_taskset *tset) ··· 106 73 struct task_struct *p; 107 74 108 75 cgroup_taskset_for_each(p, css, tset) { 109 - task_lock(p); 110 - iterate_fd(p->files, 0, update_classid_sock, 111 - (void *)(unsigned long)css_cls_state(css)->classid); 112 - task_unlock(p); 76 + update_classid_task(p, css_cls_state(css)->classid); 113 77 } 114 78 } 115 79 ··· 128 98 129 99 css_task_iter_start(css, 0, &it); 130 100 while ((p = css_task_iter_next(&it))) { 131 - task_lock(p); 132 - iterate_fd(p->files, 0, update_classid_sock, 133 - (void *)(unsigned long)cs->classid); 134 - task_unlock(p); 101 + update_classid_task(p, cs->classid); 135 102 cond_resched(); 136 103 } 137 104 css_task_iter_end(&it);