softlockup: check all tasks in hung_task

Impact: extend the scope of hung-task checks

Changed the default value of hung_task_check_count to PID_MAX_LIMIT.
hung_task_batch_count added to put an upper bound on the critical
section. Every hung_task_batch_count checks, the rcu lock is never
held for a too long time.

Keeping the critical section small minimizes time preemption is disabled
and keeps rcu grace periods small.

To prevent following a stale pointer, get_task_struct is called on g and t.
To verify that g and t have not been unhashed while outside the critical
section, the task states are checked.

The design was proposed by Frédéric Weisbecker.

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Suggested-by: Frédéric Weisbecker <fweisbec@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

authored by Mandeep Singh Baines and committed by Ingo Molnar ce9dbe24 5e54f598

+37 -2
+37 -2
kernel/hung_task.c
··· 17 17 #include <linux/sysctl.h> 18 18 19 19 /* 20 - * Have a reasonable limit on the number of tasks checked: 20 + * The number of tasks checked: 21 21 */ 22 - unsigned long __read_mostly sysctl_hung_task_check_count = 1024; 22 + unsigned long __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT; 23 + 24 + /* 25 + * Limit number of tasks checked in a batch. 26 + * 27 + * This value controls the preemptibility of khungtaskd since preemption 28 + * is disabled during the critical section. It also controls the size of 29 + * the RCU grace period. So it needs to be upper-bound. 30 + */ 31 + #define HUNG_TASK_BATCHING 1024 23 32 24 33 /* 25 34 * Zero means infinite timeout - no checking done: ··· 119 110 } 120 111 121 112 /* 113 + * To avoid extending the RCU grace period for an unbounded amount of time, 114 + * periodically exit the critical section and enter a new one. 115 + * 116 + * For preemptible RCU it is sufficient to call rcu_read_unlock in order 117 + * exit the grace period. For classic RCU, a reschedule is required. 118 + */ 119 + static void rcu_lock_break(struct task_struct *g, struct task_struct *t) 120 + { 121 + get_task_struct(g); 122 + get_task_struct(t); 123 + rcu_read_unlock(); 124 + cond_resched(); 125 + rcu_read_lock(); 126 + put_task_struct(t); 127 + put_task_struct(g); 128 + } 129 + 130 + /* 122 131 * Check whether a TASK_UNINTERRUPTIBLE does not get woken up for 123 132 * a really long time (120 seconds). If that happens, print out 124 133 * a warning. ··· 144 117 static void check_hung_uninterruptible_tasks(unsigned long timeout) 145 118 { 146 119 int max_count = sysctl_hung_task_check_count; 120 + int batch_count = HUNG_TASK_BATCHING; 147 121 unsigned long now = get_timestamp(); 148 122 struct task_struct *g, *t; 149 123 ··· 159 131 do_each_thread(g, t) { 160 132 if (!--max_count) 161 133 goto unlock; 134 + if (!--batch_count) { 135 + batch_count = HUNG_TASK_BATCHING; 136 + rcu_lock_break(g, t); 137 + /* Exit if t or g was unhashed during refresh. */ 138 + if (t->state == TASK_DEAD || g->state == TASK_DEAD) 139 + goto unlock; 140 + } 162 141 /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */ 163 142 if (t->state == TASK_UNINTERRUPTIBLE) 164 143 check_hung_task(t, now, timeout);