Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

[PATCH] sched: revert "filter affine wakeups"

Revert commit d7102e95b7b9c00277562c29aad421d2d521c5f6:

[PATCH] sched: filter affine wakeups

Apparently caused more than 10% performance regression for aim7 benchmark.
The setup in use is 16-cpu HP rx8620, 64Gb of memory and 12 MSA1000s with 144
disks. Each disk is 72Gb with a single ext3 filesystem (courtesy of HP, who
supplied benchmark results).

The problem is, for aim7, the wake-up pattern is random, but it still needs
load balancing action in the wake-up path to achieve best performance. With
the above commit, lack of load balancing hurts that workload.

However, for workloads like database transaction processing, the requirement
is exactly opposite. In the wake up path, best performance is achieved with
absolutely zero load balancing. We simply wake up the process on the CPU that
it was previously run. Worst performance is obtained when we do load
balancing at wake up.

There isn't an easy way to auto detect the workload characteristics. Ingo's
earlier patch that detects idle CPU and decide whether to load balance or not
doesn't perform with aim7 either since all CPUs are busy (it causes even
bigger perf. regression).

Revert commit d7102e95b7b9c00277562c29aad421d2d521c5f6, which causes more
than 10% performance regression with aim7.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

authored by

Chen, Kenneth W and committed by
Linus Torvalds
d6077cb8 f8225661

+2 -13
+1 -4
include/linux/sched.h
··· 697 697 698 698 int lock_depth; /* BKL lock depth */ 699 699 700 - #if defined(CONFIG_SMP) 701 - int last_waker_cpu; /* CPU that last woke this task up */ 702 - #if defined(__ARCH_WANT_UNLOCKED_CTXSW) 700 + #if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW) 703 701 int oncpu; 704 - #endif 705 702 #endif 706 703 int prio, static_prio; 707 704 struct list_head run_list;
+1 -9
kernel/sched.c
··· 1204 1204 } 1205 1205 } 1206 1206 1207 - if (p->last_waker_cpu != this_cpu) 1208 - goto out_set_cpu; 1209 - 1210 1207 if (unlikely(!cpu_isset(this_cpu, p->cpus_allowed))) 1211 1208 goto out_set_cpu; 1212 1209 ··· 1273 1276 this_cpu = smp_processor_id(); 1274 1277 cpu = task_cpu(p); 1275 1278 } 1276 - 1277 - p->last_waker_cpu = this_cpu; 1278 1279 1279 1280 out_activate: 1280 1281 #endif /* CONFIG_SMP */ ··· 1355 1360 #ifdef CONFIG_SCHEDSTATS 1356 1361 memset(&p->sched_info, 0, sizeof(p->sched_info)); 1357 1362 #endif 1358 - #if defined(CONFIG_SMP) 1359 - p->last_waker_cpu = cpu; 1360 - #if defined(__ARCH_WANT_UNLOCKED_CTXSW) 1363 + #if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW) 1361 1364 p->oncpu = 0; 1362 - #endif 1363 1365 #endif 1364 1366 #ifdef CONFIG_PREEMPT 1365 1367 /* Want to start with kernel preemption disabled. */