Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

[PATCH] x86-64: Increase TLB flush array size

The generic TLB flush functions kept upto 506 pages per
CPU to avoid too frequent IPIs.

This value was done for the L1 cache of older x86 CPUs,
but with modern CPUs it does not make much sense anymore.
TLB flushing is slow enough that using the L2 cache is fine.

This patch increases the flush array on x86-64 to cache
5350 pages. That is roughly 20MB with 4K pages. It speeds
up large munmaps in multithreaded processes on SMP considerably.

The cost is roughly 42k of memory per CPU, which is reasonable.

I only increased it on x86-64 for now, but it would probably
make sense to increase it everywhere. Embedded architectures
with SMP may keep it smaller to save some memory per CPU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

authored by

Andi Kleen and committed by
Linus Torvalds
2b4a0815 165aeb82

+9 -1
+5 -1
include/asm-generic/tlb.h
··· 23 23 * and page free order so much.. 24 24 */ 25 25 #ifdef CONFIG_SMP 26 - #define FREE_PTE_NR 506 26 + #ifdef ARCH_FREE_PTR_NR 27 + #define FREE_PTR_NR ARCH_FREE_PTR_NR 28 + #else 29 + #define FREE_PTE_NR 506 30 + #endif 27 31 #define tlb_fast_mode(tlb) ((tlb)->nr == ~0U) 28 32 #else 29 33 #define FREE_PTE_NR 1
+4
include/asm-x86_64/tlbflush.h
··· 109 109 #define TLBSTATE_OK 1 110 110 #define TLBSTATE_LAZY 2 111 111 112 + /* Roughly an IPI every 20MB with 4k pages for freeing page table 113 + ranges. Cost is about 42k of memory for each CPU. */ 114 + #define ARCH_FREE_PTE_NR 5350 115 + 112 116 #endif 113 117 114 118 #define flush_tlb_kernel_range(start, end) flush_tlb_all()