Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86, mm: trace when an IPI is about to be sent

When unmapping pages it is necessary to flush the TLB. If that page was
accessed by another CPU then an IPI is used to flush the remote CPU. That
is a lot of IPIs if kswapd is scanning and unmapping >100K pages per
second.

There already is a window between when a page is unmapped and when it is
TLB flushed. This series increases the window so multiple pages can be
flushed using a single IPI. This should be safe or the kernel is hosed
already.

Patch 1 simply made the rest of the series easier to write as ftrace
could identify all the senders of TLB flush IPIS.

Patch 2 tracks what CPUs potentially map a PFN and then sends an IPI
to flush the entire TLB.

Patch 3 tracks when there potentially are writable TLB entries that
need to be batched differently

Patch 4 increases SWAP_CLUSTER_MAX to further batch flushes

The performance impact is documented in the changelogs but in the optimistic
case on a 4-socket machine the full series reduces interrupts from 900K
interrupts/second to 60K interrupts/second.

This patch (of 4):

It is easy to trace when an IPI is received to flush a TLB but harder to
detect what event sent it. This patch makes it easy to identify the
source of IPIs being transmitted for TLB flushes on x86.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Mel Gorman and committed by
Linus Torvalds
5b74283a c47174fc

+4 -1
+1
arch/x86/mm/tlb.c
··· 140 140 info.flush_end = end; 141 141 142 142 count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); 143 + trace_tlb_flush(TLB_REMOTE_SEND_IPI, end - start); 143 144 if (is_uv_system()) { 144 145 unsigned int cpu; 145 146
+1
include/linux/mm_types.h
··· 554 554 TLB_REMOTE_SHOOTDOWN, 555 555 TLB_LOCAL_SHOOTDOWN, 556 556 TLB_LOCAL_MM_SHOOTDOWN, 557 + TLB_REMOTE_SEND_IPI, 557 558 NR_TLB_FLUSH_REASONS, 558 559 }; 559 560
+2 -1
include/trace/events/tlb.h
··· 11 11 EM( TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" ) \ 12 12 EM( TLB_REMOTE_SHOOTDOWN, "remote shootdown" ) \ 13 13 EM( TLB_LOCAL_SHOOTDOWN, "local shootdown" ) \ 14 - EMe( TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" ) 14 + EM( TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" ) \ 15 + EMe( TLB_REMOTE_SEND_IPI, "remote ipi send" ) 15 16 16 17 /* 17 18 * First define the enums in TLB_FLUSH_REASON to be exported to userspace