x86/mm: Fix SMP ordering in switch_mm_irqs_off()

Stephen noted that it is possible to not have an smp_mb() between
the loaded_mm store and the tlb_gen load in switch_mm(), meaning the
ordering against flush_tlb_mm_range() goes out the window, and it
becomes possible for switch_mm() to not observe a recent tlb_gen
update and fail to flush the TLBs.

[ dhansen: merge conflict fixed by Ingo ]

Fixes: 209954cbc7d0 ("x86/mm/tlb: Update mm_cpumask lazily")
Reported-by: Stephen Dolan <sdolan@janestreet.com>
Closes: https://lore.kernel.org/all/CAHDw0oGd0B4=uuv8NGqbUQ_ZVmSheU2bN70e4QhFXWvuAZdt2w@mail.gmail.com/
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>

authored by Ingo Molnar and committed by Dave Hansen 83b0177a f25785f9

Changed files
+22 -2
arch
x86
mm
+22 -2
arch/x86/mm/tlb.c
··· 911 911 * CR3 and cpu_tlbstate.loaded_mm are not all in sync. 912 912 */ 913 913 this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); 914 - barrier(); 915 914 916 - /* Start receiving IPIs and then read tlb_gen (and LAM below) */ 915 + /* 916 + * Make sure this CPU is set in mm_cpumask() such that we'll 917 + * receive invalidation IPIs. 918 + * 919 + * Rely on the smp_mb() implied by cpumask_set_cpu()'s atomic 920 + * operation, or explicitly provide one. Such that: 921 + * 922 + * switch_mm_irqs_off() flush_tlb_mm_range() 923 + * smp_store_release(loaded_mm, SWITCHING); atomic64_inc_return(tlb_gen) 924 + * smp_mb(); // here // smp_mb() implied 925 + * atomic64_read(tlb_gen); this_cpu_read(loaded_mm); 926 + * 927 + * we properly order against flush_tlb_mm_range(), where the 928 + * loaded_mm load can happen in mative_flush_tlb_multi() -> 929 + * should_flush_tlb(). 930 + * 931 + * This way switch_mm() must see the new tlb_gen or 932 + * flush_tlb_mm_range() must see the new loaded_mm, or both. 933 + */ 917 934 if (next != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next))) 918 935 cpumask_set_cpu(cpu, mm_cpumask(next)); 936 + else 937 + smp_mb(); 938 + 919 939 next_tlb_gen = atomic64_read(&next->context.tlb_gen); 920 940 921 941 ns = choose_new_asid(next, next_tlb_gen);