tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

Since commit 76ebbe78f739 ("locking/barriers: Add implicit
smp_read_barrier_depends() to READ_ONCE()") was merged for the 4.15
kernel, it has not been necessary to use smp_read_barrier_depends().
Similarly, commit 59ecbbe7b31c ("locking/barriers: Kill
lockless_dereference()") removed lockless_dereference() from the
kernel.

Since these primitives are no longer part of the kernel, they do not
belong in the Linux Kernel Memory Consistency Model. This patch
removes them, along with the internal rb-dep relation, and updates the
revelant documentation.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-12-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Alan Stern and committed by

Ingo Molnar 8 years ago bf28ae56 cac79a39

+46 -48

5 changed files

expand all

tools

memory-model

Documentation

cheatsheet.txt

explanation.txt

linux-kernel.bell

linux-kernel.cat

linux-kernel.def

+1 -2

tools/memory-model/Documentation/cheatsheet.txt

··· 6 6 Store, e.g., WRITE_ONCE() Y Y 7 7 Load, e.g., READ_ONCE() Y Y Y 8 8 Unsuccessful RMW operation Y Y Y 9 - smp_read_barrier_depends() Y Y Y 10 - *_dereference() Y Y Y Y 9 + rcu_dereference() Y Y Y Y 11 10 Successful *_acquire() R Y Y Y Y Y Y 12 11 Successful *_release() C Y Y Y W Y 13 12 smp_rmb() Y R Y Y R

+43 -38

tools/memory-model/Documentation/explanation.txt

··· 1 - Explanation of the Linux-Kernel Memory Model 2 - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 + Explanation of the Linux-Kernel Memory Consistency Model 2 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3 3 4 4 :Author: Alan Stern <stern@rowland.harvard.edu> 5 5 :Created: October 2017 ··· 35 35 INTRODUCTION 36 36 ------------ 37 37 38 - The Linux-kernel memory model (LKMM) is rather complex and obscure. 39 - This is particularly evident if you read through the linux-kernel.bell 40 - and linux-kernel.cat files that make up the formal version of the 41 - memory model; they are extremely terse and their meanings are far from 42 - clear. 38 + The Linux-kernel memory consistency model (LKMM) is rather complex and 39 + obscure. This is particularly evident if you read through the 40 + linux-kernel.bell and linux-kernel.cat files that make up the formal 41 + version of the model; they are extremely terse and their meanings are 42 + far from clear. 43 43 44 44 This document describes the ideas underlying the LKMM. It is meant 45 - for people who want to understand how the memory model was designed. 46 - It does not go into the details of the code in the .bell and .cat 47 - files; rather, it explains in English what the code expresses 48 - symbolically. 45 + for people who want to understand how the model was designed. It does 46 + not go into the details of the code in the .bell and .cat files; 47 + rather, it explains in English what the code expresses symbolically. 49 48 50 49 Sections 2 (BACKGROUND) through 5 (ORDERING AND CYCLES) are aimed 51 - toward beginners; they explain what memory models are and the basic 52 - notions shared by all such models. People already familiar with these 53 - concepts can skim or skip over them. Sections 6 (EVENTS) through 12 54 - (THE FROM_READS RELATION) describe the fundamental relations used in 55 - many memory models. Starting in Section 13 (AN OPERATIONAL MODEL), 56 - the workings of the LKMM itself are covered. 50 + toward beginners; they explain what memory consistency models are and 51 + the basic notions shared by all such models. People already familiar 52 + with these concepts can skim or skip over them. Sections 6 (EVENTS) 53 + through 12 (THE FROM_READS RELATION) describe the fundamental 54 + relations used in many models. Starting in Section 13 (AN OPERATIONAL 55 + MODEL), the workings of the LKMM itself are covered. 57 56 58 57 Warning: The code examples in this document are not written in the 59 58 proper format for litmus tests. They don't include a header line, the ··· 826 827 executed on C before the fence (i.e., those which precede the fence in 827 828 program order). 828 829 829 - smp_read_barrier_depends(), rcu_read_lock(), rcu_read_unlock(), and 830 - synchronize_rcu() fences have other properties which we discuss later. 830 + read_lock(), rcu_read_unlock(), and synchronize_rcu() fences have 831 + other properties which we discuss later. 831 832 832 833 833 834 PROPAGATION ORDER RELATION: cumul-fence ··· 987 988 section, is: 988 989 989 990 X and Y are both loads, X ->addr Y (i.e., there is an address 990 - dependency from X to Y), and an smp_read_barrier_depends() 991 - fence occurs between them. 991 + dependency from X to Y), and X is a READ_ONCE() or an atomic 992 + access. 992 993 993 994 Dependencies can also cause instructions to be executed in program 994 995 order. This is uncontroversial when the second instruction is a ··· 1014 1015 a particular location before it knows what that location is. However, 1015 1016 the split-cache design used by Alpha can cause it to behave in a way 1016 1017 that looks as if the loads were executed out of order (see the next 1017 - section for more details). For this reason, the LKMM does not include 1018 - address dependencies between read events in the ppo relation unless an 1019 - smp_read_barrier_depends() fence is present. 1018 + section for more details). The kernel includes a workaround for this 1019 + problem when the loads come from READ_ONCE(), and therefore the LKMM 1020 + includes address dependencies to loads in the ppo relation. 1020 1021 1021 1022 On the other hand, dependencies can indirectly affect the ordering of 1022 1023 two loads. This happens when there is a dependency from a load to a ··· 1113 1114 int *r1; 1114 1115 int r2; 1115 1116 1116 - r1 = READ_ONCE(ptr); 1117 + r1 = ptr; 1117 1118 r2 = READ_ONCE(*r1); 1118 1119 } 1119 1120 1120 - can malfunction on Alpha systems. It is quite possible that r1 = &x 1121 + can malfunction on Alpha systems (notice that P1 uses an ordinary load 1122 + to read ptr instead of READ_ONCE()). It is quite possible that r1 = &x 1121 1123 and r2 = 0 at the end, in spite of the address dependency. 1122 1124 1123 1125 At first glance this doesn't seem to make sense. We know that the ··· 1141 1141 incoming stores in FIFO order. In constrast, other architectures 1142 1142 maintain at least the appearance of FIFO order. 1143 1143 1144 - In practice, this difficulty is solved by inserting an 1145 - smp_read_barrier_depends() fence between P1's two loads. The effect 1146 - of this fence is to cause the CPU not to execute any po-later 1147 - instructions until after the local cache has finished processing all 1148 - the stores it has already received. Thus, if the code was changed to: 1144 + In practice, this difficulty is solved by inserting a special fence 1145 + between P1's two loads when the kernel is compiled for the Alpha 1146 + architecture. In fact, as of version 4.15, the kernel automatically 1147 + adds this fence (called smp_read_barrier_depends() and defined as 1148 + nothing at all on non-Alpha builds) after every READ_ONCE() and atomic 1149 + load. The effect of the fence is to cause the CPU not to execute any 1150 + po-later instructions until after the local cache has finished 1151 + processing all the stores it has already received. Thus, if the code 1152 + was changed to: 1149 1153 1150 1154 P1() 1151 1155 { ··· 1157 1153 int r2; 1158 1154 1159 1155 r1 = READ_ONCE(ptr); 1160 - smp_read_barrier_depends(); 1161 1156 r2 = READ_ONCE(*r1); 1162 1157 } 1163 1158 1164 1159 then we would never get r1 = &x and r2 = 0. By the time P1 executed 1165 1160 its second load, the x = 1 store would already be fully processed by 1166 - the local cache and available for satisfying the read request. 1161 + the local cache and available for satisfying the read request. Thus 1162 + we have yet another reason why shared data should always be read with 1163 + READ_ONCE() or another synchronization primitive rather than accessed 1164 + directly. 1167 1165 1168 1166 The LKMM requires that smp_rmb(), acquire fences, and strong fences 1169 1167 share this property with smp_read_barrier_depends(): They do not allow ··· 1757 1751 the value of x, there is nothing for the smp_rmb() fence to act on. 1758 1752 1759 1753 The LKMM defines a few extra synchronization operations in terms of 1760 - things we have already covered. In particular, rcu_dereference() and 1761 - lockless_dereference() are both treated as a READ_ONCE() followed by 1762 - smp_read_barrier_depends() -- which also happens to be how they are 1763 - defined in include/linux/rcupdate.h and include/linux/compiler.h, 1764 - respectively. 1754 + things we have already covered. In particular, rcu_dereference() is 1755 + treated as READ_ONCE() and rcu_assign_pointer() is treated as 1756 + smp_store_release() -- which is basically how the Linux kernel treats 1757 + them. 1765 1758 1766 1759 There are a few oddball fences which need special treatment: 1767 1760 smp_mb__before_atomic(), smp_mb__after_atomic(), and

-1

tools/memory-model/linux-kernel.bell

··· 24 24 enum Barriers = 'wmb (*smp_wmb*) || 25 25 'rmb (*smp_rmb*) || 26 26 'mb (*smp_mb*) || 27 - 'rb_dep (*smp_read_barrier_depends*) || 28 27 'rcu-lock (*rcu_read_lock*) || 29 28 'rcu-unlock (*rcu_read_unlock*) || 30 29 'sync-rcu (*synchronize_rcu*) ||

+2 -5

tools/memory-model/linux-kernel.cat

··· 25 25 (*******************) 26 26 27 27 (* Fences *) 28 - let rb-dep = [R] ; fencerel(Rb_dep) ; [R] 29 28 let rmb = [R \ Noreturn] ; fencerel(Rmb) ; [R \ Noreturn] 30 29 let wmb = [W] ; fencerel(Wmb) ; [W] 31 30 let mb = ([M] ; fencerel(Mb) ; [M]) | ··· 60 61 let rwdep = (dep | ctrl) ; [W] 61 62 let overwrite = co | fr 62 63 let to-w = rwdep | (overwrite & int) 63 - let rrdep = addr | (dep ; rfi) 64 - let strong-rrdep = rrdep+ & rb-dep 65 - let to-r = strong-rrdep | rfi-rel-acq 64 + let to-r = addr | (dep ; rfi) | rfi-rel-acq 66 65 let fence = strong-fence | wmb | po-rel | rmb | acq-po 67 - let ppo = rrdep* ; (to-r | to-w | fence) 66 + let ppo = to-r | to-w | fence 68 67 69 68 (* Propagation: Ordering from release operations and strong fences. *) 70 69 let A-cumul(r) = rfe? ; r

-2

tools/memory-model/linux-kernel.def

··· 13 13 smp_store_release(X,V) { __store{release}(*X,V); } 14 14 smp_load_acquire(X) __load{acquire}(*X) 15 15 rcu_assign_pointer(X,V) { __store{release}(X,V); } 16 - lockless_dereference(X) __load{lderef}(X) 17 16 rcu_dereference(X) __load{deref}(X) 18 17 19 18 // Fences 20 19 smp_mb() { __fence{mb} ; } 21 20 smp_rmb() { __fence{rmb} ; } 22 21 smp_wmb() { __fence{wmb} ; } 23 - smp_read_barrier_depends() { __fence{rb_dep}; } 24 22 smp_mb__before_atomic() { __fence{before-atomic} ; } 25 23 smp_mb__after_atomic() { __fence{after-atomic} ; } 26 24 smp_mb__after_spinlock() { __fence{after-spinlock} ; }