Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

smp: Document transitivity for memory barriers.

Transitivity is guaranteed only for full memory barriers (smp_mb()).

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

+58
+58
Documentation/memory-barriers.txt
··· 21 21 - SMP barrier pairing. 22 22 - Examples of memory barrier sequences. 23 23 - Read memory barriers vs load speculation. 24 + - Transitivity 24 25 25 26 (*) Explicit kernel barriers. 26 27 ··· 958 957 The speculation is discarded ---> --->| A->1 |------>| | 959 958 and an updated value is +-------+ | | 960 959 retrieved : : +-------+ 960 + 961 + 962 + TRANSITIVITY 963 + ------------ 964 + 965 + Transitivity is a deeply intuitive notion about ordering that is not 966 + always provided by real computer systems. The following example 967 + demonstrates transitivity (also called "cumulativity"): 968 + 969 + CPU 1 CPU 2 CPU 3 970 + ======================= ======================= ======================= 971 + { X = 0, Y = 0 } 972 + STORE X=1 LOAD X STORE Y=1 973 + <general barrier> <general barrier> 974 + LOAD Y LOAD X 975 + 976 + Suppose that CPU 2's load from X returns 1 and its load from Y returns 0. 977 + This indicates that CPU 2's load from X in some sense follows CPU 1's 978 + store to X and that CPU 2's load from Y in some sense preceded CPU 3's 979 + store to Y. The question is then "Can CPU 3's load from X return 0?" 980 + 981 + Because CPU 2's load from X in some sense came after CPU 1's store, it 982 + is natural to expect that CPU 3's load from X must therefore return 1. 983 + This expectation is an example of transitivity: if a load executing on 984 + CPU A follows a load from the same variable executing on CPU B, then 985 + CPU A's load must either return the same value that CPU B's load did, 986 + or must return some later value. 987 + 988 + In the Linux kernel, use of general memory barriers guarantees 989 + transitivity. Therefore, in the above example, if CPU 2's load from X 990 + returns 1 and its load from Y returns 0, then CPU 3's load from X must 991 + also return 1. 992 + 993 + However, transitivity is -not- guaranteed for read or write barriers. 994 + For example, suppose that CPU 2's general barrier in the above example 995 + is changed to a read barrier as shown below: 996 + 997 + CPU 1 CPU 2 CPU 3 998 + ======================= ======================= ======================= 999 + { X = 0, Y = 0 } 1000 + STORE X=1 LOAD X STORE Y=1 1001 + <read barrier> <general barrier> 1002 + LOAD Y LOAD X 1003 + 1004 + This substitution destroys transitivity: in this example, it is perfectly 1005 + legal for CPU 2's load from X to return 1, its load from Y to return 0, 1006 + and CPU 3's load from X to return 0. 1007 + 1008 + The key point is that although CPU 2's read barrier orders its pair 1009 + of loads, it does not guarantee to order CPU 1's store. Therefore, if 1010 + this example runs on a system where CPUs 1 and 2 share a store buffer 1011 + or a level of cache, CPU 2 might have early access to CPU 1's writes. 1012 + General barriers are therefore required to ensure that all CPUs agree 1013 + on the combined order of CPU 1's and CPU 2's accesses. 1014 + 1015 + To reiterate, if your code requires transitivity, use general barriers 1016 + throughout. 961 1017 962 1018 963 1019 ========================