···2121 - SMP barrier pairing.2222 - Examples of memory barrier sequences.2323 - Read memory barriers vs load speculation.2424+ - Transitivity24252526 (*) Explicit kernel barriers.2627···958957 The speculation is discarded ---> --->| A->1 |------>| |959958 and an updated value is +-------+ | |960959 retrieved : : +-------+960960+961961+962962+TRANSITIVITY963963+------------964964+965965+Transitivity is a deeply intuitive notion about ordering that is not966966+always provided by real computer systems. The following example967967+demonstrates transitivity (also called "cumulativity"):968968+969969+ CPU 1 CPU 2 CPU 3970970+ ======================= ======================= =======================971971+ { X = 0, Y = 0 }972972+ STORE X=1 LOAD X STORE Y=1973973+ <general barrier> <general barrier>974974+ LOAD Y LOAD X975975+976976+Suppose that CPU 2's load from X returns 1 and its load from Y returns 0.977977+This indicates that CPU 2's load from X in some sense follows CPU 1's978978+store to X and that CPU 2's load from Y in some sense preceded CPU 3's979979+store to Y. The question is then "Can CPU 3's load from X return 0?"980980+981981+Because CPU 2's load from X in some sense came after CPU 1's store, it982982+is natural to expect that CPU 3's load from X must therefore return 1.983983+This expectation is an example of transitivity: if a load executing on984984+CPU A follows a load from the same variable executing on CPU B, then985985+CPU A's load must either return the same value that CPU B's load did,986986+or must return some later value.987987+988988+In the Linux kernel, use of general memory barriers guarantees989989+transitivity. Therefore, in the above example, if CPU 2's load from X990990+returns 1 and its load from Y returns 0, then CPU 3's load from X must991991+also return 1.992992+993993+However, transitivity is -not- guaranteed for read or write barriers.994994+For example, suppose that CPU 2's general barrier in the above example995995+is changed to a read barrier as shown below:996996+997997+ CPU 1 CPU 2 CPU 3998998+ ======================= ======================= =======================999999+ { X = 0, Y = 0 }10001000+ STORE X=1 LOAD X STORE Y=110011001+ <read barrier> <general barrier>10021002+ LOAD Y LOAD X10031003+10041004+This substitution destroys transitivity: in this example, it is perfectly10051005+legal for CPU 2's load from X to return 1, its load from Y to return 0,10061006+and CPU 3's load from X to return 0.10071007+10081008+The key point is that although CPU 2's read barrier orders its pair10091009+of loads, it does not guarantee to order CPU 1's store. Therefore, if10101010+this example runs on a system where CPUs 1 and 2 share a store buffer10111011+or a level of cache, CPU 2 might have early access to CPU 1's writes.10121012+General barriers are therefore required to ensure that all CPUs agree10131013+on the combined order of CPU 1's and CPU 2's accesses.10141014+10151015+To reiterate, if your code requires transitivity, use general barriers10161016+throughout.961101796210189631019========================