Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

doc: Update RCU data-structure documentation for rcu_segcblist

The rcu_segcblist data structure, which contains segmented lists
of RCU callbacks, was recently added. This commit updates the
documentation accordingly.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

+156 -85
+144 -63
Documentation/RCU/Design/Data-Structures/Data-Structures.html
··· 19 19 The <tt>rcu_state</tt> Structure</a> 20 20 <li> <a href="#The rcu_node Structure"> 21 21 The <tt>rcu_node</tt> Structure</a> 22 + <li> <a href="#The rcu_segcblist Structure"> 23 + The <tt>rcu_segcblist</tt> Structure</a> 22 24 <li> <a href="#The rcu_data Structure"> 23 25 The <tt>rcu_data</tt> Structure</a> 24 26 <li> <a href="#The rcu_dynticks Structure"> ··· 843 841 Finally, lines&nbsp;64-66 produce an error if the maximum number of 844 842 CPUs is too large for the specified fanout. 845 843 844 + <h3><a name="The rcu_segcblist Structure"> 845 + The <tt>rcu_segcblist</tt> Structure</a></h3> 846 + 847 + The <tt>rcu_segcblist</tt> structure maintains a segmented list of 848 + callbacks as follows: 849 + 850 + <pre> 851 + 1 #define RCU_DONE_TAIL 0 852 + 2 #define RCU_WAIT_TAIL 1 853 + 3 #define RCU_NEXT_READY_TAIL 2 854 + 4 #define RCU_NEXT_TAIL 3 855 + 5 #define RCU_CBLIST_NSEGS 4 856 + 6 857 + 7 struct rcu_segcblist { 858 + 8 struct rcu_head *head; 859 + 9 struct rcu_head **tails[RCU_CBLIST_NSEGS]; 860 + 10 unsigned long gp_seq[RCU_CBLIST_NSEGS]; 861 + 11 long len; 862 + 12 long len_lazy; 863 + 13 }; 864 + </pre> 865 + 866 + <p> 867 + The segments are as follows: 868 + 869 + <ol> 870 + <li> <tt>RCU_DONE_TAIL</tt>: Callbacks whose grace periods have elapsed. 871 + These callbacks are ready to be invoked. 872 + <li> <tt>RCU_WAIT_TAIL</tt>: Callbacks that are waiting for the 873 + current grace period. 874 + Note that different CPUs can have different ideas about which 875 + grace period is current, hence the <tt>-&gt;gp_seq</tt> field. 876 + <li> <tt>RCU_NEXT_READY_TAIL</tt>: Callbacks waiting for the next 877 + grace period to start. 878 + <li> <tt>RCU_NEXT_TAIL</tt>: Callbacks that have not yet been 879 + associated with a grace period. 880 + </ol> 881 + 882 + <p> 883 + The <tt>-&gt;head</tt> pointer references the first callback or 884 + is <tt>NULL</tt> if the list contains no callbacks (which is 885 + <i>not</i> the same as being empty). 886 + Each element of the <tt>-&gt;tails[]</tt> array references the 887 + <tt>-&gt;next</tt> pointer of the last callback in the corresponding 888 + segment of the list, or the list's <tt>-&gt;head</tt> pointer if 889 + that segment and all previous segments are empty. 890 + If the corresponding segment is empty but some previous segment is 891 + not empty, then the array element is identical to its predecessor. 892 + Older callbacks are closer to the head of the list, and new callbacks 893 + are added at the tail. 894 + This relationship between the <tt>-&gt;head</tt> pointer, the 895 + <tt>-&gt;tails[]</tt> array, and the callbacks is shown in this 896 + diagram: 897 + 898 + </p><p><img src="nxtlist.svg" alt="nxtlist.svg" width="40%"> 899 + 900 + </p><p>In this figure, the <tt>-&gt;head</tt> pointer references the 901 + first 902 + RCU callback in the list. 903 + The <tt>-&gt;tails[RCU_DONE_TAIL]</tt> array element references 904 + the <tt>-&gt;head</tt> pointer itself, indicating that none 905 + of the callbacks is ready to invoke. 906 + The <tt>-&gt;tails[RCU_WAIT_TAIL]</tt> array element references callback 907 + CB&nbsp;2's <tt>-&gt;next</tt> pointer, which indicates that 908 + CB&nbsp;1 and CB&nbsp;2 are both waiting on the current grace period, 909 + give or take possible disagreements about exactly which grace period 910 + is the current one. 911 + The <tt>-&gt;tails[RCU_NEXT_READY_TAIL]</tt> array element 912 + references the same RCU callback that <tt>-&gt;tails[RCU_WAIT_TAIL]</tt> 913 + does, which indicates that there are no callbacks waiting on the next 914 + RCU grace period. 915 + The <tt>-&gt;tails[RCU_NEXT_TAIL]</tt> array element references 916 + CB&nbsp;4's <tt>-&gt;next</tt> pointer, indicating that all the 917 + remaining RCU callbacks have not yet been assigned to an RCU grace 918 + period. 919 + Note that the <tt>-&gt;tails[RCU_NEXT_TAIL]</tt> array element 920 + always references the last RCU callback's <tt>-&gt;next</tt> pointer 921 + unless the callback list is empty, in which case it references 922 + the <tt>-&gt;head</tt> pointer. 923 + 924 + <p> 925 + There is one additional important special case for the 926 + <tt>-&gt;tails[RCU_NEXT_TAIL]</tt> array element: It can be <tt>NULL</tt> 927 + when this list is <i>disabled</i>. 928 + Lists are disabled when the corresponding CPU is offline or when 929 + the corresponding CPU's callbacks are offloaded to a kthread, 930 + both of which are described elsewhere. 931 + 932 + </p><p>CPUs advance their callbacks from the 933 + <tt>RCU_NEXT_TAIL</tt> to the <tt>RCU_NEXT_READY_TAIL</tt> to the 934 + <tt>RCU_WAIT_TAIL</tt> to the <tt>RCU_DONE_TAIL</tt> list segments 935 + as grace periods advance. 936 + 937 + </p><p>The <tt>-&gt;gp_seq[]</tt> array records grace-period 938 + numbers corresponding to the list segments. 939 + This is what allows different CPUs to have different ideas as to 940 + which is the current grace period while still avoiding premature 941 + invocation of their callbacks. 942 + In particular, this allows CPUs that go idle for extended periods 943 + to determine which of their callbacks are ready to be invoked after 944 + reawakening. 945 + 946 + </p><p>The <tt>-&gt;len</tt> counter contains the number of 947 + callbacks in <tt>-&gt;head</tt>, and the 948 + <tt>-&gt;len_lazy</tt> contains the number of those callbacks that 949 + are known to only free memory, and whose invocation can therefore 950 + be safely deferred. 951 + 952 + <p><b>Important note</b>: It is the <tt>-&gt;len</tt> field that 953 + determines whether or not there are callbacks associated with 954 + this <tt>rcu_segcblist</tt> structure, <i>not</i> the <tt>-&gt;head</tt> 955 + pointer. 956 + The reason for this is that all the ready-to-invoke callbacks 957 + (that is, those in the <tt>RCU_DONE_TAIL</tt> segment) are extracted 958 + all at once at callback-invocation time. 959 + If callback invocation must be postponed, for example, because a 960 + high-priority process just woke up on this CPU, then the remaining 961 + callbacks are placed back on the <tt>RCU_DONE_TAIL</tt> segment. 962 + Either way, the <tt>-&gt;len</tt> and <tt>-&gt;len_lazy</tt> counts 963 + are adjusted after the corresponding callbacks have been invoked, and so 964 + again it is the <tt>-&gt;len</tt> count that accurately reflects whether 965 + or not there are callbacks associated with this <tt>rcu_segcblist</tt> 966 + structure. 967 + Of course, off-CPU sampling of the <tt>-&gt;len</tt> count requires 968 + the use of appropriate synchronization, for example, memory barriers. 969 + This synchronization can be a bit subtle, particularly in the case 970 + of <tt>rcu_barrier()</tt>. 971 + 846 972 <h3><a name="The rcu_data Structure"> 847 973 The <tt>rcu_data</tt> Structure</a></h3> 848 974 ··· 1113 983 as follows: 1114 984 1115 985 <pre> 1116 - 1 struct rcu_head *nxtlist; 1117 - 2 struct rcu_head **nxttail[RCU_NEXT_SIZE]; 1118 - 3 unsigned long nxtcompleted[RCU_NEXT_SIZE]; 1119 - 4 long qlen_lazy; 1120 - 5 long qlen; 1121 - 6 long qlen_last_fqs_check; 986 + 1 struct rcu_segcblist cblist; 987 + 2 long qlen_last_fqs_check; 988 + 3 unsigned long n_cbs_invoked; 989 + 4 unsigned long n_nocbs_invoked; 990 + 5 unsigned long n_cbs_orphaned; 991 + 6 unsigned long n_cbs_adopted; 1122 992 7 unsigned long n_force_qs_snap; 1123 - 8 unsigned long n_cbs_invoked; 1124 - 9 unsigned long n_cbs_orphaned; 1125 - 10 unsigned long n_cbs_adopted; 1126 - 11 long blimit; 993 + 8 long blimit; 1127 994 </pre> 1128 995 1129 - <p>The <tt>-&gt;nxtlist</tt> pointer and the 1130 - <tt>-&gt;nxttail[]</tt> array form a four-segment list with 1131 - older callbacks near the head and newer ones near the tail. 1132 - Each segment contains callbacks with the corresponding relationship 1133 - to the current grace period. 1134 - The pointer out of the end of each of the four segments is referenced 1135 - by the element of the <tt>-&gt;nxttail[]</tt> array indexed by 1136 - <tt>RCU_DONE_TAIL</tt> (for callbacks handled by a prior grace period), 1137 - <tt>RCU_WAIT_TAIL</tt> (for callbacks waiting on the current grace period), 1138 - <tt>RCU_NEXT_READY_TAIL</tt> (for callbacks that will wait on the next 1139 - grace period), and 1140 - <tt>RCU_NEXT_TAIL</tt> (for callbacks that are not yet associated 1141 - with a specific grace period) 1142 - respectively, as shown in the following figure. 1143 - 1144 - </p><p><img src="nxtlist.svg" alt="nxtlist.svg" width="40%"> 1145 - 1146 - </p><p>In this figure, the <tt>-&gt;nxtlist</tt> pointer references the 1147 - first 1148 - RCU callback in the list. 1149 - The <tt>-&gt;nxttail[RCU_DONE_TAIL]</tt> array element references 1150 - the <tt>-&gt;nxtlist</tt> pointer itself, indicating that none 1151 - of the callbacks is ready to invoke. 1152 - The <tt>-&gt;nxttail[RCU_WAIT_TAIL]</tt> array element references callback 1153 - CB&nbsp;2's <tt>-&gt;next</tt> pointer, which indicates that 1154 - CB&nbsp;1 and CB&nbsp;2 are both waiting on the current grace period. 1155 - The <tt>-&gt;nxttail[RCU_NEXT_READY_TAIL]</tt> array element 1156 - references the same RCU callback that <tt>-&gt;nxttail[RCU_WAIT_TAIL]</tt> 1157 - does, which indicates that there are no callbacks waiting on the next 1158 - RCU grace period. 1159 - The <tt>-&gt;nxttail[RCU_NEXT_TAIL]</tt> array element references 1160 - CB&nbsp;4's <tt>-&gt;next</tt> pointer, indicating that all the 1161 - remaining RCU callbacks have not yet been assigned to an RCU grace 1162 - period. 1163 - Note that the <tt>-&gt;nxttail[RCU_NEXT_TAIL]</tt> array element 1164 - always references the last RCU callback's <tt>-&gt;next</tt> pointer 1165 - unless the callback list is empty, in which case it references 1166 - the <tt>-&gt;nxtlist</tt> pointer. 1167 - 1168 - </p><p>CPUs advance their callbacks from the 1169 - <tt>RCU_NEXT_TAIL</tt> to the <tt>RCU_NEXT_READY_TAIL</tt> to the 1170 - <tt>RCU_WAIT_TAIL</tt> to the <tt>RCU_DONE_TAIL</tt> list segments 1171 - as grace periods advance. 996 + <p>The <tt>-&gt;cblist</tt> structure is the segmented callback list 997 + described earlier. 1172 998 The CPU advances the callbacks in its <tt>rcu_data</tt> structure 1173 999 whenever it notices that another RCU grace period has completed. 1174 1000 The CPU detects the completion of an RCU grace period by noticing ··· 1135 1049 <tt>-&gt;completed</tt> field is updated at the end of each 1136 1050 grace period. 1137 1051 1138 - </p><p>The <tt>-&gt;nxtcompleted[]</tt> array records grace-period 1139 - numbers corresponding to the list segments. 1140 - This allows CPUs that go idle for extended periods to determine 1141 - which of their callbacks are ready to be invoked after reawakening. 1142 - 1143 - </p><p>The <tt>-&gt;qlen</tt> counter contains the number of 1144 - callbacks in <tt>-&gt;nxtlist</tt>, and the 1145 - <tt>-&gt;qlen_lazy</tt> contains the number of those callbacks that 1146 - are known to only free memory, and whose invocation can therefore 1147 - be safely deferred. 1052 + <p> 1148 1053 The <tt>-&gt;qlen_last_fqs_check</tt> and 1149 1054 <tt>-&gt;n_force_qs_snap</tt> coordinate the forcing of quiescent 1150 1055 states from <tt>call_rcu()</tt> and friends when callback ··· 1146 1069 fields count the number of callbacks invoked, 1147 1070 sent to other CPUs when this CPU goes offline, 1148 1071 and received from other CPUs when those other CPUs go offline. 1072 + The <tt>-&gt;n_nocbs_invoked</tt> is used when the CPU's callbacks 1073 + are offloaded to a kthread. 1074 + 1075 + <p> 1149 1076 Finally, the <tt>-&gt;blimit</tt> counter is the maximum number of 1150 1077 RCU callbacks that may be invoked at a given time. 1151 1078
+12 -22
Documentation/RCU/Design/Data-Structures/nxtlist.svg
··· 19 19 id="svg2" 20 20 version="1.1" 21 21 inkscape:version="0.48.4 r9939" 22 - sodipodi:docname="nxtlist.fig"> 22 + sodipodi:docname="segcblist.svg"> 23 23 <metadata 24 24 id="metadata94"> 25 25 <rdf:RDF> ··· 28 28 <dc:format>image/svg+xml</dc:format> 29 29 <dc:type 30 30 rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> 31 - <dc:title></dc:title> 31 + <dc:title /> 32 32 </cc:Work> 33 33 </rdf:RDF> 34 34 </metadata> ··· 241 241 xml:space="preserve" 242 242 x="225" 243 243 y="675" 244 - fill="#000000" 245 - font-family="Courier" 246 244 font-style="normal" 247 245 font-weight="bold" 248 246 font-size="324" 249 - text-anchor="start" 250 - id="text64">nxtlist</text> 247 + id="text64" 248 + style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;head</text> 251 249 <!-- Text --> 252 250 <text 253 251 xml:space="preserve" 254 252 x="225" 255 253 y="1800" 256 - fill="#000000" 257 - font-family="Courier" 258 254 font-style="normal" 259 255 font-weight="bold" 260 256 font-size="324" 261 - text-anchor="start" 262 - id="text66">nxttail[RCU_DONE_TAIL]</text> 257 + id="text66" 258 + style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;tails[RCU_DONE_TAIL]</text> 263 259 <!-- Text --> 264 260 <text 265 261 xml:space="preserve" 266 262 x="225" 267 263 y="2925" 268 - fill="#000000" 269 - font-family="Courier" 270 264 font-style="normal" 271 265 font-weight="bold" 272 266 font-size="324" 273 - text-anchor="start" 274 - id="text68">nxttail[RCU_WAIT_TAIL]</text> 267 + id="text68" 268 + style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;tails[RCU_WAIT_TAIL]</text> 275 269 <!-- Text --> 276 270 <text 277 271 xml:space="preserve" 278 272 x="225" 279 273 y="4050" 280 - fill="#000000" 281 - font-family="Courier" 282 274 font-style="normal" 283 275 font-weight="bold" 284 276 font-size="324" 285 - text-anchor="start" 286 - id="text70">nxttail[RCU_NEXT_READY_TAIL]</text> 277 + id="text70" 278 + style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;tails[RCU_NEXT_READY_TAIL]</text> 287 279 <!-- Text --> 288 280 <text 289 281 xml:space="preserve" 290 282 x="225" 291 283 y="5175" 292 - fill="#000000" 293 - font-family="Courier" 294 284 font-style="normal" 295 285 font-weight="bold" 296 286 font-size="324" 297 - text-anchor="start" 298 - id="text72">nxttail[RCU_NEXT_TAIL]</text> 287 + id="text72" 288 + style="font-size:324px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;tails[RCU_NEXT_TAIL]</text> 299 289 <!-- Text --> 300 290 <text 301 291 xml:space="preserve"