Linux kernel mirror (for testing)
git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel
os
linux
1===================================
2NT synchronization primitive driver
3===================================
4
5This page documents the user-space API for the ntsync driver.
6
7ntsync is a support driver for emulation of NT synchronization
8primitives by user-space NT emulators. It exists because implementation
9in user-space, using existing tools, cannot match Windows performance
10while offering accurate semantics. It is implemented entirely in
11software, and does not drive any hardware device.
12
13This interface is meant as a compatibility tool only, and should not
14be used for general synchronization. Instead use generic, versatile
15interfaces such as futex(2) and poll(2).
16
17Synchronization primitives
18==========================
19
20The ntsync driver exposes three types of synchronization primitives:
21semaphores, mutexes, and events.
22
23A semaphore holds a single volatile 32-bit counter, and a static 32-bit
24integer denoting the maximum value. It is considered signaled (that is,
25can be acquired without contention, or will wake up a waiting thread)
26when the counter is nonzero. The counter is decremented by one when a
27wait is satisfied. Both the initial and maximum count are established
28when the semaphore is created.
29
30A mutex holds a volatile 32-bit recursion count, and a volatile 32-bit
31identifier denoting its owner. A mutex is considered signaled when its
32owner is zero (indicating that it is not owned). The recursion count is
33incremented when a wait is satisfied, and ownership is set to the given
34identifier.
35
36A mutex also holds an internal flag denoting whether its previous owner
37has died; such a mutex is said to be abandoned. Owner death is not
38tracked automatically based on thread death, but rather must be
39communicated using ``NTSYNC_IOC_MUTEX_KILL``. An abandoned mutex is
40inherently considered unowned.
41
42Except for the "unowned" semantics of zero, the actual value of the
43owner identifier is not interpreted by the ntsync driver at all. The
44intended use is to store a thread identifier; however, the ntsync
45driver does not actually validate that a calling thread provides
46consistent or unique identifiers.
47
48An event is similar to a semaphore with a maximum count of one. It holds
49a volatile boolean state denoting whether it is signaled or not. There
50are two types of events, auto-reset and manual-reset. An auto-reset
51event is designaled when a wait is satisfied; a manual-reset event is
52not. The event type is specified when the event is created.
53
54Unless specified otherwise, all operations on an object are atomic and
55totally ordered with respect to other operations on the same object.
56
57Objects are represented by files. When all file descriptors to an
58object are closed, that object is deleted.
59
60Char device
61===========
62
63The ntsync driver creates a single char device /dev/ntsync. Each file
64description opened on the device represents a unique instance intended
65to back an individual NT virtual machine. Objects created by one ntsync
66instance may only be used with other objects created by the same
67instance.
68
69ioctl reference
70===============
71
72All operations on the device are done through ioctls. There are four
73structures used in ioctl calls::
74
75 struct ntsync_sem_args {
76 __u32 count;
77 __u32 max;
78 };
79
80 struct ntsync_mutex_args {
81 __u32 owner;
82 __u32 count;
83 };
84
85 struct ntsync_event_args {
86 __u32 signaled;
87 __u32 manual;
88 };
89
90 struct ntsync_wait_args {
91 __u64 timeout;
92 __u64 objs;
93 __u32 count;
94 __u32 owner;
95 __u32 index;
96 __u32 alert;
97 __u32 flags;
98 __u32 pad;
99 };
100
101Depending on the ioctl, members of the structure may be used as input,
102output, or not at all.
103
104The ioctls on the device file are as follows:
105
106.. c:macro:: NTSYNC_IOC_CREATE_SEM
107
108 Create a semaphore object. Takes a pointer to struct
109 :c:type:`ntsync_sem_args`, which is used as follows:
110
111 .. list-table::
112
113 * - ``count``
114 - Initial count of the semaphore.
115 * - ``max``
116 - Maximum count of the semaphore.
117
118 Fails with ``EINVAL`` if ``count`` is greater than ``max``.
119 On success, returns a file descriptor the created semaphore.
120
121.. c:macro:: NTSYNC_IOC_CREATE_MUTEX
122
123 Create a mutex object. Takes a pointer to struct
124 :c:type:`ntsync_mutex_args`, which is used as follows:
125
126 .. list-table::
127
128 * - ``count``
129 - Initial recursion count of the mutex.
130 * - ``owner``
131 - Initial owner of the mutex.
132
133 If ``owner`` is nonzero and ``count`` is zero, or if ``owner`` is
134 zero and ``count`` is nonzero, the function fails with ``EINVAL``.
135 On success, returns a file descriptor the created mutex.
136
137.. c:macro:: NTSYNC_IOC_CREATE_EVENT
138
139 Create an event object. Takes a pointer to struct
140 :c:type:`ntsync_event_args`, which is used as follows:
141
142 .. list-table::
143
144 * - ``signaled``
145 - If nonzero, the event is initially signaled, otherwise
146 nonsignaled.
147 * - ``manual``
148 - If nonzero, the event is a manual-reset event, otherwise
149 auto-reset.
150
151 On success, returns a file descriptor the created event.
152
153The ioctls on the individual objects are as follows:
154
155.. c:macro:: NTSYNC_IOC_SEM_POST
156
157 Post to a semaphore object. Takes a pointer to a 32-bit integer,
158 which on input holds the count to be added to the semaphore, and on
159 output contains its previous count.
160
161 If adding to the semaphore's current count would raise the latter
162 past the semaphore's maximum count, the ioctl fails with
163 ``EOVERFLOW`` and the semaphore is not affected. If raising the
164 semaphore's count causes it to become signaled, eligible threads
165 waiting on this semaphore will be woken and the semaphore's count
166 decremented appropriately.
167
168.. c:macro:: NTSYNC_IOC_MUTEX_UNLOCK
169
170 Release a mutex object. Takes a pointer to struct
171 :c:type:`ntsync_mutex_args`, which is used as follows:
172
173 .. list-table::
174
175 * - ``owner``
176 - Specifies the owner trying to release this mutex.
177 * - ``count``
178 - On output, contains the previous recursion count.
179
180 If ``owner`` is zero, the ioctl fails with ``EINVAL``. If ``owner``
181 is not the current owner of the mutex, the ioctl fails with
182 ``EPERM``.
183
184 The mutex's count will be decremented by one. If decrementing the
185 mutex's count causes it to become zero, the mutex is marked as
186 unowned and signaled, and eligible threads waiting on it will be
187 woken as appropriate.
188
189.. c:macro:: NTSYNC_IOC_SET_EVENT
190
191 Signal an event object. Takes a pointer to a 32-bit integer, which on
192 output contains the previous state of the event.
193
194 Eligible threads will be woken, and auto-reset events will be
195 designaled appropriately.
196
197.. c:macro:: NTSYNC_IOC_RESET_EVENT
198
199 Designal an event object. Takes a pointer to a 32-bit integer, which
200 on output contains the previous state of the event.
201
202.. c:macro:: NTSYNC_IOC_PULSE_EVENT
203
204 Wake threads waiting on an event object while leaving it in an
205 unsignaled state. Takes a pointer to a 32-bit integer, which on
206 output contains the previous state of the event.
207
208 A pulse operation can be thought of as a set followed by a reset,
209 performed as a single atomic operation. If two threads are waiting on
210 an auto-reset event which is pulsed, only one will be woken. If two
211 threads are waiting a manual-reset event which is pulsed, both will
212 be woken. However, in both cases, the event will be unsignaled
213 afterwards, and a simultaneous read operation will always report the
214 event as unsignaled.
215
216.. c:macro:: NTSYNC_IOC_READ_SEM
217
218 Read the current state of a semaphore object. Takes a pointer to
219 struct :c:type:`ntsync_sem_args`, which is used as follows:
220
221 .. list-table::
222
223 * - ``count``
224 - On output, contains the current count of the semaphore.
225 * - ``max``
226 - On output, contains the maximum count of the semaphore.
227
228.. c:macro:: NTSYNC_IOC_READ_MUTEX
229
230 Read the current state of a mutex object. Takes a pointer to struct
231 :c:type:`ntsync_mutex_args`, which is used as follows:
232
233 .. list-table::
234
235 * - ``owner``
236 - On output, contains the current owner of the mutex, or zero
237 if the mutex is not currently owned.
238 * - ``count``
239 - On output, contains the current recursion count of the mutex.
240
241 If the mutex is marked as abandoned, the function fails with
242 ``EOWNERDEAD``. In this case, ``count`` and ``owner`` are set to
243 zero.
244
245.. c:macro:: NTSYNC_IOC_READ_EVENT
246
247 Read the current state of an event object. Takes a pointer to struct
248 :c:type:`ntsync_event_args`, which is used as follows:
249
250 .. list-table::
251
252 * - ``signaled``
253 - On output, contains the current state of the event.
254 * - ``manual``
255 - On output, contains 1 if the event is a manual-reset event,
256 and 0 otherwise.
257
258.. c:macro:: NTSYNC_IOC_KILL_OWNER
259
260 Mark a mutex as unowned and abandoned if it is owned by the given
261 owner. Takes an input-only pointer to a 32-bit integer denoting the
262 owner. If the owner is zero, the ioctl fails with ``EINVAL``. If the
263 owner does not own the mutex, the function fails with ``EPERM``.
264
265 Eligible threads waiting on the mutex will be woken as appropriate
266 (and such waits will fail with ``EOWNERDEAD``, as described below).
267
268.. c:macro:: NTSYNC_IOC_WAIT_ANY
269
270 Poll on any of a list of objects, atomically acquiring at most one.
271 Takes a pointer to struct :c:type:`ntsync_wait_args`, which is
272 used as follows:
273
274 .. list-table::
275
276 * - ``timeout``
277 - Absolute timeout in nanoseconds. If ``NTSYNC_WAIT_REALTIME``
278 is set, the timeout is measured against the REALTIME clock;
279 otherwise it is measured against the MONOTONIC clock. If the
280 timeout is equal to or earlier than the current time, the
281 function returns immediately without sleeping. If ``timeout``
282 is U64_MAX, the function will sleep until an object is
283 signaled, and will not fail with ``ETIMEDOUT``.
284 * - ``objs``
285 - Pointer to an array of ``count`` file descriptors
286 (specified as an integer so that the structure has the same
287 size regardless of architecture). If any object is
288 invalid, the function fails with ``EINVAL``.
289 * - ``count``
290 - Number of objects specified in the ``objs`` array.
291 If greater than ``NTSYNC_MAX_WAIT_COUNT``, the function fails
292 with ``EINVAL``.
293 * - ``owner``
294 - Mutex owner identifier. If any object in ``objs`` is a mutex,
295 the ioctl will attempt to acquire that mutex on behalf of
296 ``owner``. If ``owner`` is zero, the ioctl fails with
297 ``EINVAL``.
298 * - ``index``
299 - On success, contains the index (into ``objs``) of the object
300 which was signaled. If ``alert`` was signaled instead,
301 this contains ``count``.
302 * - ``alert``
303 - Optional event object file descriptor. If nonzero, this
304 specifies an "alert" event object which, if signaled, will
305 terminate the wait. If nonzero, the identifier must point to a
306 valid event.
307 * - ``flags``
308 - Zero or more flags. Currently the only flag is
309 ``NTSYNC_WAIT_REALTIME``, which causes the timeout to be
310 measured against the REALTIME clock instead of MONOTONIC.
311 * - ``pad``
312 - Unused, must be set to zero.
313
314 This function attempts to acquire one of the given objects. If unable
315 to do so, it sleeps until an object becomes signaled, subsequently
316 acquiring it, or the timeout expires. In the latter case the ioctl
317 fails with ``ETIMEDOUT``. The function only acquires one object, even
318 if multiple objects are signaled.
319
320 A semaphore is considered to be signaled if its count is nonzero, and
321 is acquired by decrementing its count by one. A mutex is considered
322 to be signaled if it is unowned or if its owner matches the ``owner``
323 argument, and is acquired by incrementing its recursion count by one
324 and setting its owner to the ``owner`` argument. An auto-reset event
325 is acquired by designaling it; a manual-reset event is not affected
326 by acquisition.
327
328 Acquisition is atomic and totally ordered with respect to other
329 operations on the same object. If two wait operations (with different
330 ``owner`` identifiers) are queued on the same mutex, only one is
331 signaled. If two wait operations are queued on the same semaphore,
332 and a value of one is posted to it, only one is signaled.
333
334 If an abandoned mutex is acquired, the ioctl fails with
335 ``EOWNERDEAD``. Although this is a failure return, the function may
336 otherwise be considered successful. The mutex is marked as owned by
337 the given owner (with a recursion count of 1) and as no longer
338 abandoned, and ``index`` is still set to the index of the mutex.
339
340 The ``alert`` argument is an "extra" event which can terminate the
341 wait, independently of all other objects.
342
343 It is valid to pass the same object more than once, including by
344 passing the same event in the ``objs`` array and in ``alert``. If a
345 wakeup occurs due to that object being signaled, ``index`` is set to
346 the lowest index corresponding to that object.
347
348 The function may fail with ``EINTR`` if a signal is received.
349
350.. c:macro:: NTSYNC_IOC_WAIT_ALL
351
352 Poll on a list of objects, atomically acquiring all of them. Takes a
353 pointer to struct :c:type:`ntsync_wait_args`, which is used
354 identically to ``NTSYNC_IOC_WAIT_ANY``, except that ``index`` is
355 always filled with zero on success if not woken via alert.
356
357 This function attempts to simultaneously acquire all of the given
358 objects. If unable to do so, it sleeps until all objects become
359 simultaneously signaled, subsequently acquiring them, or the timeout
360 expires. In the latter case the ioctl fails with ``ETIMEDOUT`` and no
361 objects are modified.
362
363 Objects may become signaled and subsequently designaled (through
364 acquisition by other threads) while this thread is sleeping. Only
365 once all objects are simultaneously signaled does the ioctl acquire
366 them and return. The entire acquisition is atomic and totally ordered
367 with respect to other operations on any of the given objects.
368
369 If an abandoned mutex is acquired, the ioctl fails with
370 ``EOWNERDEAD``. Similarly to ``NTSYNC_IOC_WAIT_ANY``, all objects are
371 nevertheless marked as acquired. Note that if multiple mutex objects
372 are specified, there is no way to know which were marked as
373 abandoned.
374
375 As with "any" waits, the ``alert`` argument is an "extra" event which
376 can terminate the wait. Critically, however, an "all" wait will
377 succeed if all members in ``objs`` are signaled, *or* if ``alert`` is
378 signaled. In the latter case ``index`` will be set to ``count``. As
379 with "any" waits, if both conditions are filled, the former takes
380 priority, and objects in ``objs`` will be acquired.
381
382 Unlike ``NTSYNC_IOC_WAIT_ANY``, it is not valid to pass the same
383 object more than once, nor is it valid to pass the same object in
384 ``objs`` and in ``alert``. If this is attempted, the function fails
385 with ``EINVAL``.