jcs's openbsd hax
openbsd
1.\" $OpenBSD: kqueue.2,v 1.52 2025/05/10 09:44:39 visa Exp $
2.\"
3.\" Copyright (c) 2000 Jonathan Lemon
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\" notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\" notice, this list of conditions and the following disclaimer in the
13.\" documentation and/or other materials provided with the distribution.
14.\"
15.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND
16.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
18.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
19.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
20.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
21.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
22.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
23.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
24.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
25.\" SUCH DAMAGE.
26.\"
27.\" $FreeBSD: src/lib/libc/sys/kqueue.2,v 1.18 2001/02/14 08:48:35 guido Exp $
28.\"
29.Dd $Mdocdate: May 10 2025 $
30.Dt KQUEUE 2
31.Os
32.Sh NAME
33.Nm kqueue ,
34.Nm kqueue1 ,
35.Nm kevent ,
36.Nm EV_SET
37.Nd kernel event notification mechanism
38.Sh SYNOPSIS
39.In sys/types.h
40.In sys/event.h
41.In sys/time.h
42.Ft int
43.Fn kqueue "void"
44.Ft int
45.Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout"
46.Fn EV_SET "&kev" ident filter flags fflags data udata
47.In sys/types.h
48.In sys/event.h
49.In sys/time.h
50.In fcntl.h
51.Ft int
52.Fn kqueue1 "int flags"
53.Sh DESCRIPTION
54.Fn kqueue
55provides a generic method of notifying the user when an event
56happens or a condition holds, based on the results of small
57pieces of kernel code termed
58.Dq filters .
59A kevent is identified by the (ident, filter) pair; there may only
60be one unique kevent per kqueue.
61.Pp
62The filter is executed upon the initial registration of a kevent
63in order to detect whether a preexisting condition is present, and is also
64executed whenever an event is passed to the filter for evaluation.
65If the filter determines that the condition should be reported,
66then the kevent is placed on the kqueue for the user to retrieve.
67.Pp
68The filter is also run when the user attempts to retrieve the kevent
69from the kqueue.
70If the filter indicates that the condition that triggered
71the event no longer holds, the kevent is removed from the kqueue and
72is not returned.
73.Pp
74Multiple events which trigger the filter do not result in multiple
75kevents being placed on the kqueue; instead, the filter will aggregate
76the events into a single
77.Vt struct kevent .
78Calling
79.Xr close 2
80on a file descriptor will remove any kevents that reference the descriptor.
81.Pp
82.Fn kqueue
83creates a new kernel event queue and returns a descriptor.
84The queue is not inherited by a child created with
85.Xr fork 2 .
86Similarly, kqueues cannot be passed across UNIX-domain sockets.
87.Pp
88The
89.Fn kqueue1
90function is identical to
91.Fn kqueue
92except that the close-on-exec flag on the new file descriptor
93is determined by the
94.Dv O_CLOEXEC
95flag
96in the
97.Fa flags
98argument.
99.Pp
100.Fn kevent
101is used to register events with the queue, and return any pending
102events to the user.
103.Fa changelist
104is a pointer to an array of
105.Vt kevent
106structures, as defined in
107.In sys/event.h .
108All changes contained in the
109.Fa changelist
110are applied before any pending events are read from the queue.
111.Fa nchanges
112gives the size of
113.Fa changelist .
114.Fa eventlist
115is a pointer to an array of
116.Vt kevent
117structures.
118.Fa nevents
119determines the size of
120.Fa eventlist .
121When
122.Fa nevents
123is zero,
124.Fn kevent
125will return immediately even if there is a
126.Fa timeout
127specified, unlike
128.Xr select 2 .
129If
130.Fa timeout
131is not
132.Dv NULL ,
133it specifies a maximum interval to wait
134for an event, which will be interpreted as a
135.Vt struct timespec .
136If
137.Fa timeout
138is
139.Dv NULL ,
140.Fn kevent
141waits indefinitely.
142To effect a poll, the
143.Fa timeout
144argument should not be
145.Dv NULL ,
146pointing to a zero-valued
147.Vt struct timespec .
148The same array may be used for the
149.Fa changelist
150and
151.Fa eventlist .
152.Pp
153.Fn EV_SET
154is a macro which is provided for ease of initializing a
155.Vt kevent
156structure.
157.Pp
158The
159.Vt kevent
160structure is defined as:
161.Bd -literal
162struct kevent {
163 uintptr_t ident; /* identifier for this event */
164 short filter; /* filter for event */
165 u_short flags; /* action flags for kqueue */
166 u_int fflags; /* filter flag value */
167 int64_t data; /* filter data value */
168 void *udata; /* opaque user data identifier */
169};
170.Ed
171.Pp
172The fields of
173.Vt struct kevent
174are:
175.Bl -tag -width XXXfilter
176.It Fa ident
177Value used to identify this event.
178The exact interpretation is determined by the attached filter,
179but often is a file descriptor.
180.It Fa filter
181Identifies the kernel filter used to process this event.
182The pre-defined system filters are described below.
183.It Fa flags
184Actions to perform on the event.
185.It Fa fflags
186Filter-specific flags.
187.It Fa data
188Filter-specific data value.
189.It Fa udata
190Opaque user-defined value passed through the kernel unchanged.
191.El
192.Pp
193The
194.Fa flags
195field can contain the following values:
196.Bl -tag -width XXXEV_ONESHOT
197.It Dv EV_ADD
198Adds the event to the kqueue.
199Re-adding an existing event will modify the parameters of the original event,
200and not result in a duplicate entry.
201Adding an event automatically enables it, unless overridden by the
202.Dv EV_DISABLE
203flag.
204.It Dv EV_ENABLE
205Permit
206.Fn kevent
207to return the event if it is triggered.
208.It Dv EV_DISABLE
209Disable the event so
210.Fn kevent
211will not return it.
212The filter itself is not disabled.
213.It Dv EV_DISPATCH
214Disable the event source immediately after delivery of an event.
215See
216.Dv EV_DISABLE
217above.
218.It Dv EV_DELETE
219Removes the event from the kqueue.
220Events which are attached to file descriptors are automatically deleted
221on the last close of the descriptor.
222.It Dv EV_RECEIPT
223Causes
224.Fn kevent
225to return with
226.Dv EV_ERROR
227set without draining any pending events after updating events in the kqueue.
228When a filter is successfully added, the
229.Fa data
230field will be zero.
231This flag is useful for making bulk changes to a kqueue.
232.It Dv EV_ONESHOT
233Causes the event to return only the first occurrence of the filter
234being triggered.
235After the user retrieves the event from the kqueue, it is deleted.
236.It Dv EV_CLEAR
237After the event is retrieved by the user, its state is reset.
238This is useful for filters which report state transitions
239instead of the current state.
240Note that some filters may automatically set this flag internally.
241.It Dv EV_EOF
242Filters may set this flag to indicate filter-specific EOF condition.
243.It Dv EV_ERROR
244See
245.Sx RETURN VALUES
246below.
247.El
248.Pp
249The predefined system filters are listed below.
250Arguments may be passed to and from the filter via the
251.Fa fflags
252and
253.Fa data
254fields in the
255.Vt kevent
256structure.
257.Bl -tag -width EVFILT_SIGNAL
258.It Dv EVFILT_READ
259Takes a descriptor as the identifier, and returns whenever
260there is data available to read.
261The behavior of the filter is slightly different depending
262on the descriptor type.
263.Bl -tag -width 2n
264.It Sockets
265Sockets which have previously been passed to
266.Xr listen 2
267return when there is an incoming connection pending.
268.Fa data
269contains the size of the listen backlog.
270.Pp
271Other socket descriptors return when there is data to be read,
272subject to the
273.Dv SO_RCVLOWAT
274value of the socket buffer.
275This may be overridden with a per-filter low water mark at the
276time the filter is added by setting the
277.Dv NOTE_LOWAT
278flag in
279.Fa fflags ,
280and specifying the new low water mark in
281.Fa data .
282On return,
283.Fa data
284contains the number of bytes in the socket buffer.
285.Pp
286If the read direction of the socket has shutdown, then the filter
287also sets
288.Dv EV_EOF
289in
290.Fa flags ,
291and returns the socket error (if any) in
292.Fa fflags .
293It is possible for EOF to be returned (indicating the connection is gone)
294while there is still data pending in the socket buffer.
295.It Vnodes
296Returns when the file pointer is not at the end of file.
297.Fa data
298contains the offset from current position to end of file,
299and may be negative.
300If
301.Dv NOTE_EOF
302is set in
303.Fa fflags ,
304.Fn kevent
305will also return when the file pointer is at the end of file.
306The end of file condition is indicated by the presence of
307.Dv NOTE_EOF
308in
309.Fa fflags
310on return.
311.It "FIFOs, Pipes"
312Returns when there is data to read;
313.Fa data
314contains the number of bytes available.
315.Pp
316When the last writer disconnects, the filter will set
317.Dv EV_EOF
318in
319.Fa flags .
320This may be cleared by passing in
321.Dv EV_CLEAR ,
322at which point the filter will resume waiting for data to become
323available before returning.
324.It "BPF devices"
325Returns when the BPF buffer is full, the BPF timeout has expired, or
326when the BPF has
327.Dq immediate mode
328enabled and there is any data to read;
329.Fa data
330contains the number of bytes available.
331.El
332.It Dv EVFILT_EXCEPT
333Takes a descriptor as the identifier, and returns whenever one of the
334specified exceptional conditions has occurred on the descriptor.
335Conditions are specified in
336.Fa fflags .
337Currently, a filter can monitor the reception of out-of-band data
338on a socket or pseudo terminal with
339.Dv NOTE_OOB .
340.It Dv EVFILT_WRITE
341Takes a descriptor as the identifier, and returns whenever
342it is possible to write to the descriptor.
343For sockets, pipes, and FIFOs,
344.Fa data
345will contain the amount of space remaining in the write buffer.
346The filter will set
347.Dv EV_EOF
348when the reader disconnects, and for the FIFO case,
349this may be cleared by use of
350.Dv EV_CLEAR .
351Note that this filter is not supported for vnodes or BPF devices.
352.Pp
353For sockets, the low water mark and socket error handling is
354identical to the
355.Dv EVFILT_READ
356case.
357.\".It Dv EVFILT_AIO
358.\"The sigevent portion of the AIO request is filled in, with
359.\".Va sigev_notify_kqueue
360.\"containing the descriptor of the kqueue that the event should
361.\"be attached to,
362.\".Va sigev_value
363.\"containing the udata value, and
364.\".Va sigev_notify
365.\"set to
366.\".Dv SIGEV_KEVENT .
367.\"When the aio_* function is called, the event will be registered
368.\"with the specified kqueue, and the
369.\".Va ident
370.\"argument set to the
371.\".Li struct aiocb
372.\"returned by the aio_* function.
373.\"The filter returns under the same conditions as aio_error.
374.\".Pp
375.\"Alternatively, a kevent structure may be initialized, with
376.\".Va ident
377.\"containing the descriptor of the kqueue, and the
378.\"address of the kevent structure placed in the
379.\".Va aio_lio_opcode
380.\"field of the AIO request.
381.\"However, this approach will not work on architectures with 64-bit pointers,
382.\"and should be considered deprecated.
383.It Dv EVFILT_VNODE
384Takes a file descriptor as the identifier and the events to watch for in
385.Fa fflags ,
386and returns when one or more of the requested events occurs on the descriptor.
387The events to monitor are:
388.Bl -tag -width XXNOTE_RENAME
389.It Dv NOTE_DELETE
390.Xr unlink 2
391was called on the file referenced by the descriptor.
392.It Dv NOTE_WRITE
393A write occurred on the file referenced by the descriptor.
394.It Dv NOTE_EXTEND
395The file referenced by the descriptor was extended.
396.It Dv NOTE_TRUNCATE
397The file referenced by the descriptor was truncated.
398.It Dv NOTE_ATTRIB
399The file referenced by the descriptor had its attributes changed.
400.It Dv NOTE_LINK
401The link count on the file changed.
402.It Dv NOTE_RENAME
403The file referenced by the descriptor was renamed.
404.It Dv NOTE_REVOKE
405Access to the file was revoked via
406.Xr revoke 2
407or the underlying file system was unmounted.
408.El
409.Pp
410On return,
411.Fa fflags
412contains the events which triggered the filter.
413.It Dv EVFILT_PROC
414Takes the process ID to monitor as the identifier and the events to watch for
415in
416.Fa fflags ,
417and returns when the process performs one or more of the requested events.
418If a process can normally see another process, it can attach an event to it.
419The events to monitor are:
420.Bl -tag -width XXNOTE_TRACKERR
421.It Dv NOTE_EXIT
422The process has exited.
423The exit status will be stored in
424.Fa data
425in the same format as the status set by
426.Xr wait 2 .
427.It Dv NOTE_FORK
428The process has called
429.Xr fork 2 .
430.It Dv NOTE_EXEC
431The process has executed a new process via
432.Xr execve 2
433or similar call.
434.It Dv NOTE_TRACK
435Follow a process across
436.Xr fork 2
437calls.
438The parent process will return with
439.Dv NOTE_FORK
440set in the
441.Fa fflags
442field, while the child process will return with
443.Dv NOTE_CHILD
444set in
445.Fa fflags
446and the parent PID in
447.Fa data .
448.It Dv NOTE_TRACKERR
449This flag is returned if the system was unable to attach an event to
450the child process, usually due to resource limitations.
451.El
452.Pp
453On return,
454.Fa fflags
455contains the events which triggered the filter.
456.It Dv EVFILT_SIGNAL
457Takes the signal number to monitor as the identifier and returns
458when the given signal is delivered to the process.
459This coexists with the
460.Xr signal 3
461and
462.Xr sigaction 2
463facilities, and has a lower precedence.
464The filter will record all attempts to deliver a signal to a process,
465even if the signal has been marked as
466.Dv SIG_IGN .
467Event notification happens after normal signal delivery processing.
468.Fa data
469returns the number of times the signal has occurred since the last call to
470.Fn kevent .
471This filter automatically sets the
472.Dv EV_CLEAR
473flag internally.
474.It Dv EVFILT_TIMER
475Establishes an arbitrary timer identified by
476.Fa ident .
477When adding a timer,
478.Fa data
479specifies the timeout period in units described below or, if
480.Dv NOTE_ABSTIME
481is set in
482.Va fflags ,
483the absolute time at which the timer should fire.
484The timer will repeat unless
485.Dv EV_ONESHOT
486is set in
487.Va flags
488or
489.Dv NOTE_ABSTIME
490is set in
491.Va fflags .
492On return,
493.Fa data
494contains the number of times the timeout has expired since the last call to
495.Fn kevent .
496This filter automatically sets
497.Dv EV_CLEAR
498in
499.Va flags
500for periodic timers.
501Timers created with
502.Dv NOTE_ABSTIME
503remain activated on the kqueue once the absolute time has passed unless
504.Dv EV_CLEAR
505or
506.Dv EV_ONESHOT
507are also specified.
508.Pp
509The filter accepts the following flags in the
510.Va fflags
511argument:
512.Bl -tag -width NOTE_MSECONDS
513.It Dv NOTE_SECONDS
514The timer value in
515.Va data
516is expressed in seconds.
517.It Dv NOTE_MSECONDS
518The timer value in
519.Va data
520is expressed in milliseconds.
521.It Dv NOTE_USECONDS
522The timer value in
523.Va data
524is expressed in microseconds.
525.It Dv NOTE_NSECONDS
526The timer value in
527.Va data
528is expressed in nanoseconds.
529.It Dv NOTE_ABSTIME
530The timer value is an absolute time with
531.Dv CLOCK_REALTIME
532as the reference clock.
533.El
534.Pp
535Note that
536.Dv NOTE_SECONDS ,
537.Dv NOTE_MSECONDS ,
538.Dv NOTE_USECONDS ,
539and
540.Dv NOTE_NSECONDS
541are mutually exclusive; behavior is undefined if more than one are specified.
542If a timer value unit is not specified, the default is
543.Dv NOTE_MSECONDS .
544.Pp
545If an existing timer is re-added, the existing timer and related pending events
546will be cancelled.
547The timer will be re-started using the timeout period
548.Fa data .
549.It Dv EVFILT_DEVICE
550Takes a descriptor as the identifier and the events to watch for in
551.Fa fflags ,
552and returns when one or more of the requested events occur on the
553descriptor.
554The events to monitor are:
555.Bl -tag -width XXNOTE_CHANGE
556.It Dv NOTE_CHANGE
557A device change event has occurred,
558e.g. an HDMI cable has been plugged in to a port.
559.El
560.Pp
561On return,
562.Fa fflags
563contains the events which triggered the filter.
564.It Dv EVFILT_USER
565Establishes a user event identified by
566.Va ident
567which is not associated with any kernel mechanism but is triggered by
568user level code.
569The lower 24 bits of the
570.Va fflags
571may be used for user defined flags and manipulated using the following:
572.Bl -tag -width XXNOTE_FFLAGSMASK
573.It Dv NOTE_FFNOP
574Ignore the input
575.Va fflags .
576.It Dv NOTE_FFAND
577Bitwise AND
578.Va fflags .
579.It Dv NOTE_FFOR
580Bitwise OR
581.Va fflags .
582.It Dv NOTE_FFCOPY
583Copy
584.Va fflags .
585.It Dv NOTE_FFCTRLMASK
586Control mask for
587.Va fflags .
588.It Dv NOTE_FFLAGSMASK
589User defined flag mask for
590.Va fflags .
591.El
592.Pp
593A user event is triggered for output with the following:
594.Bl -tag -width XXNOTE_FFLAGSMASK
595.It Dv NOTE_TRIGGER
596Cause the event to be triggered.
597.El
598.Pp
599On return,
600.Va fflags
601contains the user defined flags in the lower 24 bits.
602.El
603.Sh RETURN VALUES
604.Fn kqueue
605and
606.Fn kqueue1
607create a new kernel event queue and returns a file descriptor.
608If there was an error creating the kernel event queue, a value of -1 is
609returned and
610.Va errno
611set.
612.Pp
613.Fn kevent
614returns the number of events placed in the
615.Fa eventlist ,
616up to the value given by
617.Fa nevents .
618If an error occurs while processing an element of the
619.Fa changelist
620and there is enough room in the
621.Fa eventlist ,
622then the event will be placed in the
623.Fa eventlist
624with
625.Dv EV_ERROR
626set in
627.Fa flags
628and the system error in
629.Fa data .
630Otherwise, -1 will be returned, and
631.Va errno
632will be set to indicate the error condition.
633If the time limit expires, then
634.Fn kevent
635returns 0.
636.Sh ERRORS
637The
638.Fn kqueue
639and
640.Fn kqueue1
641functions fail if:
642.Bl -tag -width Er
643.It Bq Er ENOMEM
644The kernel failed to allocate enough memory for the kernel queue.
645.It Bq Er EMFILE
646The per-process descriptor table is full.
647.It Bq Er ENFILE
648The system file table is full.
649.El
650.Pp
651In addition,
652.Fn kqueue1
653fails if:
654.Bl -tag -width Er
655.It Bq Er EINVAL
656.Fa flags
657is invalid.
658.El
659.Pp
660The
661.Fn kevent
662function fails if:
663.Bl -tag -width Er
664.It Bq Er EACCES
665The process does not have permission to register a filter.
666.It Bq Er EFAULT
667There was an error reading or writing the
668.Vt kevent
669structure.
670.It Bq Er EBADF
671The specified descriptor is invalid.
672.It Bq Er EINTR
673A signal was delivered before the timeout expired and before any
674events were placed on the kqueue for return.
675.It Bq Er EINVAL
676The specified time limit or filter is invalid.
677.It Bq Er ENOENT
678The event could not be found to be modified or deleted.
679.It Bq Er ENOMEM
680No memory was available to register the event.
681.It Bq Er ESRCH
682The specified process to attach to does not exist.
683.El
684.Sh SEE ALSO
685.Xr clock_gettime 2 ,
686.Xr poll 2 ,
687.Xr read 2 ,
688.Xr select 2 ,
689.Xr sigaction 2 ,
690.Xr wait 2 ,
691.Xr write 2 ,
692.Xr signal 3
693.Sh HISTORY
694The
695.Fn kqueue
696and
697.Fn kevent
698functions first appeared in
699.Fx 4.1
700and have been available since
701.Ox 2.9 .
702.Sh AUTHORS
703The
704.Fn kqueue
705system and this manual page were written by
706.An Jonathan Lemon Aq Mt jlemon@FreeBSD.org .