Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull notification queue from David Howells:
"This adds a general notification queue concept and adds an event
source for keys/keyrings, such as linking and unlinking keys and
changing their attributes.

Thanks to Debarshi Ray, we do have a pull request to use this to fix a
problem with gnome-online-accounts - as mentioned last time:

https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

Without this, g-o-a has to constantly poll a keyring-based kerberos
cache to find out if kinit has changed anything.

[ There are other notification pending: mount/sb fsinfo notifications
for libmount that Karel Zak and Ian Kent have been working on, and
Christian Brauner would like to use them in lxc, but let's see how
this one works first ]

LSM hooks are included:

- A set of hooks are provided that allow an LSM to rule on whether or
not a watch may be set. Each of these hooks takes a different
"watched object" parameter, so they're not really shareable. The
LSM should use current's credentials. [Wanted by SELinux & Smack]

- A hook is provided to allow an LSM to rule on whether or not a
particular message may be posted to a particular queue. This is
given the credentials from the event generator (which may be the
system) and the watch setter. [Wanted by Smack]

I've provided SELinux and Smack with implementations of some of these
hooks.

WHY
===

Key/keyring notifications are desirable because if you have your
kerberos tickets in a file/directory, your Gnome desktop will monitor
that using something like fanotify and tell you if your credentials
cache changes.

However, we also have the ability to cache your kerberos tickets in
the session, user or persistent keyring so that it isn't left around
on disk across a reboot or logout. Keyrings, however, cannot currently
be monitored asynchronously, so the desktop has to poll for it - not
so good on a laptop. This facility will allow the desktop to avoid the
need to poll.

DESIGN DECISIONS
================

- The notification queue is built on top of a standard pipe. Messages
are effectively spliced in. The pipe is opened with a special flag:

pipe2(fds, O_NOTIFICATION_PIPE);

The special flag has the same value as O_EXCL (which doesn't seem
like it will ever be applicable in this context)[?]. It is given up
front to make it a lot easier to prohibit splice&co from accessing
the pipe.

[?] Should this be done some other way? I'd rather not use up a new
O_* flag if I can avoid it - should I add a pipe3() system call
instead?

The pipe is then configured::

ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

Messages are then read out of the pipe using read().

- It should be possible to allow write() to insert data into the
notification pipes too, but this is currently disabled as the
kernel has to be able to insert messages into the pipe *without*
holding pipe->mutex and the code to make this work needs careful
auditing.

- sendfile(), splice() and vmsplice() are disabled on notification
pipes because of the pipe->mutex issue and also because they
sometimes want to revert what they just did - but one or more
notification messages might've been interleaved in the ring.

- The kernel inserts messages with the wait queue spinlock held. This
means that pipe_read() and pipe_write() have to take the spinlock
to update the queue pointers.

- Records in the buffer are binary, typed and have a length so that
they can be of varying size.

This allows multiple heterogeneous sources to share a common
buffer; there are 16 million types available, of which I've used
just a few, so there is scope for others to be used. Tags may be
specified when a watchpoint is created to help distinguish the
sources.

- Records are filterable as types have up to 256 subtypes that can be
individually filtered. Other filtration is also available.

- Notification pipes don't interfere with each other; each may be
bound to a different set of watches. Any particular notification
will be copied to all the queues that are currently watching for it
- and only those that are watching for it.

- When recording a notification, the kernel will not sleep, but will
rather mark a queue as having lost a message if there's
insufficient space. read() will fabricate a loss notification
message at an appropriate point later.

- The notification pipe is created and then watchpoints are attached
to it, using one of:

keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

where in both cases, fd indicates the queue and the number after is
a tag between 0 and 255.

- Watches are removed if either the notification pipe is destroyed or
the watched object is destroyed. In the latter case, a message will
be generated indicating the enforced watch removal.

Things I want to avoid:

- Introducing features that make the core VFS dependent on the
network stack or networking namespaces (ie. usage of netlink).

- Dumping all this stuff into dmesg and having a daemon that sits
there parsing the output and distributing it as this then puts the
responsibility for security into userspace and makes handling
namespaces tricky. Further, dmesg might not exist or might be
inaccessible inside a container.

- Letting users see events they shouldn't be able to see.

TESTING AND MANPAGES
====================

- The keyutils tree has a pipe-watch branch that has keyctl commands
for making use of notifications. Proposed manual pages can also be
found on this branch, though a couple of them really need to go to
the main manpages repository instead.

If the kernel supports the watching of keys, then running "make
test" on that branch will cause the testing infrastructure to spawn
a monitoring process on the side that monitors a notifications pipe
for all the key/keyring changes induced by the tests and they'll
all be checked off to make sure they happened.

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

- A test program is provided (samples/watch_queue/watch_test) that
can be used to monitor for keyrings, mount and superblock events.
Information on the notifications is simply logged to stdout"

* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
smack: Implement the watch_key and post_notification hooks
selinux: Implement the watch_key security hook
keys: Make the KEY_NEED_* perms an enum rather than a mask
pipe: Add notification lossage handling
pipe: Allow buffers to be marked read-whole-or-error for notifications
Add sample notification program
watch_queue: Add a key/keyring notification facility
security: Add hooks to rule on setting a watch
pipe: Add general notification queue support
pipe: Add O_NOTIFICATION_PIPE
security: Add a hook for the point of notification insertion
uapi: General notification queue definitions

+2181 -178
+57
Documentation/security/keys/core.rst
··· 1030 1030 written into the output buffer. Verification returns 0 on success. 1031 1031 1032 1032 1033 + * Watch a key or keyring for changes:: 1034 + 1035 + long keyctl(KEYCTL_WATCH_KEY, key_serial_t key, int queue_fd, 1036 + const struct watch_notification_filter *filter); 1037 + 1038 + This will set or remove a watch for changes on the specified key or 1039 + keyring. 1040 + 1041 + "key" is the ID of the key to be watched. 1042 + 1043 + "queue_fd" is a file descriptor referring to an open "/dev/watch_queue" 1044 + which manages the buffer into which notifications will be delivered. 1045 + 1046 + "filter" is either NULL to remove a watch or a filter specification to 1047 + indicate what events are required from the key. 1048 + 1049 + See Documentation/watch_queue.rst for more information. 1050 + 1051 + Note that only one watch may be emplaced for any particular { key, 1052 + queue_fd } combination. 1053 + 1054 + Notification records look like:: 1055 + 1056 + struct key_notification { 1057 + struct watch_notification watch; 1058 + __u32 key_id; 1059 + __u32 aux; 1060 + }; 1061 + 1062 + In this, watch::type will be "WATCH_TYPE_KEY_NOTIFY" and subtype will be 1063 + one of:: 1064 + 1065 + NOTIFY_KEY_INSTANTIATED 1066 + NOTIFY_KEY_UPDATED 1067 + NOTIFY_KEY_LINKED 1068 + NOTIFY_KEY_UNLINKED 1069 + NOTIFY_KEY_CLEARED 1070 + NOTIFY_KEY_REVOKED 1071 + NOTIFY_KEY_INVALIDATED 1072 + NOTIFY_KEY_SETATTR 1073 + 1074 + Where these indicate a key being instantiated/rejected, updated, a link 1075 + being made in a keyring, a link being removed from a keyring, a keyring 1076 + being cleared, a key being revoked, a key being invalidated or a key 1077 + having one of its attributes changed (user, group, perm, timeout, 1078 + restriction). 1079 + 1080 + If a watched key is deleted, a basic watch_notification will be issued 1081 + with "type" set to WATCH_TYPE_META and "subtype" set to 1082 + watch_meta_removal_notification. The watchpoint ID will be set in the 1083 + "info" field. 1084 + 1085 + This needs to be configured by enabling: 1086 + 1087 + "Provide key/keyring change notifications" (KEY_NOTIFICATIONS) 1088 + 1089 + 1033 1090 Kernel Services 1034 1091 =============== 1035 1092
+1
Documentation/userspace-api/ioctl/ioctl-number.rst
··· 202 202 'W' 00-1F linux/wanrouter.h conflict! (pre 3.9) 203 203 'W' 00-3F sound/asound.h conflict! 204 204 'W' 40-5F drivers/pci/switch/switchtec.c 205 + 'W' 60-61 linux/watch_queue.h 205 206 'X' all fs/xfs/xfs_fs.h, conflict! 206 207 fs/xfs/linux-2.6/xfs_ioctl32.h, 207 208 include/linux/falloc.h,
+339
Documentation/watch_queue.rst
··· 1 + ============================== 2 + General notification mechanism 3 + ============================== 4 + 5 + The general notification mechanism is built on top of the standard pipe driver 6 + whereby it effectively splices notification messages from the kernel into pipes 7 + opened by userspace. This can be used in conjunction with:: 8 + 9 + * Key/keyring notifications 10 + 11 + 12 + The notifications buffers can be enabled by: 13 + 14 + "General setup"/"General notification queue" 15 + (CONFIG_WATCH_QUEUE) 16 + 17 + This document has the following sections: 18 + 19 + .. contents:: :local: 20 + 21 + 22 + Overview 23 + ======== 24 + 25 + This facility appears as a pipe that is opened in a special mode. The pipe's 26 + internal ring buffer is used to hold messages that are generated by the kernel. 27 + These messages are then read out by read(). Splice and similar are disabled on 28 + such pipes due to them wanting to, under some circumstances, revert their 29 + additions to the ring - which might end up interleaved with notification 30 + messages. 31 + 32 + The owner of the pipe has to tell the kernel which sources it would like to 33 + watch through that pipe. Only sources that have been connected to a pipe will 34 + insert messages into it. Note that a source may be bound to multiple pipes and 35 + insert messages into all of them simultaneously. 36 + 37 + Filters may also be emplaced on a pipe so that certain source types and 38 + subevents can be ignored if they're not of interest. 39 + 40 + A message will be discarded if there isn't a slot available in the ring or if 41 + no preallocated message buffer is available. In both of these cases, read() 42 + will insert a WATCH_META_LOSS_NOTIFICATION message into the output buffer after 43 + the last message currently in the buffer has been read. 44 + 45 + Note that when producing a notification, the kernel does not wait for the 46 + consumers to collect it, but rather just continues on. This means that 47 + notifications can be generated whilst spinlocks are held and also protects the 48 + kernel from being held up indefinitely by a userspace malfunction. 49 + 50 + 51 + Message Structure 52 + ================= 53 + 54 + Notification messages begin with a short header:: 55 + 56 + struct watch_notification { 57 + __u32 type:24; 58 + __u32 subtype:8; 59 + __u32 info; 60 + }; 61 + 62 + "type" indicates the source of the notification record and "subtype" indicates 63 + the type of record from that source (see the Watch Sources section below). The 64 + type may also be "WATCH_TYPE_META". This is a special record type generated 65 + internally by the watch queue itself. There are two subtypes: 66 + 67 + * WATCH_META_REMOVAL_NOTIFICATION 68 + * WATCH_META_LOSS_NOTIFICATION 69 + 70 + The first indicates that an object on which a watch was installed was removed 71 + or destroyed and the second indicates that some messages have been lost. 72 + 73 + "info" indicates a bunch of things, including: 74 + 75 + * The length of the message in bytes, including the header (mask with 76 + WATCH_INFO_LENGTH and shift by WATCH_INFO_LENGTH__SHIFT). This indicates 77 + the size of the record, which may be between 8 and 127 bytes. 78 + 79 + * The watch ID (mask with WATCH_INFO_ID and shift by WATCH_INFO_ID__SHIFT). 80 + This indicates that caller's ID of the watch, which may be between 0 81 + and 255. Multiple watches may share a queue, and this provides a means to 82 + distinguish them. 83 + 84 + * A type-specific field (WATCH_INFO_TYPE_INFO). This is set by the 85 + notification producer to indicate some meaning specific to the type and 86 + subtype. 87 + 88 + Everything in info apart from the length can be used for filtering. 89 + 90 + The header can be followed by supplementary information. The format of this is 91 + at the discretion is defined by the type and subtype. 92 + 93 + 94 + Watch List (Notification Source) API 95 + ==================================== 96 + 97 + A "watch list" is a list of watchers that are subscribed to a source of 98 + notifications. A list may be attached to an object (say a key or a superblock) 99 + or may be global (say for device events). From a userspace perspective, a 100 + non-global watch list is typically referred to by reference to the object it 101 + belongs to (such as using KEYCTL_NOTIFY and giving it a key serial number to 102 + watch that specific key). 103 + 104 + To manage a watch list, the following functions are provided: 105 + 106 + * ``void init_watch_list(struct watch_list *wlist, 107 + void (*release_watch)(struct watch *wlist));`` 108 + 109 + Initialise a watch list. If ``release_watch`` is not NULL, then this 110 + indicates a function that should be called when the watch_list object is 111 + destroyed to discard any references the watch list holds on the watched 112 + object. 113 + 114 + * ``void remove_watch_list(struct watch_list *wlist);`` 115 + 116 + This removes all of the watches subscribed to a watch_list and frees them 117 + and then destroys the watch_list object itself. 118 + 119 + 120 + Watch Queue (Notification Output) API 121 + ===================================== 122 + 123 + A "watch queue" is the buffer allocated by an application that notification 124 + records will be written into. The workings of this are hidden entirely inside 125 + of the pipe device driver, but it is necessary to gain a reference to it to set 126 + a watch. These can be managed with: 127 + 128 + * ``struct watch_queue *get_watch_queue(int fd);`` 129 + 130 + Since watch queues are indicated to the kernel by the fd of the pipe that 131 + implements the buffer, userspace must hand that fd through a system call. 132 + This can be used to look up an opaque pointer to the watch queue from the 133 + system call. 134 + 135 + * ``void put_watch_queue(struct watch_queue *wqueue);`` 136 + 137 + This discards the reference obtained from ``get_watch_queue()``. 138 + 139 + 140 + Watch Subscription API 141 + ====================== 142 + 143 + A "watch" is a subscription on a watch list, indicating the watch queue, and 144 + thus the buffer, into which notification records should be written. The watch 145 + queue object may also carry filtering rules for that object, as set by 146 + userspace. Some parts of the watch struct can be set by the driver:: 147 + 148 + struct watch { 149 + union { 150 + u32 info_id; /* ID to be OR'd in to info field */ 151 + ... 152 + }; 153 + void *private; /* Private data for the watched object */ 154 + u64 id; /* Internal identifier */ 155 + ... 156 + }; 157 + 158 + The ``info_id`` value should be an 8-bit number obtained from userspace and 159 + shifted by WATCH_INFO_ID__SHIFT. This is OR'd into the WATCH_INFO_ID field of 160 + struct watch_notification::info when and if the notification is written into 161 + the associated watch queue buffer. 162 + 163 + The ``private`` field is the driver's data associated with the watch_list and 164 + is cleaned up by the ``watch_list::release_watch()`` method. 165 + 166 + The ``id`` field is the source's ID. Notifications that are posted with a 167 + different ID are ignored. 168 + 169 + The following functions are provided to manage watches: 170 + 171 + * ``void init_watch(struct watch *watch, struct watch_queue *wqueue);`` 172 + 173 + Initialise a watch object, setting its pointer to the watch queue, using 174 + appropriate barriering to avoid lockdep complaints. 175 + 176 + * ``int add_watch_to_object(struct watch *watch, struct watch_list *wlist);`` 177 + 178 + Subscribe a watch to a watch list (notification source). The 179 + driver-settable fields in the watch struct must have been set before this 180 + is called. 181 + 182 + * ``int remove_watch_from_object(struct watch_list *wlist, 183 + struct watch_queue *wqueue, 184 + u64 id, false);`` 185 + 186 + Remove a watch from a watch list, where the watch must match the specified 187 + watch queue (``wqueue``) and object identifier (``id``). A notification 188 + (``WATCH_META_REMOVAL_NOTIFICATION``) is sent to the watch queue to 189 + indicate that the watch got removed. 190 + 191 + * ``int remove_watch_from_object(struct watch_list *wlist, NULL, 0, true);`` 192 + 193 + Remove all the watches from a watch list. It is expected that this will be 194 + called preparatory to destruction and that the watch list will be 195 + inaccessible to new watches by this point. A notification 196 + (``WATCH_META_REMOVAL_NOTIFICATION``) is sent to the watch queue of each 197 + subscribed watch to indicate that the watch got removed. 198 + 199 + 200 + Notification Posting API 201 + ======================== 202 + 203 + To post a notification to watch list so that the subscribed watches can see it, 204 + the following function should be used:: 205 + 206 + void post_watch_notification(struct watch_list *wlist, 207 + struct watch_notification *n, 208 + const struct cred *cred, 209 + u64 id); 210 + 211 + The notification should be preformatted and a pointer to the header (``n``) 212 + should be passed in. The notification may be larger than this and the size in 213 + units of buffer slots is noted in ``n->info & WATCH_INFO_LENGTH``. 214 + 215 + The ``cred`` struct indicates the credentials of the source (subject) and is 216 + passed to the LSMs, such as SELinux, to allow or suppress the recording of the 217 + note in each individual queue according to the credentials of that queue 218 + (object). 219 + 220 + The ``id`` is the ID of the source object (such as the serial number on a key). 221 + Only watches that have the same ID set in them will see this notification. 222 + 223 + 224 + Watch Sources 225 + ============= 226 + 227 + Any particular buffer can be fed from multiple sources. Sources include: 228 + 229 + * WATCH_TYPE_KEY_NOTIFY 230 + 231 + Notifications of this type indicate changes to keys and keyrings, including 232 + the changes of keyring contents or the attributes of keys. 233 + 234 + See Documentation/security/keys/core.rst for more information. 235 + 236 + 237 + Event Filtering 238 + =============== 239 + 240 + Once a watch queue has been created, a set of filters can be applied to limit 241 + the events that are received using:: 242 + 243 + struct watch_notification_filter filter = { 244 + ... 245 + }; 246 + ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter) 247 + 248 + The filter description is a variable of type:: 249 + 250 + struct watch_notification_filter { 251 + __u32 nr_filters; 252 + __u32 __reserved; 253 + struct watch_notification_type_filter filters[]; 254 + }; 255 + 256 + Where "nr_filters" is the number of filters in filters[] and "__reserved" 257 + should be 0. The "filters" array has elements of the following type:: 258 + 259 + struct watch_notification_type_filter { 260 + __u32 type; 261 + __u32 info_filter; 262 + __u32 info_mask; 263 + __u32 subtype_filter[8]; 264 + }; 265 + 266 + Where: 267 + 268 + * ``type`` is the event type to filter for and should be something like 269 + "WATCH_TYPE_KEY_NOTIFY" 270 + 271 + * ``info_filter`` and ``info_mask`` act as a filter on the info field of the 272 + notification record. The notification is only written into the buffer if:: 273 + 274 + (watch.info & info_mask) == info_filter 275 + 276 + This could be used, for example, to ignore events that are not exactly on 277 + the watched point in a mount tree. 278 + 279 + * ``subtype_filter`` is a bitmask indicating the subtypes that are of 280 + interest. Bit 0 of subtype_filter[0] corresponds to subtype 0, bit 1 to 281 + subtype 1, and so on. 282 + 283 + If the argument to the ioctl() is NULL, then the filters will be removed and 284 + all events from the watched sources will come through. 285 + 286 + 287 + Userspace Code Example 288 + ====================== 289 + 290 + A buffer is created with something like the following:: 291 + 292 + pipe2(fds, O_TMPFILE); 293 + ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256); 294 + 295 + It can then be set to receive keyring change notifications:: 296 + 297 + keyctl(KEYCTL_WATCH_KEY, KEY_SPEC_SESSION_KEYRING, fds[1], 0x01); 298 + 299 + The notifications can then be consumed by something like the following:: 300 + 301 + static void consumer(int rfd, struct watch_queue_buffer *buf) 302 + { 303 + unsigned char buffer[128]; 304 + ssize_t buf_len; 305 + 306 + while (buf_len = read(rfd, buffer, sizeof(buffer)), 307 + buf_len > 0 308 + ) { 309 + void *p = buffer; 310 + void *end = buffer + buf_len; 311 + while (p < end) { 312 + union { 313 + struct watch_notification n; 314 + unsigned char buf1[128]; 315 + } n; 316 + size_t largest, len; 317 + 318 + largest = end - p; 319 + if (largest > 128) 320 + largest = 128; 321 + memcpy(&n, p, largest); 322 + 323 + len = (n->info & WATCH_INFO_LENGTH) >> 324 + WATCH_INFO_LENGTH__SHIFT; 325 + if (len == 0 || len > largest) 326 + return; 327 + 328 + switch (n.n.type) { 329 + case WATCH_TYPE_META: 330 + got_meta(&n.n); 331 + case WATCH_TYPE_KEY_NOTIFY: 332 + saw_key_change(&n.n); 333 + break; 334 + } 335 + 336 + p += len; 337 + } 338 + } 339 + }
+172 -70
fs/pipe.c
··· 24 24 #include <linux/syscalls.h> 25 25 #include <linux/fcntl.h> 26 26 #include <linux/memcontrol.h> 27 + #include <linux/watch_queue.h> 27 28 28 29 #include <linux/uaccess.h> 29 30 #include <asm/ioctls.h> ··· 260 259 unsigned int tail = pipe->tail; 261 260 unsigned int mask = pipe->ring_size - 1; 262 261 262 + #ifdef CONFIG_WATCH_QUEUE 263 + if (pipe->note_loss) { 264 + struct watch_notification n; 265 + 266 + if (total_len < 8) { 267 + if (ret == 0) 268 + ret = -ENOBUFS; 269 + break; 270 + } 271 + 272 + n.type = WATCH_TYPE_META; 273 + n.subtype = WATCH_META_LOSS_NOTIFICATION; 274 + n.info = watch_sizeof(n); 275 + if (copy_to_iter(&n, sizeof(n), to) != sizeof(n)) { 276 + if (ret == 0) 277 + ret = -EFAULT; 278 + break; 279 + } 280 + ret += sizeof(n); 281 + total_len -= sizeof(n); 282 + pipe->note_loss = false; 283 + } 284 + #endif 285 + 263 286 if (!pipe_empty(head, tail)) { 264 287 struct pipe_buffer *buf = &pipe->bufs[tail & mask]; 265 288 size_t chars = buf->len; 266 289 size_t written; 267 290 int error; 268 291 269 - if (chars > total_len) 292 + if (chars > total_len) { 293 + if (buf->flags & PIPE_BUF_FLAG_WHOLE) { 294 + if (ret == 0) 295 + ret = -ENOBUFS; 296 + break; 297 + } 270 298 chars = total_len; 299 + } 271 300 272 301 error = pipe_buf_confirm(pipe, buf); 273 302 if (error) { ··· 325 294 if (!buf->len) { 326 295 pipe_buf_release(pipe, buf); 327 296 spin_lock_irq(&pipe->rd_wait.lock); 297 + #ifdef CONFIG_WATCH_QUEUE 298 + if (buf->flags & PIPE_BUF_FLAG_LOSS) 299 + pipe->note_loss = true; 300 + #endif 328 301 tail++; 329 302 pipe->tail = tail; 330 303 spin_unlock_irq(&pipe->rd_wait.lock); ··· 439 404 ret = -EPIPE; 440 405 goto out; 441 406 } 407 + 408 + #ifdef CONFIG_WATCH_QUEUE 409 + if (pipe->watch_queue) { 410 + ret = -EXDEV; 411 + goto out; 412 + } 413 + #endif 442 414 443 415 /* 444 416 * Only wake up if the pipe started out empty, since ··· 616 574 int count, head, tail, mask; 617 575 618 576 switch (cmd) { 619 - case FIONREAD: 620 - __pipe_lock(pipe); 621 - count = 0; 622 - head = pipe->head; 623 - tail = pipe->tail; 624 - mask = pipe->ring_size - 1; 577 + case FIONREAD: 578 + __pipe_lock(pipe); 579 + count = 0; 580 + head = pipe->head; 581 + tail = pipe->tail; 582 + mask = pipe->ring_size - 1; 625 583 626 - while (tail != head) { 627 - count += pipe->bufs[tail & mask].len; 628 - tail++; 629 - } 630 - __pipe_unlock(pipe); 584 + while (tail != head) { 585 + count += pipe->bufs[tail & mask].len; 586 + tail++; 587 + } 588 + __pipe_unlock(pipe); 631 589 632 - return put_user(count, (int __user *)arg); 633 - default: 634 - return -ENOIOCTLCMD; 590 + return put_user(count, (int __user *)arg); 591 + 592 + #ifdef CONFIG_WATCH_QUEUE 593 + case IOC_WATCH_QUEUE_SET_SIZE: { 594 + int ret; 595 + __pipe_lock(pipe); 596 + ret = watch_queue_set_size(pipe, arg); 597 + __pipe_unlock(pipe); 598 + return ret; 599 + } 600 + 601 + case IOC_WATCH_QUEUE_SET_FILTER: 602 + return watch_queue_set_filter( 603 + pipe, (struct watch_notification_filter __user *)arg); 604 + #endif 605 + 606 + default: 607 + return -ENOIOCTLCMD; 635 608 } 636 609 } 637 610 ··· 757 700 return retval; 758 701 } 759 702 760 - static unsigned long account_pipe_buffers(struct user_struct *user, 761 - unsigned long old, unsigned long new) 703 + unsigned long account_pipe_buffers(struct user_struct *user, 704 + unsigned long old, unsigned long new) 762 705 { 763 706 return atomic_long_add_return(new - old, &user->pipe_bufs); 764 707 } 765 708 766 - static bool too_many_pipe_buffers_soft(unsigned long user_bufs) 709 + bool too_many_pipe_buffers_soft(unsigned long user_bufs) 767 710 { 768 711 unsigned long soft_limit = READ_ONCE(pipe_user_pages_soft); 769 712 770 713 return soft_limit && user_bufs > soft_limit; 771 714 } 772 715 773 - static bool too_many_pipe_buffers_hard(unsigned long user_bufs) 716 + bool too_many_pipe_buffers_hard(unsigned long user_bufs) 774 717 { 775 718 unsigned long hard_limit = READ_ONCE(pipe_user_pages_hard); 776 719 777 720 return hard_limit && user_bufs > hard_limit; 778 721 } 779 722 780 - static bool is_unprivileged_user(void) 723 + bool pipe_is_unprivileged_user(void) 781 724 { 782 725 return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN); 783 726 } ··· 799 742 800 743 user_bufs = account_pipe_buffers(user, 0, pipe_bufs); 801 744 802 - if (too_many_pipe_buffers_soft(user_bufs) && is_unprivileged_user()) { 745 + if (too_many_pipe_buffers_soft(user_bufs) && pipe_is_unprivileged_user()) { 803 746 user_bufs = account_pipe_buffers(user, pipe_bufs, 1); 804 747 pipe_bufs = 1; 805 748 } 806 749 807 - if (too_many_pipe_buffers_hard(user_bufs) && is_unprivileged_user()) 750 + if (too_many_pipe_buffers_hard(user_bufs) && pipe_is_unprivileged_user()) 808 751 goto out_revert_acct; 809 752 810 753 pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer), ··· 816 759 pipe->r_counter = pipe->w_counter = 1; 817 760 pipe->max_usage = pipe_bufs; 818 761 pipe->ring_size = pipe_bufs; 762 + pipe->nr_accounted = pipe_bufs; 819 763 pipe->user = user; 820 764 mutex_init(&pipe->mutex); 821 765 return pipe; ··· 834 776 { 835 777 int i; 836 778 837 - (void) account_pipe_buffers(pipe->user, pipe->ring_size, 0); 779 + #ifdef CONFIG_WATCH_QUEUE 780 + if (pipe->watch_queue) { 781 + watch_queue_clear(pipe->watch_queue); 782 + put_watch_queue(pipe->watch_queue); 783 + } 784 + #endif 785 + 786 + (void) account_pipe_buffers(pipe->user, pipe->nr_accounted, 0); 838 787 free_uid(pipe->user); 839 788 for (i = 0; i < pipe->ring_size; i++) { 840 789 struct pipe_buffer *buf = pipe->bufs + i; ··· 917 852 if (!inode) 918 853 return -ENFILE; 919 854 855 + if (flags & O_NOTIFICATION_PIPE) { 856 + #ifdef CONFIG_WATCH_QUEUE 857 + if (watch_queue_init(inode->i_pipe) < 0) { 858 + iput(inode); 859 + return -ENOMEM; 860 + } 861 + #else 862 + return -ENOPKG; 863 + #endif 864 + } 865 + 920 866 f = alloc_file_pseudo(inode, pipe_mnt, "", 921 867 O_WRONLY | (flags & (O_NONBLOCK | O_DIRECT)), 922 868 &pipefifo_fops); ··· 958 882 int error; 959 883 int fdw, fdr; 960 884 961 - if (flags & ~(O_CLOEXEC | O_NONBLOCK | O_DIRECT)) 885 + if (flags & ~(O_CLOEXEC | O_NONBLOCK | O_DIRECT | O_NOTIFICATION_PIPE)) 962 886 return -EINVAL; 963 887 964 888 error = create_pipe_files(files, flags); ··· 1206 1130 } 1207 1131 1208 1132 /* 1209 - * Allocate a new array of pipe buffers and copy the info over. Returns the 1210 - * pipe size if successful, or return -ERROR on error. 1133 + * Resize the pipe ring to a number of slots. 1211 1134 */ 1212 - static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long arg) 1135 + int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots) 1213 1136 { 1214 1137 struct pipe_buffer *bufs; 1215 - unsigned int size, nr_slots, head, tail, mask, n; 1216 - unsigned long user_bufs; 1217 - long ret = 0; 1218 - 1219 - size = round_pipe_size(arg); 1220 - nr_slots = size >> PAGE_SHIFT; 1221 - 1222 - if (!nr_slots) 1223 - return -EINVAL; 1224 - 1225 - /* 1226 - * If trying to increase the pipe capacity, check that an 1227 - * unprivileged user is not trying to exceed various limits 1228 - * (soft limit check here, hard limit check just below). 1229 - * Decreasing the pipe capacity is always permitted, even 1230 - * if the user is currently over a limit. 1231 - */ 1232 - if (nr_slots > pipe->ring_size && 1233 - size > pipe_max_size && !capable(CAP_SYS_RESOURCE)) 1234 - return -EPERM; 1235 - 1236 - user_bufs = account_pipe_buffers(pipe->user, pipe->ring_size, nr_slots); 1237 - 1238 - if (nr_slots > pipe->ring_size && 1239 - (too_many_pipe_buffers_hard(user_bufs) || 1240 - too_many_pipe_buffers_soft(user_bufs)) && 1241 - is_unprivileged_user()) { 1242 - ret = -EPERM; 1243 - goto out_revert_acct; 1244 - } 1138 + unsigned int head, tail, mask, n; 1245 1139 1246 1140 /* 1247 1141 * We can shrink the pipe, if arg is greater than the ring occupancy. ··· 1223 1177 head = pipe->head; 1224 1178 tail = pipe->tail; 1225 1179 n = pipe_occupancy(pipe->head, pipe->tail); 1226 - if (nr_slots < n) { 1227 - ret = -EBUSY; 1228 - goto out_revert_acct; 1229 - } 1180 + if (nr_slots < n) 1181 + return -EBUSY; 1230 1182 1231 1183 bufs = kcalloc(nr_slots, sizeof(*bufs), 1232 1184 GFP_KERNEL_ACCOUNT | __GFP_NOWARN); 1233 - if (unlikely(!bufs)) { 1234 - ret = -ENOMEM; 1235 - goto out_revert_acct; 1236 - } 1185 + if (unlikely(!bufs)) 1186 + return -ENOMEM; 1237 1187 1238 1188 /* 1239 1189 * The pipe array wraps around, so just start the new one at zero ··· 1257 1215 kfree(pipe->bufs); 1258 1216 pipe->bufs = bufs; 1259 1217 pipe->ring_size = nr_slots; 1260 - pipe->max_usage = nr_slots; 1218 + if (pipe->max_usage > nr_slots) 1219 + pipe->max_usage = nr_slots; 1261 1220 pipe->tail = tail; 1262 1221 pipe->head = head; 1263 1222 1264 1223 /* This might have made more room for writers */ 1265 1224 wake_up_interruptible(&pipe->wr_wait); 1225 + return 0; 1226 + } 1227 + 1228 + /* 1229 + * Allocate a new array of pipe buffers and copy the info over. Returns the 1230 + * pipe size if successful, or return -ERROR on error. 1231 + */ 1232 + static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long arg) 1233 + { 1234 + unsigned long user_bufs; 1235 + unsigned int nr_slots, size; 1236 + long ret = 0; 1237 + 1238 + #ifdef CONFIG_WATCH_QUEUE 1239 + if (pipe->watch_queue) 1240 + return -EBUSY; 1241 + #endif 1242 + 1243 + size = round_pipe_size(arg); 1244 + nr_slots = size >> PAGE_SHIFT; 1245 + 1246 + if (!nr_slots) 1247 + return -EINVAL; 1248 + 1249 + /* 1250 + * If trying to increase the pipe capacity, check that an 1251 + * unprivileged user is not trying to exceed various limits 1252 + * (soft limit check here, hard limit check just below). 1253 + * Decreasing the pipe capacity is always permitted, even 1254 + * if the user is currently over a limit. 1255 + */ 1256 + if (nr_slots > pipe->max_usage && 1257 + size > pipe_max_size && !capable(CAP_SYS_RESOURCE)) 1258 + return -EPERM; 1259 + 1260 + user_bufs = account_pipe_buffers(pipe->user, pipe->nr_accounted, nr_slots); 1261 + 1262 + if (nr_slots > pipe->max_usage && 1263 + (too_many_pipe_buffers_hard(user_bufs) || 1264 + too_many_pipe_buffers_soft(user_bufs)) && 1265 + pipe_is_unprivileged_user()) { 1266 + ret = -EPERM; 1267 + goto out_revert_acct; 1268 + } 1269 + 1270 + ret = pipe_resize_ring(pipe, nr_slots); 1271 + if (ret < 0) 1272 + goto out_revert_acct; 1273 + 1274 + pipe->max_usage = nr_slots; 1275 + pipe->nr_accounted = nr_slots; 1266 1276 return pipe->max_usage * PAGE_SIZE; 1267 1277 1268 1278 out_revert_acct: 1269 - (void) account_pipe_buffers(pipe->user, nr_slots, pipe->ring_size); 1279 + (void) account_pipe_buffers(pipe->user, nr_slots, pipe->nr_accounted); 1270 1280 return ret; 1271 1281 } 1272 1282 ··· 1327 1233 * location, so checking ->i_pipe is not enough to verify that this is a 1328 1234 * pipe. 1329 1235 */ 1330 - struct pipe_inode_info *get_pipe_info(struct file *file) 1236 + struct pipe_inode_info *get_pipe_info(struct file *file, bool for_splice) 1331 1237 { 1332 - return file->f_op == &pipefifo_fops ? file->private_data : NULL; 1238 + struct pipe_inode_info *pipe = file->private_data; 1239 + 1240 + if (file->f_op != &pipefifo_fops || !pipe) 1241 + return NULL; 1242 + #ifdef CONFIG_WATCH_QUEUE 1243 + if (for_splice && pipe->watch_queue) 1244 + return NULL; 1245 + #endif 1246 + return pipe; 1333 1247 } 1334 1248 1335 1249 long pipe_fcntl(struct file *file, unsigned int cmd, unsigned long arg) ··· 1345 1243 struct pipe_inode_info *pipe; 1346 1244 long ret; 1347 1245 1348 - pipe = get_pipe_info(file); 1246 + pipe = get_pipe_info(file, false); 1349 1247 if (!pipe) 1350 1248 return -EBADF; 1351 1249
+6 -6
fs/splice.c
··· 1101 1101 !(out->f_mode & FMODE_WRITE))) 1102 1102 return -EBADF; 1103 1103 1104 - ipipe = get_pipe_info(in); 1105 - opipe = get_pipe_info(out); 1104 + ipipe = get_pipe_info(in, true); 1105 + opipe = get_pipe_info(out, true); 1106 1106 1107 1107 if (ipipe && opipe) { 1108 1108 if (off_in || off_out) ··· 1252 1252 static long vmsplice_to_user(struct file *file, struct iov_iter *iter, 1253 1253 unsigned int flags) 1254 1254 { 1255 - struct pipe_inode_info *pipe = get_pipe_info(file); 1255 + struct pipe_inode_info *pipe = get_pipe_info(file, true); 1256 1256 struct splice_desc sd = { 1257 1257 .total_len = iov_iter_count(iter), 1258 1258 .flags = flags, ··· 1287 1287 if (flags & SPLICE_F_GIFT) 1288 1288 buf_flag = PIPE_BUF_FLAG_GIFT; 1289 1289 1290 - pipe = get_pipe_info(file); 1290 + pipe = get_pipe_info(file, true); 1291 1291 if (!pipe) 1292 1292 return -EBADF; 1293 1293 ··· 1733 1733 */ 1734 1734 long do_tee(struct file *in, struct file *out, size_t len, unsigned int flags) 1735 1735 { 1736 - struct pipe_inode_info *ipipe = get_pipe_info(in); 1737 - struct pipe_inode_info *opipe = get_pipe_info(out); 1736 + struct pipe_inode_info *ipipe = get_pipe_info(in, true); 1737 + struct pipe_inode_info *opipe = get_pipe_info(out, true); 1738 1738 int ret = -EINVAL; 1739 1739 1740 1740 if (unlikely(!(in->f_mode & FMODE_READ) ||
+21 -12
include/linux/key.h
··· 71 71 72 72 #define KEY_PERM_UNDEF 0xffffffff 73 73 74 + /* 75 + * The permissions required on a key that we're looking up. 76 + */ 77 + enum key_need_perm { 78 + KEY_NEED_UNSPECIFIED, /* Needed permission unspecified */ 79 + KEY_NEED_VIEW, /* Require permission to view attributes */ 80 + KEY_NEED_READ, /* Require permission to read content */ 81 + KEY_NEED_WRITE, /* Require permission to update / modify */ 82 + KEY_NEED_SEARCH, /* Require permission to search (keyring) or find (key) */ 83 + KEY_NEED_LINK, /* Require permission to link */ 84 + KEY_NEED_SETATTR, /* Require permission to change attributes */ 85 + KEY_NEED_UNLINK, /* Require permission to unlink key */ 86 + KEY_SYSADMIN_OVERRIDE, /* Special: override by CAP_SYS_ADMIN */ 87 + KEY_AUTHTOKEN_OVERRIDE, /* Special: override by possession of auth token */ 88 + KEY_DEFER_PERM_CHECK, /* Special: permission check is deferred */ 89 + }; 90 + 74 91 struct seq_file; 75 92 struct user_struct; 76 93 struct signal_struct; ··· 193 176 struct list_head graveyard_link; 194 177 struct rb_node serial_node; 195 178 }; 179 + #ifdef CONFIG_KEY_NOTIFICATIONS 180 + struct watch_list *watchers; /* Entities watching this key for changes */ 181 + #endif 196 182 struct rw_semaphore sem; /* change vs change sem */ 197 183 struct key_user *user; /* owner of this key */ 198 184 void *security; /* security data for this key */ ··· 437 417 extern void key_set_timeout(struct key *, unsigned); 438 418 439 419 extern key_ref_t lookup_user_key(key_serial_t id, unsigned long flags, 440 - key_perm_t perm); 420 + enum key_need_perm need_perm); 441 421 extern void key_free_user_ns(struct user_namespace *); 442 - 443 - /* 444 - * The permissions required on a key that we're looking up. 445 - */ 446 - #define KEY_NEED_VIEW 0x01 /* Require permission to view attributes */ 447 - #define KEY_NEED_READ 0x02 /* Require permission to read content */ 448 - #define KEY_NEED_WRITE 0x04 /* Require permission to update / modify */ 449 - #define KEY_NEED_SEARCH 0x08 /* Require permission to search (keyring) or find (key) */ 450 - #define KEY_NEED_LINK 0x10 /* Require permission to link */ 451 - #define KEY_NEED_SETATTR 0x20 /* Require permission to change attributes */ 452 - #define KEY_NEED_ALL 0x3f /* All the above permissions */ 453 422 454 423 static inline short key_read_state(const struct key *key) 455 424 {
+1
include/linux/lsm_audit.h
··· 75 75 #define LSM_AUDIT_DATA_IBPKEY 13 76 76 #define LSM_AUDIT_DATA_IBENDPORT 14 77 77 #define LSM_AUDIT_DATA_LOCKDOWN 15 78 + #define LSM_AUDIT_DATA_NOTIFICATION 16 78 79 union { 79 80 struct path path; 80 81 struct dentry *dentry;
+9
include/linux/lsm_hook_defs.h
··· 254 254 LSM_HOOK(int, 0, inode_getsecctx, struct inode *inode, void **ctx, 255 255 u32 *ctxlen) 256 256 257 + #if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE) 258 + LSM_HOOK(int, 0, post_notification, const struct cred *w_cred, 259 + const struct cred *cred, struct watch_notification *n) 260 + #endif /* CONFIG_SECURITY && CONFIG_WATCH_QUEUE */ 261 + 262 + #if defined(CONFIG_SECURITY) && defined(CONFIG_KEY_NOTIFICATIONS) 263 + LSM_HOOK(int, 0, watch_key, struct key *key) 264 + #endif /* CONFIG_SECURITY && CONFIG_KEY_NOTIFICATIONS */ 265 + 257 266 #ifdef CONFIG_SECURITY_NETWORK 258 267 LSM_HOOK(int, 0, unix_stream_connect, struct sock *sock, struct sock *other, 259 268 struct sock *newsk)
+14
include/linux/lsm_hooks.h
··· 1445 1445 * @ctx is a pointer in which to place the allocated security context. 1446 1446 * @ctxlen points to the place to put the length of @ctx. 1447 1447 * 1448 + * Security hooks for the general notification queue: 1449 + * 1450 + * @post_notification: 1451 + * Check to see if a watch notification can be posted to a particular 1452 + * queue. 1453 + * @w_cred: The credentials of the whoever set the watch. 1454 + * @cred: The event-triggerer's credentials 1455 + * @n: The notification being posted 1456 + * 1457 + * @watch_key: 1458 + * Check to see if a process is allowed to watch for event notifications 1459 + * from a key or keyring. 1460 + * @key: The key to watch. 1461 + * 1448 1462 * Security hooks for using the eBPF maps and programs functionalities through 1449 1463 * eBPF syscalls. 1450 1464 *
+26 -1
include/linux/pipe_fs_i.h
··· 9 9 #define PIPE_BUF_FLAG_GIFT 0x04 /* page is a gift */ 10 10 #define PIPE_BUF_FLAG_PACKET 0x08 /* read() as a packet */ 11 11 #define PIPE_BUF_FLAG_CAN_MERGE 0x10 /* can merge buffers */ 12 + #define PIPE_BUF_FLAG_WHOLE 0x20 /* read() must return entire buffer or error */ 13 + #ifdef CONFIG_WATCH_QUEUE 14 + #define PIPE_BUF_FLAG_LOSS 0x40 /* Message loss happened after this buffer */ 15 + #endif 12 16 13 17 /** 14 18 * struct pipe_buffer - a linux kernel pipe buffer ··· 38 34 * @wr_wait: writer wait point in case of full pipe 39 35 * @head: The point of buffer production 40 36 * @tail: The point of buffer consumption 37 + * @note_loss: The next read() should insert a data-lost message 41 38 * @max_usage: The maximum number of slots that may be used in the ring 42 39 * @ring_size: total number of buffers (should be a power of 2) 40 + * @nr_accounted: The amount this pipe accounts for in user->pipe_bufs 43 41 * @tmp_page: cached released page 44 42 * @readers: number of current readers of this pipe 45 43 * @writers: number of current writers of this pipe ··· 52 46 * @fasync_writers: writer side fasync 53 47 * @bufs: the circular array of pipe buffers 54 48 * @user: the user who created this pipe 49 + * @watch_queue: If this pipe is a watch_queue, this is the stuff for that 55 50 **/ 56 51 struct pipe_inode_info { 57 52 struct mutex mutex; ··· 61 54 unsigned int tail; 62 55 unsigned int max_usage; 63 56 unsigned int ring_size; 57 + #ifdef CONFIG_WATCH_QUEUE 58 + bool note_loss; 59 + #endif 60 + unsigned int nr_accounted; 64 61 unsigned int readers; 65 62 unsigned int writers; 66 63 unsigned int files; ··· 75 64 struct fasync_struct *fasync_writers; 76 65 struct pipe_buffer *bufs; 77 66 struct user_struct *user; 67 + #ifdef CONFIG_WATCH_QUEUE 68 + struct watch_queue *watch_queue; 69 + #endif 78 70 }; 79 71 80 72 /* ··· 253 239 254 240 extern const struct pipe_buf_operations nosteal_pipe_buf_ops; 255 241 242 + #ifdef CONFIG_WATCH_QUEUE 243 + unsigned long account_pipe_buffers(struct user_struct *user, 244 + unsigned long old, unsigned long new); 245 + bool too_many_pipe_buffers_soft(unsigned long user_bufs); 246 + bool too_many_pipe_buffers_hard(unsigned long user_bufs); 247 + bool pipe_is_unprivileged_user(void); 248 + #endif 249 + 256 250 /* for F_SETPIPE_SZ and F_GETPIPE_SZ */ 251 + #ifdef CONFIG_WATCH_QUEUE 252 + int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots); 253 + #endif 257 254 long pipe_fcntl(struct file *, unsigned int, unsigned long arg); 258 - struct pipe_inode_info *get_pipe_info(struct file *file); 255 + struct pipe_inode_info *get_pipe_info(struct file *file, bool for_splice); 259 256 260 257 int create_pipe_files(struct file **, int); 261 258 unsigned int round_pipe_size(unsigned long size);
+27 -3
include/linux/security.h
··· 56 56 struct fs_context; 57 57 struct fs_parameter; 58 58 enum fs_value_type; 59 + struct watch; 60 + struct watch_notification; 59 61 60 62 /* Default (no) options for the capable function */ 61 63 #define CAP_OPT_NONE 0x0 ··· 1284 1282 } 1285 1283 #endif /* CONFIG_SECURITY */ 1286 1284 1285 + #if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE) 1286 + int security_post_notification(const struct cred *w_cred, 1287 + const struct cred *cred, 1288 + struct watch_notification *n); 1289 + #else 1290 + static inline int security_post_notification(const struct cred *w_cred, 1291 + const struct cred *cred, 1292 + struct watch_notification *n) 1293 + { 1294 + return 0; 1295 + } 1296 + #endif 1297 + 1298 + #if defined(CONFIG_SECURITY) && defined(CONFIG_KEY_NOTIFICATIONS) 1299 + int security_watch_key(struct key *key); 1300 + #else 1301 + static inline int security_watch_key(struct key *key) 1302 + { 1303 + return 0; 1304 + } 1305 + #endif 1306 + 1287 1307 #ifdef CONFIG_SECURITY_NETWORK 1288 1308 1289 1309 int security_unix_stream_connect(struct sock *sock, struct sock *other, struct sock *newsk); ··· 1774 1750 1775 1751 int security_key_alloc(struct key *key, const struct cred *cred, unsigned long flags); 1776 1752 void security_key_free(struct key *key); 1777 - int security_key_permission(key_ref_t key_ref, 1778 - const struct cred *cred, unsigned perm); 1753 + int security_key_permission(key_ref_t key_ref, const struct cred *cred, 1754 + enum key_need_perm need_perm); 1779 1755 int security_key_getsecurity(struct key *key, char **_buffer); 1780 1756 1781 1757 #else ··· 1793 1769 1794 1770 static inline int security_key_permission(key_ref_t key_ref, 1795 1771 const struct cred *cred, 1796 - unsigned perm) 1772 + enum key_need_perm need_perm) 1797 1773 { 1798 1774 return 0; 1799 1775 }
+127
include/linux/watch_queue.h
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* User-mappable watch queue 3 + * 4 + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. 5 + * Written by David Howells (dhowells@redhat.com) 6 + * 7 + * See Documentation/watch_queue.rst 8 + */ 9 + 10 + #ifndef _LINUX_WATCH_QUEUE_H 11 + #define _LINUX_WATCH_QUEUE_H 12 + 13 + #include <uapi/linux/watch_queue.h> 14 + #include <linux/kref.h> 15 + #include <linux/rcupdate.h> 16 + 17 + #ifdef CONFIG_WATCH_QUEUE 18 + 19 + struct cred; 20 + 21 + struct watch_type_filter { 22 + enum watch_notification_type type; 23 + __u32 subtype_filter[1]; /* Bitmask of subtypes to filter on */ 24 + __u32 info_filter; /* Filter on watch_notification::info */ 25 + __u32 info_mask; /* Mask of relevant bits in info_filter */ 26 + }; 27 + 28 + struct watch_filter { 29 + union { 30 + struct rcu_head rcu; 31 + unsigned long type_filter[2]; /* Bitmask of accepted types */ 32 + }; 33 + u32 nr_filters; /* Number of filters */ 34 + struct watch_type_filter filters[]; 35 + }; 36 + 37 + struct watch_queue { 38 + struct rcu_head rcu; 39 + struct watch_filter __rcu *filter; 40 + struct pipe_inode_info *pipe; /* The pipe we're using as a buffer */ 41 + struct hlist_head watches; /* Contributory watches */ 42 + struct page **notes; /* Preallocated notifications */ 43 + unsigned long *notes_bitmap; /* Allocation bitmap for notes */ 44 + struct kref usage; /* Object usage count */ 45 + spinlock_t lock; 46 + unsigned int nr_notes; /* Number of notes */ 47 + unsigned int nr_pages; /* Number of pages in notes[] */ 48 + bool defunct; /* T when queues closed */ 49 + }; 50 + 51 + /* 52 + * Representation of a watch on an object. 53 + */ 54 + struct watch { 55 + union { 56 + struct rcu_head rcu; 57 + u32 info_id; /* ID to be OR'd in to info field */ 58 + }; 59 + struct watch_queue __rcu *queue; /* Queue to post events to */ 60 + struct hlist_node queue_node; /* Link in queue->watches */ 61 + struct watch_list __rcu *watch_list; 62 + struct hlist_node list_node; /* Link in watch_list->watchers */ 63 + const struct cred *cred; /* Creds of the owner of the watch */ 64 + void *private; /* Private data for the watched object */ 65 + u64 id; /* Internal identifier */ 66 + struct kref usage; /* Object usage count */ 67 + }; 68 + 69 + /* 70 + * List of watches on an object. 71 + */ 72 + struct watch_list { 73 + struct rcu_head rcu; 74 + struct hlist_head watchers; 75 + void (*release_watch)(struct watch *); 76 + spinlock_t lock; 77 + }; 78 + 79 + extern void __post_watch_notification(struct watch_list *, 80 + struct watch_notification *, 81 + const struct cred *, 82 + u64); 83 + extern struct watch_queue *get_watch_queue(int); 84 + extern void put_watch_queue(struct watch_queue *); 85 + extern void init_watch(struct watch *, struct watch_queue *); 86 + extern int add_watch_to_object(struct watch *, struct watch_list *); 87 + extern int remove_watch_from_object(struct watch_list *, struct watch_queue *, u64, bool); 88 + extern long watch_queue_set_size(struct pipe_inode_info *, unsigned int); 89 + extern long watch_queue_set_filter(struct pipe_inode_info *, 90 + struct watch_notification_filter __user *); 91 + extern int watch_queue_init(struct pipe_inode_info *); 92 + extern void watch_queue_clear(struct watch_queue *); 93 + 94 + static inline void init_watch_list(struct watch_list *wlist, 95 + void (*release_watch)(struct watch *)) 96 + { 97 + INIT_HLIST_HEAD(&wlist->watchers); 98 + spin_lock_init(&wlist->lock); 99 + wlist->release_watch = release_watch; 100 + } 101 + 102 + static inline void post_watch_notification(struct watch_list *wlist, 103 + struct watch_notification *n, 104 + const struct cred *cred, 105 + u64 id) 106 + { 107 + if (unlikely(wlist)) 108 + __post_watch_notification(wlist, n, cred, id); 109 + } 110 + 111 + static inline void remove_watch_list(struct watch_list *wlist, u64 id) 112 + { 113 + if (wlist) { 114 + remove_watch_from_object(wlist, NULL, id, true); 115 + kfree_rcu(wlist, rcu); 116 + } 117 + } 118 + 119 + /** 120 + * watch_sizeof - Calculate the information part of the size of a watch record, 121 + * given the structure size. 122 + */ 123 + #define watch_sizeof(STRUCT) (sizeof(STRUCT) << WATCH_INFO_LENGTH__SHIFT) 124 + 125 + #endif 126 + 127 + #endif /* _LINUX_WATCH_QUEUE_H */
+2
include/uapi/linux/keyctl.h
··· 69 69 #define KEYCTL_RESTRICT_KEYRING 29 /* Restrict keys allowed to link to a keyring */ 70 70 #define KEYCTL_MOVE 30 /* Move keys between keyrings */ 71 71 #define KEYCTL_CAPABILITIES 31 /* Find capabilities of keyrings subsystem */ 72 + #define KEYCTL_WATCH_KEY 32 /* Watch a key or ring of keys for changes */ 72 73 73 74 /* keyctl structures */ 74 75 struct keyctl_dh_params { ··· 131 130 #define KEYCTL_CAPS0_MOVE 0x80 /* KEYCTL_MOVE supported */ 132 131 #define KEYCTL_CAPS1_NS_KEYRING_NAME 0x01 /* Keyring names are per-user_namespace */ 133 132 #define KEYCTL_CAPS1_NS_KEY_TAG 0x02 /* Key indexing can include a namespace tag */ 133 + #define KEYCTL_CAPS1_NOTIFICATIONS 0x04 /* Keys generate watchable notifications */ 134 134 135 135 #endif /* _LINUX_KEYCTL_H */
+104
include/uapi/linux/watch_queue.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + #ifndef _UAPI_LINUX_WATCH_QUEUE_H 3 + #define _UAPI_LINUX_WATCH_QUEUE_H 4 + 5 + #include <linux/types.h> 6 + #include <linux/fcntl.h> 7 + #include <linux/ioctl.h> 8 + 9 + #define O_NOTIFICATION_PIPE O_EXCL /* Parameter to pipe2() selecting notification pipe */ 10 + 11 + #define IOC_WATCH_QUEUE_SET_SIZE _IO('W', 0x60) /* Set the size in pages */ 12 + #define IOC_WATCH_QUEUE_SET_FILTER _IO('W', 0x61) /* Set the filter */ 13 + 14 + enum watch_notification_type { 15 + WATCH_TYPE_META = 0, /* Special record */ 16 + WATCH_TYPE_KEY_NOTIFY = 1, /* Key change event notification */ 17 + WATCH_TYPE__NR = 2 18 + }; 19 + 20 + enum watch_meta_notification_subtype { 21 + WATCH_META_REMOVAL_NOTIFICATION = 0, /* Watched object was removed */ 22 + WATCH_META_LOSS_NOTIFICATION = 1, /* Data loss occurred */ 23 + }; 24 + 25 + /* 26 + * Notification record header. This is aligned to 64-bits so that subclasses 27 + * can contain __u64 fields. 28 + */ 29 + struct watch_notification { 30 + __u32 type:24; /* enum watch_notification_type */ 31 + __u32 subtype:8; /* Type-specific subtype (filterable) */ 32 + __u32 info; 33 + #define WATCH_INFO_LENGTH 0x0000007f /* Length of record */ 34 + #define WATCH_INFO_LENGTH__SHIFT 0 35 + #define WATCH_INFO_ID 0x0000ff00 /* ID of watchpoint */ 36 + #define WATCH_INFO_ID__SHIFT 8 37 + #define WATCH_INFO_TYPE_INFO 0xffff0000 /* Type-specific info */ 38 + #define WATCH_INFO_TYPE_INFO__SHIFT 16 39 + #define WATCH_INFO_FLAG_0 0x00010000 /* Type-specific info, flag bit 0 */ 40 + #define WATCH_INFO_FLAG_1 0x00020000 /* ... */ 41 + #define WATCH_INFO_FLAG_2 0x00040000 42 + #define WATCH_INFO_FLAG_3 0x00080000 43 + #define WATCH_INFO_FLAG_4 0x00100000 44 + #define WATCH_INFO_FLAG_5 0x00200000 45 + #define WATCH_INFO_FLAG_6 0x00400000 46 + #define WATCH_INFO_FLAG_7 0x00800000 47 + }; 48 + 49 + /* 50 + * Notification filtering rules (IOC_WATCH_QUEUE_SET_FILTER). 51 + */ 52 + struct watch_notification_type_filter { 53 + __u32 type; /* Type to apply filter to */ 54 + __u32 info_filter; /* Filter on watch_notification::info */ 55 + __u32 info_mask; /* Mask of relevant bits in info_filter */ 56 + __u32 subtype_filter[8]; /* Bitmask of subtypes to filter on */ 57 + }; 58 + 59 + struct watch_notification_filter { 60 + __u32 nr_filters; /* Number of filters */ 61 + __u32 __reserved; /* Must be 0 */ 62 + struct watch_notification_type_filter filters[]; 63 + }; 64 + 65 + 66 + /* 67 + * Extended watch removal notification. This is used optionally if the type 68 + * wants to indicate an identifier for the object being watched, if there is 69 + * such. This can be distinguished by the length. 70 + * 71 + * type -> WATCH_TYPE_META 72 + * subtype -> WATCH_META_REMOVAL_NOTIFICATION 73 + */ 74 + struct watch_notification_removal { 75 + struct watch_notification watch; 76 + __u64 id; /* Type-dependent identifier */ 77 + }; 78 + 79 + /* 80 + * Type of key/keyring change notification. 81 + */ 82 + enum key_notification_subtype { 83 + NOTIFY_KEY_INSTANTIATED = 0, /* Key was instantiated (aux is error code) */ 84 + NOTIFY_KEY_UPDATED = 1, /* Key was updated */ 85 + NOTIFY_KEY_LINKED = 2, /* Key (aux) was added to watched keyring */ 86 + NOTIFY_KEY_UNLINKED = 3, /* Key (aux) was removed from watched keyring */ 87 + NOTIFY_KEY_CLEARED = 4, /* Keyring was cleared */ 88 + NOTIFY_KEY_REVOKED = 5, /* Key was revoked */ 89 + NOTIFY_KEY_INVALIDATED = 6, /* Key was invalidated */ 90 + NOTIFY_KEY_SETATTR = 7, /* Key's attributes got changed */ 91 + }; 92 + 93 + /* 94 + * Key/keyring notification record. 95 + * - watch.type = WATCH_TYPE_KEY_NOTIFY 96 + * - watch.subtype = enum key_notification_type 97 + */ 98 + struct key_notification { 99 + struct watch_notification watch; 100 + __u32 key_id; /* The key/keyring affected */ 101 + __u32 aux; /* Per-type auxiliary data */ 102 + }; 103 + 104 + #endif /* _UAPI_LINUX_WATCH_QUEUE_H */
+12
init/Kconfig
··· 367 367 depends on SYSCTL 368 368 default y 369 369 370 + config WATCH_QUEUE 371 + bool "General notification queue" 372 + default n 373 + help 374 + 375 + This is a general notification queue for the kernel to pass events to 376 + userspace by splicing them into pipes. It can be used in conjunction 377 + with watches for key/keyring change notifications and device 378 + notifications. 379 + 380 + See Documentation/watch_queue.rst 381 + 370 382 config CROSS_MEMORY_ATTACH 371 383 bool "Enable process_vm_readv/writev syscalls" 372 384 depends on MMU
+1
kernel/Makefile
··· 121 121 122 122 obj-$(CONFIG_HAS_IOMEM) += iomem.o 123 123 obj-$(CONFIG_RSEQ) += rseq.o 124 + obj-$(CONFIG_WATCH_QUEUE) += watch_queue.o 124 125 125 126 obj-$(CONFIG_SYSCTL_KUNIT_TEST) += sysctl-test.o 126 127
+655
kernel/watch_queue.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Watch queue and general notification mechanism, built on pipes 3 + * 4 + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. 5 + * Written by David Howells (dhowells@redhat.com) 6 + * 7 + * See Documentation/watch_queue.rst 8 + */ 9 + 10 + #define pr_fmt(fmt) "watchq: " fmt 11 + #include <linux/module.h> 12 + #include <linux/init.h> 13 + #include <linux/sched.h> 14 + #include <linux/slab.h> 15 + #include <linux/printk.h> 16 + #include <linux/miscdevice.h> 17 + #include <linux/fs.h> 18 + #include <linux/mm.h> 19 + #include <linux/pagemap.h> 20 + #include <linux/poll.h> 21 + #include <linux/uaccess.h> 22 + #include <linux/vmalloc.h> 23 + #include <linux/file.h> 24 + #include <linux/security.h> 25 + #include <linux/cred.h> 26 + #include <linux/sched/signal.h> 27 + #include <linux/watch_queue.h> 28 + #include <linux/pipe_fs_i.h> 29 + 30 + MODULE_DESCRIPTION("Watch queue"); 31 + MODULE_AUTHOR("Red Hat, Inc."); 32 + MODULE_LICENSE("GPL"); 33 + 34 + #define WATCH_QUEUE_NOTE_SIZE 128 35 + #define WATCH_QUEUE_NOTES_PER_PAGE (PAGE_SIZE / WATCH_QUEUE_NOTE_SIZE) 36 + 37 + static void watch_queue_pipe_buf_release(struct pipe_inode_info *pipe, 38 + struct pipe_buffer *buf) 39 + { 40 + struct watch_queue *wqueue = (struct watch_queue *)buf->private; 41 + struct page *page; 42 + unsigned int bit; 43 + 44 + /* We need to work out which note within the page this refers to, but 45 + * the note might have been maximum size, so merely ANDing the offset 46 + * off doesn't work. OTOH, the note must've been more than zero size. 47 + */ 48 + bit = buf->offset + buf->len; 49 + if ((bit & (WATCH_QUEUE_NOTE_SIZE - 1)) == 0) 50 + bit -= WATCH_QUEUE_NOTE_SIZE; 51 + bit /= WATCH_QUEUE_NOTE_SIZE; 52 + 53 + page = buf->page; 54 + bit += page->index; 55 + 56 + set_bit(bit, wqueue->notes_bitmap); 57 + } 58 + 59 + // No try_steal function => no stealing 60 + #define watch_queue_pipe_buf_try_steal NULL 61 + 62 + /* New data written to a pipe may be appended to a buffer with this type. */ 63 + static const struct pipe_buf_operations watch_queue_pipe_buf_ops = { 64 + .release = watch_queue_pipe_buf_release, 65 + .try_steal = watch_queue_pipe_buf_try_steal, 66 + .get = generic_pipe_buf_get, 67 + }; 68 + 69 + /* 70 + * Post a notification to a watch queue. 71 + */ 72 + static bool post_one_notification(struct watch_queue *wqueue, 73 + struct watch_notification *n) 74 + { 75 + void *p; 76 + struct pipe_inode_info *pipe = wqueue->pipe; 77 + struct pipe_buffer *buf; 78 + struct page *page; 79 + unsigned int head, tail, mask, note, offset, len; 80 + bool done = false; 81 + 82 + if (!pipe) 83 + return false; 84 + 85 + spin_lock_irq(&pipe->rd_wait.lock); 86 + 87 + if (wqueue->defunct) 88 + goto out; 89 + 90 + mask = pipe->ring_size - 1; 91 + head = pipe->head; 92 + tail = pipe->tail; 93 + if (pipe_full(head, tail, pipe->ring_size)) 94 + goto lost; 95 + 96 + note = find_first_bit(wqueue->notes_bitmap, wqueue->nr_notes); 97 + if (note >= wqueue->nr_notes) 98 + goto lost; 99 + 100 + page = wqueue->notes[note / WATCH_QUEUE_NOTES_PER_PAGE]; 101 + offset = note % WATCH_QUEUE_NOTES_PER_PAGE * WATCH_QUEUE_NOTE_SIZE; 102 + get_page(page); 103 + len = n->info & WATCH_INFO_LENGTH; 104 + p = kmap_atomic(page); 105 + memcpy(p + offset, n, len); 106 + kunmap_atomic(p); 107 + 108 + buf = &pipe->bufs[head & mask]; 109 + buf->page = page; 110 + buf->private = (unsigned long)wqueue; 111 + buf->ops = &watch_queue_pipe_buf_ops; 112 + buf->offset = offset; 113 + buf->len = len; 114 + buf->flags = PIPE_BUF_FLAG_WHOLE; 115 + pipe->head = head + 1; 116 + 117 + if (!test_and_clear_bit(note, wqueue->notes_bitmap)) { 118 + spin_unlock_irq(&pipe->rd_wait.lock); 119 + BUG(); 120 + } 121 + wake_up_interruptible_sync_poll_locked(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM); 122 + done = true; 123 + 124 + out: 125 + spin_unlock_irq(&pipe->rd_wait.lock); 126 + if (done) 127 + kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN); 128 + return done; 129 + 130 + lost: 131 + buf = &pipe->bufs[(head - 1) & mask]; 132 + buf->flags |= PIPE_BUF_FLAG_LOSS; 133 + goto out; 134 + } 135 + 136 + /* 137 + * Apply filter rules to a notification. 138 + */ 139 + static bool filter_watch_notification(const struct watch_filter *wf, 140 + const struct watch_notification *n) 141 + { 142 + const struct watch_type_filter *wt; 143 + unsigned int st_bits = sizeof(wt->subtype_filter[0]) * 8; 144 + unsigned int st_index = n->subtype / st_bits; 145 + unsigned int st_bit = 1U << (n->subtype % st_bits); 146 + int i; 147 + 148 + if (!test_bit(n->type, wf->type_filter)) 149 + return false; 150 + 151 + for (i = 0; i < wf->nr_filters; i++) { 152 + wt = &wf->filters[i]; 153 + if (n->type == wt->type && 154 + (wt->subtype_filter[st_index] & st_bit) && 155 + (n->info & wt->info_mask) == wt->info_filter) 156 + return true; 157 + } 158 + 159 + return false; /* If there is a filter, the default is to reject. */ 160 + } 161 + 162 + /** 163 + * __post_watch_notification - Post an event notification 164 + * @wlist: The watch list to post the event to. 165 + * @n: The notification record to post. 166 + * @cred: The creds of the process that triggered the notification. 167 + * @id: The ID to match on the watch. 168 + * 169 + * Post a notification of an event into a set of watch queues and let the users 170 + * know. 171 + * 172 + * The size of the notification should be set in n->info & WATCH_INFO_LENGTH and 173 + * should be in units of sizeof(*n). 174 + */ 175 + void __post_watch_notification(struct watch_list *wlist, 176 + struct watch_notification *n, 177 + const struct cred *cred, 178 + u64 id) 179 + { 180 + const struct watch_filter *wf; 181 + struct watch_queue *wqueue; 182 + struct watch *watch; 183 + 184 + if (((n->info & WATCH_INFO_LENGTH) >> WATCH_INFO_LENGTH__SHIFT) == 0) { 185 + WARN_ON(1); 186 + return; 187 + } 188 + 189 + rcu_read_lock(); 190 + 191 + hlist_for_each_entry_rcu(watch, &wlist->watchers, list_node) { 192 + if (watch->id != id) 193 + continue; 194 + n->info &= ~WATCH_INFO_ID; 195 + n->info |= watch->info_id; 196 + 197 + wqueue = rcu_dereference(watch->queue); 198 + wf = rcu_dereference(wqueue->filter); 199 + if (wf && !filter_watch_notification(wf, n)) 200 + continue; 201 + 202 + if (security_post_notification(watch->cred, cred, n) < 0) 203 + continue; 204 + 205 + post_one_notification(wqueue, n); 206 + } 207 + 208 + rcu_read_unlock(); 209 + } 210 + EXPORT_SYMBOL(__post_watch_notification); 211 + 212 + /* 213 + * Allocate sufficient pages to preallocation for the requested number of 214 + * notifications. 215 + */ 216 + long watch_queue_set_size(struct pipe_inode_info *pipe, unsigned int nr_notes) 217 + { 218 + struct watch_queue *wqueue = pipe->watch_queue; 219 + struct page **pages; 220 + unsigned long *bitmap; 221 + unsigned long user_bufs; 222 + unsigned int bmsize; 223 + int ret, i, nr_pages; 224 + 225 + if (!wqueue) 226 + return -ENODEV; 227 + if (wqueue->notes) 228 + return -EBUSY; 229 + 230 + if (nr_notes < 1 || 231 + nr_notes > 512) /* TODO: choose a better hard limit */ 232 + return -EINVAL; 233 + 234 + nr_pages = (nr_notes + WATCH_QUEUE_NOTES_PER_PAGE - 1); 235 + nr_pages /= WATCH_QUEUE_NOTES_PER_PAGE; 236 + user_bufs = account_pipe_buffers(pipe->user, pipe->nr_accounted, nr_pages); 237 + 238 + if (nr_pages > pipe->max_usage && 239 + (too_many_pipe_buffers_hard(user_bufs) || 240 + too_many_pipe_buffers_soft(user_bufs)) && 241 + pipe_is_unprivileged_user()) { 242 + ret = -EPERM; 243 + goto error; 244 + } 245 + 246 + ret = pipe_resize_ring(pipe, nr_notes); 247 + if (ret < 0) 248 + goto error; 249 + 250 + pages = kcalloc(sizeof(struct page *), nr_pages, GFP_KERNEL); 251 + if (!pages) 252 + goto error; 253 + 254 + for (i = 0; i < nr_pages; i++) { 255 + pages[i] = alloc_page(GFP_KERNEL); 256 + if (!pages[i]) 257 + goto error_p; 258 + pages[i]->index = i * WATCH_QUEUE_NOTES_PER_PAGE; 259 + } 260 + 261 + bmsize = (nr_notes + BITS_PER_LONG - 1) / BITS_PER_LONG; 262 + bmsize *= sizeof(unsigned long); 263 + bitmap = kmalloc(bmsize, GFP_KERNEL); 264 + if (!bitmap) 265 + goto error_p; 266 + 267 + memset(bitmap, 0xff, bmsize); 268 + wqueue->notes = pages; 269 + wqueue->notes_bitmap = bitmap; 270 + wqueue->nr_pages = nr_pages; 271 + wqueue->nr_notes = nr_pages * WATCH_QUEUE_NOTES_PER_PAGE; 272 + return 0; 273 + 274 + error_p: 275 + for (i = 0; i < nr_pages; i++) 276 + __free_page(pages[i]); 277 + kfree(pages); 278 + error: 279 + (void) account_pipe_buffers(pipe->user, nr_pages, pipe->nr_accounted); 280 + return ret; 281 + } 282 + 283 + /* 284 + * Set the filter on a watch queue. 285 + */ 286 + long watch_queue_set_filter(struct pipe_inode_info *pipe, 287 + struct watch_notification_filter __user *_filter) 288 + { 289 + struct watch_notification_type_filter *tf; 290 + struct watch_notification_filter filter; 291 + struct watch_type_filter *q; 292 + struct watch_filter *wfilter; 293 + struct watch_queue *wqueue = pipe->watch_queue; 294 + int ret, nr_filter = 0, i; 295 + 296 + if (!wqueue) 297 + return -ENODEV; 298 + 299 + if (!_filter) { 300 + /* Remove the old filter */ 301 + wfilter = NULL; 302 + goto set; 303 + } 304 + 305 + /* Grab the user's filter specification */ 306 + if (copy_from_user(&filter, _filter, sizeof(filter)) != 0) 307 + return -EFAULT; 308 + if (filter.nr_filters == 0 || 309 + filter.nr_filters > 16 || 310 + filter.__reserved != 0) 311 + return -EINVAL; 312 + 313 + tf = memdup_user(_filter->filters, filter.nr_filters * sizeof(*tf)); 314 + if (IS_ERR(tf)) 315 + return PTR_ERR(tf); 316 + 317 + ret = -EINVAL; 318 + for (i = 0; i < filter.nr_filters; i++) { 319 + if ((tf[i].info_filter & ~tf[i].info_mask) || 320 + tf[i].info_mask & WATCH_INFO_LENGTH) 321 + goto err_filter; 322 + /* Ignore any unknown types */ 323 + if (tf[i].type >= sizeof(wfilter->type_filter) * 8) 324 + continue; 325 + nr_filter++; 326 + } 327 + 328 + /* Now we need to build the internal filter from only the relevant 329 + * user-specified filters. 330 + */ 331 + ret = -ENOMEM; 332 + wfilter = kzalloc(struct_size(wfilter, filters, nr_filter), GFP_KERNEL); 333 + if (!wfilter) 334 + goto err_filter; 335 + wfilter->nr_filters = nr_filter; 336 + 337 + q = wfilter->filters; 338 + for (i = 0; i < filter.nr_filters; i++) { 339 + if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG) 340 + continue; 341 + 342 + q->type = tf[i].type; 343 + q->info_filter = tf[i].info_filter; 344 + q->info_mask = tf[i].info_mask; 345 + q->subtype_filter[0] = tf[i].subtype_filter[0]; 346 + __set_bit(q->type, wfilter->type_filter); 347 + q++; 348 + } 349 + 350 + kfree(tf); 351 + set: 352 + pipe_lock(pipe); 353 + wfilter = rcu_replace_pointer(wqueue->filter, wfilter, 354 + lockdep_is_held(&pipe->mutex)); 355 + pipe_unlock(pipe); 356 + if (wfilter) 357 + kfree_rcu(wfilter, rcu); 358 + return 0; 359 + 360 + err_filter: 361 + kfree(tf); 362 + return ret; 363 + } 364 + 365 + static void __put_watch_queue(struct kref *kref) 366 + { 367 + struct watch_queue *wqueue = 368 + container_of(kref, struct watch_queue, usage); 369 + struct watch_filter *wfilter; 370 + int i; 371 + 372 + for (i = 0; i < wqueue->nr_pages; i++) 373 + __free_page(wqueue->notes[i]); 374 + 375 + wfilter = rcu_access_pointer(wqueue->filter); 376 + if (wfilter) 377 + kfree_rcu(wfilter, rcu); 378 + kfree_rcu(wqueue, rcu); 379 + } 380 + 381 + /** 382 + * put_watch_queue - Dispose of a ref on a watchqueue. 383 + * @wqueue: The watch queue to unref. 384 + */ 385 + void put_watch_queue(struct watch_queue *wqueue) 386 + { 387 + kref_put(&wqueue->usage, __put_watch_queue); 388 + } 389 + EXPORT_SYMBOL(put_watch_queue); 390 + 391 + static void free_watch(struct rcu_head *rcu) 392 + { 393 + struct watch *watch = container_of(rcu, struct watch, rcu); 394 + 395 + put_watch_queue(rcu_access_pointer(watch->queue)); 396 + put_cred(watch->cred); 397 + } 398 + 399 + static void __put_watch(struct kref *kref) 400 + { 401 + struct watch *watch = container_of(kref, struct watch, usage); 402 + 403 + call_rcu(&watch->rcu, free_watch); 404 + } 405 + 406 + /* 407 + * Discard a watch. 408 + */ 409 + static void put_watch(struct watch *watch) 410 + { 411 + kref_put(&watch->usage, __put_watch); 412 + } 413 + 414 + /** 415 + * init_watch_queue - Initialise a watch 416 + * @watch: The watch to initialise. 417 + * @wqueue: The queue to assign. 418 + * 419 + * Initialise a watch and set the watch queue. 420 + */ 421 + void init_watch(struct watch *watch, struct watch_queue *wqueue) 422 + { 423 + kref_init(&watch->usage); 424 + INIT_HLIST_NODE(&watch->list_node); 425 + INIT_HLIST_NODE(&watch->queue_node); 426 + rcu_assign_pointer(watch->queue, wqueue); 427 + } 428 + 429 + /** 430 + * add_watch_to_object - Add a watch on an object to a watch list 431 + * @watch: The watch to add 432 + * @wlist: The watch list to add to 433 + * 434 + * @watch->queue must have been set to point to the queue to post notifications 435 + * to and the watch list of the object to be watched. @watch->cred must also 436 + * have been set to the appropriate credentials and a ref taken on them. 437 + * 438 + * The caller must pin the queue and the list both and must hold the list 439 + * locked against racing watch additions/removals. 440 + */ 441 + int add_watch_to_object(struct watch *watch, struct watch_list *wlist) 442 + { 443 + struct watch_queue *wqueue = rcu_access_pointer(watch->queue); 444 + struct watch *w; 445 + 446 + hlist_for_each_entry(w, &wlist->watchers, list_node) { 447 + struct watch_queue *wq = rcu_access_pointer(w->queue); 448 + if (wqueue == wq && watch->id == w->id) 449 + return -EBUSY; 450 + } 451 + 452 + watch->cred = get_current_cred(); 453 + rcu_assign_pointer(watch->watch_list, wlist); 454 + 455 + spin_lock_bh(&wqueue->lock); 456 + kref_get(&wqueue->usage); 457 + kref_get(&watch->usage); 458 + hlist_add_head(&watch->queue_node, &wqueue->watches); 459 + spin_unlock_bh(&wqueue->lock); 460 + 461 + hlist_add_head(&watch->list_node, &wlist->watchers); 462 + return 0; 463 + } 464 + EXPORT_SYMBOL(add_watch_to_object); 465 + 466 + /** 467 + * remove_watch_from_object - Remove a watch or all watches from an object. 468 + * @wlist: The watch list to remove from 469 + * @wq: The watch queue of interest (ignored if @all is true) 470 + * @id: The ID of the watch to remove (ignored if @all is true) 471 + * @all: True to remove all objects 472 + * 473 + * Remove a specific watch or all watches from an object. A notification is 474 + * sent to the watcher to tell them that this happened. 475 + */ 476 + int remove_watch_from_object(struct watch_list *wlist, struct watch_queue *wq, 477 + u64 id, bool all) 478 + { 479 + struct watch_notification_removal n; 480 + struct watch_queue *wqueue; 481 + struct watch *watch; 482 + int ret = -EBADSLT; 483 + 484 + rcu_read_lock(); 485 + 486 + again: 487 + spin_lock(&wlist->lock); 488 + hlist_for_each_entry(watch, &wlist->watchers, list_node) { 489 + if (all || 490 + (watch->id == id && rcu_access_pointer(watch->queue) == wq)) 491 + goto found; 492 + } 493 + spin_unlock(&wlist->lock); 494 + goto out; 495 + 496 + found: 497 + ret = 0; 498 + hlist_del_init_rcu(&watch->list_node); 499 + rcu_assign_pointer(watch->watch_list, NULL); 500 + spin_unlock(&wlist->lock); 501 + 502 + /* We now own the reference on watch that used to belong to wlist. */ 503 + 504 + n.watch.type = WATCH_TYPE_META; 505 + n.watch.subtype = WATCH_META_REMOVAL_NOTIFICATION; 506 + n.watch.info = watch->info_id | watch_sizeof(n.watch); 507 + n.id = id; 508 + if (id != 0) 509 + n.watch.info = watch->info_id | watch_sizeof(n); 510 + 511 + wqueue = rcu_dereference(watch->queue); 512 + 513 + /* We don't need the watch list lock for the next bit as RCU is 514 + * protecting *wqueue from deallocation. 515 + */ 516 + if (wqueue) { 517 + post_one_notification(wqueue, &n.watch); 518 + 519 + spin_lock_bh(&wqueue->lock); 520 + 521 + if (!hlist_unhashed(&watch->queue_node)) { 522 + hlist_del_init_rcu(&watch->queue_node); 523 + put_watch(watch); 524 + } 525 + 526 + spin_unlock_bh(&wqueue->lock); 527 + } 528 + 529 + if (wlist->release_watch) { 530 + void (*release_watch)(struct watch *); 531 + 532 + release_watch = wlist->release_watch; 533 + rcu_read_unlock(); 534 + (*release_watch)(watch); 535 + rcu_read_lock(); 536 + } 537 + put_watch(watch); 538 + 539 + if (all && !hlist_empty(&wlist->watchers)) 540 + goto again; 541 + out: 542 + rcu_read_unlock(); 543 + return ret; 544 + } 545 + EXPORT_SYMBOL(remove_watch_from_object); 546 + 547 + /* 548 + * Remove all the watches that are contributory to a queue. This has the 549 + * potential to race with removal of the watches by the destruction of the 550 + * objects being watched or with the distribution of notifications. 551 + */ 552 + void watch_queue_clear(struct watch_queue *wqueue) 553 + { 554 + struct watch_list *wlist; 555 + struct watch *watch; 556 + bool release; 557 + 558 + rcu_read_lock(); 559 + spin_lock_bh(&wqueue->lock); 560 + 561 + /* Prevent new additions and prevent notifications from happening */ 562 + wqueue->defunct = true; 563 + 564 + while (!hlist_empty(&wqueue->watches)) { 565 + watch = hlist_entry(wqueue->watches.first, struct watch, queue_node); 566 + hlist_del_init_rcu(&watch->queue_node); 567 + /* We now own a ref on the watch. */ 568 + spin_unlock_bh(&wqueue->lock); 569 + 570 + /* We can't do the next bit under the queue lock as we need to 571 + * get the list lock - which would cause a deadlock if someone 572 + * was removing from the opposite direction at the same time or 573 + * posting a notification. 574 + */ 575 + wlist = rcu_dereference(watch->watch_list); 576 + if (wlist) { 577 + void (*release_watch)(struct watch *); 578 + 579 + spin_lock(&wlist->lock); 580 + 581 + release = !hlist_unhashed(&watch->list_node); 582 + if (release) { 583 + hlist_del_init_rcu(&watch->list_node); 584 + rcu_assign_pointer(watch->watch_list, NULL); 585 + 586 + /* We now own a second ref on the watch. */ 587 + } 588 + 589 + release_watch = wlist->release_watch; 590 + spin_unlock(&wlist->lock); 591 + 592 + if (release) { 593 + if (release_watch) { 594 + rcu_read_unlock(); 595 + /* This might need to call dput(), so 596 + * we have to drop all the locks. 597 + */ 598 + (*release_watch)(watch); 599 + rcu_read_lock(); 600 + } 601 + put_watch(watch); 602 + } 603 + } 604 + 605 + put_watch(watch); 606 + spin_lock_bh(&wqueue->lock); 607 + } 608 + 609 + spin_unlock_bh(&wqueue->lock); 610 + rcu_read_unlock(); 611 + } 612 + 613 + /** 614 + * get_watch_queue - Get a watch queue from its file descriptor. 615 + * @fd: The fd to query. 616 + */ 617 + struct watch_queue *get_watch_queue(int fd) 618 + { 619 + struct pipe_inode_info *pipe; 620 + struct watch_queue *wqueue = ERR_PTR(-EINVAL); 621 + struct fd f; 622 + 623 + f = fdget(fd); 624 + if (f.file) { 625 + pipe = get_pipe_info(f.file, false); 626 + if (pipe && pipe->watch_queue) { 627 + wqueue = pipe->watch_queue; 628 + kref_get(&wqueue->usage); 629 + } 630 + fdput(f); 631 + } 632 + 633 + return wqueue; 634 + } 635 + EXPORT_SYMBOL(get_watch_queue); 636 + 637 + /* 638 + * Initialise a watch queue 639 + */ 640 + int watch_queue_init(struct pipe_inode_info *pipe) 641 + { 642 + struct watch_queue *wqueue; 643 + 644 + wqueue = kzalloc(sizeof(*wqueue), GFP_KERNEL); 645 + if (!wqueue) 646 + return -ENOMEM; 647 + 648 + wqueue->pipe = pipe; 649 + kref_init(&wqueue->usage); 650 + spin_lock_init(&wqueue->lock); 651 + INIT_HLIST_HEAD(&wqueue->watches); 652 + 653 + pipe->watch_queue = wqueue; 654 + return 0; 655 + }
+7
samples/Kconfig
··· 209 209 bool "watchdog sample" 210 210 depends on CC_CAN_LINK 211 211 212 + config SAMPLE_WATCH_QUEUE 213 + bool "Build example /dev/watch_queue notification consumer" 214 + depends on HEADERS_INSTALL 215 + help 216 + Build example userspace program to use the new mount_notify(), 217 + sb_notify() syscalls and the KEYCTL_WATCH_KEY keyctl() function. 218 + 212 219 endif # SAMPLES
+1
samples/Makefile
··· 27 27 subdir-$(CONFIG_SAMPLE_VFS) += vfs 28 28 obj-$(CONFIG_SAMPLE_INTEL_MEI) += mei/ 29 29 subdir-$(CONFIG_SAMPLE_WATCHDOG) += watchdog 30 + subdir-$(CONFIG_SAMPLE_WATCH_QUEUE) += watch_queue
+7
samples/watch_queue/Makefile
··· 1 + # List of programs to build 2 + hostprogs := watch_test 3 + 4 + # Tell kbuild to always build the programs 5 + always-y := $(hostprogs) 6 + 7 + HOSTCFLAGS_watch_test.o += -I$(objtree)/usr/include
+186
samples/watch_queue/watch_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Use /dev/watch_queue to watch for notifications. 3 + * 4 + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. 5 + * Written by David Howells (dhowells@redhat.com) 6 + */ 7 + 8 + #define _GNU_SOURCE 9 + #include <stdbool.h> 10 + #include <stdarg.h> 11 + #include <stdio.h> 12 + #include <stdlib.h> 13 + #include <string.h> 14 + #include <signal.h> 15 + #include <unistd.h> 16 + #include <errno.h> 17 + #include <sys/ioctl.h> 18 + #include <limits.h> 19 + #include <linux/watch_queue.h> 20 + #include <linux/unistd.h> 21 + #include <linux/keyctl.h> 22 + 23 + #ifndef KEYCTL_WATCH_KEY 24 + #define KEYCTL_WATCH_KEY -1 25 + #endif 26 + #ifndef __NR_keyctl 27 + #define __NR_keyctl -1 28 + #endif 29 + 30 + #define BUF_SIZE 256 31 + 32 + static long keyctl_watch_key(int key, int watch_fd, int watch_id) 33 + { 34 + return syscall(__NR_keyctl, KEYCTL_WATCH_KEY, key, watch_fd, watch_id); 35 + } 36 + 37 + static const char *key_subtypes[256] = { 38 + [NOTIFY_KEY_INSTANTIATED] = "instantiated", 39 + [NOTIFY_KEY_UPDATED] = "updated", 40 + [NOTIFY_KEY_LINKED] = "linked", 41 + [NOTIFY_KEY_UNLINKED] = "unlinked", 42 + [NOTIFY_KEY_CLEARED] = "cleared", 43 + [NOTIFY_KEY_REVOKED] = "revoked", 44 + [NOTIFY_KEY_INVALIDATED] = "invalidated", 45 + [NOTIFY_KEY_SETATTR] = "setattr", 46 + }; 47 + 48 + static void saw_key_change(struct watch_notification *n, size_t len) 49 + { 50 + struct key_notification *k = (struct key_notification *)n; 51 + 52 + if (len != sizeof(struct key_notification)) { 53 + fprintf(stderr, "Incorrect key message length\n"); 54 + return; 55 + } 56 + 57 + printf("KEY %08x change=%u[%s] aux=%u\n", 58 + k->key_id, n->subtype, key_subtypes[n->subtype], k->aux); 59 + } 60 + 61 + /* 62 + * Consume and display events. 63 + */ 64 + static void consumer(int fd) 65 + { 66 + unsigned char buffer[433], *p, *end; 67 + union { 68 + struct watch_notification n; 69 + unsigned char buf1[128]; 70 + } n; 71 + ssize_t buf_len; 72 + 73 + for (;;) { 74 + buf_len = read(fd, buffer, sizeof(buffer)); 75 + if (buf_len == -1) { 76 + perror("read"); 77 + exit(1); 78 + } 79 + 80 + if (buf_len == 0) { 81 + printf("-- END --\n"); 82 + return; 83 + } 84 + 85 + if (buf_len > sizeof(buffer)) { 86 + fprintf(stderr, "Read buffer overrun: %zd\n", buf_len); 87 + return; 88 + } 89 + 90 + printf("read() = %zd\n", buf_len); 91 + 92 + p = buffer; 93 + end = buffer + buf_len; 94 + while (p < end) { 95 + size_t largest, len; 96 + 97 + largest = end - p; 98 + if (largest > 128) 99 + largest = 128; 100 + if (largest < sizeof(struct watch_notification)) { 101 + fprintf(stderr, "Short message header: %zu\n", largest); 102 + return; 103 + } 104 + memcpy(&n, p, largest); 105 + 106 + printf("NOTIFY[%03zx]: ty=%06x sy=%02x i=%08x\n", 107 + p - buffer, n.n.type, n.n.subtype, n.n.info); 108 + 109 + len = n.n.info & WATCH_INFO_LENGTH; 110 + if (len < sizeof(n.n) || len > largest) { 111 + fprintf(stderr, "Bad message length: %zu/%zu\n", len, largest); 112 + exit(1); 113 + } 114 + 115 + switch (n.n.type) { 116 + case WATCH_TYPE_META: 117 + switch (n.n.subtype) { 118 + case WATCH_META_REMOVAL_NOTIFICATION: 119 + printf("REMOVAL of watchpoint %08x\n", 120 + (n.n.info & WATCH_INFO_ID) >> 121 + WATCH_INFO_ID__SHIFT); 122 + break; 123 + case WATCH_META_LOSS_NOTIFICATION: 124 + printf("-- LOSS --\n"); 125 + break; 126 + default: 127 + printf("other meta record\n"); 128 + break; 129 + } 130 + break; 131 + case WATCH_TYPE_KEY_NOTIFY: 132 + saw_key_change(&n.n, len); 133 + break; 134 + default: 135 + printf("other type\n"); 136 + break; 137 + } 138 + 139 + p += len; 140 + } 141 + } 142 + } 143 + 144 + static struct watch_notification_filter filter = { 145 + .nr_filters = 1, 146 + .filters = { 147 + [0] = { 148 + .type = WATCH_TYPE_KEY_NOTIFY, 149 + .subtype_filter[0] = UINT_MAX, 150 + }, 151 + }, 152 + }; 153 + 154 + int main(int argc, char **argv) 155 + { 156 + int pipefd[2], fd; 157 + 158 + if (pipe2(pipefd, O_NOTIFICATION_PIPE) == -1) { 159 + perror("pipe2"); 160 + exit(1); 161 + } 162 + fd = pipefd[0]; 163 + 164 + if (ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, BUF_SIZE) == -1) { 165 + perror("watch_queue(size)"); 166 + exit(1); 167 + } 168 + 169 + if (ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter) == -1) { 170 + perror("watch_queue(filter)"); 171 + exit(1); 172 + } 173 + 174 + if (keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fd, 0x01) == -1) { 175 + perror("keyctl"); 176 + exit(1); 177 + } 178 + 179 + if (keyctl_watch_key(KEY_SPEC_USER_KEYRING, fd, 0x02) == -1) { 180 + perror("keyctl"); 181 + exit(1); 182 + } 183 + 184 + consumer(fd); 185 + exit(0); 186 + }
+9
security/keys/Kconfig
··· 114 114 in the kernel. 115 115 116 116 If you are unsure as to whether this is required, answer N. 117 + 118 + config KEY_NOTIFICATIONS 119 + bool "Provide key/keyring change notifications" 120 + depends on KEYS && WATCH_QUEUE 121 + help 122 + This option provides support for getting change notifications on keys 123 + and keyrings on which the caller has View permission. This makes use 124 + of the /dev/watch_queue misc device to handle the notification 125 + buffer and provides KEYCTL_WATCH_KEY to enable/disable watches.
+3
security/keys/compat.c
··· 156 156 case KEYCTL_CAPABILITIES: 157 157 return keyctl_capabilities(compat_ptr(arg2), arg3); 158 158 159 + case KEYCTL_WATCH_KEY: 160 + return keyctl_watch_key(arg2, arg3, arg4); 161 + 159 162 default: 160 163 return -EOPNOTSUPP; 161 164 }
+5
security/keys/gc.c
··· 131 131 kdebug("- %u", key->serial); 132 132 key_check(key); 133 133 134 + #ifdef CONFIG_KEY_NOTIFICATIONS 135 + remove_watch_list(key->watchers, key->serial); 136 + key->watchers = NULL; 137 + #endif 138 + 134 139 /* Throw away the key data if the key is instantiated */ 135 140 if (state == KEY_IS_POSITIVE && key->type->destroy) 136 141 key->type->destroy(key);
+33 -5
security/keys/internal.h
··· 15 15 #include <linux/task_work.h> 16 16 #include <linux/keyctl.h> 17 17 #include <linux/refcount.h> 18 + #include <linux/watch_queue.h> 18 19 #include <linux/compat.h> 19 20 #include <linux/mm.h> 20 21 #include <linux/vmalloc.h> ··· 100 99 const struct keyring_index_key *index_key, 101 100 struct assoc_array_edit **_edit); 102 101 extern int __key_link_check_live_key(struct key *keyring, struct key *key); 103 - extern void __key_link(struct key *key, struct assoc_array_edit **_edit); 102 + extern void __key_link(struct key *keyring, struct key *key, 103 + struct assoc_array_edit **_edit); 104 104 extern void __key_link_end(struct key *keyring, 105 105 const struct keyring_index_key *index_key, 106 106 struct assoc_array_edit *edit); ··· 167 165 const struct key_match_data *match_data); 168 166 #define KEY_LOOKUP_CREATE 0x01 169 167 #define KEY_LOOKUP_PARTIAL 0x02 170 - #define KEY_LOOKUP_FOR_UNLINK 0x04 171 168 172 169 extern long join_session_keyring(const char *name); 173 170 extern void key_change_session_keyring(struct callback_head *twork); ··· 182 181 183 182 extern int key_task_permission(const key_ref_t key_ref, 184 183 const struct cred *cred, 185 - key_perm_t perm); 184 + enum key_need_perm need_perm); 185 + 186 + static inline void notify_key(struct key *key, 187 + enum key_notification_subtype subtype, u32 aux) 188 + { 189 + #ifdef CONFIG_KEY_NOTIFICATIONS 190 + struct key_notification n = { 191 + .watch.type = WATCH_TYPE_KEY_NOTIFY, 192 + .watch.subtype = subtype, 193 + .watch.info = watch_sizeof(n), 194 + .key_id = key_serial(key), 195 + .aux = aux, 196 + }; 197 + 198 + post_watch_notification(key->watchers, &n.watch, current_cred(), 199 + n.key_id); 200 + #endif 201 + } 186 202 187 203 /* 188 204 * Check to see whether permission is granted to use a key in the desired way. 189 205 */ 190 - static inline int key_permission(const key_ref_t key_ref, unsigned perm) 206 + static inline int key_permission(const key_ref_t key_ref, 207 + enum key_need_perm need_perm) 191 208 { 192 - return key_task_permission(key_ref, current_cred(), perm); 209 + return key_task_permission(key_ref, current_cred(), need_perm); 193 210 } 194 211 195 212 extern struct key_type key_type_request_key_auth; ··· 351 332 #endif 352 333 353 334 extern long keyctl_capabilities(unsigned char __user *_buffer, size_t buflen); 335 + 336 + #ifdef CONFIG_KEY_NOTIFICATIONS 337 + extern long keyctl_watch_key(key_serial_t, int, int); 338 + #else 339 + static inline long keyctl_watch_key(key_serial_t key_id, int watch_fd, int watch_id) 340 + { 341 + return -EOPNOTSUPP; 342 + } 343 + #endif 354 344 355 345 /* 356 346 * Debugging key validation
+23 -13
security/keys/key.c
··· 444 444 /* mark the key as being instantiated */ 445 445 atomic_inc(&key->user->nikeys); 446 446 mark_key_instantiated(key, 0); 447 + notify_key(key, NOTIFY_KEY_INSTANTIATED, 0); 447 448 448 449 if (test_and_clear_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags)) 449 450 awaken = 1; ··· 454 453 if (test_bit(KEY_FLAG_KEEP, &keyring->flags)) 455 454 set_bit(KEY_FLAG_KEEP, &key->flags); 456 455 457 - __key_link(key, _edit); 456 + __key_link(keyring, key, _edit); 458 457 } 459 458 460 459 /* disable the authorisation key */ ··· 602 601 /* mark the key as being negatively instantiated */ 603 602 atomic_inc(&key->user->nikeys); 604 603 mark_key_instantiated(key, -error); 604 + notify_key(key, NOTIFY_KEY_INSTANTIATED, -error); 605 605 key->expiry = ktime_get_real_seconds() + timeout; 606 606 key_schedule_gc(key->expiry + key_gc_delay); 607 607 ··· 613 611 614 612 /* and link it into the destination keyring */ 615 613 if (keyring && link_ret == 0) 616 - __key_link(key, &edit); 614 + __key_link(keyring, key, &edit); 617 615 618 616 /* disable the authorisation key */ 619 617 if (authkey) ··· 766 764 down_write(&key->sem); 767 765 768 766 ret = key->type->update(key, prep); 769 - if (ret == 0) 767 + if (ret == 0) { 770 768 /* Updating a negative key positively instantiates it */ 771 769 mark_key_instantiated(key, 0); 770 + notify_key(key, NOTIFY_KEY_UPDATED, 0); 771 + } 772 772 773 773 up_write(&key->sem); 774 774 ··· 1027 1023 down_write(&key->sem); 1028 1024 1029 1025 ret = key->type->update(key, &prep); 1030 - if (ret == 0) 1026 + if (ret == 0) { 1031 1027 /* Updating a negative key positively instantiates it */ 1032 1028 mark_key_instantiated(key, 0); 1029 + notify_key(key, NOTIFY_KEY_UPDATED, 0); 1030 + } 1033 1031 1034 1032 up_write(&key->sem); 1035 1033 ··· 1063 1057 * instantiated 1064 1058 */ 1065 1059 down_write_nested(&key->sem, 1); 1066 - if (!test_and_set_bit(KEY_FLAG_REVOKED, &key->flags) && 1067 - key->type->revoke) 1068 - key->type->revoke(key); 1060 + if (!test_and_set_bit(KEY_FLAG_REVOKED, &key->flags)) { 1061 + notify_key(key, NOTIFY_KEY_REVOKED, 0); 1062 + if (key->type->revoke) 1063 + key->type->revoke(key); 1069 1064 1070 - /* set the death time to no more than the expiry time */ 1071 - time = ktime_get_real_seconds(); 1072 - if (key->revoked_at == 0 || key->revoked_at > time) { 1073 - key->revoked_at = time; 1074 - key_schedule_gc(key->revoked_at + key_gc_delay); 1065 + /* set the death time to no more than the expiry time */ 1066 + time = ktime_get_real_seconds(); 1067 + if (key->revoked_at == 0 || key->revoked_at > time) { 1068 + key->revoked_at = time; 1069 + key_schedule_gc(key->revoked_at + key_gc_delay); 1070 + } 1075 1071 } 1076 1072 1077 1073 up_write(&key->sem); ··· 1095 1087 1096 1088 if (!test_bit(KEY_FLAG_INVALIDATED, &key->flags)) { 1097 1089 down_write_nested(&key->sem, 1); 1098 - if (!test_and_set_bit(KEY_FLAG_INVALIDATED, &key->flags)) 1090 + if (!test_and_set_bit(KEY_FLAG_INVALIDATED, &key->flags)) { 1091 + notify_key(key, NOTIFY_KEY_INVALIDATED, 0); 1099 1092 key_schedule_gc_links(); 1093 + } 1100 1094 up_write(&key->sem); 1101 1095 } 1102 1096 }
+105 -10
security/keys/keyctl.c
··· 37 37 KEYCTL_CAPS0_MOVE 38 38 ), 39 39 [1] = (KEYCTL_CAPS1_NS_KEYRING_NAME | 40 - KEYCTL_CAPS1_NS_KEY_TAG), 40 + KEYCTL_CAPS1_NS_KEY_TAG | 41 + (IS_ENABLED(CONFIG_KEY_NOTIFICATIONS) ? KEYCTL_CAPS1_NOTIFICATIONS : 0) 42 + ), 41 43 }; 42 44 43 45 static int key_get_type_from_user(char *type, ··· 431 429 432 430 /* Root is permitted to invalidate certain special keys */ 433 431 if (capable(CAP_SYS_ADMIN)) { 434 - key_ref = lookup_user_key(id, 0, 0); 432 + key_ref = lookup_user_key(id, 0, KEY_SYSADMIN_OVERRIDE); 435 433 if (IS_ERR(key_ref)) 436 434 goto error; 437 435 if (test_bit(KEY_FLAG_ROOT_CAN_INVAL, ··· 476 474 477 475 /* Root is permitted to invalidate certain special keyrings */ 478 476 if (capable(CAP_SYS_ADMIN)) { 479 - keyring_ref = lookup_user_key(ringid, 0, 0); 477 + keyring_ref = lookup_user_key(ringid, 0, 478 + KEY_SYSADMIN_OVERRIDE); 480 479 if (IS_ERR(keyring_ref)) 481 480 goto error; 482 481 if (test_bit(KEY_FLAG_ROOT_CAN_CLEAR, ··· 561 558 goto error; 562 559 } 563 560 564 - key_ref = lookup_user_key(id, KEY_LOOKUP_FOR_UNLINK, 0); 561 + key_ref = lookup_user_key(id, KEY_LOOKUP_PARTIAL, KEY_NEED_UNLINK); 565 562 if (IS_ERR(key_ref)) { 566 563 ret = PTR_ERR(key_ref); 567 564 goto error2; ··· 661 658 key_put(instkey); 662 659 key_ref = lookup_user_key(keyid, 663 660 KEY_LOOKUP_PARTIAL, 664 - 0); 661 + KEY_AUTHTOKEN_OVERRIDE); 665 662 if (!IS_ERR(key_ref)) 666 663 goto okay; 667 664 } ··· 831 828 size_t key_data_len; 832 829 833 830 /* find the key first */ 834 - key_ref = lookup_user_key(keyid, 0, 0); 831 + key_ref = lookup_user_key(keyid, 0, KEY_DEFER_PERM_CHECK); 835 832 if (IS_ERR(key_ref)) { 836 833 ret = -ENOKEY; 837 834 goto out; ··· 1039 1036 if (group != (gid_t) -1) 1040 1037 key->gid = gid; 1041 1038 1039 + notify_key(key, NOTIFY_KEY_SETATTR, 0); 1042 1040 ret = 0; 1043 1041 1044 1042 error_put: ··· 1090 1086 /* if we're not the sysadmin, we can only change a key that we own */ 1091 1087 if (capable(CAP_SYS_ADMIN) || uid_eq(key->uid, current_fsuid())) { 1092 1088 key->perm = perm; 1089 + notify_key(key, NOTIFY_KEY_SETATTR, 0); 1093 1090 ret = 0; 1094 1091 } 1095 1092 ··· 1466 1461 key_put(instkey); 1467 1462 key_ref = lookup_user_key(id, 1468 1463 KEY_LOOKUP_PARTIAL, 1469 - 0); 1464 + KEY_AUTHTOKEN_OVERRIDE); 1470 1465 if (!IS_ERR(key_ref)) 1471 1466 goto okay; 1472 1467 } ··· 1479 1474 okay: 1480 1475 key = key_ref_to_ptr(key_ref); 1481 1476 ret = 0; 1482 - if (test_bit(KEY_FLAG_KEEP, &key->flags)) 1477 + if (test_bit(KEY_FLAG_KEEP, &key->flags)) { 1483 1478 ret = -EPERM; 1484 - else 1479 + } else { 1485 1480 key_set_timeout(key, timeout); 1481 + notify_key(key, NOTIFY_KEY_SETATTR, 0); 1482 + } 1486 1483 key_put(key); 1487 1484 1488 1485 error: ··· 1574 1567 return PTR_ERR(instkey); 1575 1568 key_put(instkey); 1576 1569 1577 - key_ref = lookup_user_key(keyid, KEY_LOOKUP_PARTIAL, 0); 1570 + key_ref = lookup_user_key(keyid, KEY_LOOKUP_PARTIAL, 1571 + KEY_AUTHTOKEN_OVERRIDE); 1578 1572 if (IS_ERR(key_ref)) 1579 1573 return PTR_ERR(key_ref); 1580 1574 } ··· 1759 1751 return ret; 1760 1752 } 1761 1753 1754 + #ifdef CONFIG_KEY_NOTIFICATIONS 1755 + /* 1756 + * Watch for changes to a key. 1757 + * 1758 + * The caller must have View permission to watch a key or keyring. 1759 + */ 1760 + long keyctl_watch_key(key_serial_t id, int watch_queue_fd, int watch_id) 1761 + { 1762 + struct watch_queue *wqueue; 1763 + struct watch_list *wlist = NULL; 1764 + struct watch *watch = NULL; 1765 + struct key *key; 1766 + key_ref_t key_ref; 1767 + long ret; 1768 + 1769 + if (watch_id < -1 || watch_id > 0xff) 1770 + return -EINVAL; 1771 + 1772 + key_ref = lookup_user_key(id, KEY_LOOKUP_CREATE, KEY_NEED_VIEW); 1773 + if (IS_ERR(key_ref)) 1774 + return PTR_ERR(key_ref); 1775 + key = key_ref_to_ptr(key_ref); 1776 + 1777 + wqueue = get_watch_queue(watch_queue_fd); 1778 + if (IS_ERR(wqueue)) { 1779 + ret = PTR_ERR(wqueue); 1780 + goto err_key; 1781 + } 1782 + 1783 + if (watch_id >= 0) { 1784 + ret = -ENOMEM; 1785 + if (!key->watchers) { 1786 + wlist = kzalloc(sizeof(*wlist), GFP_KERNEL); 1787 + if (!wlist) 1788 + goto err_wqueue; 1789 + init_watch_list(wlist, NULL); 1790 + } 1791 + 1792 + watch = kzalloc(sizeof(*watch), GFP_KERNEL); 1793 + if (!watch) 1794 + goto err_wlist; 1795 + 1796 + init_watch(watch, wqueue); 1797 + watch->id = key->serial; 1798 + watch->info_id = (u32)watch_id << WATCH_INFO_ID__SHIFT; 1799 + 1800 + ret = security_watch_key(key); 1801 + if (ret < 0) 1802 + goto err_watch; 1803 + 1804 + down_write(&key->sem); 1805 + if (!key->watchers) { 1806 + key->watchers = wlist; 1807 + wlist = NULL; 1808 + } 1809 + 1810 + ret = add_watch_to_object(watch, key->watchers); 1811 + up_write(&key->sem); 1812 + 1813 + if (ret == 0) 1814 + watch = NULL; 1815 + } else { 1816 + ret = -EBADSLT; 1817 + if (key->watchers) { 1818 + down_write(&key->sem); 1819 + ret = remove_watch_from_object(key->watchers, 1820 + wqueue, key_serial(key), 1821 + false); 1822 + up_write(&key->sem); 1823 + } 1824 + } 1825 + 1826 + err_watch: 1827 + kfree(watch); 1828 + err_wlist: 1829 + kfree(wlist); 1830 + err_wqueue: 1831 + put_watch_queue(wqueue); 1832 + err_key: 1833 + key_put(key); 1834 + return ret; 1835 + } 1836 + #endif /* CONFIG_KEY_NOTIFICATIONS */ 1837 + 1762 1838 /* 1763 1839 * Get keyrings subsystem capabilities. 1764 1840 */ ··· 2011 1919 2012 1920 case KEYCTL_CAPABILITIES: 2013 1921 return keyctl_capabilities((unsigned char __user *)arg2, (size_t)arg3); 1922 + 1923 + case KEYCTL_WATCH_KEY: 1924 + return keyctl_watch_key((key_serial_t)arg2, (int)arg3, (int)arg4); 2014 1925 2015 1926 default: 2016 1927 return -EOPNOTSUPP;
+13 -7
security/keys/keyring.c
··· 1056 1056 down_write(&keyring->sem); 1057 1057 down_write(&keyring_serialise_restrict_sem); 1058 1058 1059 - if (keyring->restrict_link) 1059 + if (keyring->restrict_link) { 1060 1060 ret = -EEXIST; 1061 - else if (keyring_detect_restriction_cycle(keyring, restrict_link)) 1061 + } else if (keyring_detect_restriction_cycle(keyring, restrict_link)) { 1062 1062 ret = -EDEADLK; 1063 - else 1063 + } else { 1064 1064 keyring->restrict_link = restrict_link; 1065 + notify_key(keyring, NOTIFY_KEY_SETATTR, 0); 1066 + } 1065 1067 1066 1068 up_write(&keyring_serialise_restrict_sem); 1067 1069 up_write(&keyring->sem); ··· 1364 1362 * holds at most one link to any given key of a particular type+description 1365 1363 * combination. 1366 1364 */ 1367 - void __key_link(struct key *key, struct assoc_array_edit **_edit) 1365 + void __key_link(struct key *keyring, struct key *key, 1366 + struct assoc_array_edit **_edit) 1368 1367 { 1369 1368 __key_get(key); 1370 1369 assoc_array_insert_set_object(*_edit, keyring_key_to_ptr(key)); 1371 1370 assoc_array_apply_edit(*_edit); 1372 1371 *_edit = NULL; 1372 + notify_key(keyring, NOTIFY_KEY_LINKED, key_serial(key)); 1373 1373 } 1374 1374 1375 1375 /* ··· 1455 1451 if (ret == 0) 1456 1452 ret = __key_link_check_live_key(keyring, key); 1457 1453 if (ret == 0) 1458 - __key_link(key, &edit); 1454 + __key_link(keyring, key, &edit); 1459 1455 1460 1456 error_end: 1461 1457 __key_link_end(keyring, &key->index_key, edit); ··· 1487 1483 struct assoc_array_edit *edit; 1488 1484 1489 1485 BUG_ON(*_edit != NULL); 1490 - 1486 + 1491 1487 edit = assoc_array_delete(&keyring->keys, &keyring_assoc_array_ops, 1492 1488 &key->index_key); 1493 1489 if (IS_ERR(edit)) ··· 1507 1503 struct assoc_array_edit **_edit) 1508 1504 { 1509 1505 assoc_array_apply_edit(*_edit); 1506 + notify_key(keyring, NOTIFY_KEY_UNLINKED, key_serial(key)); 1510 1507 *_edit = NULL; 1511 1508 key_payload_reserve(keyring, keyring->datalen - KEYQUOTA_LINK_BYTES); 1512 1509 } ··· 1626 1621 goto error; 1627 1622 1628 1623 __key_unlink(from_keyring, key, &from_edit); 1629 - __key_link(key, &to_edit); 1624 + __key_link(to_keyring, key, &to_edit); 1630 1625 error: 1631 1626 __key_link_end(to_keyring, &key->index_key, to_edit); 1632 1627 __key_unlink_end(from_keyring, key, from_edit); ··· 1660 1655 } else { 1661 1656 if (edit) 1662 1657 assoc_array_apply_edit(edit); 1658 + notify_key(keyring, NOTIFY_KEY_CLEARED, 0); 1663 1659 key_payload_reserve(keyring, 0); 1664 1660 ret = 0; 1665 1661 }
+24 -7
security/keys/permission.c
··· 13 13 * key_task_permission - Check a key can be used 14 14 * @key_ref: The key to check. 15 15 * @cred: The credentials to use. 16 - * @perm: The permissions to check for. 16 + * @need_perm: The permission required. 17 17 * 18 18 * Check to see whether permission is granted to use a key in the desired way, 19 19 * but permit the security modules to override. ··· 24 24 * permissions bits or the LSM check. 25 25 */ 26 26 int key_task_permission(const key_ref_t key_ref, const struct cred *cred, 27 - unsigned perm) 27 + enum key_need_perm need_perm) 28 28 { 29 29 struct key *key; 30 - key_perm_t kperm; 30 + key_perm_t kperm, mask; 31 31 int ret; 32 + 33 + switch (need_perm) { 34 + default: 35 + WARN_ON(1); 36 + return -EACCES; 37 + case KEY_NEED_UNLINK: 38 + case KEY_SYSADMIN_OVERRIDE: 39 + case KEY_AUTHTOKEN_OVERRIDE: 40 + case KEY_DEFER_PERM_CHECK: 41 + goto lsm; 42 + 43 + case KEY_NEED_VIEW: mask = KEY_OTH_VIEW; break; 44 + case KEY_NEED_READ: mask = KEY_OTH_READ; break; 45 + case KEY_NEED_WRITE: mask = KEY_OTH_WRITE; break; 46 + case KEY_NEED_SEARCH: mask = KEY_OTH_SEARCH; break; 47 + case KEY_NEED_LINK: mask = KEY_OTH_LINK; break; 48 + case KEY_NEED_SETATTR: mask = KEY_OTH_SETATTR; break; 49 + } 32 50 33 51 key = key_ref_to_ptr(key_ref); 34 52 ··· 82 64 if (is_key_possessed(key_ref)) 83 65 kperm |= key->perm >> 24; 84 66 85 - kperm = kperm & perm & KEY_NEED_ALL; 86 - 87 - if (kperm != perm) 67 + if ((kperm & mask) != mask) 88 68 return -EACCES; 89 69 90 70 /* let LSM be the final arbiter */ 91 - return security_key_permission(key_ref, cred, perm); 71 + lsm: 72 + return security_key_permission(key_ref, cred, need_perm); 92 73 } 93 74 EXPORT_SYMBOL(key_task_permission); 94 75
+22 -24
security/keys/process_keys.c
··· 609 609 * returned key reference. 610 610 */ 611 611 key_ref_t lookup_user_key(key_serial_t id, unsigned long lflags, 612 - key_perm_t perm) 612 + enum key_need_perm need_perm) 613 613 { 614 614 struct keyring_search_context ctx = { 615 615 .match_data.cmp = lookup_user_key_possessed, ··· 773 773 774 774 /* unlink does not use the nominated key in any way, so can skip all 775 775 * the permission checks as it is only concerned with the keyring */ 776 - if (lflags & KEY_LOOKUP_FOR_UNLINK) { 777 - ret = 0; 778 - goto error; 779 - } 780 - 781 - if (!(lflags & KEY_LOOKUP_PARTIAL)) { 782 - ret = wait_for_key_construction(key, true); 783 - switch (ret) { 784 - case -ERESTARTSYS: 785 - goto invalid_key; 786 - default: 787 - if (perm) 776 + if (need_perm != KEY_NEED_UNLINK) { 777 + if (!(lflags & KEY_LOOKUP_PARTIAL)) { 778 + ret = wait_for_key_construction(key, true); 779 + switch (ret) { 780 + case -ERESTARTSYS: 788 781 goto invalid_key; 789 - case 0: 790 - break; 782 + default: 783 + if (need_perm != KEY_AUTHTOKEN_OVERRIDE && 784 + need_perm != KEY_DEFER_PERM_CHECK) 785 + goto invalid_key; 786 + case 0: 787 + break; 788 + } 789 + } else if (need_perm != KEY_DEFER_PERM_CHECK) { 790 + ret = key_validate(key); 791 + if (ret < 0) 792 + goto invalid_key; 791 793 } 792 - } else if (perm) { 793 - ret = key_validate(key); 794 - if (ret < 0) 794 + 795 + ret = -EIO; 796 + if (!(lflags & KEY_LOOKUP_PARTIAL) && 797 + key_read_state(key) == KEY_IS_UNINSTANTIATED) 795 798 goto invalid_key; 796 799 } 797 - 798 - ret = -EIO; 799 - if (!(lflags & KEY_LOOKUP_PARTIAL) && 800 - key_read_state(key) == KEY_IS_UNINSTANTIATED) 801 - goto invalid_key; 802 800 803 801 /* check the permissions */ 804 - ret = key_task_permission(key_ref, ctx.cred, perm); 802 + ret = key_task_permission(key_ref, ctx.cred, need_perm); 805 803 if (ret < 0) 806 804 goto invalid_key; 807 805
+2 -2
security/keys/request_key.c
··· 418 418 goto key_already_present; 419 419 420 420 if (dest_keyring) 421 - __key_link(key, &edit); 421 + __key_link(dest_keyring, key, &edit); 422 422 423 423 mutex_unlock(&key_construction_mutex); 424 424 if (dest_keyring) ··· 437 437 if (dest_keyring) { 438 438 ret = __key_link_check_live_key(dest_keyring, key); 439 439 if (ret == 0) 440 - __key_link(key, &edit); 440 + __key_link(dest_keyring, key, &edit); 441 441 __key_link_end(dest_keyring, &ctx->index_key, edit); 442 442 if (ret < 0) 443 443 goto link_check_failed;
+19 -3
security/security.c
··· 2030 2030 } 2031 2031 EXPORT_SYMBOL(security_inode_getsecctx); 2032 2032 2033 + #ifdef CONFIG_WATCH_QUEUE 2034 + int security_post_notification(const struct cred *w_cred, 2035 + const struct cred *cred, 2036 + struct watch_notification *n) 2037 + { 2038 + return call_int_hook(post_notification, 0, w_cred, cred, n); 2039 + } 2040 + #endif /* CONFIG_WATCH_QUEUE */ 2041 + 2042 + #ifdef CONFIG_KEY_NOTIFICATIONS 2043 + int security_watch_key(struct key *key) 2044 + { 2045 + return call_int_hook(watch_key, 0, key); 2046 + } 2047 + #endif 2048 + 2033 2049 #ifdef CONFIG_SECURITY_NETWORK 2034 2050 2035 2051 int security_unix_stream_connect(struct sock *sock, struct sock *other, struct sock *newsk) ··· 2421 2405 call_void_hook(key_free, key); 2422 2406 } 2423 2407 2424 - int security_key_permission(key_ref_t key_ref, 2425 - const struct cred *cred, unsigned perm) 2408 + int security_key_permission(key_ref_t key_ref, const struct cred *cred, 2409 + enum key_need_perm need_perm) 2426 2410 { 2427 - return call_int_hook(key_permission, 0, key_ref, cred, perm); 2411 + return call_int_hook(key_permission, 0, key_ref, cred, need_perm); 2428 2412 } 2429 2413 2430 2414 int security_key_getsecurity(struct key *key, char **_buffer)
+44 -7
security/selinux/hooks.c
··· 6559 6559 6560 6560 static int selinux_key_permission(key_ref_t key_ref, 6561 6561 const struct cred *cred, 6562 - unsigned perm) 6562 + enum key_need_perm need_perm) 6563 6563 { 6564 6564 struct key *key; 6565 6565 struct key_security_struct *ksec; 6566 - u32 sid; 6566 + u32 perm, sid; 6567 6567 6568 - /* if no specific permissions are requested, we skip the 6569 - permission check. No serious, additional covert channels 6570 - appear to be created. */ 6571 - if (perm == 0) 6568 + switch (need_perm) { 6569 + case KEY_NEED_VIEW: 6570 + perm = KEY__VIEW; 6571 + break; 6572 + case KEY_NEED_READ: 6573 + perm = KEY__READ; 6574 + break; 6575 + case KEY_NEED_WRITE: 6576 + perm = KEY__WRITE; 6577 + break; 6578 + case KEY_NEED_SEARCH: 6579 + perm = KEY__SEARCH; 6580 + break; 6581 + case KEY_NEED_LINK: 6582 + perm = KEY__LINK; 6583 + break; 6584 + case KEY_NEED_SETATTR: 6585 + perm = KEY__SETATTR; 6586 + break; 6587 + case KEY_NEED_UNLINK: 6588 + case KEY_SYSADMIN_OVERRIDE: 6589 + case KEY_AUTHTOKEN_OVERRIDE: 6590 + case KEY_DEFER_PERM_CHECK: 6572 6591 return 0; 6592 + default: 6593 + WARN_ON(1); 6594 + return -EPERM; 6595 + 6596 + } 6573 6597 6574 6598 sid = cred_sid(cred); 6575 - 6576 6599 key = key_ref_to_ptr(key_ref); 6577 6600 ksec = key->security; 6578 6601 ··· 6617 6594 *_buffer = context; 6618 6595 return rc; 6619 6596 } 6597 + 6598 + #ifdef CONFIG_KEY_NOTIFICATIONS 6599 + static int selinux_watch_key(struct key *key) 6600 + { 6601 + struct key_security_struct *ksec = key->security; 6602 + u32 sid = current_sid(); 6603 + 6604 + return avc_has_perm(&selinux_state, 6605 + sid, ksec->sid, SECCLASS_KEY, KEY__VIEW, NULL); 6606 + } 6607 + #endif 6620 6608 #endif 6621 6609 6622 6610 #ifdef CONFIG_SECURITY_INFINIBAND ··· 7143 7109 LSM_HOOK_INIT(key_free, selinux_key_free), 7144 7110 LSM_HOOK_INIT(key_permission, selinux_key_permission), 7145 7111 LSM_HOOK_INIT(key_getsecurity, selinux_key_getsecurity), 7112 + #ifdef CONFIG_KEY_NOTIFICATIONS 7113 + LSM_HOOK_INIT(watch_key, selinux_watch_key), 7114 + #endif 7146 7115 #endif 7147 7116 7148 7117 #ifdef CONFIG_AUDIT
+104 -8
security/smack/smack_lsm.c
··· 41 41 #include <linux/parser.h> 42 42 #include <linux/fs_context.h> 43 43 #include <linux/fs_parser.h> 44 + #include <linux/watch_queue.h> 44 45 #include "smack.h" 45 46 46 47 #define TRANS_TRUE "TRUE" ··· 4214 4213 * smack_key_permission - Smack access on a key 4215 4214 * @key_ref: gets to the object 4216 4215 * @cred: the credentials to use 4217 - * @perm: requested key permissions 4216 + * @need_perm: requested key permission 4218 4217 * 4219 4218 * Return 0 if the task has read and write to the object, 4220 4219 * an error code otherwise 4221 4220 */ 4222 4221 static int smack_key_permission(key_ref_t key_ref, 4223 - const struct cred *cred, unsigned perm) 4222 + const struct cred *cred, 4223 + enum key_need_perm need_perm) 4224 4224 { 4225 4225 struct key *keyp; 4226 4226 struct smk_audit_info ad; ··· 4232 4230 /* 4233 4231 * Validate requested permissions 4234 4232 */ 4235 - if (perm & ~KEY_NEED_ALL) 4233 + switch (need_perm) { 4234 + case KEY_NEED_READ: 4235 + case KEY_NEED_SEARCH: 4236 + case KEY_NEED_VIEW: 4237 + request |= MAY_READ; 4238 + break; 4239 + case KEY_NEED_WRITE: 4240 + case KEY_NEED_LINK: 4241 + case KEY_NEED_SETATTR: 4242 + request |= MAY_WRITE; 4243 + break; 4244 + case KEY_NEED_UNSPECIFIED: 4245 + case KEY_NEED_UNLINK: 4246 + case KEY_SYSADMIN_OVERRIDE: 4247 + case KEY_AUTHTOKEN_OVERRIDE: 4248 + case KEY_DEFER_PERM_CHECK: 4249 + return 0; 4250 + default: 4236 4251 return -EINVAL; 4252 + } 4237 4253 4238 4254 keyp = key_ref_to_ptr(key_ref); 4239 4255 if (keyp == NULL) ··· 4268 4248 if (tkp == NULL) 4269 4249 return -EACCES; 4270 4250 4271 - if (smack_privileged_cred(CAP_MAC_OVERRIDE, cred)) 4251 + if (smack_privileged(CAP_MAC_OVERRIDE)) 4272 4252 return 0; 4273 4253 4274 4254 #ifdef CONFIG_AUDIT ··· 4276 4256 ad.a.u.key_struct.key = keyp->serial; 4277 4257 ad.a.u.key_struct.key_desc = keyp->description; 4278 4258 #endif 4279 - if (perm & (KEY_NEED_READ | KEY_NEED_SEARCH | KEY_NEED_VIEW)) 4280 - request |= MAY_READ; 4281 - if (perm & (KEY_NEED_WRITE | KEY_NEED_LINK | KEY_NEED_SETATTR)) 4282 - request |= MAY_WRITE; 4283 4259 rc = smk_access(tkp, keyp->security, request, &ad); 4284 4260 rc = smk_bu_note("key access", tkp, keyp->security, request, rc); 4285 4261 return rc; ··· 4310 4294 return length; 4311 4295 } 4312 4296 4297 + 4298 + #ifdef CONFIG_KEY_NOTIFICATIONS 4299 + /** 4300 + * smack_watch_key - Smack access to watch a key for notifications. 4301 + * @key: The key to be watched 4302 + * 4303 + * Return 0 if the @watch->cred has permission to read from the key object and 4304 + * an error otherwise. 4305 + */ 4306 + static int smack_watch_key(struct key *key) 4307 + { 4308 + struct smk_audit_info ad; 4309 + struct smack_known *tkp = smk_of_current(); 4310 + int rc; 4311 + 4312 + if (key == NULL) 4313 + return -EINVAL; 4314 + /* 4315 + * If the key hasn't been initialized give it access so that 4316 + * it may do so. 4317 + */ 4318 + if (key->security == NULL) 4319 + return 0; 4320 + /* 4321 + * This should not occur 4322 + */ 4323 + if (tkp == NULL) 4324 + return -EACCES; 4325 + 4326 + if (smack_privileged_cred(CAP_MAC_OVERRIDE, current_cred())) 4327 + return 0; 4328 + 4329 + #ifdef CONFIG_AUDIT 4330 + smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_KEY); 4331 + ad.a.u.key_struct.key = key->serial; 4332 + ad.a.u.key_struct.key_desc = key->description; 4333 + #endif 4334 + rc = smk_access(tkp, key->security, MAY_READ, &ad); 4335 + rc = smk_bu_note("key watch", tkp, key->security, MAY_READ, rc); 4336 + return rc; 4337 + } 4338 + #endif /* CONFIG_KEY_NOTIFICATIONS */ 4313 4339 #endif /* CONFIG_KEYS */ 4340 + 4341 + #ifdef CONFIG_WATCH_QUEUE 4342 + /** 4343 + * smack_post_notification - Smack access to post a notification to a queue 4344 + * @w_cred: The credentials of the watcher. 4345 + * @cred: The credentials of the event source (may be NULL). 4346 + * @n: The notification message to be posted. 4347 + */ 4348 + static int smack_post_notification(const struct cred *w_cred, 4349 + const struct cred *cred, 4350 + struct watch_notification *n) 4351 + { 4352 + struct smk_audit_info ad; 4353 + struct smack_known *subj, *obj; 4354 + int rc; 4355 + 4356 + /* Always let maintenance notifications through. */ 4357 + if (n->type == WATCH_TYPE_META) 4358 + return 0; 4359 + 4360 + if (!cred) 4361 + return 0; 4362 + subj = smk_of_task(smack_cred(cred)); 4363 + obj = smk_of_task(smack_cred(w_cred)); 4364 + 4365 + smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_NOTIFICATION); 4366 + rc = smk_access(subj, obj, MAY_WRITE, &ad); 4367 + rc = smk_bu_note("notification", subj, obj, MAY_WRITE, rc); 4368 + return rc; 4369 + } 4370 + #endif /* CONFIG_WATCH_QUEUE */ 4314 4371 4315 4372 /* 4316 4373 * Smack Audit hooks ··· 4773 4684 LSM_HOOK_INIT(key_free, smack_key_free), 4774 4685 LSM_HOOK_INIT(key_permission, smack_key_permission), 4775 4686 LSM_HOOK_INIT(key_getsecurity, smack_key_getsecurity), 4687 + #ifdef CONFIG_KEY_NOTIFICATIONS 4688 + LSM_HOOK_INIT(watch_key, smack_watch_key), 4689 + #endif 4776 4690 #endif /* CONFIG_KEYS */ 4691 + 4692 + #ifdef CONFIG_WATCH_QUEUE 4693 + LSM_HOOK_INIT(post_notification, smack_post_notification), 4694 + #endif 4777 4695 4778 4696 /* Audit hooks */ 4779 4697 #ifdef CONFIG_AUDIT