Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

pidfd: support PIDFD_NONBLOCK in pidfd_open()

Introduce PIDFD_NONBLOCK to support non-blocking pidfd file descriptors.

Ever since the introduction of pidfds and more advanced async io various
programming languages such as Rust have grown support for async event
libraries. These libraries are created to help build epoll-based event loops
around file descriptors. A common pattern is to automatically make all file
descriptors they manage to O_NONBLOCK.

For such libraries the EAGAIN error code is treated specially. When a function
is called that returns EAGAIN the function isn't called again until the event
loop indicates the the file descriptor is ready. Supporting EAGAIN when
waiting on pidfds makes such libraries just work with little effort. In the
following patch we will extend waitid() internally to support non-blocking
pidfds.

This introduces a new flag PIDFD_NONBLOCK that is equivalent to O_NONBLOCK.
This follows the same patterns we have for other (anon inode) file descriptors
such as EFD_NONBLOCK, IN_NONBLOCK, SFD_NONBLOCK, TFD_NONBLOCK and the same for
close-on-exec flags.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/lkml/20200811181236.GA18763@localhost/
Link: https://github.com/joshtriplett/async-pidfd
Link: https://lore.kernel.org/r/20200902102130.147672-2-christian.brauner@ubuntu.com

+19 -5
+12
include/uapi/linux/pidfd.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + 3 + #ifndef _UAPI_LINUX_PIDFD_H 4 + #define _UAPI_LINUX_PIDFD_H 5 + 6 + #include <linux/types.h> 7 + #include <linux/fcntl.h> 8 + 9 + /* Flags for pidfd_open(). */ 10 + #define PIDFD_NONBLOCK O_NONBLOCK 11 + 12 + #endif /* _UAPI_LINUX_PIDFD_H */
+7 -5
kernel/pid.c
··· 43 43 #include <linux/sched/task.h> 44 44 #include <linux/idr.h> 45 45 #include <net/sock.h> 46 + #include <uapi/linux/pidfd.h> 46 47 47 48 struct pid init_struct_pid = { 48 49 .count = REFCOUNT_INIT(1), ··· 523 522 /** 524 523 * pidfd_create() - Create a new pid file descriptor. 525 524 * 526 - * @pid: struct pid that the pidfd will reference 525 + * @pid: struct pid that the pidfd will reference 526 + * @flags: flags to pass 527 527 * 528 528 * This creates a new pid file descriptor with the O_CLOEXEC flag set. 529 529 * ··· 534 532 * Return: On success, a cloexec pidfd is returned. 535 533 * On error, a negative errno number will be returned. 536 534 */ 537 - static int pidfd_create(struct pid *pid) 535 + static int pidfd_create(struct pid *pid, unsigned int flags) 538 536 { 539 537 int fd; 540 538 541 539 fd = anon_inode_getfd("[pidfd]", &pidfd_fops, get_pid(pid), 542 - O_RDWR | O_CLOEXEC); 540 + flags | O_RDWR | O_CLOEXEC); 543 541 if (fd < 0) 544 542 put_pid(pid); 545 543 ··· 567 565 int fd; 568 566 struct pid *p; 569 567 570 - if (flags) 568 + if (flags & ~PIDFD_NONBLOCK) 571 569 return -EINVAL; 572 570 573 571 if (pid <= 0) ··· 578 576 return -ESRCH; 579 577 580 578 if (pid_has_task(p, PIDTYPE_TGID)) 581 - fd = pidfd_create(p); 579 + fd = pidfd_create(p, flags); 582 580 else 583 581 fd = -EINVAL; 584 582