Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

fork: don't check parent_tidptr with CLONE_PIDFD

Give userspace a cheap and reliable way to tell whether CLONE_PIDFD is
supported by the kernel or not. The easiest way is to pass an invalid
file descriptor value in parent_tidptr, perform the syscall and verify
that parent_tidptr has been changed to a valid file descriptor value.

CLONE_PIDFD uses parent_tidptr to return pidfds. CLONE_PARENT_SETTID
will use parent_tidptr to return the tid of the parent. The two flags
cannot be used together. Old kernels that only support
CLONE_PARENT_SETTID will not verify the value pointed to by
parent_tidptr. This behavior is unchanged even with the introduction of
CLONE_PIDFD.
However, if CLONE_PIDFD is specified the kernel will currently check the
value pointed to by parent_tidptr before placing the pidfd in the memory
pointed to. EINVAL will be returned if the value in parent_tidptr is not
0.

If CLONE_PIDFD is supported and fd 0 is closed, then the returned pidfd
can and likely will be 0 and parent_tidptr will be unchanged. This means
userspace must either check CLONE_PIDFD support beforehand or check that
fd 0 is not closed when invoking CLONE_PIDFD.

The check for pidfd == 0 was introduced during the v5.2 merge window by
commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
CLONE_PIDFD could be potentially extended by passing in flags through
the return argument.

However, that extension would look horrible, and with the upcoming
introduction of the clone3 syscall in v5.3 there is no need to extend
legacy clone syscall this way. (Even if it would need to be extended,
CLONE_DETACHED can be reused with CLONE_PIDFD.)

So remove the pidfd == 0 check. Userspace that needs to be portable to
kernels without CLONE_PIDFD support can then be advised to initialize
pidfd to -1 and check the pidfd value returned by CLONE_PIDFD.

Fixes: b3e583825266 ("clone: add CLONE_PIDFD")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Christian Brauner <christian@brauner.io>

authored by

Dmitry V. Levin and committed by
Christian Brauner
9014143b 4b972a01

-12
-12
kernel/fork.c
··· 1822 1822 } 1823 1823 1824 1824 if (clone_flags & CLONE_PIDFD) { 1825 - int reserved; 1826 - 1827 1825 /* 1828 1826 * - CLONE_PARENT_SETTID is useless for pidfds and also 1829 1827 * parent_tidptr is used to return pidfds. ··· 1831 1833 */ 1832 1834 if (clone_flags & 1833 1835 (CLONE_DETACHED | CLONE_PARENT_SETTID | CLONE_THREAD)) 1834 - return ERR_PTR(-EINVAL); 1835 - 1836 - /* 1837 - * Verify that parent_tidptr is sane so we can potentially 1838 - * reuse it later. 1839 - */ 1840 - if (get_user(reserved, parent_tidptr)) 1841 - return ERR_PTR(-EFAULT); 1842 - 1843 - if (reserved != 0) 1844 1836 return ERR_PTR(-EINVAL); 1845 1837 } 1846 1838