Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution

/* Background. */
There are many circumstances when userspace wants to resolve a path and
ensure that it doesn't go outside of a particular root directory during
resolution. Obvious examples include archive extraction tools, as well as
other security-conscious userspace programs. FreeBSD spun out O_BENEATH
from their Capsicum project[1,2], so it also seems reasonable to
implement similar functionality for Linux.

This is part of a refresh of Al's AT_NO_JUMPS patchset[3] (which was a
variation on David Drysdale's O_BENEATH patchset[4], which in turn was
based on the Capsicum project[5]).

/* Userspace API. */
LOOKUP_BENEATH will be exposed to userspace through openat2(2).

/* Semantics. */
Unlike most other LOOKUP flags (most notably LOOKUP_FOLLOW),
LOOKUP_BENEATH applies to all components of the path.

With LOOKUP_BENEATH, any path component which attempts to "escape" the
starting point of the filesystem lookup (the dirfd passed to openat)
will yield -EXDEV. Thus, all absolute paths and symlinks are disallowed.

Due to a security concern brought up by Jann[6], any ".." path
components are also blocked. This restriction will be lifted in a future
patch, but requires more work to ensure that permitting ".." is done
safely.

Magic-link jumps are also blocked, because they can beam the path lookup
across the starting point. It would be possible to detect and block
only the "bad" crossings with path_is_under() checks, but it's unclear
whether it makes sense to permit magic-links at all. However, userspace
is recommended to pass LOOKUP_NO_MAGICLINKS if they want to ensure that
magic-link crossing is entirely disabled.

/* Testing. */
LOOKUP_BENEATH is tested as part of the openat2(2) selftests.

[1]: https://reviews.freebsd.org/D2808
[2]: https://reviews.freebsd.org/D17547
[3]: https://lore.kernel.org/lkml/20170429220414.GT29622@ZenIV.linux.org.uk/
[4]: https://lore.kernel.org/lkml/1415094884-18349-1-git-send-email-drysdale@google.com/
[5]: https://lore.kernel.org/lkml/1404124096-21445-1-git-send-email-drysdale@google.com/
[6]: https://lore.kernel.org/lkml/CAG48ez1jzNvxB+bfOBnERFGp=oMM0vHWuLD6EULmne3R6xa53w@mail.gmail.com/

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: David Drysdale <drysdale@google.com>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

authored by

Aleksa Sarai and committed by
Al Viro
adb21d2b 72ba2929

+78 -6
+74 -6
fs/namei.c
··· 641 641 642 642 static bool legitimize_root(struct nameidata *nd) 643 643 { 644 + /* 645 + * For scoped-lookups (where nd->root has been zeroed), we need to 646 + * restart the whole lookup from scratch -- because set_root() is wrong 647 + * for these lookups (nd->dfd is the root, not the filesystem root). 648 + */ 649 + if (!nd->root.mnt && (nd->flags & LOOKUP_IS_SCOPED)) 650 + return false; 651 + /* Nothing to do if nd->root is zero or is managed by the VFS user. */ 644 652 if (!nd->root.mnt || (nd->flags & LOOKUP_ROOT)) 645 653 return true; 646 654 nd->flags |= LOOKUP_ROOT_GRABBED; ··· 784 776 int status; 785 777 786 778 if (nd->flags & LOOKUP_RCU) { 787 - if (!(nd->flags & LOOKUP_ROOT)) 779 + /* 780 + * We don't want to zero nd->root for scoped-lookups or 781 + * externally-managed nd->root. 782 + */ 783 + if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED))) 788 784 nd->root.mnt = NULL; 789 785 if (unlikely(unlazy_walk(nd))) 790 786 return -ECHILD; 787 + } 788 + 789 + if (unlikely(nd->flags & LOOKUP_IS_SCOPED)) { 790 + /* 791 + * While the guarantee of LOOKUP_IS_SCOPED is (roughly) "don't 792 + * ever step outside the root during lookup" and should already 793 + * be guaranteed by the rest of namei, we want to avoid a namei 794 + * BUG resulting in userspace being given a path that was not 795 + * scoped within the root at some point during the lookup. 796 + * 797 + * So, do a final sanity-check to make sure that in the 798 + * worst-case scenario (a complete bypass of LOOKUP_IS_SCOPED) 799 + * we won't silently return an fd completely outside of the 800 + * requested root to userspace. 801 + * 802 + * Userspace could move the path outside the root after this 803 + * check, but as discussed elsewhere this is not a concern (the 804 + * resolved file was inside the root at some point). 805 + */ 806 + if (!path_is_under(&nd->path, &nd->root)) 807 + return -EXDEV; 791 808 } 792 809 793 810 if (likely(!(nd->flags & LOOKUP_JUMPED))) ··· 834 801 static int set_root(struct nameidata *nd) 835 802 { 836 803 struct fs_struct *fs = current->fs; 804 + 805 + /* 806 + * Jumping to the real root in a scoped-lookup is a BUG in namei, but we 807 + * still have to ensure it doesn't happen because it will cause a breakout 808 + * from the dirfd. 809 + */ 810 + if (WARN_ON(nd->flags & LOOKUP_IS_SCOPED)) 811 + return -ENOTRECOVERABLE; 837 812 838 813 if (nd->flags & LOOKUP_RCU) { 839 814 unsigned seq; ··· 879 838 880 839 static int nd_jump_root(struct nameidata *nd) 881 840 { 841 + if (unlikely(nd->flags & LOOKUP_BENEATH)) 842 + return -EXDEV; 882 843 if (unlikely(nd->flags & LOOKUP_NO_XDEV)) { 883 844 /* Absolute path arguments to path_init() are allowed. */ 884 845 if (nd->path.mnt != NULL && nd->path.mnt != nd->root.mnt) ··· 926 883 if (nd->path.mnt != path->mnt) 927 884 goto err; 928 885 } 886 + /* Not currently safe for scoped-lookups. */ 887 + if (unlikely(nd->flags & LOOKUP_IS_SCOPED)) 888 + goto err; 929 889 930 890 path_put(&nd->path); 931 891 nd->path = *path; ··· 1431 1385 struct inode *inode = nd->inode; 1432 1386 1433 1387 while (1) { 1434 - if (path_equal(&nd->path, &nd->root)) 1388 + if (path_equal(&nd->path, &nd->root)) { 1389 + if (unlikely(nd->flags & LOOKUP_BENEATH)) 1390 + return -ECHILD; 1435 1391 break; 1392 + } 1436 1393 if (nd->path.dentry != nd->path.mnt->mnt_root) { 1437 1394 struct dentry *old = nd->path.dentry; 1438 1395 struct dentry *parent = old->d_parent; ··· 1565 1516 1566 1517 static int follow_dotdot(struct nameidata *nd) 1567 1518 { 1568 - while(1) { 1569 - if (path_equal(&nd->path, &nd->root)) 1519 + while (1) { 1520 + if (path_equal(&nd->path, &nd->root)) { 1521 + if (unlikely(nd->flags & LOOKUP_BENEATH)) 1522 + return -EXDEV; 1570 1523 break; 1524 + } 1571 1525 if (nd->path.dentry != nd->path.mnt->mnt_root) { 1572 1526 int ret = path_parent_directory(&nd->path); 1573 1527 if (ret) ··· 1793 1741 if (type == LAST_DOTDOT) { 1794 1742 int error = 0; 1795 1743 1744 + /* 1745 + * Scoped-lookup flags resolving ".." is not currently safe -- 1746 + * races can cause our parent to have moved outside of the root 1747 + * and us to skip over it. 1748 + */ 1749 + if (unlikely(nd->flags & LOOKUP_IS_SCOPED)) 1750 + return -EXDEV; 1796 1751 if (!nd->root.mnt) { 1797 1752 error = set_root(nd); 1798 1753 if (error) ··· 2317 2258 get_fs_pwd(current->fs, &nd->path); 2318 2259 nd->inode = nd->path.dentry->d_inode; 2319 2260 } 2320 - return s; 2321 2261 } else { 2322 2262 /* Caller must check execute permissions on the starting path component */ 2323 2263 struct fd f = fdget_raw(nd->dfd); ··· 2341 2283 nd->inode = nd->path.dentry->d_inode; 2342 2284 } 2343 2285 fdput(f); 2344 - return s; 2345 2286 } 2287 + /* For scoped-lookups we need to set the root to the dirfd as well. */ 2288 + if (flags & LOOKUP_IS_SCOPED) { 2289 + nd->root = nd->path; 2290 + if (flags & LOOKUP_RCU) { 2291 + nd->root_seq = nd->seq; 2292 + } else { 2293 + path_get(&nd->root); 2294 + nd->flags |= LOOKUP_ROOT_GRABBED; 2295 + } 2296 + } 2297 + return s; 2346 2298 } 2347 2299 2348 2300 static const char *trailing_symlink(struct nameidata *nd)
+4
include/linux/namei.h
··· 2 2 #ifndef _LINUX_NAMEI_H 3 3 #define _LINUX_NAMEI_H 4 4 5 + #include <linux/fs.h> 5 6 #include <linux/kernel.h> 6 7 #include <linux/path.h> 7 8 #include <linux/fcntl.h> ··· 44 43 #define LOOKUP_NO_SYMLINKS 0x010000 /* No symlink crossing. */ 45 44 #define LOOKUP_NO_MAGICLINKS 0x020000 /* No nd_jump_link() crossing. */ 46 45 #define LOOKUP_NO_XDEV 0x040000 /* No mountpoint crossing. */ 46 + #define LOOKUP_BENEATH 0x080000 /* No escaping from starting point. */ 47 + /* LOOKUP_* flags which do scope-related checks based on the dirfd. */ 48 + #define LOOKUP_IS_SCOPED LOOKUP_BENEATH 47 49 48 50 extern int path_pts(struct path *path); 49 51