Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

fuse: Add module param for CAP_SYS_ADMIN access bypassing allow_other

Since commit 73f03c2b4b52 ("fuse: Restrict allow_other to the superblock's
namespace or a descendant"), access to allow_other FUSE filesystems has
been limited to users in the mounting user namespace or descendants. This
prevents a process that is privileged in its userns - but not its parent
namespaces - from mounting a FUSE fs w/ allow_other that is accessible to
processes in parent namespaces.

While this restriction makes sense overall it breaks a legitimate usecase:
I have a tracing daemon which needs to peek into process' open files in
order to symbolicate - similar to 'perf'. The daemon is a privileged
process in the root userns, but is unable to peek into FUSE filesystems
mounted by processes in child namespaces.

This patch adds a module param, allow_sys_admin_access, to act as an escape
hatch for this descendant userns logic and for the allow_other mount option
in general. Setting allow_sys_admin_access allows processes with
CAP_SYS_ADMIN in the initial userns to access FUSE filesystems irrespective
of the mounting userns or whether allow_other was set. A sysadmin setting
this param must trust FUSEs on the host to not DoS processes as described
in 73f03c2b4b52.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

authored by

Dave Marchevsky and committed by
Miklos Szeredi
9ccf47b2 c6479780

+33 -5
+24 -5
Documentation/filesystems/fuse.rst
··· 279 279 the filesystem or not. 280 280 281 281 Note that the *ptrace* check is not strictly necessary to 282 - prevent B/2/i, it is enough to check if mount owner has enough 282 + prevent C/2/i, it is enough to check if mount owner has enough 283 283 privilege to send signal to the process accessing the 284 284 filesystem, since *SIGSTOP* can be used to get a similar effect. 285 285 ··· 288 288 289 289 If a sysadmin trusts the users enough, or can ensure through other 290 290 measures, that system processes will never enter non-privileged 291 - mounts, it can relax the last limitation with a 'user_allow_other' 292 - config option. If this config option is set, the mounting user can 293 - add the 'allow_other' mount option which disables the check for other 294 - users' processes. 291 + mounts, it can relax the last limitation in several ways: 292 + 293 + - With the 'user_allow_other' config option. If this config option is 294 + set, the mounting user can add the 'allow_other' mount option which 295 + disables the check for other users' processes. 296 + 297 + User namespaces have an unintuitive interaction with 'allow_other': 298 + an unprivileged user - normally restricted from mounting with 299 + 'allow_other' - could do so in a user namespace where they're 300 + privileged. If any process could access such an 'allow_other' mount 301 + this would give the mounting user the ability to manipulate 302 + processes in user namespaces where they're unprivileged. For this 303 + reason 'allow_other' restricts access to users in the same userns 304 + or a descendant. 305 + 306 + - With the 'allow_sys_admin_access' module option. If this option is 307 + set, super user's processes have unrestricted access to mounts 308 + irrespective of allow_other setting or user namespace of the 309 + mounting user. 310 + 311 + Note that both of these relaxations expose the system to potential 312 + information leak or *DoS* as described in points B and C/2/i-ii in the 313 + preceding section. 295 314 296 315 Kernel - userspace interface 297 316 ============================
+9
fs/fuse/dir.c
··· 11 11 #include <linux/pagemap.h> 12 12 #include <linux/file.h> 13 13 #include <linux/fs_context.h> 14 + #include <linux/moduleparam.h> 14 15 #include <linux/sched.h> 15 16 #include <linux/namei.h> 16 17 #include <linux/slab.h> ··· 21 20 #include <linux/security.h> 22 21 #include <linux/types.h> 23 22 #include <linux/kernel.h> 23 + 24 + static bool __read_mostly allow_sys_admin_access; 25 + module_param(allow_sys_admin_access, bool, 0644); 26 + MODULE_PARM_DESC(allow_sys_admin_access, 27 + "Allow users with CAP_SYS_ADMIN in initial userns to bypass allow_other access check"); 24 28 25 29 static void fuse_advise_use_readdirplus(struct inode *dir) 26 30 { ··· 1234 1228 int fuse_allow_current_process(struct fuse_conn *fc) 1235 1229 { 1236 1230 const struct cred *cred; 1231 + 1232 + if (allow_sys_admin_access && capable(CAP_SYS_ADMIN)) 1233 + return 1; 1237 1234 1238 1235 if (fc->allow_other) 1239 1236 return current_in_userns(fc->user_ns);