Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tmpfs,xattr: enable limited user extended attributes

Enable "user." extended attributes on tmpfs, limiting them by tracking
the space they occupy, and deducting that space from the limited ispace
(unless tmpfs mounted with nr_inodes=0 to leave that ispace unlimited).

tmpfs inodes and simple xattrs are both unswappable, and have to be in
lowmem on a 32-bit highmem kernel: so the ispace limit is appropriate
for xattrs, without any need for a further mount option.

Add simple_xattr_space() to give approximate but deterministic estimate
of the space taken up by each xattr: with simple_xattrs_free() outputting
the space freed if required (but kernfs and even some tmpfs usages do not
require that, so don't waste time on strlen'ing if not needed).

Security and trusted xattrs were already supported: for consistency and
simplicity, account them from the same pool; though there's a small risk
that a tmpfs with enough space before would now be considered too small.

When extended attributes are used, "df -i" does show more IUsed and less
IFree than can be explained by the inodes: document that (manpage later).

xfstests tests/generic which were not run on tmpfs before but now pass:
020 037 062 070 077 097 103 117 337 377 454 486 523 533 611 618 728
with no new failures.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Message-Id: <2e63b26e-df46-5baa-c7d6-f9a8dd3282c5@google.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

authored by

Hugh Dickins and committed by
Christian Brauner
2daf18a7 e07c469e

+106 -16
+5 -2
Documentation/filesystems/tmpfs.rst
··· 21 21 fly using a remount ('mount -o remount ...') of the filesystem. A tmpfs 22 22 filesystem can be resized but it cannot be resized to a size below its current 23 23 usage. tmpfs also supports POSIX ACLs, and extended attributes for the 24 - trusted.* and security.* namespaces. ramfs does not use swap and you cannot 25 - modify any parameter for a ramfs filesystem. The size limit of a ramfs 24 + trusted.*, security.* and user.* namespaces. ramfs does not use swap and you 25 + cannot modify any parameter for a ramfs filesystem. The size limit of a ramfs 26 26 filesystem is how much memory you have available, and so care must be taken if 27 27 used so to not run out of memory. 28 28 ··· 96 96 mount with such options, since it allows any user with write access to 97 97 use up all the memory on the machine; but enhances the scalability of 98 98 that instance in a system with many CPUs making intensive use of it. 99 + 100 + If nr_inodes is not 0, that limited space for inodes is also used up by 101 + extended attributes: "df -i"'s IUsed and IUse% increase, IFree decreases. 99 102 100 103 tmpfs blocks may be swapped out, when there is a shortage of memory. 101 104 tmpfs has a mount option to disable its use of swap:
+2 -2
fs/Kconfig
··· 205 205 Extended attributes are name:value pairs associated with inodes by 206 206 the kernel or by users (see the attr(5) manual page for details). 207 207 208 - Currently this enables support for the trusted.* and 209 - security.* namespaces. 208 + This enables support for the trusted.*, security.* and user.* 209 + namespaces. 210 210 211 211 You need this for POSIX ACL support on tmpfs. 212 212
+1 -1
fs/kernfs/dir.c
··· 556 556 kfree_const(kn->name); 557 557 558 558 if (kn->iattr) { 559 - simple_xattrs_free(&kn->iattr->xattrs); 559 + simple_xattrs_free(&kn->iattr->xattrs, NULL); 560 560 kmem_cache_free(kernfs_iattrs_cache, kn->iattr); 561 561 } 562 562 spin_lock(&kernfs_idr_lock);
+27 -1
fs/xattr.c
··· 1040 1040 EXPORT_SYMBOL(xattr_full_name); 1041 1041 1042 1042 /** 1043 + * simple_xattr_space - estimate the memory used by a simple xattr 1044 + * @name: the full name of the xattr 1045 + * @size: the size of its value 1046 + * 1047 + * This takes no account of how much larger the two slab objects actually are: 1048 + * that would depend on the slab implementation, when what is required is a 1049 + * deterministic number, which grows with name length and size and quantity. 1050 + * 1051 + * Return: The approximate number of bytes of memory used by such an xattr. 1052 + */ 1053 + size_t simple_xattr_space(const char *name, size_t size) 1054 + { 1055 + /* 1056 + * Use "40" instead of sizeof(struct simple_xattr), to return the 1057 + * same result on 32-bit and 64-bit, and even if simple_xattr grows. 1058 + */ 1059 + return 40 + size + strlen(name); 1060 + } 1061 + 1062 + /** 1043 1063 * simple_xattr_free - free an xattr object 1044 1064 * @xattr: the xattr object 1045 1065 * ··· 1383 1363 /** 1384 1364 * simple_xattrs_free - free xattrs 1385 1365 * @xattrs: xattr header whose xattrs to destroy 1366 + * @freed_space: approximate number of bytes of memory freed from @xattrs 1386 1367 * 1387 1368 * Destroy all xattrs in @xattr. When this is called no one can hold a 1388 1369 * reference to any of the xattrs anymore. 1389 1370 */ 1390 - void simple_xattrs_free(struct simple_xattrs *xattrs) 1371 + void simple_xattrs_free(struct simple_xattrs *xattrs, size_t *freed_space) 1391 1372 { 1392 1373 struct rb_node *rbp; 1393 1374 1375 + if (freed_space) 1376 + *freed_space = 0; 1394 1377 rbp = rb_first(&xattrs->rb_root); 1395 1378 while (rbp) { 1396 1379 struct simple_xattr *xattr; ··· 1402 1379 rbp_next = rb_next(rbp); 1403 1380 xattr = rb_entry(rbp, struct simple_xattr, rb_node); 1404 1381 rb_erase(&xattr->rb_node, &xattrs->rb_root); 1382 + if (freed_space) 1383 + *freed_space += simple_xattr_space(xattr->name, 1384 + xattr->size); 1405 1385 simple_xattr_free(xattr); 1406 1386 rbp = rbp_next; 1407 1387 }
+2 -1
include/linux/xattr.h
··· 114 114 }; 115 115 116 116 void simple_xattrs_init(struct simple_xattrs *xattrs); 117 - void simple_xattrs_free(struct simple_xattrs *xattrs); 117 + void simple_xattrs_free(struct simple_xattrs *xattrs, size_t *freed_space); 118 + size_t simple_xattr_space(const char *name, size_t size); 118 119 struct simple_xattr *simple_xattr_alloc(const void *value, size_t size); 119 120 void simple_xattr_free(struct simple_xattr *xattr); 120 121 int simple_xattr_get(struct simple_xattrs *xattrs, const char *name,
+69 -9
mm/shmem.c
··· 393 393 return 0; 394 394 } 395 395 396 - static void shmem_free_inode(struct super_block *sb) 396 + static void shmem_free_inode(struct super_block *sb, size_t freed_ispace) 397 397 { 398 398 struct shmem_sb_info *sbinfo = SHMEM_SB(sb); 399 399 if (sbinfo->max_inodes) { 400 400 raw_spin_lock(&sbinfo->stat_lock); 401 - sbinfo->free_ispace += BOGO_INODE_SIZE; 401 + sbinfo->free_ispace += BOGO_INODE_SIZE + freed_ispace; 402 402 raw_spin_unlock(&sbinfo->stat_lock); 403 403 } 404 404 } ··· 1232 1232 { 1233 1233 struct shmem_inode_info *info = SHMEM_I(inode); 1234 1234 struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); 1235 + size_t freed = 0; 1235 1236 1236 1237 if (shmem_mapping(inode->i_mapping)) { 1237 1238 shmem_unacct_size(info->flags, inode->i_size); ··· 1259 1258 } 1260 1259 } 1261 1260 1262 - simple_xattrs_free(&info->xattrs); 1261 + simple_xattrs_free(&info->xattrs, sbinfo->max_inodes ? &freed : NULL); 1262 + shmem_free_inode(inode->i_sb, freed); 1263 1263 WARN_ON(inode->i_blocks); 1264 - shmem_free_inode(inode->i_sb); 1265 1264 clear_inode(inode); 1266 1265 #ifdef CONFIG_TMPFS_QUOTA 1267 1266 dquot_free_inode(inode); ··· 2441 2440 inode = new_inode(sb); 2442 2441 2443 2442 if (!inode) { 2444 - shmem_free_inode(sb); 2443 + shmem_free_inode(sb, 0); 2445 2444 return ERR_PTR(-ENOSPC); 2446 2445 } 2447 2446 ··· 3285 3284 ret = simple_offset_add(shmem_get_offset_ctx(dir), dentry); 3286 3285 if (ret) { 3287 3286 if (inode->i_nlink) 3288 - shmem_free_inode(inode->i_sb); 3287 + shmem_free_inode(inode->i_sb, 0); 3289 3288 goto out; 3290 3289 } 3291 3290 ··· 3305 3304 struct inode *inode = d_inode(dentry); 3306 3305 3307 3306 if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode)) 3308 - shmem_free_inode(inode->i_sb); 3307 + shmem_free_inode(inode->i_sb, 0); 3309 3308 3310 3309 simple_offset_remove(shmem_get_offset_ctx(dir), dentry); 3311 3310 ··· 3558 3557 void *fs_info) 3559 3558 { 3560 3559 struct shmem_inode_info *info = SHMEM_I(inode); 3560 + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); 3561 3561 const struct xattr *xattr; 3562 3562 struct simple_xattr *new_xattr; 3563 + size_t ispace = 0; 3563 3564 size_t len; 3565 + 3566 + if (sbinfo->max_inodes) { 3567 + for (xattr = xattr_array; xattr->name != NULL; xattr++) { 3568 + ispace += simple_xattr_space(xattr->name, 3569 + xattr->value_len + XATTR_SECURITY_PREFIX_LEN); 3570 + } 3571 + if (ispace) { 3572 + raw_spin_lock(&sbinfo->stat_lock); 3573 + if (sbinfo->free_ispace < ispace) 3574 + ispace = 0; 3575 + else 3576 + sbinfo->free_ispace -= ispace; 3577 + raw_spin_unlock(&sbinfo->stat_lock); 3578 + if (!ispace) 3579 + return -ENOSPC; 3580 + } 3581 + } 3564 3582 3565 3583 for (xattr = xattr_array; xattr->name != NULL; xattr++) { 3566 3584 new_xattr = simple_xattr_alloc(xattr->value, xattr->value_len); 3567 3585 if (!new_xattr) 3568 - return -ENOMEM; 3586 + break; 3569 3587 3570 3588 len = strlen(xattr->name) + 1; 3571 3589 new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len, 3572 3590 GFP_KERNEL); 3573 3591 if (!new_xattr->name) { 3574 3592 kvfree(new_xattr); 3575 - return -ENOMEM; 3593 + break; 3576 3594 } 3577 3595 3578 3596 memcpy(new_xattr->name, XATTR_SECURITY_PREFIX, ··· 3600 3580 xattr->name, len); 3601 3581 3602 3582 simple_xattr_add(&info->xattrs, new_xattr); 3583 + } 3584 + 3585 + if (xattr->name != NULL) { 3586 + if (ispace) { 3587 + raw_spin_lock(&sbinfo->stat_lock); 3588 + sbinfo->free_ispace += ispace; 3589 + raw_spin_unlock(&sbinfo->stat_lock); 3590 + } 3591 + simple_xattrs_free(&info->xattrs, NULL); 3592 + return -ENOMEM; 3603 3593 } 3604 3594 3605 3595 return 0; ··· 3632 3602 size_t size, int flags) 3633 3603 { 3634 3604 struct shmem_inode_info *info = SHMEM_I(inode); 3605 + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); 3635 3606 struct simple_xattr *old_xattr; 3607 + size_t ispace = 0; 3636 3608 3637 3609 name = xattr_full_name(handler, name); 3610 + if (value && sbinfo->max_inodes) { 3611 + ispace = simple_xattr_space(name, size); 3612 + raw_spin_lock(&sbinfo->stat_lock); 3613 + if (sbinfo->free_ispace < ispace) 3614 + ispace = 0; 3615 + else 3616 + sbinfo->free_ispace -= ispace; 3617 + raw_spin_unlock(&sbinfo->stat_lock); 3618 + if (!ispace) 3619 + return -ENOSPC; 3620 + } 3621 + 3638 3622 old_xattr = simple_xattr_set(&info->xattrs, name, value, size, flags); 3639 3623 if (!IS_ERR(old_xattr)) { 3624 + ispace = 0; 3625 + if (old_xattr && sbinfo->max_inodes) 3626 + ispace = simple_xattr_space(old_xattr->name, 3627 + old_xattr->size); 3640 3628 simple_xattr_free(old_xattr); 3641 3629 old_xattr = NULL; 3642 3630 inode->i_ctime = current_time(inode); 3643 3631 inode_inc_iversion(inode); 3632 + } 3633 + if (ispace) { 3634 + raw_spin_lock(&sbinfo->stat_lock); 3635 + sbinfo->free_ispace += ispace; 3636 + raw_spin_unlock(&sbinfo->stat_lock); 3644 3637 } 3645 3638 return PTR_ERR(old_xattr); 3646 3639 } ··· 3680 3627 .set = shmem_xattr_handler_set, 3681 3628 }; 3682 3629 3630 + static const struct xattr_handler shmem_user_xattr_handler = { 3631 + .prefix = XATTR_USER_PREFIX, 3632 + .get = shmem_xattr_handler_get, 3633 + .set = shmem_xattr_handler_set, 3634 + }; 3635 + 3683 3636 static const struct xattr_handler *shmem_xattr_handlers[] = { 3684 3637 &shmem_security_xattr_handler, 3685 3638 &shmem_trusted_xattr_handler, 3639 + &shmem_user_xattr_handler, 3686 3640 NULL 3687 3641 }; 3688 3642