Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
"This work from Amir adds NFS export capability to overlayfs. NFS
exporting an overlay filesystem is a challange because we want to keep
track of any copy-up of a file or directory between encoding the file
handle and decoding it.

This is achieved by indexing copied up objects by lower layer file
handle. The index is already used for hard links, this patchset
extends the use to NFS file handle decoding"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (51 commits)
ovl: check ERR_PTR() return value from ovl_encode_fh()
ovl: fix regression in fsnotify of overlay merge dir
ovl: wire up NFS export operations
ovl: lookup indexed ancestor of lower dir
ovl: lookup connected ancestor of dir in inode cache
ovl: hash non-indexed dir by upper inode for NFS export
ovl: decode pure lower dir file handles
ovl: decode indexed dir file handles
ovl: decode lower file handles of unlinked but open files
ovl: decode indexed non-dir file handles
ovl: decode lower non-dir file handles
ovl: encode lower file handles
ovl: copy up before encoding non-connectable dir file handle
ovl: encode non-indexed upper file handles
ovl: decode connected upper dir file handles
ovl: decode pure upper file handles
ovl: encode pure upper file handles
ovl: document NFS export
vfs: factor out helpers d_instantiate_anon() and d_alloc_anon()
ovl: store 'has_upper' and 'opaque' as bit flags
...

+1910 -414
+103 -3
Documentation/filesystems/overlayfs.txt
··· 190 190 Redirects are not created and not followed (equivalent to "redirect_dir=off" 191 191 if "redirect_always_follow" feature is not enabled). 192 192 193 + When the NFS export feature is enabled, every copied up directory is 194 + indexed by the file handle of the lower inode and a file handle of the 195 + upper directory is stored in a "trusted.overlay.upper" extended attribute 196 + on the index entry. On lookup of a merged directory, if the upper 197 + directory does not match the file handle stores in the index, that is an 198 + indication that multiple upper directories may be redirected to the same 199 + lower directory. In that case, lookup returns an error and warns about 200 + a possible inconsistency. 201 + 202 + Because lower layer redirects cannot be verified with the index, enabling 203 + NFS export support on an overlay filesystem with no upper layer requires 204 + turning off redirect follow (e.g. "redirect_dir=nofollow"). 205 + 206 + 193 207 Non-directories 194 208 --------------- 195 209 ··· 295 281 296 282 Any open files referring to this inode will access the old data. 297 283 298 - If a file with multiple hard links is copied up, then this will 299 - "break" the link. Changes will not be propagated to other names 300 - referring to the same inode. 284 + Unless "inode index" feature is enabled, if a file with multiple hard 285 + links is copied up, then this will "break" the link. Changes will not be 286 + propagated to other names referring to the same inode. 301 287 302 288 Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged 303 289 directory will fail with EXDEV. ··· 312 298 filesystem are not allowed. If the underlying filesystem is changed, 313 299 the behavior of the overlay is undefined, though it will not result in 314 300 a crash or deadlock. 301 + 302 + When the overlay NFS export feature is enabled, overlay filesystems 303 + behavior on offline changes of the underlying lower layer is different 304 + than the behavior when NFS export is disabled. 305 + 306 + On every copy_up, an NFS file handle of the lower inode, along with the 307 + UUID of the lower filesystem, are encoded and stored in an extended 308 + attribute "trusted.overlay.origin" on the upper inode. 309 + 310 + When the NFS export feature is enabled, a lookup of a merged directory, 311 + that found a lower directory at the lookup path or at the path pointed 312 + to by the "trusted.overlay.redirect" extended attribute, will verify 313 + that the found lower directory file handle and lower filesystem UUID 314 + match the origin file handle that was stored at copy_up time. If a 315 + found lower directory does not match the stored origin, that directory 316 + will not be merged with the upper directory. 317 + 318 + 319 + 320 + NFS export 321 + ---------- 322 + 323 + When the underlying filesystems supports NFS export and the "nfs_export" 324 + feature is enabled, an overlay filesystem may be exported to NFS. 325 + 326 + With the "nfs_export" feature, on copy_up of any lower object, an index 327 + entry is created under the index directory. The index entry name is the 328 + hexadecimal representation of the copy up origin file handle. For a 329 + non-directory object, the index entry is a hard link to the upper inode. 330 + For a directory object, the index entry has an extended attribute 331 + "trusted.overlay.upper" with an encoded file handle of the upper 332 + directory inode. 333 + 334 + When encoding a file handle from an overlay filesystem object, the 335 + following rules apply: 336 + 337 + 1. For a non-upper object, encode a lower file handle from lower inode 338 + 2. For an indexed object, encode a lower file handle from copy_up origin 339 + 3. For a pure-upper object and for an existing non-indexed upper object, 340 + encode an upper file handle from upper inode 341 + 342 + The encoded overlay file handle includes: 343 + - Header including path type information (e.g. lower/upper) 344 + - UUID of the underlying filesystem 345 + - Underlying filesystem encoding of underlying inode 346 + 347 + This encoding format is identical to the encoding format file handles that 348 + are stored in extended attribute "trusted.overlay.origin". 349 + 350 + When decoding an overlay file handle, the following steps are followed: 351 + 352 + 1. Find underlying layer by UUID and path type information. 353 + 2. Decode the underlying filesystem file handle to underlying dentry. 354 + 3. For a lower file handle, lookup the handle in index directory by name. 355 + 4. If a whiteout is found in index, return ESTALE. This represents an 356 + overlay object that was deleted after its file handle was encoded. 357 + 5. For a non-directory, instantiate a disconnected overlay dentry from the 358 + decoded underlying dentry, the path type and index inode, if found. 359 + 6. For a directory, use the connected underlying decoded dentry, path type 360 + and index, to lookup a connected overlay dentry. 361 + 362 + Decoding a non-directory file handle may return a disconnected dentry. 363 + copy_up of that disconnected dentry will create an upper index entry with 364 + no upper alias. 365 + 366 + When overlay filesystem has multiple lower layers, a middle layer 367 + directory may have a "redirect" to lower directory. Because middle layer 368 + "redirects" are not indexed, a lower file handle that was encoded from the 369 + "redirect" origin directory, cannot be used to find the middle or upper 370 + layer directory. Similarly, a lower file handle that was encoded from a 371 + descendant of the "redirect" origin directory, cannot be used to 372 + reconstruct a connected overlay path. To mitigate the cases of 373 + directories that cannot be decoded from a lower file handle, these 374 + directories are copied up on encode and encoded as an upper file handle. 375 + On an overlay filesystem with no upper layer this mitigation cannot be 376 + used NFS export in this setup requires turning off redirect follow (e.g. 377 + "redirect_dir=nofollow"). 378 + 379 + The overlay filesystem does not support non-directory connectable file 380 + handles, so exporting with the 'subtree_check' exportfs configuration will 381 + cause failures to lookup files over NFS. 382 + 383 + When the NFS export feature is enabled, all directory index entries are 384 + verified on mount time to check that upper file handles are not stale. 385 + This verification may cause significant overhead in some cases. 386 + 315 387 316 388 Testsuite 317 389 ---------
+61 -35
fs/dcache.c
··· 1698 1698 } 1699 1699 EXPORT_SYMBOL(d_alloc); 1700 1700 1701 + struct dentry *d_alloc_anon(struct super_block *sb) 1702 + { 1703 + return __d_alloc(sb, NULL); 1704 + } 1705 + EXPORT_SYMBOL(d_alloc_anon); 1706 + 1701 1707 struct dentry *d_alloc_cursor(struct dentry * parent) 1702 1708 { 1703 - struct dentry *dentry = __d_alloc(parent->d_sb, NULL); 1709 + struct dentry *dentry = d_alloc_anon(parent->d_sb); 1704 1710 if (dentry) { 1705 1711 dentry->d_flags |= DCACHE_RCUACCESS | DCACHE_DENTRY_CURSOR; 1706 1712 dentry->d_parent = dget(parent); ··· 1892 1886 struct dentry *res = NULL; 1893 1887 1894 1888 if (root_inode) { 1895 - res = __d_alloc(root_inode->i_sb, NULL); 1889 + res = d_alloc_anon(root_inode->i_sb); 1896 1890 if (res) 1897 1891 d_instantiate(res, root_inode); 1898 1892 else ··· 1931 1925 } 1932 1926 EXPORT_SYMBOL(d_find_any_alias); 1933 1927 1934 - static struct dentry *__d_obtain_alias(struct inode *inode, int disconnected) 1928 + static struct dentry *__d_instantiate_anon(struct dentry *dentry, 1929 + struct inode *inode, 1930 + bool disconnected) 1931 + { 1932 + struct dentry *res; 1933 + unsigned add_flags; 1934 + 1935 + security_d_instantiate(dentry, inode); 1936 + spin_lock(&inode->i_lock); 1937 + res = __d_find_any_alias(inode); 1938 + if (res) { 1939 + spin_unlock(&inode->i_lock); 1940 + dput(dentry); 1941 + goto out_iput; 1942 + } 1943 + 1944 + /* attach a disconnected dentry */ 1945 + add_flags = d_flags_for_inode(inode); 1946 + 1947 + if (disconnected) 1948 + add_flags |= DCACHE_DISCONNECTED; 1949 + 1950 + spin_lock(&dentry->d_lock); 1951 + __d_set_inode_and_type(dentry, inode, add_flags); 1952 + hlist_add_head(&dentry->d_u.d_alias, &inode->i_dentry); 1953 + if (!disconnected) { 1954 + hlist_bl_lock(&dentry->d_sb->s_roots); 1955 + hlist_bl_add_head(&dentry->d_hash, &dentry->d_sb->s_roots); 1956 + hlist_bl_unlock(&dentry->d_sb->s_roots); 1957 + } 1958 + spin_unlock(&dentry->d_lock); 1959 + spin_unlock(&inode->i_lock); 1960 + 1961 + return dentry; 1962 + 1963 + out_iput: 1964 + iput(inode); 1965 + return res; 1966 + } 1967 + 1968 + struct dentry *d_instantiate_anon(struct dentry *dentry, struct inode *inode) 1969 + { 1970 + return __d_instantiate_anon(dentry, inode, true); 1971 + } 1972 + EXPORT_SYMBOL(d_instantiate_anon); 1973 + 1974 + static struct dentry *__d_obtain_alias(struct inode *inode, bool disconnected) 1935 1975 { 1936 1976 struct dentry *tmp; 1937 1977 struct dentry *res; 1938 - unsigned add_flags; 1939 1978 1940 1979 if (!inode) 1941 1980 return ERR_PTR(-ESTALE); ··· 1991 1940 if (res) 1992 1941 goto out_iput; 1993 1942 1994 - tmp = __d_alloc(inode->i_sb, NULL); 1943 + tmp = d_alloc_anon(inode->i_sb); 1995 1944 if (!tmp) { 1996 1945 res = ERR_PTR(-ENOMEM); 1997 1946 goto out_iput; 1998 1947 } 1999 1948 2000 - security_d_instantiate(tmp, inode); 2001 - spin_lock(&inode->i_lock); 2002 - res = __d_find_any_alias(inode); 2003 - if (res) { 2004 - spin_unlock(&inode->i_lock); 2005 - dput(tmp); 2006 - goto out_iput; 2007 - } 1949 + return __d_instantiate_anon(tmp, inode, disconnected); 2008 1950 2009 - /* attach a disconnected dentry */ 2010 - add_flags = d_flags_for_inode(inode); 2011 - 2012 - if (disconnected) 2013 - add_flags |= DCACHE_DISCONNECTED; 2014 - 2015 - spin_lock(&tmp->d_lock); 2016 - __d_set_inode_and_type(tmp, inode, add_flags); 2017 - hlist_add_head(&tmp->d_u.d_alias, &inode->i_dentry); 2018 - if (!disconnected) { 2019 - hlist_bl_lock(&tmp->d_sb->s_roots); 2020 - hlist_bl_add_head(&tmp->d_hash, &tmp->d_sb->s_roots); 2021 - hlist_bl_unlock(&tmp->d_sb->s_roots); 2022 - } 2023 - spin_unlock(&tmp->d_lock); 2024 - spin_unlock(&inode->i_lock); 2025 - 2026 - return tmp; 2027 - 2028 - out_iput: 1951 + out_iput: 2029 1952 iput(inode); 2030 1953 return res; 2031 1954 } ··· 2024 1999 */ 2025 2000 struct dentry *d_obtain_alias(struct inode *inode) 2026 2001 { 2027 - return __d_obtain_alias(inode, 1); 2002 + return __d_obtain_alias(inode, true); 2028 2003 } 2029 2004 EXPORT_SYMBOL(d_obtain_alias); 2030 2005 ··· 2045 2020 */ 2046 2021 struct dentry *d_obtain_root(struct inode *inode) 2047 2022 { 2048 - return __d_obtain_alias(inode, 0); 2023 + return __d_obtain_alias(inode, false); 2049 2024 } 2050 2025 EXPORT_SYMBOL(d_obtain_root); 2051 2026 ··· 3552 3527 3553 3528 return result; 3554 3529 } 3530 + EXPORT_SYMBOL(is_subdir); 3555 3531 3556 3532 static enum d_walk_ret d_genocide_kill(void *data, struct dentry *dentry) 3557 3533 {
+25 -6
fs/overlayfs/Kconfig
··· 47 47 The inodes index feature prevents breaking of lower hardlinks on copy 48 48 up. 49 49 50 - Note, that the inodes index feature is read-only backward compatible. 51 - That is, mounting an overlay which has an index dir on a kernel that 52 - doesn't support this feature read-only, will not have any negative 53 - outcomes. However, mounting the same overlay with an old kernel 54 - read-write and then mounting it again with a new kernel, will have 55 - unexpected results. 50 + Note, that the inodes index feature is not backward compatible. 51 + That is, mounting an overlay which has an inodes index on a kernel 52 + that doesn't support this feature will have unexpected results. 53 + 54 + config OVERLAY_FS_NFS_EXPORT 55 + bool "Overlayfs: turn on NFS export feature by default" 56 + depends on OVERLAY_FS 57 + depends on OVERLAY_FS_INDEX 58 + help 59 + If this config option is enabled then overlay filesystems will use 60 + the inodes index dir to decode overlay NFS file handles by default. 61 + In this case, it is still possible to turn off NFS export support 62 + globally with the "nfs_export=off" module option or on a filesystem 63 + instance basis with the "nfs_export=off" mount option. 64 + 65 + The NFS export feature creates an index on copy up of every file and 66 + directory. This full index is used to detect overlay filesystems 67 + inconsistencies on lookup, like redirect from multiple upper dirs to 68 + the same lower dir. The full index may incur some overhead on mount 69 + time, especially when verifying that directory file handles are not 70 + stale. 71 + 72 + Note, that the NFS export feature is not backward compatible. 73 + That is, mounting an overlay which has a full index on a kernel 74 + that doesn't support this feature will have unexpected results.
+2 -1
fs/overlayfs/Makefile
··· 4 4 5 5 obj-$(CONFIG_OVERLAY_FS) += overlay.o 6 6 7 - overlay-objs := super.o namei.o util.o inode.o dir.o readdir.o copy_up.o 7 + overlay-objs := super.o namei.o util.o inode.o dir.o readdir.o copy_up.o \ 8 + export.o
+160 -28
fs/overlayfs/copy_up.c
··· 232 232 return err; 233 233 } 234 234 235 - struct ovl_fh *ovl_encode_fh(struct dentry *lower, bool is_upper) 235 + struct ovl_fh *ovl_encode_fh(struct dentry *real, bool is_upper) 236 236 { 237 237 struct ovl_fh *fh; 238 238 int fh_type, fh_len, dwords; 239 239 void *buf; 240 240 int buflen = MAX_HANDLE_SZ; 241 - uuid_t *uuid = &lower->d_sb->s_uuid; 241 + uuid_t *uuid = &real->d_sb->s_uuid; 242 242 243 243 buf = kmalloc(buflen, GFP_KERNEL); 244 244 if (!buf) ··· 250 250 * the price or reconnecting the dentry. 251 251 */ 252 252 dwords = buflen >> 2; 253 - fh_type = exportfs_encode_fh(lower, buf, &dwords, 0); 253 + fh_type = exportfs_encode_fh(real, buf, &dwords, 0); 254 254 buflen = (dwords << 2); 255 255 256 256 fh = ERR_PTR(-EIO); ··· 288 288 return fh; 289 289 } 290 290 291 - static int ovl_set_origin(struct dentry *dentry, struct dentry *lower, 292 - struct dentry *upper) 291 + int ovl_set_origin(struct dentry *dentry, struct dentry *lower, 292 + struct dentry *upper) 293 293 { 294 294 const struct ovl_fh *fh = NULL; 295 295 int err; ··· 315 315 return err; 316 316 } 317 317 318 + /* Store file handle of @upper dir in @index dir entry */ 319 + static int ovl_set_upper_fh(struct dentry *upper, struct dentry *index) 320 + { 321 + const struct ovl_fh *fh; 322 + int err; 323 + 324 + fh = ovl_encode_fh(upper, true); 325 + if (IS_ERR(fh)) 326 + return PTR_ERR(fh); 327 + 328 + err = ovl_do_setxattr(index, OVL_XATTR_UPPER, fh, fh->len, 0); 329 + 330 + kfree(fh); 331 + return err; 332 + } 333 + 334 + /* 335 + * Create and install index entry. 336 + * 337 + * Caller must hold i_mutex on indexdir. 338 + */ 339 + static int ovl_create_index(struct dentry *dentry, struct dentry *origin, 340 + struct dentry *upper) 341 + { 342 + struct dentry *indexdir = ovl_indexdir(dentry->d_sb); 343 + struct inode *dir = d_inode(indexdir); 344 + struct dentry *index = NULL; 345 + struct dentry *temp = NULL; 346 + struct qstr name = { }; 347 + int err; 348 + 349 + /* 350 + * For now this is only used for creating index entry for directories, 351 + * because non-dir are copied up directly to index and then hardlinked 352 + * to upper dir. 353 + * 354 + * TODO: implement create index for non-dir, so we can call it when 355 + * encoding file handle for non-dir in case index does not exist. 356 + */ 357 + if (WARN_ON(!d_is_dir(dentry))) 358 + return -EIO; 359 + 360 + /* Directory not expected to be indexed before copy up */ 361 + if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry)))) 362 + return -EIO; 363 + 364 + err = ovl_get_index_name(origin, &name); 365 + if (err) 366 + return err; 367 + 368 + temp = ovl_lookup_temp(indexdir); 369 + if (IS_ERR(temp)) 370 + goto temp_err; 371 + 372 + err = ovl_do_mkdir(dir, temp, S_IFDIR, true); 373 + if (err) 374 + goto out; 375 + 376 + err = ovl_set_upper_fh(upper, temp); 377 + if (err) 378 + goto out_cleanup; 379 + 380 + index = lookup_one_len(name.name, indexdir, name.len); 381 + if (IS_ERR(index)) { 382 + err = PTR_ERR(index); 383 + } else { 384 + err = ovl_do_rename(dir, temp, dir, index, 0); 385 + dput(index); 386 + } 387 + 388 + if (err) 389 + goto out_cleanup; 390 + 391 + out: 392 + dput(temp); 393 + kfree(name.name); 394 + return err; 395 + 396 + temp_err: 397 + err = PTR_ERR(temp); 398 + temp = NULL; 399 + goto out; 400 + 401 + out_cleanup: 402 + ovl_cleanup(dir, temp); 403 + goto out; 404 + } 405 + 318 406 struct ovl_copy_up_ctx { 319 407 struct dentry *parent; 320 408 struct dentry *dentry; ··· 415 327 struct dentry *workdir; 416 328 bool tmpfile; 417 329 bool origin; 330 + bool indexed; 418 331 }; 419 332 420 333 static int ovl_link_up(struct ovl_copy_up_ctx *c) ··· 450 361 } 451 362 } 452 363 inode_unlock(udir); 453 - ovl_set_nlink_upper(c->dentry); 364 + if (err) 365 + return err; 366 + 367 + err = ovl_set_nlink_upper(c->dentry); 454 368 455 369 return err; 456 370 } ··· 590 498 if (err) 591 499 goto out_cleanup; 592 500 501 + if (S_ISDIR(c->stat.mode) && c->indexed) { 502 + err = ovl_create_index(c->dentry, c->lowerpath.dentry, temp); 503 + if (err) 504 + goto out_cleanup; 505 + } 506 + 593 507 if (c->tmpfile) { 594 508 inode_lock_nested(udir, I_MUTEX_PARENT); 595 509 err = ovl_install_temp(c, temp, &newdentry); ··· 634 536 { 635 537 int err; 636 538 struct ovl_fs *ofs = c->dentry->d_sb->s_fs_info; 637 - bool indexed = false; 539 + bool to_index = false; 638 540 639 - if (ovl_indexdir(c->dentry->d_sb) && !S_ISDIR(c->stat.mode) && 640 - c->stat.nlink > 1) 641 - indexed = true; 541 + /* 542 + * Indexed non-dir is copied up directly to the index entry and then 543 + * hardlinked to upper dir. Indexed dir is copied up to indexdir, 544 + * then index entry is created and then copied up dir installed. 545 + * Copying dir up to indexdir instead of workdir simplifies locking. 546 + */ 547 + if (ovl_need_index(c->dentry)) { 548 + c->indexed = true; 549 + if (S_ISDIR(c->stat.mode)) 550 + c->workdir = ovl_indexdir(c->dentry->d_sb); 551 + else 552 + to_index = true; 553 + } 642 554 643 - if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || indexed) 555 + if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || to_index) 644 556 c->origin = true; 645 557 646 - if (indexed) { 558 + if (to_index) { 647 559 c->destdir = ovl_indexdir(c->dentry->d_sb); 648 560 err = ovl_get_index_name(c->lowerpath.dentry, &c->destname); 649 561 if (err) 650 562 return err; 563 + } else if (WARN_ON(!c->parent)) { 564 + /* Disconnected dentry must be copied up to index dir */ 565 + return -EIO; 651 566 } else { 652 567 /* 653 568 * Mark parent "impure" because it may now contain non-pure ··· 683 572 } 684 573 } 685 574 686 - if (indexed) { 687 - if (!err) 688 - ovl_set_flag(OVL_INDEX, d_inode(c->dentry)); 689 - kfree(c->destname.name); 690 - } else if (!err) { 575 + 576 + if (err) 577 + goto out; 578 + 579 + if (c->indexed) 580 + ovl_set_flag(OVL_INDEX, d_inode(c->dentry)); 581 + 582 + if (to_index) { 583 + /* Initialize nlink for copy up of disconnected dentry */ 584 + err = ovl_set_nlink_upper(c->dentry); 585 + } else { 691 586 struct inode *udir = d_inode(c->destdir); 692 587 693 588 /* Restore timestamps on parent (best effort) */ ··· 704 587 ovl_dentry_set_upper_alias(c->dentry); 705 588 } 706 589 590 + out: 591 + if (to_index) 592 + kfree(c->destname.name); 707 593 return err; 708 594 } 709 595 ··· 731 611 if (err) 732 612 return err; 733 613 734 - ovl_path_upper(parent, &parentpath); 735 - ctx.destdir = parentpath.dentry; 736 - ctx.destname = dentry->d_name; 614 + if (parent) { 615 + ovl_path_upper(parent, &parentpath); 616 + ctx.destdir = parentpath.dentry; 617 + ctx.destname = dentry->d_name; 737 618 738 - err = vfs_getattr(&parentpath, &ctx.pstat, 739 - STATX_ATIME | STATX_MTIME, AT_STATX_SYNC_AS_STAT); 740 - if (err) 741 - return err; 619 + err = vfs_getattr(&parentpath, &ctx.pstat, 620 + STATX_ATIME | STATX_MTIME, 621 + AT_STATX_SYNC_AS_STAT); 622 + if (err) 623 + return err; 624 + } 742 625 743 626 /* maybe truncate regular file. this has no effect on dirs */ 744 627 if (flags & O_TRUNC) ··· 762 639 } else { 763 640 if (!ovl_dentry_upper(dentry)) 764 641 err = ovl_do_copy_up(&ctx); 765 - if (!err && !ovl_dentry_has_upper_alias(dentry)) 642 + if (!err && parent && !ovl_dentry_has_upper_alias(dentry)) 766 643 err = ovl_link_up(&ctx); 767 644 ovl_copy_up_end(dentry); 768 645 } ··· 775 652 { 776 653 int err = 0; 777 654 const struct cred *old_cred = ovl_override_creds(dentry->d_sb); 655 + bool disconnected = (dentry->d_flags & DCACHE_DISCONNECTED); 656 + 657 + /* 658 + * With NFS export, copy up can get called for a disconnected non-dir. 659 + * In this case, we will copy up lower inode to index dir without 660 + * linking it to upper dir. 661 + */ 662 + if (WARN_ON(disconnected && d_is_dir(dentry))) 663 + return -EIO; 778 664 779 665 while (!err) { 780 666 struct dentry *next; 781 - struct dentry *parent; 667 + struct dentry *parent = NULL; 782 668 783 669 /* 784 670 * Check if copy-up has happened as well as for upper alias (in ··· 803 671 * with rename. 804 672 */ 805 673 if (ovl_dentry_upper(dentry) && 806 - ovl_dentry_has_upper_alias(dentry)) 674 + (ovl_dentry_has_upper_alias(dentry) || disconnected)) 807 675 break; 808 676 809 677 next = dget(dentry); 810 678 /* find the topmost dentry not yet copied up */ 811 - for (;;) { 679 + for (; !disconnected;) { 812 680 parent = dget_parent(next); 813 681 814 682 if (ovl_dentry_upper(parent))
+87 -88
fs/overlayfs/dir.c
··· 63 63 } 64 64 65 65 /* caller holds i_mutex on workdir */ 66 - static struct dentry *ovl_whiteout(struct dentry *workdir, 67 - struct dentry *dentry) 66 + static struct dentry *ovl_whiteout(struct dentry *workdir) 68 67 { 69 68 int err; 70 69 struct dentry *whiteout; ··· 80 81 } 81 82 82 83 return whiteout; 84 + } 85 + 86 + /* Caller must hold i_mutex on both workdir and dir */ 87 + int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir, 88 + struct dentry *dentry) 89 + { 90 + struct inode *wdir = workdir->d_inode; 91 + struct dentry *whiteout; 92 + int err; 93 + int flags = 0; 94 + 95 + whiteout = ovl_whiteout(workdir); 96 + err = PTR_ERR(whiteout); 97 + if (IS_ERR(whiteout)) 98 + return err; 99 + 100 + if (d_is_dir(dentry)) 101 + flags = RENAME_EXCHANGE; 102 + 103 + err = ovl_do_rename(wdir, whiteout, dir, dentry, flags); 104 + if (err) 105 + goto kill_whiteout; 106 + if (flags) 107 + ovl_cleanup(wdir, dentry); 108 + 109 + out: 110 + dput(whiteout); 111 + return err; 112 + 113 + kill_whiteout: 114 + ovl_cleanup(wdir, whiteout); 115 + goto out; 83 116 } 84 117 85 118 int ovl_create_real(struct inode *dir, struct dentry *newdentry, ··· 210 179 static bool ovl_type_origin(struct dentry *dentry) 211 180 { 212 181 return OVL_TYPE_ORIGIN(ovl_path_type(dentry)); 213 - } 214 - 215 - static bool ovl_may_have_whiteouts(struct dentry *dentry) 216 - { 217 - return ovl_test_flag(OVL_WHITEOUTS, d_inode(dentry)); 218 182 } 219 183 220 184 static int ovl_create_upper(struct dentry *dentry, struct inode *inode, ··· 325 299 unlock_rename(workdir, upperdir); 326 300 out: 327 301 return ERR_PTR(err); 328 - } 329 - 330 - static struct dentry *ovl_check_empty_and_clear(struct dentry *dentry) 331 - { 332 - int err; 333 - struct dentry *ret = NULL; 334 - LIST_HEAD(list); 335 - 336 - err = ovl_check_empty_dir(dentry, &list); 337 - if (err) { 338 - ret = ERR_PTR(err); 339 - goto out_free; 340 - } 341 - 342 - /* 343 - * When removing an empty opaque directory, then it makes no sense to 344 - * replace it with an exact replica of itself. 345 - * 346 - * If upperdentry has whiteouts, clear them. 347 - * 348 - * Can race with copy-up, since we don't hold the upperdir mutex. 349 - * Doesn't matter, since copy-up can't create a non-empty directory 350 - * from an empty one. 351 - */ 352 - if (!list_empty(&list)) 353 - ret = ovl_clear_empty(dentry, &list); 354 - 355 - out_free: 356 - ovl_cache_free(&list); 357 - 358 - return ret; 359 302 } 360 303 361 304 static int ovl_set_upper_acl(struct dentry *upperdentry, const char *name, ··· 618 623 return d_inode(ovl_dentry_upper(dentry)) == d_inode(upper); 619 624 } 620 625 621 - static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir) 626 + static int ovl_remove_and_whiteout(struct dentry *dentry, 627 + struct list_head *list) 622 628 { 623 629 struct dentry *workdir = ovl_workdir(dentry); 624 - struct inode *wdir = workdir->d_inode; 625 630 struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent); 626 - struct inode *udir = upperdir->d_inode; 627 - struct dentry *whiteout; 628 631 struct dentry *upper; 629 632 struct dentry *opaquedir = NULL; 630 633 int err; 631 - int flags = 0; 632 634 633 635 if (WARN_ON(!workdir)) 634 636 return -EROFS; 635 637 636 - if (is_dir) { 637 - opaquedir = ovl_check_empty_and_clear(dentry); 638 + if (!list_empty(list)) { 639 + opaquedir = ovl_clear_empty(dentry, list); 638 640 err = PTR_ERR(opaquedir); 639 641 if (IS_ERR(opaquedir)) 640 642 goto out; ··· 654 662 goto out_dput_upper; 655 663 } 656 664 657 - whiteout = ovl_whiteout(workdir, dentry); 658 - err = PTR_ERR(whiteout); 659 - if (IS_ERR(whiteout)) 660 - goto out_dput_upper; 661 - 662 - if (d_is_dir(upper)) 663 - flags = RENAME_EXCHANGE; 664 - 665 - err = ovl_do_rename(wdir, whiteout, udir, upper, flags); 665 + err = ovl_cleanup_and_whiteout(workdir, d_inode(upperdir), upper); 666 666 if (err) 667 - goto kill_whiteout; 668 - if (flags) 669 - ovl_cleanup(wdir, upper); 667 + goto out_d_drop; 670 668 671 669 ovl_dentry_version_inc(dentry->d_parent, true); 672 670 out_d_drop: 673 671 d_drop(dentry); 674 - dput(whiteout); 675 672 out_dput_upper: 676 673 dput(upper); 677 674 out_unlock: ··· 669 688 dput(opaquedir); 670 689 out: 671 690 return err; 672 - 673 - kill_whiteout: 674 - ovl_cleanup(wdir, whiteout); 675 - goto out_d_drop; 676 691 } 677 692 678 - static int ovl_remove_upper(struct dentry *dentry, bool is_dir) 693 + static int ovl_remove_upper(struct dentry *dentry, bool is_dir, 694 + struct list_head *list) 679 695 { 680 696 struct dentry *upperdir = ovl_dentry_upper(dentry->d_parent); 681 697 struct inode *dir = upperdir->d_inode; ··· 680 702 struct dentry *opaquedir = NULL; 681 703 int err; 682 704 683 - /* Redirect/origin dir can be !ovl_lower_positive && not clean */ 684 - if (is_dir && (ovl_dentry_get_redirect(dentry) || 685 - ovl_may_have_whiteouts(dentry))) { 686 - opaquedir = ovl_check_empty_and_clear(dentry); 705 + if (!list_empty(list)) { 706 + opaquedir = ovl_clear_empty(dentry, list); 687 707 err = PTR_ERR(opaquedir); 688 708 if (IS_ERR(opaquedir)) 689 709 goto out; ··· 722 746 return err; 723 747 } 724 748 749 + static bool ovl_pure_upper(struct dentry *dentry) 750 + { 751 + return !ovl_dentry_lower(dentry) && 752 + !ovl_test_flag(OVL_WHITEOUTS, d_inode(dentry)); 753 + } 754 + 725 755 static int ovl_do_remove(struct dentry *dentry, bool is_dir) 726 756 { 727 757 int err; 728 758 bool locked = false; 729 759 const struct cred *old_cred; 760 + bool lower_positive = ovl_lower_positive(dentry); 761 + LIST_HEAD(list); 762 + 763 + /* No need to clean pure upper removed by vfs_rmdir() */ 764 + if (is_dir && (lower_positive || !ovl_pure_upper(dentry))) { 765 + err = ovl_check_empty_dir(dentry, &list); 766 + if (err) 767 + goto out; 768 + } 730 769 731 770 err = ovl_want_write(dentry); 732 771 if (err) ··· 756 765 goto out_drop_write; 757 766 758 767 old_cred = ovl_override_creds(dentry->d_sb); 759 - if (!ovl_lower_positive(dentry)) 760 - err = ovl_remove_upper(dentry, is_dir); 768 + if (!lower_positive) 769 + err = ovl_remove_upper(dentry, is_dir, &list); 761 770 else 762 - err = ovl_remove_and_whiteout(dentry, is_dir); 771 + err = ovl_remove_and_whiteout(dentry, &list); 763 772 revert_creds(old_cred); 764 773 if (!err) { 765 774 if (is_dir) ··· 771 780 out_drop_write: 772 781 ovl_drop_write(dentry); 773 782 out: 783 + ovl_cache_free(&list); 774 784 return err; 775 785 } 776 786 ··· 907 915 bool samedir = olddir == newdir; 908 916 struct dentry *opaquedir = NULL; 909 917 const struct cred *old_cred = NULL; 918 + LIST_HEAD(list); 910 919 911 920 err = -EINVAL; 912 921 if (flags & ~(RENAME_EXCHANGE | RENAME_NOREPLACE)) ··· 921 928 goto out; 922 929 if (!overwrite && !ovl_can_move(new)) 923 930 goto out; 931 + 932 + if (overwrite && new_is_dir && !ovl_pure_upper(new)) { 933 + err = ovl_check_empty_dir(new, &list); 934 + if (err) 935 + goto out; 936 + } 937 + 938 + if (overwrite) { 939 + if (ovl_lower_positive(old)) { 940 + if (!ovl_dentry_is_whiteout(new)) { 941 + /* Whiteout source */ 942 + flags |= RENAME_WHITEOUT; 943 + } else { 944 + /* Switch whiteouts */ 945 + flags |= RENAME_EXCHANGE; 946 + } 947 + } else if (is_dir && ovl_dentry_is_whiteout(new)) { 948 + flags |= RENAME_EXCHANGE; 949 + cleanup_whiteout = true; 950 + } 951 + } 924 952 925 953 err = ovl_want_write(old); 926 954 if (err) ··· 966 952 967 953 old_cred = ovl_override_creds(old->d_sb); 968 954 969 - if (overwrite && new_is_dir && (ovl_type_merge_or_lower(new) || 970 - ovl_may_have_whiteouts(new))) { 971 - opaquedir = ovl_check_empty_and_clear(new); 955 + if (!list_empty(&list)) { 956 + opaquedir = ovl_clear_empty(new, &list); 972 957 err = PTR_ERR(opaquedir); 973 958 if (IS_ERR(opaquedir)) { 974 959 opaquedir = NULL; 975 960 goto out_revert_creds; 976 - } 977 - } 978 - 979 - if (overwrite) { 980 - if (ovl_lower_positive(old)) { 981 - if (!ovl_dentry_is_whiteout(new)) { 982 - /* Whiteout source */ 983 - flags |= RENAME_WHITEOUT; 984 - } else { 985 - /* Switch whiteouts */ 986 - flags |= RENAME_EXCHANGE; 987 - } 988 - } else if (is_dir && ovl_dentry_is_whiteout(new)) { 989 - flags |= RENAME_EXCHANGE; 990 - cleanup_whiteout = true; 991 961 } 992 962 } 993 963 ··· 1092 1094 ovl_drop_write(old); 1093 1095 out: 1094 1096 dput(opaquedir); 1097 + ovl_cache_free(&list); 1095 1098 return err; 1096 1099 } 1097 1100
+715
fs/overlayfs/export.c
··· 1 + /* 2 + * Overlayfs NFS export support. 3 + * 4 + * Amir Goldstein <amir73il@gmail.com> 5 + * 6 + * Copyright (C) 2017-2018 CTERA Networks. All Rights Reserved. 7 + * 8 + * This program is free software; you can redistribute it and/or modify it 9 + * under the terms of the GNU General Public License version 2 as published by 10 + * the Free Software Foundation. 11 + */ 12 + 13 + #include <linux/fs.h> 14 + #include <linux/cred.h> 15 + #include <linux/mount.h> 16 + #include <linux/namei.h> 17 + #include <linux/xattr.h> 18 + #include <linux/exportfs.h> 19 + #include <linux/ratelimit.h> 20 + #include "overlayfs.h" 21 + 22 + /* 23 + * We only need to encode origin if there is a chance that the same object was 24 + * encoded pre copy up and then we need to stay consistent with the same 25 + * encoding also after copy up. If non-pure upper is not indexed, then it was 26 + * copied up before NFS export was enabled. In that case we don't need to worry 27 + * about staying consistent with pre copy up encoding and we encode an upper 28 + * file handle. Overlay root dentry is a private case of non-indexed upper. 29 + * 30 + * The following table summarizes the different file handle encodings used for 31 + * different overlay object types: 32 + * 33 + * Object type | Encoding 34 + * -------------------------------- 35 + * Pure upper | U 36 + * Non-indexed upper | U 37 + * Indexed upper | L (*) 38 + * Non-upper | L (*) 39 + * 40 + * U = upper file handle 41 + * L = lower file handle 42 + * 43 + * (*) Connecting an overlay dir from real lower dentry is not always 44 + * possible when there are redirects in lower layers. To mitigate this case, 45 + * we copy up the lower dir first and then encode an upper dir file handle. 46 + */ 47 + static bool ovl_should_encode_origin(struct dentry *dentry) 48 + { 49 + struct ovl_fs *ofs = dentry->d_sb->s_fs_info; 50 + 51 + if (!ovl_dentry_lower(dentry)) 52 + return false; 53 + 54 + /* 55 + * Decoding a merge dir, whose origin's parent is under a redirected 56 + * lower dir is not always possible. As a simple aproximation, we do 57 + * not encode lower dir file handles when overlay has multiple lower 58 + * layers and origin is below the topmost lower layer. 59 + * 60 + * TODO: copy up only the parent that is under redirected lower. 61 + */ 62 + if (d_is_dir(dentry) && ofs->upper_mnt && 63 + OVL_E(dentry)->lowerstack[0].layer->idx > 1) 64 + return false; 65 + 66 + /* Decoding a non-indexed upper from origin is not implemented */ 67 + if (ovl_dentry_upper(dentry) && 68 + !ovl_test_flag(OVL_INDEX, d_inode(dentry))) 69 + return false; 70 + 71 + return true; 72 + } 73 + 74 + static int ovl_encode_maybe_copy_up(struct dentry *dentry) 75 + { 76 + int err; 77 + 78 + if (ovl_dentry_upper(dentry)) 79 + return 0; 80 + 81 + err = ovl_want_write(dentry); 82 + if (err) 83 + return err; 84 + 85 + err = ovl_copy_up(dentry); 86 + 87 + ovl_drop_write(dentry); 88 + return err; 89 + } 90 + 91 + static int ovl_d_to_fh(struct dentry *dentry, char *buf, int buflen) 92 + { 93 + struct dentry *origin = ovl_dentry_lower(dentry); 94 + struct ovl_fh *fh = NULL; 95 + int err; 96 + 97 + /* 98 + * If we should not encode a lower dir file handle, copy up and encode 99 + * an upper dir file handle. 100 + */ 101 + if (!ovl_should_encode_origin(dentry)) { 102 + err = ovl_encode_maybe_copy_up(dentry); 103 + if (err) 104 + goto fail; 105 + 106 + origin = NULL; 107 + } 108 + 109 + /* Encode an upper or origin file handle */ 110 + fh = ovl_encode_fh(origin ?: ovl_dentry_upper(dentry), !origin); 111 + err = PTR_ERR(fh); 112 + if (IS_ERR(fh)) 113 + goto fail; 114 + 115 + err = -EOVERFLOW; 116 + if (fh->len > buflen) 117 + goto fail; 118 + 119 + memcpy(buf, (char *)fh, fh->len); 120 + err = fh->len; 121 + 122 + out: 123 + kfree(fh); 124 + return err; 125 + 126 + fail: 127 + pr_warn_ratelimited("overlayfs: failed to encode file handle (%pd2, err=%i, buflen=%d, len=%d, type=%d)\n", 128 + dentry, err, buflen, fh ? (int)fh->len : 0, 129 + fh ? fh->type : 0); 130 + goto out; 131 + } 132 + 133 + static int ovl_dentry_to_fh(struct dentry *dentry, u32 *fid, int *max_len) 134 + { 135 + int res, len = *max_len << 2; 136 + 137 + res = ovl_d_to_fh(dentry, (char *)fid, len); 138 + if (res <= 0) 139 + return FILEID_INVALID; 140 + 141 + len = res; 142 + 143 + /* Round up to dwords */ 144 + *max_len = (len + 3) >> 2; 145 + return OVL_FILEID; 146 + } 147 + 148 + static int ovl_encode_inode_fh(struct inode *inode, u32 *fid, int *max_len, 149 + struct inode *parent) 150 + { 151 + struct dentry *dentry; 152 + int type; 153 + 154 + /* TODO: encode connectable file handles */ 155 + if (parent) 156 + return FILEID_INVALID; 157 + 158 + dentry = d_find_any_alias(inode); 159 + if (WARN_ON(!dentry)) 160 + return FILEID_INVALID; 161 + 162 + type = ovl_dentry_to_fh(dentry, fid, max_len); 163 + 164 + dput(dentry); 165 + return type; 166 + } 167 + 168 + /* 169 + * Find or instantiate an overlay dentry from real dentries and index. 170 + */ 171 + static struct dentry *ovl_obtain_alias(struct super_block *sb, 172 + struct dentry *upper_alias, 173 + struct ovl_path *lowerpath, 174 + struct dentry *index) 175 + { 176 + struct dentry *lower = lowerpath ? lowerpath->dentry : NULL; 177 + struct dentry *upper = upper_alias ?: index; 178 + struct dentry *dentry; 179 + struct inode *inode; 180 + struct ovl_entry *oe; 181 + 182 + /* We get overlay directory dentries with ovl_lookup_real() */ 183 + if (d_is_dir(upper ?: lower)) 184 + return ERR_PTR(-EIO); 185 + 186 + inode = ovl_get_inode(sb, dget(upper), lower, index, !!lower); 187 + if (IS_ERR(inode)) { 188 + dput(upper); 189 + return ERR_CAST(inode); 190 + } 191 + 192 + if (index) 193 + ovl_set_flag(OVL_INDEX, inode); 194 + 195 + dentry = d_find_any_alias(inode); 196 + if (!dentry) { 197 + dentry = d_alloc_anon(inode->i_sb); 198 + if (!dentry) 199 + goto nomem; 200 + oe = ovl_alloc_entry(lower ? 1 : 0); 201 + if (!oe) 202 + goto nomem; 203 + 204 + if (lower) { 205 + oe->lowerstack->dentry = dget(lower); 206 + oe->lowerstack->layer = lowerpath->layer; 207 + } 208 + dentry->d_fsdata = oe; 209 + if (upper_alias) 210 + ovl_dentry_set_upper_alias(dentry); 211 + } 212 + 213 + return d_instantiate_anon(dentry, inode); 214 + 215 + nomem: 216 + iput(inode); 217 + dput(dentry); 218 + return ERR_PTR(-ENOMEM); 219 + } 220 + 221 + /* Get the upper or lower dentry in stach whose on layer @idx */ 222 + static struct dentry *ovl_dentry_real_at(struct dentry *dentry, int idx) 223 + { 224 + struct ovl_entry *oe = dentry->d_fsdata; 225 + int i; 226 + 227 + if (!idx) 228 + return ovl_dentry_upper(dentry); 229 + 230 + for (i = 0; i < oe->numlower; i++) { 231 + if (oe->lowerstack[i].layer->idx == idx) 232 + return oe->lowerstack[i].dentry; 233 + } 234 + 235 + return NULL; 236 + } 237 + 238 + /* 239 + * Lookup a child overlay dentry to get a connected overlay dentry whose real 240 + * dentry is @real. If @real is on upper layer, we lookup a child overlay 241 + * dentry with the same name as the real dentry. Otherwise, we need to consult 242 + * index for lookup. 243 + */ 244 + static struct dentry *ovl_lookup_real_one(struct dentry *connected, 245 + struct dentry *real, 246 + struct ovl_layer *layer) 247 + { 248 + struct inode *dir = d_inode(connected); 249 + struct dentry *this, *parent = NULL; 250 + struct name_snapshot name; 251 + int err; 252 + 253 + /* 254 + * Lookup child overlay dentry by real name. The dir mutex protects us 255 + * from racing with overlay rename. If the overlay dentry that is above 256 + * real has already been moved to a parent that is not under the 257 + * connected overlay dir, we return -ECHILD and restart the lookup of 258 + * connected real path from the top. 259 + */ 260 + inode_lock_nested(dir, I_MUTEX_PARENT); 261 + err = -ECHILD; 262 + parent = dget_parent(real); 263 + if (ovl_dentry_real_at(connected, layer->idx) != parent) 264 + goto fail; 265 + 266 + /* 267 + * We also need to take a snapshot of real dentry name to protect us 268 + * from racing with underlying layer rename. In this case, we don't 269 + * care about returning ESTALE, only from dereferencing a free name 270 + * pointer because we hold no lock on the real dentry. 271 + */ 272 + take_dentry_name_snapshot(&name, real); 273 + this = lookup_one_len(name.name, connected, strlen(name.name)); 274 + err = PTR_ERR(this); 275 + if (IS_ERR(this)) { 276 + goto fail; 277 + } else if (!this || !this->d_inode) { 278 + dput(this); 279 + err = -ENOENT; 280 + goto fail; 281 + } else if (ovl_dentry_real_at(this, layer->idx) != real) { 282 + dput(this); 283 + err = -ESTALE; 284 + goto fail; 285 + } 286 + 287 + out: 288 + release_dentry_name_snapshot(&name); 289 + dput(parent); 290 + inode_unlock(dir); 291 + return this; 292 + 293 + fail: 294 + pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, layer=%d, connected=%pd2, err=%i)\n", 295 + real, layer->idx, connected, err); 296 + this = ERR_PTR(err); 297 + goto out; 298 + } 299 + 300 + static struct dentry *ovl_lookup_real(struct super_block *sb, 301 + struct dentry *real, 302 + struct ovl_layer *layer); 303 + 304 + /* 305 + * Lookup an indexed or hashed overlay dentry by real inode. 306 + */ 307 + static struct dentry *ovl_lookup_real_inode(struct super_block *sb, 308 + struct dentry *real, 309 + struct ovl_layer *layer) 310 + { 311 + struct ovl_fs *ofs = sb->s_fs_info; 312 + struct ovl_layer upper_layer = { .mnt = ofs->upper_mnt }; 313 + struct dentry *index = NULL; 314 + struct dentry *this = NULL; 315 + struct inode *inode; 316 + 317 + /* 318 + * Decoding upper dir from index is expensive, so first try to lookup 319 + * overlay dentry in inode/dcache. 320 + */ 321 + inode = ovl_lookup_inode(sb, real, !layer->idx); 322 + if (IS_ERR(inode)) 323 + return ERR_CAST(inode); 324 + if (inode) { 325 + this = d_find_any_alias(inode); 326 + iput(inode); 327 + } 328 + 329 + /* 330 + * For decoded lower dir file handle, lookup index by origin to check 331 + * if lower dir was copied up and and/or removed. 332 + */ 333 + if (!this && layer->idx && ofs->indexdir && !WARN_ON(!d_is_dir(real))) { 334 + index = ovl_lookup_index(ofs, NULL, real, false); 335 + if (IS_ERR(index)) 336 + return index; 337 + } 338 + 339 + /* Get connected upper overlay dir from index */ 340 + if (index) { 341 + struct dentry *upper = ovl_index_upper(ofs, index); 342 + 343 + dput(index); 344 + if (IS_ERR_OR_NULL(upper)) 345 + return upper; 346 + 347 + /* 348 + * ovl_lookup_real() in lower layer may call recursively once to 349 + * ovl_lookup_real() in upper layer. The first level call walks 350 + * back lower parents to the topmost indexed parent. The second 351 + * recursive call walks back from indexed upper to the topmost 352 + * connected/hashed upper parent (or up to root). 353 + */ 354 + this = ovl_lookup_real(sb, upper, &upper_layer); 355 + dput(upper); 356 + } 357 + 358 + if (!this) 359 + return NULL; 360 + 361 + if (WARN_ON(ovl_dentry_real_at(this, layer->idx) != real)) { 362 + dput(this); 363 + this = ERR_PTR(-EIO); 364 + } 365 + 366 + return this; 367 + } 368 + 369 + /* 370 + * Lookup an indexed or hashed overlay dentry, whose real dentry is an 371 + * ancestor of @real. 372 + */ 373 + static struct dentry *ovl_lookup_real_ancestor(struct super_block *sb, 374 + struct dentry *real, 375 + struct ovl_layer *layer) 376 + { 377 + struct dentry *next, *parent = NULL; 378 + struct dentry *ancestor = ERR_PTR(-EIO); 379 + 380 + if (real == layer->mnt->mnt_root) 381 + return dget(sb->s_root); 382 + 383 + /* Find the topmost indexed or hashed ancestor */ 384 + next = dget(real); 385 + for (;;) { 386 + parent = dget_parent(next); 387 + 388 + /* 389 + * Lookup a matching overlay dentry in inode/dentry 390 + * cache or in index by real inode. 391 + */ 392 + ancestor = ovl_lookup_real_inode(sb, next, layer); 393 + if (ancestor) 394 + break; 395 + 396 + if (parent == layer->mnt->mnt_root) { 397 + ancestor = dget(sb->s_root); 398 + break; 399 + } 400 + 401 + /* 402 + * If @real has been moved out of the layer root directory, 403 + * we will eventully hit the real fs root. This cannot happen 404 + * by legit overlay rename, so we return error in that case. 405 + */ 406 + if (parent == next) { 407 + ancestor = ERR_PTR(-EXDEV); 408 + break; 409 + } 410 + 411 + dput(next); 412 + next = parent; 413 + } 414 + 415 + dput(parent); 416 + dput(next); 417 + 418 + return ancestor; 419 + } 420 + 421 + /* 422 + * Lookup a connected overlay dentry whose real dentry is @real. 423 + * If @real is on upper layer, we lookup a child overlay dentry with the same 424 + * path the real dentry. Otherwise, we need to consult index for lookup. 425 + */ 426 + static struct dentry *ovl_lookup_real(struct super_block *sb, 427 + struct dentry *real, 428 + struct ovl_layer *layer) 429 + { 430 + struct dentry *connected; 431 + int err = 0; 432 + 433 + connected = ovl_lookup_real_ancestor(sb, real, layer); 434 + if (IS_ERR(connected)) 435 + return connected; 436 + 437 + while (!err) { 438 + struct dentry *next, *this; 439 + struct dentry *parent = NULL; 440 + struct dentry *real_connected = ovl_dentry_real_at(connected, 441 + layer->idx); 442 + 443 + if (real_connected == real) 444 + break; 445 + 446 + /* Find the topmost dentry not yet connected */ 447 + next = dget(real); 448 + for (;;) { 449 + parent = dget_parent(next); 450 + 451 + if (parent == real_connected) 452 + break; 453 + 454 + /* 455 + * If real has been moved out of 'real_connected', 456 + * we will not find 'real_connected' and hit the layer 457 + * root. In that case, we need to restart connecting. 458 + * This game can go on forever in the worst case. We 459 + * may want to consider taking s_vfs_rename_mutex if 460 + * this happens more than once. 461 + */ 462 + if (parent == layer->mnt->mnt_root) { 463 + dput(connected); 464 + connected = dget(sb->s_root); 465 + break; 466 + } 467 + 468 + /* 469 + * If real file has been moved out of the layer root 470 + * directory, we will eventully hit the real fs root. 471 + * This cannot happen by legit overlay rename, so we 472 + * return error in that case. 473 + */ 474 + if (parent == next) { 475 + err = -EXDEV; 476 + break; 477 + } 478 + 479 + dput(next); 480 + next = parent; 481 + } 482 + 483 + if (!err) { 484 + this = ovl_lookup_real_one(connected, next, layer); 485 + if (IS_ERR(this)) 486 + err = PTR_ERR(this); 487 + 488 + /* 489 + * Lookup of child in overlay can fail when racing with 490 + * overlay rename of child away from 'connected' parent. 491 + * In this case, we need to restart the lookup from the 492 + * top, because we cannot trust that 'real_connected' is 493 + * still an ancestor of 'real'. There is a good chance 494 + * that the renamed overlay ancestor is now in cache, so 495 + * ovl_lookup_real_ancestor() will find it and we can 496 + * continue to connect exactly from where lookup failed. 497 + */ 498 + if (err == -ECHILD) { 499 + this = ovl_lookup_real_ancestor(sb, real, 500 + layer); 501 + err = IS_ERR(this) ? PTR_ERR(this) : 0; 502 + } 503 + if (!err) { 504 + dput(connected); 505 + connected = this; 506 + } 507 + } 508 + 509 + dput(parent); 510 + dput(next); 511 + } 512 + 513 + if (err) 514 + goto fail; 515 + 516 + return connected; 517 + 518 + fail: 519 + pr_warn_ratelimited("overlayfs: failed to lookup by real (%pd2, layer=%d, connected=%pd2, err=%i)\n", 520 + real, layer->idx, connected, err); 521 + dput(connected); 522 + return ERR_PTR(err); 523 + } 524 + 525 + /* 526 + * Get an overlay dentry from upper/lower real dentries and index. 527 + */ 528 + static struct dentry *ovl_get_dentry(struct super_block *sb, 529 + struct dentry *upper, 530 + struct ovl_path *lowerpath, 531 + struct dentry *index) 532 + { 533 + struct ovl_fs *ofs = sb->s_fs_info; 534 + struct ovl_layer upper_layer = { .mnt = ofs->upper_mnt }; 535 + struct ovl_layer *layer = upper ? &upper_layer : lowerpath->layer; 536 + struct dentry *real = upper ?: (index ?: lowerpath->dentry); 537 + 538 + /* 539 + * Obtain a disconnected overlay dentry from a non-dir real dentry 540 + * and index. 541 + */ 542 + if (!d_is_dir(real)) 543 + return ovl_obtain_alias(sb, upper, lowerpath, index); 544 + 545 + /* Removed empty directory? */ 546 + if ((real->d_flags & DCACHE_DISCONNECTED) || d_unhashed(real)) 547 + return ERR_PTR(-ENOENT); 548 + 549 + /* 550 + * If real dentry is connected and hashed, get a connected overlay 551 + * dentry whose real dentry is @real. 552 + */ 553 + return ovl_lookup_real(sb, real, layer); 554 + } 555 + 556 + static struct dentry *ovl_upper_fh_to_d(struct super_block *sb, 557 + struct ovl_fh *fh) 558 + { 559 + struct ovl_fs *ofs = sb->s_fs_info; 560 + struct dentry *dentry; 561 + struct dentry *upper; 562 + 563 + if (!ofs->upper_mnt) 564 + return ERR_PTR(-EACCES); 565 + 566 + upper = ovl_decode_fh(fh, ofs->upper_mnt); 567 + if (IS_ERR_OR_NULL(upper)) 568 + return upper; 569 + 570 + dentry = ovl_get_dentry(sb, upper, NULL, NULL); 571 + dput(upper); 572 + 573 + return dentry; 574 + } 575 + 576 + static struct dentry *ovl_lower_fh_to_d(struct super_block *sb, 577 + struct ovl_fh *fh) 578 + { 579 + struct ovl_fs *ofs = sb->s_fs_info; 580 + struct ovl_path origin = { }; 581 + struct ovl_path *stack = &origin; 582 + struct dentry *dentry = NULL; 583 + struct dentry *index = NULL; 584 + struct inode *inode = NULL; 585 + bool is_deleted = false; 586 + int err; 587 + 588 + /* First lookup indexed upper by fh */ 589 + if (ofs->indexdir) { 590 + index = ovl_get_index_fh(ofs, fh); 591 + err = PTR_ERR(index); 592 + if (IS_ERR(index)) { 593 + if (err != -ESTALE) 594 + return ERR_PTR(err); 595 + 596 + /* Found a whiteout index - treat as deleted inode */ 597 + is_deleted = true; 598 + index = NULL; 599 + } 600 + } 601 + 602 + /* Then try to get upper dir by index */ 603 + if (index && d_is_dir(index)) { 604 + struct dentry *upper = ovl_index_upper(ofs, index); 605 + 606 + err = PTR_ERR(upper); 607 + if (IS_ERR_OR_NULL(upper)) 608 + goto out_err; 609 + 610 + dentry = ovl_get_dentry(sb, upper, NULL, NULL); 611 + dput(upper); 612 + goto out; 613 + } 614 + 615 + /* Then lookup origin by fh */ 616 + err = ovl_check_origin_fh(ofs, fh, NULL, &stack); 617 + if (err) { 618 + goto out_err; 619 + } else if (index) { 620 + err = ovl_verify_origin(index, origin.dentry, false); 621 + if (err) 622 + goto out_err; 623 + } else if (is_deleted) { 624 + /* Lookup deleted non-dir by origin inode */ 625 + if (!d_is_dir(origin.dentry)) 626 + inode = ovl_lookup_inode(sb, origin.dentry, false); 627 + err = -ESTALE; 628 + if (!inode || atomic_read(&inode->i_count) == 1) 629 + goto out_err; 630 + 631 + /* Deleted but still open? */ 632 + index = dget(ovl_i_dentry_upper(inode)); 633 + } 634 + 635 + dentry = ovl_get_dentry(sb, NULL, &origin, index); 636 + 637 + out: 638 + dput(origin.dentry); 639 + dput(index); 640 + iput(inode); 641 + return dentry; 642 + 643 + out_err: 644 + dentry = ERR_PTR(err); 645 + goto out; 646 + } 647 + 648 + static struct dentry *ovl_fh_to_dentry(struct super_block *sb, struct fid *fid, 649 + int fh_len, int fh_type) 650 + { 651 + struct dentry *dentry = NULL; 652 + struct ovl_fh *fh = (struct ovl_fh *) fid; 653 + int len = fh_len << 2; 654 + unsigned int flags = 0; 655 + int err; 656 + 657 + err = -EINVAL; 658 + if (fh_type != OVL_FILEID) 659 + goto out_err; 660 + 661 + err = ovl_check_fh_len(fh, len); 662 + if (err) 663 + goto out_err; 664 + 665 + flags = fh->flags; 666 + dentry = (flags & OVL_FH_FLAG_PATH_UPPER) ? 667 + ovl_upper_fh_to_d(sb, fh) : 668 + ovl_lower_fh_to_d(sb, fh); 669 + err = PTR_ERR(dentry); 670 + if (IS_ERR(dentry) && err != -ESTALE) 671 + goto out_err; 672 + 673 + return dentry; 674 + 675 + out_err: 676 + pr_warn_ratelimited("overlayfs: failed to decode file handle (len=%d, type=%d, flags=%x, err=%i)\n", 677 + len, fh_type, flags, err); 678 + return ERR_PTR(err); 679 + } 680 + 681 + static struct dentry *ovl_fh_to_parent(struct super_block *sb, struct fid *fid, 682 + int fh_len, int fh_type) 683 + { 684 + pr_warn_ratelimited("overlayfs: connectable file handles not supported; use 'no_subtree_check' exportfs option.\n"); 685 + return ERR_PTR(-EACCES); 686 + } 687 + 688 + static int ovl_get_name(struct dentry *parent, char *name, 689 + struct dentry *child) 690 + { 691 + /* 692 + * ovl_fh_to_dentry() returns connected dir overlay dentries and 693 + * ovl_fh_to_parent() is not implemented, so we should not get here. 694 + */ 695 + WARN_ON_ONCE(1); 696 + return -EIO; 697 + } 698 + 699 + static struct dentry *ovl_get_parent(struct dentry *dentry) 700 + { 701 + /* 702 + * ovl_fh_to_dentry() returns connected dir overlay dentries, so we 703 + * should not get here. 704 + */ 705 + WARN_ON_ONCE(1); 706 + return ERR_PTR(-EIO); 707 + } 708 + 709 + const struct export_operations ovl_export_operations = { 710 + .encode_fh = ovl_encode_inode_fh, 711 + .fh_to_dentry = ovl_fh_to_dentry, 712 + .fh_to_parent = ovl_fh_to_parent, 713 + .get_name = ovl_get_name, 714 + .get_parent = ovl_get_parent, 715 + };
+79 -27
fs/overlayfs/inode.c
··· 105 105 * Lower hardlinks may be broken on copy up to different 106 106 * upper files, so we cannot use the lower origin st_ino 107 107 * for those different files, even for the same fs case. 108 + * 109 + * Similarly, several redirected dirs can point to the 110 + * same dir on a lower layer. With the "verify_lower" 111 + * feature, we do not use the lower origin st_ino, if 112 + * we haven't verified that this redirect is unique. 113 + * 108 114 * With inodes index enabled, it is safe to use st_ino 109 - * of an indexed hardlinked origin. The index validates 110 - * that the upper hardlink is not broken. 115 + * of an indexed origin. The index validates that the 116 + * upper hardlink is not broken and that a redirected 117 + * dir is the only redirect to that origin. 111 118 */ 112 - if (is_dir || lowerstat.nlink == 1 || 113 - ovl_test_flag(OVL_INDEX, d_inode(dentry))) 119 + if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) || 120 + (!ovl_verify_lower(dentry->d_sb) && 121 + (is_dir || lowerstat.nlink == 1))) 114 122 stat->ino = lowerstat.ino; 115 123 116 124 if (samefs) ··· 351 343 352 344 static bool ovl_open_need_copy_up(struct dentry *dentry, int flags) 353 345 { 346 + /* Copy up of disconnected dentry does not set upper alias */ 354 347 if (ovl_dentry_upper(dentry) && 355 - ovl_dentry_has_upper_alias(dentry)) 348 + (ovl_dentry_has_upper_alias(dentry) || 349 + (dentry->d_flags & DCACHE_DISCONNECTED))) 356 350 return false; 357 351 358 352 if (special_file(d_inode(dentry)->i_mode)) ··· 614 604 } 615 605 616 606 static bool ovl_verify_inode(struct inode *inode, struct dentry *lowerdentry, 617 - struct dentry *upperdentry) 607 + struct dentry *upperdentry, bool strict) 618 608 { 609 + /* 610 + * For directories, @strict verify from lookup path performs consistency 611 + * checks, so NULL lower/upper in dentry must match NULL lower/upper in 612 + * inode. Non @strict verify from NFS handle decode path passes NULL for 613 + * 'unknown' lower/upper. 614 + */ 615 + if (S_ISDIR(inode->i_mode) && strict) { 616 + /* Real lower dir moved to upper layer under us? */ 617 + if (!lowerdentry && ovl_inode_lower(inode)) 618 + return false; 619 + 620 + /* Lookup of an uncovered redirect origin? */ 621 + if (!upperdentry && ovl_inode_upper(inode)) 622 + return false; 623 + } 624 + 619 625 /* 620 626 * Allow non-NULL lower inode in ovl_inode even if lowerdentry is NULL. 621 627 * This happens when finding a copied up overlay inode for a renamed ··· 651 625 return true; 652 626 } 653 627 654 - struct inode *ovl_get_inode(struct dentry *dentry, struct dentry *upperdentry, 655 - struct dentry *index) 628 + struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real, 629 + bool is_upper) 656 630 { 657 - struct dentry *lowerdentry = ovl_dentry_lower(dentry); 631 + struct inode *inode, *key = d_inode(real); 632 + 633 + inode = ilookup5(sb, (unsigned long) key, ovl_inode_test, key); 634 + if (!inode) 635 + return NULL; 636 + 637 + if (!ovl_verify_inode(inode, is_upper ? NULL : real, 638 + is_upper ? real : NULL, false)) { 639 + iput(inode); 640 + return ERR_PTR(-ESTALE); 641 + } 642 + 643 + return inode; 644 + } 645 + 646 + struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry, 647 + struct dentry *lowerdentry, struct dentry *index, 648 + unsigned int numlower) 649 + { 650 + struct ovl_fs *ofs = sb->s_fs_info; 658 651 struct inode *realinode = upperdentry ? d_inode(upperdentry) : NULL; 659 652 struct inode *inode; 660 653 /* Already indexed or could be indexed on copy up? */ 661 - bool indexed = (index || (ovl_indexdir(dentry->d_sb) && !upperdentry)); 654 + bool indexed = (index || (ovl_indexdir(sb) && !upperdentry)); 655 + struct dentry *origin = indexed ? lowerdentry : NULL; 656 + bool is_dir; 662 657 663 658 if (WARN_ON(upperdentry && indexed && !lowerdentry)) 664 659 return ERR_PTR(-EIO); ··· 688 641 realinode = d_inode(lowerdentry); 689 642 690 643 /* 691 - * Copy up origin (lower) may exist for non-indexed upper, but we must 692 - * not use lower as hash key in that case. 693 - * Hash inodes that are or could be indexed by origin inode and 694 - * non-indexed upper inodes that could be hard linked by upper inode. 644 + * Copy up origin (lower) may exist for non-indexed non-dir upper, but 645 + * we must not use lower as hash key in that case. 646 + * Hash non-dir that is or could be indexed by origin inode. 647 + * Hash dir that is or could be merged by origin inode. 648 + * Hash pure upper and non-indexed non-dir by upper inode. 649 + * Hash non-indexed dir by upper inode for NFS export. 695 650 */ 696 - if (!S_ISDIR(realinode->i_mode) && (upperdentry || indexed)) { 697 - struct inode *key = d_inode(indexed ? lowerdentry : 698 - upperdentry); 699 - unsigned int nlink; 651 + is_dir = S_ISDIR(realinode->i_mode); 652 + if (is_dir && (indexed || !sb->s_export_op || !ofs->upper_mnt)) 653 + origin = lowerdentry; 700 654 701 - inode = iget5_locked(dentry->d_sb, (unsigned long) key, 655 + if (upperdentry || origin) { 656 + struct inode *key = d_inode(origin ?: upperdentry); 657 + unsigned int nlink = is_dir ? 1 : realinode->i_nlink; 658 + 659 + inode = iget5_locked(sb, (unsigned long) key, 702 660 ovl_inode_test, ovl_inode_set, key); 703 661 if (!inode) 704 662 goto out_nomem; ··· 712 660 * Verify that the underlying files stored in the inode 713 661 * match those in the dentry. 714 662 */ 715 - if (!ovl_verify_inode(inode, lowerdentry, upperdentry)) { 663 + if (!ovl_verify_inode(inode, lowerdentry, upperdentry, 664 + true)) { 716 665 iput(inode); 717 666 inode = ERR_PTR(-ESTALE); 718 667 goto out; ··· 723 670 goto out; 724 671 } 725 672 726 - nlink = ovl_get_nlink(lowerdentry, upperdentry, 727 - realinode->i_nlink); 673 + /* Recalculate nlink for non-dir due to indexing */ 674 + if (!is_dir) 675 + nlink = ovl_get_nlink(lowerdentry, upperdentry, nlink); 728 676 set_nlink(inode, nlink); 729 677 } else { 730 - inode = new_inode(dentry->d_sb); 678 + inode = new_inode(sb); 731 679 if (!inode) 732 680 goto out_nomem; 733 681 } ··· 739 685 ovl_set_flag(OVL_IMPURE, inode); 740 686 741 687 /* Check for non-merge dir that may have whiteouts */ 742 - if (S_ISDIR(realinode->i_mode)) { 743 - struct ovl_entry *oe = dentry->d_fsdata; 744 - 745 - if (((upperdentry && lowerdentry) || oe->numlower > 1) || 688 + if (is_dir) { 689 + if (((upperdentry && lowerdentry) || numlower > 1) || 746 690 ovl_check_origin_xattr(upperdentry ?: lowerdentry)) { 747 691 ovl_set_flag(OVL_WHITEOUTS, inode); 748 692 }
+390 -145
fs/overlayfs/namei.c
··· 9 9 10 10 #include <linux/fs.h> 11 11 #include <linux/cred.h> 12 + #include <linux/ctype.h> 12 13 #include <linux/namei.h> 13 14 #include <linux/xattr.h> 14 15 #include <linux/ratelimit.h> ··· 85 84 86 85 static int ovl_acceptable(void *ctx, struct dentry *dentry) 87 86 { 88 - return 1; 87 + /* 88 + * A non-dir origin may be disconnected, which is fine, because 89 + * we only need it for its unique inode number. 90 + */ 91 + if (!d_is_dir(dentry)) 92 + return 1; 93 + 94 + /* Don't decode a deleted empty directory */ 95 + if (d_unhashed(dentry)) 96 + return 0; 97 + 98 + /* Check if directory belongs to the layer we are decoding from */ 99 + return is_subdir(dentry, ((struct vfsmount *)ctx)->mnt_root); 89 100 } 90 101 91 - static struct ovl_fh *ovl_get_origin_fh(struct dentry *dentry) 102 + /* 103 + * Check validity of an overlay file handle buffer. 104 + * 105 + * Return 0 for a valid file handle. 106 + * Return -ENODATA for "origin unknown". 107 + * Return <0 for an invalid file handle. 108 + */ 109 + int ovl_check_fh_len(struct ovl_fh *fh, int fh_len) 92 110 { 93 - int res; 111 + if (fh_len < sizeof(struct ovl_fh) || fh_len < fh->len) 112 + return -EINVAL; 113 + 114 + if (fh->magic != OVL_FH_MAGIC) 115 + return -EINVAL; 116 + 117 + /* Treat larger version and unknown flags as "origin unknown" */ 118 + if (fh->version > OVL_FH_VERSION || fh->flags & ~OVL_FH_FLAG_ALL) 119 + return -ENODATA; 120 + 121 + /* Treat endianness mismatch as "origin unknown" */ 122 + if (!(fh->flags & OVL_FH_FLAG_ANY_ENDIAN) && 123 + (fh->flags & OVL_FH_FLAG_BIG_ENDIAN) != OVL_FH_FLAG_CPU_ENDIAN) 124 + return -ENODATA; 125 + 126 + return 0; 127 + } 128 + 129 + static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name) 130 + { 131 + int res, err; 94 132 struct ovl_fh *fh = NULL; 95 133 96 - res = vfs_getxattr(dentry, OVL_XATTR_ORIGIN, NULL, 0); 134 + res = vfs_getxattr(dentry, name, NULL, 0); 97 135 if (res < 0) { 98 136 if (res == -ENODATA || res == -EOPNOTSUPP) 99 137 return NULL; ··· 142 102 if (res == 0) 143 103 return NULL; 144 104 145 - fh = kzalloc(res, GFP_KERNEL); 105 + fh = kzalloc(res, GFP_KERNEL); 146 106 if (!fh) 147 107 return ERR_PTR(-ENOMEM); 148 108 149 - res = vfs_getxattr(dentry, OVL_XATTR_ORIGIN, fh, res); 109 + res = vfs_getxattr(dentry, name, fh, res); 150 110 if (res < 0) 151 111 goto fail; 152 112 153 - if (res < sizeof(struct ovl_fh) || res < fh->len) 113 + err = ovl_check_fh_len(fh, res); 114 + if (err < 0) { 115 + if (err == -ENODATA) 116 + goto out; 154 117 goto invalid; 155 - 156 - if (fh->magic != OVL_FH_MAGIC) 157 - goto invalid; 158 - 159 - /* Treat larger version and unknown flags as "origin unknown" */ 160 - if (fh->version > OVL_FH_VERSION || fh->flags & ~OVL_FH_FLAG_ALL) 161 - goto out; 162 - 163 - /* Treat endianness mismatch as "origin unknown" */ 164 - if (!(fh->flags & OVL_FH_FLAG_ANY_ENDIAN) && 165 - (fh->flags & OVL_FH_FLAG_BIG_ENDIAN) != OVL_FH_FLAG_CPU_ENDIAN) 166 - goto out; 118 + } 167 119 168 120 return fh; 169 121 ··· 171 139 goto out; 172 140 } 173 141 174 - static struct dentry *ovl_get_origin(struct dentry *dentry, 175 - struct vfsmount *mnt) 142 + struct dentry *ovl_decode_fh(struct ovl_fh *fh, struct vfsmount *mnt) 176 143 { 177 - struct dentry *origin = NULL; 178 - struct ovl_fh *fh = ovl_get_origin_fh(dentry); 144 + struct dentry *real; 179 145 int bytes; 180 - 181 - if (IS_ERR_OR_NULL(fh)) 182 - return (struct dentry *)fh; 183 146 184 147 /* 185 148 * Make sure that the stored uuid matches the uuid of the lower 186 149 * layer where file handle will be decoded. 187 150 */ 188 151 if (!uuid_equal(&fh->uuid, &mnt->mnt_sb->s_uuid)) 189 - goto out; 152 + return NULL; 190 153 191 154 bytes = (fh->len - offsetof(struct ovl_fh, fid)); 192 - origin = exportfs_decode_fh(mnt, (struct fid *)fh->fid, 193 - bytes >> 2, (int)fh->type, 194 - ovl_acceptable, NULL); 195 - if (IS_ERR(origin)) { 196 - /* Treat stale file handle as "origin unknown" */ 197 - if (origin == ERR_PTR(-ESTALE)) 198 - origin = NULL; 199 - goto out; 155 + real = exportfs_decode_fh(mnt, (struct fid *)fh->fid, 156 + bytes >> 2, (int)fh->type, 157 + ovl_acceptable, mnt); 158 + if (IS_ERR(real)) { 159 + /* 160 + * Treat stale file handle to lower file as "origin unknown". 161 + * upper file handle could become stale when upper file is 162 + * unlinked and this information is needed to handle stale 163 + * index entries correctly. 164 + */ 165 + if (real == ERR_PTR(-ESTALE) && 166 + !(fh->flags & OVL_FH_FLAG_PATH_UPPER)) 167 + real = NULL; 168 + return real; 200 169 } 201 170 202 - if (ovl_dentry_weird(origin) || 203 - ((d_inode(origin)->i_mode ^ d_inode(dentry)->i_mode) & S_IFMT)) 204 - goto invalid; 171 + if (ovl_dentry_weird(real)) { 172 + dput(real); 173 + return NULL; 174 + } 205 175 206 - out: 207 - kfree(fh); 208 - return origin; 209 - 210 - invalid: 211 - pr_warn_ratelimited("overlayfs: invalid origin (%pd2)\n", origin); 212 - dput(origin); 213 - origin = NULL; 214 - goto out; 176 + return real; 215 177 } 216 178 217 179 static bool ovl_is_opaquedir(struct dentry *dentry) ··· 310 284 } 311 285 312 286 313 - static int ovl_check_origin(struct dentry *upperdentry, 314 - struct ovl_path *lower, unsigned int numlower, 315 - struct ovl_path **stackp, unsigned int *ctrp) 287 + int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, 288 + struct dentry *upperdentry, struct ovl_path **stackp) 316 289 { 317 - struct vfsmount *mnt; 318 290 struct dentry *origin = NULL; 319 291 int i; 320 292 321 - for (i = 0; i < numlower; i++) { 322 - mnt = lower[i].layer->mnt; 323 - origin = ovl_get_origin(upperdentry, mnt); 324 - if (IS_ERR(origin)) 325 - return PTR_ERR(origin); 326 - 293 + for (i = 0; i < ofs->numlower; i++) { 294 + origin = ovl_decode_fh(fh, ofs->lower_layers[i].mnt); 327 295 if (origin) 328 296 break; 329 297 } 330 298 331 299 if (!origin) 332 - return 0; 300 + return -ESTALE; 301 + else if (IS_ERR(origin)) 302 + return PTR_ERR(origin); 333 303 334 - BUG_ON(*ctrp); 304 + if (upperdentry && !ovl_is_whiteout(upperdentry) && 305 + ((d_inode(origin)->i_mode ^ d_inode(upperdentry)->i_mode) & S_IFMT)) 306 + goto invalid; 307 + 335 308 if (!*stackp) 336 309 *stackp = kmalloc(sizeof(struct ovl_path), GFP_KERNEL); 337 310 if (!*stackp) { 338 311 dput(origin); 339 312 return -ENOMEM; 340 313 } 341 - **stackp = (struct ovl_path){.dentry = origin, .layer = lower[i].layer}; 342 - *ctrp = 1; 314 + **stackp = (struct ovl_path){ 315 + .dentry = origin, 316 + .layer = &ofs->lower_layers[i] 317 + }; 343 318 319 + return 0; 320 + 321 + invalid: 322 + pr_warn_ratelimited("overlayfs: invalid origin (%pd2, ftype=%x, origin ftype=%x).\n", 323 + upperdentry, d_inode(upperdentry)->i_mode & S_IFMT, 324 + d_inode(origin)->i_mode & S_IFMT); 325 + dput(origin); 326 + return -EIO; 327 + } 328 + 329 + static int ovl_check_origin(struct ovl_fs *ofs, struct dentry *upperdentry, 330 + struct ovl_path **stackp, unsigned int *ctrp) 331 + { 332 + struct ovl_fh *fh = ovl_get_fh(upperdentry, OVL_XATTR_ORIGIN); 333 + int err; 334 + 335 + if (IS_ERR_OR_NULL(fh)) 336 + return PTR_ERR(fh); 337 + 338 + err = ovl_check_origin_fh(ofs, fh, upperdentry, stackp); 339 + kfree(fh); 340 + 341 + if (err) { 342 + if (err == -ESTALE) 343 + return 0; 344 + return err; 345 + } 346 + 347 + if (WARN_ON(*ctrp)) 348 + return -EIO; 349 + 350 + *ctrp = 1; 344 351 return 0; 345 352 } 346 353 347 354 /* 348 - * Verify that @fh matches the origin file handle stored in OVL_XATTR_ORIGIN. 355 + * Verify that @fh matches the file handle stored in xattr @name. 349 356 * Return 0 on match, -ESTALE on mismatch, < 0 on error. 350 357 */ 351 - static int ovl_verify_origin_fh(struct dentry *dentry, const struct ovl_fh *fh) 358 + static int ovl_verify_fh(struct dentry *dentry, const char *name, 359 + const struct ovl_fh *fh) 352 360 { 353 - struct ovl_fh *ofh = ovl_get_origin_fh(dentry); 361 + struct ovl_fh *ofh = ovl_get_fh(dentry, name); 354 362 int err = 0; 355 363 356 364 if (!ofh) ··· 401 341 } 402 342 403 343 /* 404 - * Verify that an inode matches the origin file handle stored in upper inode. 344 + * Verify that @real dentry matches the file handle stored in xattr @name. 405 345 * 406 - * If @set is true and there is no stored file handle, encode and store origin 407 - * file handle in OVL_XATTR_ORIGIN. 346 + * If @set is true and there is no stored file handle, encode @real and store 347 + * file handle in xattr @name. 408 348 * 409 - * Return 0 on match, -ESTALE on mismatch, < 0 on error. 349 + * Return 0 on match, -ESTALE on mismatch, -ENODATA on no xattr, < 0 on error. 410 350 */ 411 - int ovl_verify_origin(struct dentry *dentry, struct dentry *origin, 412 - bool is_upper, bool set) 351 + int ovl_verify_set_fh(struct dentry *dentry, const char *name, 352 + struct dentry *real, bool is_upper, bool set) 413 353 { 414 354 struct inode *inode; 415 355 struct ovl_fh *fh; 416 356 int err; 417 357 418 - fh = ovl_encode_fh(origin, is_upper); 358 + fh = ovl_encode_fh(real, is_upper); 419 359 err = PTR_ERR(fh); 420 360 if (IS_ERR(fh)) 421 361 goto fail; 422 362 423 - err = ovl_verify_origin_fh(dentry, fh); 363 + err = ovl_verify_fh(dentry, name, fh); 424 364 if (set && err == -ENODATA) 425 - err = ovl_do_setxattr(dentry, OVL_XATTR_ORIGIN, fh, fh->len, 0); 365 + err = ovl_do_setxattr(dentry, name, fh, fh->len, 0); 426 366 if (err) 427 367 goto fail; 428 368 ··· 431 371 return err; 432 372 433 373 fail: 434 - inode = d_inode(origin); 435 - pr_warn_ratelimited("overlayfs: failed to verify origin (%pd2, ino=%lu, err=%i)\n", 436 - origin, inode ? inode->i_ino : 0, err); 374 + inode = d_inode(real); 375 + pr_warn_ratelimited("overlayfs: failed to verify %s (%pd2, ino=%lu, err=%i)\n", 376 + is_upper ? "upper" : "origin", real, 377 + inode ? inode->i_ino : 0, err); 437 378 goto out; 379 + } 380 + 381 + /* Get upper dentry from index */ 382 + struct dentry *ovl_index_upper(struct ovl_fs *ofs, struct dentry *index) 383 + { 384 + struct ovl_fh *fh; 385 + struct dentry *upper; 386 + 387 + if (!d_is_dir(index)) 388 + return dget(index); 389 + 390 + fh = ovl_get_fh(index, OVL_XATTR_UPPER); 391 + if (IS_ERR_OR_NULL(fh)) 392 + return ERR_CAST(fh); 393 + 394 + upper = ovl_decode_fh(fh, ofs->upper_mnt); 395 + kfree(fh); 396 + 397 + if (IS_ERR_OR_NULL(upper)) 398 + return upper ?: ERR_PTR(-ESTALE); 399 + 400 + if (!d_is_dir(upper)) { 401 + pr_warn_ratelimited("overlayfs: invalid index upper (%pd2, upper=%pd2).\n", 402 + index, upper); 403 + dput(upper); 404 + return ERR_PTR(-EIO); 405 + } 406 + 407 + return upper; 408 + } 409 + 410 + /* Is this a leftover from create/whiteout of directory index entry? */ 411 + static bool ovl_is_temp_index(struct dentry *index) 412 + { 413 + return index->d_name.name[0] == '#'; 438 414 } 439 415 440 416 /* ··· 478 382 * OVL_XATTR_ORIGIN and that origin file handle can be decoded to lower path. 479 383 * Return 0 on match, -ESTALE on mismatch or stale origin, < 0 on error. 480 384 */ 481 - int ovl_verify_index(struct dentry *index, struct ovl_path *lower, 482 - unsigned int numlower) 385 + int ovl_verify_index(struct ovl_fs *ofs, struct dentry *index) 483 386 { 484 387 struct ovl_fh *fh = NULL; 485 388 size_t len; 486 389 struct ovl_path origin = { }; 487 390 struct ovl_path *stack = &origin; 488 - unsigned int ctr = 0; 391 + struct dentry *upper = NULL; 489 392 int err; 490 393 491 394 if (!d_inode(index)) 492 395 return 0; 493 396 494 - /* 495 - * Directory index entries are going to be used for looking up 496 - * redirected upper dirs by lower dir fh when decoding an overlay 497 - * file handle of a merge dir. Whiteout index entries are going to be 498 - * used as an indication that an exported overlay file handle should 499 - * be treated as stale (i.e. after unlink of the overlay inode). 500 - * We don't know the verification rules for directory and whiteout 501 - * index entries, because they have not been implemented yet, so return 502 - * EINVAL if those entries are found to abort the mount to avoid 503 - * corrupting an index that was created by a newer kernel. 504 - */ 505 - err = -EINVAL; 506 - if (d_is_dir(index) || ovl_is_whiteout(index)) 397 + /* Cleanup leftover from index create/cleanup attempt */ 398 + err = -ESTALE; 399 + if (ovl_is_temp_index(index)) 507 400 goto fail; 508 401 402 + err = -EINVAL; 509 403 if (index->d_name.len < sizeof(struct ovl_fh)*2) 510 404 goto fail; 511 405 ··· 506 420 goto fail; 507 421 508 422 err = -EINVAL; 509 - if (hex2bin((u8 *)fh, index->d_name.name, len) || len != fh->len) 423 + if (hex2bin((u8 *)fh, index->d_name.name, len)) 510 424 goto fail; 511 425 512 - err = ovl_verify_origin_fh(index, fh); 426 + err = ovl_check_fh_len(fh, len); 513 427 if (err) 514 428 goto fail; 515 429 516 - err = ovl_check_origin(index, lower, numlower, &stack, &ctr); 517 - if (!err && !ctr) 518 - err = -ESTALE; 430 + /* 431 + * Whiteout index entries are used as an indication that an exported 432 + * overlay file handle should be treated as stale (i.e. after unlink 433 + * of the overlay inode). These entries contain no origin xattr. 434 + */ 435 + if (ovl_is_whiteout(index)) 436 + goto out; 437 + 438 + /* 439 + * Verifying directory index entries are not stale is expensive, so 440 + * only verify stale dir index if NFS export is enabled. 441 + */ 442 + if (d_is_dir(index) && !ofs->config.nfs_export) 443 + goto out; 444 + 445 + /* 446 + * Directory index entries should have 'upper' xattr pointing to the 447 + * real upper dir. Non-dir index entries are hardlinks to the upper 448 + * real inode. For non-dir index, we can read the copy up origin xattr 449 + * directly from the index dentry, but for dir index we first need to 450 + * decode the upper directory. 451 + */ 452 + upper = ovl_index_upper(ofs, index); 453 + if (IS_ERR_OR_NULL(upper)) { 454 + err = PTR_ERR(upper); 455 + /* 456 + * Directory index entries with no 'upper' xattr need to be 457 + * removed. When dir index entry has a stale 'upper' xattr, 458 + * we assume that upper dir was removed and we treat the dir 459 + * index as orphan entry that needs to be whited out. 460 + */ 461 + if (err == -ESTALE) 462 + goto orphan; 463 + else if (!err) 464 + err = -ESTALE; 465 + goto fail; 466 + } 467 + 468 + err = ovl_verify_fh(upper, OVL_XATTR_ORIGIN, fh); 469 + dput(upper); 519 470 if (err) 520 471 goto fail; 521 472 522 - /* Check if index is orphan and don't warn before cleaning it */ 523 - if (d_inode(index)->i_nlink == 1 && 524 - ovl_get_nlink(origin.dentry, index, 0) == 0) 525 - err = -ENOENT; 473 + /* Check if non-dir index is orphan and don't warn before cleaning it */ 474 + if (!d_is_dir(index) && d_inode(index)->i_nlink == 1) { 475 + err = ovl_check_origin_fh(ofs, fh, index, &stack); 476 + if (err) 477 + goto fail; 526 478 527 - dput(origin.dentry); 479 + if (ovl_get_nlink(origin.dentry, index, 0) == 0) 480 + goto orphan; 481 + } 482 + 528 483 out: 484 + dput(origin.dentry); 529 485 kfree(fh); 530 486 return err; 531 487 ··· 575 447 pr_warn_ratelimited("overlayfs: failed to verify index (%pd2, ftype=%x, err=%i)\n", 576 448 index, d_inode(index)->i_mode & S_IFMT, err); 577 449 goto out; 450 + 451 + orphan: 452 + pr_warn_ratelimited("overlayfs: orphan index entry (%pd2, ftype=%x, nlink=%u)\n", 453 + index, d_inode(index)->i_mode & S_IFMT, 454 + d_inode(index)->i_nlink); 455 + err = -ENOENT; 456 + goto out; 457 + } 458 + 459 + static int ovl_get_index_name_fh(struct ovl_fh *fh, struct qstr *name) 460 + { 461 + char *n, *s; 462 + 463 + n = kzalloc(fh->len * 2, GFP_KERNEL); 464 + if (!n) 465 + return -ENOMEM; 466 + 467 + s = bin2hex(n, fh, fh->len); 468 + *name = (struct qstr) QSTR_INIT(n, s - n); 469 + 470 + return 0; 471 + 578 472 } 579 473 580 474 /* ··· 616 466 */ 617 467 int ovl_get_index_name(struct dentry *origin, struct qstr *name) 618 468 { 619 - int err; 620 469 struct ovl_fh *fh; 621 - char *n, *s; 470 + int err; 622 471 623 472 fh = ovl_encode_fh(origin, false); 624 473 if (IS_ERR(fh)) 625 474 return PTR_ERR(fh); 626 475 627 - err = -ENOMEM; 628 - n = kzalloc(fh->len * 2, GFP_KERNEL); 629 - if (n) { 630 - s = bin2hex(n, fh, fh->len); 631 - *name = (struct qstr) QSTR_INIT(n, s - n); 632 - err = 0; 633 - } 476 + err = ovl_get_index_name_fh(fh, name); 477 + 634 478 kfree(fh); 635 - 636 479 return err; 637 - 638 480 } 639 481 640 - static struct dentry *ovl_lookup_index(struct dentry *dentry, 641 - struct dentry *upper, 642 - struct dentry *origin) 482 + /* Lookup index by file handle for NFS export */ 483 + struct dentry *ovl_get_index_fh(struct ovl_fs *ofs, struct ovl_fh *fh) 643 484 { 644 - struct ovl_fs *ofs = dentry->d_sb->s_fs_info; 485 + struct dentry *index; 486 + struct qstr name; 487 + int err; 488 + 489 + err = ovl_get_index_name_fh(fh, &name); 490 + if (err) 491 + return ERR_PTR(err); 492 + 493 + index = lookup_one_len_unlocked(name.name, ofs->indexdir, name.len); 494 + kfree(name.name); 495 + if (IS_ERR(index)) { 496 + if (PTR_ERR(index) == -ENOENT) 497 + index = NULL; 498 + return index; 499 + } 500 + 501 + if (d_is_negative(index)) 502 + err = 0; 503 + else if (ovl_is_whiteout(index)) 504 + err = -ESTALE; 505 + else if (ovl_dentry_weird(index)) 506 + err = -EIO; 507 + else 508 + return index; 509 + 510 + dput(index); 511 + return ERR_PTR(err); 512 + } 513 + 514 + struct dentry *ovl_lookup_index(struct ovl_fs *ofs, struct dentry *upper, 515 + struct dentry *origin, bool verify) 516 + { 645 517 struct dentry *index; 646 518 struct inode *inode; 647 519 struct qstr name; 520 + bool is_dir = d_is_dir(origin); 648 521 int err; 649 522 650 523 err = ovl_get_index_name(origin, &name); ··· 691 518 inode = d_inode(index); 692 519 if (d_is_negative(index)) { 693 520 goto out_dput; 694 - } else if (upper && d_inode(upper) != inode) { 695 - goto out_dput; 521 + } else if (ovl_is_whiteout(index) && !verify) { 522 + /* 523 + * When index lookup is called with !verify for decoding an 524 + * overlay file handle, a whiteout index implies that decode 525 + * should treat file handle as stale and no need to print a 526 + * warning about it. 527 + */ 528 + dput(index); 529 + index = ERR_PTR(-ESTALE); 530 + goto out; 696 531 } else if (ovl_dentry_weird(index) || ovl_is_whiteout(index) || 697 532 ((inode->i_mode ^ d_inode(origin)->i_mode) & S_IFMT)) { 698 533 /* ··· 714 533 index, d_inode(index)->i_mode & S_IFMT, 715 534 d_inode(origin)->i_mode & S_IFMT); 716 535 goto fail; 717 - } 536 + } else if (is_dir && verify) { 537 + if (!upper) { 538 + pr_warn_ratelimited("overlayfs: suspected uncovered redirected dir found (origin=%pd2, index=%pd2).\n", 539 + origin, index); 540 + goto fail; 541 + } 718 542 543 + /* Verify that dir index 'upper' xattr points to upper dir */ 544 + err = ovl_verify_upper(index, upper, false); 545 + if (err) { 546 + if (err == -ESTALE) { 547 + pr_warn_ratelimited("overlayfs: suspected multiply redirected dir found (upper=%pd2, origin=%pd2, index=%pd2).\n", 548 + upper, origin, index); 549 + } 550 + goto fail; 551 + } 552 + } else if (upper && d_inode(upper) != inode) { 553 + goto out_dput; 554 + } 719 555 out: 720 556 kfree(name.name); 721 557 return index; ··· 770 572 return (idx < oe->numlower) ? idx + 1 : -1; 771 573 } 772 574 773 - static int ovl_find_layer(struct ovl_fs *ofs, struct ovl_path *path) 575 + /* Fix missing 'origin' xattr */ 576 + static int ovl_fix_origin(struct dentry *dentry, struct dentry *lower, 577 + struct dentry *upper) 774 578 { 775 - int i; 579 + int err; 776 580 777 - for (i = 0; i < ofs->numlower; i++) { 778 - if (ofs->lower_layers[i].mnt == path->layer->mnt) 779 - break; 780 - } 581 + if (ovl_check_origin_xattr(upper)) 582 + return 0; 781 583 782 - return i; 584 + err = ovl_want_write(dentry); 585 + if (err) 586 + return err; 587 + 588 + err = ovl_set_origin(dentry, lower, upper); 589 + if (!err) 590 + err = ovl_set_impure(dentry->d_parent, upper->d_parent); 591 + 592 + ovl_drop_write(dentry); 593 + return err; 783 594 } 784 595 785 596 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, ··· 801 594 struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata; 802 595 struct ovl_path *stack = NULL; 803 596 struct dentry *upperdir, *upperdentry = NULL; 597 + struct dentry *origin = NULL; 804 598 struct dentry *index = NULL; 805 599 unsigned int ctr = 0; 806 600 struct inode *inode = NULL; ··· 846 638 * number - it's the same as if we held a reference 847 639 * to a dentry in lower layer that was moved under us. 848 640 */ 849 - err = ovl_check_origin(upperdentry, roe->lowerstack, 850 - roe->numlower, &stack, &ctr); 641 + err = ovl_check_origin(ofs, upperdentry, &stack, &ctr); 851 642 if (err) 852 643 goto out_put_upper; 853 644 } ··· 881 674 if (!this) 882 675 continue; 883 676 677 + /* 678 + * If no origin fh is stored in upper of a merge dir, store fh 679 + * of lower dir and set upper parent "impure". 680 + */ 681 + if (upperdentry && !ctr && !ofs->noxattr) { 682 + err = ovl_fix_origin(dentry, this, upperdentry); 683 + if (err) { 684 + dput(this); 685 + goto out_put; 686 + } 687 + } 688 + 689 + /* 690 + * When "verify_lower" feature is enabled, do not merge with a 691 + * lower dir that does not match a stored origin xattr. In any 692 + * case, only verified origin is used for index lookup. 693 + */ 694 + if (upperdentry && !ctr && ovl_verify_lower(dentry->d_sb)) { 695 + err = ovl_verify_origin(upperdentry, this, false); 696 + if (err) { 697 + dput(this); 698 + break; 699 + } 700 + 701 + /* Bless lower dir as verified origin */ 702 + origin = this; 703 + } 704 + 884 705 stack[ctr].dentry = this; 885 706 stack[ctr].layer = lower.layer; 886 707 ctr++; ··· 928 693 */ 929 694 err = -EPERM; 930 695 if (d.redirect && !ofs->config.redirect_follow) { 931 - pr_warn_ratelimited("overlay: refusing to follow redirect for (%pd2)\n", dentry); 696 + pr_warn_ratelimited("overlayfs: refusing to follow redirect for (%pd2)\n", 697 + dentry); 932 698 goto out_put; 933 699 } 934 700 935 701 if (d.redirect && d.redirect[0] == '/' && poe != roe) { 936 702 poe = roe; 937 - 938 703 /* Find the current layer on the root dentry */ 939 - i = ovl_find_layer(ofs, &lower); 940 - if (WARN_ON(i == ofs->numlower)) 941 - break; 704 + i = lower.layer->idx - 1; 942 705 } 943 706 } 944 707 945 - /* Lookup index by lower inode and verify it matches upper inode */ 946 - if (ctr && !d.is_dir && ovl_indexdir(dentry->d_sb)) { 947 - struct dentry *origin = stack[0].dentry; 708 + /* 709 + * Lookup index by lower inode and verify it matches upper inode. 710 + * We only trust dir index if we verified that lower dir matches 711 + * origin, otherwise dir index entries may be inconsistent and we 712 + * ignore them. Always lookup index of non-dir and non-upper. 713 + */ 714 + if (ctr && (!upperdentry || !d.is_dir)) 715 + origin = stack[0].dentry; 948 716 949 - index = ovl_lookup_index(dentry, upperdentry, origin); 717 + if (origin && ovl_indexdir(dentry->d_sb) && 718 + (!d.is_dir || ovl_index_all(dentry->d_sb))) { 719 + index = ovl_lookup_index(ofs, upperdentry, origin, true); 950 720 if (IS_ERR(index)) { 951 721 err = PTR_ERR(index); 952 722 index = NULL; ··· 964 724 if (!oe) 965 725 goto out_put; 966 726 967 - oe->opaque = upperopaque; 968 727 memcpy(oe->lowerstack, stack, sizeof(struct ovl_path) * ctr); 969 728 dentry->d_fsdata = oe; 729 + 730 + if (upperopaque) 731 + ovl_dentry_set_opaque(dentry); 970 732 971 733 if (upperdentry) 972 734 ovl_dentry_set_upper_alias(dentry); ··· 976 734 upperdentry = dget(index); 977 735 978 736 if (upperdentry || ctr) { 979 - inode = ovl_get_inode(dentry, upperdentry, index); 737 + if (ctr) 738 + origin = stack[0].dentry; 739 + inode = ovl_get_inode(dentry->d_sb, upperdentry, origin, index, 740 + ctr); 980 741 err = PTR_ERR(inode); 981 742 if (IS_ERR(inode)) 982 743 goto out_free_oe; ··· 993 748 dput(index); 994 749 kfree(stack); 995 750 kfree(d.redirect); 996 - d_add(dentry, inode); 997 - 998 - return NULL; 751 + return d_splice_alias(inode, dentry); 999 752 1000 753 out_free_oe: 1001 754 dentry->d_fsdata = NULL; ··· 1014 771 1015 772 bool ovl_lower_positive(struct dentry *dentry) 1016 773 { 1017 - struct ovl_entry *oe = dentry->d_fsdata; 1018 774 struct ovl_entry *poe = dentry->d_parent->d_fsdata; 1019 775 const struct qstr *name = &dentry->d_name; 776 + const struct cred *old_cred; 1020 777 unsigned int i; 1021 778 bool positive = false; 1022 779 bool done = false; ··· 1026 783 * whiteout. 1027 784 */ 1028 785 if (!dentry->d_inode) 1029 - return oe->opaque; 786 + return ovl_dentry_is_opaque(dentry); 1030 787 1031 788 /* Negative upper -> positive lower */ 1032 789 if (!ovl_dentry_upper(dentry)) 1033 790 return true; 1034 791 792 + old_cred = ovl_override_creds(dentry->d_sb); 1035 793 /* Positive upper -> have to look up lower to see whether it exists */ 1036 794 for (i = 0; !done && !positive && i < poe->numlower; i++) { 1037 795 struct dentry *this; ··· 1062 818 dput(this); 1063 819 } 1064 820 } 821 + revert_creds(old_cred); 1065 822 1066 823 return positive; 1067 824 }
+55 -11
fs/overlayfs/overlayfs.h
··· 27 27 #define OVL_XATTR_ORIGIN OVL_XATTR_PREFIX "origin" 28 28 #define OVL_XATTR_IMPURE OVL_XATTR_PREFIX "impure" 29 29 #define OVL_XATTR_NLINK OVL_XATTR_PREFIX "nlink" 30 + #define OVL_XATTR_UPPER OVL_XATTR_PREFIX "upper" 30 31 31 - enum ovl_flag { 32 + enum ovl_inode_flag { 32 33 /* Pure upper dir that may contain non pure upper entries */ 33 34 OVL_IMPURE, 34 35 /* Non-merge dir that may contain whiteout entries */ 35 36 OVL_WHITEOUTS, 36 37 OVL_INDEX, 38 + }; 39 + 40 + enum ovl_entry_flag { 41 + OVL_E_UPPER_ALIAS, 42 + OVL_E_OPAQUE, 37 43 }; 38 44 39 45 /* ··· 67 61 #else 68 62 #error Endianness not defined 69 63 #endif 64 + 65 + /* The type returned by overlay exportfs ops when encoding an ovl_fh handle */ 66 + #define OVL_FILEID 0xfb 70 67 71 68 /* On-disk and in-memeory format for redirect by file handle */ 72 69 struct ovl_fh { ··· 203 194 struct super_block *ovl_same_sb(struct super_block *sb); 204 195 bool ovl_can_decode_fh(struct super_block *sb); 205 196 struct dentry *ovl_indexdir(struct super_block *sb); 197 + bool ovl_index_all(struct super_block *sb); 198 + bool ovl_verify_lower(struct super_block *sb); 206 199 struct ovl_entry *ovl_alloc_entry(unsigned int numlower); 207 200 bool ovl_dentry_remote(struct dentry *dentry); 208 201 bool ovl_dentry_weird(struct dentry *dentry); ··· 221 210 struct inode *ovl_inode_real(struct inode *inode); 222 211 struct ovl_dir_cache *ovl_dir_cache(struct inode *inode); 223 212 void ovl_set_dir_cache(struct inode *inode, struct ovl_dir_cache *cache); 213 + void ovl_dentry_set_flag(unsigned long flag, struct dentry *dentry); 214 + void ovl_dentry_clear_flag(unsigned long flag, struct dentry *dentry); 215 + bool ovl_dentry_test_flag(unsigned long flag, struct dentry *dentry); 224 216 bool ovl_dentry_is_opaque(struct dentry *dentry); 225 217 bool ovl_dentry_is_whiteout(struct dentry *dentry); 226 218 void ovl_dentry_set_opaque(struct dentry *dentry); ··· 252 238 bool ovl_test_flag(unsigned long flag, struct inode *inode); 253 239 bool ovl_inuse_trylock(struct dentry *dentry); 254 240 void ovl_inuse_unlock(struct dentry *dentry); 241 + bool ovl_need_index(struct dentry *dentry); 255 242 int ovl_nlink_start(struct dentry *dentry, bool *locked); 256 243 void ovl_nlink_end(struct dentry *dentry, bool locked); 257 244 int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir); ··· 264 249 265 250 266 251 /* namei.c */ 267 - int ovl_verify_origin(struct dentry *dentry, struct dentry *origin, 268 - bool is_upper, bool set); 269 - int ovl_verify_index(struct dentry *index, struct ovl_path *lower, 270 - unsigned int numlower); 252 + int ovl_check_fh_len(struct ovl_fh *fh, int fh_len); 253 + struct dentry *ovl_decode_fh(struct ovl_fh *fh, struct vfsmount *mnt); 254 + int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, 255 + struct dentry *upperdentry, struct ovl_path **stackp); 256 + int ovl_verify_set_fh(struct dentry *dentry, const char *name, 257 + struct dentry *real, bool is_upper, bool set); 258 + struct dentry *ovl_index_upper(struct ovl_fs *ofs, struct dentry *index); 259 + int ovl_verify_index(struct ovl_fs *ofs, struct dentry *index); 271 260 int ovl_get_index_name(struct dentry *origin, struct qstr *name); 261 + struct dentry *ovl_get_index_fh(struct ovl_fs *ofs, struct ovl_fh *fh); 262 + struct dentry *ovl_lookup_index(struct ovl_fs *ofs, struct dentry *upper, 263 + struct dentry *origin, bool verify); 272 264 int ovl_path_next(int idx, struct dentry *dentry, struct path *path); 273 - struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags); 265 + struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, 266 + unsigned int flags); 274 267 bool ovl_lower_positive(struct dentry *dentry); 268 + 269 + static inline int ovl_verify_origin(struct dentry *upper, 270 + struct dentry *origin, bool set) 271 + { 272 + return ovl_verify_set_fh(upper, OVL_XATTR_ORIGIN, origin, false, set); 273 + } 274 + 275 + static inline int ovl_verify_upper(struct dentry *index, 276 + struct dentry *upper, bool set) 277 + { 278 + return ovl_verify_set_fh(index, OVL_XATTR_UPPER, upper, true, set); 279 + } 275 280 276 281 /* readdir.c */ 277 282 extern const struct file_operations ovl_dir_operations; ··· 302 267 int ovl_check_d_type_supported(struct path *realpath); 303 268 void ovl_workdir_cleanup(struct inode *dir, struct vfsmount *mnt, 304 269 struct dentry *dentry, int level); 305 - int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt, 306 - struct ovl_path *lower, unsigned int numlower); 270 + int ovl_indexdir_cleanup(struct ovl_fs *ofs); 307 271 308 272 /* inode.c */ 309 273 int ovl_set_nlink_upper(struct dentry *dentry); ··· 325 291 bool ovl_is_private_xattr(const char *name); 326 292 327 293 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev); 328 - struct inode *ovl_get_inode(struct dentry *dentry, struct dentry *upperdentry, 329 - struct dentry *index); 294 + struct inode *ovl_lookup_inode(struct super_block *sb, struct dentry *real, 295 + bool is_upper); 296 + struct inode *ovl_get_inode(struct super_block *sb, struct dentry *upperdentry, 297 + struct dentry *lowerdentry, struct dentry *index, 298 + unsigned int numlower); 330 299 static inline void ovl_copyattr(struct inode *from, struct inode *to) 331 300 { 332 301 to->i_uid = from->i_uid; ··· 343 306 /* dir.c */ 344 307 extern const struct inode_operations ovl_dir_inode_operations; 345 308 struct dentry *ovl_lookup_temp(struct dentry *workdir); 309 + int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir, 310 + struct dentry *dentry); 346 311 struct cattr { 347 312 dev_t rdev; 348 313 umode_t mode; ··· 360 321 int ovl_copy_up_flags(struct dentry *dentry, int flags); 361 322 int ovl_copy_xattr(struct dentry *old, struct dentry *new); 362 323 int ovl_set_attr(struct dentry *upper, struct kstat *stat); 363 - struct ovl_fh *ovl_encode_fh(struct dentry *lower, bool is_upper); 324 + struct ovl_fh *ovl_encode_fh(struct dentry *real, bool is_upper); 325 + int ovl_set_origin(struct dentry *dentry, struct dentry *lower, 326 + struct dentry *upper); 327 + 328 + /* export.c */ 329 + extern const struct export_operations ovl_export_operations;
+9 -2
fs/overlayfs/ovl_entry.h
··· 17 17 bool redirect_follow; 18 18 const char *redirect_mode; 19 19 bool index; 20 + bool nfs_export; 20 21 }; 21 22 22 23 struct ovl_layer { 23 24 struct vfsmount *mnt; 24 25 dev_t pseudo_dev; 26 + /* Index of this layer in fs root (upper == 0) */ 27 + int idx; 25 28 }; 26 29 27 30 struct ovl_path { ··· 61 58 struct ovl_entry { 62 59 union { 63 60 struct { 64 - unsigned long has_upper; 65 - bool opaque; 61 + unsigned long flags; 66 62 }; 67 63 struct rcu_head rcu; 68 64 }; ··· 70 68 }; 71 69 72 70 struct ovl_entry *ovl_alloc_entry(unsigned int numlower); 71 + 72 + static inline struct ovl_entry *OVL_E(struct dentry *dentry) 73 + { 74 + return (struct ovl_entry *) dentry->d_fsdata; 75 + } 73 76 74 77 struct ovl_inode { 75 78 struct ovl_dir_cache *cache;
+46 -11
fs/overlayfs/readdir.c
··· 593 593 return ERR_PTR(res); 594 594 } 595 595 if (list_empty(&cache->entries)) { 596 - /* Good oportunity to get rid of an unnecessary "impure" flag */ 597 - ovl_do_removexattr(ovl_dentry_upper(dentry), OVL_XATTR_IMPURE); 596 + /* 597 + * A good opportunity to get rid of an unneeded "impure" flag. 598 + * Removing the "impure" xattr is best effort. 599 + */ 600 + if (!ovl_want_write(dentry)) { 601 + ovl_do_removexattr(ovl_dentry_upper(dentry), 602 + OVL_XATTR_IMPURE); 603 + ovl_drop_write(dentry); 604 + } 598 605 ovl_clear_flag(OVL_IMPURE, d_inode(dentry)); 599 606 kfree(cache); 600 607 return NULL; ··· 776 769 struct dentry *dentry = file->f_path.dentry; 777 770 struct file *realfile = od->realfile; 778 771 772 + /* Nothing to sync for lower */ 773 + if (!OVL_TYPE_UPPER(ovl_path_type(dentry))) 774 + return 0; 775 + 779 776 /* 780 777 * Need to check if we started out being a lower dir, but got copied up 781 778 */ 782 - if (!od->is_upper && OVL_TYPE_UPPER(ovl_path_type(dentry))) { 779 + if (!od->is_upper) { 783 780 struct inode *inode = file_inode(file); 784 781 785 782 realfile = READ_ONCE(od->upperfile); ··· 869 858 int err; 870 859 struct ovl_cache_entry *p, *n; 871 860 struct rb_root root = RB_ROOT; 861 + const struct cred *old_cred; 872 862 863 + old_cred = ovl_override_creds(dentry->d_sb); 873 864 err = ovl_dir_read_merged(dentry, list, &root); 865 + revert_creds(old_cred); 874 866 if (err) 875 867 return err; 876 868 ··· 1030 1016 } 1031 1017 } 1032 1018 1033 - int ovl_indexdir_cleanup(struct dentry *dentry, struct vfsmount *mnt, 1034 - struct ovl_path *lower, unsigned int numlower) 1019 + int ovl_indexdir_cleanup(struct ovl_fs *ofs) 1035 1020 { 1036 1021 int err; 1022 + struct dentry *indexdir = ofs->indexdir; 1037 1023 struct dentry *index = NULL; 1038 - struct inode *dir = dentry->d_inode; 1039 - struct path path = { .mnt = mnt, .dentry = dentry }; 1024 + struct inode *dir = indexdir->d_inode; 1025 + struct path path = { .mnt = ofs->upper_mnt, .dentry = indexdir }; 1040 1026 LIST_HEAD(list); 1041 1027 struct rb_root root = RB_ROOT; 1042 1028 struct ovl_cache_entry *p; ··· 1060 1046 if (p->len == 2 && p->name[1] == '.') 1061 1047 continue; 1062 1048 } 1063 - index = lookup_one_len(p->name, dentry, p->len); 1049 + index = lookup_one_len(p->name, indexdir, p->len); 1064 1050 if (IS_ERR(index)) { 1065 1051 err = PTR_ERR(index); 1066 1052 index = NULL; 1067 1053 break; 1068 1054 } 1069 - err = ovl_verify_index(index, lower, numlower); 1070 - /* Cleanup stale and orphan index entries */ 1071 - if (err && (err == -ESTALE || err == -ENOENT)) 1055 + err = ovl_verify_index(ofs, index); 1056 + if (!err) { 1057 + goto next; 1058 + } else if (err == -ESTALE) { 1059 + /* Cleanup stale index entries */ 1072 1060 err = ovl_cleanup(dir, index); 1061 + } else if (err != -ENOENT) { 1062 + /* 1063 + * Abort mount to avoid corrupting the index if 1064 + * an incompatible index entry was found or on out 1065 + * of memory. 1066 + */ 1067 + break; 1068 + } else if (ofs->config.nfs_export) { 1069 + /* 1070 + * Whiteout orphan index to block future open by 1071 + * handle after overlay nlink dropped to zero. 1072 + */ 1073 + err = ovl_cleanup_and_whiteout(indexdir, dir, index); 1074 + } else { 1075 + /* Cleanup orphan index entries */ 1076 + err = ovl_cleanup(dir, index); 1077 + } 1078 + 1073 1079 if (err) 1074 1080 break; 1075 1081 1082 + next: 1076 1083 dput(index); 1077 1084 index = NULL; 1078 1085 }
+97 -28
fs/overlayfs/super.c
··· 45 45 MODULE_PARM_DESC(ovl_index_def, 46 46 "Default to on or off for the inodes index feature"); 47 47 48 + static bool ovl_nfs_export_def = IS_ENABLED(CONFIG_OVERLAY_FS_NFS_EXPORT); 49 + module_param_named(nfs_export, ovl_nfs_export_def, bool, 0644); 50 + MODULE_PARM_DESC(ovl_nfs_export_def, 51 + "Default to on or off for the NFS export feature"); 52 + 48 53 static void ovl_entry_stack_free(struct ovl_entry *oe) 49 54 { 50 55 unsigned int i; ··· 216 211 struct ovl_inode *oi = OVL_I(inode); 217 212 218 213 dput(oi->__upperdentry); 214 + iput(oi->lower); 219 215 kfree(oi->redirect); 220 216 ovl_dir_cache_free(inode); 221 217 mutex_destroy(&oi->lock); ··· 347 341 seq_printf(m, ",redirect_dir=%s", ofs->config.redirect_mode); 348 342 if (ofs->config.index != ovl_index_def) 349 343 seq_printf(m, ",index=%s", ofs->config.index ? "on" : "off"); 344 + if (ofs->config.nfs_export != ovl_nfs_export_def) 345 + seq_printf(m, ",nfs_export=%s", ofs->config.nfs_export ? 346 + "on" : "off"); 350 347 return 0; 351 348 } 352 349 ··· 382 373 OPT_REDIRECT_DIR, 383 374 OPT_INDEX_ON, 384 375 OPT_INDEX_OFF, 376 + OPT_NFS_EXPORT_ON, 377 + OPT_NFS_EXPORT_OFF, 385 378 OPT_ERR, 386 379 }; 387 380 ··· 395 384 {OPT_REDIRECT_DIR, "redirect_dir=%s"}, 396 385 {OPT_INDEX_ON, "index=on"}, 397 386 {OPT_INDEX_OFF, "index=off"}, 387 + {OPT_NFS_EXPORT_ON, "nfs_export=on"}, 388 + {OPT_NFS_EXPORT_OFF, "nfs_export=off"}, 398 389 {OPT_ERR, NULL} 399 390 }; 400 391 ··· 503 490 config->index = false; 504 491 break; 505 492 493 + case OPT_NFS_EXPORT_ON: 494 + config->nfs_export = true; 495 + break; 496 + 497 + case OPT_NFS_EXPORT_OFF: 498 + config->nfs_export = false; 499 + break; 500 + 506 501 default: 507 502 pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p); 508 503 return -EINVAL; ··· 540 519 int err; 541 520 bool retried = false; 542 521 bool locked = false; 543 - 544 - err = mnt_want_write(mnt); 545 - if (err) 546 - goto out_err; 547 522 548 523 inode_lock_nested(dir, I_MUTEX_PARENT); 549 524 locked = true; ··· 605 588 goto out_err; 606 589 } 607 590 out_unlock: 608 - mnt_drop_write(mnt); 609 591 if (locked) 610 592 inode_unlock(dir); 611 593 ··· 716 700 *remote = true; 717 701 718 702 /* 719 - * The inodes index feature needs to encode and decode file 720 - * handles, so it requires that all layers support them. 703 + * The inodes index feature and NFS export need to encode and decode 704 + * file handles, so they require that all layers support them. 721 705 */ 722 - if (ofs->config.index && !ovl_can_decode_fh(path->dentry->d_sb)) { 706 + if ((ofs->config.nfs_export || 707 + (ofs->config.index && ofs->config.upperdir)) && 708 + !ovl_can_decode_fh(path->dentry->d_sb)) { 723 709 ofs->config.index = false; 724 - pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off.\n", name); 710 + ofs->config.nfs_export = false; 711 + pr_warn("overlayfs: fs on '%s' does not support file handles, falling back to index=off,nfs_export=off.\n", 712 + name); 725 713 } 726 714 727 715 return 0; ··· 949 929 950 930 static int ovl_make_workdir(struct ovl_fs *ofs, struct path *workpath) 951 931 { 932 + struct vfsmount *mnt = ofs->upper_mnt; 952 933 struct dentry *temp; 953 934 int err; 954 935 936 + err = mnt_want_write(mnt); 937 + if (err) 938 + return err; 939 + 955 940 ofs->workdir = ovl_workdir_create(ofs, OVL_WORKDIR_NAME, false); 956 941 if (!ofs->workdir) 957 - return 0; 942 + goto out; 958 943 959 944 /* 960 945 * Upper should support d_type, else whiteouts are visible. Given ··· 969 944 */ 970 945 err = ovl_check_d_type_supported(workpath); 971 946 if (err < 0) 972 - return err; 947 + goto out; 973 948 974 949 /* 975 950 * We allowed this configuration and don't want to break users over ··· 992 967 err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0); 993 968 if (err) { 994 969 ofs->noxattr = true; 995 - pr_warn("overlayfs: upper fs does not support xattr.\n"); 970 + ofs->config.index = false; 971 + pr_warn("overlayfs: upper fs does not support xattr, falling back to index=off.\n"); 972 + err = 0; 996 973 } else { 997 974 vfs_removexattr(ofs->workdir, OVL_XATTR_OPAQUE); 998 975 } ··· 1006 979 pr_warn("overlayfs: upper fs does not support file handles, falling back to index=off.\n"); 1007 980 } 1008 981 1009 - return 0; 982 + /* NFS export of r/w mount depends on index */ 983 + if (ofs->config.nfs_export && !ofs->config.index) { 984 + pr_warn("overlayfs: NFS export requires \"index=on\", falling back to nfs_export=off.\n"); 985 + ofs->config.nfs_export = false; 986 + } 987 + 988 + out: 989 + mnt_drop_write(mnt); 990 + return err; 1010 991 } 1011 992 1012 993 static int ovl_get_workdir(struct ovl_fs *ofs, struct path *upperpath) ··· 1061 1026 static int ovl_get_indexdir(struct ovl_fs *ofs, struct ovl_entry *oe, 1062 1027 struct path *upperpath) 1063 1028 { 1029 + struct vfsmount *mnt = ofs->upper_mnt; 1064 1030 int err; 1031 + 1032 + err = mnt_want_write(mnt); 1033 + if (err) 1034 + return err; 1065 1035 1066 1036 /* Verify lower root is upper root origin */ 1067 1037 err = ovl_verify_origin(upperpath->dentry, oe->lowerstack[0].dentry, 1068 - false, true); 1038 + true); 1069 1039 if (err) { 1070 1040 pr_err("overlayfs: failed to verify upper root origin\n"); 1071 1041 goto out; ··· 1078 1038 1079 1039 ofs->indexdir = ovl_workdir_create(ofs, OVL_INDEXDIR_NAME, true); 1080 1040 if (ofs->indexdir) { 1081 - /* Verify upper root is index dir origin */ 1082 - err = ovl_verify_origin(ofs->indexdir, upperpath->dentry, 1083 - true, true); 1041 + /* 1042 + * Verify upper root is exclusively associated with index dir. 1043 + * Older kernels stored upper fh in "trusted.overlay.origin" 1044 + * xattr. If that xattr exists, verify that it is a match to 1045 + * upper dir file handle. In any case, verify or set xattr 1046 + * "trusted.overlay.upper" to indicate that index may have 1047 + * directory entries. 1048 + */ 1049 + if (ovl_check_origin_xattr(ofs->indexdir)) { 1050 + err = ovl_verify_set_fh(ofs->indexdir, OVL_XATTR_ORIGIN, 1051 + upperpath->dentry, true, false); 1052 + if (err) 1053 + pr_err("overlayfs: failed to verify index dir 'origin' xattr\n"); 1054 + } 1055 + err = ovl_verify_upper(ofs->indexdir, upperpath->dentry, true); 1084 1056 if (err) 1085 - pr_err("overlayfs: failed to verify index dir origin\n"); 1057 + pr_err("overlayfs: failed to verify index dir 'upper' xattr\n"); 1086 1058 1087 1059 /* Cleanup bad/stale/orphan index entries */ 1088 1060 if (!err) 1089 - err = ovl_indexdir_cleanup(ofs->indexdir, 1090 - ofs->upper_mnt, 1091 - oe->lowerstack, 1092 - oe->numlower); 1061 + err = ovl_indexdir_cleanup(ofs); 1093 1062 } 1094 1063 if (err || !ofs->indexdir) 1095 1064 pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n"); 1096 1065 1097 1066 out: 1067 + mnt_drop_write(mnt); 1098 1068 return err; 1099 1069 } 1100 1070 ··· 1144 1094 1145 1095 ofs->lower_layers[ofs->numlower].mnt = mnt; 1146 1096 ofs->lower_layers[ofs->numlower].pseudo_dev = dev; 1097 + ofs->lower_layers[ofs->numlower].idx = i + 1; 1147 1098 ofs->numlower++; 1148 1099 1149 1100 /* Check if all lower layers are on same sb */ ··· 1182 1131 } else if (!ofs->config.upperdir && stacklen == 1) { 1183 1132 pr_err("overlayfs: at least 2 lowerdir are needed while upperdir nonexistent\n"); 1184 1133 goto out_err; 1134 + } else if (!ofs->config.upperdir && ofs->config.nfs_export && 1135 + ofs->config.redirect_follow) { 1136 + pr_warn("overlayfs: NFS export requires \"redirect_dir=nofollow\" on non-upper mount, falling back to nfs_export=off.\n"); 1137 + ofs->config.nfs_export = false; 1185 1138 } 1186 1139 1187 1140 err = -ENOMEM; ··· 1262 1207 goto out_err; 1263 1208 1264 1209 ofs->config.index = ovl_index_def; 1210 + ofs->config.nfs_export = ovl_nfs_export_def; 1265 1211 err = ovl_parse_opt((char *) data, &ofs->config); 1266 1212 if (err) 1267 1213 goto out_err; ··· 1313 1257 if (err) 1314 1258 goto out_free_oe; 1315 1259 1316 - if (!ofs->indexdir) 1260 + /* Force r/o mount with no index dir */ 1261 + if (!ofs->indexdir) { 1262 + dput(ofs->workdir); 1263 + ofs->workdir = NULL; 1317 1264 sb->s_flags |= SB_RDONLY; 1265 + } 1266 + 1318 1267 } 1319 1268 1320 - /* Show index=off/on in /proc/mounts for any of the reasons above */ 1321 - if (!ofs->indexdir) 1269 + /* Show index=off in /proc/mounts for forced r/o mount */ 1270 + if (!ofs->indexdir) { 1322 1271 ofs->config.index = false; 1272 + if (ofs->upper_mnt && ofs->config.nfs_export) { 1273 + pr_warn("overlayfs: NFS export requires an index dir, falling back to nfs_export=off.\n"); 1274 + ofs->config.nfs_export = false; 1275 + } 1276 + } 1277 + 1278 + if (ofs->config.nfs_export) 1279 + sb->s_export_op = &ovl_export_operations; 1323 1280 1324 1281 /* Never override disk quota limits or use reserved space */ 1325 1282 cap_lower(cred->cap_effective, CAP_SYS_RESOURCE); ··· 1348 1279 if (!root_dentry) 1349 1280 goto out_free_oe; 1350 1281 1282 + root_dentry->d_fsdata = oe; 1283 + 1351 1284 mntput(upperpath.mnt); 1352 1285 if (upperpath.dentry) { 1353 - oe->has_upper = true; 1286 + ovl_dentry_set_upper_alias(root_dentry); 1354 1287 if (ovl_is_impuredir(upperpath.dentry)) 1355 1288 ovl_set_flag(OVL_IMPURE, d_inode(root_dentry)); 1356 1289 } 1357 - 1358 - root_dentry->d_fsdata = oe; 1359 1290 1360 1291 /* Root is always merge -> can have whiteouts */ 1361 1292 ovl_set_flag(OVL_WHITEOUTS, d_inode(root_dentry));
+79 -29
fs/overlayfs/util.c
··· 63 63 return ofs->indexdir; 64 64 } 65 65 66 + /* Index all files on copy up. For now only enabled for NFS export */ 67 + bool ovl_index_all(struct super_block *sb) 68 + { 69 + struct ovl_fs *ofs = sb->s_fs_info; 70 + 71 + return ofs->config.nfs_export && ofs->config.index; 72 + } 73 + 74 + /* Verify lower origin on lookup. For now only enabled for NFS export */ 75 + bool ovl_verify_lower(struct super_block *sb) 76 + { 77 + struct ovl_fs *ofs = sb->s_fs_info; 78 + 79 + return ofs->config.nfs_export && ofs->config.index; 80 + } 81 + 66 82 struct ovl_entry *ovl_alloc_entry(unsigned int numlower) 67 83 { 68 84 size_t size = offsetof(struct ovl_entry, lowerstack[numlower]); ··· 210 194 OVL_I(inode)->cache = cache; 211 195 } 212 196 197 + void ovl_dentry_set_flag(unsigned long flag, struct dentry *dentry) 198 + { 199 + set_bit(flag, &OVL_E(dentry)->flags); 200 + } 201 + 202 + void ovl_dentry_clear_flag(unsigned long flag, struct dentry *dentry) 203 + { 204 + clear_bit(flag, &OVL_E(dentry)->flags); 205 + } 206 + 207 + bool ovl_dentry_test_flag(unsigned long flag, struct dentry *dentry) 208 + { 209 + return test_bit(flag, &OVL_E(dentry)->flags); 210 + } 211 + 213 212 bool ovl_dentry_is_opaque(struct dentry *dentry) 214 213 { 215 - struct ovl_entry *oe = dentry->d_fsdata; 216 - return oe->opaque; 214 + return ovl_dentry_test_flag(OVL_E_OPAQUE, dentry); 217 215 } 218 216 219 217 bool ovl_dentry_is_whiteout(struct dentry *dentry) ··· 237 207 238 208 void ovl_dentry_set_opaque(struct dentry *dentry) 239 209 { 240 - struct ovl_entry *oe = dentry->d_fsdata; 241 - 242 - oe->opaque = true; 210 + ovl_dentry_set_flag(OVL_E_OPAQUE, dentry); 243 211 } 244 212 245 213 /* 246 - * For hard links it's possible for ovl_dentry_upper() to return positive, while 247 - * there's no actual upper alias for the inode. Copy up code needs to know 248 - * about the existence of the upper alias, so it can't use ovl_dentry_upper(). 214 + * For hard links and decoded file handles, it's possible for ovl_dentry_upper() 215 + * to return positive, while there's no actual upper alias for the inode. 216 + * Copy up code needs to know about the existence of the upper alias, so it 217 + * can't use ovl_dentry_upper(). 249 218 */ 250 219 bool ovl_dentry_has_upper_alias(struct dentry *dentry) 251 220 { 252 - struct ovl_entry *oe = dentry->d_fsdata; 253 - 254 - return oe->has_upper; 221 + return ovl_dentry_test_flag(OVL_E_UPPER_ALIAS, dentry); 255 222 } 256 223 257 224 void ovl_dentry_set_upper_alias(struct dentry *dentry) 258 225 { 259 - struct ovl_entry *oe = dentry->d_fsdata; 260 - 261 - oe->has_upper = true; 226 + ovl_dentry_set_flag(OVL_E_UPPER_ALIAS, dentry); 262 227 } 263 228 264 229 bool ovl_redirect_dir(struct super_block *sb) ··· 282 257 if (upperdentry) 283 258 OVL_I(inode)->__upperdentry = upperdentry; 284 259 if (lowerdentry) 285 - OVL_I(inode)->lower = d_inode(lowerdentry); 260 + OVL_I(inode)->lower = igrab(d_inode(lowerdentry)); 286 261 287 262 ovl_copyattr(d_inode(upperdentry ?: lowerdentry), inode); 288 263 } ··· 298 273 */ 299 274 smp_wmb(); 300 275 OVL_I(inode)->__upperdentry = upperdentry; 301 - if (!S_ISDIR(upperinode->i_mode) && inode_unhashed(inode)) { 276 + if (inode_unhashed(inode)) { 302 277 inode->i_private = upperinode; 303 278 __insert_inode_hash(inode, (unsigned long) upperinode); 304 279 } ··· 472 447 } 473 448 } 474 449 450 + /* 451 + * Does this overlay dentry need to be indexed on copy up? 452 + */ 453 + bool ovl_need_index(struct dentry *dentry) 454 + { 455 + struct dentry *lower = ovl_dentry_lower(dentry); 456 + 457 + if (!lower || !ovl_indexdir(dentry->d_sb)) 458 + return false; 459 + 460 + /* Index all files for NFS export and consistency verification */ 461 + if (ovl_index_all(dentry->d_sb)) 462 + return true; 463 + 464 + /* Index only lower hardlinks on copy up */ 465 + if (!d_is_dir(lower) && d_inode(lower)->i_nlink > 1) 466 + return true; 467 + 468 + return false; 469 + } 470 + 475 471 /* Caller must hold OVL_I(inode)->lock */ 476 472 static void ovl_cleanup_index(struct dentry *dentry) 477 473 { 478 - struct inode *dir = ovl_indexdir(dentry->d_sb)->d_inode; 474 + struct dentry *indexdir = ovl_indexdir(dentry->d_sb); 475 + struct inode *dir = indexdir->d_inode; 479 476 struct dentry *lowerdentry = ovl_dentry_lower(dentry); 480 477 struct dentry *upperdentry = ovl_dentry_upper(dentry); 481 478 struct dentry *index = NULL; ··· 510 463 goto fail; 511 464 512 465 inode = d_inode(upperdentry); 513 - if (inode->i_nlink != 1) { 466 + if (!S_ISDIR(inode->i_mode) && inode->i_nlink != 1) { 514 467 pr_warn_ratelimited("overlayfs: cleanup linked index (%pd2, ino=%lu, nlink=%u)\n", 515 468 upperdentry, inode->i_ino, inode->i_nlink); 516 469 /* ··· 528 481 } 529 482 530 483 inode_lock_nested(dir, I_MUTEX_PARENT); 531 - /* TODO: whiteout instead of cleanup to block future open by handle */ 532 - index = lookup_one_len(name.name, ovl_indexdir(dentry->d_sb), name.len); 484 + index = lookup_one_len(name.name, indexdir, name.len); 533 485 err = PTR_ERR(index); 534 - if (!IS_ERR(index)) 535 - err = ovl_cleanup(dir, index); 536 - else 486 + if (IS_ERR(index)) { 537 487 index = NULL; 488 + } else if (ovl_index_all(dentry->d_sb)) { 489 + /* Whiteout orphan index to block future open by handle */ 490 + err = ovl_cleanup_and_whiteout(indexdir, dir, index); 491 + } else { 492 + /* Cleanup orphan index entries */ 493 + err = ovl_cleanup(dir, index); 494 + } 538 495 539 496 inode_unlock(dir); 540 497 if (err) ··· 563 512 const struct cred *old_cred; 564 513 int err; 565 514 566 - if (!d_inode(dentry) || d_is_dir(dentry)) 515 + if (!d_inode(dentry)) 567 516 return 0; 568 517 569 518 /* 570 519 * With inodes index is enabled, we store the union overlay nlink 571 - * in an xattr on the index inode. When whiting out lower hardlinks 520 + * in an xattr on the index inode. When whiting out an indexed lower, 572 521 * we need to decrement the overlay persistent nlink, but before the 573 522 * first copy up, we have no upper index inode to store the xattr. 574 523 * 575 - * As a workaround, before whiteout/rename over of a lower hardlink, 524 + * As a workaround, before whiteout/rename over an indexed lower, 576 525 * copy up to create the upper index. Creating the upper index will 577 526 * initialize the overlay nlink, so it could be dropped if unlink 578 527 * or rename succeeds. ··· 580 529 * TODO: implement metadata only index copy up when called with 581 530 * ovl_copy_up_flags(dentry, O_PATH). 582 531 */ 583 - if (ovl_indexdir(dentry->d_sb) && !ovl_dentry_has_upper_alias(dentry) && 584 - d_inode(ovl_dentry_lower(dentry))->i_nlink > 1) { 532 + if (ovl_need_index(dentry) && !ovl_dentry_has_upper_alias(dentry)) { 585 533 err = ovl_copy_up(dentry); 586 534 if (err) 587 535 return err; ··· 590 540 if (err) 591 541 return err; 592 542 593 - if (!ovl_test_flag(OVL_INDEX, d_inode(dentry))) 543 + if (d_is_dir(dentry) || !ovl_test_flag(OVL_INDEX, d_inode(dentry))) 594 544 goto out; 595 545 596 546 old_cred = ovl_override_creds(dentry->d_sb);
+2
include/linux/dcache.h
··· 227 227 */ 228 228 extern void d_instantiate(struct dentry *, struct inode *); 229 229 extern struct dentry * d_instantiate_unique(struct dentry *, struct inode *); 230 + extern struct dentry * d_instantiate_anon(struct dentry *, struct inode *); 230 231 extern int d_instantiate_no_diralias(struct dentry *, struct inode *); 231 232 extern void __d_drop(struct dentry *dentry); 232 233 extern void d_drop(struct dentry *dentry); ··· 236 235 237 236 /* allocate/de-allocate */ 238 237 extern struct dentry * d_alloc(struct dentry *, const struct qstr *); 238 + extern struct dentry * d_alloc_anon(struct super_block *); 239 239 extern struct dentry * d_alloc_pseudo(struct super_block *, const struct qstr *); 240 240 extern struct dentry * d_alloc_parallel(struct dentry *, const struct qstr *, 241 241 wait_queue_head_t *);