Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

fs: properly document __lookup_mnt()

The comment on top of __lookup_mnt() states that it finds the first
mount implying that there could be multiple mounts mounted at the same
dentry with the same parent.

On older kernels "shadow mounts" could be created during mount
propagation. So if a mount @m in the destination propagation tree
already had a child mount @p mounted at @mp then any mount @n we
propagated to @m at the same @mp would be appended after the preexisting
mount @p in @mount_hashtable. This was a completely direct way of
creating shadow mounts.

That direct way is gone but there are still subtle ways to create shadow
mounts. For example, when attaching a source mnt @mnt to a shared mount.
The root of the source mnt @mnt might be overmounted by a mount @o after
we finished path lookup but before we acquired the namespace semaphore
to copy the source mount tree @mnt.

After we acquired the namespace lock @mnt is copied including @o
covering it. After we attach @mnt to a shared mount @dest_mnt we end up
propagation it to all it's peer and slaves @d. If @d already has a mount
@n mounted on top of it we tuck @mnt beneath @n. This means, we mount
@mnt at @d and mount @n on @mnt. Now we have both @o and @n mounted on
the same mountpoint at @mnt.

Explain this in the documentation as this is pretty subtle.

Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
Message-Id: <20230202-fs-move-mount-replace-v4-2-98f3d80d7eaa@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

+19 -3
+19 -3
fs/namespace.c
··· 658 658 return false; 659 659 } 660 660 661 - /* 662 - * find the first mount at @dentry on vfsmount @mnt. 663 - * call under rcu_read_lock() 661 + /** 662 + * __lookup_mnt - find first child mount 663 + * @mnt: parent mount 664 + * @dentry: mountpoint 665 + * 666 + * If @mnt has a child mount @c mounted @dentry find and return it. 667 + * 668 + * Note that the child mount @c need not be unique. There are cases 669 + * where shadow mounts are created. For example, during mount 670 + * propagation when a source mount @mnt whose root got overmounted by a 671 + * mount @o after path lookup but before @namespace_sem could be 672 + * acquired gets copied and propagated. So @mnt gets copied including 673 + * @o. When @mnt is propagated to a destination mount @d that already 674 + * has another mount @n mounted at the same mountpoint then the source 675 + * mount @mnt will be tucked beneath @n, i.e., @n will be mounted on 676 + * @mnt and @mnt mounted on @d. Now both @n and @o are mounted at @mnt 677 + * on @dentry. 678 + * 679 + * Return: The first child of @mnt mounted @dentry or NULL. 664 680 */ 665 681 struct mount *__lookup_mnt(struct vfsmount *mnt, struct dentry *dentry) 666 682 {