fix proc_sys_compare() handling of in-lookup dentries

There's one case where ->d_compare() can be called for an in-lookup
dentry; usually that's nothing special from ->d_compare() point of
view, but... proc_sys_compare() is weird.

The thing is, /proc/sys subdirectories can look differently for
different processes. Up to and including having the same name
resolve to different dentries - all of them hashed.

The way it's done is ->d_compare() refusing to admit a match unless
this dentry is supposed to be visible to this caller. The information
needed to discriminate between them is stored in inode; it is set
during proc_sys_lookup() and until it's done d_splice_alias() we really
can't tell who should that dentry be visible for.

Normally there's no negative dentries in /proc/sys; we can run into
a dying dentry in RCU dcache lookup, but those can be safely rejected.

However, ->d_compare() is also called for in-lookup dentries, before
they get positive - or hashed, for that matter. In case of match
we will wait until dentry leaves in-lookup state and repeat ->d_compare()
afterwards. In other words, the right behaviour is to treat the
name match as sufficient for in-lookup dentries; if dentry is not
for us, we'll see that when we recheck once proc_sys_lookup() is
done with it.

While we are at it, fix the misspelled READ_ONCE and WRITE_ONCE there.

Fixes: d9171b934526 ("parallel lookups machinery, part 4 (and last)")
Reported-by: NeilBrown <neilb@brown.name>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Al Viro b969f961 d0b3b7b2

Changed files
+12 -8
fs
+1 -1
fs/proc/inode.c
··· 42 42 43 43 head = ei->sysctl; 44 44 if (head) { 45 - RCU_INIT_POINTER(ei->sysctl, NULL); 45 + WRITE_ONCE(ei->sysctl, NULL); 46 46 proc_sys_evict_inode(inode, head); 47 47 } 48 48 }
+11 -7
fs/proc/proc_sysctl.c
··· 918 918 struct ctl_table_header *head; 919 919 struct inode *inode; 920 920 921 - /* Although proc doesn't have negative dentries, rcu-walk means 922 - * that inode here can be NULL */ 923 - /* AV: can it, indeed? */ 924 - inode = d_inode_rcu(dentry); 925 - if (!inode) 926 - return 1; 927 921 if (name->len != len) 928 922 return 1; 929 923 if (memcmp(name->name, str, len)) 930 924 return 1; 931 - head = rcu_dereference(PROC_I(inode)->sysctl); 925 + 926 + // false positive is fine here - we'll recheck anyway 927 + if (d_in_lookup(dentry)) 928 + return 0; 929 + 930 + inode = d_inode_rcu(dentry); 931 + // we just might have run into dentry in the middle of __dentry_kill() 932 + if (!inode) 933 + return 1; 934 + 935 + head = READ_ONCE(PROC_I(inode)->sysctl); 932 936 return !head || !sysctl_is_seen(head); 933 937 } 934 938