Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

vfs: fix subtle use-after-free of pipe_inode_info

The pipe code was trying (and failing) to be very careful about freeing
the pipe info only after the last access, with a pattern like:

spin_lock(&inode->i_lock);
if (!--pipe->files) {
inode->i_pipe = NULL;
kill = 1;
}
spin_unlock(&inode->i_lock);
__pipe_unlock(pipe);
if (kill)
free_pipe_info(pipe);

where the final freeing is done last.

HOWEVER. The above is actually broken, because while the freeing is
done at the end, if we have two racing processes releasing the pipe
inode info, the one that *doesn't* free it will decrement the ->files
count, and unlock the inode i_lock, but then still use the
"pipe_inode_info" afterwards when it does the "__pipe_unlock(pipe)".

This is *very* hard to trigger in practice, since the race window is
very small, and adding debug options seems to just hide it by slowing
things down.

Simon originally reported this way back in July as an Oops in
kmem_cache_allocate due to a single bit corruption (due to the final
"spin_unlock(pipe->mutex.wait_lock)" incrementing a field in a different
allocation that had re-used the free'd pipe-info), it's taken this long
to figure out.

Since the 'pipe->files' accesses aren't even protected by the pipe lock
(we very much use the inode lock for that), the simple solution is to
just drop the pipe lock early. And since there were two users of this
pattern, create a helper function for it.

Introduced commit ba5bb147330a ("pipe: take allocation and freeing of
pipe_inode_info out of ->i_mutex").

Reported-by: Simon Kirby <sim@hostway.ca>
Reported-by: Ian Applegate <ia@cloudflare.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # v3.10+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+19 -20
+19 -20
fs/pipe.c
··· 726 726 return mask; 727 727 } 728 728 729 + static void put_pipe_info(struct inode *inode, struct pipe_inode_info *pipe) 730 + { 731 + int kill = 0; 732 + 733 + spin_lock(&inode->i_lock); 734 + if (!--pipe->files) { 735 + inode->i_pipe = NULL; 736 + kill = 1; 737 + } 738 + spin_unlock(&inode->i_lock); 739 + 740 + if (kill) 741 + free_pipe_info(pipe); 742 + } 743 + 729 744 static int 730 745 pipe_release(struct inode *inode, struct file *file) 731 746 { 732 - struct pipe_inode_info *pipe = inode->i_pipe; 733 - int kill = 0; 747 + struct pipe_inode_info *pipe = file->private_data; 734 748 735 749 __pipe_lock(pipe); 736 750 if (file->f_mode & FMODE_READ) ··· 757 743 kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN); 758 744 kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT); 759 745 } 760 - spin_lock(&inode->i_lock); 761 - if (!--pipe->files) { 762 - inode->i_pipe = NULL; 763 - kill = 1; 764 - } 765 - spin_unlock(&inode->i_lock); 766 746 __pipe_unlock(pipe); 767 747 768 - if (kill) 769 - free_pipe_info(pipe); 770 - 748 + put_pipe_info(inode, pipe); 771 749 return 0; 772 750 } 773 751 ··· 1020 1014 { 1021 1015 struct pipe_inode_info *pipe; 1022 1016 bool is_pipe = inode->i_sb->s_magic == PIPEFS_MAGIC; 1023 - int kill = 0; 1024 1017 int ret; 1025 1018 1026 1019 filp->f_version = 0; ··· 1135 1130 goto err; 1136 1131 1137 1132 err: 1138 - spin_lock(&inode->i_lock); 1139 - if (!--pipe->files) { 1140 - inode->i_pipe = NULL; 1141 - kill = 1; 1142 - } 1143 - spin_unlock(&inode->i_lock); 1144 1133 __pipe_unlock(pipe); 1145 - if (kill) 1146 - free_pipe_info(pipe); 1134 + 1135 + put_pipe_info(inode, pipe); 1147 1136 return ret; 1148 1137 } 1149 1138