Revert "ext4: make __ext4_get_inode_loc plug"

This reverts commit b03755ad6f33b7b8cd7312a3596a2dbf496de6e7.

This is sad, and done for all the wrong reasons. Because that commit is
good, and does exactly what it says: avoids a lot of small disk requests
for the inode table read-ahead.

However, it turns out that it causes an entirely unrelated problem: the
getrandom() system call was introduced back in 2014 by commit
c6e9d6f38894 ("random: introduce getrandom(2) system call"), and people
use it as a convenient source of good random numbers.

But part of the current semantics for getrandom() is that it waits for
the entropy pool to fill at least partially (unlike /dev/urandom). And
at least ArchLinux apparently has a systemd that uses getrandom() at
boot time, and the improvements in IO patterns means that existing
installations suddenly start hanging, waiting for entropy that will
never happen.

It seems to be an unlucky combination of not _quite_ enough entropy,
together with a particular systemd version and configuration. Lennart
says that the systemd-random-seed process (which is what does this early
access) is supposed to not block any other boot activity, but sadly that
doesn't actually seem to be the case (possibly due bogus dependencies on
cryptsetup for encrypted swapspace).

The correct fix is to fix getrandom() to not block when it's not
appropriate, but that fix is going to take a lot more discussion. Do we
just make it act like /dev/urandom by default, and add a new flag for
"wait for entropy"? Do we add a boot-time option? Or do we just limit
the amount of time it will wait for entropy?

So in the meantime, we do the revert to give us time to discuss the
eventual fix for the fundamental problem, at which point we can re-apply
the ext4 inode table access optimization.

Reported-by: Ahmed S. Darwish <darwish.07@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Alexander E. Patrakov <patrakov@gmail.com>
Cc: Lennart Poettering <mzxreary@0pointer.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Changed files
-3
fs
ext4
-3
fs/ext4/inode.c
··· 4586 4586 struct buffer_head *bh; 4587 4587 struct super_block *sb = inode->i_sb; 4588 4588 ext4_fsblk_t block; 4589 - struct blk_plug plug; 4590 4589 int inodes_per_block, inode_offset; 4591 4590 4592 4591 iloc->bh = NULL; ··· 4674 4675 * If we need to do any I/O, try to pre-readahead extra 4675 4676 * blocks from the inode table. 4676 4677 */ 4677 - blk_start_plug(&plug); 4678 4678 if (EXT4_SB(sb)->s_inode_readahead_blks) { 4679 4679 ext4_fsblk_t b, end, table; 4680 4680 unsigned num; ··· 4704 4706 get_bh(bh); 4705 4707 bh->b_end_io = end_buffer_read_sync; 4706 4708 submit_bh(REQ_OP_READ, REQ_META | REQ_PRIO, bh); 4707 - blk_finish_plug(&plug); 4708 4709 wait_on_buffer(bh); 4709 4710 if (!buffer_uptodate(bh)) { 4710 4711 EXT4_ERROR_INODE_BLOCK(inode, block,