Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

blockdev: Avoid two active bdev inodes for one device

When blkdev_open() races with device removal and creation it can happen
that unhashed bdev inode gets associated with newly created gendisk
like:

CPU0 CPU1
blkdev_open()
bdev = bd_acquire()
del_gendisk()
bdev_unhash_inode(bdev);
remove device
create new device with the same number
__blkdev_get()
disk = get_gendisk()
- gets reference to gendisk of the new device

Now another blkdev_open() will not find original 'bdev' as it got
unhashed, create a new one and associate it with the same 'disk' at
which point problems start as we have two independent page caches for
one device.

Fix the problem by verifying that the bdev inode didn't get unhashed
before we acquired gendisk reference. That way we make sure gendisk can
get associated only with visible bdev inodes.

Tested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Jan Kara and committed by
Jens Axboe
560e7cb2 56c0908c

+23 -2
+23 -2
fs/block_dev.c
··· 1058 1058 return 0; 1059 1059 } 1060 1060 1061 + static struct gendisk *bdev_get_gendisk(struct block_device *bdev, int *partno) 1062 + { 1063 + struct gendisk *disk = get_gendisk(bdev->bd_dev, partno); 1064 + 1065 + if (!disk) 1066 + return NULL; 1067 + /* 1068 + * Now that we hold gendisk reference we make sure bdev we looked up is 1069 + * not stale. If it is, it means device got removed and created before 1070 + * we looked up gendisk and we fail open in such case. Associating 1071 + * unhashed bdev with newly created gendisk could lead to two bdevs 1072 + * (and thus two independent caches) being associated with one device 1073 + * which is bad. 1074 + */ 1075 + if (inode_unhashed(bdev->bd_inode)) { 1076 + put_disk_and_module(disk); 1077 + return NULL; 1078 + } 1079 + return disk; 1080 + } 1081 + 1061 1082 /** 1062 1083 * bd_start_claiming - start claiming a block device 1063 1084 * @bdev: block device of interest ··· 1115 1094 * @bdev might not have been initialized properly yet, look up 1116 1095 * and grab the outer block device the hard way. 1117 1096 */ 1118 - disk = get_gendisk(bdev->bd_dev, &partno); 1097 + disk = bdev_get_gendisk(bdev, &partno); 1119 1098 if (!disk) 1120 1099 return ERR_PTR(-ENXIO); 1121 1100 ··· 1450 1429 restart: 1451 1430 1452 1431 ret = -ENXIO; 1453 - disk = get_gendisk(bdev->bd_dev, &partno); 1432 + disk = bdev_get_gendisk(bdev, &partno); 1454 1433 if (!disk) 1455 1434 goto out; 1456 1435