Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

loop: fix the the direct I/O support check when used on top of block devices

__loop_update_dio only checks the alignment requirement for block backed
file systems, but misses them for the case where the loop device is
created directly on top of another block device. Due to this creating
a loop device with default option plus the direct I/O flag on a > 512 byte
sector size file system will lead to incorrect I/O being submitted to the
lower block device and a lot of error from the lock layer. This can
be seen with xfstests generic/563.

Fix the code in __loop_update_dio by factoring the alignment check into
a helper, and calling that also for the struct block_device of a block
device inode.

Also remove the TODO comment talking about dynamically switching between
buffered and direct I/O, which is a would be a recipe for horrible
performance and occasional data loss.

Fixes: 2e5ab5f379f9 ("block: loop: prepare for supporing direct IO")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20240117175901.871796-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Christoph Hellwig and committed by
Jens Axboe
baa7d536 49e60333

+25 -27
+25 -27
drivers/block/loop.c
··· 165 165 return get_size(lo->lo_offset, lo->lo_sizelimit, file); 166 166 } 167 167 168 + /* 169 + * We support direct I/O only if lo_offset is aligned with the logical I/O size 170 + * of backing device, and the logical block size of loop is bigger than that of 171 + * the backing device. 172 + */ 173 + static bool lo_bdev_can_use_dio(struct loop_device *lo, 174 + struct block_device *backing_bdev) 175 + { 176 + unsigned short sb_bsize = bdev_logical_block_size(backing_bdev); 177 + 178 + if (queue_logical_block_size(lo->lo_queue) < sb_bsize) 179 + return false; 180 + if (lo->lo_offset & (sb_bsize - 1)) 181 + return false; 182 + return true; 183 + } 184 + 168 185 static void __loop_update_dio(struct loop_device *lo, bool dio) 169 186 { 170 187 struct file *file = lo->lo_backing_file; 171 - struct address_space *mapping = file->f_mapping; 172 - struct inode *inode = mapping->host; 173 - unsigned short sb_bsize = 0; 174 - unsigned dio_align = 0; 188 + struct inode *inode = file->f_mapping->host; 189 + struct block_device *backing_bdev = NULL; 175 190 bool use_dio; 176 191 177 - if (inode->i_sb->s_bdev) { 178 - sb_bsize = bdev_logical_block_size(inode->i_sb->s_bdev); 179 - dio_align = sb_bsize - 1; 180 - } 192 + if (S_ISBLK(inode->i_mode)) 193 + backing_bdev = I_BDEV(inode); 194 + else if (inode->i_sb->s_bdev) 195 + backing_bdev = inode->i_sb->s_bdev; 181 196 182 - /* 183 - * We support direct I/O only if lo_offset is aligned with the 184 - * logical I/O size of backing device, and the logical block 185 - * size of loop is bigger than the backing device's. 186 - * 187 - * TODO: the above condition may be loosed in the future, and 188 - * direct I/O may be switched runtime at that time because most 189 - * of requests in sane applications should be PAGE_SIZE aligned 190 - */ 191 - if (dio) { 192 - if (queue_logical_block_size(lo->lo_queue) >= sb_bsize && 193 - !(lo->lo_offset & dio_align) && 194 - (file->f_mode & FMODE_CAN_ODIRECT)) 195 - use_dio = true; 196 - else 197 - use_dio = false; 198 - } else { 199 - use_dio = false; 200 - } 197 + use_dio = dio && (file->f_mode & FMODE_CAN_ODIRECT) && 198 + (!backing_bdev || lo_bdev_can_use_dio(lo, backing_bdev)); 201 199 202 200 if (lo->use_dio == use_dio) 203 201 return;