Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

block: loop: don't hold lo_ctl_mutex in lo_open

The lo_ctl_mutex is held for running all ioctl handlers, and
in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
rereading partitions, which requires bd_mutex.

So it is easy to cause failure because trylock(bd_mutex) may
fail inside blkdev_reread_part(), and follows the lock context:

blkid or other application:
->open()
->mutex_lock(bd_mutex)
->lo_open()
->mutex_lock(lo_ctl_mutex)

losetup(set fd ioctl):
->mutex_lock(lo_ctl_mutex)
->ioctl_by_bdev(BLKRRPART)
->trylock(bd_mutex)

This patch trys to eliminate the ABBA lock dependency by removing
lo_ctl_mutext in lo_open() with the following approach:

1) make lo_refcnt as atomic_t and avoid acquiring lo_ctl_mutex in lo_open():
- for open vs. add/del loop, no any problem because of loop_index_mutex
- freeze request queue during clr_fd, so I/O can't come until
clearing fd is completed, like the effect of holding lo_ctl_mutex
in lo_open
- both open() and release() have been serialized by bd_mutex already

2) don't hold lo_ctl_mutex for decreasing/checking lo_refcnt in
lo_release(), then lo_ctl_mutex is only required for the last release.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

authored by

Ming Lei and committed by
Jens Axboe
f8933667 b816d45f

+13 -10
+12 -9
drivers/block/loop.c
··· 831 831 * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d 832 832 * command to fail with EBUSY. 833 833 */ 834 - if (lo->lo_refcnt > 1) { 834 + if (atomic_read(&lo->lo_refcnt) > 1) { 835 835 lo->lo_flags |= LO_FLAGS_AUTOCLEAR; 836 836 mutex_unlock(&lo->lo_ctl_mutex); 837 837 return 0; ··· 839 839 840 840 if (filp == NULL) 841 841 return -EINVAL; 842 + 843 + /* freeze request queue during the transition */ 844 + blk_mq_freeze_queue(lo->lo_queue); 842 845 843 846 spin_lock_irq(&lo->lo_lock); 844 847 lo->lo_state = Lo_rundown; ··· 874 871 lo->lo_state = Lo_unbound; 875 872 /* This is safe: open() is still holding a reference. */ 876 873 module_put(THIS_MODULE); 874 + blk_mq_unfreeze_queue(lo->lo_queue); 875 + 877 876 if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev) 878 877 ioctl_by_bdev(bdev, BLKRRPART, 0); 879 878 lo->lo_flags = 0; ··· 1335 1330 goto out; 1336 1331 } 1337 1332 1338 - mutex_lock(&lo->lo_ctl_mutex); 1339 - lo->lo_refcnt++; 1340 - mutex_unlock(&lo->lo_ctl_mutex); 1333 + atomic_inc(&lo->lo_refcnt); 1341 1334 out: 1342 1335 mutex_unlock(&loop_index_mutex); 1343 1336 return err; ··· 1346 1343 struct loop_device *lo = disk->private_data; 1347 1344 int err; 1348 1345 1346 + if (atomic_dec_return(&lo->lo_refcnt)) 1347 + return; 1348 + 1349 1349 mutex_lock(&lo->lo_ctl_mutex); 1350 - 1351 - if (--lo->lo_refcnt) 1352 - goto out; 1353 - 1354 1350 if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { 1355 1351 /* 1356 1352 * In autoclear mode, stop the loop thread ··· 1603 1601 disk->flags |= GENHD_FL_NO_PART_SCAN; 1604 1602 disk->flags |= GENHD_FL_EXT_DEVT; 1605 1603 mutex_init(&lo->lo_ctl_mutex); 1604 + atomic_set(&lo->lo_refcnt, 0); 1606 1605 lo->lo_number = i; 1607 1606 spin_lock_init(&lo->lo_lock); 1608 1607 disk->major = LOOP_MAJOR; ··· 1721 1718 mutex_unlock(&lo->lo_ctl_mutex); 1722 1719 break; 1723 1720 } 1724 - if (lo->lo_refcnt > 0) { 1721 + if (atomic_read(&lo->lo_refcnt) > 0) { 1725 1722 ret = -EBUSY; 1726 1723 mutex_unlock(&lo->lo_ctl_mutex); 1727 1724 break;
+1 -1
drivers/block/loop.h
··· 28 28 29 29 struct loop_device { 30 30 int lo_number; 31 - int lo_refcnt; 31 + atomic_t lo_refcnt; 32 32 loff_t lo_offset; 33 33 loff_t lo_sizelimit; 34 34 int lo_flags;