Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

dm rq: do not update rq partially in each ending bio

We don't need to update the original dm request partially when ending
each cloned bio: just update original dm request once when the whole
cloned request is finished. This still allows full support for partial
completion because a new 'completed' counter accounts for incremental
progress as the clone bios complete.

Partial request update can be a bit expensive, so we should try to avoid
it, especially because it is run in softirq context.

Avoiding all the partial request updates fixes both hard lockup and
soft lockups that were easily reproduced while running Laurence's
test[1] on IB/SRP.

BTW, after d4acf3650c7c ("block: Make blk_mq_delay_kick_requeue_list()
rerun the queue at a quiet time"), we need to make the test more
aggressive for reproducing the lockup:

1) run hammer_write.sh 32 or 64 concurrently.
2) write 8M each time

[1] https://marc.info/?l=linux-block&m=150220185510245&w=2

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

authored by

Ming Lei and committed by
Mike Snitzer
dc6364b5 d5c27f3f

+8 -11
+7 -11
drivers/md/dm-rq.c
··· 117 struct dm_rq_clone_bio_info *info = 118 container_of(clone, struct dm_rq_clone_bio_info, clone); 119 struct dm_rq_target_io *tio = info->tio; 120 - struct bio *bio = info->orig; 121 unsigned int nr_bytes = info->orig->bi_iter.bi_size; 122 blk_status_t error = clone->bi_status; 123 124 bio_put(clone); 125 ··· 137 * when the request is completed. 138 */ 139 tio->error = error; 140 - return; 141 } 142 143 /* 144 * I/O for the bio successfully completed. 145 * Notice the data completion to the upper layer. 146 */ 147 - 148 - /* 149 - * bios are processed from the head of the list. 150 - * So the completing bio should always be rq->bio. 151 - * If it's not, something wrong is happening. 152 - */ 153 - if (tio->orig->bio != bio) 154 - DMERR("bio completion is going in the middle of the request"); 155 156 /* 157 * Update the original request. 158 * Do not use blk_end_request() here, because it may complete 159 * the original request before the clone, and break the ordering. 160 */ 161 - blk_update_request(tio->orig, BLK_STS_OK, nr_bytes); 162 } 163 164 static struct dm_rq_target_io *tio_from_request(struct request *rq) ··· 451 tio->clone = NULL; 452 tio->orig = rq; 453 tio->error = 0; 454 /* 455 * Avoid initializing info for blk-mq; it passes 456 * target-specific data through info.ptr
··· 117 struct dm_rq_clone_bio_info *info = 118 container_of(clone, struct dm_rq_clone_bio_info, clone); 119 struct dm_rq_target_io *tio = info->tio; 120 unsigned int nr_bytes = info->orig->bi_iter.bi_size; 121 blk_status_t error = clone->bi_status; 122 + bool is_last = !clone->bi_next; 123 124 bio_put(clone); 125 ··· 137 * when the request is completed. 138 */ 139 tio->error = error; 140 + goto exit; 141 } 142 143 /* 144 * I/O for the bio successfully completed. 145 * Notice the data completion to the upper layer. 146 */ 147 + tio->completed += nr_bytes; 148 149 /* 150 * Update the original request. 151 * Do not use blk_end_request() here, because it may complete 152 * the original request before the clone, and break the ordering. 153 */ 154 + if (is_last) 155 + exit: 156 + blk_update_request(tio->orig, BLK_STS_OK, tio->completed); 157 } 158 159 static struct dm_rq_target_io *tio_from_request(struct request *rq) ··· 456 tio->clone = NULL; 457 tio->orig = rq; 458 tio->error = 0; 459 + tio->completed = 0; 460 /* 461 * Avoid initializing info for blk-mq; it passes 462 * target-specific data through info.ptr
+1
drivers/md/dm-rq.h
··· 29 struct dm_stats_aux stats_aux; 30 unsigned long duration_jiffies; 31 unsigned n_sectors; 32 }; 33 34 /*
··· 29 struct dm_stats_aux stats_aux; 30 unsigned long duration_jiffies; 31 unsigned n_sectors; 32 + unsigned completed; 33 }; 34 35 /*