Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

dm snapshot: flush merged data before committing metadata

If the origin device has a volatile write-back cache and the following
events occur:

1: After finishing merge operation of one set of exceptions,
merge_callback() is invoked.
2: Update the metadata in COW device tracking the merge completion.
This update to COW device is flushed cleanly.
3: System crashes and the origin device's cache where the recent
merge was completed has not been flushed.

During the next cycle when we read the metadata from the COW device,
we will skip reading those metadata whose merge was completed in
step (1). This will lead to data loss/corruption.

To address this, flush the origin device post merge IO before
updating the metadata.

Cc: stable@vger.kernel.org
Signed-off-by: Akilesh Kailash <akailash@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

authored by

Akilesh Kailash and committed by
Mike Snitzer
fcc42338 d68b2958

+24
+24
drivers/md/dm-snap.c
··· 141 141 * for them to be committed. 142 142 */ 143 143 struct bio_list bios_queued_during_merge; 144 + 145 + /* 146 + * Flush data after merge. 147 + */ 148 + struct bio flush_bio; 144 149 }; 145 150 146 151 /* ··· 1126 1121 1127 1122 static void error_bios(struct bio *bio); 1128 1123 1124 + static int flush_data(struct dm_snapshot *s) 1125 + { 1126 + struct bio *flush_bio = &s->flush_bio; 1127 + 1128 + bio_reset(flush_bio); 1129 + bio_set_dev(flush_bio, s->origin->bdev); 1130 + flush_bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH; 1131 + 1132 + return submit_bio_wait(flush_bio); 1133 + } 1134 + 1129 1135 static void merge_callback(int read_err, unsigned long write_err, void *context) 1130 1136 { 1131 1137 struct dm_snapshot *s = context; ··· 1147 1131 DMERR("Read error: shutting down merge."); 1148 1132 else 1149 1133 DMERR("Write error: shutting down merge."); 1134 + goto shut; 1135 + } 1136 + 1137 + if (flush_data(s) < 0) { 1138 + DMERR("Flush after merge failed: shutting down merge"); 1150 1139 goto shut; 1151 1140 } 1152 1141 ··· 1339 1318 s->first_merging_chunk = 0; 1340 1319 s->num_merging_chunks = 0; 1341 1320 bio_list_init(&s->bios_queued_during_merge); 1321 + bio_init(&s->flush_bio, NULL, 0); 1342 1322 1343 1323 /* Allocate hash table for COW data */ 1344 1324 if (init_hash_tables(s)) { ··· 1525 1503 mempool_exit(&s->pending_pool); 1526 1504 1527 1505 dm_exception_store_destroy(s->store); 1506 + 1507 + bio_uninit(&s->flush_bio); 1528 1508 1529 1509 dm_put_device(ti, s->cow); 1530 1510