Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

md: avoid races when stopping resync.

There has been a race in raid10 and raid1 for a long time
which has only recently started showing up due to a scheduler changed.

When a sync_read request finishes, as soon as reschedule_retry
is called, another thread can mark the resync request as having
completed, so md_do_sync can finish, ->stop can be called, and
->conf can be freed. So using conf after reschedule_retry is not
safe.

Similarly, when finishing a sync_write, calling md_done_sync must be
the last thing we do, as it allows a chain of events which will free
conf and other data structures.

The first of these requires action in raid10.c
The second requires action in raid1.c and raid10.c

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

NeilBrown 73d5c38a 78200d45

+6 -4
+2 -1
drivers/md/raid1.c
··· 1237 1237 update_head_pos(mirror, r1_bio); 1238 1238 1239 1239 if (atomic_dec_and_test(&r1_bio->remaining)) { 1240 - md_done_sync(mddev, r1_bio->sectors, uptodate); 1240 + sector_t s = r1_bio->sectors; 1241 1241 put_buf(r1_bio); 1242 + md_done_sync(mddev, s, uptodate); 1242 1243 } 1243 1244 } 1244 1245
+4 -3
drivers/md/raid10.c
··· 1236 1236 /* for reconstruct, we always reschedule after a read. 1237 1237 * for resync, only after all reads 1238 1238 */ 1239 + rdev_dec_pending(conf->mirrors[d].rdev, conf->mddev); 1239 1240 if (test_bit(R10BIO_IsRecover, &r10_bio->state) || 1240 1241 atomic_dec_and_test(&r10_bio->remaining)) { 1241 1242 /* we have read all the blocks, ··· 1244 1243 */ 1245 1244 reschedule_retry(r10_bio); 1246 1245 } 1247 - rdev_dec_pending(conf->mirrors[d].rdev, conf->mddev); 1248 1246 } 1249 1247 1250 1248 static void end_sync_write(struct bio *bio, int error) ··· 1264 1264 1265 1265 update_head_pos(i, r10_bio); 1266 1266 1267 + rdev_dec_pending(conf->mirrors[d].rdev, mddev); 1267 1268 while (atomic_dec_and_test(&r10_bio->remaining)) { 1268 1269 if (r10_bio->master_bio == NULL) { 1269 1270 /* the primary of several recovery bios */ 1270 - md_done_sync(mddev, r10_bio->sectors, 1); 1271 + sector_t s = r10_bio->sectors; 1271 1272 put_buf(r10_bio); 1273 + md_done_sync(mddev, s, 1); 1272 1274 break; 1273 1275 } else { 1274 1276 r10bio_t *r10_bio2 = (r10bio_t *)r10_bio->master_bio; ··· 1278 1276 r10_bio = r10_bio2; 1279 1277 } 1280 1278 } 1281 - rdev_dec_pending(conf->mirrors[d].rdev, mddev); 1282 1279 } 1283 1280 1284 1281 /*