Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

md: fix for divide error in status_resync

Stopping external metadata arrays during resync/recovery causes
retries, loop of interrupting and starting reconstruction, until it
hit at good moment to stop completely. While these retries
curr_mark_cnt can be small- especially on HDD drives, so subtraction
result can be smaller than 0. However it is casted to uint without
checking. As a result of it the status bar in /proc/mdstat while stopping
is strange (it jumps between 0% and 99%).

The real problem occurs here after commit 72deb455b5ec ("block: remove
CONFIG_LBDAF"). Sector_div() macro has been changed, now the
divisor is casted to uint32. For db = -8 the divisior(db/32-1) becomes 0.

Check if db value can be really counted and replace these macro by
div64_u64() inline.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>

authored by

Mariusz Tkaczyk and committed by
Song Liu
9642fa73 45691804

+22 -14
+22 -14
drivers/md/md.c
··· 7607 7607 static int status_resync(struct seq_file *seq, struct mddev *mddev) 7608 7608 { 7609 7609 sector_t max_sectors, resync, res; 7610 - unsigned long dt, db; 7611 - sector_t rt; 7612 - int scale; 7610 + unsigned long dt, db = 0; 7611 + sector_t rt, curr_mark_cnt, resync_mark_cnt; 7612 + int scale, recovery_active; 7613 7613 unsigned int per_milli; 7614 7614 7615 7615 if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) || ··· 7698 7698 * db: blocks written from mark until now 7699 7699 * rt: remaining time 7700 7700 * 7701 - * rt is a sector_t, so could be 32bit or 64bit. 7702 - * So we divide before multiply in case it is 32bit and close 7703 - * to the limit. 7704 - * We scale the divisor (db) by 32 to avoid losing precision 7705 - * near the end of resync when the number of remaining sectors 7706 - * is close to 'db'. 7707 - * We then divide rt by 32 after multiplying by db to compensate. 7708 - * The '+1' avoids division by zero if db is very small. 7701 + * rt is a sector_t, which is always 64bit now. We are keeping 7702 + * the original algorithm, but it is not really necessary. 7703 + * 7704 + * Original algorithm: 7705 + * So we divide before multiply in case it is 32bit and close 7706 + * to the limit. 7707 + * We scale the divisor (db) by 32 to avoid losing precision 7708 + * near the end of resync when the number of remaining sectors 7709 + * is close to 'db'. 7710 + * We then divide rt by 32 after multiplying by db to compensate. 7711 + * The '+1' avoids division by zero if db is very small. 7709 7712 */ 7710 7713 dt = ((jiffies - mddev->resync_mark) / HZ); 7711 7714 if (!dt) dt++; 7712 - db = (mddev->curr_mark_cnt - atomic_read(&mddev->recovery_active)) 7713 - - mddev->resync_mark_cnt; 7715 + 7716 + curr_mark_cnt = mddev->curr_mark_cnt; 7717 + recovery_active = atomic_read(&mddev->recovery_active); 7718 + resync_mark_cnt = mddev->resync_mark_cnt; 7719 + 7720 + if (curr_mark_cnt >= (recovery_active + resync_mark_cnt)) 7721 + db = curr_mark_cnt - (recovery_active + resync_mark_cnt); 7714 7722 7715 7723 rt = max_sectors - resync; /* number of remaining sectors */ 7716 - sector_div(rt, db/32+1); 7724 + rt = div64_u64(rt, db/32+1); 7717 7725 rt *= dt; 7718 7726 rt >>= 5; 7719 7727