Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

block: relax when to modify the timeout timer

Since we are now, by default, applying timer slack to expiry times,
the logic for when to modify a timer in the block code is suboptimal.
The block layer keeps a forward rolling timer per queue for all
requests, and modifies this timer if a request has a shorter timeout
than what the current expiry time is. However, this breaks down
when our rounded timer values get applied slack. Then each new
request ends up modifying the timer, since we're still a little
in front of the timer + slack.

Fix this by allowing a tolerance of HZ / 2, the timeout handling
doesn't need to be very precise. This drastically cuts down
the number of timer modifications we have to make.

Signed-off-by: Jens Axboe <axboe@fb.com>

+13 -2
+13 -2
block/blk-timeout.c
··· 199 199 expiry = round_jiffies_up(req->deadline); 200 200 201 201 if (!timer_pending(&q->timeout) || 202 - time_before(expiry, q->timeout.expires)) 203 - mod_timer(&q->timeout, expiry); 202 + time_before(expiry, q->timeout.expires)) { 203 + unsigned long diff = q->timeout.expires - expiry; 204 + 205 + /* 206 + * Due to added timer slack to group timers, the timer 207 + * will often be a little in front of what we asked for. 208 + * So apply some tolerance here too, otherwise we keep 209 + * modifying the timer because expires for value X 210 + * will be X + something. 211 + */ 212 + if (diff >= HZ / 2) 213 + mod_timer(&q->timeout, expiry); 214 + } 204 215 205 216 } 206 217