Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drbd: fix drbd epoch write count for ahead/behind mode

The sanity check when receiving P_BARRIER_ACK does expect all write
requests with a given req->epoch to have been either all replicated,
or all not replicated.

Because req->epoch was assigned before calling maybe_pull_ahead(),
this expectation was not met, leading to an off-by-one in the sanity
check, and further to a "Protocol Error".

Fix: move the call to maybe_pull_ahead() a few lines up,
and assign req->epoch only after that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Lars Ellenberg and committed by
Jens Axboe
607f25e5 ef57f9e6

+7 -7
+7 -7
drivers/block/drbd/drbd_req.c
··· 865 865 bool congested = false; 866 866 enum drbd_on_congestion on_congestion; 867 867 868 + rcu_read_lock(); 868 869 nc = rcu_dereference(tconn->net_conf); 869 870 on_congestion = nc ? nc->on_congestion : OC_BLOCK; 871 + rcu_read_unlock(); 870 872 if (on_congestion == OC_BLOCK || 871 873 tconn->agreed_pro_version < 96) 872 874 return; ··· 962 960 struct drbd_conf *mdev = req->w.mdev; 963 961 int remote, send_oos; 964 962 965 - rcu_read_lock(); 966 963 remote = drbd_should_do_remote(mdev->state); 967 - if (remote) { 968 - maybe_pull_ahead(mdev); 969 - remote = drbd_should_do_remote(mdev->state); 970 - } 971 964 send_oos = drbd_should_send_out_of_sync(mdev->state); 972 - rcu_read_unlock(); 973 965 974 966 /* Need to replicate writes. Unless it is an empty flush, 975 967 * which is better mapped to a DRBD P_BARRIER packet, ··· 1083 1087 * but will re-aquire it before it returns here. 1084 1088 * Needs to be before the check on drbd_suspended() */ 1085 1089 complete_conflicting_writes(req); 1090 + /* no more giving up req_lock from now on! */ 1091 + 1092 + /* check for congestion, and potentially stop sending 1093 + * full data updates, but start sending "dirty bits" only. */ 1094 + maybe_pull_ahead(mdev); 1086 1095 } 1087 1096 1088 - /* no more giving up req_lock from now on! */ 1089 1097 1090 1098 if (drbd_suspended(mdev)) { 1091 1099 /* push back and retry: */