Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

rds: rds_cong_queue_updates needs to defer the congestion update transmission

When the RDS transport is TCP, we cannot inline the call to rds_send_xmit
from rds_cong_queue_update because
(a) we are already holding the sock_lock in the recv path, and
will deadlock when tcp_setsockopt/tcp_sendmsg try to get the sock
lock
(b) cong_queue_update does an irqsave on the rds_cong_lock, and this
will trigger warnings (for a good reason) from functions called
out of sock_lock.

This patch reverts the change introduced by
2fa57129d ("RDS: Bypass workqueue when queueing cong updates").

The patch has been verified for both RDS/TCP as well as RDS/RDMA
to ensure that there are not regressions for either transport:
- for verification of RDS/TCP a client-server unit-test was used,
with the server blocked in gdb and thus unable to drain its rcvbuf,
eventually triggering a RDS congestion update.
- for RDS/RDMA, the standard IB regression tests were used

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Sowmini Varadhan and committed by
David S. Miller
80ad0d4a bf250a1f

+15 -1
+15 -1
net/rds/cong.c
··· 221 221 list_for_each_entry(conn, &map->m_conn_list, c_map_item) { 222 222 if (!test_and_set_bit(0, &conn->c_map_queued)) { 223 223 rds_stats_inc(s_cong_update_queued); 224 - rds_send_xmit(conn); 224 + /* We cannot inline the call to rds_send_xmit() here 225 + * for two reasons (both pertaining to a TCP transport): 226 + * 1. When we get here from the receive path, we 227 + * are already holding the sock_lock (held by 228 + * tcp_v4_rcv()). So inlining calls to 229 + * tcp_setsockopt and/or tcp_sendmsg will deadlock 230 + * when it tries to get the sock_lock()) 231 + * 2. Interrupts are masked so that we can mark the 232 + * the port congested from both send and recv paths. 233 + * (See comment around declaration of rdc_cong_lock). 234 + * An attempt to get the sock_lock() here will 235 + * therefore trigger warnings. 236 + * Defer the xmit to rds_send_worker() instead. 237 + */ 238 + queue_delayed_work(rds_wq, &conn->c_send_w, 0); 225 239 } 226 240 } 227 241