Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tipc: fix potential hanging after b/rcast changing

In commit c55c8edafa91 ("tipc: smooth change between replicast and
broadcast"), we allow instant switching between replicast and broadcast
by sending a dummy 'SYN' packet on the last used link to synchronize
packets on the links. The 'SYN' message is an object of link congestion
also, so if that happens, a 'SOCK_WAKEUP' will be scheduled to be sent
back to the socket...
However, in that commit, we simply use the same socket 'cong_link_cnt'
counter for both the 'SYN' & normal payload message sending. Therefore,
if both the replicast & broadcast links are congested, the counter will
be not updated correctly but overwritten by the latter congestion.
Later on, when the 'SOCK_WAKEUP' messages are processed, the counter is
reduced one by one and eventually overflowed. Consequently, further
activities on the socket will only wait for the false congestion signal
to disappear but never been met.

Because sending the 'SYN' message is vital for the mechanism, it should
be done anyway. This commit fixes the issue by marking the message with
an error code e.g. 'TIPC_ERR_NO_PORT', so its sending should not face a
link congestion, there is no need to touch the socket 'cong_link_cnt'
either. In addition, in the event of any error (e.g. -ENOBUFS), we will
purge the entire payload message queue and make a return immediately.

Fixes: c55c8edafa91 ("tipc: smooth change between replicast and broadcast")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Tuong Lien and committed by
David S. Miller
dca4a17d d5162f34

+15 -9
+15 -9
net/tipc/bcast.c
··· 305 305 * @skb: socket buffer to copy 306 306 * @method: send method to be used 307 307 * @dests: destination nodes for message. 308 - * @cong_link_cnt: returns number of encountered congested destination links 309 308 * Returns 0 if success, otherwise errno 310 309 */ 311 310 static int tipc_mcast_send_sync(struct net *net, struct sk_buff *skb, 312 311 struct tipc_mc_method *method, 313 - struct tipc_nlist *dests, 314 - u16 *cong_link_cnt) 312 + struct tipc_nlist *dests) 315 313 { 316 314 struct tipc_msg *hdr, *_hdr; 317 315 struct sk_buff_head tmpq; 318 316 struct sk_buff *_skb; 317 + u16 cong_link_cnt; 318 + int rc = 0; 319 319 320 320 /* Is a cluster supporting with new capabilities ? */ 321 321 if (!(tipc_net(net)->capabilities & TIPC_MCAST_RBCTL)) ··· 343 343 _hdr = buf_msg(_skb); 344 344 msg_set_size(_hdr, MCAST_H_SIZE); 345 345 msg_set_is_rcast(_hdr, !msg_is_rcast(hdr)); 346 + msg_set_errcode(_hdr, TIPC_ERR_NO_PORT); 346 347 347 348 __skb_queue_head_init(&tmpq); 348 349 __skb_queue_tail(&tmpq, _skb); 349 350 if (method->rcast) 350 - tipc_bcast_xmit(net, &tmpq, cong_link_cnt); 351 + rc = tipc_bcast_xmit(net, &tmpq, &cong_link_cnt); 351 352 else 352 - tipc_rcast_xmit(net, &tmpq, dests, cong_link_cnt); 353 + rc = tipc_rcast_xmit(net, &tmpq, dests, &cong_link_cnt); 353 354 354 355 /* This queue should normally be empty by now */ 355 356 __skb_queue_purge(&tmpq); 356 357 357 - return 0; 358 + return rc; 358 359 } 359 360 360 361 /* tipc_mcast_xmit - deliver message to indicated destination nodes ··· 397 396 msg_set_is_rcast(hdr, method->rcast); 398 397 399 398 /* Switch method ? */ 400 - if (rcast != method->rcast) 401 - tipc_mcast_send_sync(net, skb, method, 402 - dests, cong_link_cnt); 399 + if (rcast != method->rcast) { 400 + rc = tipc_mcast_send_sync(net, skb, method, dests); 401 + if (unlikely(rc)) { 402 + pr_err("Unable to send SYN: method %d, rc %d\n", 403 + rcast, rc); 404 + goto exit; 405 + } 406 + } 403 407 404 408 if (method->rcast) 405 409 rc = tipc_rcast_xmit(net, pkts, dests, cong_link_cnt);