Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tipc: fix crash during node removal

When the TIPC module is unloaded, we have identified a race condition
that allows a node reference counter to go to zero and the node instance
being freed before the node timer is finished with accessing it. This
leads to occasional crashes, especially in multi-namespace environments.

The scenario goes as follows:

CPU0:(node_stop) CPU1:(node_timeout) // ref == 2

1: if(!mod_timer())
2: if (del_timer())
3: tipc_node_put() // ref -> 1
4: tipc_node_put() // ref -> 0
5: kfree_rcu(node);
6: tipc_node_get(node)
7: // BOOM!

We now clean up this functionality as follows:

1) We remove the node pointer from the node lookup table before we
attempt deactivating the timer. This way, we reduce the risk that
tipc_node_find() may obtain a valid pointer to an instance marked
for deletion; a harmless but undesirable situation.

2) We use del_timer_sync() instead of del_timer() to safely deactivate
the node timer without any risk that it might be reactivated by the
timeout handler. There is no risk of deadlock here, since the two
functions never touch the same spinlocks.

3: We remove a pointless tipc_node_get() + tipc_node_put() from the
timeout handler.

Reported-by: Zhijiang Hu <huzhijiang@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Jon Paul Maloy and committed by
David S. Miller
d25a0125 b170997a

+11 -13
+11 -13
net/tipc/node.c
··· 225 225 226 226 static void tipc_node_kref_release(struct kref *kref) 227 227 { 228 - struct tipc_node *node = container_of(kref, struct tipc_node, kref); 228 + struct tipc_node *n = container_of(kref, struct tipc_node, kref); 229 229 230 - tipc_node_delete(node); 230 + kfree(n->bc_entry.link); 231 + kfree_rcu(n, rcu); 231 232 } 232 233 233 234 static void tipc_node_put(struct tipc_node *node) ··· 396 395 { 397 396 list_del_rcu(&node->list); 398 397 hlist_del_rcu(&node->hash); 399 - kfree(node->bc_entry.link); 400 - kfree_rcu(node, rcu); 398 + tipc_node_put(node); 399 + 400 + del_timer_sync(&node->timer); 401 + tipc_node_put(node); 401 402 } 402 403 403 404 void tipc_node_stop(struct net *net) 404 405 { 405 - struct tipc_net *tn = net_generic(net, tipc_net_id); 406 + struct tipc_net *tn = tipc_net(net); 406 407 struct tipc_node *node, *t_node; 407 408 408 409 spin_lock_bh(&tn->node_list_lock); 409 - list_for_each_entry_safe(node, t_node, &tn->node_list, list) { 410 - if (del_timer(&node->timer)) 411 - tipc_node_put(node); 412 - tipc_node_put(node); 413 - } 410 + list_for_each_entry_safe(node, t_node, &tn->node_list, list) 411 + tipc_node_delete(node); 414 412 spin_unlock_bh(&tn->node_list_lock); 415 413 } 416 414 ··· 530 530 if (rc & TIPC_LINK_DOWN_EVT) 531 531 tipc_node_link_down(n, bearer_id, false); 532 532 } 533 - if (!mod_timer(&n->timer, jiffies + n->keepalive_intv)) 534 - tipc_node_get(n); 535 - tipc_node_put(n); 533 + mod_timer(&n->timer, jiffies + n->keepalive_intv); 536 534 } 537 535 538 536 /**