Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: drop dst before queueing fragments

Commit 4a94445c9a5c (net: Use ip_route_input_noref() in input path)
added a bug in IP defragmentation handling, as non refcounted
dst could escape an RCU protected section.

Commit 64f3b9e203bd068 (net: ip_expire() must revalidate route) fixed
the case of timeouts, but not the general problem.

Tom Parkin noticed crashes in UDP stack and provided a patch,
but further analysis permitted us to pinpoint the root cause.

Before queueing a packet into a frag list, we must drop its dst,
as this dst has limited lifetime (RCU protected)

When/if a packet is finally reassembled, we use the dst of the very
last skb, still protected by RCU and valid, as the dst of the
reassembled packet.

Use same logic in IPv6, as there is no need to hold dst references.

Reported-by: Tom Parkin <tparkin@katalix.com>
Tested-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Eric Dumazet and committed by
David S. Miller
97599dc7 8d7ed0f0

+20 -6
+10 -4
net/ipv4/ip_fragment.c
··· 248 248 if (!head->dev) 249 249 goto out_rcu_unlock; 250 250 251 - /* skb dst is stale, drop it, and perform route lookup again */ 252 - skb_dst_drop(head); 251 + /* skb has no dst, perform route lookup again */ 253 252 iph = ip_hdr(head); 254 253 err = ip_route_input_noref(head, iph->daddr, iph->saddr, 255 254 iph->tos, head->dev); ··· 522 523 qp->q.max_size = skb->len + ihl; 523 524 524 525 if (qp->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) && 525 - qp->q.meat == qp->q.len) 526 - return ip_frag_reasm(qp, prev, dev); 526 + qp->q.meat == qp->q.len) { 527 + unsigned long orefdst = skb->_skb_refdst; 527 528 529 + skb->_skb_refdst = 0UL; 530 + err = ip_frag_reasm(qp, prev, dev); 531 + skb->_skb_refdst = orefdst; 532 + return err; 533 + } 534 + 535 + skb_dst_drop(skb); 528 536 inet_frag_lru_move(&qp->q); 529 537 return -EINPROGRESS; 530 538
+10 -2
net/ipv6/reassembly.c
··· 330 330 } 331 331 332 332 if (fq->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) && 333 - fq->q.meat == fq->q.len) 334 - return ip6_frag_reasm(fq, prev, dev); 333 + fq->q.meat == fq->q.len) { 334 + int res; 335 + unsigned long orefdst = skb->_skb_refdst; 335 336 337 + skb->_skb_refdst = 0UL; 338 + res = ip6_frag_reasm(fq, prev, dev); 339 + skb->_skb_refdst = orefdst; 340 + return res; 341 + } 342 + 343 + skb_dst_drop(skb); 336 344 inet_frag_lru_move(&fq->q); 337 345 return -1; 338 346