Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'frag-udp-tunneled-skbs'

Shmulik Ladkani says:

====================
net: Consider fragmentation of udp tunneled skbs in 'ip_finish_output_gso'

Currently IP fragmentation of GSO segments that exceed dst mtu is
considered only in the ipv4 forwarding case.

There are cases where GSO skbs that are bridged and then udp-tunneled
may have gso_size exceeding the egress device mtu.
It makes sense to fragment them, as in the non GSOed code path.

The exact cases where this behavior is needed is described and addressed
in the 2nd patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

+16 -4
+1
include/net/ip.h
··· 47 47 #define IPSKB_REROUTED BIT(4) 48 48 #define IPSKB_DOREDIRECT BIT(5) 49 49 #define IPSKB_FRAG_PMTU BIT(6) 50 + #define IPSKB_FRAG_SEGS BIT(7) 50 51 51 52 u16 frag_max_size; 52 53 };
+1 -1
net/ipv4/ip_forward.c
··· 117 117 if (opt->is_strictroute && rt->rt_uses_gateway) 118 118 goto sr_failed; 119 119 120 - IPCB(skb)->flags |= IPSKB_FORWARDED; 120 + IPCB(skb)->flags |= IPSKB_FORWARDED | IPSKB_FRAG_SEGS; 121 121 mtu = ip_dst_mtu_maybe_forward(&rt->dst, true); 122 122 if (ip_exceeds_mtu(skb, mtu)) { 123 123 IP_INC_STATS(net, IPSTATS_MIB_FRAGFAILS);
+4 -2
net/ipv4/ip_output.c
··· 223 223 struct sk_buff *segs; 224 224 int ret = 0; 225 225 226 - /* common case: locally created skb or seglen is <= mtu */ 227 - if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) || 226 + /* common case: fragmentation of segments is not allowed, 227 + * or seglen is <= mtu 228 + */ 229 + if (((IPCB(skb)->flags & IPSKB_FRAG_SEGS) == 0) || 228 230 skb_gso_validate_mtu(skb, mtu)) 229 231 return ip_finish_output2(net, sk, skb); 230 232
+9
net/ipv4/ip_tunnel_core.c
··· 63 63 int pkt_len = skb->len - skb_inner_network_offset(skb); 64 64 struct net *net = dev_net(rt->dst.dev); 65 65 struct net_device *dev = skb->dev; 66 + int skb_iif = skb->skb_iif; 66 67 struct iphdr *iph; 67 68 int err; 68 69 ··· 72 71 skb_clear_hash(skb); 73 72 skb_dst_set(skb, &rt->dst); 74 73 memset(IPCB(skb), 0, sizeof(*IPCB(skb))); 74 + 75 + if (skb_iif && proto == IPPROTO_UDP) { 76 + /* Arrived from an ingress interface and got udp encapuslated. 77 + * The encapsulated network segment length may exceed dst mtu. 78 + * Allow IP Fragmentation of segments. 79 + */ 80 + IPCB(skb)->flags |= IPSKB_FRAG_SEGS; 81 + } 75 82 76 83 /* Push down and install the IP header. */ 77 84 skb_push(skb, sizeof(struct iphdr));
+1 -1
net/ipv4/ipmr.c
··· 1749 1749 vif->dev->stats.tx_bytes += skb->len; 1750 1750 } 1751 1751 1752 - IPCB(skb)->flags |= IPSKB_FORWARDED; 1752 + IPCB(skb)->flags |= IPSKB_FORWARDED | IPSKB_FRAG_SEGS; 1753 1753 1754 1754 /* RFC1584 teaches, that DVMRP/PIM router must deliver packets locally 1755 1755 * not only before forwarding, but after forwarding on all output