Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netkit: Fix pkt_type override upon netkit pass verdict

When running Cilium connectivity test suite with netkit in L2 mode, we
found that compared to tcx a few tests were failing which pushed traffic
into an L7 proxy sitting in host namespace. The problem in particular is
around the invocation of eth_type_trans() in netkit.

In case of tcx, this is run before the tcx ingress is triggered inside
host namespace and thus if the BPF program uses the bpf_skb_change_type()
helper the newly set type is retained. However, in case of netkit, the
late eth_type_trans() invocation overrides the earlier decision from the
BPF program which eventually leads to the test failure.

Instead of eth_type_trans(), split out the relevant parts, meaning, reset
of mac header and call to eth_skb_pkt_type() before the BPF program is run
in order to have the same behavior as with tcx, and refactor a small helper
called eth_skb_pull_mac() which is run in case it's passed up the stack
where the mac header must be pulled. With this all connectivity tests pass.

Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240524163619.26001-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Daniel Borkmann and committed by
Alexei Starovoitov
3998d184 d6fe532b

+12 -4
+3 -1
drivers/net/netkit.c
··· 55 55 skb_scrub_packet(skb, xnet); 56 56 skb->priority = 0; 57 57 nf_skip_egress(skb, true); 58 + skb_reset_mac_header(skb); 58 59 } 59 60 60 61 static struct netkit *netkit_priv(const struct net_device *dev) ··· 79 78 skb_orphan_frags(skb, GFP_ATOMIC))) 80 79 goto drop; 81 80 netkit_prep_forward(skb, !net_eq(dev_net(dev), dev_net(peer))); 81 + eth_skb_pkt_type(skb, peer); 82 82 skb->dev = peer; 83 83 entry = rcu_dereference(nk->active); 84 84 if (entry) ··· 87 85 switch (ret) { 88 86 case NETKIT_NEXT: 89 87 case NETKIT_PASS: 90 - skb->protocol = eth_type_trans(skb, skb->dev); 88 + eth_skb_pull_mac(skb); 91 89 skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); 92 90 if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) { 93 91 dev_sw_netstats_tx_add(dev, 1, len);
+8
include/linux/etherdevice.h
··· 636 636 } 637 637 } 638 638 639 + static inline struct ethhdr *eth_skb_pull_mac(struct sk_buff *skb) 640 + { 641 + struct ethhdr *eth = (struct ethhdr *)skb->data; 642 + 643 + skb_pull_inline(skb, ETH_HLEN); 644 + return eth; 645 + } 646 + 639 647 /** 640 648 * eth_skb_pad - Pad buffer to mininum number of octets for Ethernet frame 641 649 * @skb: Buffer to pad
+1 -3
net/ethernet/eth.c
··· 161 161 skb->dev = dev; 162 162 skb_reset_mac_header(skb); 163 163 164 - eth = (struct ethhdr *)skb->data; 165 - skb_pull_inline(skb, ETH_HLEN); 166 - 164 + eth = eth_skb_pull_mac(skb); 167 165 eth_skb_pkt_type(skb, dev); 168 166 169 167 /*