Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: add and use skb_get_hash_net

Years ago flow dissector gained ability to delegate flow dissection
to a bpf program, scoped per netns.

Unfortunately, skb_get_hash() only gets an sk_buff argument instead
of both net+skb. This means the flow dissector needs to obtain the
netns pointer from somewhere else.

The netns is derived from skb->dev, and if that is not available, from
skb->sk. If neither is set, we hit a (benign) WARN_ON_ONCE().

Trying both dev and sk covers most cases, but not all, as recently
reported by Christoph Paasch.

In case of nf-generated tcp reset, both sk and dev are NULL:

WARNING: .. net/core/flow_dissector.c:1104
skb_flow_dissect_flow_keys include/linux/skbuff.h:1536 [inline]
skb_get_hash include/linux/skbuff.h:1578 [inline]
nft_trace_init+0x7d/0x120 net/netfilter/nf_tables_trace.c:320
nft_do_chain+0xb26/0xb90 net/netfilter/nf_tables_core.c:268
nft_do_chain_ipv4+0x7a/0xa0 net/netfilter/nft_chain_filter.c:23
nf_hook_slow+0x57/0x160 net/netfilter/core.c:626
__ip_local_out+0x21d/0x260 net/ipv4/ip_output.c:118
ip_local_out+0x26/0x1e0 net/ipv4/ip_output.c:127
nf_send_reset+0x58c/0x700 net/ipv4/netfilter/nf_reject_ipv4.c:308
nft_reject_ipv4_eval+0x53/0x90 net/ipv4/netfilter/nft_reject_ipv4.c:30
[..]

syzkaller did something like this:
table inet filter {
chain input {
type filter hook input priority filter; policy accept;
meta nftrace set 1
tcp dport 42 reject with tcp reset
}
chain output {
type filter hook output priority filter; policy accept;
# empty chain is enough
}
}

... then sends a tcp packet to port 42.

Initial attempt to simply set skb->dev from nf_reject_ipv4 doesn't cover
all cases: skbs generated via ipv4 igmp_send_report trigger similar splat.

Moreover, Pablo Neira found that nft_hash.c uses __skb_get_hash_symmetric()
which would trigger same warn splat for such skbs.

Lets allow callers to pass the current netns explicitly.
The nf_trace infrastructure is adjusted to use the new helper.

__skb_get_hash_symmetric is handled in the next patch.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/494
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240608221057.16070-2-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Florian Westphal and committed by
Jakub Kicinski
b975d3ee 91579c93

+22 -7
+10 -2
include/linux/skbuff.h
··· 1498 1498 __skb_set_hash(skb, hash, true, is_l4); 1499 1499 } 1500 1500 1501 - void __skb_get_hash(struct sk_buff *skb); 1501 + void __skb_get_hash_net(const struct net *net, struct sk_buff *skb); 1502 1502 u32 __skb_get_hash_symmetric(const struct sk_buff *skb); 1503 1503 u32 skb_get_poff(const struct sk_buff *skb); 1504 1504 u32 __skb_get_poff(const struct sk_buff *skb, const void *data, ··· 1578 1578 struct flow_dissector *flow_dissector, 1579 1579 void *target_container); 1580 1580 1581 + static inline __u32 skb_get_hash_net(const struct net *net, struct sk_buff *skb) 1582 + { 1583 + if (!skb->l4_hash && !skb->sw_hash) 1584 + __skb_get_hash_net(net, skb); 1585 + 1586 + return skb->hash; 1587 + } 1588 + 1581 1589 static inline __u32 skb_get_hash(struct sk_buff *skb) 1582 1590 { 1583 1591 if (!skb->l4_hash && !skb->sw_hash) 1584 - __skb_get_hash(skb); 1592 + __skb_get_hash_net(NULL, skb); 1585 1593 1586 1594 return skb->hash; 1587 1595 }
+11 -4
net/core/flow_dissector.c
··· 1860 1860 EXPORT_SYMBOL_GPL(__skb_get_hash_symmetric); 1861 1861 1862 1862 /** 1863 - * __skb_get_hash: calculate a flow hash 1863 + * __skb_get_hash_net: calculate a flow hash 1864 + * @net: associated network namespace, derived from @skb if NULL 1864 1865 * @skb: sk_buff to calculate flow hash from 1865 1866 * 1866 1867 * This function calculates a flow hash based on src/dst addresses ··· 1869 1868 * on success, zero indicates no valid hash. Also, sets l4_hash in skb 1870 1869 * if hash is a canonical 4-tuple hash over transport ports. 1871 1870 */ 1872 - void __skb_get_hash(struct sk_buff *skb) 1871 + void __skb_get_hash_net(const struct net *net, struct sk_buff *skb) 1873 1872 { 1874 1873 struct flow_keys keys; 1875 1874 u32 hash; 1876 1875 1876 + memset(&keys, 0, sizeof(keys)); 1877 + 1878 + __skb_flow_dissect(net, skb, &flow_keys_dissector, 1879 + &keys, NULL, 0, 0, 0, 1880 + FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL); 1881 + 1877 1882 __flow_hash_secret_init(); 1878 1883 1879 - hash = ___skb_get_hash(skb, &keys, &hashrnd); 1884 + hash = __flow_hash_from_keys(&keys, &hashrnd); 1880 1885 1881 1886 __skb_set_sw_hash(skb, hash, flow_keys_have_l4(&keys)); 1882 1887 } 1883 - EXPORT_SYMBOL(__skb_get_hash); 1888 + EXPORT_SYMBOL(__skb_get_hash_net); 1884 1889 1885 1890 __u32 skb_get_hash_perturb(const struct sk_buff *skb, 1886 1891 const siphash_key_t *perturb)
+1 -1
net/netfilter/nf_tables_trace.c
··· 317 317 net_get_random_once(&trace_key, sizeof(trace_key)); 318 318 319 319 info->skbid = (u32)siphash_3u32(hash32_ptr(skb), 320 - skb_get_hash(skb), 320 + skb_get_hash_net(nft_net(pkt), skb), 321 321 skb->skb_iif, 322 322 &trace_key); 323 323 }