Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tcp: Fix bind() regression for v6-only wildcard and v4(-mapped-v6) non-wildcard addresses.

Jianguo Wu reported another bind() regression introduced by bhash2.

Calling bind() for the following 3 addresses on the same port, the
3rd one should fail but now succeeds.

1. 0.0.0.0 or ::ffff:0.0.0.0
2. [::] w/ IPV6_V6ONLY
3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address

The first two bind() create tb2 like this:

bhash2 -> tb2(:: w/ IPV6_V6ONLY) -> tb2(0.0.0.0)

The 3rd bind() will match with the IPv6 only wildcard address bucket
in inet_bind2_bucket_match_addr_any(), however, no conflicting socket
exists in the bucket. So, inet_bhash2_conflict() will returns false,
and thus, inet_bhash2_addr_any_conflict() returns false consequently.

As a result, the 3rd bind() bypasses conflict check, which should be
done against the IPv4 wildcard address bucket.

So, in inet_bhash2_addr_any_conflict(), we must iterate over all buckets.

Note that we cannot add ipv6_only flag for inet_bind2_bucket as it
would confuse the following patetrn.

1. [::] w/ SO_REUSE{ADDR,PORT} and IPV6_V6ONLY
2. [::] w/ SO_REUSE{ADDR,PORT}
3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address

The first bind() would create a bucket with ipv6_only flag true,
the second bind() would add the [::] socket into the same bucket,
and the third bind() could succeed based on the wrong assumption
that ipv6_only bucket would not conflict with v4(-mapped-v6) address.

Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Diagnosed-by: Jianguo Wu <wujianguo106@163.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240326204251.51301-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Kuniyuki Iwashima and committed by
Jakub Kicinski
d91ef1e1 ea111449

+11 -8
+11 -8
net/ipv4/inet_connection_sock.c
··· 294 294 struct sock_reuseport *reuseport_cb; 295 295 struct inet_bind_hashbucket *head2; 296 296 struct inet_bind2_bucket *tb2; 297 + bool conflict = false; 297 298 bool reuseport_cb_ok; 298 299 299 300 rcu_read_lock(); ··· 307 306 308 307 spin_lock(&head2->lock); 309 308 310 - inet_bind_bucket_for_each(tb2, &head2->chain) 311 - if (inet_bind2_bucket_match_addr_any(tb2, net, port, l3mdev, sk)) 312 - break; 309 + inet_bind_bucket_for_each(tb2, &head2->chain) { 310 + if (!inet_bind2_bucket_match_addr_any(tb2, net, port, l3mdev, sk)) 311 + continue; 313 312 314 - if (tb2 && inet_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok, 315 - reuseport_ok)) { 316 - spin_unlock(&head2->lock); 317 - return true; 313 + if (!inet_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok, reuseport_ok)) 314 + continue; 315 + 316 + conflict = true; 317 + break; 318 318 } 319 319 320 320 spin_unlock(&head2->lock); 321 - return false; 321 + 322 + return conflict; 322 323 } 323 324 324 325 /*