Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: Introduce net.ipv4.tcp_migrate_req.

This commit adds a new sysctl option: net.ipv4.tcp_migrate_req. If this
option is enabled or eBPF program is attached, we will be able to migrate
child sockets from a listener to another in the same reuseport group after
close() or shutdown() syscalls.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210612123224.12525-2-kuniyu@amazon.co.jp

authored by

Kuniyuki Iwashima and committed by
Daniel Borkmann
f9ac779f bbf29d3a

+35
+25
Documentation/networking/ip-sysctl.rst
··· 761 761 network connections you can set this knob to 2 to enable 762 762 unconditionally generation of syncookies. 763 763 764 + tcp_migrate_req - BOOLEAN 765 + The incoming connection is tied to a specific listening socket when 766 + the initial SYN packet is received during the three-way handshake. 767 + When a listener is closed, in-flight request sockets during the 768 + handshake and established sockets in the accept queue are aborted. 769 + 770 + If the listener has SO_REUSEPORT enabled, other listeners on the 771 + same port should have been able to accept such connections. This 772 + option makes it possible to migrate such child sockets to another 773 + listener after close() or shutdown(). 774 + 775 + The BPF_SK_REUSEPORT_SELECT_OR_MIGRATE type of eBPF program should 776 + usually be used to define the policy to pick an alive listener. 777 + Otherwise, the kernel will randomly pick an alive listener only if 778 + this option is enabled. 779 + 780 + Note that migration between listeners with different settings may 781 + crash applications. Let's say migration happens from listener A to 782 + B, and only B has TCP_SAVE_SYN enabled. B cannot read SYN data from 783 + the requests migrated from A. To avoid such a situation, cancel 784 + migration by returning SK_DROP in the type of eBPF program, or 785 + disable this option. 786 + 787 + Default: 0 788 + 764 789 tcp_fastopen - INTEGER 765 790 Enable TCP Fast Open (RFC7413) to send and accept data in the opening 766 791 SYN packet.
+1
include/net/netns/ipv4.h
··· 126 126 u8 sysctl_tcp_syn_retries; 127 127 u8 sysctl_tcp_synack_retries; 128 128 u8 sysctl_tcp_syncookies; 129 + u8 sysctl_tcp_migrate_req; 129 130 int sysctl_tcp_reordering; 130 131 u8 sysctl_tcp_retries1; 131 132 u8 sysctl_tcp_retries2;
+9
net/ipv4/sysctl_net_ipv4.c
··· 961 961 }, 962 962 #endif 963 963 { 964 + .procname = "tcp_migrate_req", 965 + .data = &init_net.ipv4.sysctl_tcp_migrate_req, 966 + .maxlen = sizeof(u8), 967 + .mode = 0644, 968 + .proc_handler = proc_dou8vec_minmax, 969 + .extra1 = SYSCTL_ZERO, 970 + .extra2 = SYSCTL_ONE 971 + }, 972 + { 964 973 .procname = "tcp_reordering", 965 974 .data = &init_net.ipv4.sysctl_tcp_reordering, 966 975 .maxlen = sizeof(int),