Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

selftests/bpf: De-flake test_tcpbpf

It looks like BPF program that handles BPF_SOCK_OPS_STATE_CB state
can race with the bpf_map_lookup_elem("global_map"); I sometimes
see the failures in this test and re-running helps.

Since we know that we expect the callback to be called 3 times (one
time for listener socket, two times for both ends of the connection),
let's export this number and add simple retry logic around that.

Also, let's make EXPECT_EQ() not return on failure, but continue
evaluating all conditions; that should make potential debugging
easier.

With this fix in place I don't observe the flakiness anymore.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Lawrence Brakmo <brakmo@fb.com>
Link: https://lore.kernel.org/bpf/20191204190955.170934-1-sdf@google.com

authored by

Stanislav Fomichev and committed by
Alexei Starovoitov
ef8c84ef 6bf6affe

+20 -7
+1
tools/testing/selftests/bpf/progs/test_tcpbpf_kern.c
··· 131 131 g.bytes_received = skops->bytes_received; 132 132 g.bytes_acked = skops->bytes_acked; 133 133 } 134 + g.num_close_events++; 134 135 bpf_map_update_elem(&global_map, &key, &g, 135 136 BPF_ANY); 136 137 }
+1
tools/testing/selftests/bpf/test_tcpbpf.h
··· 13 13 __u64 bytes_received; 14 14 __u64 bytes_acked; 15 15 __u32 num_listen; 16 + __u32 num_close_events; 16 17 }; 17 18 #endif
+18 -7
tools/testing/selftests/bpf/test_tcpbpf_user.c
··· 16 16 17 17 #include "test_tcpbpf.h" 18 18 19 + /* 3 comes from one listening socket + both ends of the connection */ 20 + #define EXPECTED_CLOSE_EVENTS 3 21 + 19 22 #define EXPECT_EQ(expected, actual, fmt) \ 20 23 do { \ 21 24 if ((expected) != (actual)) { \ ··· 26 23 " Actual: %" fmt "\n" \ 27 24 " Expected: %" fmt "\n", \ 28 25 (actual), (expected)); \ 29 - goto err; \ 26 + ret--; \ 30 27 } \ 31 28 } while (0) 32 29 33 30 int verify_result(const struct tcpbpf_globals *result) 34 31 { 35 32 __u32 expected_events; 33 + int ret = 0; 36 34 37 35 expected_events = ((1 << BPF_SOCK_OPS_TIMEOUT_INIT) | 38 36 (1 << BPF_SOCK_OPS_RWND_INIT) | ··· 52 48 EXPECT_EQ(0x80, result->bad_cb_test_rv, PRIu32); 53 49 EXPECT_EQ(0, result->good_cb_test_rv, PRIu32); 54 50 EXPECT_EQ(1, result->num_listen, PRIu32); 51 + EXPECT_EQ(EXPECTED_CLOSE_EVENTS, result->num_close_events, PRIu32); 55 52 56 - return 0; 57 - err: 58 - return -1; 53 + return ret; 59 54 } 60 55 61 56 int verify_sockopt_result(int sock_map_fd) 62 57 { 63 58 __u32 key = 0; 59 + int ret = 0; 64 60 int res; 65 61 int rv; 66 62 ··· 73 69 rv = bpf_map_lookup_elem(sock_map_fd, &key, &res); 74 70 EXPECT_EQ(0, rv, "d"); 75 71 EXPECT_EQ(1, res, "d"); 76 - return 0; 77 - err: 78 - return -1; 72 + return ret; 79 73 } 80 74 81 75 static int bpf_find_map(const char *test, struct bpf_object *obj, ··· 98 96 int error = EXIT_FAILURE; 99 97 struct bpf_object *obj; 100 98 int cg_fd = -1; 99 + int retry = 10; 101 100 __u32 key = 0; 102 101 int rv; 103 102 ··· 137 134 if (sock_map_fd < 0) 138 135 goto err; 139 136 137 + retry_lookup: 140 138 rv = bpf_map_lookup_elem(map_fd, &key, &g); 141 139 if (rv != 0) { 142 140 printf("FAILED: bpf_map_lookup_elem returns %d\n", rv); 143 141 goto err; 142 + } 143 + 144 + if (g.num_close_events != EXPECTED_CLOSE_EVENTS && retry--) { 145 + printf("Unexpected number of close events (%d), retrying!\n", 146 + g.num_close_events); 147 + usleep(100); 148 + goto retry_lookup; 144 149 } 145 150 146 151 if (verify_result(&g)) {