Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

selftests: openvswitch: retry instead of sleep

There are a couple of places where the test script "sleep"s to wait for
some external condition to be met.

This is error prone, specially in slow systems (identified in CI by
"KSFT_MACHINE_SLOW=yes").

To fix this, add a "ovs_wait" function that tries to execute a command
a few times until it succeeds. The timeout used is set to 5s for
"normal" systems and doubled if a slow CI machine is detected.

This should make the following work:

$ vng --build \
--config tools/testing/selftests/net/config \
--config kernel/configs/debug.config

$ vng --run . --user root -- "make -C tools/testing/selftests/ \
KSFT_MACHINE_SLOW=yes TARGETS=net/openvswitch run_tests"

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20240710090500.1655212-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Adrian Moreno and committed by
Jakub Kicinski
5e724cb6 13cabc47

+38 -8
+37 -8
tools/testing/selftests/net/openvswitch/openvswitch.sh
··· 11 11 PAUSE_ON_FAIL=no 12 12 VERBOSE=0 13 13 TRACING=0 14 + WAIT_TIMEOUT=5 15 + 16 + if test "X$KSFT_MACHINE_SLOW" == "Xyes"; then 17 + WAIT_TIMEOUT=10 18 + fi 14 19 15 20 tests=" 16 21 arp_ping eth-arp: Basic arp ping between two NS ··· 32 27 [ "${ovs_dir}" != "" ] && 33 28 echo "`date +"[%m-%d %H:%M:%S]"` $*" >> ${ovs_dir}/debug.log 34 29 [ $VERBOSE = 0 ] || echo $* 30 + } 31 + 32 + ovs_wait() { 33 + info "waiting $WAIT_TIMEOUT s for: $@" 34 + 35 + if "$@" ; then 36 + info "wait succeeded immediately" 37 + return 0 38 + fi 39 + 40 + # A quick re-check helps speed up small races in fast systems. 41 + # However, fractional sleeps might not necessarily work. 42 + local start=0 43 + sleep 0.1 || { sleep 1; start=1; } 44 + 45 + for (( i=start; i<WAIT_TIMEOUT; i++ )); do 46 + if "$@" ; then 47 + info "wait succeeded after $i seconds" 48 + return 0 49 + fi 50 + sleep 1 51 + done 52 + info "wait failed after $i seconds" 53 + return 1 35 54 } 36 55 37 56 ovs_base=`pwd` ··· 307 278 308 279 # Record psample data. 309 280 ovs_spawn_daemon "test_psample" python3 $ovs_base/ovs-dpctl.py psample-events 281 + ovs_wait grep -q "listening for psample events" ${ovs_dir}/stdout 310 282 311 283 # Send a single ping. 312 - sleep 1 313 284 ovs_sbx "test_psample" ip netns exec client ping -I c1 172.31.110.20 -c 1 || return 1 314 - sleep 1 315 285 316 286 # We should have received one userspace action upcall and 2 psample packets. 317 - grep -E "userspace action command" $ovs_dir/s0.out >/dev/null 2>&1 || return 1 287 + ovs_wait grep -q "userspace action command" $ovs_dir/s0.out || return 1 318 288 319 289 # client -> server samples should only contain the first 14 bytes of the packet. 320 - grep -E "rate:4294967295,group:1,cookie:c0ffee data:[0-9a-f]{28}$" \ 321 - $ovs_dir/stdout >/dev/null 2>&1 || return 1 322 - grep -E "rate:4294967295,group:2,cookie:eeff0c" \ 323 - $ovs_dir/stdout >/dev/null 2>&1 || return 1 290 + ovs_wait grep -qE "rate:4294967295,group:1,cookie:c0ffee data:[0-9a-f]{28}$" \ 291 + $ovs_dir/stdout || return 1 292 + 293 + ovs_wait grep -q "rate:4294967295,group:2,cookie:eeff0c" $ovs_dir/stdout || return 1 324 294 325 295 return 0 326 296 } ··· 739 711 ovs_add_netns_and_veths "test_upcall_interfaces" ui0 upc left0 l0 \ 740 712 172.31.110.1/24 -u || return 1 741 713 742 - sleep 1 714 + ovs_wait grep -q "listening on upcall packet handler" ${ovs_dir}/left0.out 715 + 743 716 info "sending arping" 744 717 ip netns exec upc arping -I l0 172.31.110.20 -c 1 \ 745 718 >$ovs_dir/arping.stdout 2>$ovs_dir/arping.stderr
+1
tools/testing/selftests/net/openvswitch/ovs-dpctl.py
··· 2520 2520 marshal_class = psample_msg 2521 2521 2522 2522 def read_samples(self): 2523 + print("listening for psample events", flush=True) 2523 2524 while True: 2524 2525 try: 2525 2526 for msg in self.get():