Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

locking/ww_mutex/test: Make sure we bail out instead of livelock

I've seen what appears to be livelocks in the stress_inorder_work()
function, and looking at the code it is clear we can have a case
where we continually retry acquiring the locks and never check to
see if we have passed the specified timeout.

This patch reworks that function so we always check the timeout
before iterating through the loop again.

I believe others may have hit this previously here:

https://lore.kernel.org/lkml/895ef450-4fb3-5d29-a6ad-790657106a5a@intel.com/

Reported-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230922043616.19282-4-jstultz@google.com

authored by

John Stultz and committed by
Ingo Molnar
cfa92b6d bccdd808

+5 -4
+5 -4
kernel/locking/test-ww_mutex.c
··· 465 465 ww_mutex_unlock(&locks[order[n]]); 466 466 467 467 if (err == -EDEADLK) { 468 - ww_mutex_lock_slow(&locks[order[contended]], &ctx); 469 - goto retry; 468 + if (!time_after(jiffies, stress->timeout)) { 469 + ww_mutex_lock_slow(&locks[order[contended]], &ctx); 470 + goto retry; 471 + } 470 472 } 471 473 474 + ww_acquire_fini(&ctx); 472 475 if (err) { 473 476 pr_err_once("stress (%s) failed with %d\n", 474 477 __func__, err); 475 478 break; 476 479 } 477 - 478 - ww_acquire_fini(&ctx); 479 480 } while (!time_after(jiffies, stress->timeout)); 480 481 481 482 kfree(order);