nilfs2: fix unexpected freezing of nilfs_segctor_sync()

A potential and reproducible race issue has been identified where
nilfs_segctor_sync() would block even after the log writer thread writes a
checkpoint, unless there is an interrupt or other trigger to resume log
writing.

This turned out to be because, depending on the execution timing of the
log writer thread running in parallel, the log writer thread may skip
responding to nilfs_segctor_sync(), which causes a call to schedule()
waiting for completion within nilfs_segctor_sync() to lose the opportunity
to wake up.

The reason why waking up the task waiting in nilfs_segctor_sync() may be
skipped is that updating the request generation issued using a shared
sequence counter and adding an wait queue entry to the request wait queue
to the log writer, are not done atomically. There is a possibility that
log writing and request completion notification by nilfs_segctor_wakeup()
may occur between the two operations, and in that case, the wait queue
entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of
nilfs_segctor_sync() will be carried over until the next request occurs.

Fix this issue by performing these two operations simultaneously within
the lock section of sc_state_lock. Also, following the memory barrier
guidelines for event waiting loops, move the call to set_current_state()
in the same location into the event waiting loop to ensure that a memory
barrier is inserted just before the event condition determination.

Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by Ryusuke Konishi and committed by Andrew Morton 936184ea f5d4e046

+13 -4
+13 -4
fs/nilfs2/segment.c
··· 2168 struct nilfs_segctor_wait_request wait_req; 2169 int err = 0; 2170 2171 - spin_lock(&sci->sc_state_lock); 2172 init_wait(&wait_req.wq); 2173 wait_req.err = 0; 2174 atomic_set(&wait_req.done, 0); 2175 wait_req.seq = ++sci->sc_seq_request; 2176 spin_unlock(&sci->sc_state_lock); 2177 2178 - init_waitqueue_entry(&wait_req.wq, current); 2179 - add_wait_queue(&sci->sc_wait_request, &wait_req.wq); 2180 - set_current_state(TASK_INTERRUPTIBLE); 2181 wake_up(&sci->sc_wait_daemon); 2182 2183 for (;;) { 2184 if (atomic_read(&wait_req.done)) { 2185 err = wait_req.err; 2186 break;
··· 2168 struct nilfs_segctor_wait_request wait_req; 2169 int err = 0; 2170 2171 init_wait(&wait_req.wq); 2172 wait_req.err = 0; 2173 atomic_set(&wait_req.done, 0); 2174 + init_waitqueue_entry(&wait_req.wq, current); 2175 + 2176 + /* 2177 + * To prevent a race issue where completion notifications from the 2178 + * log writer thread are missed, increment the request sequence count 2179 + * "sc_seq_request" and insert a wait queue entry using the current 2180 + * sequence number into the "sc_wait_request" queue at the same time 2181 + * within the lock section of "sc_state_lock". 2182 + */ 2183 + spin_lock(&sci->sc_state_lock); 2184 wait_req.seq = ++sci->sc_seq_request; 2185 + add_wait_queue(&sci->sc_wait_request, &wait_req.wq); 2186 spin_unlock(&sci->sc_state_lock); 2187 2188 wake_up(&sci->sc_wait_daemon); 2189 2190 for (;;) { 2191 + set_current_state(TASK_INTERRUPTIBLE); 2192 + 2193 if (atomic_read(&wait_req.done)) { 2194 err = wait_req.err; 2195 break;