sched/deadline: Fix 'stuck' dl_server

Andrea reported the dl_server getting stuck for him. He tracked it
down to a state where dl_server_start() saw dl_defer_running==1, but
the dl_server's job is no longer valid at the time of
dl_server_start().

In the state diagram this corresponds to [4] D->A (or dl_server_stop()
due to no more runnable tasks) followed by [1], which in case of a
lapsed deadline must then be A->B.

Now our A has dl_defer_running==1, while B demands
dl_defer_running==0, therefore it must get cleared when the CBS wakeup
rules demand a replenish.

Fixes: a110a81c52a9 ("sched/deadline: Deferrable dl server")
Reported-by: Andrea Righi arighi@nvidia.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Tested-by: Andrea Righi arighi@nvidia.com
Link: https://lkml.kernel.org/r/20260123161645.2181752-1-arighi@nvidia.com
Link: https://patch.msgid.link/20260130124100.GC1079264@noisy.programming.kicks-ass.net

+12
+12
kernel/sched/deadline.c
··· 1034 1034 return; 1035 1035 } 1036 1036 1037 + /* 1038 + * When [4] D->A is followed by [1] A->B, dl_defer_running 1039 + * needs to be cleared, otherwise it will fail to properly 1040 + * start the zero-laxity timer. 1041 + */ 1042 + dl_se->dl_defer_running = 0; 1037 1043 replenish_dl_new_period(dl_se, rq); 1038 1044 } else if (dl_server(dl_se) && dl_se->dl_defer) { 1039 1045 /* ··· 1661 1655 * dl_server_active = 1; 1662 1656 * enqueue_dl_entity() 1663 1657 * update_dl_entity(WAKEUP) 1658 + * if (dl_time_before() || dl_entity_overflow()) 1659 + * dl_defer_running = 0; 1660 + * replenish_dl_new_period(); 1661 + * // fwd period 1662 + * dl_throttled = 1; 1663 + * dl_defer_armed = 1; 1664 1664 * if (!dl_defer_running) 1665 1665 * dl_defer_armed = 1; 1666 1666 * dl_throttled = 1;