Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

sched_ext: Rename scx_bpf_consume() to scx_bpf_dsq_move_to_local()

In sched_ext API, a repeatedly reported pain point is the overuse of the
verb "dispatch" and confusion around "consume":

- ops.dispatch()
- scx_bpf_dispatch[_vtime]()
- scx_bpf_consume()
- scx_bpf_dispatch[_vtime]_from_dsq*()

This overloading of the term is historical. Originally, there were only
built-in DSQs and moving a task into a DSQ always dispatched it for
execution. Using the verb "dispatch" for the kfuncs to move tasks into these
DSQs made sense.

Later, user DSQs were added and scx_bpf_dispatch[_vtime]() updated to be
able to insert tasks into any DSQ. The only allowed DSQ to DSQ transfer was
from a non-local DSQ to a local DSQ and this operation was named "consume".
This was already confusing as a task could be dispatched to a user DSQ from
ops.enqueue() and then the DSQ would have to be consumed in ops.dispatch().
Later addition of scx_bpf_dispatch_from_dsq*() made the confusion even worse
as "dispatch" in this context meant moving a task to an arbitrary DSQ from a
user DSQ.

Clean up the API with the following renames:

1. scx_bpf_dispatch[_vtime]() -> scx_bpf_dsq_insert[_vtime]()
2. scx_bpf_consume() -> scx_bpf_dsq_move_to_local()
3. scx_bpf_dispatch[_vtime]_from_dsq*() -> scx_bpf_dsq_move[_vtime]*()

This patch performs the second rename. Compatibility is maintained by:

- The previous kfunc names are still provided by the kernel so that old
binaries can run. Kernel generates a warning when the old names are used.

- compat.bpf.h provides wrappers for the new names which automatically fall
back to the old names when running on older kernels. They also trigger
build error if old names are used for new builds.

The compat features will be dropped after v6.15.

v2: Comment and documentation updates.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Johannes Bechberger <me@mostlynerdless.de>
Acked-by: Giovanni Gherdovich <ggherdovich@suse.com>
Cc: Dan Schatzberg <dschatzberg@meta.com>
Cc: Ming Yang <yougmark94@gmail.com>

+58 -37
+10 -11
Documentation/scheduler/sched-ext.rst
··· 203 203 ``scx_bpf_destroy_dsq()``. 204 204 205 205 A CPU always executes a task from its local DSQ. A task is "inserted" into a 206 - DSQ. A non-local DSQ is "consumed" to transfer a task to the consuming CPU's 207 - local DSQ. 206 + DSQ. A task in a non-local DSQ is "move"d into the target CPU's local DSQ. 208 207 209 208 When a CPU is looking for the next task to run, if the local DSQ is not 210 - empty, the first task is picked. Otherwise, the CPU tries to consume the 211 - global DSQ. If that doesn't yield a runnable task either, ``ops.dispatch()`` 212 - is invoked. 209 + empty, the first task is picked. Otherwise, the CPU tries to move a task 210 + from the global DSQ. If that doesn't yield a runnable task either, 211 + ``ops.dispatch()`` is invoked. 213 212 214 213 Scheduling Cycle 215 214 ---------------- ··· 264 265 rather than performing them immediately. There can be up to 265 266 ``ops.dispatch_max_batch`` pending tasks. 266 267 267 - * ``scx_bpf_consume()`` tranfers a task from the specified non-local DSQ 268 - to the dispatching DSQ. This function cannot be called with any BPF 269 - locks held. ``scx_bpf_consume()`` flushes the pending dispatched tasks 270 - before trying to consume the specified DSQ. 268 + * ``scx_bpf_move_to_local()`` moves a task from the specified non-local 269 + DSQ to the dispatching DSQ. This function cannot be called with any BPF 270 + locks held. ``scx_bpf_move_to_local()`` flushes the pending insertions 271 + tasks before trying to move from the specified DSQ. 271 272 272 273 4. After ``ops.dispatch()`` returns, if there are tasks in the local DSQ, 273 274 the CPU runs the first one. If empty, the following steps are taken: 274 275 275 - * Try to consume the global DSQ. If successful, run the task. 276 + * Try to move from the global DSQ. If successful, run the task. 276 277 277 278 * If ``ops.dispatch()`` has dispatched any tasks, retry #3. 278 279 ··· 285 286 in ``ops.enqueue()`` as illustrated in the above simple example. If only the 286 287 built-in DSQs are used, there is no need to implement ``ops.dispatch()`` as 287 288 a task is never queued on the BPF scheduler and both the local and global 288 - DSQs are consumed automatically. 289 + DSQs are executed automatically. 289 290 290 291 ``scx_bpf_dsq_insert()`` inserts the task on the FIFO of the target DSQ. Use 291 292 ``scx_bpf_dsq_insert_vtime()`` for the priority queue. Internal DSQs such as
+28 -17
kernel/sched/ext.c
··· 264 264 void (*dequeue)(struct task_struct *p, u64 deq_flags); 265 265 266 266 /** 267 - * dispatch - Dispatch tasks from the BPF scheduler and/or consume DSQs 267 + * dispatch - Dispatch tasks from the BPF scheduler and/or user DSQs 268 268 * @cpu: CPU to dispatch tasks for 269 269 * @prev: previous task being switched out 270 270 * 271 271 * Called when a CPU's local dsq is empty. The operation should dispatch 272 272 * one or more tasks from the BPF scheduler into the DSQs using 273 - * scx_bpf_dsq_insert() and/or consume user DSQs into the local DSQ 274 - * using scx_bpf_consume(). 273 + * scx_bpf_dsq_insert() and/or move from user DSQs into the local DSQ 274 + * using scx_bpf_dsq_move_to_local(). 275 275 * 276 276 * The maximum number of times scx_bpf_dsq_insert() can be called 277 - * without an intervening scx_bpf_consume() is specified by 277 + * without an intervening scx_bpf_dsq_move_to_local() is specified by 278 278 * ops.dispatch_max_batch. See the comments on top of the two functions 279 279 * for more details. 280 280 * ··· 282 282 * @prev is still runnable as indicated by set %SCX_TASK_QUEUED in 283 283 * @prev->scx.flags, it is not enqueued yet and will be enqueued after 284 284 * ops.dispatch() returns. To keep executing @prev, return without 285 - * dispatching or consuming any tasks. Also see %SCX_OPS_ENQ_LAST. 285 + * dispatching or moving any tasks. Also see %SCX_OPS_ENQ_LAST. 286 286 */ 287 287 void (*dispatch)(s32 cpu, struct task_struct *prev); 288 288 ··· 6372 6372 * @enq_flags: SCX_ENQ_* 6373 6373 * 6374 6374 * Insert @p into the vtime priority queue of the DSQ identified by @dsq_id. 6375 - * Tasks queued into the priority queue are ordered by @vtime and always 6376 - * consumed after the tasks in the FIFO queue. All other aspects are identical 6377 - * to scx_bpf_dsq_insert(). 6375 + * Tasks queued into the priority queue are ordered by @vtime. All other aspects 6376 + * are identical to scx_bpf_dsq_insert(). 6378 6377 * 6379 6378 * @vtime ordering is according to time_before64() which considers wrapping. A 6380 6379 * numerically larger vtime may indicate an earlier position in the ordering and 6381 6380 * vice-versa. 6381 + * 6382 + * A DSQ can only be used as a FIFO or priority queue at any given time and this 6383 + * function must not be called on a DSQ which already has one or more FIFO tasks 6384 + * queued and vice-versa. Also, the built-in DSQs (SCX_DSQ_LOCAL and 6385 + * SCX_DSQ_GLOBAL) cannot be used as priority queues. 6382 6386 */ 6383 6387 __bpf_kfunc void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id, 6384 6388 u64 slice, u64 vtime, u64 enq_flags) ··· 6543 6539 } 6544 6540 6545 6541 /** 6546 - * scx_bpf_consume - Transfer a task from a DSQ to the current CPU's local DSQ 6547 - * @dsq_id: DSQ to consume 6542 + * scx_bpf_dsq_move_to_local - move a task from a DSQ to the current CPU's local DSQ 6543 + * @dsq_id: DSQ to move task from 6548 6544 * 6549 - * Consume a task from the non-local DSQ identified by @dsq_id and transfer it 6550 - * to the current CPU's local DSQ for execution. Can only be called from 6551 - * ops.dispatch(). 6545 + * Move a task from the non-local DSQ identified by @dsq_id to the current CPU's 6546 + * local DSQ for execution. Can only be called from ops.dispatch(). 6552 6547 * 6553 6548 * This function flushes the in-flight dispatches from scx_bpf_dsq_insert() 6554 - * before trying to consume the specified DSQ. It may also grab rq locks and 6549 + * before trying to move from the specified DSQ. It may also grab rq locks and 6555 6550 * thus can't be called under any BPF locks. 6556 6551 * 6557 - * Returns %true if a task has been consumed, %false if there isn't any task to 6558 - * consume. 6552 + * Returns %true if a task has been moved, %false if there isn't any task to 6553 + * move. 6559 6554 */ 6560 - __bpf_kfunc bool scx_bpf_consume(u64 dsq_id) 6555 + __bpf_kfunc bool scx_bpf_dsq_move_to_local(u64 dsq_id) 6561 6556 { 6562 6557 struct scx_dsp_ctx *dspc = this_cpu_ptr(scx_dsp_ctx); 6563 6558 struct scx_dispatch_q *dsq; ··· 6584 6581 } else { 6585 6582 return false; 6586 6583 } 6584 + } 6585 + 6586 + /* for backward compatibility, will be removed in v6.15 */ 6587 + __bpf_kfunc bool scx_bpf_consume(u64 dsq_id) 6588 + { 6589 + printk_deferred_once(KERN_WARNING "sched_ext: scx_bpf_consume() renamed to scx_bpf_dsq_move_to_local()"); 6590 + return scx_bpf_dsq_move_to_local(dsq_id); 6587 6591 } 6588 6592 6589 6593 /** ··· 6694 6684 BTF_KFUNCS_START(scx_kfunc_ids_dispatch) 6695 6685 BTF_ID_FLAGS(func, scx_bpf_dispatch_nr_slots) 6696 6686 BTF_ID_FLAGS(func, scx_bpf_dispatch_cancel) 6687 + BTF_ID_FLAGS(func, scx_bpf_dsq_move_to_local) 6697 6688 BTF_ID_FLAGS(func, scx_bpf_consume) 6698 6689 BTF_ID_FLAGS(func, scx_bpf_dispatch_from_dsq_set_slice) 6699 6690 BTF_ID_FLAGS(func, scx_bpf_dispatch_from_dsq_set_vtime)
+1 -1
tools/sched_ext/include/scx/common.bpf.h
··· 40 40 void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __ksym __weak; 41 41 u32 scx_bpf_dispatch_nr_slots(void) __ksym; 42 42 void scx_bpf_dispatch_cancel(void) __ksym; 43 - bool scx_bpf_consume(u64 dsq_id) __ksym; 43 + bool scx_bpf_dsq_move_to_local(u64 dsq_id) __ksym; 44 44 void scx_bpf_dispatch_from_dsq_set_slice(struct bpf_iter_scx_dsq *it__iter, u64 slice) __ksym __weak; 45 45 void scx_bpf_dispatch_from_dsq_set_vtime(struct bpf_iter_scx_dsq *it__iter, u64 vtime) __ksym __weak; 46 46 bool scx_bpf_dispatch_from_dsq(struct bpf_iter_scx_dsq *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __ksym __weak;
+11
tools/sched_ext/include/scx/compat.bpf.h
··· 43 43 */ 44 44 void scx_bpf_dispatch___compat(struct task_struct *p, u64 dsq_id, u64 slice, u64 enq_flags) __ksym __weak; 45 45 void scx_bpf_dispatch_vtime___compat(struct task_struct *p, u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags) __ksym __weak; 46 + bool scx_bpf_consume___compat(u64 dsq_id) __ksym __weak; 46 47 47 48 #define scx_bpf_dsq_insert(p, dsq_id, slice, enq_flags) \ 48 49 (bpf_ksym_exists(scx_bpf_dsq_insert) ? \ ··· 55 54 scx_bpf_dsq_insert_vtime((p), (dsq_id), (slice), (vtime), (enq_flags)) : \ 56 55 scx_bpf_dispatch_vtime___compat((p), (dsq_id), (slice), (vtime), (enq_flags))) 57 56 57 + #define scx_bpf_dsq_move_to_local(dsq_id) \ 58 + (bpf_ksym_exists(scx_bpf_dsq_move_to_local) ? \ 59 + scx_bpf_dsq_move_to_local((dsq_id)) : \ 60 + scx_bpf_consume___compat((dsq_id))) 61 + 58 62 #define scx_bpf_dispatch(p, dsq_id, slice, enq_flags) \ 59 63 _Static_assert(false, "scx_bpf_dispatch() renamed to scx_bpf_dsq_insert()") 60 64 61 65 #define scx_bpf_dispatch_vtime(p, dsq_id, slice, vtime, enq_flags) \ 62 66 _Static_assert(false, "scx_bpf_dispatch_vtime() renamed to scx_bpf_dsq_insert_vtime()") 67 + 68 + #define scx_bpf_consume(dsq_id) ({ \ 69 + _Static_assert(false, "scx_bpf_consume() renamed to scx_bpf_dsq_move_to_local()"); \ 70 + false; \ 71 + }) 63 72 64 73 /* 65 74 * Define sched_ext_ops. This may be expanded to define multiple variants for
+2 -2
tools/sched_ext/scx_central.bpf.c
··· 219 219 } 220 220 221 221 /* look for a task to run on the central CPU */ 222 - if (scx_bpf_consume(FALLBACK_DSQ_ID)) 222 + if (scx_bpf_dsq_move_to_local(FALLBACK_DSQ_ID)) 223 223 return; 224 224 dispatch_to_cpu(central_cpu); 225 225 } else { 226 226 bool *gimme; 227 227 228 - if (scx_bpf_consume(FALLBACK_DSQ_ID)) 228 + if (scx_bpf_dsq_move_to_local(FALLBACK_DSQ_ID)) 229 229 return; 230 230 231 231 gimme = ARRAY_ELEM_PTR(cpu_gimme_task, cpu, nr_cpu_ids);
+3 -3
tools/sched_ext/scx_flatcg.bpf.c
··· 665 665 goto out_free; 666 666 } 667 667 668 - if (!scx_bpf_consume(cgid)) { 668 + if (!scx_bpf_dsq_move_to_local(cgid)) { 669 669 bpf_cgroup_release(cgrp); 670 670 stat_inc(FCG_STAT_PNC_EMPTY); 671 671 goto out_stash; ··· 745 745 goto pick_next_cgroup; 746 746 747 747 if (vtime_before(now, cpuc->cur_at + cgrp_slice_ns)) { 748 - if (scx_bpf_consume(cpuc->cur_cgid)) { 748 + if (scx_bpf_dsq_move_to_local(cpuc->cur_cgid)) { 749 749 stat_inc(FCG_STAT_CNS_KEEP); 750 750 return; 751 751 } ··· 785 785 pick_next_cgroup: 786 786 cpuc->cur_at = now; 787 787 788 - if (scx_bpf_consume(FALLBACK_DSQ)) { 788 + if (scx_bpf_dsq_move_to_local(FALLBACK_DSQ)) { 789 789 cpuc->cur_cgid = 0; 790 790 return; 791 791 }
+2 -2
tools/sched_ext/scx_qmap.bpf.c
··· 374 374 if (dispatch_highpri(false)) 375 375 return; 376 376 377 - if (!nr_highpri_queued && scx_bpf_consume(SHARED_DSQ)) 377 + if (!nr_highpri_queued && scx_bpf_dsq_move_to_local(SHARED_DSQ)) 378 378 return; 379 379 380 380 if (dsp_inf_loop_after && nr_dispatched > dsp_inf_loop_after) { ··· 439 439 if (!batch || !scx_bpf_dispatch_nr_slots()) { 440 440 if (dispatch_highpri(false)) 441 441 return; 442 - scx_bpf_consume(SHARED_DSQ); 442 + scx_bpf_dsq_move_to_local(SHARED_DSQ); 443 443 return; 444 444 } 445 445 if (!cpuc->dsp_cnt)
+1 -1
tools/sched_ext/scx_simple.bpf.c
··· 94 94 95 95 void BPF_STRUCT_OPS(simple_dispatch, s32 cpu, struct task_struct *prev) 96 96 { 97 - scx_bpf_consume(SHARED_DSQ); 97 + scx_bpf_dsq_move_to_local(SHARED_DSQ); 98 98 } 99 99 100 100 void BPF_STRUCT_OPS(simple_running, struct task_struct *p)