sched/mmcid: Use cpumask_weighted_or() · tjh.dev/kernel@79c11fb

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

sched/mmcid: Use cpumask_weighted_or()

Use cpumask_weighted_or() instead of cpumask_or() and cpumask_weight() on
the result, which walks the same bitmap twice. Results in 10-20% less
cycles, which reduces the runqueue lock hold time.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Link: https://patch.msgid.link/20251119172549.511736272@linutronix.de

authored by

Thomas Gleixner and committed by

Peter Zijlstra 4 months ago 79c11fb3 437cb3de

+3 -2

1 changed file

expand all

kernel

sched

core.c

+3 -2

kernel/sched/core.c

··· 10377 10377 static inline void mm_update_cpus_allowed(struct mm_struct *mm, const struct cpumask *affmsk) 10378 10378 { 10379 10379 struct cpumask *mm_allowed; 10380 + unsigned int weight; 10380 10381 10381 10382 if (!mm) 10382 10383 return; ··· 10388 10387 */ 10389 10388 guard(raw_spinlock)(&mm->mm_cid.lock); 10390 10389 mm_allowed = mm_cpus_allowed(mm); 10391 - cpumask_or(mm_allowed, mm_allowed, affmsk); 10392 - WRITE_ONCE(mm->mm_cid.nr_cpus_allowed, cpumask_weight(mm_allowed)); 10390 + weight = cpumask_weighted_or(mm_allowed, mm_allowed, affmsk); 10391 + WRITE_ONCE(mm->mm_cid.nr_cpus_allowed, weight); 10393 10392 } 10394 10393 10395 10394 void sched_mm_cid_exit_signals(struct task_struct *t)