Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cpumask: fix incorrect cpumask scanning result checks

It turns out that commit 596ff4a09b89 ("cpumask: re-introduce
constant-sized cpumask optimizations") exposed a number of cases of
drivers not checking the result of "cpumask_next()" and friends
correctly.

The documented correct check for "no more cpus in the cpumask" is to
check for the result being equal or larger than the number of possible
CPU ids, exactly _because_ we've always done those constant-sized
cpumask scans using a widened type before. So the return value of a
cpumask scan should be checked with

if (cpu >= nr_cpu_ids)
...

because the cpumask scan did not necessarily stop exactly *at* that
maximum CPU id.

But a few cases ended up instead using checks like

if (cpu == nr_cpumask_bits)
...

which used that internal "widened" number of bits. And that used to
work pretty much by accident (ok, in this case "by accident" is simply
because it matched the historical internal implementation of the cpumask
scanning, so it was more of a "intentionally using implementation
details rather than an accident").

But the extended constant-sized optimizations then did that internal
implementation differently, and now that code that did things wrong but
matched the old implementation no longer worked at all.

Which then causes subsequent odd problems due to using what ends up
being an invalid CPU ID.

Most of these cases require either unusual hardware or special uses to
hit, but the random.c one triggers quite easily.

All you really need is to have a sufficiently small CONFIG_NR_CPUS value
for the bit scanning optimization to be triggered, but not enough CPUs
to then actually fill that widened cpumask. At that point, the cpumask
scanning will return the NR_CPUS constant, which is _not_ the same as
nr_cpumask_bits.

This just does the mindless fix with

sed -i 's/== nr_cpumask_bits/>= nr_cpu_ids/'

to fix the incorrect uses.

The ones in the SCSI lpfc driver in particular could probably be fixed
more cleanly by just removing that repeated pattern entirely, but I am
not emptionally invested enough in that driver to care.

Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/lkml/CAMuHMdUKo_Sf7TjKzcNDa8Ve+6QrK+P8nSQrSQ=6LTRmcBKNww@mail.gmail.com/
Reported-by: Vernon Yang <vernon2gm@gmail.com>
Link: https://lore.kernel.org/lkml/20230306160651.2016767-1-vernon2gm@gmail.com/
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+10 -10
+1 -1
arch/powerpc/xmon/xmon.c
··· 1275 1275 while (!cpumask_empty(&xmon_batch_cpus)) { 1276 1276 cpu = cpumask_next_wrap(smp_processor_id(), &xmon_batch_cpus, 1277 1277 xmon_batch_start_cpu, true); 1278 - if (cpu == nr_cpumask_bits) 1278 + if (cpu >= nr_cpu_ids) 1279 1279 break; 1280 1280 if (xmon_batch_start_cpu == -1) 1281 1281 xmon_batch_start_cpu = cpu;
+1 -1
drivers/char/random.c
··· 1311 1311 /* Basic CPU round-robin, which avoids the current CPU. */ 1312 1312 do { 1313 1313 cpu = cpumask_next(cpu, &timer_cpus); 1314 - if (cpu == nr_cpumask_bits) 1314 + if (cpu >= nr_cpu_ids) 1315 1315 cpu = cpumask_first(&timer_cpus); 1316 1316 } while (cpu == smp_processor_id() && num_cpus > 1); 1317 1317
+1 -1
drivers/net/wireguard/queueing.h
··· 106 106 { 107 107 unsigned int cpu = *stored_cpu, cpu_index, i; 108 108 109 - if (unlikely(cpu == nr_cpumask_bits || 109 + if (unlikely(cpu >= nr_cpu_ids || 110 110 !cpumask_test_cpu(cpu, cpu_online_mask))) { 111 111 cpu_index = id % cpumask_weight(cpu_online_mask); 112 112 cpu = cpumask_first(cpu_online_mask);
+7 -7
drivers/scsi/lpfc/lpfc_init.c
··· 12563 12563 goto found_same; 12564 12564 new_cpu = cpumask_next( 12565 12565 new_cpu, cpu_present_mask); 12566 - if (new_cpu == nr_cpumask_bits) 12566 + if (new_cpu >= nr_cpu_ids) 12567 12567 new_cpu = first_cpu; 12568 12568 } 12569 12569 /* At this point, we leave the CPU as unassigned */ ··· 12577 12577 * selecting the same IRQ. 12578 12578 */ 12579 12579 start_cpu = cpumask_next(new_cpu, cpu_present_mask); 12580 - if (start_cpu == nr_cpumask_bits) 12580 + if (start_cpu >= nr_cpu_ids) 12581 12581 start_cpu = first_cpu; 12582 12582 12583 12583 lpfc_printf_log(phba, KERN_INFO, LOG_INIT, ··· 12613 12613 goto found_any; 12614 12614 new_cpu = cpumask_next( 12615 12615 new_cpu, cpu_present_mask); 12616 - if (new_cpu == nr_cpumask_bits) 12616 + if (new_cpu >= nr_cpu_ids) 12617 12617 new_cpu = first_cpu; 12618 12618 } 12619 12619 /* We should never leave an entry unassigned */ ··· 12631 12631 * selecting the same IRQ. 12632 12632 */ 12633 12633 start_cpu = cpumask_next(new_cpu, cpu_present_mask); 12634 - if (start_cpu == nr_cpumask_bits) 12634 + if (start_cpu >= nr_cpu_ids) 12635 12635 start_cpu = first_cpu; 12636 12636 12637 12637 lpfc_printf_log(phba, KERN_INFO, LOG_INIT, ··· 12704 12704 goto found_hdwq; 12705 12705 } 12706 12706 new_cpu = cpumask_next(new_cpu, cpu_present_mask); 12707 - if (new_cpu == nr_cpumask_bits) 12707 + if (new_cpu >= nr_cpu_ids) 12708 12708 new_cpu = first_cpu; 12709 12709 } 12710 12710 ··· 12719 12719 goto found_hdwq; 12720 12720 12721 12721 new_cpu = cpumask_next(new_cpu, cpu_present_mask); 12722 - if (new_cpu == nr_cpumask_bits) 12722 + if (new_cpu >= nr_cpu_ids) 12723 12723 new_cpu = first_cpu; 12724 12724 } 12725 12725 ··· 12730 12730 found_hdwq: 12731 12731 /* We found an available entry, copy the IRQ info */ 12732 12732 start_cpu = cpumask_next(new_cpu, cpu_present_mask); 12733 - if (start_cpu == nr_cpumask_bits) 12733 + if (start_cpu >= nr_cpu_ids) 12734 12734 start_cpu = first_cpu; 12735 12735 cpup->hdwq = new_cpup->hdwq; 12736 12736 logit: