Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powerpc: Default arch idle could cede processor on pseries

When adding cpuidle support to pSeries, we introduced two
regressions:

- The new cpuidle backend driver only works under hypervisors
supporting the "SLPLAR" option, which isn't the case of the
old POWER4 hypervisor and the HV "light" used on js2x blades

- The cpuidle driver registers fairly late, meaning that for
a significant portion of the boot process, we end up having
all threads spinning. This slows down the boot process and
increases the overall resource usage if the hypervisor has
shared processors.

This fixes both by implementing a "default" idle that will cede
to the hypervisor when possible, in a very simple way without
all the bells and whisles of cpuidle.

Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>

authored by

Vaidyanathan Srinivasan and committed by
Benjamin Herrenschmidt
363edbe2 88c2d0b6

+21 -10
+21 -10
arch/powerpc/platforms/pseries/setup.c
··· 354 354 } 355 355 early_initcall(alloc_dispatch_log_kmem_cache); 356 356 357 - static void pSeries_idle(void) 357 + static void pseries_lpar_idle(void) 358 358 { 359 359 /* This would call on the cpuidle framework, and the back-end pseries 360 360 * driver to go to idle states ··· 362 362 if (cpuidle_idle_call()) { 363 363 /* On error, execute default handler 364 364 * to go into low thread priority and possibly 365 - * low power mode. 365 + * low power mode by cedeing processor to hypervisor 366 366 */ 367 - HMT_low(); 368 - HMT_very_low(); 367 + 368 + /* Indicate to hypervisor that we are idle. */ 369 + get_lppaca()->idle = 1; 370 + 371 + /* 372 + * Yield the processor to the hypervisor. We return if 373 + * an external interrupt occurs (which are driven prior 374 + * to returning here) or if a prod occurs from another 375 + * processor. When returning here, external interrupts 376 + * are enabled. 377 + */ 378 + cede_processor(); 379 + 380 + get_lppaca()->idle = 0; 369 381 } 370 382 } 371 383 ··· 468 456 469 457 pSeries_nvram_init(); 470 458 471 - if (firmware_has_feature(FW_FEATURE_SPLPAR)) { 459 + if (firmware_has_feature(FW_FEATURE_LPAR)) { 472 460 vpa_init(boot_cpuid); 473 - ppc_md.power_save = pSeries_idle; 474 - } 475 - 476 - if (firmware_has_feature(FW_FEATURE_LPAR)) 461 + ppc_md.power_save = pseries_lpar_idle; 477 462 ppc_md.enable_pmcs = pseries_lpar_enable_pmcs; 478 - else 463 + } else { 464 + /* No special idle routine */ 479 465 ppc_md.enable_pmcs = power4_enable_pmcs; 466 + } 480 467 481 468 ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare; 482 469