[PATCH] powerpc/8xx: Use 8MB D-TLB's for kernel static mapping faults

The following implements support for instantiation of 8MB D-TLB
entries for the kernel direct virtual mapping on 8xx, thus reducing TLB
space consumed for the kernel.

Test used: writing 40MB from /dev/zero to file in ext2fs over
RAMDISK.

$ time dd if=/dev/zero of=file bs=4k count=10000

VANILLA 8MB kernel data pages

real 0m11.485s real 0m11.267s
user 0m0.218s user 0m0.250s
sys 0m8.939s sys 0m9.108s

real 0m11.518s real 0m10.978s
user 0m0.203s user 0m0.222s
sys 0m9.585s sys 0m9.138s

real 0m11.554s real 0m10.967s
user 0m0.228s user 0m0.222s
sys 0m9.497s sys 0m9.127s

real 0m11.633s real 0m11.286s
user 0m0.214s user 0m0.196s
sys 0m9.529s sys 0m9.134s

and averages for both:

real 11.54750 real 11.12450

Which is a 3.6% improvement in execution time. More improvement is
expected for loads with larger kernel data footprint (real workloads).

Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

authored by Marcelo Tosatti and committed by Paul Mackerras 8f069b1a 7d13d21a

+77
+77
arch/ppc/kernel/head_8xx.S
··· 375 lis r11, swapper_pg_dir@h 376 ori r11, r11, swapper_pg_dir@l 377 rlwimi r10, r11, 0, 2, 19 378 3: 379 lwz r11, 0(r10) /* Get the level 1 entry */ 380 rlwinm. r10, r11,0,0,19 /* Extract page descriptor page address */ ··· 431 . = 0x1300 432 InstructionTLBError: 433 b InstructionAccess 434 435 /* This is the data TLB error on the MPC8xx. This could be due to 436 * many reasons, including a dirty update to a pte. We can catch that
··· 375 lis r11, swapper_pg_dir@h 376 ori r11, r11, swapper_pg_dir@l 377 rlwimi r10, r11, 0, 2, 19 378 + stw r12, 16(r0) 379 + b LoadLargeDTLB 380 3: 381 lwz r11, 0(r10) /* Get the level 1 entry */ 382 rlwinm. r10, r11,0,0,19 /* Extract page descriptor page address */ ··· 429 . = 0x1300 430 InstructionTLBError: 431 b InstructionAccess 432 + 433 + LoadLargeDTLB: 434 + li r12, 0 435 + lwz r11, 0(r10) /* Get the level 1 entry */ 436 + rlwinm. r10, r11,0,0,19 /* Extract page descriptor page address */ 437 + beq 3f /* If zero, don't try to find a pte */ 438 + 439 + /* We have a pte table, so load fetch the pte from the table. 440 + */ 441 + ori r11, r11, 1 /* Set valid bit in physical L2 page */ 442 + DO_8xx_CPU6(0x3b80, r3) 443 + mtspr SPRN_MD_TWC, r11 /* Load pte table base address */ 444 + mfspr r10, SPRN_MD_TWC /* ....and get the pte address */ 445 + lwz r10, 0(r10) /* Get the pte */ 446 + 447 + /* Insert the Guarded flag into the TWC from the Linux PTE. 448 + * It is bit 27 of both the Linux PTE and the TWC (at least 449 + * I got that right :-). It will be better when we can put 450 + * this into the Linux pgd/pmd and load it in the operation 451 + * above. 452 + */ 453 + rlwimi r11, r10, 0, 27, 27 454 + 455 + rlwimi r12, r10, 0, 0, 9 /* extract phys. addr */ 456 + mfspr r3, SPRN_MD_EPN 457 + rlwinm r3, r3, 0, 0, 9 /* extract virtual address */ 458 + tophys(r3, r3) 459 + cmpw r3, r12 /* only use 8M page if it is a direct 460 + kernel mapping */ 461 + bne 1f 462 + ori r11, r11, MD_PS8MEG 463 + li r12, 1 464 + b 2f 465 + 1: 466 + li r12, 0 /* can't use 8MB TLB, so zero r12. */ 467 + 2: 468 + DO_8xx_CPU6(0x3b80, r3) 469 + mtspr SPRN_MD_TWC, r11 470 + 471 + /* The Linux PTE won't go exactly into the MMU TLB. 472 + * Software indicator bits 21, 22 and 28 must be clear. 473 + * Software indicator bits 24, 25, 26, and 27 must be 474 + * set. All other Linux PTE bits control the behavior 475 + * of the MMU. 476 + */ 477 + 3: li r11, 0x00f0 478 + rlwimi r10, r11, 0, 24, 28 /* Set 24-27, clear 28 */ 479 + cmpwi r12, 1 480 + bne 4f 481 + ori r10, r10, 0x8 482 + 483 + mfspr r12, SPRN_MD_EPN 484 + lis r3, 0xff80 /* 10-19 must be clear for 8MB TLB */ 485 + ori r3, r3, 0x0fff 486 + and r12, r3, r12 487 + DO_8xx_CPU6(0x3780, r3) 488 + mtspr SPRN_MD_EPN, r12 489 + 490 + lis r3, 0xff80 /* 10-19 must be clear for 8MB TLB */ 491 + ori r3, r3, 0x0fff 492 + and r10, r3, r10 493 + 4: 494 + DO_8xx_CPU6(0x3d80, r3) 495 + mtspr SPRN_MD_RPN, r10 /* Update TLB entry */ 496 + 497 + mfspr r10, SPRN_M_TW /* Restore registers */ 498 + lwz r11, 0(r0) 499 + mtcr r11 500 + lwz r11, 4(r0) 501 + 502 + lwz r12, 16(r0) 503 + #ifdef CONFIG_8xx_CPU6 504 + lwz r3, 8(r0) 505 + #endif 506 + rfi 507 508 /* This is the data TLB error on the MPC8xx. This could be due to 509 * many reasons, including a dirty update to a pte. We can catch that