powerpc: fix e500 SPE float rounding inexactness detection

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

The e500 SPE floating-point emulation code for the rounding modes
rounding to positive or negative infinity (which may not be
implemented in hardware) tries to avoid emulating rounding if the
result was inexact. However, it tests inexactness using the sticky
bit with the cumulative result of previous operations, rather than
with the non-sticky bits relating to the operation that generated the
interrupt. Furthermore, when a vector operation generates the
interrupt, it's possible that only one of the low and high parts is
inexact, and so only that part should have rounding emulated. This
results in incorrect rounding of exact results in these modes when the
sticky bit is set from a previous operation.

(I'm not sure why the rounding interrupts are generated at all when
the result is exact, but empirically the hardware does generate them.)

This patch checks for inexactness using the correct bits of SPEFSCR,
and ensures that rounding only occurs when the relevant part of the
result was actually inexact.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

authored by

Joseph Myers and committed by

Scott Wood 12 years ago 28414a6d 640e9225

+16 -7

1 changed file

expand all

arch

powerpc

math-emu

math_efp.c

+16 -7

arch/powerpc/math-emu/math_efp.c

··· 680 680 { 681 681 union dw_union fgpr; 682 682 int s_lo, s_hi; 683 - unsigned long speinsn, type, fc; 683 + int lo_inexact, hi_inexact; 684 + unsigned long speinsn, type, fc, fptype; 684 685 685 686 if (get_user(speinsn, (unsigned int __user *) regs->nip)) 686 687 return -EFAULT; ··· 694 693 __FPU_FPSCR = mfspr(SPRN_SPEFSCR); 695 694 pr_debug("speinsn:%08lx spefscr:%08lx\n", speinsn, __FPU_FPSCR); 696 695 696 + fptype = (speinsn >> 5) & 0x7; 697 + 697 698 /* No need to round if the result is exact */ 698 - if (!(__FPU_FPSCR & FP_EX_INEXACT)) 699 + lo_inexact = __FPU_FPSCR & (SPEFSCR_FG | SPEFSCR_FX); 700 + hi_inexact = __FPU_FPSCR & (SPEFSCR_FGH | SPEFSCR_FXH); 701 + if (!(lo_inexact || (hi_inexact && fptype == VCT))) 699 702 return 0; 700 703 701 704 fc = (speinsn >> 21) & 0x1f; ··· 710 705 711 706 pr_debug("round fgpr: %08x %08x\n", fgpr.wp[0], fgpr.wp[1]); 712 707 713 - switch ((speinsn >> 5) & 0x7) { 708 + switch (fptype) { 714 709 /* Since SPE instructions on E500 core can handle round to nearest 715 710 * and round toward zero with IEEE-754 complied, we just need 716 711 * to handle round toward +Inf and round toward -Inf by software. ··· 733 728 734 729 case VCT: 735 730 if (FP_ROUNDMODE == FP_RND_PINF) { 736 - if (!s_lo) fgpr.wp[1]++; /* Z_low > 0, choose Z1 */ 737 - if (!s_hi) fgpr.wp[0]++; /* Z_high word > 0, choose Z1 */ 731 + if (lo_inexact && !s_lo) 732 + fgpr.wp[1]++; /* Z_low > 0, choose Z1 */ 733 + if (hi_inexact && !s_hi) 734 + fgpr.wp[0]++; /* Z_high word > 0, choose Z1 */ 738 735 } else { /* round to -Inf */ 739 - if (s_lo) fgpr.wp[1]++; /* Z_low < 0, choose Z2 */ 740 - if (s_hi) fgpr.wp[0]++; /* Z_high < 0, choose Z2 */ 736 + if (lo_inexact && s_lo) 737 + fgpr.wp[1]++; /* Z_low < 0, choose Z2 */ 738 + if (hi_inexact && s_hi) 739 + fgpr.wp[0]++; /* Z_high < 0, choose Z2 */ 741 740 } 742 741 break; 743 742