Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powerpc: fix exception clearing in e500 SPE float emulation

The e500 SPE floating-point emulation code clears existing exceptions
(__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the
emulated operation. However, these exception bits are the "sticky",
cumulative exception bits, and should only be cleared by the user
program setting SPEFSCR, not implicitly by any floating-point
instruction (whether executed purely by the hardware or emulated).
The spurious clearing of these bits shows up as missing exceptions in
glibc testing.

Fixing this, however, is not as simple as just not clearing the bits,
because while the bits may be from previous floating-point operations
(in which case they should not be cleared), the processor can also set
the sticky bits itself before the interrupt for an exception occurs,
and this can happen in cases when IEEE 754 semantics are that the
sticky bit should not be set. Specifically, the "invalid" sticky bit
is set in various cases with non-finite operands, where IEEE 754
semantics do not involve raising such an exception, and the
"underflow" sticky bit is set in cases of exact underflow, whereas
IEEE 754 semantics are that this flag is set only for inexact
underflow. Thus, for correct emulation the kernel needs to know the
setting of these two sticky bits before the instruction being
emulated.

When a floating-point operation raises an exception, the kernel can
note the state of the sticky bits immediately afterwards. Some
<fenv.h> functions that affect the state of these bits, such as
fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
PR_SET_FPEXC anyway, and so it is natural to record the state of those
bits during that call into the kernel and so avoid any need for a
separate call into the kernel to inform it of a change to those bits.
Thus, the interface I chose to use (in this patch and the glibc port)
is that one of those prctl calls must be made after any userspace
change to those sticky bits, other than through a floating-point
operation that traps into the kernel anyway. feclearexcept and
fesetexceptflag duly make those calls, which would not be required
were it not for this issue.

The previous EGLIBC port, and the uClibc code copied from it, is
fundamentally broken as regards any use of prctl for floating-point
exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its
prctl calls (and did various worse things, such as passing a pointer
when prctl expected an integer). If you avoid anything where prctl is
used, the clearing of sticky bits still means it will never give
anything approximating correct exception semantics with existing
kernels. I don't believe the patch makes things any worse for
existing code that doesn't try to inform the kernel of changes to
sticky bits - such code may get incorrect exceptions in some cases,
but it would have done so anyway in other cases.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>

authored by

Joseph Myers and committed by
Scott Wood
640e9225 228b1a47

+52 -4
+5 -1
arch/powerpc/include/asm/processor.h
··· 256 256 unsigned long evr[32]; /* upper 32-bits of SPE regs */ 257 257 u64 acc; /* Accumulator */ 258 258 unsigned long spefscr; /* SPE & eFP status */ 259 + unsigned long spefscr_last; /* SPEFSCR value on last prctl 260 + call or trap return */ 259 261 int used_spe; /* set if process has used spe */ 260 262 #endif /* CONFIG_SPE */ 261 263 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM ··· 319 317 (_ALIGN_UP(sizeof(init_thread_info), 16) + (unsigned long) &init_stack) 320 318 321 319 #ifdef CONFIG_SPE 322 - #define SPEFSCR_INIT .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | SPEFSCR_FOVFE, 320 + #define SPEFSCR_INIT \ 321 + .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | SPEFSCR_FOVFE, \ 322 + .spefscr_last = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | SPEFSCR_FOVFE, 323 323 #else 324 324 #define SPEFSCR_INIT 325 325 #endif
+28 -2
arch/powerpc/kernel/process.c
··· 1175 1175 if (val & PR_FP_EXC_SW_ENABLE) { 1176 1176 #ifdef CONFIG_SPE 1177 1177 if (cpu_has_feature(CPU_FTR_SPE)) { 1178 + /* 1179 + * When the sticky exception bits are set 1180 + * directly by userspace, it must call prctl 1181 + * with PR_GET_FPEXC (with PR_FP_EXC_SW_ENABLE 1182 + * in the existing prctl settings) or 1183 + * PR_SET_FPEXC (with PR_FP_EXC_SW_ENABLE in 1184 + * the bits being set). <fenv.h> functions 1185 + * saving and restoring the whole 1186 + * floating-point environment need to do so 1187 + * anyway to restore the prctl settings from 1188 + * the saved environment. 1189 + */ 1190 + tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR); 1178 1191 tsk->thread.fpexc_mode = val & 1179 1192 (PR_FP_EXC_SW_ENABLE | PR_FP_ALL_EXCEPT); 1180 1193 return 0; ··· 1219 1206 1220 1207 if (tsk->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE) 1221 1208 #ifdef CONFIG_SPE 1222 - if (cpu_has_feature(CPU_FTR_SPE)) 1209 + if (cpu_has_feature(CPU_FTR_SPE)) { 1210 + /* 1211 + * When the sticky exception bits are set 1212 + * directly by userspace, it must call prctl 1213 + * with PR_GET_FPEXC (with PR_FP_EXC_SW_ENABLE 1214 + * in the existing prctl settings) or 1215 + * PR_SET_FPEXC (with PR_FP_EXC_SW_ENABLE in 1216 + * the bits being set). <fenv.h> functions 1217 + * saving and restoring the whole 1218 + * floating-point environment need to do so 1219 + * anyway to restore the prctl settings from 1220 + * the saved environment. 1221 + */ 1222 + tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR); 1223 1223 val = tsk->thread.fpexc_mode; 1224 - else 1224 + } else 1225 1225 return -EINVAL; 1226 1226 #else 1227 1227 return -EINVAL;
+19 -1
arch/powerpc/math-emu/math_efp.c
··· 630 630 regs->ccr |= (IR << ((7 - ((speinsn >> 23) & 0x7)) << 2)); 631 631 632 632 update_regs: 633 - __FPU_FPSCR &= ~FP_EX_MASK; 633 + /* 634 + * If the "invalid" exception sticky bit was set by the 635 + * processor for non-finite input, but was not set before the 636 + * instruction being emulated, clear it. Likewise for the 637 + * "underflow" bit, which may have been set by the processor 638 + * for exact underflow, not just inexact underflow when the 639 + * flag should be set for IEEE 754 semantics. Other sticky 640 + * exceptions will only be set by the processor when they are 641 + * correct according to IEEE 754 semantics, and we must not 642 + * clear sticky bits that were already set before the emulated 643 + * instruction as they represent the user-visible sticky 644 + * exception status. "inexact" traps to kernel are not 645 + * required for IEEE semantics and are not enabled by default, 646 + * so the "inexact" sticky bit may have been set by a previous 647 + * instruction without the kernel being aware of it. 648 + */ 649 + __FPU_FPSCR 650 + &= ~(FP_EX_INVALID | FP_EX_UNDERFLOW) | current->thread.spefscr_last; 634 651 __FPU_FPSCR |= (FP_CUR_EXCEPTIONS & FP_EX_MASK); 635 652 mtspr(SPRN_SPEFSCR, __FPU_FPSCR); 653 + current->thread.spefscr_last = __FPU_FPSCR; 636 654 637 655 current->thread.evr[fc] = vc.wp[0]; 638 656 regs->gpr[fc] = vc.wp[1];