Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powerpc: Reimplement __get_SP() as a function not a define

Li Zhong points out an issue with our current __get_SP()
implementation. If ftrace function tracing is enabled (ie -pg
profiling using _mcount) we spill a stack frame on 64bit all the
time.

If a function calls __get_SP() and later calls a function that is
tail call optimised, we will pop the stack frame and the value
returned by __get_SP() is no longer valid. An example from Li can
be found in save_stack_trace -> save_context_stack:

c0000000000432c0 <.save_stack_trace>:
c0000000000432c0: mflr r0
c0000000000432c4: std r0,16(r1)
c0000000000432c8: stdu r1,-128(r1) <-- stack frame for _mcount
c0000000000432cc: std r3,112(r1)
c0000000000432d0: bl <._mcount>
c0000000000432d4: nop

c0000000000432d8: mr r4,r1 <-- __get_SP()

c0000000000432dc: ld r5,632(r13)
c0000000000432e0: ld r3,112(r1)
c0000000000432e4: li r6,1

c0000000000432e8: addi r1,r1,128 <-- pop stack frame

c0000000000432ec: ld r0,16(r1)
c0000000000432f0: mtlr r0
c0000000000432f4: b <.save_context_stack> <-- tail call optimized

save_context_stack ends up with a stack pointer below the current
one, and it is likely to be scribbled over.

Fix this by making __get_SP() a function which returns the
callers stack frame. Also replace inline assembly which grabs
the stack pointer in save_stack_trace and show_stack with
__get_SP().

This also fixes an issue with perf_arch_fetch_caller_regs().
It currently unwinds the stack once, which will skip a
valid stack frame on a leaf function. With the __get_SP() fixes
in this patch, we never need to unwind the stack frame to get
to the first interesting frame.

We have to export __get_SP() because perf_arch_fetch_caller_regs()
(which is used in modules) calls it from a header file.

Reported-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

authored by

Anton Blanchard and committed by
Michael Ellerman
bfe9a2cf 2d73bae1

+10 -5
+1 -1
arch/powerpc/include/asm/perf_event.h
··· 34 34 do { \ 35 35 (regs)->result = 0; \ 36 36 (regs)->nip = __ip; \ 37 - (regs)->gpr[1] = *(unsigned long *)__get_SP(); \ 37 + (regs)->gpr[1] = __get_SP(); \ 38 38 asm volatile("mfmsr %0" : "=r" ((regs)->msr)); \ 39 39 } while (0) 40 40 #endif
+1 -2
arch/powerpc/include/asm/reg.h
··· 1265 1265 1266 1266 #define proc_trap() asm volatile("trap") 1267 1267 1268 - #define __get_SP() ({unsigned long sp; \ 1269 - asm volatile("mr %0,1": "=r" (sp)); sp;}) 1268 + extern unsigned long __get_SP(void); 1270 1269 1271 1270 extern unsigned long scom970_read(unsigned int address); 1272 1271 extern void scom970_write(unsigned int address, unsigned long value);
+4
arch/powerpc/kernel/misc.S
··· 114 114 mtlr r0 115 115 mr r3,r4 116 116 blr 117 + 118 + _GLOBAL(__get_SP) 119 + PPC_LL r3,0(r1) 120 + blr
+2
arch/powerpc/kernel/ppc_ksyms.c
··· 41 41 #ifdef CONFIG_EPAPR_PARAVIRT 42 42 EXPORT_SYMBOL(epapr_hypercall_start); 43 43 #endif 44 + 45 + EXPORT_SYMBOL(__get_SP);
+1 -1
arch/powerpc/kernel/process.c
··· 1545 1545 tsk = current; 1546 1546 if (sp == 0) { 1547 1547 if (tsk == current) 1548 - asm("mr %0,1" : "=r" (sp)); 1548 + sp = __get_SP(); 1549 1549 else 1550 1550 sp = tsk->thread.ksp; 1551 1551 }
+1 -1
arch/powerpc/kernel/stacktrace.c
··· 50 50 { 51 51 unsigned long sp; 52 52 53 - asm("mr %0,1" : "=r" (sp)); 53 + sp = __get_SP(); 54 54 55 55 save_context_stack(trace, sp, current, 1); 56 56 }