Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge phase #5 (misc) of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

Merges oprofile, timers/hpet, x86/traps, x86/time, and x86/core misc items.

* 'x86-core-v4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (132 commits)
x86: change early_ioremap to use slots instead of nesting
x86: adjust dependencies for CONFIG_X86_CMOV
dumpstack: x86: various small unification steps, fix
x86: remove additional_cpus
x86: remove additional_cpus configurability
x86: improve UP kernel when CPU-hotplug and SMP is enabled
dumpstack: x86: various small unification steps
dumpstack: i386: make kstack= an early boot-param and add oops=panic
dumpstack: x86: use log_lvl and unify trace formatting
dumptrace: x86: consistently include loglevel, print stack switch
dumpstack: x86: add "end" parameter to valid_stack_ptr and print_context_stack
dumpstack: x86: make printk_address equal
dumpstack: x86: move die_nmi to dumpstack_32.c
traps: x86: finalize unification of traps.c
traps: x86: make traps_32.c and traps_64.c equal
traps: x86: various noop-changes preparing for unification of traps_xx.c
traps: x86_64: use task_pid_nr(tsk) instead of tsk->pid in do_general_protection
traps: i386: expand clear_mem_error and remove from mach_traps.h
traps: x86_64: make io_check_error equal to the one on i386
traps: i386: use preempt_conditional_sti/cli in do_int3
...

+2865 -2584
-2
Documentation/00-INDEX
··· 159 159 - info on using the Hayes ESP serial driver. 160 160 highuid.txt 161 161 - notes on the change from 16 bit to 32 bit user/group IDs. 162 - hpet.txt 163 - - High Precision Event Timer Driver for Linux. 164 162 timers/ 165 163 - info on the timer related topics 166 164 hw_random.txt
+21 -22
Documentation/hpet.txt Documentation/timers/hpet.txt
··· 1 1 High Precision Event Timer Driver for Linux 2 2 3 - The High Precision Event Timer (HPET) hardware is the future replacement 4 - for the 8254 and Real Time Clock (RTC) periodic timer functionality. 5 - Each HPET can have up to 32 timers. It is possible to configure the 6 - first two timers as legacy replacements for 8254 and RTC periodic timers. 7 - A specification done by Intel and Microsoft can be found at 8 - <http://www.intel.com/technology/architecture/hpetspec.htm>. 3 + The High Precision Event Timer (HPET) hardware follows a specification 4 + by Intel and Microsoft which can be found at 5 + 6 + http://www.intel.com/technology/architecture/hpetspec.htm 7 + 8 + Each HPET has one fixed-rate counter (at 10+ MHz, hence "High Precision") 9 + and up to 32 comparators. Normally three or more comparators are provided, 10 + each of which can generate oneshot interupts and at least one of which has 11 + additional hardware to support periodic interrupts. The comparators are 12 + also called "timers", which can be misleading since usually timers are 13 + independent of each other ... these share a counter, complicating resets. 14 + 15 + HPET devices can support two interrupt routing modes. In one mode, the 16 + comparators are additional interrupt sources with no particular system 17 + role. Many x86 BIOS writers don't route HPET interrupts at all, which 18 + prevents use of that mode. They support the other "legacy replacement" 19 + mode where the first two comparators block interrupts from 8254 timers 20 + and from the RTC. 9 21 10 22 The driver supports detection of HPET driver allocation and initialization 11 23 of the HPET before the driver module_init routine is called. This enables 12 24 platform code which uses timer 0 or 1 as the main timer to intercept HPET 13 25 initialization. An example of this initialization can be found in 14 - arch/i386/kernel/time_hpet.c. 26 + arch/x86/kernel/hpet.c. 15 27 16 - The driver provides two APIs which are very similar to the API found in 17 - the rtc.c driver. There is a user space API and a kernel space API. 18 - An example user space program is provided below. 28 + The driver provides a userspace API which resembles the API found in the 29 + RTC driver framework. An example user space program is provided below. 19 30 20 31 #include <stdio.h> 21 32 #include <stdlib.h> ··· 297 286 298 287 return; 299 288 } 300 - 301 - The kernel API has three interfaces exported from the driver: 302 - 303 - hpet_register(struct hpet_task *tp, int periodic) 304 - hpet_unregister(struct hpet_task *tp) 305 - hpet_control(struct hpet_task *tp, unsigned int cmd, unsigned long arg) 306 - 307 - The kernel module using this interface fills in the ht_func and ht_data 308 - members of the hpet_task structure before calling hpet_register. 309 - hpet_control simply vectors to the hpet_ioctl routine and has the same 310 - commands and respective arguments as the user API. hpet_unregister 311 - is used to terminate usage of the HPET timer reserved by hpet_register.
+10
Documentation/timers/00-INDEX
··· 1 + 00-INDEX 2 + - this file 3 + highres.txt 4 + - High resolution timers and dynamic ticks design notes 5 + hpet.txt 6 + - High Precision Event Timer Driver for Linux 7 + hrtimers.txt 8 + - subsystem for high-resolution kernel timers 9 + timer_stats.txt 10 + - timer usage statistics
+14
arch/Kconfig
··· 13 13 14 14 If unsure, say N. 15 15 16 + config OPROFILE_IBS 17 + bool "OProfile AMD IBS support (EXPERIMENTAL)" 18 + default n 19 + depends on OPROFILE && SMP && X86 20 + help 21 + Instruction-Based Sampling (IBS) is a new profiling 22 + technique that provides rich, precise program performance 23 + information. IBS is introduced by AMD Family10h processors 24 + (AMD Opteron Quad-Core processor “Barcelona”) to overcome 25 + the limitations of conventional performance counter 26 + sampling. 27 + 28 + If unsure, say N. 29 + 16 30 config HAVE_OPROFILE 17 31 def_bool n 18 32
+10 -19
arch/x86/Kconfig.cpu
··· 38 38 - "Crusoe" for the Transmeta Crusoe series. 39 39 - "Efficeon" for the Transmeta Efficeon series. 40 40 - "Winchip-C6" for original IDT Winchip. 41 - - "Winchip-2" for IDT Winchip 2. 42 - - "Winchip-2A" for IDT Winchips with 3dNow! capabilities. 41 + - "Winchip-2" for IDT Winchips with 3dNow! capabilities. 43 42 - "GeodeGX1" for Geode GX1 (Cyrix MediaGX). 44 43 - "Geode GX/LX" For AMD Geode GX and LX processors. 45 44 - "CyrixIII/VIA C3" for VIA Cyrix III or VIA C3. ··· 193 194 treat this chip as a 586TSC with some extended instructions 194 195 and alignment requirements. 195 196 196 - config MWINCHIP2 197 - bool "Winchip-2" 198 - depends on X86_32 199 - help 200 - Select this for an IDT Winchip-2. Linux and GCC 201 - treat this chip as a 586TSC with some extended instructions 202 - and alignment requirements. 203 - 204 197 config MWINCHIP3D 205 - bool "Winchip-2A/Winchip-3" 198 + bool "Winchip-2/Winchip-2A/Winchip-3" 206 199 depends on X86_32 207 200 help 208 - Select this for an IDT Winchip-2A or 3. Linux and GCC 201 + Select this for an IDT Winchip-2, 2A or 3. Linux and GCC 209 202 treat this chip as a 586TSC with some extended instructions 210 203 and alignment requirements. Also enable out of order memory 211 204 stores for this CPU, which can increase performance of some ··· 309 318 int 310 319 default "7" if MPENTIUM4 || X86_GENERIC || GENERIC_CPU || MPSC 311 320 default "4" if X86_ELAN || M486 || M386 || MGEODEGX1 312 - default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX 321 + default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX 313 322 default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MVIAC7 314 323 315 324 config X86_XADD ··· 351 360 352 361 config X86_ALIGNMENT_16 353 362 def_bool y 354 - depends on MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCYRIXIII || X86_ELAN || MK6 || M586MMX || M586TSC || M586 || M486 || MVIAC3_2 || MGEODEGX1 363 + depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || X86_ELAN || MK6 || M586MMX || M586TSC || M586 || M486 || MVIAC3_2 || MGEODEGX1 355 364 356 365 config X86_INTEL_USERCOPY 357 366 def_bool y ··· 359 368 360 369 config X86_USE_PPRO_CHECKSUM 361 370 def_bool y 362 - depends on MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MEFFICEON || MGEODE_LX || MCORE2 371 + depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MEFFICEON || MGEODE_LX || MCORE2 363 372 364 373 config X86_USE_3DNOW 365 374 def_bool y ··· 367 376 368 377 config X86_OOSTORE 369 378 def_bool y 370 - depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR 379 + depends on (MWINCHIP3D || MWINCHIPC6) && MTRR 371 380 372 381 # 373 382 # P6_NOPs are a relatively minor optimization that require a family >= ··· 387 396 388 397 config X86_TSC 389 398 def_bool y 390 - depends on ((MWINCHIP3D || MWINCHIP2 || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2) && !X86_NUMAQ) || X86_64 399 + depends on ((MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2) && !X86_NUMAQ) || X86_64 391 400 392 401 config X86_CMPXCHG64 393 402 def_bool y ··· 397 406 # generates cmov. 398 407 config X86_CMOV 399 408 def_bool y 400 - depends on (MK7 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || X86_64) 409 + depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64) 401 410 402 411 config X86_MINIMUM_CPU_FAMILY 403 412 int ··· 408 417 409 418 config X86_DEBUGCTLMSR 410 419 def_bool y 411 - depends on !(MK6 || MWINCHIPC6 || MWINCHIP2 || MWINCHIP3D || MCYRIXIII || M586MMX || M586TSC || M586 || M486 || M386) 420 + depends on !(MK6 || MWINCHIPC6 || MWINCHIP3D || MCYRIXIII || M586MMX || M586TSC || M586 || M486 || M386) 412 421 413 422 menuconfig PROCESSOR_SELECT 414 423 bool "Supported processor vendors" if EMBEDDED
-1
arch/x86/Makefile_32.cpu
··· 28 28 cflags-$(CONFIG_MCRUSOE) += -march=i686 $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0 29 29 cflags-$(CONFIG_MEFFICEON) += -march=i686 $(call tune,pentium3) $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0 30 30 cflags-$(CONFIG_MWINCHIPC6) += $(call cc-option,-march=winchip-c6,-march=i586) 31 - cflags-$(CONFIG_MWINCHIP2) += $(call cc-option,-march=winchip2,-march=i586) 32 31 cflags-$(CONFIG_MWINCHIP3D) += $(call cc-option,-march=winchip2,-march=i586) 33 32 cflags-$(CONFIG_MCYRIXIII) += $(call cc-option,-march=c3,-march=i486) $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0 34 33 cflags-$(CONFIG_MVIAC3_2) += $(call cc-option,-march=c3-2,-march=i686)
-1
arch/x86/configs/i386_defconfig
··· 213 213 # CONFIG_MCRUSOE is not set 214 214 # CONFIG_MEFFICEON is not set 215 215 # CONFIG_MWINCHIPC6 is not set 216 - # CONFIG_MWINCHIP2 is not set 217 216 # CONFIG_MWINCHIP3D is not set 218 217 # CONFIG_MGEODEGX1 is not set 219 218 # CONFIG_MGEODE_LX is not set
-1
arch/x86/configs/x86_64_defconfig
··· 210 210 # CONFIG_MCRUSOE is not set 211 211 # CONFIG_MEFFICEON is not set 212 212 # CONFIG_MWINCHIPC6 is not set 213 - # CONFIG_MWINCHIP2 is not set 214 213 # CONFIG_MWINCHIP3D is not set 215 214 # CONFIG_MGEODEGX1 is not set 216 215 # CONFIG_MGEODE_LX is not set
+10 -16
arch/x86/ia32/ia32entry.S
··· 39 39 .endm 40 40 41 41 /* clobbers %eax */ 42 - .macro CLEAR_RREGS 42 + .macro CLEAR_RREGS _r9=rax 43 43 xorl %eax,%eax 44 44 movq %rax,R11(%rsp) 45 45 movq %rax,R10(%rsp) 46 - movq %rax,R9(%rsp) 46 + movq %\_r9,R9(%rsp) 47 47 movq %rax,R8(%rsp) 48 48 .endm 49 49 ··· 52 52 * We don't reload %eax because syscall_trace_enter() returned 53 53 * the value it wants us to use in the table lookup. 54 54 */ 55 - .macro LOAD_ARGS32 offset 56 - movl \offset(%rsp),%r11d 57 - movl \offset+8(%rsp),%r10d 55 + .macro LOAD_ARGS32 offset, _r9=0 56 + .if \_r9 58 57 movl \offset+16(%rsp),%r9d 59 - movl \offset+24(%rsp),%r8d 58 + .endif 60 59 movl \offset+40(%rsp),%ecx 61 60 movl \offset+48(%rsp),%edx 62 61 movl \offset+56(%rsp),%esi ··· 144 145 SAVE_ARGS 0,0,1 145 146 /* no need to do an access_ok check here because rbp has been 146 147 32bit zero extended */ 147 - 1: movl (%rbp),%r9d 148 + 1: movl (%rbp),%ebp 148 149 .section __ex_table,"a" 149 150 .quad 1b,ia32_badarg 150 151 .previous ··· 156 157 cmpl $(IA32_NR_syscalls-1),%eax 157 158 ja ia32_badsys 158 159 sysenter_do_call: 159 - IA32_ARG_FIXUP 1 160 + IA32_ARG_FIXUP 160 161 sysenter_dispatch: 161 162 call *ia32_sys_call_table(,%rax,8) 162 163 movq %rax,RAX-ARGOFFSET(%rsp) ··· 233 234 #endif 234 235 235 236 sysenter_tracesys: 236 - xchgl %r9d,%ebp 237 237 #ifdef CONFIG_AUDITSYSCALL 238 238 testl $(_TIF_WORK_SYSCALL_ENTRY & ~_TIF_SYSCALL_AUDIT),TI_flags(%r10) 239 239 jz sysenter_auditsys 240 240 #endif 241 241 SAVE_REST 242 242 CLEAR_RREGS 243 - movq %r9,R9(%rsp) 244 243 movq $-ENOSYS,RAX(%rsp)/* ptrace can change this for a bad syscall */ 245 244 movq %rsp,%rdi /* &pt_regs -> arg1 */ 246 245 call syscall_trace_enter 247 246 LOAD_ARGS32 ARGOFFSET /* reload args from stack in case ptrace changed it */ 248 247 RESTORE_REST 249 - xchgl %ebp,%r9d 250 248 cmpl $(IA32_NR_syscalls-1),%eax 251 249 ja int_ret_from_sys_call /* sysenter_tracesys has set RAX(%rsp) */ 252 250 jmp sysenter_do_call ··· 310 314 testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%r10) 311 315 CFI_REMEMBER_STATE 312 316 jnz cstar_tracesys 313 - cstar_do_call: 314 317 cmpl $IA32_NR_syscalls-1,%eax 315 318 ja ia32_badsys 319 + cstar_do_call: 316 320 IA32_ARG_FIXUP 1 317 321 cstar_dispatch: 318 322 call *ia32_sys_call_table(,%rax,8) ··· 353 357 #endif 354 358 xchgl %r9d,%ebp 355 359 SAVE_REST 356 - CLEAR_RREGS 357 - movq %r9,R9(%rsp) 360 + CLEAR_RREGS r9 358 361 movq $-ENOSYS,RAX(%rsp) /* ptrace can change this for a bad syscall */ 359 362 movq %rsp,%rdi /* &pt_regs -> arg1 */ 360 363 call syscall_trace_enter 361 - LOAD_ARGS32 ARGOFFSET /* reload args from stack in case ptrace changed it */ 364 + LOAD_ARGS32 ARGOFFSET, 1 /* reload args from stack in case ptrace changed it */ 362 365 RESTORE_REST 363 366 xchgl %ebp,%r9d 364 - movl RSP-ARGOFFSET(%rsp), %r8d 365 367 cmpl $(IA32_NR_syscalls-1),%eax 366 368 ja int_ret_from_sys_call /* cstar_tracesys has set RAX(%rsp) */ 367 369 jmp cstar_do_call
+1 -1
arch/x86/kernel/Makefile
··· 23 23 CFLAGS_tsc.o := $(nostackp) 24 24 25 25 obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o 26 - obj-y += traps_$(BITS).o irq_$(BITS).o 26 + obj-y += traps.o irq_$(BITS).o dumpstack_$(BITS).o 27 27 obj-y += time_$(BITS).o ioport.o ldt.o 28 28 obj-y += setup.o i8259.o irqinit_$(BITS).o setup_percpu.o 29 29 obj-$(CONFIG_X86_VISWS) += visws_quirks.o
+1 -1
arch/x86/kernel/alternative.c
··· 444 444 _text, _etext); 445 445 446 446 /* Only switch to UP mode if we don't immediately boot others */ 447 - if (num_possible_cpus() == 1 || setup_max_cpus <= 1) 447 + if (num_present_cpus() == 1 || setup_max_cpus <= 1) 448 448 alternatives_smp_switch(0); 449 449 } 450 450 #endif
+4
arch/x86/kernel/apic_32.c
··· 295 295 * 296 296 * Vector mappings are hard coded. On K8 only offset 0 (APIC500) and 297 297 * MCE interrupts are supported. Thus MCE offset must be set to 0. 298 + * 299 + * If mask=1, the LVT entry does not generate interrupts while mask=0 300 + * enables the vector. See also the BKDGs. 298 301 */ 299 302 300 303 #define APIC_EILVT_LVTOFF_MCE 0 ··· 322 319 setup_APIC_eilvt(APIC_EILVT_LVTOFF_IBS, vector, msg_type, mask); 323 320 return APIC_EILVT_LVTOFF_IBS; 324 321 } 322 + EXPORT_SYMBOL_GPL(setup_APIC_eilvt_ibs); 325 323 326 324 /* 327 325 * Program the next event, relative to now
+4
arch/x86/kernel/apic_64.c
··· 307 307 * 308 308 * Vector mappings are hard coded. On K8 only offset 0 (APIC500) and 309 309 * MCE interrupts are supported. Thus MCE offset must be set to 0. 310 + * 311 + * If mask=1, the LVT entry does not generate interrupts while mask=0 312 + * enables the vector. See also the BKDGs. 310 313 */ 311 314 312 315 #define APIC_EILVT_LVTOFF_MCE 0 ··· 334 331 setup_APIC_eilvt(APIC_EILVT_LVTOFF_IBS, vector, msg_type, mask); 335 332 return APIC_EILVT_LVTOFF_IBS; 336 333 } 334 + EXPORT_SYMBOL_GPL(setup_APIC_eilvt_ibs); 337 335 338 336 /* 339 337 * Program the next event, relative to now
+32 -13
arch/x86/kernel/cpu/common.c
··· 124 124 { 125 125 u32 f1, f2; 126 126 127 - asm("pushfl\n\t" 128 - "pushfl\n\t" 129 - "popl %0\n\t" 130 - "movl %0,%1\n\t" 131 - "xorl %2,%0\n\t" 132 - "pushl %0\n\t" 133 - "popfl\n\t" 134 - "pushfl\n\t" 135 - "popl %0\n\t" 136 - "popfl\n\t" 137 - : "=&r" (f1), "=&r" (f2) 138 - : "ir" (flag)); 127 + /* 128 + * Cyrix and IDT cpus allow disabling of CPUID 129 + * so the code below may return different results 130 + * when it is executed before and after enabling 131 + * the CPUID. Add "volatile" to not allow gcc to 132 + * optimize the subsequent calls to this function. 133 + */ 134 + asm volatile ("pushfl\n\t" 135 + "pushfl\n\t" 136 + "popl %0\n\t" 137 + "movl %0,%1\n\t" 138 + "xorl %2,%0\n\t" 139 + "pushl %0\n\t" 140 + "popfl\n\t" 141 + "pushfl\n\t" 142 + "popl %0\n\t" 143 + "popfl\n\t" 144 + : "=&r" (f1), "=&r" (f2) 145 + : "ir" (flag)); 139 146 140 147 return ((f1^f2) & flag) != 0; 141 148 } ··· 726 719 #endif 727 720 } 728 721 722 + #ifdef CONFIG_X86_64 723 + static void vgetcpu_set_mode(void) 724 + { 725 + if (cpu_has(&boot_cpu_data, X86_FEATURE_RDTSCP)) 726 + vgetcpu_mode = VGETCPU_RDTSCP; 727 + else 728 + vgetcpu_mode = VGETCPU_LSL; 729 + } 730 + #endif 731 + 729 732 void __init identify_boot_cpu(void) 730 733 { 731 734 identify_cpu(&boot_cpu_data); 732 735 #ifdef CONFIG_X86_32 733 736 sysenter_setup(); 734 737 enable_sep_cpu(); 738 + #else 739 + vgetcpu_set_mode(); 735 740 #endif 736 741 } 737 742 ··· 816 797 else if (c->cpuid_level >= 0) 817 798 vendor = c->x86_vendor_id; 818 799 819 - if (vendor && strncmp(c->x86_model_id, vendor, strlen(vendor))) 800 + if (vendor && !strstr(c->x86_model_id, vendor)) 820 801 printk(KERN_CONT "%s ", vendor); 821 802 822 803 if (c->x86_model_id[0])
+1 -1
arch/x86/kernel/doublefault_32.c
··· 66 66 .ds = __USER_DS, 67 67 .fs = __KERNEL_PERCPU, 68 68 69 - .__cr3 = __phys_addr_const((unsigned long)swapper_pg_dir) 69 + .__cr3 = __pa_nodebug(swapper_pg_dir), 70 70 } 71 71 };
+447
arch/x86/kernel/dumpstack_32.c
··· 1 + /* 2 + * Copyright (C) 1991, 1992 Linus Torvalds 3 + * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 + */ 5 + #include <linux/kallsyms.h> 6 + #include <linux/kprobes.h> 7 + #include <linux/uaccess.h> 8 + #include <linux/utsname.h> 9 + #include <linux/hardirq.h> 10 + #include <linux/kdebug.h> 11 + #include <linux/module.h> 12 + #include <linux/ptrace.h> 13 + #include <linux/kexec.h> 14 + #include <linux/bug.h> 15 + #include <linux/nmi.h> 16 + 17 + #include <asm/stacktrace.h> 18 + 19 + #define STACKSLOTS_PER_LINE 8 20 + #define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :) 21 + 22 + int panic_on_unrecovered_nmi; 23 + int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE; 24 + static unsigned int code_bytes = 64; 25 + static int die_counter; 26 + 27 + void printk_address(unsigned long address, int reliable) 28 + { 29 + printk(" [<%p>] %s%pS\n", (void *) address, 30 + reliable ? "" : "? ", (void *) address); 31 + } 32 + 33 + static inline int valid_stack_ptr(struct thread_info *tinfo, 34 + void *p, unsigned int size, void *end) 35 + { 36 + void *t = tinfo; 37 + if (end) { 38 + if (p < end && p >= (end-THREAD_SIZE)) 39 + return 1; 40 + else 41 + return 0; 42 + } 43 + return p > t && p < t + THREAD_SIZE - size; 44 + } 45 + 46 + /* The form of the top of the frame on the stack */ 47 + struct stack_frame { 48 + struct stack_frame *next_frame; 49 + unsigned long return_address; 50 + }; 51 + 52 + static inline unsigned long 53 + print_context_stack(struct thread_info *tinfo, 54 + unsigned long *stack, unsigned long bp, 55 + const struct stacktrace_ops *ops, void *data, 56 + unsigned long *end) 57 + { 58 + struct stack_frame *frame = (struct stack_frame *)bp; 59 + 60 + while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) { 61 + unsigned long addr; 62 + 63 + addr = *stack; 64 + if (__kernel_text_address(addr)) { 65 + if ((unsigned long) stack == bp + sizeof(long)) { 66 + ops->address(data, addr, 1); 67 + frame = frame->next_frame; 68 + bp = (unsigned long) frame; 69 + } else { 70 + ops->address(data, addr, bp == 0); 71 + } 72 + } 73 + stack++; 74 + } 75 + return bp; 76 + } 77 + 78 + void dump_trace(struct task_struct *task, struct pt_regs *regs, 79 + unsigned long *stack, unsigned long bp, 80 + const struct stacktrace_ops *ops, void *data) 81 + { 82 + if (!task) 83 + task = current; 84 + 85 + if (!stack) { 86 + unsigned long dummy; 87 + stack = &dummy; 88 + if (task && task != current) 89 + stack = (unsigned long *)task->thread.sp; 90 + } 91 + 92 + #ifdef CONFIG_FRAME_POINTER 93 + if (!bp) { 94 + if (task == current) { 95 + /* Grab bp right from our regs */ 96 + get_bp(bp); 97 + } else { 98 + /* bp is the last reg pushed by switch_to */ 99 + bp = *(unsigned long *) task->thread.sp; 100 + } 101 + } 102 + #endif 103 + 104 + for (;;) { 105 + struct thread_info *context; 106 + 107 + context = (struct thread_info *) 108 + ((unsigned long)stack & (~(THREAD_SIZE - 1))); 109 + bp = print_context_stack(context, stack, bp, ops, data, NULL); 110 + 111 + stack = (unsigned long *)context->previous_esp; 112 + if (!stack) 113 + break; 114 + if (ops->stack(data, "IRQ") < 0) 115 + break; 116 + touch_nmi_watchdog(); 117 + } 118 + } 119 + EXPORT_SYMBOL(dump_trace); 120 + 121 + static void 122 + print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 123 + { 124 + printk(data); 125 + print_symbol(msg, symbol); 126 + printk("\n"); 127 + } 128 + 129 + static void print_trace_warning(void *data, char *msg) 130 + { 131 + printk("%s%s\n", (char *)data, msg); 132 + } 133 + 134 + static int print_trace_stack(void *data, char *name) 135 + { 136 + printk("%s <%s> ", (char *)data, name); 137 + return 0; 138 + } 139 + 140 + /* 141 + * Print one address/symbol entries per line. 142 + */ 143 + static void print_trace_address(void *data, unsigned long addr, int reliable) 144 + { 145 + touch_nmi_watchdog(); 146 + printk(data); 147 + printk_address(addr, reliable); 148 + } 149 + 150 + static const struct stacktrace_ops print_trace_ops = { 151 + .warning = print_trace_warning, 152 + .warning_symbol = print_trace_warning_symbol, 153 + .stack = print_trace_stack, 154 + .address = print_trace_address, 155 + }; 156 + 157 + static void 158 + show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 159 + unsigned long *stack, unsigned long bp, char *log_lvl) 160 + { 161 + printk("%sCall Trace:\n", log_lvl); 162 + dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 163 + } 164 + 165 + void show_trace(struct task_struct *task, struct pt_regs *regs, 166 + unsigned long *stack, unsigned long bp) 167 + { 168 + show_trace_log_lvl(task, regs, stack, bp, ""); 169 + } 170 + 171 + static void 172 + show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 173 + unsigned long *sp, unsigned long bp, char *log_lvl) 174 + { 175 + unsigned long *stack; 176 + int i; 177 + 178 + if (sp == NULL) { 179 + if (task) 180 + sp = (unsigned long *)task->thread.sp; 181 + else 182 + sp = (unsigned long *)&sp; 183 + } 184 + 185 + stack = sp; 186 + for (i = 0; i < kstack_depth_to_print; i++) { 187 + if (kstack_end(stack)) 188 + break; 189 + if (i && ((i % STACKSLOTS_PER_LINE) == 0)) 190 + printk("\n%s", log_lvl); 191 + printk(" %08lx", *stack++); 192 + touch_nmi_watchdog(); 193 + } 194 + printk("\n"); 195 + show_trace_log_lvl(task, regs, sp, bp, log_lvl); 196 + } 197 + 198 + void show_stack(struct task_struct *task, unsigned long *sp) 199 + { 200 + show_stack_log_lvl(task, NULL, sp, 0, ""); 201 + } 202 + 203 + /* 204 + * The architecture-independent dump_stack generator 205 + */ 206 + void dump_stack(void) 207 + { 208 + unsigned long bp = 0; 209 + unsigned long stack; 210 + 211 + #ifdef CONFIG_FRAME_POINTER 212 + if (!bp) 213 + get_bp(bp); 214 + #endif 215 + 216 + printk("Pid: %d, comm: %.20s %s %s %.*s\n", 217 + current->pid, current->comm, print_tainted(), 218 + init_utsname()->release, 219 + (int)strcspn(init_utsname()->version, " "), 220 + init_utsname()->version); 221 + show_trace(NULL, NULL, &stack, bp); 222 + } 223 + 224 + EXPORT_SYMBOL(dump_stack); 225 + 226 + void show_registers(struct pt_regs *regs) 227 + { 228 + int i; 229 + 230 + print_modules(); 231 + __show_regs(regs, 0); 232 + 233 + printk(KERN_EMERG "Process %.*s (pid: %d, ti=%p task=%p task.ti=%p)\n", 234 + TASK_COMM_LEN, current->comm, task_pid_nr(current), 235 + current_thread_info(), current, task_thread_info(current)); 236 + /* 237 + * When in-kernel, we also print out the stack and code at the 238 + * time of the fault.. 239 + */ 240 + if (!user_mode_vm(regs)) { 241 + unsigned int code_prologue = code_bytes * 43 / 64; 242 + unsigned int code_len = code_bytes; 243 + unsigned char c; 244 + u8 *ip; 245 + 246 + printk(KERN_EMERG "Stack:\n"); 247 + show_stack_log_lvl(NULL, regs, &regs->sp, 248 + 0, KERN_EMERG); 249 + 250 + printk(KERN_EMERG "Code: "); 251 + 252 + ip = (u8 *)regs->ip - code_prologue; 253 + if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) { 254 + /* try starting at IP */ 255 + ip = (u8 *)regs->ip; 256 + code_len = code_len - code_prologue + 1; 257 + } 258 + for (i = 0; i < code_len; i++, ip++) { 259 + if (ip < (u8 *)PAGE_OFFSET || 260 + probe_kernel_address(ip, c)) { 261 + printk(" Bad EIP value."); 262 + break; 263 + } 264 + if (ip == (u8 *)regs->ip) 265 + printk("<%02x> ", c); 266 + else 267 + printk("%02x ", c); 268 + } 269 + } 270 + printk("\n"); 271 + } 272 + 273 + int is_valid_bugaddr(unsigned long ip) 274 + { 275 + unsigned short ud2; 276 + 277 + if (ip < PAGE_OFFSET) 278 + return 0; 279 + if (probe_kernel_address((unsigned short *)ip, ud2)) 280 + return 0; 281 + 282 + return ud2 == 0x0b0f; 283 + } 284 + 285 + static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 286 + static int die_owner = -1; 287 + static unsigned int die_nest_count; 288 + 289 + unsigned __kprobes long oops_begin(void) 290 + { 291 + unsigned long flags; 292 + 293 + oops_enter(); 294 + 295 + if (die_owner != raw_smp_processor_id()) { 296 + console_verbose(); 297 + raw_local_irq_save(flags); 298 + __raw_spin_lock(&die_lock); 299 + die_owner = smp_processor_id(); 300 + die_nest_count = 0; 301 + bust_spinlocks(1); 302 + } else { 303 + raw_local_irq_save(flags); 304 + } 305 + die_nest_count++; 306 + return flags; 307 + } 308 + 309 + void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 310 + { 311 + bust_spinlocks(0); 312 + die_owner = -1; 313 + add_taint(TAINT_DIE); 314 + __raw_spin_unlock(&die_lock); 315 + raw_local_irq_restore(flags); 316 + 317 + if (!regs) 318 + return; 319 + 320 + if (kexec_should_crash(current)) 321 + crash_kexec(regs); 322 + if (in_interrupt()) 323 + panic("Fatal exception in interrupt"); 324 + if (panic_on_oops) 325 + panic("Fatal exception"); 326 + oops_exit(); 327 + do_exit(signr); 328 + } 329 + 330 + int __kprobes __die(const char *str, struct pt_regs *regs, long err) 331 + { 332 + unsigned short ss; 333 + unsigned long sp; 334 + 335 + printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter); 336 + #ifdef CONFIG_PREEMPT 337 + printk("PREEMPT "); 338 + #endif 339 + #ifdef CONFIG_SMP 340 + printk("SMP "); 341 + #endif 342 + #ifdef CONFIG_DEBUG_PAGEALLOC 343 + printk("DEBUG_PAGEALLOC"); 344 + #endif 345 + printk("\n"); 346 + if (notify_die(DIE_OOPS, str, regs, err, 347 + current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 348 + return 1; 349 + 350 + show_registers(regs); 351 + /* Executive summary in case the oops scrolled away */ 352 + sp = (unsigned long) (&regs->sp); 353 + savesegment(ss, ss); 354 + if (user_mode(regs)) { 355 + sp = regs->sp; 356 + ss = regs->ss & 0xffff; 357 + } 358 + printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip); 359 + print_symbol("%s", regs->ip); 360 + printk(" SS:ESP %04x:%08lx\n", ss, sp); 361 + return 0; 362 + } 363 + 364 + /* 365 + * This is gone through when something in the kernel has done something bad 366 + * and is about to be terminated: 367 + */ 368 + void die(const char *str, struct pt_regs *regs, long err) 369 + { 370 + unsigned long flags = oops_begin(); 371 + 372 + if (die_nest_count < 3) { 373 + report_bug(regs->ip, regs); 374 + 375 + if (__die(str, regs, err)) 376 + regs = NULL; 377 + } else { 378 + printk(KERN_EMERG "Recursive die() failure, output suppressed\n"); 379 + } 380 + 381 + oops_end(flags, regs, SIGSEGV); 382 + } 383 + 384 + static DEFINE_SPINLOCK(nmi_print_lock); 385 + 386 + void notrace __kprobes 387 + die_nmi(char *str, struct pt_regs *regs, int do_panic) 388 + { 389 + if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 390 + return; 391 + 392 + spin_lock(&nmi_print_lock); 393 + /* 394 + * We are in trouble anyway, lets at least try 395 + * to get a message out: 396 + */ 397 + bust_spinlocks(1); 398 + printk(KERN_EMERG "%s", str); 399 + printk(" on CPU%d, ip %08lx, registers:\n", 400 + smp_processor_id(), regs->ip); 401 + show_registers(regs); 402 + if (do_panic) 403 + panic("Non maskable interrupt"); 404 + console_silent(); 405 + spin_unlock(&nmi_print_lock); 406 + bust_spinlocks(0); 407 + 408 + /* 409 + * If we are in kernel we are probably nested up pretty bad 410 + * and might aswell get out now while we still can: 411 + */ 412 + if (!user_mode_vm(regs)) { 413 + current->thread.trap_no = 2; 414 + crash_kexec(regs); 415 + } 416 + 417 + do_exit(SIGSEGV); 418 + } 419 + 420 + static int __init oops_setup(char *s) 421 + { 422 + if (!s) 423 + return -EINVAL; 424 + if (!strcmp(s, "panic")) 425 + panic_on_oops = 1; 426 + return 0; 427 + } 428 + early_param("oops", oops_setup); 429 + 430 + static int __init kstack_setup(char *s) 431 + { 432 + if (!s) 433 + return -EINVAL; 434 + kstack_depth_to_print = simple_strtoul(s, NULL, 0); 435 + return 0; 436 + } 437 + early_param("kstack", kstack_setup); 438 + 439 + static int __init code_bytes_setup(char *s) 440 + { 441 + code_bytes = simple_strtoul(s, NULL, 0); 442 + if (code_bytes > 8192) 443 + code_bytes = 8192; 444 + 445 + return 1; 446 + } 447 + __setup("code_bytes=", code_bytes_setup);
+573
arch/x86/kernel/dumpstack_64.c
··· 1 + /* 2 + * Copyright (C) 1991, 1992 Linus Torvalds 3 + * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 + */ 5 + #include <linux/kallsyms.h> 6 + #include <linux/kprobes.h> 7 + #include <linux/uaccess.h> 8 + #include <linux/utsname.h> 9 + #include <linux/hardirq.h> 10 + #include <linux/kdebug.h> 11 + #include <linux/module.h> 12 + #include <linux/ptrace.h> 13 + #include <linux/kexec.h> 14 + #include <linux/bug.h> 15 + #include <linux/nmi.h> 16 + 17 + #include <asm/stacktrace.h> 18 + 19 + #define STACKSLOTS_PER_LINE 4 20 + #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :) 21 + 22 + int panic_on_unrecovered_nmi; 23 + int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE; 24 + static unsigned int code_bytes = 64; 25 + static int die_counter; 26 + 27 + void printk_address(unsigned long address, int reliable) 28 + { 29 + printk(" [<%p>] %s%pS\n", (void *) address, 30 + reliable ? "" : "? ", (void *) address); 31 + } 32 + 33 + static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack, 34 + unsigned *usedp, char **idp) 35 + { 36 + static char ids[][8] = { 37 + [DEBUG_STACK - 1] = "#DB", 38 + [NMI_STACK - 1] = "NMI", 39 + [DOUBLEFAULT_STACK - 1] = "#DF", 40 + [STACKFAULT_STACK - 1] = "#SS", 41 + [MCE_STACK - 1] = "#MC", 42 + #if DEBUG_STKSZ > EXCEPTION_STKSZ 43 + [N_EXCEPTION_STACKS ... 44 + N_EXCEPTION_STACKS + DEBUG_STKSZ / EXCEPTION_STKSZ - 2] = "#DB[?]" 45 + #endif 46 + }; 47 + unsigned k; 48 + 49 + /* 50 + * Iterate over all exception stacks, and figure out whether 51 + * 'stack' is in one of them: 52 + */ 53 + for (k = 0; k < N_EXCEPTION_STACKS; k++) { 54 + unsigned long end = per_cpu(orig_ist, cpu).ist[k]; 55 + /* 56 + * Is 'stack' above this exception frame's end? 57 + * If yes then skip to the next frame. 58 + */ 59 + if (stack >= end) 60 + continue; 61 + /* 62 + * Is 'stack' above this exception frame's start address? 63 + * If yes then we found the right frame. 64 + */ 65 + if (stack >= end - EXCEPTION_STKSZ) { 66 + /* 67 + * Make sure we only iterate through an exception 68 + * stack once. If it comes up for the second time 69 + * then there's something wrong going on - just 70 + * break out and return NULL: 71 + */ 72 + if (*usedp & (1U << k)) 73 + break; 74 + *usedp |= 1U << k; 75 + *idp = ids[k]; 76 + return (unsigned long *)end; 77 + } 78 + /* 79 + * If this is a debug stack, and if it has a larger size than 80 + * the usual exception stacks, then 'stack' might still 81 + * be within the lower portion of the debug stack: 82 + */ 83 + #if DEBUG_STKSZ > EXCEPTION_STKSZ 84 + if (k == DEBUG_STACK - 1 && stack >= end - DEBUG_STKSZ) { 85 + unsigned j = N_EXCEPTION_STACKS - 1; 86 + 87 + /* 88 + * Black magic. A large debug stack is composed of 89 + * multiple exception stack entries, which we 90 + * iterate through now. Dont look: 91 + */ 92 + do { 93 + ++j; 94 + end -= EXCEPTION_STKSZ; 95 + ids[j][4] = '1' + (j - N_EXCEPTION_STACKS); 96 + } while (stack < end - EXCEPTION_STKSZ); 97 + if (*usedp & (1U << j)) 98 + break; 99 + *usedp |= 1U << j; 100 + *idp = ids[j]; 101 + return (unsigned long *)end; 102 + } 103 + #endif 104 + } 105 + return NULL; 106 + } 107 + 108 + /* 109 + * x86-64 can have up to three kernel stacks: 110 + * process stack 111 + * interrupt stack 112 + * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack 113 + */ 114 + 115 + static inline int valid_stack_ptr(struct thread_info *tinfo, 116 + void *p, unsigned int size, void *end) 117 + { 118 + void *t = tinfo; 119 + if (end) { 120 + if (p < end && p >= (end-THREAD_SIZE)) 121 + return 1; 122 + else 123 + return 0; 124 + } 125 + return p > t && p < t + THREAD_SIZE - size; 126 + } 127 + 128 + /* The form of the top of the frame on the stack */ 129 + struct stack_frame { 130 + struct stack_frame *next_frame; 131 + unsigned long return_address; 132 + }; 133 + 134 + static inline unsigned long 135 + print_context_stack(struct thread_info *tinfo, 136 + unsigned long *stack, unsigned long bp, 137 + const struct stacktrace_ops *ops, void *data, 138 + unsigned long *end) 139 + { 140 + struct stack_frame *frame = (struct stack_frame *)bp; 141 + 142 + while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) { 143 + unsigned long addr; 144 + 145 + addr = *stack; 146 + if (__kernel_text_address(addr)) { 147 + if ((unsigned long) stack == bp + sizeof(long)) { 148 + ops->address(data, addr, 1); 149 + frame = frame->next_frame; 150 + bp = (unsigned long) frame; 151 + } else { 152 + ops->address(data, addr, bp == 0); 153 + } 154 + } 155 + stack++; 156 + } 157 + return bp; 158 + } 159 + 160 + void dump_trace(struct task_struct *task, struct pt_regs *regs, 161 + unsigned long *stack, unsigned long bp, 162 + const struct stacktrace_ops *ops, void *data) 163 + { 164 + const unsigned cpu = get_cpu(); 165 + unsigned long *irqstack_end = (unsigned long *)cpu_pda(cpu)->irqstackptr; 166 + unsigned used = 0; 167 + struct thread_info *tinfo; 168 + 169 + if (!task) 170 + task = current; 171 + 172 + if (!stack) { 173 + unsigned long dummy; 174 + stack = &dummy; 175 + if (task && task != current) 176 + stack = (unsigned long *)task->thread.sp; 177 + } 178 + 179 + #ifdef CONFIG_FRAME_POINTER 180 + if (!bp) { 181 + if (task == current) { 182 + /* Grab bp right from our regs */ 183 + get_bp(bp); 184 + } else { 185 + /* bp is the last reg pushed by switch_to */ 186 + bp = *(unsigned long *) task->thread.sp; 187 + } 188 + } 189 + #endif 190 + 191 + /* 192 + * Print function call entries in all stacks, starting at the 193 + * current stack address. If the stacks consist of nested 194 + * exceptions 195 + */ 196 + tinfo = task_thread_info(task); 197 + for (;;) { 198 + char *id; 199 + unsigned long *estack_end; 200 + estack_end = in_exception_stack(cpu, (unsigned long)stack, 201 + &used, &id); 202 + 203 + if (estack_end) { 204 + if (ops->stack(data, id) < 0) 205 + break; 206 + 207 + bp = print_context_stack(tinfo, stack, bp, ops, 208 + data, estack_end); 209 + ops->stack(data, "<EOE>"); 210 + /* 211 + * We link to the next stack via the 212 + * second-to-last pointer (index -2 to end) in the 213 + * exception stack: 214 + */ 215 + stack = (unsigned long *) estack_end[-2]; 216 + continue; 217 + } 218 + if (irqstack_end) { 219 + unsigned long *irqstack; 220 + irqstack = irqstack_end - 221 + (IRQSTACKSIZE - 64) / sizeof(*irqstack); 222 + 223 + if (stack >= irqstack && stack < irqstack_end) { 224 + if (ops->stack(data, "IRQ") < 0) 225 + break; 226 + bp = print_context_stack(tinfo, stack, bp, 227 + ops, data, irqstack_end); 228 + /* 229 + * We link to the next stack (which would be 230 + * the process stack normally) the last 231 + * pointer (index -1 to end) in the IRQ stack: 232 + */ 233 + stack = (unsigned long *) (irqstack_end[-1]); 234 + irqstack_end = NULL; 235 + ops->stack(data, "EOI"); 236 + continue; 237 + } 238 + } 239 + break; 240 + } 241 + 242 + /* 243 + * This handles the process stack: 244 + */ 245 + bp = print_context_stack(tinfo, stack, bp, ops, data, NULL); 246 + put_cpu(); 247 + } 248 + EXPORT_SYMBOL(dump_trace); 249 + 250 + static void 251 + print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 252 + { 253 + printk(data); 254 + print_symbol(msg, symbol); 255 + printk("\n"); 256 + } 257 + 258 + static void print_trace_warning(void *data, char *msg) 259 + { 260 + printk("%s%s\n", (char *)data, msg); 261 + } 262 + 263 + static int print_trace_stack(void *data, char *name) 264 + { 265 + printk("%s <%s> ", (char *)data, name); 266 + return 0; 267 + } 268 + 269 + /* 270 + * Print one address/symbol entries per line. 271 + */ 272 + static void print_trace_address(void *data, unsigned long addr, int reliable) 273 + { 274 + touch_nmi_watchdog(); 275 + printk(data); 276 + printk_address(addr, reliable); 277 + } 278 + 279 + static const struct stacktrace_ops print_trace_ops = { 280 + .warning = print_trace_warning, 281 + .warning_symbol = print_trace_warning_symbol, 282 + .stack = print_trace_stack, 283 + .address = print_trace_address, 284 + }; 285 + 286 + static void 287 + show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 288 + unsigned long *stack, unsigned long bp, char *log_lvl) 289 + { 290 + printk("%sCall Trace:\n", log_lvl); 291 + dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 292 + } 293 + 294 + void show_trace(struct task_struct *task, struct pt_regs *regs, 295 + unsigned long *stack, unsigned long bp) 296 + { 297 + show_trace_log_lvl(task, regs, stack, bp, ""); 298 + } 299 + 300 + static void 301 + show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 302 + unsigned long *sp, unsigned long bp, char *log_lvl) 303 + { 304 + unsigned long *stack; 305 + int i; 306 + const int cpu = smp_processor_id(); 307 + unsigned long *irqstack_end = 308 + (unsigned long *) (cpu_pda(cpu)->irqstackptr); 309 + unsigned long *irqstack = 310 + (unsigned long *) (cpu_pda(cpu)->irqstackptr - IRQSTACKSIZE); 311 + 312 + /* 313 + * debugging aid: "show_stack(NULL, NULL);" prints the 314 + * back trace for this cpu. 315 + */ 316 + 317 + if (sp == NULL) { 318 + if (task) 319 + sp = (unsigned long *)task->thread.sp; 320 + else 321 + sp = (unsigned long *)&sp; 322 + } 323 + 324 + stack = sp; 325 + for (i = 0; i < kstack_depth_to_print; i++) { 326 + if (stack >= irqstack && stack <= irqstack_end) { 327 + if (stack == irqstack_end) { 328 + stack = (unsigned long *) (irqstack_end[-1]); 329 + printk(" <EOI> "); 330 + } 331 + } else { 332 + if (((long) stack & (THREAD_SIZE-1)) == 0) 333 + break; 334 + } 335 + if (i && ((i % STACKSLOTS_PER_LINE) == 0)) 336 + printk("\n%s", log_lvl); 337 + printk(" %016lx", *stack++); 338 + touch_nmi_watchdog(); 339 + } 340 + printk("\n"); 341 + show_trace_log_lvl(task, regs, sp, bp, log_lvl); 342 + } 343 + 344 + void show_stack(struct task_struct *task, unsigned long *sp) 345 + { 346 + show_stack_log_lvl(task, NULL, sp, 0, ""); 347 + } 348 + 349 + /* 350 + * The architecture-independent dump_stack generator 351 + */ 352 + void dump_stack(void) 353 + { 354 + unsigned long bp = 0; 355 + unsigned long stack; 356 + 357 + #ifdef CONFIG_FRAME_POINTER 358 + if (!bp) 359 + get_bp(bp); 360 + #endif 361 + 362 + printk("Pid: %d, comm: %.20s %s %s %.*s\n", 363 + current->pid, current->comm, print_tainted(), 364 + init_utsname()->release, 365 + (int)strcspn(init_utsname()->version, " "), 366 + init_utsname()->version); 367 + show_trace(NULL, NULL, &stack, bp); 368 + } 369 + EXPORT_SYMBOL(dump_stack); 370 + 371 + void show_registers(struct pt_regs *regs) 372 + { 373 + int i; 374 + unsigned long sp; 375 + const int cpu = smp_processor_id(); 376 + struct task_struct *cur = cpu_pda(cpu)->pcurrent; 377 + 378 + sp = regs->sp; 379 + printk("CPU %d ", cpu); 380 + __show_regs(regs, 1); 381 + printk("Process %s (pid: %d, threadinfo %p, task %p)\n", 382 + cur->comm, cur->pid, task_thread_info(cur), cur); 383 + 384 + /* 385 + * When in-kernel, we also print out the stack and code at the 386 + * time of the fault.. 387 + */ 388 + if (!user_mode(regs)) { 389 + unsigned int code_prologue = code_bytes * 43 / 64; 390 + unsigned int code_len = code_bytes; 391 + unsigned char c; 392 + u8 *ip; 393 + 394 + printk(KERN_EMERG "Stack:\n"); 395 + show_stack_log_lvl(NULL, regs, (unsigned long *)sp, 396 + regs->bp, KERN_EMERG); 397 + 398 + printk(KERN_EMERG "Code: "); 399 + 400 + ip = (u8 *)regs->ip - code_prologue; 401 + if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) { 402 + /* try starting at IP */ 403 + ip = (u8 *)regs->ip; 404 + code_len = code_len - code_prologue + 1; 405 + } 406 + for (i = 0; i < code_len; i++, ip++) { 407 + if (ip < (u8 *)PAGE_OFFSET || 408 + probe_kernel_address(ip, c)) { 409 + printk(" Bad RIP value."); 410 + break; 411 + } 412 + if (ip == (u8 *)regs->ip) 413 + printk("<%02x> ", c); 414 + else 415 + printk("%02x ", c); 416 + } 417 + } 418 + printk("\n"); 419 + } 420 + 421 + int is_valid_bugaddr(unsigned long ip) 422 + { 423 + unsigned short ud2; 424 + 425 + if (__copy_from_user(&ud2, (const void __user *) ip, sizeof(ud2))) 426 + return 0; 427 + 428 + return ud2 == 0x0b0f; 429 + } 430 + 431 + static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 432 + static int die_owner = -1; 433 + static unsigned int die_nest_count; 434 + 435 + unsigned __kprobes long oops_begin(void) 436 + { 437 + int cpu; 438 + unsigned long flags; 439 + 440 + oops_enter(); 441 + 442 + /* racy, but better than risking deadlock. */ 443 + raw_local_irq_save(flags); 444 + cpu = smp_processor_id(); 445 + if (!__raw_spin_trylock(&die_lock)) { 446 + if (cpu == die_owner) 447 + /* nested oops. should stop eventually */; 448 + else 449 + __raw_spin_lock(&die_lock); 450 + } 451 + die_nest_count++; 452 + die_owner = cpu; 453 + console_verbose(); 454 + bust_spinlocks(1); 455 + return flags; 456 + } 457 + 458 + void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 459 + { 460 + die_owner = -1; 461 + bust_spinlocks(0); 462 + die_nest_count--; 463 + if (!die_nest_count) 464 + /* Nest count reaches zero, release the lock. */ 465 + __raw_spin_unlock(&die_lock); 466 + raw_local_irq_restore(flags); 467 + if (!regs) { 468 + oops_exit(); 469 + return; 470 + } 471 + if (in_interrupt()) 472 + panic("Fatal exception in interrupt"); 473 + if (panic_on_oops) 474 + panic("Fatal exception"); 475 + oops_exit(); 476 + do_exit(signr); 477 + } 478 + 479 + int __kprobes __die(const char *str, struct pt_regs *regs, long err) 480 + { 481 + printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter); 482 + #ifdef CONFIG_PREEMPT 483 + printk("PREEMPT "); 484 + #endif 485 + #ifdef CONFIG_SMP 486 + printk("SMP "); 487 + #endif 488 + #ifdef CONFIG_DEBUG_PAGEALLOC 489 + printk("DEBUG_PAGEALLOC"); 490 + #endif 491 + printk("\n"); 492 + if (notify_die(DIE_OOPS, str, regs, err, 493 + current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 494 + return 1; 495 + 496 + show_registers(regs); 497 + add_taint(TAINT_DIE); 498 + /* Executive summary in case the oops scrolled away */ 499 + printk(KERN_ALERT "RIP "); 500 + printk_address(regs->ip, 1); 501 + printk(" RSP <%016lx>\n", regs->sp); 502 + if (kexec_should_crash(current)) 503 + crash_kexec(regs); 504 + return 0; 505 + } 506 + 507 + void die(const char *str, struct pt_regs *regs, long err) 508 + { 509 + unsigned long flags = oops_begin(); 510 + 511 + if (!user_mode(regs)) 512 + report_bug(regs->ip, regs); 513 + 514 + if (__die(str, regs, err)) 515 + regs = NULL; 516 + oops_end(flags, regs, SIGSEGV); 517 + } 518 + 519 + notrace __kprobes void 520 + die_nmi(char *str, struct pt_regs *regs, int do_panic) 521 + { 522 + unsigned long flags; 523 + 524 + if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 525 + return; 526 + 527 + flags = oops_begin(); 528 + /* 529 + * We are in trouble anyway, lets at least try 530 + * to get a message out. 531 + */ 532 + printk(KERN_EMERG "%s", str); 533 + printk(" on CPU%d, ip %08lx, registers:\n", 534 + smp_processor_id(), regs->ip); 535 + show_registers(regs); 536 + if (kexec_should_crash(current)) 537 + crash_kexec(regs); 538 + if (do_panic || panic_on_oops) 539 + panic("Non maskable interrupt"); 540 + oops_end(flags, NULL, SIGBUS); 541 + nmi_exit(); 542 + local_irq_enable(); 543 + do_exit(SIGBUS); 544 + } 545 + 546 + static int __init oops_setup(char *s) 547 + { 548 + if (!s) 549 + return -EINVAL; 550 + if (!strcmp(s, "panic")) 551 + panic_on_oops = 1; 552 + return 0; 553 + } 554 + early_param("oops", oops_setup); 555 + 556 + static int __init kstack_setup(char *s) 557 + { 558 + if (!s) 559 + return -EINVAL; 560 + kstack_depth_to_print = simple_strtoul(s, NULL, 0); 561 + return 0; 562 + } 563 + early_param("kstack", kstack_setup); 564 + 565 + static int __init code_bytes_setup(char *s) 566 + { 567 + code_bytes = simple_strtoul(s, NULL, 0); 568 + if (code_bytes > 8192) 569 + code_bytes = 8192; 570 + 571 + return 1; 572 + } 573 + __setup("code_bytes=", code_bytes_setup);
+8 -14
arch/x86/kernel/entry_32.S
··· 730 730 movl $(__USER_DS), %ecx 731 731 movl %ecx, %ds 732 732 movl %ecx, %es 733 + TRACE_IRQS_OFF 733 734 movl %esp,%eax # pt_regs pointer 734 735 call *%edi 735 736 jmp ret_from_exception ··· 761 760 RING0_INT_FRAME 762 761 pushl $-1 # mark this as an int 763 762 CFI_ADJUST_CFA_OFFSET 4 764 - SAVE_ALL 765 - GET_CR0_INTO_EAX 766 - testl $0x4, %eax # EM (math emulation bit) 767 - jne device_not_available_emulate 768 - preempt_stop(CLBR_ANY) 769 - call math_state_restore 770 - jmp ret_from_exception 771 - device_not_available_emulate: 772 - pushl $0 # temporary storage for ORIG_EIP 763 + pushl $do_device_not_available 773 764 CFI_ADJUST_CFA_OFFSET 4 774 - call math_emulate 775 - addl $4, %esp 776 - CFI_ADJUST_CFA_OFFSET -4 777 - jmp ret_from_exception 765 + jmp error_code 778 766 CFI_ENDPROC 779 767 END(device_not_available) 780 768 ··· 804 814 pushl $-1 # mark this as an int 805 815 CFI_ADJUST_CFA_OFFSET 4 806 816 SAVE_ALL 817 + TRACE_IRQS_OFF 807 818 xorl %edx,%edx # error code 0 808 819 movl %esp,%eax # pt_regs pointer 809 820 call do_debug ··· 849 858 pushl %eax 850 859 CFI_ADJUST_CFA_OFFSET 4 851 860 SAVE_ALL 861 + TRACE_IRQS_OFF 852 862 xorl %edx,%edx # zero error code 853 863 movl %esp,%eax # pt_regs pointer 854 864 call do_nmi ··· 890 898 pushl %eax 891 899 CFI_ADJUST_CFA_OFFSET 4 892 900 SAVE_ALL 901 + TRACE_IRQS_OFF 893 902 FIXUP_ESPFIX_STACK # %eax == %esp 894 903 xorl %edx,%edx # zero error code 895 904 call do_nmi ··· 921 928 pushl $-1 # mark this as an int 922 929 CFI_ADJUST_CFA_OFFSET 4 923 930 SAVE_ALL 931 + TRACE_IRQS_OFF 924 932 xorl %edx,%edx # zero error code 925 933 movl %esp,%eax # pt_regs pointer 926 934 call do_int3 ··· 1024 1030 RING0_INT_FRAME 1025 1031 pushl $0 1026 1032 CFI_ADJUST_CFA_OFFSET 4 1027 - pushl machine_check_vector 1033 + pushl $do_machine_check 1028 1034 CFI_ADJUST_CFA_OFFSET 4 1029 1035 jmp error_code 1030 1036 CFI_ENDPROC
+13 -2
arch/x86/kernel/entry_64.S
··· 667 667 SAVE_ARGS 668 668 leaq -ARGOFFSET(%rsp),%rdi # arg1 for handler 669 669 pushq %rbp 670 + /* 671 + * Save rbp twice: One is for marking the stack frame, as usual, and the 672 + * other, to fill pt_regs properly. This is because bx comes right 673 + * before the last saved register in that structure, and not bp. If the 674 + * base pointer were in the place bx is today, this would not be needed. 675 + */ 676 + movq %rbp, -8(%rsp) 670 677 CFI_ADJUST_CFA_OFFSET 8 671 678 CFI_REL_OFFSET rbp, 0 672 679 movq %rsp,%rbp ··· 939 932 .if \ist 940 933 movq %gs:pda_data_offset, %rbp 941 934 .endif 935 + .if \irqtrace 936 + TRACE_IRQS_OFF 937 + .endif 942 938 movq %rsp,%rdi 943 939 movq ORIG_RAX(%rsp),%rsi 944 940 movq $-1,ORIG_RAX(%rsp) ··· 1068 1058 je error_kernelspace 1069 1059 error_swapgs: 1070 1060 SWAPGS 1071 - error_sti: 1061 + error_sti: 1062 + TRACE_IRQS_OFF 1072 1063 movq %rdi,RDI(%rsp) 1073 1064 CFI_REL_OFFSET rdi,RDI 1074 1065 movq %rsp,%rdi ··· 1243 1232 END(simd_coprocessor_error) 1244 1233 1245 1234 ENTRY(device_not_available) 1246 - zeroentry math_state_restore 1235 + zeroentry do_device_not_available 1247 1236 END(device_not_available) 1248 1237 1249 1238 /* runs on exception stack */
+23 -5
arch/x86/kernel/es7000_32.c
··· 109 109 }; 110 110 111 111 extern int find_unisys_acpi_oem_table(unsigned long *oem_addr); 112 + extern void unmap_unisys_acpi_oem_table(unsigned long oem_addr); 112 113 #endif 113 114 114 115 struct mip_reg { ··· 244 243 } 245 244 246 245 #ifdef CONFIG_ACPI 247 - int __init 248 - find_unisys_acpi_oem_table(unsigned long *oem_addr) 246 + static unsigned long oem_addrX; 247 + static unsigned long oem_size; 248 + int __init find_unisys_acpi_oem_table(unsigned long *oem_addr) 249 249 { 250 250 struct acpi_table_header *header = NULL; 251 251 int i = 0; 252 - while (ACPI_SUCCESS(acpi_get_table("OEM1", i++, &header))) { 252 + acpi_size tbl_size; 253 + 254 + while (ACPI_SUCCESS(acpi_get_table_with_size("OEM1", i++, &header, &tbl_size))) { 253 255 if (!memcmp((char *) &header->oem_id, "UNISYS", 6)) { 254 256 struct oem_table *t = (struct oem_table *)header; 255 - *oem_addr = (unsigned long)__acpi_map_table(t->OEMTableAddr, 256 - t->OEMTableSize); 257 + 258 + oem_addrX = t->OEMTableAddr; 259 + oem_size = t->OEMTableSize; 260 + early_acpi_os_unmap_memory(header, tbl_size); 261 + 262 + *oem_addr = (unsigned long)__acpi_map_table(oem_addrX, 263 + oem_size); 257 264 return 0; 258 265 } 266 + early_acpi_os_unmap_memory(header, tbl_size); 259 267 } 260 268 return -1; 269 + } 270 + 271 + void __init unmap_unisys_acpi_oem_table(unsigned long oem_addr) 272 + { 273 + if (!oem_addr) 274 + return; 275 + 276 + __acpi_unmap_table((char *)oem_addr, oem_size); 261 277 } 262 278 #endif 263 279
+11 -12
arch/x86/kernel/genx2apic_uv_x.c
··· 114 114 unsigned long val, apicid, lapicid; 115 115 int pnode; 116 116 117 - apicid = per_cpu(x86_cpu_to_apicid, cpu); /* ZZZ - cache node-local ? */ 117 + apicid = per_cpu(x86_cpu_to_apicid, cpu); 118 118 lapicid = apicid & 0x3f; /* ZZZ macro needed */ 119 119 pnode = uv_apicid_to_pnode(apicid); 120 120 val = ··· 202 202 return uv_read_apic_id() >> index_msb; 203 203 } 204 204 205 - #ifdef ZZZ /* Needs x2apic patch */ 206 205 static void uv_send_IPI_self(int vector) 207 206 { 208 207 apic_write(APIC_SELF_IPI, vector); 209 208 } 210 - #endif 211 209 212 210 struct genapic apic_x2apic_uv_x = { 213 211 .name = "UV large system", ··· 213 215 .int_delivery_mode = dest_Fixed, 214 216 .int_dest_mode = (APIC_DEST_PHYSICAL != 0), 215 217 .target_cpus = uv_target_cpus, 216 - .vector_allocation_domain = uv_vector_allocation_domain,/* Fixme ZZZ */ 218 + .vector_allocation_domain = uv_vector_allocation_domain, 217 219 .apic_id_registered = uv_apic_id_registered, 218 220 .init_apic_ldr = uv_init_apic_ldr, 219 221 .send_IPI_all = uv_send_IPI_all, 220 222 .send_IPI_allbutself = uv_send_IPI_allbutself, 221 223 .send_IPI_mask = uv_send_IPI_mask, 222 - /* ZZZ.send_IPI_self = uv_send_IPI_self, */ 224 + .send_IPI_self = uv_send_IPI_self, 223 225 .cpu_mask_to_apicid = uv_cpu_mask_to_apicid, 224 - .phys_pkg_id = phys_pkg_id, /* Fixme ZZZ */ 226 + .phys_pkg_id = phys_pkg_id, 225 227 .get_apic_id = get_apic_id, 226 228 .set_apic_id = set_apic_id, 227 229 .apic_id_mask = (0xFFFFFFFFu), ··· 284 286 285 287 enum map_type {map_wb, map_uc}; 286 288 287 - static __init void map_high(char *id, unsigned long base, int shift, enum map_type map_type) 289 + static __init void map_high(char *id, unsigned long base, int shift, 290 + int max_pnode, enum map_type map_type) 288 291 { 289 292 unsigned long bytes, paddr; 290 293 291 294 paddr = base << shift; 292 - bytes = (1UL << shift); 295 + bytes = (1UL << shift) * (max_pnode + 1); 293 296 printk(KERN_INFO "UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr, 294 297 paddr + bytes); 295 298 if (map_type == map_uc) ··· 306 307 307 308 gru.v = uv_read_local_mmr(UVH_RH_GAM_GRU_OVERLAY_CONFIG_MMR); 308 309 if (gru.s.enable) 309 - map_high("GRU", gru.s.base, shift, map_wb); 310 + map_high("GRU", gru.s.base, shift, max_pnode, map_wb); 310 311 } 311 312 312 313 static __init void map_config_high(int max_pnode) ··· 316 317 317 318 cfg.v = uv_read_local_mmr(UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR); 318 319 if (cfg.s.enable) 319 - map_high("CONFIG", cfg.s.base, shift, map_uc); 320 + map_high("CONFIG", cfg.s.base, shift, max_pnode, map_uc); 320 321 } 321 322 322 323 static __init void map_mmr_high(int max_pnode) ··· 326 327 327 328 mmr.v = uv_read_local_mmr(UVH_RH_GAM_MMR_OVERLAY_CONFIG_MMR); 328 329 if (mmr.s.enable) 329 - map_high("MMR", mmr.s.base, shift, map_uc); 330 + map_high("MMR", mmr.s.base, shift, max_pnode, map_uc); 330 331 } 331 332 332 333 static __init void map_mmioh_high(int max_pnode) ··· 336 337 337 338 mmioh.v = uv_read_local_mmr(UVH_RH_GAM_MMIOH_OVERLAY_CONFIG_MMR); 338 339 if (mmioh.s.enable) 339 - map_high("MMIOH", mmioh.s.base, shift, map_uc); 340 + map_high("MMIOH", mmioh.s.base, shift, max_pnode, map_uc); 340 341 } 341 342 342 343 static __init void uv_rtc_init(void)
+1
arch/x86/kernel/head.c
··· 35 35 36 36 /* start of EBDA area */ 37 37 ebda_addr = get_bios_ebda(); 38 + printk(KERN_INFO "BIOS EBDA/lowmem at: %08x/%08x\n", ebda_addr, lowmem); 38 39 39 40 /* Fixup: bios puts an EBDA in the top 64K segment */ 40 41 /* of conventional memory, but does not adjust lowmem. */
+5 -1
arch/x86/kernel/hpet.c
··· 115 115 hd.hd_phys_address = hpet_address; 116 116 hd.hd_address = hpet; 117 117 hd.hd_nirqs = nrtimers; 118 - hd.hd_flags = HPET_DATA_PLATFORM; 119 118 hpet_reserve_timer(&hd, 0); 120 119 121 120 #ifdef CONFIG_HPET_EMULATE_RTC 122 121 hpet_reserve_timer(&hd, 1); 123 122 #endif 124 123 124 + /* 125 + * NOTE that hd_irq[] reflects IOAPIC input pins (LEGACY_8254 126 + * is wrong for i8259!) not the output IRQ. Many BIOS writers 127 + * don't bother configuring *any* comparator interrupts. 128 + */ 125 129 hd.hd_irq[0] = HPET_LEGACY_8254; 126 130 hd.hd_irq[1] = HPET_LEGACY_RTC; 127 131
+27 -16
arch/x86/kernel/irqinit_64.c
··· 135 135 [IRQ15_VECTOR + 1 ... NR_VECTORS - 1] = -1 136 136 }; 137 137 138 - static void __init init_ISA_irqs (void) 138 + void __init init_ISA_irqs(void) 139 139 { 140 140 int i; 141 141 ··· 164 164 165 165 void init_IRQ(void) __attribute__((weak, alias("native_init_IRQ"))); 166 166 167 - void __init native_init_IRQ(void) 167 + static void __init smp_intr_init(void) 168 168 { 169 - int i; 170 - 171 - init_ISA_irqs(); 172 - /* 173 - * Cover the whole vector space, no vector can escape 174 - * us. (some of these will be overridden and become 175 - * 'special' SMP interrupts) 176 - */ 177 - for (i = 0; i < (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) { 178 - int vector = FIRST_EXTERNAL_VECTOR + i; 179 - if (vector != IA32_SYSCALL_VECTOR) 180 - set_intr_gate(vector, interrupt[i]); 181 - } 182 - 183 169 #ifdef CONFIG_SMP 184 170 /* 185 171 * The reschedule interrupt is a CPU-to-CPU reschedule-helper ··· 193 207 /* Low priority IPI to cleanup after moving an irq */ 194 208 set_intr_gate(IRQ_MOVE_CLEANUP_VECTOR, irq_move_cleanup_interrupt); 195 209 #endif 210 + } 211 + 212 + static void __init apic_intr_init(void) 213 + { 214 + smp_intr_init(); 215 + 196 216 alloc_intr_gate(THERMAL_APIC_VECTOR, thermal_interrupt); 197 217 alloc_intr_gate(THRESHOLD_APIC_VECTOR, threshold_interrupt); 198 218 ··· 208 216 /* IPI vectors for APIC spurious and error interrupts */ 209 217 alloc_intr_gate(SPURIOUS_APIC_VECTOR, spurious_interrupt); 210 218 alloc_intr_gate(ERROR_APIC_VECTOR, error_interrupt); 219 + } 220 + 221 + void __init native_init_IRQ(void) 222 + { 223 + int i; 224 + 225 + init_ISA_irqs(); 226 + /* 227 + * Cover the whole vector space, no vector can escape 228 + * us. (some of these will be overridden and become 229 + * 'special' SMP interrupts) 230 + */ 231 + for (i = 0; i < (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) { 232 + int vector = FIRST_EXTERNAL_VECTOR + i; 233 + if (vector != IA32_SYSCALL_VECTOR) 234 + set_intr_gate(vector, interrupt[i]); 235 + } 236 + 237 + apic_intr_init(); 211 238 212 239 if (!acpi_ioapic) 213 240 setup_irq(2, &irq2);
+2 -2
arch/x86/kernel/process_32.c
··· 123 123 } 124 124 } 125 125 126 - void __show_registers(struct pt_regs *regs, int all) 126 + void __show_regs(struct pt_regs *regs, int all) 127 127 { 128 128 unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L; 129 129 unsigned long d0, d1, d2, d3, d6, d7; ··· 189 189 190 190 void show_regs(struct pt_regs *regs) 191 191 { 192 - __show_registers(regs, 1); 192 + __show_regs(regs, 1); 193 193 show_trace(NULL, regs, &regs->sp, regs->bp); 194 194 } 195 195
+5 -2
arch/x86/kernel/process_64.c
··· 136 136 } 137 137 138 138 /* Prints also some state that isn't saved in the pt_regs */ 139 - void __show_regs(struct pt_regs *regs) 139 + void __show_regs(struct pt_regs *regs, int all) 140 140 { 141 141 unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L, fs, gs, shadowgs; 142 142 unsigned long d0, d1, d2, d3, d6, d7; ··· 175 175 rdmsrl(MSR_GS_BASE, gs); 176 176 rdmsrl(MSR_KERNEL_GS_BASE, shadowgs); 177 177 178 + if (!all) 179 + return; 180 + 178 181 cr0 = read_cr0(); 179 182 cr2 = read_cr2(); 180 183 cr3 = read_cr3(); ··· 203 200 void show_regs(struct pt_regs *regs) 204 201 { 205 202 printk(KERN_INFO "CPU %d:", smp_processor_id()); 206 - __show_regs(regs); 203 + __show_regs(regs, 1); 207 204 show_trace(NULL, regs, (void *)(regs + 1), regs->bp); 208 205 } 209 206
+39 -2
arch/x86/kernel/quirks.c
··· 354 354 printk(KERN_DEBUG "Force enabled HPET at resume\n"); 355 355 } 356 356 357 + static u32 ati_ixp4x0_rev(struct pci_dev *dev) 358 + { 359 + u32 d; 360 + u8 b; 361 + 362 + pci_read_config_byte(dev, 0xac, &b); 363 + b &= ~(1<<5); 364 + pci_write_config_byte(dev, 0xac, b); 365 + pci_read_config_dword(dev, 0x70, &d); 366 + d |= 1<<8; 367 + pci_write_config_dword(dev, 0x70, d); 368 + pci_read_config_dword(dev, 0x8, &d); 369 + d &= 0xff; 370 + dev_printk(KERN_DEBUG, &dev->dev, "SB4X0 revision 0x%x\n", d); 371 + return d; 372 + } 373 + 357 374 static void ati_force_enable_hpet(struct pci_dev *dev) 358 375 { 359 - u32 uninitialized_var(val); 376 + u32 d, val; 377 + u8 b; 360 378 361 379 if (hpet_address || force_hpet_address) 362 380 return; ··· 384 366 return; 385 367 } 386 368 369 + d = ati_ixp4x0_rev(dev); 370 + if (d < 0x82) 371 + return; 372 + 373 + /* base address */ 387 374 pci_write_config_dword(dev, 0x14, 0xfed00000); 388 375 pci_read_config_dword(dev, 0x14, &val); 376 + 377 + /* enable interrupt */ 378 + outb(0x72, 0xcd6); b = inb(0xcd7); 379 + b |= 0x1; 380 + outb(0x72, 0xcd6); outb(b, 0xcd7); 381 + outb(0x72, 0xcd6); b = inb(0xcd7); 382 + if (!(b & 0x1)) 383 + return; 384 + pci_read_config_dword(dev, 0x64, &d); 385 + d |= (1<<10); 386 + pci_write_config_dword(dev, 0x64, d); 387 + pci_read_config_dword(dev, 0x64, &d); 388 + if (!(d & (1<<10))) 389 + return; 390 + 389 391 force_hpet_address = val; 390 392 force_hpet_resume_type = ATI_FORCE_HPET_RESUME; 391 393 dev_printk(KERN_DEBUG, &dev->dev, "Force enabled HPET at 0x%lx\n", 392 394 force_hpet_address); 393 395 cached_dev = dev; 394 - return; 395 396 } 396 397 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP400_SMBUS, 397 398 ati_force_enable_hpet);
+6 -4
arch/x86/kernel/setup.c
··· 302 302 if (clen > MAX_MAP_CHUNK-slop) 303 303 clen = MAX_MAP_CHUNK-slop; 304 304 mapaddr = ramdisk_image & PAGE_MASK; 305 - p = early_ioremap(mapaddr, clen+slop); 305 + p = early_memremap(mapaddr, clen+slop); 306 306 memcpy(q, p+slop, clen); 307 307 early_iounmap(p, clen+slop); 308 308 q += clen; ··· 379 379 return; 380 380 pa_data = boot_params.hdr.setup_data; 381 381 while (pa_data) { 382 - data = early_ioremap(pa_data, PAGE_SIZE); 382 + data = early_memremap(pa_data, PAGE_SIZE); 383 383 switch (data->type) { 384 384 case SETUP_E820_EXT: 385 385 parse_e820_ext(data, pa_data); ··· 402 402 return; 403 403 pa_data = boot_params.hdr.setup_data; 404 404 while (pa_data) { 405 - data = early_ioremap(pa_data, sizeof(*data)); 405 + data = early_memremap(pa_data, sizeof(*data)); 406 406 e820_update_range(pa_data, sizeof(*data)+data->len, 407 407 E820_RAM, E820_RESERVED_KERN); 408 408 found = 1; ··· 428 428 return; 429 429 pa_data = boot_params.hdr.setup_data; 430 430 while (pa_data) { 431 - data = early_ioremap(pa_data, sizeof(*data)); 431 + data = early_memremap(pa_data, sizeof(*data)); 432 432 sprintf(buf, "setup data %x", data->type); 433 433 reserve_early(pa_data, pa_data+sizeof(*data)+data->len, buf); 434 434 pa_data = data->next; ··· 997 997 * Parse the ACPI tables for possible boot-time SMP configuration. 998 998 */ 999 999 acpi_boot_table_init(); 1000 + 1001 + early_acpi_boot_init(); 1000 1002 1001 1003 #ifdef CONFIG_ACPI_NUMA 1002 1004 /*
+49 -60
arch/x86/kernel/smpboot.c
··· 334 334 * does not change while we are assigning vectors to cpus. Holding 335 335 * this lock ensures we don't half assign or remove an irq from a cpu. 336 336 */ 337 - ipi_call_lock_irq(); 337 + ipi_call_lock(); 338 338 lock_vector_lock(); 339 339 __setup_vector_irq(smp_processor_id()); 340 340 cpu_set(smp_processor_id(), cpu_online_map); 341 341 unlock_vector_lock(); 342 - ipi_call_unlock_irq(); 342 + ipi_call_unlock(); 343 343 per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; 344 + 345 + /* enable local interrupts */ 346 + local_irq_enable(); 344 347 345 348 setup_secondary_clock(); 346 349 ··· 599 596 * Give the other CPU some time to accept the IPI. 600 597 */ 601 598 udelay(200); 602 - maxlvt = lapic_get_maxlvt(); 603 - if (maxlvt > 3) /* Due to the Pentium erratum 3AP. */ 604 - apic_write(APIC_ESR, 0); 605 - accept_status = (apic_read(APIC_ESR) & 0xEF); 599 + if (APIC_INTEGRATED(apic_version[phys_apicid])) { 600 + maxlvt = lapic_get_maxlvt(); 601 + if (maxlvt > 3) /* Due to the Pentium erratum 3AP. */ 602 + apic_write(APIC_ESR, 0); 603 + accept_status = (apic_read(APIC_ESR) & 0xEF); 604 + } 606 605 pr_debug("NMI sent.\n"); 607 606 608 607 if (send_status) ··· 1261 1256 check_nmi_watchdog(); 1262 1257 } 1263 1258 1259 + /* 1260 + * cpu_possible_map should be static, it cannot change as cpu's 1261 + * are onlined, or offlined. The reason is per-cpu data-structures 1262 + * are allocated by some modules at init time, and dont expect to 1263 + * do this dynamically on cpu arrival/departure. 1264 + * cpu_present_map on the other hand can change dynamically. 1265 + * In case when cpu_hotplug is not compiled, then we resort to current 1266 + * behaviour, which is cpu_possible == cpu_present. 1267 + * - Ashok Raj 1268 + * 1269 + * Three ways to find out the number of additional hotplug CPUs: 1270 + * - If the BIOS specified disabled CPUs in ACPI/mptables use that. 1271 + * - The user can overwrite it with additional_cpus=NUM 1272 + * - Otherwise don't reserve additional CPUs. 1273 + * We do this because additional CPUs waste a lot of memory. 1274 + * -AK 1275 + */ 1276 + __init void prefill_possible_map(void) 1277 + { 1278 + int i, possible; 1279 + 1280 + /* no processor from mptable or madt */ 1281 + if (!num_processors) 1282 + num_processors = 1; 1283 + 1284 + possible = num_processors + disabled_cpus; 1285 + if (possible > NR_CPUS) 1286 + possible = NR_CPUS; 1287 + 1288 + printk(KERN_INFO "SMP: Allowing %d CPUs, %d hotplug CPUs\n", 1289 + possible, max_t(int, possible - num_processors, 0)); 1290 + 1291 + for (i = 0; i < possible; i++) 1292 + cpu_set(i, cpu_possible_map); 1293 + 1294 + nr_cpu_ids = possible; 1295 + } 1296 + 1264 1297 #ifdef CONFIG_HOTPLUG_CPU 1265 1298 1266 1299 static void remove_siblinginfo(int cpu) ··· 1322 1279 c->phys_proc_id = 0; 1323 1280 c->cpu_core_id = 0; 1324 1281 cpu_clear(cpu, cpu_sibling_setup_map); 1325 - } 1326 - 1327 - static int additional_cpus __initdata = -1; 1328 - 1329 - static __init int setup_additional_cpus(char *s) 1330 - { 1331 - return s && get_option(&s, &additional_cpus) ? 0 : -EINVAL; 1332 - } 1333 - early_param("additional_cpus", setup_additional_cpus); 1334 - 1335 - /* 1336 - * cpu_possible_map should be static, it cannot change as cpu's 1337 - * are onlined, or offlined. The reason is per-cpu data-structures 1338 - * are allocated by some modules at init time, and dont expect to 1339 - * do this dynamically on cpu arrival/departure. 1340 - * cpu_present_map on the other hand can change dynamically. 1341 - * In case when cpu_hotplug is not compiled, then we resort to current 1342 - * behaviour, which is cpu_possible == cpu_present. 1343 - * - Ashok Raj 1344 - * 1345 - * Three ways to find out the number of additional hotplug CPUs: 1346 - * - If the BIOS specified disabled CPUs in ACPI/mptables use that. 1347 - * - The user can overwrite it with additional_cpus=NUM 1348 - * - Otherwise don't reserve additional CPUs. 1349 - * We do this because additional CPUs waste a lot of memory. 1350 - * -AK 1351 - */ 1352 - __init void prefill_possible_map(void) 1353 - { 1354 - int i; 1355 - int possible; 1356 - 1357 - /* no processor from mptable or madt */ 1358 - if (!num_processors) 1359 - num_processors = 1; 1360 - 1361 - if (additional_cpus == -1) { 1362 - if (disabled_cpus > 0) 1363 - additional_cpus = disabled_cpus; 1364 - else 1365 - additional_cpus = 0; 1366 - } 1367 - 1368 - possible = num_processors + additional_cpus; 1369 - if (possible > NR_CPUS) 1370 - possible = NR_CPUS; 1371 - 1372 - printk(KERN_INFO "SMP: Allowing %d CPUs, %d hotplug CPUs\n", 1373 - possible, max_t(int, possible - num_processors, 0)); 1374 - 1375 - for (i = 0; i < possible; i++) 1376 - cpu_set(i, cpu_possible_map); 1377 - 1378 - nr_cpu_ids = possible; 1379 1282 } 1380 1283 1381 1284 static void __ref remove_cpu_from_maps(int cpu)
+4 -3
arch/x86/kernel/time_32.c
··· 47 47 unsigned long pc = instruction_pointer(regs); 48 48 49 49 #ifdef CONFIG_SMP 50 - if (!v8086_mode(regs) && SEGMENT_IS_KERNEL_CODE(regs->cs) && 51 - in_lock_functions(pc)) { 50 + if (!user_mode_vm(regs) && in_lock_functions(pc)) { 52 51 #ifdef CONFIG_FRAME_POINTER 53 - return *(unsigned long *)(regs->bp + 4); 52 + return *(unsigned long *)(regs->bp + sizeof(long)); 54 53 #else 55 54 unsigned long *sp = (unsigned long *)&regs->sp; 56 55 ··· 94 95 95 96 do_timer_interrupt_hook(); 96 97 98 + #ifdef CONFIG_MCA 97 99 if (MCA_bus) { 98 100 /* The PS/2 uses level-triggered interrupts. You can't 99 101 turn them off, nor would you want to (any attempt to ··· 108 108 u8 irq_v = inb_p( 0x61 ); /* read the current state */ 109 109 outb_p( irq_v|0x80, 0x61 ); /* reset the IRQ */ 110 110 } 111 + #endif 111 112 112 113 return IRQ_HANDLED; 113 114 }
+16 -7
arch/x86/kernel/time_64.c
··· 16 16 #include <linux/interrupt.h> 17 17 #include <linux/module.h> 18 18 #include <linux/time.h> 19 + #include <linux/mca.h> 19 20 20 21 #include <asm/i8253.h> 21 22 #include <asm/hpet.h> ··· 34 33 /* Assume the lock function has either no stack frame or a copy 35 34 of flags from PUSHF 36 35 Eflags always has bits 22 and up cleared unlike kernel addresses. */ 37 - if (!user_mode(regs) && in_lock_functions(pc)) { 36 + if (!user_mode_vm(regs) && in_lock_functions(pc)) { 37 + #ifdef CONFIG_FRAME_POINTER 38 + return *(unsigned long *)(regs->bp + sizeof(long)); 39 + #else 38 40 unsigned long *sp = (unsigned long *)regs->sp; 39 41 if (sp[0] >> 22) 40 42 return sp[0]; 41 43 if (sp[1] >> 22) 42 44 return sp[1]; 45 + #endif 43 46 } 44 47 return pc; 45 48 } 46 49 EXPORT_SYMBOL(profile_pc); 47 50 48 - static irqreturn_t timer_event_interrupt(int irq, void *dev_id) 51 + irqreturn_t timer_interrupt(int irq, void *dev_id) 49 52 { 50 53 add_pda(irq0_irqs, 1); 51 54 52 55 global_clock_event->event_handler(global_clock_event); 56 + 57 + #ifdef CONFIG_MCA 58 + if (MCA_bus) { 59 + u8 irq_v = inb_p(0x61); /* read the current state */ 60 + outb_p(irq_v|0x80, 0x61); /* reset the IRQ */ 61 + } 62 + #endif 53 63 54 64 return IRQ_HANDLED; 55 65 } ··· 112 100 } 113 101 114 102 static struct irqaction irq0 = { 115 - .handler = timer_event_interrupt, 103 + .handler = timer_interrupt, 116 104 .flags = IRQF_DISABLED | IRQF_IRQPOLL | IRQF_NOBALANCING, 117 105 .mask = CPU_MASK_NONE, 118 106 .name = "timer" ··· 123 111 if (!hpet_enable()) 124 112 setup_pit_timer(); 125 113 114 + irq0.mask = cpumask_of_cpu(0); 126 115 setup_irq(0, &irq0); 127 116 } 128 117 129 118 void __init time_init(void) 130 119 { 131 120 tsc_init(); 132 - if (cpu_has(&boot_cpu_data, X86_FEATURE_RDTSCP)) 133 - vgetcpu_mode = VGETCPU_RDTSCP; 134 - else 135 - vgetcpu_mode = VGETCPU_LSL; 136 121 137 122 late_time_init = choose_time_init(); 138 123 }
+393 -616
arch/x86/kernel/traps_32.c arch/x86/kernel/traps.c
··· 7 7 */ 8 8 9 9 /* 10 - * 'Traps.c' handles hardware traps and faults after we have saved some 11 - * state in 'asm.s'. 10 + * Handle hardware traps and faults. 12 11 */ 13 12 #include <linux/interrupt.h> 14 13 #include <linux/kallsyms.h> 15 14 #include <linux/spinlock.h> 16 - #include <linux/highmem.h> 17 15 #include <linux/kprobes.h> 18 16 #include <linux/uaccess.h> 19 17 #include <linux/utsname.h> ··· 30 32 #include <linux/bug.h> 31 33 #include <linux/nmi.h> 32 34 #include <linux/mm.h> 35 + #include <linux/smp.h> 36 + #include <linux/io.h> 33 37 34 38 #ifdef CONFIG_EISA 35 39 #include <linux/ioport.h> ··· 46 46 #include <linux/edac.h> 47 47 #endif 48 48 49 - #include <asm/arch_hooks.h> 50 49 #include <asm/stacktrace.h> 51 50 #include <asm/processor.h> 52 51 #include <asm/debugreg.h> 53 52 #include <asm/atomic.h> 54 53 #include <asm/system.h> 55 54 #include <asm/unwind.h> 55 + #include <asm/traps.h> 56 56 #include <asm/desc.h> 57 57 #include <asm/i387.h> 58 + 59 + #include <mach_traps.h> 60 + 61 + #ifdef CONFIG_X86_64 62 + #include <asm/pgalloc.h> 63 + #include <asm/proto.h> 64 + #include <asm/pda.h> 65 + #else 66 + #include <asm/processor-flags.h> 67 + #include <asm/arch_hooks.h> 58 68 #include <asm/nmi.h> 59 69 #include <asm/smp.h> 60 70 #include <asm/io.h> 61 71 #include <asm/traps.h> 62 72 63 - #include "mach_traps.h" 73 + #include "cpu/mcheck/mce.h" 64 74 65 75 DECLARE_BITMAP(used_vectors, NR_VECTORS); 66 76 EXPORT_SYMBOL_GPL(used_vectors); ··· 87 77 */ 88 78 gate_desc idt_table[256] 89 79 __attribute__((__section__(".data.idt"))) = { { { { 0, 0 } } }, }; 80 + #endif 90 81 91 - int panic_on_unrecovered_nmi; 92 - int kstack_depth_to_print = 24; 93 - static unsigned int code_bytes = 64; 94 82 static int ignore_nmis; 95 - static int die_counter; 96 83 97 - void printk_address(unsigned long address, int reliable) 84 + static inline void conditional_sti(struct pt_regs *regs) 98 85 { 99 - #ifdef CONFIG_KALLSYMS 100 - unsigned long offset = 0; 101 - unsigned long symsize; 102 - const char *symname; 103 - char *modname; 104 - char *delim = ":"; 105 - char namebuf[KSYM_NAME_LEN]; 106 - char reliab[4] = ""; 107 - 108 - symname = kallsyms_lookup(address, &symsize, &offset, 109 - &modname, namebuf); 110 - if (!symname) { 111 - printk(" [<%08lx>]\n", address); 112 - return; 113 - } 114 - if (!reliable) 115 - strcpy(reliab, "? "); 116 - 117 - if (!modname) 118 - modname = delim = ""; 119 - printk(" [<%08lx>] %s%s%s%s%s+0x%lx/0x%lx\n", 120 - address, reliab, delim, modname, delim, symname, offset, symsize); 121 - #else 122 - printk(" [<%08lx>]\n", address); 123 - #endif 86 + if (regs->flags & X86_EFLAGS_IF) 87 + local_irq_enable(); 124 88 } 125 89 126 - static inline int valid_stack_ptr(struct thread_info *tinfo, 127 - void *p, unsigned int size) 90 + static inline void preempt_conditional_sti(struct pt_regs *regs) 128 91 { 129 - void *t = tinfo; 130 - return p > t && p <= t + THREAD_SIZE - size; 92 + inc_preempt_count(); 93 + if (regs->flags & X86_EFLAGS_IF) 94 + local_irq_enable(); 131 95 } 132 96 133 - /* The form of the top of the frame on the stack */ 134 - struct stack_frame { 135 - struct stack_frame *next_frame; 136 - unsigned long return_address; 137 - }; 138 - 139 - static inline unsigned long 140 - print_context_stack(struct thread_info *tinfo, 141 - unsigned long *stack, unsigned long bp, 142 - const struct stacktrace_ops *ops, void *data) 97 + static inline void preempt_conditional_cli(struct pt_regs *regs) 143 98 { 144 - struct stack_frame *frame = (struct stack_frame *)bp; 145 - 146 - while (valid_stack_ptr(tinfo, stack, sizeof(*stack))) { 147 - unsigned long addr; 148 - 149 - addr = *stack; 150 - if (__kernel_text_address(addr)) { 151 - if ((unsigned long) stack == bp + 4) { 152 - ops->address(data, addr, 1); 153 - frame = frame->next_frame; 154 - bp = (unsigned long) frame; 155 - } else { 156 - ops->address(data, addr, bp == 0); 157 - } 158 - } 159 - stack++; 160 - } 161 - return bp; 99 + if (regs->flags & X86_EFLAGS_IF) 100 + local_irq_disable(); 101 + dec_preempt_count(); 162 102 } 163 103 164 - void dump_trace(struct task_struct *task, struct pt_regs *regs, 165 - unsigned long *stack, unsigned long bp, 166 - const struct stacktrace_ops *ops, void *data) 167 - { 168 - if (!task) 169 - task = current; 170 - 171 - if (!stack) { 172 - unsigned long dummy; 173 - stack = &dummy; 174 - if (task != current) 175 - stack = (unsigned long *)task->thread.sp; 176 - } 177 - 178 - #ifdef CONFIG_FRAME_POINTER 179 - if (!bp) { 180 - if (task == current) { 181 - /* Grab bp right from our regs */ 182 - asm("movl %%ebp, %0" : "=r" (bp) :); 183 - } else { 184 - /* bp is the last reg pushed by switch_to */ 185 - bp = *(unsigned long *) task->thread.sp; 186 - } 187 - } 188 - #endif 189 - 190 - for (;;) { 191 - struct thread_info *context; 192 - 193 - context = (struct thread_info *) 194 - ((unsigned long)stack & (~(THREAD_SIZE - 1))); 195 - bp = print_context_stack(context, stack, bp, ops, data); 196 - /* 197 - * Should be after the line below, but somewhere 198 - * in early boot context comes out corrupted and we 199 - * can't reference it: 200 - */ 201 - if (ops->stack(data, "IRQ") < 0) 202 - break; 203 - stack = (unsigned long *)context->previous_esp; 204 - if (!stack) 205 - break; 206 - touch_nmi_watchdog(); 207 - } 208 - } 209 - EXPORT_SYMBOL(dump_trace); 210 - 211 - static void 212 - print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 213 - { 214 - printk(data); 215 - print_symbol(msg, symbol); 216 - printk("\n"); 217 - } 218 - 219 - static void print_trace_warning(void *data, char *msg) 220 - { 221 - printk("%s%s\n", (char *)data, msg); 222 - } 223 - 224 - static int print_trace_stack(void *data, char *name) 225 - { 226 - return 0; 227 - } 228 - 229 - /* 230 - * Print one address/symbol entries per line. 231 - */ 232 - static void print_trace_address(void *data, unsigned long addr, int reliable) 233 - { 234 - printk("%s [<%08lx>] ", (char *)data, addr); 235 - if (!reliable) 236 - printk("? "); 237 - print_symbol("%s\n", addr); 238 - touch_nmi_watchdog(); 239 - } 240 - 241 - static const struct stacktrace_ops print_trace_ops = { 242 - .warning = print_trace_warning, 243 - .warning_symbol = print_trace_warning_symbol, 244 - .stack = print_trace_stack, 245 - .address = print_trace_address, 246 - }; 247 - 248 - static void 249 - show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 250 - unsigned long *stack, unsigned long bp, char *log_lvl) 251 - { 252 - dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 253 - printk("%s =======================\n", log_lvl); 254 - } 255 - 256 - void show_trace(struct task_struct *task, struct pt_regs *regs, 257 - unsigned long *stack, unsigned long bp) 258 - { 259 - show_trace_log_lvl(task, regs, stack, bp, ""); 260 - } 261 - 262 - static void 263 - show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 264 - unsigned long *sp, unsigned long bp, char *log_lvl) 265 - { 266 - unsigned long *stack; 267 - int i; 268 - 269 - if (sp == NULL) { 270 - if (task) 271 - sp = (unsigned long *)task->thread.sp; 272 - else 273 - sp = (unsigned long *)&sp; 274 - } 275 - 276 - stack = sp; 277 - for (i = 0; i < kstack_depth_to_print; i++) { 278 - if (kstack_end(stack)) 279 - break; 280 - if (i && ((i % 8) == 0)) 281 - printk("\n%s ", log_lvl); 282 - printk("%08lx ", *stack++); 283 - } 284 - printk("\n%sCall Trace:\n", log_lvl); 285 - 286 - show_trace_log_lvl(task, regs, sp, bp, log_lvl); 287 - } 288 - 289 - void show_stack(struct task_struct *task, unsigned long *sp) 290 - { 291 - printk(" "); 292 - show_stack_log_lvl(task, NULL, sp, 0, ""); 293 - } 294 - 295 - /* 296 - * The architecture-independent dump_stack generator 297 - */ 298 - void dump_stack(void) 299 - { 300 - unsigned long bp = 0; 301 - unsigned long stack; 302 - 303 - #ifdef CONFIG_FRAME_POINTER 304 - if (!bp) 305 - asm("movl %%ebp, %0" : "=r" (bp):); 306 - #endif 307 - 308 - printk("Pid: %d, comm: %.20s %s %s %.*s\n", 309 - current->pid, current->comm, print_tainted(), 310 - init_utsname()->release, 311 - (int)strcspn(init_utsname()->version, " "), 312 - init_utsname()->version); 313 - 314 - show_trace(current, NULL, &stack, bp); 315 - } 316 - 317 - EXPORT_SYMBOL(dump_stack); 318 - 319 - void show_registers(struct pt_regs *regs) 320 - { 321 - int i; 322 - 323 - print_modules(); 324 - __show_registers(regs, 0); 325 - 326 - printk(KERN_EMERG "Process %.*s (pid: %d, ti=%p task=%p task.ti=%p)", 327 - TASK_COMM_LEN, current->comm, task_pid_nr(current), 328 - current_thread_info(), current, task_thread_info(current)); 329 - /* 330 - * When in-kernel, we also print out the stack and code at the 331 - * time of the fault.. 332 - */ 333 - if (!user_mode_vm(regs)) { 334 - unsigned int code_prologue = code_bytes * 43 / 64; 335 - unsigned int code_len = code_bytes; 336 - unsigned char c; 337 - u8 *ip; 338 - 339 - printk("\n" KERN_EMERG "Stack: "); 340 - show_stack_log_lvl(NULL, regs, &regs->sp, 0, KERN_EMERG); 341 - 342 - printk(KERN_EMERG "Code: "); 343 - 344 - ip = (u8 *)regs->ip - code_prologue; 345 - if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) { 346 - /* try starting at EIP */ 347 - ip = (u8 *)regs->ip; 348 - code_len = code_len - code_prologue + 1; 349 - } 350 - for (i = 0; i < code_len; i++, ip++) { 351 - if (ip < (u8 *)PAGE_OFFSET || 352 - probe_kernel_address(ip, c)) { 353 - printk(" Bad EIP value."); 354 - break; 355 - } 356 - if (ip == (u8 *)regs->ip) 357 - printk("<%02x> ", c); 358 - else 359 - printk("%02x ", c); 360 - } 361 - } 362 - printk("\n"); 363 - } 364 - 365 - int is_valid_bugaddr(unsigned long ip) 366 - { 367 - unsigned short ud2; 368 - 369 - if (ip < PAGE_OFFSET) 370 - return 0; 371 - if (probe_kernel_address((unsigned short *)ip, ud2)) 372 - return 0; 373 - 374 - return ud2 == 0x0b0f; 375 - } 376 - 377 - static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 378 - static int die_owner = -1; 379 - static unsigned int die_nest_count; 380 - 381 - unsigned __kprobes long oops_begin(void) 382 - { 383 - unsigned long flags; 384 - 385 - oops_enter(); 386 - 387 - if (die_owner != raw_smp_processor_id()) { 388 - console_verbose(); 389 - raw_local_irq_save(flags); 390 - __raw_spin_lock(&die_lock); 391 - die_owner = smp_processor_id(); 392 - die_nest_count = 0; 393 - bust_spinlocks(1); 394 - } else { 395 - raw_local_irq_save(flags); 396 - } 397 - die_nest_count++; 398 - return flags; 399 - } 400 - 401 - void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 402 - { 403 - bust_spinlocks(0); 404 - die_owner = -1; 405 - add_taint(TAINT_DIE); 406 - __raw_spin_unlock(&die_lock); 407 - raw_local_irq_restore(flags); 408 - 409 - if (!regs) 410 - return; 411 - 412 - if (kexec_should_crash(current)) 413 - crash_kexec(regs); 414 - 415 - if (in_interrupt()) 416 - panic("Fatal exception in interrupt"); 417 - 418 - if (panic_on_oops) 419 - panic("Fatal exception"); 420 - 421 - oops_exit(); 422 - do_exit(signr); 423 - } 424 - 425 - int __kprobes __die(const char *str, struct pt_regs *regs, long err) 426 - { 427 - unsigned short ss; 428 - unsigned long sp; 429 - 430 - printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter); 431 - #ifdef CONFIG_PREEMPT 432 - printk("PREEMPT "); 433 - #endif 434 - #ifdef CONFIG_SMP 435 - printk("SMP "); 436 - #endif 437 - #ifdef CONFIG_DEBUG_PAGEALLOC 438 - printk("DEBUG_PAGEALLOC"); 439 - #endif 440 - printk("\n"); 441 - if (notify_die(DIE_OOPS, str, regs, err, 442 - current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 443 - return 1; 444 - 445 - show_registers(regs); 446 - /* Executive summary in case the oops scrolled away */ 447 - sp = (unsigned long) (&regs->sp); 448 - savesegment(ss, ss); 449 - if (user_mode(regs)) { 450 - sp = regs->sp; 451 - ss = regs->ss & 0xffff; 452 - } 453 - printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip); 454 - print_symbol("%s", regs->ip); 455 - printk(" SS:ESP %04x:%08lx\n", ss, sp); 456 - return 0; 457 - } 458 - 459 - /* 460 - * This is gone through when something in the kernel has done something bad 461 - * and is about to be terminated: 462 - */ 463 - void die(const char *str, struct pt_regs *regs, long err) 464 - { 465 - unsigned long flags = oops_begin(); 466 - 467 - if (die_nest_count < 3) { 468 - report_bug(regs->ip, regs); 469 - 470 - if (__die(str, regs, err)) 471 - regs = NULL; 472 - } else { 473 - printk(KERN_EMERG "Recursive die() failure, output suppressed\n"); 474 - } 475 - 476 - oops_end(flags, regs, SIGSEGV); 477 - } 478 - 104 + #ifdef CONFIG_X86_32 479 105 static inline void 480 106 die_if_kernel(const char *str, struct pt_regs *regs, long err) 481 107 { ··· 119 473 die(str, regs, err); 120 474 } 121 475 122 - static void __kprobes 123 - do_trap(int trapnr, int signr, char *str, int vm86, struct pt_regs *regs, 124 - long error_code, siginfo_t *info) 476 + /* 477 + * Perform the lazy TSS's I/O bitmap copy. If the TSS has an 478 + * invalid offset set (the LAZY one) and the faulting thread has 479 + * a valid I/O bitmap pointer, we copy the I/O bitmap in the TSS, 480 + * we set the offset field correctly and return 1. 481 + */ 482 + static int lazy_iobitmap_copy(void) 125 483 { 126 - struct task_struct *tsk = current; 127 - 128 - if (regs->flags & X86_VM_MASK) { 129 - if (vm86) 130 - goto vm86_trap; 131 - goto trap_signal; 132 - } 133 - 134 - if (!user_mode(regs)) 135 - goto kernel_trap; 136 - 137 - trap_signal: 138 - /* 139 - * We want error_code and trap_no set for userspace faults and 140 - * kernelspace faults which result in die(), but not 141 - * kernelspace faults which are fixed up. die() gives the 142 - * process no chance to handle the signal and notice the 143 - * kernel fault information, so that won't result in polluting 144 - * the information about previously queued, but not yet 145 - * delivered, faults. See also do_general_protection below. 146 - */ 147 - tsk->thread.error_code = error_code; 148 - tsk->thread.trap_no = trapnr; 149 - 150 - if (info) 151 - force_sig_info(signr, info, tsk); 152 - else 153 - force_sig(signr, tsk); 154 - return; 155 - 156 - kernel_trap: 157 - if (!fixup_exception(regs)) { 158 - tsk->thread.error_code = error_code; 159 - tsk->thread.trap_no = trapnr; 160 - die(str, regs, error_code); 161 - } 162 - return; 163 - 164 - vm86_trap: 165 - if (handle_vm86_trap((struct kernel_vm86_regs *) regs, 166 - error_code, trapnr)) 167 - goto trap_signal; 168 - return; 169 - } 170 - 171 - #define DO_ERROR(trapnr, signr, str, name) \ 172 - void do_##name(struct pt_regs *regs, long error_code) \ 173 - { \ 174 - trace_hardirqs_fixup(); \ 175 - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 176 - == NOTIFY_STOP) \ 177 - return; \ 178 - do_trap(trapnr, signr, str, 0, regs, error_code, NULL); \ 179 - } 180 - 181 - #define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr, irq) \ 182 - void do_##name(struct pt_regs *regs, long error_code) \ 183 - { \ 184 - siginfo_t info; \ 185 - if (irq) \ 186 - local_irq_enable(); \ 187 - info.si_signo = signr; \ 188 - info.si_errno = 0; \ 189 - info.si_code = sicode; \ 190 - info.si_addr = (void __user *)siaddr; \ 191 - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 192 - == NOTIFY_STOP) \ 193 - return; \ 194 - do_trap(trapnr, signr, str, 0, regs, error_code, &info); \ 195 - } 196 - 197 - #define DO_VM86_ERROR(trapnr, signr, str, name) \ 198 - void do_##name(struct pt_regs *regs, long error_code) \ 199 - { \ 200 - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 201 - == NOTIFY_STOP) \ 202 - return; \ 203 - do_trap(trapnr, signr, str, 1, regs, error_code, NULL); \ 204 - } 205 - 206 - #define DO_VM86_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \ 207 - void do_##name(struct pt_regs *regs, long error_code) \ 208 - { \ 209 - siginfo_t info; \ 210 - info.si_signo = signr; \ 211 - info.si_errno = 0; \ 212 - info.si_code = sicode; \ 213 - info.si_addr = (void __user *)siaddr; \ 214 - trace_hardirqs_fixup(); \ 215 - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 216 - == NOTIFY_STOP) \ 217 - return; \ 218 - do_trap(trapnr, signr, str, 1, regs, error_code, &info); \ 219 - } 220 - 221 - DO_VM86_ERROR_INFO(0, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->ip) 222 - #ifndef CONFIG_KPROBES 223 - DO_VM86_ERROR(3, SIGTRAP, "int3", int3) 224 - #endif 225 - DO_VM86_ERROR(4, SIGSEGV, "overflow", overflow) 226 - DO_VM86_ERROR(5, SIGSEGV, "bounds", bounds) 227 - DO_ERROR_INFO(6, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip, 0) 228 - DO_ERROR(9, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun) 229 - DO_ERROR(10, SIGSEGV, "invalid TSS", invalid_TSS) 230 - DO_ERROR(11, SIGBUS, "segment not present", segment_not_present) 231 - DO_ERROR(12, SIGBUS, "stack segment", stack_segment) 232 - DO_ERROR_INFO(17, SIGBUS, "alignment check", alignment_check, BUS_ADRALN, 0, 0) 233 - DO_ERROR_INFO(32, SIGILL, "iret exception", iret_error, ILL_BADSTK, 0, 1) 234 - 235 - void __kprobes 236 - do_general_protection(struct pt_regs *regs, long error_code) 237 - { 238 - struct task_struct *tsk; 239 484 struct thread_struct *thread; 240 485 struct tss_struct *tss; 241 486 int cpu; ··· 135 598 tss = &per_cpu(init_tss, cpu); 136 599 thread = &current->thread; 137 600 138 - /* 139 - * Perform the lazy TSS's I/O bitmap copy. If the TSS has an 140 - * invalid offset set (the LAZY one) and the faulting thread has 141 - * a valid I/O bitmap pointer, we copy the I/O bitmap in the TSS 142 - * and we set the offset field correctly. Then we let the CPU to 143 - * restart the faulting instruction. 144 - */ 145 601 if (tss->x86_tss.io_bitmap_base == INVALID_IO_BITMAP_OFFSET_LAZY && 146 602 thread->io_bitmap_ptr) { 147 603 memcpy(tss->io_bitmap, thread->io_bitmap_ptr, ··· 153 623 tss->io_bitmap_owner = thread; 154 624 put_cpu(); 155 625 156 - return; 626 + return 1; 157 627 } 158 628 put_cpu(); 159 629 630 + return 0; 631 + } 632 + #endif 633 + 634 + static void __kprobes 635 + do_trap(int trapnr, int signr, char *str, struct pt_regs *regs, 636 + long error_code, siginfo_t *info) 637 + { 638 + struct task_struct *tsk = current; 639 + 640 + #ifdef CONFIG_X86_32 641 + if (regs->flags & X86_VM_MASK) { 642 + /* 643 + * traps 0, 1, 3, 4, and 5 should be forwarded to vm86. 644 + * On nmi (interrupt 2), do_trap should not be called. 645 + */ 646 + if (trapnr < 6) 647 + goto vm86_trap; 648 + goto trap_signal; 649 + } 650 + #endif 651 + 652 + if (!user_mode(regs)) 653 + goto kernel_trap; 654 + 655 + #ifdef CONFIG_X86_32 656 + trap_signal: 657 + #endif 658 + /* 659 + * We want error_code and trap_no set for userspace faults and 660 + * kernelspace faults which result in die(), but not 661 + * kernelspace faults which are fixed up. die() gives the 662 + * process no chance to handle the signal and notice the 663 + * kernel fault information, so that won't result in polluting 664 + * the information about previously queued, but not yet 665 + * delivered, faults. See also do_general_protection below. 666 + */ 667 + tsk->thread.error_code = error_code; 668 + tsk->thread.trap_no = trapnr; 669 + 670 + #ifdef CONFIG_X86_64 671 + if (show_unhandled_signals && unhandled_signal(tsk, signr) && 672 + printk_ratelimit()) { 673 + printk(KERN_INFO 674 + "%s[%d] trap %s ip:%lx sp:%lx error:%lx", 675 + tsk->comm, tsk->pid, str, 676 + regs->ip, regs->sp, error_code); 677 + print_vma_addr(" in ", regs->ip); 678 + printk("\n"); 679 + } 680 + #endif 681 + 682 + if (info) 683 + force_sig_info(signr, info, tsk); 684 + else 685 + force_sig(signr, tsk); 686 + return; 687 + 688 + kernel_trap: 689 + if (!fixup_exception(regs)) { 690 + tsk->thread.error_code = error_code; 691 + tsk->thread.trap_no = trapnr; 692 + die(str, regs, error_code); 693 + } 694 + return; 695 + 696 + #ifdef CONFIG_X86_32 697 + vm86_trap: 698 + if (handle_vm86_trap((struct kernel_vm86_regs *) regs, 699 + error_code, trapnr)) 700 + goto trap_signal; 701 + return; 702 + #endif 703 + } 704 + 705 + #define DO_ERROR(trapnr, signr, str, name) \ 706 + dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ 707 + { \ 708 + if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 709 + == NOTIFY_STOP) \ 710 + return; \ 711 + conditional_sti(regs); \ 712 + do_trap(trapnr, signr, str, regs, error_code, NULL); \ 713 + } 714 + 715 + #define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \ 716 + dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \ 717 + { \ 718 + siginfo_t info; \ 719 + info.si_signo = signr; \ 720 + info.si_errno = 0; \ 721 + info.si_code = sicode; \ 722 + info.si_addr = (void __user *)siaddr; \ 723 + if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 724 + == NOTIFY_STOP) \ 725 + return; \ 726 + conditional_sti(regs); \ 727 + do_trap(trapnr, signr, str, regs, error_code, &info); \ 728 + } 729 + 730 + DO_ERROR_INFO(0, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->ip) 731 + DO_ERROR(4, SIGSEGV, "overflow", overflow) 732 + DO_ERROR(5, SIGSEGV, "bounds", bounds) 733 + DO_ERROR_INFO(6, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip) 734 + DO_ERROR(9, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun) 735 + DO_ERROR(10, SIGSEGV, "invalid TSS", invalid_TSS) 736 + DO_ERROR(11, SIGBUS, "segment not present", segment_not_present) 737 + #ifdef CONFIG_X86_32 738 + DO_ERROR(12, SIGBUS, "stack segment", stack_segment) 739 + #endif 740 + DO_ERROR_INFO(17, SIGBUS, "alignment check", alignment_check, BUS_ADRALN, 0) 741 + 742 + #ifdef CONFIG_X86_64 743 + /* Runs on IST stack */ 744 + dotraplinkage void do_stack_segment(struct pt_regs *regs, long error_code) 745 + { 746 + if (notify_die(DIE_TRAP, "stack segment", regs, error_code, 747 + 12, SIGBUS) == NOTIFY_STOP) 748 + return; 749 + preempt_conditional_sti(regs); 750 + do_trap(12, SIGBUS, "stack segment", regs, error_code, NULL); 751 + preempt_conditional_cli(regs); 752 + } 753 + 754 + dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) 755 + { 756 + static const char str[] = "double fault"; 757 + struct task_struct *tsk = current; 758 + 759 + /* Return not checked because double check cannot be ignored */ 760 + notify_die(DIE_TRAP, str, regs, error_code, 8, SIGSEGV); 761 + 762 + tsk->thread.error_code = error_code; 763 + tsk->thread.trap_no = 8; 764 + 765 + /* This is always a kernel trap and never fixable (and thus must 766 + never return). */ 767 + for (;;) 768 + die(str, regs, error_code); 769 + } 770 + #endif 771 + 772 + dotraplinkage void __kprobes 773 + do_general_protection(struct pt_regs *regs, long error_code) 774 + { 775 + struct task_struct *tsk; 776 + 777 + conditional_sti(regs); 778 + 779 + #ifdef CONFIG_X86_32 780 + if (lazy_iobitmap_copy()) { 781 + /* restart the faulting instruction */ 782 + return; 783 + } 784 + 160 785 if (regs->flags & X86_VM_MASK) 161 786 goto gp_in_vm86; 787 + #endif 162 788 163 789 tsk = current; 164 790 if (!user_mode(regs)) ··· 336 650 force_sig(SIGSEGV, tsk); 337 651 return; 338 652 653 + #ifdef CONFIG_X86_32 339 654 gp_in_vm86: 340 655 local_irq_enable(); 341 656 handle_vm86_fault((struct kernel_vm86_regs *) regs, error_code); 342 657 return; 658 + #endif 343 659 344 660 gp_in_kernel: 345 661 if (fixup_exception(regs)) ··· 378 690 printk(KERN_EMERG "Dazed and confused, but trying to continue\n"); 379 691 380 692 /* Clear and disable the memory parity error line. */ 381 - clear_mem_error(reason); 693 + reason = (reason & 0xf) | 4; 694 + outb(reason, 0x61); 382 695 } 383 696 384 697 static notrace __kprobes void ··· 405 716 static notrace __kprobes void 406 717 unknown_nmi_error(unsigned char reason, struct pt_regs *regs) 407 718 { 408 - if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP) 719 + if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) == 720 + NOTIFY_STOP) 409 721 return; 410 722 #ifdef CONFIG_MCA 411 723 /* ··· 427 737 panic("NMI: Not continuing"); 428 738 429 739 printk(KERN_EMERG "Dazed and confused, but trying to continue\n"); 430 - } 431 - 432 - static DEFINE_SPINLOCK(nmi_print_lock); 433 - 434 - void notrace __kprobes die_nmi(char *str, struct pt_regs *regs, int do_panic) 435 - { 436 - if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 437 - return; 438 - 439 - spin_lock(&nmi_print_lock); 440 - /* 441 - * We are in trouble anyway, lets at least try 442 - * to get a message out: 443 - */ 444 - bust_spinlocks(1); 445 - printk(KERN_EMERG "%s", str); 446 - printk(" on CPU%d, ip %08lx, registers:\n", 447 - smp_processor_id(), regs->ip); 448 - show_registers(regs); 449 - if (do_panic) 450 - panic("Non maskable interrupt"); 451 - console_silent(); 452 - spin_unlock(&nmi_print_lock); 453 - bust_spinlocks(0); 454 - 455 - /* 456 - * If we are in kernel we are probably nested up pretty bad 457 - * and might aswell get out now while we still can: 458 - */ 459 - if (!user_mode_vm(regs)) { 460 - current->thread.trap_no = 2; 461 - crash_kexec(regs); 462 - } 463 - 464 - do_exit(SIGSEGV); 465 740 } 466 741 467 742 static notrace __kprobes void default_do_nmi(struct pt_regs *regs) ··· 467 812 mem_parity_error(reason, regs); 468 813 if (reason & 0x40) 469 814 io_check_error(reason, regs); 815 + #ifdef CONFIG_X86_32 470 816 /* 471 817 * Reassert NMI in case it became active meanwhile 472 818 * as it's edge-triggered: 473 819 */ 474 820 reassert_nmi(); 821 + #endif 475 822 } 476 823 477 - notrace __kprobes void do_nmi(struct pt_regs *regs, long error_code) 824 + dotraplinkage notrace __kprobes void 825 + do_nmi(struct pt_regs *regs, long error_code) 478 826 { 479 - int cpu; 480 - 481 827 nmi_enter(); 482 828 483 - cpu = smp_processor_id(); 484 - 485 - ++nmi_count(cpu); 829 + #ifdef CONFIG_X86_32 830 + { int cpu; cpu = smp_processor_id(); ++nmi_count(cpu); } 831 + #else 832 + add_pda(__nmi_count, 1); 833 + #endif 486 834 487 835 if (!ignore_nmis) 488 836 default_do_nmi(regs); ··· 505 847 acpi_nmi_enable(); 506 848 } 507 849 508 - #ifdef CONFIG_KPROBES 509 - void __kprobes do_int3(struct pt_regs *regs, long error_code) 850 + /* May run on IST stack. */ 851 + dotraplinkage void __kprobes do_int3(struct pt_regs *regs, long error_code) 510 852 { 511 - trace_hardirqs_fixup(); 512 - 853 + #ifdef CONFIG_KPROBES 513 854 if (notify_die(DIE_INT3, "int3", regs, error_code, 3, SIGTRAP) 514 855 == NOTIFY_STOP) 515 856 return; 516 - /* 517 - * This is an interrupt gate, because kprobes wants interrupts 518 - * disabled. Normal trap handlers don't. 519 - */ 520 - restore_interrupts(regs); 857 + #else 858 + if (notify_die(DIE_TRAP, "int3", regs, error_code, 3, SIGTRAP) 859 + == NOTIFY_STOP) 860 + return; 861 + #endif 521 862 522 - do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL); 863 + preempt_conditional_sti(regs); 864 + do_trap(3, SIGTRAP, "int3", regs, error_code, NULL); 865 + preempt_conditional_cli(regs); 866 + } 867 + 868 + #ifdef CONFIG_X86_64 869 + /* Help handler running on IST stack to switch back to user stack 870 + for scheduling or signal handling. The actual stack switch is done in 871 + entry.S */ 872 + asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs) 873 + { 874 + struct pt_regs *regs = eregs; 875 + /* Did already sync */ 876 + if (eregs == (struct pt_regs *)eregs->sp) 877 + ; 878 + /* Exception from user space */ 879 + else if (user_mode(eregs)) 880 + regs = task_pt_regs(current); 881 + /* Exception from kernel and interrupts are enabled. Move to 882 + kernel process stack. */ 883 + else if (eregs->flags & X86_EFLAGS_IF) 884 + regs = (struct pt_regs *)(eregs->sp -= sizeof(struct pt_regs)); 885 + if (eregs != regs) 886 + *regs = *eregs; 887 + return regs; 523 888 } 524 889 #endif 525 890 ··· 567 886 * about restoring all the debug state, and ptrace doesn't have to 568 887 * find every occurrence of the TF bit that could be saved away even 569 888 * by user code) 889 + * 890 + * May run on IST stack. 570 891 */ 571 - void __kprobes do_debug(struct pt_regs *regs, long error_code) 892 + dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code) 572 893 { 573 894 struct task_struct *tsk = current; 574 - unsigned int condition; 895 + unsigned long condition; 575 896 int si_code; 576 - 577 - trace_hardirqs_fixup(); 578 897 579 898 get_debugreg(condition, 6); 580 899 ··· 587 906 if (notify_die(DIE_DEBUG, "debug", regs, condition, error_code, 588 907 SIGTRAP) == NOTIFY_STOP) 589 908 return; 909 + 590 910 /* It's safe to allow irq's after DR6 has been saved */ 591 - if (regs->flags & X86_EFLAGS_IF) 592 - local_irq_enable(); 911 + preempt_conditional_sti(regs); 593 912 594 913 /* Mask out spurious debug traps due to lazy DR7 setting */ 595 914 if (condition & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) { ··· 597 916 goto clear_dr7; 598 917 } 599 918 919 + #ifdef CONFIG_X86_32 600 920 if (regs->flags & X86_VM_MASK) 601 921 goto debug_vm86; 922 + #endif 602 923 603 924 /* Save debug status register where ptrace can see it */ 604 925 tsk->thread.debugreg6 = condition; ··· 610 927 * kernel space (but re-enable TF when returning to user mode). 611 928 */ 612 929 if (condition & DR_STEP) { 613 - /* 614 - * We already checked v86 mode above, so we can 615 - * check for kernel mode by just checking the CPL 616 - * of CS. 617 - */ 618 930 if (!user_mode(regs)) 619 931 goto clear_TF_reenable; 620 932 } 621 933 622 - si_code = get_si_code((unsigned long)condition); 934 + si_code = get_si_code(condition); 623 935 /* Ok, finally something we can handle */ 624 936 send_sigtrap(tsk, regs, error_code, si_code); 625 937 ··· 624 946 */ 625 947 clear_dr7: 626 948 set_debugreg(0, 7); 949 + preempt_conditional_cli(regs); 627 950 return; 628 951 952 + #ifdef CONFIG_X86_32 629 953 debug_vm86: 630 954 handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code, 1); 955 + preempt_conditional_cli(regs); 631 956 return; 957 + #endif 632 958 633 959 clear_TF_reenable: 634 960 set_tsk_thread_flag(tsk, TIF_SINGLESTEP); 635 961 regs->flags &= ~X86_EFLAGS_TF; 962 + preempt_conditional_cli(regs); 636 963 return; 637 964 } 965 + 966 + #ifdef CONFIG_X86_64 967 + static int kernel_math_error(struct pt_regs *regs, const char *str, int trapnr) 968 + { 969 + if (fixup_exception(regs)) 970 + return 1; 971 + 972 + notify_die(DIE_GPF, str, regs, 0, trapnr, SIGFPE); 973 + /* Illegal floating point operation in the kernel */ 974 + current->thread.trap_no = trapnr; 975 + die(str, regs, 0); 976 + return 0; 977 + } 978 + #endif 638 979 639 980 /* 640 981 * Note that we play around with the 'TS' bit in an attempt to get ··· 691 994 swd = get_fpu_swd(task); 692 995 switch (swd & ~cwd & 0x3f) { 693 996 case 0x000: /* No unmasked exception */ 997 + #ifdef CONFIG_X86_32 694 998 return; 999 + #endif 695 1000 default: /* Multiple exceptions */ 696 1001 break; 697 1002 case 0x001: /* Invalid Op */ ··· 721 1022 force_sig_info(SIGFPE, &info, task); 722 1023 } 723 1024 724 - void do_coprocessor_error(struct pt_regs *regs, long error_code) 1025 + dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code) 725 1026 { 1027 + conditional_sti(regs); 1028 + 1029 + #ifdef CONFIG_X86_32 726 1030 ignore_fpu_irq = 1; 1031 + #else 1032 + if (!user_mode(regs) && 1033 + kernel_math_error(regs, "kernel x87 math error", 16)) 1034 + return; 1035 + #endif 1036 + 727 1037 math_error((void __user *)regs->ip); 728 1038 } 729 1039 ··· 784 1076 force_sig_info(SIGFPE, &info, task); 785 1077 } 786 1078 787 - void do_simd_coprocessor_error(struct pt_regs *regs, long error_code) 1079 + dotraplinkage void 1080 + do_simd_coprocessor_error(struct pt_regs *regs, long error_code) 788 1081 { 1082 + conditional_sti(regs); 1083 + 1084 + #ifdef CONFIG_X86_32 789 1085 if (cpu_has_xmm) { 790 1086 /* Handle SIMD FPU exceptions on PIII+ processors. */ 791 1087 ignore_fpu_irq = 1; ··· 808 1096 current->thread.error_code = error_code; 809 1097 die_if_kernel("cache flush denied", regs, error_code); 810 1098 force_sig(SIGSEGV, current); 1099 + #else 1100 + if (!user_mode(regs) && 1101 + kernel_math_error(regs, "kernel simd math error", 19)) 1102 + return; 1103 + simd_math_error((void __user *)regs->ip); 1104 + #endif 811 1105 } 812 1106 813 - void do_spurious_interrupt_bug(struct pt_regs *regs, long error_code) 1107 + dotraplinkage void 1108 + do_spurious_interrupt_bug(struct pt_regs *regs, long error_code) 814 1109 { 1110 + conditional_sti(regs); 815 1111 #if 0 816 1112 /* No need to warn about this any longer. */ 817 1113 printk(KERN_INFO "Ignoring P6 Local APIC Spurious Interrupt Bug...\n"); 818 1114 #endif 819 1115 } 820 1116 1117 + #ifdef CONFIG_X86_32 821 1118 unsigned long patch_espfix_desc(unsigned long uesp, unsigned long kesp) 822 1119 { 823 1120 struct desc_struct *gdt = get_cpu_gdt_table(smp_processor_id()); ··· 845 1124 846 1125 return new_kesp; 847 1126 } 1127 + #else 1128 + asmlinkage void __attribute__((weak)) smp_thermal_interrupt(void) 1129 + { 1130 + } 1131 + 1132 + asmlinkage void __attribute__((weak)) mce_threshold_interrupt(void) 1133 + { 1134 + } 1135 + #endif 848 1136 849 1137 /* 850 1138 * 'math_state_restore()' saves the current math information in the ··· 886 1156 } 887 1157 888 1158 clts(); /* Allow maths ops (or we recurse) */ 1159 + #ifdef CONFIG_X86_32 889 1160 restore_fpu(tsk); 1161 + #else 1162 + /* 1163 + * Paranoid restore. send a SIGSEGV if we fail to restore the state. 1164 + */ 1165 + if (unlikely(restore_fpu_checking(tsk))) { 1166 + stts(); 1167 + force_sig(SIGSEGV, tsk); 1168 + return; 1169 + } 1170 + #endif 890 1171 thread->status |= TS_USEDFPU; /* So we fnsave on switch_to() */ 891 1172 tsk->fpu_counter++; 892 1173 } 893 1174 EXPORT_SYMBOL_GPL(math_state_restore); 894 1175 895 1176 #ifndef CONFIG_MATH_EMULATION 896 - 897 1177 asmlinkage void math_emulate(long arg) 898 1178 { 899 1179 printk(KERN_EMERG ··· 912 1172 force_sig(SIGFPE, current); 913 1173 schedule(); 914 1174 } 915 - 916 1175 #endif /* CONFIG_MATH_EMULATION */ 1176 + 1177 + dotraplinkage void __kprobes 1178 + do_device_not_available(struct pt_regs *regs, long error) 1179 + { 1180 + #ifdef CONFIG_X86_32 1181 + if (read_cr0() & X86_CR0_EM) { 1182 + conditional_sti(regs); 1183 + math_emulate(0); 1184 + } else { 1185 + math_state_restore(); /* interrupts still off */ 1186 + conditional_sti(regs); 1187 + } 1188 + #else 1189 + math_state_restore(); 1190 + #endif 1191 + } 1192 + 1193 + #ifdef CONFIG_X86_32 1194 + #ifdef CONFIG_X86_MCE 1195 + dotraplinkage void __kprobes do_machine_check(struct pt_regs *regs, long error) 1196 + { 1197 + conditional_sti(regs); 1198 + machine_check_vector(regs, error); 1199 + } 1200 + #endif 1201 + 1202 + dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code) 1203 + { 1204 + siginfo_t info; 1205 + local_irq_enable(); 1206 + 1207 + info.si_signo = SIGILL; 1208 + info.si_errno = 0; 1209 + info.si_code = ILL_BADSTK; 1210 + info.si_addr = 0; 1211 + if (notify_die(DIE_TRAP, "iret exception", 1212 + regs, error_code, 32, SIGILL) == NOTIFY_STOP) 1213 + return; 1214 + do_trap(32, SIGILL, "iret exception", regs, error_code, &info); 1215 + } 1216 + #endif 917 1217 918 1218 void __init trap_init(void) 919 1219 { 1220 + #ifdef CONFIG_X86_32 920 1221 int i; 1222 + #endif 921 1223 922 1224 #ifdef CONFIG_EISA 923 1225 void __iomem *p = early_ioremap(0x0FFFD9, 4); ··· 969 1187 early_iounmap(p, 4); 970 1188 #endif 971 1189 972 - set_trap_gate(0, &divide_error); 973 - set_intr_gate(1, &debug); 974 - set_intr_gate(2, &nmi); 975 - set_system_intr_gate(3, &int3); /* int3 can be called from all */ 976 - set_system_gate(4, &overflow); /* int4 can be called from all */ 977 - set_trap_gate(5, &bounds); 978 - set_trap_gate(6, &invalid_op); 979 - set_trap_gate(7, &device_not_available); 1190 + set_intr_gate(0, &divide_error); 1191 + set_intr_gate_ist(1, &debug, DEBUG_STACK); 1192 + set_intr_gate_ist(2, &nmi, NMI_STACK); 1193 + /* int3 can be called from all */ 1194 + set_system_intr_gate_ist(3, &int3, DEBUG_STACK); 1195 + /* int4 can be called from all */ 1196 + set_system_intr_gate(4, &overflow); 1197 + set_intr_gate(5, &bounds); 1198 + set_intr_gate(6, &invalid_op); 1199 + set_intr_gate(7, &device_not_available); 1200 + #ifdef CONFIG_X86_32 980 1201 set_task_gate(8, GDT_ENTRY_DOUBLEFAULT_TSS); 981 - set_trap_gate(9, &coprocessor_segment_overrun); 982 - set_trap_gate(10, &invalid_TSS); 983 - set_trap_gate(11, &segment_not_present); 984 - set_trap_gate(12, &stack_segment); 985 - set_trap_gate(13, &general_protection); 986 - set_intr_gate(14, &page_fault); 987 - set_trap_gate(15, &spurious_interrupt_bug); 988 - set_trap_gate(16, &coprocessor_error); 989 - set_trap_gate(17, &alignment_check); 990 - #ifdef CONFIG_X86_MCE 991 - set_trap_gate(18, &machine_check); 1202 + #else 1203 + set_intr_gate_ist(8, &double_fault, DOUBLEFAULT_STACK); 992 1204 #endif 993 - set_trap_gate(19, &simd_coprocessor_error); 1205 + set_intr_gate(9, &coprocessor_segment_overrun); 1206 + set_intr_gate(10, &invalid_TSS); 1207 + set_intr_gate(11, &segment_not_present); 1208 + set_intr_gate_ist(12, &stack_segment, STACKFAULT_STACK); 1209 + set_intr_gate(13, &general_protection); 1210 + set_intr_gate(14, &page_fault); 1211 + set_intr_gate(15, &spurious_interrupt_bug); 1212 + set_intr_gate(16, &coprocessor_error); 1213 + set_intr_gate(17, &alignment_check); 1214 + #ifdef CONFIG_X86_MCE 1215 + set_intr_gate_ist(18, &machine_check, MCE_STACK); 1216 + #endif 1217 + set_intr_gate(19, &simd_coprocessor_error); 994 1218 1219 + #ifdef CONFIG_IA32_EMULATION 1220 + set_system_intr_gate(IA32_SYSCALL_VECTOR, ia32_syscall); 1221 + #endif 1222 + 1223 + #ifdef CONFIG_X86_32 995 1224 if (cpu_has_fxsr) { 996 1225 printk(KERN_INFO "Enabling fast FPU save and restore... "); 997 1226 set_in_cr4(X86_CR4_OSFXSR); ··· 1015 1222 printk("done.\n"); 1016 1223 } 1017 1224 1018 - set_system_gate(SYSCALL_VECTOR, &system_call); 1225 + set_system_trap_gate(SYSCALL_VECTOR, &system_call); 1019 1226 1020 1227 /* Reserve all the builtin and the syscall vector: */ 1021 1228 for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++) 1022 1229 set_bit(i, used_vectors); 1023 1230 1024 1231 set_bit(SYSCALL_VECTOR, used_vectors); 1025 - 1232 + #endif 1026 1233 /* 1027 1234 * Should be a barrier for any external CPU state: 1028 1235 */ 1029 1236 cpu_init(); 1030 1237 1238 + #ifdef CONFIG_X86_32 1031 1239 trap_init_hook(); 1240 + #endif 1032 1241 } 1033 - 1034 - static int __init kstack_setup(char *s) 1035 - { 1036 - kstack_depth_to_print = simple_strtoul(s, NULL, 0); 1037 - 1038 - return 1; 1039 - } 1040 - __setup("kstack=", kstack_setup); 1041 - 1042 - static int __init code_bytes_setup(char *s) 1043 - { 1044 - code_bytes = simple_strtoul(s, NULL, 0); 1045 - if (code_bytes > 8192) 1046 - code_bytes = 8192; 1047 - 1048 - return 1; 1049 - } 1050 - __setup("code_bytes=", code_bytes_setup);
-1214
arch/x86/kernel/traps_64.c
··· 1 - /* 2 - * Copyright (C) 1991, 1992 Linus Torvalds 3 - * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 - * 5 - * Pentium III FXSR, SSE support 6 - * Gareth Hughes <gareth@valinux.com>, May 2000 7 - */ 8 - 9 - /* 10 - * 'Traps.c' handles hardware traps and faults after we have saved some 11 - * state in 'entry.S'. 12 - */ 13 - #include <linux/moduleparam.h> 14 - #include <linux/interrupt.h> 15 - #include <linux/kallsyms.h> 16 - #include <linux/spinlock.h> 17 - #include <linux/kprobes.h> 18 - #include <linux/uaccess.h> 19 - #include <linux/utsname.h> 20 - #include <linux/kdebug.h> 21 - #include <linux/kernel.h> 22 - #include <linux/module.h> 23 - #include <linux/ptrace.h> 24 - #include <linux/string.h> 25 - #include <linux/unwind.h> 26 - #include <linux/delay.h> 27 - #include <linux/errno.h> 28 - #include <linux/kexec.h> 29 - #include <linux/sched.h> 30 - #include <linux/timer.h> 31 - #include <linux/init.h> 32 - #include <linux/bug.h> 33 - #include <linux/nmi.h> 34 - #include <linux/mm.h> 35 - #include <linux/smp.h> 36 - #include <linux/io.h> 37 - 38 - #if defined(CONFIG_EDAC) 39 - #include <linux/edac.h> 40 - #endif 41 - 42 - #include <asm/stacktrace.h> 43 - #include <asm/processor.h> 44 - #include <asm/debugreg.h> 45 - #include <asm/atomic.h> 46 - #include <asm/system.h> 47 - #include <asm/unwind.h> 48 - #include <asm/desc.h> 49 - #include <asm/i387.h> 50 - #include <asm/pgalloc.h> 51 - #include <asm/proto.h> 52 - #include <asm/pda.h> 53 - #include <asm/traps.h> 54 - 55 - #include <mach_traps.h> 56 - 57 - int panic_on_unrecovered_nmi; 58 - int kstack_depth_to_print = 12; 59 - static unsigned int code_bytes = 64; 60 - static int ignore_nmis; 61 - static int die_counter; 62 - 63 - static inline void conditional_sti(struct pt_regs *regs) 64 - { 65 - if (regs->flags & X86_EFLAGS_IF) 66 - local_irq_enable(); 67 - } 68 - 69 - static inline void preempt_conditional_sti(struct pt_regs *regs) 70 - { 71 - inc_preempt_count(); 72 - if (regs->flags & X86_EFLAGS_IF) 73 - local_irq_enable(); 74 - } 75 - 76 - static inline void preempt_conditional_cli(struct pt_regs *regs) 77 - { 78 - if (regs->flags & X86_EFLAGS_IF) 79 - local_irq_disable(); 80 - /* Make sure to not schedule here because we could be running 81 - on an exception stack. */ 82 - dec_preempt_count(); 83 - } 84 - 85 - void printk_address(unsigned long address, int reliable) 86 - { 87 - printk(" [<%016lx>] %s%pS\n", 88 - address, reliable ? "" : "? ", (void *) address); 89 - } 90 - 91 - static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack, 92 - unsigned *usedp, char **idp) 93 - { 94 - static char ids[][8] = { 95 - [DEBUG_STACK - 1] = "#DB", 96 - [NMI_STACK - 1] = "NMI", 97 - [DOUBLEFAULT_STACK - 1] = "#DF", 98 - [STACKFAULT_STACK - 1] = "#SS", 99 - [MCE_STACK - 1] = "#MC", 100 - #if DEBUG_STKSZ > EXCEPTION_STKSZ 101 - [N_EXCEPTION_STACKS ... 102 - N_EXCEPTION_STACKS + DEBUG_STKSZ / EXCEPTION_STKSZ - 2] = "#DB[?]" 103 - #endif 104 - }; 105 - unsigned k; 106 - 107 - /* 108 - * Iterate over all exception stacks, and figure out whether 109 - * 'stack' is in one of them: 110 - */ 111 - for (k = 0; k < N_EXCEPTION_STACKS; k++) { 112 - unsigned long end = per_cpu(orig_ist, cpu).ist[k]; 113 - /* 114 - * Is 'stack' above this exception frame's end? 115 - * If yes then skip to the next frame. 116 - */ 117 - if (stack >= end) 118 - continue; 119 - /* 120 - * Is 'stack' above this exception frame's start address? 121 - * If yes then we found the right frame. 122 - */ 123 - if (stack >= end - EXCEPTION_STKSZ) { 124 - /* 125 - * Make sure we only iterate through an exception 126 - * stack once. If it comes up for the second time 127 - * then there's something wrong going on - just 128 - * break out and return NULL: 129 - */ 130 - if (*usedp & (1U << k)) 131 - break; 132 - *usedp |= 1U << k; 133 - *idp = ids[k]; 134 - return (unsigned long *)end; 135 - } 136 - /* 137 - * If this is a debug stack, and if it has a larger size than 138 - * the usual exception stacks, then 'stack' might still 139 - * be within the lower portion of the debug stack: 140 - */ 141 - #if DEBUG_STKSZ > EXCEPTION_STKSZ 142 - if (k == DEBUG_STACK - 1 && stack >= end - DEBUG_STKSZ) { 143 - unsigned j = N_EXCEPTION_STACKS - 1; 144 - 145 - /* 146 - * Black magic. A large debug stack is composed of 147 - * multiple exception stack entries, which we 148 - * iterate through now. Dont look: 149 - */ 150 - do { 151 - ++j; 152 - end -= EXCEPTION_STKSZ; 153 - ids[j][4] = '1' + (j - N_EXCEPTION_STACKS); 154 - } while (stack < end - EXCEPTION_STKSZ); 155 - if (*usedp & (1U << j)) 156 - break; 157 - *usedp |= 1U << j; 158 - *idp = ids[j]; 159 - return (unsigned long *)end; 160 - } 161 - #endif 162 - } 163 - return NULL; 164 - } 165 - 166 - /* 167 - * x86-64 can have up to three kernel stacks: 168 - * process stack 169 - * interrupt stack 170 - * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack 171 - */ 172 - 173 - static inline int valid_stack_ptr(struct thread_info *tinfo, 174 - void *p, unsigned int size, void *end) 175 - { 176 - void *t = tinfo; 177 - if (end) { 178 - if (p < end && p >= (end-THREAD_SIZE)) 179 - return 1; 180 - else 181 - return 0; 182 - } 183 - return p > t && p < t + THREAD_SIZE - size; 184 - } 185 - 186 - /* The form of the top of the frame on the stack */ 187 - struct stack_frame { 188 - struct stack_frame *next_frame; 189 - unsigned long return_address; 190 - }; 191 - 192 - static inline unsigned long 193 - print_context_stack(struct thread_info *tinfo, 194 - unsigned long *stack, unsigned long bp, 195 - const struct stacktrace_ops *ops, void *data, 196 - unsigned long *end) 197 - { 198 - struct stack_frame *frame = (struct stack_frame *)bp; 199 - 200 - while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) { 201 - unsigned long addr; 202 - 203 - addr = *stack; 204 - if (__kernel_text_address(addr)) { 205 - if ((unsigned long) stack == bp + 8) { 206 - ops->address(data, addr, 1); 207 - frame = frame->next_frame; 208 - bp = (unsigned long) frame; 209 - } else { 210 - ops->address(data, addr, bp == 0); 211 - } 212 - } 213 - stack++; 214 - } 215 - return bp; 216 - } 217 - 218 - void dump_trace(struct task_struct *task, struct pt_regs *regs, 219 - unsigned long *stack, unsigned long bp, 220 - const struct stacktrace_ops *ops, void *data) 221 - { 222 - const unsigned cpu = get_cpu(); 223 - unsigned long *irqstack_end = (unsigned long *)cpu_pda(cpu)->irqstackptr; 224 - unsigned used = 0; 225 - struct thread_info *tinfo; 226 - 227 - if (!task) 228 - task = current; 229 - 230 - if (!stack) { 231 - unsigned long dummy; 232 - stack = &dummy; 233 - if (task && task != current) 234 - stack = (unsigned long *)task->thread.sp; 235 - } 236 - 237 - #ifdef CONFIG_FRAME_POINTER 238 - if (!bp) { 239 - if (task == current) { 240 - /* Grab bp right from our regs */ 241 - asm("movq %%rbp, %0" : "=r" (bp) : ); 242 - } else { 243 - /* bp is the last reg pushed by switch_to */ 244 - bp = *(unsigned long *) task->thread.sp; 245 - } 246 - } 247 - #endif 248 - 249 - /* 250 - * Print function call entries in all stacks, starting at the 251 - * current stack address. If the stacks consist of nested 252 - * exceptions 253 - */ 254 - tinfo = task_thread_info(task); 255 - for (;;) { 256 - char *id; 257 - unsigned long *estack_end; 258 - estack_end = in_exception_stack(cpu, (unsigned long)stack, 259 - &used, &id); 260 - 261 - if (estack_end) { 262 - if (ops->stack(data, id) < 0) 263 - break; 264 - 265 - bp = print_context_stack(tinfo, stack, bp, ops, 266 - data, estack_end); 267 - ops->stack(data, "<EOE>"); 268 - /* 269 - * We link to the next stack via the 270 - * second-to-last pointer (index -2 to end) in the 271 - * exception stack: 272 - */ 273 - stack = (unsigned long *) estack_end[-2]; 274 - continue; 275 - } 276 - if (irqstack_end) { 277 - unsigned long *irqstack; 278 - irqstack = irqstack_end - 279 - (IRQSTACKSIZE - 64) / sizeof(*irqstack); 280 - 281 - if (stack >= irqstack && stack < irqstack_end) { 282 - if (ops->stack(data, "IRQ") < 0) 283 - break; 284 - bp = print_context_stack(tinfo, stack, bp, 285 - ops, data, irqstack_end); 286 - /* 287 - * We link to the next stack (which would be 288 - * the process stack normally) the last 289 - * pointer (index -1 to end) in the IRQ stack: 290 - */ 291 - stack = (unsigned long *) (irqstack_end[-1]); 292 - irqstack_end = NULL; 293 - ops->stack(data, "EOI"); 294 - continue; 295 - } 296 - } 297 - break; 298 - } 299 - 300 - /* 301 - * This handles the process stack: 302 - */ 303 - bp = print_context_stack(tinfo, stack, bp, ops, data, NULL); 304 - put_cpu(); 305 - } 306 - EXPORT_SYMBOL(dump_trace); 307 - 308 - static void 309 - print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 310 - { 311 - print_symbol(msg, symbol); 312 - printk("\n"); 313 - } 314 - 315 - static void print_trace_warning(void *data, char *msg) 316 - { 317 - printk("%s\n", msg); 318 - } 319 - 320 - static int print_trace_stack(void *data, char *name) 321 - { 322 - printk(" <%s> ", name); 323 - return 0; 324 - } 325 - 326 - static void print_trace_address(void *data, unsigned long addr, int reliable) 327 - { 328 - touch_nmi_watchdog(); 329 - printk_address(addr, reliable); 330 - } 331 - 332 - static const struct stacktrace_ops print_trace_ops = { 333 - .warning = print_trace_warning, 334 - .warning_symbol = print_trace_warning_symbol, 335 - .stack = print_trace_stack, 336 - .address = print_trace_address, 337 - }; 338 - 339 - static void 340 - show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 341 - unsigned long *stack, unsigned long bp, char *log_lvl) 342 - { 343 - printk("Call Trace:\n"); 344 - dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 345 - } 346 - 347 - void show_trace(struct task_struct *task, struct pt_regs *regs, 348 - unsigned long *stack, unsigned long bp) 349 - { 350 - show_trace_log_lvl(task, regs, stack, bp, ""); 351 - } 352 - 353 - static void 354 - show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 355 - unsigned long *sp, unsigned long bp, char *log_lvl) 356 - { 357 - unsigned long *stack; 358 - int i; 359 - const int cpu = smp_processor_id(); 360 - unsigned long *irqstack_end = 361 - (unsigned long *) (cpu_pda(cpu)->irqstackptr); 362 - unsigned long *irqstack = 363 - (unsigned long *) (cpu_pda(cpu)->irqstackptr - IRQSTACKSIZE); 364 - 365 - /* 366 - * debugging aid: "show_stack(NULL, NULL);" prints the 367 - * back trace for this cpu. 368 - */ 369 - 370 - if (sp == NULL) { 371 - if (task) 372 - sp = (unsigned long *)task->thread.sp; 373 - else 374 - sp = (unsigned long *)&sp; 375 - } 376 - 377 - stack = sp; 378 - for (i = 0; i < kstack_depth_to_print; i++) { 379 - if (stack >= irqstack && stack <= irqstack_end) { 380 - if (stack == irqstack_end) { 381 - stack = (unsigned long *) (irqstack_end[-1]); 382 - printk(" <EOI> "); 383 - } 384 - } else { 385 - if (((long) stack & (THREAD_SIZE-1)) == 0) 386 - break; 387 - } 388 - if (i && ((i % 4) == 0)) 389 - printk("\n"); 390 - printk(" %016lx", *stack++); 391 - touch_nmi_watchdog(); 392 - } 393 - printk("\n"); 394 - show_trace_log_lvl(task, regs, sp, bp, log_lvl); 395 - } 396 - 397 - void show_stack(struct task_struct *task, unsigned long *sp) 398 - { 399 - show_stack_log_lvl(task, NULL, sp, 0, ""); 400 - } 401 - 402 - /* 403 - * The architecture-independent dump_stack generator 404 - */ 405 - void dump_stack(void) 406 - { 407 - unsigned long bp = 0; 408 - unsigned long stack; 409 - 410 - #ifdef CONFIG_FRAME_POINTER 411 - if (!bp) 412 - asm("movq %%rbp, %0" : "=r" (bp) : ); 413 - #endif 414 - 415 - printk("Pid: %d, comm: %.20s %s %s %.*s\n", 416 - current->pid, current->comm, print_tainted(), 417 - init_utsname()->release, 418 - (int)strcspn(init_utsname()->version, " "), 419 - init_utsname()->version); 420 - show_trace(NULL, NULL, &stack, bp); 421 - } 422 - EXPORT_SYMBOL(dump_stack); 423 - 424 - void show_registers(struct pt_regs *regs) 425 - { 426 - int i; 427 - unsigned long sp; 428 - const int cpu = smp_processor_id(); 429 - struct task_struct *cur = cpu_pda(cpu)->pcurrent; 430 - 431 - sp = regs->sp; 432 - printk("CPU %d ", cpu); 433 - __show_regs(regs); 434 - printk("Process %s (pid: %d, threadinfo %p, task %p)\n", 435 - cur->comm, cur->pid, task_thread_info(cur), cur); 436 - 437 - /* 438 - * When in-kernel, we also print out the stack and code at the 439 - * time of the fault.. 440 - */ 441 - if (!user_mode(regs)) { 442 - unsigned int code_prologue = code_bytes * 43 / 64; 443 - unsigned int code_len = code_bytes; 444 - unsigned char c; 445 - u8 *ip; 446 - 447 - printk("Stack: "); 448 - show_stack_log_lvl(NULL, regs, (unsigned long *)sp, 449 - regs->bp, ""); 450 - 451 - printk(KERN_EMERG "Code: "); 452 - 453 - ip = (u8 *)regs->ip - code_prologue; 454 - if (ip < (u8 *)PAGE_OFFSET || probe_kernel_address(ip, c)) { 455 - /* try starting at RIP */ 456 - ip = (u8 *)regs->ip; 457 - code_len = code_len - code_prologue + 1; 458 - } 459 - for (i = 0; i < code_len; i++, ip++) { 460 - if (ip < (u8 *)PAGE_OFFSET || 461 - probe_kernel_address(ip, c)) { 462 - printk(" Bad RIP value."); 463 - break; 464 - } 465 - if (ip == (u8 *)regs->ip) 466 - printk("<%02x> ", c); 467 - else 468 - printk("%02x ", c); 469 - } 470 - } 471 - printk("\n"); 472 - } 473 - 474 - int is_valid_bugaddr(unsigned long ip) 475 - { 476 - unsigned short ud2; 477 - 478 - if (__copy_from_user(&ud2, (const void __user *) ip, sizeof(ud2))) 479 - return 0; 480 - 481 - return ud2 == 0x0b0f; 482 - } 483 - 484 - static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 485 - static int die_owner = -1; 486 - static unsigned int die_nest_count; 487 - 488 - unsigned __kprobes long oops_begin(void) 489 - { 490 - int cpu; 491 - unsigned long flags; 492 - 493 - oops_enter(); 494 - 495 - /* racy, but better than risking deadlock. */ 496 - raw_local_irq_save(flags); 497 - cpu = smp_processor_id(); 498 - if (!__raw_spin_trylock(&die_lock)) { 499 - if (cpu == die_owner) 500 - /* nested oops. should stop eventually */; 501 - else 502 - __raw_spin_lock(&die_lock); 503 - } 504 - die_nest_count++; 505 - die_owner = cpu; 506 - console_verbose(); 507 - bust_spinlocks(1); 508 - return flags; 509 - } 510 - 511 - void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 512 - { 513 - die_owner = -1; 514 - bust_spinlocks(0); 515 - die_nest_count--; 516 - if (!die_nest_count) 517 - /* Nest count reaches zero, release the lock. */ 518 - __raw_spin_unlock(&die_lock); 519 - raw_local_irq_restore(flags); 520 - if (!regs) { 521 - oops_exit(); 522 - return; 523 - } 524 - if (panic_on_oops) 525 - panic("Fatal exception"); 526 - oops_exit(); 527 - do_exit(signr); 528 - } 529 - 530 - int __kprobes __die(const char *str, struct pt_regs *regs, long err) 531 - { 532 - printk(KERN_EMERG "%s: %04lx [%u] ", str, err & 0xffff, ++die_counter); 533 - #ifdef CONFIG_PREEMPT 534 - printk("PREEMPT "); 535 - #endif 536 - #ifdef CONFIG_SMP 537 - printk("SMP "); 538 - #endif 539 - #ifdef CONFIG_DEBUG_PAGEALLOC 540 - printk("DEBUG_PAGEALLOC"); 541 - #endif 542 - printk("\n"); 543 - if (notify_die(DIE_OOPS, str, regs, err, 544 - current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 545 - return 1; 546 - 547 - show_registers(regs); 548 - add_taint(TAINT_DIE); 549 - /* Executive summary in case the oops scrolled away */ 550 - printk(KERN_ALERT "RIP "); 551 - printk_address(regs->ip, 1); 552 - printk(" RSP <%016lx>\n", regs->sp); 553 - if (kexec_should_crash(current)) 554 - crash_kexec(regs); 555 - return 0; 556 - } 557 - 558 - void die(const char *str, struct pt_regs *regs, long err) 559 - { 560 - unsigned long flags = oops_begin(); 561 - 562 - if (!user_mode(regs)) 563 - report_bug(regs->ip, regs); 564 - 565 - if (__die(str, regs, err)) 566 - regs = NULL; 567 - oops_end(flags, regs, SIGSEGV); 568 - } 569 - 570 - notrace __kprobes void 571 - die_nmi(char *str, struct pt_regs *regs, int do_panic) 572 - { 573 - unsigned long flags; 574 - 575 - if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 576 - return; 577 - 578 - flags = oops_begin(); 579 - /* 580 - * We are in trouble anyway, lets at least try 581 - * to get a message out. 582 - */ 583 - printk(KERN_EMERG "%s", str); 584 - printk(" on CPU%d, ip %08lx, registers:\n", 585 - smp_processor_id(), regs->ip); 586 - show_registers(regs); 587 - if (kexec_should_crash(current)) 588 - crash_kexec(regs); 589 - if (do_panic || panic_on_oops) 590 - panic("Non maskable interrupt"); 591 - oops_end(flags, NULL, SIGBUS); 592 - nmi_exit(); 593 - local_irq_enable(); 594 - do_exit(SIGBUS); 595 - } 596 - 597 - static void __kprobes 598 - do_trap(int trapnr, int signr, char *str, struct pt_regs *regs, 599 - long error_code, siginfo_t *info) 600 - { 601 - struct task_struct *tsk = current; 602 - 603 - if (!user_mode(regs)) 604 - goto kernel_trap; 605 - 606 - /* 607 - * We want error_code and trap_no set for userspace faults and 608 - * kernelspace faults which result in die(), but not 609 - * kernelspace faults which are fixed up. die() gives the 610 - * process no chance to handle the signal and notice the 611 - * kernel fault information, so that won't result in polluting 612 - * the information about previously queued, but not yet 613 - * delivered, faults. See also do_general_protection below. 614 - */ 615 - tsk->thread.error_code = error_code; 616 - tsk->thread.trap_no = trapnr; 617 - 618 - if (show_unhandled_signals && unhandled_signal(tsk, signr) && 619 - printk_ratelimit()) { 620 - printk(KERN_INFO 621 - "%s[%d] trap %s ip:%lx sp:%lx error:%lx", 622 - tsk->comm, tsk->pid, str, 623 - regs->ip, regs->sp, error_code); 624 - print_vma_addr(" in ", regs->ip); 625 - printk("\n"); 626 - } 627 - 628 - if (info) 629 - force_sig_info(signr, info, tsk); 630 - else 631 - force_sig(signr, tsk); 632 - return; 633 - 634 - kernel_trap: 635 - if (!fixup_exception(regs)) { 636 - tsk->thread.error_code = error_code; 637 - tsk->thread.trap_no = trapnr; 638 - die(str, regs, error_code); 639 - } 640 - return; 641 - } 642 - 643 - #define DO_ERROR(trapnr, signr, str, name) \ 644 - asmlinkage void do_##name(struct pt_regs *regs, long error_code) \ 645 - { \ 646 - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 647 - == NOTIFY_STOP) \ 648 - return; \ 649 - conditional_sti(regs); \ 650 - do_trap(trapnr, signr, str, regs, error_code, NULL); \ 651 - } 652 - 653 - #define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \ 654 - asmlinkage void do_##name(struct pt_regs *regs, long error_code) \ 655 - { \ 656 - siginfo_t info; \ 657 - info.si_signo = signr; \ 658 - info.si_errno = 0; \ 659 - info.si_code = sicode; \ 660 - info.si_addr = (void __user *)siaddr; \ 661 - trace_hardirqs_fixup(); \ 662 - if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) \ 663 - == NOTIFY_STOP) \ 664 - return; \ 665 - conditional_sti(regs); \ 666 - do_trap(trapnr, signr, str, regs, error_code, &info); \ 667 - } 668 - 669 - DO_ERROR_INFO(0, SIGFPE, "divide error", divide_error, FPE_INTDIV, regs->ip) 670 - DO_ERROR(4, SIGSEGV, "overflow", overflow) 671 - DO_ERROR(5, SIGSEGV, "bounds", bounds) 672 - DO_ERROR_INFO(6, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->ip) 673 - DO_ERROR(9, SIGFPE, "coprocessor segment overrun", coprocessor_segment_overrun) 674 - DO_ERROR(10, SIGSEGV, "invalid TSS", invalid_TSS) 675 - DO_ERROR(11, SIGBUS, "segment not present", segment_not_present) 676 - DO_ERROR_INFO(17, SIGBUS, "alignment check", alignment_check, BUS_ADRALN, 0) 677 - 678 - /* Runs on IST stack */ 679 - asmlinkage void do_stack_segment(struct pt_regs *regs, long error_code) 680 - { 681 - if (notify_die(DIE_TRAP, "stack segment", regs, error_code, 682 - 12, SIGBUS) == NOTIFY_STOP) 683 - return; 684 - preempt_conditional_sti(regs); 685 - do_trap(12, SIGBUS, "stack segment", regs, error_code, NULL); 686 - preempt_conditional_cli(regs); 687 - } 688 - 689 - asmlinkage void do_double_fault(struct pt_regs *regs, long error_code) 690 - { 691 - static const char str[] = "double fault"; 692 - struct task_struct *tsk = current; 693 - 694 - /* Return not checked because double check cannot be ignored */ 695 - notify_die(DIE_TRAP, str, regs, error_code, 8, SIGSEGV); 696 - 697 - tsk->thread.error_code = error_code; 698 - tsk->thread.trap_no = 8; 699 - 700 - /* This is always a kernel trap and never fixable (and thus must 701 - never return). */ 702 - for (;;) 703 - die(str, regs, error_code); 704 - } 705 - 706 - asmlinkage void __kprobes 707 - do_general_protection(struct pt_regs *regs, long error_code) 708 - { 709 - struct task_struct *tsk; 710 - 711 - conditional_sti(regs); 712 - 713 - tsk = current; 714 - if (!user_mode(regs)) 715 - goto gp_in_kernel; 716 - 717 - tsk->thread.error_code = error_code; 718 - tsk->thread.trap_no = 13; 719 - 720 - if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) && 721 - printk_ratelimit()) { 722 - printk(KERN_INFO 723 - "%s[%d] general protection ip:%lx sp:%lx error:%lx", 724 - tsk->comm, tsk->pid, 725 - regs->ip, regs->sp, error_code); 726 - print_vma_addr(" in ", regs->ip); 727 - printk("\n"); 728 - } 729 - 730 - force_sig(SIGSEGV, tsk); 731 - return; 732 - 733 - gp_in_kernel: 734 - if (fixup_exception(regs)) 735 - return; 736 - 737 - tsk->thread.error_code = error_code; 738 - tsk->thread.trap_no = 13; 739 - if (notify_die(DIE_GPF, "general protection fault", regs, 740 - error_code, 13, SIGSEGV) == NOTIFY_STOP) 741 - return; 742 - die("general protection fault", regs, error_code); 743 - } 744 - 745 - static notrace __kprobes void 746 - mem_parity_error(unsigned char reason, struct pt_regs *regs) 747 - { 748 - printk(KERN_EMERG "Uhhuh. NMI received for unknown reason %02x.\n", 749 - reason); 750 - printk(KERN_EMERG "You have some hardware problem, likely on the PCI bus.\n"); 751 - 752 - #if defined(CONFIG_EDAC) 753 - if (edac_handler_set()) { 754 - edac_atomic_assert_error(); 755 - return; 756 - } 757 - #endif 758 - 759 - if (panic_on_unrecovered_nmi) 760 - panic("NMI: Not continuing"); 761 - 762 - printk(KERN_EMERG "Dazed and confused, but trying to continue\n"); 763 - 764 - /* Clear and disable the memory parity error line. */ 765 - reason = (reason & 0xf) | 4; 766 - outb(reason, 0x61); 767 - } 768 - 769 - static notrace __kprobes void 770 - io_check_error(unsigned char reason, struct pt_regs *regs) 771 - { 772 - printk("NMI: IOCK error (debug interrupt?)\n"); 773 - show_registers(regs); 774 - 775 - /* Re-enable the IOCK line, wait for a few seconds */ 776 - reason = (reason & 0xf) | 8; 777 - outb(reason, 0x61); 778 - mdelay(2000); 779 - reason &= ~8; 780 - outb(reason, 0x61); 781 - } 782 - 783 - static notrace __kprobes void 784 - unknown_nmi_error(unsigned char reason, struct pt_regs *regs) 785 - { 786 - if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) == 787 - NOTIFY_STOP) 788 - return; 789 - printk(KERN_EMERG "Uhhuh. NMI received for unknown reason %02x.\n", 790 - reason); 791 - printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n"); 792 - 793 - if (panic_on_unrecovered_nmi) 794 - panic("NMI: Not continuing"); 795 - 796 - printk(KERN_EMERG "Dazed and confused, but trying to continue\n"); 797 - } 798 - 799 - /* Runs on IST stack. This code must keep interrupts off all the time. 800 - Nested NMIs are prevented by the CPU. */ 801 - asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs) 802 - { 803 - unsigned char reason = 0; 804 - int cpu; 805 - 806 - cpu = smp_processor_id(); 807 - 808 - /* Only the BSP gets external NMIs from the system. */ 809 - if (!cpu) 810 - reason = get_nmi_reason(); 811 - 812 - if (!(reason & 0xc0)) { 813 - if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT) 814 - == NOTIFY_STOP) 815 - return; 816 - /* 817 - * Ok, so this is none of the documented NMI sources, 818 - * so it must be the NMI watchdog. 819 - */ 820 - if (nmi_watchdog_tick(regs, reason)) 821 - return; 822 - if (!do_nmi_callback(regs, cpu)) 823 - unknown_nmi_error(reason, regs); 824 - 825 - return; 826 - } 827 - if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP) 828 - return; 829 - 830 - /* AK: following checks seem to be broken on modern chipsets. FIXME */ 831 - if (reason & 0x80) 832 - mem_parity_error(reason, regs); 833 - if (reason & 0x40) 834 - io_check_error(reason, regs); 835 - } 836 - 837 - asmlinkage notrace __kprobes void 838 - do_nmi(struct pt_regs *regs, long error_code) 839 - { 840 - nmi_enter(); 841 - 842 - add_pda(__nmi_count, 1); 843 - 844 - if (!ignore_nmis) 845 - default_do_nmi(regs); 846 - 847 - nmi_exit(); 848 - } 849 - 850 - void stop_nmi(void) 851 - { 852 - acpi_nmi_disable(); 853 - ignore_nmis++; 854 - } 855 - 856 - void restart_nmi(void) 857 - { 858 - ignore_nmis--; 859 - acpi_nmi_enable(); 860 - } 861 - 862 - /* runs on IST stack. */ 863 - asmlinkage void __kprobes do_int3(struct pt_regs *regs, long error_code) 864 - { 865 - trace_hardirqs_fixup(); 866 - 867 - if (notify_die(DIE_INT3, "int3", regs, error_code, 3, SIGTRAP) 868 - == NOTIFY_STOP) 869 - return; 870 - 871 - preempt_conditional_sti(regs); 872 - do_trap(3, SIGTRAP, "int3", regs, error_code, NULL); 873 - preempt_conditional_cli(regs); 874 - } 875 - 876 - /* Help handler running on IST stack to switch back to user stack 877 - for scheduling or signal handling. The actual stack switch is done in 878 - entry.S */ 879 - asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs) 880 - { 881 - struct pt_regs *regs = eregs; 882 - /* Did already sync */ 883 - if (eregs == (struct pt_regs *)eregs->sp) 884 - ; 885 - /* Exception from user space */ 886 - else if (user_mode(eregs)) 887 - regs = task_pt_regs(current); 888 - /* Exception from kernel and interrupts are enabled. Move to 889 - kernel process stack. */ 890 - else if (eregs->flags & X86_EFLAGS_IF) 891 - regs = (struct pt_regs *)(eregs->sp -= sizeof(struct pt_regs)); 892 - if (eregs != regs) 893 - *regs = *eregs; 894 - return regs; 895 - } 896 - 897 - /* runs on IST stack. */ 898 - asmlinkage void __kprobes do_debug(struct pt_regs *regs, 899 - unsigned long error_code) 900 - { 901 - struct task_struct *tsk = current; 902 - unsigned long condition; 903 - siginfo_t info; 904 - 905 - trace_hardirqs_fixup(); 906 - 907 - get_debugreg(condition, 6); 908 - 909 - /* 910 - * The processor cleared BTF, so don't mark that we need it set. 911 - */ 912 - clear_tsk_thread_flag(tsk, TIF_DEBUGCTLMSR); 913 - tsk->thread.debugctlmsr = 0; 914 - 915 - if (notify_die(DIE_DEBUG, "debug", regs, condition, error_code, 916 - SIGTRAP) == NOTIFY_STOP) 917 - return; 918 - 919 - preempt_conditional_sti(regs); 920 - 921 - /* Mask out spurious debug traps due to lazy DR7 setting */ 922 - if (condition & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) { 923 - if (!tsk->thread.debugreg7) 924 - goto clear_dr7; 925 - } 926 - 927 - tsk->thread.debugreg6 = condition; 928 - 929 - /* 930 - * Single-stepping through TF: make sure we ignore any events in 931 - * kernel space (but re-enable TF when returning to user mode). 932 - */ 933 - if (condition & DR_STEP) { 934 - if (!user_mode(regs)) 935 - goto clear_TF_reenable; 936 - } 937 - 938 - /* Ok, finally something we can handle */ 939 - tsk->thread.trap_no = 1; 940 - tsk->thread.error_code = error_code; 941 - info.si_signo = SIGTRAP; 942 - info.si_errno = 0; 943 - info.si_code = get_si_code(condition); 944 - info.si_addr = user_mode(regs) ? (void __user *)regs->ip : NULL; 945 - force_sig_info(SIGTRAP, &info, tsk); 946 - 947 - clear_dr7: 948 - set_debugreg(0, 7); 949 - preempt_conditional_cli(regs); 950 - return; 951 - 952 - clear_TF_reenable: 953 - set_tsk_thread_flag(tsk, TIF_SINGLESTEP); 954 - regs->flags &= ~X86_EFLAGS_TF; 955 - preempt_conditional_cli(regs); 956 - return; 957 - } 958 - 959 - static int kernel_math_error(struct pt_regs *regs, const char *str, int trapnr) 960 - { 961 - if (fixup_exception(regs)) 962 - return 1; 963 - 964 - notify_die(DIE_GPF, str, regs, 0, trapnr, SIGFPE); 965 - /* Illegal floating point operation in the kernel */ 966 - current->thread.trap_no = trapnr; 967 - die(str, regs, 0); 968 - return 0; 969 - } 970 - 971 - /* 972 - * Note that we play around with the 'TS' bit in an attempt to get 973 - * the correct behaviour even in the presence of the asynchronous 974 - * IRQ13 behaviour 975 - */ 976 - asmlinkage void do_coprocessor_error(struct pt_regs *regs) 977 - { 978 - void __user *ip = (void __user *)(regs->ip); 979 - struct task_struct *task; 980 - siginfo_t info; 981 - unsigned short cwd, swd; 982 - 983 - conditional_sti(regs); 984 - if (!user_mode(regs) && 985 - kernel_math_error(regs, "kernel x87 math error", 16)) 986 - return; 987 - 988 - /* 989 - * Save the info for the exception handler and clear the error. 990 - */ 991 - task = current; 992 - save_init_fpu(task); 993 - task->thread.trap_no = 16; 994 - task->thread.error_code = 0; 995 - info.si_signo = SIGFPE; 996 - info.si_errno = 0; 997 - info.si_code = __SI_FAULT; 998 - info.si_addr = ip; 999 - /* 1000 - * (~cwd & swd) will mask out exceptions that are not set to unmasked 1001 - * status. 0x3f is the exception bits in these regs, 0x200 is the 1002 - * C1 reg you need in case of a stack fault, 0x040 is the stack 1003 - * fault bit. We should only be taking one exception at a time, 1004 - * so if this combination doesn't produce any single exception, 1005 - * then we have a bad program that isn't synchronizing its FPU usage 1006 - * and it will suffer the consequences since we won't be able to 1007 - * fully reproduce the context of the exception 1008 - */ 1009 - cwd = get_fpu_cwd(task); 1010 - swd = get_fpu_swd(task); 1011 - switch (swd & ~cwd & 0x3f) { 1012 - case 0x000: /* No unmasked exception */ 1013 - default: /* Multiple exceptions */ 1014 - break; 1015 - case 0x001: /* Invalid Op */ 1016 - /* 1017 - * swd & 0x240 == 0x040: Stack Underflow 1018 - * swd & 0x240 == 0x240: Stack Overflow 1019 - * User must clear the SF bit (0x40) if set 1020 - */ 1021 - info.si_code = FPE_FLTINV; 1022 - break; 1023 - case 0x002: /* Denormalize */ 1024 - case 0x010: /* Underflow */ 1025 - info.si_code = FPE_FLTUND; 1026 - break; 1027 - case 0x004: /* Zero Divide */ 1028 - info.si_code = FPE_FLTDIV; 1029 - break; 1030 - case 0x008: /* Overflow */ 1031 - info.si_code = FPE_FLTOVF; 1032 - break; 1033 - case 0x020: /* Precision */ 1034 - info.si_code = FPE_FLTRES; 1035 - break; 1036 - } 1037 - force_sig_info(SIGFPE, &info, task); 1038 - } 1039 - 1040 - asmlinkage void bad_intr(void) 1041 - { 1042 - printk("bad interrupt"); 1043 - } 1044 - 1045 - asmlinkage void do_simd_coprocessor_error(struct pt_regs *regs) 1046 - { 1047 - void __user *ip = (void __user *)(regs->ip); 1048 - struct task_struct *task; 1049 - siginfo_t info; 1050 - unsigned short mxcsr; 1051 - 1052 - conditional_sti(regs); 1053 - if (!user_mode(regs) && 1054 - kernel_math_error(regs, "kernel simd math error", 19)) 1055 - return; 1056 - 1057 - /* 1058 - * Save the info for the exception handler and clear the error. 1059 - */ 1060 - task = current; 1061 - save_init_fpu(task); 1062 - task->thread.trap_no = 19; 1063 - task->thread.error_code = 0; 1064 - info.si_signo = SIGFPE; 1065 - info.si_errno = 0; 1066 - info.si_code = __SI_FAULT; 1067 - info.si_addr = ip; 1068 - /* 1069 - * The SIMD FPU exceptions are handled a little differently, as there 1070 - * is only a single status/control register. Thus, to determine which 1071 - * unmasked exception was caught we must mask the exception mask bits 1072 - * at 0x1f80, and then use these to mask the exception bits at 0x3f. 1073 - */ 1074 - mxcsr = get_fpu_mxcsr(task); 1075 - switch (~((mxcsr & 0x1f80) >> 7) & (mxcsr & 0x3f)) { 1076 - case 0x000: 1077 - default: 1078 - break; 1079 - case 0x001: /* Invalid Op */ 1080 - info.si_code = FPE_FLTINV; 1081 - break; 1082 - case 0x002: /* Denormalize */ 1083 - case 0x010: /* Underflow */ 1084 - info.si_code = FPE_FLTUND; 1085 - break; 1086 - case 0x004: /* Zero Divide */ 1087 - info.si_code = FPE_FLTDIV; 1088 - break; 1089 - case 0x008: /* Overflow */ 1090 - info.si_code = FPE_FLTOVF; 1091 - break; 1092 - case 0x020: /* Precision */ 1093 - info.si_code = FPE_FLTRES; 1094 - break; 1095 - } 1096 - force_sig_info(SIGFPE, &info, task); 1097 - } 1098 - 1099 - asmlinkage void do_spurious_interrupt_bug(struct pt_regs *regs) 1100 - { 1101 - } 1102 - 1103 - asmlinkage void __attribute__((weak)) smp_thermal_interrupt(void) 1104 - { 1105 - } 1106 - 1107 - asmlinkage void __attribute__((weak)) mce_threshold_interrupt(void) 1108 - { 1109 - } 1110 - 1111 - /* 1112 - * 'math_state_restore()' saves the current math information in the 1113 - * old math state array, and gets the new ones from the current task 1114 - * 1115 - * Careful.. There are problems with IBM-designed IRQ13 behaviour. 1116 - * Don't touch unless you *really* know how it works. 1117 - */ 1118 - asmlinkage void math_state_restore(void) 1119 - { 1120 - struct task_struct *me = current; 1121 - 1122 - if (!used_math()) { 1123 - local_irq_enable(); 1124 - /* 1125 - * does a slab alloc which can sleep 1126 - */ 1127 - if (init_fpu(me)) { 1128 - /* 1129 - * ran out of memory! 1130 - */ 1131 - do_group_exit(SIGKILL); 1132 - return; 1133 - } 1134 - local_irq_disable(); 1135 - } 1136 - 1137 - clts(); /* Allow maths ops (or we recurse) */ 1138 - /* 1139 - * Paranoid restore. send a SIGSEGV if we fail to restore the state. 1140 - */ 1141 - if (unlikely(restore_fpu_checking(me))) { 1142 - stts(); 1143 - force_sig(SIGSEGV, me); 1144 - return; 1145 - } 1146 - task_thread_info(me)->status |= TS_USEDFPU; 1147 - me->fpu_counter++; 1148 - } 1149 - EXPORT_SYMBOL_GPL(math_state_restore); 1150 - 1151 - void __init trap_init(void) 1152 - { 1153 - set_intr_gate(0, &divide_error); 1154 - set_intr_gate_ist(1, &debug, DEBUG_STACK); 1155 - set_intr_gate_ist(2, &nmi, NMI_STACK); 1156 - /* int3 can be called from all */ 1157 - set_system_gate_ist(3, &int3, DEBUG_STACK); 1158 - /* int4 can be called from all */ 1159 - set_system_gate(4, &overflow); 1160 - set_intr_gate(5, &bounds); 1161 - set_intr_gate(6, &invalid_op); 1162 - set_intr_gate(7, &device_not_available); 1163 - set_intr_gate_ist(8, &double_fault, DOUBLEFAULT_STACK); 1164 - set_intr_gate(9, &coprocessor_segment_overrun); 1165 - set_intr_gate(10, &invalid_TSS); 1166 - set_intr_gate(11, &segment_not_present); 1167 - set_intr_gate_ist(12, &stack_segment, STACKFAULT_STACK); 1168 - set_intr_gate(13, &general_protection); 1169 - set_intr_gate(14, &page_fault); 1170 - set_intr_gate(15, &spurious_interrupt_bug); 1171 - set_intr_gate(16, &coprocessor_error); 1172 - set_intr_gate(17, &alignment_check); 1173 - #ifdef CONFIG_X86_MCE 1174 - set_intr_gate_ist(18, &machine_check, MCE_STACK); 1175 - #endif 1176 - set_intr_gate(19, &simd_coprocessor_error); 1177 - 1178 - #ifdef CONFIG_IA32_EMULATION 1179 - set_system_gate(IA32_SYSCALL_VECTOR, ia32_syscall); 1180 - #endif 1181 - /* 1182 - * Should be a barrier for any external CPU state: 1183 - */ 1184 - cpu_init(); 1185 - } 1186 - 1187 - static int __init oops_setup(char *s) 1188 - { 1189 - if (!s) 1190 - return -EINVAL; 1191 - if (!strcmp(s, "panic")) 1192 - panic_on_oops = 1; 1193 - return 0; 1194 - } 1195 - early_param("oops", oops_setup); 1196 - 1197 - static int __init kstack_setup(char *s) 1198 - { 1199 - if (!s) 1200 - return -EINVAL; 1201 - kstack_depth_to_print = simple_strtoul(s, NULL, 0); 1202 - return 0; 1203 - } 1204 - early_param("kstack", kstack_setup); 1205 - 1206 - static int __init code_bytes_setup(char *s) 1207 - { 1208 - code_bytes = simple_strtoul(s, NULL, 0); 1209 - if (code_bytes > 8192) 1210 - code_bytes = 8192; 1211 - 1212 - return 1; 1213 - } 1214 - __setup("code_bytes=", code_bytes_setup);
+15 -5
arch/x86/mach-generic/es7000.c
··· 47 47 /* Hook from generic ACPI tables.c */ 48 48 static int __init acpi_madt_oem_check(char *oem_id, char *oem_table_id) 49 49 { 50 - unsigned long oem_addr; 50 + unsigned long oem_addr = 0; 51 + int check_dsdt; 52 + int ret = 0; 53 + 54 + /* check dsdt at first to avoid clear fix_map for oem_addr */ 55 + check_dsdt = es7000_check_dsdt(); 56 + 51 57 if (!find_unisys_acpi_oem_table(&oem_addr)) { 52 - if (es7000_check_dsdt()) 53 - return parse_unisys_oem((char *)oem_addr); 58 + if (check_dsdt) 59 + ret = parse_unisys_oem((char *)oem_addr); 54 60 else { 55 61 setup_unisys(); 56 - return 1; 62 + ret = 1; 57 63 } 64 + /* 65 + * we need to unmap it 66 + */ 67 + unmap_unisys_acpi_oem_table(oem_addr); 58 68 } 59 - return 0; 69 + return ret; 60 70 } 61 71 #else 62 72 static int __init acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+1 -5
arch/x86/mm/Makefile
··· 13 13 mmiotrace-y := pf_in.o mmio-mod.o 14 14 obj-$(CONFIG_MMIOTRACE_TEST) += testmmiotrace.o 15 15 16 - ifeq ($(CONFIG_X86_32),y) 17 - obj-$(CONFIG_NUMA) += discontig_32.o 18 - else 19 - obj-$(CONFIG_NUMA) += numa_64.o 16 + obj-$(CONFIG_NUMA) += numa_$(BITS).o 20 17 obj-$(CONFIG_K8_NUMA) += k8topology_64.o 21 - endif 22 18 obj-$(CONFIG_ACPI_NUMA) += srat_$(BITS).o 23 19 24 20 obj-$(CONFIG_MEMTEST) += memtest.o
arch/x86/mm/discontig_32.c arch/x86/mm/numa_32.c
-5
arch/x86/mm/fault.c
··· 592 592 unsigned long flags; 593 593 #endif 594 594 595 - /* 596 - * We can fault from pretty much anywhere, with unknown IRQ state. 597 - */ 598 - trace_hardirqs_fixup(); 599 - 600 595 tsk = current; 601 596 mm = tsk->mm; 602 597 prefetchw(&mm->mmap_sem);
+5 -5
arch/x86/mm/gup.c
··· 82 82 pte_t pte = gup_get_pte(ptep); 83 83 struct page *page; 84 84 85 - if ((pte_val(pte) & (mask | _PAGE_SPECIAL)) != mask) { 85 + if ((pte_flags(pte) & (mask | _PAGE_SPECIAL)) != mask) { 86 86 pte_unmap(ptep); 87 87 return 0; 88 88 } ··· 116 116 mask = _PAGE_PRESENT|_PAGE_USER; 117 117 if (write) 118 118 mask |= _PAGE_RW; 119 - if ((pte_val(pte) & mask) != mask) 119 + if ((pte_flags(pte) & mask) != mask) 120 120 return 0; 121 121 /* hugepages are never "special" */ 122 - VM_BUG_ON(pte_val(pte) & _PAGE_SPECIAL); 122 + VM_BUG_ON(pte_flags(pte) & _PAGE_SPECIAL); 123 123 VM_BUG_ON(!pfn_valid(pte_pfn(pte))); 124 124 125 125 refs = 0; ··· 173 173 mask = _PAGE_PRESENT|_PAGE_USER; 174 174 if (write) 175 175 mask |= _PAGE_RW; 176 - if ((pte_val(pte) & mask) != mask) 176 + if ((pte_flags(pte) & mask) != mask) 177 177 return 0; 178 178 /* hugepages are never "special" */ 179 - VM_BUG_ON(pte_val(pte) & _PAGE_SPECIAL); 179 + VM_BUG_ON(pte_flags(pte) & _PAGE_SPECIAL); 180 180 VM_BUG_ON(!pfn_valid(pte_pfn(pte))); 181 181 182 182 refs = 0;
+1 -1
arch/x86/mm/init_32.c
··· 558 558 559 559 int nx_enabled; 560 560 561 - pteval_t __supported_pte_mask __read_mostly = ~(_PAGE_NX | _PAGE_GLOBAL); 561 + pteval_t __supported_pte_mask __read_mostly = ~(_PAGE_NX | _PAGE_GLOBAL | _PAGE_IOMAP); 562 562 EXPORT_SYMBOL_GPL(__supported_pte_mask); 563 563 564 564 #ifdef CONFIG_X86_PAE
+3 -6
arch/x86/mm/init_64.c
··· 89 89 90 90 int after_bootmem; 91 91 92 - unsigned long __supported_pte_mask __read_mostly = ~0UL; 92 + pteval_t __supported_pte_mask __read_mostly = ~_PAGE_IOMAP; 93 93 EXPORT_SYMBOL_GPL(__supported_pte_mask); 94 94 95 95 static int do_not_nx __cpuinitdata; ··· 196 196 } 197 197 198 198 pte = pte_offset_kernel(pmd, vaddr); 199 - if (!pte_none(*pte) && pte_val(new_pte) && 200 - pte_val(*pte) != (pte_val(new_pte) & __supported_pte_mask)) 201 - pte_ERROR(*pte); 202 199 set_pte(pte, new_pte); 203 200 204 201 /* ··· 310 313 if (pfn >= table_top) 311 314 panic("alloc_low_page: ran out of memory"); 312 315 313 - adr = early_ioremap(pfn * PAGE_SIZE, PAGE_SIZE); 316 + adr = early_memremap(pfn * PAGE_SIZE, PAGE_SIZE); 314 317 memset(adr, 0, PAGE_SIZE); 315 318 *phys = pfn * PAGE_SIZE; 316 319 return adr; ··· 746 749 old_start = mr[i].start; 747 750 memmove(&mr[i], &mr[i+1], 748 751 (nr_range - 1 - i) * sizeof (struct map_range)); 749 - mr[i].start = old_start; 752 + mr[i--].start = old_start; 750 753 nr_range--; 751 754 } 752 755
+112 -31
arch/x86/mm/ioremap.c
··· 45 45 } 46 46 EXPORT_SYMBOL(__phys_addr); 47 47 48 + bool __virt_addr_valid(unsigned long x) 49 + { 50 + if (x >= __START_KERNEL_map) { 51 + x -= __START_KERNEL_map; 52 + if (x >= KERNEL_IMAGE_SIZE) 53 + return false; 54 + x += phys_base; 55 + } else { 56 + if (x < PAGE_OFFSET) 57 + return false; 58 + x -= PAGE_OFFSET; 59 + if (system_state == SYSTEM_BOOTING ? 60 + x > MAXMEM : !phys_addr_valid(x)) { 61 + return false; 62 + } 63 + } 64 + 65 + return pfn_valid(x >> PAGE_SHIFT); 66 + } 67 + EXPORT_SYMBOL(__virt_addr_valid); 68 + 48 69 #else 49 70 50 71 static inline int phys_addr_valid(unsigned long addr) ··· 77 56 unsigned long __phys_addr(unsigned long x) 78 57 { 79 58 /* VMALLOC_* aren't constants; not available at the boot time */ 80 - VIRTUAL_BUG_ON(x < PAGE_OFFSET || (system_state != SYSTEM_BOOTING && 81 - is_vmalloc_addr((void *)x))); 59 + VIRTUAL_BUG_ON(x < PAGE_OFFSET); 60 + VIRTUAL_BUG_ON(system_state != SYSTEM_BOOTING && 61 + is_vmalloc_addr((void *) x)); 82 62 return x - PAGE_OFFSET; 83 63 } 84 64 EXPORT_SYMBOL(__phys_addr); 85 65 #endif 66 + 67 + bool __virt_addr_valid(unsigned long x) 68 + { 69 + if (x < PAGE_OFFSET) 70 + return false; 71 + if (system_state != SYSTEM_BOOTING && is_vmalloc_addr((void *) x)) 72 + return false; 73 + return pfn_valid((x - PAGE_OFFSET) >> PAGE_SHIFT); 74 + } 75 + EXPORT_SYMBOL(__virt_addr_valid); 86 76 87 77 #endif 88 78 ··· 274 242 switch (prot_val) { 275 243 case _PAGE_CACHE_UC: 276 244 default: 277 - prot = PAGE_KERNEL_NOCACHE; 245 + prot = PAGE_KERNEL_IO_NOCACHE; 278 246 break; 279 247 case _PAGE_CACHE_UC_MINUS: 280 - prot = PAGE_KERNEL_UC_MINUS; 248 + prot = PAGE_KERNEL_IO_UC_MINUS; 281 249 break; 282 250 case _PAGE_CACHE_WC: 283 - prot = PAGE_KERNEL_WC; 251 + prot = PAGE_KERNEL_IO_WC; 284 252 break; 285 253 case _PAGE_CACHE_WB: 286 - prot = PAGE_KERNEL; 254 + prot = PAGE_KERNEL_IO; 287 255 break; 288 256 } 289 257 ··· 600 568 } 601 569 602 570 static inline void __init early_set_fixmap(enum fixed_addresses idx, 603 - unsigned long phys) 571 + unsigned long phys, pgprot_t prot) 604 572 { 605 573 if (after_paging_init) 606 - set_fixmap(idx, phys); 574 + __set_fixmap(idx, phys, prot); 607 575 else 608 - __early_set_fixmap(idx, phys, PAGE_KERNEL); 576 + __early_set_fixmap(idx, phys, prot); 609 577 } 610 578 611 579 static inline void __init early_clear_fixmap(enum fixed_addresses idx) ··· 616 584 __early_set_fixmap(idx, 0, __pgprot(0)); 617 585 } 618 586 619 - 620 - static int __initdata early_ioremap_nested; 621 - 587 + static void *prev_map[FIX_BTMAPS_SLOTS] __initdata; 588 + static unsigned long prev_size[FIX_BTMAPS_SLOTS] __initdata; 622 589 static int __init check_early_ioremap_leak(void) 623 590 { 624 - if (!early_ioremap_nested) 591 + int count = 0; 592 + int i; 593 + 594 + for (i = 0; i < FIX_BTMAPS_SLOTS; i++) 595 + if (prev_map[i]) 596 + count++; 597 + 598 + if (!count) 625 599 return 0; 626 600 WARN(1, KERN_WARNING 627 601 "Debug warning: early ioremap leak of %d areas detected.\n", 628 - early_ioremap_nested); 602 + count); 629 603 printk(KERN_WARNING 630 604 "please boot with early_ioremap_debug and report the dmesg.\n"); 631 605 ··· 639 601 } 640 602 late_initcall(check_early_ioremap_leak); 641 603 642 - void __init *early_ioremap(unsigned long phys_addr, unsigned long size) 604 + static void __init *__early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot) 643 605 { 644 606 unsigned long offset, last_addr; 645 - unsigned int nrpages, nesting; 607 + unsigned int nrpages; 646 608 enum fixed_addresses idx0, idx; 609 + int i, slot; 647 610 648 611 WARN_ON(system_state != SYSTEM_BOOTING); 649 612 650 - nesting = early_ioremap_nested; 613 + slot = -1; 614 + for (i = 0; i < FIX_BTMAPS_SLOTS; i++) { 615 + if (!prev_map[i]) { 616 + slot = i; 617 + break; 618 + } 619 + } 620 + 621 + if (slot < 0) { 622 + printk(KERN_INFO "early_iomap(%08lx, %08lx) not found slot\n", 623 + phys_addr, size); 624 + WARN_ON(1); 625 + return NULL; 626 + } 627 + 651 628 if (early_ioremap_debug) { 652 629 printk(KERN_INFO "early_ioremap(%08lx, %08lx) [%d] => ", 653 - phys_addr, size, nesting); 630 + phys_addr, size, slot); 654 631 dump_stack(); 655 632 } 656 633 ··· 676 623 return NULL; 677 624 } 678 625 679 - if (nesting >= FIX_BTMAPS_NESTING) { 680 - WARN_ON(1); 681 - return NULL; 682 - } 683 - early_ioremap_nested++; 626 + prev_size[slot] = size; 684 627 /* 685 628 * Mappings have to be page-aligned 686 629 */ ··· 696 647 /* 697 648 * Ok, go for it.. 698 649 */ 699 - idx0 = FIX_BTMAP_BEGIN - NR_FIX_BTMAPS*nesting; 650 + idx0 = FIX_BTMAP_BEGIN - NR_FIX_BTMAPS*slot; 700 651 idx = idx0; 701 652 while (nrpages > 0) { 702 - early_set_fixmap(idx, phys_addr); 653 + early_set_fixmap(idx, phys_addr, prot); 703 654 phys_addr += PAGE_SIZE; 704 655 --idx; 705 656 --nrpages; ··· 707 658 if (early_ioremap_debug) 708 659 printk(KERN_CONT "%08lx + %08lx\n", offset, fix_to_virt(idx0)); 709 660 710 - return (void *) (offset + fix_to_virt(idx0)); 661 + prev_map[slot] = (void *) (offset + fix_to_virt(idx0)); 662 + return prev_map[slot]; 663 + } 664 + 665 + /* Remap an IO device */ 666 + void __init *early_ioremap(unsigned long phys_addr, unsigned long size) 667 + { 668 + return __early_ioremap(phys_addr, size, PAGE_KERNEL_IO); 669 + } 670 + 671 + /* Remap memory */ 672 + void __init *early_memremap(unsigned long phys_addr, unsigned long size) 673 + { 674 + return __early_ioremap(phys_addr, size, PAGE_KERNEL); 711 675 } 712 676 713 677 void __init early_iounmap(void *addr, unsigned long size) ··· 729 667 unsigned long offset; 730 668 unsigned int nrpages; 731 669 enum fixed_addresses idx; 732 - int nesting; 670 + int i, slot; 733 671 734 - nesting = --early_ioremap_nested; 735 - if (WARN_ON(nesting < 0)) 672 + slot = -1; 673 + for (i = 0; i < FIX_BTMAPS_SLOTS; i++) { 674 + if (prev_map[i] == addr) { 675 + slot = i; 676 + break; 677 + } 678 + } 679 + 680 + if (slot < 0) { 681 + printk(KERN_INFO "early_iounmap(%p, %08lx) not found slot\n", 682 + addr, size); 683 + WARN_ON(1); 736 684 return; 685 + } 686 + 687 + if (prev_size[slot] != size) { 688 + printk(KERN_INFO "early_iounmap(%p, %08lx) [%d] size not consistent %08lx\n", 689 + addr, size, slot, prev_size[slot]); 690 + WARN_ON(1); 691 + return; 692 + } 737 693 738 694 if (early_ioremap_debug) { 739 695 printk(KERN_INFO "early_iounmap(%p, %08lx) [%d]\n", addr, 740 - size, nesting); 696 + size, slot); 741 697 dump_stack(); 742 698 } 743 699 ··· 767 687 offset = virt_addr & ~PAGE_MASK; 768 688 nrpages = PAGE_ALIGN(offset + size - 1) >> PAGE_SHIFT; 769 689 770 - idx = FIX_BTMAP_BEGIN - NR_FIX_BTMAPS*nesting; 690 + idx = FIX_BTMAP_BEGIN - NR_FIX_BTMAPS*slot; 771 691 while (nrpages > 0) { 772 692 early_clear_fixmap(idx); 773 693 --idx; 774 694 --nrpages; 775 695 } 696 + prev_map[slot] = 0; 776 697 } 777 698 778 699 void __this_fixmap_does_not_exist(void)
+1 -1
arch/x86/mm/srat_64.c
··· 138 138 return; 139 139 } 140 140 141 - if (is_uv_system()) 141 + if (get_uv_system_type() >= UV_X2APIC) 142 142 apic_id = (pa->apic_id << 8) | pa->local_sapic_eid; 143 143 else 144 144 apic_id = pa->apic_id;
+1 -1
arch/x86/oprofile/Makefile
··· 7 7 timer_int.o ) 8 8 9 9 oprofile-y := $(DRIVER_OBJS) init.o backtrace.o 10 - oprofile-$(CONFIG_X86_LOCAL_APIC) += nmi_int.o op_model_athlon.o \ 10 + oprofile-$(CONFIG_X86_LOCAL_APIC) += nmi_int.o op_model_amd.o \ 11 11 op_model_ppro.o op_model_p4.o 12 12 oprofile-$(CONFIG_X86_IO_APIC) += nmi_timer_int.o
+21 -6
arch/x86/oprofile/nmi_int.c
··· 1 1 /** 2 2 * @file nmi_int.c 3 3 * 4 - * @remark Copyright 2002 OProfile authors 4 + * @remark Copyright 2002-2008 OProfile authors 5 5 * @remark Read the file COPYING 6 6 * 7 7 * @author John Levon <levon@movementarian.org> 8 + * @author Robert Richter <robert.richter@amd.com> 8 9 */ 9 10 10 11 #include <linux/init.h> ··· 440 439 __u8 vendor = boot_cpu_data.x86_vendor; 441 440 __u8 family = boot_cpu_data.x86; 442 441 char *cpu_type; 442 + int ret = 0; 443 443 444 444 if (!cpu_has_apic) 445 445 return -ENODEV; ··· 453 451 default: 454 452 return -ENODEV; 455 453 case 6: 456 - model = &op_athlon_spec; 454 + model = &op_amd_spec; 457 455 cpu_type = "i386/athlon"; 458 456 break; 459 457 case 0xf: 460 - model = &op_athlon_spec; 458 + model = &op_amd_spec; 461 459 /* Actually it could be i386/hammer too, but give 462 460 user space an consistent name. */ 463 461 cpu_type = "x86-64/hammer"; 464 462 break; 465 463 case 0x10: 466 - model = &op_athlon_spec; 464 + model = &op_amd_spec; 467 465 cpu_type = "x86-64/family10"; 466 + break; 467 + case 0x11: 468 + model = &op_amd_spec; 469 + cpu_type = "x86-64/family11h"; 468 470 break; 469 471 } 470 472 break; ··· 496 490 return -ENODEV; 497 491 } 498 492 499 - init_sysfs(); 500 493 #ifdef CONFIG_SMP 501 494 register_cpu_notifier(&oprofile_cpu_nb); 502 495 #endif 503 - using_nmi = 1; 496 + /* default values, can be overwritten by model */ 504 497 ops->create_files = nmi_create_files; 505 498 ops->setup = nmi_setup; 506 499 ops->shutdown = nmi_shutdown; 507 500 ops->start = nmi_start; 508 501 ops->stop = nmi_stop; 509 502 ops->cpu_type = cpu_type; 503 + 504 + if (model->init) 505 + ret = model->init(ops); 506 + if (ret) 507 + return ret; 508 + 509 + init_sysfs(); 510 + using_nmi = 1; 510 511 printk(KERN_INFO "oprofile: using NMI interrupt.\n"); 511 512 return 0; 512 513 } ··· 526 513 unregister_cpu_notifier(&oprofile_cpu_nb); 527 514 #endif 528 515 } 516 + if (model->exit) 517 + model->exit(); 529 518 }
+543
arch/x86/oprofile/op_model_amd.c
··· 1 + /* 2 + * @file op_model_amd.c 3 + * athlon / K7 / K8 / Family 10h model-specific MSR operations 4 + * 5 + * @remark Copyright 2002-2008 OProfile authors 6 + * @remark Read the file COPYING 7 + * 8 + * @author John Levon 9 + * @author Philippe Elie 10 + * @author Graydon Hoare 11 + * @author Robert Richter <robert.richter@amd.com> 12 + * @author Barry Kasindorf 13 + */ 14 + 15 + #include <linux/oprofile.h> 16 + #include <linux/device.h> 17 + #include <linux/pci.h> 18 + 19 + #include <asm/ptrace.h> 20 + #include <asm/msr.h> 21 + #include <asm/nmi.h> 22 + 23 + #include "op_x86_model.h" 24 + #include "op_counter.h" 25 + 26 + #define NUM_COUNTERS 4 27 + #define NUM_CONTROLS 4 28 + 29 + #define CTR_IS_RESERVED(msrs, c) (msrs->counters[(c)].addr ? 1 : 0) 30 + #define CTR_READ(l, h, msrs, c) do {rdmsr(msrs->counters[(c)].addr, (l), (h)); } while (0) 31 + #define CTR_WRITE(l, msrs, c) do {wrmsr(msrs->counters[(c)].addr, -(unsigned int)(l), -1); } while (0) 32 + #define CTR_OVERFLOWED(n) (!((n) & (1U<<31))) 33 + 34 + #define CTRL_IS_RESERVED(msrs, c) (msrs->controls[(c)].addr ? 1 : 0) 35 + #define CTRL_READ(l, h, msrs, c) do {rdmsr(msrs->controls[(c)].addr, (l), (h)); } while (0) 36 + #define CTRL_WRITE(l, h, msrs, c) do {wrmsr(msrs->controls[(c)].addr, (l), (h)); } while (0) 37 + #define CTRL_SET_ACTIVE(n) (n |= (1<<22)) 38 + #define CTRL_SET_INACTIVE(n) (n &= ~(1<<22)) 39 + #define CTRL_CLEAR_LO(x) (x &= (1<<21)) 40 + #define CTRL_CLEAR_HI(x) (x &= 0xfffffcf0) 41 + #define CTRL_SET_ENABLE(val) (val |= 1<<20) 42 + #define CTRL_SET_USR(val, u) (val |= ((u & 1) << 16)) 43 + #define CTRL_SET_KERN(val, k) (val |= ((k & 1) << 17)) 44 + #define CTRL_SET_UM(val, m) (val |= (m << 8)) 45 + #define CTRL_SET_EVENT_LOW(val, e) (val |= (e & 0xff)) 46 + #define CTRL_SET_EVENT_HIGH(val, e) (val |= ((e >> 8) & 0xf)) 47 + #define CTRL_SET_HOST_ONLY(val, h) (val |= ((h & 1) << 9)) 48 + #define CTRL_SET_GUEST_ONLY(val, h) (val |= ((h & 1) << 8)) 49 + 50 + static unsigned long reset_value[NUM_COUNTERS]; 51 + 52 + #ifdef CONFIG_OPROFILE_IBS 53 + 54 + /* IbsFetchCtl bits/masks */ 55 + #define IBS_FETCH_HIGH_VALID_BIT (1UL << 17) /* bit 49 */ 56 + #define IBS_FETCH_HIGH_ENABLE (1UL << 16) /* bit 48 */ 57 + #define IBS_FETCH_LOW_MAX_CNT_MASK 0x0000FFFFUL /* MaxCnt mask */ 58 + 59 + /*IbsOpCtl bits */ 60 + #define IBS_OP_LOW_VALID_BIT (1ULL<<18) /* bit 18 */ 61 + #define IBS_OP_LOW_ENABLE (1ULL<<17) /* bit 17 */ 62 + 63 + /* Codes used in cpu_buffer.c */ 64 + /* This produces duplicate code, need to be fixed */ 65 + #define IBS_FETCH_BEGIN 3 66 + #define IBS_OP_BEGIN 4 67 + 68 + /* The function interface needs to be fixed, something like add 69 + data. Should then be added to linux/oprofile.h. */ 70 + extern void oprofile_add_ibs_sample(struct pt_regs *const regs, 71 + unsigned int * const ibs_sample, u8 code); 72 + 73 + struct ibs_fetch_sample { 74 + /* MSRC001_1031 IBS Fetch Linear Address Register */ 75 + unsigned int ibs_fetch_lin_addr_low; 76 + unsigned int ibs_fetch_lin_addr_high; 77 + /* MSRC001_1030 IBS Fetch Control Register */ 78 + unsigned int ibs_fetch_ctl_low; 79 + unsigned int ibs_fetch_ctl_high; 80 + /* MSRC001_1032 IBS Fetch Physical Address Register */ 81 + unsigned int ibs_fetch_phys_addr_low; 82 + unsigned int ibs_fetch_phys_addr_high; 83 + }; 84 + 85 + struct ibs_op_sample { 86 + /* MSRC001_1034 IBS Op Logical Address Register (IbsRIP) */ 87 + unsigned int ibs_op_rip_low; 88 + unsigned int ibs_op_rip_high; 89 + /* MSRC001_1035 IBS Op Data Register */ 90 + unsigned int ibs_op_data1_low; 91 + unsigned int ibs_op_data1_high; 92 + /* MSRC001_1036 IBS Op Data 2 Register */ 93 + unsigned int ibs_op_data2_low; 94 + unsigned int ibs_op_data2_high; 95 + /* MSRC001_1037 IBS Op Data 3 Register */ 96 + unsigned int ibs_op_data3_low; 97 + unsigned int ibs_op_data3_high; 98 + /* MSRC001_1038 IBS DC Linear Address Register (IbsDcLinAd) */ 99 + unsigned int ibs_dc_linear_low; 100 + unsigned int ibs_dc_linear_high; 101 + /* MSRC001_1039 IBS DC Physical Address Register (IbsDcPhysAd) */ 102 + unsigned int ibs_dc_phys_low; 103 + unsigned int ibs_dc_phys_high; 104 + }; 105 + 106 + /* 107 + * unitialize the APIC for the IBS interrupts if needed on AMD Family10h+ 108 + */ 109 + static void clear_ibs_nmi(void); 110 + 111 + static int ibs_allowed; /* AMD Family10h and later */ 112 + 113 + struct op_ibs_config { 114 + unsigned long op_enabled; 115 + unsigned long fetch_enabled; 116 + unsigned long max_cnt_fetch; 117 + unsigned long max_cnt_op; 118 + unsigned long rand_en; 119 + unsigned long dispatched_ops; 120 + }; 121 + 122 + static struct op_ibs_config ibs_config; 123 + 124 + #endif 125 + 126 + /* functions for op_amd_spec */ 127 + 128 + static void op_amd_fill_in_addresses(struct op_msrs * const msrs) 129 + { 130 + int i; 131 + 132 + for (i = 0; i < NUM_COUNTERS; i++) { 133 + if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) 134 + msrs->counters[i].addr = MSR_K7_PERFCTR0 + i; 135 + else 136 + msrs->counters[i].addr = 0; 137 + } 138 + 139 + for (i = 0; i < NUM_CONTROLS; i++) { 140 + if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) 141 + msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i; 142 + else 143 + msrs->controls[i].addr = 0; 144 + } 145 + } 146 + 147 + 148 + static void op_amd_setup_ctrs(struct op_msrs const * const msrs) 149 + { 150 + unsigned int low, high; 151 + int i; 152 + 153 + /* clear all counters */ 154 + for (i = 0 ; i < NUM_CONTROLS; ++i) { 155 + if (unlikely(!CTRL_IS_RESERVED(msrs, i))) 156 + continue; 157 + CTRL_READ(low, high, msrs, i); 158 + CTRL_CLEAR_LO(low); 159 + CTRL_CLEAR_HI(high); 160 + CTRL_WRITE(low, high, msrs, i); 161 + } 162 + 163 + /* avoid a false detection of ctr overflows in NMI handler */ 164 + for (i = 0; i < NUM_COUNTERS; ++i) { 165 + if (unlikely(!CTR_IS_RESERVED(msrs, i))) 166 + continue; 167 + CTR_WRITE(1, msrs, i); 168 + } 169 + 170 + /* enable active counters */ 171 + for (i = 0; i < NUM_COUNTERS; ++i) { 172 + if ((counter_config[i].enabled) && (CTR_IS_RESERVED(msrs, i))) { 173 + reset_value[i] = counter_config[i].count; 174 + 175 + CTR_WRITE(counter_config[i].count, msrs, i); 176 + 177 + CTRL_READ(low, high, msrs, i); 178 + CTRL_CLEAR_LO(low); 179 + CTRL_CLEAR_HI(high); 180 + CTRL_SET_ENABLE(low); 181 + CTRL_SET_USR(low, counter_config[i].user); 182 + CTRL_SET_KERN(low, counter_config[i].kernel); 183 + CTRL_SET_UM(low, counter_config[i].unit_mask); 184 + CTRL_SET_EVENT_LOW(low, counter_config[i].event); 185 + CTRL_SET_EVENT_HIGH(high, counter_config[i].event); 186 + CTRL_SET_HOST_ONLY(high, 0); 187 + CTRL_SET_GUEST_ONLY(high, 0); 188 + 189 + CTRL_WRITE(low, high, msrs, i); 190 + } else { 191 + reset_value[i] = 0; 192 + } 193 + } 194 + } 195 + 196 + #ifdef CONFIG_OPROFILE_IBS 197 + 198 + static inline int 199 + op_amd_handle_ibs(struct pt_regs * const regs, 200 + struct op_msrs const * const msrs) 201 + { 202 + unsigned int low, high; 203 + struct ibs_fetch_sample ibs_fetch; 204 + struct ibs_op_sample ibs_op; 205 + 206 + if (!ibs_allowed) 207 + return 1; 208 + 209 + if (ibs_config.fetch_enabled) { 210 + rdmsr(MSR_AMD64_IBSFETCHCTL, low, high); 211 + if (high & IBS_FETCH_HIGH_VALID_BIT) { 212 + ibs_fetch.ibs_fetch_ctl_high = high; 213 + ibs_fetch.ibs_fetch_ctl_low = low; 214 + rdmsr(MSR_AMD64_IBSFETCHLINAD, low, high); 215 + ibs_fetch.ibs_fetch_lin_addr_high = high; 216 + ibs_fetch.ibs_fetch_lin_addr_low = low; 217 + rdmsr(MSR_AMD64_IBSFETCHPHYSAD, low, high); 218 + ibs_fetch.ibs_fetch_phys_addr_high = high; 219 + ibs_fetch.ibs_fetch_phys_addr_low = low; 220 + 221 + oprofile_add_ibs_sample(regs, 222 + (unsigned int *)&ibs_fetch, 223 + IBS_FETCH_BEGIN); 224 + 225 + /*reenable the IRQ */ 226 + rdmsr(MSR_AMD64_IBSFETCHCTL, low, high); 227 + high &= ~IBS_FETCH_HIGH_VALID_BIT; 228 + high |= IBS_FETCH_HIGH_ENABLE; 229 + low &= IBS_FETCH_LOW_MAX_CNT_MASK; 230 + wrmsr(MSR_AMD64_IBSFETCHCTL, low, high); 231 + } 232 + } 233 + 234 + if (ibs_config.op_enabled) { 235 + rdmsr(MSR_AMD64_IBSOPCTL, low, high); 236 + if (low & IBS_OP_LOW_VALID_BIT) { 237 + rdmsr(MSR_AMD64_IBSOPRIP, low, high); 238 + ibs_op.ibs_op_rip_low = low; 239 + ibs_op.ibs_op_rip_high = high; 240 + rdmsr(MSR_AMD64_IBSOPDATA, low, high); 241 + ibs_op.ibs_op_data1_low = low; 242 + ibs_op.ibs_op_data1_high = high; 243 + rdmsr(MSR_AMD64_IBSOPDATA2, low, high); 244 + ibs_op.ibs_op_data2_low = low; 245 + ibs_op.ibs_op_data2_high = high; 246 + rdmsr(MSR_AMD64_IBSOPDATA3, low, high); 247 + ibs_op.ibs_op_data3_low = low; 248 + ibs_op.ibs_op_data3_high = high; 249 + rdmsr(MSR_AMD64_IBSDCLINAD, low, high); 250 + ibs_op.ibs_dc_linear_low = low; 251 + ibs_op.ibs_dc_linear_high = high; 252 + rdmsr(MSR_AMD64_IBSDCPHYSAD, low, high); 253 + ibs_op.ibs_dc_phys_low = low; 254 + ibs_op.ibs_dc_phys_high = high; 255 + 256 + /* reenable the IRQ */ 257 + oprofile_add_ibs_sample(regs, 258 + (unsigned int *)&ibs_op, 259 + IBS_OP_BEGIN); 260 + rdmsr(MSR_AMD64_IBSOPCTL, low, high); 261 + high = 0; 262 + low &= ~IBS_OP_LOW_VALID_BIT; 263 + low |= IBS_OP_LOW_ENABLE; 264 + wrmsr(MSR_AMD64_IBSOPCTL, low, high); 265 + } 266 + } 267 + 268 + return 1; 269 + } 270 + 271 + #endif 272 + 273 + static int op_amd_check_ctrs(struct pt_regs * const regs, 274 + struct op_msrs const * const msrs) 275 + { 276 + unsigned int low, high; 277 + int i; 278 + 279 + for (i = 0 ; i < NUM_COUNTERS; ++i) { 280 + if (!reset_value[i]) 281 + continue; 282 + CTR_READ(low, high, msrs, i); 283 + if (CTR_OVERFLOWED(low)) { 284 + oprofile_add_sample(regs, i); 285 + CTR_WRITE(reset_value[i], msrs, i); 286 + } 287 + } 288 + 289 + #ifdef CONFIG_OPROFILE_IBS 290 + op_amd_handle_ibs(regs, msrs); 291 + #endif 292 + 293 + /* See op_model_ppro.c */ 294 + return 1; 295 + } 296 + 297 + static void op_amd_start(struct op_msrs const * const msrs) 298 + { 299 + unsigned int low, high; 300 + int i; 301 + for (i = 0 ; i < NUM_COUNTERS ; ++i) { 302 + if (reset_value[i]) { 303 + CTRL_READ(low, high, msrs, i); 304 + CTRL_SET_ACTIVE(low); 305 + CTRL_WRITE(low, high, msrs, i); 306 + } 307 + } 308 + 309 + #ifdef CONFIG_OPROFILE_IBS 310 + if (ibs_allowed && ibs_config.fetch_enabled) { 311 + low = (ibs_config.max_cnt_fetch >> 4) & 0xFFFF; 312 + high = IBS_FETCH_HIGH_ENABLE; 313 + wrmsr(MSR_AMD64_IBSFETCHCTL, low, high); 314 + } 315 + 316 + if (ibs_allowed && ibs_config.op_enabled) { 317 + low = ((ibs_config.max_cnt_op >> 4) & 0xFFFF) + IBS_OP_LOW_ENABLE; 318 + high = 0; 319 + wrmsr(MSR_AMD64_IBSOPCTL, low, high); 320 + } 321 + #endif 322 + } 323 + 324 + 325 + static void op_amd_stop(struct op_msrs const * const msrs) 326 + { 327 + unsigned int low, high; 328 + int i; 329 + 330 + /* Subtle: stop on all counters to avoid race with 331 + * setting our pm callback */ 332 + for (i = 0 ; i < NUM_COUNTERS ; ++i) { 333 + if (!reset_value[i]) 334 + continue; 335 + CTRL_READ(low, high, msrs, i); 336 + CTRL_SET_INACTIVE(low); 337 + CTRL_WRITE(low, high, msrs, i); 338 + } 339 + 340 + #ifdef CONFIG_OPROFILE_IBS 341 + if (ibs_allowed && ibs_config.fetch_enabled) { 342 + low = 0; /* clear max count and enable */ 343 + high = 0; 344 + wrmsr(MSR_AMD64_IBSFETCHCTL, low, high); 345 + } 346 + 347 + if (ibs_allowed && ibs_config.op_enabled) { 348 + low = 0; /* clear max count and enable */ 349 + high = 0; 350 + wrmsr(MSR_AMD64_IBSOPCTL, low, high); 351 + } 352 + #endif 353 + } 354 + 355 + static void op_amd_shutdown(struct op_msrs const * const msrs) 356 + { 357 + int i; 358 + 359 + for (i = 0 ; i < NUM_COUNTERS ; ++i) { 360 + if (CTR_IS_RESERVED(msrs, i)) 361 + release_perfctr_nmi(MSR_K7_PERFCTR0 + i); 362 + } 363 + for (i = 0 ; i < NUM_CONTROLS ; ++i) { 364 + if (CTRL_IS_RESERVED(msrs, i)) 365 + release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); 366 + } 367 + } 368 + 369 + #ifndef CONFIG_OPROFILE_IBS 370 + 371 + /* no IBS support */ 372 + 373 + static int op_amd_init(struct oprofile_operations *ops) 374 + { 375 + return 0; 376 + } 377 + 378 + static void op_amd_exit(void) {} 379 + 380 + #else 381 + 382 + static u8 ibs_eilvt_off; 383 + 384 + static inline void apic_init_ibs_nmi_per_cpu(void *arg) 385 + { 386 + ibs_eilvt_off = setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_NMI, 0); 387 + } 388 + 389 + static inline void apic_clear_ibs_nmi_per_cpu(void *arg) 390 + { 391 + setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_FIX, 1); 392 + } 393 + 394 + static int pfm_amd64_setup_eilvt(void) 395 + { 396 + #define IBSCTL_LVTOFFSETVAL (1 << 8) 397 + #define IBSCTL 0x1cc 398 + struct pci_dev *cpu_cfg; 399 + int nodes; 400 + u32 value = 0; 401 + 402 + /* per CPU setup */ 403 + on_each_cpu(apic_init_ibs_nmi_per_cpu, NULL, 1); 404 + 405 + nodes = 0; 406 + cpu_cfg = NULL; 407 + do { 408 + cpu_cfg = pci_get_device(PCI_VENDOR_ID_AMD, 409 + PCI_DEVICE_ID_AMD_10H_NB_MISC, 410 + cpu_cfg); 411 + if (!cpu_cfg) 412 + break; 413 + ++nodes; 414 + pci_write_config_dword(cpu_cfg, IBSCTL, ibs_eilvt_off 415 + | IBSCTL_LVTOFFSETVAL); 416 + pci_read_config_dword(cpu_cfg, IBSCTL, &value); 417 + if (value != (ibs_eilvt_off | IBSCTL_LVTOFFSETVAL)) { 418 + printk(KERN_DEBUG "Failed to setup IBS LVT offset, " 419 + "IBSCTL = 0x%08x", value); 420 + return 1; 421 + } 422 + } while (1); 423 + 424 + if (!nodes) { 425 + printk(KERN_DEBUG "No CPU node configured for IBS"); 426 + return 1; 427 + } 428 + 429 + #ifdef CONFIG_NUMA 430 + /* Sanity check */ 431 + /* Works only for 64bit with proper numa implementation. */ 432 + if (nodes != num_possible_nodes()) { 433 + printk(KERN_DEBUG "Failed to setup CPU node(s) for IBS, " 434 + "found: %d, expected %d", 435 + nodes, num_possible_nodes()); 436 + return 1; 437 + } 438 + #endif 439 + return 0; 440 + } 441 + 442 + /* 443 + * initialize the APIC for the IBS interrupts 444 + * if available (AMD Family10h rev B0 and later) 445 + */ 446 + static void setup_ibs(void) 447 + { 448 + ibs_allowed = boot_cpu_has(X86_FEATURE_IBS); 449 + 450 + if (!ibs_allowed) 451 + return; 452 + 453 + if (pfm_amd64_setup_eilvt()) { 454 + ibs_allowed = 0; 455 + return; 456 + } 457 + 458 + printk(KERN_INFO "oprofile: AMD IBS detected\n"); 459 + } 460 + 461 + 462 + /* 463 + * unitialize the APIC for the IBS interrupts if needed on AMD Family10h 464 + * rev B0 and later */ 465 + static void clear_ibs_nmi(void) 466 + { 467 + if (ibs_allowed) 468 + on_each_cpu(apic_clear_ibs_nmi_per_cpu, NULL, 1); 469 + } 470 + 471 + static int (*create_arch_files)(struct super_block * sb, struct dentry * root); 472 + 473 + static int setup_ibs_files(struct super_block * sb, struct dentry * root) 474 + { 475 + char buf[12]; 476 + struct dentry *dir; 477 + int ret = 0; 478 + 479 + /* architecture specific files */ 480 + if (create_arch_files) 481 + ret = create_arch_files(sb, root); 482 + 483 + if (ret) 484 + return ret; 485 + 486 + if (!ibs_allowed) 487 + return ret; 488 + 489 + /* model specific files */ 490 + 491 + /* setup some reasonable defaults */ 492 + ibs_config.max_cnt_fetch = 250000; 493 + ibs_config.fetch_enabled = 0; 494 + ibs_config.max_cnt_op = 250000; 495 + ibs_config.op_enabled = 0; 496 + ibs_config.dispatched_ops = 1; 497 + snprintf(buf, sizeof(buf), "ibs_fetch"); 498 + dir = oprofilefs_mkdir(sb, root, buf); 499 + oprofilefs_create_ulong(sb, dir, "rand_enable", 500 + &ibs_config.rand_en); 501 + oprofilefs_create_ulong(sb, dir, "enable", 502 + &ibs_config.fetch_enabled); 503 + oprofilefs_create_ulong(sb, dir, "max_count", 504 + &ibs_config.max_cnt_fetch); 505 + snprintf(buf, sizeof(buf), "ibs_uops"); 506 + dir = oprofilefs_mkdir(sb, root, buf); 507 + oprofilefs_create_ulong(sb, dir, "enable", 508 + &ibs_config.op_enabled); 509 + oprofilefs_create_ulong(sb, dir, "max_count", 510 + &ibs_config.max_cnt_op); 511 + oprofilefs_create_ulong(sb, dir, "dispatched_ops", 512 + &ibs_config.dispatched_ops); 513 + 514 + return 0; 515 + } 516 + 517 + static int op_amd_init(struct oprofile_operations *ops) 518 + { 519 + setup_ibs(); 520 + create_arch_files = ops->create_files; 521 + ops->create_files = setup_ibs_files; 522 + return 0; 523 + } 524 + 525 + static void op_amd_exit(void) 526 + { 527 + clear_ibs_nmi(); 528 + } 529 + 530 + #endif 531 + 532 + struct op_x86_model_spec const op_amd_spec = { 533 + .init = op_amd_init, 534 + .exit = op_amd_exit, 535 + .num_counters = NUM_COUNTERS, 536 + .num_controls = NUM_CONTROLS, 537 + .fill_in_addresses = &op_amd_fill_in_addresses, 538 + .setup_ctrs = &op_amd_setup_ctrs, 539 + .check_ctrs = &op_amd_check_ctrs, 540 + .start = &op_amd_start, 541 + .stop = &op_amd_stop, 542 + .shutdown = &op_amd_shutdown 543 + };
-190
arch/x86/oprofile/op_model_athlon.c
··· 1 - /* 2 - * @file op_model_athlon.h 3 - * athlon / K7 / K8 / Family 10h model-specific MSR operations 4 - * 5 - * @remark Copyright 2002 OProfile authors 6 - * @remark Read the file COPYING 7 - * 8 - * @author John Levon 9 - * @author Philippe Elie 10 - * @author Graydon Hoare 11 - */ 12 - 13 - #include <linux/oprofile.h> 14 - #include <asm/ptrace.h> 15 - #include <asm/msr.h> 16 - #include <asm/nmi.h> 17 - 18 - #include "op_x86_model.h" 19 - #include "op_counter.h" 20 - 21 - #define NUM_COUNTERS 4 22 - #define NUM_CONTROLS 4 23 - 24 - #define CTR_IS_RESERVED(msrs, c) (msrs->counters[(c)].addr ? 1 : 0) 25 - #define CTR_READ(l, h, msrs, c) do {rdmsr(msrs->counters[(c)].addr, (l), (h)); } while (0) 26 - #define CTR_WRITE(l, msrs, c) do {wrmsr(msrs->counters[(c)].addr, -(unsigned int)(l), -1); } while (0) 27 - #define CTR_OVERFLOWED(n) (!((n) & (1U<<31))) 28 - 29 - #define CTRL_IS_RESERVED(msrs, c) (msrs->controls[(c)].addr ? 1 : 0) 30 - #define CTRL_READ(l, h, msrs, c) do {rdmsr(msrs->controls[(c)].addr, (l), (h)); } while (0) 31 - #define CTRL_WRITE(l, h, msrs, c) do {wrmsr(msrs->controls[(c)].addr, (l), (h)); } while (0) 32 - #define CTRL_SET_ACTIVE(n) (n |= (1<<22)) 33 - #define CTRL_SET_INACTIVE(n) (n &= ~(1<<22)) 34 - #define CTRL_CLEAR_LO(x) (x &= (1<<21)) 35 - #define CTRL_CLEAR_HI(x) (x &= 0xfffffcf0) 36 - #define CTRL_SET_ENABLE(val) (val |= 1<<20) 37 - #define CTRL_SET_USR(val, u) (val |= ((u & 1) << 16)) 38 - #define CTRL_SET_KERN(val, k) (val |= ((k & 1) << 17)) 39 - #define CTRL_SET_UM(val, m) (val |= (m << 8)) 40 - #define CTRL_SET_EVENT_LOW(val, e) (val |= (e & 0xff)) 41 - #define CTRL_SET_EVENT_HIGH(val, e) (val |= ((e >> 8) & 0xf)) 42 - #define CTRL_SET_HOST_ONLY(val, h) (val |= ((h & 1) << 9)) 43 - #define CTRL_SET_GUEST_ONLY(val, h) (val |= ((h & 1) << 8)) 44 - 45 - static unsigned long reset_value[NUM_COUNTERS]; 46 - 47 - static void athlon_fill_in_addresses(struct op_msrs * const msrs) 48 - { 49 - int i; 50 - 51 - for (i = 0; i < NUM_COUNTERS; i++) { 52 - if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) 53 - msrs->counters[i].addr = MSR_K7_PERFCTR0 + i; 54 - else 55 - msrs->counters[i].addr = 0; 56 - } 57 - 58 - for (i = 0; i < NUM_CONTROLS; i++) { 59 - if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) 60 - msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i; 61 - else 62 - msrs->controls[i].addr = 0; 63 - } 64 - } 65 - 66 - 67 - static void athlon_setup_ctrs(struct op_msrs const * const msrs) 68 - { 69 - unsigned int low, high; 70 - int i; 71 - 72 - /* clear all counters */ 73 - for (i = 0 ; i < NUM_CONTROLS; ++i) { 74 - if (unlikely(!CTRL_IS_RESERVED(msrs, i))) 75 - continue; 76 - CTRL_READ(low, high, msrs, i); 77 - CTRL_CLEAR_LO(low); 78 - CTRL_CLEAR_HI(high); 79 - CTRL_WRITE(low, high, msrs, i); 80 - } 81 - 82 - /* avoid a false detection of ctr overflows in NMI handler */ 83 - for (i = 0; i < NUM_COUNTERS; ++i) { 84 - if (unlikely(!CTR_IS_RESERVED(msrs, i))) 85 - continue; 86 - CTR_WRITE(1, msrs, i); 87 - } 88 - 89 - /* enable active counters */ 90 - for (i = 0; i < NUM_COUNTERS; ++i) { 91 - if ((counter_config[i].enabled) && (CTR_IS_RESERVED(msrs, i))) { 92 - reset_value[i] = counter_config[i].count; 93 - 94 - CTR_WRITE(counter_config[i].count, msrs, i); 95 - 96 - CTRL_READ(low, high, msrs, i); 97 - CTRL_CLEAR_LO(low); 98 - CTRL_CLEAR_HI(high); 99 - CTRL_SET_ENABLE(low); 100 - CTRL_SET_USR(low, counter_config[i].user); 101 - CTRL_SET_KERN(low, counter_config[i].kernel); 102 - CTRL_SET_UM(low, counter_config[i].unit_mask); 103 - CTRL_SET_EVENT_LOW(low, counter_config[i].event); 104 - CTRL_SET_EVENT_HIGH(high, counter_config[i].event); 105 - CTRL_SET_HOST_ONLY(high, 0); 106 - CTRL_SET_GUEST_ONLY(high, 0); 107 - 108 - CTRL_WRITE(low, high, msrs, i); 109 - } else { 110 - reset_value[i] = 0; 111 - } 112 - } 113 - } 114 - 115 - 116 - static int athlon_check_ctrs(struct pt_regs * const regs, 117 - struct op_msrs const * const msrs) 118 - { 119 - unsigned int low, high; 120 - int i; 121 - 122 - for (i = 0 ; i < NUM_COUNTERS; ++i) { 123 - if (!reset_value[i]) 124 - continue; 125 - CTR_READ(low, high, msrs, i); 126 - if (CTR_OVERFLOWED(low)) { 127 - oprofile_add_sample(regs, i); 128 - CTR_WRITE(reset_value[i], msrs, i); 129 - } 130 - } 131 - 132 - /* See op_model_ppro.c */ 133 - return 1; 134 - } 135 - 136 - 137 - static void athlon_start(struct op_msrs const * const msrs) 138 - { 139 - unsigned int low, high; 140 - int i; 141 - for (i = 0 ; i < NUM_COUNTERS ; ++i) { 142 - if (reset_value[i]) { 143 - CTRL_READ(low, high, msrs, i); 144 - CTRL_SET_ACTIVE(low); 145 - CTRL_WRITE(low, high, msrs, i); 146 - } 147 - } 148 - } 149 - 150 - 151 - static void athlon_stop(struct op_msrs const * const msrs) 152 - { 153 - unsigned int low, high; 154 - int i; 155 - 156 - /* Subtle: stop on all counters to avoid race with 157 - * setting our pm callback */ 158 - for (i = 0 ; i < NUM_COUNTERS ; ++i) { 159 - if (!reset_value[i]) 160 - continue; 161 - CTRL_READ(low, high, msrs, i); 162 - CTRL_SET_INACTIVE(low); 163 - CTRL_WRITE(low, high, msrs, i); 164 - } 165 - } 166 - 167 - static void athlon_shutdown(struct op_msrs const * const msrs) 168 - { 169 - int i; 170 - 171 - for (i = 0 ; i < NUM_COUNTERS ; ++i) { 172 - if (CTR_IS_RESERVED(msrs, i)) 173 - release_perfctr_nmi(MSR_K7_PERFCTR0 + i); 174 - } 175 - for (i = 0 ; i < NUM_CONTROLS ; ++i) { 176 - if (CTRL_IS_RESERVED(msrs, i)) 177 - release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); 178 - } 179 - } 180 - 181 - struct op_x86_model_spec const op_athlon_spec = { 182 - .num_counters = NUM_COUNTERS, 183 - .num_controls = NUM_CONTROLS, 184 - .fill_in_addresses = &athlon_fill_in_addresses, 185 - .setup_ctrs = &athlon_setup_ctrs, 186 - .check_ctrs = &athlon_check_ctrs, 187 - .start = &athlon_start, 188 - .stop = &athlon_stop, 189 - .shutdown = &athlon_shutdown 190 - };
+3 -1
arch/x86/oprofile/op_x86_model.h
··· 32 32 * various x86 CPU models' perfctr support. 33 33 */ 34 34 struct op_x86_model_spec { 35 + int (*init)(struct oprofile_operations *ops); 36 + void (*exit)(void); 35 37 unsigned int const num_counters; 36 38 unsigned int const num_controls; 37 39 void (*fill_in_addresses)(struct op_msrs * const msrs); ··· 48 46 extern struct op_x86_model_spec const op_ppro_spec; 49 47 extern struct op_x86_model_spec const op_p4_spec; 50 48 extern struct op_x86_model_spec const op_p4_ht2_spec; 51 - extern struct op_x86_model_spec const op_athlon_spec; 49 + extern struct op_x86_model_spec const op_amd_spec; 52 50 53 51 #endif /* OP_X86_MODEL_H */
+28
arch/x86/pci/fixup.c
··· 511 511 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1202, fam10h_pci_cfg_space_size); 512 512 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1203, fam10h_pci_cfg_space_size); 513 513 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, 0x1204, fam10h_pci_cfg_space_size); 514 + 515 + /* 516 + * SB600: Disable BAR1 on device 14.0 to avoid HPET resources from 517 + * confusing the PCI engine: 518 + */ 519 + static void sb600_disable_hpet_bar(struct pci_dev *dev) 520 + { 521 + u8 val; 522 + 523 + /* 524 + * The SB600 and SB700 both share the same device 525 + * ID, but the PM register 0x55 does something different 526 + * for the SB700, so make sure we are dealing with the 527 + * SB600 before touching the bit: 528 + */ 529 + 530 + pci_read_config_byte(dev, 0x08, &val); 531 + 532 + if (val < 0x2F) { 533 + outb(0x55, 0xCD6); 534 + val = inb(0xCD7); 535 + 536 + /* Set bit 7 in PM register 0x55 */ 537 + outb(0x55, 0xCD6); 538 + outb(val | 0x80, 0xCD7); 539 + } 540 + } 541 + DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_ATI, 0x4385, sb600_disable_hpet_bar);
+89 -70
drivers/char/hpet.c
··· 53 53 54 54 #define HPET_RANGE_SIZE 1024 /* from HPET spec */ 55 55 56 + 57 + /* WARNING -- don't get confused. These macros are never used 58 + * to write the (single) counter, and rarely to read it. 59 + * They're badly named; to fix, someday. 60 + */ 56 61 #if BITS_PER_LONG == 64 57 62 #define write_counter(V, MC) writeq(V, MC) 58 63 #define read_counter(MC) readq(MC) ··· 82 77 .rating = 250, 83 78 .read = read_hpet, 84 79 .mask = CLOCKSOURCE_MASK(64), 85 - .mult = 0, /*to be caluclated*/ 80 + .mult = 0, /* to be calculated */ 86 81 .shift = 10, 87 82 .flags = CLOCK_SOURCE_IS_CONTINUOUS, 88 83 }; ··· 91 86 92 87 /* A lock for concurrent access by app and isr hpet activity. */ 93 88 static DEFINE_SPINLOCK(hpet_lock); 94 - /* A lock for concurrent intermodule access to hpet and isr hpet activity. */ 95 - static DEFINE_SPINLOCK(hpet_task_lock); 96 89 97 90 #define HPET_DEV_NAME (7) 98 91 ··· 102 99 unsigned long hd_irqdata; 103 100 wait_queue_head_t hd_waitqueue; 104 101 struct fasync_struct *hd_async_queue; 105 - struct hpet_task *hd_task; 106 102 unsigned int hd_flags; 107 103 unsigned int hd_irq; 108 104 unsigned int hd_hdwirq; ··· 175 173 writel(isr, &devp->hd_hpet->hpet_isr); 176 174 spin_unlock(&hpet_lock); 177 175 178 - spin_lock(&hpet_task_lock); 179 - if (devp->hd_task) 180 - devp->hd_task->ht_func(devp->hd_task->ht_data); 181 - spin_unlock(&hpet_task_lock); 182 - 183 176 wake_up_interruptible(&devp->hd_waitqueue); 184 177 185 178 kill_fasync(&devp->hd_async_queue, SIGIO, POLL_IN); 186 179 187 180 return IRQ_HANDLED; 181 + } 182 + 183 + static void hpet_timer_set_irq(struct hpet_dev *devp) 184 + { 185 + unsigned long v; 186 + int irq, gsi; 187 + struct hpet_timer __iomem *timer; 188 + 189 + spin_lock_irq(&hpet_lock); 190 + if (devp->hd_hdwirq) { 191 + spin_unlock_irq(&hpet_lock); 192 + return; 193 + } 194 + 195 + timer = devp->hd_timer; 196 + 197 + /* we prefer level triggered mode */ 198 + v = readl(&timer->hpet_config); 199 + if (!(v & Tn_INT_TYPE_CNF_MASK)) { 200 + v |= Tn_INT_TYPE_CNF_MASK; 201 + writel(v, &timer->hpet_config); 202 + } 203 + spin_unlock_irq(&hpet_lock); 204 + 205 + v = (readq(&timer->hpet_config) & Tn_INT_ROUTE_CAP_MASK) >> 206 + Tn_INT_ROUTE_CAP_SHIFT; 207 + 208 + /* 209 + * In PIC mode, skip IRQ0-4, IRQ6-9, IRQ12-15 which is always used by 210 + * legacy device. In IO APIC mode, we skip all the legacy IRQS. 211 + */ 212 + if (acpi_irq_model == ACPI_IRQ_MODEL_PIC) 213 + v &= ~0xf3df; 214 + else 215 + v &= ~0xffff; 216 + 217 + for (irq = find_first_bit(&v, HPET_MAX_IRQ); irq < HPET_MAX_IRQ; 218 + irq = find_next_bit(&v, HPET_MAX_IRQ, 1 + irq)) { 219 + 220 + if (irq >= NR_IRQS) { 221 + irq = HPET_MAX_IRQ; 222 + break; 223 + } 224 + 225 + gsi = acpi_register_gsi(irq, ACPI_LEVEL_SENSITIVE, 226 + ACPI_ACTIVE_LOW); 227 + if (gsi > 0) 228 + break; 229 + 230 + /* FIXME: Setup interrupt source table */ 231 + } 232 + 233 + if (irq < HPET_MAX_IRQ) { 234 + spin_lock_irq(&hpet_lock); 235 + v = readl(&timer->hpet_config); 236 + v |= irq << Tn_INT_ROUTE_CNF_SHIFT; 237 + writel(v, &timer->hpet_config); 238 + devp->hd_hdwirq = gsi; 239 + spin_unlock_irq(&hpet_lock); 240 + } 241 + return; 188 242 } 189 243 190 244 static int hpet_open(struct inode *inode, struct file *file) ··· 257 199 258 200 for (devp = NULL, hpetp = hpets; hpetp && !devp; hpetp = hpetp->hp_next) 259 201 for (i = 0; i < hpetp->hp_ntimer; i++) 260 - if (hpetp->hp_dev[i].hd_flags & HPET_OPEN 261 - || hpetp->hp_dev[i].hd_task) 202 + if (hpetp->hp_dev[i].hd_flags & HPET_OPEN) 262 203 continue; 263 204 else { 264 205 devp = &hpetp->hp_dev[i]; ··· 275 218 devp->hd_flags |= HPET_OPEN; 276 219 spin_unlock_irq(&hpet_lock); 277 220 unlock_kernel(); 221 + 222 + hpet_timer_set_irq(devp); 278 223 279 224 return 0; 280 225 } ··· 500 441 devp->hd_irq = irq; 501 442 t = devp->hd_ireqfreq; 502 443 v = readq(&timer->hpet_config); 503 - g = v | Tn_INT_ENB_CNF_MASK; 444 + 445 + /* 64-bit comparators are not yet supported through the ioctls, 446 + * so force this into 32-bit mode if it supports both modes 447 + */ 448 + g = v | Tn_32MODE_CNF_MASK | Tn_INT_ENB_CNF_MASK; 504 449 505 450 if (devp->hd_flags & HPET_PERIODIC) { 506 451 write_counter(t, &timer->hpet_compare); ··· 514 451 v |= Tn_VAL_SET_CNF_MASK; 515 452 writeq(v, &timer->hpet_config); 516 453 local_irq_save(flags); 454 + 455 + /* NOTE: what we modify here is a hidden accumulator 456 + * register supported by periodic-capable comparators. 457 + * We never want to modify the (single) counter; that 458 + * would affect all the comparators. 459 + */ 517 460 m = read_counter(&hpet->hpet_mc); 518 461 write_counter(t + m + hpetp->hp_delta, &timer->hpet_compare); 519 462 } else { ··· 673 604 return 0; 674 605 } 675 606 676 - static inline int hpet_tpcheck(struct hpet_task *tp) 677 - { 678 - struct hpet_dev *devp; 679 - struct hpets *hpetp; 680 - 681 - devp = tp->ht_opaque; 682 - 683 - if (!devp) 684 - return -ENXIO; 685 - 686 - for (hpetp = hpets; hpetp; hpetp = hpetp->hp_next) 687 - if (devp >= hpetp->hp_dev 688 - && devp < (hpetp->hp_dev + hpetp->hp_ntimer) 689 - && devp->hd_hpet == hpetp->hp_hpet) 690 - return 0; 691 - 692 - return -ENXIO; 693 - } 694 - 695 - #if 0 696 - int hpet_unregister(struct hpet_task *tp) 697 - { 698 - struct hpet_dev *devp; 699 - struct hpet_timer __iomem *timer; 700 - int err; 701 - 702 - if ((err = hpet_tpcheck(tp))) 703 - return err; 704 - 705 - spin_lock_irq(&hpet_task_lock); 706 - spin_lock(&hpet_lock); 707 - 708 - devp = tp->ht_opaque; 709 - if (devp->hd_task != tp) { 710 - spin_unlock(&hpet_lock); 711 - spin_unlock_irq(&hpet_task_lock); 712 - return -ENXIO; 713 - } 714 - 715 - timer = devp->hd_timer; 716 - writeq((readq(&timer->hpet_config) & ~Tn_INT_ENB_CNF_MASK), 717 - &timer->hpet_config); 718 - devp->hd_flags &= ~(HPET_IE | HPET_PERIODIC); 719 - devp->hd_task = NULL; 720 - spin_unlock(&hpet_lock); 721 - spin_unlock_irq(&hpet_task_lock); 722 - 723 - return 0; 724 - } 725 - #endif /* 0 */ 726 - 727 607 static ctl_table hpet_table[] = { 728 608 { 729 609 .ctl_name = CTL_UNNUMBERED, ··· 764 746 static struct hpets *last = NULL; 765 747 unsigned long period; 766 748 unsigned long long temp; 749 + u32 remainder; 767 750 768 751 /* 769 752 * hpet_alloc can be called by platform dependent code. ··· 828 809 printk("%s %d", i > 0 ? "," : "", hdp->hd_irq[i]); 829 810 printk("\n"); 830 811 831 - printk(KERN_INFO "hpet%u: %u %d-bit timers, %Lu Hz\n", 832 - hpetp->hp_which, hpetp->hp_ntimer, 833 - cap & HPET_COUNTER_SIZE_MASK ? 64 : 32, hpetp->hp_tick_freq); 812 + temp = hpetp->hp_tick_freq; 813 + remainder = do_div(temp, 1000000); 814 + printk(KERN_INFO 815 + "hpet%u: %u comparators, %d-bit %u.%06u MHz counter\n", 816 + hpetp->hp_which, hpetp->hp_ntimer, 817 + cap & HPET_COUNTER_SIZE_MASK ? 64 : 32, 818 + (unsigned) temp, remainder); 834 819 835 820 mcfg = readq(&hpet->hpet_config); 836 821 if ((mcfg & HPET_ENABLE_CNF_MASK) == 0) { ··· 897 874 hdp->hd_address = ioremap(addr.minimum, addr.address_length); 898 875 899 876 if (hpet_is_known(hdp)) { 900 - printk(KERN_DEBUG "%s: 0x%lx is busy\n", 901 - __func__, hdp->hd_phys_address); 902 877 iounmap(hdp->hd_address); 903 878 return AE_ALREADY_EXISTS; 904 879 } ··· 912 891 HPET_RANGE_SIZE); 913 892 914 893 if (hpet_is_known(hdp)) { 915 - printk(KERN_DEBUG "%s: 0x%lx is busy\n", 916 - __func__, hdp->hd_phys_address); 917 894 iounmap(hdp->hd_address); 918 895 return AE_ALREADY_EXISTS; 919 896 }
+143 -66
drivers/oprofile/buffer_sync.c
··· 5 5 * @remark Read the file COPYING 6 6 * 7 7 * @author John Levon <levon@movementarian.org> 8 + * @author Barry Kasindorf 8 9 * 9 10 * This is the core of the buffer management. Each 10 11 * CPU buffer is processed and entered into the ··· 34 33 #include "event_buffer.h" 35 34 #include "cpu_buffer.h" 36 35 #include "buffer_sync.h" 37 - 36 + 38 37 static LIST_HEAD(dying_tasks); 39 38 static LIST_HEAD(dead_tasks); 40 39 static cpumask_t marked_cpus = CPU_MASK_NONE; ··· 49 48 * Can be invoked from softirq via RCU callback due to 50 49 * call_rcu() of the task struct, hence the _irqsave. 51 50 */ 52 - static int task_free_notify(struct notifier_block * self, unsigned long val, void * data) 51 + static int 52 + task_free_notify(struct notifier_block *self, unsigned long val, void *data) 53 53 { 54 54 unsigned long flags; 55 - struct task_struct * task = data; 55 + struct task_struct *task = data; 56 56 spin_lock_irqsave(&task_mortuary, flags); 57 57 list_add(&task->tasks, &dying_tasks); 58 58 spin_unlock_irqrestore(&task_mortuary, flags); ··· 64 62 /* The task is on its way out. A sync of the buffer means we can catch 65 63 * any remaining samples for this task. 66 64 */ 67 - static int task_exit_notify(struct notifier_block * self, unsigned long val, void * data) 65 + static int 66 + task_exit_notify(struct notifier_block *self, unsigned long val, void *data) 68 67 { 69 68 /* To avoid latency problems, we only process the current CPU, 70 69 * hoping that most samples for the task are on this CPU 71 70 */ 72 71 sync_buffer(raw_smp_processor_id()); 73 - return 0; 72 + return 0; 74 73 } 75 74 76 75 ··· 80 77 * we don't lose any. This does not have to be exact, it's a QoI issue 81 78 * only. 82 79 */ 83 - static int munmap_notify(struct notifier_block * self, unsigned long val, void * data) 80 + static int 81 + munmap_notify(struct notifier_block *self, unsigned long val, void *data) 84 82 { 85 83 unsigned long addr = (unsigned long)data; 86 - struct mm_struct * mm = current->mm; 87 - struct vm_area_struct * mpnt; 84 + struct mm_struct *mm = current->mm; 85 + struct vm_area_struct *mpnt; 88 86 89 87 down_read(&mm->mmap_sem); 90 88 ··· 103 99 return 0; 104 100 } 105 101 106 - 102 + 107 103 /* We need to be told about new modules so we don't attribute to a previously 108 104 * loaded module, or drop the samples on the floor. 109 105 */ 110 - static int module_load_notify(struct notifier_block * self, unsigned long val, void * data) 106 + static int 107 + module_load_notify(struct notifier_block *self, unsigned long val, void *data) 111 108 { 112 109 #ifdef CONFIG_MODULES 113 110 if (val != MODULE_STATE_COMING) ··· 123 118 return 0; 124 119 } 125 120 126 - 121 + 127 122 static struct notifier_block task_free_nb = { 128 123 .notifier_call = task_free_notify, 129 124 }; ··· 140 135 .notifier_call = module_load_notify, 141 136 }; 142 137 143 - 138 + 144 139 static void end_sync(void) 145 140 { 146 141 end_cpu_work(); ··· 213 208 * not strictly necessary but allows oprofile to associate 214 209 * shared-library samples with particular applications 215 210 */ 216 - static unsigned long get_exec_dcookie(struct mm_struct * mm) 211 + static unsigned long get_exec_dcookie(struct mm_struct *mm) 217 212 { 218 213 unsigned long cookie = NO_COOKIE; 219 - struct vm_area_struct * vma; 220 - 214 + struct vm_area_struct *vma; 215 + 221 216 if (!mm) 222 217 goto out; 223 - 218 + 224 219 for (vma = mm->mmap; vma; vma = vma->vm_next) { 225 220 if (!vma->vm_file) 226 221 continue; ··· 240 235 * sure to do this lookup before a mm->mmap modification happens so 241 236 * we don't lose track. 242 237 */ 243 - static unsigned long lookup_dcookie(struct mm_struct * mm, unsigned long addr, off_t * offset) 238 + static unsigned long 239 + lookup_dcookie(struct mm_struct *mm, unsigned long addr, off_t *offset) 244 240 { 245 241 unsigned long cookie = NO_COOKIE; 246 - struct vm_area_struct * vma; 242 + struct vm_area_struct *vma; 247 243 248 244 for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) { 249 - 245 + 250 246 if (addr < vma->vm_start || addr >= vma->vm_end) 251 247 continue; 252 248 ··· 269 263 return cookie; 270 264 } 271 265 266 + static void increment_tail(struct oprofile_cpu_buffer *b) 267 + { 268 + unsigned long new_tail = b->tail_pos + 1; 269 + 270 + rmb(); /* be sure fifo pointers are synchromized */ 271 + 272 + if (new_tail < b->buffer_size) 273 + b->tail_pos = new_tail; 274 + else 275 + b->tail_pos = 0; 276 + } 272 277 273 278 static unsigned long last_cookie = INVALID_COOKIE; 274 - 279 + 275 280 static void add_cpu_switch(int i) 276 281 { 277 282 add_event_entry(ESCAPE_CODE); ··· 295 278 { 296 279 add_event_entry(ESCAPE_CODE); 297 280 if (in_kernel) 298 - add_event_entry(KERNEL_ENTER_SWITCH_CODE); 281 + add_event_entry(KERNEL_ENTER_SWITCH_CODE); 299 282 else 300 - add_event_entry(KERNEL_EXIT_SWITCH_CODE); 283 + add_event_entry(KERNEL_EXIT_SWITCH_CODE); 301 284 } 302 - 285 + 303 286 static void 304 - add_user_ctx_switch(struct task_struct const * task, unsigned long cookie) 287 + add_user_ctx_switch(struct task_struct const *task, unsigned long cookie) 305 288 { 306 289 add_event_entry(ESCAPE_CODE); 307 - add_event_entry(CTX_SWITCH_CODE); 290 + add_event_entry(CTX_SWITCH_CODE); 308 291 add_event_entry(task->pid); 309 292 add_event_entry(cookie); 310 293 /* Another code for daemon back-compat */ ··· 313 296 add_event_entry(task->tgid); 314 297 } 315 298 316 - 299 + 317 300 static void add_cookie_switch(unsigned long cookie) 318 301 { 319 302 add_event_entry(ESCAPE_CODE); ··· 321 304 add_event_entry(cookie); 322 305 } 323 306 324 - 307 + 325 308 static void add_trace_begin(void) 326 309 { 327 310 add_event_entry(ESCAPE_CODE); 328 311 add_event_entry(TRACE_BEGIN_CODE); 329 312 } 330 313 314 + #ifdef CONFIG_OPROFILE_IBS 315 + 316 + #define IBS_FETCH_CODE_SIZE 2 317 + #define IBS_OP_CODE_SIZE 5 318 + #define IBS_EIP(offset) \ 319 + (((struct op_sample *)&cpu_buf->buffer[(offset)])->eip) 320 + #define IBS_EVENT(offset) \ 321 + (((struct op_sample *)&cpu_buf->buffer[(offset)])->event) 322 + 323 + /* 324 + * Add IBS fetch and op entries to event buffer 325 + */ 326 + static void add_ibs_begin(struct oprofile_cpu_buffer *cpu_buf, int code, 327 + int in_kernel, struct mm_struct *mm) 328 + { 329 + unsigned long rip; 330 + int i, count; 331 + unsigned long ibs_cookie = 0; 332 + off_t offset; 333 + 334 + increment_tail(cpu_buf); /* move to RIP entry */ 335 + 336 + rip = IBS_EIP(cpu_buf->tail_pos); 337 + 338 + #ifdef __LP64__ 339 + rip += IBS_EVENT(cpu_buf->tail_pos) << 32; 340 + #endif 341 + 342 + if (mm) { 343 + ibs_cookie = lookup_dcookie(mm, rip, &offset); 344 + 345 + if (ibs_cookie == NO_COOKIE) 346 + offset = rip; 347 + if (ibs_cookie == INVALID_COOKIE) { 348 + atomic_inc(&oprofile_stats.sample_lost_no_mapping); 349 + offset = rip; 350 + } 351 + if (ibs_cookie != last_cookie) { 352 + add_cookie_switch(ibs_cookie); 353 + last_cookie = ibs_cookie; 354 + } 355 + } else 356 + offset = rip; 357 + 358 + add_event_entry(ESCAPE_CODE); 359 + add_event_entry(code); 360 + add_event_entry(offset); /* Offset from Dcookie */ 361 + 362 + /* we send the Dcookie offset, but send the raw Linear Add also*/ 363 + add_event_entry(IBS_EIP(cpu_buf->tail_pos)); 364 + add_event_entry(IBS_EVENT(cpu_buf->tail_pos)); 365 + 366 + if (code == IBS_FETCH_CODE) 367 + count = IBS_FETCH_CODE_SIZE; /*IBS FETCH is 2 int64s*/ 368 + else 369 + count = IBS_OP_CODE_SIZE; /*IBS OP is 5 int64s*/ 370 + 371 + for (i = 0; i < count; i++) { 372 + increment_tail(cpu_buf); 373 + add_event_entry(IBS_EIP(cpu_buf->tail_pos)); 374 + add_event_entry(IBS_EVENT(cpu_buf->tail_pos)); 375 + } 376 + } 377 + 378 + #endif 331 379 332 380 static void add_sample_entry(unsigned long offset, unsigned long event) 333 381 { ··· 401 319 } 402 320 403 321 404 - static int add_us_sample(struct mm_struct * mm, struct op_sample * s) 322 + static int add_us_sample(struct mm_struct *mm, struct op_sample *s) 405 323 { 406 324 unsigned long cookie; 407 325 off_t offset; 408 - 409 - cookie = lookup_dcookie(mm, s->eip, &offset); 410 - 326 + 327 + cookie = lookup_dcookie(mm, s->eip, &offset); 328 + 411 329 if (cookie == INVALID_COOKIE) { 412 330 atomic_inc(&oprofile_stats.sample_lost_no_mapping); 413 331 return 0; ··· 423 341 return 1; 424 342 } 425 343 426 - 344 + 427 345 /* Add a sample to the global event buffer. If possible the 428 346 * sample is converted into a persistent dentry/offset pair 429 347 * for later lookup from userspace. 430 348 */ 431 349 static int 432 - add_sample(struct mm_struct * mm, struct op_sample * s, int in_kernel) 350 + add_sample(struct mm_struct *mm, struct op_sample *s, int in_kernel) 433 351 { 434 352 if (in_kernel) { 435 353 add_sample_entry(s->eip, s->event); ··· 441 359 } 442 360 return 0; 443 361 } 444 - 445 362 446 - static void release_mm(struct mm_struct * mm) 363 + 364 + static void release_mm(struct mm_struct *mm) 447 365 { 448 366 if (!mm) 449 367 return; ··· 452 370 } 453 371 454 372 455 - static struct mm_struct * take_tasks_mm(struct task_struct * task) 373 + static struct mm_struct *take_tasks_mm(struct task_struct *task) 456 374 { 457 - struct mm_struct * mm = get_task_mm(task); 375 + struct mm_struct *mm = get_task_mm(task); 458 376 if (mm) 459 377 down_read(&mm->mmap_sem); 460 378 return mm; ··· 465 383 { 466 384 return val == ESCAPE_CODE; 467 385 } 468 - 386 + 469 387 470 388 /* "acquire" as many cpu buffer slots as we can */ 471 - static unsigned long get_slots(struct oprofile_cpu_buffer * b) 389 + static unsigned long get_slots(struct oprofile_cpu_buffer *b) 472 390 { 473 391 unsigned long head = b->head_pos; 474 392 unsigned long tail = b->tail_pos; ··· 494 412 } 495 413 496 414 497 - static void increment_tail(struct oprofile_cpu_buffer * b) 498 - { 499 - unsigned long new_tail = b->tail_pos + 1; 500 - 501 - rmb(); 502 - 503 - if (new_tail < b->buffer_size) 504 - b->tail_pos = new_tail; 505 - else 506 - b->tail_pos = 0; 507 - } 508 - 509 - 510 415 /* Move tasks along towards death. Any tasks on dead_tasks 511 416 * will definitely have no remaining references in any 512 417 * CPU buffers at this point, because we use two lists, ··· 504 435 { 505 436 unsigned long flags; 506 437 LIST_HEAD(local_dead_tasks); 507 - struct task_struct * task; 508 - struct task_struct * ttask; 438 + struct task_struct *task; 439 + struct task_struct *ttask; 509 440 510 441 spin_lock_irqsave(&task_mortuary, flags); 511 442 ··· 562 493 { 563 494 struct oprofile_cpu_buffer *cpu_buf = &per_cpu(cpu_buffer, cpu); 564 495 struct mm_struct *mm = NULL; 565 - struct task_struct * new; 496 + struct task_struct *new; 566 497 unsigned long cookie = 0; 567 498 int in_kernel = 1; 568 499 unsigned int i; ··· 570 501 unsigned long available; 571 502 572 503 mutex_lock(&buffer_mutex); 573 - 504 + 574 505 add_cpu_switch(cpu); 575 506 576 507 /* Remember, only we can modify tail_pos */ ··· 578 509 available = get_slots(cpu_buf); 579 510 580 511 for (i = 0; i < available; ++i) { 581 - struct op_sample * s = &cpu_buf->buffer[cpu_buf->tail_pos]; 582 - 512 + struct op_sample *s = &cpu_buf->buffer[cpu_buf->tail_pos]; 513 + 583 514 if (is_code(s->eip)) { 584 515 if (s->event <= CPU_IS_KERNEL) { 585 516 /* kernel/userspace switch */ ··· 590 521 } else if (s->event == CPU_TRACE_BEGIN) { 591 522 state = sb_bt_start; 592 523 add_trace_begin(); 524 + #ifdef CONFIG_OPROFILE_IBS 525 + } else if (s->event == IBS_FETCH_BEGIN) { 526 + state = sb_bt_start; 527 + add_ibs_begin(cpu_buf, 528 + IBS_FETCH_CODE, in_kernel, mm); 529 + } else if (s->event == IBS_OP_BEGIN) { 530 + state = sb_bt_start; 531 + add_ibs_begin(cpu_buf, 532 + IBS_OP_CODE, in_kernel, mm); 533 + #endif 593 534 } else { 594 - struct mm_struct * oldmm = mm; 535 + struct mm_struct *oldmm = mm; 595 536 596 537 /* userspace context switch */ 597 538 new = (struct task_struct *)s->event; ··· 612 533 cookie = get_exec_dcookie(mm); 613 534 add_user_ctx_switch(new, cookie); 614 535 } 615 - } else { 616 - if (state >= sb_bt_start && 617 - !add_sample(mm, s, in_kernel)) { 618 - if (state == sb_bt_start) { 619 - state = sb_bt_ignore; 620 - atomic_inc(&oprofile_stats.bt_lost_no_mapping); 621 - } 536 + } else if (state >= sb_bt_start && 537 + !add_sample(mm, s, in_kernel)) { 538 + if (state == sb_bt_start) { 539 + state = sb_bt_ignore; 540 + atomic_inc(&oprofile_stats.bt_lost_no_mapping); 622 541 } 623 542 } 624 543
+72 -2
drivers/oprofile/cpu_buffer.c
··· 5 5 * @remark Read the file COPYING 6 6 * 7 7 * @author John Levon <levon@movementarian.org> 8 + * @author Barry Kasindorf <barry.kasindorf@amd.com> 8 9 * 9 10 * Each CPU has a local buffer that stores PC value/event 10 11 * pairs. We also log context switches when we notice them. ··· 210 209 return 1; 211 210 } 212 211 213 - static int oprofile_begin_trace(struct oprofile_cpu_buffer * cpu_buf) 212 + static int oprofile_begin_trace(struct oprofile_cpu_buffer *cpu_buf) 214 213 { 215 214 if (nr_available_slots(cpu_buf) < 4) { 216 215 cpu_buf->sample_lost_overflow++; ··· 255 254 oprofile_add_ext_sample(pc, regs, event, is_kernel); 256 255 } 257 256 257 + #ifdef CONFIG_OPROFILE_IBS 258 + 259 + #define MAX_IBS_SAMPLE_SIZE 14 260 + static int log_ibs_sample(struct oprofile_cpu_buffer *cpu_buf, 261 + unsigned long pc, int is_kernel, unsigned int *ibs, int ibs_code) 262 + { 263 + struct task_struct *task; 264 + 265 + cpu_buf->sample_received++; 266 + 267 + if (nr_available_slots(cpu_buf) < MAX_IBS_SAMPLE_SIZE) { 268 + cpu_buf->sample_lost_overflow++; 269 + return 0; 270 + } 271 + 272 + is_kernel = !!is_kernel; 273 + 274 + /* notice a switch from user->kernel or vice versa */ 275 + if (cpu_buf->last_is_kernel != is_kernel) { 276 + cpu_buf->last_is_kernel = is_kernel; 277 + add_code(cpu_buf, is_kernel); 278 + } 279 + 280 + /* notice a task switch */ 281 + if (!is_kernel) { 282 + task = current; 283 + 284 + if (cpu_buf->last_task != task) { 285 + cpu_buf->last_task = task; 286 + add_code(cpu_buf, (unsigned long)task); 287 + } 288 + } 289 + 290 + add_code(cpu_buf, ibs_code); 291 + add_sample(cpu_buf, ibs[0], ibs[1]); 292 + add_sample(cpu_buf, ibs[2], ibs[3]); 293 + add_sample(cpu_buf, ibs[4], ibs[5]); 294 + 295 + if (ibs_code == IBS_OP_BEGIN) { 296 + add_sample(cpu_buf, ibs[6], ibs[7]); 297 + add_sample(cpu_buf, ibs[8], ibs[9]); 298 + add_sample(cpu_buf, ibs[10], ibs[11]); 299 + } 300 + 301 + return 1; 302 + } 303 + 304 + void oprofile_add_ibs_sample(struct pt_regs *const regs, 305 + unsigned int * const ibs_sample, u8 code) 306 + { 307 + int is_kernel = !user_mode(regs); 308 + unsigned long pc = profile_pc(regs); 309 + 310 + struct oprofile_cpu_buffer *cpu_buf = 311 + &per_cpu(cpu_buffer, smp_processor_id()); 312 + 313 + if (!backtrace_depth) { 314 + log_ibs_sample(cpu_buf, pc, is_kernel, ibs_sample, code); 315 + return; 316 + } 317 + 318 + /* if log_sample() fails we can't backtrace since we lost the source 319 + * of this event */ 320 + if (log_ibs_sample(cpu_buf, pc, is_kernel, ibs_sample, code)) 321 + oprofile_ops.backtrace(regs, backtrace_depth); 322 + } 323 + 324 + #endif 325 + 258 326 void oprofile_add_pc(unsigned long pc, int is_kernel, unsigned long event) 259 327 { 260 328 struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(cpu_buffer); ··· 366 296 struct oprofile_cpu_buffer * b = 367 297 container_of(work, struct oprofile_cpu_buffer, work.work); 368 298 if (b->cpu != smp_processor_id()) { 369 - printk("WQ on CPU%d, prefer CPU%d\n", 299 + printk(KERN_DEBUG "WQ on CPU%d, prefer CPU%d\n", 370 300 smp_processor_id(), b->cpu); 371 301 } 372 302 sync_buffer(b->cpu);
+2
drivers/oprofile/cpu_buffer.h
··· 55 55 /* transient events for the CPU buffer -> event buffer */ 56 56 #define CPU_IS_KERNEL 1 57 57 #define CPU_TRACE_BEGIN 2 58 + #define IBS_FETCH_BEGIN 3 59 + #define IBS_OP_BEGIN 4 58 60 59 61 #endif /* OPROFILE_CPU_BUFFER_H */
+7 -11
include/asm-x86/desc.h
··· 351 351 _set_gate(n, GATE_INTERRUPT, addr, 0x3, 0, __KERNEL_CS); 352 352 } 353 353 354 + static inline void set_system_trap_gate(unsigned int n, void *addr) 355 + { 356 + BUG_ON((unsigned)n > 0xFF); 357 + _set_gate(n, GATE_TRAP, addr, 0x3, 0, __KERNEL_CS); 358 + } 359 + 354 360 static inline void set_trap_gate(unsigned int n, void *addr) 355 361 { 356 362 BUG_ON((unsigned)n > 0xFF); 357 363 _set_gate(n, GATE_TRAP, addr, 0, 0, __KERNEL_CS); 358 - } 359 - 360 - static inline void set_system_gate(unsigned int n, void *addr) 361 - { 362 - BUG_ON((unsigned)n > 0xFF); 363 - #ifdef CONFIG_X86_32 364 - _set_gate(n, GATE_TRAP, addr, 0x3, 0, __KERNEL_CS); 365 - #else 366 - _set_gate(n, GATE_INTERRUPT, addr, 0x3, 0, __KERNEL_CS); 367 - #endif 368 364 } 369 365 370 366 static inline void set_task_gate(unsigned int n, unsigned int gdt_entry) ··· 375 379 _set_gate(n, GATE_INTERRUPT, addr, 0, ist, __KERNEL_CS); 376 380 } 377 381 378 - static inline void set_system_gate_ist(int n, void *addr, unsigned ist) 382 + static inline void set_system_intr_gate_ist(int n, void *addr, unsigned ist) 379 383 { 380 384 BUG_ON((unsigned)n > 0xFF); 381 385 _set_gate(n, GATE_INTERRUPT, addr, 0x3, ist, __KERNEL_CS);
+1
include/asm-x86/es7000/mpparse.h
··· 5 5 6 6 extern int parse_unisys_oem (char *oemptr); 7 7 extern int find_unisys_acpi_oem_table(unsigned long *oem_addr); 8 + extern void unmap_unisys_acpi_oem_table(unsigned long oem_addr); 8 9 extern void setup_unisys(void); 9 10 10 11 #ifndef CONFIG_X86_GENERICARCH
+2 -2
include/asm-x86/fixmap_32.h
··· 94 94 * can have a single pgd entry and a single pte table: 95 95 */ 96 96 #define NR_FIX_BTMAPS 64 97 - #define FIX_BTMAPS_NESTING 4 97 + #define FIX_BTMAPS_SLOTS 4 98 98 FIX_BTMAP_END = __end_of_permanent_fixed_addresses + 256 - 99 99 (__end_of_permanent_fixed_addresses & 255), 100 - FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_NESTING - 1, 100 + FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_SLOTS - 1, 101 101 FIX_WP_TEST, 102 102 #ifdef CONFIG_ACPI 103 103 FIX_ACPI_BEGIN,
+6 -6
include/asm-x86/fixmap_64.h
··· 49 49 #ifdef CONFIG_PARAVIRT 50 50 FIX_PARAVIRT_BOOTMAP, 51 51 #endif 52 + __end_of_permanent_fixed_addresses, 52 53 #ifdef CONFIG_ACPI 53 54 FIX_ACPI_BEGIN, 54 55 FIX_ACPI_END = FIX_ACPI_BEGIN + FIX_ACPI_PAGES - 1, ··· 57 56 #ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT 58 57 FIX_OHCI1394_BASE, 59 58 #endif 60 - __end_of_permanent_fixed_addresses, 61 59 /* 62 60 * 256 temporary boot-time mappings, used by early_ioremap(), 63 61 * before ioremap() is functional. 64 62 * 65 - * We round it up to the next 512 pages boundary so that we 63 + * We round it up to the next 256 pages boundary so that we 66 64 * can have a single pgd entry and a single pte table: 67 65 */ 68 66 #define NR_FIX_BTMAPS 64 69 - #define FIX_BTMAPS_NESTING 4 70 - FIX_BTMAP_END = __end_of_permanent_fixed_addresses + 512 - 71 - (__end_of_permanent_fixed_addresses & 511), 72 - FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_NESTING - 1, 67 + #define FIX_BTMAPS_SLOTS 4 68 + FIX_BTMAP_END = __end_of_permanent_fixed_addresses + 256 - 69 + (__end_of_permanent_fixed_addresses & 255), 70 + FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_SLOTS - 1, 73 71 __end_of_fixed_addresses 74 72 }; 75 73
+1 -14
include/asm-x86/io.h
··· 5 5 6 6 #include <linux/compiler.h> 7 7 8 - /* 9 - * early_ioremap() and early_iounmap() are for temporary early boot-time 10 - * mappings, before the real ioremap() is functional. 11 - * A boot-time mapping is currently limited to at most 16 pages. 12 - */ 13 - #ifndef __ASSEMBLY__ 14 - extern void early_ioremap_init(void); 15 - extern void early_ioremap_clear(void); 16 - extern void early_ioremap_reset(void); 17 - extern void *early_ioremap(unsigned long offset, unsigned long size); 18 - extern void early_iounmap(void *addr, unsigned long size); 19 - extern void __iomem *fix_ioremap(unsigned idx, unsigned long phys); 20 - #endif 21 - 22 8 #define build_mmio_read(name, size, type, reg, barrier) \ 23 9 static inline type name(const volatile void __iomem *addr) \ 24 10 { type ret; asm volatile("mov" size " %1,%0":reg (ret) \ ··· 83 97 extern void early_ioremap_clear(void); 84 98 extern void early_ioremap_reset(void); 85 99 extern void *early_ioremap(unsigned long offset, unsigned long size); 100 + extern void *early_memremap(unsigned long offset, unsigned long size); 86 101 extern void early_iounmap(void *addr, unsigned long size); 87 102 extern void __iomem *fix_ioremap(unsigned idx, unsigned long phys); 88 103
-3
include/asm-x86/io_64.h
··· 165 165 166 166 #include <asm-generic/iomap.h> 167 167 168 - extern void *early_ioremap(unsigned long addr, unsigned long size); 169 - extern void early_iounmap(void *addr, unsigned long size); 170 - 171 168 /* 172 169 * This one maps high address device memory and turns off caching for that area. 173 170 * it's useful if some control registers are in such an area and write combining
-21
include/asm-x86/irqflags.h
··· 166 166 return raw_irqs_disabled_flags(flags); 167 167 } 168 168 169 - /* 170 - * makes the traced hardirq state match with the machine state 171 - * 172 - * should be a rarely used function, only in places where its 173 - * otherwise impossible to know the irq state, like in traps. 174 - */ 175 - static inline void trace_hardirqs_fixup_flags(unsigned long flags) 176 - { 177 - if (raw_irqs_disabled_flags(flags)) 178 - trace_hardirqs_off(); 179 - else 180 - trace_hardirqs_on(); 181 - } 182 - 183 - static inline void trace_hardirqs_fixup(void) 184 - { 185 - unsigned long flags = __raw_local_save_flags(); 186 - 187 - trace_hardirqs_fixup_flags(flags); 188 - } 189 - 190 169 #else 191 170 192 171 #ifdef CONFIG_X86_64
+1 -2
include/asm-x86/kdebug.h
··· 27 27 extern void die(const char *, struct pt_regs *,long); 28 28 extern int __must_check __die(const char *, struct pt_regs *, long); 29 29 extern void show_registers(struct pt_regs *regs); 30 - extern void __show_registers(struct pt_regs *, int all); 31 30 extern void show_trace(struct task_struct *t, struct pt_regs *regs, 32 31 unsigned long *sp, unsigned long bp); 33 - extern void __show_regs(struct pt_regs *regs); 32 + extern void __show_regs(struct pt_regs *regs, int all); 34 33 extern void show_regs(struct pt_regs *regs); 35 34 extern unsigned long oops_begin(void); 36 35 extern void oops_end(unsigned long, struct pt_regs *, int signr);
-9
include/asm-x86/kprobes.h
··· 82 82 struct prev_kprobe prev_kprobe; 83 83 }; 84 84 85 - /* trap3/1 are intr gates for kprobes. So, restore the status of IF, 86 - * if necessary, before executing the original int3/1 (trap) handler. 87 - */ 88 - static inline void restore_interrupts(struct pt_regs *regs) 89 - { 90 - if (regs->flags & X86_EFLAGS_IF) 91 - local_irq_enable(); 92 - } 93 - 94 85 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr); 95 86 extern int kprobe_exceptions_notify(struct notifier_block *self, 96 87 unsigned long val, void *data);
-6
include/asm-x86/mach-default/mach_traps.h
··· 7 7 8 8 #include <asm/mc146818rtc.h> 9 9 10 - static inline void clear_mem_error(unsigned char reason) 11 - { 12 - reason = (reason & 0xf) | 4; 13 - outb(reason, 0x61); 14 - } 15 - 16 10 static inline unsigned char get_nmi_reason(void) 17 11 { 18 12 return inb(0x61);
-2
include/asm-x86/module.h
··· 52 52 #define MODULE_PROC_FAMILY "EFFICEON " 53 53 #elif defined CONFIG_MWINCHIPC6 54 54 #define MODULE_PROC_FAMILY "WINCHIPC6 " 55 - #elif defined CONFIG_MWINCHIP2 56 - #define MODULE_PROC_FAMILY "WINCHIP2 " 57 55 #elif defined CONFIG_MWINCHIP3D 58 56 #define MODULE_PROC_FAMILY "WINCHIP3D " 59 57 #elif defined CONFIG_MCYRIXIII
-4
include/asm-x86/nmi.h
··· 15 15 */ 16 16 int do_nmi_callback(struct pt_regs *regs, int cpu); 17 17 18 - #ifdef CONFIG_X86_64 19 - extern void default_do_nmi(struct pt_regs *); 20 - #endif 21 - 22 18 extern void die_nmi(char *str, struct pt_regs *regs, int do_panic); 23 19 extern int check_nmi_watchdog(void); 24 20 extern int nmi_watchdog_enabled;
+7 -1
include/asm-x86/page.h
··· 179 179 #endif /* CONFIG_PARAVIRT */ 180 180 181 181 #define __pa(x) __phys_addr((unsigned long)(x)) 182 + #define __pa_nodebug(x) __phys_addr_nodebug((unsigned long)(x)) 182 183 /* __pa_symbol should be used for C visible symbols. 183 184 This seems to be the official gcc blessed way to do such arithmetic. */ 184 185 #define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x))) ··· 189 188 #define __boot_va(x) __va(x) 190 189 #define __boot_pa(x) __pa(x) 191 190 191 + /* 192 + * virt_to_page(kaddr) returns a valid pointer if and only if 193 + * virt_addr_valid(kaddr) returns true. 194 + */ 192 195 #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) 193 196 #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) 194 - #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT) 197 + extern bool __virt_addr_valid(unsigned long kaddr); 198 + #define virt_addr_valid(kaddr) __virt_addr_valid((unsigned long) (kaddr)) 195 199 196 200 #endif /* __ASSEMBLY__ */ 197 201
+8 -2
include/asm-x86/page_32.h
··· 20 20 #endif 21 21 #define THREAD_SIZE (PAGE_SIZE << THREAD_ORDER) 22 22 23 + #define STACKFAULT_STACK 0 24 + #define DOUBLEFAULT_STACK 1 25 + #define NMI_STACK 0 26 + #define DEBUG_STACK 0 27 + #define MCE_STACK 0 28 + #define N_EXCEPTION_STACKS 1 23 29 24 30 #ifdef CONFIG_X86_PAE 25 31 /* 44=32+12, the limit we can fit into an unsigned long pfn */ ··· 79 73 #endif 80 74 81 75 #ifndef __ASSEMBLY__ 82 - #define __phys_addr_const(x) ((x) - PAGE_OFFSET) 76 + #define __phys_addr_nodebug(x) ((x) - PAGE_OFFSET) 83 77 #ifdef CONFIG_DEBUG_VIRTUAL 84 78 extern unsigned long __phys_addr(unsigned long); 85 79 #else 86 - #define __phys_addr(x) ((x) - PAGE_OFFSET) 80 + #define __phys_addr(x) __phys_addr_nodebug(x) 87 81 #endif 88 82 #define __phys_reloc_hide(x) RELOC_HIDE((x), 0) 89 83
+13 -3
include/asm-x86/pgtable.h
··· 15 15 #define _PAGE_BIT_PAT 7 /* on 4KB pages */ 16 16 #define _PAGE_BIT_GLOBAL 8 /* Global TLB entry PPro+ */ 17 17 #define _PAGE_BIT_UNUSED1 9 /* available for programmer */ 18 - #define _PAGE_BIT_UNUSED2 10 18 + #define _PAGE_BIT_IOMAP 10 /* flag used to indicate IO mapping */ 19 19 #define _PAGE_BIT_UNUSED3 11 20 20 #define _PAGE_BIT_PAT_LARGE 12 /* On 2MB or 1GB pages */ 21 21 #define _PAGE_BIT_SPECIAL _PAGE_BIT_UNUSED1 ··· 32 32 #define _PAGE_PSE (_AT(pteval_t, 1) << _PAGE_BIT_PSE) 33 33 #define _PAGE_GLOBAL (_AT(pteval_t, 1) << _PAGE_BIT_GLOBAL) 34 34 #define _PAGE_UNUSED1 (_AT(pteval_t, 1) << _PAGE_BIT_UNUSED1) 35 - #define _PAGE_UNUSED2 (_AT(pteval_t, 1) << _PAGE_BIT_UNUSED2) 35 + #define _PAGE_IOMAP (_AT(pteval_t, 1) << _PAGE_BIT_IOMAP) 36 36 #define _PAGE_UNUSED3 (_AT(pteval_t, 1) << _PAGE_BIT_UNUSED3) 37 37 #define _PAGE_PAT (_AT(pteval_t, 1) << _PAGE_BIT_PAT) 38 38 #define _PAGE_PAT_LARGE (_AT(pteval_t, 1) << _PAGE_BIT_PAT_LARGE) ··· 99 99 #define __PAGE_KERNEL_LARGE_NOCACHE (__PAGE_KERNEL | _PAGE_CACHE_UC | _PAGE_PSE) 100 100 #define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE) 101 101 102 + #define __PAGE_KERNEL_IO (__PAGE_KERNEL | _PAGE_IOMAP) 103 + #define __PAGE_KERNEL_IO_NOCACHE (__PAGE_KERNEL_NOCACHE | _PAGE_IOMAP) 104 + #define __PAGE_KERNEL_IO_UC_MINUS (__PAGE_KERNEL_UC_MINUS | _PAGE_IOMAP) 105 + #define __PAGE_KERNEL_IO_WC (__PAGE_KERNEL_WC | _PAGE_IOMAP) 106 + 102 107 #define PAGE_KERNEL __pgprot(__PAGE_KERNEL) 103 108 #define PAGE_KERNEL_RO __pgprot(__PAGE_KERNEL_RO) 104 109 #define PAGE_KERNEL_EXEC __pgprot(__PAGE_KERNEL_EXEC) ··· 117 112 #define PAGE_KERNEL_LARGE_EXEC __pgprot(__PAGE_KERNEL_LARGE_EXEC) 118 113 #define PAGE_KERNEL_VSYSCALL __pgprot(__PAGE_KERNEL_VSYSCALL) 119 114 #define PAGE_KERNEL_VSYSCALL_NOCACHE __pgprot(__PAGE_KERNEL_VSYSCALL_NOCACHE) 115 + 116 + #define PAGE_KERNEL_IO __pgprot(__PAGE_KERNEL_IO) 117 + #define PAGE_KERNEL_IO_NOCACHE __pgprot(__PAGE_KERNEL_IO_NOCACHE) 118 + #define PAGE_KERNEL_IO_UC_MINUS __pgprot(__PAGE_KERNEL_IO_UC_MINUS) 119 + #define PAGE_KERNEL_IO_WC __pgprot(__PAGE_KERNEL_IO_WC) 120 120 121 121 /* xwr */ 122 122 #define __P000 PAGE_NONE ··· 206 196 207 197 static inline int pte_special(pte_t pte) 208 198 { 209 - return pte_val(pte) & _PAGE_SPECIAL; 199 + return pte_flags(pte) & _PAGE_SPECIAL; 210 200 } 211 201 212 202 static inline unsigned long pte_pfn(pte_t pte)
-4
include/asm-x86/ptrace.h
··· 174 174 175 175 extern unsigned long 176 176 convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs); 177 - 178 - #ifdef CONFIG_X86_32 179 177 extern void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs, 180 178 int error_code, int si_code); 181 - #endif 182 - 183 179 void signal_fault(struct pt_regs *regs, void __user *frame, char *where); 184 180 185 181 extern long syscall_trace_enter(struct pt_regs *);
-6
include/asm-x86/segment.h
··· 131 131 * Matching rules for certain types of segments. 132 132 */ 133 133 134 - /* Matches only __KERNEL_CS, ignoring PnP / USER / APM segments */ 135 - #define SEGMENT_IS_KERNEL_CODE(x) (((x) & 0xfc) == GDT_ENTRY_KERNEL_CS * 8) 136 - 137 - /* Matches __KERNEL_CS and __USER_CS (they must be 2 entries apart) */ 138 - #define SEGMENT_IS_FLAT_CODE(x) (((x) & 0xec) == GDT_ENTRY_KERNEL_CS * 8) 139 - 140 134 /* Matches PNP_CS32 and PNP_CS16 (they must be consecutive) */ 141 135 #define SEGMENT_IS_PNP_CODE(x) (((x) & 0xf4) == GDT_ENTRY_PNPBIOS_BASE * 8) 142 136
+3 -5
include/asm-x86/smp.h
··· 141 141 void native_send_call_func_ipi(cpumask_t mask); 142 142 void native_send_call_func_single_ipi(int cpu); 143 143 144 + extern void prefill_possible_map(void); 145 + 144 146 void smp_store_cpu_info(int id); 145 147 #define cpu_physical_id(cpu) per_cpu(x86_cpu_to_apicid, cpu) 146 148 ··· 151 149 { 152 150 return cpus_weight(cpu_callout_map); 153 151 } 154 - #endif /* CONFIG_SMP */ 155 - 156 - #if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_CPU) 157 - extern void prefill_possible_map(void); 158 152 #else 159 153 static inline void prefill_possible_map(void) 160 154 { 161 155 } 162 - #endif 156 + #endif /* CONFIG_SMP */ 163 157 164 158 extern unsigned disabled_cpus __cpuinitdata; 165 159
+4 -1
include/asm-x86/system.h
··· 64 64 \ 65 65 /* regparm parameters for __switch_to(): */ \ 66 66 [prev] "a" (prev), \ 67 - [next] "d" (next)); \ 67 + [next] "d" (next) \ 68 + \ 69 + : /* reloaded segment registers */ \ 70 + "memory"); \ 68 71 } while (0) 69 72 70 73 /*
+37 -38
include/asm-x86/traps.h
··· 3 3 4 4 #include <asm/debugreg.h> 5 5 6 - /* Common in X86_32 and X86_64 */ 6 + #ifdef CONFIG_X86_32 7 + #define dotraplinkage 8 + #else 9 + #define dotraplinkage asmlinkage 10 + #endif 11 + 7 12 asmlinkage void divide_error(void); 8 13 asmlinkage void debug(void); 9 14 asmlinkage void nmi(void); ··· 17 12 asmlinkage void bounds(void); 18 13 asmlinkage void invalid_op(void); 19 14 asmlinkage void device_not_available(void); 15 + #ifdef CONFIG_X86_64 16 + asmlinkage void double_fault(void); 17 + #endif 20 18 asmlinkage void coprocessor_segment_overrun(void); 21 19 asmlinkage void invalid_TSS(void); 22 20 asmlinkage void segment_not_present(void); 23 21 asmlinkage void stack_segment(void); 24 22 asmlinkage void general_protection(void); 25 23 asmlinkage void page_fault(void); 26 - asmlinkage void coprocessor_error(void); 27 - asmlinkage void simd_coprocessor_error(void); 28 - asmlinkage void alignment_check(void); 29 24 asmlinkage void spurious_interrupt_bug(void); 25 + asmlinkage void coprocessor_error(void); 26 + asmlinkage void alignment_check(void); 30 27 #ifdef CONFIG_X86_MCE 31 28 asmlinkage void machine_check(void); 32 29 #endif /* CONFIG_X86_MCE */ 30 + asmlinkage void simd_coprocessor_error(void); 33 31 34 - void do_divide_error(struct pt_regs *, long); 35 - void do_overflow(struct pt_regs *, long); 36 - void do_bounds(struct pt_regs *, long); 37 - void do_coprocessor_segment_overrun(struct pt_regs *, long); 38 - void do_invalid_TSS(struct pt_regs *, long); 39 - void do_segment_not_present(struct pt_regs *, long); 40 - void do_stack_segment(struct pt_regs *, long); 41 - void do_alignment_check(struct pt_regs *, long); 42 - void do_invalid_op(struct pt_regs *, long); 43 - void do_general_protection(struct pt_regs *, long); 44 - void do_nmi(struct pt_regs *, long); 32 + dotraplinkage void do_divide_error(struct pt_regs *, long); 33 + dotraplinkage void do_debug(struct pt_regs *, long); 34 + dotraplinkage void do_nmi(struct pt_regs *, long); 35 + dotraplinkage void do_int3(struct pt_regs *, long); 36 + dotraplinkage void do_overflow(struct pt_regs *, long); 37 + dotraplinkage void do_bounds(struct pt_regs *, long); 38 + dotraplinkage void do_invalid_op(struct pt_regs *, long); 39 + dotraplinkage void do_device_not_available(struct pt_regs *, long); 40 + dotraplinkage void do_coprocessor_segment_overrun(struct pt_regs *, long); 41 + dotraplinkage void do_invalid_TSS(struct pt_regs *, long); 42 + dotraplinkage void do_segment_not_present(struct pt_regs *, long); 43 + dotraplinkage void do_stack_segment(struct pt_regs *, long); 44 + dotraplinkage void do_general_protection(struct pt_regs *, long); 45 + dotraplinkage void do_page_fault(struct pt_regs *, unsigned long); 46 + dotraplinkage void do_spurious_interrupt_bug(struct pt_regs *, long); 47 + dotraplinkage void do_coprocessor_error(struct pt_regs *, long); 48 + dotraplinkage void do_alignment_check(struct pt_regs *, long); 49 + #ifdef CONFIG_X86_MCE 50 + dotraplinkage void do_machine_check(struct pt_regs *, long); 51 + #endif 52 + dotraplinkage void do_simd_coprocessor_error(struct pt_regs *, long); 53 + #ifdef CONFIG_X86_32 54 + dotraplinkage void do_iret_error(struct pt_regs *, long); 55 + #endif 45 56 46 57 static inline int get_si_code(unsigned long condition) 47 58 { ··· 73 52 extern int kstack_depth_to_print; 74 53 75 54 #ifdef CONFIG_X86_32 76 - 77 - void do_iret_error(struct pt_regs *, long); 78 - void do_int3(struct pt_regs *, long); 79 - void do_debug(struct pt_regs *, long); 80 55 void math_error(void __user *); 81 - void do_coprocessor_error(struct pt_regs *, long); 82 - void do_simd_coprocessor_error(struct pt_regs *, long); 83 - void do_spurious_interrupt_bug(struct pt_regs *, long); 84 56 unsigned long patch_espfix_desc(unsigned long, unsigned long); 85 57 asmlinkage void math_emulate(long); 58 + #endif 86 59 87 - void do_page_fault(struct pt_regs *regs, unsigned long error_code); 88 - 89 - #else /* CONFIG_X86_32 */ 90 - 91 - asmlinkage void double_fault(void); 92 - 93 - asmlinkage void do_int3(struct pt_regs *, long); 94 - asmlinkage void do_stack_segment(struct pt_regs *, long); 95 - asmlinkage void do_debug(struct pt_regs *, unsigned long); 96 - asmlinkage void do_coprocessor_error(struct pt_regs *); 97 - asmlinkage void do_simd_coprocessor_error(struct pt_regs *); 98 - asmlinkage void do_spurious_interrupt_bug(struct pt_regs *); 99 - 100 - asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code); 101 - 102 - #endif /* CONFIG_X86_32 */ 103 60 #endif /* ASM_X86__TRAPS_H */
+3 -11
include/linux/hpet.h
··· 37 37 #define hpet_compare _u1._hpet_compare 38 38 39 39 #define HPET_MAX_TIMERS (32) 40 + #define HPET_MAX_IRQ (32) 40 41 41 42 /* 42 43 * HPET general capabilities register ··· 65 64 */ 66 65 67 66 #define Tn_INT_ROUTE_CAP_MASK (0xffffffff00000000ULL) 68 - #define Tn_INI_ROUTE_CAP_SHIFT (32UL) 67 + #define Tn_INT_ROUTE_CAP_SHIFT (32UL) 69 68 #define Tn_FSB_INT_DELCAP_MASK (0x8000UL) 70 69 #define Tn_FSB_INT_DELCAP_SHIFT (15) 71 70 #define Tn_FSB_EN_CNF_MASK (0x4000UL) ··· 92 91 * exported interfaces 93 92 */ 94 93 95 - struct hpet_task { 96 - void (*ht_func) (void *); 97 - void *ht_data; 98 - void *ht_opaque; 99 - }; 100 - 101 94 struct hpet_data { 102 95 unsigned long hd_phys_address; 103 96 void __iomem *hd_address; 104 97 unsigned short hd_nirqs; 105 - unsigned short hd_flags; 106 98 unsigned int hd_state; /* timer allocated */ 107 99 unsigned int hd_irq[HPET_MAX_TIMERS]; 108 100 }; 109 - 110 - #define HPET_DATA_PLATFORM 0x0001 /* platform call to hpet_alloc */ 111 101 112 102 static inline void hpet_reserve_timer(struct hpet_data *hd, int timer) 113 103 { ··· 117 125 unsigned short hi_timer; 118 126 }; 119 127 120 - #define HPET_INFO_PERIODIC 0x0001 /* timer is periodic */ 128 + #define HPET_INFO_PERIODIC 0x0010 /* periodic-capable comparator */ 121 129 122 130 #define HPET_IE_ON _IO('h', 0x01) /* interrupt on */ 123 131 #define HPET_IE_OFF _IO('h', 0x02) /* interrupt off */
+2
include/linux/oprofile.h
··· 36 36 #define XEN_ENTER_SWITCH_CODE 10 37 37 #define SPU_PROFILING_CODE 11 38 38 #define SPU_CTX_SWITCH_CODE 12 39 + #define IBS_FETCH_CODE 13 40 + #define IBS_OP_CODE 14 39 41 40 42 struct super_block; 41 43 struct dentry;