Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core

+4121 -3195
-12
Documentation/feature-removal-schedule.txt
··· 244 244 245 245 --------------------------- 246 246 247 - What: init_mm export 248 - When: 2.6.26 249 - Why: Not used in-tree. The current out-of-tree users used it to 250 - work around problems in the CPA code which should be resolved 251 - by now. One usecase was described to provide verification code 252 - of the CPA operation. That's a good idea in general, but such 253 - code / infrastructure should be in the kernel and not in some 254 - out-of-tree driver. 255 - Who: Thomas Gleixner <tglx@linutronix.de> 256 - 257 - ---------------------------- 258 - 259 247 What: usedac i386 kernel parameter 260 248 When: 2.6.27 261 249 Why: replaced by allowdac and no dac combination
+6 -3
Documentation/filesystems/proc.txt
··· 1339 1339 1340 1340 Enables/Disables the NMI watchdog on x86 systems. When the value is non-zero 1341 1341 the NMI watchdog is enabled and will continuously test all online cpus to 1342 - determine whether or not they are still functioning properly. 1342 + determine whether or not they are still functioning properly. Currently, 1343 + passing "nmi_watchdog=" parameter at boot time is required for this function 1344 + to work. 1343 1345 1344 - Because the NMI watchdog shares registers with oprofile, by disabling the NMI 1345 - watchdog, oprofile may have more registers to utilize. 1346 + If LAPIC NMI watchdog method is in use (nmi_watchdog=2 kernel parameter), the 1347 + NMI watchdog shares registers with oprofile. By disabling the NMI watchdog, 1348 + oprofile may have more registers to utilize. 1346 1349 1347 1350 msgmni 1348 1351 ------
+32 -1
Documentation/kernel-parameters.txt
··· 1393 1393 when a NMI is triggered. 1394 1394 Format: [state][,regs][,debounce][,die] 1395 1395 1396 - nmi_watchdog= [KNL,BUGS=X86-32] Debugging features for SMP kernels 1396 + nmi_watchdog= [KNL,BUGS=X86-32,X86-64] Debugging features for SMP kernels 1397 + Format: [panic,][num] 1398 + Valid num: 0,1,2 1399 + 0 - turn nmi_watchdog off 1400 + 1 - use the IO-APIC timer for the NMI watchdog 1401 + 2 - use the local APIC for the NMI watchdog using 1402 + a performance counter. Note: This will use one performance 1403 + counter and the local APIC's performance vector. 1404 + When panic is specified panic when an NMI watchdog timeout occurs. 1405 + This is useful when you use a panic=... timeout and need the box 1406 + quickly up again. 1407 + Instead of 1 and 2 it is possible to use the following 1408 + symbolic names: lapic and ioapic 1409 + Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic 1397 1410 1398 1411 no387 [BUGS=X86-32] Tells the kernel to use the 387 maths 1399 1412 emulation library even if a 387 maths coprocessor ··· 1639 1626 nomsi [MSI] If the PCI_MSI kernel config parameter is 1640 1627 enabled, this kernel boot option can be used to 1641 1628 disable the use of MSI interrupts system-wide. 1629 + noioapicquirk [APIC] Disable all boot interrupt quirks. 1630 + Safety option to keep boot IRQs enabled. This 1631 + should never be necessary. 1632 + ioapicreroute [APIC] Enable rerouting of boot IRQs to the 1633 + primary IO-APIC for bridges that cannot disable 1634 + boot IRQs. This fixes a source of spurious IRQs 1635 + when the system masks IRQs. 1636 + noioapicreroute [APIC] Disable workaround that uses the 1637 + boot IRQ equivalent of an IRQ that connects to 1638 + a chipset where boot IRQs cannot be disabled. 1639 + The opposite of ioapicreroute. 1642 1640 biosirq [X86-32] Use PCI BIOS calls to get the interrupt 1643 1641 routing table. These calls are known to be buggy 1644 1642 on several machines and they hang the machine ··· 2278 2254 trix= [HW,OSS] MediaTrix AudioTrix Pro 2279 2255 Format: 2280 2256 <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq> 2257 + 2258 + tsc= Disable clocksource-must-verify flag for TSC. 2259 + Format: <string> 2260 + [x86] reliable: mark tsc clocksource as reliable, this 2261 + disables clocksource verification at runtime. 2262 + Used to enable high-resolution timer mode on older 2263 + hardware, and in virtualized environment. 2281 2264 2282 2265 turbografx.map[2|3]= [HW,JOY] 2283 2266 TurboGraFX parallel port interface
+5
Documentation/nmi_watchdog.txt
··· 69 69 On x86 nmi_watchdog is disabled by default so you have to enable it with 70 70 a boot time parameter. 71 71 72 + It's possible to disable the NMI watchdog in run-time by writing "0" to 73 + /proc/sys/kernel/nmi_watchdog. Writing "1" to the same file will re-enable 74 + the NMI watchdog. Notice that you still need to use "nmi_watchdog=" parameter 75 + at boot time. 76 + 72 77 NOTE: In kernels prior to 2.4.2-ac18 the NMI-oopser is enabled unconditionally 73 78 on x86 SMP boxes. 74 79
+3 -3
Documentation/x86/boot.txt
··· 349 349 3 SYSLINUX 350 350 4 EtherBoot 351 351 5 ELILO 352 - 7 GRuB 352 + 7 GRUB 353 353 8 U-BOOT 354 354 9 Xen 355 355 A Gujin ··· 537 537 Offset/size: 0x248/4 538 538 Protocol: 2.08+ 539 539 540 - If non-zero then this field contains the offset from the end of the 541 - real-mode code to the payload. 540 + If non-zero then this field contains the offset from the beginning 541 + of the protected-mode code to the payload. 542 542 543 543 The payload may be compressed. The format of both the compressed and 544 544 uncompressed data should be determined using the standard magic
+24
Documentation/x86/pat.txt
··· 80 80 | | | | 81 81 ------------------------------------------------------------------- 82 82 83 + Advanced APIs for drivers 84 + ------------------------- 85 + A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range, 86 + vm_insert_pfn 87 + 88 + Drivers wanting to export some pages to userspace do it by using mmap 89 + interface and a combination of 90 + 1) pgprot_noncached() 91 + 2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn() 92 + 93 + With PAT support, a new API pgprot_writecombine is being added. So, drivers can 94 + continue to use the above sequence, with either pgprot_noncached() or 95 + pgprot_writecombine() in step 1, followed by step 2. 96 + 97 + In addition, step 2 internally tracks the region as UC or WC in memtype 98 + list in order to ensure no conflicting mapping. 99 + 100 + Note that this set of APIs only works with IO (non RAM) regions. If driver 101 + wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc() 102 + as step 0 above and also track the usage of those pages and use set_memory_wb() 103 + before the page is freed to free pool. 104 + 105 + 106 + 83 107 Notes: 84 108 85 109 -- in the above table mean "Not suggested usage for the API". Some of the --'s
-11
Documentation/x86/x86_64/boot-options.txt
··· 79 79 Report when timer interrupts are lost because some code turned off 80 80 interrupts for too long. 81 81 82 - nmi_watchdog=NUMBER[,panic] 83 - NUMBER can be: 84 - 0 don't use an NMI watchdog 85 - 1 use the IO-APIC timer for the NMI watchdog 86 - 2 use the local APIC for the NMI watchdog using a performance counter. Note 87 - This will use one performance counter and the local APIC's performance 88 - vector. 89 - When panic is specified panic when an NMI watchdog timeout occurs. 90 - This is useful when you use a panic=... timeout and need the box 91 - quickly up again. 92 - 93 82 nohpet 94 83 Don't use the HPET timer. 95 84
+1 -1
Documentation/x86/x86_64/mm.txt
··· 6 6 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm 7 7 hole caused by [48:63] sign extension 8 8 ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole 9 - ffff810000000000 - ffffc0ffffffffff (=46 bits) direct mapping of all phys. memory 9 + ffff880000000000 - ffffc0ffffffffff (=57 TB) direct mapping of all phys. memory 10 10 ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole 11 11 ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space 12 12 ffffe20000000000 - ffffe2ffffffffff (=40 bits) virtual memory map (1TB)
+55 -27
arch/x86/Kconfig
··· 19 19 config X86 20 20 def_bool y 21 21 select HAVE_AOUT if X86_32 22 + select HAVE_READQ 23 + select HAVE_WRITEQ 22 24 select HAVE_UNSTABLE_SCHED_CLOCK 23 25 select HAVE_IDE 24 26 select HAVE_OPROFILE ··· 89 87 config GENERIC_BUG 90 88 def_bool y 91 89 depends on BUG 90 + select GENERIC_BUG_RELATIVE_POINTERS if X86_64 91 + 92 + config GENERIC_BUG_RELATIVE_POINTERS 93 + bool 92 94 93 95 config GENERIC_HWEIGHT 94 96 def_bool y ··· 248 242 def_bool y 249 243 depends on X86_MPPARSE || X86_VOYAGER 250 244 251 - if ACPI 252 245 config X86_MPPARSE 253 - def_bool y 254 - bool "Enable MPS table" 246 + bool "Enable MPS table" if ACPI 247 + default y 255 248 depends on X86_LOCAL_APIC 256 249 help 257 250 For old smp systems that do not have proper acpi support. Newer systems 258 251 (esp with 64bit cpus) with acpi support, MADT and DSDT will override it 259 - endif 260 - 261 - if !ACPI 262 - config X86_MPPARSE 263 - def_bool y 264 - depends on X86_LOCAL_APIC 265 - endif 266 252 267 253 choice 268 254 prompt "Subarchitecture Type" ··· 463 465 def_bool y 464 466 depends on X86_GENERICARCH 465 467 466 - config ES7000_CLUSTERED_APIC 467 - def_bool y 468 - depends on SMP && X86_ES7000 && MPENTIUMIII 469 - 470 468 source "arch/x86/Kconfig.cpu" 471 469 472 470 config HPET_TIMER ··· 653 659 config X86_VISWS_APIC 654 660 def_bool y 655 661 depends on X86_32 && X86_VISWS 662 + 663 + config X86_REROUTE_FOR_BROKEN_BOOT_IRQS 664 + bool "Reroute for broken boot IRQs" 665 + default n 666 + depends on X86_IO_APIC 667 + help 668 + This option enables a workaround that fixes a source of 669 + spurious interrupts. This is recommended when threaded 670 + interrupt handling is used on systems where the generation of 671 + superfluous "boot interrupts" cannot be disabled. 672 + 673 + Some chipsets generate a legacy INTx "boot IRQ" when the IRQ 674 + entry in the chipset's IO-APIC is masked (as, e.g. the RT 675 + kernel does during interrupt handling). On chipsets where this 676 + boot IRQ generation cannot be disabled, this workaround keeps 677 + the original IRQ line masked so that only the equivalent "boot 678 + IRQ" is delivered to the CPUs. The workaround also tells the 679 + kernel to set up the IRQ handler on the boot IRQ line. In this 680 + way only one interrupt is delivered to the kernel. Otherwise 681 + the spurious second interrupt may cause the kernel to bring 682 + down (vital) interrupt lines. 683 + 684 + Only affects "broken" chipsets. Interrupt sharing may be 685 + increased on these systems. 656 686 657 687 config X86_MCE 658 688 bool "Machine Check Exception" ··· 974 956 config ARCH_PHYS_ADDR_T_64BIT 975 957 def_bool X86_64 || X86_PAE 976 958 959 + config DIRECT_GBPAGES 960 + bool "Enable 1GB pages for kernel pagetables" if EMBEDDED 961 + default y 962 + depends on X86_64 963 + help 964 + Allow the kernel linear mapping to use 1GB pages on CPUs that 965 + support it. This can improve the kernel's performance a tiny bit by 966 + reducing TLB pressure. If in doubt, say "Y". 967 + 977 968 # Common NUMA Features 978 969 config NUMA 979 - bool "Numa Memory Allocation and Scheduler Support (EXPERIMENTAL)" 970 + bool "Numa Memory Allocation and Scheduler Support" 980 971 depends on SMP 981 972 depends on X86_64 || (X86_32 && HIGHMEM64G && (X86_NUMAQ || X86_BIGSMP || X86_SUMMIT && ACPI) && EXPERIMENTAL) 982 973 default n if X86_PC 983 974 default y if (X86_NUMAQ || X86_SUMMIT || X86_BIGSMP) 984 975 help 985 976 Enable NUMA (Non Uniform Memory Access) support. 977 + 986 978 The kernel will try to allocate memory used by a CPU on the 987 979 local memory controller of the CPU and add some more 988 980 NUMA awareness to the kernel. 989 981 990 - For 32-bit this is currently highly experimental and should be only 991 - used for kernel development. It might also cause boot failures. 992 - For 64-bit this is recommended on all multiprocessor Opteron systems. 993 - If the system is EM64T, you should say N unless your system is 994 - EM64T NUMA. 982 + For 64-bit this is recommended if the system is Intel Core i7 983 + (or later), AMD Opteron, or EM64T NUMA. 984 + 985 + For 32-bit this is only needed on (rare) 32-bit-only platforms 986 + that support NUMA topologies, such as NUMAQ / Summit, or if you 987 + boot a 32-bit kernel on a 64-bit NUMA platform. 988 + 989 + Otherwise, you should say N. 995 990 996 991 comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI" 997 992 depends on X86_32 && X86_SUMMIT && (!HIGHMEM64G || !ACPI) ··· 1524 1493 def_bool y 1525 1494 depends on X86_64 || (X86_32 && HIGHMEM) 1526 1495 1496 + config ARCH_ENABLE_MEMORY_HOTREMOVE 1497 + def_bool y 1498 + depends on MEMORY_HOTPLUG 1499 + 1527 1500 config HAVE_ARCH_EARLY_PFN_TO_NID 1528 1501 def_bool X86_64 1529 1502 depends on NUMA ··· 1666 1631 needs to. Unfortunately, some BIOSes do not -- especially those in 1667 1632 many of the newer IBM Thinkpads. If you experience hangs when you 1668 1633 suspend, try setting this to Y. Otherwise, say N. 1669 - 1670 - config APM_REAL_MODE_POWER_OFF 1671 - bool "Use real mode APM BIOS call to power off" 1672 - help 1673 - Use real mode APM BIOS calls to switch off the computer. This is 1674 - a work-around for a number of buggy BIOSes. Switch this option on if 1675 - your computer crashes instead of powering off properly. 1676 1634 1677 1635 endif # APM 1678 1636
+4 -16
arch/x86/Kconfig.debug
··· 114 114 data. This is recommended so that we can catch kernel bugs sooner. 115 115 If in doubt, say "Y". 116 116 117 - config DIRECT_GBPAGES 118 - bool "Enable gbpages-mapped kernel pagetables" 119 - depends on DEBUG_KERNEL && EXPERIMENTAL && X86_64 120 - help 121 - Enable gigabyte pages support (if the CPU supports it). This can 122 - improve the kernel's performance a tiny bit by reducing TLB 123 - pressure. 124 - 125 - This is experimental code. 126 - 127 - If in doubt, say "N". 128 - 129 117 config DEBUG_RODATA_TEST 130 118 bool "Testcase for the DEBUG_RODATA feature" 131 119 depends on DEBUG_RODATA ··· 295 307 developers have marked 'inline'. Doing so takes away freedom from gcc to 296 308 do what it thinks is best, which is desirable for the gcc 3.x series of 297 309 compilers. The gcc 4.x series have a rewritten inlining algorithm and 298 - disabling this option will generate a smaller kernel there. Hopefully 299 - this algorithm is so good that allowing gcc4 to make the decision can 300 - become the default in the future, until then this option is there to 301 - test gcc for this. 310 + enabling this option will generate a smaller kernel there. Hopefully 311 + this algorithm is so good that allowing gcc 4.x and above to make the 312 + decision will become the default in the future. Until then this option 313 + is there to test gcc for this. 302 314 303 315 If unsure, say N. 304 316
+2 -2
arch/x86/boot/video-vga.c
··· 34 34 { VIDEO_80x25, 80, 25, 0 }, 35 35 }; 36 36 37 - __videocard video_vga; 37 + static __videocard video_vga; 38 38 39 39 /* Set basic 80x25 mode */ 40 40 static u8 vga_set_basic_mode(void) ··· 259 259 return mode_count[adapter]; 260 260 } 261 261 262 - __videocard video_vga = { 262 + static __videocard video_vga = { 263 263 .card_name = "VGA", 264 264 .probe = vga_probe, 265 265 .set_mode = vga_set_mode,
+1 -1
arch/x86/boot/video.c
··· 226 226 227 227 #ifdef CONFIG_VIDEO_RETAIN 228 228 /* Save screen content to the heap */ 229 - struct saved_screen { 229 + static struct saved_screen { 230 230 int x, y; 231 231 int curx, cury; 232 232 u16 *data;
+2 -2
arch/x86/configs/i386_defconfig
··· 77 77 CONFIG_AUDITSYSCALL=y 78 78 CONFIG_AUDIT_TREE=y 79 79 # CONFIG_IKCONFIG is not set 80 - CONFIG_LOG_BUF_SHIFT=17 80 + CONFIG_LOG_BUF_SHIFT=18 81 81 CONFIG_CGROUPS=y 82 82 # CONFIG_CGROUP_DEBUG is not set 83 83 CONFIG_CGROUP_NS=y ··· 298 298 CONFIG_CRASH_DUMP=y 299 299 # CONFIG_KEXEC_JUMP is not set 300 300 CONFIG_PHYSICAL_START=0x1000000 301 - CONFIG_RELOCATABLE=y 301 + # CONFIG_RELOCATABLE is not set 302 302 CONFIG_PHYSICAL_ALIGN=0x200000 303 303 CONFIG_HOTPLUG_CPU=y 304 304 # CONFIG_COMPAT_VDSO is not set
+2 -2
arch/x86/configs/x86_64_defconfig
··· 77 77 CONFIG_AUDITSYSCALL=y 78 78 CONFIG_AUDIT_TREE=y 79 79 # CONFIG_IKCONFIG is not set 80 - CONFIG_LOG_BUF_SHIFT=17 80 + CONFIG_LOG_BUF_SHIFT=18 81 81 CONFIG_CGROUPS=y 82 82 # CONFIG_CGROUP_DEBUG is not set 83 83 CONFIG_CGROUP_NS=y ··· 298 298 CONFIG_KEXEC=y 299 299 CONFIG_CRASH_DUMP=y 300 300 CONFIG_PHYSICAL_START=0x1000000 301 - CONFIG_RELOCATABLE=y 301 + # CONFIG_RELOCATABLE is not set 302 302 CONFIG_PHYSICAL_ALIGN=0x200000 303 303 CONFIG_HOTPLUG_CPU=y 304 304 # CONFIG_COMPAT_VDSO is not set
+39 -72
arch/x86/ia32/ia32_signal.c
··· 32 32 #include <asm/proto.h> 33 33 #include <asm/vdso.h> 34 34 35 + #include <asm/sigframe.h> 36 + 35 37 #define DEBUG_SIG 0 36 38 37 39 #define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP))) ··· 43 41 X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \ 44 42 X86_EFLAGS_CF) 45 43 46 - asmlinkage int do_signal(struct pt_regs *regs, sigset_t *oldset); 47 44 void signal_fault(struct pt_regs *regs, void __user *frame, char *where); 48 45 49 46 int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from) ··· 174 173 /* 175 174 * Do a signal return; undo the signal stack. 176 175 */ 177 - 178 - struct sigframe 179 - { 180 - u32 pretcode; 181 - int sig; 182 - struct sigcontext_ia32 sc; 183 - struct _fpstate_ia32 fpstate_unused; /* look at kernel/sigframe.h */ 184 - unsigned int extramask[_COMPAT_NSIG_WORDS-1]; 185 - char retcode[8]; 186 - /* fp state follows here */ 187 - }; 188 - 189 - struct rt_sigframe 190 - { 191 - u32 pretcode; 192 - int sig; 193 - u32 pinfo; 194 - u32 puc; 195 - compat_siginfo_t info; 196 - struct ucontext_ia32 uc; 197 - char retcode[8]; 198 - /* fp state follows here */ 199 - }; 200 - 201 - #define COPY(x) { \ 202 - unsigned int reg; \ 203 - err |= __get_user(reg, &sc->x); \ 204 - regs->x = reg; \ 176 + #define COPY(x) { \ 177 + err |= __get_user(regs->x, &sc->x); \ 205 178 } 206 179 207 - #define RELOAD_SEG(seg,mask) \ 208 - { unsigned int cur; \ 209 - unsigned short pre; \ 210 - err |= __get_user(pre, &sc->seg); \ 211 - savesegment(seg, cur); \ 212 - pre |= mask; \ 213 - if (pre != cur) loadsegment(seg, pre); } 180 + #define COPY_SEG_CPL3(seg) { \ 181 + unsigned short tmp; \ 182 + err |= __get_user(tmp, &sc->seg); \ 183 + regs->seg = tmp | 3; \ 184 + } 185 + 186 + #define RELOAD_SEG(seg) { \ 187 + unsigned int cur, pre; \ 188 + err |= __get_user(pre, &sc->seg); \ 189 + savesegment(seg, cur); \ 190 + pre |= 3; \ 191 + if (pre != cur) \ 192 + loadsegment(seg, pre); \ 193 + } 214 194 215 195 static int ia32_restore_sigcontext(struct pt_regs *regs, 216 196 struct sigcontext_ia32 __user *sc, 217 - unsigned int *peax) 197 + unsigned int *pax) 218 198 { 219 199 unsigned int tmpflags, gs, oldgs, err = 0; 220 200 void __user *buf; ··· 222 240 if (gs != oldgs) 223 241 load_gs_index(gs); 224 242 225 - RELOAD_SEG(fs, 3); 226 - RELOAD_SEG(ds, 3); 227 - RELOAD_SEG(es, 3); 243 + RELOAD_SEG(fs); 244 + RELOAD_SEG(ds); 245 + RELOAD_SEG(es); 228 246 229 247 COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx); 230 248 COPY(dx); COPY(cx); COPY(ip); 231 249 /* Don't touch extended registers */ 232 250 233 - err |= __get_user(regs->cs, &sc->cs); 234 - regs->cs |= 3; 235 - err |= __get_user(regs->ss, &sc->ss); 236 - regs->ss |= 3; 251 + COPY_SEG_CPL3(cs); 252 + COPY_SEG_CPL3(ss); 237 253 238 254 err |= __get_user(tmpflags, &sc->flags); 239 255 regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS); ··· 242 262 buf = compat_ptr(tmp); 243 263 err |= restore_i387_xstate_ia32(buf); 244 264 245 - err |= __get_user(tmp, &sc->ax); 246 - *peax = tmp; 247 - 265 + err |= __get_user(*pax, &sc->ax); 248 266 return err; 249 267 } 250 268 251 269 asmlinkage long sys32_sigreturn(struct pt_regs *regs) 252 270 { 253 - struct sigframe __user *frame = (struct sigframe __user *)(regs->sp-8); 271 + struct sigframe_ia32 __user *frame = (struct sigframe_ia32 __user *)(regs->sp-8); 254 272 sigset_t set; 255 273 unsigned int ax; 256 274 ··· 278 300 279 301 asmlinkage long sys32_rt_sigreturn(struct pt_regs *regs) 280 302 { 281 - struct rt_sigframe __user *frame; 303 + struct rt_sigframe_ia32 __user *frame; 282 304 sigset_t set; 283 305 unsigned int ax; 284 306 struct pt_regs tregs; 285 307 286 - frame = (struct rt_sigframe __user *)(regs->sp - 4); 308 + frame = (struct rt_sigframe_ia32 __user *)(regs->sp - 4); 287 309 288 310 if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) 289 311 goto badframe; ··· 337 359 err |= __put_user(regs->dx, &sc->dx); 338 360 err |= __put_user(regs->cx, &sc->cx); 339 361 err |= __put_user(regs->ax, &sc->ax); 340 - err |= __put_user(regs->cs, &sc->cs); 341 - err |= __put_user(regs->ss, &sc->ss); 342 362 err |= __put_user(current->thread.trap_no, &sc->trapno); 343 363 err |= __put_user(current->thread.error_code, &sc->err); 344 364 err |= __put_user(regs->ip, &sc->ip); 365 + err |= __put_user(regs->cs, (unsigned int __user *)&sc->cs); 345 366 err |= __put_user(regs->flags, &sc->flags); 346 367 err |= __put_user(regs->sp, &sc->sp_at_signal); 368 + err |= __put_user(regs->ss, (unsigned int __user *)&sc->ss); 347 369 348 - tmp = save_i387_xstate_ia32(fpstate); 349 - if (tmp < 0) 350 - err = -EFAULT; 351 - else 352 - err |= __put_user(ptr_to_compat(tmp ? fpstate : NULL), 353 - &sc->fpstate); 370 + err |= __put_user(ptr_to_compat(fpstate), &sc->fpstate); 354 371 355 372 /* non-iBCS2 extensions.. */ 356 373 err |= __put_user(mask, &sc->oldmask); ··· 373 400 } 374 401 375 402 /* This is the legacy signal stack switching. */ 376 - else if ((regs->ss & 0xffff) != __USER_DS && 403 + else if ((regs->ss & 0xffff) != __USER32_DS && 377 404 !(ka->sa.sa_flags & SA_RESTORER) && 378 405 ka->sa.sa_restorer) 379 406 sp = (unsigned long) ka->sa.sa_restorer; ··· 381 408 if (used_math()) { 382 409 sp = sp - sig_xstate_ia32_size; 383 410 *fpstate = (struct _fpstate_ia32 *) sp; 411 + if (save_i387_xstate_ia32(*fpstate) < 0) 412 + return (void __user *) -1L; 384 413 } 385 414 386 415 sp -= frame_size; ··· 395 420 int ia32_setup_frame(int sig, struct k_sigaction *ka, 396 421 compat_sigset_t *set, struct pt_regs *regs) 397 422 { 398 - struct sigframe __user *frame; 423 + struct sigframe_ia32 __user *frame; 399 424 void __user *restorer; 400 425 int err = 0; 401 426 void __user *fpstate = NULL; ··· 405 430 u16 poplmovl; 406 431 u32 val; 407 432 u16 int80; 408 - u16 pad; 409 433 } __attribute__((packed)) code = { 410 434 0xb858, /* popl %eax ; movl $...,%eax */ 411 435 __NR_ia32_sigreturn, 412 436 0x80cd, /* int $0x80 */ 413 - 0, 414 437 }; 415 438 416 439 frame = get_sigframe(ka, regs, sizeof(*frame), &fpstate); ··· 444 471 * These are actually not used anymore, but left because some 445 472 * gdb versions depend on them as a marker. 446 473 */ 447 - err |= __copy_to_user(frame->retcode, &code, 8); 474 + err |= __put_user(*((u64 *)&code), (u64 *)frame->retcode); 448 475 if (err) 449 476 return -EFAULT; 450 477 ··· 474 501 int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, 475 502 compat_sigset_t *set, struct pt_regs *regs) 476 503 { 477 - struct rt_sigframe __user *frame; 504 + struct rt_sigframe_ia32 __user *frame; 478 505 void __user *restorer; 479 506 int err = 0; 480 507 void __user *fpstate = NULL; ··· 484 511 u8 movl; 485 512 u32 val; 486 513 u16 int80; 487 - u16 pad; 488 - u8 pad2; 514 + u8 pad; 489 515 } __attribute__((packed)) code = { 490 516 0xb8, 491 517 __NR_ia32_rt_sigreturn, ··· 531 559 * Not actually used anymore, but left because some gdb 532 560 * versions need it. 533 561 */ 534 - err |= __copy_to_user(frame->retcode, &code, 8); 562 + err |= __put_user(*((u64 *)&code), (u64 *)frame->retcode); 535 563 if (err) 536 564 return -EFAULT; 537 565 538 566 /* Set up registers for signal handler */ 539 567 regs->sp = (unsigned long) frame; 540 568 regs->ip = (unsigned long) ka->sa.sa_handler; 541 - 542 - /* Make -mregparm=3 work */ 543 - regs->ax = sig; 544 - regs->dx = (unsigned long) &frame->info; 545 - regs->cx = (unsigned long) &frame->uc; 546 569 547 570 /* Make -mregparm=3 work */ 548 571 regs->ax = sig;
+1
arch/x86/include/asm/apic.h
··· 193 193 static inline void lapic_shutdown(void) { } 194 194 #define local_apic_timer_c2_ok 1 195 195 static inline void init_apic_mappings(void) { } 196 + static inline void disable_local_APIC(void) { } 196 197 197 198 #endif /* !CONFIG_X86_LOCAL_APIC */ 198 199
-2
arch/x86/include/asm/bigsmp/apic.h
··· 24 24 #define INT_DELIVERY_MODE (dest_Fixed) 25 25 #define INT_DEST_MODE (0) /* phys delivery to target proc */ 26 26 #define NO_BALANCE_IRQ (0) 27 - #define WAKE_SECONDARY_VIA_INIT 28 - 29 27 30 28 static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid) 31 29 {
+9 -1
arch/x86/include/asm/bitops.h
··· 168 168 */ 169 169 static inline void change_bit(int nr, volatile unsigned long *addr) 170 170 { 171 - asm volatile(LOCK_PREFIX "btc %1,%0" : ADDR : "Ir" (nr)); 171 + if (IS_IMMEDIATE(nr)) { 172 + asm volatile(LOCK_PREFIX "xorb %1,%0" 173 + : CONST_MASK_ADDR(nr, addr) 174 + : "iq" ((u8)CONST_MASK(nr))); 175 + } else { 176 + asm volatile(LOCK_PREFIX "btc %1,%0" 177 + : BITOP_ADDR(addr) 178 + : "Ir" (nr)); 179 + } 172 180 } 173 181 174 182 /**
+1 -1
arch/x86/include/asm/bug.h
··· 9 9 #ifdef CONFIG_X86_32 10 10 # define __BUG_C0 "2:\t.long 1b, %c0\n" 11 11 #else 12 - # define __BUG_C0 "2:\t.quad 1b, %c0\n" 12 + # define __BUG_C0 "2:\t.long 1b - 2b, %c0 - 2b\n" 13 13 #endif 14 14 15 15 #define BUG() \
+31 -47
arch/x86/include/asm/byteorder.h
··· 4 4 #include <asm/types.h> 5 5 #include <linux/compiler.h> 6 6 7 - #ifdef __GNUC__ 7 + #define __LITTLE_ENDIAN 8 8 9 - #ifdef __i386__ 10 - 11 - static inline __attribute_const__ __u32 ___arch__swab32(__u32 x) 9 + static inline __attribute_const__ __u32 __arch_swab32(__u32 val) 12 10 { 13 - #ifdef CONFIG_X86_BSWAP 14 - asm("bswap %0" : "=r" (x) : "0" (x)); 15 - #else 11 + #ifdef __i386__ 12 + # ifdef CONFIG_X86_BSWAP 13 + asm("bswap %0" : "=r" (val) : "0" (val)); 14 + # else 16 15 asm("xchgb %b0,%h0\n\t" /* swap lower bytes */ 17 16 "rorl $16,%0\n\t" /* swap words */ 18 17 "xchgb %b0,%h0" /* swap higher bytes */ 19 - : "=q" (x) 20 - : "0" (x)); 21 - #endif 22 - return x; 23 - } 18 + : "=q" (val) 19 + : "0" (val)); 20 + # endif 24 21 25 - static inline __attribute_const__ __u64 ___arch__swab64(__u64 val) 22 + #else /* __i386__ */ 23 + asm("bswapl %0" 24 + : "=r" (val) 25 + : "0" (val)); 26 + #endif 27 + return val; 28 + } 29 + #define __arch_swab32 __arch_swab32 30 + 31 + static inline __attribute_const__ __u64 __arch_swab64(__u64 val) 26 32 { 33 + #ifdef __i386__ 27 34 union { 28 35 struct { 29 36 __u32 a; ··· 39 32 __u64 u; 40 33 } v; 41 34 v.u = val; 42 - #ifdef CONFIG_X86_BSWAP 35 + # ifdef CONFIG_X86_BSWAP 43 36 asm("bswapl %0 ; bswapl %1 ; xchgl %0,%1" 44 37 : "=r" (v.s.a), "=r" (v.s.b) 45 38 : "0" (v.s.a), "1" (v.s.b)); 46 - #else 47 - v.s.a = ___arch__swab32(v.s.a); 48 - v.s.b = ___arch__swab32(v.s.b); 39 + # else 40 + v.s.a = __arch_swab32(v.s.a); 41 + v.s.b = __arch_swab32(v.s.b); 49 42 asm("xchgl %0,%1" 50 43 : "=r" (v.s.a), "=r" (v.s.b) 51 44 : "0" (v.s.a), "1" (v.s.b)); 52 - #endif 45 + # endif 53 46 return v.u; 54 - } 55 - 56 47 #else /* __i386__ */ 57 - 58 - static inline __attribute_const__ __u64 ___arch__swab64(__u64 x) 59 - { 60 48 asm("bswapq %0" 61 - : "=r" (x) 62 - : "0" (x)); 63 - return x; 64 - } 65 - 66 - static inline __attribute_const__ __u32 ___arch__swab32(__u32 x) 67 - { 68 - asm("bswapl %0" 69 - : "=r" (x) 70 - : "0" (x)); 71 - return x; 72 - } 73 - 49 + : "=r" (val) 50 + : "0" (val)); 51 + return val; 74 52 #endif 53 + } 54 + #define __arch_swab64 __arch_swab64 75 55 76 - /* Do not define swab16. Gcc is smart enough to recognize "C" version and 77 - convert it into rotation or exhange. */ 78 - 79 - #define __arch__swab64(x) ___arch__swab64(x) 80 - #define __arch__swab32(x) ___arch__swab32(x) 81 - 82 - #define __BYTEORDER_HAS_U64__ 83 - 84 - #endif /* __GNUC__ */ 85 - 86 - #include <linux/byteorder/little_endian.h> 56 + #include <linux/byteorder.h> 87 57 88 58 #endif /* _ASM_X86_BYTEORDER_H */
+4 -1
arch/x86/include/asm/cpufeature.h
··· 80 80 #define X86_FEATURE_UP (3*32+ 9) /* smp kernel running on up */ 81 81 #define X86_FEATURE_FXSAVE_LEAK (3*32+10) /* "" FXSAVE leaks FOP/FIP/FOP */ 82 82 #define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */ 83 - #define X86_FEATURE_NOPL (3*32+20) /* The NOPL (0F 1F) instructions */ 84 83 #define X86_FEATURE_PEBS (3*32+12) /* Precise-Event Based Sampling */ 85 84 #define X86_FEATURE_BTS (3*32+13) /* Branch Trace Store */ 86 85 #define X86_FEATURE_SYSCALL32 (3*32+14) /* "" syscall in ia32 userspace */ ··· 91 92 #define X86_FEATURE_NOPL (3*32+20) /* The NOPL (0F 1F) instructions */ 92 93 #define X86_FEATURE_AMDC1E (3*32+21) /* AMD C1E detected */ 93 94 #define X86_FEATURE_XTOPOLOGY (3*32+22) /* cpu topology enum extensions */ 95 + #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */ 96 + #define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */ 94 97 95 98 /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */ 96 99 #define X86_FEATURE_XMM3 (4*32+ 0) /* "pni" SSE-3 */ ··· 118 117 #define X86_FEATURE_XSAVE (4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */ 119 118 #define X86_FEATURE_OSXSAVE (4*32+27) /* "" XSAVE enabled in the OS */ 120 119 #define X86_FEATURE_AVX (4*32+28) /* Advanced Vector Extensions */ 120 + #define X86_FEATURE_HYPERVISOR (4*32+31) /* Running on a hypervisor */ 121 121 122 122 /* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */ 123 123 #define X86_FEATURE_XSTORE (5*32+ 2) /* "rng" RNG present (xstore) */ ··· 239 237 #define cpu_has_xmm4_2 boot_cpu_has(X86_FEATURE_XMM4_2) 240 238 #define cpu_has_x2apic boot_cpu_has(X86_FEATURE_X2APIC) 241 239 #define cpu_has_xsave boot_cpu_has(X86_FEATURE_XSAVE) 240 + #define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR) 242 241 243 242 #if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64) 244 243 # define cpu_has_invlpg 1
+3 -1
arch/x86/include/asm/emergency-restart.h
··· 8 8 BOOT_BIOS = 'b', 9 9 #endif 10 10 BOOT_ACPI = 'a', 11 - BOOT_EFI = 'e' 11 + BOOT_EFI = 'e', 12 + BOOT_CF9 = 'p', 13 + BOOT_CF9_COND = 'q', 12 14 }; 13 15 14 16 extern enum reboot_type reboot_type;
+54 -27
arch/x86/include/asm/es7000/apic.h
··· 9 9 return (1); 10 10 } 11 11 12 - static inline cpumask_t target_cpus(void) 12 + static inline cpumask_t target_cpus_cluster(void) 13 13 { 14 - #if defined CONFIG_ES7000_CLUSTERED_APIC 15 14 return CPU_MASK_ALL; 16 - #else 17 - return cpumask_of_cpu(smp_processor_id()); 18 - #endif 19 15 } 20 16 21 - #if defined CONFIG_ES7000_CLUSTERED_APIC 22 - #define APIC_DFR_VALUE (APIC_DFR_CLUSTER) 23 - #define INT_DELIVERY_MODE (dest_LowestPrio) 24 - #define INT_DEST_MODE (1) /* logical delivery broadcast to all procs */ 25 - #define NO_BALANCE_IRQ (1) 26 - #undef WAKE_SECONDARY_VIA_INIT 27 - #define WAKE_SECONDARY_VIA_MIP 28 - #else 17 + static inline cpumask_t target_cpus(void) 18 + { 19 + return cpumask_of_cpu(smp_processor_id()); 20 + } 21 + 22 + #define APIC_DFR_VALUE_CLUSTER (APIC_DFR_CLUSTER) 23 + #define INT_DELIVERY_MODE_CLUSTER (dest_LowestPrio) 24 + #define INT_DEST_MODE_CLUSTER (1) /* logical delivery broadcast to all procs */ 25 + #define NO_BALANCE_IRQ_CLUSTER (1) 26 + 29 27 #define APIC_DFR_VALUE (APIC_DFR_FLAT) 30 28 #define INT_DELIVERY_MODE (dest_Fixed) 31 29 #define INT_DEST_MODE (0) /* phys delivery to target procs */ 32 30 #define NO_BALANCE_IRQ (0) 33 31 #undef APIC_DEST_LOGICAL 34 32 #define APIC_DEST_LOGICAL 0x0 35 - #define WAKE_SECONDARY_VIA_INIT 36 - #endif 37 33 38 34 static inline unsigned long check_apicid_used(physid_mask_t bitmap, int apicid) 39 35 { ··· 56 60 * an APIC. See e.g. "AP-388 82489DX User's Manual" (Intel 57 61 * document number 292116). So here it goes... 58 62 */ 63 + static inline void init_apic_ldr_cluster(void) 64 + { 65 + unsigned long val; 66 + int cpu = smp_processor_id(); 67 + 68 + apic_write(APIC_DFR, APIC_DFR_VALUE_CLUSTER); 69 + val = calculate_ldr(cpu); 70 + apic_write(APIC_LDR, val); 71 + } 72 + 59 73 static inline void init_apic_ldr(void) 60 74 { 61 75 unsigned long val; ··· 75 69 val = calculate_ldr(cpu); 76 70 apic_write(APIC_LDR, val); 77 71 } 78 - 79 - #ifndef CONFIG_X86_GENERICARCH 80 - extern void enable_apic_mode(void); 81 - #endif 82 72 83 73 extern int apic_version [MAX_APICS]; 84 74 static inline void setup_apic_routing(void) ··· 146 144 return (1); 147 145 } 148 146 149 - static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask) 147 + static inline unsigned int cpu_mask_to_apicid_cluster(cpumask_t cpumask) 150 148 { 151 149 int num_bits_set; 152 150 int cpus_found = 0; ··· 156 154 num_bits_set = cpus_weight(cpumask); 157 155 /* Return id to all */ 158 156 if (num_bits_set == NR_CPUS) 159 - #if defined CONFIG_ES7000_CLUSTERED_APIC 160 157 return 0xFF; 161 - #else 162 - return cpu_to_logical_apicid(0); 163 - #endif 164 158 /* 165 159 * The cpus in the mask must all be on the apic cluster. If are not 166 160 * on the same apicid cluster return default value of TARGET_CPUS. ··· 169 171 if (apicid_cluster(apicid) != 170 172 apicid_cluster(new_apicid)){ 171 173 printk ("%s: Not a valid mask!\n", __func__); 172 - #if defined CONFIG_ES7000_CLUSTERED_APIC 173 174 return 0xFF; 174 - #else 175 + } 176 + apicid = new_apicid; 177 + cpus_found++; 178 + } 179 + cpu++; 180 + } 181 + return apicid; 182 + } 183 + 184 + static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask) 185 + { 186 + int num_bits_set; 187 + int cpus_found = 0; 188 + int cpu; 189 + int apicid; 190 + 191 + num_bits_set = cpus_weight(cpumask); 192 + /* Return id to all */ 193 + if (num_bits_set == NR_CPUS) 194 + return cpu_to_logical_apicid(0); 195 + /* 196 + * The cpus in the mask must all be on the apic cluster. If are not 197 + * on the same apicid cluster return default value of TARGET_CPUS. 198 + */ 199 + cpu = first_cpu(cpumask); 200 + apicid = cpu_to_logical_apicid(cpu); 201 + while (cpus_found < num_bits_set) { 202 + if (cpu_isset(cpu, cpumask)) { 203 + int new_apicid = cpu_to_logical_apicid(cpu); 204 + if (apicid_cluster(apicid) != 205 + apicid_cluster(new_apicid)){ 206 + printk ("%s: Not a valid mask!\n", __func__); 175 207 return cpu_to_logical_apicid(0); 176 - #endif 177 208 } 178 209 apicid = new_apicid; 179 210 cpus_found++;
+10 -31
arch/x86/include/asm/es7000/wakecpu.h
··· 1 1 #ifndef __ASM_ES7000_WAKECPU_H 2 2 #define __ASM_ES7000_WAKECPU_H 3 3 4 - /* 5 - * This file copes with machines that wakeup secondary CPUs by the 6 - * INIT, INIT, STARTUP sequence. 7 - */ 8 - 9 - #ifdef CONFIG_ES7000_CLUSTERED_APIC 10 - #define WAKE_SECONDARY_VIA_MIP 11 - #else 12 - #define WAKE_SECONDARY_VIA_INIT 13 - #endif 14 - 15 - #ifdef WAKE_SECONDARY_VIA_MIP 16 - extern int es7000_start_cpu(int cpu, unsigned long eip); 17 - static inline int 18 - wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) 19 - { 20 - int boot_error = 0; 21 - boot_error = es7000_start_cpu(phys_apicid, start_eip); 22 - return boot_error; 23 - } 24 - #endif 25 - 26 - #define TRAMPOLINE_LOW phys_to_virt(0x467) 27 - #define TRAMPOLINE_HIGH phys_to_virt(0x469) 28 - 29 - #define boot_cpu_apicid boot_cpu_physical_apicid 4 + #define TRAMPOLINE_PHYS_LOW 0x467 5 + #define TRAMPOLINE_PHYS_HIGH 0x469 30 6 31 7 static inline void wait_for_init_deassert(atomic_t *deassert) 32 8 { 33 - #ifdef WAKE_SECONDARY_VIA_INIT 9 + #ifndef CONFIG_ES7000_CLUSTERED_APIC 34 10 while (!atomic_read(deassert)) 35 11 cpu_relax(); 36 12 #endif ··· 26 50 { 27 51 } 28 52 29 - #define inquire_remote_apic(apicid) do { \ 30 - if (apic_verbosity >= APIC_DEBUG) \ 31 - __inquire_remote_apic(apicid); \ 32 - } while (0) 53 + extern void __inquire_remote_apic(int apicid); 54 + 55 + static inline void inquire_remote_apic(int apicid) 56 + { 57 + if (apic_verbosity >= APIC_DEBUG) 58 + __inquire_remote_apic(apicid); 59 + } 33 60 34 61 #endif /* __ASM_MACH_WAKECPU_H */
+18 -1
arch/x86/include/asm/genapic_32.h
··· 2 2 #define _ASM_X86_GENAPIC_32_H 3 3 4 4 #include <asm/mpspec.h> 5 + #include <asm/atomic.h> 5 6 6 7 /* 7 8 * Generic APIC driver interface. ··· 66 65 void (*send_IPI_allbutself)(int vector); 67 66 void (*send_IPI_all)(int vector); 68 67 #endif 68 + int (*wakeup_cpu)(int apicid, unsigned long start_eip); 69 + int trampoline_phys_low; 70 + int trampoline_phys_high; 71 + void (*wait_for_init_deassert)(atomic_t *deassert); 72 + void (*smp_callin_clear_local_apic)(void); 73 + void (*store_NMI_vector)(unsigned short *high, unsigned short *low); 74 + void (*restore_NMI_vector)(unsigned short *high, unsigned short *low); 75 + void (*inquire_remote_apic)(int apicid); 69 76 }; 70 77 71 78 #define APICFUNC(x) .x = x, ··· 114 105 APICFUNC(get_apic_id) \ 115 106 .apic_id_mask = APIC_ID_MASK, \ 116 107 APICFUNC(cpu_mask_to_apicid) \ 117 - APICFUNC(vector_allocation_domain) \ 108 + APICFUNC(vector_allocation_domain) \ 118 109 APICFUNC(acpi_madt_oem_check) \ 119 110 IPIFUNC(send_IPI_mask) \ 120 111 IPIFUNC(send_IPI_allbutself) \ 121 112 IPIFUNC(send_IPI_all) \ 122 113 APICFUNC(enable_apic_mode) \ 123 114 APICFUNC(phys_pkg_id) \ 115 + .trampoline_phys_low = TRAMPOLINE_PHYS_LOW, \ 116 + .trampoline_phys_high = TRAMPOLINE_PHYS_HIGH, \ 117 + APICFUNC(wait_for_init_deassert) \ 118 + APICFUNC(smp_callin_clear_local_apic) \ 119 + APICFUNC(store_NMI_vector) \ 120 + APICFUNC(restore_NMI_vector) \ 121 + APICFUNC(inquire_remote_apic) \ 124 122 } 125 123 126 124 extern struct genapic *genapic; 125 + extern void es7000_update_genapic_to_cluster(void); 127 126 128 127 enum uv_system_type {UV_NONE, UV_LEGACY_APIC, UV_X2APIC, UV_NON_UNIQUE_APIC}; 129 128 #define get_uv_system_type() UV_NONE
+2
arch/x86/include/asm/genapic_64.h
··· 32 32 unsigned int (*get_apic_id)(unsigned long x); 33 33 unsigned long (*set_apic_id)(unsigned int id); 34 34 unsigned long apic_id_mask; 35 + /* wakeup_secondary_cpu */ 36 + int (*wakeup_cpu)(int apicid, unsigned long start_eip); 35 37 }; 36 38 37 39 extern struct genapic *genapic;
+26
arch/x86/include/asm/hypervisor.h
··· 1 + /* 2 + * Copyright (C) 2008, VMware, Inc. 3 + * 4 + * This program is free software; you can redistribute it and/or modify 5 + * it under the terms of the GNU General Public License as published by 6 + * the Free Software Foundation; either version 2 of the License, or 7 + * (at your option) any later version. 8 + * 9 + * This program is distributed in the hope that it will be useful, but 10 + * WITHOUT ANY WARRANTY; without even the implied warranty of 11 + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or 12 + * NON INFRINGEMENT. See the GNU General Public License for more 13 + * details. 14 + * 15 + * You should have received a copy of the GNU General Public License 16 + * along with this program; if not, write to the Free Software 17 + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 18 + * 19 + */ 20 + #ifndef ASM_X86__HYPERVISOR_H 21 + #define ASM_X86__HYPERVISOR_H 22 + 23 + extern unsigned long get_hypervisor_tsc_freq(void); 24 + extern void init_hypervisor(struct cpuinfo_x86 *c); 25 + 26 + #endif
-18
arch/x86/include/asm/ia32.h
··· 129 129 } _sifields; 130 130 } compat_siginfo_t; 131 131 132 - struct sigframe32 { 133 - u32 pretcode; 134 - int sig; 135 - struct sigcontext_ia32 sc; 136 - struct _fpstate_ia32 fpstate; 137 - unsigned int extramask[_COMPAT_NSIG_WORDS-1]; 138 - }; 139 - 140 - struct rt_sigframe32 { 141 - u32 pretcode; 142 - int sig; 143 - u32 pinfo; 144 - u32 puc; 145 - compat_siginfo_t info; 146 - struct ucontext_ia32 uc; 147 - struct _fpstate_ia32 fpstate; 148 - }; 149 - 150 132 struct ustat32 { 151 133 __u32 f_tfree; 152 134 compat_ino_t f_tinode;
+5
arch/x86/include/asm/idle.h
··· 8 8 void idle_notifier_register(struct notifier_block *n); 9 9 void idle_notifier_unregister(struct notifier_block *n); 10 10 11 + #ifdef CONFIG_X86_64 11 12 void enter_idle(void); 12 13 void exit_idle(void); 14 + #else /* !CONFIG_X86_64 */ 15 + static inline void enter_idle(void) { } 16 + static inline void exit_idle(void) { } 17 + #endif /* CONFIG_X86_64 */ 13 18 14 19 void c1e_remove_cpu(int cpu); 15 20
+29 -8
arch/x86/include/asm/io.h
··· 4 4 #define ARCH_HAS_IOREMAP_WC 5 5 6 6 #include <linux/compiler.h> 7 + #include <asm-generic/int-ll64.h> 7 8 8 9 #define build_mmio_read(name, size, type, reg, barrier) \ 9 10 static inline type name(const volatile void __iomem *addr) \ ··· 46 45 #define mmiowb() barrier() 47 46 48 47 #ifdef CONFIG_X86_64 48 + 49 49 build_mmio_read(readq, "q", unsigned long, "=r", :"memory") 50 - build_mmio_read(__readq, "q", unsigned long, "=r", ) 51 50 build_mmio_write(writeq, "q", unsigned long, "r", :"memory") 52 - build_mmio_write(__writeq, "q", unsigned long, "r", ) 53 51 54 - #define readq_relaxed(a) __readq(a) 55 - #define __raw_readq __readq 56 - #define __raw_writeq writeq 52 + #else 57 53 58 - /* Let people know we have them */ 59 - #define readq readq 60 - #define writeq writeq 54 + static inline __u64 readq(const volatile void __iomem *addr) 55 + { 56 + const volatile u32 __iomem *p = addr; 57 + u32 low, high; 58 + 59 + low = readl(p); 60 + high = readl(p + 1); 61 + 62 + return low + ((u64)high << 32); 63 + } 64 + 65 + static inline void writeq(__u64 val, volatile void __iomem *addr) 66 + { 67 + writel(val, addr); 68 + writel(val >> 32, addr+4); 69 + } 70 + 61 71 #endif 72 + 73 + #define readq_relaxed(a) readq(a) 74 + 75 + #define __raw_readq(a) readq(a) 76 + #define __raw_writeq(val, addr) writeq(val, addr) 77 + 78 + /* Let people know that we have them */ 79 + #define readq readq 80 + #define writeq writeq 62 81 63 82 extern int iommu_bio_merge; 64 83
+10
arch/x86/include/asm/io_apic.h
··· 156 156 /* 1 if "noapic" boot option passed */ 157 157 extern int skip_ioapic_setup; 158 158 159 + /* 1 if "noapic" boot option passed */ 160 + extern int noioapicquirk; 161 + 162 + /* -1 if "noapic" boot option passed */ 163 + extern int noioapicreroute; 164 + 159 165 /* 1 if the timer IRQ uses the '8259A Virtual Wire' mode */ 160 166 extern int timer_through_8259; 161 167 162 168 static inline void disable_ioapic_setup(void) 163 169 { 170 + #ifdef CONFIG_PCI 171 + noioapicquirk = 1; 172 + noioapicreroute = -1; 173 + #endif 164 174 skip_ioapic_setup = 1; 165 175 } 166 176
-4
arch/x86/include/asm/irq.h
··· 31 31 # endif 32 32 #endif 33 33 34 - #ifdef CONFIG_IRQBALANCE 35 - extern int irqbalance_disable(char *str); 36 - #endif 37 - 38 34 #ifdef CONFIG_HOTPLUG_CPU 39 35 #include <linux/cpumask.h> 40 36 extern void fixup_irqs(cpumask_t map);
+2
arch/x86/include/asm/irq_regs_32.h
··· 9 9 10 10 #include <asm/percpu.h> 11 11 12 + #define ARCH_HAS_OWN_IRQ_REGS 13 + 12 14 DECLARE_PER_CPU(struct pt_regs *, irq_regs); 13 15 14 16 static inline struct pt_regs *get_irq_regs(void)
+16 -15
arch/x86/include/asm/kexec.h
··· 5 5 # define PA_CONTROL_PAGE 0 6 6 # define VA_CONTROL_PAGE 1 7 7 # define PA_PGD 2 8 - # define VA_PGD 3 9 - # define PA_PTE_0 4 10 - # define VA_PTE_0 5 11 - # define PA_PTE_1 6 12 - # define VA_PTE_1 7 13 - # define PA_SWAP_PAGE 8 14 - # ifdef CONFIG_X86_PAE 15 - # define PA_PMD_0 9 16 - # define VA_PMD_0 10 17 - # define PA_PMD_1 11 18 - # define VA_PMD_1 12 19 - # define PAGES_NR 13 20 - # else 21 - # define PAGES_NR 9 22 - # endif 8 + # define PA_SWAP_PAGE 3 9 + # define PAGES_NR 4 23 10 #else 24 11 # define PA_CONTROL_PAGE 0 25 12 # define VA_CONTROL_PAGE 1 ··· 155 168 relocate_kernel(unsigned long indirection_page, 156 169 unsigned long page_list, 157 170 unsigned long start_address) ATTRIB_NORET; 171 + #endif 172 + 173 + #ifdef CONFIG_X86_32 174 + #define ARCH_HAS_KIMAGE_ARCH 175 + 176 + struct kimage_arch { 177 + pgd_t *pgd; 178 + #ifdef CONFIG_X86_PAE 179 + pmd_t *pmd0; 180 + pmd_t *pmd1; 181 + #endif 182 + pte_t *pte0; 183 + pte_t *pte1; 184 + }; 158 185 #endif 159 186 160 187 #endif /* __ASSEMBLY__ */
+2
arch/x86/include/asm/mach-default/mach_apic.h
··· 32 32 #define vector_allocation_domain (genapic->vector_allocation_domain) 33 33 #define read_apic_id() (GET_APIC_ID(apic_read(APIC_ID))) 34 34 #define send_IPI_self (genapic->send_IPI_self) 35 + #define wakeup_secondary_cpu (genapic->wakeup_cpu) 35 36 extern void setup_apic_routing(void); 36 37 #else 37 38 #define INT_DELIVERY_MODE dest_LowestPrio 38 39 #define INT_DEST_MODE 1 /* logical delivery broadcast to all procs */ 39 40 #define TARGET_CPUS (target_cpus()) 41 + #define wakeup_secondary_cpu wakeup_secondary_cpu_via_init 40 42 /* 41 43 * Set up the logical destination ID. 42 44 *
+9 -15
arch/x86/include/asm/mach-default/mach_wakecpu.h
··· 1 1 #ifndef _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H 2 2 #define _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H 3 3 4 - /* 5 - * This file copes with machines that wakeup secondary CPUs by the 6 - * INIT, INIT, STARTUP sequence. 7 - */ 8 - 9 - #define WAKE_SECONDARY_VIA_INIT 10 - 11 - #define TRAMPOLINE_LOW phys_to_virt(0x467) 12 - #define TRAMPOLINE_HIGH phys_to_virt(0x469) 13 - 14 - #define boot_cpu_apicid boot_cpu_physical_apicid 4 + #define TRAMPOLINE_PHYS_LOW (0x467) 5 + #define TRAMPOLINE_PHYS_HIGH (0x469) 15 6 16 7 static inline void wait_for_init_deassert(atomic_t *deassert) 17 8 { ··· 24 33 { 25 34 } 26 35 27 - #define inquire_remote_apic(apicid) do { \ 28 - if (apic_verbosity >= APIC_DEBUG) \ 29 - __inquire_remote_apic(apicid); \ 30 - } while (0) 36 + extern void __inquire_remote_apic(int apicid); 37 + 38 + static inline void inquire_remote_apic(int apicid) 39 + { 40 + if (apic_verbosity >= APIC_DEBUG) 41 + __inquire_remote_apic(apicid); 42 + } 31 43 32 44 #endif /* _ASM_X86_MACH_DEFAULT_MACH_WAKECPU_H */
+5 -3
arch/x86/include/asm/mach-default/smpboot_hooks.h
··· 13 13 CMOS_WRITE(0xa, 0xf); 14 14 local_flush_tlb(); 15 15 pr_debug("1.\n"); 16 - *((volatile unsigned short *) TRAMPOLINE_HIGH) = start_eip >> 4; 16 + *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) = 17 + start_eip >> 4; 17 18 pr_debug("2.\n"); 18 - *((volatile unsigned short *) TRAMPOLINE_LOW) = start_eip & 0xf; 19 + *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 20 + start_eip & 0xf; 19 21 pr_debug("3.\n"); 20 22 } 21 23 ··· 34 32 */ 35 33 CMOS_WRITE(0, 0xf); 36 34 37 - *((volatile long *) phys_to_virt(0x467)) = 0; 35 + *((volatile long *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 0; 38 36 } 39 37 40 38 static inline void __init smpboot_setup_io_apic(void)
+1
arch/x86/include/asm/mach-generic/mach_apic.h
··· 27 27 #define vector_allocation_domain (genapic->vector_allocation_domain) 28 28 #define enable_apic_mode (genapic->enable_apic_mode) 29 29 #define phys_pkg_id (genapic->phys_pkg_id) 30 + #define wakeup_secondary_cpu (genapic->wakeup_cpu) 30 31 31 32 extern void generic_bigsmp_probe(void); 32 33
+12
arch/x86/include/asm/mach-generic/mach_wakecpu.h
··· 1 + #ifndef _ASM_X86_MACH_GENERIC_MACH_WAKECPU_H 2 + #define _ASM_X86_MACH_GENERIC_MACH_WAKECPU_H 3 + 4 + #define TRAMPOLINE_PHYS_LOW (genapic->trampoline_phys_low) 5 + #define TRAMPOLINE_PHYS_HIGH (genapic->trampoline_phys_high) 6 + #define wait_for_init_deassert (genapic->wait_for_init_deassert) 7 + #define smp_callin_clear_local_apic (genapic->smp_callin_clear_local_apic) 8 + #define store_NMI_vector (genapic->store_NMI_vector) 9 + #define restore_NMI_vector (genapic->restore_NMI_vector) 10 + #define inquire_remote_apic (genapic->inquire_remote_apic) 11 + 12 + #endif /* _ASM_X86_MACH_GENERIC_MACH_APIC_H */
+6 -7
arch/x86/include/asm/mmu_context_32.h
··· 4 4 static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) 5 5 { 6 6 #ifdef CONFIG_SMP 7 - unsigned cpu = smp_processor_id(); 8 - if (per_cpu(cpu_tlbstate, cpu).state == TLBSTATE_OK) 9 - per_cpu(cpu_tlbstate, cpu).state = TLBSTATE_LAZY; 7 + if (x86_read_percpu(cpu_tlbstate.state) == TLBSTATE_OK) 8 + x86_write_percpu(cpu_tlbstate.state, TLBSTATE_LAZY); 10 9 #endif 11 10 } 12 11 ··· 19 20 /* stop flush ipis for the previous mm */ 20 21 cpu_clear(cpu, prev->cpu_vm_mask); 21 22 #ifdef CONFIG_SMP 22 - per_cpu(cpu_tlbstate, cpu).state = TLBSTATE_OK; 23 - per_cpu(cpu_tlbstate, cpu).active_mm = next; 23 + x86_write_percpu(cpu_tlbstate.state, TLBSTATE_OK); 24 + x86_write_percpu(cpu_tlbstate.active_mm, next); 24 25 #endif 25 26 cpu_set(cpu, next->cpu_vm_mask); 26 27 ··· 35 36 } 36 37 #ifdef CONFIG_SMP 37 38 else { 38 - per_cpu(cpu_tlbstate, cpu).state = TLBSTATE_OK; 39 - BUG_ON(per_cpu(cpu_tlbstate, cpu).active_mm != next); 39 + x86_write_percpu(cpu_tlbstate.state, TLBSTATE_OK); 40 + BUG_ON(x86_read_percpu(cpu_tlbstate.active_mm) != next); 40 41 41 42 if (!cpu_test_and_set(cpu, next->cpu_vm_mask)) { 42 43 /* We were in lazy tlb mode and leave_mm disabled
+2
arch/x86/include/asm/msr-index.h
··· 85 85 /* AMD64 MSRs. Not complete. See the architecture manual for a more 86 86 complete list. */ 87 87 88 + #define MSR_AMD64_PATCH_LEVEL 0x0000008b 88 89 #define MSR_AMD64_NB_CFG 0xc001001f 90 + #define MSR_AMD64_PATCH_LOADER 0xc0010020 89 91 #define MSR_AMD64_IBSFETCHCTL 0xc0011030 90 92 #define MSR_AMD64_IBSFETCHLINAD 0xc0011031 91 93 #define MSR_AMD64_IBSFETCHPHYSAD 0xc0011032
+6 -6
arch/x86/include/asm/msr.h
··· 22 22 } 23 23 24 24 /* 25 - * i386 calling convention returns 64-bit value in edx:eax, while 26 - * x86_64 returns at rax. Also, the "A" constraint does not really 27 - * mean rdx:rax in x86_64, so we need specialized behaviour for each 28 - * architecture 25 + * both i386 and x86_64 returns 64-bit value in edx:eax, but gcc's "A" 26 + * constraint has different meanings. For i386, "A" means exactly 27 + * edx:eax, while for x86_64 it doesn't mean rdx:rax or edx:eax. Instead, 28 + * it means rax *or* rdx. 29 29 */ 30 30 #ifdef CONFIG_X86_64 31 31 #define DECLARE_ARGS(val, low, high) unsigned low, high ··· 181 181 } 182 182 183 183 #define rdtscl(low) \ 184 - ((low) = (u32)native_read_tsc()) 184 + ((low) = (u32)__native_read_tsc()) 185 185 186 186 #define rdtscll(val) \ 187 - ((val) = native_read_tsc()) 187 + ((val) = __native_read_tsc()) 188 188 189 189 #define rdpmc(counter, low, high) \ 190 190 do { \
+13 -11
arch/x86/include/asm/numaq/wakecpu.h
··· 3 3 4 4 /* This file copes with machines that wakeup secondary CPUs by NMIs */ 5 5 6 - #define WAKE_SECONDARY_VIA_NMI 7 - 8 - #define TRAMPOLINE_LOW phys_to_virt(0x8) 9 - #define TRAMPOLINE_HIGH phys_to_virt(0xa) 10 - 11 - #define boot_cpu_apicid boot_cpu_logical_apicid 6 + #define TRAMPOLINE_PHYS_LOW (0x8) 7 + #define TRAMPOLINE_PHYS_HIGH (0xa) 12 8 13 9 /* We don't do anything here because we use NMI's to boot instead */ 14 10 static inline void wait_for_init_deassert(atomic_t *deassert) ··· 23 27 static inline void store_NMI_vector(unsigned short *high, unsigned short *low) 24 28 { 25 29 printk("Storing NMI vector\n"); 26 - *high = *((volatile unsigned short *) TRAMPOLINE_HIGH); 27 - *low = *((volatile unsigned short *) TRAMPOLINE_LOW); 30 + *high = 31 + *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)); 32 + *low = 33 + *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)); 28 34 } 29 35 30 36 static inline void restore_NMI_vector(unsigned short *high, unsigned short *low) 31 37 { 32 38 printk("Restoring NMI vector\n"); 33 - *((volatile unsigned short *) TRAMPOLINE_HIGH) = *high; 34 - *((volatile unsigned short *) TRAMPOLINE_LOW) = *low; 39 + *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) = 40 + *high; 41 + *((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) = 42 + *low; 35 43 } 36 44 37 - #define inquire_remote_apic(apicid) {} 45 + static inline void inquire_remote_apic(int apicid) 46 + { 47 + } 38 48 39 49 #endif /* __ASM_NUMAQ_WAKECPU_H */
+2
arch/x86/include/asm/pci.h
··· 19 19 }; 20 20 21 21 extern int pci_routeirq; 22 + extern int noioapicquirk; 23 + extern int noioapicreroute; 22 24 23 25 /* scan a bus after allocating a pci_sysdata for it */ 24 26 extern struct pci_bus *pci_scan_bus_on_node(int busno, struct pci_ops *ops,
+41 -9
arch/x86/include/asm/pgtable-2level.h
··· 56 56 #define pte_none(x) (!(x).pte_low) 57 57 58 58 /* 59 - * Bits 0, 6 and 7 are taken, split up the 29 bits of offset 60 - * into this range: 59 + * Bits _PAGE_BIT_PRESENT, _PAGE_BIT_FILE and _PAGE_BIT_PROTNONE are taken, 60 + * split up the 29 bits of offset into this range: 61 61 */ 62 62 #define PTE_FILE_MAX_BITS 29 63 + #define PTE_FILE_SHIFT1 (_PAGE_BIT_PRESENT + 1) 64 + #if _PAGE_BIT_FILE < _PAGE_BIT_PROTNONE 65 + #define PTE_FILE_SHIFT2 (_PAGE_BIT_FILE + 1) 66 + #define PTE_FILE_SHIFT3 (_PAGE_BIT_PROTNONE + 1) 67 + #else 68 + #define PTE_FILE_SHIFT2 (_PAGE_BIT_PROTNONE + 1) 69 + #define PTE_FILE_SHIFT3 (_PAGE_BIT_FILE + 1) 70 + #endif 71 + #define PTE_FILE_BITS1 (PTE_FILE_SHIFT2 - PTE_FILE_SHIFT1 - 1) 72 + #define PTE_FILE_BITS2 (PTE_FILE_SHIFT3 - PTE_FILE_SHIFT2 - 1) 63 73 64 74 #define pte_to_pgoff(pte) \ 65 - ((((pte).pte_low >> 1) & 0x1f) + (((pte).pte_low >> 8) << 5)) 75 + ((((pte).pte_low >> PTE_FILE_SHIFT1) \ 76 + & ((1U << PTE_FILE_BITS1) - 1)) \ 77 + + ((((pte).pte_low >> PTE_FILE_SHIFT2) \ 78 + & ((1U << PTE_FILE_BITS2) - 1)) << PTE_FILE_BITS1) \ 79 + + (((pte).pte_low >> PTE_FILE_SHIFT3) \ 80 + << (PTE_FILE_BITS1 + PTE_FILE_BITS2))) 66 81 67 82 #define pgoff_to_pte(off) \ 68 - ((pte_t) { .pte_low = (((off) & 0x1f) << 1) + \ 69 - (((off) >> 5) << 8) + _PAGE_FILE }) 83 + ((pte_t) { .pte_low = \ 84 + (((off) & ((1U << PTE_FILE_BITS1) - 1)) << PTE_FILE_SHIFT1) \ 85 + + ((((off) >> PTE_FILE_BITS1) & ((1U << PTE_FILE_BITS2) - 1)) \ 86 + << PTE_FILE_SHIFT2) \ 87 + + (((off) >> (PTE_FILE_BITS1 + PTE_FILE_BITS2)) \ 88 + << PTE_FILE_SHIFT3) \ 89 + + _PAGE_FILE }) 70 90 71 91 /* Encode and de-code a swap entry */ 72 - #define __swp_type(x) (((x).val >> 1) & 0x1f) 73 - #define __swp_offset(x) ((x).val >> 8) 74 - #define __swp_entry(type, offset) \ 75 - ((swp_entry_t) { ((type) << 1) | ((offset) << 8) }) 92 + #if _PAGE_BIT_FILE < _PAGE_BIT_PROTNONE 93 + #define SWP_TYPE_BITS (_PAGE_BIT_FILE - _PAGE_BIT_PRESENT - 1) 94 + #define SWP_OFFSET_SHIFT (_PAGE_BIT_PROTNONE + 1) 95 + #else 96 + #define SWP_TYPE_BITS (_PAGE_BIT_PROTNONE - _PAGE_BIT_PRESENT - 1) 97 + #define SWP_OFFSET_SHIFT (_PAGE_BIT_FILE + 1) 98 + #endif 99 + 100 + #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) 101 + 102 + #define __swp_type(x) (((x).val >> (_PAGE_BIT_PRESENT + 1)) \ 103 + & ((1U << SWP_TYPE_BITS) - 1)) 104 + #define __swp_offset(x) ((x).val >> SWP_OFFSET_SHIFT) 105 + #define __swp_entry(type, offset) ((swp_entry_t) { \ 106 + ((type) << (_PAGE_BIT_PRESENT + 1)) \ 107 + | ((offset) << SWP_OFFSET_SHIFT) }) 76 108 #define __pte_to_swp_entry(pte) ((swp_entry_t) { (pte).pte_low }) 77 109 #define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val }) 78 110
+1
arch/x86/include/asm/pgtable-3level.h
··· 166 166 #define PTE_FILE_MAX_BITS 32 167 167 168 168 /* Encode and de-code a swap entry */ 169 + #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5) 169 170 #define __swp_type(x) (((x).val) & 0x1f) 170 171 #define __swp_offset(x) ((x).val >> 5) 171 172 #define __swp_entry(type, offset) ((swp_entry_t){(type) | (offset) << 5})
+22 -6
arch/x86/include/asm/pgtable.h
··· 10 10 #define _PAGE_BIT_PCD 4 /* page cache disabled */ 11 11 #define _PAGE_BIT_ACCESSED 5 /* was accessed (raised by CPU) */ 12 12 #define _PAGE_BIT_DIRTY 6 /* was written to (raised by CPU) */ 13 - #define _PAGE_BIT_FILE 6 14 13 #define _PAGE_BIT_PSE 7 /* 4 MB (or 2MB) page */ 15 14 #define _PAGE_BIT_PAT 7 /* on 4KB pages */ 16 15 #define _PAGE_BIT_GLOBAL 8 /* Global TLB entry PPro+ */ ··· 20 21 #define _PAGE_BIT_SPECIAL _PAGE_BIT_UNUSED1 21 22 #define _PAGE_BIT_CPA_TEST _PAGE_BIT_UNUSED1 22 23 #define _PAGE_BIT_NX 63 /* No execute: only valid after cpuid check */ 24 + 25 + /* If _PAGE_BIT_PRESENT is clear, we use these: */ 26 + /* - if the user mapped it with PROT_NONE; pte_present gives true */ 27 + #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL 28 + /* - set: nonlinear file mapping, saved PTE; unset:swap */ 29 + #define _PAGE_BIT_FILE _PAGE_BIT_DIRTY 23 30 24 31 #define _PAGE_PRESENT (_AT(pteval_t, 1) << _PAGE_BIT_PRESENT) 25 32 #define _PAGE_RW (_AT(pteval_t, 1) << _PAGE_BIT_RW) ··· 51 46 #define _PAGE_NX (_AT(pteval_t, 0)) 52 47 #endif 53 48 54 - /* If _PAGE_PRESENT is clear, we use these: */ 55 - #define _PAGE_FILE _PAGE_DIRTY /* nonlinear file mapping, 56 - * saved PTE; unset:swap */ 57 - #define _PAGE_PROTNONE _PAGE_PSE /* if the user mapped it with PROT_NONE; 58 - pte_present gives true */ 49 + #define _PAGE_FILE (_AT(pteval_t, 1) << _PAGE_BIT_FILE) 50 + #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) 59 51 60 52 #define _PAGE_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \ 61 53 _PAGE_ACCESSED | _PAGE_DIRTY) ··· 160 158 #define PGD_IDENT_ATTR 0x001 /* PRESENT (no other attributes) */ 161 159 #endif 162 160 161 + /* 162 + * Macro to mark a page protection value as UC- 163 + */ 164 + #define pgprot_noncached(prot) \ 165 + ((boot_cpu_data.x86 > 3) \ 166 + ? (__pgprot(pgprot_val(prot) | _PAGE_CACHE_UC_MINUS)) \ 167 + : (prot)) 168 + 163 169 #ifndef __ASSEMBLY__ 170 + 171 + #define pgprot_writecombine pgprot_writecombine 172 + extern pgprot_t pgprot_writecombine(pgprot_t prot); 164 173 165 174 /* 166 175 * ZERO_PAGE is a global shared page that is always zero: used ··· 342 329 #define canon_pgprot(p) __pgprot(pgprot_val(p) & __supported_pte_mask) 343 330 344 331 #ifndef __ASSEMBLY__ 332 + /* Indicate that x86 has its own track and untrack pfn vma functions */ 333 + #define __HAVE_PFNMAP_TRACKING 334 + 345 335 #define __HAVE_PHYS_MEM_ACCESS_PROT 346 336 struct file; 347 337 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
-9
arch/x86/include/asm/pgtable_32.h
··· 101 101 #endif 102 102 103 103 /* 104 - * Macro to mark a page protection value as "uncacheable". 105 - * On processors which do not support it, this is a no-op. 106 - */ 107 - #define pgprot_noncached(prot) \ 108 - ((boot_cpu_data.x86 > 3) \ 109 - ? (__pgprot(pgprot_val(prot) | _PAGE_PCD | _PAGE_PWT)) \ 110 - : (prot)) 111 - 112 - /* 113 104 * Conversion functions: convert a page and protection to a page entry, 114 105 * and a page entry and page directory to the page they refer to. 115 106 */
+17 -11
arch/x86/include/asm/pgtable_64.h
··· 146 146 #define PGDIR_MASK (~(PGDIR_SIZE - 1)) 147 147 148 148 149 - #define MAXMEM _AC(0x00003fffffffffff, UL) 149 + #define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL) 150 150 #define VMALLOC_START _AC(0xffffc20000000000, UL) 151 151 #define VMALLOC_END _AC(0xffffe1ffffffffff, UL) 152 152 #define VMEMMAP_START _AC(0xffffe20000000000, UL) ··· 175 175 #define pte_present(x) (pte_val((x)) & (_PAGE_PRESENT | _PAGE_PROTNONE)) 176 176 177 177 #define pages_to_mb(x) ((x) >> (20 - PAGE_SHIFT)) /* FIXME: is this right? */ 178 - 179 - /* 180 - * Macro to mark a page protection value as "uncacheable". 181 - */ 182 - #define pgprot_noncached(prot) \ 183 - (__pgprot(pgprot_val((prot)) | _PAGE_PCD | _PAGE_PWT)) 184 178 185 179 /* 186 180 * Conversion functions: convert a page and protection to a page entry, ··· 244 250 extern int direct_gbpages; 245 251 246 252 /* Encode and de-code a swap entry */ 247 - #define __swp_type(x) (((x).val >> 1) & 0x3f) 248 - #define __swp_offset(x) ((x).val >> 8) 249 - #define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 1) | \ 250 - ((offset) << 8) }) 253 + #if _PAGE_BIT_FILE < _PAGE_BIT_PROTNONE 254 + #define SWP_TYPE_BITS (_PAGE_BIT_FILE - _PAGE_BIT_PRESENT - 1) 255 + #define SWP_OFFSET_SHIFT (_PAGE_BIT_PROTNONE + 1) 256 + #else 257 + #define SWP_TYPE_BITS (_PAGE_BIT_PROTNONE - _PAGE_BIT_PRESENT - 1) 258 + #define SWP_OFFSET_SHIFT (_PAGE_BIT_FILE + 1) 259 + #endif 260 + 261 + #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS) 262 + 263 + #define __swp_type(x) (((x).val >> (_PAGE_BIT_PRESENT + 1)) \ 264 + & ((1U << SWP_TYPE_BITS) - 1)) 265 + #define __swp_offset(x) ((x).val >> SWP_OFFSET_SHIFT) 266 + #define __swp_entry(type, offset) ((swp_entry_t) { \ 267 + ((type) << (_PAGE_BIT_PRESENT + 1)) \ 268 + | ((offset) << SWP_OFFSET_SHIFT) }) 251 269 #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) }) 252 270 #define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val }) 253 271
+3
arch/x86/include/asm/prctl.h
··· 6 6 #define ARCH_GET_FS 0x1003 7 7 #define ARCH_GET_GS 0x1004 8 8 9 + #ifdef CONFIG_X86_64 10 + extern long sys_arch_prctl(int, unsigned long); 11 + #endif /* CONFIG_X86_64 */ 9 12 10 13 #endif /* _ASM_X86_PRCTL_H */
+4
arch/x86/include/asm/processor.h
··· 110 110 /* Index into per_cpu list: */ 111 111 u16 cpu_index; 112 112 #endif 113 + unsigned int x86_hyper_vendor; 113 114 } __attribute__((__aligned__(SMP_CACHE_BYTES))); 114 115 115 116 #define X86_VENDOR_INTEL 0 ··· 123 122 #define X86_VENDOR_NUM 9 124 123 125 124 #define X86_VENDOR_UNKNOWN 0xff 125 + 126 + #define X86_HYPER_VENDOR_NONE 0 127 + #define X86_HYPER_VENDOR_VMWARE 1 126 128 127 129 /* 128 130 * capabilities of CPUs
+5
arch/x86/include/asm/reboot.h
··· 1 1 #ifndef _ASM_X86_REBOOT_H 2 2 #define _ASM_X86_REBOOT_H 3 3 4 + #include <linux/kdebug.h> 5 + 4 6 struct pt_regs; 5 7 6 8 struct machine_ops { ··· 19 17 void native_machine_crash_shutdown(struct pt_regs *regs); 20 18 void native_machine_shutdown(void); 21 19 void machine_real_restart(const unsigned char *code, int length); 20 + 21 + typedef void (*nmi_shootdown_cb)(int, struct die_args*); 22 + void nmi_shootdown_cpus(nmi_shootdown_cb callback); 22 23 23 24 #endif /* _ASM_X86_REBOOT_H */
+7
arch/x86/include/asm/setup.h
··· 8 8 /* Interrupt control for vSMPowered x86_64 systems */ 9 9 void vsmp_init(void); 10 10 11 + 12 + void setup_bios_corruption_check(void); 13 + 14 + 11 15 #ifdef CONFIG_X86_VISWS 12 16 extern void visws_early_detect(void); 13 17 extern int is_visws_box(void); ··· 20 16 static inline int is_visws_box(void) { return 0; } 21 17 #endif 22 18 19 + extern int wakeup_secondary_cpu_via_nmi(int apicid, unsigned long start_eip); 20 + extern int wakeup_secondary_cpu_via_init(int apicid, unsigned long start_eip); 23 21 /* 24 22 * Any setup quirks to be performed? 25 23 */ ··· 45 39 void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable, 46 40 unsigned short oemsize); 47 41 int (*setup_ioapic_ids)(void); 42 + int (*update_genapic)(void); 48 43 }; 49 44 50 45 extern struct x86_quirks *x86_quirks;
+70
arch/x86/include/asm/sigframe.h
··· 1 + #ifndef _ASM_X86_SIGFRAME_H 2 + #define _ASM_X86_SIGFRAME_H 3 + 4 + #include <asm/sigcontext.h> 5 + #include <asm/siginfo.h> 6 + #include <asm/ucontext.h> 7 + 8 + #ifdef CONFIG_X86_32 9 + #define sigframe_ia32 sigframe 10 + #define rt_sigframe_ia32 rt_sigframe 11 + #define sigcontext_ia32 sigcontext 12 + #define _fpstate_ia32 _fpstate 13 + #define ucontext_ia32 ucontext 14 + #else /* !CONFIG_X86_32 */ 15 + 16 + #ifdef CONFIG_IA32_EMULATION 17 + #include <asm/ia32.h> 18 + #endif /* CONFIG_IA32_EMULATION */ 19 + 20 + #endif /* CONFIG_X86_32 */ 21 + 22 + #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION) 23 + struct sigframe_ia32 { 24 + u32 pretcode; 25 + int sig; 26 + struct sigcontext_ia32 sc; 27 + /* 28 + * fpstate is unused. fpstate is moved/allocated after 29 + * retcode[] below. This movement allows to have the FP state and the 30 + * future state extensions (xsave) stay together. 31 + * And at the same time retaining the unused fpstate, prevents changing 32 + * the offset of extramask[] in the sigframe and thus prevent any 33 + * legacy application accessing/modifying it. 34 + */ 35 + struct _fpstate_ia32 fpstate_unused; 36 + #ifdef CONFIG_IA32_EMULATION 37 + unsigned int extramask[_COMPAT_NSIG_WORDS-1]; 38 + #else /* !CONFIG_IA32_EMULATION */ 39 + unsigned long extramask[_NSIG_WORDS-1]; 40 + #endif /* CONFIG_IA32_EMULATION */ 41 + char retcode[8]; 42 + /* fp state follows here */ 43 + }; 44 + 45 + struct rt_sigframe_ia32 { 46 + u32 pretcode; 47 + int sig; 48 + u32 pinfo; 49 + u32 puc; 50 + #ifdef CONFIG_IA32_EMULATION 51 + compat_siginfo_t info; 52 + #else /* !CONFIG_IA32_EMULATION */ 53 + struct siginfo info; 54 + #endif /* CONFIG_IA32_EMULATION */ 55 + struct ucontext_ia32 uc; 56 + char retcode[8]; 57 + /* fp state follows here */ 58 + }; 59 + #endif /* defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION) */ 60 + 61 + #ifdef CONFIG_X86_64 62 + struct rt_sigframe { 63 + char __user *pretcode; 64 + struct ucontext uc; 65 + struct siginfo info; 66 + /* fp state follows here */ 67 + }; 68 + #endif /* CONFIG_X86_64 */ 69 + 70 + #endif /* _ASM_X86_SIGFRAME_H */
+4 -2
arch/x86/include/asm/signal.h
··· 121 121 122 122 #ifndef __ASSEMBLY__ 123 123 124 + # ifdef __KERNEL__ 125 + extern void do_notify_resume(struct pt_regs *, void *, __u32); 126 + # endif /* __KERNEL__ */ 127 + 124 128 #ifdef __i386__ 125 129 # ifdef __KERNEL__ 126 130 struct old_sigaction { ··· 144 140 struct k_sigaction { 145 141 struct sigaction sa; 146 142 }; 147 - 148 - extern void do_notify_resume(struct pt_regs *, void *, __u32); 149 143 150 144 # else /* __KERNEL__ */ 151 145 /* Here we must cater to libcs that poke about in kernel headers. */
+1 -1
arch/x86/include/asm/sparsemem.h
··· 27 27 #else /* CONFIG_X86_32 */ 28 28 # define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */ 29 29 # define MAX_PHYSADDR_BITS 44 30 - # define MAX_PHYSMEM_BITS 44 30 + # define MAX_PHYSMEM_BITS 44 /* Can be max 45 bits */ 31 31 #endif 32 32 33 33 #endif /* CONFIG_SPARSEMEM */
+8 -8
arch/x86/include/asm/syscalls.h
··· 19 19 /* kernel/ioport.c */ 20 20 asmlinkage long sys_ioperm(unsigned long, unsigned long, int); 21 21 22 + /* kernel/ldt.c */ 23 + asmlinkage int sys_modify_ldt(int, void __user *, unsigned long); 24 + 25 + /* kernel/tls.c */ 26 + asmlinkage int sys_set_thread_area(struct user_desc __user *); 27 + asmlinkage int sys_get_thread_area(struct user_desc __user *); 28 + 22 29 /* X86_32 only */ 23 30 #ifdef CONFIG_X86_32 24 31 /* kernel/process_32.c */ ··· 40 33 struct old_sigaction __user *); 41 34 asmlinkage int sys_sigaltstack(unsigned long); 42 35 asmlinkage unsigned long sys_sigreturn(unsigned long); 43 - asmlinkage int sys_rt_sigreturn(unsigned long); 36 + asmlinkage int sys_rt_sigreturn(struct pt_regs); 44 37 45 38 /* kernel/ioport.c */ 46 39 asmlinkage long sys_iopl(unsigned long); 47 - 48 - /* kernel/ldt.c */ 49 - asmlinkage int sys_modify_ldt(int, void __user *, unsigned long); 50 40 51 41 /* kernel/sys_i386_32.c */ 52 42 asmlinkage long sys_mmap2(unsigned long, unsigned long, unsigned long, ··· 57 53 asmlinkage int sys_uname(struct old_utsname __user *); 58 54 struct oldold_utsname; 59 55 asmlinkage int sys_olduname(struct oldold_utsname __user *); 60 - 61 - /* kernel/tls.c */ 62 - asmlinkage int sys_set_thread_area(struct user_desc __user *); 63 - asmlinkage int sys_get_thread_area(struct user_desc __user *); 64 56 65 57 /* kernel/vm86_32.c */ 66 58 asmlinkage int sys_vm86old(struct pt_regs);
+4 -2
arch/x86/include/asm/system.h
··· 17 17 # define AT_VECTOR_SIZE_ARCH 1 18 18 #endif 19 19 20 - #ifdef CONFIG_X86_32 21 - 22 20 struct task_struct; /* one of the stranger aspects of C forward declarations */ 23 21 struct task_struct *__switch_to(struct task_struct *prev, 24 22 struct task_struct *next); 23 + 24 + #ifdef CONFIG_X86_32 25 25 26 26 /* 27 27 * Saving eflags is important. It switches not only IOPL between tasks, ··· 313 313 extern void free_init_pages(char *what, unsigned long begin, unsigned long end); 314 314 315 315 void default_idle(void); 316 + 317 + void stop_this_cpu(void *dummy); 316 318 317 319 /* 318 320 * Force strict CPU ordering.
+1 -1
arch/x86/include/asm/thread_info.h
··· 24 24 struct thread_info { 25 25 struct task_struct *task; /* main task structure */ 26 26 struct exec_domain *exec_domain; /* execution domain */ 27 - unsigned long flags; /* low level flags */ 27 + __u32 flags; /* low level flags */ 28 28 __u32 status; /* thread synchronous flags */ 29 29 __u32 cpu; /* current CPU */ 30 30 int preempt_count; /* 0 => preemptable,
+7
arch/x86/include/asm/trampoline.h
··· 3 3 4 4 #ifndef __ASSEMBLY__ 5 5 6 + #ifdef CONFIG_X86_TRAMPOLINE 6 7 /* 7 8 * Trampoline 80x86 program as an array. 8 9 */ ··· 14 13 extern unsigned long init_rsp; 15 14 extern unsigned long initial_code; 16 15 16 + #define TRAMPOLINE_SIZE roundup(trampoline_end - trampoline_data, PAGE_SIZE) 17 17 #define TRAMPOLINE_BASE 0x6000 18 + 18 19 extern unsigned long setup_trampoline(void); 20 + extern void __init reserve_trampoline_memory(void); 21 + #else 22 + static inline void reserve_trampoline_memory(void) {}; 23 + #endif /* CONFIG_X86_TRAMPOLINE */ 19 24 20 25 #endif /* __ASSEMBLY__ */ 21 26
+9 -2
arch/x86/include/asm/traps.h
··· 46 46 dotraplinkage void do_invalid_TSS(struct pt_regs *, long); 47 47 dotraplinkage void do_segment_not_present(struct pt_regs *, long); 48 48 dotraplinkage void do_stack_segment(struct pt_regs *, long); 49 + #ifdef CONFIG_X86_64 50 + dotraplinkage void do_double_fault(struct pt_regs *, long); 51 + asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *); 52 + #endif 49 53 dotraplinkage void do_general_protection(struct pt_regs *, long); 50 54 dotraplinkage void do_page_fault(struct pt_regs *, unsigned long); 51 55 dotraplinkage void do_spurious_interrupt_bug(struct pt_regs *, long); ··· 76 72 extern int panic_on_unrecovered_nmi; 77 73 extern int kstack_depth_to_print; 78 74 79 - #ifdef CONFIG_X86_32 80 75 void math_error(void __user *); 81 - unsigned long patch_espfix_desc(unsigned long, unsigned long); 82 76 asmlinkage void math_emulate(long); 77 + #ifdef CONFIG_X86_32 78 + unsigned long patch_espfix_desc(unsigned long, unsigned long); 79 + #else 80 + asmlinkage void smp_thermal_interrupt(void); 81 + asmlinkage void mce_threshold_interrupt(void); 83 82 #endif 84 83 85 84 #endif /* _ASM_X86_TRAPS_H */
+1 -7
arch/x86/include/asm/tsc.h
··· 34 34 35 35 static __always_inline cycles_t vget_cycles(void) 36 36 { 37 - cycles_t cycles; 38 - 39 37 /* 40 38 * We only do VDSOs on TSC capable CPUs, so this shouldnt 41 39 * access boot_cpu_data (which is not VDSO-safe): ··· 42 44 if (!cpu_has_tsc) 43 45 return 0; 44 46 #endif 45 - rdtsc_barrier(); 46 - cycles = (cycles_t)__native_read_tsc(); 47 - rdtsc_barrier(); 48 - 49 - return cycles; 47 + return (cycles_t)__native_read_tsc(); 50 48 } 51 49 52 50 extern void tsc_init(void);
+2 -2
arch/x86/include/asm/uaccess.h
··· 350 350 351 351 #define __put_user_nocheck(x, ptr, size) \ 352 352 ({ \ 353 - long __pu_err; \ 353 + int __pu_err; \ 354 354 __put_user_size((x), (ptr), (size), __pu_err, -EFAULT); \ 355 355 __pu_err; \ 356 356 }) 357 357 358 358 #define __get_user_nocheck(x, ptr, size) \ 359 359 ({ \ 360 - long __gu_err; \ 360 + int __gu_err; \ 361 361 unsigned long __gu_val; \ 362 362 __get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \ 363 363 (x) = (__force __typeof__(*(ptr)))__gu_val; \
+30 -4
arch/x86/include/asm/uv/bios.h
··· 32 32 enum uv_bios_cmd { 33 33 UV_BIOS_COMMON, 34 34 UV_BIOS_GET_SN_INFO, 35 - UV_BIOS_FREQ_BASE 35 + UV_BIOS_FREQ_BASE, 36 + UV_BIOS_WATCHLIST_ALLOC, 37 + UV_BIOS_WATCHLIST_FREE, 38 + UV_BIOS_MEMPROTECT, 39 + UV_BIOS_GET_PARTITION_ADDR 36 40 }; 37 41 38 42 /* 39 43 * Status values returned from a BIOS call. 40 44 */ 41 45 enum { 46 + BIOS_STATUS_MORE_PASSES = 1, 42 47 BIOS_STATUS_SUCCESS = 0, 43 48 BIOS_STATUS_UNIMPLEMENTED = -ENOSYS, 44 49 BIOS_STATUS_EINVAL = -EINVAL, ··· 76 71 }; 77 72 }; 78 73 74 + union uv_watchlist_u { 75 + u64 val; 76 + struct { 77 + u64 blade : 16, 78 + size : 32, 79 + filler : 16; 80 + }; 81 + }; 82 + 83 + enum uv_memprotect { 84 + UV_MEMPROT_RESTRICT_ACCESS, 85 + UV_MEMPROT_ALLOW_AMO, 86 + UV_MEMPROT_ALLOW_RW 87 + }; 88 + 79 89 /* 80 90 * bios calls have 6 parameters 81 91 */ ··· 100 80 101 81 extern s64 uv_bios_get_sn_info(int, int *, long *, long *, long *); 102 82 extern s64 uv_bios_freq_base(u64, u64 *); 83 + extern int uv_bios_mq_watchlist_alloc(int, unsigned long, unsigned int, 84 + unsigned long *); 85 + extern int uv_bios_mq_watchlist_free(int, int); 86 + extern s64 uv_bios_change_memprotect(u64, u64, enum uv_memprotect); 87 + extern s64 uv_bios_reserved_page_pa(u64, u64 *, u64 *, u64 *); 103 88 104 89 extern void uv_bios_init(void); 105 90 91 + extern unsigned long sn_rtc_cycles_per_second; 106 92 extern int uv_type; 107 93 extern long sn_partition_id; 108 - extern long uv_coherency_id; 109 - extern long uv_region_size; 110 - #define partition_coherence_id() (uv_coherency_id) 94 + extern long sn_coherency_id; 95 + extern long sn_region_size; 96 + #define partition_coherence_id() (sn_coherency_id) 111 97 112 98 extern struct kobject *sgi_uv_kobj; /* /sys/firmware/sgi_uv */ 113 99
+76 -27
arch/x86/include/asm/uv/uv_hub.h
··· 113 113 */ 114 114 #define UV_MAX_NASID_VALUE (UV_MAX_NUMALINK_NODES * 2) 115 115 116 + struct uv_scir_s { 117 + struct timer_list timer; 118 + unsigned long offset; 119 + unsigned long last; 120 + unsigned long idle_on; 121 + unsigned long idle_off; 122 + unsigned char state; 123 + unsigned char enabled; 124 + }; 125 + 116 126 /* 117 127 * The following defines attributes of the HUB chip. These attributes are 118 128 * frequently referenced and are kept in the per-cpu data areas of each cpu. 119 129 * They are kept together in a struct to minimize cache misses. 120 130 */ 121 131 struct uv_hub_info_s { 122 - unsigned long global_mmr_base; 123 - unsigned long gpa_mask; 124 - unsigned long gnode_upper; 125 - unsigned long lowmem_remap_top; 126 - unsigned long lowmem_remap_base; 127 - unsigned short pnode; 128 - unsigned short pnode_mask; 129 - unsigned short coherency_domain_number; 130 - unsigned short numa_blade_id; 131 - unsigned char blade_processor_id; 132 - unsigned char m_val; 133 - unsigned char n_val; 132 + unsigned long global_mmr_base; 133 + unsigned long gpa_mask; 134 + unsigned long gnode_upper; 135 + unsigned long lowmem_remap_top; 136 + unsigned long lowmem_remap_base; 137 + unsigned short pnode; 138 + unsigned short pnode_mask; 139 + unsigned short coherency_domain_number; 140 + unsigned short numa_blade_id; 141 + unsigned char blade_processor_id; 142 + unsigned char m_val; 143 + unsigned char n_val; 144 + struct uv_scir_s scir; 134 145 }; 146 + 135 147 DECLARE_PER_CPU(struct uv_hub_info_s, __uv_hub_info); 136 148 #define uv_hub_info (&__get_cpu_var(__uv_hub_info)) 137 149 #define uv_cpu_hub_info(cpu) (&per_cpu(__uv_hub_info, cpu)) ··· 175 163 176 164 #define UV_APIC_PNODE_SHIFT 6 177 165 166 + /* Local Bus from cpu's perspective */ 167 + #define LOCAL_BUS_BASE 0x1c00000 168 + #define LOCAL_BUS_SIZE (4 * 1024 * 1024) 169 + 170 + /* 171 + * System Controller Interface Reg 172 + * 173 + * Note there are NO leds on a UV system. This register is only 174 + * used by the system controller to monitor system-wide operation. 175 + * There are 64 regs per node. With Nahelem cpus (2 cores per node, 176 + * 8 cpus per core, 2 threads per cpu) there are 32 cpu threads on 177 + * a node. 178 + * 179 + * The window is located at top of ACPI MMR space 180 + */ 181 + #define SCIR_WINDOW_COUNT 64 182 + #define SCIR_LOCAL_MMR_BASE (LOCAL_BUS_BASE + \ 183 + LOCAL_BUS_SIZE - \ 184 + SCIR_WINDOW_COUNT) 185 + 186 + #define SCIR_CPU_HEARTBEAT 0x01 /* timer interrupt */ 187 + #define SCIR_CPU_ACTIVITY 0x02 /* not idle */ 188 + #define SCIR_CPU_HB_INTERVAL (HZ) /* once per second */ 189 + 178 190 /* 179 191 * Macros for converting between kernel virtual addresses, socket local physical 180 192 * addresses, and UV global physical addresses. ··· 210 174 static inline unsigned long uv_soc_phys_ram_to_gpa(unsigned long paddr) 211 175 { 212 176 if (paddr < uv_hub_info->lowmem_remap_top) 213 - paddr += uv_hub_info->lowmem_remap_base; 177 + paddr |= uv_hub_info->lowmem_remap_base; 214 178 return paddr | uv_hub_info->gnode_upper; 215 179 } 216 180 ··· 218 182 /* socket virtual --> UV global physical address */ 219 183 static inline unsigned long uv_gpa(void *v) 220 184 { 221 - return __pa(v) | uv_hub_info->gnode_upper; 222 - } 223 - 224 - /* socket virtual --> UV global physical address */ 225 - static inline void *uv_vgpa(void *v) 226 - { 227 - return (void *)uv_gpa(v); 228 - } 229 - 230 - /* UV global physical address --> socket virtual */ 231 - static inline void *uv_va(unsigned long gpa) 232 - { 233 - return __va(gpa & uv_hub_info->gpa_mask); 185 + return uv_soc_phys_ram_to_gpa(__pa(v)); 234 186 } 235 187 236 188 /* pnode, offset --> socket virtual */ ··· 301 277 *uv_local_mmr_address(offset) = val; 302 278 } 303 279 280 + static inline unsigned char uv_read_local_mmr8(unsigned long offset) 281 + { 282 + return *((unsigned char *)uv_local_mmr_address(offset)); 283 + } 284 + 285 + static inline void uv_write_local_mmr8(unsigned long offset, unsigned char val) 286 + { 287 + *((unsigned char *)uv_local_mmr_address(offset)) = val; 288 + } 289 + 304 290 /* 305 291 * Structures and definitions for converting between cpu, node, pnode, and blade 306 292 * numbers. ··· 385 351 return uv_possible_blades; 386 352 } 387 353 388 - #endif /* _ASM_X86_UV_UV_HUB_H */ 354 + /* Update SCIR state */ 355 + static inline void uv_set_scir_bits(unsigned char value) 356 + { 357 + if (uv_hub_info->scir.state != value) { 358 + uv_hub_info->scir.state = value; 359 + uv_write_local_mmr8(uv_hub_info->scir.offset, value); 360 + } 361 + } 362 + static inline void uv_set_cpu_scir_bits(int cpu, unsigned char value) 363 + { 364 + if (uv_cpu_hub_info(cpu)->scir.state != value) { 365 + uv_cpu_hub_info(cpu)->scir.state = value; 366 + uv_write_local_mmr8(uv_cpu_hub_info(cpu)->scir.offset, value); 367 + } 368 + } 389 369 370 + #endif /* _ASM_X86_UV_UV_HUB_H */
+27
arch/x86/include/asm/vmware.h
··· 1 + /* 2 + * Copyright (C) 2008, VMware, Inc. 3 + * 4 + * This program is free software; you can redistribute it and/or modify 5 + * it under the terms of the GNU General Public License as published by 6 + * the Free Software Foundation; either version 2 of the License, or 7 + * (at your option) any later version. 8 + * 9 + * This program is distributed in the hope that it will be useful, but 10 + * WITHOUT ANY WARRANTY; without even the implied warranty of 11 + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or 12 + * NON INFRINGEMENT. See the GNU General Public License for more 13 + * details. 14 + * 15 + * You should have received a copy of the GNU General Public License 16 + * along with this program; if not, write to the Free Software 17 + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 18 + * 19 + */ 20 + #ifndef ASM_X86__VMWARE_H 21 + #define ASM_X86__VMWARE_H 22 + 23 + extern unsigned long vmware_get_tsc_khz(void); 24 + extern int vmware_platform(void); 25 + extern void vmware_set_feature_bits(struct cpuinfo_x86 *c); 26 + 27 + #endif
+6
arch/x86/include/asm/xen/hypercall.h
··· 33 33 #ifndef _ASM_X86_XEN_HYPERCALL_H 34 34 #define _ASM_X86_XEN_HYPERCALL_H 35 35 36 + #include <linux/kernel.h> 37 + #include <linux/spinlock.h> 36 38 #include <linux/errno.h> 37 39 #include <linux/string.h> 40 + #include <linux/types.h> 41 + 42 + #include <asm/page.h> 43 + #include <asm/pgtable.h> 38 44 39 45 #include <xen/interface/xen.h> 40 46 #include <xen/interface/sched.h>
+8 -31
arch/x86/include/asm/xen/hypervisor.h
··· 33 33 #ifndef _ASM_X86_XEN_HYPERVISOR_H 34 34 #define _ASM_X86_XEN_HYPERVISOR_H 35 35 36 - #include <linux/types.h> 37 - #include <linux/kernel.h> 38 - 39 - #include <xen/interface/xen.h> 40 - #include <xen/interface/version.h> 41 - 42 - #include <asm/ptrace.h> 43 - #include <asm/page.h> 44 - #include <asm/desc.h> 45 - #if defined(__i386__) 46 - # ifdef CONFIG_X86_PAE 47 - # include <asm-generic/pgtable-nopud.h> 48 - # else 49 - # include <asm-generic/pgtable-nopmd.h> 50 - # endif 51 - #endif 52 - #include <asm/xen/hypercall.h> 53 - 54 36 /* arch/i386/kernel/setup.c */ 55 37 extern struct shared_info *HYPERVISOR_shared_info; 56 38 extern struct start_info *xen_start_info; 57 - 58 - /* arch/i386/mach-xen/evtchn.c */ 59 - /* Force a proper event-channel callback from Xen. */ 60 - extern void force_evtchn_callback(void); 61 - 62 - /* Turn jiffies into Xen system time. */ 63 - u64 jiffies_to_st(unsigned long jiffies); 64 - 65 - 66 - #define MULTI_UVMFLAGS_INDEX 3 67 - #define MULTI_UVMDOMID_INDEX 4 68 39 69 40 enum xen_domain_type { 70 41 XEN_NATIVE, ··· 45 74 46 75 extern enum xen_domain_type xen_domain_type; 47 76 77 + #ifdef CONFIG_XEN 48 78 #define xen_domain() (xen_domain_type != XEN_NATIVE) 49 - #define xen_pv_domain() (xen_domain_type == XEN_PV_DOMAIN) 79 + #else 80 + #define xen_domain() (0) 81 + #endif 82 + 83 + #define xen_pv_domain() (xen_domain() && xen_domain_type == XEN_PV_DOMAIN) 84 + #define xen_hvm_domain() (xen_domain() && xen_domain_type == XEN_HVM_DOMAIN) 85 + 50 86 #define xen_initial_domain() (xen_pv_domain() && xen_start_info->flags & SIF_INITDOMAIN) 51 - #define xen_hvm_domain() (xen_domain_type == XEN_HVM_DOMAIN) 52 87 53 88 #endif /* _ASM_X86_XEN_HYPERVISOR_H */
+5
arch/x86/include/asm/xen/page.h
··· 1 1 #ifndef _ASM_X86_XEN_PAGE_H 2 2 #define _ASM_X86_XEN_PAGE_H 3 3 4 + #include <linux/kernel.h> 5 + #include <linux/types.h> 6 + #include <linux/spinlock.h> 4 7 #include <linux/pfn.h> 5 8 6 9 #include <asm/uaccess.h> 10 + #include <asm/page.h> 7 11 #include <asm/pgtable.h> 8 12 13 + #include <xen/interface/xen.h> 9 14 #include <xen/features.h> 10 15 11 16 /* Xen machine address */
+5 -2
arch/x86/kernel/Makefile
··· 12 12 CFLAGS_REMOVE_rtc.o = -pg 13 13 CFLAGS_REMOVE_paravirt-spinlocks.o = -pg 14 14 CFLAGS_REMOVE_ftrace.o = -pg 15 + CFLAGS_REMOVE_early_printk.o = -pg 15 16 endif 16 17 17 18 # ··· 24 23 CFLAGS_hpet.o := $(nostackp) 25 24 CFLAGS_tsc.o := $(nostackp) 26 25 27 - obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o 26 + obj-y := process_$(BITS).o signal.o entry_$(BITS).o 28 27 obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o 29 - obj-y += time_$(BITS).o ioport.o ldt.o 28 + obj-y += time_$(BITS).o ioport.o ldt.o dumpstack.o 30 29 obj-y += setup.o i8259.o irqinit_$(BITS).o setup_percpu.o 31 30 obj-$(CONFIG_X86_VISWS) += visws_quirks.o 32 31 obj-$(CONFIG_X86_32) += probe_roms_32.o ··· 105 104 microcode-$(CONFIG_MICROCODE_INTEL) += microcode_intel.o 106 105 microcode-$(CONFIG_MICROCODE_AMD) += microcode_amd.o 107 106 obj-$(CONFIG_MICROCODE) += microcode.o 107 + 108 + obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o 108 109 109 110 ### 110 111 # 64 bit specific files
+11
arch/x86/kernel/acpi/boot.c
··· 1360 1360 disable_acpi(); 1361 1361 } 1362 1362 } 1363 + 1364 + /* 1365 + * ACPI supports both logical (e.g. Hyper-Threading) and physical 1366 + * processors, where MPS only supports physical. 1367 + */ 1368 + if (acpi_lapic && acpi_ioapic) 1369 + printk(KERN_INFO "Using ACPI (MADT) for SMP configuration " 1370 + "information\n"); 1371 + else if (acpi_lapic) 1372 + printk(KERN_INFO "Using ACPI for processor (LAPIC) " 1373 + "configuration information\n"); 1363 1374 #endif 1364 1375 return; 1365 1376 }
+54 -73
arch/x86/kernel/apic.c
··· 441 441 v = apic_read(APIC_LVTT); 442 442 v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR); 443 443 apic_write(APIC_LVTT, v); 444 + apic_write(APIC_TMICT, 0xffffffff); 444 445 break; 445 446 case CLOCK_EVT_MODE_RESUME: 446 447 /* Nothing to do here */ ··· 560 559 } else { 561 560 res = (((u64)deltapm) * mult) >> 22; 562 561 do_div(res, 1000000); 563 - printk(KERN_WARNING "APIC calibration not consistent " 562 + pr_warning("APIC calibration not consistent " 564 563 "with PM Timer: %ldms instead of 100ms\n", 565 564 (long)res); 566 565 /* Correct the lapic counter value */ 567 566 res = (((u64)(*delta)) * pm_100ms); 568 567 do_div(res, deltapm); 569 - printk(KERN_INFO "APIC delta adjusted to PM-Timer: " 568 + pr_info("APIC delta adjusted to PM-Timer: " 570 569 "%lu (%ld)\n", (unsigned long)res, *delta); 571 570 *delta = (long)res; 572 571 } ··· 646 645 */ 647 646 if (calibration_result < (1000000 / HZ)) { 648 647 local_irq_enable(); 649 - printk(KERN_WARNING 650 - "APIC frequency too slow, disabling apic timer\n"); 648 + pr_warning("APIC frequency too slow, disabling apic timer\n"); 651 649 return -1; 652 650 } 653 651 ··· 672 672 while (lapic_cal_loops <= LAPIC_CAL_LOOPS) 673 673 cpu_relax(); 674 674 675 - local_irq_disable(); 676 - 677 675 /* Stop the lapic timer */ 678 676 lapic_timer_setup(CLOCK_EVT_MODE_SHUTDOWN, levt); 679 - 680 - local_irq_enable(); 681 677 682 678 /* Jiffies delta */ 683 679 deltaj = lapic_cal_j2 - lapic_cal_j1; ··· 688 692 local_irq_enable(); 689 693 690 694 if (levt->features & CLOCK_EVT_FEAT_DUMMY) { 691 - printk(KERN_WARNING 692 - "APIC timer disabled due to verification failure.\n"); 695 + pr_warning("APIC timer disabled due to verification failure.\n"); 693 696 return -1; 694 697 } 695 698 ··· 709 714 * broadcast mechanism is used. On UP systems simply ignore it. 710 715 */ 711 716 if (disable_apic_timer) { 712 - printk(KERN_INFO "Disabling APIC timer\n"); 717 + pr_info("Disabling APIC timer\n"); 713 718 /* No broadcast on UP ! */ 714 719 if (num_possible_cpus() > 1) { 715 720 lapic_clockevent.mult = 1; ··· 736 741 if (nmi_watchdog != NMI_IO_APIC) 737 742 lapic_clockevent.features &= ~CLOCK_EVT_FEAT_DUMMY; 738 743 else 739 - printk(KERN_WARNING "APIC timer registered as dummy," 744 + pr_warning("APIC timer registered as dummy," 740 745 " due to nmi_watchdog=%d!\n", nmi_watchdog); 741 746 742 747 /* Setup the lapic or request the broadcast */ ··· 768 773 * spurious. 769 774 */ 770 775 if (!evt->event_handler) { 771 - printk(KERN_WARNING 772 - "Spurious LAPIC timer interrupt on cpu %d\n", cpu); 776 + pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu); 773 777 /* Switch it off */ 774 778 lapic_timer_setup(CLOCK_EVT_MODE_SHUTDOWN, evt); 775 779 return; ··· 808 814 * Besides, if we don't timer interrupts ignore the global 809 815 * interrupt lock, which is the WrongThing (tm) to do. 810 816 */ 811 - #ifdef CONFIG_X86_64 812 817 exit_idle(); 813 - #endif 814 818 irq_enter(); 815 819 local_apic_timer_interrupt(); 816 820 irq_exit(); ··· 1085 1093 unsigned int oldvalue, value, maxlvt; 1086 1094 1087 1095 if (!lapic_is_integrated()) { 1088 - printk(KERN_INFO "No ESR for 82489DX.\n"); 1096 + pr_info("No ESR for 82489DX.\n"); 1089 1097 return; 1090 1098 } 1091 1099 ··· 1096 1104 * ESR disabled - we can't do anything useful with the 1097 1105 * errors anyway - mbligh 1098 1106 */ 1099 - printk(KERN_INFO "Leaving ESR disabled.\n"); 1107 + pr_info("Leaving ESR disabled.\n"); 1100 1108 return; 1101 1109 } 1102 1110 ··· 1290 1298 rdmsr(MSR_IA32_APICBASE, msr, msr2); 1291 1299 1292 1300 if (msr & X2APIC_ENABLE) { 1293 - printk("x2apic enabled by BIOS, switching to x2apic ops\n"); 1301 + pr_info("x2apic enabled by BIOS, switching to x2apic ops\n"); 1294 1302 x2apic_preenabled = x2apic = 1; 1295 1303 apic_ops = &x2apic_ops; 1296 1304 } ··· 1302 1310 1303 1311 rdmsr(MSR_IA32_APICBASE, msr, msr2); 1304 1312 if (!(msr & X2APIC_ENABLE)) { 1305 - printk("Enabling x2apic\n"); 1313 + pr_info("Enabling x2apic\n"); 1306 1314 wrmsr(MSR_IA32_APICBASE, msr | X2APIC_ENABLE, 0); 1307 1315 } 1308 1316 } ··· 1317 1325 return; 1318 1326 1319 1327 if (!x2apic_preenabled && disable_x2apic) { 1320 - printk(KERN_INFO 1321 - "Skipped enabling x2apic and Interrupt-remapping " 1322 - "because of nox2apic\n"); 1328 + pr_info("Skipped enabling x2apic and Interrupt-remapping " 1329 + "because of nox2apic\n"); 1323 1330 return; 1324 1331 } 1325 1332 ··· 1326 1335 panic("Bios already enabled x2apic, can't enforce nox2apic"); 1327 1336 1328 1337 if (!x2apic_preenabled && skip_ioapic_setup) { 1329 - printk(KERN_INFO 1330 - "Skipped enabling x2apic and Interrupt-remapping " 1331 - "because of skipping io-apic setup\n"); 1338 + pr_info("Skipped enabling x2apic and Interrupt-remapping " 1339 + "because of skipping io-apic setup\n"); 1332 1340 return; 1333 1341 } 1334 1342 1335 1343 ret = dmar_table_init(); 1336 1344 if (ret) { 1337 - printk(KERN_INFO 1338 - "dmar_table_init() failed with %d:\n", ret); 1345 + pr_info("dmar_table_init() failed with %d:\n", ret); 1339 1346 1340 1347 if (x2apic_preenabled) 1341 1348 panic("x2apic enabled by bios. But IR enabling failed"); 1342 1349 else 1343 - printk(KERN_INFO 1344 - "Not enabling x2apic,Intr-remapping\n"); 1350 + pr_info("Not enabling x2apic,Intr-remapping\n"); 1345 1351 return; 1346 1352 } 1347 1353 ··· 1347 1359 1348 1360 ret = save_mask_IO_APIC_setup(); 1349 1361 if (ret) { 1350 - printk(KERN_INFO "Saving IO-APIC state failed: %d\n", ret); 1362 + pr_info("Saving IO-APIC state failed: %d\n", ret); 1351 1363 goto end; 1352 1364 } 1353 1365 ··· 1382 1394 1383 1395 if (!ret) { 1384 1396 if (!x2apic_preenabled) 1385 - printk(KERN_INFO 1386 - "Enabled x2apic and interrupt-remapping\n"); 1397 + pr_info("Enabled x2apic and interrupt-remapping\n"); 1387 1398 else 1388 - printk(KERN_INFO 1389 - "Enabled Interrupt-remapping\n"); 1399 + pr_info("Enabled Interrupt-remapping\n"); 1390 1400 } else 1391 - printk(KERN_ERR 1392 - "Failed to enable Interrupt-remapping and x2apic\n"); 1401 + pr_err("Failed to enable Interrupt-remapping and x2apic\n"); 1393 1402 #else 1394 1403 if (!cpu_has_x2apic) 1395 1404 return; ··· 1395 1410 panic("x2apic enabled prior OS handover," 1396 1411 " enable CONFIG_INTR_REMAP"); 1397 1412 1398 - printk(KERN_INFO "Enable CONFIG_INTR_REMAP for enabling intr-remapping " 1399 - " and x2apic\n"); 1413 + pr_info("Enable CONFIG_INTR_REMAP for enabling intr-remapping " 1414 + " and x2apic\n"); 1400 1415 #endif 1401 1416 1402 1417 return; ··· 1413 1428 static int __init detect_init_APIC(void) 1414 1429 { 1415 1430 if (!cpu_has_apic) { 1416 - printk(KERN_INFO "No local APIC present\n"); 1431 + pr_info("No local APIC present\n"); 1417 1432 return -1; 1418 1433 } 1419 1434 ··· 1454 1469 * "lapic" specified. 1455 1470 */ 1456 1471 if (!force_enable_local_apic) { 1457 - printk(KERN_INFO "Local APIC disabled by BIOS -- " 1458 - "you can enable it with \"lapic\"\n"); 1472 + pr_info("Local APIC disabled by BIOS -- " 1473 + "you can enable it with \"lapic\"\n"); 1459 1474 return -1; 1460 1475 } 1461 1476 /* ··· 1465 1480 */ 1466 1481 rdmsr(MSR_IA32_APICBASE, l, h); 1467 1482 if (!(l & MSR_IA32_APICBASE_ENABLE)) { 1468 - printk(KERN_INFO 1469 - "Local APIC disabled by BIOS -- reenabling.\n"); 1483 + pr_info("Local APIC disabled by BIOS -- reenabling.\n"); 1470 1484 l &= ~MSR_IA32_APICBASE_BASE; 1471 1485 l |= MSR_IA32_APICBASE_ENABLE | APIC_DEFAULT_PHYS_BASE; 1472 1486 wrmsr(MSR_IA32_APICBASE, l, h); ··· 1478 1494 */ 1479 1495 features = cpuid_edx(1); 1480 1496 if (!(features & (1 << X86_FEATURE_APIC))) { 1481 - printk(KERN_WARNING "Could not enable APIC!\n"); 1497 + pr_warning("Could not enable APIC!\n"); 1482 1498 return -1; 1483 1499 } 1484 1500 set_cpu_cap(&boot_cpu_data, X86_FEATURE_APIC); ··· 1489 1505 if (l & MSR_IA32_APICBASE_ENABLE) 1490 1506 mp_lapic_addr = l & MSR_IA32_APICBASE_BASE; 1491 1507 1492 - printk(KERN_INFO "Found and enabled local APIC!\n"); 1508 + pr_info("Found and enabled local APIC!\n"); 1493 1509 1494 1510 apic_pm_activate(); 1495 1511 1496 1512 return 0; 1497 1513 1498 1514 no_apic: 1499 - printk(KERN_INFO "No local APIC present or hardware disabled\n"); 1515 + pr_info("No local APIC present or hardware disabled\n"); 1500 1516 return -1; 1501 1517 } 1502 1518 #endif ··· 1572 1588 { 1573 1589 #ifdef CONFIG_X86_64 1574 1590 if (disable_apic) { 1575 - printk(KERN_INFO "Apic disabled\n"); 1591 + pr_info("Apic disabled\n"); 1576 1592 return -1; 1577 1593 } 1578 1594 if (!cpu_has_apic) { 1579 1595 disable_apic = 1; 1580 - printk(KERN_INFO "Apic disabled by BIOS\n"); 1596 + pr_info("Apic disabled by BIOS\n"); 1581 1597 return -1; 1582 1598 } 1583 1599 #else ··· 1589 1605 */ 1590 1606 if (!cpu_has_apic && 1591 1607 APIC_INTEGRATED(apic_version[boot_cpu_physical_apicid])) { 1592 - printk(KERN_ERR "BIOS bug, local APIC 0x%x not detected!...\n", 1593 - boot_cpu_physical_apicid); 1608 + pr_err("BIOS bug, local APIC 0x%x not detected!...\n", 1609 + boot_cpu_physical_apicid); 1594 1610 clear_cpu_cap(&boot_cpu_data, X86_FEATURE_APIC); 1595 1611 return -1; 1596 1612 } ··· 1666 1682 { 1667 1683 u32 v; 1668 1684 1669 - #ifdef CONFIG_X86_64 1670 1685 exit_idle(); 1671 - #endif 1672 1686 irq_enter(); 1673 1687 /* 1674 1688 * Check if this really is a spurious interrupt and ACK it ··· 1681 1699 add_pda(irq_spurious_count, 1); 1682 1700 #else 1683 1701 /* see sw-dev-man vol 3, chapter 7.4.13.5 */ 1684 - printk(KERN_INFO "spurious APIC interrupt on CPU#%d, " 1685 - "should never happen.\n", smp_processor_id()); 1702 + pr_info("spurious APIC interrupt on CPU#%d, " 1703 + "should never happen.\n", smp_processor_id()); 1686 1704 __get_cpu_var(irq_stat).irq_spurious_count++; 1687 1705 #endif 1688 1706 irq_exit(); ··· 1695 1713 { 1696 1714 u32 v, v1; 1697 1715 1698 - #ifdef CONFIG_X86_64 1699 1716 exit_idle(); 1700 - #endif 1701 1717 irq_enter(); 1702 1718 /* First tickle the hardware, only then report what went on. -- REW */ 1703 1719 v = apic_read(APIC_ESR); ··· 1704 1724 ack_APIC_irq(); 1705 1725 atomic_inc(&irq_err_count); 1706 1726 1707 - /* Here is what the APIC error bits mean: 1708 - 0: Send CS error 1709 - 1: Receive CS error 1710 - 2: Send accept error 1711 - 3: Receive accept error 1712 - 4: Reserved 1713 - 5: Send illegal vector 1714 - 6: Received illegal vector 1715 - 7: Illegal register address 1716 - */ 1717 - printk(KERN_DEBUG "APIC error on CPU%d: %02x(%02x)\n", 1727 + /* 1728 + * Here is what the APIC error bits mean: 1729 + * 0: Send CS error 1730 + * 1: Receive CS error 1731 + * 2: Send accept error 1732 + * 3: Receive accept error 1733 + * 4: Reserved 1734 + * 5: Send illegal vector 1735 + * 6: Received illegal vector 1736 + * 7: Illegal register address 1737 + */ 1738 + pr_debug("APIC error on CPU%d: %02x(%02x)\n", 1718 1739 smp_processor_id(), v , v1); 1719 1740 irq_exit(); 1720 1741 } ··· 1819 1838 * Validate version 1820 1839 */ 1821 1840 if (version == 0x0) { 1822 - printk(KERN_WARNING "BIOS bug, APIC version is 0 for CPU#%d! " 1823 - "fixing up to 0x10. (tell your hw vendor)\n", 1824 - version); 1841 + pr_warning("BIOS bug, APIC version is 0 for CPU#%d! " 1842 + "fixing up to 0x10. (tell your hw vendor)\n", 1843 + version); 1825 1844 version = 0x10; 1826 1845 } 1827 1846 apic_version[apicid] = version; 1828 1847 1829 1848 if (num_processors >= NR_CPUS) { 1830 - printk(KERN_WARNING "WARNING: NR_CPUS limit of %i reached." 1849 + pr_warning("WARNING: NR_CPUS limit of %i reached." 1831 1850 " Processor ignored.\n", NR_CPUS); 1832 1851 return; 1833 1852 } ··· 2190 2209 else if (strcmp("verbose", arg) == 0) 2191 2210 apic_verbosity = APIC_VERBOSE; 2192 2211 else { 2193 - printk(KERN_WARNING "APIC Verbosity level %s not recognised" 2212 + pr_warning("APIC Verbosity level %s not recognised" 2194 2213 " use apic=verbose or apic=debug\n", arg); 2195 2214 return -EINVAL; 2196 2215 }
-4
arch/x86/kernel/apm_32.c
··· 391 391 #else 392 392 static int power_off = 1; 393 393 #endif 394 - #ifdef CONFIG_APM_REAL_MODE_POWER_OFF 395 - static int realmode_power_off = 1; 396 - #else 397 394 static int realmode_power_off; 398 - #endif 399 395 #ifdef CONFIG_APM_ALLOW_INTS 400 396 static int allow_ints = 1; 401 397 #else
+1 -1
arch/x86/kernel/asm-offsets_32.c
··· 11 11 #include <linux/suspend.h> 12 12 #include <linux/kbuild.h> 13 13 #include <asm/ucontext.h> 14 - #include "sigframe.h" 14 + #include <asm/sigframe.h> 15 15 #include <asm/pgtable.h> 16 16 #include <asm/fixmap.h> 17 17 #include <asm/processor.h>
+3 -1
arch/x86/kernel/asm-offsets_64.c
··· 20 20 21 21 #include <xen/interface/xen.h> 22 22 23 + #include <asm/sigframe.h> 24 + 23 25 #define __NO_STUBS 1 24 26 #undef __SYSCALL 25 27 #undef _ASM_X86_UNISTD_64_H ··· 89 87 BLANK(); 90 88 #undef ENTRY 91 89 DEFINE(IA32_RT_SIGFRAME_sigcontext, 92 - offsetof (struct rt_sigframe32, uc.uc_mcontext)); 90 + offsetof (struct rt_sigframe_ia32, uc.uc_mcontext)); 93 91 BLANK(); 94 92 #endif 95 93 DEFINE(pbe_address, offsetof(struct pbe, address));
+54 -4
arch/x86/kernel/bios_uv.c
··· 69 69 70 70 long sn_partition_id; 71 71 EXPORT_SYMBOL_GPL(sn_partition_id); 72 - long uv_coherency_id; 73 - EXPORT_SYMBOL_GPL(uv_coherency_id); 74 - long uv_region_size; 75 - EXPORT_SYMBOL_GPL(uv_region_size); 72 + long sn_coherency_id; 73 + EXPORT_SYMBOL_GPL(sn_coherency_id); 74 + long sn_region_size; 75 + EXPORT_SYMBOL_GPL(sn_region_size); 76 76 int uv_type; 77 77 78 78 ··· 100 100 return ret; 101 101 } 102 102 103 + int 104 + uv_bios_mq_watchlist_alloc(int blade, unsigned long addr, unsigned int mq_size, 105 + unsigned long *intr_mmr_offset) 106 + { 107 + union uv_watchlist_u size_blade; 108 + u64 watchlist; 109 + s64 ret; 110 + 111 + size_blade.size = mq_size; 112 + size_blade.blade = blade; 113 + 114 + /* 115 + * bios returns watchlist number or negative error number. 116 + */ 117 + ret = (int)uv_bios_call_irqsave(UV_BIOS_WATCHLIST_ALLOC, addr, 118 + size_blade.val, (u64)intr_mmr_offset, 119 + (u64)&watchlist, 0); 120 + if (ret < BIOS_STATUS_SUCCESS) 121 + return ret; 122 + 123 + return watchlist; 124 + } 125 + EXPORT_SYMBOL_GPL(uv_bios_mq_watchlist_alloc); 126 + 127 + int 128 + uv_bios_mq_watchlist_free(int blade, int watchlist_num) 129 + { 130 + return (int)uv_bios_call_irqsave(UV_BIOS_WATCHLIST_FREE, 131 + blade, watchlist_num, 0, 0, 0); 132 + } 133 + EXPORT_SYMBOL_GPL(uv_bios_mq_watchlist_free); 134 + 135 + s64 136 + uv_bios_change_memprotect(u64 paddr, u64 len, enum uv_memprotect perms) 137 + { 138 + return uv_bios_call_irqsave(UV_BIOS_MEMPROTECT, paddr, len, 139 + perms, 0, 0); 140 + } 141 + EXPORT_SYMBOL_GPL(uv_bios_change_memprotect); 142 + 143 + s64 144 + uv_bios_reserved_page_pa(u64 buf, u64 *cookie, u64 *addr, u64 *len) 145 + { 146 + s64 ret; 147 + 148 + ret = uv_bios_call_irqsave(UV_BIOS_GET_PARTITION_ADDR, (u64)cookie, 149 + (u64)addr, buf, (u64)len, 0); 150 + return ret; 151 + } 152 + EXPORT_SYMBOL_GPL(uv_bios_reserved_page_pa); 103 153 104 154 s64 uv_bios_freq_base(u64 clock_type, u64 *ticks_per_second) 105 155 {
+161
arch/x86/kernel/check.c
··· 1 + #include <linux/module.h> 2 + #include <linux/sched.h> 3 + #include <linux/kthread.h> 4 + #include <linux/workqueue.h> 5 + #include <asm/e820.h> 6 + #include <asm/proto.h> 7 + 8 + /* 9 + * Some BIOSes seem to corrupt the low 64k of memory during events 10 + * like suspend/resume and unplugging an HDMI cable. Reserve all 11 + * remaining free memory in that area and fill it with a distinct 12 + * pattern. 13 + */ 14 + #define MAX_SCAN_AREAS 8 15 + 16 + static int __read_mostly memory_corruption_check = -1; 17 + 18 + static unsigned __read_mostly corruption_check_size = 64*1024; 19 + static unsigned __read_mostly corruption_check_period = 60; /* seconds */ 20 + 21 + static struct e820entry scan_areas[MAX_SCAN_AREAS]; 22 + static int num_scan_areas; 23 + 24 + 25 + static __init int set_corruption_check(char *arg) 26 + { 27 + char *end; 28 + 29 + memory_corruption_check = simple_strtol(arg, &end, 10); 30 + 31 + return (*end == 0) ? 0 : -EINVAL; 32 + } 33 + early_param("memory_corruption_check", set_corruption_check); 34 + 35 + static __init int set_corruption_check_period(char *arg) 36 + { 37 + char *end; 38 + 39 + corruption_check_period = simple_strtoul(arg, &end, 10); 40 + 41 + return (*end == 0) ? 0 : -EINVAL; 42 + } 43 + early_param("memory_corruption_check_period", set_corruption_check_period); 44 + 45 + static __init int set_corruption_check_size(char *arg) 46 + { 47 + char *end; 48 + unsigned size; 49 + 50 + size = memparse(arg, &end); 51 + 52 + if (*end == '\0') 53 + corruption_check_size = size; 54 + 55 + return (size == corruption_check_size) ? 0 : -EINVAL; 56 + } 57 + early_param("memory_corruption_check_size", set_corruption_check_size); 58 + 59 + 60 + void __init setup_bios_corruption_check(void) 61 + { 62 + u64 addr = PAGE_SIZE; /* assume first page is reserved anyway */ 63 + 64 + if (memory_corruption_check == -1) { 65 + memory_corruption_check = 66 + #ifdef CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK 67 + 1 68 + #else 69 + 0 70 + #endif 71 + ; 72 + } 73 + 74 + if (corruption_check_size == 0) 75 + memory_corruption_check = 0; 76 + 77 + if (!memory_corruption_check) 78 + return; 79 + 80 + corruption_check_size = round_up(corruption_check_size, PAGE_SIZE); 81 + 82 + while (addr < corruption_check_size && num_scan_areas < MAX_SCAN_AREAS) { 83 + u64 size; 84 + addr = find_e820_area_size(addr, &size, PAGE_SIZE); 85 + 86 + if (addr == 0) 87 + break; 88 + 89 + if ((addr + size) > corruption_check_size) 90 + size = corruption_check_size - addr; 91 + 92 + if (size == 0) 93 + break; 94 + 95 + e820_update_range(addr, size, E820_RAM, E820_RESERVED); 96 + scan_areas[num_scan_areas].addr = addr; 97 + scan_areas[num_scan_areas].size = size; 98 + num_scan_areas++; 99 + 100 + /* Assume we've already mapped this early memory */ 101 + memset(__va(addr), 0, size); 102 + 103 + addr += size; 104 + } 105 + 106 + printk(KERN_INFO "Scanning %d areas for low memory corruption\n", 107 + num_scan_areas); 108 + update_e820(); 109 + } 110 + 111 + 112 + void check_for_bios_corruption(void) 113 + { 114 + int i; 115 + int corruption = 0; 116 + 117 + if (!memory_corruption_check) 118 + return; 119 + 120 + for (i = 0; i < num_scan_areas; i++) { 121 + unsigned long *addr = __va(scan_areas[i].addr); 122 + unsigned long size = scan_areas[i].size; 123 + 124 + for (; size; addr++, size -= sizeof(unsigned long)) { 125 + if (!*addr) 126 + continue; 127 + printk(KERN_ERR "Corrupted low memory at %p (%lx phys) = %08lx\n", 128 + addr, __pa(addr), *addr); 129 + corruption = 1; 130 + *addr = 0; 131 + } 132 + } 133 + 134 + WARN_ONCE(corruption, KERN_ERR "Memory corruption detected in low memory\n"); 135 + } 136 + 137 + static void check_corruption(struct work_struct *dummy); 138 + static DECLARE_DELAYED_WORK(bios_check_work, check_corruption); 139 + 140 + static void check_corruption(struct work_struct *dummy) 141 + { 142 + check_for_bios_corruption(); 143 + schedule_delayed_work(&bios_check_work, 144 + round_jiffies_relative(corruption_check_period*HZ)); 145 + } 146 + 147 + static int start_periodic_check_for_corruption(void) 148 + { 149 + if (!memory_corruption_check || corruption_check_period == 0) 150 + return 0; 151 + 152 + printk(KERN_INFO "Scanning for low memory corruption every %d seconds\n", 153 + corruption_check_period); 154 + 155 + /* First time we run the checks right away */ 156 + schedule_delayed_work(&bios_check_work, 0); 157 + return 0; 158 + } 159 + 160 + module_init(start_periodic_check_for_corruption); 161 +
+1
arch/x86/kernel/cpu/Makefile
··· 4 4 5 5 obj-y := intel_cacheinfo.o addon_cpuid_features.o 6 6 obj-y += proc.o capflags.o powerflags.o common.o 7 + obj-y += vmware.o hypervisor.o 7 8 8 9 obj-$(CONFIG_X86_32) += bugs.o cmpxchg.o 9 10 obj-$(CONFIG_X86_64) += bugs_64.o
+5 -3
arch/x86/kernel/cpu/common.c
··· 36 36 #include <asm/proto.h> 37 37 #include <asm/sections.h> 38 38 #include <asm/setup.h> 39 + #include <asm/hypervisor.h> 39 40 40 41 #include "cpu.h" 41 42 ··· 704 703 detect_ht(c); 705 704 #endif 706 705 706 + init_hypervisor(c); 707 707 /* 708 708 * On SMP, boot_cpu_data holds the common feature set between 709 709 * all CPUs; so make sure that we indicate which features are ··· 864 862 865 863 struct desc_ptr idt_descr = { 256 * 16 - 1, (unsigned long) idt_table }; 866 864 867 - char boot_cpu_stack[IRQSTACKSIZE] __page_aligned_bss; 865 + static char boot_cpu_stack[IRQSTACKSIZE] __page_aligned_bss; 868 866 869 867 void __cpuinit pda_init(int cpu) 870 868 { ··· 905 903 } 906 904 } 907 905 908 - char boot_exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + 909 - DEBUG_STKSZ] __page_aligned_bss; 906 + static char boot_exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + 907 + DEBUG_STKSZ] __page_aligned_bss; 910 908 911 909 extern asmlinkage void ignore_sysret(void); 912 910
+58
arch/x86/kernel/cpu/hypervisor.c
··· 1 + /* 2 + * Common hypervisor code 3 + * 4 + * Copyright (C) 2008, VMware, Inc. 5 + * Author : Alok N Kataria <akataria@vmware.com> 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License as published by 9 + * the Free Software Foundation; either version 2 of the License, or 10 + * (at your option) any later version. 11 + * 12 + * This program is distributed in the hope that it will be useful, but 13 + * WITHOUT ANY WARRANTY; without even the implied warranty of 14 + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or 15 + * NON INFRINGEMENT. See the GNU General Public License for more 16 + * details. 17 + * 18 + * You should have received a copy of the GNU General Public License 19 + * along with this program; if not, write to the Free Software 20 + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 21 + * 22 + */ 23 + 24 + #include <asm/processor.h> 25 + #include <asm/vmware.h> 26 + #include <asm/hypervisor.h> 27 + 28 + static inline void __cpuinit 29 + detect_hypervisor_vendor(struct cpuinfo_x86 *c) 30 + { 31 + if (vmware_platform()) { 32 + c->x86_hyper_vendor = X86_HYPER_VENDOR_VMWARE; 33 + } else { 34 + c->x86_hyper_vendor = X86_HYPER_VENDOR_NONE; 35 + } 36 + } 37 + 38 + unsigned long get_hypervisor_tsc_freq(void) 39 + { 40 + if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_VMWARE) 41 + return vmware_get_tsc_khz(); 42 + return 0; 43 + } 44 + 45 + static inline void __cpuinit 46 + hypervisor_set_feature_bits(struct cpuinfo_x86 *c) 47 + { 48 + if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_VMWARE) { 49 + vmware_set_feature_bits(c); 50 + return; 51 + } 52 + } 53 + 54 + void __cpuinit init_hypervisor(struct cpuinfo_x86 *c) 55 + { 56 + detect_hypervisor_vendor(c); 57 + hypervisor_set_feature_bits(c); 58 + }
+1 -2
arch/x86/kernel/cpu/intel.c
··· 307 307 set_cpu_cap(c, X86_FEATURE_P4); 308 308 if (c->x86 == 6) 309 309 set_cpu_cap(c, X86_FEATURE_P3); 310 + #endif 310 311 311 312 if (cpu_has_bts) 312 313 ptrace_bts_init_intel(c); 313 - 314 - #endif 315 314 316 315 detect_extended_topology(c); 317 316 if (!cpu_has(c, X86_FEATURE_XTOPOLOGY)) {
+7 -10
arch/x86/kernel/cpu/intel_cacheinfo.c
··· 644 644 return show_shared_cpu_map_func(leaf, 1, buf); 645 645 } 646 646 647 - static ssize_t show_type(struct _cpuid4_info *this_leaf, char *buf) { 648 - switch(this_leaf->eax.split.type) { 649 - case CACHE_TYPE_DATA: 647 + static ssize_t show_type(struct _cpuid4_info *this_leaf, char *buf) 648 + { 649 + switch (this_leaf->eax.split.type) { 650 + case CACHE_TYPE_DATA: 650 651 return sprintf(buf, "Data\n"); 651 - break; 652 - case CACHE_TYPE_INST: 652 + case CACHE_TYPE_INST: 653 653 return sprintf(buf, "Instruction\n"); 654 - break; 655 - case CACHE_TYPE_UNIFIED: 654 + case CACHE_TYPE_UNIFIED: 656 655 return sprintf(buf, "Unified\n"); 657 - break; 658 - default: 656 + default: 659 657 return sprintf(buf, "Unknown\n"); 660 - break; 661 658 } 662 659 } 663 660
+174 -178
arch/x86/kernel/cpu/mtrr/main.c
··· 803 803 } 804 804 805 805 static struct res_range __initdata range[RANGE_NUM]; 806 + static int __initdata nr_range; 806 807 807 808 #ifdef CONFIG_MTRR_SANITIZER 808 809 ··· 1207 1206 #define PSHIFT (PAGE_SHIFT - 10) 1208 1207 1209 1208 static struct mtrr_cleanup_result __initdata result[NUM_RESULT]; 1210 - static struct res_range __initdata range_new[RANGE_NUM]; 1211 1209 static unsigned long __initdata min_loss_pfn[RANGE_NUM]; 1212 1210 1213 - static int __init mtrr_cleanup(unsigned address_bits) 1211 + static void __init print_out_mtrr_range_state(void) 1214 1212 { 1215 - unsigned long extra_remove_base, extra_remove_size; 1216 - unsigned long base, size, def, dummy; 1217 - mtrr_type type; 1218 - int nr_range, nr_range_new; 1219 - u64 chunk_size, gran_size; 1220 - unsigned long range_sums, range_sums_new; 1221 - int index_good; 1222 - int num_reg_good; 1223 1213 int i; 1214 + char start_factor = 'K', size_factor = 'K'; 1215 + unsigned long start_base, size_base; 1216 + mtrr_type type; 1224 1217 1218 + for (i = 0; i < num_var_ranges; i++) { 1219 + 1220 + size_base = range_state[i].size_pfn << (PAGE_SHIFT - 10); 1221 + if (!size_base) 1222 + continue; 1223 + 1224 + size_base = to_size_factor(size_base, &size_factor), 1225 + start_base = range_state[i].base_pfn << (PAGE_SHIFT - 10); 1226 + start_base = to_size_factor(start_base, &start_factor), 1227 + type = range_state[i].type; 1228 + 1229 + printk(KERN_DEBUG "reg %d, base: %ld%cB, range: %ld%cB, type %s\n", 1230 + i, start_base, start_factor, 1231 + size_base, size_factor, 1232 + (type == MTRR_TYPE_UNCACHABLE) ? "UC" : 1233 + ((type == MTRR_TYPE_WRPROT) ? "WP" : 1234 + ((type == MTRR_TYPE_WRBACK) ? "WB" : "Other")) 1235 + ); 1236 + } 1237 + } 1238 + 1239 + static int __init mtrr_need_cleanup(void) 1240 + { 1241 + int i; 1242 + mtrr_type type; 1243 + unsigned long size; 1225 1244 /* extra one for all 0 */ 1226 1245 int num[MTRR_NUM_TYPES + 1]; 1227 - 1228 - if (!is_cpu(INTEL) || enable_mtrr_cleanup < 1) 1229 - return 0; 1230 - rdmsr(MTRRdefType_MSR, def, dummy); 1231 - def &= 0xff; 1232 - if (def != MTRR_TYPE_UNCACHABLE) 1233 - return 0; 1234 - 1235 - /* get it and store it aside */ 1236 - memset(range_state, 0, sizeof(range_state)); 1237 - for (i = 0; i < num_var_ranges; i++) { 1238 - mtrr_if->get(i, &base, &size, &type); 1239 - range_state[i].base_pfn = base; 1240 - range_state[i].size_pfn = size; 1241 - range_state[i].type = type; 1242 - } 1243 1246 1244 1247 /* check entries number */ 1245 1248 memset(num, 0, sizeof(num)); ··· 1268 1263 num_var_ranges - num[MTRR_NUM_TYPES]) 1269 1264 return 0; 1270 1265 1266 + return 1; 1267 + } 1268 + 1269 + static unsigned long __initdata range_sums; 1270 + static void __init mtrr_calc_range_state(u64 chunk_size, u64 gran_size, 1271 + unsigned long extra_remove_base, 1272 + unsigned long extra_remove_size, 1273 + int i) 1274 + { 1275 + int num_reg; 1276 + static struct res_range range_new[RANGE_NUM]; 1277 + static int nr_range_new; 1278 + unsigned long range_sums_new; 1279 + 1280 + /* convert ranges to var ranges state */ 1281 + num_reg = x86_setup_var_mtrrs(range, nr_range, 1282 + chunk_size, gran_size); 1283 + 1284 + /* we got new setting in range_state, check it */ 1285 + memset(range_new, 0, sizeof(range_new)); 1286 + nr_range_new = x86_get_mtrr_mem_range(range_new, 0, 1287 + extra_remove_base, extra_remove_size); 1288 + range_sums_new = sum_ranges(range_new, nr_range_new); 1289 + 1290 + result[i].chunk_sizek = chunk_size >> 10; 1291 + result[i].gran_sizek = gran_size >> 10; 1292 + result[i].num_reg = num_reg; 1293 + if (range_sums < range_sums_new) { 1294 + result[i].lose_cover_sizek = 1295 + (range_sums_new - range_sums) << PSHIFT; 1296 + result[i].bad = 1; 1297 + } else 1298 + result[i].lose_cover_sizek = 1299 + (range_sums - range_sums_new) << PSHIFT; 1300 + 1301 + /* double check it */ 1302 + if (!result[i].bad && !result[i].lose_cover_sizek) { 1303 + if (nr_range_new != nr_range || 1304 + memcmp(range, range_new, sizeof(range))) 1305 + result[i].bad = 1; 1306 + } 1307 + 1308 + if (!result[i].bad && (range_sums - range_sums_new < 1309 + min_loss_pfn[num_reg])) { 1310 + min_loss_pfn[num_reg] = 1311 + range_sums - range_sums_new; 1312 + } 1313 + } 1314 + 1315 + static void __init mtrr_print_out_one_result(int i) 1316 + { 1317 + char gran_factor, chunk_factor, lose_factor; 1318 + unsigned long gran_base, chunk_base, lose_base; 1319 + 1320 + gran_base = to_size_factor(result[i].gran_sizek, &gran_factor), 1321 + chunk_base = to_size_factor(result[i].chunk_sizek, &chunk_factor), 1322 + lose_base = to_size_factor(result[i].lose_cover_sizek, &lose_factor), 1323 + printk(KERN_INFO "%sgran_size: %ld%c \tchunk_size: %ld%c \t", 1324 + result[i].bad ? "*BAD*" : " ", 1325 + gran_base, gran_factor, chunk_base, chunk_factor); 1326 + printk(KERN_CONT "num_reg: %d \tlose cover RAM: %s%ld%c\n", 1327 + result[i].num_reg, result[i].bad ? "-" : "", 1328 + lose_base, lose_factor); 1329 + } 1330 + 1331 + static int __init mtrr_search_optimal_index(void) 1332 + { 1333 + int i; 1334 + int num_reg_good; 1335 + int index_good; 1336 + 1337 + if (nr_mtrr_spare_reg >= num_var_ranges) 1338 + nr_mtrr_spare_reg = num_var_ranges - 1; 1339 + num_reg_good = -1; 1340 + for (i = num_var_ranges - nr_mtrr_spare_reg; i > 0; i--) { 1341 + if (!min_loss_pfn[i]) 1342 + num_reg_good = i; 1343 + } 1344 + 1345 + index_good = -1; 1346 + if (num_reg_good != -1) { 1347 + for (i = 0; i < NUM_RESULT; i++) { 1348 + if (!result[i].bad && 1349 + result[i].num_reg == num_reg_good && 1350 + !result[i].lose_cover_sizek) { 1351 + index_good = i; 1352 + break; 1353 + } 1354 + } 1355 + } 1356 + 1357 + return index_good; 1358 + } 1359 + 1360 + 1361 + static int __init mtrr_cleanup(unsigned address_bits) 1362 + { 1363 + unsigned long extra_remove_base, extra_remove_size; 1364 + unsigned long base, size, def, dummy; 1365 + mtrr_type type; 1366 + u64 chunk_size, gran_size; 1367 + int index_good; 1368 + int i; 1369 + 1370 + if (!is_cpu(INTEL) || enable_mtrr_cleanup < 1) 1371 + return 0; 1372 + rdmsr(MTRRdefType_MSR, def, dummy); 1373 + def &= 0xff; 1374 + if (def != MTRR_TYPE_UNCACHABLE) 1375 + return 0; 1376 + 1377 + /* get it and store it aside */ 1378 + memset(range_state, 0, sizeof(range_state)); 1379 + for (i = 0; i < num_var_ranges; i++) { 1380 + mtrr_if->get(i, &base, &size, &type); 1381 + range_state[i].base_pfn = base; 1382 + range_state[i].size_pfn = size; 1383 + range_state[i].type = type; 1384 + } 1385 + 1386 + /* check if we need handle it and can handle it */ 1387 + if (!mtrr_need_cleanup()) 1388 + return 0; 1389 + 1271 1390 /* print original var MTRRs at first, for debugging: */ 1272 1391 printk(KERN_DEBUG "original variable MTRRs\n"); 1273 - for (i = 0; i < num_var_ranges; i++) { 1274 - char start_factor = 'K', size_factor = 'K'; 1275 - unsigned long start_base, size_base; 1276 - 1277 - size_base = range_state[i].size_pfn << (PAGE_SHIFT - 10); 1278 - if (!size_base) 1279 - continue; 1280 - 1281 - size_base = to_size_factor(size_base, &size_factor), 1282 - start_base = range_state[i].base_pfn << (PAGE_SHIFT - 10); 1283 - start_base = to_size_factor(start_base, &start_factor), 1284 - type = range_state[i].type; 1285 - 1286 - printk(KERN_DEBUG "reg %d, base: %ld%cB, range: %ld%cB, type %s\n", 1287 - i, start_base, start_factor, 1288 - size_base, size_factor, 1289 - (type == MTRR_TYPE_UNCACHABLE) ? "UC" : 1290 - ((type == MTRR_TYPE_WRPROT) ? "WP" : 1291 - ((type == MTRR_TYPE_WRBACK) ? "WB" : "Other")) 1292 - ); 1293 - } 1392 + print_out_mtrr_range_state(); 1294 1393 1295 1394 memset(range, 0, sizeof(range)); 1296 1395 extra_remove_size = 0; ··· 1418 1309 range_sums >> (20 - PAGE_SHIFT)); 1419 1310 1420 1311 if (mtrr_chunk_size && mtrr_gran_size) { 1421 - int num_reg; 1422 - char gran_factor, chunk_factor, lose_factor; 1423 - unsigned long gran_base, chunk_base, lose_base; 1424 - 1425 - debug_print++; 1426 - /* convert ranges to var ranges state */ 1427 - num_reg = x86_setup_var_mtrrs(range, nr_range, mtrr_chunk_size, 1428 - mtrr_gran_size); 1429 - 1430 - /* we got new setting in range_state, check it */ 1431 - memset(range_new, 0, sizeof(range_new)); 1432 - nr_range_new = x86_get_mtrr_mem_range(range_new, 0, 1433 - extra_remove_base, 1434 - extra_remove_size); 1435 - range_sums_new = sum_ranges(range_new, nr_range_new); 1436 - 1437 1312 i = 0; 1438 - result[i].chunk_sizek = mtrr_chunk_size >> 10; 1439 - result[i].gran_sizek = mtrr_gran_size >> 10; 1440 - result[i].num_reg = num_reg; 1441 - if (range_sums < range_sums_new) { 1442 - result[i].lose_cover_sizek = 1443 - (range_sums_new - range_sums) << PSHIFT; 1444 - result[i].bad = 1; 1445 - } else 1446 - result[i].lose_cover_sizek = 1447 - (range_sums - range_sums_new) << PSHIFT; 1313 + mtrr_calc_range_state(mtrr_chunk_size, mtrr_gran_size, 1314 + extra_remove_base, extra_remove_size, i); 1448 1315 1449 - gran_base = to_size_factor(result[i].gran_sizek, &gran_factor), 1450 - chunk_base = to_size_factor(result[i].chunk_sizek, &chunk_factor), 1451 - lose_base = to_size_factor(result[i].lose_cover_sizek, &lose_factor), 1452 - printk(KERN_INFO "%sgran_size: %ld%c \tchunk_size: %ld%c \t", 1453 - result[i].bad?"*BAD*":" ", 1454 - gran_base, gran_factor, chunk_base, chunk_factor); 1455 - printk(KERN_CONT "num_reg: %d \tlose cover RAM: %s%ld%c\n", 1456 - result[i].num_reg, result[i].bad?"-":"", 1457 - lose_base, lose_factor); 1316 + mtrr_print_out_one_result(i); 1317 + 1458 1318 if (!result[i].bad) { 1459 1319 set_var_mtrr_all(address_bits); 1460 1320 return 1; 1461 1321 } 1462 1322 printk(KERN_INFO "invalid mtrr_gran_size or mtrr_chunk_size, " 1463 1323 "will find optimal one\n"); 1464 - debug_print--; 1465 - memset(result, 0, sizeof(result[0])); 1466 1324 } 1467 1325 1468 1326 i = 0; 1469 1327 memset(min_loss_pfn, 0xff, sizeof(min_loss_pfn)); 1470 1328 memset(result, 0, sizeof(result)); 1471 1329 for (gran_size = (1ULL<<16); gran_size < (1ULL<<32); gran_size <<= 1) { 1472 - char gran_factor; 1473 - unsigned long gran_base; 1474 - 1475 - if (debug_print) 1476 - gran_base = to_size_factor(gran_size >> 10, &gran_factor); 1477 1330 1478 1331 for (chunk_size = gran_size; chunk_size < (1ULL<<32); 1479 1332 chunk_size <<= 1) { 1480 - int num_reg; 1481 1333 1482 - if (debug_print) { 1483 - char chunk_factor; 1484 - unsigned long chunk_base; 1485 - 1486 - chunk_base = to_size_factor(chunk_size>>10, &chunk_factor), 1487 - printk(KERN_INFO "\n"); 1488 - printk(KERN_INFO "gran_size: %ld%c chunk_size: %ld%c \n", 1489 - gran_base, gran_factor, chunk_base, chunk_factor); 1490 - } 1491 1334 if (i >= NUM_RESULT) 1492 1335 continue; 1493 1336 1494 - /* convert ranges to var ranges state */ 1495 - num_reg = x86_setup_var_mtrrs(range, nr_range, 1496 - chunk_size, gran_size); 1497 - 1498 - /* we got new setting in range_state, check it */ 1499 - memset(range_new, 0, sizeof(range_new)); 1500 - nr_range_new = x86_get_mtrr_mem_range(range_new, 0, 1501 - extra_remove_base, extra_remove_size); 1502 - range_sums_new = sum_ranges(range_new, nr_range_new); 1503 - 1504 - result[i].chunk_sizek = chunk_size >> 10; 1505 - result[i].gran_sizek = gran_size >> 10; 1506 - result[i].num_reg = num_reg; 1507 - if (range_sums < range_sums_new) { 1508 - result[i].lose_cover_sizek = 1509 - (range_sums_new - range_sums) << PSHIFT; 1510 - result[i].bad = 1; 1511 - } else 1512 - result[i].lose_cover_sizek = 1513 - (range_sums - range_sums_new) << PSHIFT; 1514 - 1515 - /* double check it */ 1516 - if (!result[i].bad && !result[i].lose_cover_sizek) { 1517 - if (nr_range_new != nr_range || 1518 - memcmp(range, range_new, sizeof(range))) 1519 - result[i].bad = 1; 1337 + mtrr_calc_range_state(chunk_size, gran_size, 1338 + extra_remove_base, extra_remove_size, i); 1339 + if (debug_print) { 1340 + mtrr_print_out_one_result(i); 1341 + printk(KERN_INFO "\n"); 1520 1342 } 1521 1343 1522 - if (!result[i].bad && (range_sums - range_sums_new < 1523 - min_loss_pfn[num_reg])) { 1524 - min_loss_pfn[num_reg] = 1525 - range_sums - range_sums_new; 1526 - } 1527 1344 i++; 1528 1345 } 1529 1346 } 1530 1347 1531 - /* print out all */ 1532 - for (i = 0; i < NUM_RESULT; i++) { 1533 - char gran_factor, chunk_factor, lose_factor; 1534 - unsigned long gran_base, chunk_base, lose_base; 1535 - 1536 - gran_base = to_size_factor(result[i].gran_sizek, &gran_factor), 1537 - chunk_base = to_size_factor(result[i].chunk_sizek, &chunk_factor), 1538 - lose_base = to_size_factor(result[i].lose_cover_sizek, &lose_factor), 1539 - printk(KERN_INFO "%sgran_size: %ld%c \tchunk_size: %ld%c \t", 1540 - result[i].bad?"*BAD*":" ", 1541 - gran_base, gran_factor, chunk_base, chunk_factor); 1542 - printk(KERN_CONT "num_reg: %d \tlose cover RAM: %s%ld%c\n", 1543 - result[i].num_reg, result[i].bad?"-":"", 1544 - lose_base, lose_factor); 1545 - } 1546 - 1547 1348 /* try to find the optimal index */ 1548 - if (nr_mtrr_spare_reg >= num_var_ranges) 1549 - nr_mtrr_spare_reg = num_var_ranges - 1; 1550 - num_reg_good = -1; 1551 - for (i = num_var_ranges - nr_mtrr_spare_reg; i > 0; i--) { 1552 - if (!min_loss_pfn[i]) 1553 - num_reg_good = i; 1554 - } 1555 - 1556 - index_good = -1; 1557 - if (num_reg_good != -1) { 1558 - for (i = 0; i < NUM_RESULT; i++) { 1559 - if (!result[i].bad && 1560 - result[i].num_reg == num_reg_good && 1561 - !result[i].lose_cover_sizek) { 1562 - index_good = i; 1563 - break; 1564 - } 1565 - } 1566 - } 1349 + index_good = mtrr_search_optimal_index(); 1567 1350 1568 1351 if (index_good != -1) { 1569 - char gran_factor, chunk_factor, lose_factor; 1570 - unsigned long gran_base, chunk_base, lose_base; 1571 - 1572 1352 printk(KERN_INFO "Found optimal setting for mtrr clean up\n"); 1573 1353 i = index_good; 1574 - gran_base = to_size_factor(result[i].gran_sizek, &gran_factor), 1575 - chunk_base = to_size_factor(result[i].chunk_sizek, &chunk_factor), 1576 - lose_base = to_size_factor(result[i].lose_cover_sizek, &lose_factor), 1577 - printk(KERN_INFO "gran_size: %ld%c \tchunk_size: %ld%c \t", 1578 - gran_base, gran_factor, chunk_base, chunk_factor); 1579 - printk(KERN_CONT "num_reg: %d \tlose RAM: %ld%c\n", 1580 - result[i].num_reg, lose_base, lose_factor); 1354 + mtrr_print_out_one_result(i); 1355 + 1581 1356 /* convert ranges to var ranges state */ 1582 1357 chunk_size = result[i].chunk_sizek; 1583 1358 chunk_size <<= 10; 1584 1359 gran_size = result[i].gran_sizek; 1585 1360 gran_size <<= 10; 1586 - debug_print++; 1587 1361 x86_setup_var_mtrrs(range, nr_range, chunk_size, gran_size); 1588 - debug_print--; 1589 1362 set_var_mtrr_all(address_bits); 1363 + printk(KERN_DEBUG "New variable MTRRs\n"); 1364 + print_out_mtrr_range_state(); 1590 1365 return 1; 1366 + } else { 1367 + /* print out all */ 1368 + for (i = 0; i < NUM_RESULT; i++) 1369 + mtrr_print_out_one_result(i); 1591 1370 } 1592 1371 1593 1372 printk(KERN_INFO "mtrr_cleanup: can not find optimal value\n"); ··· 1559 1562 { 1560 1563 unsigned long i, base, size, highest_pfn = 0, def, dummy; 1561 1564 mtrr_type type; 1562 - int nr_range; 1563 1565 u64 total_trim_size; 1564 1566 1565 1567 /* extra one for all 0 */
+112
arch/x86/kernel/cpu/vmware.c
··· 1 + /* 2 + * VMware Detection code. 3 + * 4 + * Copyright (C) 2008, VMware, Inc. 5 + * Author : Alok N Kataria <akataria@vmware.com> 6 + * 7 + * This program is free software; you can redistribute it and/or modify 8 + * it under the terms of the GNU General Public License as published by 9 + * the Free Software Foundation; either version 2 of the License, or 10 + * (at your option) any later version. 11 + * 12 + * This program is distributed in the hope that it will be useful, but 13 + * WITHOUT ANY WARRANTY; without even the implied warranty of 14 + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or 15 + * NON INFRINGEMENT. See the GNU General Public License for more 16 + * details. 17 + * 18 + * You should have received a copy of the GNU General Public License 19 + * along with this program; if not, write to the Free Software 20 + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 21 + * 22 + */ 23 + 24 + #include <linux/dmi.h> 25 + #include <asm/div64.h> 26 + #include <asm/vmware.h> 27 + 28 + #define CPUID_VMWARE_INFO_LEAF 0x40000000 29 + #define VMWARE_HYPERVISOR_MAGIC 0x564D5868 30 + #define VMWARE_HYPERVISOR_PORT 0x5658 31 + 32 + #define VMWARE_PORT_CMD_GETVERSION 10 33 + #define VMWARE_PORT_CMD_GETHZ 45 34 + 35 + #define VMWARE_PORT(cmd, eax, ebx, ecx, edx) \ 36 + __asm__("inl (%%dx)" : \ 37 + "=a"(eax), "=c"(ecx), "=d"(edx), "=b"(ebx) : \ 38 + "0"(VMWARE_HYPERVISOR_MAGIC), \ 39 + "1"(VMWARE_PORT_CMD_##cmd), \ 40 + "2"(VMWARE_HYPERVISOR_PORT), "3"(UINT_MAX) : \ 41 + "memory"); 42 + 43 + static inline int __vmware_platform(void) 44 + { 45 + uint32_t eax, ebx, ecx, edx; 46 + VMWARE_PORT(GETVERSION, eax, ebx, ecx, edx); 47 + return eax != (uint32_t)-1 && ebx == VMWARE_HYPERVISOR_MAGIC; 48 + } 49 + 50 + static unsigned long __vmware_get_tsc_khz(void) 51 + { 52 + uint64_t tsc_hz; 53 + uint32_t eax, ebx, ecx, edx; 54 + 55 + VMWARE_PORT(GETHZ, eax, ebx, ecx, edx); 56 + 57 + if (ebx == UINT_MAX) 58 + return 0; 59 + tsc_hz = eax | (((uint64_t)ebx) << 32); 60 + do_div(tsc_hz, 1000); 61 + BUG_ON(tsc_hz >> 32); 62 + return tsc_hz; 63 + } 64 + 65 + /* 66 + * While checking the dmi string infomation, just checking the product 67 + * serial key should be enough, as this will always have a VMware 68 + * specific string when running under VMware hypervisor. 69 + */ 70 + int vmware_platform(void) 71 + { 72 + if (cpu_has_hypervisor) { 73 + unsigned int eax, ebx, ecx, edx; 74 + char hyper_vendor_id[13]; 75 + 76 + cpuid(CPUID_VMWARE_INFO_LEAF, &eax, &ebx, &ecx, &edx); 77 + memcpy(hyper_vendor_id + 0, &ebx, 4); 78 + memcpy(hyper_vendor_id + 4, &ecx, 4); 79 + memcpy(hyper_vendor_id + 8, &edx, 4); 80 + hyper_vendor_id[12] = '\0'; 81 + if (!strcmp(hyper_vendor_id, "VMwareVMware")) 82 + return 1; 83 + } else if (dmi_available && dmi_name_in_serial("VMware") && 84 + __vmware_platform()) 85 + return 1; 86 + 87 + return 0; 88 + } 89 + 90 + unsigned long vmware_get_tsc_khz(void) 91 + { 92 + BUG_ON(!vmware_platform()); 93 + return __vmware_get_tsc_khz(); 94 + } 95 + 96 + /* 97 + * VMware hypervisor takes care of exporting a reliable TSC to the guest. 98 + * Still, due to timing difference when running on virtual cpus, the TSC can 99 + * be marked as unstable in some cases. For example, the TSC sync check at 100 + * bootup can fail due to a marginal offset between vcpus' TSCs (though the 101 + * TSCs do not drift from each other). Also, the ACPI PM timer clocksource 102 + * is not suitable as a watchdog when running on a hypervisor because the 103 + * kernel may miss a wrap of the counter if the vcpu is descheduled for a 104 + * long time. To skip these checks at runtime we set these capability bits, 105 + * so that the kernel could just trust the hypervisor with providing a 106 + * reliable virtual TSC that is suitable for timekeeping. 107 + */ 108 + void __cpuinit vmware_set_feature_bits(struct cpuinfo_x86 *c) 109 + { 110 + set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); 111 + set_cpu_cap(c, X86_FEATURE_TSC_RELIABLE); 112 + }
+12 -66
arch/x86/kernel/crash.c
··· 29 29 30 30 #include <mach_ipi.h> 31 31 32 - /* This keeps a track of which one is crashing cpu. */ 33 - static int crashing_cpu; 34 32 35 33 #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC) 36 - static atomic_t waiting_for_crash_ipi; 37 34 38 - static int crash_nmi_callback(struct notifier_block *self, 39 - unsigned long val, void *data) 35 + static void kdump_nmi_callback(int cpu, struct die_args *args) 40 36 { 41 37 struct pt_regs *regs; 42 38 #ifdef CONFIG_X86_32 43 39 struct pt_regs fixed_regs; 44 40 #endif 45 - int cpu; 46 41 47 - if (val != DIE_NMI_IPI) 48 - return NOTIFY_OK; 49 - 50 - regs = ((struct die_args *)data)->regs; 51 - cpu = raw_smp_processor_id(); 52 - 53 - /* Don't do anything if this handler is invoked on crashing cpu. 54 - * Otherwise, system will completely hang. Crashing cpu can get 55 - * an NMI if system was initially booted with nmi_watchdog parameter. 56 - */ 57 - if (cpu == crashing_cpu) 58 - return NOTIFY_STOP; 59 - local_irq_disable(); 42 + regs = args->regs; 60 43 61 44 #ifdef CONFIG_X86_32 62 45 if (!user_mode_vm(regs)) { ··· 48 65 } 49 66 #endif 50 67 crash_save_cpu(regs, cpu); 51 - disable_local_APIC(); 52 - atomic_dec(&waiting_for_crash_ipi); 53 - /* Assume hlt works */ 54 - halt(); 55 - for (;;) 56 - cpu_relax(); 57 68 58 - return 1; 59 - } 60 - 61 - static void smp_send_nmi_allbutself(void) 62 - { 63 - cpumask_t mask = cpu_online_map; 64 - cpu_clear(safe_smp_processor_id(), mask); 65 - if (!cpus_empty(mask)) 66 - send_IPI_mask(mask, NMI_VECTOR); 67 - } 68 - 69 - static struct notifier_block crash_nmi_nb = { 70 - .notifier_call = crash_nmi_callback, 71 - }; 72 - 73 - static void nmi_shootdown_cpus(void) 74 - { 75 - unsigned long msecs; 76 - 77 - atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); 78 - /* Would it be better to replace the trap vector here? */ 79 - if (register_die_notifier(&crash_nmi_nb)) 80 - return; /* return what? */ 81 - /* Ensure the new callback function is set before sending 82 - * out the NMI 83 - */ 84 - wmb(); 85 - 86 - smp_send_nmi_allbutself(); 87 - 88 - msecs = 1000; /* Wait at most a second for the other cpus to stop */ 89 - while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { 90 - mdelay(1); 91 - msecs--; 92 - } 93 - 94 - /* Leave the nmi callback set */ 95 69 disable_local_APIC(); 96 70 } 71 + 72 + static void kdump_nmi_shootdown_cpus(void) 73 + { 74 + nmi_shootdown_cpus(kdump_nmi_callback); 75 + 76 + disable_local_APIC(); 77 + } 78 + 97 79 #else 98 - static void nmi_shootdown_cpus(void) 80 + static void kdump_nmi_shootdown_cpus(void) 99 81 { 100 82 /* There are no cpus to shootdown */ 101 83 } ··· 79 131 /* The kernel is broken so disable interrupts */ 80 132 local_irq_disable(); 81 133 82 - /* Make a note of crashing cpu. Will be used in NMI callback.*/ 83 - crashing_cpu = safe_smp_processor_id(); 84 - nmi_shootdown_cpus(); 134 + kdump_nmi_shootdown_cpus(); 85 135 lapic_shutdown(); 86 136 #if defined(CONFIG_X86_IO_APIC) 87 137 disable_IO_APIC();
+4 -5
arch/x86/kernel/ds.c
··· 847 847 switch (c->x86) { 848 848 case 0x6: 849 849 switch (c->x86_model) { 850 + case 0 ... 0xC: 851 + /* sorry, don't know about them */ 852 + break; 850 853 case 0xD: 851 854 case 0xE: /* Pentium M */ 852 855 ds_configure(&ds_cfg_var); 853 856 break; 854 - case 0xF: /* Core2 */ 855 - case 0x1C: /* Atom */ 857 + default: /* Core2, Atom, ... */ 856 858 ds_configure(&ds_cfg_64); 857 - break; 858 - default: 859 - /* sorry, don't know about them */ 860 859 break; 861 860 } 862 861 break;
+319
arch/x86/kernel/dumpstack.c
··· 1 + /* 2 + * Copyright (C) 1991, 1992 Linus Torvalds 3 + * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 + */ 5 + #include <linux/kallsyms.h> 6 + #include <linux/kprobes.h> 7 + #include <linux/uaccess.h> 8 + #include <linux/utsname.h> 9 + #include <linux/hardirq.h> 10 + #include <linux/kdebug.h> 11 + #include <linux/module.h> 12 + #include <linux/ptrace.h> 13 + #include <linux/kexec.h> 14 + #include <linux/bug.h> 15 + #include <linux/nmi.h> 16 + #include <linux/sysfs.h> 17 + 18 + #include <asm/stacktrace.h> 19 + 20 + #include "dumpstack.h" 21 + 22 + int panic_on_unrecovered_nmi; 23 + unsigned int code_bytes = 64; 24 + int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE; 25 + static int die_counter; 26 + 27 + void printk_address(unsigned long address, int reliable) 28 + { 29 + printk(" [<%p>] %s%pS\n", (void *) address, 30 + reliable ? "" : "? ", (void *) address); 31 + } 32 + 33 + /* 34 + * x86-64 can have up to three kernel stacks: 35 + * process stack 36 + * interrupt stack 37 + * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack 38 + */ 39 + 40 + static inline int valid_stack_ptr(struct thread_info *tinfo, 41 + void *p, unsigned int size, void *end) 42 + { 43 + void *t = tinfo; 44 + if (end) { 45 + if (p < end && p >= (end-THREAD_SIZE)) 46 + return 1; 47 + else 48 + return 0; 49 + } 50 + return p > t && p < t + THREAD_SIZE - size; 51 + } 52 + 53 + unsigned long 54 + print_context_stack(struct thread_info *tinfo, 55 + unsigned long *stack, unsigned long bp, 56 + const struct stacktrace_ops *ops, void *data, 57 + unsigned long *end) 58 + { 59 + struct stack_frame *frame = (struct stack_frame *)bp; 60 + 61 + while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) { 62 + unsigned long addr; 63 + 64 + addr = *stack; 65 + if (__kernel_text_address(addr)) { 66 + if ((unsigned long) stack == bp + sizeof(long)) { 67 + ops->address(data, addr, 1); 68 + frame = frame->next_frame; 69 + bp = (unsigned long) frame; 70 + } else { 71 + ops->address(data, addr, bp == 0); 72 + } 73 + } 74 + stack++; 75 + } 76 + return bp; 77 + } 78 + 79 + 80 + static void 81 + print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 82 + { 83 + printk(data); 84 + print_symbol(msg, symbol); 85 + printk("\n"); 86 + } 87 + 88 + static void print_trace_warning(void *data, char *msg) 89 + { 90 + printk("%s%s\n", (char *)data, msg); 91 + } 92 + 93 + static int print_trace_stack(void *data, char *name) 94 + { 95 + printk("%s <%s> ", (char *)data, name); 96 + return 0; 97 + } 98 + 99 + /* 100 + * Print one address/symbol entries per line. 101 + */ 102 + static void print_trace_address(void *data, unsigned long addr, int reliable) 103 + { 104 + touch_nmi_watchdog(); 105 + printk(data); 106 + printk_address(addr, reliable); 107 + } 108 + 109 + static const struct stacktrace_ops print_trace_ops = { 110 + .warning = print_trace_warning, 111 + .warning_symbol = print_trace_warning_symbol, 112 + .stack = print_trace_stack, 113 + .address = print_trace_address, 114 + }; 115 + 116 + void 117 + show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 118 + unsigned long *stack, unsigned long bp, char *log_lvl) 119 + { 120 + printk("%sCall Trace:\n", log_lvl); 121 + dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 122 + } 123 + 124 + void show_trace(struct task_struct *task, struct pt_regs *regs, 125 + unsigned long *stack, unsigned long bp) 126 + { 127 + show_trace_log_lvl(task, regs, stack, bp, ""); 128 + } 129 + 130 + void show_stack(struct task_struct *task, unsigned long *sp) 131 + { 132 + show_stack_log_lvl(task, NULL, sp, 0, ""); 133 + } 134 + 135 + /* 136 + * The architecture-independent dump_stack generator 137 + */ 138 + void dump_stack(void) 139 + { 140 + unsigned long bp = 0; 141 + unsigned long stack; 142 + 143 + #ifdef CONFIG_FRAME_POINTER 144 + if (!bp) 145 + get_bp(bp); 146 + #endif 147 + 148 + printk("Pid: %d, comm: %.20s %s %s %.*s\n", 149 + current->pid, current->comm, print_tainted(), 150 + init_utsname()->release, 151 + (int)strcspn(init_utsname()->version, " "), 152 + init_utsname()->version); 153 + show_trace(NULL, NULL, &stack, bp); 154 + } 155 + EXPORT_SYMBOL(dump_stack); 156 + 157 + static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 158 + static int die_owner = -1; 159 + static unsigned int die_nest_count; 160 + 161 + unsigned __kprobes long oops_begin(void) 162 + { 163 + int cpu; 164 + unsigned long flags; 165 + 166 + oops_enter(); 167 + 168 + /* racy, but better than risking deadlock. */ 169 + raw_local_irq_save(flags); 170 + cpu = smp_processor_id(); 171 + if (!__raw_spin_trylock(&die_lock)) { 172 + if (cpu == die_owner) 173 + /* nested oops. should stop eventually */; 174 + else 175 + __raw_spin_lock(&die_lock); 176 + } 177 + die_nest_count++; 178 + die_owner = cpu; 179 + console_verbose(); 180 + bust_spinlocks(1); 181 + return flags; 182 + } 183 + 184 + void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 185 + { 186 + if (regs && kexec_should_crash(current)) 187 + crash_kexec(regs); 188 + 189 + bust_spinlocks(0); 190 + die_owner = -1; 191 + add_taint(TAINT_DIE); 192 + die_nest_count--; 193 + if (!die_nest_count) 194 + /* Nest count reaches zero, release the lock. */ 195 + __raw_spin_unlock(&die_lock); 196 + raw_local_irq_restore(flags); 197 + oops_exit(); 198 + 199 + if (!signr) 200 + return; 201 + if (in_interrupt()) 202 + panic("Fatal exception in interrupt"); 203 + if (panic_on_oops) 204 + panic("Fatal exception"); 205 + do_exit(signr); 206 + } 207 + 208 + int __kprobes __die(const char *str, struct pt_regs *regs, long err) 209 + { 210 + #ifdef CONFIG_X86_32 211 + unsigned short ss; 212 + unsigned long sp; 213 + #endif 214 + printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter); 215 + #ifdef CONFIG_PREEMPT 216 + printk("PREEMPT "); 217 + #endif 218 + #ifdef CONFIG_SMP 219 + printk("SMP "); 220 + #endif 221 + #ifdef CONFIG_DEBUG_PAGEALLOC 222 + printk("DEBUG_PAGEALLOC"); 223 + #endif 224 + printk("\n"); 225 + sysfs_printk_last_file(); 226 + if (notify_die(DIE_OOPS, str, regs, err, 227 + current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 228 + return 1; 229 + 230 + show_registers(regs); 231 + #ifdef CONFIG_X86_32 232 + sp = (unsigned long) (&regs->sp); 233 + savesegment(ss, ss); 234 + if (user_mode(regs)) { 235 + sp = regs->sp; 236 + ss = regs->ss & 0xffff; 237 + } 238 + printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip); 239 + print_symbol("%s", regs->ip); 240 + printk(" SS:ESP %04x:%08lx\n", ss, sp); 241 + #else 242 + /* Executive summary in case the oops scrolled away */ 243 + printk(KERN_ALERT "RIP "); 244 + printk_address(regs->ip, 1); 245 + printk(" RSP <%016lx>\n", regs->sp); 246 + #endif 247 + return 0; 248 + } 249 + 250 + /* 251 + * This is gone through when something in the kernel has done something bad 252 + * and is about to be terminated: 253 + */ 254 + void die(const char *str, struct pt_regs *regs, long err) 255 + { 256 + unsigned long flags = oops_begin(); 257 + int sig = SIGSEGV; 258 + 259 + if (!user_mode_vm(regs)) 260 + report_bug(regs->ip, regs); 261 + 262 + if (__die(str, regs, err)) 263 + sig = 0; 264 + oops_end(flags, regs, sig); 265 + } 266 + 267 + void notrace __kprobes 268 + die_nmi(char *str, struct pt_regs *regs, int do_panic) 269 + { 270 + unsigned long flags; 271 + 272 + if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 273 + return; 274 + 275 + /* 276 + * We are in trouble anyway, lets at least try 277 + * to get a message out. 278 + */ 279 + flags = oops_begin(); 280 + printk(KERN_EMERG "%s", str); 281 + printk(" on CPU%d, ip %08lx, registers:\n", 282 + smp_processor_id(), regs->ip); 283 + show_registers(regs); 284 + oops_end(flags, regs, 0); 285 + if (do_panic || panic_on_oops) 286 + panic("Non maskable interrupt"); 287 + nmi_exit(); 288 + local_irq_enable(); 289 + do_exit(SIGBUS); 290 + } 291 + 292 + static int __init oops_setup(char *s) 293 + { 294 + if (!s) 295 + return -EINVAL; 296 + if (!strcmp(s, "panic")) 297 + panic_on_oops = 1; 298 + return 0; 299 + } 300 + early_param("oops", oops_setup); 301 + 302 + static int __init kstack_setup(char *s) 303 + { 304 + if (!s) 305 + return -EINVAL; 306 + kstack_depth_to_print = simple_strtoul(s, NULL, 0); 307 + return 0; 308 + } 309 + early_param("kstack", kstack_setup); 310 + 311 + static int __init code_bytes_setup(char *s) 312 + { 313 + code_bytes = simple_strtoul(s, NULL, 0); 314 + if (code_bytes > 8192) 315 + code_bytes = 8192; 316 + 317 + return 1; 318 + } 319 + __setup("code_bytes=", code_bytes_setup);
+39
arch/x86/kernel/dumpstack.h
··· 1 + /* 2 + * Copyright (C) 1991, 1992 Linus Torvalds 3 + * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 + */ 5 + 6 + #ifndef DUMPSTACK_H 7 + #define DUMPSTACK_H 8 + 9 + #ifdef CONFIG_X86_32 10 + #define STACKSLOTS_PER_LINE 8 11 + #define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :) 12 + #else 13 + #define STACKSLOTS_PER_LINE 4 14 + #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :) 15 + #endif 16 + 17 + extern unsigned long 18 + print_context_stack(struct thread_info *tinfo, 19 + unsigned long *stack, unsigned long bp, 20 + const struct stacktrace_ops *ops, void *data, 21 + unsigned long *end); 22 + 23 + extern void 24 + show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 25 + unsigned long *stack, unsigned long bp, char *log_lvl); 26 + 27 + extern void 28 + show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 29 + unsigned long *sp, unsigned long bp, char *log_lvl); 30 + 31 + extern unsigned int code_bytes; 32 + extern int kstack_depth_to_print; 33 + 34 + /* The form of the top of the frame on the stack */ 35 + struct stack_frame { 36 + struct stack_frame *next_frame; 37 + unsigned long return_address; 38 + }; 39 + #endif
+2 -300
arch/x86/kernel/dumpstack_32.c
··· 17 17 18 18 #include <asm/stacktrace.h> 19 19 20 - #define STACKSLOTS_PER_LINE 8 21 - #define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :) 22 - 23 - int panic_on_unrecovered_nmi; 24 - int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE; 25 - static unsigned int code_bytes = 64; 26 - static int die_counter; 27 - 28 - void printk_address(unsigned long address, int reliable) 29 - { 30 - printk(" [<%p>] %s%pS\n", (void *) address, 31 - reliable ? "" : "? ", (void *) address); 32 - } 33 - 34 - static inline int valid_stack_ptr(struct thread_info *tinfo, 35 - void *p, unsigned int size, void *end) 36 - { 37 - void *t = tinfo; 38 - if (end) { 39 - if (p < end && p >= (end-THREAD_SIZE)) 40 - return 1; 41 - else 42 - return 0; 43 - } 44 - return p > t && p < t + THREAD_SIZE - size; 45 - } 46 - 47 - /* The form of the top of the frame on the stack */ 48 - struct stack_frame { 49 - struct stack_frame *next_frame; 50 - unsigned long return_address; 51 - }; 52 - 53 - static inline unsigned long 54 - print_context_stack(struct thread_info *tinfo, 55 - unsigned long *stack, unsigned long bp, 56 - const struct stacktrace_ops *ops, void *data, 57 - unsigned long *end) 58 - { 59 - struct stack_frame *frame = (struct stack_frame *)bp; 60 - 61 - while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) { 62 - unsigned long addr; 63 - 64 - addr = *stack; 65 - if (__kernel_text_address(addr)) { 66 - if ((unsigned long) stack == bp + sizeof(long)) { 67 - ops->address(data, addr, 1); 68 - frame = frame->next_frame; 69 - bp = (unsigned long) frame; 70 - } else { 71 - ops->address(data, addr, bp == 0); 72 - } 73 - } 74 - stack++; 75 - } 76 - return bp; 77 - } 20 + #include "dumpstack.h" 78 21 79 22 void dump_trace(struct task_struct *task, struct pt_regs *regs, 80 23 unsigned long *stack, unsigned long bp, ··· 62 119 } 63 120 EXPORT_SYMBOL(dump_trace); 64 121 65 - static void 66 - print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 67 - { 68 - printk(data); 69 - print_symbol(msg, symbol); 70 - printk("\n"); 71 - } 72 - 73 - static void print_trace_warning(void *data, char *msg) 74 - { 75 - printk("%s%s\n", (char *)data, msg); 76 - } 77 - 78 - static int print_trace_stack(void *data, char *name) 79 - { 80 - printk("%s <%s> ", (char *)data, name); 81 - return 0; 82 - } 83 - 84 - /* 85 - * Print one address/symbol entries per line. 86 - */ 87 - static void print_trace_address(void *data, unsigned long addr, int reliable) 88 - { 89 - touch_nmi_watchdog(); 90 - printk(data); 91 - printk_address(addr, reliable); 92 - } 93 - 94 - static const struct stacktrace_ops print_trace_ops = { 95 - .warning = print_trace_warning, 96 - .warning_symbol = print_trace_warning_symbol, 97 - .stack = print_trace_stack, 98 - .address = print_trace_address, 99 - }; 100 - 101 - static void 102 - show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 103 - unsigned long *stack, unsigned long bp, char *log_lvl) 104 - { 105 - printk("%sCall Trace:\n", log_lvl); 106 - dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 107 - } 108 - 109 - void show_trace(struct task_struct *task, struct pt_regs *regs, 110 - unsigned long *stack, unsigned long bp) 111 - { 112 - show_trace_log_lvl(task, regs, stack, bp, ""); 113 - } 114 - 115 - static void 122 + void 116 123 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 117 124 unsigned long *sp, unsigned long bp, char *log_lvl) 118 125 { ··· 89 196 show_trace_log_lvl(task, regs, sp, bp, log_lvl); 90 197 } 91 198 92 - void show_stack(struct task_struct *task, unsigned long *sp) 93 - { 94 - show_stack_log_lvl(task, NULL, sp, 0, ""); 95 - } 96 - 97 - /* 98 - * The architecture-independent dump_stack generator 99 - */ 100 - void dump_stack(void) 101 - { 102 - unsigned long bp = 0; 103 - unsigned long stack; 104 - 105 - #ifdef CONFIG_FRAME_POINTER 106 - if (!bp) 107 - get_bp(bp); 108 - #endif 109 - 110 - printk("Pid: %d, comm: %.20s %s %s %.*s\n", 111 - current->pid, current->comm, print_tainted(), 112 - init_utsname()->release, 113 - (int)strcspn(init_utsname()->version, " "), 114 - init_utsname()->version); 115 - show_trace(NULL, NULL, &stack, bp); 116 - } 117 - 118 - EXPORT_SYMBOL(dump_stack); 119 199 120 200 void show_registers(struct pt_regs *regs) 121 201 { ··· 149 283 return ud2 == 0x0b0f; 150 284 } 151 285 152 - static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 153 - static int die_owner = -1; 154 - static unsigned int die_nest_count; 155 - 156 - unsigned __kprobes long oops_begin(void) 157 - { 158 - unsigned long flags; 159 - 160 - oops_enter(); 161 - 162 - if (die_owner != raw_smp_processor_id()) { 163 - console_verbose(); 164 - raw_local_irq_save(flags); 165 - __raw_spin_lock(&die_lock); 166 - die_owner = smp_processor_id(); 167 - die_nest_count = 0; 168 - bust_spinlocks(1); 169 - } else { 170 - raw_local_irq_save(flags); 171 - } 172 - die_nest_count++; 173 - return flags; 174 - } 175 - 176 - void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 177 - { 178 - bust_spinlocks(0); 179 - die_owner = -1; 180 - add_taint(TAINT_DIE); 181 - __raw_spin_unlock(&die_lock); 182 - raw_local_irq_restore(flags); 183 - 184 - if (!regs) 185 - return; 186 - 187 - if (kexec_should_crash(current)) 188 - crash_kexec(regs); 189 - if (in_interrupt()) 190 - panic("Fatal exception in interrupt"); 191 - if (panic_on_oops) 192 - panic("Fatal exception"); 193 - oops_exit(); 194 - do_exit(signr); 195 - } 196 - 197 - int __kprobes __die(const char *str, struct pt_regs *regs, long err) 198 - { 199 - unsigned short ss; 200 - unsigned long sp; 201 - 202 - printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter); 203 - #ifdef CONFIG_PREEMPT 204 - printk("PREEMPT "); 205 - #endif 206 - #ifdef CONFIG_SMP 207 - printk("SMP "); 208 - #endif 209 - #ifdef CONFIG_DEBUG_PAGEALLOC 210 - printk("DEBUG_PAGEALLOC"); 211 - #endif 212 - printk("\n"); 213 - sysfs_printk_last_file(); 214 - if (notify_die(DIE_OOPS, str, regs, err, 215 - current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 216 - return 1; 217 - 218 - show_registers(regs); 219 - /* Executive summary in case the oops scrolled away */ 220 - sp = (unsigned long) (&regs->sp); 221 - savesegment(ss, ss); 222 - if (user_mode(regs)) { 223 - sp = regs->sp; 224 - ss = regs->ss & 0xffff; 225 - } 226 - printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip); 227 - print_symbol("%s", regs->ip); 228 - printk(" SS:ESP %04x:%08lx\n", ss, sp); 229 - return 0; 230 - } 231 - 232 - /* 233 - * This is gone through when something in the kernel has done something bad 234 - * and is about to be terminated: 235 - */ 236 - void die(const char *str, struct pt_regs *regs, long err) 237 - { 238 - unsigned long flags = oops_begin(); 239 - 240 - if (die_nest_count < 3) { 241 - report_bug(regs->ip, regs); 242 - 243 - if (__die(str, regs, err)) 244 - regs = NULL; 245 - } else { 246 - printk(KERN_EMERG "Recursive die() failure, output suppressed\n"); 247 - } 248 - 249 - oops_end(flags, regs, SIGSEGV); 250 - } 251 - 252 - static DEFINE_SPINLOCK(nmi_print_lock); 253 - 254 - void notrace __kprobes 255 - die_nmi(char *str, struct pt_regs *regs, int do_panic) 256 - { 257 - if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 258 - return; 259 - 260 - spin_lock(&nmi_print_lock); 261 - /* 262 - * We are in trouble anyway, lets at least try 263 - * to get a message out: 264 - */ 265 - bust_spinlocks(1); 266 - printk(KERN_EMERG "%s", str); 267 - printk(" on CPU%d, ip %08lx, registers:\n", 268 - smp_processor_id(), regs->ip); 269 - show_registers(regs); 270 - if (do_panic) 271 - panic("Non maskable interrupt"); 272 - console_silent(); 273 - spin_unlock(&nmi_print_lock); 274 - 275 - /* 276 - * If we are in kernel we are probably nested up pretty bad 277 - * and might aswell get out now while we still can: 278 - */ 279 - if (!user_mode_vm(regs)) { 280 - current->thread.trap_no = 2; 281 - crash_kexec(regs); 282 - } 283 - 284 - bust_spinlocks(0); 285 - do_exit(SIGSEGV); 286 - } 287 - 288 - static int __init oops_setup(char *s) 289 - { 290 - if (!s) 291 - return -EINVAL; 292 - if (!strcmp(s, "panic")) 293 - panic_on_oops = 1; 294 - return 0; 295 - } 296 - early_param("oops", oops_setup); 297 - 298 - static int __init kstack_setup(char *s) 299 - { 300 - if (!s) 301 - return -EINVAL; 302 - kstack_depth_to_print = simple_strtoul(s, NULL, 0); 303 - return 0; 304 - } 305 - early_param("kstack", kstack_setup); 306 - 307 - static int __init code_bytes_setup(char *s) 308 - { 309 - code_bytes = simple_strtoul(s, NULL, 0); 310 - if (code_bytes > 8192) 311 - code_bytes = 8192; 312 - 313 - return 1; 314 - } 315 - __setup("code_bytes=", code_bytes_setup);
+2 -280
arch/x86/kernel/dumpstack_64.c
··· 17 17 18 18 #include <asm/stacktrace.h> 19 19 20 - #define STACKSLOTS_PER_LINE 4 21 - #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :) 22 - 23 - int panic_on_unrecovered_nmi; 24 - int kstack_depth_to_print = 3 * STACKSLOTS_PER_LINE; 25 - static unsigned int code_bytes = 64; 26 - static int die_counter; 27 - 28 - void printk_address(unsigned long address, int reliable) 29 - { 30 - printk(" [<%p>] %s%pS\n", (void *) address, 31 - reliable ? "" : "? ", (void *) address); 32 - } 20 + #include "dumpstack.h" 33 21 34 22 static unsigned long *in_exception_stack(unsigned cpu, unsigned long stack, 35 23 unsigned *usedp, char **idp) ··· 100 112 * interrupt stack 101 113 * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack 102 114 */ 103 - 104 - static inline int valid_stack_ptr(struct thread_info *tinfo, 105 - void *p, unsigned int size, void *end) 106 - { 107 - void *t = tinfo; 108 - if (end) { 109 - if (p < end && p >= (end-THREAD_SIZE)) 110 - return 1; 111 - else 112 - return 0; 113 - } 114 - return p > t && p < t + THREAD_SIZE - size; 115 - } 116 - 117 - /* The form of the top of the frame on the stack */ 118 - struct stack_frame { 119 - struct stack_frame *next_frame; 120 - unsigned long return_address; 121 - }; 122 - 123 - static inline unsigned long 124 - print_context_stack(struct thread_info *tinfo, 125 - unsigned long *stack, unsigned long bp, 126 - const struct stacktrace_ops *ops, void *data, 127 - unsigned long *end) 128 - { 129 - struct stack_frame *frame = (struct stack_frame *)bp; 130 - 131 - while (valid_stack_ptr(tinfo, stack, sizeof(*stack), end)) { 132 - unsigned long addr; 133 - 134 - addr = *stack; 135 - if (__kernel_text_address(addr)) { 136 - if ((unsigned long) stack == bp + sizeof(long)) { 137 - ops->address(data, addr, 1); 138 - frame = frame->next_frame; 139 - bp = (unsigned long) frame; 140 - } else { 141 - ops->address(data, addr, bp == 0); 142 - } 143 - } 144 - stack++; 145 - } 146 - return bp; 147 - } 148 115 149 116 void dump_trace(struct task_struct *task, struct pt_regs *regs, 150 117 unsigned long *stack, unsigned long bp, ··· 191 248 } 192 249 EXPORT_SYMBOL(dump_trace); 193 250 194 - static void 195 - print_trace_warning_symbol(void *data, char *msg, unsigned long symbol) 196 - { 197 - printk(data); 198 - print_symbol(msg, symbol); 199 - printk("\n"); 200 - } 201 - 202 - static void print_trace_warning(void *data, char *msg) 203 - { 204 - printk("%s%s\n", (char *)data, msg); 205 - } 206 - 207 - static int print_trace_stack(void *data, char *name) 208 - { 209 - printk("%s <%s> ", (char *)data, name); 210 - return 0; 211 - } 212 - 213 - /* 214 - * Print one address/symbol entries per line. 215 - */ 216 - static void print_trace_address(void *data, unsigned long addr, int reliable) 217 - { 218 - touch_nmi_watchdog(); 219 - printk(data); 220 - printk_address(addr, reliable); 221 - } 222 - 223 - static const struct stacktrace_ops print_trace_ops = { 224 - .warning = print_trace_warning, 225 - .warning_symbol = print_trace_warning_symbol, 226 - .stack = print_trace_stack, 227 - .address = print_trace_address, 228 - }; 229 - 230 - static void 231 - show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 232 - unsigned long *stack, unsigned long bp, char *log_lvl) 233 - { 234 - printk("%sCall Trace:\n", log_lvl); 235 - dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl); 236 - } 237 - 238 - void show_trace(struct task_struct *task, struct pt_regs *regs, 239 - unsigned long *stack, unsigned long bp) 240 - { 241 - show_trace_log_lvl(task, regs, stack, bp, ""); 242 - } 243 - 244 - static void 251 + void 245 252 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 246 253 unsigned long *sp, unsigned long bp, char *log_lvl) 247 254 { ··· 234 341 printk("\n"); 235 342 show_trace_log_lvl(task, regs, sp, bp, log_lvl); 236 343 } 237 - 238 - void show_stack(struct task_struct *task, unsigned long *sp) 239 - { 240 - show_stack_log_lvl(task, NULL, sp, 0, ""); 241 - } 242 - 243 - /* 244 - * The architecture-independent dump_stack generator 245 - */ 246 - void dump_stack(void) 247 - { 248 - unsigned long bp = 0; 249 - unsigned long stack; 250 - 251 - #ifdef CONFIG_FRAME_POINTER 252 - if (!bp) 253 - get_bp(bp); 254 - #endif 255 - 256 - printk("Pid: %d, comm: %.20s %s %s %.*s\n", 257 - current->pid, current->comm, print_tainted(), 258 - init_utsname()->release, 259 - (int)strcspn(init_utsname()->version, " "), 260 - init_utsname()->version); 261 - show_trace(NULL, NULL, &stack, bp); 262 - } 263 - EXPORT_SYMBOL(dump_stack); 264 344 265 345 void show_registers(struct pt_regs *regs) 266 346 { ··· 295 429 return ud2 == 0x0b0f; 296 430 } 297 431 298 - static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; 299 - static int die_owner = -1; 300 - static unsigned int die_nest_count; 301 - 302 - unsigned __kprobes long oops_begin(void) 303 - { 304 - int cpu; 305 - unsigned long flags; 306 - 307 - oops_enter(); 308 - 309 - /* racy, but better than risking deadlock. */ 310 - raw_local_irq_save(flags); 311 - cpu = smp_processor_id(); 312 - if (!__raw_spin_trylock(&die_lock)) { 313 - if (cpu == die_owner) 314 - /* nested oops. should stop eventually */; 315 - else 316 - __raw_spin_lock(&die_lock); 317 - } 318 - die_nest_count++; 319 - die_owner = cpu; 320 - console_verbose(); 321 - bust_spinlocks(1); 322 - return flags; 323 - } 324 - 325 - void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) 326 - { 327 - die_owner = -1; 328 - bust_spinlocks(0); 329 - die_nest_count--; 330 - if (!die_nest_count) 331 - /* Nest count reaches zero, release the lock. */ 332 - __raw_spin_unlock(&die_lock); 333 - raw_local_irq_restore(flags); 334 - if (!regs) { 335 - oops_exit(); 336 - return; 337 - } 338 - if (in_interrupt()) 339 - panic("Fatal exception in interrupt"); 340 - if (panic_on_oops) 341 - panic("Fatal exception"); 342 - oops_exit(); 343 - do_exit(signr); 344 - } 345 - 346 - int __kprobes __die(const char *str, struct pt_regs *regs, long err) 347 - { 348 - printk(KERN_EMERG "%s: %04lx [#%d] ", str, err & 0xffff, ++die_counter); 349 - #ifdef CONFIG_PREEMPT 350 - printk("PREEMPT "); 351 - #endif 352 - #ifdef CONFIG_SMP 353 - printk("SMP "); 354 - #endif 355 - #ifdef CONFIG_DEBUG_PAGEALLOC 356 - printk("DEBUG_PAGEALLOC"); 357 - #endif 358 - printk("\n"); 359 - sysfs_printk_last_file(); 360 - if (notify_die(DIE_OOPS, str, regs, err, 361 - current->thread.trap_no, SIGSEGV) == NOTIFY_STOP) 362 - return 1; 363 - 364 - show_registers(regs); 365 - add_taint(TAINT_DIE); 366 - /* Executive summary in case the oops scrolled away */ 367 - printk(KERN_ALERT "RIP "); 368 - printk_address(regs->ip, 1); 369 - printk(" RSP <%016lx>\n", regs->sp); 370 - if (kexec_should_crash(current)) 371 - crash_kexec(regs); 372 - return 0; 373 - } 374 - 375 - void die(const char *str, struct pt_regs *regs, long err) 376 - { 377 - unsigned long flags = oops_begin(); 378 - 379 - if (!user_mode(regs)) 380 - report_bug(regs->ip, regs); 381 - 382 - if (__die(str, regs, err)) 383 - regs = NULL; 384 - oops_end(flags, regs, SIGSEGV); 385 - } 386 - 387 - notrace __kprobes void 388 - die_nmi(char *str, struct pt_regs *regs, int do_panic) 389 - { 390 - unsigned long flags; 391 - 392 - if (notify_die(DIE_NMIWATCHDOG, str, regs, 0, 2, SIGINT) == NOTIFY_STOP) 393 - return; 394 - 395 - flags = oops_begin(); 396 - /* 397 - * We are in trouble anyway, lets at least try 398 - * to get a message out. 399 - */ 400 - printk(KERN_EMERG "%s", str); 401 - printk(" on CPU%d, ip %08lx, registers:\n", 402 - smp_processor_id(), regs->ip); 403 - show_registers(regs); 404 - if (kexec_should_crash(current)) 405 - crash_kexec(regs); 406 - if (do_panic || panic_on_oops) 407 - panic("Non maskable interrupt"); 408 - oops_end(flags, NULL, SIGBUS); 409 - nmi_exit(); 410 - local_irq_enable(); 411 - do_exit(SIGBUS); 412 - } 413 - 414 - static int __init oops_setup(char *s) 415 - { 416 - if (!s) 417 - return -EINVAL; 418 - if (!strcmp(s, "panic")) 419 - panic_on_oops = 1; 420 - return 0; 421 - } 422 - early_param("oops", oops_setup); 423 - 424 - static int __init kstack_setup(char *s) 425 - { 426 - if (!s) 427 - return -EINVAL; 428 - kstack_depth_to_print = simple_strtoul(s, NULL, 0); 429 - return 0; 430 - } 431 - early_param("kstack", kstack_setup); 432 - 433 - static int __init code_bytes_setup(char *s) 434 - { 435 - code_bytes = simple_strtoul(s, NULL, 0); 436 - if (code_bytes > 8192) 437 - code_bytes = 8192; 438 - 439 - return 1; 440 - } 441 - __setup("code_bytes=", code_bytes_setup);
-16
arch/x86/kernel/e820.c
··· 677 677 }; 678 678 static struct early_res early_res[MAX_EARLY_RES] __initdata = { 679 679 { 0, PAGE_SIZE, "BIOS data page" }, /* BIOS data page */ 680 - #if defined(CONFIG_X86_64) && defined(CONFIG_X86_TRAMPOLINE) 681 - { TRAMPOLINE_BASE, TRAMPOLINE_BASE + 2 * PAGE_SIZE, "TRAMPOLINE" }, 682 - #endif 683 - #if defined(CONFIG_X86_32) && defined(CONFIG_SMP) 684 - /* 685 - * But first pinch a few for the stack/trampoline stuff 686 - * FIXME: Don't need the extra page at 4K, but need to fix 687 - * trampoline before removing it. (see the GDT stuff) 688 - */ 689 - { PAGE_SIZE, PAGE_SIZE + PAGE_SIZE, "EX TRAMPOLINE" }, 690 - /* 691 - * Has to be in very low memory so we can execute 692 - * real-mode AP code. 693 - */ 694 - { TRAMPOLINE_BASE, TRAMPOLINE_BASE + PAGE_SIZE, "TRAMPOLINE" }, 695 - #endif 696 680 {} 697 681 }; 698 682
-47
arch/x86/kernel/early_printk.c
··· 875 875 }; 876 876 #endif 877 877 878 - /* Console interface to a host file on AMD's SimNow! */ 879 - 880 - static int simnow_fd; 881 - 882 - enum { 883 - MAGIC1 = 0xBACCD00A, 884 - MAGIC2 = 0xCA110000, 885 - XOPEN = 5, 886 - XWRITE = 4, 887 - }; 888 - 889 - static noinline long simnow(long cmd, long a, long b, long c) 890 - { 891 - long ret; 892 - 893 - asm volatile("cpuid" : 894 - "=a" (ret) : 895 - "b" (a), "c" (b), "d" (c), "0" (MAGIC1), "D" (cmd + MAGIC2)); 896 - return ret; 897 - } 898 - 899 - static void __init simnow_init(char *str) 900 - { 901 - char *fn = "klog"; 902 - 903 - if (*str == '=') 904 - fn = ++str; 905 - /* error ignored */ 906 - simnow_fd = simnow(XOPEN, (unsigned long)fn, O_WRONLY|O_APPEND|O_CREAT, 0644); 907 - } 908 - 909 - static void simnow_write(struct console *con, const char *s, unsigned n) 910 - { 911 - simnow(XWRITE, simnow_fd, (unsigned long)s, n); 912 - } 913 - 914 - static struct console simnow_console = { 915 - .name = "simnow", 916 - .write = simnow_write, 917 - .flags = CON_PRINTBUFFER, 918 - .index = -1, 919 - }; 920 - 921 878 /* Direct interface for emergencies */ 922 879 static struct console *early_console = &early_vga_console; 923 880 static int __initdata early_console_initialized; ··· 917 960 max_ypos = boot_params.screen_info.orig_video_lines; 918 961 current_ypos = boot_params.screen_info.orig_y; 919 962 early_console = &early_vga_console; 920 - } else if (!strncmp(buf, "simnow", 6)) { 921 - simnow_init(buf + 6); 922 - early_console = &simnow_console; 923 - keep_early = 1; 924 963 #ifdef CONFIG_EARLY_PRINTK_DBGP 925 964 } else if (!strncmp(buf, "dbgp", 4)) { 926 965 if (early_dbgp_init(buf+4) < 0)
+1
arch/x86/kernel/entry_32.S
··· 1051 1051 push %eax 1052 1052 CFI_ADJUST_CFA_OFFSET 4 1053 1053 call do_exit 1054 + ud2 # padding for call trace 1054 1055 CFI_ENDPROC 1055 1056 ENDPROC(kernel_thread_helper) 1056 1057
+98 -95
arch/x86/kernel/entry_64.S
··· 11 11 * 12 12 * NOTE: This code handles signal-recognition, which happens every time 13 13 * after an interrupt and after each system call. 14 - * 15 - * Normal syscalls and interrupts don't save a full stack frame, this is 14 + * 15 + * Normal syscalls and interrupts don't save a full stack frame, this is 16 16 * only done for syscall tracing, signals or fork/exec et.al. 17 - * 18 - * A note on terminology: 19 - * - top of stack: Architecture defined interrupt frame from SS to RIP 20 - * at the top of the kernel process stack. 17 + * 18 + * A note on terminology: 19 + * - top of stack: Architecture defined interrupt frame from SS to RIP 20 + * at the top of the kernel process stack. 21 21 * - partial stack frame: partially saved registers upto R11. 22 - * - full stack frame: Like partial stack frame, but all register saved. 22 + * - full stack frame: Like partial stack frame, but all register saved. 23 23 * 24 24 * Some macro usage: 25 25 * - CFI macros are used to generate dwarf2 unwind information for better ··· 142 142 143 143 #ifndef CONFIG_PREEMPT 144 144 #define retint_kernel retint_restore_args 145 - #endif 145 + #endif 146 146 147 147 #ifdef CONFIG_PARAVIRT 148 148 ENTRY(native_usergs_sysret64) ··· 161 161 .endm 162 162 163 163 /* 164 - * C code is not supposed to know about undefined top of stack. Every time 165 - * a C function with an pt_regs argument is called from the SYSCALL based 164 + * C code is not supposed to know about undefined top of stack. Every time 165 + * a C function with an pt_regs argument is called from the SYSCALL based 166 166 * fast path FIXUP_TOP_OF_STACK is needed. 167 167 * RESTORE_TOP_OF_STACK syncs the syscall state after any possible ptregs 168 168 * manipulation. 169 - */ 170 - 171 - /* %rsp:at FRAMEEND */ 169 + */ 170 + 171 + /* %rsp:at FRAMEEND */ 172 172 .macro FIXUP_TOP_OF_STACK tmp 173 173 movq %gs:pda_oldrsp,\tmp 174 174 movq \tmp,RSP(%rsp) ··· 244 244 .endm 245 245 /* 246 246 * A newly forked process directly context switches into this. 247 - */ 248 - /* rdi: prev */ 247 + */ 248 + /* rdi: prev */ 249 249 ENTRY(ret_from_fork) 250 250 CFI_DEFAULT_STACK 251 251 push kernel_eflags(%rip) ··· 255 255 call schedule_tail 256 256 GET_THREAD_INFO(%rcx) 257 257 testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),TI_flags(%rcx) 258 + CFI_REMEMBER_STATE 258 259 jnz rff_trace 259 - rff_action: 260 + rff_action: 260 261 RESTORE_REST 261 262 testl $3,CS-ARGOFFSET(%rsp) # from kernel_thread? 262 263 je int_ret_from_sys_call ··· 265 264 jnz int_ret_from_sys_call 266 265 RESTORE_TOP_OF_STACK %rdi,ARGOFFSET 267 266 jmp ret_from_sys_call 267 + CFI_RESTORE_STATE 268 268 rff_trace: 269 269 movq %rsp,%rdi 270 270 call syscall_trace_leave 271 - GET_THREAD_INFO(%rcx) 271 + GET_THREAD_INFO(%rcx) 272 272 jmp rff_action 273 273 CFI_ENDPROC 274 274 END(ret_from_fork) ··· 280 278 * SYSCALL does not save anything on the stack and does not change the 281 279 * stack pointer. 282 280 */ 283 - 281 + 284 282 /* 285 - * Register setup: 283 + * Register setup: 286 284 * rax system call number 287 285 * rdi arg0 288 - * rcx return address for syscall/sysret, C arg3 286 + * rcx return address for syscall/sysret, C arg3 289 287 * rsi arg1 290 - * rdx arg2 288 + * rdx arg2 291 289 * r10 arg3 (--> moved to rcx for C) 292 290 * r8 arg4 293 291 * r9 arg5 294 292 * r11 eflags for syscall/sysret, temporary for C 295 - * r12-r15,rbp,rbx saved by C code, not touched. 296 - * 293 + * r12-r15,rbp,rbx saved by C code, not touched. 294 + * 297 295 * Interrupts are off on entry. 298 296 * Only called from user space. 299 297 * ··· 303 301 * When user can change the frames always force IRET. That is because 304 302 * it deals with uncanonical addresses better. SYSRET has trouble 305 303 * with them due to bugs in both AMD and Intel CPUs. 306 - */ 304 + */ 307 305 308 306 ENTRY(system_call) 309 307 CFI_STARTPROC simple ··· 319 317 */ 320 318 ENTRY(system_call_after_swapgs) 321 319 322 - movq %rsp,%gs:pda_oldrsp 320 + movq %rsp,%gs:pda_oldrsp 323 321 movq %gs:pda_kernelstack,%rsp 324 322 /* 325 323 * No need to follow this irqs off/on section - it's straight ··· 327 325 */ 328 326 ENABLE_INTERRUPTS(CLBR_NONE) 329 327 SAVE_ARGS 8,1 330 - movq %rax,ORIG_RAX-ARGOFFSET(%rsp) 328 + movq %rax,ORIG_RAX-ARGOFFSET(%rsp) 331 329 movq %rcx,RIP-ARGOFFSET(%rsp) 332 330 CFI_REL_OFFSET rip,RIP-ARGOFFSET 333 331 GET_THREAD_INFO(%rcx) ··· 341 339 movq %rax,RAX-ARGOFFSET(%rsp) 342 340 /* 343 341 * Syscall return path ending with SYSRET (fast path) 344 - * Has incomplete stack frame and undefined top of stack. 345 - */ 342 + * Has incomplete stack frame and undefined top of stack. 343 + */ 346 344 ret_from_sys_call: 347 345 movl $_TIF_ALLWORK_MASK,%edi 348 346 /* edi: flagmask */ 349 - sysret_check: 347 + sysret_check: 350 348 LOCKDEP_SYS_EXIT 351 349 GET_THREAD_INFO(%rcx) 352 350 DISABLE_INTERRUPTS(CLBR_NONE) 353 351 TRACE_IRQS_OFF 354 352 movl TI_flags(%rcx),%edx 355 353 andl %edi,%edx 356 - jnz sysret_careful 354 + jnz sysret_careful 357 355 CFI_REMEMBER_STATE 358 356 /* 359 357 * sysretq will re-enable interrupts: ··· 368 366 369 367 CFI_RESTORE_STATE 370 368 /* Handle reschedules */ 371 - /* edx: work, edi: workmask */ 369 + /* edx: work, edi: workmask */ 372 370 sysret_careful: 373 371 bt $TIF_NEED_RESCHED,%edx 374 372 jnc sysret_signal ··· 381 379 CFI_ADJUST_CFA_OFFSET -8 382 380 jmp sysret_check 383 381 384 - /* Handle a signal */ 382 + /* Handle a signal */ 385 383 sysret_signal: 386 384 TRACE_IRQS_ON 387 385 ENABLE_INTERRUPTS(CLBR_NONE) ··· 400 398 DISABLE_INTERRUPTS(CLBR_NONE) 401 399 TRACE_IRQS_OFF 402 400 jmp int_with_check 403 - 401 + 404 402 badsys: 405 403 movq $-ENOSYS,RAX-ARGOFFSET(%rsp) 406 404 jmp ret_from_sys_call ··· 439 437 #endif /* CONFIG_AUDITSYSCALL */ 440 438 441 439 /* Do syscall tracing */ 442 - tracesys: 440 + tracesys: 443 441 #ifdef CONFIG_AUDITSYSCALL 444 442 testl $(_TIF_WORK_SYSCALL_ENTRY & ~_TIF_SYSCALL_AUDIT),TI_flags(%rcx) 445 443 jz auditsys ··· 462 460 call *sys_call_table(,%rax,8) 463 461 movq %rax,RAX-ARGOFFSET(%rsp) 464 462 /* Use IRET because user could have changed frame */ 465 - 466 - /* 463 + 464 + /* 467 465 * Syscall return path ending with IRET. 468 466 * Has correct top of stack, but partial stack frame. 469 467 */ ··· 507 505 TRACE_IRQS_ON 508 506 ENABLE_INTERRUPTS(CLBR_NONE) 509 507 SAVE_REST 510 - /* Check for syscall exit trace */ 508 + /* Check for syscall exit trace */ 511 509 testl $_TIF_WORK_SYSCALL_EXIT,%edx 512 510 jz int_signal 513 511 pushq %rdi 514 512 CFI_ADJUST_CFA_OFFSET 8 515 - leaq 8(%rsp),%rdi # &ptregs -> arg1 513 + leaq 8(%rsp),%rdi # &ptregs -> arg1 516 514 call syscall_trace_leave 517 515 popq %rdi 518 516 CFI_ADJUST_CFA_OFFSET -8 519 517 andl $~(_TIF_WORK_SYSCALL_EXIT|_TIF_SYSCALL_EMU),%edi 520 518 jmp int_restore_rest 521 - 519 + 522 520 int_signal: 523 521 testl $_TIF_DO_NOTIFY_MASK,%edx 524 522 jz 1f ··· 533 531 jmp int_with_check 534 532 CFI_ENDPROC 535 533 END(system_call) 536 - 537 - /* 534 + 535 + /* 538 536 * Certain special system calls that need to save a complete full stack frame. 539 - */ 540 - 537 + */ 538 + 541 539 .macro PTREGSCALL label,func,arg 542 540 .globl \label 543 541 \label: ··· 574 572 ret 575 573 CFI_ENDPROC 576 574 END(ptregscall_common) 577 - 575 + 578 576 ENTRY(stub_execve) 579 577 CFI_STARTPROC 580 578 popq %r11 ··· 590 588 jmp int_ret_from_sys_call 591 589 CFI_ENDPROC 592 590 END(stub_execve) 593 - 591 + 594 592 /* 595 593 * sigreturn is special because it needs to restore all registers on return. 596 594 * This cannot be done with SYSRET, so use the IRET return path instead. 597 - */ 595 + */ 598 596 ENTRY(stub_rt_sigreturn) 599 597 CFI_STARTPROC 600 598 addq $8, %rsp ··· 687 685 GET_THREAD_INFO(%rcx) 688 686 testl $3,CS-ARGOFFSET(%rsp) 689 687 je retint_kernel 690 - 688 + 691 689 /* Interrupt came from user space */ 692 690 /* 693 691 * Has a correct top of stack, but a partial stack frame 694 692 * %rcx: thread info. Interrupts off. 695 - */ 693 + */ 696 694 retint_with_reschedule: 697 695 movl $_TIF_WORK_MASK,%edi 698 696 retint_check: ··· 765 763 pushq %rdi 766 764 CFI_ADJUST_CFA_OFFSET 8 767 765 call schedule 768 - popq %rdi 766 + popq %rdi 769 767 CFI_ADJUST_CFA_OFFSET -8 770 768 GET_THREAD_INFO(%rcx) 771 769 DISABLE_INTERRUPTS(CLBR_NONE) 772 770 TRACE_IRQS_OFF 773 771 jmp retint_check 774 - 772 + 775 773 retint_signal: 776 774 testl $_TIF_DO_NOTIFY_MASK,%edx 777 775 jz retint_swapgs 778 776 TRACE_IRQS_ON 779 777 ENABLE_INTERRUPTS(CLBR_NONE) 780 778 SAVE_REST 781 - movq $-1,ORIG_RAX(%rsp) 779 + movq $-1,ORIG_RAX(%rsp) 782 780 xorl %esi,%esi # oldset 783 781 movq %rsp,%rdi # &pt_regs 784 782 call do_notify_resume ··· 800 798 jnc retint_restore_args 801 799 call preempt_schedule_irq 802 800 jmp exit_intr 803 - #endif 801 + #endif 804 802 805 803 CFI_ENDPROC 806 804 END(common_interrupt) 807 - 805 + 808 806 /* 809 807 * APIC interrupts. 810 - */ 808 + */ 811 809 .macro apicinterrupt num,func 812 810 INTR_FRAME 813 811 pushq $~(\num) ··· 825 823 apicinterrupt THRESHOLD_APIC_VECTOR,mce_threshold_interrupt 826 824 END(threshold_interrupt) 827 825 828 - #ifdef CONFIG_SMP 826 + #ifdef CONFIG_SMP 829 827 ENTRY(reschedule_interrupt) 830 828 apicinterrupt RESCHEDULE_VECTOR,smp_reschedule_interrupt 831 829 END(reschedule_interrupt) 832 830 833 831 .macro INVALIDATE_ENTRY num 834 832 ENTRY(invalidate_interrupt\num) 835 - apicinterrupt INVALIDATE_TLB_VECTOR_START+\num,smp_invalidate_interrupt 833 + apicinterrupt INVALIDATE_TLB_VECTOR_START+\num,smp_invalidate_interrupt 836 834 END(invalidate_interrupt\num) 837 835 .endm 838 836 ··· 871 869 ENTRY(spurious_interrupt) 872 870 apicinterrupt SPURIOUS_APIC_VECTOR,smp_spurious_interrupt 873 871 END(spurious_interrupt) 874 - 872 + 875 873 /* 876 874 * Exception entry points. 877 - */ 875 + */ 878 876 .macro zeroentry sym 879 877 INTR_FRAME 880 878 PARAVIRT_ADJUST_EXCEPTION_FRAME 881 - pushq $0 /* push error code/oldrax */ 879 + pushq $0 /* push error code/oldrax */ 882 880 CFI_ADJUST_CFA_OFFSET 8 883 - pushq %rax /* push real oldrax to the rdi slot */ 881 + pushq %rax /* push real oldrax to the rdi slot */ 884 882 CFI_ADJUST_CFA_OFFSET 8 885 883 CFI_REL_OFFSET rax,0 886 884 leaq \sym(%rip),%rax 887 885 jmp error_entry 888 886 CFI_ENDPROC 889 - .endm 887 + .endm 890 888 891 889 .macro errorentry sym 892 890 XCPT_FRAME ··· 1000 998 1001 999 /* 1002 1000 * Exception entry point. This expects an error code/orig_rax on the stack 1003 - * and the exception handler in %rax. 1004 - */ 1001 + * and the exception handler in %rax. 1002 + */ 1005 1003 KPROBE_ENTRY(error_entry) 1006 1004 _frame RDI 1007 1005 CFI_REL_OFFSET rax,0 1008 1006 /* rdi slot contains rax, oldrax contains error code */ 1009 - cld 1007 + cld 1010 1008 subq $14*8,%rsp 1011 1009 CFI_ADJUST_CFA_OFFSET (14*8) 1012 1010 movq %rsi,13*8(%rsp) ··· 1017 1015 CFI_REL_OFFSET rdx,RDX 1018 1016 movq %rcx,11*8(%rsp) 1019 1017 CFI_REL_OFFSET rcx,RCX 1020 - movq %rsi,10*8(%rsp) /* store rax */ 1018 + movq %rsi,10*8(%rsp) /* store rax */ 1021 1019 CFI_REL_OFFSET rax,RAX 1022 1020 movq %r8, 9*8(%rsp) 1023 1021 CFI_REL_OFFSET r8,R8 ··· 1027 1025 CFI_REL_OFFSET r10,R10 1028 1026 movq %r11,6*8(%rsp) 1029 1027 CFI_REL_OFFSET r11,R11 1030 - movq %rbx,5*8(%rsp) 1028 + movq %rbx,5*8(%rsp) 1031 1029 CFI_REL_OFFSET rbx,RBX 1032 - movq %rbp,4*8(%rsp) 1030 + movq %rbp,4*8(%rsp) 1033 1031 CFI_REL_OFFSET rbp,RBP 1034 - movq %r12,3*8(%rsp) 1032 + movq %r12,3*8(%rsp) 1035 1033 CFI_REL_OFFSET r12,R12 1036 - movq %r13,2*8(%rsp) 1034 + movq %r13,2*8(%rsp) 1037 1035 CFI_REL_OFFSET r13,R13 1038 - movq %r14,1*8(%rsp) 1036 + movq %r14,1*8(%rsp) 1039 1037 CFI_REL_OFFSET r14,R14 1040 - movq %r15,(%rsp) 1038 + movq %r15,(%rsp) 1041 1039 CFI_REL_OFFSET r15,R15 1042 - xorl %ebx,%ebx 1040 + xorl %ebx,%ebx 1043 1041 testl $3,CS(%rsp) 1044 1042 je error_kernelspace 1045 - error_swapgs: 1043 + error_swapgs: 1046 1044 SWAPGS 1047 1045 error_sti: 1048 1046 TRACE_IRQS_OFF 1049 - movq %rdi,RDI(%rsp) 1047 + movq %rdi,RDI(%rsp) 1050 1048 CFI_REL_OFFSET rdi,RDI 1051 1049 movq %rsp,%rdi 1052 - movq ORIG_RAX(%rsp),%rsi /* get error code */ 1050 + movq ORIG_RAX(%rsp),%rsi /* get error code */ 1053 1051 movq $-1,ORIG_RAX(%rsp) 1054 1052 call *%rax 1055 1053 /* ebx: no swapgs flag (1: don't need swapgs, 0: need it) */ ··· 1058 1056 RESTORE_REST 1059 1057 DISABLE_INTERRUPTS(CLBR_NONE) 1060 1058 TRACE_IRQS_OFF 1061 - GET_THREAD_INFO(%rcx) 1059 + GET_THREAD_INFO(%rcx) 1062 1060 testl %eax,%eax 1063 1061 jne retint_kernel 1064 1062 LOCKDEP_SYS_EXIT_IRQ ··· 1074 1072 /* There are two places in the kernel that can potentially fault with 1075 1073 usergs. Handle them here. The exception handlers after 1076 1074 iret run with kernel gs again, so don't set the user space flag. 1077 - B stepping K8s sometimes report an truncated RIP for IRET 1075 + B stepping K8s sometimes report an truncated RIP for IRET 1078 1076 exceptions returning to compat mode. Check for these here too. */ 1079 1077 leaq irq_return(%rip),%rcx 1080 1078 cmpq %rcx,RIP(%rsp) ··· 1086 1084 je error_swapgs 1087 1085 jmp error_sti 1088 1086 KPROBE_END(error_entry) 1089 - 1087 + 1090 1088 /* Reload gs selector with exception handling */ 1091 - /* edi: new selector */ 1089 + /* edi: new selector */ 1092 1090 ENTRY(native_load_gs_index) 1093 1091 CFI_STARTPROC 1094 1092 pushf 1095 1093 CFI_ADJUST_CFA_OFFSET 8 1096 1094 DISABLE_INTERRUPTS(CLBR_ANY | ~(CLBR_RDI)) 1097 1095 SWAPGS 1098 - gs_change: 1099 - movl %edi,%gs 1096 + gs_change: 1097 + movl %edi,%gs 1100 1098 2: mfence /* workaround */ 1101 1099 SWAPGS 1102 1100 popf ··· 1104 1102 ret 1105 1103 CFI_ENDPROC 1106 1104 ENDPROC(native_load_gs_index) 1107 - 1105 + 1108 1106 .section __ex_table,"a" 1109 1107 .align 8 1110 1108 .quad gs_change,bad_gs 1111 1109 .previous 1112 1110 .section .fixup,"ax" 1113 1111 /* running with kernelgs */ 1114 - bad_gs: 1112 + bad_gs: 1115 1113 SWAPGS /* switch back to user gs */ 1116 1114 xorl %eax,%eax 1117 1115 movl %eax,%gs 1118 1116 jmp 2b 1119 - .previous 1120 - 1117 + .previous 1118 + 1121 1119 /* 1122 1120 * Create a kernel thread. 1123 1121 * ··· 1140 1138 1141 1139 xorl %r8d,%r8d 1142 1140 xorl %r9d,%r9d 1143 - 1141 + 1144 1142 # clone now 1145 1143 call do_fork 1146 1144 movq %rax,RAX(%rsp) ··· 1151 1149 * so internally to the x86_64 port you can rely on kernel_thread() 1152 1150 * not to reschedule the child before returning, this avoids the need 1153 1151 * of hacks for example to fork off the per-CPU idle tasks. 1154 - * [Hopefully no generic code relies on the reschedule -AK] 1152 + * [Hopefully no generic code relies on the reschedule -AK] 1155 1153 */ 1156 1154 RESTORE_ALL 1157 1155 UNFAKE_STACK_FRAME 1158 1156 ret 1159 1157 CFI_ENDPROC 1160 1158 ENDPROC(kernel_thread) 1161 - 1159 + 1162 1160 child_rip: 1163 1161 pushq $0 # fake return address 1164 1162 CFI_STARTPROC ··· 1172 1170 # exit 1173 1171 mov %eax, %edi 1174 1172 call do_exit 1173 + ud2 # padding for call trace 1175 1174 CFI_ENDPROC 1176 1175 ENDPROC(child_rip) 1177 1176 ··· 1194 1191 ENTRY(kernel_execve) 1195 1192 CFI_STARTPROC 1196 1193 FAKE_STACK_FRAME $0 1197 - SAVE_ALL 1194 + SAVE_ALL 1198 1195 movq %rsp,%rcx 1199 1196 call sys_execve 1200 - movq %rax, RAX(%rsp) 1197 + movq %rax, RAX(%rsp) 1201 1198 RESTORE_REST 1202 1199 testq %rax,%rax 1203 1200 je int_ret_from_sys_call ··· 1216 1213 END(coprocessor_error) 1217 1214 1218 1215 ENTRY(simd_coprocessor_error) 1219 - zeroentry do_simd_coprocessor_error 1216 + zeroentry do_simd_coprocessor_error 1220 1217 END(simd_coprocessor_error) 1221 1218 1222 1219 ENTRY(device_not_available) ··· 1228 1225 INTR_FRAME 1229 1226 PARAVIRT_ADJUST_EXCEPTION_FRAME 1230 1227 pushq $0 1231 - CFI_ADJUST_CFA_OFFSET 8 1228 + CFI_ADJUST_CFA_OFFSET 8 1232 1229 paranoidentry do_debug, DEBUG_STACK 1233 1230 paranoidexit 1234 1231 KPROBE_END(debug) 1235 1232 1236 - /* runs on exception stack */ 1233 + /* runs on exception stack */ 1237 1234 KPROBE_ENTRY(nmi) 1238 1235 INTR_FRAME 1239 1236 PARAVIRT_ADJUST_EXCEPTION_FRAME ··· 1267 1264 END(bounds) 1268 1265 1269 1266 ENTRY(invalid_op) 1270 - zeroentry do_invalid_op 1267 + zeroentry do_invalid_op 1271 1268 END(invalid_op) 1272 1269 1273 1270 ENTRY(coprocessor_segment_overrun) ··· 1322 1319 INTR_FRAME 1323 1320 PARAVIRT_ADJUST_EXCEPTION_FRAME 1324 1321 pushq $0 1325 - CFI_ADJUST_CFA_OFFSET 8 1322 + CFI_ADJUST_CFA_OFFSET 8 1326 1323 paranoidentry do_machine_check 1327 1324 jmp paranoid_exit1 1328 1325 CFI_ENDPROC
+42 -20
arch/x86/kernel/es7000_32.c
··· 38 38 #include <asm/io.h> 39 39 #include <asm/nmi.h> 40 40 #include <asm/smp.h> 41 + #include <asm/atomic.h> 41 42 #include <asm/apicdef.h> 42 43 #include <mach_mpparse.h> 44 + #include <asm/genapic.h> 45 + #include <asm/setup.h> 43 46 44 47 /* 45 48 * ES7000 chipsets ··· 164 161 return gsi; 165 162 } 166 163 164 + static int wakeup_secondary_cpu_via_mip(int cpu, unsigned long eip) 165 + { 166 + unsigned long vect = 0, psaival = 0; 167 + 168 + if (psai == NULL) 169 + return -1; 170 + 171 + vect = ((unsigned long)__pa(eip)/0x1000) << 16; 172 + psaival = (0x1000000 | vect | cpu); 173 + 174 + while (*psai & 0x1000000) 175 + ; 176 + 177 + *psai = psaival; 178 + 179 + return 0; 180 + } 181 + 182 + static void noop_wait_for_deassert(atomic_t *deassert_not_used) 183 + { 184 + } 185 + 186 + static int __init es7000_update_genapic(void) 187 + { 188 + genapic->wakeup_cpu = wakeup_secondary_cpu_via_mip; 189 + 190 + /* MPENTIUMIII */ 191 + if (boot_cpu_data.x86 == 6 && 192 + (boot_cpu_data.x86_model >= 7 || boot_cpu_data.x86_model <= 11)) { 193 + es7000_update_genapic_to_cluster(); 194 + genapic->wait_for_init_deassert = noop_wait_for_deassert; 195 + genapic->wakeup_cpu = wakeup_secondary_cpu_via_mip; 196 + } 197 + 198 + return 0; 199 + } 200 + 167 201 void __init 168 202 setup_unisys(void) 169 203 { ··· 216 176 else 217 177 es7000_plat = ES7000_CLASSIC; 218 178 ioapic_renumber_irq = es7000_rename_gsi; 179 + 180 + x86_quirks->update_genapic = es7000_update_genapic; 219 181 } 220 182 221 183 /* ··· 357 315 mip_reg->off_38 = ((unsigned long long)mip_reg->off_38 & 358 316 (unsigned long long)~MIP_VALID); 359 317 return status; 360 - } 361 - 362 - int 363 - es7000_start_cpu(int cpu, unsigned long eip) 364 - { 365 - unsigned long vect = 0, psaival = 0; 366 - 367 - if (psai == NULL) 368 - return -1; 369 - 370 - vect = ((unsigned long)__pa(eip)/0x1000) << 16; 371 - psaival = (0x1000000 | vect | cpu); 372 - 373 - while (*psai & 0x1000000) 374 - ; 375 - 376 - *psai = psaival; 377 - 378 - return 0; 379 - 380 318 } 381 319 382 320 void __init
+4
arch/x86/kernel/genapic_64.c
··· 21 21 #include <asm/smp.h> 22 22 #include <asm/ipi.h> 23 23 #include <asm/genapic.h> 24 + #include <asm/setup.h> 24 25 25 26 extern struct genapic apic_flat; 26 27 extern struct genapic apic_physflat; ··· 54 53 genapic = &apic_physflat; 55 54 printk(KERN_INFO "Setting APIC routing to %s\n", genapic->name); 56 55 } 56 + 57 + if (x86_quirks->update_genapic) 58 + x86_quirks->update_genapic(); 57 59 } 58 60 59 61 /* Same for both flat and physical. */
+107 -4
arch/x86/kernel/genx2apic_uv_x.c
··· 10 10 11 11 #include <linux/kernel.h> 12 12 #include <linux/threads.h> 13 + #include <linux/cpu.h> 13 14 #include <linux/cpumask.h> 14 15 #include <linux/string.h> 15 16 #include <linux/ctype.h> ··· 18 17 #include <linux/sched.h> 19 18 #include <linux/module.h> 20 19 #include <linux/hardirq.h> 20 + #include <linux/timer.h> 21 + #include <linux/proc_fs.h> 22 + #include <asm/current.h> 21 23 #include <asm/smp.h> 22 24 #include <asm/ipi.h> 23 25 #include <asm/genapic.h> ··· 360 356 } 361 357 362 358 /* 359 + * percpu heartbeat timer 360 + */ 361 + static void uv_heartbeat(unsigned long ignored) 362 + { 363 + struct timer_list *timer = &uv_hub_info->scir.timer; 364 + unsigned char bits = uv_hub_info->scir.state; 365 + 366 + /* flip heartbeat bit */ 367 + bits ^= SCIR_CPU_HEARTBEAT; 368 + 369 + /* is this cpu idle? */ 370 + if (idle_cpu(raw_smp_processor_id())) 371 + bits &= ~SCIR_CPU_ACTIVITY; 372 + else 373 + bits |= SCIR_CPU_ACTIVITY; 374 + 375 + /* update system controller interface reg */ 376 + uv_set_scir_bits(bits); 377 + 378 + /* enable next timer period */ 379 + mod_timer(timer, jiffies + SCIR_CPU_HB_INTERVAL); 380 + } 381 + 382 + static void __cpuinit uv_heartbeat_enable(int cpu) 383 + { 384 + if (!uv_cpu_hub_info(cpu)->scir.enabled) { 385 + struct timer_list *timer = &uv_cpu_hub_info(cpu)->scir.timer; 386 + 387 + uv_set_cpu_scir_bits(cpu, SCIR_CPU_HEARTBEAT|SCIR_CPU_ACTIVITY); 388 + setup_timer(timer, uv_heartbeat, cpu); 389 + timer->expires = jiffies + SCIR_CPU_HB_INTERVAL; 390 + add_timer_on(timer, cpu); 391 + uv_cpu_hub_info(cpu)->scir.enabled = 1; 392 + } 393 + 394 + /* check boot cpu */ 395 + if (!uv_cpu_hub_info(0)->scir.enabled) 396 + uv_heartbeat_enable(0); 397 + } 398 + 399 + #ifdef CONFIG_HOTPLUG_CPU 400 + static void __cpuinit uv_heartbeat_disable(int cpu) 401 + { 402 + if (uv_cpu_hub_info(cpu)->scir.enabled) { 403 + uv_cpu_hub_info(cpu)->scir.enabled = 0; 404 + del_timer(&uv_cpu_hub_info(cpu)->scir.timer); 405 + } 406 + uv_set_cpu_scir_bits(cpu, 0xff); 407 + } 408 + 409 + /* 410 + * cpu hotplug notifier 411 + */ 412 + static __cpuinit int uv_scir_cpu_notify(struct notifier_block *self, 413 + unsigned long action, void *hcpu) 414 + { 415 + long cpu = (long)hcpu; 416 + 417 + switch (action) { 418 + case CPU_ONLINE: 419 + uv_heartbeat_enable(cpu); 420 + break; 421 + case CPU_DOWN_PREPARE: 422 + uv_heartbeat_disable(cpu); 423 + break; 424 + default: 425 + break; 426 + } 427 + return NOTIFY_OK; 428 + } 429 + 430 + static __init void uv_scir_register_cpu_notifier(void) 431 + { 432 + hotcpu_notifier(uv_scir_cpu_notify, 0); 433 + } 434 + 435 + #else /* !CONFIG_HOTPLUG_CPU */ 436 + 437 + static __init void uv_scir_register_cpu_notifier(void) 438 + { 439 + } 440 + 441 + static __init int uv_init_heartbeat(void) 442 + { 443 + int cpu; 444 + 445 + if (is_uv_system()) 446 + for_each_online_cpu(cpu) 447 + uv_heartbeat_enable(cpu); 448 + return 0; 449 + } 450 + 451 + late_initcall(uv_init_heartbeat); 452 + 453 + #endif /* !CONFIG_HOTPLUG_CPU */ 454 + 455 + /* 363 456 * Called on each cpu to initialize the per_cpu UV data area. 364 457 * ZZZ hotplug not supported yet 365 458 */ ··· 529 428 530 429 uv_bios_init(); 531 430 uv_bios_get_sn_info(0, &uv_type, &sn_partition_id, 532 - &uv_coherency_id, &uv_region_size); 431 + &sn_coherency_id, &sn_region_size); 533 432 uv_rtc_init(); 534 433 535 434 for_each_present_cpu(cpu) { ··· 540 439 uv_blade_info[blade].nr_possible_cpus++; 541 440 542 441 uv_cpu_hub_info(cpu)->lowmem_remap_base = lowmem_redir_base; 543 - uv_cpu_hub_info(cpu)->lowmem_remap_top = 544 - lowmem_redir_base + lowmem_redir_size; 442 + uv_cpu_hub_info(cpu)->lowmem_remap_top = lowmem_redir_size; 545 443 uv_cpu_hub_info(cpu)->m_val = m_val; 546 444 uv_cpu_hub_info(cpu)->n_val = m_val; 547 445 uv_cpu_hub_info(cpu)->numa_blade_id = blade; ··· 550 450 uv_cpu_hub_info(cpu)->gpa_mask = (1 << (m_val + n_val)) - 1; 551 451 uv_cpu_hub_info(cpu)->gnode_upper = gnode_upper; 552 452 uv_cpu_hub_info(cpu)->global_mmr_base = mmr_base; 553 - uv_cpu_hub_info(cpu)->coherency_domain_number = uv_coherency_id; 453 + uv_cpu_hub_info(cpu)->coherency_domain_number = sn_coherency_id; 454 + uv_cpu_hub_info(cpu)->scir.offset = SCIR_LOCAL_MMR_BASE + lcpu; 554 455 uv_node_to_blade[nid] = blade; 555 456 uv_cpu_to_blade[cpu] = blade; 556 457 max_pnode = max(pnode, max_pnode); ··· 568 467 map_mmioh_high(max_pnode); 569 468 570 469 uv_cpu_init(); 470 + uv_scir_register_cpu_notifier(); 471 + proc_mkdir("sgi_uv", NULL); 571 472 }
-1
arch/x86/kernel/head.c
··· 35 35 36 36 /* start of EBDA area */ 37 37 ebda_addr = get_bios_ebda(); 38 - printk(KERN_INFO "BIOS EBDA/lowmem at: %08x/%08x\n", ebda_addr, lowmem); 39 38 40 39 /* Fixup: bios puts an EBDA in the top 64K segment */ 41 40 /* of conventional memory, but does not adjust lowmem. */
+3
arch/x86/kernel/head32.c
··· 12 12 #include <asm/sections.h> 13 13 #include <asm/e820.h> 14 14 #include <asm/bios_ebda.h> 15 + #include <asm/trampoline.h> 15 16 16 17 void __init i386_start_kernel(void) 17 18 { 19 + reserve_trampoline_memory(); 20 + 18 21 reserve_early(__pa_symbol(&_text), __pa_symbol(&_end), "TEXT DATA BSS"); 19 22 20 23 #ifdef CONFIG_BLK_DEV_INITRD
+3
arch/x86/kernel/head64.c
··· 24 24 #include <asm/kdebug.h> 25 25 #include <asm/e820.h> 26 26 #include <asm/bios_ebda.h> 27 + #include <asm/trampoline.h> 27 28 28 29 /* boot cpu pda */ 29 30 static struct x8664_pda _boot_cpu_pda __read_mostly; ··· 120 119 void __init x86_64_start_reservations(char *real_mode_data) 121 120 { 122 121 copy_bootdata(__va(real_mode_data)); 122 + 123 + reserve_trampoline_memory(); 123 124 124 125 reserve_early(__pa_symbol(&_text), __pa_symbol(&_end), "TEXT DATA BSS"); 125 126
+3 -1
arch/x86/kernel/hpet.c
··· 33 33 * HPET address is set in acpi/boot.c, when an ACPI entry exists 34 34 */ 35 35 unsigned long hpet_address; 36 - unsigned long hpet_num_timers; 36 + #ifdef CONFIG_PCI_MSI 37 + static unsigned long hpet_num_timers; 38 + #endif 37 39 static void __iomem *hpet_virt_address; 38 40 39 41 struct hpet_dev {
-1
arch/x86/kernel/init_task.c
··· 14 14 static struct signal_struct init_signals = INIT_SIGNALS(init_signals); 15 15 static struct sighand_struct init_sighand = INIT_SIGHAND(init_sighand); 16 16 struct mm_struct init_mm = INIT_MM(init_mm); 17 - EXPORT_UNUSED_SYMBOL(init_mm); /* will be removed in 2.6.26 */ 18 17 19 18 /* 20 19 * Initial thread structure.
+1 -2
arch/x86/kernel/io_apic.c
··· 2216 2216 asmlinkage void smp_irq_move_cleanup_interrupt(void) 2217 2217 { 2218 2218 unsigned vector, me; 2219 + 2219 2220 ack_APIC_irq(); 2220 - #ifdef CONFIG_X86_64 2221 2221 exit_idle(); 2222 - #endif 2223 2222 irq_enter(); 2224 2223 2225 2224 me = smp_processor_id();
+9 -13
arch/x86/kernel/irq_64.c
··· 18 18 #include <asm/idle.h> 19 19 #include <asm/smp.h> 20 20 21 - #ifdef CONFIG_DEBUG_STACKOVERFLOW 22 21 /* 23 22 * Probabilistic stack overflow check: 24 23 * ··· 27 28 */ 28 29 static inline void stack_overflow_check(struct pt_regs *regs) 29 30 { 31 + #ifdef CONFIG_DEBUG_STACKOVERFLOW 30 32 u64 curbase = (u64)task_stack_page(current); 31 - static unsigned long warned = -60*HZ; 32 33 33 - if (regs->sp >= curbase && regs->sp <= curbase + THREAD_SIZE && 34 - regs->sp < curbase + sizeof(struct thread_info) + 128 && 35 - time_after(jiffies, warned + 60*HZ)) { 36 - printk("do_IRQ: %s near stack overflow (cur:%Lx,sp:%lx)\n", 37 - current->comm, curbase, regs->sp); 38 - show_stack(NULL,NULL); 39 - warned = jiffies; 40 - } 41 - } 34 + WARN_ONCE(regs->sp >= curbase && 35 + regs->sp <= curbase + THREAD_SIZE && 36 + regs->sp < curbase + sizeof(struct thread_info) + 37 + sizeof(struct pt_regs) + 128, 38 + 39 + "do_IRQ: %s near stack overflow (cur:%Lx,sp:%lx)\n", 40 + current->comm, curbase, regs->sp); 42 41 #endif 42 + } 43 43 44 44 /* 45 45 * do_IRQ handles all normal device IRQ's (the special ··· 58 60 irq_enter(); 59 61 irq = __get_cpu_var(vector_irq)[vector]; 60 62 61 - #ifdef CONFIG_DEBUG_STACKOVERFLOW 62 63 stack_overflow_check(regs); 63 - #endif 64 64 65 65 desc = irq_to_desc(irq); 66 66 if (likely(desc))
+82 -22
arch/x86/kernel/machine_kexec_32.c
··· 13 13 #include <linux/numa.h> 14 14 #include <linux/ftrace.h> 15 15 #include <linux/suspend.h> 16 + #include <linux/gfp.h> 16 17 17 18 #include <asm/pgtable.h> 18 19 #include <asm/pgalloc.h> ··· 25 24 #include <asm/desc.h> 26 25 #include <asm/system.h> 27 26 #include <asm/cacheflush.h> 28 - 29 - #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE))) 30 - static u32 kexec_pgd[1024] PAGE_ALIGNED; 31 - #ifdef CONFIG_X86_PAE 32 - static u32 kexec_pmd0[1024] PAGE_ALIGNED; 33 - static u32 kexec_pmd1[1024] PAGE_ALIGNED; 34 - #endif 35 - static u32 kexec_pte0[1024] PAGE_ALIGNED; 36 - static u32 kexec_pte1[1024] PAGE_ALIGNED; 37 27 38 28 static void set_idt(void *newidt, __u16 limit) 39 29 { ··· 68 76 #undef __STR 69 77 } 70 78 79 + static void machine_kexec_free_page_tables(struct kimage *image) 80 + { 81 + free_page((unsigned long)image->arch.pgd); 82 + #ifdef CONFIG_X86_PAE 83 + free_page((unsigned long)image->arch.pmd0); 84 + free_page((unsigned long)image->arch.pmd1); 85 + #endif 86 + free_page((unsigned long)image->arch.pte0); 87 + free_page((unsigned long)image->arch.pte1); 88 + } 89 + 90 + static int machine_kexec_alloc_page_tables(struct kimage *image) 91 + { 92 + image->arch.pgd = (pgd_t *)get_zeroed_page(GFP_KERNEL); 93 + #ifdef CONFIG_X86_PAE 94 + image->arch.pmd0 = (pmd_t *)get_zeroed_page(GFP_KERNEL); 95 + image->arch.pmd1 = (pmd_t *)get_zeroed_page(GFP_KERNEL); 96 + #endif 97 + image->arch.pte0 = (pte_t *)get_zeroed_page(GFP_KERNEL); 98 + image->arch.pte1 = (pte_t *)get_zeroed_page(GFP_KERNEL); 99 + if (!image->arch.pgd || 100 + #ifdef CONFIG_X86_PAE 101 + !image->arch.pmd0 || !image->arch.pmd1 || 102 + #endif 103 + !image->arch.pte0 || !image->arch.pte1) { 104 + machine_kexec_free_page_tables(image); 105 + return -ENOMEM; 106 + } 107 + return 0; 108 + } 109 + 110 + static void machine_kexec_page_table_set_one( 111 + pgd_t *pgd, pmd_t *pmd, pte_t *pte, 112 + unsigned long vaddr, unsigned long paddr) 113 + { 114 + pud_t *pud; 115 + 116 + pgd += pgd_index(vaddr); 117 + #ifdef CONFIG_X86_PAE 118 + if (!(pgd_val(*pgd) & _PAGE_PRESENT)) 119 + set_pgd(pgd, __pgd(__pa(pmd) | _PAGE_PRESENT)); 120 + #endif 121 + pud = pud_offset(pgd, vaddr); 122 + pmd = pmd_offset(pud, vaddr); 123 + if (!(pmd_val(*pmd) & _PAGE_PRESENT)) 124 + set_pmd(pmd, __pmd(__pa(pte) | _PAGE_TABLE)); 125 + pte = pte_offset_kernel(pmd, vaddr); 126 + set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC)); 127 + } 128 + 129 + static void machine_kexec_prepare_page_tables(struct kimage *image) 130 + { 131 + void *control_page; 132 + pmd_t *pmd = 0; 133 + 134 + control_page = page_address(image->control_code_page); 135 + #ifdef CONFIG_X86_PAE 136 + pmd = image->arch.pmd0; 137 + #endif 138 + machine_kexec_page_table_set_one( 139 + image->arch.pgd, pmd, image->arch.pte0, 140 + (unsigned long)control_page, __pa(control_page)); 141 + #ifdef CONFIG_X86_PAE 142 + pmd = image->arch.pmd1; 143 + #endif 144 + machine_kexec_page_table_set_one( 145 + image->arch.pgd, pmd, image->arch.pte1, 146 + __pa(control_page), __pa(control_page)); 147 + } 148 + 71 149 /* 72 150 * A architecture hook called to validate the 73 151 * proposed image and prepare the control pages ··· 149 87 * reboot code buffer to allow us to avoid allocations 150 88 * later. 151 89 * 152 - * Make control page executable. 90 + * - Make control page executable. 91 + * - Allocate page tables 92 + * - Setup page tables 153 93 */ 154 94 int machine_kexec_prepare(struct kimage *image) 155 95 { 96 + int error; 97 + 156 98 if (nx_enabled) 157 99 set_pages_x(image->control_code_page, 1); 100 + error = machine_kexec_alloc_page_tables(image); 101 + if (error) 102 + return error; 103 + machine_kexec_prepare_page_tables(image); 158 104 return 0; 159 105 } 160 106 ··· 174 104 { 175 105 if (nx_enabled) 176 106 set_pages_nx(image->control_code_page, 1); 107 + machine_kexec_free_page_tables(image); 177 108 } 178 109 179 110 /* ··· 221 150 relocate_kernel_ptr = control_page; 222 151 page_list[PA_CONTROL_PAGE] = __pa(control_page); 223 152 page_list[VA_CONTROL_PAGE] = (unsigned long)control_page; 224 - page_list[PA_PGD] = __pa(kexec_pgd); 225 - page_list[VA_PGD] = (unsigned long)kexec_pgd; 226 - #ifdef CONFIG_X86_PAE 227 - page_list[PA_PMD_0] = __pa(kexec_pmd0); 228 - page_list[VA_PMD_0] = (unsigned long)kexec_pmd0; 229 - page_list[PA_PMD_1] = __pa(kexec_pmd1); 230 - page_list[VA_PMD_1] = (unsigned long)kexec_pmd1; 231 - #endif 232 - page_list[PA_PTE_0] = __pa(kexec_pte0); 233 - page_list[VA_PTE_0] = (unsigned long)kexec_pte0; 234 - page_list[PA_PTE_1] = __pa(kexec_pte1); 235 - page_list[VA_PTE_1] = (unsigned long)kexec_pte1; 153 + page_list[PA_PGD] = __pa(image->arch.pgd); 236 154 237 155 if (image->type == KEXEC_TYPE_DEFAULT) 238 156 page_list[PA_SWAP_PAGE] = (page_to_pfn(image->swap_page)
+86 -146
arch/x86/kernel/microcode_amd.c
··· 10 10 * This driver allows to upgrade microcode on AMD 11 11 * family 0x10 and 0x11 processors. 12 12 * 13 - * Licensed unter the terms of the GNU General Public 13 + * Licensed under the terms of the GNU General Public 14 14 * License version 2. See file COPYING for details. 15 15 */ 16 16 ··· 32 32 #include <linux/platform_device.h> 33 33 #include <linux/pci.h> 34 34 #include <linux/pci_ids.h> 35 + #include <linux/uaccess.h> 35 36 36 37 #include <asm/msr.h> 37 - #include <asm/uaccess.h> 38 38 #include <asm/processor.h> 39 39 #include <asm/microcode.h> 40 40 ··· 47 47 #define UCODE_UCODE_TYPE 0x00000001 48 48 49 49 struct equiv_cpu_entry { 50 - unsigned int installed_cpu; 51 - unsigned int fixed_errata_mask; 52 - unsigned int fixed_errata_compare; 53 - unsigned int equiv_cpu; 54 - }; 50 + u32 installed_cpu; 51 + u32 fixed_errata_mask; 52 + u32 fixed_errata_compare; 53 + u16 equiv_cpu; 54 + u16 res; 55 + } __attribute__((packed)); 55 56 56 57 struct microcode_header_amd { 57 - unsigned int data_code; 58 - unsigned int patch_id; 59 - unsigned char mc_patch_data_id[2]; 60 - unsigned char mc_patch_data_len; 61 - unsigned char init_flag; 62 - unsigned int mc_patch_data_checksum; 63 - unsigned int nb_dev_id; 64 - unsigned int sb_dev_id; 65 - unsigned char processor_rev_id[2]; 66 - unsigned char nb_rev_id; 67 - unsigned char sb_rev_id; 68 - unsigned char bios_api_rev; 69 - unsigned char reserved1[3]; 70 - unsigned int match_reg[8]; 71 - }; 58 + u32 data_code; 59 + u32 patch_id; 60 + u16 mc_patch_data_id; 61 + u8 mc_patch_data_len; 62 + u8 init_flag; 63 + u32 mc_patch_data_checksum; 64 + u32 nb_dev_id; 65 + u32 sb_dev_id; 66 + u16 processor_rev_id; 67 + u8 nb_rev_id; 68 + u8 sb_rev_id; 69 + u8 bios_api_rev; 70 + u8 reserved1[3]; 71 + u32 match_reg[8]; 72 + } __attribute__((packed)); 72 73 73 74 struct microcode_amd { 74 75 struct microcode_header_amd hdr; 75 76 unsigned int mpb[0]; 76 77 }; 77 78 78 - #define UCODE_MAX_SIZE (2048) 79 - #define DEFAULT_UCODE_DATASIZE (896) 80 - #define MC_HEADER_SIZE (sizeof(struct microcode_header_amd)) 81 - #define DEFAULT_UCODE_TOTALSIZE (DEFAULT_UCODE_DATASIZE + MC_HEADER_SIZE) 82 - #define DWSIZE (sizeof(u32)) 83 - /* For now we support a fixed ucode total size only */ 84 - #define get_totalsize(mc) \ 85 - ((((struct microcode_amd *)mc)->hdr.mc_patch_data_len * 28) \ 86 - + MC_HEADER_SIZE) 79 + #define UCODE_MAX_SIZE 2048 80 + #define UCODE_CONTAINER_SECTION_HDR 8 81 + #define UCODE_CONTAINER_HEADER_SIZE 12 87 82 88 83 /* serialize access to the physical write */ 89 84 static DEFINE_SPINLOCK(microcode_update_lock); ··· 88 93 static int collect_cpu_info_amd(int cpu, struct cpu_signature *csig) 89 94 { 90 95 struct cpuinfo_x86 *c = &cpu_data(cpu); 96 + u32 dummy; 91 97 92 98 memset(csig, 0, sizeof(*csig)); 93 - 94 99 if (c->x86_vendor != X86_VENDOR_AMD || c->x86 < 0x10) { 95 - printk(KERN_ERR "microcode: CPU%d not a capable AMD processor\n", 96 - cpu); 100 + printk(KERN_WARNING "microcode: CPU%d: AMD CPU family 0x%x not " 101 + "supported\n", cpu, c->x86); 97 102 return -1; 98 103 } 99 - 100 - asm volatile("movl %1, %%ecx; rdmsr" 101 - : "=a" (csig->rev) 102 - : "i" (0x0000008B) : "ecx"); 103 - 104 - printk(KERN_INFO "microcode: collect_cpu_info_amd : patch_id=0x%x\n", 105 - csig->rev); 106 - 104 + rdmsr(MSR_AMD64_PATCH_LEVEL, csig->rev, dummy); 105 + printk(KERN_INFO "microcode: CPU%d: patch_level=0x%x\n", cpu, csig->rev); 107 106 return 0; 108 107 } 109 108 110 109 static int get_matching_microcode(int cpu, void *mc, int rev) 111 110 { 112 111 struct microcode_header_amd *mc_header = mc; 113 - struct pci_dev *nb_pci_dev, *sb_pci_dev; 114 112 unsigned int current_cpu_id; 115 - unsigned int equiv_cpu_id = 0x00; 113 + u16 equiv_cpu_id = 0; 116 114 unsigned int i = 0; 117 115 118 116 BUG_ON(equiv_cpu_table == NULL); ··· 120 132 } 121 133 122 134 if (!equiv_cpu_id) { 123 - printk(KERN_ERR "microcode: CPU%d cpu_id " 124 - "not found in equivalent cpu table \n", cpu); 135 + printk(KERN_WARNING "microcode: CPU%d: cpu revision " 136 + "not listed in equivalent cpu table\n", cpu); 125 137 return 0; 126 138 } 127 139 128 - if ((mc_header->processor_rev_id[0]) != (equiv_cpu_id & 0xff)) { 129 - printk(KERN_ERR 130 - "microcode: CPU%d patch does not match " 131 - "(patch is %x, cpu extended is %x) \n", 132 - cpu, mc_header->processor_rev_id[0], 133 - (equiv_cpu_id & 0xff)); 140 + if (mc_header->processor_rev_id != equiv_cpu_id) { 141 + printk(KERN_ERR "microcode: CPU%d: patch mismatch " 142 + "(processor_rev_id: %x, equiv_cpu_id: %x)\n", 143 + cpu, mc_header->processor_rev_id, equiv_cpu_id); 134 144 return 0; 135 145 } 136 146 137 - if ((mc_header->processor_rev_id[1]) != ((equiv_cpu_id >> 16) & 0xff)) { 138 - printk(KERN_ERR "microcode: CPU%d patch does not match " 139 - "(patch is %x, cpu base id is %x) \n", 140 - cpu, mc_header->processor_rev_id[1], 141 - ((equiv_cpu_id >> 16) & 0xff)); 142 - 147 + /* ucode might be chipset specific -- currently we don't support this */ 148 + if (mc_header->nb_dev_id || mc_header->sb_dev_id) { 149 + printk(KERN_ERR "microcode: CPU%d: loading of chipset " 150 + "specific code not yet supported\n", cpu); 143 151 return 0; 144 - } 145 - 146 - /* ucode may be northbridge specific */ 147 - if (mc_header->nb_dev_id) { 148 - nb_pci_dev = pci_get_device(PCI_VENDOR_ID_AMD, 149 - (mc_header->nb_dev_id & 0xff), 150 - NULL); 151 - if ((!nb_pci_dev) || 152 - (mc_header->nb_rev_id != nb_pci_dev->revision)) { 153 - printk(KERN_ERR "microcode: CPU%d NB mismatch \n", cpu); 154 - pci_dev_put(nb_pci_dev); 155 - return 0; 156 - } 157 - pci_dev_put(nb_pci_dev); 158 - } 159 - 160 - /* ucode may be southbridge specific */ 161 - if (mc_header->sb_dev_id) { 162 - sb_pci_dev = pci_get_device(PCI_VENDOR_ID_AMD, 163 - (mc_header->sb_dev_id & 0xff), 164 - NULL); 165 - if ((!sb_pci_dev) || 166 - (mc_header->sb_rev_id != sb_pci_dev->revision)) { 167 - printk(KERN_ERR "microcode: CPU%d SB mismatch \n", cpu); 168 - pci_dev_put(sb_pci_dev); 169 - return 0; 170 - } 171 - pci_dev_put(sb_pci_dev); 172 152 } 173 153 174 154 if (mc_header->patch_id <= rev) ··· 148 192 static void apply_microcode_amd(int cpu) 149 193 { 150 194 unsigned long flags; 151 - unsigned int eax, edx; 152 - unsigned int rev; 195 + u32 rev, dummy; 153 196 int cpu_num = raw_smp_processor_id(); 154 197 struct ucode_cpu_info *uci = ucode_cpu_info + cpu_num; 155 198 struct microcode_amd *mc_amd = uci->mc; 156 - unsigned long addr; 157 199 158 200 /* We should bind the task to the CPU */ 159 201 BUG_ON(cpu_num != cpu); ··· 160 206 return; 161 207 162 208 spin_lock_irqsave(&microcode_update_lock, flags); 163 - 164 - addr = (unsigned long)&mc_amd->hdr.data_code; 165 - edx = (unsigned int)(((unsigned long)upper_32_bits(addr))); 166 - eax = (unsigned int)(((unsigned long)lower_32_bits(addr))); 167 - 168 - asm volatile("movl %0, %%ecx; wrmsr" : 169 - : "i" (0xc0010020), "a" (eax), "d" (edx) : "ecx"); 170 - 209 + wrmsrl(MSR_AMD64_PATCH_LOADER, (u64)(long)&mc_amd->hdr.data_code); 171 210 /* get patch id after patching */ 172 - asm volatile("movl %1, %%ecx; rdmsr" 173 - : "=a" (rev) 174 - : "i" (0x0000008B) : "ecx"); 175 - 211 + rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy); 176 212 spin_unlock_irqrestore(&microcode_update_lock, flags); 177 213 178 214 /* check current patch id and patch's id for match */ 179 215 if (rev != mc_amd->hdr.patch_id) { 180 - printk(KERN_ERR "microcode: CPU%d update from revision " 181 - "0x%x to 0x%x failed\n", cpu_num, 182 - mc_amd->hdr.patch_id, rev); 216 + printk(KERN_ERR "microcode: CPU%d: update failed " 217 + "(for patch_level=0x%x)\n", cpu, mc_amd->hdr.patch_id); 183 218 return; 184 219 } 185 220 186 - printk(KERN_INFO "microcode: CPU%d updated from revision " 187 - "0x%x to 0x%x \n", 188 - cpu_num, uci->cpu_sig.rev, mc_amd->hdr.patch_id); 221 + printk(KERN_INFO "microcode: CPU%d: updated (new patch_level=0x%x)\n", 222 + cpu, rev); 189 223 190 224 uci->cpu_sig.rev = rev; 191 225 } 192 226 193 - static void * get_next_ucode(u8 *buf, unsigned int size, 194 - int (*get_ucode_data)(void *, const void *, size_t), 195 - unsigned int *mc_size) 227 + static int get_ucode_data(void *to, const u8 *from, size_t n) 228 + { 229 + memcpy(to, from, n); 230 + return 0; 231 + } 232 + 233 + static void *get_next_ucode(const u8 *buf, unsigned int size, 234 + unsigned int *mc_size) 196 235 { 197 236 unsigned int total_size; 198 - #define UCODE_CONTAINER_SECTION_HDR 8 199 237 u8 section_hdr[UCODE_CONTAINER_SECTION_HDR]; 200 238 void *mc; 201 239 ··· 195 249 return NULL; 196 250 197 251 if (section_hdr[0] != UCODE_UCODE_TYPE) { 198 - printk(KERN_ERR "microcode: error! " 199 - "Wrong microcode payload type field\n"); 252 + printk(KERN_ERR "microcode: error: invalid type field in " 253 + "container file section header\n"); 200 254 return NULL; 201 255 } 202 256 203 257 total_size = (unsigned long) (section_hdr[4] + (section_hdr[5] << 8)); 204 258 205 - printk(KERN_INFO "microcode: size %u, total_size %u\n", 206 - size, total_size); 259 + printk(KERN_DEBUG "microcode: size %u, total_size %u\n", 260 + size, total_size); 207 261 208 262 if (total_size > size || total_size > UCODE_MAX_SIZE) { 209 - printk(KERN_ERR "microcode: error! Bad data in microcode data file\n"); 263 + printk(KERN_ERR "microcode: error: size mismatch\n"); 210 264 return NULL; 211 265 } 212 266 213 267 mc = vmalloc(UCODE_MAX_SIZE); 214 268 if (mc) { 215 269 memset(mc, 0, UCODE_MAX_SIZE); 216 - if (get_ucode_data(mc, buf + UCODE_CONTAINER_SECTION_HDR, total_size)) { 270 + if (get_ucode_data(mc, buf + UCODE_CONTAINER_SECTION_HDR, 271 + total_size)) { 217 272 vfree(mc); 218 273 mc = NULL; 219 274 } else 220 275 *mc_size = total_size + UCODE_CONTAINER_SECTION_HDR; 221 276 } 222 - #undef UCODE_CONTAINER_SECTION_HDR 223 277 return mc; 224 278 } 225 279 226 280 227 - static int install_equiv_cpu_table(u8 *buf, 228 - int (*get_ucode_data)(void *, const void *, size_t)) 281 + static int install_equiv_cpu_table(const u8 *buf) 229 282 { 230 - #define UCODE_CONTAINER_HEADER_SIZE 12 231 283 u8 *container_hdr[UCODE_CONTAINER_HEADER_SIZE]; 232 284 unsigned int *buf_pos = (unsigned int *)container_hdr; 233 285 unsigned long size; ··· 236 292 size = buf_pos[2]; 237 293 238 294 if (buf_pos[1] != UCODE_EQUIV_CPU_TABLE_TYPE || !size) { 239 - printk(KERN_ERR "microcode: error! " 240 - "Wrong microcode equivalnet cpu table\n"); 295 + printk(KERN_ERR "microcode: error: invalid type field in " 296 + "container file section header\n"); 241 297 return 0; 242 298 } 243 299 244 300 equiv_cpu_table = (struct equiv_cpu_entry *) vmalloc(size); 245 301 if (!equiv_cpu_table) { 246 - printk(KERN_ERR "microcode: error, can't allocate memory for equiv CPU table\n"); 302 + printk(KERN_ERR "microcode: failed to allocate " 303 + "equivalent CPU table\n"); 247 304 return 0; 248 305 } 249 306 ··· 255 310 } 256 311 257 312 return size + UCODE_CONTAINER_HEADER_SIZE; /* add header length */ 258 - #undef UCODE_CONTAINER_HEADER_SIZE 259 313 } 260 314 261 315 static void free_equiv_cpu_table(void) ··· 265 321 } 266 322 } 267 323 268 - static int generic_load_microcode(int cpu, void *data, size_t size, 269 - int (*get_ucode_data)(void *, const void *, size_t)) 324 + static int generic_load_microcode(int cpu, const u8 *data, size_t size) 270 325 { 271 326 struct ucode_cpu_info *uci = ucode_cpu_info + cpu; 272 - u8 *ucode_ptr = data, *new_mc = NULL, *mc; 327 + const u8 *ucode_ptr = data; 328 + void *new_mc = NULL; 329 + void *mc; 273 330 int new_rev = uci->cpu_sig.rev; 274 331 unsigned int leftover; 275 332 unsigned long offset; 276 333 277 - offset = install_equiv_cpu_table(ucode_ptr, get_ucode_data); 334 + offset = install_equiv_cpu_table(ucode_ptr); 278 335 if (!offset) { 279 - printk(KERN_ERR "microcode: installing equivalent cpu table failed\n"); 336 + printk(KERN_ERR "microcode: failed to create " 337 + "equivalent cpu table\n"); 280 338 return -EINVAL; 281 339 } 282 340 ··· 289 343 unsigned int uninitialized_var(mc_size); 290 344 struct microcode_header_amd *mc_header; 291 345 292 - mc = get_next_ucode(ucode_ptr, leftover, get_ucode_data, &mc_size); 346 + mc = get_next_ucode(ucode_ptr, leftover, &mc_size); 293 347 if (!mc) 294 348 break; 295 349 ··· 299 353 vfree(new_mc); 300 354 new_rev = mc_header->patch_id; 301 355 new_mc = mc; 302 - } else 356 + } else 303 357 vfree(mc); 304 358 305 359 ucode_ptr += mc_size; ··· 311 365 if (uci->mc) 312 366 vfree(uci->mc); 313 367 uci->mc = new_mc; 314 - pr_debug("microcode: CPU%d found a matching microcode update with" 315 - " version 0x%x (current=0x%x)\n", 316 - cpu, new_rev, uci->cpu_sig.rev); 368 + pr_debug("microcode: CPU%d found a matching microcode " 369 + "update with version 0x%x (current=0x%x)\n", 370 + cpu, new_rev, uci->cpu_sig.rev); 317 371 } else 318 372 vfree(new_mc); 319 373 } ··· 321 375 free_equiv_cpu_table(); 322 376 323 377 return (int)leftover; 324 - } 325 - 326 - static int get_ucode_fw(void *to, const void *from, size_t n) 327 - { 328 - memcpy(to, from, n); 329 - return 0; 330 378 } 331 379 332 380 static int request_microcode_fw(int cpu, struct device *device) ··· 334 394 335 395 ret = request_firmware(&firmware, fw_name, device); 336 396 if (ret) { 337 - printk(KERN_ERR "microcode: ucode data file %s load failed\n", fw_name); 397 + printk(KERN_ERR "microcode: failed to load file %s\n", fw_name); 338 398 return ret; 339 399 } 340 400 341 - ret = generic_load_microcode(cpu, (void*)firmware->data, firmware->size, 342 - &get_ucode_fw); 401 + ret = generic_load_microcode(cpu, firmware->data, firmware->size); 343 402 344 403 release_firmware(firmware); 345 404 ··· 347 408 348 409 static int request_microcode_user(int cpu, const void __user *buf, size_t size) 349 410 { 350 - printk(KERN_WARNING "microcode: AMD microcode update via /dev/cpu/microcode" 351 - "is not supported\n"); 411 + printk(KERN_INFO "microcode: AMD microcode update via " 412 + "/dev/cpu/microcode not supported\n"); 352 413 return -1; 353 414 } 354 415 ··· 372 433 { 373 434 return &microcode_amd_ops; 374 435 } 436 +
+3 -3
arch/x86/kernel/microcode_core.c
··· 99 99 100 100 #define MICROCODE_VERSION "2.00" 101 101 102 - struct microcode_ops *microcode_ops; 102 + static struct microcode_ops *microcode_ops; 103 103 104 104 /* no concurrent ->write()s are allowed on /dev/cpu/microcode */ 105 105 static DEFINE_MUTEX(microcode_mutex); ··· 203 203 #endif 204 204 205 205 /* fake device for request_firmware */ 206 - struct platform_device *microcode_pdev; 206 + static struct platform_device *microcode_pdev; 207 207 208 208 static ssize_t reload_store(struct sys_device *dev, 209 209 struct sysdev_attribute *attr, ··· 328 328 return 0; 329 329 } 330 330 331 - void microcode_update_cpu(int cpu) 331 + static void microcode_update_cpu(int cpu) 332 332 { 333 333 struct ucode_cpu_info *uci = ucode_cpu_info + cpu; 334 334 int err = 0;
+1 -1
arch/x86/kernel/microcode_intel.c
··· 471 471 uci->mc = NULL; 472 472 } 473 473 474 - struct microcode_ops microcode_intel_ops = { 474 + static struct microcode_ops microcode_intel_ops = { 475 475 .request_microcode_user = request_microcode_user, 476 476 .request_microcode_fw = request_microcode_fw, 477 477 .collect_cpu_info = collect_cpu_info,
+13 -16
arch/x86/kernel/mpparse.c
··· 586 586 { 587 587 struct intel_mp_floating *mpf = mpf_found; 588 588 589 + if (!mpf) 590 + return; 591 + 592 + if (acpi_lapic && early) 593 + return; 594 + 595 + /* 596 + * MPS doesn't support hyperthreading, aka only have 597 + * thread 0 apic id in MPS table 598 + */ 599 + if (acpi_lapic && acpi_ioapic) 600 + return; 601 + 589 602 if (x86_quirks->mach_get_smp_config) { 590 603 if (x86_quirks->mach_get_smp_config(early)) 591 604 return; 592 605 } 593 - if (acpi_lapic && early) 594 - return; 595 - /* 596 - * ACPI supports both logical (e.g. Hyper-Threading) and physical 597 - * processors, where MPS only supports physical. 598 - */ 599 - if (acpi_lapic && acpi_ioapic) { 600 - printk(KERN_INFO "Using ACPI (MADT) for SMP configuration " 601 - "information\n"); 602 - return; 603 - } else if (acpi_lapic) 604 - printk(KERN_INFO "Using ACPI for processor (LAPIC) " 605 - "configuration information\n"); 606 - 607 - if (!mpf) 608 - return; 609 606 610 607 printk(KERN_INFO "Intel MultiProcessor Specification v1.%d\n", 611 608 mpf->mpf_specification);
+46 -12
arch/x86/kernel/nmi.c
··· 131 131 atomic_dec(&nmi_active); 132 132 } 133 133 134 + static void __acpi_nmi_disable(void *__unused) 135 + { 136 + apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED); 137 + } 138 + 134 139 int __init check_nmi_watchdog(void) 135 140 { 136 141 unsigned int *prev_nmi_count; ··· 184 179 kfree(prev_nmi_count); 185 180 return 0; 186 181 error: 187 - if (nmi_watchdog == NMI_IO_APIC && !timer_through_8259) 188 - disable_8259A_irq(0); 182 + if (nmi_watchdog == NMI_IO_APIC) { 183 + if (!timer_through_8259) 184 + disable_8259A_irq(0); 185 + on_each_cpu(__acpi_nmi_disable, NULL, 1); 186 + } 187 + 189 188 #ifdef CONFIG_X86_32 190 189 timer_ack = 0; 191 190 #endif ··· 208 199 ++str; 209 200 } 210 201 211 - get_option(&str, &nmi); 202 + if (!strncmp(str, "lapic", 5)) 203 + nmi_watchdog = NMI_LOCAL_APIC; 204 + else if (!strncmp(str, "ioapic", 6)) 205 + nmi_watchdog = NMI_IO_APIC; 206 + else { 207 + get_option(&str, &nmi); 208 + if (nmi >= NMI_INVALID) 209 + return 0; 210 + nmi_watchdog = nmi; 211 + } 212 212 213 - if (nmi >= NMI_INVALID) 214 - return 0; 215 - 216 - nmi_watchdog = nmi; 217 213 return 1; 218 214 } 219 215 __setup("nmi_watchdog=", setup_nmi_watchdog); ··· 299 285 on_each_cpu(__acpi_nmi_enable, NULL, 1); 300 286 } 301 287 302 - static void __acpi_nmi_disable(void *__unused) 303 - { 304 - apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED); 305 - } 306 - 307 288 /* 308 289 * Disable timer based NMIs on all CPUs: 309 290 */ ··· 349 340 return; 350 341 if (nmi_watchdog == NMI_LOCAL_APIC) 351 342 lapic_watchdog_stop(); 343 + else 344 + __acpi_nmi_disable(NULL); 352 345 __get_cpu_var(wd_enabled) = 0; 353 346 atomic_dec(&nmi_active); 354 347 } ··· 476 465 477 466 #ifdef CONFIG_SYSCTL 478 467 468 + static void enable_ioapic_nmi_watchdog_single(void *unused) 469 + { 470 + __get_cpu_var(wd_enabled) = 1; 471 + atomic_inc(&nmi_active); 472 + __acpi_nmi_enable(NULL); 473 + } 474 + 475 + static void enable_ioapic_nmi_watchdog(void) 476 + { 477 + on_each_cpu(enable_ioapic_nmi_watchdog_single, NULL, 1); 478 + touch_nmi_watchdog(); 479 + } 480 + 481 + static void disable_ioapic_nmi_watchdog(void) 482 + { 483 + on_each_cpu(stop_apic_nmi_watchdog, NULL, 1); 484 + } 485 + 479 486 static int __init setup_unknown_nmi_panic(char *str) 480 487 { 481 488 unknown_nmi_panic = 1; ··· 536 507 enable_lapic_nmi_watchdog(); 537 508 else 538 509 disable_lapic_nmi_watchdog(); 510 + } else if (nmi_watchdog == NMI_IO_APIC) { 511 + if (nmi_watchdog_enabled) 512 + enable_ioapic_nmi_watchdog(); 513 + else 514 + disable_ioapic_nmi_watchdog(); 539 515 } else { 540 516 printk(KERN_WARNING 541 517 "NMI watchdog doesn't know what hardware to touch\n");
+9 -1
arch/x86/kernel/numaq_32.c
··· 31 31 #include <asm/numaq.h> 32 32 #include <asm/topology.h> 33 33 #include <asm/processor.h> 34 - #include <asm/mpspec.h> 34 + #include <asm/genapic.h> 35 35 #include <asm/e820.h> 36 36 #include <asm/setup.h> 37 37 ··· 235 235 return 1; 236 236 } 237 237 238 + static int __init numaq_update_genapic(void) 239 + { 240 + genapic->wakeup_cpu = wakeup_secondary_cpu_via_nmi; 241 + 242 + return 0; 243 + } 244 + 238 245 static struct x86_quirks numaq_x86_quirks __initdata = { 239 246 .arch_pre_time_init = numaq_pre_time_init, 240 247 .arch_time_init = NULL, ··· 257 250 .mpc_oem_pci_bus = mpc_oem_pci_bus, 258 251 .smp_read_mpc_oem = smp_read_mpc_oem, 259 252 .setup_ioapic_ids = numaq_setup_ioapic_ids, 253 + .update_genapic = numaq_update_genapic, 260 254 }; 261 255 262 256 void numaq_mps_oem_check(struct mp_config_table *mpc, char *oem,
+2 -2
arch/x86/kernel/pci-dma.c
··· 300 300 static __devinit void via_no_dac(struct pci_dev *dev) 301 301 { 302 302 if ((dev->class >> 8) == PCI_CLASS_BRIDGE_PCI && forbid_dac == 0) { 303 - printk(KERN_INFO "PCI: VIA PCI bridge detected." 304 - "Disabling DAC.\n"); 303 + printk(KERN_INFO 304 + "PCI: VIA PCI bridge detected. Disabling DAC.\n"); 305 305 forbid_dac = 1; 306 306 } 307 307 }
+17
arch/x86/kernel/process.c
··· 1 1 #include <linux/errno.h> 2 2 #include <linux/kernel.h> 3 3 #include <linux/mm.h> 4 + #include <asm/idle.h> 4 5 #include <linux/smp.h> 5 6 #include <linux/slab.h> 6 7 #include <linux/sched.h> ··· 9 8 #include <linux/pm.h> 10 9 #include <linux/clockchips.h> 11 10 #include <asm/system.h> 11 + #include <asm/apic.h> 12 12 13 13 unsigned long idle_halt; 14 14 EXPORT_SYMBOL(idle_halt); ··· 123 121 #ifdef CONFIG_APM_MODULE 124 122 EXPORT_SYMBOL(default_idle); 125 123 #endif 124 + 125 + void stop_this_cpu(void *dummy) 126 + { 127 + local_irq_disable(); 128 + /* 129 + * Remove this CPU: 130 + */ 131 + cpu_clear(smp_processor_id(), cpu_online_map); 132 + disable_local_APIC(); 133 + 134 + for (;;) { 135 + if (hlt_works(smp_processor_id())) 136 + halt(); 137 + } 138 + } 126 139 127 140 static void do_nothing(void *unused) 128 141 {
+4 -5
arch/x86/kernel/ptrace.c
··· 929 929 switch (c->x86) { 930 930 case 0x6: 931 931 switch (c->x86_model) { 932 + case 0 ... 0xC: 933 + /* sorry, don't know about them */ 934 + break; 932 935 case 0xD: 933 936 case 0xE: /* Pentium M */ 934 937 bts_configure(&bts_cfg_pentium_m); 935 938 break; 936 - case 0xF: /* Core2 */ 937 - case 0x1C: /* Atom */ 939 + default: /* Core2, Atom, ... */ 938 940 bts_configure(&bts_cfg_core2); 939 - break; 940 - default: 941 - /* sorry, don't know about them */ 942 941 break; 943 942 } 944 943 break;
+123 -3
arch/x86/kernel/reboot.c
··· 21 21 # include <asm/iommu.h> 22 22 #endif 23 23 24 + #include <mach_ipi.h> 25 + 26 + 24 27 /* 25 28 * Power off function, if any 26 29 */ ··· 39 36 static int reboot_cpu = -1; 40 37 #endif 41 38 42 - /* reboot=b[ios] | s[mp] | t[riple] | k[bd] | e[fi] [, [w]arm | [c]old] 39 + /* This is set by the PCI code if either type 1 or type 2 PCI is detected */ 40 + bool port_cf9_safe = false; 41 + 42 + /* reboot=b[ios] | s[mp] | t[riple] | k[bd] | e[fi] [, [w]arm | [c]old] | p[ci] 43 43 warm Don't set the cold reboot flag 44 44 cold Set the cold reboot flag 45 45 bios Reboot by jumping through the BIOS (only for X86_32) ··· 51 45 kbd Use the keyboard controller. cold reset (default) 52 46 acpi Use the RESET_REG in the FADT 53 47 efi Use efi reset_system runtime service 48 + pci Use the so-called "PCI reset register", CF9 54 49 force Avoid anything that could hang. 55 50 */ 56 51 static int __init reboot_setup(char *str) ··· 86 79 case 'k': 87 80 case 't': 88 81 case 'e': 82 + case 'p': 89 83 reboot_type = *str; 90 84 break; 91 85 ··· 412 404 reboot_type = BOOT_KBD; 413 405 break; 414 406 415 - 416 407 case BOOT_EFI: 417 408 if (efi_enabled) 418 - efi.reset_system(reboot_mode ? EFI_RESET_WARM : EFI_RESET_COLD, 409 + efi.reset_system(reboot_mode ? 410 + EFI_RESET_WARM : 411 + EFI_RESET_COLD, 419 412 EFI_SUCCESS, 0, NULL); 413 + reboot_type = BOOT_KBD; 414 + break; 420 415 416 + case BOOT_CF9: 417 + port_cf9_safe = true; 418 + /* fall through */ 419 + 420 + case BOOT_CF9_COND: 421 + if (port_cf9_safe) { 422 + u8 cf9 = inb(0xcf9) & ~6; 423 + outb(cf9|2, 0xcf9); /* Request hard reset */ 424 + udelay(50); 425 + outb(cf9|6, 0xcf9); /* Actually do the reset */ 426 + udelay(50); 427 + } 421 428 reboot_type = BOOT_KBD; 422 429 break; 423 430 } ··· 493 470 494 471 static void native_machine_halt(void) 495 472 { 473 + /* stop other cpus and apics */ 474 + machine_shutdown(); 475 + 476 + /* stop this cpu */ 477 + stop_this_cpu(NULL); 496 478 } 497 479 498 480 static void native_machine_power_off(void) ··· 549 521 void machine_crash_shutdown(struct pt_regs *regs) 550 522 { 551 523 machine_ops.crash_shutdown(regs); 524 + } 525 + #endif 526 + 527 + 528 + #if defined(CONFIG_SMP) 529 + 530 + /* This keeps a track of which one is crashing cpu. */ 531 + static int crashing_cpu; 532 + static nmi_shootdown_cb shootdown_callback; 533 + 534 + static atomic_t waiting_for_crash_ipi; 535 + 536 + static int crash_nmi_callback(struct notifier_block *self, 537 + unsigned long val, void *data) 538 + { 539 + int cpu; 540 + 541 + if (val != DIE_NMI_IPI) 542 + return NOTIFY_OK; 543 + 544 + cpu = raw_smp_processor_id(); 545 + 546 + /* Don't do anything if this handler is invoked on crashing cpu. 547 + * Otherwise, system will completely hang. Crashing cpu can get 548 + * an NMI if system was initially booted with nmi_watchdog parameter. 549 + */ 550 + if (cpu == crashing_cpu) 551 + return NOTIFY_STOP; 552 + local_irq_disable(); 553 + 554 + shootdown_callback(cpu, (struct die_args *)data); 555 + 556 + atomic_dec(&waiting_for_crash_ipi); 557 + /* Assume hlt works */ 558 + halt(); 559 + for (;;) 560 + cpu_relax(); 561 + 562 + return 1; 563 + } 564 + 565 + static void smp_send_nmi_allbutself(void) 566 + { 567 + cpumask_t mask = cpu_online_map; 568 + cpu_clear(safe_smp_processor_id(), mask); 569 + if (!cpus_empty(mask)) 570 + send_IPI_mask(mask, NMI_VECTOR); 571 + } 572 + 573 + static struct notifier_block crash_nmi_nb = { 574 + .notifier_call = crash_nmi_callback, 575 + }; 576 + 577 + /* Halt all other CPUs, calling the specified function on each of them 578 + * 579 + * This function can be used to halt all other CPUs on crash 580 + * or emergency reboot time. The function passed as parameter 581 + * will be called inside a NMI handler on all CPUs. 582 + */ 583 + void nmi_shootdown_cpus(nmi_shootdown_cb callback) 584 + { 585 + unsigned long msecs; 586 + local_irq_disable(); 587 + 588 + /* Make a note of crashing cpu. Will be used in NMI callback.*/ 589 + crashing_cpu = safe_smp_processor_id(); 590 + 591 + shootdown_callback = callback; 592 + 593 + atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); 594 + /* Would it be better to replace the trap vector here? */ 595 + if (register_die_notifier(&crash_nmi_nb)) 596 + return; /* return what? */ 597 + /* Ensure the new callback function is set before sending 598 + * out the NMI 599 + */ 600 + wmb(); 601 + 602 + smp_send_nmi_allbutself(); 603 + 604 + msecs = 1000; /* Wait at most a second for the other cpus to stop */ 605 + while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { 606 + mdelay(1); 607 + msecs--; 608 + } 609 + 610 + /* Leave the nmi callback set */ 611 + } 612 + #else /* !CONFIG_SMP */ 613 + void nmi_shootdown_cpus(nmi_shootdown_cb callback) 614 + { 615 + /* No other CPUs to shoot down */ 552 616 } 553 617 #endif
-115
arch/x86/kernel/relocate_kernel_32.S
··· 10 10 #include <asm/page.h> 11 11 #include <asm/kexec.h> 12 12 #include <asm/processor-flags.h> 13 - #include <asm/pgtable.h> 14 13 15 14 /* 16 15 * Must be relocatable PIC code callable as a C function 17 16 */ 18 17 19 18 #define PTR(x) (x << 2) 20 - #define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY) 21 - #define PAE_PGD_ATTR (_PAGE_PRESENT) 22 19 23 20 /* control_page + KEXEC_CONTROL_CODE_MAX_SIZE 24 21 * ~ control_page + PAGE_SIZE are used as data storage and stack for ··· 36 39 #define CP_PA_BACKUP_PAGES_MAP DATA(0x1c) 37 40 38 41 .text 39 - .align PAGE_SIZE 40 42 .globl relocate_kernel 41 43 relocate_kernel: 42 44 /* Save the CPU context, used for jumping back */ ··· 56 60 movl %cr4, %eax 57 61 movl %eax, CR4(%edi) 58 62 59 - #ifdef CONFIG_X86_PAE 60 - /* map the control page at its virtual address */ 61 - 62 - movl PTR(VA_PGD)(%ebp), %edi 63 - movl PTR(VA_CONTROL_PAGE)(%ebp), %eax 64 - andl $0xc0000000, %eax 65 - shrl $27, %eax 66 - addl %edi, %eax 67 - 68 - movl PTR(PA_PMD_0)(%ebp), %edx 69 - orl $PAE_PGD_ATTR, %edx 70 - movl %edx, (%eax) 71 - 72 - movl PTR(VA_PMD_0)(%ebp), %edi 73 - movl PTR(VA_CONTROL_PAGE)(%ebp), %eax 74 - andl $0x3fe00000, %eax 75 - shrl $18, %eax 76 - addl %edi, %eax 77 - 78 - movl PTR(PA_PTE_0)(%ebp), %edx 79 - orl $PAGE_ATTR, %edx 80 - movl %edx, (%eax) 81 - 82 - movl PTR(VA_PTE_0)(%ebp), %edi 83 - movl PTR(VA_CONTROL_PAGE)(%ebp), %eax 84 - andl $0x001ff000, %eax 85 - shrl $9, %eax 86 - addl %edi, %eax 87 - 88 - movl PTR(PA_CONTROL_PAGE)(%ebp), %edx 89 - orl $PAGE_ATTR, %edx 90 - movl %edx, (%eax) 91 - 92 - /* identity map the control page at its physical address */ 93 - 94 - movl PTR(VA_PGD)(%ebp), %edi 95 - movl PTR(PA_CONTROL_PAGE)(%ebp), %eax 96 - andl $0xc0000000, %eax 97 - shrl $27, %eax 98 - addl %edi, %eax 99 - 100 - movl PTR(PA_PMD_1)(%ebp), %edx 101 - orl $PAE_PGD_ATTR, %edx 102 - movl %edx, (%eax) 103 - 104 - movl PTR(VA_PMD_1)(%ebp), %edi 105 - movl PTR(PA_CONTROL_PAGE)(%ebp), %eax 106 - andl $0x3fe00000, %eax 107 - shrl $18, %eax 108 - addl %edi, %eax 109 - 110 - movl PTR(PA_PTE_1)(%ebp), %edx 111 - orl $PAGE_ATTR, %edx 112 - movl %edx, (%eax) 113 - 114 - movl PTR(VA_PTE_1)(%ebp), %edi 115 - movl PTR(PA_CONTROL_PAGE)(%ebp), %eax 116 - andl $0x001ff000, %eax 117 - shrl $9, %eax 118 - addl %edi, %eax 119 - 120 - movl PTR(PA_CONTROL_PAGE)(%ebp), %edx 121 - orl $PAGE_ATTR, %edx 122 - movl %edx, (%eax) 123 - #else 124 - /* map the control page at its virtual address */ 125 - 126 - movl PTR(VA_PGD)(%ebp), %edi 127 - movl PTR(VA_CONTROL_PAGE)(%ebp), %eax 128 - andl $0xffc00000, %eax 129 - shrl $20, %eax 130 - addl %edi, %eax 131 - 132 - movl PTR(PA_PTE_0)(%ebp), %edx 133 - orl $PAGE_ATTR, %edx 134 - movl %edx, (%eax) 135 - 136 - movl PTR(VA_PTE_0)(%ebp), %edi 137 - movl PTR(VA_CONTROL_PAGE)(%ebp), %eax 138 - andl $0x003ff000, %eax 139 - shrl $10, %eax 140 - addl %edi, %eax 141 - 142 - movl PTR(PA_CONTROL_PAGE)(%ebp), %edx 143 - orl $PAGE_ATTR, %edx 144 - movl %edx, (%eax) 145 - 146 - /* identity map the control page at its physical address */ 147 - 148 - movl PTR(VA_PGD)(%ebp), %edi 149 - movl PTR(PA_CONTROL_PAGE)(%ebp), %eax 150 - andl $0xffc00000, %eax 151 - shrl $20, %eax 152 - addl %edi, %eax 153 - 154 - movl PTR(PA_PTE_1)(%ebp), %edx 155 - orl $PAGE_ATTR, %edx 156 - movl %edx, (%eax) 157 - 158 - movl PTR(VA_PTE_1)(%ebp), %edi 159 - movl PTR(PA_CONTROL_PAGE)(%ebp), %eax 160 - andl $0x003ff000, %eax 161 - shrl $10, %eax 162 - addl %edi, %eax 163 - 164 - movl PTR(PA_CONTROL_PAGE)(%ebp), %edx 165 - orl $PAGE_ATTR, %edx 166 - movl %edx, (%eax) 167 - #endif 168 - 169 - relocate_new_kernel: 170 63 /* read the arguments and say goodbye to the stack */ 171 64 movl 20+4(%esp), %ebx /* page_list */ 172 65 movl 20+8(%esp), %ebp /* list of pages */
+24 -152
arch/x86/kernel/setup.c
··· 98 98 99 99 #include <mach_apic.h> 100 100 #include <asm/paravirt.h> 101 + #include <asm/hypervisor.h> 101 102 102 103 #include <asm/percpu.h> 103 104 #include <asm/topology.h> ··· 449 448 * @size: Size of the crashkernel memory to reserve. 450 449 * Returns the base address on success, and -1ULL on failure. 451 450 */ 451 + static 452 452 unsigned long long __init find_and_reserve_crashkernel(unsigned long long size) 453 453 { 454 454 const unsigned long long alignment = 16<<20; /* 16M */ ··· 585 583 early_param("elfcorehdr", setup_elfcorehdr); 586 584 #endif 587 585 588 - static struct x86_quirks default_x86_quirks __initdata; 586 + static int __init default_update_genapic(void) 587 + { 588 + #ifdef CONFIG_X86_SMP 589 + # if defined(CONFIG_X86_GENERICARCH) || defined(CONFIG_X86_64) 590 + genapic->wakeup_cpu = wakeup_secondary_cpu_via_init; 591 + # endif 592 + #endif 593 + 594 + return 0; 595 + } 596 + 597 + static struct x86_quirks default_x86_quirks __initdata = { 598 + .update_genapic = default_update_genapic, 599 + }; 589 600 590 601 struct x86_quirks *x86_quirks __initdata = &default_x86_quirks; 591 602 592 - /* 593 - * Some BIOSes seem to corrupt the low 64k of memory during events 594 - * like suspend/resume and unplugging an HDMI cable. Reserve all 595 - * remaining free memory in that area and fill it with a distinct 596 - * pattern. 597 - */ 598 - #ifdef CONFIG_X86_CHECK_BIOS_CORRUPTION 599 - #define MAX_SCAN_AREAS 8 600 - 601 - static int __read_mostly memory_corruption_check = -1; 602 - 603 - static unsigned __read_mostly corruption_check_size = 64*1024; 604 - static unsigned __read_mostly corruption_check_period = 60; /* seconds */ 605 - 606 - static struct e820entry scan_areas[MAX_SCAN_AREAS]; 607 - static int num_scan_areas; 608 - 609 - 610 - static int set_corruption_check(char *arg) 611 - { 612 - char *end; 613 - 614 - memory_corruption_check = simple_strtol(arg, &end, 10); 615 - 616 - return (*end == 0) ? 0 : -EINVAL; 617 - } 618 - early_param("memory_corruption_check", set_corruption_check); 619 - 620 - static int set_corruption_check_period(char *arg) 621 - { 622 - char *end; 623 - 624 - corruption_check_period = simple_strtoul(arg, &end, 10); 625 - 626 - return (*end == 0) ? 0 : -EINVAL; 627 - } 628 - early_param("memory_corruption_check_period", set_corruption_check_period); 629 - 630 - static int set_corruption_check_size(char *arg) 631 - { 632 - char *end; 633 - unsigned size; 634 - 635 - size = memparse(arg, &end); 636 - 637 - if (*end == '\0') 638 - corruption_check_size = size; 639 - 640 - return (size == corruption_check_size) ? 0 : -EINVAL; 641 - } 642 - early_param("memory_corruption_check_size", set_corruption_check_size); 643 - 644 - 645 - static void __init setup_bios_corruption_check(void) 646 - { 647 - u64 addr = PAGE_SIZE; /* assume first page is reserved anyway */ 648 - 649 - if (memory_corruption_check == -1) { 650 - memory_corruption_check = 651 - #ifdef CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK 652 - 1 653 - #else 654 - 0 655 - #endif 656 - ; 657 - } 658 - 659 - if (corruption_check_size == 0) 660 - memory_corruption_check = 0; 661 - 662 - if (!memory_corruption_check) 663 - return; 664 - 665 - corruption_check_size = round_up(corruption_check_size, PAGE_SIZE); 666 - 667 - while(addr < corruption_check_size && num_scan_areas < MAX_SCAN_AREAS) { 668 - u64 size; 669 - addr = find_e820_area_size(addr, &size, PAGE_SIZE); 670 - 671 - if (addr == 0) 672 - break; 673 - 674 - if ((addr + size) > corruption_check_size) 675 - size = corruption_check_size - addr; 676 - 677 - if (size == 0) 678 - break; 679 - 680 - e820_update_range(addr, size, E820_RAM, E820_RESERVED); 681 - scan_areas[num_scan_areas].addr = addr; 682 - scan_areas[num_scan_areas].size = size; 683 - num_scan_areas++; 684 - 685 - /* Assume we've already mapped this early memory */ 686 - memset(__va(addr), 0, size); 687 - 688 - addr += size; 689 - } 690 - 691 - printk(KERN_INFO "Scanning %d areas for low memory corruption\n", 692 - num_scan_areas); 693 - update_e820(); 694 - } 695 - 696 - static struct timer_list periodic_check_timer; 697 - 698 - void check_for_bios_corruption(void) 699 - { 700 - int i; 701 - int corruption = 0; 702 - 703 - if (!memory_corruption_check) 704 - return; 705 - 706 - for(i = 0; i < num_scan_areas; i++) { 707 - unsigned long *addr = __va(scan_areas[i].addr); 708 - unsigned long size = scan_areas[i].size; 709 - 710 - for(; size; addr++, size -= sizeof(unsigned long)) { 711 - if (!*addr) 712 - continue; 713 - printk(KERN_ERR "Corrupted low memory at %p (%lx phys) = %08lx\n", 714 - addr, __pa(addr), *addr); 715 - corruption = 1; 716 - *addr = 0; 717 - } 718 - } 719 - 720 - WARN(corruption, KERN_ERR "Memory corruption detected in low memory\n"); 721 - } 722 - 723 - static void periodic_check_for_corruption(unsigned long data) 724 - { 725 - check_for_bios_corruption(); 726 - mod_timer(&periodic_check_timer, round_jiffies(jiffies + corruption_check_period*HZ)); 727 - } 728 - 729 - void start_periodic_check_for_corruption(void) 730 - { 731 - if (!memory_corruption_check || corruption_check_period == 0) 732 - return; 733 - 734 - printk(KERN_INFO "Scanning for low memory corruption every %d seconds\n", 735 - corruption_check_period); 736 - 737 - init_timer(&periodic_check_timer); 738 - periodic_check_timer.function = &periodic_check_for_corruption; 739 - periodic_check_for_corruption(0); 740 - } 741 - #endif 742 - 603 + #ifdef CONFIG_X86_RESERVE_LOW_64K 743 604 static int __init dmi_low_memory_corruption(const struct dmi_system_id *d) 744 605 { 745 606 printk(KERN_NOTICE ··· 614 749 615 750 return 0; 616 751 } 752 + #endif 617 753 618 754 /* List of systems that have known low memory corruption BIOS problems */ 619 755 static struct dmi_system_id __initdata bad_bios_dmi_table[] = { ··· 772 906 dmi_scan_machine(); 773 907 774 908 dmi_check_system(bad_bios_dmi_table); 909 + 910 + /* 911 + * VMware detection requires dmi to be available, so this 912 + * needs to be done after dmi_scan_machine, for the BP. 913 + */ 914 + init_hypervisor(&boot_cpu_data); 775 915 776 916 #ifdef CONFIG_X86_32 777 917 probe_roms();
-42
arch/x86/kernel/sigframe.h
··· 1 - #ifdef CONFIG_X86_32 2 - struct sigframe { 3 - char __user *pretcode; 4 - int sig; 5 - struct sigcontext sc; 6 - /* 7 - * fpstate is unused. fpstate is moved/allocated after 8 - * retcode[] below. This movement allows to have the FP state and the 9 - * future state extensions (xsave) stay together. 10 - * And at the same time retaining the unused fpstate, prevents changing 11 - * the offset of extramask[] in the sigframe and thus prevent any 12 - * legacy application accessing/modifying it. 13 - */ 14 - struct _fpstate fpstate_unused; 15 - unsigned long extramask[_NSIG_WORDS-1]; 16 - char retcode[8]; 17 - /* fp state follows here */ 18 - }; 19 - 20 - struct rt_sigframe { 21 - char __user *pretcode; 22 - int sig; 23 - struct siginfo __user *pinfo; 24 - void __user *puc; 25 - struct siginfo info; 26 - struct ucontext uc; 27 - char retcode[8]; 28 - /* fp state follows here */ 29 - }; 30 - #else 31 - struct rt_sigframe { 32 - char __user *pretcode; 33 - struct ucontext uc; 34 - struct siginfo info; 35 - /* fp state follows here */ 36 - }; 37 - 38 - int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, 39 - sigset_t *set, struct pt_regs *regs); 40 - int ia32_setup_frame(int sig, struct k_sigaction *ka, 41 - sigset_t *set, struct pt_regs *regs); 42 - #endif
+383 -190
arch/x86/kernel/signal_32.c arch/x86/kernel/signal.c
··· 1 1 /* 2 2 * Copyright (C) 1991, 1992 Linus Torvalds 3 + * Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs 3 4 * 4 5 * 1997-11-28 Modified for POSIX.1b signals by Richard Henderson 5 6 * 2000-06-20 Pentium III FXSR, SSE support by Gareth Hughes 7 + * 2000-2002 x86-64 support by Andi Kleen 6 8 */ 7 - #include <linux/list.h> 8 9 9 - #include <linux/personality.h> 10 - #include <linux/binfmts.h> 11 - #include <linux/suspend.h> 12 - #include <linux/kernel.h> 13 - #include <linux/ptrace.h> 14 - #include <linux/signal.h> 15 - #include <linux/stddef.h> 16 - #include <linux/unistd.h> 17 - #include <linux/errno.h> 18 10 #include <linux/sched.h> 19 - #include <linux/wait.h> 20 - #include <linux/tracehook.h> 21 - #include <linux/elf.h> 22 - #include <linux/smp.h> 23 11 #include <linux/mm.h> 12 + #include <linux/smp.h> 13 + #include <linux/kernel.h> 14 + #include <linux/signal.h> 15 + #include <linux/errno.h> 16 + #include <linux/wait.h> 17 + #include <linux/ptrace.h> 18 + #include <linux/tracehook.h> 19 + #include <linux/unistd.h> 20 + #include <linux/stddef.h> 21 + #include <linux/personality.h> 22 + #include <linux/uaccess.h> 24 23 25 24 #include <asm/processor.h> 26 25 #include <asm/ucontext.h> 27 - #include <asm/uaccess.h> 28 26 #include <asm/i387.h> 29 27 #include <asm/vdso.h> 28 + 29 + #ifdef CONFIG_X86_64 30 + #include <asm/proto.h> 31 + #include <asm/ia32_unistd.h> 32 + #include <asm/mce.h> 33 + #endif /* CONFIG_X86_64 */ 34 + 30 35 #include <asm/syscall.h> 31 36 #include <asm/syscalls.h> 32 37 33 - #include "sigframe.h" 38 + #include <asm/sigframe.h> 34 39 35 40 #define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP))) 36 41 ··· 50 45 # define FIX_EFLAGS __FIX_EFLAGS 51 46 #endif 52 47 53 - /* 54 - * Atomically swap in the new signal mask, and wait for a signal. 55 - */ 56 - asmlinkage int 57 - sys_sigsuspend(int history0, int history1, old_sigset_t mask) 58 - { 59 - mask &= _BLOCKABLE; 60 - spin_lock_irq(&current->sighand->siglock); 61 - current->saved_sigmask = current->blocked; 62 - siginitset(&current->blocked, mask); 63 - recalc_sigpending(); 64 - spin_unlock_irq(&current->sighand->siglock); 65 - 66 - current->state = TASK_INTERRUPTIBLE; 67 - schedule(); 68 - set_restore_sigmask(); 69 - 70 - return -ERESTARTNOHAND; 71 - } 72 - 73 - asmlinkage int 74 - sys_sigaction(int sig, const struct old_sigaction __user *act, 75 - struct old_sigaction __user *oact) 76 - { 77 - struct k_sigaction new_ka, old_ka; 78 - int ret; 79 - 80 - if (act) { 81 - old_sigset_t mask; 82 - 83 - if (!access_ok(VERIFY_READ, act, sizeof(*act)) || 84 - __get_user(new_ka.sa.sa_handler, &act->sa_handler) || 85 - __get_user(new_ka.sa.sa_restorer, &act->sa_restorer)) 86 - return -EFAULT; 87 - 88 - __get_user(new_ka.sa.sa_flags, &act->sa_flags); 89 - __get_user(mask, &act->sa_mask); 90 - siginitset(&new_ka.sa.sa_mask, mask); 91 - } 92 - 93 - ret = do_sigaction(sig, act ? &new_ka : NULL, oact ? &old_ka : NULL); 94 - 95 - if (!ret && oact) { 96 - if (!access_ok(VERIFY_WRITE, oact, sizeof(*oact)) || 97 - __put_user(old_ka.sa.sa_handler, &oact->sa_handler) || 98 - __put_user(old_ka.sa.sa_restorer, &oact->sa_restorer)) 99 - return -EFAULT; 100 - 101 - __put_user(old_ka.sa.sa_flags, &oact->sa_flags); 102 - __put_user(old_ka.sa.sa_mask.sig[0], &oact->sa_mask); 103 - } 104 - 105 - return ret; 106 - } 107 - 108 - asmlinkage int sys_sigaltstack(unsigned long bx) 109 - { 110 - /* 111 - * This is needed to make gcc realize it doesn't own the 112 - * "struct pt_regs" 113 - */ 114 - struct pt_regs *regs = (struct pt_regs *)&bx; 115 - const stack_t __user *uss = (const stack_t __user *)bx; 116 - stack_t __user *uoss = (stack_t __user *)regs->cx; 117 - 118 - return do_sigaltstack(uss, uoss, regs->sp); 119 - } 120 - 121 48 #define COPY(x) { \ 122 49 err |= __get_user(regs->x, &sc->x); \ 123 50 } ··· 60 123 regs->seg = tmp; \ 61 124 } 62 125 63 - #define COPY_SEG_STRICT(seg) { \ 126 + #define COPY_SEG_CPL3(seg) { \ 64 127 unsigned short tmp; \ 65 128 err |= __get_user(tmp, &sc->seg); \ 66 129 regs->seg = tmp | 3; \ ··· 72 135 loadsegment(seg, tmp); \ 73 136 } 74 137 75 - /* 76 - * Do a signal return; undo the signal stack. 77 - */ 78 138 static int 79 139 restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc, 80 140 unsigned long *pax) ··· 83 149 /* Always make any pending restarted system calls return -EINTR */ 84 150 current_thread_info()->restart_block.fn = do_no_restart_syscall; 85 151 152 + #ifdef CONFIG_X86_32 86 153 GET_SEG(gs); 87 154 COPY_SEG(fs); 88 155 COPY_SEG(es); 89 156 COPY_SEG(ds); 157 + #endif /* CONFIG_X86_32 */ 158 + 90 159 COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx); 91 160 COPY(dx); COPY(cx); COPY(ip); 92 - COPY_SEG_STRICT(cs); 93 - COPY_SEG_STRICT(ss); 161 + 162 + #ifdef CONFIG_X86_64 163 + COPY(r8); 164 + COPY(r9); 165 + COPY(r10); 166 + COPY(r11); 167 + COPY(r12); 168 + COPY(r13); 169 + COPY(r14); 170 + COPY(r15); 171 + #endif /* CONFIG_X86_64 */ 172 + 173 + #ifdef CONFIG_X86_32 174 + COPY_SEG_CPL3(cs); 175 + COPY_SEG_CPL3(ss); 176 + #else /* !CONFIG_X86_32 */ 177 + /* Kernel saves and restores only the CS segment register on signals, 178 + * which is the bare minimum needed to allow mixed 32/64-bit code. 179 + * App's signal handler can save/restore other segments if needed. */ 180 + COPY_SEG_CPL3(cs); 181 + #endif /* CONFIG_X86_32 */ 94 182 95 183 err |= __get_user(tmpflags, &sc->flags); 96 184 regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS); ··· 125 169 return err; 126 170 } 127 171 128 - asmlinkage unsigned long sys_sigreturn(unsigned long __unused) 129 - { 130 - struct sigframe __user *frame; 131 - struct pt_regs *regs; 132 - unsigned long ax; 133 - sigset_t set; 134 - 135 - regs = (struct pt_regs *) &__unused; 136 - frame = (struct sigframe __user *)(regs->sp - 8); 137 - 138 - if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) 139 - goto badframe; 140 - if (__get_user(set.sig[0], &frame->sc.oldmask) || (_NSIG_WORDS > 1 141 - && __copy_from_user(&set.sig[1], &frame->extramask, 142 - sizeof(frame->extramask)))) 143 - goto badframe; 144 - 145 - sigdelsetmask(&set, ~_BLOCKABLE); 146 - spin_lock_irq(&current->sighand->siglock); 147 - current->blocked = set; 148 - recalc_sigpending(); 149 - spin_unlock_irq(&current->sighand->siglock); 150 - 151 - if (restore_sigcontext(regs, &frame->sc, &ax)) 152 - goto badframe; 153 - return ax; 154 - 155 - badframe: 156 - if (show_unhandled_signals && printk_ratelimit()) { 157 - printk("%s%s[%d] bad frame in sigreturn frame:" 158 - "%p ip:%lx sp:%lx oeax:%lx", 159 - task_pid_nr(current) > 1 ? KERN_INFO : KERN_EMERG, 160 - current->comm, task_pid_nr(current), frame, regs->ip, 161 - regs->sp, regs->orig_ax); 162 - print_vma_addr(" in ", regs->ip); 163 - printk(KERN_CONT "\n"); 164 - } 165 - 166 - force_sig(SIGSEGV, current); 167 - 168 - return 0; 169 - } 170 - 171 - static long do_rt_sigreturn(struct pt_regs *regs) 172 - { 173 - struct rt_sigframe __user *frame; 174 - unsigned long ax; 175 - sigset_t set; 176 - 177 - frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long)); 178 - if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) 179 - goto badframe; 180 - if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set))) 181 - goto badframe; 182 - 183 - sigdelsetmask(&set, ~_BLOCKABLE); 184 - spin_lock_irq(&current->sighand->siglock); 185 - current->blocked = set; 186 - recalc_sigpending(); 187 - spin_unlock_irq(&current->sighand->siglock); 188 - 189 - if (restore_sigcontext(regs, &frame->uc.uc_mcontext, &ax)) 190 - goto badframe; 191 - 192 - if (do_sigaltstack(&frame->uc.uc_stack, NULL, regs->sp) == -EFAULT) 193 - goto badframe; 194 - 195 - return ax; 196 - 197 - badframe: 198 - signal_fault(regs, frame, "rt_sigreturn"); 199 - return 0; 200 - } 201 - 202 - asmlinkage int sys_rt_sigreturn(unsigned long __unused) 203 - { 204 - struct pt_regs *regs = (struct pt_regs *)&__unused; 205 - 206 - return do_rt_sigreturn(regs); 207 - } 208 - 209 - /* 210 - * Set up a signal frame. 211 - */ 212 172 static int 213 173 setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate, 214 174 struct pt_regs *regs, unsigned long mask) 215 175 { 216 - int tmp, err = 0; 176 + int err = 0; 217 177 178 + #ifdef CONFIG_X86_32 179 + { 180 + unsigned int tmp; 181 + 182 + savesegment(gs, tmp); 183 + err |= __put_user(tmp, (unsigned int __user *)&sc->gs); 184 + } 218 185 err |= __put_user(regs->fs, (unsigned int __user *)&sc->fs); 219 - savesegment(gs, tmp); 220 - err |= __put_user(tmp, (unsigned int __user *)&sc->gs); 221 - 222 186 err |= __put_user(regs->es, (unsigned int __user *)&sc->es); 223 187 err |= __put_user(regs->ds, (unsigned int __user *)&sc->ds); 188 + #endif /* CONFIG_X86_32 */ 189 + 224 190 err |= __put_user(regs->di, &sc->di); 225 191 err |= __put_user(regs->si, &sc->si); 226 192 err |= __put_user(regs->bp, &sc->bp); ··· 151 273 err |= __put_user(regs->dx, &sc->dx); 152 274 err |= __put_user(regs->cx, &sc->cx); 153 275 err |= __put_user(regs->ax, &sc->ax); 276 + #ifdef CONFIG_X86_64 277 + err |= __put_user(regs->r8, &sc->r8); 278 + err |= __put_user(regs->r9, &sc->r9); 279 + err |= __put_user(regs->r10, &sc->r10); 280 + err |= __put_user(regs->r11, &sc->r11); 281 + err |= __put_user(regs->r12, &sc->r12); 282 + err |= __put_user(regs->r13, &sc->r13); 283 + err |= __put_user(regs->r14, &sc->r14); 284 + err |= __put_user(regs->r15, &sc->r15); 285 + #endif /* CONFIG_X86_64 */ 286 + 154 287 err |= __put_user(current->thread.trap_no, &sc->trapno); 155 288 err |= __put_user(current->thread.error_code, &sc->err); 156 289 err |= __put_user(regs->ip, &sc->ip); 290 + #ifdef CONFIG_X86_32 157 291 err |= __put_user(regs->cs, (unsigned int __user *)&sc->cs); 158 292 err |= __put_user(regs->flags, &sc->flags); 159 293 err |= __put_user(regs->sp, &sc->sp_at_signal); 160 294 err |= __put_user(regs->ss, (unsigned int __user *)&sc->ss); 295 + #else /* !CONFIG_X86_32 */ 296 + err |= __put_user(regs->flags, &sc->flags); 297 + err |= __put_user(regs->cs, &sc->cs); 298 + err |= __put_user(0, &sc->gs); 299 + err |= __put_user(0, &sc->fs); 300 + #endif /* CONFIG_X86_32 */ 161 301 162 - tmp = save_i387_xstate(fpstate); 163 - if (tmp < 0) 164 - err = 1; 165 - else 166 - err |= __put_user(tmp ? fpstate : NULL, &sc->fpstate); 302 + err |= __put_user(fpstate, &sc->fpstate); 167 303 168 304 /* non-iBCS2 extensions.. */ 169 305 err |= __put_user(mask, &sc->oldmask); ··· 185 293 186 294 return err; 187 295 } 296 + 297 + /* 298 + * Set up a signal frame. 299 + */ 300 + #ifdef CONFIG_X86_32 301 + static const struct { 302 + u16 poplmovl; 303 + u32 val; 304 + u16 int80; 305 + } __attribute__((packed)) retcode = { 306 + 0xb858, /* popl %eax; movl $..., %eax */ 307 + __NR_sigreturn, 308 + 0x80cd, /* int $0x80 */ 309 + }; 310 + 311 + static const struct { 312 + u8 movl; 313 + u32 val; 314 + u16 int80; 315 + u8 pad; 316 + } __attribute__((packed)) rt_retcode = { 317 + 0xb8, /* movl $..., %eax */ 318 + __NR_rt_sigreturn, 319 + 0x80cd, /* int $0x80 */ 320 + 0 321 + }; 188 322 189 323 /* 190 324 * Determine which stack to use.. ··· 246 328 if (used_math()) { 247 329 sp = sp - sig_xstate_size; 248 330 *fpstate = (struct _fpstate *) sp; 331 + if (save_i387_xstate(*fpstate) < 0) 332 + return (void __user *)-1L; 249 333 } 250 334 251 335 sp -= frame_size; ··· 303 383 * reasons and because gdb uses it as a signature to notice 304 384 * signal handler stack frames. 305 385 */ 306 - err |= __put_user(0xb858, (short __user *)(frame->retcode+0)); 307 - err |= __put_user(__NR_sigreturn, (int __user *)(frame->retcode+2)); 308 - err |= __put_user(0x80cd, (short __user *)(frame->retcode+6)); 386 + err |= __put_user(*((u64 *)&retcode), (u64 *)frame->retcode); 309 387 310 388 if (err) 311 389 return -EFAULT; ··· 372 454 * reasons and because gdb uses it as a signature to notice 373 455 * signal handler stack frames. 374 456 */ 375 - err |= __put_user(0xb8, (char __user *)(frame->retcode+0)); 376 - err |= __put_user(__NR_rt_sigreturn, (int __user *)(frame->retcode+1)); 377 - err |= __put_user(0x80cd, (short __user *)(frame->retcode+5)); 457 + err |= __put_user(*((u64 *)&rt_retcode), (u64 *)frame->retcode); 378 458 379 459 if (err) 380 460 return -EFAULT; ··· 391 475 392 476 return 0; 393 477 } 478 + #else /* !CONFIG_X86_32 */ 479 + /* 480 + * Determine which stack to use.. 481 + */ 482 + static void __user * 483 + get_stack(struct k_sigaction *ka, unsigned long sp, unsigned long size) 484 + { 485 + /* Default to using normal stack - redzone*/ 486 + sp -= 128; 487 + 488 + /* This is the X/Open sanctioned signal stack switching. */ 489 + if (ka->sa.sa_flags & SA_ONSTACK) { 490 + if (sas_ss_flags(sp) == 0) 491 + sp = current->sas_ss_sp + current->sas_ss_size; 492 + } 493 + 494 + return (void __user *)round_down(sp - size, 64); 495 + } 496 + 497 + static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, 498 + sigset_t *set, struct pt_regs *regs) 499 + { 500 + struct rt_sigframe __user *frame; 501 + void __user *fp = NULL; 502 + int err = 0; 503 + struct task_struct *me = current; 504 + 505 + if (used_math()) { 506 + fp = get_stack(ka, regs->sp, sig_xstate_size); 507 + frame = (void __user *)round_down( 508 + (unsigned long)fp - sizeof(struct rt_sigframe), 16) - 8; 509 + 510 + if (save_i387_xstate(fp) < 0) 511 + return -EFAULT; 512 + } else 513 + frame = get_stack(ka, regs->sp, sizeof(struct rt_sigframe)) - 8; 514 + 515 + if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) 516 + return -EFAULT; 517 + 518 + if (ka->sa.sa_flags & SA_SIGINFO) { 519 + if (copy_siginfo_to_user(&frame->info, info)) 520 + return -EFAULT; 521 + } 522 + 523 + /* Create the ucontext. */ 524 + if (cpu_has_xsave) 525 + err |= __put_user(UC_FP_XSTATE, &frame->uc.uc_flags); 526 + else 527 + err |= __put_user(0, &frame->uc.uc_flags); 528 + err |= __put_user(0, &frame->uc.uc_link); 529 + err |= __put_user(me->sas_ss_sp, &frame->uc.uc_stack.ss_sp); 530 + err |= __put_user(sas_ss_flags(regs->sp), 531 + &frame->uc.uc_stack.ss_flags); 532 + err |= __put_user(me->sas_ss_size, &frame->uc.uc_stack.ss_size); 533 + err |= setup_sigcontext(&frame->uc.uc_mcontext, fp, regs, set->sig[0]); 534 + err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set)); 535 + 536 + /* Set up to return from userspace. If provided, use a stub 537 + already in userspace. */ 538 + /* x86-64 should always use SA_RESTORER. */ 539 + if (ka->sa.sa_flags & SA_RESTORER) { 540 + err |= __put_user(ka->sa.sa_restorer, &frame->pretcode); 541 + } else { 542 + /* could use a vstub here */ 543 + return -EFAULT; 544 + } 545 + 546 + if (err) 547 + return -EFAULT; 548 + 549 + /* Set up registers for signal handler */ 550 + regs->di = sig; 551 + /* In case the signal handler was declared without prototypes */ 552 + regs->ax = 0; 553 + 554 + /* This also works for non SA_SIGINFO handlers because they expect the 555 + next argument after the signal number on the stack. */ 556 + regs->si = (unsigned long)&frame->info; 557 + regs->dx = (unsigned long)&frame->uc; 558 + regs->ip = (unsigned long) ka->sa.sa_handler; 559 + 560 + regs->sp = (unsigned long)frame; 561 + 562 + /* Set up the CS register to run signal handlers in 64-bit mode, 563 + even if the handler happens to be interrupting 32-bit code. */ 564 + regs->cs = __USER_CS; 565 + 566 + return 0; 567 + } 568 + #endif /* CONFIG_X86_32 */ 569 + 570 + #ifdef CONFIG_X86_32 571 + /* 572 + * Atomically swap in the new signal mask, and wait for a signal. 573 + */ 574 + asmlinkage int 575 + sys_sigsuspend(int history0, int history1, old_sigset_t mask) 576 + { 577 + mask &= _BLOCKABLE; 578 + spin_lock_irq(&current->sighand->siglock); 579 + current->saved_sigmask = current->blocked; 580 + siginitset(&current->blocked, mask); 581 + recalc_sigpending(); 582 + spin_unlock_irq(&current->sighand->siglock); 583 + 584 + current->state = TASK_INTERRUPTIBLE; 585 + schedule(); 586 + set_restore_sigmask(); 587 + 588 + return -ERESTARTNOHAND; 589 + } 590 + 591 + asmlinkage int 592 + sys_sigaction(int sig, const struct old_sigaction __user *act, 593 + struct old_sigaction __user *oact) 594 + { 595 + struct k_sigaction new_ka, old_ka; 596 + int ret; 597 + 598 + if (act) { 599 + old_sigset_t mask; 600 + 601 + if (!access_ok(VERIFY_READ, act, sizeof(*act)) || 602 + __get_user(new_ka.sa.sa_handler, &act->sa_handler) || 603 + __get_user(new_ka.sa.sa_restorer, &act->sa_restorer)) 604 + return -EFAULT; 605 + 606 + __get_user(new_ka.sa.sa_flags, &act->sa_flags); 607 + __get_user(mask, &act->sa_mask); 608 + siginitset(&new_ka.sa.sa_mask, mask); 609 + } 610 + 611 + ret = do_sigaction(sig, act ? &new_ka : NULL, oact ? &old_ka : NULL); 612 + 613 + if (!ret && oact) { 614 + if (!access_ok(VERIFY_WRITE, oact, sizeof(*oact)) || 615 + __put_user(old_ka.sa.sa_handler, &oact->sa_handler) || 616 + __put_user(old_ka.sa.sa_restorer, &oact->sa_restorer)) 617 + return -EFAULT; 618 + 619 + __put_user(old_ka.sa.sa_flags, &oact->sa_flags); 620 + __put_user(old_ka.sa.sa_mask.sig[0], &oact->sa_mask); 621 + } 622 + 623 + return ret; 624 + } 625 + #endif /* CONFIG_X86_32 */ 626 + 627 + #ifdef CONFIG_X86_32 628 + asmlinkage int sys_sigaltstack(unsigned long bx) 629 + { 630 + /* 631 + * This is needed to make gcc realize it doesn't own the 632 + * "struct pt_regs" 633 + */ 634 + struct pt_regs *regs = (struct pt_regs *)&bx; 635 + const stack_t __user *uss = (const stack_t __user *)bx; 636 + stack_t __user *uoss = (stack_t __user *)regs->cx; 637 + 638 + return do_sigaltstack(uss, uoss, regs->sp); 639 + } 640 + #else /* !CONFIG_X86_32 */ 641 + asmlinkage long 642 + sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss, 643 + struct pt_regs *regs) 644 + { 645 + return do_sigaltstack(uss, uoss, regs->sp); 646 + } 647 + #endif /* CONFIG_X86_32 */ 648 + 649 + /* 650 + * Do a signal return; undo the signal stack. 651 + */ 652 + #ifdef CONFIG_X86_32 653 + asmlinkage unsigned long sys_sigreturn(unsigned long __unused) 654 + { 655 + struct sigframe __user *frame; 656 + struct pt_regs *regs; 657 + unsigned long ax; 658 + sigset_t set; 659 + 660 + regs = (struct pt_regs *) &__unused; 661 + frame = (struct sigframe __user *)(regs->sp - 8); 662 + 663 + if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) 664 + goto badframe; 665 + if (__get_user(set.sig[0], &frame->sc.oldmask) || (_NSIG_WORDS > 1 666 + && __copy_from_user(&set.sig[1], &frame->extramask, 667 + sizeof(frame->extramask)))) 668 + goto badframe; 669 + 670 + sigdelsetmask(&set, ~_BLOCKABLE); 671 + spin_lock_irq(&current->sighand->siglock); 672 + current->blocked = set; 673 + recalc_sigpending(); 674 + spin_unlock_irq(&current->sighand->siglock); 675 + 676 + if (restore_sigcontext(regs, &frame->sc, &ax)) 677 + goto badframe; 678 + return ax; 679 + 680 + badframe: 681 + signal_fault(regs, frame, "sigreturn"); 682 + 683 + return 0; 684 + } 685 + #endif /* CONFIG_X86_32 */ 686 + 687 + static long do_rt_sigreturn(struct pt_regs *regs) 688 + { 689 + struct rt_sigframe __user *frame; 690 + unsigned long ax; 691 + sigset_t set; 692 + 693 + frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long)); 694 + if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) 695 + goto badframe; 696 + if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set))) 697 + goto badframe; 698 + 699 + sigdelsetmask(&set, ~_BLOCKABLE); 700 + spin_lock_irq(&current->sighand->siglock); 701 + current->blocked = set; 702 + recalc_sigpending(); 703 + spin_unlock_irq(&current->sighand->siglock); 704 + 705 + if (restore_sigcontext(regs, &frame->uc.uc_mcontext, &ax)) 706 + goto badframe; 707 + 708 + if (do_sigaltstack(&frame->uc.uc_stack, NULL, regs->sp) == -EFAULT) 709 + goto badframe; 710 + 711 + return ax; 712 + 713 + badframe: 714 + signal_fault(regs, frame, "rt_sigreturn"); 715 + return 0; 716 + } 717 + 718 + #ifdef CONFIG_X86_32 719 + asmlinkage int sys_rt_sigreturn(struct pt_regs regs) 720 + { 721 + return do_rt_sigreturn(&regs); 722 + } 723 + #else /* !CONFIG_X86_32 */ 724 + asmlinkage long sys_rt_sigreturn(struct pt_regs *regs) 725 + { 726 + return do_rt_sigreturn(regs); 727 + } 728 + #endif /* CONFIG_X86_32 */ 394 729 395 730 /* 396 731 * OK, we're invoking a handler: 397 732 */ 398 733 static int signr_convert(int sig) 399 734 { 735 + #ifdef CONFIG_X86_32 400 736 struct thread_info *info = current_thread_info(); 401 737 402 738 if (info->exec_domain && info->exec_domain->signal_invmap && sig < 32) 403 739 return info->exec_domain->signal_invmap[sig]; 740 + #endif /* CONFIG_X86_32 */ 404 741 return sig; 405 742 } 743 + 744 + #ifdef CONFIG_X86_32 406 745 407 746 #define is_ia32 1 408 747 #define ia32_setup_frame __setup_frame 409 748 #define ia32_setup_rt_frame __setup_rt_frame 749 + 750 + #else /* !CONFIG_X86_32 */ 751 + 752 + #ifdef CONFIG_IA32_EMULATION 753 + #define is_ia32 test_thread_flag(TIF_IA32) 754 + #else /* !CONFIG_IA32_EMULATION */ 755 + #define is_ia32 0 756 + #endif /* CONFIG_IA32_EMULATION */ 757 + 758 + int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, 759 + sigset_t *set, struct pt_regs *regs); 760 + int ia32_setup_frame(int sig, struct k_sigaction *ka, 761 + sigset_t *set, struct pt_regs *regs); 762 + 763 + #endif /* CONFIG_X86_32 */ 410 764 411 765 static int 412 766 setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, ··· 778 592 return 0; 779 593 } 780 594 595 + #ifdef CONFIG_X86_32 781 596 #define NR_restart_syscall __NR_restart_syscall 597 + #else /* !CONFIG_X86_32 */ 598 + #define NR_restart_syscall \ 599 + test_thread_flag(TIF_IA32) ? __NR_ia32_restart_syscall : __NR_restart_syscall 600 + #endif /* CONFIG_X86_32 */ 601 + 782 602 /* 783 603 * Note that 'init' is a special process: it doesn't get signals it doesn't 784 604 * want to handle. Thus you cannot kill init even with a SIGKILL even by ··· 896 704 struct task_struct *me = current; 897 705 898 706 if (show_unhandled_signals && printk_ratelimit()) { 899 - printk(KERN_INFO 707 + printk("%s" 900 708 "%s[%d] bad frame in %s frame:%p ip:%lx sp:%lx orax:%lx", 709 + task_pid_nr(current) > 1 ? KERN_INFO : KERN_EMERG, 901 710 me->comm, me->pid, where, frame, 902 711 regs->ip, regs->sp, regs->orig_ax); 903 712 print_vma_addr(" in ", regs->ip);
-516
arch/x86/kernel/signal_64.c
··· 1 - /* 2 - * Copyright (C) 1991, 1992 Linus Torvalds 3 - * Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs 4 - * 5 - * 1997-11-28 Modified for POSIX.1b signals by Richard Henderson 6 - * 2000-06-20 Pentium III FXSR, SSE support by Gareth Hughes 7 - * 2000-2002 x86-64 support by Andi Kleen 8 - */ 9 - 10 - #include <linux/sched.h> 11 - #include <linux/mm.h> 12 - #include <linux/smp.h> 13 - #include <linux/kernel.h> 14 - #include <linux/signal.h> 15 - #include <linux/errno.h> 16 - #include <linux/wait.h> 17 - #include <linux/ptrace.h> 18 - #include <linux/tracehook.h> 19 - #include <linux/unistd.h> 20 - #include <linux/stddef.h> 21 - #include <linux/personality.h> 22 - #include <linux/compiler.h> 23 - #include <linux/uaccess.h> 24 - 25 - #include <asm/processor.h> 26 - #include <asm/ucontext.h> 27 - #include <asm/i387.h> 28 - #include <asm/proto.h> 29 - #include <asm/ia32_unistd.h> 30 - #include <asm/mce.h> 31 - #include <asm/syscall.h> 32 - #include <asm/syscalls.h> 33 - #include "sigframe.h" 34 - 35 - #define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP))) 36 - 37 - #define __FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \ 38 - X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \ 39 - X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \ 40 - X86_EFLAGS_CF) 41 - 42 - #ifdef CONFIG_X86_32 43 - # define FIX_EFLAGS (__FIX_EFLAGS | X86_EFLAGS_RF) 44 - #else 45 - # define FIX_EFLAGS __FIX_EFLAGS 46 - #endif 47 - 48 - asmlinkage long 49 - sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss, 50 - struct pt_regs *regs) 51 - { 52 - return do_sigaltstack(uss, uoss, regs->sp); 53 - } 54 - 55 - #define COPY(x) { \ 56 - err |= __get_user(regs->x, &sc->x); \ 57 - } 58 - 59 - #define COPY_SEG_STRICT(seg) { \ 60 - unsigned short tmp; \ 61 - err |= __get_user(tmp, &sc->seg); \ 62 - regs->seg = tmp | 3; \ 63 - } 64 - 65 - /* 66 - * Do a signal return; undo the signal stack. 67 - */ 68 - static int 69 - restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc, 70 - unsigned long *pax) 71 - { 72 - void __user *buf; 73 - unsigned int tmpflags; 74 - unsigned int err = 0; 75 - 76 - /* Always make any pending restarted system calls return -EINTR */ 77 - current_thread_info()->restart_block.fn = do_no_restart_syscall; 78 - 79 - COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx); 80 - COPY(dx); COPY(cx); COPY(ip); 81 - COPY(r8); 82 - COPY(r9); 83 - COPY(r10); 84 - COPY(r11); 85 - COPY(r12); 86 - COPY(r13); 87 - COPY(r14); 88 - COPY(r15); 89 - 90 - /* Kernel saves and restores only the CS segment register on signals, 91 - * which is the bare minimum needed to allow mixed 32/64-bit code. 92 - * App's signal handler can save/restore other segments if needed. */ 93 - COPY_SEG_STRICT(cs); 94 - 95 - err |= __get_user(tmpflags, &sc->flags); 96 - regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS); 97 - regs->orig_ax = -1; /* disable syscall checks */ 98 - 99 - err |= __get_user(buf, &sc->fpstate); 100 - err |= restore_i387_xstate(buf); 101 - 102 - err |= __get_user(*pax, &sc->ax); 103 - return err; 104 - } 105 - 106 - static long do_rt_sigreturn(struct pt_regs *regs) 107 - { 108 - struct rt_sigframe __user *frame; 109 - unsigned long ax; 110 - sigset_t set; 111 - 112 - frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long)); 113 - if (!access_ok(VERIFY_READ, frame, sizeof(*frame))) 114 - goto badframe; 115 - if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set))) 116 - goto badframe; 117 - 118 - sigdelsetmask(&set, ~_BLOCKABLE); 119 - spin_lock_irq(&current->sighand->siglock); 120 - current->blocked = set; 121 - recalc_sigpending(); 122 - spin_unlock_irq(&current->sighand->siglock); 123 - 124 - if (restore_sigcontext(regs, &frame->uc.uc_mcontext, &ax)) 125 - goto badframe; 126 - 127 - if (do_sigaltstack(&frame->uc.uc_stack, NULL, regs->sp) == -EFAULT) 128 - goto badframe; 129 - 130 - return ax; 131 - 132 - badframe: 133 - signal_fault(regs, frame, "rt_sigreturn"); 134 - return 0; 135 - } 136 - 137 - asmlinkage long sys_rt_sigreturn(struct pt_regs *regs) 138 - { 139 - return do_rt_sigreturn(regs); 140 - } 141 - 142 - /* 143 - * Set up a signal frame. 144 - */ 145 - 146 - static inline int 147 - setup_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs, 148 - unsigned long mask, struct task_struct *me) 149 - { 150 - int err = 0; 151 - 152 - err |= __put_user(regs->cs, &sc->cs); 153 - err |= __put_user(0, &sc->gs); 154 - err |= __put_user(0, &sc->fs); 155 - 156 - err |= __put_user(regs->di, &sc->di); 157 - err |= __put_user(regs->si, &sc->si); 158 - err |= __put_user(regs->bp, &sc->bp); 159 - err |= __put_user(regs->sp, &sc->sp); 160 - err |= __put_user(regs->bx, &sc->bx); 161 - err |= __put_user(regs->dx, &sc->dx); 162 - err |= __put_user(regs->cx, &sc->cx); 163 - err |= __put_user(regs->ax, &sc->ax); 164 - err |= __put_user(regs->r8, &sc->r8); 165 - err |= __put_user(regs->r9, &sc->r9); 166 - err |= __put_user(regs->r10, &sc->r10); 167 - err |= __put_user(regs->r11, &sc->r11); 168 - err |= __put_user(regs->r12, &sc->r12); 169 - err |= __put_user(regs->r13, &sc->r13); 170 - err |= __put_user(regs->r14, &sc->r14); 171 - err |= __put_user(regs->r15, &sc->r15); 172 - err |= __put_user(me->thread.trap_no, &sc->trapno); 173 - err |= __put_user(me->thread.error_code, &sc->err); 174 - err |= __put_user(regs->ip, &sc->ip); 175 - err |= __put_user(regs->flags, &sc->flags); 176 - err |= __put_user(mask, &sc->oldmask); 177 - err |= __put_user(me->thread.cr2, &sc->cr2); 178 - 179 - return err; 180 - } 181 - 182 - /* 183 - * Determine which stack to use.. 184 - */ 185 - 186 - static void __user * 187 - get_stack(struct k_sigaction *ka, struct pt_regs *regs, unsigned long size) 188 - { 189 - unsigned long sp; 190 - 191 - /* Default to using normal stack - redzone*/ 192 - sp = regs->sp - 128; 193 - 194 - /* This is the X/Open sanctioned signal stack switching. */ 195 - if (ka->sa.sa_flags & SA_ONSTACK) { 196 - if (sas_ss_flags(sp) == 0) 197 - sp = current->sas_ss_sp + current->sas_ss_size; 198 - } 199 - 200 - return (void __user *)round_down(sp - size, 64); 201 - } 202 - 203 - static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, 204 - sigset_t *set, struct pt_regs *regs) 205 - { 206 - struct rt_sigframe __user *frame; 207 - void __user *fp = NULL; 208 - int err = 0; 209 - struct task_struct *me = current; 210 - 211 - if (used_math()) { 212 - fp = get_stack(ka, regs, sig_xstate_size); 213 - frame = (void __user *)round_down( 214 - (unsigned long)fp - sizeof(struct rt_sigframe), 16) - 8; 215 - 216 - if (save_i387_xstate(fp) < 0) 217 - return -EFAULT; 218 - } else 219 - frame = get_stack(ka, regs, sizeof(struct rt_sigframe)) - 8; 220 - 221 - if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) 222 - return -EFAULT; 223 - 224 - if (ka->sa.sa_flags & SA_SIGINFO) { 225 - if (copy_siginfo_to_user(&frame->info, info)) 226 - return -EFAULT; 227 - } 228 - 229 - /* Create the ucontext. */ 230 - if (cpu_has_xsave) 231 - err |= __put_user(UC_FP_XSTATE, &frame->uc.uc_flags); 232 - else 233 - err |= __put_user(0, &frame->uc.uc_flags); 234 - err |= __put_user(0, &frame->uc.uc_link); 235 - err |= __put_user(me->sas_ss_sp, &frame->uc.uc_stack.ss_sp); 236 - err |= __put_user(sas_ss_flags(regs->sp), 237 - &frame->uc.uc_stack.ss_flags); 238 - err |= __put_user(me->sas_ss_size, &frame->uc.uc_stack.ss_size); 239 - err |= setup_sigcontext(&frame->uc.uc_mcontext, regs, set->sig[0], me); 240 - err |= __put_user(fp, &frame->uc.uc_mcontext.fpstate); 241 - if (sizeof(*set) == 16) { 242 - __put_user(set->sig[0], &frame->uc.uc_sigmask.sig[0]); 243 - __put_user(set->sig[1], &frame->uc.uc_sigmask.sig[1]); 244 - } else 245 - err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set)); 246 - 247 - /* Set up to return from userspace. If provided, use a stub 248 - already in userspace. */ 249 - /* x86-64 should always use SA_RESTORER. */ 250 - if (ka->sa.sa_flags & SA_RESTORER) { 251 - err |= __put_user(ka->sa.sa_restorer, &frame->pretcode); 252 - } else { 253 - /* could use a vstub here */ 254 - return -EFAULT; 255 - } 256 - 257 - if (err) 258 - return -EFAULT; 259 - 260 - /* Set up registers for signal handler */ 261 - regs->di = sig; 262 - /* In case the signal handler was declared without prototypes */ 263 - regs->ax = 0; 264 - 265 - /* This also works for non SA_SIGINFO handlers because they expect the 266 - next argument after the signal number on the stack. */ 267 - regs->si = (unsigned long)&frame->info; 268 - regs->dx = (unsigned long)&frame->uc; 269 - regs->ip = (unsigned long) ka->sa.sa_handler; 270 - 271 - regs->sp = (unsigned long)frame; 272 - 273 - /* Set up the CS register to run signal handlers in 64-bit mode, 274 - even if the handler happens to be interrupting 32-bit code. */ 275 - regs->cs = __USER_CS; 276 - 277 - return 0; 278 - } 279 - 280 - /* 281 - * OK, we're invoking a handler 282 - */ 283 - static int signr_convert(int sig) 284 - { 285 - return sig; 286 - } 287 - 288 - #ifdef CONFIG_IA32_EMULATION 289 - #define is_ia32 test_thread_flag(TIF_IA32) 290 - #else 291 - #define is_ia32 0 292 - #endif 293 - 294 - static int 295 - setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, 296 - sigset_t *set, struct pt_regs *regs) 297 - { 298 - int usig = signr_convert(sig); 299 - int ret; 300 - 301 - /* Set up the stack frame */ 302 - if (is_ia32) { 303 - if (ka->sa.sa_flags & SA_SIGINFO) 304 - ret = ia32_setup_rt_frame(usig, ka, info, set, regs); 305 - else 306 - ret = ia32_setup_frame(usig, ka, set, regs); 307 - } else 308 - ret = __setup_rt_frame(sig, ka, info, set, regs); 309 - 310 - if (ret) { 311 - force_sigsegv(sig, current); 312 - return -EFAULT; 313 - } 314 - 315 - return ret; 316 - } 317 - 318 - static int 319 - handle_signal(unsigned long sig, siginfo_t *info, struct k_sigaction *ka, 320 - sigset_t *oldset, struct pt_regs *regs) 321 - { 322 - int ret; 323 - 324 - /* Are we from a system call? */ 325 - if (syscall_get_nr(current, regs) >= 0) { 326 - /* If so, check system call restarting.. */ 327 - switch (syscall_get_error(current, regs)) { 328 - case -ERESTART_RESTARTBLOCK: 329 - case -ERESTARTNOHAND: 330 - regs->ax = -EINTR; 331 - break; 332 - 333 - case -ERESTARTSYS: 334 - if (!(ka->sa.sa_flags & SA_RESTART)) { 335 - regs->ax = -EINTR; 336 - break; 337 - } 338 - /* fallthrough */ 339 - case -ERESTARTNOINTR: 340 - regs->ax = regs->orig_ax; 341 - regs->ip -= 2; 342 - break; 343 - } 344 - } 345 - 346 - /* 347 - * If TF is set due to a debugger (TIF_FORCED_TF), clear the TF 348 - * flag so that register information in the sigcontext is correct. 349 - */ 350 - if (unlikely(regs->flags & X86_EFLAGS_TF) && 351 - likely(test_and_clear_thread_flag(TIF_FORCED_TF))) 352 - regs->flags &= ~X86_EFLAGS_TF; 353 - 354 - ret = setup_rt_frame(sig, ka, info, oldset, regs); 355 - 356 - if (ret) 357 - return ret; 358 - 359 - #ifdef CONFIG_X86_64 360 - /* 361 - * This has nothing to do with segment registers, 362 - * despite the name. This magic affects uaccess.h 363 - * macros' behavior. Reset it to the normal setting. 364 - */ 365 - set_fs(USER_DS); 366 - #endif 367 - 368 - /* 369 - * Clear the direction flag as per the ABI for function entry. 370 - */ 371 - regs->flags &= ~X86_EFLAGS_DF; 372 - 373 - /* 374 - * Clear TF when entering the signal handler, but 375 - * notify any tracer that was single-stepping it. 376 - * The tracer may want to single-step inside the 377 - * handler too. 378 - */ 379 - regs->flags &= ~X86_EFLAGS_TF; 380 - 381 - spin_lock_irq(&current->sighand->siglock); 382 - sigorsets(&current->blocked, &current->blocked, &ka->sa.sa_mask); 383 - if (!(ka->sa.sa_flags & SA_NODEFER)) 384 - sigaddset(&current->blocked, sig); 385 - recalc_sigpending(); 386 - spin_unlock_irq(&current->sighand->siglock); 387 - 388 - tracehook_signal_handler(sig, info, ka, regs, 389 - test_thread_flag(TIF_SINGLESTEP)); 390 - 391 - return 0; 392 - } 393 - 394 - #define NR_restart_syscall \ 395 - test_thread_flag(TIF_IA32) ? __NR_ia32_restart_syscall : __NR_restart_syscall 396 - /* 397 - * Note that 'init' is a special process: it doesn't get signals it doesn't 398 - * want to handle. Thus you cannot kill init even with a SIGKILL even by 399 - * mistake. 400 - */ 401 - static void do_signal(struct pt_regs *regs) 402 - { 403 - struct k_sigaction ka; 404 - siginfo_t info; 405 - int signr; 406 - sigset_t *oldset; 407 - 408 - /* 409 - * We want the common case to go fast, which is why we may in certain 410 - * cases get here from kernel mode. Just return without doing anything 411 - * if so. 412 - * X86_32: vm86 regs switched out by assembly code before reaching 413 - * here, so testing against kernel CS suffices. 414 - */ 415 - if (!user_mode(regs)) 416 - return; 417 - 418 - if (current_thread_info()->status & TS_RESTORE_SIGMASK) 419 - oldset = &current->saved_sigmask; 420 - else 421 - oldset = &current->blocked; 422 - 423 - signr = get_signal_to_deliver(&info, &ka, regs, NULL); 424 - if (signr > 0) { 425 - /* 426 - * Re-enable any watchpoints before delivering the 427 - * signal to user space. The processor register will 428 - * have been cleared if the watchpoint triggered 429 - * inside the kernel. 430 - */ 431 - if (current->thread.debugreg7) 432 - set_debugreg(current->thread.debugreg7, 7); 433 - 434 - /* Whee! Actually deliver the signal. */ 435 - if (handle_signal(signr, &info, &ka, oldset, regs) == 0) { 436 - /* 437 - * A signal was successfully delivered; the saved 438 - * sigmask will have been stored in the signal frame, 439 - * and will be restored by sigreturn, so we can simply 440 - * clear the TS_RESTORE_SIGMASK flag. 441 - */ 442 - current_thread_info()->status &= ~TS_RESTORE_SIGMASK; 443 - } 444 - return; 445 - } 446 - 447 - /* Did we come from a system call? */ 448 - if (syscall_get_nr(current, regs) >= 0) { 449 - /* Restart the system call - no handlers present */ 450 - switch (syscall_get_error(current, regs)) { 451 - case -ERESTARTNOHAND: 452 - case -ERESTARTSYS: 453 - case -ERESTARTNOINTR: 454 - regs->ax = regs->orig_ax; 455 - regs->ip -= 2; 456 - break; 457 - 458 - case -ERESTART_RESTARTBLOCK: 459 - regs->ax = NR_restart_syscall; 460 - regs->ip -= 2; 461 - break; 462 - } 463 - } 464 - 465 - /* 466 - * If there's no signal to deliver, we just put the saved sigmask 467 - * back. 468 - */ 469 - if (current_thread_info()->status & TS_RESTORE_SIGMASK) { 470 - current_thread_info()->status &= ~TS_RESTORE_SIGMASK; 471 - sigprocmask(SIG_SETMASK, &current->saved_sigmask, NULL); 472 - } 473 - } 474 - 475 - /* 476 - * notification of userspace execution resumption 477 - * - triggered by the TIF_WORK_MASK flags 478 - */ 479 - void 480 - do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags) 481 - { 482 - #if defined(CONFIG_X86_64) && defined(CONFIG_X86_MCE) 483 - /* notify userspace of pending MCEs */ 484 - if (thread_info_flags & _TIF_MCE_NOTIFY) 485 - mce_notify_user(); 486 - #endif /* CONFIG_X86_64 && CONFIG_X86_MCE */ 487 - 488 - /* deal with pending signal delivery */ 489 - if (thread_info_flags & _TIF_SIGPENDING) 490 - do_signal(regs); 491 - 492 - if (thread_info_flags & _TIF_NOTIFY_RESUME) { 493 - clear_thread_flag(TIF_NOTIFY_RESUME); 494 - tracehook_notify_resume(regs); 495 - } 496 - 497 - #ifdef CONFIG_X86_32 498 - clear_thread_flag(TIF_IRET); 499 - #endif /* CONFIG_X86_32 */ 500 - } 501 - 502 - void signal_fault(struct pt_regs *regs, void __user *frame, char *where) 503 - { 504 - struct task_struct *me = current; 505 - 506 - if (show_unhandled_signals && printk_ratelimit()) { 507 - printk(KERN_INFO 508 - "%s[%d] bad frame in %s frame:%p ip:%lx sp:%lx orax:%lx", 509 - me->comm, me->pid, where, frame, 510 - regs->ip, regs->sp, regs->orig_ax); 511 - print_vma_addr(" in ", regs->ip); 512 - printk(KERN_CONT "\n"); 513 - } 514 - 515 - force_sig(SIGSEGV, me); 516 - }
-13
arch/x86/kernel/smp.c
··· 140 140 send_IPI_mask(mask, CALL_FUNCTION_VECTOR); 141 141 } 142 142 143 - static void stop_this_cpu(void *dummy) 144 - { 145 - local_irq_disable(); 146 - /* 147 - * Remove this CPU: 148 - */ 149 - cpu_clear(smp_processor_id(), cpu_online_map); 150 - disable_local_APIC(); 151 - if (hlt_works(smp_processor_id())) 152 - for (;;) halt(); 153 - for (;;); 154 - } 155 - 156 143 /* 157 144 * this function calls the 'stop' function on all other CPUs in the system. 158 145 */
+11 -12
arch/x86/kernel/smpboot.c
··· 62 62 #include <asm/mtrr.h> 63 63 #include <asm/vmi.h> 64 64 #include <asm/genapic.h> 65 + #include <asm/setup.h> 65 66 #include <linux/mc146818rtc.h> 66 67 67 68 #include <mach_apic.h> ··· 535 534 pr_debug("Before bogocount - setting activated=1.\n"); 536 535 } 537 536 538 - static inline void __inquire_remote_apic(int apicid) 537 + void __inquire_remote_apic(int apicid) 539 538 { 540 539 unsigned i, regs[] = { APIC_ID >> 4, APIC_LVR >> 4, APIC_SPIV >> 4 }; 541 540 char *names[] = { "ID", "VERSION", "SPIV" }; ··· 574 573 } 575 574 } 576 575 577 - #ifdef WAKE_SECONDARY_VIA_NMI 578 576 /* 579 577 * Poke the other CPU in the eye via NMI to wake it up. Remember that the normal 580 578 * INIT, INIT, STARTUP sequence will reset the chip hard for us, and this 581 579 * won't ... remember to clear down the APIC, etc later. 582 580 */ 583 - static int __devinit 584 - wakeup_secondary_cpu(int logical_apicid, unsigned long start_eip) 581 + int __devinit 582 + wakeup_secondary_cpu_via_nmi(int logical_apicid, unsigned long start_eip) 585 583 { 586 584 unsigned long send_status, accept_status = 0; 587 585 int maxlvt; ··· 597 597 * Give the other CPU some time to accept the IPI. 598 598 */ 599 599 udelay(200); 600 - if (APIC_INTEGRATED(apic_version[phys_apicid])) { 600 + if (APIC_INTEGRATED(apic_version[boot_cpu_physical_apicid])) { 601 601 maxlvt = lapic_get_maxlvt(); 602 602 if (maxlvt > 3) /* Due to the Pentium erratum 3AP. */ 603 603 apic_write(APIC_ESR, 0); ··· 612 612 613 613 return (send_status | accept_status); 614 614 } 615 - #endif /* WAKE_SECONDARY_VIA_NMI */ 616 615 617 - #ifdef WAKE_SECONDARY_VIA_INIT 618 - static int __devinit 619 - wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip) 616 + int __devinit 617 + wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip) 620 618 { 621 619 unsigned long send_status, accept_status = 0; 622 620 int maxlvt, num_starts, j; ··· 733 735 734 736 return (send_status | accept_status); 735 737 } 736 - #endif /* WAKE_SECONDARY_VIA_INIT */ 737 738 738 739 struct create_idle { 739 740 struct work_struct work; ··· 1081 1084 #endif 1082 1085 1083 1086 if (!physid_isset(hard_smp_processor_id(), phys_cpu_present_map)) { 1084 - printk(KERN_WARNING "weird, boot CPU (#%d) not listed" 1085 - "by the BIOS.\n", hard_smp_processor_id()); 1087 + printk(KERN_WARNING 1088 + "weird, boot CPU (#%d) not listed by the BIOS.\n", 1089 + hard_smp_processor_id()); 1090 + 1086 1091 physid_set(hard_smp_processor_id(), phys_cpu_present_map); 1087 1092 } 1088 1093
+3 -1
arch/x86/kernel/time_64.c
··· 49 49 } 50 50 EXPORT_SYMBOL(profile_pc); 51 51 52 - irqreturn_t timer_interrupt(int irq, void *dev_id) 52 + static irqreturn_t timer_interrupt(int irq, void *dev_id) 53 53 { 54 54 add_pda(irq0_irqs, 1); 55 55 ··· 80 80 break; 81 81 no_ctr_free = (i == 4); 82 82 if (no_ctr_free) { 83 + WARN(1, KERN_WARNING "Warning: AMD perfctrs busy ... " 84 + "cpu_khz value may be incorrect.\n"); 83 85 i = 3; 84 86 rdmsrl(MSR_K7_EVNTSEL3, evntsel3); 85 87 wrmsrl(MSR_K7_EVNTSEL3, 0);
+5 -6
arch/x86/kernel/tlb_32.c
··· 34 34 */ 35 35 void leave_mm(int cpu) 36 36 { 37 - if (per_cpu(cpu_tlbstate, cpu).state == TLBSTATE_OK) 38 - BUG(); 39 - cpu_clear(cpu, per_cpu(cpu_tlbstate, cpu).active_mm->cpu_vm_mask); 37 + BUG_ON(x86_read_percpu(cpu_tlbstate.state) == TLBSTATE_OK); 38 + cpu_clear(cpu, x86_read_percpu(cpu_tlbstate.active_mm)->cpu_vm_mask); 40 39 load_cr3(swapper_pg_dir); 41 40 } 42 41 EXPORT_SYMBOL_GPL(leave_mm); ··· 103 104 * BUG(); 104 105 */ 105 106 106 - if (flush_mm == per_cpu(cpu_tlbstate, cpu).active_mm) { 107 - if (per_cpu(cpu_tlbstate, cpu).state == TLBSTATE_OK) { 107 + if (flush_mm == x86_read_percpu(cpu_tlbstate.active_mm)) { 108 + if (x86_read_percpu(cpu_tlbstate.state) == TLBSTATE_OK) { 108 109 if (flush_va == TLB_FLUSH_ALL) 109 110 local_flush_tlb(); 110 111 else ··· 237 238 unsigned long cpu = smp_processor_id(); 238 239 239 240 __flush_tlb_all(); 240 - if (per_cpu(cpu_tlbstate, cpu).state == TLBSTATE_LAZY) 241 + if (x86_read_percpu(cpu_tlbstate.state) == TLBSTATE_LAZY) 241 242 leave_mm(cpu); 242 243 } 243 244
-4
arch/x86/kernel/tlb_uv.c
··· 566 566 if (!is_uv_system()) 567 567 return 0; 568 568 569 - if (!proc_mkdir("sgi_uv", NULL)) 570 - return -EINVAL; 571 - 572 569 proc_uv_ptc = create_proc_entry(UV_PTC_BASENAME, 0444, NULL); 573 570 if (!proc_uv_ptc) { 574 571 printk(KERN_ERR "unable to create %s proc entry\n", 575 572 UV_PTC_BASENAME); 576 - remove_proc_entry("sgi_uv", NULL); 577 573 return -EINVAL; 578 574 } 579 575 proc_uv_ptc->proc_fops = &proc_uv_ptc_operations;
+17 -2
arch/x86/kernel/trampoline.c
··· 1 1 #include <linux/io.h> 2 2 3 3 #include <asm/trampoline.h> 4 + #include <asm/e820.h> 4 5 5 6 /* ready for x86_64 and x86 */ 6 7 unsigned char *trampoline_base = __va(TRAMPOLINE_BASE); 8 + 9 + void __init reserve_trampoline_memory(void) 10 + { 11 + #ifdef CONFIG_X86_32 12 + /* 13 + * But first pinch a few for the stack/trampoline stuff 14 + * FIXME: Don't need the extra page at 4K, but need to fix 15 + * trampoline before removing it. (see the GDT stuff) 16 + */ 17 + reserve_early(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE, "EX TRAMPOLINE"); 18 + #endif 19 + /* Has to be in very low memory so we can execute real-mode AP code. */ 20 + reserve_early(TRAMPOLINE_BASE, TRAMPOLINE_BASE + TRAMPOLINE_SIZE, 21 + "TRAMPOLINE"); 22 + } 7 23 8 24 /* 9 25 * Currently trivial. Write the real->protected mode ··· 28 12 */ 29 13 unsigned long setup_trampoline(void) 30 14 { 31 - memcpy(trampoline_base, trampoline_data, 32 - trampoline_end - trampoline_data); 15 + memcpy(trampoline_base, trampoline_data, TRAMPOLINE_SIZE); 33 16 return virt_to_phys(trampoline_base); 34 17 }
+15 -19
arch/x86/kernel/traps.c
··· 664 664 { 665 665 struct task_struct *task; 666 666 siginfo_t info; 667 - unsigned short cwd, swd; 667 + unsigned short cwd, swd, err; 668 668 669 669 /* 670 670 * Save the info for the exception handler and clear the error. ··· 675 675 task->thread.error_code = 0; 676 676 info.si_signo = SIGFPE; 677 677 info.si_errno = 0; 678 - info.si_code = __SI_FAULT; 679 678 info.si_addr = ip; 680 679 /* 681 680 * (~cwd & swd) will mask out exceptions that are not set to unmasked ··· 688 689 */ 689 690 cwd = get_fpu_cwd(task); 690 691 swd = get_fpu_swd(task); 691 - switch (swd & ~cwd & 0x3f) { 692 - case 0x000: /* No unmasked exception */ 693 - #ifdef CONFIG_X86_32 692 + 693 + err = swd & ~cwd & 0x3f; 694 + 695 + #if CONFIG_X86_32 696 + if (!err) 694 697 return; 695 698 #endif 696 - default: /* Multiple exceptions */ 697 - break; 698 - case 0x001: /* Invalid Op */ 699 + 700 + if (err & 0x001) { /* Invalid op */ 699 701 /* 700 702 * swd & 0x240 == 0x040: Stack Underflow 701 703 * swd & 0x240 == 0x240: Stack Overflow 702 704 * User must clear the SF bit (0x40) if set 703 705 */ 704 706 info.si_code = FPE_FLTINV; 705 - break; 706 - case 0x002: /* Denormalize */ 707 - case 0x010: /* Underflow */ 708 - info.si_code = FPE_FLTUND; 709 - break; 710 - case 0x004: /* Zero Divide */ 707 + } else if (err & 0x004) { /* Divide by Zero */ 711 708 info.si_code = FPE_FLTDIV; 712 - break; 713 - case 0x008: /* Overflow */ 709 + } else if (err & 0x008) { /* Overflow */ 714 710 info.si_code = FPE_FLTOVF; 715 - break; 716 - case 0x020: /* Precision */ 711 + } else if (err & 0x012) { /* Denormal, Underflow */ 712 + info.si_code = FPE_FLTUND; 713 + } else if (err & 0x020) { /* Precision */ 717 714 info.si_code = FPE_FLTRES; 718 - break; 715 + } else { 716 + info.si_code = __SI_FAULT|SI_KERNEL; /* WTF? */ 719 717 } 720 718 force_sig_info(SIGFPE, &info, task); 721 719 }
+30 -14
arch/x86/kernel/tsc.c
··· 15 15 #include <asm/vgtod.h> 16 16 #include <asm/time.h> 17 17 #include <asm/delay.h> 18 + #include <asm/hypervisor.h> 18 19 19 20 unsigned int cpu_khz; /* TSC clocks / usec, not used here */ 20 21 EXPORT_SYMBOL(cpu_khz); ··· 32 31 erroneous rdtsc usage on !cpu_has_tsc processors */ 33 32 static int tsc_disabled = -1; 34 33 34 + static int tsc_clocksource_reliable; 35 35 /* 36 36 * Scheduler clock - returns current time in nanosec units. 37 37 */ ··· 99 97 #endif 100 98 101 99 __setup("notsc", notsc_setup); 100 + 101 + static int __init tsc_setup(char *str) 102 + { 103 + if (!strcmp(str, "reliable")) 104 + tsc_clocksource_reliable = 1; 105 + return 1; 106 + } 107 + 108 + __setup("tsc=", tsc_setup); 102 109 103 110 #define MAX_RETRIES 5 104 111 #define SMI_TRESHOLD 50000 ··· 363 352 { 364 353 u64 tsc1, tsc2, delta, ref1, ref2; 365 354 unsigned long tsc_pit_min = ULONG_MAX, tsc_ref_min = ULONG_MAX; 366 - unsigned long flags, latch, ms, fast_calibrate; 355 + unsigned long flags, latch, ms, fast_calibrate, tsc_khz; 367 356 int hpet = is_hpet_enabled(), i, loopmin; 357 + 358 + tsc_khz = get_hypervisor_tsc_freq(); 359 + if (tsc_khz) { 360 + printk(KERN_INFO "TSC: Frequency read from the hypervisor\n"); 361 + return tsc_khz; 362 + } 368 363 369 364 local_irq_save(flags); 370 365 fast_calibrate = quick_pit_calibrate(); ··· 748 731 {} 749 732 }; 750 733 751 - /* 752 - * Geode_LX - the OLPC CPU has a possibly a very reliable TSC 753 - */ 754 - #ifdef CONFIG_MGEODE_LX 755 - /* RTSC counts during suspend */ 756 - #define RTSC_SUSP 0x100 757 - 758 - static void __init check_geode_tsc_reliable(void) 734 + static void __init check_system_tsc_reliable(void) 759 735 { 736 + #ifdef CONFIG_MGEODE_LX 737 + /* RTSC counts during suspend */ 738 + #define RTSC_SUSP 0x100 760 739 unsigned long res_low, res_high; 761 740 762 741 rdmsr_safe(MSR_GEODE_BUSCONT_CONF0, &res_low, &res_high); 742 + /* Geode_LX - the OLPC CPU has a possibly a very reliable TSC */ 763 743 if (res_low & RTSC_SUSP) 764 - clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; 765 - } 766 - #else 767 - static inline void check_geode_tsc_reliable(void) { } 744 + tsc_clocksource_reliable = 1; 768 745 #endif 746 + if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) 747 + tsc_clocksource_reliable = 1; 748 + } 769 749 770 750 /* 771 751 * Make an educated guess if the TSC is trustworthy and synchronized ··· 797 783 { 798 784 clocksource_tsc.mult = clocksource_khz2mult(tsc_khz, 799 785 clocksource_tsc.shift); 786 + if (tsc_clocksource_reliable) 787 + clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; 800 788 /* lower the rating if we already know its unstable: */ 801 789 if (check_tsc_unstable()) { 802 790 clocksource_tsc.rating = 0; ··· 859 843 if (unsynchronized_tsc()) 860 844 mark_tsc_unstable("TSCs unsynchronized"); 861 845 862 - check_geode_tsc_reliable(); 846 + check_system_tsc_reliable(); 863 847 init_tsc_clocksource(); 864 848 } 865 849
+7 -1
arch/x86/kernel/tsc_sync.c
··· 112 112 if (unsynchronized_tsc()) 113 113 return; 114 114 115 + if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) { 116 + printk(KERN_INFO 117 + "Skipping synchronization checks as TSC is reliable.\n"); 118 + return; 119 + } 120 + 115 121 printk(KERN_INFO "checking TSC synchronization [CPU#%d -> CPU#%d]:", 116 122 smp_processor_id(), cpu); 117 123 ··· 171 165 { 172 166 int cpus = 2; 173 167 174 - if (unsynchronized_tsc()) 168 + if (unsynchronized_tsc() || boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) 175 169 return; 176 170 177 171 /*
-119
arch/x86/kernel/vmi_32.c
··· 266 266 { 267 267 } 268 268 269 - #ifdef CONFIG_DEBUG_PAGE_TYPE 270 - 271 - #ifdef CONFIG_X86_PAE 272 - #define MAX_BOOT_PTS (2048+4+1) 273 - #else 274 - #define MAX_BOOT_PTS (1024+1) 275 - #endif 276 - 277 - /* 278 - * During boot, mem_map is not yet available in paging_init, so stash 279 - * all the boot page allocations here. 280 - */ 281 - static struct { 282 - u32 pfn; 283 - int type; 284 - } boot_page_allocations[MAX_BOOT_PTS]; 285 - static int num_boot_page_allocations; 286 - static int boot_allocations_applied; 287 - 288 - void vmi_apply_boot_page_allocations(void) 289 - { 290 - int i; 291 - BUG_ON(!mem_map); 292 - for (i = 0; i < num_boot_page_allocations; i++) { 293 - struct page *page = pfn_to_page(boot_page_allocations[i].pfn); 294 - page->type = boot_page_allocations[i].type; 295 - page->type = boot_page_allocations[i].type & 296 - ~(VMI_PAGE_ZEROED | VMI_PAGE_CLONE); 297 - } 298 - boot_allocations_applied = 1; 299 - } 300 - 301 - static void record_page_type(u32 pfn, int type) 302 - { 303 - BUG_ON(num_boot_page_allocations >= MAX_BOOT_PTS); 304 - boot_page_allocations[num_boot_page_allocations].pfn = pfn; 305 - boot_page_allocations[num_boot_page_allocations].type = type; 306 - num_boot_page_allocations++; 307 - } 308 - 309 - static void check_zeroed_page(u32 pfn, int type, struct page *page) 310 - { 311 - u32 *ptr; 312 - int i; 313 - int limit = PAGE_SIZE / sizeof(int); 314 - 315 - if (page_address(page)) 316 - ptr = (u32 *)page_address(page); 317 - else 318 - ptr = (u32 *)__va(pfn << PAGE_SHIFT); 319 - /* 320 - * When cloning the root in non-PAE mode, only the userspace 321 - * pdes need to be zeroed. 322 - */ 323 - if (type & VMI_PAGE_CLONE) 324 - limit = KERNEL_PGD_BOUNDARY; 325 - for (i = 0; i < limit; i++) 326 - BUG_ON(ptr[i]); 327 - } 328 - 329 - /* 330 - * We stash the page type into struct page so we can verify the page 331 - * types are used properly. 332 - */ 333 - static void vmi_set_page_type(u32 pfn, int type) 334 - { 335 - /* PAE can have multiple roots per page - don't track */ 336 - if (PTRS_PER_PMD > 1 && (type & VMI_PAGE_PDP)) 337 - return; 338 - 339 - if (boot_allocations_applied) { 340 - struct page *page = pfn_to_page(pfn); 341 - if (type != VMI_PAGE_NORMAL) 342 - BUG_ON(page->type); 343 - else 344 - BUG_ON(page->type == VMI_PAGE_NORMAL); 345 - page->type = type & ~(VMI_PAGE_ZEROED | VMI_PAGE_CLONE); 346 - if (type & VMI_PAGE_ZEROED) 347 - check_zeroed_page(pfn, type, page); 348 - } else { 349 - record_page_type(pfn, type); 350 - } 351 - } 352 - 353 - static void vmi_check_page_type(u32 pfn, int type) 354 - { 355 - /* PAE can have multiple roots per page - skip checks */ 356 - if (PTRS_PER_PMD > 1 && (type & VMI_PAGE_PDP)) 357 - return; 358 - 359 - type &= ~(VMI_PAGE_ZEROED | VMI_PAGE_CLONE); 360 - if (boot_allocations_applied) { 361 - struct page *page = pfn_to_page(pfn); 362 - BUG_ON((page->type ^ type) & VMI_PAGE_PAE); 363 - BUG_ON(type == VMI_PAGE_NORMAL && page->type); 364 - BUG_ON((type & page->type) == 0); 365 - } 366 - } 367 - #else 368 - #define vmi_set_page_type(p,t) do { } while (0) 369 - #define vmi_check_page_type(p,t) do { } while (0) 370 - #endif 371 - 372 269 #ifdef CONFIG_HIGHPTE 373 270 static void *vmi_kmap_atomic_pte(struct page *page, enum km_type type) 374 271 { ··· 292 395 293 396 static void vmi_allocate_pte(struct mm_struct *mm, unsigned long pfn) 294 397 { 295 - vmi_set_page_type(pfn, VMI_PAGE_L1); 296 398 vmi_ops.allocate_page(pfn, VMI_PAGE_L1, 0, 0, 0); 297 399 } 298 400 ··· 302 406 * It is called only for swapper_pg_dir, which already has 303 407 * data on it. 304 408 */ 305 - vmi_set_page_type(pfn, VMI_PAGE_L2); 306 409 vmi_ops.allocate_page(pfn, VMI_PAGE_L2, 0, 0, 0); 307 410 } 308 411 309 412 static void vmi_allocate_pmd_clone(unsigned long pfn, unsigned long clonepfn, unsigned long start, unsigned long count) 310 413 { 311 - vmi_set_page_type(pfn, VMI_PAGE_L2 | VMI_PAGE_CLONE); 312 - vmi_check_page_type(clonepfn, VMI_PAGE_L2); 313 414 vmi_ops.allocate_page(pfn, VMI_PAGE_L2 | VMI_PAGE_CLONE, clonepfn, start, count); 314 415 } 315 416 316 417 static void vmi_release_pte(unsigned long pfn) 317 418 { 318 419 vmi_ops.release_page(pfn, VMI_PAGE_L1); 319 - vmi_set_page_type(pfn, VMI_PAGE_NORMAL); 320 420 } 321 421 322 422 static void vmi_release_pmd(unsigned long pfn) 323 423 { 324 424 vmi_ops.release_page(pfn, VMI_PAGE_L2); 325 - vmi_set_page_type(pfn, VMI_PAGE_NORMAL); 326 425 } 327 426 328 427 /* ··· 341 450 342 451 static void vmi_update_pte(struct mm_struct *mm, unsigned long addr, pte_t *ptep) 343 452 { 344 - vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); 345 453 vmi_ops.update_pte(ptep, vmi_flags_addr(mm, addr, VMI_PAGE_PT, 0)); 346 454 } 347 455 348 456 static void vmi_update_pte_defer(struct mm_struct *mm, unsigned long addr, pte_t *ptep) 349 457 { 350 - vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); 351 458 vmi_ops.update_pte(ptep, vmi_flags_addr_defer(mm, addr, VMI_PAGE_PT, 0)); 352 459 } 353 460 354 461 static void vmi_set_pte(pte_t *ptep, pte_t pte) 355 462 { 356 463 /* XXX because of set_pmd_pte, this can be called on PT or PD layers */ 357 - vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE | VMI_PAGE_PD); 358 464 vmi_ops.set_pte(pte, ptep, VMI_PAGE_PT); 359 465 } 360 466 361 467 static void vmi_set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) 362 468 { 363 - vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); 364 469 vmi_ops.set_pte(pte, ptep, vmi_flags_addr(mm, addr, VMI_PAGE_PT, 0)); 365 470 } 366 471 ··· 364 477 { 365 478 #ifdef CONFIG_X86_PAE 366 479 const pte_t pte = { .pte = pmdval.pmd }; 367 - vmi_check_page_type(__pa(pmdp) >> PAGE_SHIFT, VMI_PAGE_PMD); 368 480 #else 369 481 const pte_t pte = { pmdval.pud.pgd.pgd }; 370 - vmi_check_page_type(__pa(pmdp) >> PAGE_SHIFT, VMI_PAGE_PGD); 371 482 #endif 372 483 vmi_ops.set_pte(pte, (pte_t *)pmdp, VMI_PAGE_PD); 373 484 } ··· 387 502 388 503 static void vmi_set_pte_present(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) 389 504 { 390 - vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); 391 505 vmi_ops.set_pte(pte, ptep, vmi_flags_addr_defer(mm, addr, VMI_PAGE_PT, 1)); 392 506 } 393 507 ··· 394 510 { 395 511 /* Um, eww */ 396 512 const pte_t pte = { .pte = pudval.pgd.pgd }; 397 - vmi_check_page_type(__pa(pudp) >> PAGE_SHIFT, VMI_PAGE_PGD); 398 513 vmi_ops.set_pte(pte, (pte_t *)pudp, VMI_PAGE_PDP); 399 514 } 400 515 401 516 static void vmi_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) 402 517 { 403 518 const pte_t pte = { .pte = 0 }; 404 - vmi_check_page_type(__pa(ptep) >> PAGE_SHIFT, VMI_PAGE_PTE); 405 519 vmi_ops.set_pte(pte, ptep, vmi_flags_addr(mm, addr, VMI_PAGE_PT, 0)); 406 520 } 407 521 408 522 static void vmi_pmd_clear(pmd_t *pmd) 409 523 { 410 524 const pte_t pte = { .pte = 0 }; 411 - vmi_check_page_type(__pa(pmd) >> PAGE_SHIFT, VMI_PAGE_PMD); 412 525 vmi_ops.set_pte(pte, (pte_t *)pmd, VMI_PAGE_PD); 413 526 } 414 527 #endif
+9
arch/x86/kernel/vsyscall_64.c
··· 128 128 gettimeofday(tv,NULL); 129 129 return; 130 130 } 131 + 132 + /* 133 + * Surround the RDTSC by barriers, to make sure it's not 134 + * speculated to outside the seqlock critical section and 135 + * does not cause time warps: 136 + */ 137 + rdtsc_barrier(); 131 138 now = vread(); 139 + rdtsc_barrier(); 140 + 132 141 base = __vsyscall_gtod_data.clock.cycle_last; 133 142 mask = __vsyscall_gtod_data.clock.mask; 134 143 mult = __vsyscall_gtod_data.clock.mult;
+1
arch/x86/mach-generic/bigsmp.c
··· 17 17 #include <asm/bigsmp/apic.h> 18 18 #include <asm/bigsmp/ipi.h> 19 19 #include <asm/mach-default/mach_mpparse.h> 20 + #include <asm/mach-default/mach_wakecpu.h> 20 21 21 22 static int dmi_bigsmp; /* can be set by dmi scanners */ 22 23
+1
arch/x86/mach-generic/default.c
··· 16 16 #include <asm/mach-default/mach_apic.h> 17 17 #include <asm/mach-default/mach_ipi.h> 18 18 #include <asm/mach-default/mach_mpparse.h> 19 + #include <asm/mach-default/mach_wakecpu.h> 19 20 20 21 /* should be called last. */ 21 22 static int probe_default(void)
+13 -1
arch/x86/mach-generic/es7000.c
··· 16 16 #include <asm/es7000/apic.h> 17 17 #include <asm/es7000/ipi.h> 18 18 #include <asm/es7000/mpparse.h> 19 - #include <asm/es7000/wakecpu.h> 19 + #include <asm/mach-default/mach_wakecpu.h> 20 + 21 + void __init es7000_update_genapic_to_cluster(void) 22 + { 23 + genapic->target_cpus = target_cpus_cluster; 24 + genapic->int_delivery_mode = INT_DELIVERY_MODE_CLUSTER; 25 + genapic->int_dest_mode = INT_DEST_MODE_CLUSTER; 26 + genapic->no_balance_irq = NO_BALANCE_IRQ_CLUSTER; 27 + 28 + genapic->init_apic_ldr = init_apic_ldr_cluster; 29 + 30 + genapic->cpu_mask_to_apicid = cpu_mask_to_apicid_cluster; 31 + } 20 32 21 33 static int probe_es7000(void) 22 34 {
+15 -1
arch/x86/mach-generic/probe.c
··· 15 15 #include <asm/mpspec.h> 16 16 #include <asm/apicdef.h> 17 17 #include <asm/genapic.h> 18 + #include <asm/setup.h> 18 19 19 20 extern struct genapic apic_numaq; 20 21 extern struct genapic apic_summit; ··· 58 57 } 59 58 } 60 59 60 + if (x86_quirks->update_genapic) 61 + x86_quirks->update_genapic(); 62 + 61 63 /* Parsed again by __setup for debug/verbose */ 62 64 return 0; 63 65 } ··· 76 72 * - we find more than 8 CPUs in acpi LAPIC listing with xAPIC support 77 73 */ 78 74 79 - if (!cmdline_apic && genapic == &apic_default) 75 + if (!cmdline_apic && genapic == &apic_default) { 80 76 if (apic_bigsmp.probe()) { 81 77 genapic = &apic_bigsmp; 78 + if (x86_quirks->update_genapic) 79 + x86_quirks->update_genapic(); 82 80 printk(KERN_INFO "Overriding APIC driver with %s\n", 83 81 genapic->name); 84 82 } 83 + } 85 84 #endif 86 85 } 87 86 ··· 101 94 /* Not visible without early console */ 102 95 if (!apic_probe[i]) 103 96 panic("Didn't find an APIC driver"); 97 + 98 + if (x86_quirks->update_genapic) 99 + x86_quirks->update_genapic(); 104 100 } 105 101 printk(KERN_INFO "Using APIC driver %s\n", genapic->name); 106 102 } ··· 118 108 if (apic_probe[i]->mps_oem_check(mpc, oem, productid)) { 119 109 if (!cmdline_apic) { 120 110 genapic = apic_probe[i]; 111 + if (x86_quirks->update_genapic) 112 + x86_quirks->update_genapic(); 121 113 printk(KERN_INFO "Switched to APIC driver `%s'.\n", 122 114 genapic->name); 123 115 } ··· 136 124 if (apic_probe[i]->acpi_madt_oem_check(oem_id, oem_table_id)) { 137 125 if (!cmdline_apic) { 138 126 genapic = apic_probe[i]; 127 + if (x86_quirks->update_genapic) 128 + x86_quirks->update_genapic(); 139 129 printk(KERN_INFO "Switched to APIC driver `%s'.\n", 140 130 genapic->name); 141 131 }
+1
arch/x86/mach-generic/summit.c
··· 16 16 #include <asm/summit/apic.h> 17 17 #include <asm/summit/ipi.h> 18 18 #include <asm/summit/mpparse.h> 19 + #include <asm/mach-default/mach_wakecpu.h> 19 20 20 21 static int probe_summit(void) 21 22 {
+7 -4
arch/x86/mm/fault.c
··· 413 413 unsigned long error_code) 414 414 { 415 415 unsigned long flags = oops_begin(); 416 + int sig = SIGKILL; 416 417 struct task_struct *tsk; 417 418 418 419 printk(KERN_ALERT "%s: Corrupted page table at address %lx\n", ··· 424 423 tsk->thread.trap_no = 14; 425 424 tsk->thread.error_code = error_code; 426 425 if (__die("Bad pagetable", regs, error_code)) 427 - regs = NULL; 428 - oops_end(flags, regs, SIGKILL); 426 + sig = 0; 427 + oops_end(flags, regs, sig); 429 428 } 430 429 #endif 431 430 ··· 591 590 int fault; 592 591 #ifdef CONFIG_X86_64 593 592 unsigned long flags; 593 + int sig; 594 594 #endif 595 595 596 596 tsk = current; ··· 851 849 bust_spinlocks(0); 852 850 do_exit(SIGKILL); 853 851 #else 852 + sig = SIGKILL; 854 853 if (__die("Oops", regs, error_code)) 855 - regs = NULL; 854 + sig = 0; 856 855 /* Executive summary in case the body of the oops scrolled away */ 857 856 printk(KERN_EMERG "CR2: %016lx\n", address); 858 - oops_end(flags, regs, SIGKILL); 857 + oops_end(flags, regs, sig); 859 858 #endif 860 859 861 860 /*
+21 -11
arch/x86/mm/init_32.c
··· 67 67 68 68 static int __initdata after_init_bootmem; 69 69 70 - static __init void *alloc_low_page(unsigned long *phys) 70 + static __init void *alloc_low_page(void) 71 71 { 72 72 unsigned long pfn = table_end++; 73 73 void *adr; ··· 77 77 78 78 adr = __va(pfn * PAGE_SIZE); 79 79 memset(adr, 0, PAGE_SIZE); 80 - *phys = pfn * PAGE_SIZE; 81 80 return adr; 82 81 } 83 82 ··· 91 92 pmd_t *pmd_table; 92 93 93 94 #ifdef CONFIG_X86_PAE 94 - unsigned long phys; 95 95 if (!(pgd_val(*pgd) & _PAGE_PRESENT)) { 96 96 if (after_init_bootmem) 97 97 pmd_table = (pmd_t *)alloc_bootmem_low_pages(PAGE_SIZE); 98 98 else 99 - pmd_table = (pmd_t *)alloc_low_page(&phys); 99 + pmd_table = (pmd_t *)alloc_low_page(); 100 100 paravirt_alloc_pmd(&init_mm, __pa(pmd_table) >> PAGE_SHIFT); 101 101 set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT)); 102 102 pud = pud_offset(pgd, 0); 103 103 BUG_ON(pmd_table != pmd_offset(pud, 0)); 104 + 105 + return pmd_table; 104 106 } 105 107 #endif 106 108 pud = pud_offset(pgd, 0); ··· 126 126 if (!page_table) 127 127 page_table = 128 128 (pte_t *)alloc_bootmem_low_pages(PAGE_SIZE); 129 - } else { 130 - unsigned long phys; 131 - page_table = (pte_t *)alloc_low_page(&phys); 132 - } 129 + } else 130 + page_table = (pte_t *)alloc_low_page(); 133 131 134 132 paravirt_alloc_pte(&init_mm, __pa(page_table) >> PAGE_SHIFT); 135 133 set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE)); ··· 967 969 int codesize, reservedpages, datasize, initsize; 968 970 int tmp; 969 971 970 - start_periodic_check_for_corruption(); 971 - 972 972 #ifdef CONFIG_FLATMEM 973 973 BUG_ON(!mem_map); 974 974 #endif ··· 1036 1040 (unsigned long)&_text, (unsigned long)&_etext, 1037 1041 ((unsigned long)&_etext - (unsigned long)&_text) >> 10); 1038 1042 1043 + /* 1044 + * Check boundaries twice: Some fundamental inconsistencies can 1045 + * be detected at build time already. 1046 + */ 1047 + #define __FIXADDR_TOP (-PAGE_SIZE) 1048 + #ifdef CONFIG_HIGHMEM 1049 + BUILD_BUG_ON(PKMAP_BASE + LAST_PKMAP*PAGE_SIZE > FIXADDR_START); 1050 + BUILD_BUG_ON(VMALLOC_END > PKMAP_BASE); 1051 + #endif 1052 + #define high_memory (-128UL << 20) 1053 + BUILD_BUG_ON(VMALLOC_START >= VMALLOC_END); 1054 + #undef high_memory 1055 + #undef __FIXADDR_TOP 1056 + 1039 1057 #ifdef CONFIG_HIGHMEM 1040 1058 BUG_ON(PKMAP_BASE + LAST_PKMAP*PAGE_SIZE > FIXADDR_START); 1041 1059 BUG_ON(VMALLOC_END > PKMAP_BASE); 1042 1060 #endif 1043 - BUG_ON(VMALLOC_START > VMALLOC_END); 1061 + BUG_ON(VMALLOC_START >= VMALLOC_END); 1044 1062 BUG_ON((unsigned long)high_memory > VMALLOC_START); 1045 1063 1046 1064 if (boot_cpu_data.wp_works_ok < 0)
-2
arch/x86/mm/init_64.c
··· 902 902 long codesize, reservedpages, datasize, initsize; 903 903 unsigned long absent_pages; 904 904 905 - start_periodic_check_for_corruption(); 906 - 907 905 pci_iommu_alloc(); 908 906 909 907 /* clear_bss() already clear the empty_zero_page */
+2 -1
arch/x86/mm/ioremap.c
··· 223 223 * Check if the request spans more than any BAR in the iomem resource 224 224 * tree. 225 225 */ 226 - WARN_ON(iomem_map_sanity_check(phys_addr, size)); 226 + WARN_ONCE(iomem_map_sanity_check(phys_addr, size), 227 + KERN_INFO "Info: mapping multiple BARs. Your kernel is fine."); 227 228 228 229 /* 229 230 * Don't allow anybody to remap normal RAM that we're using..
+236
arch/x86/mm/pat.c
··· 596 596 free_memtype(addr, addr + size); 597 597 } 598 598 599 + /* 600 + * Internal interface to reserve a range of physical memory with prot. 601 + * Reserved non RAM regions only and after successful reserve_memtype, 602 + * this func also keeps identity mapping (if any) in sync with this new prot. 603 + */ 604 + static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t vma_prot) 605 + { 606 + int is_ram = 0; 607 + int id_sz, ret; 608 + unsigned long flags; 609 + unsigned long want_flags = (pgprot_val(vma_prot) & _PAGE_CACHE_MASK); 610 + 611 + is_ram = pagerange_is_ram(paddr, paddr + size); 612 + 613 + if (is_ram != 0) { 614 + /* 615 + * For mapping RAM pages, drivers need to call 616 + * set_memory_[uc|wc|wb] directly, for reserve and free, before 617 + * setting up the PTE. 618 + */ 619 + WARN_ON_ONCE(1); 620 + return 0; 621 + } 622 + 623 + ret = reserve_memtype(paddr, paddr + size, want_flags, &flags); 624 + if (ret) 625 + return ret; 626 + 627 + if (flags != want_flags) { 628 + free_memtype(paddr, paddr + size); 629 + printk(KERN_ERR 630 + "%s:%d map pfn expected mapping type %s for %Lx-%Lx, got %s\n", 631 + current->comm, current->pid, 632 + cattr_name(want_flags), 633 + (unsigned long long)paddr, 634 + (unsigned long long)(paddr + size), 635 + cattr_name(flags)); 636 + return -EINVAL; 637 + } 638 + 639 + /* Need to keep identity mapping in sync */ 640 + if (paddr >= __pa(high_memory)) 641 + return 0; 642 + 643 + id_sz = (__pa(high_memory) < paddr + size) ? 644 + __pa(high_memory) - paddr : 645 + size; 646 + 647 + if (ioremap_change_attr((unsigned long)__va(paddr), id_sz, flags) < 0) { 648 + free_memtype(paddr, paddr + size); 649 + printk(KERN_ERR 650 + "%s:%d reserve_pfn_range ioremap_change_attr failed %s " 651 + "for %Lx-%Lx\n", 652 + current->comm, current->pid, 653 + cattr_name(flags), 654 + (unsigned long long)paddr, 655 + (unsigned long long)(paddr + size)); 656 + return -EINVAL; 657 + } 658 + return 0; 659 + } 660 + 661 + /* 662 + * Internal interface to free a range of physical memory. 663 + * Frees non RAM regions only. 664 + */ 665 + static void free_pfn_range(u64 paddr, unsigned long size) 666 + { 667 + int is_ram; 668 + 669 + is_ram = pagerange_is_ram(paddr, paddr + size); 670 + if (is_ram == 0) 671 + free_memtype(paddr, paddr + size); 672 + } 673 + 674 + /* 675 + * track_pfn_vma_copy is called when vma that is covering the pfnmap gets 676 + * copied through copy_page_range(). 677 + * 678 + * If the vma has a linear pfn mapping for the entire range, we get the prot 679 + * from pte and reserve the entire vma range with single reserve_pfn_range call. 680 + * Otherwise, we reserve the entire vma range, my ging through the PTEs page 681 + * by page to get physical address and protection. 682 + */ 683 + int track_pfn_vma_copy(struct vm_area_struct *vma) 684 + { 685 + int retval = 0; 686 + unsigned long i, j; 687 + u64 paddr; 688 + unsigned long prot; 689 + unsigned long vma_start = vma->vm_start; 690 + unsigned long vma_end = vma->vm_end; 691 + unsigned long vma_size = vma_end - vma_start; 692 + 693 + if (!pat_enabled) 694 + return 0; 695 + 696 + if (is_linear_pfn_mapping(vma)) { 697 + /* 698 + * reserve the whole chunk covered by vma. We need the 699 + * starting address and protection from pte. 700 + */ 701 + if (follow_phys(vma, vma_start, 0, &prot, &paddr)) { 702 + WARN_ON_ONCE(1); 703 + return -EINVAL; 704 + } 705 + return reserve_pfn_range(paddr, vma_size, __pgprot(prot)); 706 + } 707 + 708 + /* reserve entire vma page by page, using pfn and prot from pte */ 709 + for (i = 0; i < vma_size; i += PAGE_SIZE) { 710 + if (follow_phys(vma, vma_start + i, 0, &prot, &paddr)) 711 + continue; 712 + 713 + retval = reserve_pfn_range(paddr, PAGE_SIZE, __pgprot(prot)); 714 + if (retval) 715 + goto cleanup_ret; 716 + } 717 + return 0; 718 + 719 + cleanup_ret: 720 + /* Reserve error: Cleanup partial reservation and return error */ 721 + for (j = 0; j < i; j += PAGE_SIZE) { 722 + if (follow_phys(vma, vma_start + j, 0, &prot, &paddr)) 723 + continue; 724 + 725 + free_pfn_range(paddr, PAGE_SIZE); 726 + } 727 + 728 + return retval; 729 + } 730 + 731 + /* 732 + * track_pfn_vma_new is called when a _new_ pfn mapping is being established 733 + * for physical range indicated by pfn and size. 734 + * 735 + * prot is passed in as a parameter for the new mapping. If the vma has a 736 + * linear pfn mapping for the entire range reserve the entire vma range with 737 + * single reserve_pfn_range call. 738 + * Otherwise, we look t the pfn and size and reserve only the specified range 739 + * page by page. 740 + * 741 + * Note that this function can be called with caller trying to map only a 742 + * subrange/page inside the vma. 743 + */ 744 + int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t prot, 745 + unsigned long pfn, unsigned long size) 746 + { 747 + int retval = 0; 748 + unsigned long i, j; 749 + u64 base_paddr; 750 + u64 paddr; 751 + unsigned long vma_start = vma->vm_start; 752 + unsigned long vma_end = vma->vm_end; 753 + unsigned long vma_size = vma_end - vma_start; 754 + 755 + if (!pat_enabled) 756 + return 0; 757 + 758 + if (is_linear_pfn_mapping(vma)) { 759 + /* reserve the whole chunk starting from vm_pgoff */ 760 + paddr = (u64)vma->vm_pgoff << PAGE_SHIFT; 761 + return reserve_pfn_range(paddr, vma_size, prot); 762 + } 763 + 764 + /* reserve page by page using pfn and size */ 765 + base_paddr = (u64)pfn << PAGE_SHIFT; 766 + for (i = 0; i < size; i += PAGE_SIZE) { 767 + paddr = base_paddr + i; 768 + retval = reserve_pfn_range(paddr, PAGE_SIZE, prot); 769 + if (retval) 770 + goto cleanup_ret; 771 + } 772 + return 0; 773 + 774 + cleanup_ret: 775 + /* Reserve error: Cleanup partial reservation and return error */ 776 + for (j = 0; j < i; j += PAGE_SIZE) { 777 + paddr = base_paddr + j; 778 + free_pfn_range(paddr, PAGE_SIZE); 779 + } 780 + 781 + return retval; 782 + } 783 + 784 + /* 785 + * untrack_pfn_vma is called while unmapping a pfnmap for a region. 786 + * untrack can be called for a specific region indicated by pfn and size or 787 + * can be for the entire vma (in which case size can be zero). 788 + */ 789 + void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn, 790 + unsigned long size) 791 + { 792 + unsigned long i; 793 + u64 paddr; 794 + unsigned long prot; 795 + unsigned long vma_start = vma->vm_start; 796 + unsigned long vma_end = vma->vm_end; 797 + unsigned long vma_size = vma_end - vma_start; 798 + 799 + if (!pat_enabled) 800 + return; 801 + 802 + if (is_linear_pfn_mapping(vma)) { 803 + /* free the whole chunk starting from vm_pgoff */ 804 + paddr = (u64)vma->vm_pgoff << PAGE_SHIFT; 805 + free_pfn_range(paddr, vma_size); 806 + return; 807 + } 808 + 809 + if (size != 0 && size != vma_size) { 810 + /* free page by page, using pfn and size */ 811 + paddr = (u64)pfn << PAGE_SHIFT; 812 + for (i = 0; i < size; i += PAGE_SIZE) { 813 + paddr = paddr + i; 814 + free_pfn_range(paddr, PAGE_SIZE); 815 + } 816 + } else { 817 + /* free entire vma, page by page, using the pfn from pte */ 818 + for (i = 0; i < vma_size; i += PAGE_SIZE) { 819 + if (follow_phys(vma, vma_start + i, 0, &prot, &paddr)) 820 + continue; 821 + 822 + free_pfn_range(paddr, PAGE_SIZE); 823 + } 824 + } 825 + } 826 + 827 + pgprot_t pgprot_writecombine(pgprot_t prot) 828 + { 829 + if (pat_enabled) 830 + return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WC); 831 + else 832 + return pgprot_noncached(prot); 833 + } 834 + 599 835 #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_PAT) 600 836 601 837 /* get Nth element of the linked list */
+17
arch/x86/pci/common.c
··· 23 23 unsigned int pci_early_dump_regs; 24 24 static int pci_bf_sort; 25 25 int pci_routeirq; 26 + int noioapicquirk; 27 + #ifdef CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS 28 + int noioapicreroute = 0; 29 + #else 30 + int noioapicreroute = 1; 31 + #endif 26 32 int pcibios_last_bus = -1; 27 33 unsigned long pirq_table_addr; 28 34 struct pci_bus *pci_root_bus; ··· 524 518 return NULL; 525 519 } else if (!strcmp(str, "skip_isa_align")) { 526 520 pci_probe |= PCI_CAN_SKIP_ISA_ALIGN; 521 + return NULL; 522 + } else if (!strcmp(str, "noioapicquirk")) { 523 + noioapicquirk = 1; 524 + return NULL; 525 + } else if (!strcmp(str, "ioapicreroute")) { 526 + if (noioapicreroute != -1) 527 + noioapicreroute = 0; 528 + return NULL; 529 + } else if (!strcmp(str, "noioapicreroute")) { 530 + if (noioapicreroute != -1) 531 + noioapicreroute = 1; 527 532 return NULL; 528 533 } 529 534 return str;
+3 -1
arch/x86/pci/direct.c
··· 173 173 174 174 #undef PCI_CONF2_ADDRESS 175 175 176 - static struct pci_raw_ops pci_direct_conf2 = { 176 + struct pci_raw_ops pci_direct_conf2 = { 177 177 .read = pci_conf2_read, 178 178 .write = pci_conf2_write, 179 179 }; ··· 289 289 290 290 if (pci_check_type1()) { 291 291 raw_pci_ops = &pci_direct_conf1; 292 + port_cf9_safe = true; 292 293 return 1; 293 294 } 294 295 release_resource(region); ··· 306 305 307 306 if (pci_check_type2()) { 308 307 raw_pci_ops = &pci_direct_conf2; 308 + port_cf9_safe = true; 309 309 return 2; 310 310 } 311 311
+1
arch/x86/pci/pci.h
··· 96 96 extern struct pci_raw_ops *raw_pci_ext_ops; 97 97 98 98 extern struct pci_raw_ops pci_direct_conf1; 99 + extern bool port_cf9_safe; 99 100 100 101 /* arch_initcall level */ 101 102 extern int pci_direct_probe(void);
+10 -7
arch/x86/xen/enlighten.c
··· 28 28 #include <linux/console.h> 29 29 30 30 #include <xen/interface/xen.h> 31 + #include <xen/interface/version.h> 31 32 #include <xen/interface/physdev.h> 32 33 #include <xen/interface/vcpu.h> 33 34 #include <xen/features.h> ··· 794 793 795 794 ret = 0; 796 795 797 - switch(msr) { 796 + switch (msr) { 798 797 #ifdef CONFIG_X86_64 799 798 unsigned which; 800 799 u64 base; ··· 1454 1453 1455 1454 ident_pte = 0; 1456 1455 pfn = 0; 1457 - for(pmdidx = 0; pmdidx < PTRS_PER_PMD && pfn < max_pfn; pmdidx++) { 1456 + for (pmdidx = 0; pmdidx < PTRS_PER_PMD && pfn < max_pfn; pmdidx++) { 1458 1457 pte_t *pte_page; 1459 1458 1460 1459 /* Reuse or allocate a page of ptes */ ··· 1472 1471 } 1473 1472 1474 1473 /* Install mappings */ 1475 - for(pteidx = 0; pteidx < PTRS_PER_PTE; pteidx++, pfn++) { 1474 + for (pteidx = 0; pteidx < PTRS_PER_PTE; pteidx++, pfn++) { 1476 1475 pte_t pte; 1477 1476 1478 1477 if (pfn > max_pfn_mapped) ··· 1486 1485 } 1487 1486 } 1488 1487 1489 - for(pteidx = 0; pteidx < ident_pte; pteidx += PTRS_PER_PTE) 1488 + for (pteidx = 0; pteidx < ident_pte; pteidx += PTRS_PER_PTE) 1490 1489 set_page_prot(&level1_ident_pgt[pteidx], PAGE_KERNEL_RO); 1491 1490 1492 1491 set_page_prot(pmd, PAGE_KERNEL_RO); ··· 1500 1499 1501 1500 /* All levels are converted the same way, so just treat them 1502 1501 as ptes. */ 1503 - for(i = 0; i < PTRS_PER_PTE; i++) 1502 + for (i = 0; i < PTRS_PER_PTE; i++) 1504 1503 pte[i] = xen_make_pte(pte[i].pte); 1505 1504 } 1506 1505 ··· 1515 1514 * of the physical mapping once some sort of allocator has been set 1516 1515 * up. 1517 1516 */ 1518 - static __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn) 1517 + static __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd, 1518 + unsigned long max_pfn) 1519 1519 { 1520 1520 pud_t *l3; 1521 1521 pmd_t *l2; ··· 1579 1577 #else /* !CONFIG_X86_64 */ 1580 1578 static pmd_t level2_kernel_pgt[PTRS_PER_PMD] __page_aligned_bss; 1581 1579 1582 - static __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn) 1580 + static __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd, 1581 + unsigned long max_pfn) 1583 1582 { 1584 1583 pmd_t *kernel_pmd; 1585 1584
+10 -7
arch/x86/xen/mmu.c
··· 154 154 { 155 155 unsigned pfn, idx; 156 156 157 - for(pfn = 0; pfn < MAX_DOMAIN_PAGES; pfn += P2M_ENTRIES_PER_PAGE) { 157 + for (pfn = 0; pfn < MAX_DOMAIN_PAGES; pfn += P2M_ENTRIES_PER_PAGE) { 158 158 unsigned topidx = p2m_top_index(pfn); 159 159 160 160 p2m_top_mfn[topidx] = virt_to_mfn(p2m_top[topidx]); 161 161 } 162 162 163 - for(idx = 0; idx < ARRAY_SIZE(p2m_top_mfn_list); idx++) { 163 + for (idx = 0; idx < ARRAY_SIZE(p2m_top_mfn_list); idx++) { 164 164 unsigned topidx = idx * P2M_ENTRIES_PER_PAGE; 165 165 p2m_top_mfn_list[idx] = virt_to_mfn(&p2m_top_mfn[topidx]); 166 166 } ··· 179 179 unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages); 180 180 unsigned pfn; 181 181 182 - for(pfn = 0; pfn < max_pfn; pfn += P2M_ENTRIES_PER_PAGE) { 182 + for (pfn = 0; pfn < max_pfn; pfn += P2M_ENTRIES_PER_PAGE) { 183 183 unsigned topidx = p2m_top_index(pfn); 184 184 185 185 p2m_top[topidx] = &mfn_list[pfn]; ··· 207 207 p = (void *)__get_free_page(GFP_KERNEL | __GFP_NOFAIL); 208 208 BUG_ON(p == NULL); 209 209 210 - for(i = 0; i < P2M_ENTRIES_PER_PAGE; i++) 210 + for (i = 0; i < P2M_ENTRIES_PER_PAGE; i++) 211 211 p[i] = INVALID_P2M_ENTRY; 212 212 213 213 if (cmpxchg(pp, p2m_missing, p) != p2m_missing) ··· 407 407 preempt_enable(); 408 408 } 409 409 410 - pte_t xen_ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, pte_t *ptep) 410 + pte_t xen_ptep_modify_prot_start(struct mm_struct *mm, 411 + unsigned long addr, pte_t *ptep) 411 412 { 412 413 /* Just return the pte as-is. We preserve the bits on commit */ 413 414 return *ptep; ··· 879 878 880 879 if (user_pgd) { 881 880 xen_pin_page(mm, virt_to_page(user_pgd), PT_PGD); 882 - xen_do_pin(MMUEXT_PIN_L4_TABLE, PFN_DOWN(__pa(user_pgd))); 881 + xen_do_pin(MMUEXT_PIN_L4_TABLE, 882 + PFN_DOWN(__pa(user_pgd))); 883 883 } 884 884 } 885 885 #else /* CONFIG_X86_32 */ ··· 995 993 pgd_t *user_pgd = xen_get_user_pgd(pgd); 996 994 997 995 if (user_pgd) { 998 - xen_do_pin(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(user_pgd))); 996 + xen_do_pin(MMUEXT_UNPIN_TABLE, 997 + PFN_DOWN(__pa(user_pgd))); 999 998 xen_unpin_page(mm, virt_to_page(user_pgd), PT_PGD); 1000 999 } 1001 1000 }
+1 -1
arch/x86/xen/multicalls.c
··· 154 154 ret, smp_processor_id()); 155 155 dump_stack(); 156 156 for (i = 0; i < b->mcidx; i++) { 157 - printk(" call %2d/%d: op=%lu arg=[%lx] result=%ld\n", 157 + printk(KERN_DEBUG " call %2d/%d: op=%lu arg=[%lx] result=%ld\n", 158 158 i+1, b->mcidx, 159 159 b->debug[i].op, 160 160 b->debug[i].args[0],
+5 -4
arch/x86/xen/setup.c
··· 28 28 /* These are code, but not functions. Defined in entry.S */ 29 29 extern const char xen_hypervisor_callback[]; 30 30 extern const char xen_failsafe_callback[]; 31 + extern void xen_sysenter_target(void); 32 + extern void xen_syscall_target(void); 33 + extern void xen_syscall32_target(void); 31 34 32 35 33 36 /** ··· 113 110 114 111 void __cpuinit xen_enable_sysenter(void) 115 112 { 116 - extern void xen_sysenter_target(void); 117 113 int ret; 118 114 unsigned sysenter_feature; 119 115 ··· 134 132 { 135 133 #ifdef CONFIG_X86_64 136 134 int ret; 137 - extern void xen_syscall_target(void); 138 - extern void xen_syscall32_target(void); 139 135 140 136 ret = register_callback(CALLBACKTYPE_syscall, xen_syscall_target); 141 137 if (ret != 0) { ··· 160 160 HYPERVISOR_vm_assist(VMASST_CMD_enable, VMASST_TYPE_writable_pagetables); 161 161 162 162 if (!xen_feature(XENFEAT_auto_translated_physmap)) 163 - HYPERVISOR_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3); 163 + HYPERVISOR_vm_assist(VMASST_CMD_enable, 164 + VMASST_TYPE_pae_extended_cr3); 164 165 165 166 if (register_callback(CALLBACKTYPE_event, xen_hypervisor_callback) || 166 167 register_callback(CALLBACKTYPE_failsafe, xen_failsafe_callback))
+56
drivers/acpi/pci_irq.c
··· 384 384 return irq; 385 385 } 386 386 387 + #ifdef CONFIG_X86_IO_APIC 388 + extern int noioapicquirk; 389 + 390 + static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) 391 + { 392 + struct pci_bus *bus_it; 393 + 394 + for (bus_it = bus ; bus_it ; bus_it = bus_it->parent) { 395 + if (!bus_it->self) 396 + return 0; 397 + 398 + printk(KERN_INFO "vendor=%04x device=%04x\n", bus_it->self->vendor, 399 + bus_it->self->device); 400 + 401 + if (bus_it->self->irq_reroute_variant) 402 + return bus_it->self->irq_reroute_variant; 403 + } 404 + return 0; 405 + } 406 + #endif /* CONFIG_X86_IO_APIC */ 407 + 387 408 /* 388 409 * acpi_pci_irq_lookup 389 410 * success: return IRQ >= 0 ··· 434 413 } 435 414 436 415 ret = func(entry, triggering, polarity, link); 416 + 417 + #ifdef CONFIG_X86_IO_APIC 418 + /* 419 + * Some chipsets (e.g. intel 6700PXH) generate a legacy INTx when the 420 + * IRQ entry in the chipset's IO-APIC is masked (as, e.g. the RT kernel 421 + * does during interrupt handling). When this INTx generation cannot be 422 + * disabled, we reroute these interrupts to their legacy equivalent to 423 + * get rid of spurious interrupts. 424 + */ 425 + if (!noioapicquirk) { 426 + switch (bridge_has_boot_interrupt_variant(bus)) { 427 + case 0: 428 + /* no rerouting necessary */ 429 + break; 430 + 431 + case INTEL_IRQ_REROUTE_VARIANT: 432 + /* 433 + * Remap according to INTx routing table in 6700PXH 434 + * specs, intel order number 302628-002, section 435 + * 2.15.2. Other chipsets (80332, ...) have the same 436 + * mapping and are handled here as well. 437 + */ 438 + printk(KERN_INFO "pci irq %d -> rerouted to legacy " 439 + "irq %d\n", ret, (ret % 4) + 16); 440 + ret = (ret % 4) + 16; 441 + break; 442 + 443 + default: 444 + printk(KERN_INFO "not rerouting irq %d to legacy irq: " 445 + "unknown mapping\n", ret); 446 + break; 447 + } 448 + } 449 + #endif /* CONFIG_X86_IO_APIC */ 450 + 437 451 return ret; 438 452 } 439 453
+11
drivers/firmware/dmi_scan.c
··· 467 467 } 468 468 EXPORT_SYMBOL(dmi_get_system_info); 469 469 470 + /** 471 + * dmi_name_in_serial - Check if string is in the DMI product serial 472 + * information. 473 + */ 474 + int dmi_name_in_serial(const char *str) 475 + { 476 + int f = DMI_PRODUCT_SERIAL; 477 + if (dmi_ident[f] && strstr(dmi_ident[f], str)) 478 + return 1; 479 + return 0; 480 + } 470 481 471 482 /** 472 483 * dmi_name_in_vendors - Check if string is anywhere in the DMI vendor information.
-1
drivers/misc/sgi-gru/gruprocfs.c
··· 317 317 { 318 318 struct proc_entry *p; 319 319 320 - proc_mkdir("sgi_uv", NULL); 321 320 proc_gru = proc_mkdir("sgi_uv/gru", NULL); 322 321 323 322 for (p = proc_files; p->name; p++)
+5 -2
drivers/misc/sgi-xp/xp.h
··· 194 194 xpGruSendMqError, /* 59: gru send message queue related error */ 195 195 196 196 xpBadChannelNumber, /* 60: invalid channel number */ 197 - xpBadMsgType, /* 60: invalid message type */ 197 + xpBadMsgType, /* 61: invalid message type */ 198 + xpBiosError, /* 62: BIOS error */ 198 199 199 - xpUnknownReason /* 61: unknown reason - must be last in enum */ 200 + xpUnknownReason /* 63: unknown reason - must be last in enum */ 200 201 }; 201 202 202 203 /* ··· 346 345 extern enum xp_retval (*xp_remote_memcpy) (unsigned long, const unsigned long, 347 346 size_t); 348 347 extern int (*xp_cpu_to_nasid) (int); 348 + extern enum xp_retval (*xp_expand_memprotect) (unsigned long, unsigned long); 349 + extern enum xp_retval (*xp_restrict_memprotect) (unsigned long, unsigned long); 349 350 350 351 extern u64 xp_nofault_PIOR_target; 351 352 extern int xp_nofault_PIOR(void *);
+7
drivers/misc/sgi-xp/xp_main.c
··· 51 51 int (*xp_cpu_to_nasid) (int cpuid); 52 52 EXPORT_SYMBOL_GPL(xp_cpu_to_nasid); 53 53 54 + enum xp_retval (*xp_expand_memprotect) (unsigned long phys_addr, 55 + unsigned long size); 56 + EXPORT_SYMBOL_GPL(xp_expand_memprotect); 57 + enum xp_retval (*xp_restrict_memprotect) (unsigned long phys_addr, 58 + unsigned long size); 59 + EXPORT_SYMBOL_GPL(xp_restrict_memprotect); 60 + 54 61 /* 55 62 * xpc_registrations[] keeps track of xpc_connect()'s done by the kernel-level 56 63 * users of XPC.
+34
drivers/misc/sgi-xp/xp_sn2.c
··· 120 120 return cpuid_to_nasid(cpuid); 121 121 } 122 122 123 + static enum xp_retval 124 + xp_expand_memprotect_sn2(unsigned long phys_addr, unsigned long size) 125 + { 126 + u64 nasid_array = 0; 127 + int ret; 128 + 129 + ret = sn_change_memprotect(phys_addr, size, SN_MEMPROT_ACCESS_CLASS_1, 130 + &nasid_array); 131 + if (ret != 0) { 132 + dev_err(xp, "sn_change_memprotect(,, " 133 + "SN_MEMPROT_ACCESS_CLASS_1,) failed ret=%d\n", ret); 134 + return xpSalError; 135 + } 136 + return xpSuccess; 137 + } 138 + 139 + static enum xp_retval 140 + xp_restrict_memprotect_sn2(unsigned long phys_addr, unsigned long size) 141 + { 142 + u64 nasid_array = 0; 143 + int ret; 144 + 145 + ret = sn_change_memprotect(phys_addr, size, SN_MEMPROT_ACCESS_CLASS_0, 146 + &nasid_array); 147 + if (ret != 0) { 148 + dev_err(xp, "sn_change_memprotect(,, " 149 + "SN_MEMPROT_ACCESS_CLASS_0,) failed ret=%d\n", ret); 150 + return xpSalError; 151 + } 152 + return xpSuccess; 153 + } 154 + 123 155 enum xp_retval 124 156 xp_init_sn2(void) 125 157 { ··· 164 132 xp_pa = xp_pa_sn2; 165 133 xp_remote_memcpy = xp_remote_memcpy_sn2; 166 134 xp_cpu_to_nasid = xp_cpu_to_nasid_sn2; 135 + xp_expand_memprotect = xp_expand_memprotect_sn2; 136 + xp_restrict_memprotect = xp_restrict_memprotect_sn2; 167 137 168 138 return xp_register_nofault_code_sn2(); 169 139 }
+68 -2
drivers/misc/sgi-xp/xp_uv.c
··· 15 15 16 16 #include <linux/device.h> 17 17 #include <asm/uv/uv_hub.h> 18 + #if defined CONFIG_X86_64 19 + #include <asm/uv/bios.h> 20 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 21 + #include <asm/sn/sn_sal.h> 22 + #endif 18 23 #include "../sgi-gru/grukservices.h" 19 24 #include "xp.h" 20 25 ··· 54 49 return UV_PNODE_TO_NASID(uv_cpu_to_pnode(cpuid)); 55 50 } 56 51 52 + static enum xp_retval 53 + xp_expand_memprotect_uv(unsigned long phys_addr, unsigned long size) 54 + { 55 + int ret; 56 + 57 + #if defined CONFIG_X86_64 58 + ret = uv_bios_change_memprotect(phys_addr, size, UV_MEMPROT_ALLOW_RW); 59 + if (ret != BIOS_STATUS_SUCCESS) { 60 + dev_err(xp, "uv_bios_change_memprotect(,, " 61 + "UV_MEMPROT_ALLOW_RW) failed, ret=%d\n", ret); 62 + return xpBiosError; 63 + } 64 + 65 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 66 + u64 nasid_array; 67 + 68 + ret = sn_change_memprotect(phys_addr, size, SN_MEMPROT_ACCESS_CLASS_1, 69 + &nasid_array); 70 + if (ret != 0) { 71 + dev_err(xp, "sn_change_memprotect(,, " 72 + "SN_MEMPROT_ACCESS_CLASS_1,) failed ret=%d\n", ret); 73 + return xpSalError; 74 + } 75 + #else 76 + #error not a supported configuration 77 + #endif 78 + return xpSuccess; 79 + } 80 + 81 + static enum xp_retval 82 + xp_restrict_memprotect_uv(unsigned long phys_addr, unsigned long size) 83 + { 84 + int ret; 85 + 86 + #if defined CONFIG_X86_64 87 + ret = uv_bios_change_memprotect(phys_addr, size, 88 + UV_MEMPROT_RESTRICT_ACCESS); 89 + if (ret != BIOS_STATUS_SUCCESS) { 90 + dev_err(xp, "uv_bios_change_memprotect(,, " 91 + "UV_MEMPROT_RESTRICT_ACCESS) failed, ret=%d\n", ret); 92 + return xpBiosError; 93 + } 94 + 95 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 96 + u64 nasid_array; 97 + 98 + ret = sn_change_memprotect(phys_addr, size, SN_MEMPROT_ACCESS_CLASS_0, 99 + &nasid_array); 100 + if (ret != 0) { 101 + dev_err(xp, "sn_change_memprotect(,, " 102 + "SN_MEMPROT_ACCESS_CLASS_0,) failed ret=%d\n", ret); 103 + return xpSalError; 104 + } 105 + #else 106 + #error not a supported configuration 107 + #endif 108 + return xpSuccess; 109 + } 110 + 57 111 enum xp_retval 58 112 xp_init_uv(void) 59 113 { 60 114 BUG_ON(!is_uv()); 61 115 62 116 xp_max_npartitions = XP_MAX_NPARTITIONS_UV; 63 - xp_partition_id = 0; /* !!! not correct value */ 64 - xp_region_size = 0; /* !!! not correct value */ 117 + xp_partition_id = sn_partition_id; 118 + xp_region_size = sn_region_size; 65 119 66 120 xp_pa = xp_pa_uv; 67 121 xp_remote_memcpy = xp_remote_memcpy_uv; 68 122 xp_cpu_to_nasid = xp_cpu_to_nasid_uv; 123 + xp_expand_memprotect = xp_expand_memprotect_uv; 124 + xp_restrict_memprotect = xp_restrict_memprotect_uv; 69 125 70 126 return xpSuccess; 71 127 }
+12
drivers/misc/sgi-xp/xpc.h
··· 181 181 xpc_nasid_mask_nlongs)) 182 182 183 183 /* 184 + * Info pertinent to a GRU message queue using a watch list for irq generation. 185 + */ 186 + struct xpc_gru_mq_uv { 187 + void *address; /* address of GRU message queue */ 188 + unsigned int order; /* size of GRU message queue as a power of 2 */ 189 + int irq; /* irq raised when message is received in mq */ 190 + int mmr_blade; /* blade where watchlist was allocated from */ 191 + unsigned long mmr_offset; /* offset of irq mmr located on mmr_blade */ 192 + int watchlist_num; /* number of watchlist allocatd by BIOS */ 193 + }; 194 + 195 + /* 184 196 * The activate_mq is used to send/receive GRU messages that affect XPC's 185 197 * heartbeat, partition active state, and channel state. This is UV only. 186 198 */
+5 -10
drivers/misc/sgi-xp/xpc_sn2.c
··· 553 553 static enum xp_retval 554 554 xpc_allow_amo_ops_sn2(struct amo *amos_page) 555 555 { 556 - u64 nasid_array = 0; 557 - int ret; 556 + enum xp_retval ret = xpSuccess; 558 557 559 558 /* 560 559 * On SHUB 1.1, we cannot call sn_change_memprotect() since the BIST 561 560 * collides with memory operations. On those systems we call 562 561 * xpc_allow_amo_ops_shub_wars_1_1_sn2() instead. 563 562 */ 564 - if (!enable_shub_wars_1_1()) { 565 - ret = sn_change_memprotect(ia64_tpa((u64)amos_page), PAGE_SIZE, 566 - SN_MEMPROT_ACCESS_CLASS_1, 567 - &nasid_array); 568 - if (ret != 0) 569 - return xpSalError; 570 - } 571 - return xpSuccess; 563 + if (!enable_shub_wars_1_1()) 564 + ret = xp_expand_memprotect(ia64_tpa((u64)amos_page), PAGE_SIZE); 565 + 566 + return ret; 572 567 } 573 568 574 569 /*
+251 -73
drivers/misc/sgi-xp/xpc_uv.c
··· 18 18 #include <linux/interrupt.h> 19 19 #include <linux/delay.h> 20 20 #include <linux/device.h> 21 + #include <linux/err.h> 21 22 #include <asm/uv/uv_hub.h> 23 + #if defined CONFIG_X86_64 24 + #include <asm/uv/bios.h> 25 + #include <asm/uv/uv_irq.h> 26 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 27 + #include <asm/sn/intr.h> 28 + #include <asm/sn/sn_sal.h> 29 + #endif 22 30 #include "../sgi-gru/gru.h" 23 31 #include "../sgi-gru/grukservices.h" 24 32 #include "xpc.h" ··· 35 27 static DECLARE_BITMAP(xpc_heartbeating_to_mask_uv, XP_MAX_NPARTITIONS_UV); 36 28 37 29 #define XPC_ACTIVATE_MSG_SIZE_UV (1 * GRU_CACHE_LINE_BYTES) 30 + #define XPC_ACTIVATE_MQ_SIZE_UV (4 * XP_MAX_NPARTITIONS_UV * \ 31 + XPC_ACTIVATE_MSG_SIZE_UV) 32 + #define XPC_ACTIVATE_IRQ_NAME "xpc_activate" 33 + 38 34 #define XPC_NOTIFY_MSG_SIZE_UV (2 * GRU_CACHE_LINE_BYTES) 35 + #define XPC_NOTIFY_MQ_SIZE_UV (4 * XP_MAX_NPARTITIONS_UV * \ 36 + XPC_NOTIFY_MSG_SIZE_UV) 37 + #define XPC_NOTIFY_IRQ_NAME "xpc_notify" 39 38 40 - #define XPC_ACTIVATE_MQ_SIZE_UV (4 * XP_MAX_NPARTITIONS_UV * \ 41 - XPC_ACTIVATE_MSG_SIZE_UV) 42 - #define XPC_NOTIFY_MQ_SIZE_UV (4 * XP_MAX_NPARTITIONS_UV * \ 43 - XPC_NOTIFY_MSG_SIZE_UV) 44 - 45 - static void *xpc_activate_mq_uv; 46 - static void *xpc_notify_mq_uv; 39 + static struct xpc_gru_mq_uv *xpc_activate_mq_uv; 40 + static struct xpc_gru_mq_uv *xpc_notify_mq_uv; 47 41 48 42 static int 49 43 xpc_setup_partitions_sn_uv(void) ··· 62 52 return 0; 63 53 } 64 54 65 - static void * 66 - xpc_create_gru_mq_uv(unsigned int mq_size, int cpuid, unsigned int irq, 67 - irq_handler_t irq_handler) 55 + static int 56 + xpc_get_gru_mq_irq_uv(struct xpc_gru_mq_uv *mq, int cpu, char *irq_name) 68 57 { 69 - int ret; 70 - int nid; 71 - int mq_order; 72 - struct page *page; 73 - void *mq; 74 - 75 - nid = cpu_to_node(cpuid); 76 - mq_order = get_order(mq_size); 77 - page = alloc_pages_node(nid, GFP_KERNEL | __GFP_ZERO | GFP_THISNODE, 78 - mq_order); 79 - if (page == NULL) { 80 - dev_err(xpc_part, "xpc_create_gru_mq_uv() failed to alloc %d " 81 - "bytes of memory on nid=%d for GRU mq\n", mq_size, nid); 82 - return NULL; 58 + #if defined CONFIG_X86_64 59 + mq->irq = uv_setup_irq(irq_name, cpu, mq->mmr_blade, mq->mmr_offset); 60 + if (mq->irq < 0) { 61 + dev_err(xpc_part, "uv_setup_irq() returned error=%d\n", 62 + mq->irq); 83 63 } 84 64 85 - mq = page_address(page); 86 - ret = gru_create_message_queue(mq, mq_size); 87 - if (ret != 0) { 88 - dev_err(xpc_part, "gru_create_message_queue() returned " 89 - "error=%d\n", ret); 90 - free_pages((unsigned long)mq, mq_order); 91 - return NULL; 92 - } 65 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 66 + int mmr_pnode; 67 + unsigned long mmr_value; 93 68 94 - /* !!! Need to do some other things to set up IRQ */ 69 + if (strcmp(irq_name, XPC_ACTIVATE_IRQ_NAME) == 0) 70 + mq->irq = SGI_XPC_ACTIVATE; 71 + else if (strcmp(irq_name, XPC_NOTIFY_IRQ_NAME) == 0) 72 + mq->irq = SGI_XPC_NOTIFY; 73 + else 74 + return -EINVAL; 95 75 96 - ret = request_irq(irq, irq_handler, 0, "xpc", NULL); 97 - if (ret != 0) { 98 - dev_err(xpc_part, "request_irq(irq=%d) returned error=%d\n", 99 - irq, ret); 100 - free_pages((unsigned long)mq, mq_order); 101 - return NULL; 102 - } 76 + mmr_pnode = uv_blade_to_pnode(mq->mmr_blade); 77 + mmr_value = (unsigned long)cpu_physical_id(cpu) << 32 | mq->irq; 103 78 104 - /* !!! enable generation of irq when GRU mq op occurs to this mq */ 79 + uv_write_global_mmr64(mmr_pnode, mq->mmr_offset, mmr_value); 80 + #else 81 + #error not a supported configuration 82 + #endif 105 83 106 - /* ??? allow other partitions to access GRU mq? */ 107 - 108 - return mq; 84 + return 0; 109 85 } 110 86 111 87 static void 112 - xpc_destroy_gru_mq_uv(void *mq, unsigned int mq_size, unsigned int irq) 88 + xpc_release_gru_mq_irq_uv(struct xpc_gru_mq_uv *mq) 113 89 { 114 - /* ??? disallow other partitions to access GRU mq? */ 90 + #if defined CONFIG_X86_64 91 + uv_teardown_irq(mq->irq, mq->mmr_blade, mq->mmr_offset); 115 92 116 - /* !!! disable generation of irq when GRU mq op occurs to this mq */ 93 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 94 + int mmr_pnode; 95 + unsigned long mmr_value; 117 96 118 - free_irq(irq, NULL); 97 + mmr_pnode = uv_blade_to_pnode(mq->mmr_blade); 98 + mmr_value = 1UL << 16; 119 99 120 - free_pages((unsigned long)mq, get_order(mq_size)); 100 + uv_write_global_mmr64(mmr_pnode, mq->mmr_offset, mmr_value); 101 + #else 102 + #error not a supported configuration 103 + #endif 104 + } 105 + 106 + static int 107 + xpc_gru_mq_watchlist_alloc_uv(struct xpc_gru_mq_uv *mq) 108 + { 109 + int ret; 110 + 111 + #if defined CONFIG_X86_64 112 + ret = uv_bios_mq_watchlist_alloc(mq->mmr_blade, uv_gpa(mq->address), 113 + mq->order, &mq->mmr_offset); 114 + if (ret < 0) { 115 + dev_err(xpc_part, "uv_bios_mq_watchlist_alloc() failed, " 116 + "ret=%d\n", ret); 117 + return ret; 118 + } 119 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 120 + ret = sn_mq_watchlist_alloc(mq->mmr_blade, uv_gpa(mq->address), 121 + mq->order, &mq->mmr_offset); 122 + if (ret < 0) { 123 + dev_err(xpc_part, "sn_mq_watchlist_alloc() failed, ret=%d\n", 124 + ret); 125 + return -EBUSY; 126 + } 127 + #else 128 + #error not a supported configuration 129 + #endif 130 + 131 + mq->watchlist_num = ret; 132 + return 0; 133 + } 134 + 135 + static void 136 + xpc_gru_mq_watchlist_free_uv(struct xpc_gru_mq_uv *mq) 137 + { 138 + int ret; 139 + 140 + #if defined CONFIG_X86_64 141 + ret = uv_bios_mq_watchlist_free(mq->mmr_blade, mq->watchlist_num); 142 + BUG_ON(ret != BIOS_STATUS_SUCCESS); 143 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 144 + ret = sn_mq_watchlist_free(mq->mmr_blade, mq->watchlist_num); 145 + BUG_ON(ret != SALRET_OK); 146 + #else 147 + #error not a supported configuration 148 + #endif 149 + } 150 + 151 + static struct xpc_gru_mq_uv * 152 + xpc_create_gru_mq_uv(unsigned int mq_size, int cpu, char *irq_name, 153 + irq_handler_t irq_handler) 154 + { 155 + enum xp_retval xp_ret; 156 + int ret; 157 + int nid; 158 + int pg_order; 159 + struct page *page; 160 + struct xpc_gru_mq_uv *mq; 161 + 162 + mq = kmalloc(sizeof(struct xpc_gru_mq_uv), GFP_KERNEL); 163 + if (mq == NULL) { 164 + dev_err(xpc_part, "xpc_create_gru_mq_uv() failed to kmalloc() " 165 + "a xpc_gru_mq_uv structure\n"); 166 + ret = -ENOMEM; 167 + goto out_1; 168 + } 169 + 170 + pg_order = get_order(mq_size); 171 + mq->order = pg_order + PAGE_SHIFT; 172 + mq_size = 1UL << mq->order; 173 + 174 + mq->mmr_blade = uv_cpu_to_blade_id(cpu); 175 + 176 + nid = cpu_to_node(cpu); 177 + page = alloc_pages_node(nid, GFP_KERNEL | __GFP_ZERO | GFP_THISNODE, 178 + pg_order); 179 + if (page == NULL) { 180 + dev_err(xpc_part, "xpc_create_gru_mq_uv() failed to alloc %d " 181 + "bytes of memory on nid=%d for GRU mq\n", mq_size, nid); 182 + ret = -ENOMEM; 183 + goto out_2; 184 + } 185 + mq->address = page_address(page); 186 + 187 + ret = gru_create_message_queue(mq->address, mq_size); 188 + if (ret != 0) { 189 + dev_err(xpc_part, "gru_create_message_queue() returned " 190 + "error=%d\n", ret); 191 + ret = -EINVAL; 192 + goto out_3; 193 + } 194 + 195 + /* enable generation of irq when GRU mq operation occurs to this mq */ 196 + ret = xpc_gru_mq_watchlist_alloc_uv(mq); 197 + if (ret != 0) 198 + goto out_3; 199 + 200 + ret = xpc_get_gru_mq_irq_uv(mq, cpu, irq_name); 201 + if (ret != 0) 202 + goto out_4; 203 + 204 + ret = request_irq(mq->irq, irq_handler, 0, irq_name, NULL); 205 + if (ret != 0) { 206 + dev_err(xpc_part, "request_irq(irq=%d) returned error=%d\n", 207 + mq->irq, ret); 208 + goto out_5; 209 + } 210 + 211 + /* allow other partitions to access this GRU mq */ 212 + xp_ret = xp_expand_memprotect(xp_pa(mq->address), mq_size); 213 + if (xp_ret != xpSuccess) { 214 + ret = -EACCES; 215 + goto out_6; 216 + } 217 + 218 + return mq; 219 + 220 + /* something went wrong */ 221 + out_6: 222 + free_irq(mq->irq, NULL); 223 + out_5: 224 + xpc_release_gru_mq_irq_uv(mq); 225 + out_4: 226 + xpc_gru_mq_watchlist_free_uv(mq); 227 + out_3: 228 + free_pages((unsigned long)mq->address, pg_order); 229 + out_2: 230 + kfree(mq); 231 + out_1: 232 + return ERR_PTR(ret); 233 + } 234 + 235 + static void 236 + xpc_destroy_gru_mq_uv(struct xpc_gru_mq_uv *mq) 237 + { 238 + unsigned int mq_size; 239 + int pg_order; 240 + int ret; 241 + 242 + /* disallow other partitions to access GRU mq */ 243 + mq_size = 1UL << mq->order; 244 + ret = xp_restrict_memprotect(xp_pa(mq->address), mq_size); 245 + BUG_ON(ret != xpSuccess); 246 + 247 + /* unregister irq handler and release mq irq/vector mapping */ 248 + free_irq(mq->irq, NULL); 249 + xpc_release_gru_mq_irq_uv(mq); 250 + 251 + /* disable generation of irq when GRU mq op occurs to this mq */ 252 + xpc_gru_mq_watchlist_free_uv(mq); 253 + 254 + pg_order = mq->order - PAGE_SHIFT; 255 + free_pages((unsigned long)mq->address, pg_order); 256 + 257 + kfree(mq); 121 258 } 122 259 123 260 static enum xp_retval ··· 559 402 struct xpc_partition *part; 560 403 int wakeup_hb_checker = 0; 561 404 562 - while ((msg_hdr = gru_get_next_message(xpc_activate_mq_uv)) != NULL) { 405 + while (1) { 406 + msg_hdr = gru_get_next_message(xpc_activate_mq_uv->address); 407 + if (msg_hdr == NULL) 408 + break; 563 409 564 410 partid = msg_hdr->partid; 565 411 if (partid < 0 || partid >= XP_MAX_NPARTITIONS_UV) { ··· 578 418 } 579 419 } 580 420 581 - gru_free_message(xpc_activate_mq_uv, msg_hdr); 421 + gru_free_message(xpc_activate_mq_uv->address, msg_hdr); 582 422 } 583 423 584 424 if (wakeup_hb_checker) ··· 642 482 struct xpc_partition_uv *part_uv = &part->sn.uv; 643 483 644 484 /* 645 - * !!! Make our side think that the remote parition sent an activate 485 + * !!! Make our side think that the remote partition sent an activate 646 486 * !!! message our way by doing what the activate IRQ handler would 647 487 * !!! do had one really been sent. 648 488 */ ··· 660 500 xpc_get_partition_rsvd_page_pa_uv(void *buf, u64 *cookie, unsigned long *rp_pa, 661 501 size_t *len) 662 502 { 663 - /* !!! call the UV version of sn_partition_reserved_page_pa() */ 664 - return xpUnsupported; 503 + s64 status; 504 + enum xp_retval ret; 505 + 506 + #if defined CONFIG_X86_64 507 + status = uv_bios_reserved_page_pa((u64)buf, cookie, (u64 *)rp_pa, 508 + (u64 *)len); 509 + if (status == BIOS_STATUS_SUCCESS) 510 + ret = xpSuccess; 511 + else if (status == BIOS_STATUS_MORE_PASSES) 512 + ret = xpNeedMoreInfo; 513 + else 514 + ret = xpBiosError; 515 + 516 + #elif defined CONFIG_IA64_GENERIC || defined CONFIG_IA64_SGI_UV 517 + status = sn_partition_reserved_page_pa((u64)buf, cookie, rp_pa, len); 518 + if (status == SALRET_OK) 519 + ret = xpSuccess; 520 + else if (status == SALRET_MORE_PASSES) 521 + ret = xpNeedMoreInfo; 522 + else 523 + ret = xpSalError; 524 + 525 + #else 526 + #error not a supported configuration 527 + #endif 528 + 529 + return ret; 665 530 } 666 531 667 532 static int 668 533 xpc_setup_rsvd_page_sn_uv(struct xpc_rsvd_page *rp) 669 534 { 670 - rp->sn.activate_mq_gpa = uv_gpa(xpc_activate_mq_uv); 535 + rp->sn.activate_mq_gpa = uv_gpa(xpc_activate_mq_uv->address); 671 536 return 0; 672 537 } 673 538 ··· 1596 1411 return -E2BIG; 1597 1412 } 1598 1413 1599 - /* ??? The cpuid argument's value is 0, is that what we want? */ 1600 - /* !!! The irq argument's value isn't correct. */ 1601 - xpc_activate_mq_uv = xpc_create_gru_mq_uv(XPC_ACTIVATE_MQ_SIZE_UV, 0, 0, 1414 + xpc_activate_mq_uv = xpc_create_gru_mq_uv(XPC_ACTIVATE_MQ_SIZE_UV, 0, 1415 + XPC_ACTIVATE_IRQ_NAME, 1602 1416 xpc_handle_activate_IRQ_uv); 1603 - if (xpc_activate_mq_uv == NULL) 1604 - return -ENOMEM; 1417 + if (IS_ERR(xpc_activate_mq_uv)) 1418 + return PTR_ERR(xpc_activate_mq_uv); 1605 1419 1606 - /* ??? The cpuid argument's value is 0, is that what we want? */ 1607 - /* !!! The irq argument's value isn't correct. */ 1608 - xpc_notify_mq_uv = xpc_create_gru_mq_uv(XPC_NOTIFY_MQ_SIZE_UV, 0, 0, 1420 + xpc_notify_mq_uv = xpc_create_gru_mq_uv(XPC_NOTIFY_MQ_SIZE_UV, 0, 1421 + XPC_NOTIFY_IRQ_NAME, 1609 1422 xpc_handle_notify_IRQ_uv); 1610 - if (xpc_notify_mq_uv == NULL) { 1611 - /* !!! The irq argument's value isn't correct. */ 1612 - xpc_destroy_gru_mq_uv(xpc_activate_mq_uv, 1613 - XPC_ACTIVATE_MQ_SIZE_UV, 0); 1614 - return -ENOMEM; 1423 + if (IS_ERR(xpc_notify_mq_uv)) { 1424 + xpc_destroy_gru_mq_uv(xpc_activate_mq_uv); 1425 + return PTR_ERR(xpc_notify_mq_uv); 1615 1426 } 1616 1427 1617 1428 return 0; ··· 1616 1435 void 1617 1436 xpc_exit_uv(void) 1618 1437 { 1619 - /* !!! The irq argument's value isn't correct. */ 1620 - xpc_destroy_gru_mq_uv(xpc_notify_mq_uv, XPC_NOTIFY_MQ_SIZE_UV, 0); 1621 - 1622 - /* !!! The irq argument's value isn't correct. */ 1623 - xpc_destroy_gru_mq_uv(xpc_activate_mq_uv, XPC_ACTIVATE_MQ_SIZE_UV, 0); 1438 + xpc_destroy_gru_mq_uv(xpc_notify_mq_uv); 1439 + xpc_destroy_gru_mq_uv(xpc_activate_mq_uv); 1624 1440 }
+149 -21
drivers/pci/quirks.c
··· 606 606 sis_apic_bug = 1; 607 607 } 608 608 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SI, PCI_ANY_ID, quirk_ioapic_rmw); 609 - 610 - #define AMD8131_revA0 0x01 611 - #define AMD8131_revB0 0x11 612 - #define AMD8131_MISC 0x40 613 - #define AMD8131_NIOAMODE_BIT 0 614 - static void quirk_amd_8131_ioapic(struct pci_dev *dev) 615 - { 616 - unsigned char tmp; 617 - 618 - if (nr_ioapics == 0) 619 - return; 620 - 621 - if (dev->revision == AMD8131_revA0 || dev->revision == AMD8131_revB0) { 622 - dev_info(&dev->dev, "Fixing up AMD8131 IOAPIC mode\n"); 623 - pci_read_config_byte( dev, AMD8131_MISC, &tmp); 624 - tmp &= ~(1 << AMD8131_NIOAMODE_BIT); 625 - pci_write_config_byte( dev, AMD8131_MISC, tmp); 626 - } 627 - } 628 - DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_amd_8131_ioapic); 629 - DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_amd_8131_ioapic); 630 609 #endif /* CONFIG_X86_IO_APIC */ 631 610 632 611 /* ··· 1401 1422 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2609, quirk_intel_pcie_pm); 1402 1423 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x260a, quirk_intel_pcie_pm); 1403 1424 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x260b, quirk_intel_pcie_pm); 1425 + 1426 + #ifdef CONFIG_X86_IO_APIC 1427 + /* 1428 + * Boot interrupts on some chipsets cannot be turned off. For these chipsets, 1429 + * remap the original interrupt in the linux kernel to the boot interrupt, so 1430 + * that a PCI device's interrupt handler is installed on the boot interrupt 1431 + * line instead. 1432 + */ 1433 + static void quirk_reroute_to_boot_interrupts_intel(struct pci_dev *dev) 1434 + { 1435 + if (noioapicquirk || noioapicreroute) 1436 + return; 1437 + 1438 + dev->irq_reroute_variant = INTEL_IRQ_REROUTE_VARIANT; 1439 + 1440 + printk(KERN_INFO "PCI quirk: reroute interrupts for 0x%04x:0x%04x\n", 1441 + dev->vendor, dev->device); 1442 + return; 1443 + } 1444 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_0, quirk_reroute_to_boot_interrupts_intel); 1445 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_1, quirk_reroute_to_boot_interrupts_intel); 1446 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB2_0, quirk_reroute_to_boot_interrupts_intel); 1447 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_0, quirk_reroute_to_boot_interrupts_intel); 1448 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_1, quirk_reroute_to_boot_interrupts_intel); 1449 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHV, quirk_reroute_to_boot_interrupts_intel); 1450 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_0, quirk_reroute_to_boot_interrupts_intel); 1451 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_1, quirk_reroute_to_boot_interrupts_intel); 1452 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_0, quirk_reroute_to_boot_interrupts_intel); 1453 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80333_1, quirk_reroute_to_boot_interrupts_intel); 1454 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB2_0, quirk_reroute_to_boot_interrupts_intel); 1455 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_0, quirk_reroute_to_boot_interrupts_intel); 1456 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXH_1, quirk_reroute_to_boot_interrupts_intel); 1457 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PXHV, quirk_reroute_to_boot_interrupts_intel); 1458 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_0, quirk_reroute_to_boot_interrupts_intel); 1459 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_80332_1, quirk_reroute_to_boot_interrupts_intel); 1460 + 1461 + /* 1462 + * On some chipsets we can disable the generation of legacy INTx boot 1463 + * interrupts. 1464 + */ 1465 + 1466 + /* 1467 + * IO-APIC1 on 6300ESB generates boot interrupts, see intel order no 1468 + * 300641-004US, section 5.7.3. 1469 + */ 1470 + #define INTEL_6300_IOAPIC_ABAR 0x40 1471 + #define INTEL_6300_DISABLE_BOOT_IRQ (1<<14) 1472 + 1473 + static void quirk_disable_intel_boot_interrupt(struct pci_dev *dev) 1474 + { 1475 + u16 pci_config_word; 1476 + 1477 + if (noioapicquirk) 1478 + return; 1479 + 1480 + pci_read_config_word(dev, INTEL_6300_IOAPIC_ABAR, &pci_config_word); 1481 + pci_config_word |= INTEL_6300_DISABLE_BOOT_IRQ; 1482 + pci_write_config_word(dev, INTEL_6300_IOAPIC_ABAR, pci_config_word); 1483 + 1484 + printk(KERN_INFO "disabled boot interrupt on device 0x%04x:0x%04x\n", 1485 + dev->vendor, dev->device); 1486 + } 1487 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB_10, quirk_disable_intel_boot_interrupt); 1488 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ESB_10, quirk_disable_intel_boot_interrupt); 1489 + 1490 + /* 1491 + * disable boot interrupts on HT-1000 1492 + */ 1493 + #define BC_HT1000_FEATURE_REG 0x64 1494 + #define BC_HT1000_PIC_REGS_ENABLE (1<<0) 1495 + #define BC_HT1000_MAP_IDX 0xC00 1496 + #define BC_HT1000_MAP_DATA 0xC01 1497 + 1498 + static void quirk_disable_broadcom_boot_interrupt(struct pci_dev *dev) 1499 + { 1500 + u32 pci_config_dword; 1501 + u8 irq; 1502 + 1503 + if (noioapicquirk) 1504 + return; 1505 + 1506 + pci_read_config_dword(dev, BC_HT1000_FEATURE_REG, &pci_config_dword); 1507 + pci_write_config_dword(dev, BC_HT1000_FEATURE_REG, pci_config_dword | 1508 + BC_HT1000_PIC_REGS_ENABLE); 1509 + 1510 + for (irq = 0x10; irq < 0x10 + 32; irq++) { 1511 + outb(irq, BC_HT1000_MAP_IDX); 1512 + outb(0x00, BC_HT1000_MAP_DATA); 1513 + } 1514 + 1515 + pci_write_config_dword(dev, BC_HT1000_FEATURE_REG, pci_config_dword); 1516 + 1517 + printk(KERN_INFO "disabled boot interrupts on PCI device" 1518 + "0x%04x:0x%04x\n", dev->vendor, dev->device); 1519 + } 1520 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_HT1000SB, quirk_disable_broadcom_boot_interrupt); 1521 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_HT1000SB, quirk_disable_broadcom_boot_interrupt); 1522 + 1523 + /* 1524 + * disable boot interrupts on AMD and ATI chipsets 1525 + */ 1526 + /* 1527 + * NOIOAMODE needs to be disabled to disable "boot interrupts". For AMD 8131 1528 + * rev. A0 and B0, NOIOAMODE needs to be disabled anyway to fix IO-APIC mode 1529 + * (due to an erratum). 1530 + */ 1531 + #define AMD_813X_MISC 0x40 1532 + #define AMD_813X_NOIOAMODE (1<<0) 1533 + 1534 + static void quirk_disable_amd_813x_boot_interrupt(struct pci_dev *dev) 1535 + { 1536 + u32 pci_config_dword; 1537 + 1538 + if (noioapicquirk) 1539 + return; 1540 + 1541 + pci_read_config_dword(dev, AMD_813X_MISC, &pci_config_dword); 1542 + pci_config_dword &= ~AMD_813X_NOIOAMODE; 1543 + pci_write_config_dword(dev, AMD_813X_MISC, pci_config_dword); 1544 + 1545 + printk(KERN_INFO "disabled boot interrupts on PCI device " 1546 + "0x%04x:0x%04x\n", dev->vendor, dev->device); 1547 + } 1548 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8131_BRIDGE, quirk_disable_amd_813x_boot_interrupt); 1549 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8132_BRIDGE, quirk_disable_amd_813x_boot_interrupt); 1550 + 1551 + #define AMD_8111_PCI_IRQ_ROUTING 0x56 1552 + 1553 + static void quirk_disable_amd_8111_boot_interrupt(struct pci_dev *dev) 1554 + { 1555 + u16 pci_config_word; 1556 + 1557 + if (noioapicquirk) 1558 + return; 1559 + 1560 + pci_read_config_word(dev, AMD_8111_PCI_IRQ_ROUTING, &pci_config_word); 1561 + if (!pci_config_word) { 1562 + printk(KERN_INFO "boot interrupts on PCI device 0x%04x:0x%04x " 1563 + "already disabled\n", 1564 + dev->vendor, dev->device); 1565 + return; 1566 + } 1567 + pci_write_config_word(dev, AMD_8111_PCI_IRQ_ROUTING, 0); 1568 + printk(KERN_INFO "disabled boot interrupts on PCI device " 1569 + "0x%04x:0x%04x\n", dev->vendor, dev->device); 1570 + } 1571 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS, quirk_disable_amd_8111_boot_interrupt); 1572 + DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS, quirk_disable_amd_8111_boot_interrupt); 1573 + #endif /* CONFIG_X86_IO_APIC */ 1404 1574 1405 1575 /* 1406 1576 * Toshiba TC86C001 IDE controller reports the standard 8-byte BAR0 size
+3 -1
drivers/xen/balloon.c
··· 44 44 #include <linux/list.h> 45 45 #include <linux/sysdev.h> 46 46 47 - #include <asm/xen/hypervisor.h> 48 47 #include <asm/page.h> 49 48 #include <asm/pgalloc.h> 50 49 #include <asm/pgtable.h> 51 50 #include <asm/uaccess.h> 52 51 #include <asm/tlb.h> 53 52 53 + #include <asm/xen/hypervisor.h> 54 + #include <asm/xen/hypercall.h> 55 + #include <xen/interface/xen.h> 54 56 #include <xen/interface/memory.h> 55 57 #include <xen/xenbus.h> 56 58 #include <xen/features.h>
+5 -1
drivers/xen/features.c
··· 8 8 #include <linux/types.h> 9 9 #include <linux/cache.h> 10 10 #include <linux/module.h> 11 - #include <asm/xen/hypervisor.h> 11 + 12 + #include <asm/xen/hypercall.h> 13 + 14 + #include <xen/interface/xen.h> 15 + #include <xen/interface/version.h> 12 16 #include <xen/features.h> 13 17 14 18 u8 xen_features[XENFEAT_NR_SUBMAPS * 32] __read_mostly;
+1
drivers/xen/grant-table.c
··· 40 40 #include <xen/interface/xen.h> 41 41 #include <xen/page.h> 42 42 #include <xen/grant_table.h> 43 + #include <asm/xen/hypercall.h> 43 44 44 45 #include <asm/pgtable.h> 45 46 #include <asm/sync_bitops.h>
+8
include/asm-generic/bug.h
··· 8 8 #ifdef CONFIG_GENERIC_BUG 9 9 #ifndef __ASSEMBLY__ 10 10 struct bug_entry { 11 + #ifndef CONFIG_GENERIC_BUG_RELATIVE_POINTERS 11 12 unsigned long bug_addr; 13 + #else 14 + signed int bug_addr_disp; 15 + #endif 12 16 #ifdef CONFIG_DEBUG_BUGVERBOSE 17 + #ifndef CONFIG_GENERIC_BUG_RELATIVE_POINTERS 13 18 const char *file; 19 + #else 20 + signed int file_disp; 21 + #endif 14 22 unsigned short line; 15 23 #endif 16 24 unsigned short flags;
+50
include/asm-generic/pgtable.h
··· 129 129 #define move_pte(pte, prot, old_addr, new_addr) (pte) 130 130 #endif 131 131 132 + #ifndef pgprot_writecombine 133 + #define pgprot_writecombine pgprot_noncached 134 + #endif 135 + 132 136 /* 133 137 * When walking page tables, get the address of the next boundary, 134 138 * or the end address of the range if that comes earlier. Although no ··· 291 287 #define arch_enter_lazy_cpu_mode() do {} while (0) 292 288 #define arch_leave_lazy_cpu_mode() do {} while (0) 293 289 #define arch_flush_lazy_cpu_mode() do {} while (0) 290 + #endif 291 + 292 + #ifndef __HAVE_PFNMAP_TRACKING 293 + /* 294 + * Interface that can be used by architecture code to keep track of 295 + * memory type of pfn mappings (remap_pfn_range, vm_insert_pfn) 296 + * 297 + * track_pfn_vma_new is called when a _new_ pfn mapping is being established 298 + * for physical range indicated by pfn and size. 299 + */ 300 + static inline int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t prot, 301 + unsigned long pfn, unsigned long size) 302 + { 303 + return 0; 304 + } 305 + 306 + /* 307 + * Interface that can be used by architecture code to keep track of 308 + * memory type of pfn mappings (remap_pfn_range, vm_insert_pfn) 309 + * 310 + * track_pfn_vma_copy is called when vma that is covering the pfnmap gets 311 + * copied through copy_page_range(). 312 + */ 313 + static inline int track_pfn_vma_copy(struct vm_area_struct *vma) 314 + { 315 + return 0; 316 + } 317 + 318 + /* 319 + * Interface that can be used by architecture code to keep track of 320 + * memory type of pfn mappings (remap_pfn_range, vm_insert_pfn) 321 + * 322 + * untrack_pfn_vma is called while unmapping a pfnmap for a region. 323 + * untrack can be called for a specific region indicated by pfn and size or 324 + * can be for the entire vma (in which case size can be zero). 325 + */ 326 + static inline void untrack_pfn_vma(struct vm_area_struct *vma, 327 + unsigned long pfn, unsigned long size) 328 + { 329 + } 330 + #else 331 + extern int track_pfn_vma_new(struct vm_area_struct *vma, pgprot_t prot, 332 + unsigned long pfn, unsigned long size); 333 + extern int track_pfn_vma_copy(struct vm_area_struct *vma); 334 + extern void untrack_pfn_vma(struct vm_area_struct *vma, unsigned long pfn, 335 + unsigned long size); 294 336 #endif 295 337 296 338 #endif /* !__ASSEMBLY__ */
+2
include/linux/dmi.h
··· 44 44 extern void dmi_scan_machine(void); 45 45 extern int dmi_get_year(int field); 46 46 extern int dmi_name_in_vendors(const char *str); 47 + extern int dmi_name_in_serial(const char *str); 47 48 extern int dmi_available; 48 49 extern int dmi_walk(void (*decode)(const struct dmi_header *)); 49 50 ··· 57 56 static inline void dmi_scan_machine(void) { return; } 58 57 static inline int dmi_get_year(int year) { return 0; } 59 58 static inline int dmi_name_in_vendors(const char *s) { return 0; } 59 + static inline int dmi_name_in_serial(const char *s) { return 0; } 60 60 #define dmi_available 0 61 61 static inline int dmi_walk(void (*decode)(const struct dmi_header *)) 62 62 { return -1; }
+4
include/linux/kexec.h
··· 100 100 #define KEXEC_TYPE_DEFAULT 0 101 101 #define KEXEC_TYPE_CRASH 1 102 102 unsigned int preserve_context : 1; 103 + 104 + #ifdef ARCH_HAS_KIMAGE_ARCH 105 + struct kimage_arch arch; 106 + #endif 103 107 }; 104 108 105 109
+19
include/linux/mm.h
··· 145 145 #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ 146 146 #define FAULT_FLAG_NONLINEAR 0x02 /* Fault was via a nonlinear mapping */ 147 147 148 + /* 149 + * This interface is used by x86 PAT code to identify a pfn mapping that is 150 + * linear over entire vma. This is to optimize PAT code that deals with 151 + * marking the physical region with a particular prot. This is not for generic 152 + * mm use. Note also that this check will not work if the pfn mapping is 153 + * linear for a vma starting at physical address 0. In which case PAT code 154 + * falls back to slow path of reserving physical range page by page. 155 + */ 156 + static inline int is_linear_pfn_mapping(struct vm_area_struct *vma) 157 + { 158 + return ((vma->vm_flags & VM_PFNMAP) && vma->vm_pgoff); 159 + } 160 + 161 + static inline int is_pfn_mapping(struct vm_area_struct *vma) 162 + { 163 + return (vma->vm_flags & VM_PFNMAP); 164 + } 148 165 149 166 /* 150 167 * vm_fault is filled by the the pagefault handler and passed to the vma's ··· 798 781 struct vm_area_struct *vma); 799 782 void unmap_mapping_range(struct address_space *mapping, 800 783 loff_t const holebegin, loff_t const holelen, int even_cows); 784 + int follow_phys(struct vm_area_struct *vma, unsigned long address, 785 + unsigned int flags, unsigned long *prot, resource_size_t *phys); 801 786 int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, 802 787 void *buf, int len, int write); 803 788
+6
include/linux/pci.h
··· 134 134 PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2, 135 135 }; 136 136 137 + enum pci_irq_reroute_variant { 138 + INTEL_IRQ_REROUTE_VARIANT = 1, 139 + MAX_IRQ_REROUTE_VARIANTS = 3 140 + }; 141 + 137 142 typedef unsigned short __bitwise pci_bus_flags_t; 138 143 enum pci_bus_flags { 139 144 PCI_BUS_FLAGS_NO_MSI = (__force pci_bus_flags_t) 1, ··· 223 218 unsigned int no_msi:1; /* device may not use msi */ 224 219 unsigned int block_ucfg_access:1; /* userspace config space access is blocked */ 225 220 unsigned int broken_parity_status:1; /* Device generates false positive parity */ 221 + unsigned int irq_reroute_variant:2; /* device needs IRQ rerouting variant */ 226 222 unsigned int msi_enabled:1; 227 223 unsigned int msix_enabled:1; 228 224 unsigned int ari_enabled:1; /* ARI forwarding */
+5
include/linux/pci_ids.h
··· 2304 2304 #define PCI_DEVICE_ID_INTEL_PXH_0 0x0329 2305 2305 #define PCI_DEVICE_ID_INTEL_PXH_1 0x032A 2306 2306 #define PCI_DEVICE_ID_INTEL_PXHV 0x032C 2307 + #define PCI_DEVICE_ID_INTEL_80332_0 0x0330 2308 + #define PCI_DEVICE_ID_INTEL_80332_1 0x0332 2309 + #define PCI_DEVICE_ID_INTEL_80333_0 0x0370 2310 + #define PCI_DEVICE_ID_INTEL_80333_1 0x0372 2307 2311 #define PCI_DEVICE_ID_INTEL_82375 0x0482 2308 2312 #define PCI_DEVICE_ID_INTEL_82424 0x0483 2309 2313 #define PCI_DEVICE_ID_INTEL_82378 0x0484 ··· 2380 2376 #define PCI_DEVICE_ID_INTEL_ESB_4 0x25a4 2381 2377 #define PCI_DEVICE_ID_INTEL_ESB_5 0x25a6 2382 2378 #define PCI_DEVICE_ID_INTEL_ESB_9 0x25ab 2379 + #define PCI_DEVICE_ID_INTEL_ESB_10 0x25ac 2383 2380 #define PCI_DEVICE_ID_INTEL_82820_HB 0x2500 2384 2381 #define PCI_DEVICE_ID_INTEL_82820_UP_HB 0x2501 2385 2382 #define PCI_DEVICE_ID_INTEL_82850_HB 0x2530
+2
include/xen/interface/event_channel.h
··· 9 9 #ifndef __XEN_PUBLIC_EVENT_CHANNEL_H__ 10 10 #define __XEN_PUBLIC_EVENT_CHANNEL_H__ 11 11 12 + #include <xen/interface/xen.h> 13 + 12 14 typedef uint32_t evtchn_port_t; 13 15 DEFINE_GUEST_HANDLE(evtchn_port_t); 14 16
+17 -2
lib/bug.c
··· 5 5 6 6 CONFIG_BUG - emit BUG traps. Nothing happens without this. 7 7 CONFIG_GENERIC_BUG - enable this code. 8 + CONFIG_GENERIC_BUG_RELATIVE_POINTERS - use 32-bit pointers relative to 9 + the containing struct bug_entry for bug_addr and file. 8 10 CONFIG_DEBUG_BUGVERBOSE - emit full file+line information for each BUG 9 11 10 12 CONFIG_BUG and CONFIG_DEBUG_BUGVERBOSE are potentially user-settable ··· 45 43 46 44 extern const struct bug_entry __start___bug_table[], __stop___bug_table[]; 47 45 46 + static inline unsigned long bug_addr(const struct bug_entry *bug) 47 + { 48 + #ifndef CONFIG_GENERIC_BUG_RELATIVE_POINTERS 49 + return bug->bug_addr; 50 + #else 51 + return (unsigned long)bug + bug->bug_addr_disp; 52 + #endif 53 + } 54 + 48 55 #ifdef CONFIG_MODULES 49 56 static LIST_HEAD(module_bug_list); 50 57 ··· 66 55 unsigned i; 67 56 68 57 for (i = 0; i < mod->num_bugs; ++i, ++bug) 69 - if (bugaddr == bug->bug_addr) 58 + if (bugaddr == bug_addr(bug)) 70 59 return bug; 71 60 } 72 61 return NULL; ··· 119 108 const struct bug_entry *bug; 120 109 121 110 for (bug = __start___bug_table; bug < __stop___bug_table; ++bug) 122 - if (bugaddr == bug->bug_addr) 111 + if (bugaddr == bug_addr(bug)) 123 112 return bug; 124 113 125 114 return module_find_bug(bugaddr); ··· 144 133 145 134 if (bug) { 146 135 #ifdef CONFIG_DEBUG_BUGVERBOSE 136 + #ifndef CONFIG_GENERIC_BUG_RELATIVE_POINTERS 147 137 file = bug->file; 138 + #else 139 + file = (const char *)bug + bug->file_disp; 140 + #endif 148 141 line = bug->line; 149 142 #endif 150 143 warning = (bug->flags & BUGFLAG_WARNING) != 0;
+48 -22
mm/memory.c
··· 669 669 if (is_vm_hugetlb_page(vma)) 670 670 return copy_hugetlb_page_range(dst_mm, src_mm, vma); 671 671 672 + if (unlikely(is_pfn_mapping(vma))) { 673 + /* 674 + * We do not free on error cases below as remove_vma 675 + * gets called on error from higher level routine 676 + */ 677 + ret = track_pfn_vma_copy(vma); 678 + if (ret) 679 + return ret; 680 + } 681 + 672 682 /* 673 683 * We need to invalidate the secondary MMU mappings only when 674 684 * there could be a permission downgrade on the ptes of the ··· 924 914 925 915 if (vma->vm_flags & VM_ACCOUNT) 926 916 *nr_accounted += (end - start) >> PAGE_SHIFT; 917 + 918 + if (unlikely(is_pfn_mapping(vma))) 919 + untrack_pfn_vma(vma, 0, 0); 927 920 928 921 while (start != end) { 929 922 if (!tlb_start_valid) { ··· 1443 1430 int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, 1444 1431 unsigned long pfn) 1445 1432 { 1433 + int ret; 1446 1434 /* 1447 1435 * Technically, architectures with pte_special can avoid all these 1448 1436 * restrictions (same for remap_pfn_range). However we would like ··· 1458 1444 1459 1445 if (addr < vma->vm_start || addr >= vma->vm_end) 1460 1446 return -EFAULT; 1461 - return insert_pfn(vma, addr, pfn, vma->vm_page_prot); 1447 + if (track_pfn_vma_new(vma, vma->vm_page_prot, pfn, PAGE_SIZE)) 1448 + return -EINVAL; 1449 + 1450 + ret = insert_pfn(vma, addr, pfn, vma->vm_page_prot); 1451 + 1452 + if (ret) 1453 + untrack_pfn_vma(vma, pfn, PAGE_SIZE); 1454 + 1455 + return ret; 1462 1456 } 1463 1457 EXPORT_SYMBOL(vm_insert_pfn); 1464 1458 ··· 1597 1575 * behaviour that some programs depend on. We mark the "original" 1598 1576 * un-COW'ed pages by matching them up with "vma->vm_pgoff". 1599 1577 */ 1600 - if (is_cow_mapping(vma->vm_flags)) { 1601 - if (addr != vma->vm_start || end != vma->vm_end) 1602 - return -EINVAL; 1578 + if (addr == vma->vm_start && end == vma->vm_end) 1603 1579 vma->vm_pgoff = pfn; 1604 - } 1580 + else if (is_cow_mapping(vma->vm_flags)) 1581 + return -EINVAL; 1605 1582 1606 1583 vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP; 1584 + 1585 + err = track_pfn_vma_new(vma, prot, pfn, PAGE_ALIGN(size)); 1586 + if (err) 1587 + return -EINVAL; 1607 1588 1608 1589 BUG_ON(addr >= end); 1609 1590 pfn -= addr >> PAGE_SHIFT; ··· 1619 1594 if (err) 1620 1595 break; 1621 1596 } while (pgd++, addr = next, addr != end); 1597 + 1598 + if (err) 1599 + untrack_pfn_vma(vma, pfn, PAGE_ALIGN(size)); 1600 + 1622 1601 return err; 1623 1602 } 1624 1603 EXPORT_SYMBOL(remap_pfn_range); ··· 2894 2865 #endif /* __HAVE_ARCH_GATE_AREA */ 2895 2866 2896 2867 #ifdef CONFIG_HAVE_IOREMAP_PROT 2897 - static resource_size_t follow_phys(struct vm_area_struct *vma, 2898 - unsigned long address, unsigned int flags, 2899 - unsigned long *prot) 2868 + int follow_phys(struct vm_area_struct *vma, 2869 + unsigned long address, unsigned int flags, 2870 + unsigned long *prot, resource_size_t *phys) 2900 2871 { 2901 2872 pgd_t *pgd; 2902 2873 pud_t *pud; ··· 2905 2876 spinlock_t *ptl; 2906 2877 resource_size_t phys_addr = 0; 2907 2878 struct mm_struct *mm = vma->vm_mm; 2879 + int ret = -EINVAL; 2908 2880 2909 - VM_BUG_ON(!(vma->vm_flags & (VM_IO | VM_PFNMAP))); 2881 + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) 2882 + goto out; 2910 2883 2911 2884 pgd = pgd_offset(mm, address); 2912 2885 if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd))) 2913 - goto no_page_table; 2886 + goto out; 2914 2887 2915 2888 pud = pud_offset(pgd, address); 2916 2889 if (pud_none(*pud) || unlikely(pud_bad(*pud))) 2917 - goto no_page_table; 2890 + goto out; 2918 2891 2919 2892 pmd = pmd_offset(pud, address); 2920 2893 if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) 2921 - goto no_page_table; 2894 + goto out; 2922 2895 2923 2896 /* We cannot handle huge page PFN maps. Luckily they don't exist. */ 2924 2897 if (pmd_huge(*pmd)) 2925 - goto no_page_table; 2898 + goto out; 2926 2899 2927 2900 ptep = pte_offset_map_lock(mm, pmd, address, &ptl); 2928 2901 if (!ptep) ··· 2939 2908 phys_addr <<= PAGE_SHIFT; /* Shift here to avoid overflow on PAE */ 2940 2909 2941 2910 *prot = pgprot_val(pte_pgprot(pte)); 2911 + *phys = phys_addr; 2912 + ret = 0; 2942 2913 2943 2914 unlock: 2944 2915 pte_unmap_unlock(ptep, ptl); 2945 2916 out: 2946 - return phys_addr; 2947 - no_page_table: 2948 - return 0; 2917 + return ret; 2949 2918 } 2950 2919 2951 2920 int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, ··· 2956 2925 void *maddr; 2957 2926 int offset = addr & (PAGE_SIZE-1); 2958 2927 2959 - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) 2960 - return -EINVAL; 2961 - 2962 - phys_addr = follow_phys(vma, addr, write, &prot); 2963 - 2964 - if (!phys_addr) 2928 + if (follow_phys(vma, addr, write, &prot, &phys_addr)) 2965 2929 return -EINVAL; 2966 2930 2967 2931 maddr = ioremap_prot(phys_addr, PAGE_SIZE, prot);
+9
mm/swapfile.c
··· 1462 1462 __initcall(procswaps_init); 1463 1463 #endif /* CONFIG_PROC_FS */ 1464 1464 1465 + #ifdef MAX_SWAPFILES_CHECK 1466 + static int __init max_swapfiles_check(void) 1467 + { 1468 + MAX_SWAPFILES_CHECK(); 1469 + return 0; 1470 + } 1471 + late_initcall(max_swapfiles_check); 1472 + #endif 1473 + 1465 1474 /* 1466 1475 * Written 01/25/92 by Simmule Turner, heavily changed by Linus. 1467 1476 *