x86_64: Early segment setup for VT

VT is very picky about when it can enter execution.
Get all segments setup and get LDT and TR into valid state to allow
VT execution under VMware and KVM (untested).

This makes the boot decompression run under VT, which makes it several
orders of magnitude faster on 64-bit Intel hardware.

Before, I was seeing times up to a minute or more to decompress a 1.3MB kernel
on a very fast box.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by Zachary Amsden and committed by Linus Torvalds 08da5a2c ab144f5e

+7
+7
arch/x86_64/boot/compressed/head.S
··· 195 movl %eax, %ds 196 movl %eax, %es 197 movl %eax, %ss 198 199 /* Compute the decompressed kernel start address. It is where 200 * we were loaded at aligned to a 2M boundary. %rbp contains the ··· 300 .quad 0x0000000000000000 /* NULL descriptor */ 301 .quad 0x00af9a000000ffff /* __KERNEL_CS */ 302 .quad 0x00cf92000000ffff /* __KERNEL_DS */ 303 gdt_end: 304 .bss 305 /* Stack for uncompression */
··· 195 movl %eax, %ds 196 movl %eax, %es 197 movl %eax, %ss 198 + movl %eax, %fs 199 + movl %eax, %gs 200 + lldt %ax 201 + movl $0x20, %eax 202 + ltr %ax 203 204 /* Compute the decompressed kernel start address. It is where 205 * we were loaded at aligned to a 2M boundary. %rbp contains the ··· 295 .quad 0x0000000000000000 /* NULL descriptor */ 296 .quad 0x00af9a000000ffff /* __KERNEL_CS */ 297 .quad 0x00cf92000000ffff /* __KERNEL_DS */ 298 + .quad 0x0080890000000000 /* TS descriptor */ 299 + .quad 0x0000000000000000 /* TS continued */ 300 gdt_end: 301 .bss 302 /* Stack for uncompression */