Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

s390/kvm: Fix address space mixup

I was chasing down a bug of random validity intercepts on s390.
(guest prefix page not mapped in the host virtual aspace). Turns out
that the problem was a wrong address space control element. The
cause was quite complex:

During paging activity a DAT protection during SIE caused a program
interrupt. Normally, the sie retry loop tries to catch all
interrupts during and shortly before sie to rerun the setup. The
problem is now that protection causes a suppressing program interrupt,
causing the PSW to point to the instruction AFTER SIE in case of DAT
protection. This confused the logic of the retry loop to not trigger,
instead we jumped directly back to SIE after return from
the program interrupt. (the protection fault handler itself did
a rewind of the psw). This usually works quite well, but:

If now the protection fault handler has to wait, another program
might be scheduled in. Later on the sie process will be schedules
in again. In that case the content of CR1 (primary address space)
will be wrong because switch_to will put the user space ASCE into CR1
and not the guest ASCE.

In addition the program parameter is also wrong for every protection
fault of a guest, since we dont issue the SPP instruction.

So lets also check for PSW == instruction after SIE in the program
check handler. Instead of expensively checking all program
interruption codes that might be suppressing we assume that a program
interrupt pointing after SIE was always a program interrupt in SIE.
(Otherwise we have a kernel bug anyway).

We also have to compensate the rewinding, since the C-level handlers
will do that. Therefore we need to add a nop with the same length
as SIE before the sie_loop.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
CC: stable@vger.kernel.org
CC: Heiko Carstens <heiko.carstens@de.ibm.com>

authored by

Christian Borntraeger and committed by
Martin Schwidefsky
ce6a04ac 39efd4ec

+20 -5
+20 -5
arch/s390/kernel/entry64.S
··· 80 80 #endif 81 81 .endm 82 82 83 - .macro HANDLE_SIE_INTERCEPT scratch 83 + .macro HANDLE_SIE_INTERCEPT scratch,pgmcheck 84 84 #if defined(CONFIG_KVM) || defined(CONFIG_KVM_MODULE) 85 85 tmhh %r8,0x0001 # interrupting from user ? 86 86 jnz .+42 87 87 lgr \scratch,%r9 88 88 slg \scratch,BASED(.Lsie_loop) 89 89 clg \scratch,BASED(.Lsie_length) 90 + .if \pgmcheck 91 + # Some program interrupts are suppressing (e.g. protection). 92 + # We must also check the instruction after SIE in that case. 93 + # do_protection_exception will rewind to rewind_pad 94 + jh .+22 95 + .else 90 96 jhe .+22 97 + .endif 91 98 lg %r9,BASED(.Lsie_loop) 92 99 SPP BASED(.Lhost_id) # set host id 93 100 #endif ··· 397 390 lg %r12,__LC_THREAD_INFO 398 391 larl %r13,system_call 399 392 lmg %r8,%r9,__LC_PGM_OLD_PSW 400 - HANDLE_SIE_INTERCEPT %r14 393 + HANDLE_SIE_INTERCEPT %r14,1 401 394 tmhh %r8,0x0001 # test problem state bit 402 395 jnz 1f # -> fault in user space 403 396 tmhh %r8,0x4000 # PER bit set in old PSW ? ··· 473 466 lg %r12,__LC_THREAD_INFO 474 467 larl %r13,system_call 475 468 lmg %r8,%r9,__LC_IO_OLD_PSW 476 - HANDLE_SIE_INTERCEPT %r14 469 + HANDLE_SIE_INTERCEPT %r14,0 477 470 SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_STACK,STACK_SHIFT 478 471 tmhh %r8,0x0001 # interrupting from user? 479 472 jz io_skip ··· 619 612 lg %r12,__LC_THREAD_INFO 620 613 larl %r13,system_call 621 614 lmg %r8,%r9,__LC_EXT_OLD_PSW 622 - HANDLE_SIE_INTERCEPT %r14 615 + HANDLE_SIE_INTERCEPT %r14,0 623 616 SWITCH_ASYNC __LC_SAVE_AREA_ASYNC,__LC_ASYNC_STACK,STACK_SHIFT 624 617 tmhh %r8,0x0001 # interrupting from user ? 625 618 jz ext_skip ··· 667 660 lg %r12,__LC_THREAD_INFO 668 661 larl %r13,system_call 669 662 lmg %r8,%r9,__LC_MCK_OLD_PSW 670 - HANDLE_SIE_INTERCEPT %r14 663 + HANDLE_SIE_INTERCEPT %r14,0 671 664 tm __LC_MCCK_CODE,0x80 # system damage? 672 665 jo mcck_panic # yes -> rest of mcck code invalid 673 666 lghi %r14,__LC_CPU_TIMER_SAVE_AREA ··· 966 959 stg %r3,__SF_EMPTY+8(%r15) # save guest register save area 967 960 xc __SF_EMPTY+16(8,%r15),__SF_EMPTY+16(%r15) # host id == 0 968 961 lmg %r0,%r13,0(%r3) # load guest gprs 0-13 962 + # some program checks are suppressing. C code (e.g. do_protection_exception) 963 + # will rewind the PSW by the ILC, which is 4 bytes in case of SIE. Other 964 + # instructions in the sie_loop should not cause program interrupts. So 965 + # lets use a nop (47 00 00 00) as a landing pad. 966 + # See also HANDLE_SIE_INTERCEPT 967 + rewind_pad: 968 + nop 0 969 969 sie_loop: 970 970 lg %r14,__LC_THREAD_INFO # pointer thread_info struct 971 971 tm __TI_flags+7(%r14),_TIF_EXIT_SIE ··· 1012 998 .Lhost_id: 1013 999 .quad 0 1014 1000 1001 + EX_TABLE(rewind_pad,sie_fault) 1015 1002 EX_TABLE(sie_loop,sie_fault) 1016 1003 #endif 1017 1004