at v3.19 2161 lines 82 kB view raw
1 2 Debugging on Linux for s/390 & z/Architecture 3 by 4 Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com) 5 Copyright (C) 2000-2001 IBM Deutschland Entwicklung GmbH, IBM Corporation 6 Best viewed with fixed width fonts 7 8Overview of Document: 9===================== 10This document is intended to give a good overview of how to debug 11Linux for s/390 & z/Architecture. It isn't intended as a complete reference & not a 12tutorial on the fundamentals of C & assembly. It doesn't go into 13390 IO in any detail. It is intended to complement the documents in the 14reference section below & any other worthwhile references you get. 15 16It is intended like the Enterprise Systems Architecture/390 Reference Summary 17to be printed out & used as a quick cheat sheet self help style reference when 18problems occur. 19 20Contents 21======== 22Register Set 23Address Spaces on Intel Linux 24Address Spaces on Linux for s/390 & z/Architecture 25The Linux for s/390 & z/Architecture Kernel Task Structure 26Register Usage & Stackframes on Linux for s/390 & z/Architecture 27A sample program with comments 28Compiling programs for debugging on Linux for s/390 & z/Architecture 29Debugging under VM 30s/390 & z/Architecture IO Overview 31Debugging IO on s/390 & z/Architecture under VM 32GDB on s/390 & z/Architecture 33Stack chaining in gdb by hand 34Examining core dumps 35ldd 36Debugging modules 37The proc file system 38Starting points for debugging scripting languages etc. 39SysRq 40References 41Special Thanks 42 43Register Set 44============ 45The current architectures have the following registers. 46 4716 General propose registers, 32 bit on s/390 64 bit on z/Architecture, r0-r15 or gpr0-gpr15 used for arithmetic & addressing. 48 4916 Control registers, 32 bit on s/390 64 bit on z/Architecture, ( cr0-cr15 kernel usage only ) used for memory management, 50interrupt control,debugging control etc. 51 5216 Access registers ( ar0-ar15 ) 32 bit on s/390 & z/Architecture 53not used by normal programs but potentially could 54be used as temporary storage. Their main purpose is their 1 to 1 55association with general purpose registers and are used in 56the kernel for copying data between kernel & user address spaces. 57Access register 0 ( & access register 1 on z/Architecture ( needs 64 bit 58pointer ) ) is currently used by the pthread library as a pointer to 59the current running threads private area. 60 6116 64 bit floating point registers (fp0-fp15 ) IEEE & HFP floating 62point format compliant on G5 upwards & a Floating point control reg (FPC) 634 64 bit registers (fp0,fp2,fp4 & fp6) HFP only on older machines. 64Note: 65Linux (currently) always uses IEEE & emulates G5 IEEE format on older machines, 66( provided the kernel is configured for this ). 67 68 69The PSW is the most important register on the machine it 70is 64 bit on s/390 & 128 bit on z/Architecture & serves the roles of 71a program counter (pc), condition code register,memory space designator. 72In IBM standard notation I am counting bit 0 as the MSB. 73It has several advantages over a normal program counter 74in that you can change address translation & program counter 75in a single instruction. To change address translation, 76e.g. switching address translation off requires that you 77have a logical=physical mapping for the address you are 78currently running at. 79 80 Bit Value 81s/390 z/Architecture 820 0 Reserved ( must be 0 ) otherwise specification exception occurs. 83 841 1 Program Event Recording 1 PER enabled, 85 PER is used to facilitate debugging e.g. single stepping. 86 872-4 2-4 Reserved ( must be 0 ). 88 895 5 Dynamic address translation 1=DAT on. 90 916 6 Input/Output interrupt Mask 92 937 7 External interrupt Mask used primarily for interprocessor signalling & 94 clock interrupts. 95 968-11 8-11 PSW Key used for complex memory protection mechanism not used under linux 97 9812 12 1 on s/390 0 on z/Architecture 99 10013 13 Machine Check Mask 1=enable machine check interrupts 101 10214 14 Wait State set this to 1 to stop the processor except for interrupts & give 103 time to other LPARS used in CPU idle in the kernel to increase overall 104 usage of processor resources. 105 10615 15 Problem state ( if set to 1 certain instructions are disabled ) 107 all linux user programs run with this bit 1 108 ( useful info for debugging under VM ). 109 11016-17 16-17 Address Space Control 111 112 00 Primary Space Mode: 113 The register CR1 contains the primary address-space control ele- 114 ment (PASCE), which points to the primary space region/segment 115 table origin. 116 117 01 Access register mode 118 119 10 Secondary Space Mode: 120 The register CR7 contains the secondary address-space control 121 element (SASCE), which points to the secondary space region or 122 segment table origin. 123 124 11 Home Space Mode: 125 The register CR13 contains the home space address-space control 126 element (HASCE), which points to the home space region/segment 127 table origin. 128 129 See "Address Spaces on Linux for s/390 & z/Architecture" below 130 for more information about address space usage in Linux. 131 13218-19 18-19 Condition codes (CC) 133 13420 20 Fixed point overflow mask if 1=FPU exceptions for this event 135 occur ( normally 0 ) 136 13721 21 Decimal overflow mask if 1=FPU exceptions for this event occur 138 ( normally 0 ) 139 14022 22 Exponent underflow mask if 1=FPU exceptions for this event occur 141 ( normally 0 ) 142 14323 23 Significance Mask if 1=FPU exceptions for this event occur 144 ( normally 0 ) 145 14624-31 24-30 Reserved Must be 0. 147 148 31 Extended Addressing Mode 149 32 Basic Addressing Mode 150 Used to set addressing mode 151 PSW 31 PSW 32 152 0 0 24 bit 153 0 1 31 bit 154 1 1 64 bit 155 15632 1=31 bit addressing mode 0=24 bit addressing mode (for backward 157 compatibility), linux always runs with this bit set to 1 158 15933-64 Instruction address. 160 33-63 Reserved must be 0 161 64-127 Address 162 In 24 bits mode bits 64-103=0 bits 104-127 Address 163 In 31 bits mode bits 64-96=0 bits 97-127 Address 164 Note: unlike 31 bit mode on s/390 bit 96 must be zero 165 when loading the address with LPSWE otherwise a 166 specification exception occurs, LPSW is fully backward 167 compatible. 168 169 170Prefix Page(s) 171-------------- 172This per cpu memory area is too intimately tied to the processor not to mention. 173It exists between the real addresses 0-4096 on s/390 & 0-8192 z/Architecture & is exchanged 174with a 1 page on s/390 or 2 pages on z/Architecture in absolute storage by the set 175prefix instruction in linux'es startup. 176This page is mapped to a different prefix for each processor in an SMP configuration 177( assuming the os designer is sane of course :-) ). 178Bytes 0-512 ( 200 hex ) on s/390 & 0-512,4096-4544,4604-5119 currently on z/Architecture 179are used by the processor itself for holding such information as exception indications & 180entry points for exceptions. 181Bytes after 0xc00 hex are used by linux for per processor globals on s/390 & z/Architecture 182( there is a gap on z/Architecture too currently between 0xc00 & 1000 which linux uses ). 183The closest thing to this on traditional architectures is the interrupt 184vector table. This is a good thing & does simplify some of the kernel coding 185however it means that we now cannot catch stray NULL pointers in the 186kernel without hard coded checks. 187 188 189 190Address Spaces on Intel Linux 191============================= 192 193The traditional Intel Linux is approximately mapped as follows forgive 194the ascii art. 1950xFFFFFFFF 4GB Himem ***************** 196 * * 197 * Kernel Space * 198 * * 199 ***************** **************** 200User Space Himem (typically 0xC0000000 3GB )* User Stack * * * 201 ***************** * * 202 * Shared Libs * * Next Process * 203 ***************** * to * 204 * * <== * Run * <== 205 * User Program * * * 206 * Data BSS * * * 207 * Text * * * 208 * Sections * * * 2090x00000000 ***************** **************** 210 211Now it is easy to see that on Intel it is quite easy to recognise a kernel address 212as being one greater than user space himem ( in this case 0xC0000000). 213& addresses of less than this are the ones in the current running program on this 214processor ( if an smp box ). 215If using the virtual machine ( VM ) as a debugger it is quite difficult to 216know which user process is running as the address space you are looking at 217could be from any process in the run queue. 218 219The limitation of Intels addressing technique is that the linux 220kernel uses a very simple real address to virtual addressing technique 221of Real Address=Virtual Address-User Space Himem. 222This means that on Intel the kernel linux can typically only address 223Himem=0xFFFFFFFF-0xC0000000=1GB & this is all the RAM these machines 224can typically use. 225They can lower User Himem to 2GB or lower & thus be 226able to use 2GB of RAM however this shrinks the maximum size 227of User Space from 3GB to 2GB they have a no win limit of 4GB unless 228they go to 64 Bit. 229 230 231On 390 our limitations & strengths make us slightly different. 232For backward compatibility we are only allowed use 31 bits (2GB) 233of our 32 bit addresses, however, we use entirely separate address 234spaces for the user & kernel. 235 236This means we can support 2GB of non Extended RAM on s/390, & more 237with the Extended memory management swap device & 238currently 4TB of physical memory currently on z/Architecture. 239 240 241Address Spaces on Linux for s/390 & z/Architecture 242================================================== 243 244Our addressing scheme is basically as follows: 245 246 Primary Space Home Space 247Himem 0x7fffffff 2GB on s/390 ***************** **************** 248currently 0x3ffffffffff (2^42)-1 * User Stack * * * 249on z/Architecture. ***************** * * 250 * Shared Libs * * * 251 ***************** * * 252 * * * Kernel * 253 * User Program * * * 254 * Data BSS * * * 255 * Text * * * 256 * Sections * * * 2570x00000000 ***************** **************** 258 259This also means that we need to look at the PSW problem state bit and the 260addressing mode to decide whether we are looking at user or kernel space. 261 262User space runs in primary address mode (or access register mode within 263the vdso code). 264 265The kernel usually also runs in home space mode, however when accessing 266user space the kernel switches to primary or secondary address mode if 267the mvcos instruction is not available or if a compare-and-swap (futex) 268instruction on a user space address is performed. 269 270When also looking at the ASCE control registers, this means: 271 272User space: 273- runs in primary or access register mode 274- cr1 contains the user asce 275- cr7 contains the user asce 276- cr13 contains the kernel asce 277 278Kernel space: 279- runs in home space mode 280- cr1 contains the user or kernel asce 281 -> the kernel asce is loaded when a uaccess requires primary or 282 secondary address mode 283- cr7 contains the user or kernel asce, (changed with set_fs()) 284- cr13 contains the kernel asce 285 286In case of uaccess the kernel changes to: 287- primary space mode in case of a uaccess (copy_to_user) and uses 288 e.g. the mvcp instruction to access user space. However the kernel 289 will stay in home space mode if the mvcos instruction is available 290- secondary space mode in case of futex atomic operations, so that the 291 instructions come from primary address space and data from secondary 292 space 293 294In case of KVM, the kernel runs in home space mode, but cr1 gets switched 295to contain the gmap asce before the SIE instruction gets executed. When 296the SIE instruction is finished, cr1 will be switched back to contain the 297user asce. 298 299 300Virtual Addresses on s/390 & z/Architecture 301=========================================== 302 303A virtual address on s/390 is made up of 3 parts 304The SX ( segment index, roughly corresponding to the PGD & PMD in linux terminology ) 305being bits 1-11. 306The PX ( page index, corresponding to the page table entry (pte) in linux terminology ) 307being bits 12-19. 308The remaining bits BX (the byte index are the offset in the page ) 309i.e. bits 20 to 31. 310 311On z/Architecture in linux we currently make up an address from 4 parts. 312The region index bits (RX) 0-32 we currently use bits 22-32 313The segment index (SX) being bits 33-43 314The page index (PX) being bits 44-51 315The byte index (BX) being bits 52-63 316 317Notes: 3181) s/390 has no PMD so the PMD is really the PGD also. 319A lot of this stuff is defined in pgtable.h. 320 3212) Also seeing as s/390's page indexes are only 1k in size 322(bits 12-19 x 4 bytes per pte ) we use 1 ( page 4k ) 323to make the best use of memory by updating 4 segment indices 324entries each time we mess with a PMD & use offsets 3250,1024,2048 & 3072 in this page as for our segment indexes. 326On z/Architecture our page indexes are now 2k in size 327( bits 12-19 x 8 bytes per pte ) we do a similar trick 328but only mess with 2 segment indices each time we mess with 329a PMD. 330 3313) As z/Architecture supports up to a massive 5-level page table lookup we 332can only use 3 currently on Linux ( as this is all the generic kernel 333currently supports ) however this may change in future 334this allows us to access ( according to my sums ) 3354TB of virtual storage per process i.e. 3364096*512(PTES)*1024(PMDS)*2048(PGD) = 4398046511104 bytes, 337enough for another 2 or 3 of years I think :-). 338to do this we use a region-third-table designation type in 339our address space control registers. 340 341 342The Linux for s/390 & z/Architecture Kernel Task Structure 343========================================================== 344Each process/thread under Linux for S390 has its own kernel task_struct 345defined in linux/include/linux/sched.h 346The S390 on initialisation & resuming of a process on a cpu sets 347the __LC_KERNEL_STACK variable in the spare prefix area for this cpu 348(which we use for per-processor globals). 349 350The kernel stack pointer is intimately tied with the task structure for 351each processor as follows. 352 353 s/390 354 ************************ 355 * 1 page kernel stack * 356 * ( 4K ) * 357 ************************ 358 * 1 page task_struct * 359 * ( 4K ) * 3608K aligned ************************ 361 362 z/Architecture 363 ************************ 364 * 2 page kernel stack * 365 * ( 8K ) * 366 ************************ 367 * 2 page task_struct * 368 * ( 8K ) * 36916K aligned ************************ 370 371What this means is that we don't need to dedicate any register or global variable 372to point to the current running process & can retrieve it with the following 373very simple construct for s/390 & one very similar for z/Architecture. 374 375static inline struct task_struct * get_current(void) 376{ 377 struct task_struct *current; 378 __asm__("lhi %0,-8192\n\t" 379 "nr %0,15" 380 : "=r" (current) ); 381 return current; 382} 383 384i.e. just anding the current kernel stack pointer with the mask -8192. 385Thankfully because Linux doesn't have support for nested IO interrupts 386& our devices have large buffers can survive interrupts being shut for 387short amounts of time we don't need a separate stack for interrupts. 388 389 390 391 392Register Usage & Stackframes on Linux for s/390 & z/Architecture 393================================================================= 394Overview: 395--------- 396This is the code that gcc produces at the top & the bottom of 397each function. It usually is fairly consistent & similar from 398function to function & if you know its layout you can probably 399make some headway in finding the ultimate cause of a problem 400after a crash without a source level debugger. 401 402Note: To follow stackframes requires a knowledge of C or Pascal & 403limited knowledge of one assembly language. 404 405It should be noted that there are some differences between the 406s/390 & z/Architecture stack layouts as the z/Architecture stack layout didn't have 407to maintain compatibility with older linkage formats. 408 409Glossary: 410--------- 411alloca: 412This is a built in compiler function for runtime allocation 413of extra space on the callers stack which is obviously freed 414up on function exit ( e.g. the caller may choose to allocate nothing 415of a buffer of 4k if required for temporary purposes ), it generates 416very efficient code ( a few cycles ) when compared to alternatives 417like malloc. 418 419automatics: These are local variables on the stack, 420i.e they aren't in registers & they aren't static. 421 422back-chain: 423This is a pointer to the stack pointer before entering a 424framed functions ( see frameless function ) prologue got by 425dereferencing the address of the current stack pointer, 426 i.e. got by accessing the 32 bit value at the stack pointers 427current location. 428 429base-pointer: 430This is a pointer to the back of the literal pool which 431is an area just behind each procedure used to store constants 432in each function. 433 434call-clobbered: The caller probably needs to save these registers if there 435is something of value in them, on the stack or elsewhere before making a 436call to another procedure so that it can restore it later. 437 438epilogue: 439The code generated by the compiler to return to the caller. 440 441frameless-function 442A frameless function in Linux for s390 & z/Architecture is one which doesn't 443need more than the register save area ( 96 bytes on s/390, 160 on z/Architecture ) 444given to it by the caller. 445A frameless function never: 4461) Sets up a back chain. 4472) Calls alloca. 4483) Calls other normal functions 4494) Has automatics. 450 451GOT-pointer: 452This is a pointer to the global-offset-table in ELF 453( Executable Linkable Format, Linux'es most common executable format ), 454all globals & shared library objects are found using this pointer. 455 456lazy-binding 457ELF shared libraries are typically only loaded when routines in the shared 458library are actually first called at runtime. This is lazy binding. 459 460procedure-linkage-table 461This is a table found from the GOT which contains pointers to routines 462in other shared libraries which can't be called to by easier means. 463 464prologue: 465The code generated by the compiler to set up the stack frame. 466 467outgoing-args: 468This is extra area allocated on the stack of the calling function if the 469parameters for the callee's cannot all be put in registers, the same 470area can be reused by each function the caller calls. 471 472routine-descriptor: 473A COFF executable format based concept of a procedure reference 474actually being 8 bytes or more as opposed to a simple pointer to the routine. 475This is typically defined as follows 476Routine Descriptor offset 0=Pointer to Function 477Routine Descriptor offset 4=Pointer to Table of Contents 478The table of contents/TOC is roughly equivalent to a GOT pointer. 479& it means that shared libraries etc. can be shared between several 480environments each with their own TOC. 481 482 483static-chain: This is used in nested functions a concept adopted from pascal 484by gcc not used in ansi C or C++ ( although quite useful ), basically it 485is a pointer used to reference local variables of enclosing functions. 486You might come across this stuff once or twice in your lifetime. 487 488e.g. 489The function below should return 11 though gcc may get upset & toss warnings 490about unused variables. 491int FunctionA(int a) 492{ 493 int b; 494 FunctionC(int c) 495 { 496 b=c+1; 497 } 498 FunctionC(10); 499 return(b); 500} 501 502 503s/390 & z/Architecture Register usage 504===================================== 505r0 used by syscalls/assembly call-clobbered 506r1 used by syscalls/assembly call-clobbered 507r2 argument 0 / return value 0 call-clobbered 508r3 argument 1 / return value 1 (if long long) call-clobbered 509r4 argument 2 call-clobbered 510r5 argument 3 call-clobbered 511r6 argument 4 saved 512r7 pointer-to arguments 5 to ... saved 513r8 this & that saved 514r9 this & that saved 515r10 static-chain ( if nested function ) saved 516r11 frame-pointer ( if function used alloca ) saved 517r12 got-pointer saved 518r13 base-pointer saved 519r14 return-address saved 520r15 stack-pointer saved 521 522f0 argument 0 / return value ( float/double ) call-clobbered 523f2 argument 1 call-clobbered 524f4 z/Architecture argument 2 saved 525f6 z/Architecture argument 3 saved 526The remaining floating points 527f1,f3,f5 f7-f15 are call-clobbered. 528 529Notes: 530------ 5311) The only requirement is that registers which are used 532by the callee are saved, e.g. the compiler is perfectly 533capable of using r11 for purposes other than a frame a 534frame pointer if a frame pointer is not needed. 5352) In functions with variable arguments e.g. printf the calling procedure 536is identical to one without variable arguments & the same number of 537parameters. However, the prologue of this function is somewhat more 538hairy owing to it having to move these parameters to the stack to 539get va_start, va_arg & va_end to work. 5403) Access registers are currently unused by gcc but are used in 541the kernel. Possibilities exist to use them at the moment for 542temporary storage but it isn't recommended. 5434) Only 4 of the floating point registers are used for 544parameter passing as older machines such as G3 only have only 4 545& it keeps the stack frame compatible with other compilers. 546However with IEEE floating point emulation under linux on the 547older machines you are free to use the other 12. 5485) A long long or double parameter cannot be have the 549first 4 bytes in a register & the second four bytes in the 550outgoing args area. It must be purely in the outgoing args 551area if crossing this boundary. 5526) Floating point parameters are mixed with outgoing args 553on the outgoing args area in the order the are passed in as parameters. 5547) Floating point arguments 2 & 3 are saved in the outgoing args area for 555z/Architecture 556 557 558Stack Frame Layout 559------------------ 560s/390 z/Architecture 5610 0 back chain ( a 0 here signifies end of back chain ) 5624 8 eos ( end of stack, not used on Linux for S390 used in other linkage formats ) 5638 16 glue used in other s/390 linkage formats for saved routine descriptors etc. 56412 24 glue used in other s/390 linkage formats for saved routine descriptors etc. 56516 32 scratch area 56620 40 scratch area 56724 48 saved r6 of caller function 56828 56 saved r7 of caller function 56932 64 saved r8 of caller function 57036 72 saved r9 of caller function 57140 80 saved r10 of caller function 57244 88 saved r11 of caller function 57348 96 saved r12 of caller function 57452 104 saved r13 of caller function 57556 112 saved r14 of caller function 57660 120 saved r15 of caller function 57764 128 saved f4 of caller function 57872 132 saved f6 of caller function 57980 undefined 58096 160 outgoing args passed from caller to callee 58196+x 160+x possible stack alignment ( 8 bytes desirable ) 58296+x+y 160+x+y alloca space of caller ( if used ) 58396+x+y+z 160+x+y+z automatics of caller ( if used ) 5840 back-chain 585 586A sample program with comments. 587=============================== 588 589Comments on the function test 590----------------------------- 5911) It didn't need to set up a pointer to the constant pool gpr13 as it isn't used 592( :-( ). 5932) This is a frameless function & no stack is bought. 5943) The compiler was clever enough to recognise that it could return the 595value in r2 as well as use it for the passed in parameter ( :-) ). 5964) The basr ( branch relative & save ) trick works as follows the instruction 597has a special case with r0,r0 with some instruction operands is understood as 598the literal value 0, some risc architectures also do this ). So now 599we are branching to the next address & the address new program counter is 600in r13,so now we subtract the size of the function prologue we have executed 601+ the size of the literal pool to get to the top of the literal pool 6020040037c int test(int b) 603{ # Function prologue below 604 40037c: 90 de f0 34 stm %r13,%r14,52(%r15) # Save registers r13 & r14 605 400380: 0d d0 basr %r13,%r0 # Set up pointer to constant pool using 606 400382: a7 da ff fa ahi %r13,-6 # basr trick 607 return(5+b); 608 # Huge main program 609 400386: a7 2a 00 05 ahi %r2,5 # add 5 to r2 610 611 # Function epilogue below 612 40038a: 98 de f0 34 lm %r13,%r14,52(%r15) # restore registers r13 & 14 613 40038e: 07 fe br %r14 # return 614} 615 616Comments on the function main 617----------------------------- 6181) The compiler did this function optimally ( 8-) ) 619 620Literal pool for main. 621400390: ff ff ff ec .long 0xffffffec 622main(int argc,char *argv[]) 623{ # Function prologue below 624 400394: 90 bf f0 2c stm %r11,%r15,44(%r15) # Save necessary registers 625 400398: 18 0f lr %r0,%r15 # copy stack pointer to r0 626 40039a: a7 fa ff a0 ahi %r15,-96 # Make area for callee saving 627 40039e: 0d d0 basr %r13,%r0 # Set up r13 to point to 628 4003a0: a7 da ff f0 ahi %r13,-16 # literal pool 629 4003a4: 50 00 f0 00 st %r0,0(%r15) # Save backchain 630 631 return(test(5)); # Main Program Below 632 4003a8: 58 e0 d0 00 l %r14,0(%r13) # load relative address of test from 633 # literal pool 634 4003ac: a7 28 00 05 lhi %r2,5 # Set first parameter to 5 635 4003b0: 4d ee d0 00 bas %r14,0(%r14,%r13) # jump to test setting r14 as return 636 # address using branch & save instruction. 637 638 # Function Epilogue below 639 4003b4: 98 bf f0 8c lm %r11,%r15,140(%r15)# Restore necessary registers. 640 4003b8: 07 fe br %r14 # return to do program exit 641} 642 643 644Compiler updates 645---------------- 646 647main(int argc,char *argv[]) 648{ 649 4004fc: 90 7f f0 1c stm %r7,%r15,28(%r15) 650 400500: a7 d5 00 04 bras %r13,400508 <main+0xc> 651 400504: 00 40 04 f4 .long 0x004004f4 652 # compiler now puts constant pool in code to so it saves an instruction 653 400508: 18 0f lr %r0,%r15 654 40050a: a7 fa ff a0 ahi %r15,-96 655 40050e: 50 00 f0 00 st %r0,0(%r15) 656 return(test(5)); 657 400512: 58 10 d0 00 l %r1,0(%r13) 658 400516: a7 28 00 05 lhi %r2,5 659 40051a: 0d e1 basr %r14,%r1 660 # compiler adds 1 extra instruction to epilogue this is done to 661 # avoid processor pipeline stalls owing to data dependencies on g5 & 662 # above as register 14 in the old code was needed directly after being loaded 663 # by the lm %r11,%r15,140(%r15) for the br %14. 664 40051c: 58 40 f0 98 l %r4,152(%r15) 665 400520: 98 7f f0 7c lm %r7,%r15,124(%r15) 666 400524: 07 f4 br %r4 667} 668 669 670Hartmut ( our compiler developer ) also has been threatening to take out the 671stack backchain in optimised code as this also causes pipeline stalls, you 672have been warned. 673 67464 bit z/Architecture code disassembly 675-------------------------------------- 676 677If you understand the stuff above you'll understand the stuff 678below too so I'll avoid repeating myself & just say that 679some of the instructions have g's on the end of them to indicate 680they are 64 bit & the stack offsets are a bigger, 681the only other difference you'll find between 32 & 64 bit is that 682we now use f4 & f6 for floating point arguments on 64 bit. 68300000000800005b0 <test>: 684int test(int b) 685{ 686 return(5+b); 687 800005b0: a7 2a 00 05 ahi %r2,5 688 800005b4: b9 14 00 22 lgfr %r2,%r2 # downcast to integer 689 800005b8: 07 fe br %r14 690 800005ba: 07 07 bcr 0,%r7 691 692 693} 694 69500000000800005bc <main>: 696main(int argc,char *argv[]) 697{ 698 800005bc: eb bf f0 58 00 24 stmg %r11,%r15,88(%r15) 699 800005c2: b9 04 00 1f lgr %r1,%r15 700 800005c6: a7 fb ff 60 aghi %r15,-160 701 800005ca: e3 10 f0 00 00 24 stg %r1,0(%r15) 702 return(test(5)); 703 800005d0: a7 29 00 05 lghi %r2,5 704 # brasl allows jumps > 64k & is overkill here bras would do fune 705 800005d4: c0 e5 ff ff ff ee brasl %r14,800005b0 <test> 706 800005da: e3 40 f1 10 00 04 lg %r4,272(%r15) 707 800005e0: eb bf f0 f8 00 04 lmg %r11,%r15,248(%r15) 708 800005e6: 07 f4 br %r4 709} 710 711 712 713Compiling programs for debugging on Linux for s/390 & z/Architecture 714==================================================================== 715-gdwarf-2 now works it should be considered the default debugging 716format for s/390 & z/Architecture as it is more reliable for debugging 717shared libraries, normal -g debugging works much better now 718Thanks to the IBM java compiler developers bug reports. 719 720This is typically done adding/appending the flags -g or -gdwarf-2 to the 721CFLAGS & LDFLAGS variables Makefile of the program concerned. 722 723If using gdb & you would like accurate displays of registers & 724 stack traces compile without optimisation i.e make sure 725that there is no -O2 or similar on the CFLAGS line of the Makefile & 726the emitted gcc commands, obviously this will produce worse code 727( not advisable for shipment ) but it is an aid to the debugging process. 728 729This aids debugging because the compiler will copy parameters passed in 730in registers onto the stack so backtracing & looking at passed in 731parameters will work, however some larger programs which use inline functions 732will not compile without optimisation. 733 734Debugging with optimisation has since much improved after fixing 735some bugs, please make sure you are using gdb-5.0 or later developed 736after Nov'2000. 737 738 739 740Debugging under VM 741================== 742 743Notes 744----- 745Addresses & values in the VM debugger are always hex never decimal 746Address ranges are of the format <HexValue1>-<HexValue2> or <HexValue1>.<HexValue2> 747e.g. The address range 0x2000 to 0x3000 can be described as 2000-3000 or 2000.1000 748 749The VM Debugger is case insensitive. 750 751VM's strengths are usually other debuggers weaknesses you can get at any resource 752no matter how sensitive e.g. memory management resources,change address translation 753in the PSW. For kernel hacking you will reap dividends if you get good at it. 754 755The VM Debugger displays operators but not operands, probably because some 756of it was written when memory was expensive & the programmer was probably proud that 757it fitted into 2k of memory & the programmers & didn't want to shock hardcore VM'ers by 758changing the interface :-), also the debugger displays useful information on the same line & 759the author of the code probably felt that it was a good idea not to go over 760the 80 columns on the screen. 761 762As some of you are probably in a panic now this isn't as unintuitive as it may seem 763as the 390 instructions are easy to decode mentally & you can make a good guess at a lot 764of them as all the operands are nibble ( half byte aligned ) & if you have an objdump listing 765also it is quite easy to follow, if you don't have an objdump listing keep a copy of 766the s/390 Reference Summary & look at between pages 2 & 7 or alternatively the 767s/390 principles of operation. 768e.g. even I can guess that 7690001AFF8' LR 180F CC 0 770is a ( load register ) lr r0,r15 771 772Also it is very easy to tell the length of a 390 instruction from the 2 most significant 773bits in the instruction ( not that this info is really useful except if you are trying to 774make sense of a hexdump of code ). 775Here is a table 776Bits Instruction Length 777------------------------------------------ 77800 2 Bytes 77901 4 Bytes 78010 4 Bytes 78111 6 Bytes 782 783 784 785 786The debugger also displays other useful info on the same line such as the 787addresses being operated on destination addresses of branches & condition codes. 788e.g. 78900019736' AHI A7DAFF0E CC 1 790000198BA' BRC A7840004 -> 000198C2' CC 0 791000198CE' STM 900EF068 >> 0FA95E78 CC 2 792 793 794 795Useful VM debugger commands 796--------------------------- 797 798I suppose I'd better mention this before I start 799to list the current active traces do 800Q TR 801there can be a maximum of 255 of these per set 802( more about trace sets later ). 803To stop traces issue a 804TR END. 805To delete a particular breakpoint issue 806TR DEL <breakpoint number> 807 808The PA1 key drops to CP mode so you can issue debugger commands, 809Doing alt c (on my 3270 console at least ) clears the screen. 810hitting b <enter> comes back to the running operating system 811from cp mode ( in our case linux ). 812It is typically useful to add shortcuts to your profile.exec file 813if you have one ( this is roughly equivalent to autoexec.bat in DOS ). 814file here are a few from mine. 815/* this gives me command history on issuing f12 */ 816set pf12 retrieve 817/* this continues */ 818set pf8 imm b 819/* goes to trace set a */ 820set pf1 imm tr goto a 821/* goes to trace set b */ 822set pf2 imm tr goto b 823/* goes to trace set c */ 824set pf3 imm tr goto c 825 826 827 828Instruction Tracing 829------------------- 830Setting a simple breakpoint 831TR I PSWA <address> 832To debug a particular function try 833TR I R <function address range> 834TR I on its own will single step. 835TR I DATA <MNEMONIC> <OPTIONAL RANGE> will trace for particular mnemonics 836e.g. 837TR I DATA 4D R 0197BC.4000 838will trace for BAS'es ( opcode 4D ) in the range 0197BC.4000 839if you were inclined you could add traces for all branch instructions & 840suffix them with the run prefix so you would have a backtrace on screen 841when a program crashes. 842TR BR <INTO OR FROM> will trace branches into or out of an address. 843e.g. 844TR BR INTO 0 is often quite useful if a program is getting awkward & deciding 845to branch to 0 & crashing as this will stop at the address before in jumps to 0. 846TR I R <address range> RUN cmd d g 847single steps a range of addresses but stays running & 848displays the gprs on each step. 849 850 851 852Displaying & modifying Registers 853-------------------------------- 854D G will display all the gprs 855Adding a extra G to all the commands is necessary to access the full 64 bit 856content in VM on z/Architecture obviously this isn't required for access registers 857as these are still 32 bit. 858e.g. DGG instead of DG 859D X will display all the control registers 860D AR will display all the access registers 861D AR4-7 will display access registers 4 to 7 862CPU ALL D G will display the GRPS of all CPUS in the configuration 863D PSW will display the current PSW 864st PSW 2000 will put the value 2000 into the PSW & 865cause crash your machine. 866D PREFIX displays the prefix offset 867 868 869Displaying Memory 870----------------- 871To display memory mapped using the current PSW's mapping try 872D <range> 873To make VM display a message each time it hits a particular address & continue try 874D I<range> will disassemble/display a range of instructions. 875ST addr 32 bit word will store a 32 bit aligned address 876D T<range> will display the EBCDIC in an address ( if you are that way inclined ) 877D R<range> will display real addresses ( without DAT ) but with prefixing. 878There are other complex options to display if you need to get at say home space 879but are in primary space the easiest thing to do is to temporarily 880modify the PSW to the other addressing mode, display the stuff & then 881restore it. 882 883 884 885Hints 886----- 887If you want to issue a debugger command without halting your virtual machine with the 888PA1 key try prefixing the command with #CP e.g. 889#cp tr i pswa 2000 890also suffixing most debugger commands with RUN will cause them not 891to stop just display the mnemonic at the current instruction on the console. 892If you have several breakpoints you want to put into your program & 893you get fed up of cross referencing with System.map 894you can do the following trick for several symbols. 895grep do_signal System.map 896which emits the following among other things 8970001f4e0 T do_signal 898now you can do 899 900TR I PSWA 0001f4e0 cmd msg * do_signal 901This sends a message to your own console each time do_signal is entered. 902( As an aside I wrote a perl script once which automatically generated a REXX 903script with breakpoints on every kernel procedure, this isn't a good idea 904because there are thousands of these routines & VM can only set 255 breakpoints 905at a time so you nearly had to spend as long pruning the file down as you would 906entering the msg's by hand ),however, the trick might be useful for a single object file. 907On linux'es 3270 emulator x3270 there is a very useful option under the file ment 908Save Screens In File this is very good of keeping a copy of traces. 909 910From CMS help <command name> will give you online help on a particular command. 911e.g. 912HELP DISPLAY 913 914Also CP has a file called profile.exec which automatically gets called 915on startup of CMS ( like autoexec.bat ), keeping on a DOS analogy session 916CP has a feature similar to doskey, it may be useful for you to 917use profile.exec to define some keystrokes. 918e.g. 919SET PF9 IMM B 920This does a single step in VM on pressing F8. 921SET PF10 ^ 922This sets up the ^ key. 923which can be used for ^c (ctrl-c),^z (ctrl-z) which can't be typed directly into some 3270 consoles. 924SET PF11 ^- 925This types the starting keystrokes for a sysrq see SysRq below. 926SET PF12 RETRIEVE 927This retrieves command history on pressing F12. 928 929 930Sometimes in VM the display is set up to scroll automatically this 931can be very annoying if there are messages you wish to look at 932to stop this do 933TERM MORE 255 255 934This will nearly stop automatic screen updates, however it will 935cause a denial of service if lots of messages go to the 3270 console, 936so it would be foolish to use this as the default on a production machine. 937 938 939Tracing particular processes 940---------------------------- 941The kernel's text segment is intentionally at an address in memory that it will 942very seldom collide with text segments of user programs ( thanks Martin ), 943this simplifies debugging the kernel. 944However it is quite common for user processes to have addresses which collide 945this can make debugging a particular process under VM painful under normal 946circumstances as the process may change when doing a 947TR I R <address range>. 948Thankfully after reading VM's online help I figured out how to debug 949I particular process. 950 951Your first problem is to find the STD ( segment table designation ) 952of the program you wish to debug. 953There are several ways you can do this here are a few 9541) objdump --syms <program to be debugged> | grep main 955To get the address of main in the program. 956tr i pswa <address of main> 957Start the program, if VM drops to CP on what looks like the entry 958point of the main function this is most likely the process you wish to debug. 959Now do a D X13 or D XG13 on z/Architecture. 960On 31 bit the STD is bits 1-19 ( the STO segment table origin ) 961& 25-31 ( the STL segment table length ) of CR13. 962now type 963TR I R STD <CR13's value> 0.7fffffff 964e.g. 965TR I R STD 8F32E1FF 0.7fffffff 966Another very useful variation is 967TR STORE INTO STD <CR13's value> <address range> 968for finding out when a particular variable changes. 969 970An alternative way of finding the STD of a currently running process 971is to do the following, ( this method is more complex but 972could be quite convenient if you aren't updating the kernel much & 973so your kernel structures will stay constant for a reasonable period of 974time ). 975 976grep task /proc/<pid>/status 977from this you should see something like 978task: 0f160000 ksp: 0f161de8 pt_regs: 0f161f68 979This now gives you a pointer to the task structure. 980Now make CC:="s390-gcc -g" kernel/sched.s 981To get the task_struct stabinfo. 982( task_struct is defined in include/linux/sched.h ). 983Now we want to look at 984task->active_mm->pgd 985on my machine the active_mm in the task structure stab is 986active_mm:(4,12),672,32 987its offset is 672/8=84=0x54 988the pgd member in the mm_struct stab is 989pgd:(4,6)=*(29,5),96,32 990so its offset is 96/8=12=0xc 991 992so we'll 993hexdump -s 0xf160054 /dev/mem | more 994i.e. task_struct+active_mm offset 995to look at the active_mm member 996f160054 0fee cc60 0019 e334 0000 0000 0000 0011 997hexdump -s 0x0feecc6c /dev/mem | more 998i.e. active_mm+pgd offset 999feecc6c 0f2c 0000 0000 0001 0000 0001 0000 0010 1000we get something like 1001now do 1002TR I R STD <pgd|0x7f> 0.7fffffff 1003i.e. the 0x7f is added because the pgd only 1004gives the page table origin & we need to set the low bits 1005to the maximum possible segment table length. 1006TR I R STD 0f2c007f 0.7fffffff 1007on z/Architecture you'll probably need to do 1008TR I R STD <pgd|0x7> 0.ffffffffffffffff 1009to set the TableType to 0x1 & the Table length to 3. 1010 1011 1012 1013Tracing Program Exceptions 1014-------------------------- 1015If you get a crash which says something like 1016illegal operation or specification exception followed by a register dump 1017You can restart linux & trace these using the tr prog <range or value> trace option. 1018 1019 1020 1021The most common ones you will normally be tracing for is 10221=operation exception 10232=privileged operation exception 10244=protection exception 10255=addressing exception 10266=specification exception 102710=segment translation exception 102811=page translation exception 1029 1030The full list of these is on page 22 of the current s/390 Reference Summary. 1031e.g. 1032tr prog 10 will trace segment translation exceptions. 1033tr prog on its own will trace all program interruption codes. 1034 1035Trace Sets 1036---------- 1037On starting VM you are initially in the INITIAL trace set. 1038You can do a Q TR to verify this. 1039If you have a complex tracing situation where you wish to wait for instance 1040till a driver is open before you start tracing IO, but know in your 1041heart that you are going to have to make several runs through the code till you 1042have a clue whats going on. 1043 1044What you can do is 1045TR I PSWA <Driver open address> 1046hit b to continue till breakpoint 1047reach the breakpoint 1048now do your 1049TR GOTO B 1050TR IO 7c08-7c09 inst int run 1051or whatever the IO channels you wish to trace are & hit b 1052 1053To got back to the initial trace set do 1054TR GOTO INITIAL 1055& the TR I PSWA <Driver open address> will be the only active breakpoint again. 1056 1057 1058Tracing linux syscalls under VM 1059------------------------------- 1060Syscalls are implemented on Linux for S390 by the Supervisor call instruction (SVC) there 256 1061possibilities of these as the instruction is made up of a 0xA opcode & the second byte being 1062the syscall number. They are traced using the simple command. 1063TR SVC <Optional value or range> 1064the syscalls are defined in linux/arch/s390/include/asm/unistd.h 1065e.g. to trace all file opens just do 1066TR SVC 5 ( as this is the syscall number of open ) 1067 1068 1069SMP Specific commands 1070--------------------- 1071To find out how many cpus you have 1072Q CPUS displays all the CPU's available to your virtual machine 1073To find the cpu that the current cpu VM debugger commands are being directed at do 1074Q CPU to change the current cpu VM debugger commands are being directed at do 1075CPU <desired cpu no> 1076 1077On a SMP guest issue a command to all CPUs try prefixing the command with cpu all. 1078To issue a command to a particular cpu try cpu <cpu number> e.g. 1079CPU 01 TR I R 2000.3000 1080If you are running on a guest with several cpus & you have a IO related problem 1081& cannot follow the flow of code but you know it isn't smp related. 1082from the bash prompt issue 1083shutdown -h now or halt. 1084do a Q CPUS to find out how many cpus you have 1085detach each one of them from cp except cpu 0 1086by issuing a 1087DETACH CPU 01-(number of cpus in configuration) 1088& boot linux again. 1089TR SIGP will trace inter processor signal processor instructions. 1090DEFINE CPU 01-(number in configuration) 1091will get your guests cpus back. 1092 1093 1094Help for displaying ascii textstrings 1095------------------------------------- 1096On the very latest VM Nucleus'es VM can now display ascii 1097( thanks Neale for the hint ) by doing 1098D TX<lowaddr>.<len> 1099e.g. 1100D TX0.100 1101 1102Alternatively 1103============= 1104Under older VM debuggers ( I love EBDIC too ) you can use this little program I wrote which 1105will convert a command line of hex digits to ascii text which can be compiled under linux & 1106you can copy the hex digits from your x3270 terminal to your xterm if you are debugging 1107from a linuxbox. 1108 1109This is quite useful when looking at a parameter passed in as a text string 1110under VM ( unless you are good at decoding ASCII in your head ). 1111 1112e.g. consider tracing an open syscall 1113TR SVC 5 1114We have stopped at a breakpoint 1115000151B0' SVC 0A05 -> 0001909A' CC 0 1116 1117D 20.8 to check the SVC old psw in the prefix area & see was it from userspace 1118( for the layout of the prefix area consult P18 of the s/390 390 Reference Summary 1119if you have it available ). 1120V00000020 070C2000 800151B2 1121The problem state bit wasn't set & it's also too early in the boot sequence 1122for it to be a userspace SVC if it was we would have to temporarily switch the 1123psw to user space addressing so we could get at the first parameter of the open in 1124gpr2. 1125Next do a 1126D G2 1127GPR 2 = 00014CB4 1128Now display what gpr2 is pointing to 1129D 00014CB4.20 1130V00014CB4 2F646576 2F636F6E 736F6C65 00001BF5 1131V00014CC4 FC00014C B4001001 E0001000 B8070707 1132Now copy the text till the first 00 hex ( which is the end of the string 1133to an xterm & do hex2ascii on it. 1134hex2ascii 2F646576 2F636F6E 736F6C65 00 1135outputs 1136Decoded Hex:=/ d e v / c o n s o l e 0x00 1137We were opening the console device, 1138 1139You can compile the code below yourself for practice :-), 1140/* 1141 * hex2ascii.c 1142 * a useful little tool for converting a hexadecimal command line to ascii 1143 * 1144 * Author(s): Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com) 1145 * (C) 2000 IBM Deutschland Entwicklung GmbH, IBM Corporation. 1146 */ 1147#include <stdio.h> 1148 1149int main(int argc,char *argv[]) 1150{ 1151 int cnt1,cnt2,len,toggle=0; 1152 int startcnt=1; 1153 unsigned char c,hex; 1154 1155 if(argc>1&&(strcmp(argv[1],"-a")==0)) 1156 startcnt=2; 1157 printf("Decoded Hex:="); 1158 for(cnt1=startcnt;cnt1<argc;cnt1++) 1159 { 1160 len=strlen(argv[cnt1]); 1161 for(cnt2=0;cnt2<len;cnt2++) 1162 { 1163 c=argv[cnt1][cnt2]; 1164 if(c>='0'&&c<='9') 1165 c=c-'0'; 1166 if(c>='A'&&c<='F') 1167 c=c-'A'+10; 1168 if(c>='a'&&c<='f') 1169 c=c-'a'+10; 1170 switch(toggle) 1171 { 1172 case 0: 1173 hex=c<<4; 1174 toggle=1; 1175 break; 1176 case 1: 1177 hex+=c; 1178 if(hex<32||hex>127) 1179 { 1180 if(startcnt==1) 1181 printf("0x%02X ",(int)hex); 1182 else 1183 printf("."); 1184 } 1185 else 1186 { 1187 printf("%c",hex); 1188 if(startcnt==1) 1189 printf(" "); 1190 } 1191 toggle=0; 1192 break; 1193 } 1194 } 1195 } 1196 printf("\n"); 1197} 1198 1199 1200 1201 1202Stack tracing under VM 1203---------------------- 1204A basic backtrace 1205----------------- 1206 1207Here are the tricks I use 9 out of 10 times it works pretty well, 1208 1209When your backchain reaches a dead end 1210-------------------------------------- 1211This can happen when an exception happens in the kernel & the kernel is entered twice 1212if you reach the NULL pointer at the end of the back chain you should be 1213able to sniff further back if you follow the following tricks. 12141) A kernel address should be easy to recognise since it is in 1215primary space & the problem state bit isn't set & also 1216The Hi bit of the address is set. 12172) Another backchain should also be easy to recognise since it is an 1218address pointing to another address approximately 100 bytes or 0x70 hex 1219behind the current stackpointer. 1220 1221 1222Here is some practice. 1223boot the kernel & hit PA1 at some random time 1224d g to display the gprs, this should display something like 1225GPR 0 = 00000001 00156018 0014359C 00000000 1226GPR 4 = 00000001 001B8888 000003E0 00000000 1227GPR 8 = 00100080 00100084 00000000 000FE000 1228GPR 12 = 00010400 8001B2DC 8001B36A 000FFED8 1229Note that GPR14 is a return address but as we are real men we are going to 1230trace the stack. 1231display 0x40 bytes after the stack pointer. 1232 1233V000FFED8 000FFF38 8001B838 80014C8E 000FFF38 1234V000FFEE8 00000000 00000000 000003E0 00000000 1235V000FFEF8 00100080 00100084 00000000 000FE000 1236V000FFF08 00010400 8001B2DC 8001B36A 000FFED8 1237 1238 1239Ah now look at whats in sp+56 (sp+0x38) this is 8001B36A our saved r14 if 1240you look above at our stackframe & also agrees with GPR14. 1241 1242now backchain 1243d 000FFF38.40 1244we now are taking the contents of SP to get our first backchain. 1245 1246V000FFF38 000FFFA0 00000000 00014995 00147094 1247V000FFF48 00147090 001470A0 000003E0 00000000 1248V000FFF58 00100080 00100084 00000000 001BF1D0 1249V000FFF68 00010400 800149BA 80014CA6 000FFF38 1250 1251This displays a 2nd return address of 80014CA6 1252 1253now do d 000FFFA0.40 for our 3rd backchain 1254 1255V000FFFA0 04B52002 0001107F 00000000 00000000 1256V000FFFB0 00000000 00000000 FF000000 0001107F 1257V000FFFC0 00000000 00000000 00000000 00000000 1258V000FFFD0 00010400 80010802 8001085A 000FFFA0 1259 1260 1261our 3rd return address is 8001085A 1262 1263as the 04B52002 looks suspiciously like rubbish it is fair to assume that the kernel entry routines 1264for the sake of optimisation don't set up a backchain. 1265 1266now look at System.map to see if the addresses make any sense. 1267 1268grep -i 0001b3 System.map 1269outputs among other things 12700001b304 T cpu_idle 1271so 8001B36A 1272is cpu_idle+0x66 ( quiet the cpu is asleep, don't wake it ) 1273 1274 1275grep -i 00014 System.map 1276produces among other things 127700014a78 T start_kernel 1278so 0014CA6 is start_kernel+some hex number I can't add in my head. 1279 1280grep -i 00108 System.map 1281this produces 128200010800 T _stext 1283so 8001085A is _stext+0x5a 1284 1285Congrats you've done your first backchain. 1286 1287 1288 1289s/390 & z/Architecture IO Overview 1290================================== 1291 1292I am not going to give a course in 390 IO architecture as this would take me quite a 1293while & I'm no expert. Instead I'll give a 390 IO architecture summary for Dummies if you have 1294the s/390 principles of operation available read this instead. If nothing else you may find a few 1295useful keywords in here & be able to use them on a web search engine like altavista to find 1296more useful information. 1297 1298Unlike other bus architectures modern 390 systems do their IO using mostly 1299fibre optics & devices such as tapes & disks can be shared between several mainframes, 1300also S390 can support up to 65536 devices while a high end PC based system might be choking 1301with around 64. Here is some of the common IO terminology 1302 1303Subchannel: 1304This is the logical number most IO commands use to talk to an IO device there can be up to 13050x10000 (65536) of these in a configuration typically there is a few hundred. Under VM 1306for simplicity they are allocated contiguously, however on the native hardware they are not 1307they typically stay consistent between boots provided no new hardware is inserted or removed. 1308Under Linux for 390 we use these as IRQ's & also when issuing an IO command (CLEAR SUBCHANNEL, 1309HALT SUBCHANNEL,MODIFY SUBCHANNEL,RESUME SUBCHANNEL,START SUBCHANNEL,STORE SUBCHANNEL & 1310TEST SUBCHANNEL ) we use this as the ID of the device we wish to talk to, the most 1311important of these instructions are START SUBCHANNEL ( to start IO ), TEST SUBCHANNEL ( to check 1312whether the IO completed successfully ), & HALT SUBCHANNEL ( to kill IO ), a subchannel 1313can have up to 8 channel paths to a device this offers redundancy if one is not available. 1314 1315 1316Device Number: 1317This number remains static & Is closely tied to the hardware, there are 65536 of these 1318also they are made up of a CHPID ( Channel Path ID, the most significant 8 bits ) 1319& another lsb 8 bits. These remain static even if more devices are inserted or removed 1320from the hardware, there is a 1 to 1 mapping between Subchannels & Device Numbers provided 1321devices aren't inserted or removed. 1322 1323Channel Control Words: 1324CCWS are linked lists of instructions initially pointed to by an operation request block (ORB), 1325which is initially given to Start Subchannel (SSCH) command along with the subchannel number 1326for the IO subsystem to process while the CPU continues executing normal code. 1327These come in two flavours, Format 0 ( 24 bit for backward ) 1328compatibility & Format 1 ( 31 bit ). These are typically used to issue read & write 1329( & many other instructions ) they consist of a length field & an absolute address field. 1330For each IO typically get 1 or 2 interrupts one for channel end ( primary status ) when the 1331channel is idle & the second for device end ( secondary status ) sometimes you get both 1332concurrently, you check how the IO went on by issuing a TEST SUBCHANNEL at each interrupt, 1333from which you receive an Interruption response block (IRB). If you get channel & device end 1334status in the IRB without channel checks etc. your IO probably went okay. If you didn't you 1335probably need a doctor to examine the IRB & extended status word etc. 1336If an error occurs, more sophisticated control units have a facility known as 1337concurrent sense this means that if an error occurs Extended sense information will 1338be presented in the Extended status word in the IRB if not you have to issue a 1339subsequent SENSE CCW command after the test subchannel. 1340 1341 1342TPI( Test pending interrupt) can also be used for polled IO but in multitasking multiprocessor 1343systems it isn't recommended except for checking special cases ( i.e. non looping checks for 1344pending IO etc. ). 1345 1346Store Subchannel & Modify Subchannel can be used to examine & modify operating characteristics 1347of a subchannel ( e.g. channel paths ). 1348 1349Other IO related Terms: 1350Sysplex: S390's Clustering Technology 1351QDIO: S390's new high speed IO architecture to support devices such as gigabit ethernet, 1352this architecture is also designed to be forward compatible with up & coming 64 bit machines. 1353 1354 1355General Concepts 1356 1357Input Output Processors (IOP's) are responsible for communicating between 1358the mainframe CPU's & the channel & relieve the mainframe CPU's from the 1359burden of communicating with IO devices directly, this allows the CPU's to 1360concentrate on data processing. 1361 1362IOP's can use one or more links ( known as channel paths ) to talk to each 1363IO device. It first checks for path availability & chooses an available one, 1364then starts ( & sometimes terminates IO ). 1365There are two types of channel path: ESCON & the Parallel IO interface. 1366 1367IO devices are attached to control units, control units provide the 1368logic to interface the channel paths & channel path IO protocols to 1369the IO devices, they can be integrated with the devices or housed separately 1370& often talk to several similar devices ( typical examples would be raid 1371controllers or a control unit which connects to 1000 3270 terminals ). 1372 1373 1374 +---------------------------------------------------------------+ 1375 | +-----+ +-----+ +-----+ +-----+ +----------+ +----------+ | 1376 | | CPU | | CPU | | CPU | | CPU | | Main | | Expanded | | 1377 | | | | | | | | | | Memory | | Storage | | 1378 | +-----+ +-----+ +-----+ +-----+ +----------+ +----------+ | 1379 |---------------------------------------------------------------+ 1380 | IOP | IOP | IOP | 1381 |--------------------------------------------------------------- 1382 | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | 1383 ---------------------------------------------------------------- 1384 || || 1385 || Bus & Tag Channel Path || ESCON 1386 || ====================== || Channel 1387 || || || || Path 1388 +----------+ +----------+ +----------+ 1389 | | | | | | 1390 | CU | | CU | | CU | 1391 | | | | | | 1392 +----------+ +----------+ +----------+ 1393 | | | | | 1394+----------+ +----------+ +----------+ +----------+ +----------+ 1395|I/O Device| |I/O Device| |I/O Device| |I/O Device| |I/O Device| 1396+----------+ +----------+ +----------+ +----------+ +----------+ 1397 CPU = Central Processing Unit 1398 C = Channel 1399 IOP = IP Processor 1400 CU = Control Unit 1401 1402The 390 IO systems come in 2 flavours the current 390 machines support both 1403 1404The Older 360 & 370 Interface,sometimes called the Parallel I/O interface, 1405sometimes called Bus-and Tag & sometimes Original Equipment Manufacturers 1406Interface (OEMI). 1407 1408This byte wide Parallel channel path/bus has parity & data on the "Bus" cable 1409& control lines on the "Tag" cable. These can operate in byte multiplex mode for 1410sharing between several slow devices or burst mode & monopolize the channel for the 1411whole burst. Up to 256 devices can be addressed on one of these cables. These cables are 1412about one inch in diameter. The maximum unextended length supported by these cables is 1413125 Meters but this can be extended up to 2km with a fibre optic channel extended 1414such as a 3044. The maximum burst speed supported is 4.5 megabytes per second however 1415some really old processors support only transfer rates of 3.0, 2.0 & 1.0 MB/sec. 1416One of these paths can be daisy chained to up to 8 control units. 1417 1418 1419ESCON if fibre optic it is also called FICON 1420Was introduced by IBM in 1990. Has 2 fibre optic cables & uses either leds or lasers 1421for communication at a signaling rate of up to 200 megabits/sec. As 10bits are transferred 1422for every 8 bits info this drops to 160 megabits/sec & to 18.6 Megabytes/sec once 1423control info & CRC are added. ESCON only operates in burst mode. 1424 1425ESCONs typical max cable length is 3km for the led version & 20km for the laser version 1426known as XDF ( extended distance facility ). This can be further extended by using an 1427ESCON director which triples the above mentioned ranges. Unlike Bus & Tag as ESCON is 1428serial it uses a packet switching architecture the standard Bus & Tag control protocol 1429is however present within the packets. Up to 256 devices can be attached to each control 1430unit that uses one of these interfaces. 1431 1432Common 390 Devices include: 1433Network adapters typically OSA2,3172's,2116's & OSA-E gigabit ethernet adapters, 1434Consoles 3270 & 3215 ( a teletype emulated under linux for a line mode console ). 1435DASD's direct access storage devices ( otherwise known as hard disks ). 1436Tape Drives. 1437CTC ( Channel to Channel Adapters ), 1438ESCON or Parallel Cables used as a very high speed serial link 1439between 2 machines. We use 2 cables under linux to do a bi-directional serial link. 1440 1441 1442Debugging IO on s/390 & z/Architecture under VM 1443=============================================== 1444 1445Now we are ready to go on with IO tracing commands under VM 1446 1447A few self explanatory queries: 1448Q OSA 1449Q CTC 1450Q DISK ( This command is CMS specific ) 1451Q DASD 1452 1453 1454 1455 1456 1457 1458Q OSA on my machine returns 1459OSA 7C08 ON OSA 7C08 SUBCHANNEL = 0000 1460OSA 7C09 ON OSA 7C09 SUBCHANNEL = 0001 1461OSA 7C14 ON OSA 7C14 SUBCHANNEL = 0002 1462OSA 7C15 ON OSA 7C15 SUBCHANNEL = 0003 1463 1464If you have a guest with certain privileges you may be able to see devices 1465which don't belong to you. To avoid this, add the option V. 1466e.g. 1467Q V OSA 1468 1469Now using the device numbers returned by this command we will 1470Trace the io starting up on the first device 7c08 & 7c09 1471In our simplest case we can trace the 1472start subchannels 1473like TR SSCH 7C08-7C09 1474or the halt subchannels 1475or TR HSCH 7C08-7C09 1476MSCH's ,STSCH's I think you can guess the rest 1477 1478Ingo's favourite trick is tracing all the IO's & CCWS & spooling them into the reader of another 1479VM guest so he can ftp the logfile back to his own machine.I'll do a small bit of this & give you 1480 a look at the output. 1481 14821) Spool stdout to VM reader 1483SP PRT TO (another vm guest ) or * for the local vm guest 14842) Fill the reader with the trace 1485TR IO 7c08-7c09 INST INT CCW PRT RUN 14863) Start up linux 1487i 00c 14884) Finish the trace 1489TR END 14905) close the reader 1491C PRT 14926) list reader contents 1493RDRLIST 14947) copy it to linux4's minidisk 1495RECEIVE / LOG TXT A1 ( replace 14968) 1497filel & press F11 to look at it 1498You should see something like: 1499 150000020942' SSCH B2334000 0048813C CC 0 SCH 0000 DEV 7C08 1501 CPA 000FFDF0 PARM 00E2C9C4 KEY 0 FPI C0 LPM 80 1502 CCW 000FFDF0 E4200100 00487FE8 0000 E4240100 ........ 1503 IDAL 43D8AFE8 1504 IDAL 0FB76000 150500020B0A' I/O DEV 7C08 -> 000197BC' SCH 0000 PARM 00E2C9C4 150600021628' TSCH B2354000 >> 00488164 CC 0 SCH 0000 DEV 7C08 1507 CCWA 000FFDF8 DEV STS 0C SCH STS 00 CNT 00EC 1508 KEY 0 FPI C0 CC 0 CTLS 4007 150900022238' STSCH B2344000 >> 00488108 CC 0 SCH 0000 DEV 7C08 1510 1511If you don't like messing up your readed ( because you possibly booted from it ) 1512you can alternatively spool it to another readers guest. 1513 1514 1515Other common VM device related commands 1516--------------------------------------------- 1517These commands are listed only because they have 1518been of use to me in the past & may be of use to 1519you too. For more complete info on each of the commands 1520use type HELP <command> from CMS. 1521detaching devices 1522DET <devno range> 1523ATT <devno range> <guest> 1524attach a device to guest * for your own guest 1525READY <devno> cause VM to issue a fake interrupt. 1526 1527The VARY command is normally only available to VM administrators. 1528VARY ON PATH <path> TO <devno range> 1529VARY OFF PATH <PATH> FROM <devno range> 1530This is used to switch on or off channel paths to devices. 1531 1532Q CHPID <channel path ID> 1533This displays state of devices using this channel path 1534D SCHIB <subchannel> 1535This displays the subchannel information SCHIB block for the device. 1536this I believe is also only available to administrators. 1537DEFINE CTC <devno> 1538defines a virtual CTC channel to channel connection 15392 need to be defined on each guest for the CTC driver to use. 1540COUPLE devno userid remote devno 1541Joins a local virtual device to a remote virtual device 1542( commonly used for the CTC driver ). 1543 1544Building a VM ramdisk under CMS which linux can use 1545def vfb-<blocksize> <subchannel> <number blocks> 1546blocksize is commonly 4096 for linux. 1547Formatting it 1548format <subchannel> <driver letter e.g. x> (blksize <blocksize> 1549 1550Sharing a disk between multiple guests 1551LINK userid devno1 devno2 mode password 1552 1553 1554 1555GDB on S390 1556=========== 1557N.B. if compiling for debugging gdb works better without optimisation 1558( see Compiling programs for debugging ) 1559 1560invocation 1561---------- 1562gdb <victim program> <optional corefile> 1563 1564Online help 1565----------- 1566help: gives help on commands 1567e.g. 1568help 1569help display 1570Note gdb's online help is very good use it. 1571 1572 1573Assembly 1574-------- 1575info registers: displays registers other than floating point. 1576info all-registers: displays floating points as well. 1577disassemble: disassembles 1578e.g. 1579disassemble without parameters will disassemble the current function 1580disassemble $pc $pc+10 1581 1582Viewing & modifying variables 1583----------------------------- 1584print or p: displays variable or register 1585e.g. p/x $sp will display the stack pointer 1586 1587display: prints variable or register each time program stops 1588e.g. 1589display/x $pc will display the program counter 1590display argc 1591 1592undisplay : undo's display's 1593 1594info breakpoints: shows all current breakpoints 1595 1596info stack: shows stack back trace ( if this doesn't work too well, I'll show you the 1597stacktrace by hand below ). 1598 1599info locals: displays local variables. 1600 1601info args: display current procedure arguments. 1602 1603set args: will set argc & argv each time the victim program is invoked. 1604 1605set <variable>=value 1606set argc=100 1607set $pc=0 1608 1609 1610 1611Modifying execution 1612------------------- 1613step: steps n lines of sourcecode 1614step steps 1 line. 1615step 100 steps 100 lines of code. 1616 1617next: like step except this will not step into subroutines 1618 1619stepi: steps a single machine code instruction. 1620e.g. stepi 100 1621 1622nexti: steps a single machine code instruction but will not step into subroutines. 1623 1624finish: will run until exit of the current routine 1625 1626run: (re)starts a program 1627 1628cont: continues a program 1629 1630quit: exits gdb. 1631 1632 1633breakpoints 1634------------ 1635 1636break 1637sets a breakpoint 1638e.g. 1639 1640break main 1641 1642break *$pc 1643 1644break *0x400618 1645 1646Here's a really useful one for large programs 1647rbr 1648Set a breakpoint for all functions matching REGEXP 1649e.g. 1650rbr 390 1651will set a breakpoint with all functions with 390 in their name. 1652 1653info breakpoints 1654lists all breakpoints 1655 1656delete: delete breakpoint by number or delete them all 1657e.g. 1658delete 1 will delete the first breakpoint 1659delete will delete them all 1660 1661watch: This will set a watchpoint ( usually hardware assisted ), 1662This will watch a variable till it changes 1663e.g. 1664watch cnt, will watch the variable cnt till it changes. 1665As an aside unfortunately gdb's, architecture independent watchpoint code 1666is inconsistent & not very good, watchpoints usually work but not always. 1667 1668info watchpoints: Display currently active watchpoints 1669 1670condition: ( another useful one ) 1671Specify breakpoint number N to break only if COND is true. 1672Usage is `condition N COND', where N is an integer and COND is an 1673expression to be evaluated whenever breakpoint N is reached. 1674 1675 1676 1677User defined functions/macros 1678----------------------------- 1679define: ( Note this is very very useful,simple & powerful ) 1680usage define <name> <list of commands> end 1681 1682examples which you should consider putting into .gdbinit in your home directory 1683define d 1684stepi 1685disassemble $pc $pc+10 1686end 1687 1688define e 1689nexti 1690disassemble $pc $pc+10 1691end 1692 1693 1694Other hard to classify stuff 1695---------------------------- 1696signal n: 1697sends the victim program a signal. 1698e.g. signal 3 will send a SIGQUIT. 1699 1700info signals: 1701what gdb does when the victim receives certain signals. 1702 1703list: 1704e.g. 1705list lists current function source 1706list 1,10 list first 10 lines of current file. 1707list test.c:1,10 1708 1709 1710directory: 1711Adds directories to be searched for source if gdb cannot find the source. 1712(note it is a bit sensitive about slashes) 1713e.g. To add the root of the filesystem to the searchpath do 1714directory // 1715 1716 1717call <function> 1718This calls a function in the victim program, this is pretty powerful 1719e.g. 1720(gdb) call printf("hello world") 1721outputs: 1722$1 = 11 1723 1724You might now be thinking that the line above didn't work, something extra had to be done. 1725(gdb) call fflush(stdout) 1726hello world$2 = 0 1727As an aside the debugger also calls malloc & free under the hood 1728to make space for the "hello world" string. 1729 1730 1731 1732hints 1733----- 17341) command completion works just like bash 1735( if you are a bad typist like me this really helps ) 1736e.g. hit br <TAB> & cursor up & down :-). 1737 17382) if you have a debugging problem that takes a few steps to recreate 1739put the steps into a file called .gdbinit in your current working directory 1740if you have defined a few extra useful user defined commands put these in 1741your home directory & they will be read each time gdb is launched. 1742 1743A typical .gdbinit file might be. 1744break main 1745run 1746break runtime_exception 1747cont 1748 1749 1750stack chaining in gdb by hand 1751----------------------------- 1752This is done using a the same trick described for VM 1753p/x (*($sp+56))&0x7fffffff get the first backchain. 1754 1755For z/Architecture 1756Replace 56 with 112 & ignore the &0x7fffffff 1757in the macros below & do nasty casts to longs like the following 1758as gdb unfortunately deals with printed arguments as ints which 1759messes up everything. 1760i.e. here is a 3rd backchain dereference 1761p/x *(long *)(***(long ***)$sp+112) 1762 1763 1764this outputs 1765$5 = 0x528f18 1766on my machine. 1767Now you can use 1768info symbol (*($sp+56))&0x7fffffff 1769you might see something like. 1770rl_getc + 36 in section .text telling you what is located at address 0x528f18 1771Now do. 1772p/x (*(*$sp+56))&0x7fffffff 1773This outputs 1774$6 = 0x528ed0 1775Now do. 1776info symbol (*(*$sp+56))&0x7fffffff 1777rl_read_key + 180 in section .text 1778now do 1779p/x (*(**$sp+56))&0x7fffffff 1780& so on. 1781 1782Disassembling instructions without debug info 1783--------------------------------------------- 1784gdb typically complains if there is a lack of debugging 1785symbols in the disassemble command with 1786"No function contains specified address." To get around 1787this do 1788x/<number lines to disassemble>xi <address> 1789e.g. 1790x/20xi 0x400730 1791 1792 1793 1794Note: Remember gdb has history just like bash you don't need to retype the 1795whole line just use the up & down arrows. 1796 1797 1798 1799For more info 1800------------- 1801From your linuxbox do 1802man gdb or info gdb. 1803 1804core dumps 1805---------- 1806What a core dump ?, 1807A core dump is a file generated by the kernel ( if allowed ) which contains the registers, 1808& all active pages of the program which has crashed. 1809From this file gdb will allow you to look at the registers & stack trace & memory of the 1810program as if it just crashed on your system, it is usually called core & created in the 1811current working directory. 1812This is very useful in that a customer can mail a core dump to a technical support department 1813& the technical support department can reconstruct what happened. 1814Provided they have an identical copy of this program with debugging symbols compiled in & 1815the source base of this build is available. 1816In short it is far more useful than something like a crash log could ever hope to be. 1817 1818In theory all that is missing to restart a core dumped program is a kernel patch which 1819will do the following. 18201) Make a new kernel task structure 18212) Reload all the dumped pages back into the kernel's memory management structures. 18223) Do the required clock fixups 18234) Get all files & network connections for the process back into an identical state ( really difficult ). 18245) A few more difficult things I haven't thought of. 1825 1826 1827 1828Why have I never seen one ?. 1829Probably because you haven't used the command 1830ulimit -c unlimited in bash 1831to allow core dumps, now do 1832ulimit -a 1833to verify that the limit was accepted. 1834 1835A sample core dump 1836To create this I'm going to do 1837ulimit -c unlimited 1838gdb 1839to launch gdb (my victim app. ) now be bad & do the following from another 1840telnet/xterm session to the same machine 1841ps -aux | grep gdb 1842kill -SIGSEGV <gdb's pid> 1843or alternatively use killall -SIGSEGV gdb if you have the killall command. 1844Now look at the core dump. 1845./gdb core 1846Displays the following 1847GNU gdb 4.18 1848Copyright 1998 Free Software Foundation, Inc. 1849GDB is free software, covered by the GNU General Public License, and you are 1850welcome to change it and/or distribute copies of it under certain conditions. 1851Type "show copying" to see the conditions. 1852There is absolutely no warranty for GDB. Type "show warranty" for details. 1853This GDB was configured as "s390-ibm-linux"... 1854Core was generated by `./gdb'. 1855Program terminated with signal 11, Segmentation fault. 1856Reading symbols from /usr/lib/libncurses.so.4...done. 1857Reading symbols from /lib/libm.so.6...done. 1858Reading symbols from /lib/libc.so.6...done. 1859Reading symbols from /lib/ld-linux.so.2...done. 1860#0 0x40126d1a in read () from /lib/libc.so.6 1861Setting up the environment for debugging gdb. 1862Breakpoint 1 at 0x4dc6f8: file utils.c, line 471. 1863Breakpoint 2 at 0x4d87a4: file top.c, line 2609. 1864(top-gdb) info stack 1865#0 0x40126d1a in read () from /lib/libc.so.6 1866#1 0x528f26 in rl_getc (stream=0x7ffffde8) at input.c:402 1867#2 0x528ed0 in rl_read_key () at input.c:381 1868#3 0x5167e6 in readline_internal_char () at readline.c:454 1869#4 0x5168ee in readline_internal_charloop () at readline.c:507 1870#5 0x51692c in readline_internal () at readline.c:521 1871#6 0x5164fe in readline (prompt=0x7ffff810 "\177ÿøx\177ÿ÷Ø\177ÿøxÀ") 1872 at readline.c:349 1873#7 0x4d7a8a in command_line_input (prompt=0x564420 "(gdb) ", repeat=1, 1874 annotation_suffix=0x4d6b44 "prompt") at top.c:2091 1875#8 0x4d6cf0 in command_loop () at top.c:1345 1876#9 0x4e25bc in main (argc=1, argv=0x7ffffdf4) at main.c:635 1877 1878 1879LDD 1880=== 1881This is a program which lists the shared libraries which a library needs, 1882Note you also get the relocations of the shared library text segments which 1883help when using objdump --source. 1884e.g. 1885 ldd ./gdb 1886outputs 1887libncurses.so.4 => /usr/lib/libncurses.so.4 (0x40018000) 1888libm.so.6 => /lib/libm.so.6 (0x4005e000) 1889libc.so.6 => /lib/libc.so.6 (0x40084000) 1890/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) 1891 1892 1893Debugging shared libraries 1894========================== 1895Most programs use shared libraries, however it can be very painful 1896when you single step instruction into a function like printf for the 1897first time & you end up in functions like _dl_runtime_resolve this is 1898the ld.so doing lazy binding, lazy binding is a concept in ELF where 1899shared library functions are not loaded into memory unless they are 1900actually used, great for saving memory but a pain to debug. 1901To get around this either relink the program -static or exit gdb type 1902export LD_BIND_NOW=true this will stop lazy binding & restart the gdb'ing 1903the program in question. 1904 1905 1906 1907Debugging modules 1908================= 1909As modules are dynamically loaded into the kernel their address can be 1910anywhere to get around this use the -m option with insmod to emit a load 1911map which can be piped into a file if required. 1912 1913The proc file system 1914==================== 1915What is it ?. 1916It is a filesystem created by the kernel with files which are created on demand 1917by the kernel if read, or can be used to modify kernel parameters, 1918it is a powerful concept. 1919 1920e.g. 1921 1922cat /proc/sys/net/ipv4/ip_forward 1923On my machine outputs 19240 1925telling me ip_forwarding is not on to switch it on I can do 1926echo 1 > /proc/sys/net/ipv4/ip_forward 1927cat it again 1928cat /proc/sys/net/ipv4/ip_forward 1929On my machine now outputs 19301 1931IP forwarding is on. 1932There is a lot of useful info in here best found by going in & having a look around, 1933so I'll take you through some entries I consider important. 1934 1935All the processes running on the machine have their own entry defined by 1936/proc/<pid> 1937So lets have a look at the init process 1938cd /proc/1 1939 1940cat cmdline 1941emits 1942init [2] 1943 1944cd /proc/1/fd 1945This contains numerical entries of all the open files, 1946some of these you can cat e.g. stdout (2) 1947 1948cat /proc/29/maps 1949on my machine emits 1950 195100400000-00478000 r-xp 00000000 5f:00 4103 /bin/bash 195200478000-0047e000 rw-p 00077000 5f:00 4103 /bin/bash 19530047e000-00492000 rwxp 00000000 00:00 0 195440000000-40015000 r-xp 00000000 5f:00 14382 /lib/ld-2.1.2.so 195540015000-40016000 rw-p 00014000 5f:00 14382 /lib/ld-2.1.2.so 195640016000-40017000 rwxp 00000000 00:00 0 195740017000-40018000 rw-p 00000000 00:00 0 195840018000-4001b000 r-xp 00000000 5f:00 14435 /lib/libtermcap.so.2.0.8 19594001b000-4001c000 rw-p 00002000 5f:00 14435 /lib/libtermcap.so.2.0.8 19604001c000-4010d000 r-xp 00000000 5f:00 14387 /lib/libc-2.1.2.so 19614010d000-40111000 rw-p 000f0000 5f:00 14387 /lib/libc-2.1.2.so 196240111000-40114000 rw-p 00000000 00:00 0 196340114000-4011e000 r-xp 00000000 5f:00 14408 /lib/libnss_files-2.1.2.so 19644011e000-4011f000 rw-p 00009000 5f:00 14408 /lib/libnss_files-2.1.2.so 19657fffd000-80000000 rwxp ffffe000 00:00 0 1966 1967 1968Showing us the shared libraries init uses where they are in memory 1969& memory access permissions for each virtual memory area. 1970 1971/proc/1/cwd is a softlink to the current working directory. 1972/proc/1/root is the root of the filesystem for this process. 1973 1974/proc/1/mem is the current running processes memory which you 1975can read & write to like a file. 1976strace uses this sometimes as it is a bit faster than the 1977rather inefficient ptrace interface for peeking at DATA. 1978 1979 1980cat status 1981 1982Name: init 1983State: S (sleeping) 1984Pid: 1 1985PPid: 0 1986Uid: 0 0 0 0 1987Gid: 0 0 0 0 1988Groups: 1989VmSize: 408 kB 1990VmLck: 0 kB 1991VmRSS: 208 kB 1992VmData: 24 kB 1993VmStk: 8 kB 1994VmExe: 368 kB 1995VmLib: 0 kB 1996SigPnd: 0000000000000000 1997SigBlk: 0000000000000000 1998SigIgn: 7fffffffd7f0d8fc 1999SigCgt: 00000000280b2603 2000CapInh: 00000000fffffeff 2001CapPrm: 00000000ffffffff 2002CapEff: 00000000fffffeff 2003 2004User PSW: 070de000 80414146 2005task: 004b6000 tss: 004b62d8 ksp: 004b7ca8 pt_regs: 004b7f68 2006User GPRS: 200700000400 00000000 0000000b 7ffffa90 200800000000 00000000 00000000 0045d9f4 20090045cafc 7ffffa90 7fffff18 0045cb08 201000010400 804039e8 80403af8 7ffff8b0 2011User ACRS: 201200000000 00000000 00000000 00000000 201300000001 00000000 00000000 00000000 201400000000 00000000 00000000 00000000 201500000000 00000000 00000000 00000000 2016Kernel BackChain CallChain BackChain CallChain 2017 004b7ca8 8002bd0c 004b7d18 8002b92c 2018 004b7db8 8005cd50 004b7e38 8005d12a 2019 004b7f08 80019114 2020Showing among other things memory usage & status of some signals & 2021the processes'es registers from the kernel task_structure 2022as well as a backchain which may be useful if a process crashes 2023in the kernel for some unknown reason. 2024 2025Some driver debugging techniques 2026================================ 2027debug feature 2028------------- 2029Some of our drivers now support a "debug feature" in 2030/proc/s390dbf see s390dbf.txt in the linux/Documentation directory 2031for more info. 2032e.g. 2033to switch on the lcs "debug feature" 2034echo 5 > /proc/s390dbf/lcs/level 2035& then after the error occurred. 2036cat /proc/s390dbf/lcs/sprintf >/logfile 2037the logfile now contains some information which may help 2038tech support resolve a problem in the field. 2039 2040 2041 2042high level debugging network drivers 2043------------------------------------ 2044ifconfig is a quite useful command 2045it gives the current state of network drivers. 2046 2047If you suspect your network device driver is dead 2048one way to check is type 2049ifconfig <network device> 2050e.g. tr0 2051You should see something like 2052tr0 Link encap:16/4 Mbps Token Ring (New) HWaddr 00:04:AC:20:8E:48 2053 inet addr:9.164.185.132 Bcast:9.164.191.255 Mask:255.255.224.0 2054 UP BROADCAST RUNNING MULTICAST MTU:2000 Metric:1 2055 RX packets:246134 errors:0 dropped:0 overruns:0 frame:0 2056 TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 2057 collisions:0 txqueuelen:100 2058 2059if the device doesn't say up 2060try 2061/etc/rc.d/init.d/network start 2062( this starts the network stack & hopefully calls ifconfig tr0 up ). 2063ifconfig looks at the output of /proc/net/dev & presents it in a more presentable form 2064Now ping the device from a machine in the same subnet. 2065if the RX packets count & TX packets counts don't increment you probably 2066have problems. 2067next 2068cat /proc/net/arp 2069Do you see any hardware addresses in the cache if not you may have problems. 2070Next try 2071ping -c 5 <broadcast_addr> i.e. the Bcast field above in the output of 2072ifconfig. Do you see any replies from machines other than the local machine 2073if not you may have problems. also if the TX packets count in ifconfig 2074hasn't incremented either you have serious problems in your driver 2075(e.g. the txbusy field of the network device being stuck on ) 2076or you may have multiple network devices connected. 2077 2078 2079chandev 2080------- 2081There is a new device layer for channel devices, some 2082drivers e.g. lcs are registered with this layer. 2083If the device uses the channel device layer you'll be 2084able to find what interrupts it uses & the current state 2085of the device. 2086See the manpage chandev.8 &type cat /proc/chandev for more info. 2087 2088 2089 2090Starting points for debugging scripting languages etc. 2091====================================================== 2092 2093bash/sh 2094 2095bash -x <scriptname> 2096e.g. bash -x /usr/bin/bashbug 2097displays the following lines as it executes them. 2098+ MACHINE=i586 2099+ OS=linux-gnu 2100+ CC=gcc 2101+ CFLAGS= -DPROGRAM='bash' -DHOSTTYPE='i586' -DOSTYPE='linux-gnu' -DMACHTYPE='i586-pc-linux-gnu' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./lib -O2 -pipe 2102+ RELEASE=2.01 2103+ PATCHLEVEL=1 2104+ RELSTATUS=release 2105+ MACHTYPE=i586-pc-linux-gnu 2106 2107perl -d <scriptname> runs the perlscript in a fully interactive debugger 2108<like gdb>. 2109Type 'h' in the debugger for help. 2110 2111for debugging java type 2112jdb <filename> another fully interactive gdb style debugger. 2113& type ? in the debugger for help. 2114 2115 2116 2117SysRq 2118===== 2119This is now supported by linux for s/390 & z/Architecture. 2120To enable it do compile the kernel with 2121Kernel Hacking -> Magic SysRq Key Enabled 2122echo "1" > /proc/sys/kernel/sysrq 2123also type 2124echo "8" >/proc/sys/kernel/printk 2125To make printk output go to console. 2126On 390 all commands are prefixed with 2127^- 2128e.g. 2129^-t will show tasks. 2130^-? or some unknown command will display help. 2131The sysrq key reading is very picky ( I have to type the keys in an 2132 xterm session & paste them into the x3270 console ) 2133& it may be wise to predefine the keys as described in the VM hints above 2134 2135This is particularly useful for syncing disks unmounting & rebooting 2136if the machine gets partially hung. 2137 2138Read Documentation/sysrq.txt for more info 2139 2140References: 2141=========== 2142Enterprise Systems Architecture Reference Summary 2143Enterprise Systems Architecture Principles of Operation 2144Hartmut Penners s390 stack frame sheet. 2145IBM Mainframe Channel Attachment a technology brief from a CISCO webpage 2146Various bits of man & info pages of Linux. 2147Linux & GDB source. 2148Various info & man pages. 2149CMS Help on tracing commands. 2150Linux for s/390 Elf Application Binary Interface 2151Linux for z/Series Elf Application Binary Interface ( Both Highly Recommended ) 2152z/Architecture Principles of Operation SA22-7832-00 2153Enterprise Systems Architecture/390 Reference Summary SA22-7209-01 & the 2154Enterprise Systems Architecture/390 Principles of Operation SA22-7201-05 2155 2156Special Thanks 2157============== 2158Special thanks to Neale Ferguson who maintains a much 2159prettier HTML version of this page at 2160http://linuxvm.org/penguinvm/ 2161Bob Grainger Stefan Bader & others for reporting bugs