Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at v2.6.22-rc7 499 lines 24 kB view raw
1This file contains brief information about the SCSI tape driver. 2The driver is currently maintained by Kai Mäkisara (email 3Kai.Makisara@kolumbus.fi) 4 5Last modified: Mon Mar 7 21:14:44 2005 by kai.makisara 6 7 8BASICS 9 10The driver is generic, i.e., it does not contain any code tailored 11to any specific tape drive. The tape parameters can be specified with 12one of the following three methods: 13 141. Each user can specify the tape parameters he/she wants to use 15directly with ioctls. This is administratively a very simple and 16flexible method and applicable to single-user workstations. However, 17in a multiuser environment the next user finds the tape parameters in 18state the previous user left them. 19 202. The system manager (root) can define default values for some tape 21parameters, like block size and density using the MTSETDRVBUFFER ioctl. 22These parameters can be programmed to come into effect either when a 23new tape is loaded into the drive or if writing begins at the 24beginning of the tape. The second method is applicable if the tape 25drive performs auto-detection of the tape format well (like some 26QIC-drives). The result is that any tape can be read, writing can be 27continued using existing format, and the default format is used if 28the tape is rewritten from the beginning (or a new tape is written 29for the first time). The first method is applicable if the drive 30does not perform auto-detection well enough and there is a single 31"sensible" mode for the device. An example is a DAT drive that is 32used only in variable block mode (I don't know if this is sensible 33or not :-). 34 35The user can override the parameters defined by the system 36manager. The changes persist until the defaults again come into 37effect. 38 393. By default, up to four modes can be defined and selected using the minor 40number (bits 5 and 6). The number of modes can be changed by changing 41ST_NBR_MODE_BITS in st.h. Mode 0 corresponds to the defaults discussed 42above. Additional modes are dormant until they are defined by the 43system manager (root). When specification of a new mode is started, 44the configuration of mode 0 is used to provide a starting point for 45definition of the new mode. 46 47Using the modes allows the system manager to give the users choices 48over some of the buffering parameters not directly accessible to the 49users (buffered and asynchronous writes). The modes also allow choices 50between formats in multi-tape operations (the explicitly overridden 51parameters are reset when a new tape is loaded). 52 53If more than one mode is used, all modes should contain definitions 54for the same set of parameters. 55 56Many Unices contain internal tables that associate different modes to 57supported devices. The Linux SCSI tape driver does not contain such 58tables (and will not do that in future). Instead of that, a utility 59program can be made that fetches the inquiry data sent by the device, 60scans its database, and sets up the modes using the ioctls. Another 61alternative is to make a small script that uses mt to set the defaults 62tailored to the system. 63 64The driver supports fixed and variable block size (within buffer 65limits). Both the auto-rewind (minor equals device number) and 66non-rewind devices (minor is 128 + device number) are implemented. 67 68In variable block mode, the byte count in write() determines the size 69of the physical block on tape. When reading, the drive reads the next 70tape block and returns to the user the data if the read() byte count 71is at least the block size. Otherwise, error ENOMEM is returned. 72 73In fixed block mode, the data transfer between the drive and the 74driver is in multiples of the block size. The write() byte count must 75be a multiple of the block size. This is not required when reading but 76may be advisable for portability. 77 78Support is provided for changing the tape partition and partitioning 79of the tape with one or two partitions. By default support for 80partitioned tape is disabled for each driver and it can be enabled 81with the ioctl MTSETDRVBUFFER. 82 83By default the driver writes one filemark when the device is closed after 84writing and the last operation has been a write. Two filemarks can be 85optionally written. In both cases end of data is signified by 86returning zero bytes for two consecutive reads. 87 88If rewind, offline, bsf, or seek is done and previous tape operation was 89write, a filemark is written before moving tape. 90 91The compile options are defined in the file linux/drivers/scsi/st_options.h. 92 934. If the open option O_NONBLOCK is used, open succeeds even if the 94drive is not ready. If O_NONBLOCK is not used, the driver waits for 95the drive to become ready. If this does not happen in ST_BLOCK_SECONDS 96seconds, open fails with the errno value EIO. With O_NONBLOCK the 97device can be opened for writing even if there is a write protected 98tape in the drive (commands trying to write something return error if 99attempted). 100 101 102MINOR NUMBERS 103 104The tape driver currently supports 128 drives by default. This number 105can be increased by editing st.h and recompiling the driver if 106necessary. The upper limit is 2^17 drives if 4 modes for each drive 107are used. 108 109The minor numbers consist of the following bit fields: 110 111dev_upper non-rew mode dev-lower 112 20 - 8 7 6 5 4 0 113The non-rewind bit is always bit 7 (the uppermost bit in the lowermost 114byte). The bits defining the mode are below the non-rewind bit. The 115remaining bits define the tape device number. This numbering is 116backward compatible with the numbering used when the minor number was 117only 8 bits wide. 118 119 120SYSFS SUPPORT 121 122The driver creates the directory /sys/class/scsi_tape and populates it with 123directories corresponding to the existing tape devices. There are autorewind 124and non-rewind entries for each mode. The names are stxy and nstxy, where x 125is the tape number and y a character corresponding to the mode (none, l, m, 126a). For example, the directories for the first tape device are (assuming four 127modes): st0 nst0 st0l nst0l st0m nst0m st0a nst0a. 128 129Each directory contains the entries: default_blksize default_compression 130default_density defined dev device driver. The file 'defined' contains 1 131if the mode is defined and zero if not defined. The files 'default_*' contain 132the defaults set by the user. The value -1 means the default is not set. The 133file 'dev' contains the device numbers corresponding to this device. The links 134'device' and 'driver' point to the SCSI device and driver entries. 135 136A link named 'tape' is made from the SCSI device directory to the class 137directory corresponding to the mode 0 auto-rewind device (e.g., st0). 138 139 140BSD AND SYS V SEMANTICS 141 142The user can choose between these two behaviours of the tape driver by 143defining the value of the symbol ST_SYSV. The semantics differ when a 144file being read is closed. The BSD semantics leaves the tape where it 145currently is whereas the SYS V semantics moves the tape past the next 146filemark unless the filemark has just been crossed. 147 148The default is BSD semantics. 149 150 151BUFFERING 152 153The driver tries to do transfers directly to/from user space. If this 154is not possible, a driver buffer allocated at run-time is used. If 155direct i/o is not possible for the whole transfer, the driver buffer 156is used (i.e., bounce buffers for individual pages are not 157used). Direct i/o can be impossible because of several reasons, e.g.: 158- one or more pages are at addresses not reachable by the HBA 159- the number of pages in the transfer exceeds the number of 160 scatter/gather segments permitted by the HBA 161- one or more pages can't be locked into memory (should not happen in 162 any reasonable situation) 163 164The size of the driver buffers is always at least one tape block. In fixed 165block mode, the minimum buffer size is defined (in 1024 byte units) by 166ST_FIXED_BUFFER_BLOCKS. With small block size this allows buffering of 167several blocks and using one SCSI read or write to transfer all of the 168blocks. Buffering of data across write calls in fixed block mode is 169allowed if ST_BUFFER_WRITES is non-zero and direct i/o is not used. 170Buffer allocation uses chunks of memory having sizes 2^n * (page 171size). Because of this the actual buffer size may be larger than the 172minimum allowable buffer size. 173 174NOTE that if direct i/o is used, the small writes are not buffered. This may 175cause a surprise when moving from 2.4. There small writes (e.g., tar without 176-b option) may have had good throughput but this is not true any more with 1772.6. Direct i/o can be turned off to solve this problem but a better solution 178is to use bigger write() byte counts (e.g., tar -b 64). 179 180Asynchronous writing. Writing the buffer contents to the tape is 181started and the write call returns immediately. The status is checked 182at the next tape operation. Asynchronous writes are not done with 183direct i/o and not in fixed block mode. 184 185Buffered writes and asynchronous writes may in some rare cases cause 186problems in multivolume operations if there is not enough space on the 187tape after the early-warning mark to flush the driver buffer. 188 189Read ahead for fixed block mode (ST_READ_AHEAD). Filling the buffer is 190attempted even if the user does not want to get all of the data at 191this read command. Should be disabled for those drives that don't like 192a filemark to truncate a read request or that don't like backspacing. 193 194Scatter/gather buffers (buffers that consist of chunks non-contiguous 195in the physical memory) are used if contiguous buffers can't be 196allocated. To support all SCSI adapters (including those not 197supporting scatter/gather), buffer allocation is using the following 198three kinds of chunks: 1991. The initial segment that is used for all SCSI adapters including 200those not supporting scatter/gather. The size of this buffer will be 201(PAGE_SIZE << ST_FIRST_ORDER) bytes if the system can give a chunk of 202this size (and it is not larger than the buffer size specified by 203ST_BUFFER_BLOCKS). If this size is not available, the driver halves 204the size and tries again until the size of one page. The default 205settings in st_options.h make the driver to try to allocate all of the 206buffer as one chunk. 2072. The scatter/gather segments to fill the specified buffer size are 208allocated so that as many segments as possible are used but the number 209of segments does not exceed ST_FIRST_SG. 2103. The remaining segments between ST_MAX_SG (or the module parameter 211max_sg_segs) and the number of segments used in phases 1 and 2 212are used to extend the buffer at run-time if this is necessary. The 213number of scatter/gather segments allowed for the SCSI adapter is not 214exceeded if it is smaller than the maximum number of scatter/gather 215segments specified. If the maximum number allowed for the SCSI adapter 216is smaller than the number of segments used in phases 1 and 2, 217extending the buffer will always fail. 218 219 220EOM BEHAVIOUR WHEN WRITING 221 222When the end of medium early warning is encountered, the current write 223is finished and the number of bytes is returned. The next write 224returns -1 and errno is set to ENOSPC. To enable writing a trailer, 225the next write is allowed to proceed and, if successful, the number of 226bytes is returned. After this, -1 and the number of bytes are 227alternately returned until the physical end of medium (or some other 228error) is encountered. 229 230 231MODULE PARAMETERS 232 233The buffer size, write threshold, and the maximum number of allocated buffers 234are configurable when the driver is loaded as a module. The keywords are: 235 236buffer_kbs=xxx the buffer size for fixed block mode is set 237 to xxx kilobytes 238write_threshold_kbs=xxx the write threshold in kilobytes set to xxx 239max_sg_segs=xxx the maximum number of scatter/gather 240 segments 241try_direct_io=x try direct transfer between user buffer and 242 tape drive if this is non-zero 243 244Note that if the buffer size is changed but the write threshold is not 245set, the write threshold is set to the new buffer size - 2 kB. 246 247 248BOOT TIME CONFIGURATION 249 250If the driver is compiled into the kernel, the same parameters can be 251also set using, e.g., the LILO command line. The preferred syntax is 252to use the same keyword used when loading as module but prepended 253with 'st.'. For instance, to set the maximum number of scatter/gather 254segments, the parameter 'st.max_sg_segs=xx' should be used (xx is the 255number of scatter/gather segments). 256 257For compatibility, the old syntax from early 2.5 and 2.4 kernel 258versions is supported. The same keywords can be used as when loading 259the driver as module. If several parameters are set, the keyword-value 260pairs are separated with a comma (no spaces allowed). A colon can be 261used instead of the equal mark. The definition is prepended by the 262string st=. Here is an example: 263 264 st=buffer_kbs:64,write_threshold_kbs:60 265 266The following syntax used by the old kernel versions is also supported: 267 268 st=aa[,bb[,dd]] 269 270where 271 aa is the buffer size for fixed block mode in 1024 byte units 272 bb is the write threshold in 1024 byte units 273 dd is the maximum number of scatter/gather segments 274 275 276IOCTLS 277 278The tape is positioned and the drive parameters are set with ioctls 279defined in mtio.h The tape control program 'mt' uses these ioctls. Try 280to find an mt that supports all of the Linux SCSI tape ioctls and 281opens the device for writing if the tape contents will be modified 282(look for a package mt-st* from the Linux ftp sites; the GNU mt does 283not open for writing for, e.g., erase). 284 285The supported ioctls are: 286 287The following use the structure mtop: 288 289MTFSF Space forward over count filemarks. Tape positioned after filemark. 290MTFSFM As above but tape positioned before filemark. 291MTBSF Space backward over count filemarks. Tape positioned before 292 filemark. 293MTBSFM As above but ape positioned after filemark. 294MTFSR Space forward over count records. 295MTBSR Space backward over count records. 296MTFSS Space forward over count setmarks. 297MTBSS Space backward over count setmarks. 298MTWEOF Write count filemarks. 299MTWSM Write count setmarks. 300MTREW Rewind tape. 301MTOFFL Set device off line (often rewind plus eject). 302MTNOP Do nothing except flush the buffers. 303MTRETEN Re-tension tape. 304MTEOM Space to end of recorded data. 305MTERASE Erase tape. If the argument is zero, the short erase command 306 is used. The long erase command is used with all other values 307 of the argument. 308MTSEEK Seek to tape block count. Uses Tandberg-compatible seek (QFA) 309 for SCSI-1 drives and SCSI-2 seek for SCSI-2 drives. The file and 310 block numbers in the status are not valid after a seek. 311MTSETBLK Set the drive block size. Setting to zero sets the drive into 312 variable block mode (if applicable). 313MTSETDENSITY Sets the drive density code to arg. See drive 314 documentation for available codes. 315MTLOCK and MTUNLOCK Explicitly lock/unlock the tape drive door. 316MTLOAD and MTUNLOAD Explicitly load and unload the tape. If the 317 command argument x is between MT_ST_HPLOADER_OFFSET + 1 and 318 MT_ST_HPLOADER_OFFSET + 6, the number x is used sent to the 319 drive with the command and it selects the tape slot to use of 320 HP C1553A changer. 321MTCOMPRESSION Sets compressing or uncompressing drive mode using the 322 SCSI mode page 15. Note that some drives other methods for 323 control of compression. Some drives (like the Exabytes) use 324 density codes for compression control. Some drives use another 325 mode page but this page has not been implemented in the 326 driver. Some drives without compression capability will accept 327 any compression mode without error. 328MTSETPART Moves the tape to the partition given by the argument at the 329 next tape operation. The block at which the tape is positioned 330 is the block where the tape was previously positioned in the 331 new active partition unless the next tape operation is 332 MTSEEK. In this case the tape is moved directly to the block 333 specified by MTSEEK. MTSETPART is inactive unless 334 MT_ST_CAN_PARTITIONS set. 335MTMKPART Formats the tape with one partition (argument zero) or two 336 partitions (the argument gives in megabytes the size of 337 partition 1 that is physically the first partition of the 338 tape). The drive has to support partitions with size specified 339 by the initiator. Inactive unless MT_ST_CAN_PARTITIONS set. 340MTSETDRVBUFFER 341 Is used for several purposes. The command is obtained from count 342 with mask MT_SET_OPTIONS, the low order bits are used as argument. 343 This command is only allowed for the superuser (root). The 344 subcommands are: 345 0 346 The drive buffer option is set to the argument. Zero means 347 no buffering. 348 MT_ST_BOOLEANS 349 Sets the buffering options. The bits are the new states 350 (enabled/disabled) the following options (in the 351 parenthesis is specified whether the option is global or 352 can be specified differently for each mode): 353 MT_ST_BUFFER_WRITES write buffering (mode) 354 MT_ST_ASYNC_WRITES asynchronous writes (mode) 355 MT_ST_READ_AHEAD read ahead (mode) 356 MT_ST_TWO_FM writing of two filemarks (global) 357 MT_ST_FAST_EOM using the SCSI spacing to EOD (global) 358 MT_ST_AUTO_LOCK automatic locking of the drive door (global) 359 MT_ST_DEF_WRITES the defaults are meant only for writes (mode) 360 MT_ST_CAN_BSR backspacing over more than one records can 361 be used for repositioning the tape (global) 362 MT_ST_NO_BLKLIMS the driver does not ask the block limits 363 from the drive (block size can be changed only to 364 variable) (global) 365 MT_ST_CAN_PARTITIONS enables support for partitioned 366 tapes (global) 367 MT_ST_SCSI2LOGICAL the logical block number is used in 368 the MTSEEK and MTIOCPOS for SCSI-2 drives instead of 369 the device dependent address. It is recommended to set 370 this flag unless there are tapes using the device 371 dependent (from the old times) (global) 372 MT_ST_SYSV sets the SYSV semantics (mode) 373 MT_ST_NOWAIT enables immediate mode (i.e., don't wait for 374 the command to finish) for some commands (e.g., rewind) 375 MT_ST_DEBUGGING debugging (global; debugging must be 376 compiled into the driver) 377 MT_ST_SETBOOLEANS 378 MT_ST_CLEARBOOLEANS 379 Sets or clears the option bits. 380 MT_ST_WRITE_THRESHOLD 381 Sets the write threshold for this device to kilobytes 382 specified by the lowest bits. 383 MT_ST_DEF_BLKSIZE 384 Defines the default block size set automatically. Value 385 0xffffff means that the default is not used any more. 386 MT_ST_DEF_DENSITY 387 MT_ST_DEF_DRVBUFFER 388 Used to set or clear the density (8 bits), and drive buffer 389 state (3 bits). If the value is MT_ST_CLEAR_DEFAULT 390 (0xfffff) the default will not be used any more. Otherwise 391 the lowermost bits of the value contain the new value of 392 the parameter. 393 MT_ST_DEF_COMPRESSION 394 The compression default will not be used if the value of 395 the lowermost byte is 0xff. Otherwise the lowermost bit 396 contains the new default. If the bits 8-15 are set to a 397 non-zero number, and this number is not 0xff, the number is 398 used as the compression algorithm. The value 399 MT_ST_CLEAR_DEFAULT can be used to clear the compression 400 default. 401 MT_ST_SET_TIMEOUT 402 Set the normal timeout in seconds for this device. The 403 default is 900 seconds (15 minutes). The timeout should be 404 long enough for the retries done by the device while 405 reading/writing. 406 MT_ST_SET_LONG_TIMEOUT 407 Set the long timeout that is used for operations that are 408 known to take a long time. The default is 14000 seconds 409 (3.9 hours). For erase this value is further multiplied by 410 eight. 411 MT_ST_SET_CLN 412 Set the cleaning request interpretation parameters using 413 the lowest 24 bits of the argument. The driver can set the 414 generic status bit GMT_CLN if a cleaning request bit pattern 415 is found from the extended sense data. Many drives set one or 416 more bits in the extended sense data when the drive needs 417 cleaning. The bits are device-dependent. The driver is 418 given the number of the sense data byte (the lowest eight 419 bits of the argument; must be >= 18 (values 1 - 17 420 reserved) and <= the maximum requested sense data sixe), 421 a mask to select the relevant bits (the bits 9-16), and the 422 bit pattern (bits 17-23). If the bit pattern is zero, one 423 or more bits under the mask indicate cleaning request. If 424 the pattern is non-zero, the pattern must match the masked 425 sense data byte. 426 427 (The cleaning bit is set if the additional sense code and 428 qualifier 00h 17h are seen regardless of the setting of 429 MT_ST_SET_CLN.) 430 431The following ioctl uses the structure mtpos: 432MTIOCPOS Reads the current position from the drive. Uses 433 Tandberg-compatible QFA for SCSI-1 drives and the SCSI-2 434 command for the SCSI-2 drives. 435 436The following ioctl uses the structure mtget to return the status: 437MTIOCGET Returns some status information. 438 The file number and block number within file are returned. The 439 block is -1 when it can't be determined (e.g., after MTBSF). 440 The drive type is either MTISSCSI1 or MTISSCSI2. 441 The number of recovered errors since the previous status call 442 is stored in the lower word of the field mt_erreg. 443 The current block size and the density code are stored in the field 444 mt_dsreg (shifts for the subfields are MT_ST_BLKSIZE_SHIFT and 445 MT_ST_DENSITY_SHIFT). 446 The GMT_xxx status bits reflect the drive status. GMT_DR_OPEN 447 is set if there is no tape in the drive. GMT_EOD means either 448 end of recorded data or end of tape. GMT_EOT means end of tape. 449 450 451MISCELLANEOUS COMPILE OPTIONS 452 453The recovered write errors are considered fatal if ST_RECOVERED_WRITE_FATAL 454is defined. 455 456The maximum number of tape devices is determined by the define 457ST_MAX_TAPES. If more tapes are detected at driver initialization, the 458maximum is adjusted accordingly. 459 460Immediate return from tape positioning SCSI commands can be enabled by 461defining ST_NOWAIT. If this is defined, the user should take care that 462the next tape operation is not started before the previous one has 463finished. The drives and SCSI adapters should handle this condition 464gracefully, but some drive/adapter combinations are known to hang the 465SCSI bus in this case. 466 467The MTEOM command is by default implemented as spacing over 32767 468filemarks. With this method the file number in the status is 469correct. The user can request using direct spacing to EOD by setting 470ST_FAST_EOM 1 (or using the MT_ST_OPTIONS ioctl). In this case the file 471number will be invalid. 472 473When using read ahead or buffered writes the position within the file 474may not be correct after the file is closed (correct position may 475require backspacing over more than one record). The correct position 476within file can be obtained if ST_IN_FILE_POS is defined at compile 477time or the MT_ST_CAN_BSR bit is set for the drive with an ioctl. 478(The driver always backs over a filemark crossed by read ahead if the 479user does not request data that far.) 480 481 482DEBUGGING HINTS 483 484To enable debugging messages, edit st.c and #define DEBUG 1. As seen 485above, debugging can be switched off with an ioctl if debugging is 486compiled into the driver. The debugging output is not voluminous. 487 488If the tape seems to hang, I would be very interested to hear where 489the driver is waiting. With the command 'ps -l' you can see the state 490of the process using the tape. If the state is D, the process is 491waiting for something. The field WCHAN tells where the driver is 492waiting. If you have the current System.map in the correct place (in 493/boot for the procps I use) or have updated /etc/psdatabase (for kmem 494ps), ps writes the function name in the WCHAN field. If not, you have 495to look up the function from System.map. 496 497Note also that the timeouts are very long compared to most other 498drivers. This means that the Linux driver may appear hung although the 499real reason is that the tape firmware has got confused.