Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

st: implement tape statistics

This patch implements tape statistics in the st module via
sysfs. Current no statistics are available for tape I/O and there
is no easy way to reuse the block layer statistics for tape
as tape is a character device and does not have perform I/O in
sector sized chunks (the size of the data written to tape
can change). For tapes we also need extra stats related to
things like tape movement (via other I/O).

There have been multiple end users requesting statistics
including AT&T (and some HP customers who have not given
permission to be named). It is impossible for them
to investigate any issues related to tape performance
in a non-invasive way.

[jejb: eliminate PRId64]
Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Tested-by: Shane Seymour <shane.seymour@hp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>

authored by

Seymour, Shane M and committed by
James Bottomley
05545c92 ba929992

+461 -1
+109
Documentation/ABI/testing/sysfs-class-scsi_tape
··· 1 + What: /sys/class/scsi_tape/*/stats/in_flight 2 + Date: Apr 2015 3 + KernelVersion: 4.2 4 + Contact: Shane Seymour <shane.seymour@hp.com> 5 + Description: 6 + Show the number of I/Os currently in-flight between the st 7 + module and the SCSI mid-layer. 8 + Users: 9 + 10 + 11 + What: /sys/class/scsi_tape/*/stats/io_ns 12 + Date: Apr 2015 13 + KernelVersion: 4.2 14 + Contact: Shane Seymour <shane.seymour@hp.com> 15 + Description: 16 + Shows the total amount of time spent waiting for all I/O 17 + to and from the tape drive to complete. This includes all 18 + reads, writes, and other SCSI commands issued to the tape 19 + drive. An example of other SCSI commands would be tape 20 + movement such as a rewind when a rewind tape device is 21 + closed. This item is measured in nanoseconds. 22 + 23 + To determine the amount of time spent waiting for other I/O 24 + to complete subtract read_ns and write_ns from this value. 25 + Users: 26 + 27 + 28 + What: /sys/class/scsi_tape/*/stats/other_cnt 29 + Date: Apr 2015 30 + KernelVersion: 4.2 31 + Contact: Shane Seymour <shane.seymour@hp.com> 32 + Description: 33 + The number of I/O requests issued to the tape drive other 34 + than SCSI read/write requests. 35 + Users: 36 + 37 + 38 + What: /sys/class/scsi_tape/*/stats/read_byte_cnt 39 + Date: Apr 2015 40 + KernelVersion: 4.2 41 + Contact: Shane Seymour <shane.seymour@hp.com> 42 + Description: 43 + Shows the total number of bytes requested from the tape drive. 44 + This value is presented in bytes because tape drives support 45 + variable length block sizes. 46 + Users: 47 + 48 + 49 + What: /sys/class/scsi_tape/*/stats/read_cnt 50 + Date: Apr 2015 51 + KernelVersion: 4.2 52 + Contact: Shane Seymour <shane.seymour@hp.com> 53 + Description: 54 + Shows the total number of read requests issued to the tape 55 + drive. 56 + Users: 57 + 58 + 59 + What: /sys/class/scsi_tape/*/stats/read_ns 60 + Date: Apr 2015 61 + KernelVersion: 4.2 62 + Contact: Shane Seymour <shane.seymour@hp.com> 63 + Description: 64 + Shows the total amount of time in nanoseconds waiting for 65 + read I/O requests to complete. 66 + Users: 67 + 68 + 69 + What: /sys/class/scsi_tape/*/stats/write_byte_cnt 70 + Date: Apr 2015 71 + KernelVersion: 4.2 72 + Contact: Shane Seymour <shane.seymour@hp.com> 73 + Description: 74 + Shows the total number of bytes written to the tape drive. 75 + This value is presented in bytes because tape drives support 76 + variable length block sizes. 77 + Users: 78 + 79 + 80 + What: /sys/class/scsi_tape/*/stats/write_cnt 81 + Date: Apr 2015 82 + KernelVersion: 4.2 83 + Contact: Shane Seymour <shane.seymour@hp.com> 84 + Description: 85 + Shows the total number of write requests issued to the tape 86 + drive. 87 + Users: 88 + 89 + 90 + What: /sys/class/scsi_tape/*/stats/write_ms 91 + Date: Apr 2015 92 + KernelVersion: 4.2 93 + Contact: Shane Seymour <shane.seymour@hp.com> 94 + Description: 95 + Shows the total amount of time in nanoseconds waiting for 96 + write I/O requests to complete. 97 + Users: 98 + 99 + 100 + What: /sys/class/scsi_tape/*/stats/resid_cnt 101 + Date: Apr 2015 102 + KernelVersion: 4.2 103 + Contact: Shane Seymour <shane.seymour@hp.com> 104 + Description: 105 + Shows the number of times we found that a residual >0 106 + was found when the SCSI midlayer indicated that there was 107 + an error. For reads this may be a case of someone issuing 108 + reads greater than the block size. 109 + Users:
+59
Documentation/scsi/st.txt
··· 151 151 directory corresponding to the mode 0 auto-rewind device (e.g., st0). 152 152 153 153 154 + SYSFS AND STATISTICS FOR TAPE DEVICES 155 + 156 + The st driver maintains statistics for tape drives inside the sysfs filesystem. 157 + The following method can be used to locate the statistics that are 158 + available (assuming that sysfs is mounted at /sys): 159 + 160 + 1. Use opendir(3) on the directory /sys/class/scsi_tape 161 + 2. Use readdir(3) to read the directory contents 162 + 3. Use regcomp(3)/regexec(3) to match directory entries to the extended 163 + regular expression "^st[0-9]+$" 164 + 4. Access the statistics from the /sys/class/scsi_tape/<match>/stats 165 + directory (where <match> is a directory entry from /sys/class/scsi_tape 166 + that matched the extended regular expression) 167 + 168 + The reason for using this approach is that all the character devices 169 + pointing to the same tape drive use the same statistics. That means 170 + that st0 would have the same statistics as nst0. 171 + 172 + The directory contains the following statistics files: 173 + 174 + 1. in_flight - The number of I/Os currently outstanding to this device. 175 + 2. io_ns - The amount of time spent waiting (in nanoseconds) for all I/O 176 + to complete (including read and write). This includes tape movement 177 + commands such as seeking between file or set marks and implicit tape 178 + movement such as when rewind on close tape devices are used. 179 + 3. other_cnt - The number of I/Os issued to the tape drive other than read or 180 + write commands. The time taken to complete these commands uses the 181 + following calculation io_ms-read_ms-write_ms. 182 + 4. read_byte_cnt - The number of bytes read from the tape drive. 183 + 5. read_cnt - The number of read requests issued to the tape drive. 184 + 6. read_ns - The amount of time (in nanoseconds) spent waiting for read 185 + requests to complete. 186 + 7. write_byte_cnt - The number of bytes written to the tape drive. 187 + 8. write_cnt - The number of write requests issued to the tape drive. 188 + 9. write_ns - The amount of time (in nanoseconds) spent waiting for write 189 + requests to complete. 190 + 10. resid_cnt - The number of times during a read or write we found 191 + the residual amount to be non-zero. This should mean that a program 192 + is issuing a read larger thean the block size on tape. For write 193 + not all data made it to tape. 194 + 195 + Note: The in_flight value is incremented when an I/O starts the I/O 196 + itself is not added to the statistics until it completes. 197 + 198 + The total of read_cnt, write_cnt, and other_cnt may not total to the same 199 + value as iodone_cnt at the device level. The tape statistics only count 200 + I/O issued via the st module. 201 + 202 + When read the statistics may not be temporally consistent while I/O is in 203 + progress. The individual values are read and written to atomically however 204 + when reading them back via sysfs they may be in the process of being 205 + updated when starting an I/O or when it is completed. 206 + 207 + The value shown in in_flight is incremented before any statstics are 208 + updated and decremented when an I/O completes after updating statistics. 209 + The value of in_flight is 0 when there are no I/Os outstanding that are 210 + issued by the st driver. Tape statistics do not take into account any 211 + I/O performed via the sg device. 212 + 154 213 BSD AND SYS V SEMANTICS 155 214 156 215 The user can choose between these two behaviours of the tape driver by
+271 -1
drivers/scsi/st.c
··· 471 471 kfree(streq); 472 472 } 473 473 474 + static void st_do_stats(struct scsi_tape *STp, struct request *req) 475 + { 476 + ktime_t now; 477 + 478 + now = ktime_get(); 479 + if (req->cmd[0] == WRITE_6) { 480 + now = ktime_sub(now, STp->stats->write_time); 481 + atomic64_add(ktime_to_ns(now), &STp->stats->tot_write_time); 482 + atomic64_add(ktime_to_ns(now), &STp->stats->tot_io_time); 483 + atomic64_inc(&STp->stats->write_cnt); 484 + if (req->errors) { 485 + atomic64_add(atomic_read(&STp->stats->last_write_size) 486 + - STp->buffer->cmdstat.residual, 487 + &STp->stats->write_byte_cnt); 488 + if (STp->buffer->cmdstat.residual > 0) 489 + atomic64_inc(&STp->stats->resid_cnt); 490 + } else 491 + atomic64_add(atomic_read(&STp->stats->last_write_size), 492 + &STp->stats->write_byte_cnt); 493 + } else if (req->cmd[0] == READ_6) { 494 + now = ktime_sub(now, STp->stats->read_time); 495 + atomic64_add(ktime_to_ns(now), &STp->stats->tot_read_time); 496 + atomic64_add(ktime_to_ns(now), &STp->stats->tot_io_time); 497 + atomic64_inc(&STp->stats->read_cnt); 498 + if (req->errors) { 499 + atomic64_add(atomic_read(&STp->stats->last_read_size) 500 + - STp->buffer->cmdstat.residual, 501 + &STp->stats->read_byte_cnt); 502 + if (STp->buffer->cmdstat.residual > 0) 503 + atomic64_inc(&STp->stats->resid_cnt); 504 + } else 505 + atomic64_add(atomic_read(&STp->stats->last_read_size), 506 + &STp->stats->read_byte_cnt); 507 + } else { 508 + now = ktime_sub(now, STp->stats->other_time); 509 + atomic64_add(ktime_to_ns(now), &STp->stats->tot_io_time); 510 + atomic64_inc(&STp->stats->other_cnt); 511 + } 512 + atomic64_dec(&STp->stats->in_flight); 513 + } 514 + 474 515 static void st_scsi_execute_end(struct request *req, int uptodate) 475 516 { 476 517 struct st_request *SRpnt = req->end_io_data; ··· 520 479 521 480 STp->buffer->cmdstat.midlevel_result = SRpnt->result = req->errors; 522 481 STp->buffer->cmdstat.residual = req->resid_len; 482 + 483 + st_do_stats(STp, req); 523 484 524 485 tmp = SRpnt->bio; 525 486 if (SRpnt->waiting) ··· 539 496 struct rq_map_data *mdata = &SRpnt->stp->buffer->map_data; 540 497 int err = 0; 541 498 int write = (data_direction == DMA_TO_DEVICE); 499 + struct scsi_tape *STp = SRpnt->stp; 542 500 543 501 req = blk_get_request(SRpnt->stp->device->request_queue, write, 544 502 GFP_KERNEL); ··· 558 514 blk_put_request(req); 559 515 return DRIVER_ERROR << 24; 560 516 } 517 + } 518 + 519 + atomic64_inc(&STp->stats->in_flight); 520 + if (cmd[0] == WRITE_6) { 521 + atomic_set(&STp->stats->last_write_size, bufflen); 522 + STp->stats->write_time = ktime_get(); 523 + } else if (cmd[0] == READ_6) { 524 + atomic_set(&STp->stats->last_read_size, bufflen); 525 + STp->stats->read_time = ktime_get(); 526 + } else { 527 + STp->stats->other_time = ktime_get(); 561 528 } 562 529 563 530 SRpnt->bio = req->bio; ··· 4277 4222 } 4278 4223 tpnt->index = error; 4279 4224 sprintf(disk->disk_name, "st%d", tpnt->index); 4225 + tpnt->stats = kzalloc(sizeof(struct scsi_tape_stats), GFP_KERNEL); 4226 + if (tpnt->stats == NULL) { 4227 + sdev_printk(KERN_ERR, SDp, 4228 + "st: Can't allocate statistics.\n"); 4229 + goto out_idr_remove; 4230 + } 4280 4231 4281 4232 dev_set_drvdata(dev, tpnt); 4282 4233 ··· 4302 4241 4303 4242 out_remove_devs: 4304 4243 remove_cdevs(tpnt); 4244 + kfree(tpnt->stats); 4245 + out_idr_remove: 4305 4246 spin_lock(&st_index_lock); 4306 4247 idr_remove(&st_index_idr, tpnt->index); 4307 4248 spin_unlock(&st_index_lock); ··· 4361 4298 4362 4299 disk->private_data = NULL; 4363 4300 put_disk(disk); 4301 + kfree(tpnt->stats); 4364 4302 kfree(tpnt); 4365 4303 return; 4366 4304 } ··· 4577 4513 } 4578 4514 static DEVICE_ATTR_RO(options); 4579 4515 4516 + /* Support for tape stats */ 4517 + 4518 + /** 4519 + * read_cnt_show - return read count - count of reads made from tape drive 4520 + * @dev: struct device 4521 + * @attr: attribute structure 4522 + * @buf: buffer to return formatted data in 4523 + */ 4524 + static ssize_t read_cnt_show(struct device *dev, 4525 + struct device_attribute *attr, char *buf) 4526 + { 4527 + struct st_modedef *STm = dev_get_drvdata(dev); 4528 + 4529 + return sprintf(buf, "%lld", 4530 + (long long)atomic64_read(&STm->tape->stats->read_cnt)); 4531 + } 4532 + static DEVICE_ATTR_RO(read_cnt); 4533 + 4534 + /** 4535 + * read_byte_cnt_show - return read byte count - tape drives 4536 + * may use blocks less than 512 bytes this gives the raw byte count of 4537 + * of data read from the tape drive. 4538 + * @dev: struct device 4539 + * @attr: attribute structure 4540 + * @buf: buffer to return formatted data in 4541 + */ 4542 + static ssize_t read_byte_cnt_show(struct device *dev, 4543 + struct device_attribute *attr, char *buf) 4544 + { 4545 + struct st_modedef *STm = dev_get_drvdata(dev); 4546 + 4547 + return sprintf(buf, "%lld", 4548 + (long long)atomic64_read(&STm->tape->stats->read_byte_cnt)); 4549 + } 4550 + static DEVICE_ATTR_RO(read_byte_cnt); 4551 + 4552 + /** 4553 + * read_us_show - return read us - overall time spent waiting on reads in ns. 4554 + * @dev: struct device 4555 + * @attr: attribute structure 4556 + * @buf: buffer to return formatted data in 4557 + */ 4558 + static ssize_t read_ns_show(struct device *dev, 4559 + struct device_attribute *attr, char *buf) 4560 + { 4561 + struct st_modedef *STm = dev_get_drvdata(dev); 4562 + 4563 + return sprintf(buf, "%lld", 4564 + (long long)atomic64_read(&STm->tape->stats->tot_read_time)); 4565 + } 4566 + static DEVICE_ATTR_RO(read_ns); 4567 + 4568 + /** 4569 + * write_cnt_show - write count - number of user calls 4570 + * to write(2) that have written data to tape. 4571 + * @dev: struct device 4572 + * @attr: attribute structure 4573 + * @buf: buffer to return formatted data in 4574 + */ 4575 + static ssize_t write_cnt_show(struct device *dev, 4576 + struct device_attribute *attr, char *buf) 4577 + { 4578 + struct st_modedef *STm = dev_get_drvdata(dev); 4579 + 4580 + return sprintf(buf, "%lld", 4581 + (long long)atomic64_read(&STm->tape->stats->write_cnt)); 4582 + } 4583 + static DEVICE_ATTR_RO(write_cnt); 4584 + 4585 + /** 4586 + * write_byte_cnt_show - write byte count - raw count of 4587 + * bytes written to tape. 4588 + * @dev: struct device 4589 + * @attr: attribute structure 4590 + * @buf: buffer to return formatted data in 4591 + */ 4592 + static ssize_t write_byte_cnt_show(struct device *dev, 4593 + struct device_attribute *attr, char *buf) 4594 + { 4595 + struct st_modedef *STm = dev_get_drvdata(dev); 4596 + 4597 + return sprintf(buf, "%lld", 4598 + (long long)atomic64_read(&STm->tape->stats->write_byte_cnt)); 4599 + } 4600 + static DEVICE_ATTR_RO(write_byte_cnt); 4601 + 4602 + /** 4603 + * write_ns_show - write ns - number of nanoseconds waiting on write 4604 + * requests to complete. 4605 + * @dev: struct device 4606 + * @attr: attribute structure 4607 + * @buf: buffer to return formatted data in 4608 + */ 4609 + static ssize_t write_ns_show(struct device *dev, 4610 + struct device_attribute *attr, char *buf) 4611 + { 4612 + struct st_modedef *STm = dev_get_drvdata(dev); 4613 + 4614 + return sprintf(buf, "%lld", 4615 + (long long)atomic64_read(&STm->tape->stats->tot_write_time)); 4616 + } 4617 + static DEVICE_ATTR_RO(write_ns); 4618 + 4619 + /** 4620 + * in_flight_show - number of I/Os currently in flight - 4621 + * in most cases this will be either 0 or 1. It may be higher if someone 4622 + * has also issued other SCSI commands such as via an ioctl. 4623 + * @dev: struct device 4624 + * @attr: attribute structure 4625 + * @buf: buffer to return formatted data in 4626 + */ 4627 + static ssize_t in_flight_show(struct device *dev, 4628 + struct device_attribute *attr, char *buf) 4629 + { 4630 + struct st_modedef *STm = dev_get_drvdata(dev); 4631 + 4632 + return sprintf(buf, "%lld", 4633 + (long long)atomic64_read(&STm->tape->stats->in_flight)); 4634 + } 4635 + static DEVICE_ATTR_RO(in_flight); 4636 + 4637 + /** 4638 + * io_ns_show - io wait ns - this is the number of ns spent 4639 + * waiting on all I/O to complete. This includes tape movement commands 4640 + * such as rewinding, seeking to end of file or tape, it also includes 4641 + * read and write. To determine the time spent on tape movement 4642 + * subtract the read and write ns from this value. 4643 + * @dev: struct device 4644 + * @attr: attribute structure 4645 + * @buf: buffer to return formatted data in 4646 + */ 4647 + static ssize_t io_ns_show(struct device *dev, 4648 + struct device_attribute *attr, char *buf) 4649 + { 4650 + struct st_modedef *STm = dev_get_drvdata(dev); 4651 + 4652 + return sprintf(buf, "%lld", 4653 + (long long)atomic64_read(&STm->tape->stats->tot_io_time)); 4654 + } 4655 + static DEVICE_ATTR_RO(io_ns); 4656 + 4657 + /** 4658 + * other_cnt_show - other io count - this is the number of 4659 + * I/O requests other than read and write requests. 4660 + * Typically these are tape movement requests but will include driver 4661 + * tape movement. This includes only requests issued by the st driver. 4662 + * @dev: struct device 4663 + * @attr: attribute structure 4664 + * @buf: buffer to return formatted data in 4665 + */ 4666 + static ssize_t other_cnt_show(struct device *dev, 4667 + struct device_attribute *attr, char *buf) 4668 + { 4669 + struct st_modedef *STm = dev_get_drvdata(dev); 4670 + 4671 + return sprintf(buf, "%lld", 4672 + (long long)atomic64_read(&STm->tape->stats->other_cnt)); 4673 + } 4674 + static DEVICE_ATTR_RO(other_cnt); 4675 + 4676 + /** 4677 + * resid_cnt_show - A count of the number of times we get a residual 4678 + * count - this should indicate someone issuing reads larger than the 4679 + * block size on tape. 4680 + * @dev: struct device 4681 + * @attr: attribute structure 4682 + * @buf: buffer to return formatted data in 4683 + */ 4684 + static ssize_t resid_cnt_show(struct device *dev, 4685 + struct device_attribute *attr, char *buf) 4686 + { 4687 + struct st_modedef *STm = dev_get_drvdata(dev); 4688 + 4689 + return sprintf(buf, "%lld", 4690 + (long long)atomic64_read(&STm->tape->stats->resid_cnt)); 4691 + } 4692 + static DEVICE_ATTR_RO(resid_cnt); 4693 + 4580 4694 static struct attribute *st_dev_attrs[] = { 4581 4695 &dev_attr_defined.attr, 4582 4696 &dev_attr_default_blksize.attr, ··· 4763 4521 &dev_attr_options.attr, 4764 4522 NULL, 4765 4523 }; 4766 - ATTRIBUTE_GROUPS(st_dev); 4524 + 4525 + static struct attribute *st_stats_attrs[] = { 4526 + &dev_attr_read_cnt.attr, 4527 + &dev_attr_read_byte_cnt.attr, 4528 + &dev_attr_read_ns.attr, 4529 + &dev_attr_write_cnt.attr, 4530 + &dev_attr_write_byte_cnt.attr, 4531 + &dev_attr_write_ns.attr, 4532 + &dev_attr_in_flight.attr, 4533 + &dev_attr_io_ns.attr, 4534 + &dev_attr_other_cnt.attr, 4535 + &dev_attr_resid_cnt.attr, 4536 + NULL, 4537 + }; 4538 + 4539 + static struct attribute_group stats_group = { 4540 + .name = "stats", 4541 + .attrs = st_stats_attrs, 4542 + }; 4543 + 4544 + static struct attribute_group st_group = { 4545 + .attrs = st_dev_attrs, 4546 + }; 4547 + 4548 + static const struct attribute_group *st_dev_groups[] = { 4549 + &st_group, 4550 + &stats_group, 4551 + NULL, 4552 + }; 4767 4553 4768 4554 /* The following functions may be useful for a larger audience. */ 4769 4555 static int sgl_map_user_pages(struct st_buffer *STbp,
+22
drivers/scsi/st.h
··· 92 92 int drv_file; 93 93 }; 94 94 95 + /* Tape statistics */ 96 + struct scsi_tape_stats { 97 + atomic64_t read_byte_cnt; /* bytes read */ 98 + atomic64_t write_byte_cnt; /* bytes written */ 99 + atomic64_t in_flight; /* Number of I/Os in flight */ 100 + atomic64_t read_cnt; /* Count of read requests */ 101 + atomic64_t write_cnt; /* Count of write requests */ 102 + atomic64_t other_cnt; /* Count of other requests either 103 + * implicit or from user space 104 + * ioctl. */ 105 + atomic64_t resid_cnt; /* Count of resid_len > 0 */ 106 + atomic64_t tot_read_time; /* ktime spent completing reads */ 107 + atomic64_t tot_write_time; /* ktime spent completing writes */ 108 + atomic64_t tot_io_time; /* ktime spent doing any I/O */ 109 + ktime_t read_time; /* holds ktime request was queued */ 110 + ktime_t write_time; /* holds ktime request was queued */ 111 + ktime_t other_time; /* holds ktime request was queued */ 112 + atomic_t last_read_size; /* Number of bytes issued for last read */ 113 + atomic_t last_write_size; /* Number of bytes issued for last write */ 114 + }; 115 + 95 116 #define ST_NBR_PARTITIONS 4 96 117 97 118 /* The tape drive descriptor */ ··· 192 171 #endif 193 172 struct gendisk *disk; 194 173 struct kref kref; 174 + struct scsi_tape_stats *stats; 195 175 }; 196 176 197 177 /* Bit masks for use_pf */