Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tools api io: Move filling the io buffer to its own function

In general a read fills 4kb so filling the buffer is a 1 in 4096
operation, move it out of the io__get_char function to avoid some
checking overhead and to better hint the function is good to inline.

For perf's IO intensive internal (non-rigorous) benchmarks there's a
small improvement to kallsyms-parsing with a default build.

Before:
```
$ perf bench internals all
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 146.322 usec (+- 0.305 usec)
Average num. events: 61.000 (+- 0.000)
Average time per event 2.399 usec
Average data synthesis took: 145.056 usec (+- 0.155 usec)
Average num. events: 329.000 (+- 0.000)
Average time per event 0.441 usec

Average kallsyms__parse took: 162.313 ms (+- 0.599 ms)
...
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 53.720 usec (+- 7.823 usec)
Average PMU scanning took: 375.145 usec (+- 23.974 usec)
```
After:
```
$ perf bench internals all
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 127.829 usec (+- 0.079 usec)
Average num. events: 61.000 (+- 0.000)
Average time per event 2.096 usec
Average data synthesis took: 133.652 usec (+- 0.101 usec)
Average num. events: 327.000 (+- 0.000)
Average time per event 0.409 usec

Average kallsyms__parse took: 150.415 ms (+- 0.313 ms)
...
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 47.790 usec (+- 1.178 usec)
Average PMU scanning took: 376.945 usec (+- 23.683 usec)
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240519181716.4088459-1-irogers@google.com

authored by

Ian Rogers and committed by
Namhyung Kim
d163d602 f975c13d

+37 -30
+37 -30
tools/lib/api/io.h
··· 43 43 io->eof = false; 44 44 } 45 45 46 - /* Reads one character from the "io" file with similar semantics to fgetc. */ 47 - static inline int io__get_char(struct io *io) 46 + /* Read from fd filling the buffer. Called when io->data == io->end. */ 47 + static inline int io__fill_buffer(struct io *io) 48 48 { 49 - char *ptr = io->data; 49 + ssize_t n; 50 50 51 51 if (io->eof) 52 52 return -1; 53 53 54 - if (ptr == io->end) { 55 - ssize_t n; 54 + if (io->timeout_ms != 0) { 55 + struct pollfd pfds[] = { 56 + { 57 + .fd = io->fd, 58 + .events = POLLIN, 59 + }, 60 + }; 56 61 57 - if (io->timeout_ms != 0) { 58 - struct pollfd pfds[] = { 59 - { 60 - .fd = io->fd, 61 - .events = POLLIN, 62 - }, 63 - }; 64 - 65 - n = poll(pfds, 1, io->timeout_ms); 66 - if (n == 0) 67 - errno = ETIMEDOUT; 68 - if (n > 0 && !(pfds[0].revents & POLLIN)) { 69 - errno = EIO; 70 - n = -1; 71 - } 72 - if (n <= 0) { 73 - io->eof = true; 74 - return -1; 75 - } 62 + n = poll(pfds, 1, io->timeout_ms); 63 + if (n == 0) 64 + errno = ETIMEDOUT; 65 + if (n > 0 && !(pfds[0].revents & POLLIN)) { 66 + errno = EIO; 67 + n = -1; 76 68 } 77 - n = read(io->fd, io->buf, io->buf_len); 78 - 79 69 if (n <= 0) { 80 70 io->eof = true; 81 71 return -1; 82 72 } 83 - ptr = &io->buf[0]; 84 - io->end = &io->buf[n]; 85 73 } 86 - io->data = ptr + 1; 87 - return *ptr; 74 + n = read(io->fd, io->buf, io->buf_len); 75 + 76 + if (n <= 0) { 77 + io->eof = true; 78 + return -1; 79 + } 80 + io->data = &io->buf[0]; 81 + io->end = &io->buf[n]; 82 + return 0; 83 + } 84 + 85 + /* Reads one character from the "io" file with similar semantics to fgetc. */ 86 + static inline int io__get_char(struct io *io) 87 + { 88 + if (io->data == io->end) { 89 + int ret = io__fill_buffer(io); 90 + 91 + if (ret) 92 + return ret; 93 + } 94 + return *io->data++; 88 95 } 89 96 90 97 /* Read a hexadecimal value with no 0x prefix into the out argument hex. If the