Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tools api fs: Make xxx__mountpoint() more scalable

The xxx_mountpoint() interface provided by fs.c finds mount points for
common pseudo filesystems. The first time xxx_mountpoint() is invoked,
it scans the mount table (/proc/mounts) looking for a match. If found,
it is cached. The price to scan /proc/mounts is paid once if the mount
is found.

When the mount point is not found, subsequent calls to xxx_mountpoint()
scan /proc/mounts over and over again. There is no caching.

This causes a scaling issue in perf record with hugeltbfs__mountpoint().
The function is called for each process found in
synthesize__mmap_events(). If the machine has thousands of processes
and if the /proc/mounts has many entries this could cause major overhead
in perf record. We have observed multi-second slowdowns on some
configurations.

As an example on a laptop:

Before:

$ sudo umount /dev/hugepages
$ strace -e trace=openat -o /tmp/tt perf record -a ls
$ fgrep mounts /tmp/tt
285

After:

$ sudo umount /dev/hugepages
$ strace -e trace=openat -o /tmp/tt perf record -a ls
$ fgrep mounts /tmp/tt
1

One could argue that the non-caching in case the moint point is not
found is intentional. That way subsequent calls may discover a moint
point if the sysadmin mounts the filesystem. But the same argument could
be made against caching the mount point. It could be unmounted causing
errors. It all depends on the intent of the interface. This patch
assumes it is expected to scan /proc/mounts once. The patch documents
the caching behavior in the fs.h header file.

An alternative would be to just fix perf record. But it would solve the
problem with hugetlbs__mountpoint() but there could be similar issues
(possibly down the line) with other xxx_mountpoint() calls in perf or
other tools.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrey Zhizhikin <andrey.z@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200402154357.107873-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Stephane Eranian and committed by
Arnaldo Carvalho de Melo
c6fddb28 2a4b5166

+29
+17
tools/lib/api/fs/fs.c
··· 90 90 const char * const *mounts; 91 91 char path[PATH_MAX]; 92 92 bool found; 93 + bool checked; 93 94 long magic; 94 95 }; 95 96 ··· 112 111 .name = "sysfs", 113 112 .mounts = sysfs__fs_known_mountpoints, 114 113 .magic = SYSFS_MAGIC, 114 + .checked = false, 115 115 }, 116 116 [FS__PROCFS] = { 117 117 .name = "proc", 118 118 .mounts = procfs__known_mountpoints, 119 119 .magic = PROC_SUPER_MAGIC, 120 + .checked = false, 120 121 }, 121 122 [FS__DEBUGFS] = { 122 123 .name = "debugfs", 123 124 .mounts = debugfs__known_mountpoints, 124 125 .magic = DEBUGFS_MAGIC, 126 + .checked = false, 125 127 }, 126 128 [FS__TRACEFS] = { 127 129 .name = "tracefs", 128 130 .mounts = tracefs__known_mountpoints, 129 131 .magic = TRACEFS_MAGIC, 132 + .checked = false, 130 133 }, 131 134 [FS__HUGETLBFS] = { 132 135 .name = "hugetlbfs", 133 136 .mounts = hugetlbfs__known_mountpoints, 134 137 .magic = HUGETLBFS_MAGIC, 138 + .checked = false, 135 139 }, 136 140 [FS__BPF_FS] = { 137 141 .name = "bpf", 138 142 .mounts = bpf_fs__known_mountpoints, 139 143 .magic = BPF_FS_MAGIC, 144 + .checked = false, 140 145 }, 141 146 }; 142 147 ··· 165 158 } 166 159 167 160 fclose(fp); 161 + fs->checked = true; 168 162 return fs->found = found; 169 163 } 170 164 ··· 228 220 return false; 229 221 230 222 fs->found = true; 223 + fs->checked = true; 231 224 strncpy(fs->path, override_path, sizeof(fs->path) - 1); 232 225 fs->path[sizeof(fs->path) - 1] = '\0'; 233 226 return true; ··· 254 245 255 246 if (fs->found) 256 247 return (const char *)fs->path; 248 + 249 + /* the mount point was already checked for the mount point 250 + * but and did not exist, so return NULL to avoid scanning again. 251 + * This makes the found and not found paths cost equivalent 252 + * in case of multiple calls. 253 + */ 254 + if (fs->checked) 255 + return NULL; 257 256 258 257 return fs__get_mountpoint(fs); 259 258 }
+12
tools/lib/api/fs/fs.h
··· 18 18 const char *name##__mount(void); \ 19 19 bool name##__configured(void); \ 20 20 21 + /* 22 + * The xxxx__mountpoint() entry points find the first match mount point for each 23 + * filesystems listed below, where xxxx is the filesystem type. 24 + * 25 + * The interface is as follows: 26 + * 27 + * - If a mount point is found on first call, it is cached and used for all 28 + * subsequent calls. 29 + * 30 + * - If a mount point is not found, NULL is returned on first call and all 31 + * subsequent calls. 32 + */ 21 33 FS(sysfs) 22 34 FS(procfs) 23 35 FS(debugfs)