Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

erofs: support to readahead dirent blocks in erofs_readdir()

This patch supports to readahead more blocks in erofs_readdir(), it can
enhance readdir performance in large direcotry.

readdir test in a large directory which contains 12000 sub-files.

files_per_second
Before: 926385.54
After: 2380435.562

Meanwhile, let's introduces a new sysfs entry to control readahead
bytes to provide more flexible policy for readahead of readdir().
- location: /sys/fs/erofs/<disk>/dir_ra_bytes
- default value: 16384
- disable readahead: set the value to 0

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250721021352.2495371-1-chao@kernel.org
[ Gao Xiang: minor styling adjustment. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

authored by

Chao Yu and committed by
Gao Xiang
df0ce6ce 41409132

+29
+8
Documentation/ABI/testing/sysfs-fs-erofs
··· 35 35 and multiple accelerators are separated by '\n'. 36 36 Supported accelerator(s): qat_deflate. 37 37 Disable all accelerators with an empty string (echo > accel). 38 + 39 + What: /sys/fs/erofs/<disk>/dir_ra_bytes 40 + Date: July 2025 41 + Contact: "Chao Yu" <chao@kernel.org> 42 + Description: Used to set or show readahead bytes during readdir(), by 43 + default the value is 16384. 44 + 45 + - 0: disable readahead.
+14
fs/erofs/dir.c
··· 48 48 struct inode *dir = file_inode(f); 49 49 struct erofs_buf buf = __EROFS_BUF_INITIALIZER; 50 50 struct super_block *sb = dir->i_sb; 51 + struct file_ra_state *ra = &f->f_ra; 51 52 unsigned long bsz = sb->s_blocksize; 52 53 unsigned int ofs = erofs_blkoff(sb, ctx->pos); 54 + pgoff_t ra_pages = DIV_ROUND_UP_POW2( 55 + EROFS_I_SB(dir)->dir_ra_bytes, PAGE_SIZE); 56 + pgoff_t nr_pages = DIV_ROUND_UP_POW2(dir->i_size, PAGE_SIZE); 53 57 int err = 0; 54 58 bool initial = true; 55 59 ··· 66 62 if (fatal_signal_pending(current)) { 67 63 err = -ERESTARTSYS; 68 64 break; 65 + } 66 + 67 + /* readahead blocks to enhance performance for large directories */ 68 + if (ra_pages) { 69 + pgoff_t idx = DIV_ROUND_UP_POW2(ctx->pos, PAGE_SIZE); 70 + pgoff_t pages = min(nr_pages - idx, ra_pages); 71 + 72 + if (pages > 1 && !ra_has_index(ra, idx)) 73 + page_cache_sync_readahead(dir->i_mapping, ra, 74 + f, idx, pages); 69 75 } 70 76 71 77 de = erofs_bread(&buf, dbstart, true);
+4
fs/erofs/internal.h
··· 159 159 /* sysfs support */ 160 160 struct kobject s_kobj; /* /sys/fs/erofs/<devname> */ 161 161 struct completion s_kobj_unregister; 162 + erofs_off_t dir_ra_bytes; 162 163 163 164 /* fscache support */ 164 165 struct fscache_volume *volume; ··· 259 258 /* bitlock definitions (arranged in reverse order) */ 260 259 #define EROFS_I_BL_XATTR_BIT (BITS_PER_LONG - 1) 261 260 #define EROFS_I_BL_Z_BIT (BITS_PER_LONG - 2) 261 + 262 + /* default readahead size of directories */ 263 + #define EROFS_DIR_RA_BYTES 16384 262 264 263 265 struct erofs_inode { 264 266 erofs_nid_t nid;
+1
fs/erofs/super.c
··· 731 731 if (err) 732 732 return err; 733 733 734 + sbi->dir_ra_bytes = EROFS_DIR_RA_BYTES; 734 735 erofs_info(sb, "mounted with root inode @ nid %llu.", sbi->root_nid); 735 736 return 0; 736 737 }
+2
fs/erofs/sysfs.c
··· 65 65 #ifdef CONFIG_EROFS_FS_ZIP_ACCEL 66 66 EROFS_ATTR_FUNC(accel, 0644); 67 67 #endif 68 + EROFS_ATTR_RW_UI(dir_ra_bytes, erofs_sb_info); 68 69 69 70 static struct attribute *erofs_sb_attrs[] = { 70 71 #ifdef CONFIG_EROFS_FS_ZIP 71 72 ATTR_LIST(sync_decompress), 72 73 ATTR_LIST(drop_caches), 73 74 #endif 75 + ATTR_LIST(dir_ra_bytes), 74 76 NULL, 75 77 }; 76 78 ATTRIBUTE_GROUPS(erofs_sb);