Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

ext4: update ext4 documentation

Add documentation for mount options and ioctls to
Documentation/filesystem/ext4.txt, which has not been udpated for some
time. Also add for ext4 sysfs tunables to the
Documentation/ABI/testing/sysfs-fs-ext4 file, and fix a few
typographical errors in that file.

https://bugzilla.kernel.org/show_bug.cgi?id=9423

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

authored by

Lukas Czerner and committed by
Theodore Ts'o
6f9524e9 3abb17e8

+216 -4
+10 -3
Documentation/ABI/testing/sysfs-fs-ext4
··· 48 48 will have its blocks allocated out of its own unique 49 49 preallocation pool. 50 50 51 - What: /sys/fs/ext4/<disk>/inode_readahead 51 + What: /sys/fs/ext4/<disk>/inode_readahead_blks 52 52 Date: March 2008 53 53 Contact: "Theodore Ts'o" <tytso@mit.edu> 54 54 Description: ··· 85 85 Contact: "Theodore Ts'o" <tytso@mit.edu> 86 86 Description: 87 87 Tuning parameter which (if non-zero) controls the goal 88 - inode used by the inode allocator in p0reference to 89 - all other allocation hueristics. This is intended for 88 + inode used by the inode allocator in preference to 89 + all other allocation heuristics. This is intended for 90 90 debugging use only, and should be 0 on production 91 91 systems. 92 + 93 + What: /sys/fs/ext4/<disk>/max_writeback_mb_bump 94 + Date: September 2009 95 + Contact: "Theodore Ts'o" <tytso@mit.edu> 96 + Description: 97 + The maximum number of megabytes the writeback code will 98 + try to write out before move on to another inode.
+206 -1
Documentation/filesystems/ext4.txt
··· 367 367 minimizes the impact on the systme performance 368 368 while file system's inode table is being initialized. 369 369 370 - discard Controls whether ext4 should issue discard/TRIM 370 + discard Controls whether ext4 should issue discard/TRIM 371 371 nodiscard(*) commands to the underlying block device when 372 372 blocks are freed. This is useful for SSD devices 373 373 and sparse/thinly-provisioned LUNs, but it is off 374 374 by default until sufficient testing has been done. 375 + 376 + nouid32 Disables 32-bit UIDs and GIDs. This is for 377 + interoperability with older kernels which only 378 + store and expect 16-bit values. 379 + 380 + resize Allows to resize filesystem to the end of the last 381 + existing block group, further resize has to be done 382 + with resize2fs either online, or offline. It can be 383 + used only with conjunction with remount. 384 + 385 + block_validity This options allows to enables/disables the in-kernel 386 + noblock_validity facility for tracking filesystem metadata blocks 387 + within internal data structures. This allows multi- 388 + block allocator and other routines to quickly locate 389 + extents which might overlap with filesystem metadata 390 + blocks. This option is intended for debugging 391 + purposes and since it negatively affects the 392 + performance, it is off by default. 393 + 394 + dioread_lock Controls whether or not ext4 should use the DIO read 395 + dioread_nolock locking. If the dioread_nolock option is specified 396 + ext4 will allocate uninitialized extent before buffer 397 + write and convert the extent to initialized after IO 398 + completes. This approach allows ext4 code to avoid 399 + using inode mutex, which improves scalability on high 400 + speed storages. However this does not work with nobh 401 + option and the mount will fail. Nor does it work with 402 + data journaling and dioread_nolock option will be 403 + ignored with kernel warning. Note that dioread_nolock 404 + code path is only used for extent-based files. 405 + Because of the restrictions this options comprises 406 + it is off by default (e.g. dioread_lock). 407 + 408 + i_version Enable 64-bit inode version support. This option is 409 + off by default. 375 410 376 411 Data Mode 377 412 ========= ··· 434 399 needs to be read from and written to disk at the same time where it 435 400 outperforms all others modes. Currently ext4 does not have delayed 436 401 allocation support if this data journalling mode is selected. 402 + 403 + /proc entries 404 + ============= 405 + 406 + Information about mounted ext4 file systems can be found in 407 + /proc/fs/ext4. Each mounted filesystem will have a directory in 408 + /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or 409 + /proc/fs/ext4/dm-0). The files in each per-device directory are shown 410 + in table below. 411 + 412 + Files in /proc/fs/ext4/<devname> 413 + .............................................................................. 414 + File Content 415 + mb_groups details of multiblock allocator buddy cache of free blocks 416 + .............................................................................. 417 + 418 + /sys entries 419 + ============ 420 + 421 + Information about mounted ext4 file systems can be found in 422 + /sys/fs/ext4. Each mounted filesystem will have a directory in 423 + /sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or 424 + /sys/fs/ext4/dm-0). The files in each per-device directory are shown 425 + in table below. 426 + 427 + Files in /sys/fs/ext4/<devname> 428 + (see also Documentation/ABI/testing/sysfs-fs-ext4) 429 + .............................................................................. 430 + File Content 431 + 432 + delayed_allocation_blocks This file is read-only and shows the number of 433 + blocks that are dirty in the page cache, but 434 + which do not have their location in the 435 + filesystem allocated yet. 436 + 437 + inode_goal Tuning parameter which (if non-zero) controls 438 + the goal inode used by the inode allocator in 439 + preference to all other allocation heuristics. 440 + This is intended for debugging use only, and 441 + should be 0 on production systems. 442 + 443 + inode_readahead_blks Tuning parameter which controls the maximum 444 + number of inode table blocks that ext4's inode 445 + table readahead algorithm will pre-read into 446 + the buffer cache 447 + 448 + lifetime_write_kbytes This file is read-only and shows the number of 449 + kilobytes of data that have been written to this 450 + filesystem since it was created. 451 + 452 + max_writeback_mb_bump The maximum number of megabytes the writeback 453 + code will try to write out before move on to 454 + another inode. 455 + 456 + mb_group_prealloc The multiblock allocator will round up allocation 457 + requests to a multiple of this tuning parameter if 458 + the stripe size is not set in the ext4 superblock 459 + 460 + mb_max_to_scan The maximum number of extents the multiblock 461 + allocator will search to find the best extent 462 + 463 + mb_min_to_scan The minimum number of extents the multiblock 464 + allocator will search to find the best extent 465 + 466 + mb_order2_req Tuning parameter which controls the minimum size 467 + for requests (as a power of 2) where the buddy 468 + cache is used 469 + 470 + mb_stats Controls whether the multiblock allocator should 471 + collect statistics, which are shown during the 472 + unmount. 1 means to collect statistics, 0 means 473 + not to collect statistics 474 + 475 + mb_stream_req Files which have fewer blocks than this tunable 476 + parameter will have their blocks allocated out 477 + of a block group specific preallocation pool, so 478 + that small files are packed closely together. 479 + Each large file will have its blocks allocated 480 + out of its own unique preallocation pool. 481 + 482 + session_write_kbytes This file is read-only and shows the number of 483 + kilobytes of data that have been written to this 484 + filesystem since it was mounted. 485 + .............................................................................. 486 + 487 + Ioctls 488 + ====== 489 + 490 + There is some Ext4 specific functionality which can be accessed by applications 491 + through the system call interfaces. The list of all Ext4 specific ioctls are 492 + shown in the table below. 493 + 494 + Table of Ext4 specific ioctls 495 + .............................................................................. 496 + Ioctl Description 497 + EXT4_IOC_GETFLAGS Get additional attributes associated with inode. 498 + The ioctl argument is an integer bitfield, with 499 + bit values described in ext4.h. This ioctl is an 500 + alias for FS_IOC_GETFLAGS. 501 + 502 + EXT4_IOC_SETFLAGS Set additional attributes associated with inode. 503 + The ioctl argument is an integer bitfield, with 504 + bit values described in ext4.h. This ioctl is an 505 + alias for FS_IOC_SETFLAGS. 506 + 507 + EXT4_IOC_GETVERSION 508 + EXT4_IOC_GETVERSION_OLD 509 + Get the inode i_generation number stored for 510 + each inode. The i_generation number is normally 511 + changed only when new inode is created and it is 512 + particularly useful for network filesystems. The 513 + '_OLD' version of this ioctl is an alias for 514 + FS_IOC_GETVERSION. 515 + 516 + EXT4_IOC_SETVERSION 517 + EXT4_IOC_SETVERSION_OLD 518 + Set the inode i_generation number stored for 519 + each inode. The '_OLD' version of this ioctl 520 + is an alias for FS_IOC_SETVERSION. 521 + 522 + EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize 523 + mount option. It allows to resize filesystem 524 + to the end of the last existing block group, 525 + further resize has to be done with resize2fs, 526 + either online, or offline. The argument points 527 + to the unsigned logn number representing the 528 + filesystem new block count. 529 + 530 + EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one 531 + this ioctl is pointing to) to the donor_fd (the 532 + one specified in move_extent structure passed 533 + as an argument to this ioctl). Then, exchange 534 + inode metadata between orig_fd and donor_fd. 535 + This is especially useful for online 536 + defragmentation, because the allocator has the 537 + opportunity to allocate moved blocks better, 538 + ideally into one contiguous extent. 539 + 540 + EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or 541 + new group descriptor block. The new group 542 + descriptor is described by ext4_new_group_input 543 + structure, which is passed as an argument to 544 + this ioctl. This is especially useful in 545 + conjunction with EXT4_IOC_GROUP_EXTEND, 546 + which allows online resize of the filesystem 547 + to the end of the last existing block group. 548 + Those two ioctls combined is used in userspace 549 + online resize tool (e.g. resize2fs). 550 + 551 + EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself. 552 + It converts (migrates) ext3 indirect block mapped 553 + inode to ext4 extent mapped inode by walking 554 + through indirect block mapping of the original 555 + inode and converting contiguous block ranges 556 + into ext4 extents of the temporary inode. Then, 557 + inodes are swapped. This ioctl might help, when 558 + migrating from ext3 to ext4 filesystem, however 559 + suggestion is to create fresh ext4 filesystem 560 + and copy data from the backup. Note, that 561 + filesystem has to support extents for this ioctl 562 + to work. 563 + 564 + EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be 565 + allocated to preserve application-expected ext3 566 + behaviour. Note that this will also start 567 + triggering a write of the data blocks, but this 568 + behaviour may change in the future as it is 569 + not necessary and has been done this way only 570 + for sake of simplicity. 571 + .............................................................................. 437 572 438 573 References 439 574 ==========