Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

btrfs: Do super block verification before writing it to disk

There are already 2 reports about strangely corrupted super blocks,
where csum still matches but extra garbage gets slipped into super block.

The corruption would looks like:
------
superblock: bytenr=65536, device=/dev/sdc1
---------------------------------------------------------
csum_type 41700 (INVALID)
csum 0x3b252d3a [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
...
incompat_flags 0x5b22400000000169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA |
unknown flag: 0x5b22400000000000 )
...
------
Or
------
superblock: bytenr=65536, device=/dev/mapper/x
---------------------------------------------------------
csum_type 35355 (INVALID)
csum_size 32
csum 0xf0dbeddd [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
...
incompat_flags 0x176d200000000169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA |
unknown flag: 0x176d200000000000 )
------

Obviously, csum_type and incompat_flags get some garbage, but its csum
still matches, which means kernel calculates the csum based on corrupted
super block memory.
And after manually fixing these values, the filesystem is completely
healthy without any problem exposed by btrfs check.

Although the cause is still unknown, at least detect it and prevent further
corruption.

Both reports have same symptoms, there's an overwrite on offset 192 of
the superblock, by 4 bytes. The superblock structure is not allocated or
freed and stays in the memory for the whole filesystem lifetime, so it's
not a use-after-free kind of error on someone else's leaked page.

As a vague point for the problable cause is mentioning of other system
freezing related to graphic card drivers.

Reported-by: Ken Swenson <flat@imo.uto.moe>
Reported-by: Ben Parsons <9parsonsb@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add brief analysis of the reports ]
Signed-off-by: David Sterba <dsterba@suse.com>

authored by

Qu Wenruo and committed by
David Sterba
75cb857d 069ec957

+43
+43
fs/btrfs/disk-io.c
··· 2610 2610 return validate_super(fs_info, fs_info->super_copy, 0); 2611 2611 } 2612 2612 2613 + /* 2614 + * Validation of super block at write time. 2615 + * Some checks like bytenr check will be skipped as their values will be 2616 + * overwritten soon. 2617 + * Extra checks like csum type and incompat flags will be done here. 2618 + */ 2619 + static int btrfs_validate_write_super(struct btrfs_fs_info *fs_info, 2620 + struct btrfs_super_block *sb) 2621 + { 2622 + int ret; 2623 + 2624 + ret = validate_super(fs_info, sb, -1); 2625 + if (ret < 0) 2626 + goto out; 2627 + if (btrfs_super_csum_type(sb) != BTRFS_CSUM_TYPE_CRC32) { 2628 + ret = -EUCLEAN; 2629 + btrfs_err(fs_info, "invalid csum type, has %u want %u", 2630 + btrfs_super_csum_type(sb), BTRFS_CSUM_TYPE_CRC32); 2631 + goto out; 2632 + } 2633 + if (btrfs_super_incompat_flags(sb) & ~BTRFS_FEATURE_INCOMPAT_SUPP) { 2634 + ret = -EUCLEAN; 2635 + btrfs_err(fs_info, 2636 + "invalid incompat flags, has 0x%llx valid mask 0x%llx", 2637 + btrfs_super_incompat_flags(sb), 2638 + (unsigned long long)BTRFS_FEATURE_INCOMPAT_SUPP); 2639 + goto out; 2640 + } 2641 + out: 2642 + if (ret < 0) 2643 + btrfs_err(fs_info, 2644 + "super block corruption detected before writing it to disk"); 2645 + return ret; 2646 + } 2647 + 2613 2648 int open_ctree(struct super_block *sb, 2614 2649 struct btrfs_fs_devices *fs_devices, 2615 2650 char *options) ··· 3804 3769 3805 3770 flags = btrfs_super_flags(sb); 3806 3771 btrfs_set_super_flags(sb, flags | BTRFS_HEADER_FLAG_WRITTEN); 3772 + 3773 + ret = btrfs_validate_write_super(fs_info, sb); 3774 + if (ret < 0) { 3775 + mutex_unlock(&fs_info->fs_devices->device_list_mutex); 3776 + btrfs_handle_fs_error(fs_info, -EUCLEAN, 3777 + "unexpected superblock corruption detected"); 3778 + return -EUCLEAN; 3779 + } 3807 3780 3808 3781 ret = write_dev_supers(dev, sb, max_mirrors); 3809 3782 if (ret)