Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

btrfs: qgroup: Finish rescan when hit the last leaf of extent tree

Under the following case, qgroup rescan can double account cowed tree
blocks:

In this case, extent tree only has one tree block.

-
| transid=5 last committed=4
| btrfs_qgroup_rescan_worker()
| |- btrfs_start_transaction()
| | transid = 5
| |- qgroup_rescan_leaf()
| |- btrfs_search_slot_for_read() on extent tree
| Get the only extent tree block from commit root (transid = 4).
| Scan it, set qgroup_rescan_progress to the last
| EXTENT/META_ITEM + 1
| now qgroup_rescan_progress = A + 1.
|
| fs tree get CoWed, new tree block is at A + 16K
| transid 5 get committed
-
| transid=6 last committed=5
| btrfs_qgroup_rescan_worker()
| btrfs_qgroup_rescan_worker()
| |- btrfs_start_transaction()
| | transid = 5
| |- qgroup_rescan_leaf()
| |- btrfs_search_slot_for_read() on extent tree
| Get the only extent tree block from commit root (transid = 5).
| scan it using qgroup_rescan_progress (A + 1).
| found new tree block beyong A, and it's fs tree block,
| account it to increase qgroup numbers.
-

In above case, tree block A, and tree block A + 16K get accounted twice,
while qgroup rescan should stop when it already reach the last leaf,
other than continue using its qgroup_rescan_progress.

Such case could happen by just looping btrfs/017 and with some
possibility it can hit such double qgroup accounting problem.

Fix it by checking the path to determine if we should finish qgroup
rescan, other than relying on next loop to exit.

Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

authored by

Qu Wenruo and committed by
David Sterba
ff3d27a0 b6debf15

+19
+19
fs/btrfs/qgroup.c
··· 2580 2580 } 2581 2581 2582 2582 /* 2583 + * Check if the leaf is the last leaf. Which means all node pointers 2584 + * are at their last position. 2585 + */ 2586 + static bool is_last_leaf(struct btrfs_path *path) 2587 + { 2588 + int i; 2589 + 2590 + for (i = 1; i < BTRFS_MAX_LEVEL && path->nodes[i]; i++) { 2591 + if (path->slots[i] != btrfs_header_nritems(path->nodes[i]) - 1) 2592 + return false; 2593 + } 2594 + return true; 2595 + } 2596 + 2597 + /* 2583 2598 * returns < 0 on error, 0 when more leafs are to be scanned. 2584 2599 * returns 1 when done. 2585 2600 */ ··· 2606 2591 struct extent_buffer *scratch_leaf = NULL; 2607 2592 struct ulist *roots = NULL; 2608 2593 u64 num_bytes; 2594 + bool done; 2609 2595 int slot; 2610 2596 int ret; 2611 2597 ··· 2635 2619 mutex_unlock(&fs_info->qgroup_rescan_lock); 2636 2620 return ret; 2637 2621 } 2622 + done = is_last_leaf(path); 2638 2623 2639 2624 btrfs_item_key_to_cpu(path->nodes[0], &found, 2640 2625 btrfs_header_nritems(path->nodes[0]) - 1); ··· 2680 2663 free_extent_buffer(scratch_leaf); 2681 2664 } 2682 2665 2666 + if (done && !ret) 2667 + ret = 1; 2683 2668 return ret; 2684 2669 } 2685 2670