Merge tag 'bcachefs-2025-06-04' of git://evilpiepirate.org/bcachefs

Pull more bcachefs updates from Kent Overstreet:
"More bcachefs updates:

- More stack usage improvements (~600 bytes)

- Define CLASS()es for some commonly used types, and convert most
rcu_read_lock() uses to the new lock guards

- New introspection:
- Superblock error counters are now available in sysfs:
previously, they were only visible with 'show-super', which
doesn't provide a live view
- New tracepoint, error_throw(), which is called any time we
return an error and start to unwind

- Repair
- check_fix_ptrs() can now repair btree node roots
- We can now repair when we've somehow ended up with the journal
using a superblock bucket

- Revert some leftovers from the aborted directory i_size feature,
and add repair code: some userspace programs (e.g. sshfs) were
getting confused

It seems in 6.15 there's a bug where i_nlink on the vfs inode has been
getting incorrectly set to 0, with some unfortunate results;
list_journal analysis showed bch2_inode_rm() being called (by
bch2_evict_inode()) when it clearly should not have been.

- bch2_inode_rm() now runs "should we be deleting this inode?" checks
that were previously only run when deleting unlinked inodes in
recovery

- check_subvol() was treating a dangling subvol (pointing to a
missing root inode) like a dangling dirent, and deleting it. This
was the really unfortunate one: check_subvol() will now recreate
the root inode if necessary

This took longer to debug than it should have, and we lost several
filesystems unnecessarily, because users have been ignoring the
release notes and blindly running 'fsck -y'. Debugging required
reconstructing what happened through analyzing the journal, when
ideally someone would have noticed 'hey, fsck is asking me if I want
to repair this: it usually doesn't, maybe I should run this in dry run
mode and check what's going on?'

As a reminder, fsck errors are being marked as autofix once we've
verified, in real world usage, that they're working correctly; blindly
running 'fsck -y' on an experimental filesystem is playing with fire

Up to this incident we've had an excellent track record of not losing
data, so let's try to learn from this one

This is a community effort, I wouldn't be able to get this done
without the help of all the people QAing and providing excellent bug
reports and feedback based on real world usage. But please don't
ignore advice and expect me to pick up the pieces

If an error isn't marked as autofix, and it /is/ happening in the
wild, that's also something I need to know about so we can check it
out and add it to the autofix list if repair looks good. I haven't
been getting those reports, and I should be; since we don't have any
sort of telemetry yet I am absolutely dependent on user reports

Now I'll be spending the weekend working on new repair code to see if
I can get a filesystem back for a user who didn't have backups"

* tag 'bcachefs-2025-06-04' of git://evilpiepirate.org/bcachefs: (69 commits)
bcachefs: add cond_resched() to handle_overwrites()
bcachefs: Make journal read log message a bit quieter
bcachefs: Fix subvol to missing root repair
bcachefs: Run may_delete_deleted_inode() checks in bch2_inode_rm()
bcachefs: delete dead code from may_delete_deleted_inode()
bcachefs: Add flags to subvolume_to_text()
bcachefs: Fix oops in btree_node_seq_matches()
bcachefs: Fix dirent_casefold_mismatch repair
bcachefs: Fix bch2_fsck_rename_dirent() for casefold
bcachefs: Redo bch2_dirent_init_name()
bcachefs: Fix -Wc23-extensions in bch2_check_dirents()
bcachefs: Run check_dirents second time if required
bcachefs: Run snapshot deletion out of system_long_wq
bcachefs: Make check_key_has_snapshot safer
bcachefs: BCH_RECOVERY_PASS_NO_RATELIMIT
bcachefs: bch2_require_recovery_pass()
bcachefs: bch_err_throw()
bcachefs: Repair code for directory i_size
bcachefs: Kill un-reverted directory i_size code
bcachefs: Delete redundant fsck_err()
...

+2334 -1768
+40 -39
fs/bcachefs/alloc_background.c
··· 21 21 #include "error.h" 22 22 #include "lru.h" 23 23 #include "recovery.h" 24 - #include "trace.h" 25 24 #include "varint.h" 26 25 27 26 #include <linux/kthread.h> ··· 336 337 a->stripe_sectors = swab32(a->stripe_sectors); 337 338 } 338 339 339 - void bch2_alloc_to_text(struct printbuf *out, struct bch_fs *c, struct bkey_s_c k) 340 + static inline void __bch2_alloc_v4_to_text(struct printbuf *out, struct bch_fs *c, 341 + unsigned dev, const struct bch_alloc_v4 *a) 340 342 { 341 - struct bch_alloc_v4 _a; 342 - const struct bch_alloc_v4 *a = bch2_alloc_to_v4(k, &_a); 343 - struct bch_dev *ca = c ? bch2_dev_bucket_tryget_noerror(c, k.k->p) : NULL; 343 + struct bch_dev *ca = c ? bch2_dev_tryget_noerror(c, dev) : NULL; 344 344 345 345 prt_newline(out); 346 346 printbuf_indent_add(out, 2); ··· 365 367 printbuf_indent_sub(out, 2); 366 368 367 369 bch2_dev_put(ca); 370 + } 371 + 372 + void bch2_alloc_to_text(struct printbuf *out, struct bch_fs *c, struct bkey_s_c k) 373 + { 374 + struct bch_alloc_v4 _a; 375 + const struct bch_alloc_v4 *a = bch2_alloc_to_v4(k, &_a); 376 + 377 + __bch2_alloc_v4_to_text(out, c, k.k->p.inode, a); 378 + } 379 + 380 + void bch2_alloc_v4_to_text(struct printbuf *out, struct bch_fs *c, struct bkey_s_c k) 381 + { 382 + __bch2_alloc_v4_to_text(out, c, k.k->p.inode, bkey_s_c_to_alloc_v4(k).v); 368 383 } 369 384 370 385 void __bch2_alloc_to_v4(struct bkey_s_c k, struct bch_alloc_v4 *out) ··· 708 697 set ? "" : "un", 709 698 bch2_btree_id_str(btree), 710 699 buf.buf); 711 - if (ret == -BCH_ERR_fsck_ignore || 712 - ret == -BCH_ERR_fsck_errors_not_fixed) 700 + if (bch2_err_matches(ret, BCH_ERR_fsck_ignore) || 701 + bch2_err_matches(ret, BCH_ERR_fsck_errors_not_fixed)) 713 702 ret = 0; 714 703 715 704 printbuf_exit(&buf); ··· 865 854 866 855 struct bch_dev *ca = bch2_dev_bucket_tryget(c, new.k->p); 867 856 if (!ca) 868 - return -BCH_ERR_trigger_alloc; 857 + return bch_err_throw(c, trigger_alloc); 869 858 870 859 struct bch_alloc_v4 old_a_convert; 871 860 const struct bch_alloc_v4 *old_a = bch2_alloc_to_v4(old, &old_a_convert); ··· 999 988 } 1000 989 1001 990 if (new_a->gen != old_a->gen) { 1002 - rcu_read_lock(); 991 + guard(rcu)(); 1003 992 u8 *gen = bucket_gen(ca, new.k->p.offset); 1004 - if (unlikely(!gen)) { 1005 - rcu_read_unlock(); 993 + if (unlikely(!gen)) 1006 994 goto invalid_bucket; 1007 - } 1008 995 *gen = new_a->gen; 1009 - rcu_read_unlock(); 1010 996 } 1011 997 1012 998 #define eval_state(_a, expr) ({ const struct bch_alloc_v4 *a = _a; expr; }) ··· 1029 1021 } 1030 1022 1031 1023 if ((flags & BTREE_TRIGGER_gc) && (flags & BTREE_TRIGGER_insert)) { 1032 - rcu_read_lock(); 1024 + guard(rcu)(); 1033 1025 struct bucket *g = gc_bucket(ca, new.k->p.offset); 1034 - if (unlikely(!g)) { 1035 - rcu_read_unlock(); 1026 + if (unlikely(!g)) 1036 1027 goto invalid_bucket; 1037 - } 1038 1028 g->gen_valid = 1; 1039 1029 g->gen = new_a->gen; 1040 - rcu_read_unlock(); 1041 1030 } 1042 1031 err: 1043 1032 fsck_err: ··· 1044 1039 invalid_bucket: 1045 1040 bch2_fs_inconsistent(c, "reference to invalid bucket\n%s", 1046 1041 (bch2_bkey_val_to_text(&buf, c, new.s_c), buf.buf)); 1047 - ret = -BCH_ERR_trigger_alloc; 1042 + ret = bch_err_throw(c, trigger_alloc); 1048 1043 goto err; 1049 1044 } 1050 1045 ··· 1110 1105 bucket->offset = 0; 1111 1106 } 1112 1107 1113 - rcu_read_lock(); 1108 + guard(rcu)(); 1114 1109 *ca = __bch2_next_dev_idx(c, bucket->inode, NULL); 1115 1110 if (*ca) { 1116 1111 *bucket = POS((*ca)->dev_idx, (*ca)->mi.first_bucket); 1117 1112 bch2_dev_get(*ca); 1118 1113 } 1119 - rcu_read_unlock(); 1120 1114 1121 1115 return *ca != NULL; 1122 1116 } ··· 1458 1454 ret = bch2_btree_bit_mod_iter(trans, iter, false) ?: 1459 1455 bch2_trans_commit(trans, NULL, NULL, 1460 1456 BCH_TRANS_COMMIT_no_enospc) ?: 1461 - -BCH_ERR_transaction_restart_commit; 1457 + bch_err_throw(c, transaction_restart_commit); 1462 1458 goto out; 1463 1459 } else { 1464 1460 /* ··· 1781 1777 1782 1778 static int discard_in_flight_add(struct bch_dev *ca, u64 bucket, bool in_progress) 1783 1779 { 1780 + struct bch_fs *c = ca->fs; 1784 1781 int ret; 1785 1782 1786 1783 mutex_lock(&ca->discard_buckets_in_flight_lock); 1787 - darray_for_each(ca->discard_buckets_in_flight, i) 1788 - if (i->bucket == bucket) { 1789 - ret = -BCH_ERR_EEXIST_discard_in_flight_add; 1790 - goto out; 1791 - } 1784 + struct discard_in_flight *i = 1785 + darray_find_p(ca->discard_buckets_in_flight, i, i->bucket == bucket); 1786 + if (i) { 1787 + ret = bch_err_throw(c, EEXIST_discard_in_flight_add); 1788 + goto out; 1789 + } 1792 1790 1793 1791 ret = darray_push(&ca->discard_buckets_in_flight, ((struct discard_in_flight) { 1794 1792 .in_progress = in_progress, ··· 1804 1798 static void discard_in_flight_remove(struct bch_dev *ca, u64 bucket) 1805 1799 { 1806 1800 mutex_lock(&ca->discard_buckets_in_flight_lock); 1807 - darray_for_each(ca->discard_buckets_in_flight, i) 1808 - if (i->bucket == bucket) { 1809 - BUG_ON(!i->in_progress); 1810 - darray_remove_item(&ca->discard_buckets_in_flight, i); 1811 - goto found; 1812 - } 1813 - BUG(); 1814 - found: 1801 + struct discard_in_flight *i = 1802 + darray_find_p(ca->discard_buckets_in_flight, i, i->bucket == bucket); 1803 + BUG_ON(!i || !i->in_progress); 1804 + 1805 + darray_remove_item(&ca->discard_buckets_in_flight, i); 1815 1806 mutex_unlock(&ca->discard_buckets_in_flight_lock); 1816 1807 } 1817 1808 ··· 2507 2504 2508 2505 lockdep_assert_held(&c->state_lock); 2509 2506 2510 - rcu_read_lock(); 2507 + guard(rcu)(); 2511 2508 for_each_member_device_rcu(c, ca, NULL) { 2512 2509 struct block_device *bdev = READ_ONCE(ca->disk_sb.bdev); 2513 2510 if (bdev) ··· 2552 2549 bucket_size_max = max_t(unsigned, bucket_size_max, 2553 2550 ca->mi.bucket_size); 2554 2551 } 2555 - rcu_read_unlock(); 2556 2552 2557 2553 bch2_set_ra_pages(c, ra_pages); 2558 2554 ··· 2576 2574 { 2577 2575 u64 ret = U64_MAX; 2578 2576 2579 - rcu_read_lock(); 2577 + guard(rcu)(); 2580 2578 for_each_rw_member_rcu(c, ca) 2581 2579 ret = min(ret, ca->mi.nbuckets * ca->mi.bucket_size); 2582 - rcu_read_unlock(); 2583 2580 return ret; 2584 2581 } 2585 2582
+4 -5
fs/bcachefs/alloc_background.h
··· 13 13 14 14 static inline bool bch2_dev_bucket_exists(struct bch_fs *c, struct bpos pos) 15 15 { 16 - rcu_read_lock(); 16 + guard(rcu)(); 17 17 struct bch_dev *ca = bch2_dev_rcu_noerror(c, pos.inode); 18 - bool ret = ca && bucket_valid(ca, pos.offset); 19 - rcu_read_unlock(); 20 - return ret; 18 + return ca && bucket_valid(ca, pos.offset); 21 19 } 22 20 23 21 static inline u64 bucket_to_u64(struct bpos bucket) ··· 251 253 struct bkey_validate_context); 252 254 void bch2_alloc_v4_swab(struct bkey_s); 253 255 void bch2_alloc_to_text(struct printbuf *, struct bch_fs *, struct bkey_s_c); 256 + void bch2_alloc_v4_to_text(struct printbuf *, struct bch_fs *, struct bkey_s_c); 254 257 255 258 #define bch2_bkey_ops_alloc ((struct bkey_ops) { \ 256 259 .key_validate = bch2_alloc_v1_validate, \ ··· 276 277 277 278 #define bch2_bkey_ops_alloc_v4 ((struct bkey_ops) { \ 278 279 .key_validate = bch2_alloc_v4_validate, \ 279 - .val_to_text = bch2_alloc_to_text, \ 280 + .val_to_text = bch2_alloc_v4_to_text, \ 280 281 .swab = bch2_alloc_v4_swab, \ 281 282 .trigger = bch2_trigger_alloc, \ 282 283 .min_val_size = 48, \
+51 -57
fs/bcachefs/alloc_foreground.c
··· 69 69 70 70 void bch2_reset_alloc_cursors(struct bch_fs *c) 71 71 { 72 - rcu_read_lock(); 72 + guard(rcu)(); 73 73 for_each_member_device_rcu(c, ca, NULL) 74 74 memset(ca->alloc_cursor, 0, sizeof(ca->alloc_cursor)); 75 - rcu_read_unlock(); 76 75 } 77 76 78 77 static void bch2_open_bucket_hash_add(struct bch_fs *c, struct open_bucket *ob) ··· 165 166 ARRAY_SIZE(c->open_buckets_partial)); 166 167 167 168 spin_lock(&c->freelist_lock); 168 - rcu_read_lock(); 169 - bch2_dev_rcu(c, ob->dev)->nr_partial_buckets++; 170 - rcu_read_unlock(); 169 + scoped_guard(rcu) 170 + bch2_dev_rcu(c, ob->dev)->nr_partial_buckets++; 171 171 172 172 ob->on_partial_list = true; 173 173 c->open_buckets_partial[c->open_buckets_partial_nr++] = ··· 227 229 228 230 track_event_change(&c->times[BCH_TIME_blocked_allocate_open_bucket], true); 229 231 spin_unlock(&c->freelist_lock); 230 - return ERR_PTR(-BCH_ERR_open_buckets_empty); 232 + return ERR_PTR(bch_err_throw(c, open_buckets_empty)); 231 233 } 232 234 233 235 /* Recheck under lock: */ ··· 533 535 534 536 track_event_change(&c->times[BCH_TIME_blocked_allocate], true); 535 537 536 - ob = ERR_PTR(-BCH_ERR_freelist_empty); 538 + ob = ERR_PTR(bch_err_throw(c, freelist_empty)); 537 539 goto err; 538 540 } 539 541 ··· 558 560 } 559 561 err: 560 562 if (!ob) 561 - ob = ERR_PTR(-BCH_ERR_no_buckets_found); 563 + ob = ERR_PTR(bch_err_throw(c, no_buckets_found)); 562 564 563 565 if (!IS_ERR(ob)) 564 566 ob->data_type = req->data_type; ··· 601 603 602 604 #define dev_stripe_cmp(l, r) __dev_stripe_cmp(stripe, l, r) 603 605 604 - struct dev_alloc_list bch2_dev_alloc_list(struct bch_fs *c, 605 - struct dev_stripe_state *stripe, 606 - struct bch_devs_mask *devs) 606 + void bch2_dev_alloc_list(struct bch_fs *c, 607 + struct dev_stripe_state *stripe, 608 + struct bch_devs_mask *devs, 609 + struct dev_alloc_list *ret) 607 610 { 608 - struct dev_alloc_list ret = { .nr = 0 }; 611 + ret->nr = 0; 612 + 609 613 unsigned i; 610 - 611 614 for_each_set_bit(i, devs->d, BCH_SB_MEMBERS_MAX) 612 - ret.data[ret.nr++] = i; 615 + ret->data[ret->nr++] = i; 613 616 614 - bubble_sort(ret.data, ret.nr, dev_stripe_cmp); 615 - return ret; 617 + bubble_sort(ret->data, ret->nr, dev_stripe_cmp); 616 618 } 617 619 618 620 static const u64 stripe_clock_hand_rescale = 1ULL << 62; /* trigger rescale at */ ··· 703 705 return 0; 704 706 } 705 707 706 - int bch2_bucket_alloc_set_trans(struct btree_trans *trans, 707 - struct alloc_request *req, 708 - struct dev_stripe_state *stripe, 709 - struct closure *cl) 708 + inline int bch2_bucket_alloc_set_trans(struct btree_trans *trans, 709 + struct alloc_request *req, 710 + struct dev_stripe_state *stripe, 711 + struct closure *cl) 710 712 { 711 713 struct bch_fs *c = trans->c; 712 - int ret = -BCH_ERR_insufficient_devices; 714 + int ret = 0; 713 715 714 716 BUG_ON(req->nr_effective >= req->nr_replicas); 715 717 716 - struct dev_alloc_list devs_sorted = bch2_dev_alloc_list(c, stripe, &req->devs_may_alloc); 717 - darray_for_each(devs_sorted, i) { 718 + bch2_dev_alloc_list(c, stripe, &req->devs_may_alloc, &req->devs_sorted); 719 + 720 + darray_for_each(req->devs_sorted, i) { 718 721 req->ca = bch2_dev_tryget_noerror(c, *i); 719 722 if (!req->ca) 720 723 continue; ··· 738 739 continue; 739 740 } 740 741 741 - if (add_new_bucket(c, req, ob)) { 742 - ret = 0; 742 + ret = add_new_bucket(c, req, ob); 743 + if (ret) 743 744 break; 744 - } 745 745 } 746 746 747 - return ret; 747 + if (ret == 1) 748 + return 0; 749 + if (ret) 750 + return ret; 751 + return bch_err_throw(c, insufficient_devices); 748 752 } 749 753 750 754 /* Allocate from stripes: */ ··· 778 776 if (!h) 779 777 return 0; 780 778 781 - struct dev_alloc_list devs_sorted = 782 - bch2_dev_alloc_list(c, &req->wp->stripe, &req->devs_may_alloc); 783 - darray_for_each(devs_sorted, i) 779 + bch2_dev_alloc_list(c, &req->wp->stripe, &req->devs_may_alloc, &req->devs_sorted); 780 + 781 + darray_for_each(req->devs_sorted, i) 784 782 for (unsigned ec_idx = 0; ec_idx < h->s->nr_data; ec_idx++) { 785 783 if (!h->s->blocks[ec_idx]) 786 784 continue; ··· 874 872 i); 875 873 ob->on_partial_list = false; 876 874 877 - rcu_read_lock(); 878 - bch2_dev_rcu(c, ob->dev)->nr_partial_buckets--; 879 - rcu_read_unlock(); 875 + scoped_guard(rcu) 876 + bch2_dev_rcu(c, ob->dev)->nr_partial_buckets--; 880 877 881 878 ret = add_new_bucket(c, req, ob); 882 879 if (ret) ··· 1057 1056 1058 1057 ob->on_partial_list = false; 1059 1058 1060 - rcu_read_lock(); 1061 - bch2_dev_rcu(c, ob->dev)->nr_partial_buckets--; 1062 - rcu_read_unlock(); 1059 + scoped_guard(rcu) 1060 + bch2_dev_rcu(c, ob->dev)->nr_partial_buckets--; 1063 1061 1064 1062 spin_unlock(&c->freelist_lock); 1065 1063 bch2_open_bucket_put(c, ob); ··· 1086 1086 { 1087 1087 struct write_point *wp; 1088 1088 1089 - rcu_read_lock(); 1089 + guard(rcu)(); 1090 1090 hlist_for_each_entry_rcu(wp, head, node) 1091 1091 if (wp->write_point == write_point) 1092 - goto out; 1093 - wp = NULL; 1094 - out: 1095 - rcu_read_unlock(); 1096 - return wp; 1092 + return wp; 1093 + return NULL; 1097 1094 } 1098 1095 1099 1096 static inline bool too_many_writepoints(struct bch_fs *c, unsigned factor) ··· 1101 1104 return stranded * factor > free; 1102 1105 } 1103 1106 1104 - static bool try_increase_writepoints(struct bch_fs *c) 1107 + static noinline bool try_increase_writepoints(struct bch_fs *c) 1105 1108 { 1106 1109 struct write_point *wp; 1107 1110 ··· 1114 1117 return true; 1115 1118 } 1116 1119 1117 - static bool try_decrease_writepoints(struct btree_trans *trans, unsigned old_nr) 1120 + static noinline bool try_decrease_writepoints(struct btree_trans *trans, unsigned old_nr) 1118 1121 { 1119 1122 struct bch_fs *c = trans->c; 1120 1123 struct write_point *wp; ··· 1376 1379 goto retry; 1377 1380 1378 1381 if (cl && bch2_err_matches(ret, BCH_ERR_open_buckets_empty)) 1379 - ret = -BCH_ERR_bucket_alloc_blocked; 1382 + ret = bch_err_throw(c, bucket_alloc_blocked); 1380 1383 1381 1384 if (cl && !(flags & BCH_WRITE_alloc_nowait) && 1382 1385 bch2_err_matches(ret, BCH_ERR_freelist_empty)) 1383 - ret = -BCH_ERR_bucket_alloc_blocked; 1386 + ret = bch_err_throw(c, bucket_alloc_blocked); 1384 1387 1385 1388 return ret; 1386 1389 } ··· 1634 1637 1635 1638 bch2_printbuf_make_room(&buf, 4096); 1636 1639 1637 - rcu_read_lock(); 1638 1640 buf.atomic++; 1639 - 1640 - for_each_online_member_rcu(c, ca) { 1641 - prt_printf(&buf, "Dev %u:\n", ca->dev_idx); 1642 - printbuf_indent_add(&buf, 2); 1643 - bch2_dev_alloc_debug_to_text(&buf, ca); 1644 - printbuf_indent_sub(&buf, 2); 1645 - prt_newline(&buf); 1646 - } 1647 - 1641 + scoped_guard(rcu) 1642 + for_each_online_member_rcu(c, ca) { 1643 + prt_printf(&buf, "Dev %u:\n", ca->dev_idx); 1644 + printbuf_indent_add(&buf, 2); 1645 + bch2_dev_alloc_debug_to_text(&buf, ca); 1646 + printbuf_indent_sub(&buf, 2); 1647 + prt_newline(&buf); 1648 + } 1648 1649 --buf.atomic; 1649 - rcu_read_unlock(); 1650 1650 1651 1651 prt_printf(&buf, "Copygc debug:\n"); 1652 1652 printbuf_indent_add(&buf, 2);
+5 -3
fs/bcachefs/alloc_foreground.h
··· 42 42 struct bch_devs_mask devs_may_alloc; 43 43 44 44 /* bch2_bucket_alloc_set_trans(): */ 45 + struct dev_alloc_list devs_sorted; 45 46 struct bch_dev_usage usage; 46 47 47 48 /* bch2_bucket_alloc_trans(): */ ··· 72 71 struct bch_devs_mask scratch_devs_may_alloc; 73 72 }; 74 73 75 - struct dev_alloc_list bch2_dev_alloc_list(struct bch_fs *, 76 - struct dev_stripe_state *, 77 - struct bch_devs_mask *); 74 + void bch2_dev_alloc_list(struct bch_fs *, 75 + struct dev_stripe_state *, 76 + struct bch_devs_mask *, 77 + struct dev_alloc_list *); 78 78 void bch2_dev_stripe_increment(struct bch_dev *, struct dev_stripe_state *); 79 79 80 80 static inline struct bch_dev *ob_dev(struct bch_fs *c, struct open_bucket *ob)
+37 -37
fs/bcachefs/backpointers.c
··· 48 48 { 49 49 struct bkey_s_c_backpointer bp = bkey_s_c_to_backpointer(k); 50 50 51 - rcu_read_lock(); 52 - struct bch_dev *ca = bch2_dev_rcu_noerror(c, bp.k->p.inode); 53 - if (ca) { 54 - u32 bucket_offset; 55 - struct bpos bucket = bp_pos_to_bucket_and_offset(ca, bp.k->p, &bucket_offset); 56 - rcu_read_unlock(); 57 - prt_printf(out, "bucket=%llu:%llu:%u ", bucket.inode, bucket.offset, bucket_offset); 58 - } else { 59 - rcu_read_unlock(); 60 - prt_printf(out, "sector=%llu:%llu ", bp.k->p.inode, bp.k->p.offset >> MAX_EXTENT_COMPRESS_RATIO_SHIFT); 51 + struct bch_dev *ca; 52 + u32 bucket_offset; 53 + struct bpos bucket; 54 + scoped_guard(rcu) { 55 + ca = bch2_dev_rcu_noerror(c, bp.k->p.inode); 56 + if (ca) 57 + bucket = bp_pos_to_bucket_and_offset(ca, bp.k->p, &bucket_offset); 61 58 } 59 + 60 + if (ca) 61 + prt_printf(out, "bucket=%llu:%llu:%u ", bucket.inode, bucket.offset, bucket_offset); 62 + else 63 + prt_printf(out, "sector=%llu:%llu ", bp.k->p.inode, bp.k->p.offset >> MAX_EXTENT_COMPRESS_RATIO_SHIFT); 62 64 63 65 bch2_btree_id_level_to_text(out, bp.v->btree_id, bp.v->level); 64 66 prt_str(out, " data_type="); ··· 142 140 } 143 141 144 142 if (!will_check && __bch2_inconsistent_error(c, &buf)) 145 - ret = -BCH_ERR_erofs_unfixed_errors; 143 + ret = bch_err_throw(c, erofs_unfixed_errors); 146 144 147 145 bch_err(c, "%s", buf.buf); 148 146 printbuf_exit(&buf); ··· 295 293 return b; 296 294 297 295 if (btree_node_will_make_reachable(b)) { 298 - b = ERR_PTR(-BCH_ERR_backpointer_to_overwritten_btree_node); 296 + b = ERR_PTR(bch_err_throw(c, backpointer_to_overwritten_btree_node)); 299 297 } else { 300 298 int ret = backpointer_target_not_found(trans, bp, bkey_i_to_s_c(&b->key), 301 299 last_flushed, commit); ··· 353 351 return ret ? bkey_s_c_err(ret) : bkey_s_c_null; 354 352 } else { 355 353 struct btree *b = __bch2_backpointer_get_node(trans, bp, iter, last_flushed, commit); 356 - if (b == ERR_PTR(-BCH_ERR_backpointer_to_overwritten_btree_node)) 354 + if (b == ERR_PTR(bch_err_throw(c, backpointer_to_overwritten_btree_node))) 357 355 return bkey_s_c_null; 358 356 if (IS_ERR_OR_NULL(b)) 359 357 return ((struct bkey_s_c) { .k = ERR_CAST(b) }); ··· 593 591 bkey_for_each_ptr(other_extent_ptrs, ptr) 594 592 if (ptr->dev == bp->k.p.inode && 595 593 dev_ptr_stale_rcu(ca, ptr)) { 594 + rcu_read_unlock(); 596 595 ret = drop_dev_and_update(trans, other_bp.v->btree_id, 597 596 other_extent, bp->k.p.inode); 598 597 if (ret) ··· 651 648 prt_newline(&buf); 652 649 bch2_bkey_val_to_text(&buf, c, other_extent); 653 650 bch_err(c, "%s", buf.buf); 654 - ret = -BCH_ERR_fsck_repair_unimplemented; 651 + ret = bch_err_throw(c, fsck_repair_unimplemented); 655 652 goto err; 656 653 missing: 657 654 printbuf_reset(&buf); ··· 682 679 if (p.ptr.dev == BCH_SB_MEMBER_INVALID) 683 680 continue; 684 681 685 - rcu_read_lock(); 686 - struct bch_dev *ca = bch2_dev_rcu_noerror(c, p.ptr.dev); 687 - if (!ca) { 688 - rcu_read_unlock(); 689 - continue; 690 - } 682 + bool empty; 683 + { 684 + /* scoped_guard() is a loop, so it breaks continue */ 685 + guard(rcu)(); 686 + struct bch_dev *ca = bch2_dev_rcu_noerror(c, p.ptr.dev); 687 + if (!ca) 688 + continue; 691 689 692 - if (p.ptr.cached && dev_ptr_stale_rcu(ca, &p.ptr)) { 693 - rcu_read_unlock(); 694 - continue; 695 - } 690 + if (p.ptr.cached && dev_ptr_stale_rcu(ca, &p.ptr)) 691 + continue; 696 692 697 - u64 b = PTR_BUCKET_NR(ca, &p.ptr); 698 - if (!bch2_bucket_bitmap_test(&ca->bucket_backpointer_mismatch, b)) { 699 - rcu_read_unlock(); 700 - continue; 701 - } 693 + u64 b = PTR_BUCKET_NR(ca, &p.ptr); 694 + if (!bch2_bucket_bitmap_test(&ca->bucket_backpointer_mismatch, b)) 695 + continue; 702 696 703 - bool empty = bch2_bucket_bitmap_test(&ca->bucket_backpointer_empty, b); 704 - rcu_read_unlock(); 697 + empty = bch2_bucket_bitmap_test(&ca->bucket_backpointer_empty, b); 698 + } 705 699 706 700 struct bkey_i_backpointer bp; 707 701 bch2_extent_ptr_to_bp(c, btree, level, k, p, entry, &bp); ··· 953 953 sectors[ALLOC_cached] > a->cached_sectors || 954 954 sectors[ALLOC_stripe] > a->stripe_sectors) { 955 955 ret = check_bucket_backpointers_to_extents(trans, ca, alloc_k.k->p) ?: 956 - -BCH_ERR_transaction_restart_nested; 956 + bch_err_throw(c, transaction_restart_nested); 957 957 goto err; 958 958 } 959 959 ··· 981 981 case KEY_TYPE_btree_ptr_v2: { 982 982 bool ret = false; 983 983 984 - rcu_read_lock(); 984 + guard(rcu)(); 985 985 struct bpos pos = bkey_s_c_to_btree_ptr_v2(k).v->min_key; 986 986 while (pos.inode <= k.k->p.inode) { 987 987 if (pos.inode >= c->sb.nr_devices) ··· 1009 1009 next: 1010 1010 pos = SPOS(pos.inode + 1, 0, 0); 1011 1011 } 1012 - rcu_read_unlock(); 1013 1012 1014 1013 return ret; 1015 1014 } ··· 1351 1352 b->buckets = kvcalloc(BITS_TO_LONGS(ca->mi.nbuckets), 1352 1353 sizeof(unsigned long), GFP_KERNEL); 1353 1354 if (!b->buckets) 1354 - return -BCH_ERR_ENOMEM_backpointer_mismatches_bitmap; 1355 + return bch_err_throw(ca->fs, ENOMEM_backpointer_mismatches_bitmap); 1355 1356 } 1356 1357 1357 1358 b->nr += !__test_and_set_bit(bit, b->buckets); ··· 1360 1361 return 0; 1361 1362 } 1362 1363 1363 - int bch2_bucket_bitmap_resize(struct bucket_bitmap *b, u64 old_size, u64 new_size) 1364 + int bch2_bucket_bitmap_resize(struct bch_dev *ca, struct bucket_bitmap *b, 1365 + u64 old_size, u64 new_size) 1364 1366 { 1365 1367 scoped_guard(mutex, &b->lock) { 1366 1368 if (!b->buckets) ··· 1370 1370 unsigned long *n = kvcalloc(BITS_TO_LONGS(new_size), 1371 1371 sizeof(unsigned long), GFP_KERNEL); 1372 1372 if (!n) 1373 - return -BCH_ERR_ENOMEM_backpointer_mismatches_bitmap; 1373 + return bch_err_throw(ca->fs, ENOMEM_backpointer_mismatches_bitmap); 1374 1374 1375 1375 memcpy(n, b->buckets, 1376 1376 BITS_TO_LONGS(min(old_size, new_size)) * sizeof(unsigned long));
+2 -3
fs/bcachefs/backpointers.h
··· 53 53 54 54 static inline bool bp_pos_to_bucket_nodev_noerror(struct bch_fs *c, struct bpos bp_pos, struct bpos *bucket) 55 55 { 56 - rcu_read_lock(); 56 + guard(rcu)(); 57 57 struct bch_dev *ca = bch2_dev_rcu_noerror(c, bp_pos.inode); 58 58 if (ca) 59 59 *bucket = bp_pos_to_bucket(ca, bp_pos); 60 - rcu_read_unlock(); 61 60 return ca != NULL; 62 61 } 63 62 ··· 194 195 return bitmap && test_bit(i, bitmap); 195 196 } 196 197 197 - int bch2_bucket_bitmap_resize(struct bucket_bitmap *, u64, u64); 198 + int bch2_bucket_bitmap_resize(struct bch_dev *, struct bucket_bitmap *, u64, u64); 198 199 void bch2_bucket_bitmap_free(struct bucket_bitmap *); 199 200 200 201 #endif /* _BCACHEFS_BACKPOINTERS_BACKGROUND_H */
+41 -31
fs/bcachefs/bcachefs.h
··· 183 183 #define pr_fmt(fmt) "%s() " fmt "\n", __func__ 184 184 #endif 185 185 186 + #ifdef CONFIG_BCACHEFS_DEBUG 187 + #define ENUMERATED_REF_DEBUG 188 + #endif 189 + 190 + #ifndef dynamic_fault 191 + #define dynamic_fault(...) 0 192 + #endif 193 + 194 + #define race_fault(...) dynamic_fault("bcachefs:race") 195 + 186 196 #include <linux/backing-dev-defs.h> 187 197 #include <linux/bug.h> 188 198 #include <linux/bio.h> ··· 229 219 #include "time_stats.h" 230 220 #include "util.h" 231 221 232 - #ifdef CONFIG_BCACHEFS_DEBUG 233 - #define ENUMERATED_REF_DEBUG 234 - #endif 222 + #include "alloc_types.h" 223 + #include "async_objs_types.h" 224 + #include "btree_gc_types.h" 225 + #include "btree_types.h" 226 + #include "btree_node_scan_types.h" 227 + #include "btree_write_buffer_types.h" 228 + #include "buckets_types.h" 229 + #include "buckets_waiting_for_journal_types.h" 230 + #include "clock_types.h" 231 + #include "disk_groups_types.h" 232 + #include "ec_types.h" 233 + #include "enumerated_ref_types.h" 234 + #include "journal_types.h" 235 + #include "keylist_types.h" 236 + #include "quota_types.h" 237 + #include "rebalance_types.h" 238 + #include "recovery_passes_types.h" 239 + #include "replicas_types.h" 240 + #include "sb-members_types.h" 241 + #include "subvolume_types.h" 242 + #include "super_types.h" 243 + #include "thread_with_file_types.h" 235 244 236 - #ifndef dynamic_fault 237 - #define dynamic_fault(...) 0 238 - #endif 239 - 240 - #define race_fault(...) dynamic_fault("bcachefs:race") 245 + #include "trace.h" 241 246 242 247 #define count_event(_c, _name) this_cpu_inc((_c)->counters[BCH_COUNTER_##_name]) 243 248 ··· 405 380 pr_info(fmt, ##__VA_ARGS__); \ 406 381 } while (0) 407 382 383 + static inline int __bch2_err_trace(struct bch_fs *c, int err) 384 + { 385 + trace_error_throw(c, err, _THIS_IP_); 386 + return err; 387 + } 388 + 389 + #define bch_err_throw(_c, _err) __bch2_err_trace(_c, -BCH_ERR_##_err) 390 + 408 391 /* Parameters that are useful for debugging, but should always be compiled in: */ 409 392 #define BCH_DEBUG_PARAMS_ALWAYS() \ 410 393 BCH_DEBUG_PARAM(key_merging_disabled, \ ··· 518 485 #undef x 519 486 BCH_TIME_STAT_NR 520 487 }; 521 - 522 - #include "alloc_types.h" 523 - #include "async_objs_types.h" 524 - #include "btree_gc_types.h" 525 - #include "btree_types.h" 526 - #include "btree_node_scan_types.h" 527 - #include "btree_write_buffer_types.h" 528 - #include "buckets_types.h" 529 - #include "buckets_waiting_for_journal_types.h" 530 - #include "clock_types.h" 531 - #include "disk_groups_types.h" 532 - #include "ec_types.h" 533 - #include "enumerated_ref_types.h" 534 - #include "journal_types.h" 535 - #include "keylist_types.h" 536 - #include "quota_types.h" 537 - #include "rebalance_types.h" 538 - #include "recovery_passes_types.h" 539 - #include "replicas_types.h" 540 - #include "sb-members_types.h" 541 - #include "subvolume_types.h" 542 - #include "super_types.h" 543 - #include "thread_with_file_types.h" 544 488 545 489 /* Number of nodes btree coalesce will try to coalesce at once */ 546 490 #define GC_MERGE_NODES 4U
+12 -12
fs/bcachefs/btree_cache.c
··· 149 149 150 150 b->data = kvmalloc(btree_buf_bytes(b), gfp); 151 151 if (!b->data) 152 - return -BCH_ERR_ENOMEM_btree_node_mem_alloc; 152 + return bch_err_throw(c, ENOMEM_btree_node_mem_alloc); 153 153 #ifdef __KERNEL__ 154 154 b->aux_data = kvmalloc(btree_aux_data_bytes(b), gfp); 155 155 #else ··· 162 162 if (!b->aux_data) { 163 163 kvfree(b->data); 164 164 b->data = NULL; 165 - return -BCH_ERR_ENOMEM_btree_node_mem_alloc; 165 + return bch_err_throw(c, ENOMEM_btree_node_mem_alloc); 166 166 } 167 167 168 168 return 0; ··· 353 353 354 354 if (btree_node_noevict(b)) { 355 355 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_noevict]++; 356 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 356 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 357 357 } 358 358 if (btree_node_write_blocked(b)) { 359 359 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_write_blocked]++; 360 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 360 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 361 361 } 362 362 if (btree_node_will_make_reachable(b)) { 363 363 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_will_make_reachable]++; 364 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 364 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 365 365 } 366 366 367 367 if (btree_node_dirty(b)) { 368 368 if (!flush) { 369 369 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_dirty]++; 370 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 370 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 371 371 } 372 372 373 373 if (locked) { ··· 393 393 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_read_in_flight]++; 394 394 else if (btree_node_write_in_flight(b)) 395 395 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_write_in_flight]++; 396 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 396 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 397 397 } 398 398 399 399 if (locked) ··· 424 424 425 425 if (!six_trylock_intent(&b->c.lock)) { 426 426 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_lock_intent]++; 427 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 427 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 428 428 } 429 429 430 430 if (!six_trylock_write(&b->c.lock)) { 431 431 bc->not_freed[BCH_BTREE_CACHE_NOT_FREED_lock_write]++; 432 432 six_unlock_intent(&b->c.lock); 433 - return -BCH_ERR_ENOMEM_btree_node_reclaim; 433 + return bch_err_throw(c, ENOMEM_btree_node_reclaim); 434 434 } 435 435 436 436 /* recheck under lock */ ··· 682 682 683 683 return 0; 684 684 err: 685 - return -BCH_ERR_ENOMEM_fs_btree_cache_init; 685 + return bch_err_throw(c, ENOMEM_fs_btree_cache_init); 686 686 } 687 687 688 688 void bch2_fs_btree_cache_init_early(struct btree_cache *bc) ··· 727 727 728 728 if (!cl) { 729 729 trace_and_count(c, btree_cache_cannibalize_lock_fail, trans); 730 - return -BCH_ERR_ENOMEM_btree_cache_cannibalize_lock; 730 + return bch_err_throw(c, ENOMEM_btree_cache_cannibalize_lock); 731 731 } 732 732 733 733 closure_wait(&bc->alloc_wait, cl); ··· 741 741 } 742 742 743 743 trace_and_count(c, btree_cache_cannibalize_lock_fail, trans); 744 - return -BCH_ERR_btree_cache_cannibalize_lock_blocked; 744 + return bch_err_throw(c, btree_cache_cannibalize_lock_blocked); 745 745 746 746 success: 747 747 trace_and_count(c, btree_cache_cannibalize_lock, trans);
+28 -29
fs/bcachefs/btree_gc.c
··· 150 150 151 151 new = kmalloc_array(BKEY_BTREE_PTR_U64s_MAX, sizeof(u64), GFP_KERNEL); 152 152 if (!new) 153 - return -BCH_ERR_ENOMEM_gc_repair_key; 153 + return bch_err_throw(c, ENOMEM_gc_repair_key); 154 154 155 155 btree_ptr_to_v2(b, new); 156 156 b->data->min_key = new_min; ··· 190 190 191 191 new = kmalloc_array(BKEY_BTREE_PTR_U64s_MAX, sizeof(u64), GFP_KERNEL); 192 192 if (!new) 193 - return -BCH_ERR_ENOMEM_gc_repair_key; 193 + return bch_err_throw(c, ENOMEM_gc_repair_key); 194 194 195 195 btree_ptr_to_v2(b, new); 196 196 b->data->max_key = new_max; ··· 935 935 ret = genradix_prealloc(&ca->buckets_gc, ca->mi.nbuckets, GFP_KERNEL); 936 936 if (ret) { 937 937 bch2_dev_put(ca); 938 - ret = -BCH_ERR_ENOMEM_gc_alloc_start; 938 + ret = bch_err_throw(c, ENOMEM_gc_alloc_start); 939 939 break; 940 940 } 941 941 } ··· 1093 1093 { 1094 1094 struct bch_fs *c = trans->c; 1095 1095 struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); 1096 - struct bkey_i *u; 1097 - int ret; 1098 1096 1099 1097 if (unlikely(test_bit(BCH_FS_going_ro, &c->flags))) 1100 1098 return -EROFS; 1101 1099 1102 - rcu_read_lock(); 1103 - bkey_for_each_ptr(ptrs, ptr) { 1104 - struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 1105 - if (!ca) 1106 - continue; 1100 + bool too_stale = false; 1101 + scoped_guard(rcu) { 1102 + bkey_for_each_ptr(ptrs, ptr) { 1103 + struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 1104 + if (!ca) 1105 + continue; 1107 1106 1108 - if (dev_ptr_stale(ca, ptr) > 16) { 1109 - rcu_read_unlock(); 1110 - goto update; 1107 + too_stale |= dev_ptr_stale(ca, ptr) > 16; 1111 1108 } 1109 + 1110 + if (!too_stale) 1111 + bkey_for_each_ptr(ptrs, ptr) { 1112 + struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 1113 + if (!ca) 1114 + continue; 1115 + 1116 + u8 *gen = &ca->oldest_gen[PTR_BUCKET_NR(ca, ptr)]; 1117 + if (gen_after(*gen, ptr->gen)) 1118 + *gen = ptr->gen; 1119 + } 1112 1120 } 1113 1121 1114 - bkey_for_each_ptr(ptrs, ptr) { 1115 - struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 1116 - if (!ca) 1117 - continue; 1122 + if (too_stale) { 1123 + struct bkey_i *u = bch2_bkey_make_mut(trans, iter, &k, 0); 1124 + int ret = PTR_ERR_OR_ZERO(u); 1125 + if (ret) 1126 + return ret; 1118 1127 1119 - u8 *gen = &ca->oldest_gen[PTR_BUCKET_NR(ca, ptr)]; 1120 - if (gen_after(*gen, ptr->gen)) 1121 - *gen = ptr->gen; 1128 + bch2_extent_normalize(c, bkey_i_to_s(u)); 1122 1129 } 1123 - rcu_read_unlock(); 1124 - return 0; 1125 - update: 1126 - u = bch2_bkey_make_mut(trans, iter, &k, 0); 1127 - ret = PTR_ERR_OR_ZERO(u); 1128 - if (ret) 1129 - return ret; 1130 1130 1131 - bch2_extent_normalize(c, bkey_i_to_s(u)); 1132 1131 return 0; 1133 1132 } 1134 1133 ··· 1180 1181 ca->oldest_gen = kvmalloc(gens->nbuckets, GFP_KERNEL); 1181 1182 if (!ca->oldest_gen) { 1182 1183 bch2_dev_put(ca); 1183 - ret = -BCH_ERR_ENOMEM_gc_gens; 1184 + ret = bch_err_throw(c, ENOMEM_gc_gens); 1184 1185 goto err; 1185 1186 } 1186 1187
+21 -22
fs/bcachefs/btree_io.c
··· 557 557 const char *fmt, ...) 558 558 { 559 559 if (c->recovery.curr_pass == BCH_RECOVERY_PASS_scan_for_btree_nodes) 560 - return -BCH_ERR_fsck_fix; 560 + return bch_err_throw(c, fsck_fix); 561 561 562 562 bool have_retry = false; 563 563 int ret2; ··· 572 572 } 573 573 574 574 if (!have_retry && ret == -BCH_ERR_btree_node_read_err_want_retry) 575 - ret = -BCH_ERR_btree_node_read_err_fixable; 575 + ret = bch_err_throw(c, btree_node_read_err_fixable); 576 576 if (!have_retry && ret == -BCH_ERR_btree_node_read_err_must_retry) 577 - ret = -BCH_ERR_btree_node_read_err_bad_node; 577 + ret = bch_err_throw(c, btree_node_read_err_bad_node); 578 578 579 579 bch2_sb_error_count(c, err_type); 580 580 ··· 602 602 switch (ret) { 603 603 case -BCH_ERR_btree_node_read_err_fixable: 604 604 ret2 = bch2_fsck_err_opt(c, FSCK_CAN_FIX, err_type); 605 - if (ret2 != -BCH_ERR_fsck_fix && 606 - ret2 != -BCH_ERR_fsck_ignore) { 605 + if (!bch2_err_matches(ret2, BCH_ERR_fsck_fix) && 606 + !bch2_err_matches(ret2, BCH_ERR_fsck_ignore)) { 607 607 ret = ret2; 608 608 goto fsck_err; 609 609 } 610 610 611 611 if (!have_retry) 612 - ret = -BCH_ERR_fsck_fix; 612 + ret = bch_err_throw(c, fsck_fix); 613 613 goto out; 614 614 case -BCH_ERR_btree_node_read_err_bad_node: 615 615 prt_str(&out, ", "); ··· 631 631 switch (ret) { 632 632 case -BCH_ERR_btree_node_read_err_fixable: 633 633 ret2 = __bch2_fsck_err(c, NULL, FSCK_CAN_FIX, err_type, "%s", out.buf); 634 - if (ret2 != -BCH_ERR_fsck_fix && 635 - ret2 != -BCH_ERR_fsck_ignore) { 634 + if (!bch2_err_matches(ret2, BCH_ERR_fsck_fix) && 635 + !bch2_err_matches(ret2, BCH_ERR_fsck_ignore)) { 636 636 ret = ret2; 637 637 goto fsck_err; 638 638 } 639 639 640 640 if (!have_retry) 641 - ret = -BCH_ERR_fsck_fix; 641 + ret = bch_err_throw(c, fsck_fix); 642 642 goto out; 643 643 case -BCH_ERR_btree_node_read_err_bad_node: 644 644 prt_str(&out, ", "); ··· 660 660 failed, err_msg, \ 661 661 msg, ##__VA_ARGS__); \ 662 662 \ 663 - if (_ret != -BCH_ERR_fsck_fix) { \ 663 + if (!bch2_err_matches(_ret, BCH_ERR_fsck_fix)) { \ 664 664 ret = _ret; \ 665 665 goto fsck_err; \ 666 666 } \ ··· 1325 1325 1326 1326 btree_node_reset_sib_u64s(b); 1327 1327 1328 - rcu_read_lock(); 1329 - bkey_for_each_ptr(bch2_bkey_ptrs(bkey_i_to_s(&b->key)), ptr) { 1330 - struct bch_dev *ca2 = bch2_dev_rcu(c, ptr->dev); 1328 + scoped_guard(rcu) 1329 + bkey_for_each_ptr(bch2_bkey_ptrs(bkey_i_to_s(&b->key)), ptr) { 1330 + struct bch_dev *ca2 = bch2_dev_rcu(c, ptr->dev); 1331 1331 1332 - if (!ca2 || ca2->mi.state != BCH_MEMBER_STATE_rw) 1333 - set_btree_node_need_rewrite(b); 1334 - } 1335 - rcu_read_unlock(); 1332 + if (!ca2 || ca2->mi.state != BCH_MEMBER_STATE_rw) 1333 + set_btree_node_need_rewrite(b); 1334 + } 1336 1335 1337 1336 if (!ptr_written) 1338 1337 set_btree_node_need_rewrite(b); ··· 1687 1688 1688 1689 ra = kzalloc(sizeof(*ra), GFP_NOFS); 1689 1690 if (!ra) 1690 - return -BCH_ERR_ENOMEM_btree_node_read_all_replicas; 1691 + return bch_err_throw(c, ENOMEM_btree_node_read_all_replicas); 1691 1692 1692 1693 closure_init(&ra->cl, NULL); 1693 1694 ra->c = c; ··· 1869 1870 bch2_btree_node_hash_remove(&c->btree_cache, b); 1870 1871 mutex_unlock(&c->btree_cache.lock); 1871 1872 1872 - ret = -BCH_ERR_btree_node_read_error; 1873 + ret = bch_err_throw(c, btree_node_read_error); 1873 1874 goto err; 1874 1875 } 1875 1876 ··· 2019 2020 struct bch_fs *c = trans->c; 2020 2021 2021 2022 if (!enumerated_ref_tryget(&c->writes, BCH_WRITE_REF_btree_node_scrub)) 2022 - return -BCH_ERR_erofs_no_writes; 2023 + return bch_err_throw(c, erofs_no_writes); 2023 2024 2024 2025 struct extent_ptr_decoded pick; 2025 2026 int ret = bch2_bkey_pick_read_device(c, k, NULL, &pick, dev); ··· 2029 2030 struct bch_dev *ca = bch2_dev_get_ioref(c, pick.ptr.dev, READ, 2030 2031 BCH_DEV_READ_REF_btree_node_scrub); 2031 2032 if (!ca) { 2032 - ret = -BCH_ERR_device_offline; 2033 + ret = bch_err_throw(c, device_offline); 2033 2034 goto err; 2034 2035 } 2035 2036 ··· 2166 2167 bch2_dev_list_has_dev(wbio->wbio.failed, ptr->dev)); 2167 2168 2168 2169 if (!bch2_bkey_nr_ptrs(bkey_i_to_s_c(&wbio->key))) { 2169 - ret = -BCH_ERR_btree_node_write_all_failed; 2170 + ret = bch_err_throw(c, btree_node_write_all_failed); 2170 2171 goto err; 2171 2172 } 2172 2173
+38 -40
fs/bcachefs/btree_iter.c
··· 890 890 891 891 static noinline int btree_node_iter_and_journal_peek(struct btree_trans *trans, 892 892 struct btree_path *path, 893 - unsigned flags, 894 - struct bkey_buf *out) 893 + unsigned flags) 895 894 { 896 895 struct bch_fs *c = trans->c; 897 896 struct btree_path_level *l = path_l(path); ··· 914 915 goto err; 915 916 } 916 917 917 - bch2_bkey_buf_reassemble(out, c, k); 918 + bkey_reassemble(&trans->btree_path_down, k); 918 919 919 920 if ((flags & BTREE_ITER_prefetch) && 920 921 c->opts.btree_node_prefetch) ··· 923 924 err: 924 925 bch2_btree_and_journal_iter_exit(&jiter); 925 926 return ret; 927 + } 928 + 929 + static noinline_for_stack int btree_node_missing_err(struct btree_trans *trans, 930 + struct btree_path *path) 931 + { 932 + struct bch_fs *c = trans->c; 933 + struct printbuf buf = PRINTBUF; 934 + 935 + prt_str(&buf, "node not found at pos "); 936 + bch2_bpos_to_text(&buf, path->pos); 937 + prt_str(&buf, " within parent node "); 938 + bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(&path_l(path)->b->key)); 939 + 940 + bch2_fs_fatal_error(c, "%s", buf.buf); 941 + printbuf_exit(&buf); 942 + return bch_err_throw(c, btree_need_topology_repair); 926 943 } 927 944 928 945 static __always_inline int btree_path_down(struct btree_trans *trans, ··· 951 936 struct btree *b; 952 937 unsigned level = path->level - 1; 953 938 enum six_lock_type lock_type = __btree_lock_want(path, level); 954 - struct bkey_buf tmp; 955 939 int ret; 956 940 957 941 EBUG_ON(!btree_node_locked(path, path->level)); 958 942 959 - bch2_bkey_buf_init(&tmp); 960 - 961 943 if (unlikely(trans->journal_replay_not_finished)) { 962 - ret = btree_node_iter_and_journal_peek(trans, path, flags, &tmp); 944 + ret = btree_node_iter_and_journal_peek(trans, path, flags); 963 945 if (ret) 964 - goto err; 946 + return ret; 965 947 } else { 966 948 struct bkey_packed *k = bch2_btree_node_iter_peek(&l->iter, l->b); 967 - if (!k) { 968 - struct printbuf buf = PRINTBUF; 949 + if (unlikely(!k)) 950 + return btree_node_missing_err(trans, path); 969 951 970 - prt_str(&buf, "node not found at pos "); 971 - bch2_bpos_to_text(&buf, path->pos); 972 - prt_str(&buf, " within parent node "); 973 - bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(&l->b->key)); 952 + bch2_bkey_unpack(l->b, &trans->btree_path_down, k); 974 953 975 - bch2_fs_fatal_error(c, "%s", buf.buf); 976 - printbuf_exit(&buf); 977 - ret = -BCH_ERR_btree_need_topology_repair; 978 - goto err; 979 - } 980 - 981 - bch2_bkey_buf_unpack(&tmp, c, l->b, k); 982 - 983 - if ((flags & BTREE_ITER_prefetch) && 954 + if (unlikely((flags & BTREE_ITER_prefetch)) && 984 955 c->opts.btree_node_prefetch) { 985 956 ret = btree_path_prefetch(trans, path); 986 957 if (ret) 987 - goto err; 958 + return ret; 988 959 } 989 960 } 990 961 991 - b = bch2_btree_node_get(trans, path, tmp.k, level, lock_type, trace_ip); 962 + b = bch2_btree_node_get(trans, path, &trans->btree_path_down, 963 + level, lock_type, trace_ip); 992 964 ret = PTR_ERR_OR_ZERO(b); 993 965 if (unlikely(ret)) 994 - goto err; 966 + return ret; 995 967 996 - if (likely(!trans->journal_replay_not_finished && 997 - tmp.k->k.type == KEY_TYPE_btree_ptr_v2) && 998 - unlikely(b != btree_node_mem_ptr(tmp.k))) 968 + if (unlikely(b != btree_node_mem_ptr(&trans->btree_path_down)) && 969 + likely(!trans->journal_replay_not_finished && 970 + trans->btree_path_down.k.type == KEY_TYPE_btree_ptr_v2)) 999 971 btree_node_mem_ptr_set(trans, path, level + 1, b); 1000 972 1001 973 if (btree_node_read_locked(path, level + 1)) ··· 994 992 bch2_btree_path_level_init(trans, path, b); 995 993 996 994 bch2_btree_path_verify_locks(trans, path); 997 - err: 998 - bch2_bkey_buf_exit(&tmp, c); 999 - return ret; 995 + return 0; 1000 996 } 1001 997 1002 998 static int bch2_btree_path_traverse_all(struct btree_trans *trans) ··· 1006 1006 int ret = 0; 1007 1007 1008 1008 if (trans->in_traverse_all) 1009 - return -BCH_ERR_transaction_restart_in_traverse_all; 1009 + return bch_err_throw(c, transaction_restart_in_traverse_all); 1010 1010 1011 1011 trans->in_traverse_all = true; 1012 1012 retry_all: ··· 3568 3568 struct btree_bkey_cached_common *b) 3569 3569 { 3570 3570 struct six_lock_count c = six_lock_counts(&b->lock); 3571 - struct task_struct *owner; 3572 3571 pid_t pid; 3573 3572 3574 - rcu_read_lock(); 3575 - owner = READ_ONCE(b->lock.owner); 3576 - pid = owner ? owner->pid : 0; 3577 - rcu_read_unlock(); 3573 + scoped_guard(rcu) { 3574 + struct task_struct *owner = READ_ONCE(b->lock.owner); 3575 + pid = owner ? owner->pid : 0; 3576 + } 3578 3577 3579 3578 prt_printf(out, "\t%px %c ", b, b->cached ? 'c' : 'b'); 3580 3579 bch2_btree_id_to_text(out, b->btree_id); ··· 3602 3603 prt_printf(out, "%i %s\n", task ? task->pid : 0, trans->fn); 3603 3604 3604 3605 /* trans->paths is rcu protected vs. freeing */ 3605 - rcu_read_lock(); 3606 + guard(rcu)(); 3606 3607 out->atomic++; 3607 3608 3608 3609 struct btree_path *paths = rcu_dereference(trans->paths); ··· 3645 3646 } 3646 3647 out: 3647 3648 --out->atomic; 3648 - rcu_read_unlock(); 3649 3649 } 3650 3650 3651 3651 void bch2_fs_btree_iter_exit(struct bch_fs *c)
+21 -10
fs/bcachefs/btree_iter.h
··· 963 963 _p; \ 964 964 }) 965 965 966 - #define bch2_trans_run(_c, _do) \ 967 - ({ \ 968 - struct btree_trans *trans = bch2_trans_get(_c); \ 969 - int _ret = (_do); \ 970 - bch2_trans_put(trans); \ 971 - _ret; \ 972 - }) 973 - 974 - #define bch2_trans_do(_c, _do) bch2_trans_run(_c, lockrestart_do(trans, _do)) 975 - 976 966 struct btree_trans *__bch2_trans_get(struct bch_fs *, unsigned); 977 967 void bch2_trans_put(struct btree_trans *); 978 968 ··· 979 989 trans_fn_idx = bch2_trans_get_fn_idx(__func__); \ 980 990 __bch2_trans_get(_c, trans_fn_idx); \ 981 991 }) 992 + 993 + /* 994 + * We don't use DEFINE_CLASS() because using a function for the constructor 995 + * breaks bch2_trans_get()'s use of __func__ 996 + */ 997 + typedef struct btree_trans * class_btree_trans_t; 998 + static inline void class_btree_trans_destructor(struct btree_trans **p) 999 + { 1000 + struct btree_trans *trans = *p; 1001 + bch2_trans_put(trans); 1002 + } 1003 + 1004 + #define class_btree_trans_constructor(_c) bch2_trans_get(_c) 1005 + 1006 + #define bch2_trans_run(_c, _do) \ 1007 + ({ \ 1008 + CLASS(btree_trans, trans)(_c); \ 1009 + (_do); \ 1010 + }) 1011 + 1012 + #define bch2_trans_do(_c, _do) bch2_trans_run(_c, lockrestart_do(trans, _do)) 982 1013 983 1014 void bch2_btree_trans_to_text(struct printbuf *, struct btree_trans *); 984 1015
+7 -12
fs/bcachefs/btree_journal_iter.c
··· 292 292 if (!new_keys.data) { 293 293 bch_err(c, "%s: error allocating new key array (size %zu)", 294 294 __func__, new_keys.size); 295 - return -BCH_ERR_ENOMEM_journal_key_insert; 295 + return bch_err_throw(c, ENOMEM_journal_key_insert); 296 296 } 297 297 298 298 /* Since @keys was full, there was no gap: */ ··· 331 331 332 332 n = kmalloc(bkey_bytes(&k->k), GFP_KERNEL); 333 333 if (!n) 334 - return -BCH_ERR_ENOMEM_journal_key_insert; 334 + return bch_err_throw(c, ENOMEM_journal_key_insert); 335 335 336 336 bkey_copy(n, k); 337 337 ret = bch2_journal_key_insert_take(c, id, level, n); ··· 457 457 458 458 static struct bkey_s_c bch2_journal_iter_peek(struct journal_iter *iter) 459 459 { 460 - struct bkey_s_c ret = bkey_s_c_null; 461 - 462 460 journal_iter_verify(iter); 463 461 464 - rcu_read_lock(); 462 + guard(rcu)(); 465 463 while (iter->idx < iter->keys->size) { 466 464 struct journal_key *k = iter->keys->data + iter->idx; 467 465 ··· 468 470 break; 469 471 BUG_ON(cmp); 470 472 471 - if (!k->overwritten) { 472 - ret = bkey_i_to_s_c(k->k); 473 - break; 474 - } 473 + if (!k->overwritten) 474 + return bkey_i_to_s_c(k->k); 475 475 476 476 if (k->overwritten_range) 477 477 iter->idx = idx_to_pos(iter->keys, rcu_dereference(k->overwritten_range)->end); 478 478 else 479 479 bch2_journal_iter_advance(iter); 480 480 } 481 - rcu_read_unlock(); 482 481 483 - return ret; 482 + return bkey_s_c_null; 484 483 } 485 484 486 485 static void bch2_journal_iter_exit(struct journal_iter *iter) ··· 736 741 if (keys->nr * 8 > keys->size * 7) { 737 742 bch_err(c, "Too many journal keys for slowpath; have %zu compacted, buf size %zu, processed %zu keys at seq %llu", 738 743 keys->nr, keys->size, nr_read, le64_to_cpu(i->j.seq)); 739 - return -BCH_ERR_ENOMEM_journal_keys_sort; 744 + return bch_err_throw(c, ENOMEM_journal_keys_sort); 740 745 } 741 746 742 747 BUG_ON(darray_push(keys, n));
+12 -16
fs/bcachefs/btree_key_cache.c
··· 187 187 static struct bkey_cached * 188 188 bkey_cached_reuse(struct btree_key_cache *c) 189 189 { 190 - struct bucket_table *tbl; 190 + 191 + guard(rcu)(); 192 + struct bucket_table *tbl = rht_dereference_rcu(c->table.tbl, &c->table); 191 193 struct rhash_head *pos; 192 194 struct bkey_cached *ck; 193 - unsigned i; 194 195 195 - rcu_read_lock(); 196 - tbl = rht_dereference_rcu(c->table.tbl, &c->table); 197 - for (i = 0; i < tbl->size; i++) 196 + for (unsigned i = 0; i < tbl->size; i++) 198 197 rht_for_each_entry_rcu(ck, pos, tbl, i, hash) { 199 198 if (!test_bit(BKEY_CACHED_DIRTY, &ck->flags) && 200 199 bkey_cached_lock_for_evict(ck)) { 201 200 if (bkey_cached_evict(c, ck)) 202 - goto out; 201 + return ck; 203 202 six_unlock_write(&ck->c.lock); 204 203 six_unlock_intent(&ck->c.lock); 205 204 } 206 205 } 207 - ck = NULL; 208 - out: 209 - rcu_read_unlock(); 210 - return ck; 206 + return NULL; 211 207 } 212 208 213 209 static int btree_key_cache_create(struct btree_trans *trans, ··· 238 242 if (unlikely(!ck)) { 239 243 bch_err(c, "error allocating memory for key cache item, btree %s", 240 244 bch2_btree_id_str(ck_path->btree_id)); 241 - return -BCH_ERR_ENOMEM_btree_key_cache_create; 245 + return bch_err_throw(c, ENOMEM_btree_key_cache_create); 242 246 } 243 247 } 244 248 ··· 256 260 if (unlikely(!new_k)) { 257 261 bch_err(trans->c, "error allocating memory for key cache key, btree %s u64s %u", 258 262 bch2_btree_id_str(ck->key.btree_id), key_u64s); 259 - ret = -BCH_ERR_ENOMEM_btree_key_cache_fill; 263 + ret = bch_err_throw(c, ENOMEM_btree_key_cache_fill); 260 264 } else if (ret) { 261 265 kfree(new_k); 262 266 goto err; ··· 822 826 823 827 bc->nr_pending = alloc_percpu(size_t); 824 828 if (!bc->nr_pending) 825 - return -BCH_ERR_ENOMEM_fs_btree_cache_init; 829 + return bch_err_throw(c, ENOMEM_fs_btree_cache_init); 826 830 827 831 if (rcu_pending_init(&bc->pending[0], &c->btree_trans_barrier, __bkey_cached_free) || 828 832 rcu_pending_init(&bc->pending[1], &c->btree_trans_barrier, __bkey_cached_free)) 829 - return -BCH_ERR_ENOMEM_fs_btree_cache_init; 833 + return bch_err_throw(c, ENOMEM_fs_btree_cache_init); 830 834 831 835 if (rhashtable_init(&bc->table, &bch2_btree_key_cache_params)) 832 - return -BCH_ERR_ENOMEM_fs_btree_cache_init; 836 + return bch_err_throw(c, ENOMEM_fs_btree_cache_init); 833 837 834 838 bc->table_init_done = true; 835 839 836 840 shrink = shrinker_alloc(0, "%s-btree_key_cache", c->name); 837 841 if (!shrink) 838 - return -BCH_ERR_ENOMEM_fs_btree_cache_init; 842 + return bch_err_throw(c, ENOMEM_fs_btree_cache_init); 839 843 bc->shrink = shrink; 840 844 shrink->count_objects = bch2_btree_key_cache_count; 841 845 shrink->scan_objects = bch2_btree_key_cache_scan;
+29 -27
fs/bcachefs/btree_locking.c
··· 194 194 return 3; 195 195 } 196 196 197 + static noinline __noreturn void break_cycle_fail(struct lock_graph *g) 198 + { 199 + struct printbuf buf = PRINTBUF; 200 + buf.atomic++; 201 + 202 + prt_printf(&buf, bch2_fmt(g->g->trans->c, "cycle of nofail locks")); 203 + 204 + for (struct trans_waiting_for_lock *i = g->g; i < g->g + g->nr; i++) { 205 + struct btree_trans *trans = i->trans; 206 + 207 + bch2_btree_trans_to_text(&buf, trans); 208 + 209 + prt_printf(&buf, "backtrace:\n"); 210 + printbuf_indent_add(&buf, 2); 211 + bch2_prt_task_backtrace(&buf, trans->locking_wait.task, 2, GFP_NOWAIT); 212 + printbuf_indent_sub(&buf, 2); 213 + prt_newline(&buf); 214 + } 215 + 216 + bch2_print_str_nonblocking(g->g->trans->c, KERN_ERR, buf.buf); 217 + printbuf_exit(&buf); 218 + BUG(); 219 + } 220 + 197 221 static noinline int break_cycle(struct lock_graph *g, struct printbuf *cycle, 198 222 struct trans_waiting_for_lock *from) 199 223 { ··· 243 219 } 244 220 } 245 221 246 - if (unlikely(!best)) { 247 - struct printbuf buf = PRINTBUF; 248 - buf.atomic++; 249 - 250 - prt_printf(&buf, bch2_fmt(g->g->trans->c, "cycle of nofail locks")); 251 - 252 - for (i = g->g; i < g->g + g->nr; i++) { 253 - struct btree_trans *trans = i->trans; 254 - 255 - bch2_btree_trans_to_text(&buf, trans); 256 - 257 - prt_printf(&buf, "backtrace:\n"); 258 - printbuf_indent_add(&buf, 2); 259 - bch2_prt_task_backtrace(&buf, trans->locking_wait.task, 2, GFP_NOWAIT); 260 - printbuf_indent_sub(&buf, 2); 261 - prt_newline(&buf); 262 - } 263 - 264 - bch2_print_str_nonblocking(g->g->trans->c, KERN_ERR, buf.buf); 265 - printbuf_exit(&buf); 266 - BUG(); 267 - } 222 + if (unlikely(!best)) 223 + break_cycle_fail(g); 268 224 269 225 ret = abort_lock(g, abort); 270 226 out: ··· 259 255 struct printbuf *cycle) 260 256 { 261 257 struct btree_trans *orig_trans = g->g->trans; 262 - struct trans_waiting_for_lock *i; 263 258 264 - for (i = g->g; i < g->g + g->nr; i++) 259 + for (struct trans_waiting_for_lock *i = g->g; i < g->g + g->nr; i++) 265 260 if (i->trans == trans) { 266 261 closure_put(&trans->ref); 267 262 return break_cycle(g, cycle, i); 268 263 } 269 264 270 - if (g->nr == ARRAY_SIZE(g->g)) { 265 + if (unlikely(g->nr == ARRAY_SIZE(g->g))) { 271 266 closure_put(&trans->ref); 272 267 273 268 if (orig_trans->lock_may_not_fail) ··· 311 308 lock_graph_down(&g, trans); 312 309 313 310 /* trans->paths is rcu protected vs. freeing */ 314 - rcu_read_lock(); 311 + guard(rcu)(); 315 312 if (cycle) 316 313 cycle->atomic++; 317 314 next: ··· 409 406 out: 410 407 if (cycle) 411 408 --cycle->atomic; 412 - rcu_read_unlock(); 413 409 return ret; 414 410 } 415 411
+2
fs/bcachefs/btree_node_scan.c
··· 363 363 min_heap_sift_down(nodes_heap, 0, &found_btree_node_heap_cbs, NULL); 364 364 } 365 365 } 366 + 367 + cond_resched(); 366 368 } 367 369 368 370 return 0;
+25 -11
fs/bcachefs/btree_trans_commit.c
··· 376 376 struct btree *b, unsigned u64s) 377 377 { 378 378 if (!bch2_btree_node_insert_fits(b, u64s)) 379 - return -BCH_ERR_btree_insert_btree_node_full; 379 + return bch_err_throw(trans->c, btree_insert_btree_node_full); 380 380 381 381 return 0; 382 382 } ··· 394 394 395 395 new_k = kmalloc(new_u64s * sizeof(u64), GFP_KERNEL); 396 396 if (!new_k) { 397 - bch_err(trans->c, "error allocating memory for key cache key, btree %s u64s %u", 397 + struct bch_fs *c = trans->c; 398 + bch_err(c, "error allocating memory for key cache key, btree %s u64s %u", 398 399 bch2_btree_id_str(path->btree_id), new_u64s); 399 - return -BCH_ERR_ENOMEM_btree_key_cache_insert; 400 + return bch_err_throw(c, ENOMEM_btree_key_cache_insert); 400 401 } 401 402 402 403 ret = bch2_trans_relock(trans) ?: ··· 433 432 if (watermark < BCH_WATERMARK_reclaim && 434 433 !test_bit(BKEY_CACHED_DIRTY, &ck->flags) && 435 434 bch2_btree_key_cache_must_wait(c)) 436 - return -BCH_ERR_btree_insert_need_journal_reclaim; 435 + return bch_err_throw(c, btree_insert_need_journal_reclaim); 437 436 438 437 /* 439 438 * bch2_varint_decode can read past the end of the buffer by at most 7 ··· 895 894 */ 896 895 if ((flags & BCH_TRANS_COMMIT_journal_reclaim) && 897 896 watermark < BCH_WATERMARK_reclaim) { 898 - ret = -BCH_ERR_journal_reclaim_would_deadlock; 897 + ret = bch_err_throw(c, journal_reclaim_would_deadlock); 899 898 goto out; 900 899 } 901 900 ··· 967 966 968 967 for (struct jset_entry *i = btree_trans_journal_entries_start(trans); 969 968 i != btree_trans_journal_entries_top(trans); 970 - i = vstruct_next(i)) 969 + i = vstruct_next(i)) { 971 970 if (i->type == BCH_JSET_ENTRY_btree_keys || 972 971 i->type == BCH_JSET_ENTRY_write_buffer_keys) { 973 - int ret = bch2_journal_key_insert(c, i->btree_id, i->level, i->start); 974 - if (ret) 975 - return ret; 972 + jset_entry_for_each_key(i, k) { 973 + int ret = bch2_journal_key_insert(c, i->btree_id, i->level, k); 974 + if (ret) 975 + return ret; 976 + } 976 977 } 978 + 979 + if (i->type == BCH_JSET_ENTRY_btree_root) { 980 + guard(mutex)(&c->btree_root_lock); 981 + 982 + struct btree_root *r = bch2_btree_id_root(c, i->btree_id); 983 + 984 + bkey_copy(&r->key, i->start); 985 + r->level = i->level; 986 + r->alive = true; 987 + } 988 + } 977 989 978 990 for (struct bkey_i *i = btree_trans_subbuf_base(trans, &trans->accounting); 979 991 i != btree_trans_subbuf_top(trans, &trans->accounting); ··· 1025 1011 if (unlikely(!test_bit(BCH_FS_may_go_rw, &c->flags))) 1026 1012 ret = do_bch2_trans_commit_to_journal_replay(trans); 1027 1013 else 1028 - ret = -BCH_ERR_erofs_trans_commit; 1014 + ret = bch_err_throw(c, erofs_trans_commit); 1029 1015 goto out_reset; 1030 1016 } 1031 1017 ··· 1107 1093 * restart: 1108 1094 */ 1109 1095 if (flags & BCH_TRANS_COMMIT_no_journal_res) { 1110 - ret = -BCH_ERR_transaction_restart_nested; 1096 + ret = bch_err_throw(c, transaction_restart_nested); 1111 1097 goto out; 1112 1098 } 1113 1099
+2
fs/bcachefs/btree_types.h
··· 555 555 unsigned journal_u64s; 556 556 unsigned extra_disk_res; /* XXX kill */ 557 557 558 + __BKEY_PADDED(btree_path_down, BKEY_BTREE_PTR_VAL_U64s_MAX); 559 + 558 560 #ifdef CONFIG_DEBUG_LOCK_ALLOC 559 561 struct lockdep_map dep_map; 560 562 #endif
+19 -40
fs/bcachefs/btree_update.c
··· 123 123 } 124 124 125 125 int __bch2_insert_snapshot_whiteouts(struct btree_trans *trans, 126 - enum btree_id id, 127 - struct bpos old_pos, 128 - struct bpos new_pos) 126 + enum btree_id btree, struct bpos pos, 127 + snapshot_id_list *s) 129 128 { 130 - struct bch_fs *c = trans->c; 131 - struct btree_iter old_iter, new_iter = {}; 132 - struct bkey_s_c old_k, new_k; 133 - snapshot_id_list s; 134 - struct bkey_i *update; 135 129 int ret = 0; 136 130 137 - if (!bch2_snapshot_has_children(c, old_pos.snapshot)) 138 - return 0; 131 + darray_for_each(*s, id) { 132 + pos.snapshot = *id; 139 133 140 - darray_init(&s); 141 - 142 - bch2_trans_iter_init(trans, &old_iter, id, old_pos, 143 - BTREE_ITER_not_extents| 144 - BTREE_ITER_all_snapshots); 145 - while ((old_k = bch2_btree_iter_prev(trans, &old_iter)).k && 146 - !(ret = bkey_err(old_k)) && 147 - bkey_eq(old_pos, old_k.k->p)) { 148 - struct bpos whiteout_pos = 149 - SPOS(new_pos.inode, new_pos.offset, old_k.k->p.snapshot); 150 - 151 - if (!bch2_snapshot_is_ancestor(c, old_k.k->p.snapshot, old_pos.snapshot) || 152 - snapshot_list_has_ancestor(c, &s, old_k.k->p.snapshot)) 153 - continue; 154 - 155 - new_k = bch2_bkey_get_iter(trans, &new_iter, id, whiteout_pos, 156 - BTREE_ITER_not_extents| 157 - BTREE_ITER_intent); 158 - ret = bkey_err(new_k); 134 + struct btree_iter iter; 135 + struct bkey_s_c k = bch2_bkey_get_iter(trans, &iter, btree, pos, 136 + BTREE_ITER_not_extents| 137 + BTREE_ITER_intent); 138 + ret = bkey_err(k); 159 139 if (ret) 160 140 break; 161 141 162 - if (new_k.k->type == KEY_TYPE_deleted) { 163 - update = bch2_trans_kmalloc(trans, sizeof(struct bkey_i)); 142 + if (k.k->type == KEY_TYPE_deleted) { 143 + struct bkey_i *update = bch2_trans_kmalloc(trans, sizeof(struct bkey_i)); 164 144 ret = PTR_ERR_OR_ZERO(update); 165 - if (ret) 145 + if (ret) { 146 + bch2_trans_iter_exit(trans, &iter); 166 147 break; 148 + } 167 149 168 150 bkey_init(&update->k); 169 - update->k.p = whiteout_pos; 151 + update->k.p = pos; 170 152 update->k.type = KEY_TYPE_whiteout; 171 153 172 - ret = bch2_trans_update(trans, &new_iter, update, 154 + ret = bch2_trans_update(trans, &iter, update, 173 155 BTREE_UPDATE_internal_snapshot_node); 174 156 } 175 - bch2_trans_iter_exit(trans, &new_iter); 157 + bch2_trans_iter_exit(trans, &iter); 176 158 177 - ret = snapshot_list_add(c, &s, old_k.k->p.snapshot); 178 159 if (ret) 179 160 break; 180 161 } 181 - bch2_trans_iter_exit(trans, &new_iter); 182 - bch2_trans_iter_exit(trans, &old_iter); 183 - darray_exit(&s); 184 162 163 + darray_exit(s); 185 164 return ret; 186 165 } 187 166 ··· 587 608 BUG_ON(k.k->type != KEY_TYPE_deleted); 588 609 589 610 if (bkey_gt(k.k->p, end)) { 590 - ret = -BCH_ERR_ENOSPC_btree_slot; 611 + ret = bch_err_throw(trans->c, ENOSPC_btree_slot); 591 612 goto err; 592 613 } 593 614
+12 -2
fs/bcachefs/btree_update.h
··· 4 4 5 5 #include "btree_iter.h" 6 6 #include "journal.h" 7 + #include "snapshot.h" 7 8 8 9 struct bch_fs; 9 10 struct btree; ··· 75 74 } 76 75 77 76 int __bch2_insert_snapshot_whiteouts(struct btree_trans *, enum btree_id, 78 - struct bpos, struct bpos); 77 + struct bpos, snapshot_id_list *); 79 78 80 79 /* 81 80 * For use when splitting extents in existing snapshots: ··· 89 88 struct bpos old_pos, 90 89 struct bpos new_pos) 91 90 { 91 + BUG_ON(old_pos.snapshot != new_pos.snapshot); 92 + 92 93 if (!btree_type_has_snapshots(btree) || 93 94 bkey_eq(old_pos, new_pos)) 94 95 return 0; 95 96 96 - return __bch2_insert_snapshot_whiteouts(trans, btree, old_pos, new_pos); 97 + snapshot_id_list s; 98 + int ret = bch2_get_snapshot_overwrites(trans, btree, old_pos, &s); 99 + if (ret) 100 + return ret; 101 + 102 + return s.nr 103 + ? __bch2_insert_snapshot_whiteouts(trans, btree, new_pos, &s) 104 + : 0; 97 105 } 98 106 99 107 int bch2_trans_update_extent_overwrite(struct btree_trans *, struct btree_iter *,
+61 -43
fs/bcachefs/btree_update_interior.c
··· 57 57 struct bkey_buf prev; 58 58 int ret = 0; 59 59 60 - printbuf_indent_add_nextline(&buf, 2); 61 - 62 60 BUG_ON(b->key.k.type == KEY_TYPE_btree_ptr_v2 && 63 61 !bpos_eq(bkey_i_to_btree_ptr_v2(&b->key)->v.min_key, 64 62 b->data->min_key)); ··· 67 69 68 70 if (b == btree_node_root(c, b)) { 69 71 if (!bpos_eq(b->data->min_key, POS_MIN)) { 70 - ret = __bch2_topology_error(c, &buf); 71 - 72 + bch2_log_msg_start(c, &buf); 73 + prt_printf(&buf, "btree root with incorrect min_key: "); 72 74 bch2_bpos_to_text(&buf, b->data->min_key); 73 - log_fsck_err(trans, btree_root_bad_min_key, 74 - "btree root with incorrect min_key: %s", buf.buf); 75 - goto out; 75 + prt_newline(&buf); 76 + 77 + bch2_count_fsck_err(c, btree_root_bad_min_key, &buf); 78 + goto err; 76 79 } 77 80 78 81 if (!bpos_eq(b->data->max_key, SPOS_MAX)) { 79 - ret = __bch2_topology_error(c, &buf); 82 + bch2_log_msg_start(c, &buf); 83 + prt_printf(&buf, "btree root with incorrect max_key: "); 80 84 bch2_bpos_to_text(&buf, b->data->max_key); 81 - log_fsck_err(trans, btree_root_bad_max_key, 82 - "btree root with incorrect max_key: %s", buf.buf); 83 - goto out; 85 + prt_newline(&buf); 86 + 87 + bch2_count_fsck_err(c, btree_root_bad_max_key, &buf); 88 + goto err; 84 89 } 85 90 } 86 91 ··· 101 100 : bpos_successor(prev.k->k.p); 102 101 103 102 if (!bpos_eq(expected_min, bp.v->min_key)) { 104 - ret = __bch2_topology_error(c, &buf); 105 - 106 - prt_str(&buf, "end of prev node doesn't match start of next node\nin "); 107 - bch2_btree_id_level_to_text(&buf, b->c.btree_id, b->c.level); 108 - prt_str(&buf, " node "); 109 - bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(&b->key)); 103 + prt_str(&buf, "end of prev node doesn't match start of next node"); 110 104 prt_str(&buf, "\nprev "); 111 105 bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(prev.k)); 112 106 prt_str(&buf, "\nnext "); 113 107 bch2_bkey_val_to_text(&buf, c, k); 108 + prt_newline(&buf); 114 109 115 - log_fsck_err(trans, btree_node_topology_bad_min_key, "%s", buf.buf); 116 - goto out; 110 + bch2_count_fsck_err(c, btree_node_topology_bad_min_key, &buf); 111 + goto err; 117 112 } 118 113 119 114 bch2_bkey_buf_reassemble(&prev, c, k); ··· 117 120 } 118 121 119 122 if (bkey_deleted(&prev.k->k)) { 120 - ret = __bch2_topology_error(c, &buf); 123 + prt_printf(&buf, "empty interior node\n"); 124 + bch2_count_fsck_err(c, btree_node_topology_empty_interior_node, &buf); 125 + goto err; 126 + } 121 127 122 - prt_str(&buf, "empty interior node\nin "); 123 - bch2_btree_id_level_to_text(&buf, b->c.btree_id, b->c.level); 124 - prt_str(&buf, " node "); 125 - bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(&b->key)); 126 - 127 - log_fsck_err(trans, btree_node_topology_empty_interior_node, "%s", buf.buf); 128 - } else if (!bpos_eq(prev.k->k.p, b->key.k.p)) { 129 - ret = __bch2_topology_error(c, &buf); 130 - 131 - prt_str(&buf, "last child node doesn't end at end of parent node\nin "); 132 - bch2_btree_id_level_to_text(&buf, b->c.btree_id, b->c.level); 133 - prt_str(&buf, " node "); 134 - bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(&b->key)); 135 - prt_str(&buf, "\nlast key "); 128 + if (!bpos_eq(prev.k->k.p, b->key.k.p)) { 129 + prt_str(&buf, "last child node doesn't end at end of parent node\nchild: "); 136 130 bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(prev.k)); 131 + prt_newline(&buf); 137 132 138 - log_fsck_err(trans, btree_node_topology_bad_max_key, "%s", buf.buf); 133 + bch2_count_fsck_err(c, btree_node_topology_bad_max_key, &buf); 134 + goto err; 139 135 } 140 136 out: 141 - fsck_err: 142 137 bch2_btree_and_journal_iter_exit(&iter); 143 138 bch2_bkey_buf_exit(&prev, c); 144 139 printbuf_exit(&buf); 145 140 return ret; 141 + err: 142 + bch2_btree_id_level_to_text(&buf, b->c.btree_id, b->c.level); 143 + prt_char(&buf, ' '); 144 + bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(&b->key)); 145 + prt_newline(&buf); 146 + 147 + ret = __bch2_topology_error(c, &buf); 148 + bch2_print_str(c, KERN_ERR, buf.buf); 149 + BUG_ON(!ret); 150 + goto out; 146 151 } 147 152 148 153 /* Calculate ideal packed bkey format for new btree nodes: */ ··· 684 685 685 686 /* 686 687 * Wait for any in flight writes to finish before we free the old nodes 687 - * on disk: 688 + * on disk. But we haven't pinned those old nodes in the btree cache, 689 + * they might have already been evicted. 690 + * 691 + * The update we're completing deleted references to those nodes from the 692 + * btree, so we know if they've been evicted they can't be pulled back in. 693 + * We just have to check if the nodes we have pointers to are still those 694 + * old nodes, and haven't been reused. 695 + * 696 + * This can't be done locklessly because the data buffer might have been 697 + * vmalloc allocated, and they're not RCU freed. We also need the 698 + * __no_kmsan_checks annotation because even with the btree node read 699 + * lock, nothing tells us that the data buffer has been initialized (if 700 + * the btree node has been reused for a different node, and the data 701 + * buffer swapped for a new data buffer). 688 702 */ 689 703 for (i = 0; i < as->nr_old_nodes; i++) { 690 704 b = as->old_nodes[i]; 691 705 692 - if (btree_node_seq_matches(b, as->old_nodes_seq[i])) 706 + bch2_trans_begin(trans); 707 + btree_node_lock_nopath_nofail(trans, &b->c, SIX_LOCK_read); 708 + bool seq_matches = btree_node_seq_matches(b, as->old_nodes_seq[i]); 709 + six_unlock_read(&b->c.lock); 710 + bch2_trans_unlock_long(trans); 711 + 712 + if (seq_matches) 693 713 wait_on_bit_io(&b->flags, BTREE_NODE_write_in_flight_inner, 694 714 TASK_UNINTERRUPTIBLE); 695 715 } ··· 1263 1245 if (bch2_err_matches(ret, ENOSPC) && 1264 1246 (flags & BCH_TRANS_COMMIT_journal_reclaim) && 1265 1247 watermark < BCH_WATERMARK_reclaim) { 1266 - ret = -BCH_ERR_journal_reclaim_would_deadlock; 1248 + ret = bch_err_throw(c, journal_reclaim_would_deadlock); 1267 1249 goto err; 1268 1250 } 1269 1251 ··· 2196 2178 if (btree_iter_path(trans, iter)->l[b->c.level].b != b) { 2197 2179 /* node has been freed: */ 2198 2180 BUG_ON(!btree_node_dying(b)); 2199 - ret = -BCH_ERR_btree_node_dying; 2181 + ret = bch_err_throw(trans->c, btree_node_dying); 2200 2182 goto err; 2201 2183 } 2202 2184 ··· 2810 2792 c->btree_interior_update_worker = 2811 2793 alloc_workqueue("btree_update", WQ_UNBOUND|WQ_MEM_RECLAIM, 8); 2812 2794 if (!c->btree_interior_update_worker) 2813 - return -BCH_ERR_ENOMEM_btree_interior_update_worker_init; 2795 + return bch_err_throw(c, ENOMEM_btree_interior_update_worker_init); 2814 2796 2815 2797 c->btree_node_rewrite_worker = 2816 2798 alloc_ordered_workqueue("btree_node_rewrite", WQ_UNBOUND); 2817 2799 if (!c->btree_node_rewrite_worker) 2818 - return -BCH_ERR_ENOMEM_btree_interior_update_worker_init; 2800 + return bch_err_throw(c, ENOMEM_btree_interior_update_worker_init); 2819 2801 2820 2802 if (mempool_init_kmalloc_pool(&c->btree_interior_update_pool, 1, 2821 2803 sizeof(struct btree_update))) 2822 - return -BCH_ERR_ENOMEM_btree_interior_update_pool_init; 2804 + return bch_err_throw(c, ENOMEM_btree_interior_update_pool_init); 2823 2805 2824 2806 return 0; 2825 2807 }
+3 -3
fs/bcachefs/btree_write_buffer.c
··· 394 394 bool accounting_accumulated = false; 395 395 do { 396 396 if (race_fault()) { 397 - ret = -BCH_ERR_journal_reclaim_would_deadlock; 397 + ret = bch_err_throw(c, journal_reclaim_would_deadlock); 398 398 break; 399 399 } 400 400 ··· 633 633 struct bch_fs *c = trans->c; 634 634 635 635 if (!enumerated_ref_tryget(&c->writes, BCH_WRITE_REF_btree_write_buffer)) 636 - return -BCH_ERR_erofs_no_writes; 636 + return bch_err_throw(c, erofs_no_writes); 637 637 638 638 int ret = bch2_btree_write_buffer_flush_nocheck_rw(trans); 639 639 enumerated_ref_put(&c->writes, BCH_WRITE_REF_btree_write_buffer); ··· 676 676 goto err; 677 677 678 678 bch2_bkey_buf_copy(last_flushed, c, tmp.k); 679 - ret = -BCH_ERR_transaction_restart_write_buffer_flush; 679 + ret = bch_err_throw(c, transaction_restart_write_buffer_flush); 680 680 } 681 681 err: 682 682 bch2_bkey_buf_exit(&tmp, c);
+99 -64
fs/bcachefs/buckets.c
··· 221 221 bch2_bkey_val_to_text(&buf, c, k), buf.buf))) { 222 222 if (!p.ptr.cached && 223 223 data_type == BCH_DATA_btree) { 224 + switch (g->data_type) { 225 + case BCH_DATA_sb: 226 + bch_err(c, "btree and superblock in the same bucket - cannot repair"); 227 + ret = bch_err_throw(c, fsck_repair_unimplemented); 228 + goto out; 229 + case BCH_DATA_journal: 230 + ret = bch2_dev_journal_bucket_delete(ca, PTR_BUCKET_NR(ca, &p.ptr)); 231 + bch_err_msg(c, ret, "error deleting journal bucket %zu", 232 + PTR_BUCKET_NR(ca, &p.ptr)); 233 + if (ret) 234 + goto out; 235 + break; 236 + } 237 + 224 238 g->data_type = data_type; 225 239 g->stripe_sectors = 0; 226 240 g->dirty_sectors = 0; ··· 284 270 struct printbuf buf = PRINTBUF; 285 271 int ret = 0; 286 272 273 + /* We don't yet do btree key updates correctly for when we're RW */ 274 + BUG_ON(test_bit(BCH_FS_rw, &c->flags)); 275 + 287 276 bkey_for_each_ptr_decode(k.k, ptrs_c, p, entry_c) { 288 277 ret = bch2_check_fix_ptr(trans, k, p, entry_c, &do_update); 289 278 if (ret) ··· 294 277 } 295 278 296 279 if (do_update) { 297 - if (flags & BTREE_TRIGGER_is_root) { 298 - bch_err(c, "cannot update btree roots yet"); 299 - ret = -EINVAL; 300 - goto err; 301 - } 302 - 303 280 struct bkey_i *new = bch2_bkey_make_mut_noupdate(trans, k); 304 281 ret = PTR_ERR_OR_ZERO(new); 305 282 if (ret) 306 283 goto err; 307 284 308 - rcu_read_lock(); 309 - bch2_bkey_drop_ptrs(bkey_i_to_s(new), ptr, !bch2_dev_exists(c, ptr->dev)); 310 - rcu_read_unlock(); 285 + scoped_guard(rcu) 286 + bch2_bkey_drop_ptrs(bkey_i_to_s(new), ptr, !bch2_dev_exists(c, ptr->dev)); 311 287 312 288 if (level) { 313 289 /* ··· 309 299 * sort it out: 310 300 */ 311 301 struct bkey_ptrs ptrs = bch2_bkey_ptrs(bkey_i_to_s(new)); 312 - rcu_read_lock(); 313 - bkey_for_each_ptr(ptrs, ptr) { 314 - struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 315 - struct bucket *g = PTR_GC_BUCKET(ca, ptr); 316 - 317 - ptr->gen = g->gen; 318 - } 319 - rcu_read_unlock(); 302 + scoped_guard(rcu) 303 + bkey_for_each_ptr(ptrs, ptr) { 304 + struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 305 + ptr->gen = PTR_GC_BUCKET(ca, ptr)->gen; 306 + } 320 307 } else { 321 308 struct bkey_ptrs ptrs; 322 309 union bch_extent_entry *entry; ··· 377 370 bch_info(c, "new key %s", buf.buf); 378 371 } 379 372 380 - struct btree_iter iter; 381 - bch2_trans_node_iter_init(trans, &iter, btree, new->k.p, 0, level, 382 - BTREE_ITER_intent|BTREE_ITER_all_snapshots); 383 - ret = bch2_btree_iter_traverse(trans, &iter) ?: 384 - bch2_trans_update(trans, &iter, new, 385 - BTREE_UPDATE_internal_snapshot_node| 386 - BTREE_TRIGGER_norun); 387 - bch2_trans_iter_exit(trans, &iter); 388 - if (ret) 389 - goto err; 373 + if (!(flags & BTREE_TRIGGER_is_root)) { 374 + struct btree_iter iter; 375 + bch2_trans_node_iter_init(trans, &iter, btree, new->k.p, 0, level, 376 + BTREE_ITER_intent|BTREE_ITER_all_snapshots); 377 + ret = bch2_btree_iter_traverse(trans, &iter) ?: 378 + bch2_trans_update(trans, &iter, new, 379 + BTREE_UPDATE_internal_snapshot_node| 380 + BTREE_TRIGGER_norun); 381 + bch2_trans_iter_exit(trans, &iter); 382 + if (ret) 383 + goto err; 390 384 391 - if (level) 392 - bch2_btree_node_update_key_early(trans, btree, level - 1, k, new); 385 + if (level) 386 + bch2_btree_node_update_key_early(trans, btree, level - 1, k, new); 387 + } else { 388 + struct jset_entry *e = bch2_trans_jset_entry_alloc(trans, 389 + jset_u64s(new->k.u64s)); 390 + ret = PTR_ERR_OR_ZERO(e); 391 + if (ret) 392 + goto err; 393 + 394 + journal_entry_set(e, 395 + BCH_JSET_ENTRY_btree_root, 396 + btree, level - 1, 397 + new, new->k.u64s); 398 + 399 + /* 400 + * no locking, we're single threaded and not rw yet, see 401 + * the big assertino above that we repeat here: 402 + */ 403 + BUG_ON(test_bit(BCH_FS_rw, &c->flags)); 404 + 405 + struct btree *b = bch2_btree_id_root(c, btree)->b; 406 + bkey_copy(&b->key, new); 407 + } 393 408 } 394 409 err: 395 410 printbuf_exit(&buf); ··· 435 406 if (insert) { 436 407 bch2_trans_updates_to_text(buf, trans); 437 408 __bch2_inconsistent_error(c, buf); 438 - ret = -BCH_ERR_bucket_ref_update; 409 + /* 410 + * If we're in recovery, run_explicit_recovery_pass might give 411 + * us an error code for rewinding recovery 412 + */ 413 + if (!ret) 414 + ret = bch_err_throw(c, bucket_ref_update); 415 + } else { 416 + /* Always ignore overwrite errors, so that deletion works */ 417 + ret = 0; 439 418 } 440 419 441 420 if (print || insert) ··· 632 595 struct bch_dev *ca = bch2_dev_tryget(c, p.ptr.dev); 633 596 if (unlikely(!ca)) { 634 597 if (insert && p.ptr.dev != BCH_SB_MEMBER_INVALID) 635 - ret = -BCH_ERR_trigger_pointer; 598 + ret = bch_err_throw(c, trigger_pointer); 636 599 goto err; 637 600 } 638 601 ··· 640 603 if (!bucket_valid(ca, bucket.offset)) { 641 604 if (insert) { 642 605 bch2_dev_bucket_missing(ca, bucket.offset); 643 - ret = -BCH_ERR_trigger_pointer; 606 + ret = bch_err_throw(c, trigger_pointer); 644 607 } 645 608 goto err; 646 609 } ··· 662 625 if (bch2_fs_inconsistent_on(!g, c, "reference to invalid bucket on device %u\n %s", 663 626 p.ptr.dev, 664 627 (bch2_bkey_val_to_text(&buf, c, k), buf.buf))) { 665 - ret = -BCH_ERR_trigger_pointer; 628 + ret = bch_err_throw(c, trigger_pointer); 666 629 goto err; 667 630 } 668 631 ··· 688 651 s64 sectors, 689 652 enum btree_iter_update_trigger_flags flags) 690 653 { 654 + struct bch_fs *c = trans->c; 655 + 691 656 if (flags & BTREE_TRIGGER_transactional) { 692 657 struct btree_iter iter; 693 658 struct bkey_i_stripe *s = bch2_bkey_get_mut_typed(trans, &iter, ··· 707 668 bch2_trans_inconsistent(trans, 708 669 "stripe pointer doesn't match stripe %llu", 709 670 (u64) p.ec.idx); 710 - ret = -BCH_ERR_trigger_stripe_pointer; 671 + ret = bch_err_throw(c, trigger_stripe_pointer); 711 672 goto err; 712 673 } 713 674 ··· 727 688 } 728 689 729 690 if (flags & BTREE_TRIGGER_gc) { 730 - struct bch_fs *c = trans->c; 731 - 732 691 struct gc_stripe *m = genradix_ptr_alloc(&c->gc_stripes, p.ec.idx, GFP_KERNEL); 733 692 if (!m) { 734 693 bch_err(c, "error allocating memory for gc_stripes, idx %llu", 735 694 (u64) p.ec.idx); 736 - return -BCH_ERR_ENOMEM_mark_stripe_ptr; 695 + return bch_err_throw(c, ENOMEM_mark_stripe_ptr); 737 696 } 738 697 739 698 gc_stripe_lock(m); ··· 746 709 __bch2_inconsistent_error(c, &buf); 747 710 bch2_print_str(c, KERN_ERR, buf.buf); 748 711 printbuf_exit(&buf); 749 - return -BCH_ERR_trigger_stripe_pointer; 712 + return bch_err_throw(c, trigger_stripe_pointer); 750 713 } 751 714 752 715 m->block_sectors[p.ec.block] += sectors; ··· 769 732 static int __trigger_extent(struct btree_trans *trans, 770 733 enum btree_id btree_id, unsigned level, 771 734 struct bkey_s_c k, 772 - enum btree_iter_update_trigger_flags flags, 773 - s64 *replicas_sectors) 735 + enum btree_iter_update_trigger_flags flags) 774 736 { 775 737 bool gc = flags & BTREE_TRIGGER_gc; 776 738 struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); ··· 779 743 ? BCH_DATA_btree 780 744 : BCH_DATA_user; 781 745 int ret = 0; 746 + 747 + s64 replicas_sectors = 0; 782 748 783 749 struct disk_accounting_pos acc_replicas_key; 784 750 memset(&acc_replicas_key, 0, sizeof(acc_replicas_key)); ··· 808 770 if (ret) 809 771 return ret; 810 772 } else if (!p.has_ec) { 811 - *replicas_sectors += disk_sectors; 773 + replicas_sectors += disk_sectors; 812 774 replicas_entry_add_dev(&acc_replicas_key.replicas, p.ptr.dev); 813 775 } else { 814 776 ret = bch2_trigger_stripe_ptr(trans, k, p, data_type, disk_sectors, flags); ··· 846 808 } 847 809 848 810 if (acc_replicas_key.replicas.nr_devs) { 849 - ret = bch2_disk_accounting_mod(trans, &acc_replicas_key, replicas_sectors, 1, gc); 811 + ret = bch2_disk_accounting_mod(trans, &acc_replicas_key, &replicas_sectors, 1, gc); 850 812 if (ret) 851 813 return ret; 852 814 } 853 815 854 816 if (acc_replicas_key.replicas.nr_devs && !level && k.k->p.snapshot) { 855 - ret = bch2_disk_accounting_mod2_nr(trans, gc, replicas_sectors, 1, snapshot, k.k->p.snapshot); 817 + ret = bch2_disk_accounting_mod2_nr(trans, gc, &replicas_sectors, 1, snapshot, k.k->p.snapshot); 856 818 if (ret) 857 819 return ret; 858 820 } ··· 868 830 } 869 831 870 832 if (level) { 871 - ret = bch2_disk_accounting_mod2_nr(trans, gc, replicas_sectors, 1, btree, btree_id); 833 + ret = bch2_disk_accounting_mod2_nr(trans, gc, &replicas_sectors, 1, btree, btree_id); 872 834 if (ret) 873 835 return ret; 874 836 } else { ··· 877 839 s64 v[3] = { 878 840 insert ? 1 : -1, 879 841 insert ? k.k->size : -((s64) k.k->size), 880 - *replicas_sectors, 842 + replicas_sectors, 881 843 }; 882 844 ret = bch2_disk_accounting_mod2(trans, gc, v, inum, k.k->p.inode); 883 845 if (ret) ··· 909 871 return 0; 910 872 911 873 if (flags & (BTREE_TRIGGER_transactional|BTREE_TRIGGER_gc)) { 912 - s64 old_replicas_sectors = 0, new_replicas_sectors = 0; 913 - 914 874 if (old.k->type) { 915 875 int ret = __trigger_extent(trans, btree, level, old, 916 - flags & ~BTREE_TRIGGER_insert, 917 - &old_replicas_sectors); 876 + flags & ~BTREE_TRIGGER_insert); 918 877 if (ret) 919 878 return ret; 920 879 } 921 880 922 881 if (new.k->type) { 923 882 int ret = __trigger_extent(trans, btree, level, new.s_c, 924 - flags & ~BTREE_TRIGGER_overwrite, 925 - &new_replicas_sectors); 883 + flags & ~BTREE_TRIGGER_overwrite); 926 884 if (ret) 927 885 return ret; 928 886 } ··· 1005 971 bch2_data_type_str(type), 1006 972 bch2_data_type_str(type)); 1007 973 1008 - bool print = bch2_count_fsck_err(c, bucket_metadata_type_mismatch, &buf); 974 + bch2_count_fsck_err(c, bucket_metadata_type_mismatch, &buf); 1009 975 1010 - bch2_run_explicit_recovery_pass(c, &buf, 976 + ret = bch2_run_explicit_recovery_pass(c, &buf, 1011 977 BCH_RECOVERY_PASS_check_allocations, 0); 1012 978 1013 - if (print) 1014 - bch2_print_str(c, KERN_ERR, buf.buf); 979 + /* Always print, this is always fatal */ 980 + bch2_print_str(c, KERN_ERR, buf.buf); 1015 981 printbuf_exit(&buf); 1016 - ret = -BCH_ERR_metadata_bucket_inconsistency; 982 + if (!ret) 983 + ret = bch_err_throw(c, metadata_bucket_inconsistency); 1017 984 goto err; 1018 985 } 1019 986 ··· 1067 1032 err_unlock: 1068 1033 bucket_unlock(g); 1069 1034 err: 1070 - return -BCH_ERR_metadata_bucket_inconsistency; 1035 + return bch_err_throw(c, metadata_bucket_inconsistency); 1071 1036 } 1072 1037 1073 1038 int bch2_trans_mark_metadata_bucket(struct btree_trans *trans, ··· 1282 1247 ret = 0; 1283 1248 } else { 1284 1249 atomic64_set(&c->sectors_available, sectors_available); 1285 - ret = -BCH_ERR_ENOSPC_disk_reservation; 1250 + ret = bch_err_throw(c, ENOSPC_disk_reservation); 1286 1251 } 1287 1252 1288 1253 mutex_unlock(&c->sectors_available_lock); ··· 1311 1276 GFP_KERNEL|__GFP_ZERO); 1312 1277 if (!ca->buckets_nouse) { 1313 1278 bch2_dev_put(ca); 1314 - return -BCH_ERR_ENOMEM_buckets_nouse; 1279 + return bch_err_throw(c, ENOMEM_buckets_nouse); 1315 1280 } 1316 1281 } 1317 1282 ··· 1336 1301 lockdep_assert_held(&c->state_lock); 1337 1302 1338 1303 if (resize && ca->buckets_nouse) 1339 - return -BCH_ERR_no_resize_with_buckets_nouse; 1304 + return bch_err_throw(c, no_resize_with_buckets_nouse); 1340 1305 1341 1306 bucket_gens = bch2_kvmalloc(struct_size(bucket_gens, b, nbuckets), 1342 1307 GFP_KERNEL|__GFP_ZERO); 1343 1308 if (!bucket_gens) { 1344 - ret = -BCH_ERR_ENOMEM_bucket_gens; 1309 + ret = bch_err_throw(c, ENOMEM_bucket_gens); 1345 1310 goto err; 1346 1311 } 1347 1312 ··· 1360 1325 sizeof(bucket_gens->b[0]) * copy); 1361 1326 } 1362 1327 1363 - ret = bch2_bucket_bitmap_resize(&ca->bucket_backpointer_mismatch, 1328 + ret = bch2_bucket_bitmap_resize(ca, &ca->bucket_backpointer_mismatch, 1364 1329 ca->mi.nbuckets, nbuckets) ?: 1365 - bch2_bucket_bitmap_resize(&ca->bucket_backpointer_empty, 1330 + bch2_bucket_bitmap_resize(ca, &ca->bucket_backpointer_empty, 1366 1331 ca->mi.nbuckets, nbuckets); 1367 1332 1368 1333 rcu_assign_pointer(ca->bucket_gens, bucket_gens); ··· 1389 1354 { 1390 1355 ca->usage = alloc_percpu(struct bch_dev_usage_full); 1391 1356 if (!ca->usage) 1392 - return -BCH_ERR_ENOMEM_usage_init; 1357 + return bch_err_throw(c, ENOMEM_usage_init); 1393 1358 1394 1359 return bch2_dev_buckets_resize(c, ca, ca->mi.nbuckets); 1395 1360 }
+4 -8
fs/bcachefs/buckets.h
··· 84 84 85 85 static inline int bucket_gen_get(struct bch_dev *ca, size_t b) 86 86 { 87 - rcu_read_lock(); 88 - int ret = bucket_gen_get_rcu(ca, b); 89 - rcu_read_unlock(); 90 - return ret; 87 + guard(rcu)(); 88 + return bucket_gen_get_rcu(ca, b); 91 89 } 92 90 93 91 static inline size_t PTR_BUCKET_NR(const struct bch_dev *ca, ··· 154 156 */ 155 157 static inline int dev_ptr_stale(struct bch_dev *ca, const struct bch_extent_ptr *ptr) 156 158 { 157 - rcu_read_lock(); 158 - int ret = dev_ptr_stale_rcu(ca, ptr); 159 - rcu_read_unlock(); 160 - return ret; 159 + guard(rcu)(); 160 + return dev_ptr_stale_rcu(ca, ptr); 161 161 } 162 162 163 163 /* Device usage: */
+2 -1
fs/bcachefs/buckets_waiting_for_journal.c
··· 108 108 realloc: 109 109 n = kvmalloc(sizeof(*n) + (sizeof(n->d[0]) << new_bits), GFP_KERNEL); 110 110 if (!n) { 111 - ret = -BCH_ERR_ENOMEM_buckets_waiting_for_journal_set; 111 + struct bch_fs *c = container_of(b, struct bch_fs, buckets_waiting_for_journal); 112 + ret = bch_err_throw(c, ENOMEM_buckets_waiting_for_journal_set); 112 113 goto out; 113 114 } 114 115
+3 -6
fs/bcachefs/chardev.c
··· 613 613 if (!dev) 614 614 return -EINVAL; 615 615 616 - rcu_read_lock(); 616 + guard(rcu)(); 617 617 for_each_online_member_rcu(c, ca) 618 - if (ca->dev == dev) { 619 - rcu_read_unlock(); 618 + if (ca->dev == dev) 620 619 return ca->dev_idx; 621 - } 622 - rcu_read_unlock(); 623 620 624 - return -BCH_ERR_ENOENT_dev_idx_not_found; 621 + return bch_err_throw(c, ENOENT_dev_idx_not_found); 625 622 } 626 623 627 624 static long bch2_ioctl_disk_resize(struct bch_fs *c,
+4 -4
fs/bcachefs/checksum.c
··· 173 173 174 174 if (bch2_fs_inconsistent_on(!c->chacha20_key_set, 175 175 c, "attempting to encrypt without encryption key")) 176 - return -BCH_ERR_no_encryption_key; 176 + return bch_err_throw(c, no_encryption_key); 177 177 178 178 bch2_chacha20(&c->chacha20_key, nonce, data, len); 179 179 return 0; ··· 262 262 263 263 if (bch2_fs_inconsistent_on(!c->chacha20_key_set, 264 264 c, "attempting to encrypt without encryption key")) 265 - return -BCH_ERR_no_encryption_key; 265 + return bch_err_throw(c, no_encryption_key); 266 266 267 267 bch2_chacha20_init(&chacha_state, &c->chacha20_key, nonce); 268 268 ··· 375 375 prt_str(&buf, ")"); 376 376 WARN_RATELIMIT(1, "%s", buf.buf); 377 377 printbuf_exit(&buf); 378 - return -BCH_ERR_recompute_checksum; 378 + return bch_err_throw(c, recompute_checksum); 379 379 } 380 380 381 381 for (i = splits; i < splits + ARRAY_SIZE(splits); i++) { ··· 659 659 crypt = bch2_sb_field_resize(&c->disk_sb, crypt, 660 660 sizeof(*crypt) / sizeof(u64)); 661 661 if (!crypt) { 662 - ret = -BCH_ERR_ENOSPC_sb_crypt; 662 + ret = bch_err_throw(c, ENOSPC_sb_crypt); 663 663 goto err; 664 664 } 665 665
+18 -29
fs/bcachefs/clock.c
··· 53 53 54 54 struct io_clock_wait { 55 55 struct io_timer io_timer; 56 - struct timer_list cpu_timer; 57 56 struct task_struct *task; 58 57 int expired; 59 58 }; ··· 61 62 { 62 63 struct io_clock_wait *wait = container_of(timer, 63 64 struct io_clock_wait, io_timer); 64 - 65 - wait->expired = 1; 66 - wake_up_process(wait->task); 67 - } 68 - 69 - static void io_clock_cpu_timeout(struct timer_list *timer) 70 - { 71 - struct io_clock_wait *wait = container_of(timer, 72 - struct io_clock_wait, cpu_timer); 73 65 74 66 wait->expired = 1; 75 67 wake_up_process(wait->task); ··· 80 90 bch2_io_timer_del(clock, &wait.io_timer); 81 91 } 82 92 83 - void bch2_kthread_io_clock_wait(struct io_clock *clock, 84 - u64 io_until, unsigned long cpu_timeout) 93 + unsigned long bch2_kthread_io_clock_wait_once(struct io_clock *clock, 94 + u64 io_until, unsigned long cpu_timeout) 85 95 { 86 96 bool kthread = (current->flags & PF_KTHREAD) != 0; 87 97 struct io_clock_wait wait = { ··· 93 103 94 104 bch2_io_timer_add(clock, &wait.io_timer); 95 105 96 - timer_setup_on_stack(&wait.cpu_timer, io_clock_cpu_timeout, 0); 97 - 98 - if (cpu_timeout != MAX_SCHEDULE_TIMEOUT) 99 - mod_timer(&wait.cpu_timer, cpu_timeout + jiffies); 100 - 101 - do { 102 - set_current_state(TASK_INTERRUPTIBLE); 103 - if (kthread && kthread_should_stop()) 104 - break; 105 - 106 - if (wait.expired) 107 - break; 108 - 109 - schedule(); 106 + set_current_state(TASK_INTERRUPTIBLE); 107 + if (!(kthread && kthread_should_stop())) { 108 + cpu_timeout = schedule_timeout(cpu_timeout); 110 109 try_to_freeze(); 111 - } while (0); 110 + } 112 111 113 112 __set_current_state(TASK_RUNNING); 114 - timer_delete_sync(&wait.cpu_timer); 115 - timer_destroy_on_stack(&wait.cpu_timer); 116 113 bch2_io_timer_del(clock, &wait.io_timer); 114 + return cpu_timeout; 115 + } 116 + 117 + void bch2_kthread_io_clock_wait(struct io_clock *clock, 118 + u64 io_until, unsigned long cpu_timeout) 119 + { 120 + bool kthread = (current->flags & PF_KTHREAD) != 0; 121 + 122 + while (!(kthread && kthread_should_stop()) && 123 + cpu_timeout && 124 + atomic64_read(&clock->now) < io_until) 125 + cpu_timeout = bch2_kthread_io_clock_wait_once(clock, io_until, cpu_timeout); 117 126 } 118 127 119 128 static struct io_timer *get_expired_timer(struct io_clock *clock, u64 now)
+1
fs/bcachefs/clock.h
··· 4 4 5 5 void bch2_io_timer_add(struct io_clock *, struct io_timer *); 6 6 void bch2_io_timer_del(struct io_clock *, struct io_timer *); 7 + unsigned long bch2_kthread_io_clock_wait_once(struct io_clock *, u64, unsigned long); 7 8 void bch2_kthread_io_clock_wait(struct io_clock *, u64, unsigned long); 8 9 9 10 void __bch2_increment_clock(struct io_clock *, u64);
+10 -10
fs/bcachefs/compress.c
··· 187 187 __bch2_compression_types[crc.compression_type])) 188 188 ret = bch2_check_set_has_compressed_data(c, opt); 189 189 else 190 - ret = -BCH_ERR_compression_workspace_not_initialized; 190 + ret = bch_err_throw(c, compression_workspace_not_initialized); 191 191 if (ret) 192 192 goto err; 193 193 } ··· 200 200 ret2 = LZ4_decompress_safe_partial(src_data.b, dst_data, 201 201 src_len, dst_len, dst_len); 202 202 if (ret2 != dst_len) 203 - ret = -BCH_ERR_decompress_lz4; 203 + ret = bch_err_throw(c, decompress_lz4); 204 204 break; 205 205 case BCH_COMPRESSION_TYPE_gzip: { 206 206 z_stream strm = { ··· 219 219 mempool_free(workspace, workspace_pool); 220 220 221 221 if (ret2 != Z_STREAM_END) 222 - ret = -BCH_ERR_decompress_gzip; 222 + ret = bch_err_throw(c, decompress_gzip); 223 223 break; 224 224 } 225 225 case BCH_COMPRESSION_TYPE_zstd: { ··· 227 227 size_t real_src_len = le32_to_cpup(src_data.b); 228 228 229 229 if (real_src_len > src_len - 4) { 230 - ret = -BCH_ERR_decompress_zstd_src_len_bad; 230 + ret = bch_err_throw(c, decompress_zstd_src_len_bad); 231 231 goto err; 232 232 } 233 233 ··· 241 241 mempool_free(workspace, workspace_pool); 242 242 243 243 if (ret2 != dst_len) 244 - ret = -BCH_ERR_decompress_zstd; 244 + ret = bch_err_throw(c, decompress_zstd); 245 245 break; 246 246 } 247 247 default: ··· 270 270 bch2_write_op_error(op, op->pos.offset, 271 271 "extent too big to decompress (%u > %u)", 272 272 crc->uncompressed_size << 9, c->opts.encoded_extent_max); 273 - return -BCH_ERR_decompress_exceeded_max_encoded_extent; 273 + return bch_err_throw(c, decompress_exceeded_max_encoded_extent); 274 274 } 275 275 276 276 data = __bounce_alloc(c, dst_len, WRITE); ··· 314 314 315 315 if (crc.uncompressed_size << 9 > c->opts.encoded_extent_max || 316 316 crc.compressed_size << 9 > c->opts.encoded_extent_max) 317 - return -BCH_ERR_decompress_exceeded_max_encoded_extent; 317 + return bch_err_throw(c, decompress_exceeded_max_encoded_extent); 318 318 319 319 dst_data = dst_len == dst_iter.bi_size 320 320 ? __bio_map_or_bounce(c, dst, dst_iter, WRITE) ··· 656 656 if (!mempool_initialized(&c->compression_bounce[READ]) && 657 657 mempool_init_kvmalloc_pool(&c->compression_bounce[READ], 658 658 1, c->opts.encoded_extent_max)) 659 - return -BCH_ERR_ENOMEM_compression_bounce_read_init; 659 + return bch_err_throw(c, ENOMEM_compression_bounce_read_init); 660 660 661 661 if (!mempool_initialized(&c->compression_bounce[WRITE]) && 662 662 mempool_init_kvmalloc_pool(&c->compression_bounce[WRITE], 663 663 1, c->opts.encoded_extent_max)) 664 - return -BCH_ERR_ENOMEM_compression_bounce_write_init; 664 + return bch_err_throw(c, ENOMEM_compression_bounce_write_init); 665 665 666 666 for (i = compression_types; 667 667 i < compression_types + ARRAY_SIZE(compression_types); ··· 675 675 if (mempool_init_kvmalloc_pool( 676 676 &c->compress_workspace[i->type], 677 677 1, i->compress_workspace)) 678 - return -BCH_ERR_ENOMEM_compression_workspace_init; 678 + return bch_err_throw(c, ENOMEM_compression_workspace_init); 679 679 } 680 680 681 681 return 0;
+45 -1
fs/bcachefs/darray.h
··· 8 8 * Inspired by CCAN's darray 9 9 */ 10 10 11 + #include <linux/cleanup.h> 11 12 #include <linux/slab.h> 12 13 13 14 #define DARRAY_PREALLOCATED(_type, _nr) \ ··· 88 87 #define darray_remove_item(_d, _pos) \ 89 88 array_remove_item((_d)->data, (_d)->nr, (_pos) - (_d)->data) 90 89 91 - #define __darray_for_each(_d, _i) \ 90 + #define darray_find_p(_d, _i, cond) \ 91 + ({ \ 92 + typeof((_d).data) _ret = NULL; \ 93 + \ 94 + darray_for_each(_d, _i) \ 95 + if (cond) { \ 96 + _ret = _i; \ 97 + break; \ 98 + } \ 99 + _ret; \ 100 + }) 101 + 102 + #define darray_find(_d, _item) darray_find_p(_d, _i, *_i == _item) 103 + 104 + /* Iteration: */ 105 + 106 + #define __darray_for_each(_d, _i) \ 92 107 for ((_i) = (_d).data; _i < (_d).data + (_d).nr; _i++) 93 108 94 109 #define darray_for_each(_d, _i) \ ··· 112 95 113 96 #define darray_for_each_reverse(_d, _i) \ 114 97 for (typeof(&(_d).data[0]) _i = (_d).data + (_d).nr - 1; _i >= (_d).data && (_d).nr; --_i) 98 + 99 + /* Init/exit */ 115 100 116 101 #define darray_init(_d) \ 117 102 do { \ ··· 129 110 kvfree((_d)->data); \ 130 111 darray_init(_d); \ 131 112 } while (0) 113 + 114 + #define DEFINE_DARRAY_CLASS(_type) \ 115 + DEFINE_CLASS(_type, _type, darray_exit(&(_T)), (_type) {}, void) 116 + 117 + #define DEFINE_DARRAY(_type) \ 118 + typedef DARRAY(_type) darray_##_type; \ 119 + DEFINE_DARRAY_CLASS(darray_##_type) 120 + 121 + #define DEFINE_DARRAY_NAMED(_name, _type) \ 122 + typedef DARRAY(_type) _name; \ 123 + DEFINE_DARRAY_CLASS(_name) 124 + 125 + DEFINE_DARRAY_CLASS(darray_char); 126 + DEFINE_DARRAY_CLASS(darray_str) 127 + DEFINE_DARRAY_CLASS(darray_const_str) 128 + 129 + DEFINE_DARRAY_CLASS(darray_u8) 130 + DEFINE_DARRAY_CLASS(darray_u16) 131 + DEFINE_DARRAY_CLASS(darray_u32) 132 + DEFINE_DARRAY_CLASS(darray_u64) 133 + 134 + DEFINE_DARRAY_CLASS(darray_s8) 135 + DEFINE_DARRAY_CLASS(darray_s16) 136 + DEFINE_DARRAY_CLASS(darray_s32) 137 + DEFINE_DARRAY_CLASS(darray_s64) 132 138 133 139 #endif /* _BCACHEFS_DARRAY_H */
+103 -73
fs/bcachefs/data_update.c
··· 66 66 } 67 67 } 68 68 69 - static bool bkey_nocow_lock(struct bch_fs *c, struct moving_context *ctxt, struct bkey_s_c k) 69 + static noinline_for_stack 70 + bool __bkey_nocow_lock(struct bch_fs *c, struct moving_context *ctxt, struct bkey_ptrs_c ptrs, 71 + const struct bch_extent_ptr *start) 70 72 { 71 - struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); 73 + if (!ctxt) { 74 + bkey_for_each_ptr(ptrs, ptr) { 75 + if (ptr == start) 76 + break; 72 77 78 + struct bch_dev *ca = bch2_dev_have_ref(c, ptr->dev); 79 + struct bpos bucket = PTR_BUCKET_POS(ca, ptr); 80 + bch2_bucket_nocow_unlock(&c->nocow_locks, bucket, 0); 81 + } 82 + return false; 83 + } 84 + 85 + __bkey_for_each_ptr(start, ptrs.end, ptr) { 86 + struct bch_dev *ca = bch2_dev_have_ref(c, ptr->dev); 87 + struct bpos bucket = PTR_BUCKET_POS(ca, ptr); 88 + 89 + bool locked; 90 + move_ctxt_wait_event(ctxt, 91 + (locked = bch2_bucket_nocow_trylock(&c->nocow_locks, bucket, 0)) || 92 + list_empty(&ctxt->ios)); 93 + if (!locked) 94 + bch2_bucket_nocow_lock(&c->nocow_locks, bucket, 0); 95 + } 96 + return true; 97 + } 98 + 99 + static bool bkey_nocow_lock(struct bch_fs *c, struct moving_context *ctxt, struct bkey_ptrs_c ptrs) 100 + { 73 101 bkey_for_each_ptr(ptrs, ptr) { 74 102 struct bch_dev *ca = bch2_dev_have_ref(c, ptr->dev); 75 103 struct bpos bucket = PTR_BUCKET_POS(ca, ptr); 76 104 77 - if (ctxt) { 78 - bool locked; 79 - 80 - move_ctxt_wait_event(ctxt, 81 - (locked = bch2_bucket_nocow_trylock(&c->nocow_locks, bucket, 0)) || 82 - list_empty(&ctxt->ios)); 83 - 84 - if (!locked) 85 - bch2_bucket_nocow_lock(&c->nocow_locks, bucket, 0); 86 - } else { 87 - if (!bch2_bucket_nocow_trylock(&c->nocow_locks, bucket, 0)) { 88 - bkey_for_each_ptr(ptrs, ptr2) { 89 - if (ptr2 == ptr) 90 - break; 91 - 92 - ca = bch2_dev_have_ref(c, ptr2->dev); 93 - bucket = PTR_BUCKET_POS(ca, ptr2); 94 - bch2_bucket_nocow_unlock(&c->nocow_locks, bucket, 0); 95 - } 96 - return false; 97 - } 98 - } 105 + if (!bch2_bucket_nocow_trylock(&c->nocow_locks, bucket, 0)) 106 + return __bkey_nocow_lock(c, ctxt, ptrs, ptr); 99 107 } 108 + 100 109 return true; 101 110 } 102 111 ··· 255 246 bch2_print_str(c, KERN_ERR, buf.buf); 256 247 printbuf_exit(&buf); 257 248 258 - return -BCH_ERR_invalid_bkey; 249 + return bch_err_throw(c, invalid_bkey); 259 250 } 260 251 261 252 static int __bch2_data_update_index_update(struct btree_trans *trans, ··· 376 367 bch2_bkey_durability(c, bkey_i_to_s_c(&new->k_i)); 377 368 378 369 /* Now, drop excess replicas: */ 379 - rcu_read_lock(); 370 + scoped_guard(rcu) { 380 371 restart_drop_extra_replicas: 381 - bkey_for_each_ptr_decode(old.k, bch2_bkey_ptrs(bkey_i_to_s(insert)), p, entry) { 382 - unsigned ptr_durability = bch2_extent_ptr_durability(c, &p); 372 + bkey_for_each_ptr_decode(old.k, bch2_bkey_ptrs(bkey_i_to_s(insert)), p, entry) { 373 + unsigned ptr_durability = bch2_extent_ptr_durability(c, &p); 383 374 384 - if (!p.ptr.cached && 385 - durability - ptr_durability >= m->op.opts.data_replicas) { 386 - durability -= ptr_durability; 375 + if (!p.ptr.cached && 376 + durability - ptr_durability >= m->op.opts.data_replicas) { 377 + durability -= ptr_durability; 387 378 388 - bch2_extent_ptr_set_cached(c, &m->op.opts, 389 - bkey_i_to_s(insert), &entry->ptr); 390 - goto restart_drop_extra_replicas; 379 + bch2_extent_ptr_set_cached(c, &m->op.opts, 380 + bkey_i_to_s(insert), &entry->ptr); 381 + goto restart_drop_extra_replicas; 382 + } 391 383 } 392 384 } 393 - rcu_read_unlock(); 394 385 395 386 /* Finally, add the pointers we just wrote: */ 396 387 extent_for_each_ptr_decode(extent_i_to_s(new), p, entry) ··· 532 523 bch2_bkey_buf_exit(&update->k, c); 533 524 } 534 525 535 - static int bch2_update_unwritten_extent(struct btree_trans *trans, 536 - struct data_update *update) 526 + static noinline_for_stack 527 + int bch2_update_unwritten_extent(struct btree_trans *trans, 528 + struct data_update *update) 537 529 { 538 530 struct bch_fs *c = update->op.c; 539 531 struct bkey_i_extent *e; ··· 726 716 bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc); 727 717 } 728 718 729 - int bch2_data_update_bios_init(struct data_update *m, struct bch_fs *c, 730 - struct bch_io_opts *io_opts) 719 + static int __bch2_data_update_bios_init(struct data_update *m, struct bch_fs *c, 720 + struct bch_io_opts *io_opts, 721 + unsigned buf_bytes) 731 722 { 732 - struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(bkey_i_to_s_c(m->k.k)); 733 - const union bch_extent_entry *entry; 734 - struct extent_ptr_decoded p; 735 - 736 - /* write path might have to decompress data: */ 737 - unsigned buf_bytes = 0; 738 - bkey_for_each_ptr_decode(&m->k.k->k, ptrs, p, entry) 739 - buf_bytes = max_t(unsigned, buf_bytes, p.crc.uncompressed_size << 9); 740 - 741 723 unsigned nr_vecs = DIV_ROUND_UP(buf_bytes, PAGE_SIZE); 742 724 743 725 m->bvecs = kmalloc_array(nr_vecs, sizeof*(m->bvecs), GFP_KERNEL); ··· 753 751 return 0; 754 752 } 755 753 754 + int bch2_data_update_bios_init(struct data_update *m, struct bch_fs *c, 755 + struct bch_io_opts *io_opts) 756 + { 757 + struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(bkey_i_to_s_c(m->k.k)); 758 + const union bch_extent_entry *entry; 759 + struct extent_ptr_decoded p; 760 + 761 + /* write path might have to decompress data: */ 762 + unsigned buf_bytes = 0; 763 + bkey_for_each_ptr_decode(&m->k.k->k, ptrs, p, entry) 764 + buf_bytes = max_t(unsigned, buf_bytes, p.crc.uncompressed_size << 9); 765 + 766 + return __bch2_data_update_bios_init(m, c, io_opts, buf_bytes); 767 + } 768 + 756 769 static int can_write_extent(struct bch_fs *c, struct data_update *m) 757 770 { 758 771 if ((m->op.flags & BCH_WRITE_alloc_nowait) && 759 772 unlikely(c->open_buckets_nr_free <= bch2_open_buckets_reserved(m->op.watermark))) 760 - return -BCH_ERR_data_update_done_would_block; 773 + return bch_err_throw(c, data_update_done_would_block); 761 774 762 775 unsigned target = m->op.flags & BCH_WRITE_only_specified_devs 763 776 ? m->op.target ··· 782 765 darray_for_each(m->op.devs_have, i) 783 766 __clear_bit(*i, devs.d); 784 767 785 - rcu_read_lock(); 768 + guard(rcu)(); 769 + 786 770 unsigned nr_replicas = 0, i; 787 771 for_each_set_bit(i, devs.d, BCH_SB_MEMBERS_MAX) { 788 772 struct bch_dev *ca = bch2_dev_rcu_noerror(c, i); ··· 800 782 if (nr_replicas >= m->op.nr_replicas) 801 783 break; 802 784 } 803 - rcu_read_unlock(); 804 785 805 786 if (!nr_replicas) 806 - return -BCH_ERR_data_update_done_no_rw_devs; 787 + return bch_err_throw(c, data_update_done_no_rw_devs); 807 788 if (nr_replicas < m->op.nr_replicas) 808 - return -BCH_ERR_insufficient_devices; 789 + return bch_err_throw(c, insufficient_devices); 809 790 return 0; 810 791 } 811 792 ··· 819 802 struct bkey_s_c k) 820 803 { 821 804 struct bch_fs *c = trans->c; 822 - struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); 823 - const union bch_extent_entry *entry; 824 - struct extent_ptr_decoded p; 825 - unsigned reserve_sectors = k.k->size * data_opts.extra_replicas; 826 805 int ret = 0; 827 806 828 - /* 829 - * fs is corrupt we have a key for a snapshot node that doesn't exist, 830 - * and we have to check for this because we go rw before repairing the 831 - * snapshots table - just skip it, we can move it later. 832 - */ 833 - if (unlikely(k.k->p.snapshot && !bch2_snapshot_exists(c, k.k->p.snapshot))) 834 - return -BCH_ERR_data_update_done_no_snapshot; 807 + if (k.k->p.snapshot) { 808 + ret = bch2_check_key_has_snapshot(trans, iter, k); 809 + if (bch2_err_matches(ret, BCH_ERR_recovery_will_run)) { 810 + /* Can't repair yet, waiting on other recovery passes */ 811 + return bch_err_throw(c, data_update_done_no_snapshot); 812 + } 813 + if (ret < 0) 814 + return ret; 815 + if (ret) /* key was deleted */ 816 + return bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc) ?: 817 + bch_err_throw(c, data_update_done_no_snapshot); 818 + ret = 0; 819 + } 835 820 836 821 bch2_bkey_buf_init(&m->k); 837 822 bch2_bkey_buf_reassemble(&m->k, c, k); ··· 861 842 862 843 unsigned durability_have = 0, durability_removing = 0; 863 844 845 + struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(bkey_i_to_s_c(m->k.k)); 846 + const union bch_extent_entry *entry; 847 + struct extent_ptr_decoded p; 848 + unsigned reserve_sectors = k.k->size * data_opts.extra_replicas; 849 + unsigned buf_bytes = 0; 850 + bool unwritten = false; 851 + 864 852 unsigned ptr_bit = 1; 865 853 bkey_for_each_ptr_decode(k.k, ptrs, p, entry) { 866 854 if (!p.ptr.cached) { 867 - rcu_read_lock(); 855 + guard(rcu)(); 868 856 if (ptr_bit & m->data_opts.rewrite_ptrs) { 869 857 if (crc_is_compressed(p.crc)) 870 858 reserve_sectors += k.k->size; ··· 882 856 bch2_dev_list_add_dev(&m->op.devs_have, p.ptr.dev); 883 857 durability_have += bch2_extent_ptr_durability(c, &p); 884 858 } 885 - rcu_read_unlock(); 886 859 } 887 860 888 861 /* ··· 896 871 897 872 if (p.crc.compression_type == BCH_COMPRESSION_TYPE_incompressible) 898 873 m->op.incompressible = true; 874 + 875 + buf_bytes = max_t(unsigned, buf_bytes, p.crc.uncompressed_size << 9); 876 + unwritten |= p.ptr.unwritten; 899 877 900 878 ptr_bit <<= 1; 901 879 } ··· 938 910 if (iter) 939 911 ret = bch2_extent_drop_ptrs(trans, iter, k, io_opts, &m->data_opts); 940 912 if (!ret) 941 - ret = -BCH_ERR_data_update_done_no_writes_needed; 913 + ret = bch_err_throw(c, data_update_done_no_writes_needed); 942 914 goto out_bkey_buf_exit; 943 915 } 944 916 ··· 969 941 } 970 942 971 943 if (!bkey_get_dev_refs(c, k)) { 972 - ret = -BCH_ERR_data_update_done_no_dev_refs; 944 + ret = bch_err_throw(c, data_update_done_no_dev_refs); 973 945 goto out_put_disk_res; 974 946 } 975 947 976 948 if (c->opts.nocow_enabled && 977 - !bkey_nocow_lock(c, ctxt, k)) { 978 - ret = -BCH_ERR_nocow_lock_blocked; 949 + !bkey_nocow_lock(c, ctxt, ptrs)) { 950 + ret = bch_err_throw(c, nocow_lock_blocked); 979 951 goto out_put_dev_refs; 980 952 } 981 953 982 - if (bkey_extent_is_unwritten(k)) { 954 + if (unwritten) { 983 955 ret = bch2_update_unwritten_extent(trans, m) ?: 984 - -BCH_ERR_data_update_done_unwritten; 956 + bch_err_throw(c, data_update_done_unwritten); 985 957 goto out_nocow_unlock; 986 958 } 987 959 988 - ret = bch2_data_update_bios_init(m, c, io_opts); 960 + bch2_trans_unlock(trans); 961 + 962 + ret = __bch2_data_update_bios_init(m, c, io_opts, buf_bytes); 989 963 if (ret) 990 964 goto out_nocow_unlock; 991 965
+16 -14
fs/bcachefs/debug.c
··· 492 492 prt_printf(out, "journal pin %px:\t%llu\n", 493 493 &b->writes[1].journal, b->writes[1].journal.seq); 494 494 495 + prt_printf(out, "ob:\t%u\n", b->ob.nr); 496 + 495 497 printbuf_indent_sub(out, 2); 496 498 } 497 499 ··· 510 508 i->ret = 0; 511 509 512 510 do { 513 - struct bucket_table *tbl; 514 - struct rhash_head *pos; 515 - struct btree *b; 516 - 517 511 ret = bch2_debugfs_flush_buf(i); 518 512 if (ret) 519 513 return ret; 520 514 521 - rcu_read_lock(); 522 515 i->buf.atomic++; 523 - tbl = rht_dereference_rcu(c->btree_cache.table.tbl, 524 - &c->btree_cache.table); 525 - if (i->iter < tbl->size) { 526 - rht_for_each_entry_rcu(b, pos, tbl, i->iter, hash) 527 - bch2_cached_btree_node_to_text(&i->buf, c, b); 528 - i->iter++; 529 - } else { 530 - done = true; 516 + scoped_guard(rcu) { 517 + struct bucket_table *tbl = 518 + rht_dereference_rcu(c->btree_cache.table.tbl, 519 + &c->btree_cache.table); 520 + if (i->iter < tbl->size) { 521 + struct rhash_head *pos; 522 + struct btree *b; 523 + 524 + rht_for_each_entry_rcu(b, pos, tbl, i->iter, hash) 525 + bch2_cached_btree_node_to_text(&i->buf, c, b); 526 + i->iter++; 527 + } else { 528 + done = true; 529 + } 531 530 } 532 531 --i->buf.atomic; 533 - rcu_read_unlock(); 534 532 } while (!done); 535 533 536 534 if (i->buf.allocation_failure)
+84 -95
fs/bcachefs/dirent.c
··· 231 231 prt_printf(out, " type %s", bch2_d_type_str(d.v->d_type)); 232 232 } 233 233 234 - static struct bkey_i_dirent *dirent_alloc_key(struct btree_trans *trans, 234 + int bch2_dirent_init_name(struct bkey_i_dirent *dirent, 235 + const struct bch_hash_info *hash_info, 236 + const struct qstr *name, 237 + const struct qstr *cf_name) 238 + { 239 + EBUG_ON(hash_info->cf_encoding == NULL && cf_name); 240 + int cf_len = 0; 241 + 242 + if (name->len > BCH_NAME_MAX) 243 + return -ENAMETOOLONG; 244 + 245 + dirent->v.d_casefold = hash_info->cf_encoding != NULL; 246 + 247 + if (!dirent->v.d_casefold) { 248 + memcpy(&dirent->v.d_name[0], name->name, name->len); 249 + memset(&dirent->v.d_name[name->len], 0, 250 + bkey_val_bytes(&dirent->k) - 251 + offsetof(struct bch_dirent, d_name) - 252 + name->len); 253 + } else { 254 + #ifdef CONFIG_UNICODE 255 + memcpy(&dirent->v.d_cf_name_block.d_names[0], name->name, name->len); 256 + 257 + char *cf_out = &dirent->v.d_cf_name_block.d_names[name->len]; 258 + 259 + if (cf_name) { 260 + cf_len = cf_name->len; 261 + 262 + memcpy(cf_out, cf_name->name, cf_name->len); 263 + } else { 264 + cf_len = utf8_casefold(hash_info->cf_encoding, name, 265 + cf_out, 266 + bkey_val_end(bkey_i_to_s(&dirent->k_i)) - (void *) cf_out); 267 + if (cf_len <= 0) 268 + return cf_len; 269 + } 270 + 271 + memset(&dirent->v.d_cf_name_block.d_names[name->len + cf_len], 0, 272 + bkey_val_bytes(&dirent->k) - 273 + offsetof(struct bch_dirent, d_cf_name_block.d_names) - 274 + name->len + cf_len); 275 + 276 + dirent->v.d_cf_name_block.d_name_len = cpu_to_le16(name->len); 277 + dirent->v.d_cf_name_block.d_cf_name_len = cpu_to_le16(cf_len); 278 + 279 + EBUG_ON(bch2_dirent_get_casefold_name(dirent_i_to_s_c(dirent)).len != cf_len); 280 + #else 281 + return -EOPNOTSUPP; 282 + #endif 283 + } 284 + 285 + unsigned u64s = dirent_val_u64s(name->len, cf_len); 286 + BUG_ON(u64s > bkey_val_u64s(&dirent->k)); 287 + set_bkey_val_u64s(&dirent->k, u64s); 288 + return 0; 289 + } 290 + 291 + struct bkey_i_dirent *bch2_dirent_create_key(struct btree_trans *trans, 292 + const struct bch_hash_info *hash_info, 235 293 subvol_inum dir, 236 294 u8 type, 237 - int name_len, int cf_name_len, 295 + const struct qstr *name, 296 + const struct qstr *cf_name, 238 297 u64 dst) 239 298 { 240 - struct bkey_i_dirent *dirent; 241 - unsigned u64s = BKEY_U64s + dirent_val_u64s(name_len, cf_name_len); 242 - 243 - BUG_ON(u64s > U8_MAX); 244 - 245 - dirent = bch2_trans_kmalloc(trans, u64s * sizeof(u64)); 299 + struct bkey_i_dirent *dirent = bch2_trans_kmalloc(trans, BKEY_U64s_MAX * sizeof(u64)); 246 300 if (IS_ERR(dirent)) 247 301 return dirent; 248 302 249 303 bkey_dirent_init(&dirent->k_i); 250 - dirent->k.u64s = u64s; 304 + dirent->k.u64s = BKEY_U64s_MAX; 251 305 252 306 if (type != DT_SUBVOL) { 253 307 dirent->v.d_inum = cpu_to_le64(dst); ··· 312 258 313 259 dirent->v.d_type = type; 314 260 dirent->v.d_unused = 0; 315 - dirent->v.d_casefold = cf_name_len ? 1 : 0; 316 261 317 - return dirent; 318 - } 319 - 320 - static void dirent_init_regular_name(struct bkey_i_dirent *dirent, 321 - const struct qstr *name) 322 - { 323 - EBUG_ON(dirent->v.d_casefold); 324 - 325 - memcpy(&dirent->v.d_name[0], name->name, name->len); 326 - memset(&dirent->v.d_name[name->len], 0, 327 - bkey_val_bytes(&dirent->k) - 328 - offsetof(struct bch_dirent, d_name) - 329 - name->len); 330 - } 331 - 332 - static void dirent_init_casefolded_name(struct bkey_i_dirent *dirent, 333 - const struct qstr *name, 334 - const struct qstr *cf_name) 335 - { 336 - EBUG_ON(!dirent->v.d_casefold); 337 - EBUG_ON(!cf_name->len); 338 - 339 - dirent->v.d_cf_name_block.d_name_len = cpu_to_le16(name->len); 340 - dirent->v.d_cf_name_block.d_cf_name_len = cpu_to_le16(cf_name->len); 341 - memcpy(&dirent->v.d_cf_name_block.d_names[0], name->name, name->len); 342 - memcpy(&dirent->v.d_cf_name_block.d_names[name->len], cf_name->name, cf_name->len); 343 - memset(&dirent->v.d_cf_name_block.d_names[name->len + cf_name->len], 0, 344 - bkey_val_bytes(&dirent->k) - 345 - offsetof(struct bch_dirent, d_cf_name_block.d_names) - 346 - name->len + cf_name->len); 347 - 348 - EBUG_ON(bch2_dirent_get_casefold_name(dirent_i_to_s_c(dirent)).len != cf_name->len); 349 - } 350 - 351 - static struct bkey_i_dirent *dirent_create_key(struct btree_trans *trans, 352 - const struct bch_hash_info *hash_info, 353 - subvol_inum dir, 354 - u8 type, 355 - const struct qstr *name, 356 - const struct qstr *cf_name, 357 - u64 dst) 358 - { 359 - struct bkey_i_dirent *dirent; 360 - struct qstr _cf_name; 361 - 362 - if (name->len > BCH_NAME_MAX) 363 - return ERR_PTR(-ENAMETOOLONG); 364 - 365 - if (hash_info->cf_encoding && !cf_name) { 366 - int ret = bch2_casefold(trans, hash_info, name, &_cf_name); 367 - if (ret) 368 - return ERR_PTR(ret); 369 - 370 - cf_name = &_cf_name; 371 - } 372 - 373 - dirent = dirent_alloc_key(trans, dir, type, name->len, cf_name ? cf_name->len : 0, dst); 374 - if (IS_ERR(dirent)) 375 - return dirent; 376 - 377 - if (cf_name) 378 - dirent_init_casefolded_name(dirent, name, cf_name); 379 - else 380 - dirent_init_regular_name(dirent, name); 262 + int ret = bch2_dirent_init_name(dirent, hash_info, name, cf_name); 263 + if (ret) 264 + return ERR_PTR(ret); 381 265 382 266 EBUG_ON(bch2_dirent_get_name(dirent_i_to_s_c(dirent)).len != name->len); 383 - 384 267 return dirent; 385 268 } 386 269 ··· 332 341 struct bkey_i_dirent *dirent; 333 342 int ret; 334 343 335 - dirent = dirent_create_key(trans, hash_info, dir_inum, type, name, NULL, dst_inum); 344 + dirent = bch2_dirent_create_key(trans, hash_info, dir_inum, type, name, NULL, dst_inum); 336 345 ret = PTR_ERR_OR_ZERO(dirent); 337 346 if (ret) 338 347 return ret; ··· 356 365 struct bkey_i_dirent *dirent; 357 366 int ret; 358 367 359 - dirent = dirent_create_key(trans, hash_info, dir, type, name, NULL, dst_inum); 368 + dirent = bch2_dirent_create_key(trans, hash_info, dir, type, name, NULL, dst_inum); 360 369 ret = PTR_ERR_OR_ZERO(dirent); 361 370 if (ret) 362 371 return ret; ··· 393 402 } 394 403 395 404 int bch2_dirent_rename(struct btree_trans *trans, 396 - subvol_inum src_dir, struct bch_hash_info *src_hash, u64 *src_dir_i_size, 397 - subvol_inum dst_dir, struct bch_hash_info *dst_hash, u64 *dst_dir_i_size, 405 + subvol_inum src_dir, struct bch_hash_info *src_hash, 406 + subvol_inum dst_dir, struct bch_hash_info *dst_hash, 398 407 const struct qstr *src_name, subvol_inum *src_inum, u64 *src_offset, 399 408 const struct qstr *dst_name, subvol_inum *dst_inum, u64 *dst_offset, 400 409 enum bch_rename_mode mode) ··· 461 470 *src_offset = dst_iter.pos.offset; 462 471 463 472 /* Create new dst key: */ 464 - new_dst = dirent_create_key(trans, dst_hash, dst_dir, 0, dst_name, 465 - dst_hash->cf_encoding ? &dst_name_lookup : NULL, 0); 473 + new_dst = bch2_dirent_create_key(trans, dst_hash, dst_dir, 0, dst_name, 474 + dst_hash->cf_encoding ? &dst_name_lookup : NULL, 0); 466 475 ret = PTR_ERR_OR_ZERO(new_dst); 467 476 if (ret) 468 477 goto out; ··· 472 481 473 482 /* Create new src key: */ 474 483 if (mode == BCH_RENAME_EXCHANGE) { 475 - new_src = dirent_create_key(trans, src_hash, src_dir, 0, src_name, 476 - src_hash->cf_encoding ? &src_name_lookup : NULL, 0); 484 + new_src = bch2_dirent_create_key(trans, src_hash, src_dir, 0, src_name, 485 + src_hash->cf_encoding ? &src_name_lookup : NULL, 0); 477 486 ret = PTR_ERR_OR_ZERO(new_src); 478 487 if (ret) 479 488 goto out; ··· 532 541 if ((mode == BCH_RENAME_EXCHANGE) && 533 542 new_src->v.d_type == DT_SUBVOL) 534 543 new_src->v.d_parent_subvol = cpu_to_le32(src_dir.subvol); 535 - 536 - if (old_dst.k) 537 - *dst_dir_i_size -= bkey_bytes(old_dst.k); 538 - *src_dir_i_size -= bkey_bytes(old_src.k); 539 - 540 - if (mode == BCH_RENAME_EXCHANGE) 541 - *src_dir_i_size += bkey_bytes(&new_src->k); 542 - *dst_dir_i_size += bkey_bytes(&new_dst->k); 543 544 544 545 ret = bch2_trans_update(trans, &dst_iter, &new_dst->k_i, 0); 545 546 if (ret) ··· 639 656 struct bkey_s_c_dirent d = bkey_s_c_to_dirent(k); 640 657 if (d.v->d_type == DT_SUBVOL && le32_to_cpu(d.v->d_parent_subvol) != subvol) 641 658 continue; 642 - ret = -BCH_ERR_ENOTEMPTY_dir_not_empty; 659 + ret = bch_err_throw(trans->c, ENOTEMPTY_dir_not_empty); 643 660 break; 644 661 } 645 662 bch2_trans_iter_exit(trans, &iter); ··· 675 692 return !ret; 676 693 } 677 694 678 - int bch2_readdir(struct bch_fs *c, subvol_inum inum, struct dir_context *ctx) 695 + int bch2_readdir(struct bch_fs *c, subvol_inum inum, 696 + struct bch_hash_info *hash_info, 697 + struct dir_context *ctx) 679 698 { 680 699 struct bkey_buf sk; 681 700 bch2_bkey_buf_init(&sk); ··· 695 710 struct bkey_s_c_dirent dirent = bkey_i_to_s_c_dirent(sk.k); 696 711 697 712 subvol_inum target; 698 - int ret2 = bch2_dirent_read_target(trans, inum, dirent, &target); 713 + 714 + bool need_second_pass = false; 715 + int ret2 = bch2_str_hash_check_key(trans, NULL, &bch2_dirent_hash_desc, 716 + hash_info, &iter, k, &need_second_pass) ?: 717 + bch2_dirent_read_target(trans, inum, dirent, &target); 699 718 if (ret2 > 0) 700 719 continue; 701 720 ··· 729 740 ret = bch2_inode_unpack(k, inode); 730 741 goto found; 731 742 } 732 - ret = -BCH_ERR_ENOENT_inode; 743 + ret = bch_err_throw(trans->c, ENOENT_inode); 733 744 found: 734 745 bch_err_msg(trans->c, ret, "fetching inode %llu", inode_nr); 735 746 bch2_trans_iter_exit(trans, &iter);
+12 -4
fs/bcachefs/dirent.h
··· 38 38 } 39 39 } 40 40 41 - struct qstr bch2_dirent_get_name(struct bkey_s_c_dirent d); 41 + struct qstr bch2_dirent_get_name(struct bkey_s_c_dirent); 42 42 43 43 static inline unsigned dirent_val_u64s(unsigned len, unsigned cf_len) 44 44 { ··· 58 58 dst->v.d_inum = src.v->d_inum; 59 59 dst->v.d_type = src.v->d_type; 60 60 } 61 + 62 + int bch2_dirent_init_name(struct bkey_i_dirent *, 63 + const struct bch_hash_info *, 64 + const struct qstr *, 65 + const struct qstr *); 66 + struct bkey_i_dirent *bch2_dirent_create_key(struct btree_trans *, 67 + const struct bch_hash_info *, subvol_inum, u8, 68 + const struct qstr *, const struct qstr *, u64); 61 69 62 70 int bch2_dirent_create_snapshot(struct btree_trans *, u32, u64, u32, 63 71 const struct bch_hash_info *, u8, ··· 88 80 }; 89 81 90 82 int bch2_dirent_rename(struct btree_trans *, 91 - subvol_inum, struct bch_hash_info *, u64 *, 92 - subvol_inum, struct bch_hash_info *, u64 *, 83 + subvol_inum, struct bch_hash_info *, 84 + subvol_inum, struct bch_hash_info *, 93 85 const struct qstr *, subvol_inum *, u64 *, 94 86 const struct qstr *, subvol_inum *, u64 *, 95 87 enum bch_rename_mode); ··· 103 95 104 96 int bch2_empty_dir_snapshot(struct btree_trans *, u64, u32, u32); 105 97 int bch2_empty_dir_trans(struct btree_trans *, subvol_inum); 106 - int bch2_readdir(struct bch_fs *, subvol_inum, struct dir_context *); 98 + int bch2_readdir(struct bch_fs *, subvol_inum, struct bch_hash_info *, struct dir_context *); 107 99 108 100 int bch2_fsck_remove_dirent(struct btree_trans *, struct bpos); 109 101
+18 -20
fs/bcachefs/disk_accounting.c
··· 390 390 err: 391 391 free_percpu(n.v[1]); 392 392 free_percpu(n.v[0]); 393 - return -BCH_ERR_ENOMEM_disk_accounting; 393 + return bch_err_throw(c, ENOMEM_disk_accounting); 394 394 } 395 395 396 396 int bch2_accounting_mem_insert(struct bch_fs *c, struct bkey_s_c_accounting a, ··· 401 401 if (mode != BCH_ACCOUNTING_read && 402 402 accounting_to_replicas(&r.e, a.k->p) && 403 403 !bch2_replicas_marked_locked(c, &r.e)) 404 - return -BCH_ERR_btree_insert_need_mark_replicas; 404 + return bch_err_throw(c, btree_insert_need_mark_replicas); 405 405 406 406 percpu_up_read(&c->mark_lock); 407 407 percpu_down_write(&c->mark_lock); ··· 419 419 if (mode != BCH_ACCOUNTING_read && 420 420 accounting_to_replicas(&r.e, a.k->p) && 421 421 !bch2_replicas_marked_locked(c, &r.e)) 422 - return -BCH_ERR_btree_insert_need_mark_replicas; 422 + return bch_err_throw(c, btree_insert_need_mark_replicas); 423 423 424 424 return __bch2_accounting_mem_insert(c, a); 425 425 } ··· 559 559 sizeof(u64), GFP_KERNEL); 560 560 if (!e->v[1]) { 561 561 bch2_accounting_free_counters(acc, true); 562 - ret = -BCH_ERR_ENOMEM_disk_accounting; 562 + ret = bch_err_throw(c, ENOMEM_disk_accounting); 563 563 break; 564 564 } 565 565 } ··· 737 737 bch2_disk_accounting_mod(trans, acc, v, nr, false)) ?: 738 738 -BCH_ERR_remove_disk_accounting_entry; 739 739 } else { 740 - ret = -BCH_ERR_remove_disk_accounting_entry; 740 + ret = bch_err_throw(c, remove_disk_accounting_entry); 741 741 } 742 742 goto fsck_err; 743 743 } ··· 897 897 case BCH_DISK_ACCOUNTING_replicas: 898 898 fs_usage_data_type_to_base(usage, k.replicas.data_type, v[0]); 899 899 break; 900 - case BCH_DISK_ACCOUNTING_dev_data_type: 901 - rcu_read_lock(); 900 + case BCH_DISK_ACCOUNTING_dev_data_type: { 901 + guard(rcu)(); 902 902 struct bch_dev *ca = bch2_dev_rcu_noerror(c, k.dev_data_type.dev); 903 903 if (ca) { 904 904 struct bch_dev_usage_type __percpu *d = &ca->usage->d[k.dev_data_type.data_type]; ··· 910 910 k.dev_data_type.data_type == BCH_DATA_journal) 911 911 usage->hidden += v[0] * ca->mi.bucket_size; 912 912 } 913 - rcu_read_unlock(); 914 913 break; 914 + } 915 915 } 916 916 } 917 917 preempt_enable(); ··· 1006 1006 case BCH_DISK_ACCOUNTING_replicas: 1007 1007 fs_usage_data_type_to_base(&base, acc_k.replicas.data_type, a.v->d[0]); 1008 1008 break; 1009 - case BCH_DISK_ACCOUNTING_dev_data_type: { 1010 - rcu_read_lock(); 1011 - struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev); 1012 - if (!ca) { 1013 - rcu_read_unlock(); 1014 - continue; 1015 - } 1009 + case BCH_DISK_ACCOUNTING_dev_data_type: 1010 + { 1011 + guard(rcu)(); /* scoped guard is a loop, and doesn't play nicely with continue */ 1012 + struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev); 1013 + if (!ca) 1014 + continue; 1016 1015 1017 - v[0] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].buckets); 1018 - v[1] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].sectors); 1019 - v[2] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].fragmented); 1020 - rcu_read_unlock(); 1016 + v[0] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].buckets); 1017 + v[1] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].sectors); 1018 + v[2] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].fragmented); 1019 + } 1021 1020 1022 1021 if (memcmp(a.v->d, v, 3 * sizeof(u64))) { 1023 1022 struct printbuf buf = PRINTBUF; ··· 1030 1031 printbuf_exit(&buf); 1031 1032 mismatch = true; 1032 1033 } 1033 - } 1034 1034 } 1035 1035 1036 1036 0;
+3 -3
fs/bcachefs/disk_accounting.h
··· 174 174 case BCH_DISK_ACCOUNTING_replicas: 175 175 fs_usage_data_type_to_base(&trans->fs_usage_delta, acc_k.replicas.data_type, a.v->d[0]); 176 176 break; 177 - case BCH_DISK_ACCOUNTING_dev_data_type: 178 - rcu_read_lock(); 177 + case BCH_DISK_ACCOUNTING_dev_data_type: { 178 + guard(rcu)(); 179 179 struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev); 180 180 if (ca) { 181 181 this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].buckets, a.v->d[0]); 182 182 this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].sectors, a.v->d[1]); 183 183 this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].fragmented, a.v->d[2]); 184 184 } 185 - rcu_read_unlock(); 186 185 break; 186 + } 187 187 } 188 188 } 189 189
+12 -25
fs/bcachefs/disk_groups.c
··· 130 130 131 131 cpu_g = kzalloc(struct_size(cpu_g, entries, nr_groups), GFP_KERNEL); 132 132 if (!cpu_g) 133 - return -BCH_ERR_ENOMEM_disk_groups_to_cpu; 133 + return bch_err_throw(c, ENOMEM_disk_groups_to_cpu); 134 134 135 135 cpu_g->nr = nr_groups; 136 136 ··· 170 170 const struct bch_devs_mask *bch2_target_to_mask(struct bch_fs *c, unsigned target) 171 171 { 172 172 struct target t = target_decode(target); 173 - struct bch_devs_mask *devs; 174 173 175 - rcu_read_lock(); 174 + guard(rcu)(); 176 175 177 176 switch (t.type) { 178 177 case TARGET_NULL: 179 - devs = NULL; 180 - break; 178 + return NULL; 181 179 case TARGET_DEV: { 182 180 struct bch_dev *ca = t.dev < c->sb.nr_devices 183 181 ? rcu_dereference(c->devs[t.dev]) 184 182 : NULL; 185 - devs = ca ? &ca->self : NULL; 186 - break; 183 + return ca ? &ca->self : NULL; 187 184 } 188 185 case TARGET_GROUP: { 189 186 struct bch_disk_groups_cpu *g = rcu_dereference(c->disk_groups); 190 187 191 - devs = g && t.group < g->nr && !g->entries[t.group].deleted 188 + return g && t.group < g->nr && !g->entries[t.group].deleted 192 189 ? &g->entries[t.group].devs 193 190 : NULL; 194 - break; 195 191 } 196 192 default: 197 193 BUG(); 198 194 } 199 - 200 - rcu_read_unlock(); 201 - 202 - return devs; 203 195 } 204 196 205 197 bool bch2_dev_in_target(struct bch_fs *c, unsigned dev, unsigned target) ··· 376 384 bch2_printbuf_make_room(out, 4096); 377 385 378 386 out->atomic++; 379 - rcu_read_lock(); 387 + guard(rcu)(); 380 388 struct bch_disk_groups_cpu *g = rcu_dereference(c->disk_groups); 381 389 382 390 for (unsigned i = 0; i < (g ? g->nr : 0); i++) { ··· 397 405 prt_newline(out); 398 406 } 399 407 400 - rcu_read_unlock(); 401 408 out->atomic--; 402 409 } 403 410 404 411 void bch2_disk_path_to_text(struct printbuf *out, struct bch_fs *c, unsigned v) 405 412 { 406 413 out->atomic++; 407 - rcu_read_lock(); 414 + guard(rcu)(); 408 415 __bch2_disk_path_to_text(out, rcu_dereference(c->disk_groups), v), 409 - rcu_read_unlock(); 410 416 --out->atomic; 411 417 } 412 418 ··· 525 535 switch (t.type) { 526 536 case TARGET_NULL: 527 537 prt_printf(out, "none"); 528 - break; 538 + return; 529 539 case TARGET_DEV: { 530 - struct bch_dev *ca; 531 - 532 540 out->atomic++; 533 - rcu_read_lock(); 534 - ca = t.dev < c->sb.nr_devices 541 + guard(rcu)(); 542 + struct bch_dev *ca = t.dev < c->sb.nr_devices 535 543 ? rcu_dereference(c->devs[t.dev]) 536 544 : NULL; 537 545 ··· 540 552 else 541 553 prt_printf(out, "invalid device %u", t.dev); 542 554 543 - rcu_read_unlock(); 544 555 out->atomic--; 545 - break; 556 + return; 546 557 } 547 558 case TARGET_GROUP: 548 559 bch2_disk_path_to_text(out, c, t.group); 549 - break; 560 + return; 550 561 default: 551 562 BUG(); 552 563 }
+55 -53
fs/bcachefs/ec.c
··· 213 213 a->dirty_sectors, 214 214 a->stripe, s.k->p.offset, 215 215 (bch2_bkey_val_to_text(&buf, c, s.s_c), buf.buf))) { 216 - ret = -BCH_ERR_mark_stripe; 216 + ret = bch_err_throw(c, mark_stripe); 217 217 goto err; 218 218 } 219 219 ··· 224 224 a->dirty_sectors, 225 225 a->cached_sectors, 226 226 (bch2_bkey_val_to_text(&buf, c, s.s_c), buf.buf))) { 227 - ret = -BCH_ERR_mark_stripe; 227 + ret = bch_err_throw(c, mark_stripe); 228 228 goto err; 229 229 } 230 230 } else { ··· 234 234 bucket.inode, bucket.offset, a->gen, 235 235 a->stripe, 236 236 (bch2_bkey_val_to_text(&buf, c, s.s_c), buf.buf))) { 237 - ret = -BCH_ERR_mark_stripe; 237 + ret = bch_err_throw(c, mark_stripe); 238 238 goto err; 239 239 } 240 240 ··· 244 244 bch2_data_type_str(a->data_type), 245 245 bch2_data_type_str(data_type), 246 246 (bch2_bkey_val_to_text(&buf, c, s.s_c), buf.buf))) { 247 - ret = -BCH_ERR_mark_stripe; 247 + ret = bch_err_throw(c, mark_stripe); 248 248 goto err; 249 249 } 250 250 ··· 256 256 a->dirty_sectors, 257 257 a->cached_sectors, 258 258 (bch2_bkey_val_to_text(&buf, c, s.s_c), buf.buf))) { 259 - ret = -BCH_ERR_mark_stripe; 259 + ret = bch_err_throw(c, mark_stripe); 260 260 goto err; 261 261 } 262 262 } ··· 295 295 struct bch_dev *ca = bch2_dev_tryget(c, ptr->dev); 296 296 if (unlikely(!ca)) { 297 297 if (ptr->dev != BCH_SB_MEMBER_INVALID && !(flags & BTREE_TRIGGER_overwrite)) 298 - ret = -BCH_ERR_mark_stripe; 298 + ret = bch_err_throw(c, mark_stripe); 299 299 goto err; 300 300 } 301 301 ··· 325 325 if (bch2_fs_inconsistent_on(!g, c, "reference to invalid bucket on device %u\n%s", 326 326 ptr->dev, 327 327 (bch2_bkey_val_to_text(&buf, c, s.s_c), buf.buf))) { 328 - ret = -BCH_ERR_mark_stripe; 328 + ret = bch_err_throw(c, mark_stripe); 329 329 goto err; 330 330 } 331 331 ··· 428 428 gc = genradix_ptr_alloc(&c->gc_stripes, idx, GFP_KERNEL); 429 429 if (!gc) { 430 430 bch_err(c, "error allocating memory for gc_stripes, idx %llu", idx); 431 - return -BCH_ERR_ENOMEM_mark_stripe; 431 + return bch_err_throw(c, ENOMEM_mark_stripe); 432 432 } 433 433 434 434 /* ··· 536 536 } 537 537 538 538 /* XXX: this is a non-mempoolified memory allocation: */ 539 - static int ec_stripe_buf_init(struct ec_stripe_buf *buf, 539 + static int ec_stripe_buf_init(struct bch_fs *c, 540 + struct ec_stripe_buf *buf, 540 541 unsigned offset, unsigned size) 541 542 { 542 543 struct bch_stripe *v = &bkey_i_to_stripe(&buf->key)->v; ··· 565 564 return 0; 566 565 err: 567 566 ec_stripe_buf_exit(buf); 568 - return -BCH_ERR_ENOMEM_stripe_buf; 567 + return bch_err_throw(c, ENOMEM_stripe_buf); 569 568 } 570 569 571 570 /* Checksumming: */ ··· 841 840 842 841 buf = kzalloc(sizeof(*buf), GFP_NOFS); 843 842 if (!buf) 844 - return -BCH_ERR_ENOMEM_ec_read_extent; 843 + return bch_err_throw(c, ENOMEM_ec_read_extent); 845 844 846 845 ret = lockrestart_do(trans, get_stripe_key_trans(trans, rbio->pick.ec.idx, buf)); 847 846 if (ret) { ··· 862 861 goto err; 863 862 } 864 863 865 - ret = ec_stripe_buf_init(buf, offset, bio_sectors(&rbio->bio)); 864 + ret = ec_stripe_buf_init(c, buf, offset, bio_sectors(&rbio->bio)); 866 865 if (ret) { 867 866 msg = "-ENOMEM"; 868 867 goto err; ··· 895 894 bch_err_ratelimited(c, 896 895 "error doing reconstruct read: %s\n %s", msg, msgbuf.buf); 897 896 printbuf_exit(&msgbuf); 898 - ret = -BCH_ERR_stripe_reconstruct; 897 + ret = bch_err_throw(c, stripe_reconstruct); 899 898 goto out; 900 899 } 901 900 ··· 905 904 { 906 905 if (c->gc_pos.phase != GC_PHASE_not_running && 907 906 !genradix_ptr_alloc(&c->gc_stripes, idx, gfp)) 908 - return -BCH_ERR_ENOMEM_ec_stripe_mem_alloc; 907 + return bch_err_throw(c, ENOMEM_ec_stripe_mem_alloc); 909 908 910 909 return 0; 911 910 } ··· 1130 1129 1131 1130 bch2_fs_inconsistent(c, "%s", buf.buf); 1132 1131 printbuf_exit(&buf); 1133 - return -BCH_ERR_erasure_coding_found_btree_node; 1132 + return bch_err_throw(c, erasure_coding_found_btree_node); 1134 1133 } 1135 1134 1136 1135 k = bch2_backpointer_get_key(trans, bp, &iter, BTREE_ITER_intent, last_flushed); ··· 1196 1195 1197 1196 struct bch_dev *ca = bch2_dev_tryget(c, ptr.dev); 1198 1197 if (!ca) 1199 - return -BCH_ERR_ENOENT_dev_not_found; 1198 + return bch_err_throw(c, ENOENT_dev_not_found); 1200 1199 1201 1200 struct bpos bucket_pos = PTR_BUCKET_POS(ca, &ptr); 1202 1201 ··· 1257 1256 struct bch_dev *ca = bch2_dev_get_ioref(c, ob->dev, WRITE, 1258 1257 BCH_DEV_WRITE_REF_ec_bucket_zero); 1259 1258 if (!ca) { 1260 - s->err = -BCH_ERR_erofs_no_writes; 1259 + s->err = bch_err_throw(c, erofs_no_writes); 1261 1260 return; 1262 1261 } 1263 1262 ··· 1321 1320 1322 1321 if (ec_do_recov(c, &s->existing_stripe)) { 1323 1322 bch_err(c, "error creating stripe: error reading existing stripe"); 1324 - ret = -BCH_ERR_ec_block_read; 1323 + ret = bch_err_throw(c, ec_block_read); 1325 1324 goto err; 1326 1325 } 1327 1326 ··· 1347 1346 1348 1347 if (ec_nr_failed(&s->new_stripe)) { 1349 1348 bch_err(c, "error creating stripe: error writing redundancy buckets"); 1350 - ret = -BCH_ERR_ec_block_write; 1349 + ret = bch_err_throw(c, ec_block_write); 1351 1350 goto err; 1352 1351 } 1353 1352 ··· 1579 1578 static void ec_stripe_head_devs_update(struct bch_fs *c, struct ec_stripe_head *h) 1580 1579 { 1581 1580 struct bch_devs_mask devs = h->devs; 1581 + unsigned nr_devs, nr_devs_with_durability; 1582 1582 1583 - rcu_read_lock(); 1584 - h->devs = target_rw_devs(c, BCH_DATA_user, h->disk_label 1585 - ? group_to_target(h->disk_label - 1) 1586 - : 0); 1587 - unsigned nr_devs = dev_mask_nr(&h->devs); 1583 + scoped_guard(rcu) { 1584 + h->devs = target_rw_devs(c, BCH_DATA_user, h->disk_label 1585 + ? group_to_target(h->disk_label - 1) 1586 + : 0); 1587 + nr_devs = dev_mask_nr(&h->devs); 1588 1588 1589 - for_each_member_device_rcu(c, ca, &h->devs) 1590 - if (!ca->mi.durability) 1591 - __clear_bit(ca->dev_idx, h->devs.d); 1592 - unsigned nr_devs_with_durability = dev_mask_nr(&h->devs); 1589 + for_each_member_device_rcu(c, ca, &h->devs) 1590 + if (!ca->mi.durability) 1591 + __clear_bit(ca->dev_idx, h->devs.d); 1592 + nr_devs_with_durability = dev_mask_nr(&h->devs); 1593 1593 1594 - h->blocksize = pick_blocksize(c, &h->devs); 1594 + h->blocksize = pick_blocksize(c, &h->devs); 1595 1595 1596 - h->nr_active_devs = 0; 1597 - for_each_member_device_rcu(c, ca, &h->devs) 1598 - if (ca->mi.bucket_size == h->blocksize) 1599 - h->nr_active_devs++; 1600 - 1601 - rcu_read_unlock(); 1596 + h->nr_active_devs = 0; 1597 + for_each_member_device_rcu(c, ca, &h->devs) 1598 + if (ca->mi.bucket_size == h->blocksize) 1599 + h->nr_active_devs++; 1600 + } 1602 1601 1603 1602 /* 1604 1603 * If we only have redundancy + 1 devices, we're better off with just ··· 1866 1865 s->nr_data = existing_v->nr_blocks - 1867 1866 existing_v->nr_redundant; 1868 1867 1869 - int ret = ec_stripe_buf_init(&s->existing_stripe, 0, le16_to_cpu(existing_v->sectors)); 1868 + int ret = ec_stripe_buf_init(c, &s->existing_stripe, 0, le16_to_cpu(existing_v->sectors)); 1870 1869 if (ret) { 1871 1870 bch2_stripe_close(c, s); 1872 1871 return ret; ··· 1926 1925 } 1927 1926 bch2_trans_iter_exit(trans, &lru_iter); 1928 1927 if (!ret) 1929 - ret = -BCH_ERR_stripe_alloc_blocked; 1928 + ret = bch_err_throw(c, stripe_alloc_blocked); 1930 1929 if (ret == 1) 1931 1930 ret = 0; 1932 1931 if (ret) ··· 1967 1966 continue; 1968 1967 } 1969 1968 1970 - ret = -BCH_ERR_ENOSPC_stripe_create; 1969 + ret = bch_err_throw(c, ENOSPC_stripe_create); 1971 1970 break; 1972 1971 } 1973 1972 ··· 2025 2024 if (!h->s) { 2026 2025 h->s = ec_new_stripe_alloc(c, h); 2027 2026 if (!h->s) { 2028 - ret = -BCH_ERR_ENOMEM_ec_new_stripe_alloc; 2027 + ret = bch_err_throw(c, ENOMEM_ec_new_stripe_alloc); 2029 2028 bch_err(c, "failed to allocate new stripe"); 2030 2029 goto err; 2031 2030 } ··· 2090 2089 goto err; 2091 2090 2092 2091 allocate_buf: 2093 - ret = ec_stripe_buf_init(&s->new_stripe, 0, h->blocksize); 2092 + ret = ec_stripe_buf_init(c, &s->new_stripe, 0, h->blocksize); 2094 2093 if (ret) 2095 2094 goto err; 2096 2095 ··· 2116 2115 if (k.k->type != KEY_TYPE_stripe) 2117 2116 return 0; 2118 2117 2118 + struct bch_fs *c = trans->c; 2119 2119 struct bkey_i_stripe *s = 2120 2120 bch2_bkey_make_mut_typed(trans, iter, &k, 0, stripe); 2121 2121 int ret = PTR_ERR_OR_ZERO(s); ··· 2143 2141 2144 2142 unsigned nr_good = 0; 2145 2143 2146 - rcu_read_lock(); 2147 - bkey_for_each_ptr(ptrs, ptr) { 2148 - if (ptr->dev == dev_idx) 2149 - ptr->dev = BCH_SB_MEMBER_INVALID; 2144 + scoped_guard(rcu) 2145 + bkey_for_each_ptr(ptrs, ptr) { 2146 + if (ptr->dev == dev_idx) 2147 + ptr->dev = BCH_SB_MEMBER_INVALID; 2150 2148 2151 - struct bch_dev *ca = bch2_dev_rcu(trans->c, ptr->dev); 2152 - nr_good += ca && ca->mi.state != BCH_MEMBER_STATE_failed; 2153 - } 2154 - rcu_read_unlock(); 2149 + struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 2150 + nr_good += ca && ca->mi.state != BCH_MEMBER_STATE_failed; 2151 + } 2155 2152 2156 2153 if (nr_good < s->v.nr_blocks && !(flags & BCH_FORCE_IF_DATA_DEGRADED)) 2157 - return -BCH_ERR_remove_would_lose_data; 2154 + return bch_err_throw(c, remove_would_lose_data); 2158 2155 2159 2156 unsigned nr_data = s->v.nr_blocks - s->v.nr_redundant; 2160 2157 2161 2158 if (nr_good < nr_data && !(flags & BCH_FORCE_IF_DATA_LOST)) 2162 - return -BCH_ERR_remove_would_lose_data; 2159 + return bch_err_throw(c, remove_would_lose_data); 2163 2160 2164 2161 sectors = -sectors; 2165 2162 ··· 2179 2178 return 0; 2180 2179 2181 2180 if (a->stripe_sectors) { 2182 - bch_err(trans->c, "trying to invalidate device in stripe when bucket has stripe data"); 2183 - return -BCH_ERR_invalidate_stripe_to_dev; 2181 + struct bch_fs *c = trans->c; 2182 + bch_err(c, "trying to invalidate device in stripe when bucket has stripe data"); 2183 + return bch_err_throw(c, invalidate_stripe_to_dev); 2184 2184 } 2185 2185 2186 2186 struct btree_iter iter; 2187 2187 struct bkey_s_c_stripe s = 2188 2188 bch2_bkey_get_iter_typed(trans, &iter, BTREE_ID_stripes, POS(0, a->stripe), 2189 - BTREE_ITER_slots, stripe); 2189 + BTREE_ITER_slots, stripe); 2190 2190 int ret = bkey_err(s); 2191 2191 if (ret) 2192 2192 return ret;
+3 -1
fs/bcachefs/errcode.c
··· 13 13 NULL 14 14 }; 15 15 16 - static unsigned bch2_errcode_parents[] = { 16 + static const unsigned bch2_errcode_parents[] = { 17 17 #define x(class, err) [BCH_ERR_##err - BCH_ERR_START] = class, 18 18 BCH_ERRCODES() 19 19 #undef x 20 20 }; 21 21 22 + __attribute__((const)) 22 23 const char *bch2_err_str(int err) 23 24 { 24 25 const char *errstr; ··· 37 36 return errstr ?: "(Invalid error)"; 38 37 } 39 38 39 + __attribute__((const)) 40 40 bool __bch2_err_matches(int err, int class) 41 41 { 42 42 err = abs(err);
+11 -4
fs/bcachefs/errcode.h
··· 182 182 x(BCH_ERR_fsck, fsck_errors_not_fixed) \ 183 183 x(BCH_ERR_fsck, fsck_repair_unimplemented) \ 184 184 x(BCH_ERR_fsck, fsck_repair_impossible) \ 185 - x(EINVAL, restart_recovery) \ 186 - x(EINVAL, cannot_rewind_recovery) \ 185 + x(EINVAL, recovery_will_run) \ 186 + x(BCH_ERR_recovery_will_run, restart_recovery) \ 187 + x(BCH_ERR_recovery_will_run, cannot_rewind_recovery) \ 188 + x(BCH_ERR_recovery_will_run, recovery_pass_will_run) \ 187 189 x(0, data_update_done) \ 190 + x(0, bkey_was_deleted) \ 188 191 x(BCH_ERR_data_update_done, data_update_done_would_block) \ 189 192 x(BCH_ERR_data_update_done, data_update_done_unwritten) \ 190 193 x(BCH_ERR_data_update_done, data_update_done_no_writes_needed) \ ··· 214 211 x(EINVAL, remove_would_lose_data) \ 215 212 x(EINVAL, no_resize_with_buckets_nouse) \ 216 213 x(EINVAL, inode_unpack_error) \ 214 + x(EINVAL, inode_not_unlinked) \ 215 + x(EINVAL, inode_has_child_snapshot) \ 217 216 x(EINVAL, varint_decode_error) \ 218 217 x(EINVAL, erasure_coding_found_btree_node) \ 219 218 x(EINVAL, option_negative) \ ··· 362 357 BCH_ERR_MAX 363 358 }; 364 359 365 - const char *bch2_err_str(int); 366 - bool __bch2_err_matches(int, int); 360 + __attribute__((const)) const char *bch2_err_str(int); 367 361 362 + __attribute__((const)) bool __bch2_err_matches(int, int); 363 + 364 + __attribute__((const)) 368 365 static inline bool _bch2_err_matches(int err, int class) 369 366 { 370 367 return err < 0 && __bch2_err_matches(err, class);
+50 -43
fs/bcachefs/error.c
··· 100 100 set_bit(BCH_FS_topology_error, &c->flags); 101 101 if (!test_bit(BCH_FS_in_recovery, &c->flags)) { 102 102 __bch2_inconsistent_error(c, out); 103 - return -BCH_ERR_btree_need_topology_repair; 103 + return bch_err_throw(c, btree_need_topology_repair); 104 104 } else { 105 105 return bch2_run_explicit_recovery_pass(c, out, BCH_RECOVERY_PASS_check_topology, 0) ?: 106 - -BCH_ERR_btree_node_read_validate_error; 106 + bch_err_throw(c, btree_node_read_validate_error); 107 107 } 108 108 } 109 109 ··· 403 403 404 404 if (test_bit(BCH_FS_in_fsck, &c->flags)) { 405 405 if (!(flags & (FSCK_CAN_FIX|FSCK_CAN_IGNORE))) 406 - return -BCH_ERR_fsck_repair_unimplemented; 406 + return bch_err_throw(c, fsck_repair_unimplemented); 407 407 408 408 switch (c->opts.fix_errors) { 409 409 case FSCK_FIX_exit: 410 - return -BCH_ERR_fsck_errors_not_fixed; 410 + return bch_err_throw(c, fsck_errors_not_fixed); 411 411 case FSCK_FIX_yes: 412 412 if (flags & FSCK_CAN_FIX) 413 - return -BCH_ERR_fsck_fix; 413 + return bch_err_throw(c, fsck_fix); 414 414 fallthrough; 415 415 case FSCK_FIX_no: 416 416 if (flags & FSCK_CAN_IGNORE) 417 - return -BCH_ERR_fsck_ignore; 418 - return -BCH_ERR_fsck_errors_not_fixed; 417 + return bch_err_throw(c, fsck_ignore); 418 + return bch_err_throw(c, fsck_errors_not_fixed); 419 419 case FSCK_FIX_ask: 420 420 if (flags & FSCK_AUTOFIX) 421 - return -BCH_ERR_fsck_fix; 422 - return -BCH_ERR_fsck_ask; 421 + return bch_err_throw(c, fsck_fix); 422 + return bch_err_throw(c, fsck_ask); 423 423 default: 424 424 BUG(); 425 425 } ··· 427 427 if ((flags & FSCK_AUTOFIX) && 428 428 (c->opts.errors == BCH_ON_ERROR_continue || 429 429 c->opts.errors == BCH_ON_ERROR_fix_safe)) 430 - return -BCH_ERR_fsck_fix; 430 + return bch_err_throw(c, fsck_fix); 431 431 432 432 if (c->opts.errors == BCH_ON_ERROR_continue && 433 433 (flags & FSCK_CAN_IGNORE)) 434 - return -BCH_ERR_fsck_ignore; 435 - return -BCH_ERR_fsck_errors_not_fixed; 434 + return bch_err_throw(c, fsck_ignore); 435 + return bch_err_throw(c, fsck_errors_not_fixed); 436 436 } 437 437 } 438 438 ··· 444 444 { 445 445 va_list args; 446 446 struct printbuf buf = PRINTBUF, *out = &buf; 447 - int ret = -BCH_ERR_fsck_ignore; 447 + int ret = 0; 448 448 const char *action_orig = "fix?", *action = action_orig; 449 449 450 450 might_sleep(); ··· 474 474 475 475 if (test_bit(err, c->sb.errors_silent)) 476 476 return flags & FSCK_CAN_FIX 477 - ? -BCH_ERR_fsck_fix 478 - : -BCH_ERR_fsck_ignore; 477 + ? bch_err_throw(c, fsck_fix) 478 + : bch_err_throw(c, fsck_ignore); 479 479 480 480 printbuf_indent_add_nextline(out, 2); 481 481 ··· 517 517 prt_str(out, ", "); 518 518 if (flags & FSCK_CAN_FIX) { 519 519 prt_actioning(out, action); 520 - ret = -BCH_ERR_fsck_fix; 520 + ret = bch_err_throw(c, fsck_fix); 521 521 } else { 522 522 prt_str(out, ", continuing"); 523 - ret = -BCH_ERR_fsck_ignore; 523 + ret = bch_err_throw(c, fsck_ignore); 524 524 } 525 525 526 526 goto print; ··· 532 532 "run fsck, and forward to devs so error can be marked for self-healing"); 533 533 inconsistent = true; 534 534 print = true; 535 - ret = -BCH_ERR_fsck_errors_not_fixed; 535 + ret = bch_err_throw(c, fsck_errors_not_fixed); 536 536 } else if (flags & FSCK_CAN_FIX) { 537 537 prt_str(out, ", "); 538 538 prt_actioning(out, action); 539 - ret = -BCH_ERR_fsck_fix; 539 + ret = bch_err_throw(c, fsck_fix); 540 540 } else { 541 541 prt_str(out, ", continuing"); 542 - ret = -BCH_ERR_fsck_ignore; 542 + ret = bch_err_throw(c, fsck_ignore); 543 543 } 544 544 } else if (c->opts.fix_errors == FSCK_FIX_exit) { 545 545 prt_str(out, ", exiting"); 546 - ret = -BCH_ERR_fsck_errors_not_fixed; 546 + ret = bch_err_throw(c, fsck_errors_not_fixed); 547 547 } else if (flags & FSCK_CAN_FIX) { 548 548 int fix = s && s->fix 549 549 ? s->fix ··· 562 562 : FSCK_FIX_yes; 563 563 564 564 ret = ret & 1 565 - ? -BCH_ERR_fsck_fix 566 - : -BCH_ERR_fsck_ignore; 565 + ? bch_err_throw(c, fsck_fix) 566 + : bch_err_throw(c, fsck_ignore); 567 567 } else if (fix == FSCK_FIX_yes || 568 568 (c->opts.nochanges && 569 569 !(flags & FSCK_CAN_IGNORE))) { 570 570 prt_str(out, ", "); 571 571 prt_actioning(out, action); 572 - ret = -BCH_ERR_fsck_fix; 572 + ret = bch_err_throw(c, fsck_fix); 573 573 } else { 574 574 prt_str(out, ", not "); 575 575 prt_actioning(out, action); 576 + ret = bch_err_throw(c, fsck_ignore); 576 577 } 577 - } else if (!(flags & FSCK_CAN_IGNORE)) { 578 - prt_str(out, " (repair unimplemented)"); 578 + } else { 579 + if (flags & FSCK_CAN_IGNORE) { 580 + prt_str(out, ", continuing"); 581 + ret = bch_err_throw(c, fsck_ignore); 582 + } else { 583 + prt_str(out, " (repair unimplemented)"); 584 + ret = bch_err_throw(c, fsck_repair_unimplemented); 585 + } 579 586 } 580 587 581 - if (ret == -BCH_ERR_fsck_ignore && 588 + if (bch2_err_matches(ret, BCH_ERR_fsck_ignore) && 582 589 (c->opts.fix_errors == FSCK_FIX_exit || 583 590 !(flags & FSCK_CAN_IGNORE))) 584 - ret = -BCH_ERR_fsck_errors_not_fixed; 591 + ret = bch_err_throw(c, fsck_errors_not_fixed); 585 592 586 593 if (test_bit(BCH_FS_in_fsck, &c->flags) && 587 - (ret != -BCH_ERR_fsck_fix && 588 - ret != -BCH_ERR_fsck_ignore)) { 594 + (!bch2_err_matches(ret, BCH_ERR_fsck_fix) && 595 + !bch2_err_matches(ret, BCH_ERR_fsck_ignore))) { 589 596 exiting = true; 590 597 print = true; 591 598 } ··· 620 613 621 614 if (s) 622 615 s->ret = ret; 623 - 616 + err_unlock: 617 + mutex_unlock(&c->fsck_error_msgs_lock); 618 + err: 624 619 /* 625 620 * We don't yet track whether the filesystem currently has errors, for 626 621 * log_fsck_err()s: that would require us to track for every error type 627 622 * which recovery pass corrects it, to get the fsck exit status correct: 628 623 */ 629 - if (flags & FSCK_CAN_FIX) { 630 - if (ret == -BCH_ERR_fsck_fix) { 631 - set_bit(BCH_FS_errors_fixed, &c->flags); 632 - } else { 633 - set_bit(BCH_FS_errors_not_fixed, &c->flags); 634 - set_bit(BCH_FS_error, &c->flags); 635 - } 624 + if (bch2_err_matches(ret, BCH_ERR_fsck_fix)) { 625 + set_bit(BCH_FS_errors_fixed, &c->flags); 626 + } else { 627 + set_bit(BCH_FS_errors_not_fixed, &c->flags); 628 + set_bit(BCH_FS_error, &c->flags); 636 629 } 637 - err_unlock: 638 - mutex_unlock(&c->fsck_error_msgs_lock); 639 - err: 630 + 640 631 if (action != action_orig) 641 632 kfree(action); 642 633 printbuf_exit(&buf); 634 + 635 + BUG_ON(!ret); 643 636 return ret; 644 637 } 645 638 ··· 657 650 const char *fmt, ...) 658 651 { 659 652 if (from.flags & BCH_VALIDATE_silent) 660 - return -BCH_ERR_fsck_delete_bkey; 653 + return bch_err_throw(c, fsck_delete_bkey); 661 654 662 655 unsigned fsck_flags = 0; 663 656 if (!(from.flags & (BCH_VALIDATE_write|BCH_VALIDATE_commit))) { 664 657 if (test_bit(err, c->sb.errors_silent)) 665 - return -BCH_ERR_fsck_delete_bkey; 658 + return bch_err_throw(c, fsck_delete_bkey); 666 659 667 660 fsck_flags |= FSCK_AUTOFIX|FSCK_CAN_FIX; 668 661 }
+6 -6
fs/bcachefs/error.h
··· 105 105 #define fsck_err_wrap(_do) \ 106 106 ({ \ 107 107 int _ret = _do; \ 108 - if (_ret != -BCH_ERR_fsck_fix && \ 109 - _ret != -BCH_ERR_fsck_ignore) { \ 108 + if (!bch2_err_matches(_ret, BCH_ERR_fsck_fix) && \ 109 + !bch2_err_matches(_ret, BCH_ERR_fsck_ignore)) { \ 110 110 ret = _ret; \ 111 111 goto fsck_err; \ 112 112 } \ 113 113 \ 114 - _ret == -BCH_ERR_fsck_fix; \ 114 + bch2_err_matches(_ret, BCH_ERR_fsck_fix); \ 115 115 }) 116 116 117 117 #define __fsck_err(...) fsck_err_wrap(bch2_fsck_err(__VA_ARGS__)) ··· 170 170 int _ret = __bch2_bkey_fsck_err(c, k, from, \ 171 171 BCH_FSCK_ERR_##_err_type, \ 172 172 _err_msg, ##__VA_ARGS__); \ 173 - if (_ret != -BCH_ERR_fsck_fix && \ 174 - _ret != -BCH_ERR_fsck_ignore) \ 173 + if (!bch2_err_matches(_ret, BCH_ERR_fsck_fix) && \ 174 + !bch2_err_matches(_ret, BCH_ERR_fsck_ignore)) \ 175 175 ret = _ret; \ 176 - ret = -BCH_ERR_fsck_delete_bkey; \ 176 + ret = bch_err_throw(c, fsck_delete_bkey); \ 177 177 goto fsck_err; \ 178 178 } while (0) 179 179
+25 -38
fs/bcachefs/extents.c
··· 65 65 continue; 66 66 67 67 bch2_printbuf_make_room(out, 1024); 68 - rcu_read_lock(); 69 68 out->atomic++; 70 - struct bch_dev *ca = bch2_dev_rcu_noerror(c, f->dev); 71 - if (ca) 72 - prt_str(out, ca->name); 73 - else 74 - prt_printf(out, "(invalid device %u)", f->dev); 69 + scoped_guard(rcu) { 70 + struct bch_dev *ca = bch2_dev_rcu_noerror(c, f->dev); 71 + if (ca) 72 + prt_str(out, ca->name); 73 + else 74 + prt_printf(out, "(invalid device %u)", f->dev); 75 + } 75 76 --out->atomic; 76 - rcu_read_unlock(); 77 77 78 78 prt_char(out, ' '); 79 79 ··· 193 193 bool have_dirty_ptrs = false, have_pick = false; 194 194 195 195 if (k.k->type == KEY_TYPE_error) 196 - return -BCH_ERR_key_type_error; 196 + return bch_err_throw(c, key_type_error); 197 197 198 198 rcu_read_lock(); 199 199 struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); ··· 286 286 if (!have_dirty_ptrs) 287 287 return 0; 288 288 if (have_missing_devs) 289 - return -BCH_ERR_no_device_to_read_from; 289 + return bch_err_throw(c, no_device_to_read_from); 290 290 if (have_csum_errors) 291 - return -BCH_ERR_data_read_csum_err; 291 + return bch_err_throw(c, data_read_csum_err); 292 292 if (have_io_errors) 293 - return -BCH_ERR_data_read_io_err; 293 + return bch_err_throw(c, data_read_io_err); 294 294 295 295 /* 296 296 * If we get here, we have pointers (bkey_ptrs_validate() ensures that), 297 297 * but they don't point to valid devices: 298 298 */ 299 - return -BCH_ERR_no_devices_valid; 299 + return bch_err_throw(c, no_devices_valid); 300 300 } 301 301 302 302 /* KEY_TYPE_btree_ptr: */ ··· 407 407 lp.crc = bch2_extent_crc_unpack(l.k, NULL); 408 408 rp.crc = bch2_extent_crc_unpack(r.k, NULL); 409 409 410 + guard(rcu)(); 411 + 410 412 while (__bkey_ptr_next_decode(l.k, l_ptrs.end, lp, en_l) && 411 413 __bkey_ptr_next_decode(r.k, r_ptrs.end, rp, en_r)) { 412 414 if (lp.ptr.offset + lp.crc.offset + lp.crc.live_size != ··· 420 418 return false; 421 419 422 420 /* Extents may not straddle buckets: */ 423 - rcu_read_lock(); 424 421 struct bch_dev *ca = bch2_dev_rcu(c, lp.ptr.dev); 425 422 bool same_bucket = ca && PTR_BUCKET_NR(ca, &lp.ptr) == PTR_BUCKET_NR(ca, &rp.ptr); 426 - rcu_read_unlock(); 427 423 428 424 if (!same_bucket) 429 425 return false; ··· 838 838 struct extent_ptr_decoded p; 839 839 unsigned durability = 0; 840 840 841 - rcu_read_lock(); 841 + guard(rcu)(); 842 842 bkey_for_each_ptr_decode(k.k, ptrs, p, entry) 843 843 durability += bch2_extent_ptr_durability(c, &p); 844 - rcu_read_unlock(); 845 - 846 844 return durability; 847 845 } 848 846 ··· 851 853 struct extent_ptr_decoded p; 852 854 unsigned durability = 0; 853 855 854 - rcu_read_lock(); 856 + guard(rcu)(); 855 857 bkey_for_each_ptr_decode(k.k, ptrs, p, entry) 856 858 if (p.ptr.dev < c->sb.nr_devices && c->devs[p.ptr.dev]) 857 859 durability += bch2_extent_ptr_durability(c, &p); 858 - rcu_read_unlock(); 859 - 860 860 return durability; 861 861 } 862 862 ··· 1011 1015 { 1012 1016 struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); 1013 1017 struct bch_dev *ca; 1014 - bool ret = false; 1015 1018 1016 - rcu_read_lock(); 1019 + guard(rcu)(); 1017 1020 bkey_for_each_ptr(ptrs, ptr) 1018 1021 if (bch2_dev_in_target(c, ptr->dev, target) && 1019 1022 (ca = bch2_dev_rcu(c, ptr->dev)) && 1020 1023 (!ptr->cached || 1021 - !dev_ptr_stale_rcu(ca, ptr))) { 1022 - ret = true; 1023 - break; 1024 - } 1025 - rcu_read_unlock(); 1024 + !dev_ptr_stale_rcu(ca, ptr))) 1025 + return true; 1026 1026 1027 - return ret; 1027 + return false; 1028 1028 } 1029 1029 1030 1030 bool bch2_bkey_matches_ptr(struct bch_fs *c, struct bkey_s_c k, ··· 1134 1142 bool have_cached_ptr; 1135 1143 unsigned drop_dev = ptr->dev; 1136 1144 1137 - rcu_read_lock(); 1145 + guard(rcu)(); 1138 1146 restart_drop_ptrs: 1139 1147 ptrs = bch2_bkey_ptrs(k); 1140 1148 have_cached_ptr = false; ··· 1167 1175 goto drop; 1168 1176 1169 1177 ptr->cached = true; 1170 - rcu_read_unlock(); 1171 1178 return; 1172 1179 drop: 1173 - rcu_read_unlock(); 1174 1180 bch2_bkey_drop_ptr_noerror(k, ptr); 1175 1181 } 1176 1182 ··· 1184 1194 { 1185 1195 struct bch_dev *ca; 1186 1196 1187 - rcu_read_lock(); 1197 + guard(rcu)(); 1188 1198 bch2_bkey_drop_ptrs(k, ptr, 1189 1199 ptr->cached && 1190 1200 (!(ca = bch2_dev_rcu(c, ptr->dev)) || 1191 1201 dev_ptr_stale_rcu(ca, ptr) > 0)); 1192 - rcu_read_unlock(); 1193 1202 1194 1203 return bkey_deleted(k.k); 1195 1204 } ··· 1206 1217 struct bkey_ptrs ptrs; 1207 1218 bool have_cached_ptr; 1208 1219 1209 - rcu_read_lock(); 1220 + guard(rcu)(); 1210 1221 restart_drop_ptrs: 1211 1222 ptrs = bch2_bkey_ptrs(k); 1212 1223 have_cached_ptr = false; ··· 1219 1230 } 1220 1231 have_cached_ptr = true; 1221 1232 } 1222 - rcu_read_unlock(); 1223 1233 1224 1234 return bkey_deleted(k.k); 1225 1235 } ··· 1226 1238 void bch2_extent_ptr_to_text(struct printbuf *out, struct bch_fs *c, const struct bch_extent_ptr *ptr) 1227 1239 { 1228 1240 out->atomic++; 1229 - rcu_read_lock(); 1241 + guard(rcu)(); 1230 1242 struct bch_dev *ca = bch2_dev_rcu_noerror(c, ptr->dev); 1231 1243 if (!ca) { 1232 1244 prt_printf(out, "ptr: %u:%llu gen %u%s", ptr->dev, ··· 1250 1262 else if (stale) 1251 1263 prt_printf(out, " invalid"); 1252 1264 } 1253 - rcu_read_unlock(); 1254 1265 --out->atomic; 1255 1266 } 1256 1267 ··· 1515 1528 struct bch_compression_opt opt = __bch2_compression_decode(r->compression); 1516 1529 prt_printf(err, "invalid compression opt %u:%u", 1517 1530 opt.type, opt.level); 1518 - return -BCH_ERR_invalid_bkey; 1531 + return bch_err_throw(c, invalid_bkey); 1519 1532 } 1520 1533 #endif 1521 1534 break;
+11 -19
fs/bcachefs/fs-io-buffered.c
··· 394 394 struct bch_io_opts opts; 395 395 struct bch_folio_sector *tmp; 396 396 unsigned tmp_sectors; 397 + struct blk_plug plug; 397 398 }; 398 - 399 - static inline struct bch_writepage_state bch_writepage_state_init(struct bch_fs *c, 400 - struct bch_inode_info *inode) 401 - { 402 - struct bch_writepage_state ret = { 0 }; 403 - 404 - bch2_inode_opts_get(&ret.opts, c, &inode->ei_inode); 405 - return ret; 406 - } 407 399 408 400 /* 409 401 * Determine when a writepage io is full. We have to limit writepage bios to a ··· 658 666 int bch2_writepages(struct address_space *mapping, struct writeback_control *wbc) 659 667 { 660 668 struct bch_fs *c = mapping->host->i_sb->s_fs_info; 661 - struct bch_writepage_state w = 662 - bch_writepage_state_init(c, to_bch_ei(mapping->host)); 663 - struct blk_plug plug; 664 - int ret; 669 + struct bch_writepage_state *w = kzalloc(sizeof(*w), GFP_NOFS|__GFP_NOFAIL); 665 670 666 - blk_start_plug(&plug); 667 - ret = write_cache_pages(mapping, wbc, __bch2_writepage, &w); 668 - if (w.io) 669 - bch2_writepage_do_io(&w); 670 - blk_finish_plug(&plug); 671 - kfree(w.tmp); 671 + bch2_inode_opts_get(&w->opts, c, &to_bch_ei(mapping->host)->ei_inode); 672 + 673 + blk_start_plug(&w->plug); 674 + int ret = write_cache_pages(mapping, wbc, __bch2_writepage, w); 675 + if (w->io) 676 + bch2_writepage_do_io(w); 677 + blk_finish_plug(&w->plug); 678 + kfree(w->tmp); 679 + kfree(w); 672 680 return bch2_err_class(ret); 673 681 } 674 682
+1 -1
fs/bcachefs/fs-io-pagecache.c
··· 447 447 448 448 if (!reserved) { 449 449 bch2_disk_reservation_put(c, &disk_res); 450 - return -BCH_ERR_ENOSPC_disk_reservation; 450 + return bch_err_throw(c, ENOSPC_disk_reservation); 451 451 } 452 452 break; 453 453 }
+6 -6
fs/bcachefs/fs-io.c
··· 71 71 memset(&inode->ei_devs_need_flush, 0, sizeof(inode->ei_devs_need_flush)); 72 72 73 73 for_each_set_bit(dev, devs.d, BCH_SB_MEMBERS_MAX) { 74 - rcu_read_lock(); 75 - ca = rcu_dereference(c->devs[dev]); 76 - if (ca && !enumerated_ref_tryget(&ca->io_ref[WRITE], 77 - BCH_DEV_WRITE_REF_nocow_flush)) 78 - ca = NULL; 79 - rcu_read_unlock(); 74 + scoped_guard(rcu) { 75 + ca = rcu_dereference(c->devs[dev]); 76 + if (ca && !enumerated_ref_tryget(&ca->io_ref[WRITE], 77 + BCH_DEV_WRITE_REF_nocow_flush)) 78 + ca = NULL; 79 + } 80 80 81 81 if (!ca) 82 82 continue;
+2 -2
fs/bcachefs/fs-ioctl.c
··· 268 268 } 269 269 270 270 if (dst_dentry->d_inode) { 271 - error = -BCH_ERR_EEXIST_subvolume_create; 271 + error = bch_err_throw(c, EEXIST_subvolume_create); 272 272 goto err3; 273 273 } 274 274 275 275 dir = dst_path.dentry->d_inode; 276 276 if (IS_DEADDIR(dir)) { 277 - error = -BCH_ERR_ENOENT_directory_dead; 277 + error = bch_err_throw(c, ENOENT_directory_dead); 278 278 goto err3; 279 279 } 280 280
+25 -15
fs/bcachefs/fs.c
··· 124 124 goto err; 125 125 126 126 struct bch_extent_rebalance new_r = bch2_inode_rebalance_opts_get(c, &inode_u); 127 + bool rebalance_changed = memcmp(&old_r, &new_r, sizeof(new_r)); 127 128 128 - if (memcmp(&old_r, &new_r, sizeof(new_r))) { 129 + if (rebalance_changed) { 129 130 ret = bch2_set_rebalance_needs_scan_trans(trans, inode_u.bi_inum); 130 131 if (ret) 131 132 goto err; ··· 146 145 147 146 if (bch2_err_matches(ret, BCH_ERR_transaction_restart)) 148 147 goto retry; 148 + 149 + if (rebalance_changed) 150 + bch2_rebalance_wakeup(c); 149 151 150 152 bch2_fs_fatal_err_on(bch2_err_matches(ret, ENOENT), c, 151 153 "%s: inode %llu:%llu not found when updating", ··· 1573 1569 { 1574 1570 struct bch_inode_info *inode = file_bch_inode(file); 1575 1571 struct bch_fs *c = inode->v.i_sb->s_fs_info; 1572 + struct bch_hash_info hash = bch2_hash_info_init(c, &inode->ei_inode); 1576 1573 1577 1574 if (!dir_emit_dots(file, ctx)) 1578 1575 return 0; 1579 1576 1580 - int ret = bch2_readdir(c, inode_inum(inode), ctx); 1577 + int ret = bch2_readdir(c, inode_inum(inode), &hash, ctx); 1581 1578 1582 1579 bch_err_fn(c, ret); 1583 1580 return bch2_err_class(ret); ··· 2007 2002 goto err; 2008 2003 2009 2004 if (k.k->type != KEY_TYPE_dirent) { 2010 - ret = -BCH_ERR_ENOENT_dirent_doesnt_match_inode; 2005 + ret = bch_err_throw(c, ENOENT_dirent_doesnt_match_inode); 2011 2006 goto err; 2012 2007 } 2013 2008 2014 2009 d = bkey_s_c_to_dirent(k); 2015 2010 ret = bch2_dirent_read_target(trans, inode_inum(dir), d, &target); 2016 2011 if (ret > 0) 2017 - ret = -BCH_ERR_ENOENT_dirent_doesnt_match_inode; 2012 + ret = bch_err_throw(c, ENOENT_dirent_doesnt_match_inode); 2018 2013 if (ret) 2019 2014 goto err; 2020 2015 ··· 2180 2175 KEY_TYPE_QUOTA_WARN); 2181 2176 bch2_quota_acct(c, inode->ei_qid, Q_INO, -1, 2182 2177 KEY_TYPE_QUOTA_WARN); 2183 - bch2_inode_rm(c, inode_inum(inode)); 2178 + int ret = bch2_inode_rm(c, inode_inum(inode)); 2179 + if (ret && !bch2_err_matches(ret, EROFS)) { 2180 + bch_err_msg(c, ret, "VFS incorrectly tried to delete inode %llu:%llu", 2181 + inode->ei_inum.subvol, 2182 + inode->ei_inum.inum); 2183 + bch2_sb_error_count(c, BCH_FSCK_ERR_vfs_bad_inode_rm); 2184 + } 2184 2185 2185 2186 /* 2186 2187 * If we are deleting, we need it present in the vfs hash table ··· 2333 2322 struct bch_fs *c = root->d_sb->s_fs_info; 2334 2323 bool first = true; 2335 2324 2336 - rcu_read_lock(); 2325 + guard(rcu)(); 2337 2326 for_each_online_member_rcu(c, ca) { 2338 2327 if (!first) 2339 2328 seq_putc(seq, ':'); 2340 2329 first = false; 2341 2330 seq_puts(seq, ca->disk_sb.sb_name); 2342 2331 } 2343 - rcu_read_unlock(); 2344 2332 2345 2333 return 0; 2346 2334 } ··· 2536 2526 2537 2527 sb->s_bdi->ra_pages = VM_READAHEAD_PAGES; 2538 2528 2539 - rcu_read_lock(); 2540 - for_each_online_member_rcu(c, ca) { 2541 - struct block_device *bdev = ca->disk_sb.bdev; 2529 + scoped_guard(rcu) { 2530 + for_each_online_member_rcu(c, ca) { 2531 + struct block_device *bdev = ca->disk_sb.bdev; 2542 2532 2543 - /* XXX: create an anonymous device for multi device filesystems */ 2544 - sb->s_bdev = bdev; 2545 - sb->s_dev = bdev->bd_dev; 2546 - break; 2533 + /* XXX: create an anonymous device for multi device filesystems */ 2534 + sb->s_bdev = bdev; 2535 + sb->s_dev = bdev->bd_dev; 2536 + break; 2537 + } 2547 2538 } 2548 - rcu_read_unlock(); 2549 2539 2550 2540 c->dev = sb->s_dev; 2551 2541
+86 -63
fs/bcachefs/fsck.c
··· 23 23 #include <linux/bsearch.h> 24 24 #include <linux/dcache.h> /* struct qstr */ 25 25 26 - static int dirent_points_to_inode_nowarn(struct bkey_s_c_dirent d, 26 + static int dirent_points_to_inode_nowarn(struct bch_fs *c, 27 + struct bkey_s_c_dirent d, 27 28 struct bch_inode_unpacked *inode) 28 29 { 29 30 if (d.v->d_type == DT_SUBVOL 30 31 ? le32_to_cpu(d.v->d_child_subvol) == inode->bi_subvol 31 32 : le64_to_cpu(d.v->d_inum) == inode->bi_inum) 32 33 return 0; 33 - return -BCH_ERR_ENOENT_dirent_doesnt_match_inode; 34 + return bch_err_throw(c, ENOENT_dirent_doesnt_match_inode); 34 35 } 35 36 36 37 static void dirent_inode_mismatch_msg(struct printbuf *out, ··· 50 49 struct bkey_s_c_dirent dirent, 51 50 struct bch_inode_unpacked *inode) 52 51 { 53 - int ret = dirent_points_to_inode_nowarn(dirent, inode); 52 + int ret = dirent_points_to_inode_nowarn(c, dirent, inode); 54 53 if (ret) { 55 54 struct printbuf buf = PRINTBUF; 56 55 dirent_inode_mismatch_msg(&buf, c, dirent, inode); ··· 153 152 goto found; 154 153 } 155 154 } 156 - ret = -BCH_ERR_ENOENT_no_snapshot_tree_subvol; 155 + ret = bch_err_throw(trans->c, ENOENT_no_snapshot_tree_subvol); 157 156 found: 158 157 bch2_trans_iter_exit(trans, &iter); 159 158 return ret; ··· 230 229 231 230 if (d_type != DT_DIR) { 232 231 bch_err(c, "error looking up lost+found: not a directory"); 233 - return -BCH_ERR_ENOENT_not_directory; 232 + return bch_err_throw(c, ENOENT_not_directory); 234 233 } 235 234 236 235 /* ··· 532 531 533 532 if (!bch2_snapshot_is_leaf(c, snapshotid)) { 534 533 bch_err(c, "need to reconstruct subvol, but have interior node snapshot"); 535 - return -BCH_ERR_fsck_repair_unimplemented; 534 + return bch_err_throw(c, fsck_repair_unimplemented); 536 535 } 537 536 538 537 /* ··· 643 642 644 643 return __bch2_fsck_write_inode(trans, &new_inode); 645 644 } 646 - 647 - struct snapshots_seen { 648 - struct bpos pos; 649 - snapshot_id_list ids; 650 - }; 651 645 652 646 static inline void snapshots_seen_exit(struct snapshots_seen *s) 653 647 { ··· 886 890 { 887 891 struct bch_fs *c = trans->c; 888 892 889 - struct inode_walker_entry *i; 890 - __darray_for_each(w->inodes, i) 891 - if (bch2_snapshot_is_ancestor(c, k.k->p.snapshot, i->inode.bi_snapshot)) 892 - goto found; 893 + struct inode_walker_entry *i = darray_find_p(w->inodes, i, 894 + bch2_snapshot_is_ancestor(c, k.k->p.snapshot, i->inode.bi_snapshot)); 893 895 894 - return NULL; 895 - found: 896 - BUG_ON(k.k->p.snapshot > i->inode.bi_snapshot); 896 + if (!i) 897 + return NULL; 897 898 898 899 struct printbuf buf = PRINTBUF; 899 900 int ret = 0; ··· 940 947 if (ret) 941 948 goto fsck_err; 942 949 943 - ret = -BCH_ERR_transaction_restart_nested; 950 + ret = bch_err_throw(c, transaction_restart_nested); 944 951 goto fsck_err; 945 952 } 946 953 ··· 985 992 int ret = 0; 986 993 987 994 if (d->v.d_type == DT_SUBVOL) { 988 - BUG(); 995 + bch_err(trans->c, "%s does not support DT_SUBVOL", __func__); 996 + ret = -BCH_ERR_fsck_repair_unimplemented; 989 997 } else { 990 998 ret = get_visible_inodes(trans, &target, s, le64_to_cpu(d->v.d_inum)); 991 999 if (ret) ··· 1042 1048 if (ret && !bch2_err_matches(ret, ENOENT)) 1043 1049 return ret; 1044 1050 1045 - if ((ret || dirent_points_to_inode_nowarn(d, inode)) && 1051 + if ((ret || dirent_points_to_inode_nowarn(c, d, inode)) && 1046 1052 inode->bi_subvol && 1047 1053 (inode->bi_flags & BCH_INODE_has_child_snapshot)) { 1048 1054 /* Older version of a renamed subvolume root: we won't have a ··· 1063 1069 trans, inode_points_to_missing_dirent, 1064 1070 "inode points to missing dirent\n%s", 1065 1071 (bch2_inode_unpacked_to_text(&buf, inode), buf.buf)) || 1066 - fsck_err_on(!ret && dirent_points_to_inode_nowarn(d, inode), 1072 + fsck_err_on(!ret && dirent_points_to_inode_nowarn(c, d, inode), 1067 1073 trans, inode_points_to_wrong_dirent, 1068 1074 "%s", 1069 1075 (printbuf_reset(&buf), ··· 1166 1172 u.bi_flags &= ~BCH_INODE_unlinked; 1167 1173 do_update = true; 1168 1174 ret = 0; 1175 + } 1176 + 1177 + if (fsck_err_on(S_ISDIR(u.bi_mode) && u.bi_size, 1178 + trans, inode_dir_has_nonzero_i_size, 1179 + "directory %llu:%u with nonzero i_size %lli", 1180 + u.bi_inum, u.bi_snapshot, u.bi_size)) { 1181 + u.bi_size = 0; 1182 + do_update = true; 1169 1183 } 1170 1184 1171 1185 ret = bch2_inode_has_child_snapshots(trans, k.k->p); ··· 1454 1452 goto err; 1455 1453 1456 1454 inode->last_pos.inode--; 1457 - ret = -BCH_ERR_transaction_restart_nested; 1455 + ret = bch_err_throw(c, transaction_restart_nested); 1458 1456 goto err; 1459 1457 } 1460 1458 ··· 1571 1569 sizeof(seen->ids.data[0]) * seen->ids.size, 1572 1570 GFP_KERNEL); 1573 1571 if (!n.seen.ids.data) 1574 - return -BCH_ERR_ENOMEM_fsck_extent_ends_at; 1572 + return bch_err_throw(c, ENOMEM_fsck_extent_ends_at); 1575 1573 1576 1574 __darray_for_each(extent_ends->e, i) { 1577 1575 if (i->snapshot == k.k->p.snapshot) { ··· 1621 1619 1622 1620 bch_err(c, "%s: error finding first overlapping extent when repairing, got%s", 1623 1621 __func__, buf.buf); 1624 - ret = -BCH_ERR_internal_fsck_err; 1622 + ret = bch_err_throw(c, internal_fsck_err); 1625 1623 goto err; 1626 1624 } 1627 1625 ··· 1646 1644 pos2.size != k2.k->size) { 1647 1645 bch_err(c, "%s: error finding seconding overlapping extent when repairing%s", 1648 1646 __func__, buf.buf); 1649 - ret = -BCH_ERR_internal_fsck_err; 1647 + ret = bch_err_throw(c, internal_fsck_err); 1650 1648 goto err; 1651 1649 } 1652 1650 ··· 1694 1692 * We overwrote the second extent - restart 1695 1693 * check_extent() from the top: 1696 1694 */ 1697 - ret = -BCH_ERR_transaction_restart_nested; 1695 + ret = bch_err_throw(c, transaction_restart_nested); 1698 1696 } 1699 1697 } 1700 1698 fsck_err: ··· 2047 2045 (bch2_bkey_val_to_text(&buf, c, d.s_c), buf.buf))) { 2048 2046 if (!new_parent_subvol) { 2049 2047 bch_err(c, "could not find a subvol for snapshot %u", d.k->p.snapshot); 2050 - return -BCH_ERR_fsck_repair_unimplemented; 2048 + return bch_err_throw(c, fsck_repair_unimplemented); 2051 2049 } 2052 2050 2053 2051 struct bkey_i_dirent *new_dirent = bch2_bkey_make_mut_typed(trans, iter, &d.s_c, 0, dirent); ··· 2109 2107 2110 2108 if (ret) { 2111 2109 bch_err(c, "subvol %u points to missing inode root %llu", target_subvol, target_inum); 2112 - ret = -BCH_ERR_fsck_repair_unimplemented; 2110 + ret = bch_err_throw(c, fsck_repair_unimplemented); 2113 2111 goto err; 2114 2112 } 2115 2113 ··· 2141 2139 struct bch_hash_info *hash_info, 2142 2140 struct inode_walker *dir, 2143 2141 struct inode_walker *target, 2144 - struct snapshots_seen *s) 2142 + struct snapshots_seen *s, 2143 + bool *need_second_pass) 2145 2144 { 2146 2145 struct bch_fs *c = trans->c; 2147 2146 struct inode_walker_entry *i; ··· 2184 2181 *hash_info = bch2_hash_info_init(c, &i->inode); 2185 2182 dir->first_this_inode = false; 2186 2183 2187 - ret = bch2_str_hash_check_key(trans, s, &bch2_dirent_hash_desc, hash_info, iter, k); 2184 + #ifdef CONFIG_UNICODE 2185 + hash_info->cf_encoding = bch2_inode_casefold(c, &i->inode) ? c->cf_encoding : NULL; 2186 + #endif 2187 + 2188 + ret = bch2_str_hash_check_key(trans, s, &bch2_dirent_hash_desc, hash_info, 2189 + iter, k, need_second_pass); 2188 2190 if (ret < 0) 2189 2191 goto err; 2190 2192 if (ret) { ··· 2210 2202 (printbuf_reset(&buf), 2211 2203 bch2_bkey_val_to_text(&buf, c, k), 2212 2204 buf.buf))) { 2213 - struct qstr name = bch2_dirent_get_name(d); 2214 - u32 subvol = d.v->d_type == DT_SUBVOL 2215 - ? le32_to_cpu(d.v->d_parent_subvol) 2216 - : 0; 2205 + subvol_inum dir_inum = { .subvol = d.v->d_type == DT_SUBVOL 2206 + ? le32_to_cpu(d.v->d_parent_subvol) 2207 + : 0, 2208 + }; 2217 2209 u64 target = d.v->d_type == DT_SUBVOL 2218 2210 ? le32_to_cpu(d.v->d_child_subvol) 2219 2211 : le64_to_cpu(d.v->d_inum); 2220 - u64 dir_offset; 2212 + struct qstr name = bch2_dirent_get_name(d); 2221 2213 2222 - ret = bch2_hash_delete_at(trans, 2214 + struct bkey_i_dirent *new_d = 2215 + bch2_dirent_create_key(trans, hash_info, dir_inum, 2216 + d.v->d_type, &name, NULL, target); 2217 + ret = PTR_ERR_OR_ZERO(new_d); 2218 + if (ret) 2219 + goto out; 2220 + 2221 + new_d->k.p.inode = d.k->p.inode; 2222 + new_d->k.p.snapshot = d.k->p.snapshot; 2223 + 2224 + struct btree_iter dup_iter = {}; 2225 + ret = bch2_hash_delete_at(trans, 2223 2226 bch2_dirent_hash_desc, hash_info, iter, 2224 2227 BTREE_UPDATE_internal_snapshot_node) ?: 2225 - bch2_dirent_create_snapshot(trans, subvol, 2226 - d.k->p.inode, d.k->p.snapshot, 2227 - hash_info, 2228 - d.v->d_type, 2229 - &name, 2230 - target, 2231 - &dir_offset, 2232 - BTREE_ITER_with_updates| 2233 - BTREE_UPDATE_internal_snapshot_node| 2234 - STR_HASH_must_create) ?: 2235 - bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc); 2236 - 2237 - /* might need another check_dirents pass */ 2228 + bch2_str_hash_repair_key(trans, s, 2229 + &bch2_dirent_hash_desc, hash_info, 2230 + iter, bkey_i_to_s_c(&new_d->k_i), 2231 + &dup_iter, bkey_s_c_null, 2232 + need_second_pass); 2238 2233 goto out; 2239 2234 } 2240 2235 ··· 2305 2294 err: 2306 2295 fsck_err: 2307 2296 printbuf_exit(&buf); 2308 - bch_err_fn(c, ret); 2309 2297 return ret; 2310 2298 } 2311 2299 ··· 2318 2308 struct inode_walker target = inode_walker_init(); 2319 2309 struct snapshots_seen s; 2320 2310 struct bch_hash_info hash_info; 2311 + bool need_second_pass = false, did_second_pass = false; 2312 + int ret; 2321 2313 2322 2314 snapshots_seen_init(&s); 2323 - 2324 - int ret = bch2_trans_run(c, 2325 - for_each_btree_key(trans, iter, BTREE_ID_dirents, 2315 + again: 2316 + ret = bch2_trans_run(c, 2317 + for_each_btree_key_commit(trans, iter, BTREE_ID_dirents, 2326 2318 POS(BCACHEFS_ROOT_INO, 0), 2327 2319 BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k, 2328 - check_dirent(trans, &iter, k, &hash_info, &dir, &target, &s)) ?: 2320 + NULL, NULL, BCH_TRANS_COMMIT_no_enospc, 2321 + check_dirent(trans, &iter, k, &hash_info, &dir, &target, &s, 2322 + &need_second_pass)) ?: 2329 2323 check_subdir_count_notnested(trans, &dir)); 2324 + 2325 + if (!ret && need_second_pass && !did_second_pass) { 2326 + bch_info(c, "check_dirents requires second pass"); 2327 + swap(did_second_pass, need_second_pass); 2328 + goto again; 2329 + } 2330 + 2331 + if (!ret && need_second_pass) { 2332 + bch_err(c, "dirents not repairing"); 2333 + ret = -EINVAL; 2334 + } 2330 2335 2331 2336 snapshots_seen_exit(&s); 2332 2337 inode_walker_exit(&dir); ··· 2356 2331 struct inode_walker *inode) 2357 2332 { 2358 2333 struct bch_fs *c = trans->c; 2359 - struct inode_walker_entry *i; 2360 - int ret; 2361 2334 2362 - ret = bch2_check_key_has_snapshot(trans, iter, k); 2335 + int ret = bch2_check_key_has_snapshot(trans, iter, k); 2363 2336 if (ret < 0) 2364 2337 return ret; 2365 2338 if (ret) 2366 2339 return 0; 2367 2340 2368 - i = walk_inode(trans, inode, k); 2341 + struct inode_walker_entry *i = walk_inode(trans, inode, k); 2369 2342 ret = PTR_ERR_OR_ZERO(i); 2370 2343 if (ret) 2371 2344 return ret; ··· 2379 2356 *hash_info = bch2_hash_info_init(c, &i->inode); 2380 2357 inode->first_this_inode = false; 2381 2358 2382 - ret = bch2_str_hash_check_key(trans, NULL, &bch2_xattr_hash_desc, hash_info, iter, k); 2383 - bch_err_fn(c, ret); 2384 - return ret; 2359 + bool need_second_pass = false; 2360 + return bch2_str_hash_check_key(trans, NULL, &bch2_xattr_hash_desc, hash_info, 2361 + iter, k, &need_second_pass); 2385 2362 } 2386 2363 2387 2364 /* ··· 2770 2747 if (!d) { 2771 2748 bch_err(c, "fsck: error allocating memory for nlink_table, size %zu", 2772 2749 new_size); 2773 - return -BCH_ERR_ENOMEM_fsck_add_nlink; 2750 + return bch_err_throw(c, ENOMEM_fsck_add_nlink); 2774 2751 } 2775 2752 2776 2753 if (t->d)
+6
fs/bcachefs/fsck.h
··· 4 4 5 5 #include "str_hash.h" 6 6 7 + /* recoverds snapshot IDs of overwrites at @pos */ 8 + struct snapshots_seen { 9 + struct bpos pos; 10 + snapshot_id_list ids; 11 + }; 12 + 7 13 int bch2_fsck_update_backpointers(struct btree_trans *, 8 14 struct snapshots_seen *, 9 15 const struct bch_hash_desc,
+56 -30
fs/bcachefs/inode.c
··· 38 38 #undef x 39 39 40 40 static int delete_ancestor_snapshot_inodes(struct btree_trans *, struct bpos); 41 + static int may_delete_deleted_inum(struct btree_trans *, subvol_inum); 41 42 42 43 static const u8 byte_table[8] = { 1, 2, 3, 4, 6, 8, 10, 13 }; 43 44 ··· 1042 1041 goto found_slot; 1043 1042 1044 1043 if (!ret && start == min) 1045 - ret = -BCH_ERR_ENOSPC_inode_create; 1044 + ret = bch_err_throw(trans->c, ENOSPC_inode_create); 1046 1045 1047 1046 if (ret) { 1048 1047 bch2_trans_iter_exit(trans, iter); ··· 1131 1130 u32 snapshot; 1132 1131 int ret; 1133 1132 1133 + ret = lockrestart_do(trans, may_delete_deleted_inum(trans, inum)); 1134 + if (ret) 1135 + goto err2; 1136 + 1134 1137 /* 1135 1138 * If this was a directory, there shouldn't be any real dirents left - 1136 1139 * but there could be whiteouts (from hash collisions) that we should 1137 1140 * delete: 1138 1141 * 1139 - * XXX: the dirent could ideally would delete whiteouts when they're no 1142 + * XXX: the dirent code ideally would delete whiteouts when they're no 1140 1143 * longer needed 1141 1144 */ 1142 1145 ret = bch2_inode_delete_keys(trans, inum, BTREE_ID_extents) ?: 1143 1146 bch2_inode_delete_keys(trans, inum, BTREE_ID_xattrs) ?: 1144 1147 bch2_inode_delete_keys(trans, inum, BTREE_ID_dirents); 1145 1148 if (ret) 1146 - goto err; 1149 + goto err2; 1147 1150 retry: 1148 1151 bch2_trans_begin(trans); 1149 1152 ··· 1166 1161 bch2_fs_inconsistent(c, 1167 1162 "inode %llu:%u not found when deleting", 1168 1163 inum.inum, snapshot); 1169 - ret = -BCH_ERR_ENOENT_inode; 1164 + ret = bch_err_throw(c, ENOENT_inode); 1170 1165 goto err; 1171 1166 } 1172 1167 ··· 1333 1328 bch2_fs_inconsistent(c, 1334 1329 "inode %llu:%u not found when deleting", 1335 1330 inum, snapshot); 1336 - ret = -BCH_ERR_ENOENT_inode; 1331 + ret = bch_err_throw(c, ENOENT_inode); 1337 1332 goto err; 1338 1333 } 1339 1334 ··· 1397 1392 delete_ancestor_snapshot_inodes(trans, SPOS(0, inum, snapshot)); 1398 1393 } 1399 1394 1400 - static int may_delete_deleted_inode(struct btree_trans *trans, 1401 - struct btree_iter *iter, 1402 - struct bpos pos, 1403 - bool *need_another_pass) 1395 + static int may_delete_deleted_inode(struct btree_trans *trans, struct bpos pos, 1396 + bool from_deleted_inodes) 1404 1397 { 1405 1398 struct bch_fs *c = trans->c; 1406 1399 struct btree_iter inode_iter; ··· 1412 1409 if (ret) 1413 1410 return ret; 1414 1411 1415 - ret = bkey_is_inode(k.k) ? 0 : -BCH_ERR_ENOENT_inode; 1416 - if (fsck_err_on(!bkey_is_inode(k.k), 1412 + ret = bkey_is_inode(k.k) ? 0 : bch_err_throw(c, ENOENT_inode); 1413 + if (fsck_err_on(from_deleted_inodes && ret, 1417 1414 trans, deleted_inode_missing, 1418 1415 "nonexistent inode %llu:%u in deleted_inodes btree", 1419 1416 pos.offset, pos.snapshot)) 1420 1417 goto delete; 1418 + if (ret) 1419 + goto out; 1421 1420 1422 1421 ret = bch2_inode_unpack(k, &inode); 1423 1422 if (ret) ··· 1427 1422 1428 1423 if (S_ISDIR(inode.bi_mode)) { 1429 1424 ret = bch2_empty_dir_snapshot(trans, pos.offset, 0, pos.snapshot); 1430 - if (fsck_err_on(bch2_err_matches(ret, ENOTEMPTY), 1425 + if (fsck_err_on(from_deleted_inodes && 1426 + bch2_err_matches(ret, ENOTEMPTY), 1431 1427 trans, deleted_inode_is_dir, 1432 1428 "non empty directory %llu:%u in deleted_inodes btree", 1433 1429 pos.offset, pos.snapshot)) ··· 1437 1431 goto out; 1438 1432 } 1439 1433 1440 - if (fsck_err_on(!(inode.bi_flags & BCH_INODE_unlinked), 1434 + ret = inode.bi_flags & BCH_INODE_unlinked ? 0 : bch_err_throw(c, inode_not_unlinked); 1435 + if (fsck_err_on(from_deleted_inodes && ret, 1441 1436 trans, deleted_inode_not_unlinked, 1442 1437 "non-deleted inode %llu:%u in deleted_inodes btree", 1443 1438 pos.offset, pos.snapshot)) 1444 1439 goto delete; 1440 + if (ret) 1441 + goto out; 1445 1442 1446 - if (fsck_err_on(inode.bi_flags & BCH_INODE_has_child_snapshot, 1443 + ret = !(inode.bi_flags & BCH_INODE_has_child_snapshot) 1444 + ? 0 : bch_err_throw(c, inode_has_child_snapshot); 1445 + 1446 + if (fsck_err_on(from_deleted_inodes && ret, 1447 1447 trans, deleted_inode_has_child_snapshots, 1448 1448 "inode with child snapshots %llu:%u in deleted_inodes btree", 1449 1449 pos.offset, pos.snapshot)) 1450 1450 goto delete; 1451 + if (ret) 1452 + goto out; 1451 1453 1452 1454 ret = bch2_inode_has_child_snapshots(trans, k.k->p); 1453 1455 if (ret < 0) ··· 1472 1458 if (ret) 1473 1459 goto out; 1474 1460 } 1461 + 1462 + if (!from_deleted_inodes) { 1463 + ret = bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc) ?: 1464 + bch_err_throw(c, inode_has_child_snapshot); 1465 + goto out; 1466 + } 1467 + 1475 1468 goto delete; 1476 1469 1477 1470 } 1478 1471 1479 - if (test_bit(BCH_FS_clean_recovery, &c->flags) && 1480 - !fsck_err(trans, deleted_inode_but_clean, 1481 - "filesystem marked as clean but have deleted inode %llu:%u", 1482 - pos.offset, pos.snapshot)) { 1483 - ret = 0; 1484 - goto out; 1485 - } 1472 + if (from_deleted_inodes) { 1473 + if (test_bit(BCH_FS_clean_recovery, &c->flags) && 1474 + !fsck_err(trans, deleted_inode_but_clean, 1475 + "filesystem marked as clean but have deleted inode %llu:%u", 1476 + pos.offset, pos.snapshot)) { 1477 + ret = 0; 1478 + goto out; 1479 + } 1486 1480 1487 - ret = 1; 1481 + ret = 1; 1482 + } 1488 1483 out: 1489 1484 fsck_err: 1490 1485 bch2_trans_iter_exit(trans, &inode_iter); ··· 1504 1481 goto out; 1505 1482 } 1506 1483 1484 + static int may_delete_deleted_inum(struct btree_trans *trans, subvol_inum inum) 1485 + { 1486 + u32 snapshot; 1487 + 1488 + return bch2_subvolume_get_snapshot(trans, inum.subvol, &snapshot) ?: 1489 + may_delete_deleted_inode(trans, SPOS(0, inum.inum, snapshot), false); 1490 + } 1491 + 1507 1492 int bch2_delete_dead_inodes(struct bch_fs *c) 1508 1493 { 1509 1494 struct btree_trans *trans = bch2_trans_get(c); 1510 - bool need_another_pass; 1511 1495 int ret; 1512 - again: 1496 + 1513 1497 /* 1514 1498 * if we ran check_inodes() unlinked inodes will have already been 1515 1499 * cleaned up but the write buffer will be out of sync; therefore we ··· 1525 1495 ret = bch2_btree_write_buffer_flush_sync(trans); 1526 1496 if (ret) 1527 1497 goto err; 1528 - 1529 - need_another_pass = false; 1530 1498 1531 1499 /* 1532 1500 * Weird transaction restart handling here because on successful delete, ··· 1535 1507 ret = for_each_btree_key_commit(trans, iter, BTREE_ID_deleted_inodes, POS_MIN, 1536 1508 BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k, 1537 1509 NULL, NULL, BCH_TRANS_COMMIT_no_enospc, ({ 1538 - ret = may_delete_deleted_inode(trans, &iter, k.k->p, &need_another_pass); 1510 + ret = may_delete_deleted_inode(trans, k.k->p, true); 1539 1511 if (ret > 0) { 1540 1512 bch_verbose_ratelimited(c, "deleting unlinked inode %llu:%u", 1541 1513 k.k->p.offset, k.k->p.snapshot); ··· 1556 1528 1557 1529 ret; 1558 1530 })); 1559 - 1560 - if (!ret && need_another_pass) 1561 - goto again; 1562 1531 err: 1563 1532 bch2_trans_put(trans); 1533 + bch_err_fn(c, ret); 1564 1534 return ret; 1565 1535 }
-9
fs/bcachefs/inode.h
··· 283 283 int bch2_inode_nlink_inc(struct bch_inode_unpacked *); 284 284 void bch2_inode_nlink_dec(struct btree_trans *, struct bch_inode_unpacked *); 285 285 286 - static inline bool bch2_inode_should_have_single_bp(struct bch_inode_unpacked *inode) 287 - { 288 - bool inode_has_bp = inode->bi_dir || inode->bi_dir_offset; 289 - 290 - return S_ISDIR(inode->bi_mode) || 291 - inode->bi_subvol || 292 - (!inode->bi_nlink && inode_has_bp); 293 - } 294 - 295 286 struct bch_opts bch2_inode_opts_to_opts(struct bch_inode_unpacked *); 296 287 void bch2_inode_opts_get(struct bch_io_opts *, struct bch_fs *, 297 288 struct bch_inode_unpacked *);
+1 -1
fs/bcachefs/io_misc.c
··· 91 91 opts.data_replicas, 92 92 BCH_WATERMARK_normal, 0, &cl, &wp); 93 93 if (bch2_err_matches(ret, BCH_ERR_operation_blocked)) 94 - ret = -BCH_ERR_transaction_restart_nested; 94 + ret = bch_err_throw(c, transaction_restart_nested); 95 95 if (ret) 96 96 goto err; 97 97
+17 -18
fs/bcachefs/io_read.c
··· 56 56 if (!target) 57 57 return false; 58 58 59 - rcu_read_lock(); 59 + guard(rcu)(); 60 60 devs = bch2_target_to_mask(c, target) ?: 61 61 &c->rw_devs[BCH_DATA_user]; 62 62 ··· 73 73 total += max(congested, 0LL); 74 74 nr++; 75 75 } 76 - rcu_read_unlock(); 77 76 78 77 return get_random_u32_below(nr * CONGESTED_MAX) < total; 79 78 } ··· 137 138 BUG_ON(!opts.promote_target); 138 139 139 140 if (!(flags & BCH_READ_may_promote)) 140 - return -BCH_ERR_nopromote_may_not; 141 + return bch_err_throw(c, nopromote_may_not); 141 142 142 143 if (bch2_bkey_has_target(c, k, opts.promote_target)) 143 - return -BCH_ERR_nopromote_already_promoted; 144 + return bch_err_throw(c, nopromote_already_promoted); 144 145 145 146 if (bkey_extent_is_unwritten(k)) 146 - return -BCH_ERR_nopromote_unwritten; 147 + return bch_err_throw(c, nopromote_unwritten); 147 148 148 149 if (bch2_target_congested(c, opts.promote_target)) 149 - return -BCH_ERR_nopromote_congested; 150 + return bch_err_throw(c, nopromote_congested); 150 151 } 151 152 152 153 if (rhashtable_lookup_fast(&c->promote_table, &pos, 153 154 bch_promote_params)) 154 - return -BCH_ERR_nopromote_in_flight; 155 + return bch_err_throw(c, nopromote_in_flight); 155 156 156 157 return 0; 157 158 } ··· 239 240 240 241 struct promote_op *op = kzalloc(sizeof(*op), GFP_KERNEL); 241 242 if (!op) { 242 - ret = -BCH_ERR_nopromote_enomem; 243 + ret = bch_err_throw(c, nopromote_enomem); 243 244 goto err_put; 244 245 } 245 246 ··· 248 249 249 250 if (rhashtable_lookup_insert_fast(&c->promote_table, &op->hash, 250 251 bch_promote_params)) { 251 - ret = -BCH_ERR_nopromote_in_flight; 252 + ret = bch_err_throw(c, nopromote_in_flight); 252 253 goto err; 253 254 } 254 255 ··· 544 545 545 546 if (!bkey_and_val_eq(k, bkey_i_to_s_c(u->k.k))) { 546 547 /* extent we wanted to read no longer exists: */ 547 - rbio->ret = -BCH_ERR_data_read_key_overwritten; 548 + rbio->ret = bch_err_throw(trans->c, data_read_key_overwritten); 548 549 goto err; 549 550 } 550 551 ··· 1035 1036 1036 1037 if ((bch2_bkey_extent_flags(k) & BIT_ULL(BCH_EXTENT_FLAG_poisoned)) && 1037 1038 !orig->data_update) 1038 - return -BCH_ERR_extent_poisoned; 1039 + return bch_err_throw(c, extent_poisoned); 1039 1040 retry_pick: 1040 1041 ret = bch2_bkey_pick_read_device(c, k, failed, &pick, dev); 1041 1042 ··· 1073 1074 1074 1075 bch_err_ratelimited(c, "%s", buf.buf); 1075 1076 printbuf_exit(&buf); 1076 - ret = -BCH_ERR_data_read_no_encryption_key; 1077 + ret = bch_err_throw(c, data_read_no_encryption_key); 1077 1078 goto err; 1078 1079 } 1079 1080 ··· 1127 1128 if (ca) 1128 1129 enumerated_ref_put(&ca->io_ref[READ], 1129 1130 BCH_DEV_READ_REF_io_read); 1130 - rbio->ret = -BCH_ERR_data_read_buffer_too_small; 1131 + rbio->ret = bch_err_throw(c, data_read_buffer_too_small); 1131 1132 goto out_read_done; 1132 1133 } 1133 1134 ··· 1332 1333 * have to signal that: 1333 1334 */ 1334 1335 if (u) 1335 - orig->ret = -BCH_ERR_data_read_key_overwritten; 1336 + orig->ret = bch_err_throw(c, data_read_key_overwritten); 1336 1337 1337 1338 zero_fill_bio_iter(&orig->bio, iter); 1338 1339 out_read_done: ··· 1509 1510 c->opts.btree_node_size, 1510 1511 c->opts.encoded_extent_max) / 1511 1512 PAGE_SIZE, 0)) 1512 - return -BCH_ERR_ENOMEM_bio_bounce_pages_init; 1513 + return bch_err_throw(c, ENOMEM_bio_bounce_pages_init); 1513 1514 1514 1515 if (bioset_init(&c->bio_read, 1, offsetof(struct bch_read_bio, bio), 1515 1516 BIOSET_NEED_BVECS)) 1516 - return -BCH_ERR_ENOMEM_bio_read_init; 1517 + return bch_err_throw(c, ENOMEM_bio_read_init); 1517 1518 1518 1519 if (bioset_init(&c->bio_read_split, 1, offsetof(struct bch_read_bio, bio), 1519 1520 BIOSET_NEED_BVECS)) 1520 - return -BCH_ERR_ENOMEM_bio_read_split_init; 1521 + return bch_err_throw(c, ENOMEM_bio_read_split_init); 1521 1522 1522 1523 if (rhashtable_init(&c->promote_table, &bch_promote_params)) 1523 - return -BCH_ERR_ENOMEM_promote_table_init; 1524 + return bch_err_throw(c, ENOMEM_promote_table_init); 1524 1525 1525 1526 return 0; 1526 1527 }
+4 -2
fs/bcachefs/io_read.h
··· 91 91 return 0; 92 92 93 93 *data_btree = BTREE_ID_reflink; 94 + 95 + struct bch_fs *c = trans->c; 94 96 struct btree_iter iter; 95 97 struct bkey_s_c k = bch2_lookup_indirect_extent(trans, &iter, 96 98 offset_into_extent, ··· 104 102 105 103 if (bkey_deleted(k.k)) { 106 104 bch2_trans_iter_exit(trans, &iter); 107 - return -BCH_ERR_missing_indirect_extent; 105 + return bch_err_throw(c, missing_indirect_extent); 108 106 } 109 107 110 - bch2_bkey_buf_reassemble(extent, trans->c, k); 108 + bch2_bkey_buf_reassemble(extent, c, k); 111 109 bch2_trans_iter_exit(trans, &iter); 112 110 return 0; 113 111 }
+12 -14
fs/bcachefs/io_write.c
··· 558 558 559 559 static noinline int bch2_write_drop_io_error_ptrs(struct bch_write_op *op) 560 560 { 561 + struct bch_fs *c = op->c; 561 562 struct keylist *keys = &op->insert_keys; 562 563 struct bkey_i *src, *dst = keys->keys, *n; 563 564 ··· 570 569 test_bit(ptr->dev, op->failed.d)); 571 570 572 571 if (!bch2_bkey_nr_ptrs(bkey_i_to_s_c(src))) 573 - return -BCH_ERR_data_write_io; 572 + return bch_err_throw(c, data_write_io); 574 573 } 575 574 576 575 if (dst != src) ··· 977 976 op->crc.csum_type < BCH_CSUM_NR 978 977 ? __bch2_csum_types[op->crc.csum_type] 979 978 : "(unknown)"); 980 - return -BCH_ERR_data_write_csum; 979 + return bch_err_throw(c, data_write_csum); 981 980 } 982 981 983 982 static int bch2_write_extent(struct bch_write_op *op, struct write_point *wp, ··· 1209 1208 1210 1209 e = bkey_s_c_to_extent(k); 1211 1210 1212 - rcu_read_lock(); 1211 + guard(rcu)(); 1213 1212 extent_for_each_ptr_decode(e, p, entry) { 1214 - if (crc_is_encoded(p.crc) || p.has_ec) { 1215 - rcu_read_unlock(); 1213 + if (crc_is_encoded(p.crc) || p.has_ec) 1216 1214 return false; 1217 - } 1218 1215 1219 1216 replicas += bch2_extent_ptr_durability(c, &p); 1220 1217 } 1221 - rcu_read_unlock(); 1222 1218 1223 1219 return replicas >= op->opts.data_replicas; 1224 1220 } ··· 1288 1290 static void __bch2_nocow_write_done(struct bch_write_op *op) 1289 1291 { 1290 1292 if (unlikely(op->flags & BCH_WRITE_io_error)) { 1291 - op->error = -BCH_ERR_data_write_io; 1293 + op->error = bch_err_throw(op->c, data_write_io); 1292 1294 } else if (unlikely(op->flags & BCH_WRITE_convert_unwritten)) 1293 1295 bch2_nocow_write_convert_unwritten(op); 1294 1296 } ··· 1481 1483 "pointer to invalid bucket in nocow path on device %llu\n %s", 1482 1484 stale_at->b.inode, 1483 1485 (bch2_bkey_val_to_text(&buf, c, k), buf.buf))) { 1484 - ret = -BCH_ERR_data_write_invalid_ptr; 1486 + ret = bch_err_throw(c, data_write_invalid_ptr); 1485 1487 } else { 1486 1488 /* We can retry this: */ 1487 - ret = -BCH_ERR_transaction_restart; 1489 + ret = bch_err_throw(c, transaction_restart); 1488 1490 } 1489 1491 printbuf_exit(&buf); 1490 1492 ··· 1691 1693 1692 1694 if (unlikely(bio->bi_iter.bi_size & (c->opts.block_size - 1))) { 1693 1695 bch2_write_op_error(op, op->pos.offset, "misaligned write"); 1694 - op->error = -BCH_ERR_data_write_misaligned; 1696 + op->error = bch_err_throw(c, data_write_misaligned); 1695 1697 goto err; 1696 1698 } 1697 1699 1698 1700 if (c->opts.nochanges) { 1699 - op->error = -BCH_ERR_erofs_no_writes; 1701 + op->error = bch_err_throw(c, erofs_no_writes); 1700 1702 goto err; 1701 1703 } 1702 1704 1703 1705 if (!(op->flags & BCH_WRITE_move) && 1704 1706 !enumerated_ref_tryget(&c->writes, BCH_WRITE_REF_write)) { 1705 - op->error = -BCH_ERR_erofs_no_writes; 1707 + op->error = bch_err_throw(c, erofs_no_writes); 1706 1708 goto err; 1707 1709 } 1708 1710 ··· 1774 1776 { 1775 1777 if (bioset_init(&c->bio_write, 1, offsetof(struct bch_write_bio, bio), BIOSET_NEED_BVECS) || 1776 1778 bioset_init(&c->replica_set, 4, offsetof(struct bch_write_bio, bio), 0)) 1777 - return -BCH_ERR_ENOMEM_bio_write_init; 1779 + return bch_err_throw(c, ENOMEM_bio_write_init); 1778 1780 1779 1781 return 0; 1780 1782 }
+89 -28
fs/bcachefs/journal.c
··· 397 397 BUG_ON(BCH_SB_CLEAN(c->disk_sb.sb)); 398 398 399 399 if (j->blocked) 400 - return -BCH_ERR_journal_blocked; 400 + return bch_err_throw(c, journal_blocked); 401 401 402 402 if (j->cur_entry_error) 403 403 return j->cur_entry_error; ··· 407 407 return ret; 408 408 409 409 if (!fifo_free(&j->pin)) 410 - return -BCH_ERR_journal_pin_full; 410 + return bch_err_throw(c, journal_pin_full); 411 411 412 412 if (nr_unwritten_journal_entries(j) == ARRAY_SIZE(j->buf)) 413 - return -BCH_ERR_journal_max_in_flight; 413 + return bch_err_throw(c, journal_max_in_flight); 414 414 415 415 if (atomic64_read(&j->seq) - j->seq_write_started == JOURNAL_STATE_BUF_NR) 416 - return -BCH_ERR_journal_max_open; 416 + return bch_err_throw(c, journal_max_open); 417 417 418 418 if (unlikely(journal_cur_seq(j) >= JOURNAL_SEQ_MAX)) { 419 419 bch_err(c, "cannot start: journal seq overflow"); 420 420 if (bch2_fs_emergency_read_only_locked(c)) 421 421 bch_err(c, "fatal error - emergency read only"); 422 - return -BCH_ERR_journal_shutdown; 422 + return bch_err_throw(c, journal_shutdown); 423 423 } 424 424 425 425 if (!j->free_buf && !buf->data) 426 - return -BCH_ERR_journal_buf_enomem; /* will retry after write completion frees up a buf */ 426 + return bch_err_throw(c, journal_buf_enomem); /* will retry after write completion frees up a buf */ 427 427 428 428 BUG_ON(!j->cur_entry_sectors); 429 429 ··· 447 447 u64s = clamp_t(int, u64s, 0, JOURNAL_ENTRY_CLOSED_VAL - 1); 448 448 449 449 if (u64s <= (ssize_t) j->early_journal_entries.nr) 450 - return -BCH_ERR_journal_full; 450 + return bch_err_throw(c, journal_full); 451 451 452 452 if (fifo_empty(&j->pin) && j->reclaim_thread) 453 453 wake_up_process(j->reclaim_thread); ··· 464 464 journal_cur_seq(j)); 465 465 if (bch2_fs_emergency_read_only_locked(c)) 466 466 bch_err(c, "fatal error - emergency read only"); 467 - return -BCH_ERR_journal_shutdown; 467 + return bch_err_throw(c, journal_shutdown); 468 468 } 469 469 470 470 BUG_ON(j->pin.back - 1 != atomic64_read(&j->seq)); ··· 597 597 return ret; 598 598 599 599 if (j->blocked) 600 - return -BCH_ERR_journal_blocked; 600 + return bch_err_throw(c, journal_blocked); 601 601 602 602 if ((flags & BCH_WATERMARK_MASK) < j->watermark) { 603 - ret = -BCH_ERR_journal_full; 603 + ret = bch_err_throw(c, journal_full); 604 604 can_discard = j->can_discard; 605 605 goto out; 606 606 } 607 607 608 608 if (nr_unwritten_journal_entries(j) == ARRAY_SIZE(j->buf) && !journal_entry_is_open(j)) { 609 - ret = -BCH_ERR_journal_max_in_flight; 609 + ret = bch_err_throw(c, journal_max_in_flight); 610 610 goto out; 611 611 } 612 612 ··· 647 647 goto retry; 648 648 649 649 if (journal_error_check_stuck(j, ret, flags)) 650 - ret = -BCH_ERR_journal_stuck; 650 + ret = bch_err_throw(c, journal_stuck); 651 651 652 652 if (ret == -BCH_ERR_journal_max_in_flight && 653 653 track_event_change(&c->times[BCH_TIME_blocked_journal_max_in_flight], true) && ··· 708 708 { 709 709 u64 nsecs = 0; 710 710 711 - rcu_read_lock(); 711 + guard(rcu)(); 712 712 for_each_rw_member_rcu(c, ca) 713 713 nsecs = max(nsecs, ca->io_latency[WRITE].stats.max_duration); 714 - rcu_read_unlock(); 715 714 716 715 return nsecs_to_jiffies(nsecs); 717 716 } ··· 812 813 int bch2_journal_flush_seq_async(struct journal *j, u64 seq, 813 814 struct closure *parent) 814 815 { 816 + struct bch_fs *c = container_of(j, struct bch_fs, journal); 815 817 struct journal_buf *buf; 816 818 int ret = 0; 817 819 ··· 828 828 829 829 /* Recheck under lock: */ 830 830 if (j->err_seq && seq >= j->err_seq) { 831 - ret = -BCH_ERR_journal_flush_err; 831 + ret = bch_err_throw(c, journal_flush_err); 832 832 goto out; 833 833 } 834 834 ··· 999 999 struct bch_fs *c = container_of(j, struct bch_fs, journal); 1000 1000 1001 1001 if (!enumerated_ref_tryget(&c->writes, BCH_WRITE_REF_journal)) 1002 - return -BCH_ERR_erofs_no_writes; 1002 + return bch_err_throw(c, erofs_no_writes); 1003 1003 1004 1004 int ret = __bch2_journal_meta(j); 1005 1005 enumerated_ref_put(&c->writes, BCH_WRITE_REF_journal); ··· 1132 1132 new_buckets = kcalloc(nr, sizeof(u64), GFP_KERNEL); 1133 1133 new_bucket_seq = kcalloc(nr, sizeof(u64), GFP_KERNEL); 1134 1134 if (!bu || !ob || !new_buckets || !new_bucket_seq) { 1135 - ret = -BCH_ERR_ENOMEM_set_nr_journal_buckets; 1135 + ret = bch_err_throw(c, ENOMEM_set_nr_journal_buckets); 1136 1136 goto err_free; 1137 1137 } 1138 1138 ··· 1304 1304 return ret; 1305 1305 } 1306 1306 1307 + int bch2_dev_journal_bucket_delete(struct bch_dev *ca, u64 b) 1308 + { 1309 + struct bch_fs *c = ca->fs; 1310 + struct journal *j = &c->journal; 1311 + struct journal_device *ja = &ca->journal; 1312 + 1313 + guard(mutex)(&c->sb_lock); 1314 + unsigned pos; 1315 + for (pos = 0; pos < ja->nr; pos++) 1316 + if (ja->buckets[pos] == b) 1317 + break; 1318 + 1319 + if (pos == ja->nr) { 1320 + bch_err(ca, "journal bucket %llu not found when deleting", b); 1321 + return -EINVAL; 1322 + } 1323 + 1324 + u64 *new_buckets = kcalloc(ja->nr, sizeof(u64), GFP_KERNEL);; 1325 + if (!new_buckets) 1326 + return bch_err_throw(c, ENOMEM_set_nr_journal_buckets); 1327 + 1328 + memcpy(new_buckets, ja->buckets, ja->nr * sizeof(u64)); 1329 + memmove(&new_buckets[pos], 1330 + &new_buckets[pos + 1], 1331 + (ja->nr - 1 - pos) * sizeof(new_buckets[0])); 1332 + 1333 + int ret = bch2_journal_buckets_to_sb(c, ca, ja->buckets, ja->nr - 1) ?: 1334 + bch2_write_super(c); 1335 + if (ret) { 1336 + kfree(new_buckets); 1337 + return ret; 1338 + } 1339 + 1340 + scoped_guard(spinlock, &j->lock) { 1341 + if (pos < ja->discard_idx) 1342 + --ja->discard_idx; 1343 + if (pos < ja->dirty_idx_ondisk) 1344 + --ja->dirty_idx_ondisk; 1345 + if (pos < ja->dirty_idx) 1346 + --ja->dirty_idx; 1347 + if (pos < ja->cur_idx) 1348 + --ja->cur_idx; 1349 + 1350 + ja->nr--; 1351 + 1352 + memmove(&ja->buckets[pos], 1353 + &ja->buckets[pos + 1], 1354 + (ja->nr - pos) * sizeof(ja->buckets[0])); 1355 + 1356 + memmove(&ja->bucket_seq[pos], 1357 + &ja->bucket_seq[pos + 1], 1358 + (ja->nr - pos) * sizeof(ja->bucket_seq[0])); 1359 + 1360 + bch2_journal_space_available(j); 1361 + } 1362 + 1363 + kfree(new_buckets); 1364 + return 0; 1365 + } 1366 + 1307 1367 int bch2_dev_journal_alloc(struct bch_dev *ca, bool new_fs) 1308 1368 { 1309 1369 struct bch_fs *c = ca->fs; ··· 1373 1313 1374 1314 if (c->sb.features & BIT_ULL(BCH_FEATURE_small_image)) { 1375 1315 bch_err(c, "cannot allocate journal, filesystem is an unresized image file"); 1376 - return -BCH_ERR_erofs_filesystem_full; 1316 + return bch_err_throw(c, erofs_filesystem_full); 1377 1317 } 1378 1318 1379 1319 unsigned nr; 1380 1320 int ret; 1381 1321 1382 1322 if (dynamic_fault("bcachefs:add:journal_alloc")) { 1383 - ret = -BCH_ERR_ENOMEM_set_nr_journal_buckets; 1323 + ret = bch_err_throw(c, ENOMEM_set_nr_journal_buckets); 1384 1324 goto err; 1385 1325 } 1386 1326 ··· 1519 1459 init_fifo(&j->pin, roundup_pow_of_two(nr), GFP_KERNEL); 1520 1460 if (!j->pin.data) { 1521 1461 bch_err(c, "error reallocating journal fifo (%llu open entries)", nr); 1522 - return -BCH_ERR_ENOMEM_journal_pin_fifo; 1462 + return bch_err_throw(c, ENOMEM_journal_pin_fifo); 1523 1463 } 1524 1464 1525 1465 j->replay_journal_seq = last_seq; ··· 1607 1547 1608 1548 int bch2_dev_journal_init(struct bch_dev *ca, struct bch_sb *sb) 1609 1549 { 1550 + struct bch_fs *c = ca->fs; 1610 1551 struct journal_device *ja = &ca->journal; 1611 1552 struct bch_sb_field_journal *journal_buckets = 1612 1553 bch2_sb_field_get(sb, journal); ··· 1627 1566 1628 1567 ja->bucket_seq = kcalloc(ja->nr, sizeof(u64), GFP_KERNEL); 1629 1568 if (!ja->bucket_seq) 1630 - return -BCH_ERR_ENOMEM_dev_journal_init; 1569 + return bch_err_throw(c, ENOMEM_dev_journal_init); 1631 1570 1632 1571 unsigned nr_bvecs = DIV_ROUND_UP(JOURNAL_ENTRY_SIZE_MAX, PAGE_SIZE); 1633 1572 ··· 1635 1574 ja->bio[i] = kzalloc(struct_size(ja->bio[i], bio.bi_inline_vecs, 1636 1575 nr_bvecs), GFP_KERNEL); 1637 1576 if (!ja->bio[i]) 1638 - return -BCH_ERR_ENOMEM_dev_journal_init; 1577 + return bch_err_throw(c, ENOMEM_dev_journal_init); 1639 1578 1640 1579 ja->bio[i]->ca = ca; 1641 1580 ja->bio[i]->buf_idx = i; ··· 1644 1583 1645 1584 ja->buckets = kcalloc(ja->nr, sizeof(u64), GFP_KERNEL); 1646 1585 if (!ja->buckets) 1647 - return -BCH_ERR_ENOMEM_dev_journal_init; 1586 + return bch_err_throw(c, ENOMEM_dev_journal_init); 1648 1587 1649 1588 if (journal_buckets_v2) { 1650 1589 unsigned nr = bch2_sb_field_journal_v2_nr_entries(journal_buckets_v2); ··· 1698 1637 1699 1638 int bch2_fs_journal_init(struct journal *j) 1700 1639 { 1640 + struct bch_fs *c = container_of(j, struct bch_fs, journal); 1641 + 1701 1642 j->free_buf_size = j->buf_size_want = JOURNAL_ENTRY_SIZE_MIN; 1702 1643 j->free_buf = kvmalloc(j->free_buf_size, GFP_KERNEL); 1703 1644 if (!j->free_buf) 1704 - return -BCH_ERR_ENOMEM_journal_buf; 1645 + return bch_err_throw(c, ENOMEM_journal_buf); 1705 1646 1706 1647 for (unsigned i = 0; i < ARRAY_SIZE(j->buf); i++) 1707 1648 j->buf[i].idx = i; ··· 1711 1648 j->wq = alloc_workqueue("bcachefs_journal", 1712 1649 WQ_HIGHPRI|WQ_FREEZABLE|WQ_UNBOUND|WQ_MEM_RECLAIM, 512); 1713 1650 if (!j->wq) 1714 - return -BCH_ERR_ENOMEM_fs_other_alloc; 1651 + return bch_err_throw(c, ENOMEM_fs_other_alloc); 1715 1652 return 0; 1716 1653 } 1717 1654 ··· 1735 1672 printbuf_tabstop_push(out, 28); 1736 1673 out->atomic++; 1737 1674 1738 - rcu_read_lock(); 1675 + guard(rcu)(); 1739 1676 s = READ_ONCE(j->reservations); 1740 1677 1741 1678 prt_printf(out, "flags:\t"); ··· 1825 1762 } 1826 1763 1827 1764 prt_printf(out, "replicas want %u need %u\n", c->opts.metadata_replicas, c->opts.metadata_replicas_required); 1828 - 1829 - rcu_read_unlock(); 1830 1765 1831 1766 --out->atomic; 1832 1767 }
+3 -2
fs/bcachefs/journal.h
··· 444 444 void __bch2_journal_debug_to_text(struct printbuf *, struct journal *); 445 445 void bch2_journal_debug_to_text(struct printbuf *, struct journal *); 446 446 447 - int bch2_set_nr_journal_buckets(struct bch_fs *, struct bch_dev *, 448 - unsigned nr); 447 + int bch2_set_nr_journal_buckets(struct bch_fs *, struct bch_dev *, unsigned); 448 + int bch2_dev_journal_bucket_delete(struct bch_dev *, u64); 449 + 449 450 int bch2_dev_journal_alloc(struct bch_dev *, bool); 450 451 int bch2_fs_journal_alloc(struct bch_fs *); 451 452
+173 -110
fs/bcachefs/journal_io.c
··· 49 49 mutex_unlock(&c->sb_lock); 50 50 } 51 51 52 - void bch2_journal_ptrs_to_text(struct printbuf *out, struct bch_fs *c, 53 - struct journal_replay *j) 52 + static void bch2_journal_ptr_to_text(struct printbuf *out, struct bch_fs *c, struct journal_ptr *p) 53 + { 54 + struct bch_dev *ca = bch2_dev_tryget_noerror(c, p->dev); 55 + prt_printf(out, "%s %u:%u:%u (sector %llu)", 56 + ca ? ca->name : "(invalid dev)", 57 + p->dev, p->bucket, p->bucket_offset, p->sector); 58 + bch2_dev_put(ca); 59 + } 60 + 61 + void bch2_journal_ptrs_to_text(struct printbuf *out, struct bch_fs *c, struct journal_replay *j) 54 62 { 55 63 darray_for_each(j->ptrs, i) { 56 64 if (i != j->ptrs.data) 57 65 prt_printf(out, " "); 58 - prt_printf(out, "%u:%u:%u (sector %llu)", 59 - i->dev, i->bucket, i->bucket_offset, i->sector); 66 + bch2_journal_ptr_to_text(out, c, i); 67 + } 68 + } 69 + 70 + static void bch2_journal_datetime_to_text(struct printbuf *out, struct jset *j) 71 + { 72 + for_each_jset_entry_type(entry, j, BCH_JSET_ENTRY_datetime) { 73 + struct jset_entry_datetime *datetime = 74 + container_of(entry, struct jset_entry_datetime, entry); 75 + bch2_prt_datetime(out, le64_to_cpu(datetime->seconds)); 76 + break; 60 77 } 61 78 } 62 79 ··· 81 64 struct journal_replay *j) 82 65 { 83 66 prt_printf(out, "seq %llu ", le64_to_cpu(j->j.seq)); 84 - 67 + bch2_journal_datetime_to_text(out, &j->j); 68 + prt_char(out, ' '); 85 69 bch2_journal_ptrs_to_text(out, c, j); 86 - 87 - for_each_jset_entry_type(entry, &j->j, BCH_JSET_ENTRY_datetime) { 88 - struct jset_entry_datetime *datetime = 89 - container_of(entry, struct jset_entry_datetime, entry); 90 - bch2_prt_datetime(out, le64_to_cpu(datetime->seconds)); 91 - break; 92 - } 93 70 } 94 71 95 72 static struct nonce journal_nonce(const struct jset *jset) ··· 199 188 journal_entry_radix_idx(c, le64_to_cpu(j->seq)), 200 189 GFP_KERNEL); 201 190 if (!_i) 202 - return -BCH_ERR_ENOMEM_journal_entry_add; 191 + return bch_err_throw(c, ENOMEM_journal_entry_add); 203 192 204 193 /* 205 194 * Duplicate journal entries? If so we want the one that didn't have a ··· 242 231 replace: 243 232 i = kvmalloc(offsetof(struct journal_replay, j) + bytes, GFP_KERNEL); 244 233 if (!i) 245 - return -BCH_ERR_ENOMEM_journal_entry_add; 234 + return bch_err_throw(c, ENOMEM_journal_entry_add); 246 235 247 236 darray_init(&i->ptrs); 248 237 i->csum_good = entry_ptr.csum_good; ··· 322 311 bch2_sb_error_count(c, BCH_FSCK_ERR_##_err); \ 323 312 if (bch2_fs_inconsistent(c, \ 324 313 "corrupt metadata before write: %s\n", _buf.buf)) {\ 325 - ret = -BCH_ERR_fsck_errors_not_fixed; \ 314 + ret = bch_err_throw(c, fsck_errors_not_fixed); \ 326 315 goto fsck_err; \ 327 316 } \ 328 317 break; \ ··· 429 418 bool first = true; 430 419 431 420 jset_entry_for_each_key(entry, k) { 421 + /* We may be called on entries that haven't been validated: */ 422 + if (!k->k.u64s) 423 + break; 424 + 432 425 if (!first) { 433 426 prt_newline(out); 434 427 bch2_prt_jset_entry_type(out, entry->type); ··· 1020 1005 size_t size; 1021 1006 }; 1022 1007 1023 - static int journal_read_buf_realloc(struct journal_read_buf *b, 1008 + static int journal_read_buf_realloc(struct bch_fs *c, struct journal_read_buf *b, 1024 1009 size_t new_size) 1025 1010 { 1026 1011 void *n; 1027 1012 1028 1013 /* the bios are sized for this many pages, max: */ 1029 1014 if (new_size > JOURNAL_ENTRY_SIZE_MAX) 1030 - return -BCH_ERR_ENOMEM_journal_read_buf_realloc; 1015 + return bch_err_throw(c, ENOMEM_journal_read_buf_realloc); 1031 1016 1032 1017 new_size = roundup_pow_of_two(new_size); 1033 1018 n = kvmalloc(new_size, GFP_KERNEL); 1034 1019 if (!n) 1035 - return -BCH_ERR_ENOMEM_journal_read_buf_realloc; 1020 + return bch_err_throw(c, ENOMEM_journal_read_buf_realloc); 1036 1021 1037 1022 kvfree(b->data); 1038 1023 b->data = n; ··· 1052 1037 u64 offset = bucket_to_sector(ca, ja->buckets[bucket]), 1053 1038 end = offset + ca->mi.bucket_size; 1054 1039 bool saw_bad = false, csum_good; 1055 - struct printbuf err = PRINTBUF; 1056 1040 int ret = 0; 1057 1041 1058 1042 pr_debug("reading %u", bucket); ··· 1067 1053 1068 1054 bio = bio_kmalloc(nr_bvecs, GFP_KERNEL); 1069 1055 if (!bio) 1070 - return -BCH_ERR_ENOMEM_journal_read_bucket; 1056 + return bch_err_throw(c, ENOMEM_journal_read_bucket); 1071 1057 bio_init(bio, ca->disk_sb.bdev, bio->bi_inline_vecs, nr_bvecs, REQ_OP_READ); 1072 1058 1073 1059 bio->bi_iter.bi_sector = offset; ··· 1078 1064 kfree(bio); 1079 1065 1080 1066 if (!ret && bch2_meta_read_fault("journal")) 1081 - ret = -BCH_ERR_EIO_fault_injected; 1067 + ret = bch_err_throw(c, EIO_fault_injected); 1082 1068 1083 1069 bch2_account_io_completion(ca, BCH_MEMBER_ERROR_read, 1084 1070 submit_time, !ret); ··· 1092 1078 * found on a different device, and missing or 1093 1079 * no journal entries will be handled later 1094 1080 */ 1095 - goto out; 1081 + return 0; 1096 1082 } 1097 1083 1098 1084 j = buf->data; ··· 1106 1092 break; 1107 1093 case JOURNAL_ENTRY_REREAD: 1108 1094 if (vstruct_bytes(j) > buf->size) { 1109 - ret = journal_read_buf_realloc(buf, 1095 + ret = journal_read_buf_realloc(c, buf, 1110 1096 vstruct_bytes(j)); 1111 1097 if (ret) 1112 - goto err; 1098 + return ret; 1113 1099 } 1114 1100 goto reread; 1115 1101 case JOURNAL_ENTRY_NONE: 1116 1102 if (!saw_bad) 1117 - goto out; 1103 + return 0; 1118 1104 /* 1119 1105 * On checksum error we don't really trust the size 1120 1106 * field of the journal entry we read, so try reading ··· 1123 1109 sectors = block_sectors(c); 1124 1110 goto next_block; 1125 1111 default: 1126 - goto err; 1112 + return ret; 1127 1113 } 1128 1114 1129 1115 if (le64_to_cpu(j->seq) > ja->highest_seq_found) { ··· 1140 1126 * bucket: 1141 1127 */ 1142 1128 if (le64_to_cpu(j->seq) < ja->bucket_seq[bucket]) 1143 - goto out; 1129 + return 0; 1144 1130 1145 1131 ja->bucket_seq[bucket] = le64_to_cpu(j->seq); 1146 1132 1147 - enum bch_csum_type csum_type = JSET_CSUM_TYPE(j); 1148 1133 struct bch_csum csum; 1149 1134 csum_good = jset_csum_good(c, j, &csum); 1150 1135 1151 1136 bch2_account_io_completion(ca, BCH_MEMBER_ERROR_checksum, 0, csum_good); 1152 1137 1153 1138 if (!csum_good) { 1154 - bch_err_dev_ratelimited(ca, "%s", 1155 - (printbuf_reset(&err), 1156 - prt_str(&err, "journal "), 1157 - bch2_csum_err_msg(&err, csum_type, j->csum, csum), 1158 - err.buf)); 1139 + /* 1140 + * Don't print an error here, we'll print the error 1141 + * later if we need this journal entry 1142 + */ 1159 1143 saw_bad = true; 1160 1144 } 1161 1145 ··· 1165 1153 mutex_lock(&jlist->lock); 1166 1154 ret = journal_entry_add(c, ca, (struct journal_ptr) { 1167 1155 .csum_good = csum_good, 1156 + .csum = csum, 1168 1157 .dev = ca->dev_idx, 1169 1158 .bucket = bucket, 1170 1159 .bucket_offset = offset - ··· 1180 1167 case JOURNAL_ENTRY_ADD_OUT_OF_RANGE: 1181 1168 break; 1182 1169 default: 1183 - goto err; 1170 + return ret; 1184 1171 } 1185 1172 next_block: 1186 1173 pr_debug("next"); ··· 1189 1176 j = ((void *) j) + (sectors << 9); 1190 1177 } 1191 1178 1192 - out: 1193 - ret = 0; 1194 - err: 1195 - printbuf_exit(&err); 1196 - return ret; 1179 + return 0; 1197 1180 } 1198 1181 1199 1182 static CLOSURE_CALLBACK(bch2_journal_read_device) ··· 1206 1197 if (!ja->nr) 1207 1198 goto out; 1208 1199 1209 - ret = journal_read_buf_realloc(&buf, PAGE_SIZE); 1200 + ret = journal_read_buf_realloc(c, &buf, PAGE_SIZE); 1210 1201 if (ret) 1211 1202 goto err; 1212 1203 ··· 1238 1229 goto out; 1239 1230 } 1240 1231 1232 + noinline_for_stack 1233 + static void bch2_journal_print_checksum_error(struct bch_fs *c, struct journal_replay *j) 1234 + { 1235 + struct printbuf buf = PRINTBUF; 1236 + enum bch_csum_type csum_type = JSET_CSUM_TYPE(&j->j); 1237 + bool have_good = false; 1238 + 1239 + prt_printf(&buf, "invalid journal checksum(s) at seq %llu ", le64_to_cpu(j->j.seq)); 1240 + bch2_journal_datetime_to_text(&buf, &j->j); 1241 + prt_newline(&buf); 1242 + 1243 + darray_for_each(j->ptrs, ptr) 1244 + if (!ptr->csum_good) { 1245 + bch2_journal_ptr_to_text(&buf, c, ptr); 1246 + prt_char(&buf, ' '); 1247 + bch2_csum_to_text(&buf, csum_type, ptr->csum); 1248 + prt_newline(&buf); 1249 + } else { 1250 + have_good = true; 1251 + } 1252 + 1253 + prt_printf(&buf, "should be "); 1254 + bch2_csum_to_text(&buf, csum_type, j->j.csum); 1255 + 1256 + if (have_good) 1257 + prt_printf(&buf, "\n(had good copy on another device)"); 1258 + 1259 + bch2_print_str(c, KERN_ERR, buf.buf); 1260 + printbuf_exit(&buf); 1261 + } 1262 + 1263 + noinline_for_stack 1264 + static int bch2_journal_check_for_missing(struct bch_fs *c, u64 start_seq, u64 end_seq) 1265 + { 1266 + struct printbuf buf = PRINTBUF; 1267 + int ret = 0; 1268 + 1269 + struct genradix_iter radix_iter; 1270 + struct journal_replay *i, **_i, *prev = NULL; 1271 + u64 seq = start_seq; 1272 + 1273 + genradix_for_each(&c->journal_entries, radix_iter, _i) { 1274 + i = *_i; 1275 + 1276 + if (journal_replay_ignore(i)) 1277 + continue; 1278 + 1279 + BUG_ON(seq > le64_to_cpu(i->j.seq)); 1280 + 1281 + while (seq < le64_to_cpu(i->j.seq)) { 1282 + while (seq < le64_to_cpu(i->j.seq) && 1283 + bch2_journal_seq_is_blacklisted(c, seq, false)) 1284 + seq++; 1285 + 1286 + if (seq == le64_to_cpu(i->j.seq)) 1287 + break; 1288 + 1289 + u64 missing_start = seq; 1290 + 1291 + while (seq < le64_to_cpu(i->j.seq) && 1292 + !bch2_journal_seq_is_blacklisted(c, seq, false)) 1293 + seq++; 1294 + 1295 + u64 missing_end = seq - 1; 1296 + 1297 + printbuf_reset(&buf); 1298 + prt_printf(&buf, "journal entries %llu-%llu missing! (replaying %llu-%llu)", 1299 + missing_start, missing_end, 1300 + start_seq, end_seq); 1301 + 1302 + prt_printf(&buf, "\nprev at "); 1303 + if (prev) { 1304 + bch2_journal_ptrs_to_text(&buf, c, prev); 1305 + prt_printf(&buf, " size %zu", vstruct_sectors(&prev->j, c->block_bits)); 1306 + } else 1307 + prt_printf(&buf, "(none)"); 1308 + 1309 + prt_printf(&buf, "\nnext at "); 1310 + bch2_journal_ptrs_to_text(&buf, c, i); 1311 + prt_printf(&buf, ", continue?"); 1312 + 1313 + fsck_err(c, journal_entries_missing, "%s", buf.buf); 1314 + } 1315 + 1316 + prev = i; 1317 + seq++; 1318 + } 1319 + fsck_err: 1320 + printbuf_exit(&buf); 1321 + return ret; 1322 + } 1323 + 1241 1324 int bch2_journal_read(struct bch_fs *c, 1242 1325 u64 *last_seq, 1243 1326 u64 *blacklist_seq, 1244 1327 u64 *start_seq) 1245 1328 { 1246 1329 struct journal_list jlist; 1247 - struct journal_replay *i, **_i, *prev = NULL; 1330 + struct journal_replay *i, **_i; 1248 1331 struct genradix_iter radix_iter; 1249 1332 struct printbuf buf = PRINTBUF; 1250 1333 bool degraded = false, last_write_torn = false; ··· 1427 1326 return 0; 1428 1327 } 1429 1328 1430 - bch_info(c, "journal read done, replaying entries %llu-%llu", 1431 - *last_seq, *blacklist_seq - 1); 1432 - 1329 + printbuf_reset(&buf); 1330 + prt_printf(&buf, "journal read done, replaying entries %llu-%llu", 1331 + *last_seq, *blacklist_seq - 1); 1433 1332 if (*start_seq != *blacklist_seq) 1434 - bch_info(c, "dropped unflushed entries %llu-%llu", 1435 - *blacklist_seq, *start_seq - 1); 1333 + prt_printf(&buf, " (unflushed %llu-%llu)", *blacklist_seq, *start_seq - 1); 1334 + bch_info(c, "%s", buf.buf); 1436 1335 1437 1336 /* Drop blacklisted entries and entries older than last_seq: */ 1438 1337 genradix_for_each(&c->journal_entries, radix_iter, _i) { ··· 1455 1354 } 1456 1355 } 1457 1356 1458 - /* Check for missing entries: */ 1459 - seq = *last_seq; 1460 - genradix_for_each(&c->journal_entries, radix_iter, _i) { 1461 - i = *_i; 1462 - 1463 - if (journal_replay_ignore(i)) 1464 - continue; 1465 - 1466 - BUG_ON(seq > le64_to_cpu(i->j.seq)); 1467 - 1468 - while (seq < le64_to_cpu(i->j.seq)) { 1469 - u64 missing_start, missing_end; 1470 - struct printbuf buf1 = PRINTBUF, buf2 = PRINTBUF; 1471 - 1472 - while (seq < le64_to_cpu(i->j.seq) && 1473 - bch2_journal_seq_is_blacklisted(c, seq, false)) 1474 - seq++; 1475 - 1476 - if (seq == le64_to_cpu(i->j.seq)) 1477 - break; 1478 - 1479 - missing_start = seq; 1480 - 1481 - while (seq < le64_to_cpu(i->j.seq) && 1482 - !bch2_journal_seq_is_blacklisted(c, seq, false)) 1483 - seq++; 1484 - 1485 - if (prev) { 1486 - bch2_journal_ptrs_to_text(&buf1, c, prev); 1487 - prt_printf(&buf1, " size %zu", vstruct_sectors(&prev->j, c->block_bits)); 1488 - } else 1489 - prt_printf(&buf1, "(none)"); 1490 - bch2_journal_ptrs_to_text(&buf2, c, i); 1491 - 1492 - missing_end = seq - 1; 1493 - fsck_err(c, journal_entries_missing, 1494 - "journal entries %llu-%llu missing! (replaying %llu-%llu)\n" 1495 - "prev at %s\n" 1496 - "next at %s, continue?", 1497 - missing_start, missing_end, 1498 - *last_seq, *blacklist_seq - 1, 1499 - buf1.buf, buf2.buf); 1500 - 1501 - printbuf_exit(&buf1); 1502 - printbuf_exit(&buf2); 1503 - } 1504 - 1505 - prev = i; 1506 - seq++; 1507 - } 1357 + ret = bch2_journal_check_for_missing(c, *last_seq, *blacklist_seq - 1); 1358 + if (ret) 1359 + goto err; 1508 1360 1509 1361 genradix_for_each(&c->journal_entries, radix_iter, _i) { 1510 1362 union bch_replicas_padded replicas = { ··· 1470 1416 if (journal_replay_ignore(i)) 1471 1417 continue; 1472 1418 1473 - darray_for_each(i->ptrs, ptr) { 1474 - struct bch_dev *ca = bch2_dev_have_ref(c, ptr->dev); 1475 - 1476 - if (!ptr->csum_good) 1477 - bch_err_dev_offset(ca, ptr->sector, 1478 - "invalid journal checksum, seq %llu%s", 1479 - le64_to_cpu(i->j.seq), 1480 - i->csum_good ? " (had good copy on another device)" : ""); 1481 - } 1419 + /* 1420 + * Don't print checksum errors until we know we're going to use 1421 + * a given journal entry: 1422 + */ 1423 + darray_for_each(i->ptrs, ptr) 1424 + if (!ptr->csum_good) { 1425 + bch2_journal_print_checksum_error(c, i); 1426 + break; 1427 + } 1482 1428 1483 1429 ret = jset_validate(c, 1484 1430 bch2_dev_have_ref(c, i->ptrs.data[0].dev), ··· 1521 1467 { 1522 1468 struct bch_fs *c = container_of(j, struct bch_fs, journal); 1523 1469 1524 - rcu_read_lock(); 1470 + guard(rcu)(); 1525 1471 darray_for_each(*devs, i) { 1526 1472 struct bch_dev *ca = rcu_dereference(c->devs[*i]); 1527 1473 if (!ca) ··· 1543 1489 ja->bucket_seq[ja->cur_idx] = le64_to_cpu(seq); 1544 1490 } 1545 1491 } 1546 - rcu_read_unlock(); 1547 1492 } 1548 1493 1549 1494 static void __journal_write_alloc(struct journal *j, ··· 1612 1559 1613 1560 retry_target: 1614 1561 devs = target_rw_devs(c, BCH_DATA_journal, target); 1615 - devs_sorted = bch2_dev_alloc_list(c, &j->wp.stripe, &devs); 1562 + bch2_dev_alloc_list(c, &j->wp.stripe, &devs, &devs_sorted); 1616 1563 retry_alloc: 1617 1564 __journal_write_alloc(j, w, &devs_sorted, sectors, replicas, replicas_want); 1618 1565 ··· 1633 1580 } 1634 1581 done: 1635 1582 BUG_ON(bkey_val_u64s(&w->key.k) > BCH_REPLICAS_MAX); 1583 + 1584 + #if 0 1585 + /* 1586 + * XXX: we need a way to alert the user when we go degraded for any 1587 + * reason 1588 + */ 1589 + if (*replicas < min(replicas_want, 1590 + dev_mask_nr(&c->rw_devs[BCH_DATA_free]))) { 1591 + } 1592 + #endif 1636 1593 1637 1594 return *replicas >= replicas_need ? 0 : -BCH_ERR_insufficient_journal_devices; 1638 1595 } ··· 1691 1628 : j->noflush_write_time, j->write_start_time); 1692 1629 1693 1630 if (!w->devs_written.nr) { 1694 - err = -BCH_ERR_journal_write_err; 1631 + err = bch_err_throw(c, journal_write_err); 1695 1632 } else { 1696 1633 bch2_devlist_to_replicas(&replicas.e, BCH_DATA_journal, 1697 1634 w->devs_written); ··· 2121 2058 struct journal *j = container_of(w, struct journal, buf[w->idx]); 2122 2059 struct bch_fs *c = container_of(j, struct bch_fs, journal); 2123 2060 union bch_replicas_padded replicas; 2124 - unsigned nr_rw_members = dev_mask_nr(&c->rw_devs[BCH_DATA_journal]); 2061 + unsigned nr_rw_members = dev_mask_nr(&c->rw_devs[BCH_DATA_free]); 2125 2062 int ret; 2126 2063 2127 2064 BUG_ON(BCH_SB_CLEAN(c->disk_sb.sb));
+1
fs/bcachefs/journal_io.h
··· 9 9 10 10 struct journal_ptr { 11 11 bool csum_good; 12 + struct bch_csum csum; 12 13 u8 dev; 13 14 u32 bucket; 14 15 u32 bucket_offset;
+18 -26
fs/bcachefs/journal_reclaim.c
··· 83 83 journal_dev_space_available(struct journal *j, struct bch_dev *ca, 84 84 enum journal_space_from from) 85 85 { 86 + struct bch_fs *c = container_of(j, struct bch_fs, journal); 86 87 struct journal_device *ja = &ca->journal; 87 88 unsigned sectors, buckets, unwritten; 89 + unsigned bucket_size_aligned = round_down(ca->mi.bucket_size, block_sectors(c)); 88 90 u64 seq; 89 91 90 92 if (from == journal_space_total) 91 93 return (struct journal_space) { 92 - .next_entry = ca->mi.bucket_size, 93 - .total = ca->mi.bucket_size * ja->nr, 94 + .next_entry = bucket_size_aligned, 95 + .total = bucket_size_aligned * ja->nr, 94 96 }; 95 97 96 98 buckets = bch2_journal_dev_buckets_available(j, ja, from); 97 - sectors = ja->sectors_free; 99 + sectors = round_down(ja->sectors_free, block_sectors(c)); 98 100 99 101 /* 100 102 * We that we don't allocate the space for a journal entry ··· 111 109 continue; 112 110 113 111 /* entry won't fit on this device, skip: */ 114 - if (unwritten > ca->mi.bucket_size) 112 + if (unwritten > bucket_size_aligned) 115 113 continue; 116 114 117 115 if (unwritten >= sectors) { ··· 121 119 } 122 120 123 121 buckets--; 124 - sectors = ca->mi.bucket_size; 122 + sectors = bucket_size_aligned; 125 123 } 126 124 127 125 sectors -= unwritten; ··· 129 127 130 128 if (sectors < ca->mi.bucket_size && buckets) { 131 129 buckets--; 132 - sectors = ca->mi.bucket_size; 130 + sectors = bucket_size_aligned; 133 131 } 134 132 135 133 return (struct journal_space) { 136 134 .next_entry = sectors, 137 - .total = sectors + buckets * ca->mi.bucket_size, 135 + .total = sectors + buckets * bucket_size_aligned, 138 136 }; 139 137 } 140 138 ··· 148 146 149 147 BUG_ON(nr_devs_want > ARRAY_SIZE(dev_space)); 150 148 151 - rcu_read_lock(); 152 149 for_each_member_device_rcu(c, ca, &c->rw_devs[BCH_DATA_journal]) { 153 150 if (!ca->journal.nr || 154 151 !ca->mi.durability) ··· 165 164 166 165 array_insert_item(dev_space, nr_devs, pos, space); 167 166 } 168 - rcu_read_unlock(); 169 167 170 168 if (nr_devs < nr_devs_want) 171 169 return (struct journal_space) { 0, 0 }; ··· 189 189 int ret = 0; 190 190 191 191 lockdep_assert_held(&j->lock); 192 + guard(rcu)(); 192 193 193 - rcu_read_lock(); 194 194 for_each_member_device_rcu(c, ca, &c->rw_devs[BCH_DATA_journal]) { 195 195 struct journal_device *ja = &ca->journal; 196 196 ··· 210 210 max_entry_size = min_t(unsigned, max_entry_size, ca->mi.bucket_size); 211 211 nr_online++; 212 212 } 213 - rcu_read_unlock(); 214 213 215 214 j->can_discard = can_discard; 216 215 ··· 220 221 prt_printf(&buf, "insufficient writeable journal devices available: have %u, need %u\n" 221 222 "rw journal devs:", nr_online, metadata_replicas_required(c)); 222 223 223 - rcu_read_lock(); 224 224 for_each_member_device_rcu(c, ca, &c->rw_devs[BCH_DATA_journal]) 225 225 prt_printf(&buf, " %s", ca->name); 226 - rcu_read_unlock(); 227 226 228 227 bch_err(c, "%s", buf.buf); 229 228 printbuf_exit(&buf); 230 229 } 231 - ret = -BCH_ERR_insufficient_journal_devices; 230 + ret = bch_err_throw(c, insufficient_journal_devices); 232 231 goto out; 233 232 } 234 233 ··· 240 243 total = j->space[journal_space_total].total; 241 244 242 245 if (!j->space[journal_space_discarded].next_entry) 243 - ret = -BCH_ERR_journal_full; 246 + ret = bch_err_throw(c, journal_full); 244 247 245 248 if ((j->space[journal_space_clean_ondisk].next_entry < 246 249 j->space[journal_space_clean_ondisk].total) && ··· 253 256 bch2_journal_set_watermark(j); 254 257 out: 255 258 j->cur_entry_sectors = !ret 256 - ? round_down(j->space[journal_space_discarded].next_entry, 257 - block_sectors(c)) 259 + ? j->space[journal_space_discarded].next_entry 258 260 : 0; 259 261 j->cur_entry_error = ret; 260 262 ··· 621 625 struct bch_fs *c = container_of(j, struct bch_fs, journal); 622 626 u64 seq_to_flush = 0; 623 627 624 - spin_lock(&j->lock); 628 + guard(spinlock)(&j->lock); 629 + guard(rcu)(); 625 630 626 - rcu_read_lock(); 627 631 for_each_rw_member_rcu(c, ca) { 628 632 struct journal_device *ja = &ca->journal; 629 633 unsigned nr_buckets, bucket_to_flush; ··· 638 642 seq_to_flush = max(seq_to_flush, 639 643 ja->bucket_seq[bucket_to_flush]); 640 644 } 641 - rcu_read_unlock(); 642 645 643 646 /* Also flush if the pin fifo is more than half full */ 644 - seq_to_flush = max_t(s64, seq_to_flush, 645 - (s64) journal_cur_seq(j) - 646 - (j->pin.size >> 1)); 647 - spin_unlock(&j->lock); 648 - 649 - return seq_to_flush; 647 + return max_t(s64, seq_to_flush, 648 + (s64) journal_cur_seq(j) - 649 + (j->pin.size >> 1)); 650 650 } 651 651 652 652 /**
+1 -1
fs/bcachefs/journal_sb.c
··· 210 210 j = bch2_sb_field_resize(&ca->disk_sb, journal_v2, 211 211 (sizeof(*j) + sizeof(j->d[0]) * nr_compacted) / sizeof(u64)); 212 212 if (!j) 213 - return -BCH_ERR_ENOSPC_sb_journal; 213 + return bch_err_throw(c, ENOSPC_sb_journal); 214 214 215 215 bch2_sb_field_delete(&ca->disk_sb, BCH_SB_FIELD_journal); 216 216
+2 -2
fs/bcachefs/journal_seq_blacklist.c
··· 78 78 bl = bch2_sb_field_resize(&c->disk_sb, journal_seq_blacklist, 79 79 sb_blacklist_u64s(nr + 1)); 80 80 if (!bl) { 81 - ret = -BCH_ERR_ENOSPC_sb_journal_seq_blacklist; 81 + ret = bch_err_throw(c, ENOSPC_sb_journal_seq_blacklist); 82 82 goto out; 83 83 } 84 84 ··· 152 152 153 153 t = kzalloc(struct_size(t, entries, nr), GFP_KERNEL); 154 154 if (!t) 155 - return -BCH_ERR_ENOMEM_blacklist_table_init; 155 + return bch_err_throw(c, ENOMEM_blacklist_table_init); 156 156 157 157 t->nr = nr; 158 158
+2 -4
fs/bcachefs/lru.c
··· 145 145 case BCH_LRU_fragmentation: { 146 146 a = bch2_alloc_to_v4(k, &a_convert); 147 147 148 - rcu_read_lock(); 148 + guard(rcu)(); 149 149 struct bch_dev *ca = bch2_dev_rcu_noerror(c, k.k->p.inode); 150 - u64 idx = ca 150 + return ca 151 151 ? alloc_lru_idx_fragmentation(*a, ca) 152 152 : 0; 153 - rcu_read_unlock(); 154 - return idx; 155 153 } 156 154 case BCH_LRU_stripes: 157 155 return k.k->type == KEY_TYPE_stripe
+2 -2
fs/bcachefs/migrate.c
··· 35 35 nr_good = bch2_bkey_durability(c, k.s_c); 36 36 if ((!nr_good && !(flags & lost)) || 37 37 (nr_good < replicas && !(flags & degraded))) 38 - return -BCH_ERR_remove_would_lose_data; 38 + return bch_err_throw(c, remove_would_lose_data); 39 39 40 40 return 0; 41 41 } ··· 156 156 157 157 /* don't handle this yet: */ 158 158 if (flags & BCH_FORCE_IF_METADATA_LOST) 159 - return -BCH_ERR_remove_with_metadata_missing_unimplemented; 159 + return bch_err_throw(c, remove_with_metadata_missing_unimplemented); 160 160 161 161 trans = bch2_trans_get(c); 162 162 bch2_bkey_buf_init(&k);
+92 -42
fs/bcachefs/move.c
··· 38 38 NULL 39 39 }; 40 40 41 - static void trace_io_move2(struct bch_fs *c, struct bkey_s_c k, 42 - struct bch_io_opts *io_opts, 43 - struct data_update_opts *data_opts) 44 - { 45 - if (trace_io_move_enabled()) { 46 - struct printbuf buf = PRINTBUF; 41 + struct evacuate_bucket_arg { 42 + struct bpos bucket; 43 + int gen; 44 + struct data_update_opts data_opts; 45 + }; 47 46 48 - bch2_bkey_val_to_text(&buf, c, k); 49 - prt_newline(&buf); 50 - bch2_data_update_opts_to_text(&buf, c, io_opts, data_opts); 51 - trace_io_move(c, buf.buf); 52 - printbuf_exit(&buf); 53 - } 47 + static bool evacuate_bucket_pred(struct bch_fs *, void *, 48 + enum btree_id, struct bkey_s_c, 49 + struct bch_io_opts *, 50 + struct data_update_opts *); 51 + 52 + static noinline void 53 + trace_io_move2(struct bch_fs *c, struct bkey_s_c k, 54 + struct bch_io_opts *io_opts, 55 + struct data_update_opts *data_opts) 56 + { 57 + struct printbuf buf = PRINTBUF; 58 + 59 + bch2_bkey_val_to_text(&buf, c, k); 60 + prt_newline(&buf); 61 + bch2_data_update_opts_to_text(&buf, c, io_opts, data_opts); 62 + trace_io_move(c, buf.buf); 63 + printbuf_exit(&buf); 54 64 } 55 65 56 - static void trace_io_move_read2(struct bch_fs *c, struct bkey_s_c k) 66 + static noinline void trace_io_move_read2(struct bch_fs *c, struct bkey_s_c k) 57 67 { 58 - if (trace_io_move_read_enabled()) { 59 - struct printbuf buf = PRINTBUF; 68 + struct printbuf buf = PRINTBUF; 60 69 61 - bch2_bkey_val_to_text(&buf, c, k); 62 - trace_io_move_read(c, buf.buf); 63 - printbuf_exit(&buf); 70 + bch2_bkey_val_to_text(&buf, c, k); 71 + trace_io_move_read(c, buf.buf); 72 + printbuf_exit(&buf); 73 + } 74 + 75 + static noinline void 76 + trace_io_move_pred2(struct bch_fs *c, struct bkey_s_c k, 77 + struct bch_io_opts *io_opts, 78 + struct data_update_opts *data_opts, 79 + move_pred_fn pred, void *_arg, bool p) 80 + { 81 + struct printbuf buf = PRINTBUF; 82 + 83 + prt_printf(&buf, "%ps: %u", pred, p); 84 + 85 + if (pred == evacuate_bucket_pred) { 86 + struct evacuate_bucket_arg *arg = _arg; 87 + prt_printf(&buf, " gen=%u", arg->gen); 64 88 } 89 + 90 + prt_newline(&buf); 91 + bch2_bkey_val_to_text(&buf, c, k); 92 + prt_newline(&buf); 93 + bch2_data_update_opts_to_text(&buf, c, io_opts, data_opts); 94 + trace_io_move_pred(c, buf.buf); 95 + printbuf_exit(&buf); 96 + } 97 + 98 + static noinline void 99 + trace_io_move_evacuate_bucket2(struct bch_fs *c, struct bpos bucket, int gen) 100 + { 101 + struct printbuf buf = PRINTBUF; 102 + 103 + prt_printf(&buf, "bucket: "); 104 + bch2_bpos_to_text(&buf, bucket); 105 + prt_printf(&buf, " gen: %i\n", gen); 106 + 107 + trace_io_move_evacuate_bucket(c, buf.buf); 108 + printbuf_exit(&buf); 65 109 } 66 110 67 111 struct moving_io { ··· 342 298 struct bch_fs *c = trans->c; 343 299 int ret = -ENOMEM; 344 300 345 - trace_io_move2(c, k, &io_opts, &data_opts); 301 + if (trace_io_move_enabled()) 302 + trace_io_move2(c, k, &io_opts, &data_opts); 346 303 this_cpu_add(c->counters[BCH_COUNTER_io_move], k.k->size); 347 304 348 305 if (ctxt->stats) ··· 359 314 return 0; 360 315 } 361 316 362 - /* 363 - * Before memory allocations & taking nocow locks in 364 - * bch2_data_update_init(): 365 - */ 366 - bch2_trans_unlock(trans); 367 - 368 - struct moving_io *io = kzalloc(sizeof(struct moving_io), GFP_KERNEL); 317 + struct moving_io *io = allocate_dropping_locks(trans, ret, 318 + kzalloc(sizeof(struct moving_io), _gfp)); 369 319 if (!io) 370 320 goto err; 321 + 322 + if (ret) 323 + goto err_free; 371 324 372 325 INIT_LIST_HEAD(&io->io_list); 373 326 io->write.ctxt = ctxt; ··· 385 342 386 343 io->write.op.c = c; 387 344 io->write.data_opts = data_opts; 345 + 346 + bch2_trans_unlock(trans); 388 347 389 348 ret = bch2_data_update_bios_init(&io->write, c, &io_opts); 390 349 if (ret) ··· 409 364 atomic_inc(&io->b->count); 410 365 } 411 366 412 - trace_io_move_read2(c, k); 367 + if (trace_io_move_read_enabled()) 368 + trace_io_move_read2(c, k); 413 369 414 370 mutex_lock(&ctxt->lock); 415 371 atomic_add(io->read_sectors, &ctxt->read_sectors); ··· 436 390 err_free: 437 391 kfree(io); 438 392 err: 439 - if (bch2_err_matches(ret, BCH_ERR_data_update_done)) 440 - return 0; 441 - 442 393 if (bch2_err_matches(ret, EROFS) || 443 394 bch2_err_matches(ret, BCH_ERR_transaction_restart)) 444 395 return ret; ··· 451 408 trace_io_move_start_fail(c, buf.buf); 452 409 printbuf_exit(&buf); 453 410 } 411 + 412 + if (bch2_err_matches(ret, BCH_ERR_data_update_done)) 413 + return 0; 454 414 return ret; 455 415 } 456 416 ··· 542 496 bch2_inode_opts_get(io_opts, c, &inode); 543 497 } 544 498 bch2_trans_iter_exit(trans, &inode_iter); 499 + /* seem to be spinning here? */ 545 500 out: 546 501 return bch2_get_update_rebalance_opts(trans, io_opts, extent_iter, extent_k); 547 502 } ··· 957 910 } 958 911 959 912 struct data_update_opts data_opts = {}; 960 - if (!pred(c, arg, bp.v->btree_id, k, &io_opts, &data_opts)) { 913 + bool p = pred(c, arg, bp.v->btree_id, k, &io_opts, &data_opts); 914 + 915 + if (trace_io_move_pred_enabled()) 916 + trace_io_move_pred2(c, k, &io_opts, &data_opts, 917 + pred, arg, p); 918 + 919 + if (!p) { 961 920 bch2_trans_iter_exit(trans, &iter); 962 921 goto next; 963 922 } ··· 971 918 if (data_opts.scrub && 972 919 !bch2_dev_idx_is_online(c, data_opts.read_dev)) { 973 920 bch2_trans_iter_exit(trans, &iter); 974 - ret = -BCH_ERR_device_offline; 921 + ret = bch_err_throw(c, device_offline); 975 922 break; 976 923 } 977 924 ··· 1046 993 return ret; 1047 994 } 1048 995 1049 - struct evacuate_bucket_arg { 1050 - struct bpos bucket; 1051 - int gen; 1052 - struct data_update_opts data_opts; 1053 - }; 1054 - 1055 996 static bool evacuate_bucket_pred(struct bch_fs *c, void *_arg, 1056 997 enum btree_id btree, struct bkey_s_c k, 1057 998 struct bch_io_opts *io_opts, ··· 1072 1025 struct bpos bucket, int gen, 1073 1026 struct data_update_opts data_opts) 1074 1027 { 1028 + struct bch_fs *c = ctxt->trans->c; 1075 1029 struct evacuate_bucket_arg arg = { bucket, gen, data_opts, }; 1030 + 1031 + count_event(c, io_move_evacuate_bucket); 1032 + if (trace_io_move_evacuate_bucket_enabled()) 1033 + trace_io_move_evacuate_bucket2(c, bucket, gen); 1076 1034 1077 1035 return __bch2_move_data_phys(ctxt, bucket_in_flight, 1078 1036 bucket.inode, ··· 1176 1124 ? c->opts.metadata_replicas 1177 1125 : io_opts->data_replicas; 1178 1126 1179 - rcu_read_lock(); 1127 + guard(rcu)(); 1180 1128 struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k); 1181 1129 unsigned i = 0; 1182 1130 bkey_for_each_ptr(ptrs, ptr) { ··· 1186 1134 data_opts->kill_ptrs |= BIT(i); 1187 1135 i++; 1188 1136 } 1189 - rcu_read_unlock(); 1190 1137 1191 1138 if (!data_opts->kill_ptrs && 1192 1139 (!nr_good || nr_good >= replicas)) ··· 1293 1242 struct extent_ptr_decoded p; 1294 1243 unsigned i = 0; 1295 1244 1296 - rcu_read_lock(); 1245 + guard(rcu)(); 1297 1246 bkey_for_each_ptr_decode(k.k, bch2_bkey_ptrs_c(k), p, entry) { 1298 1247 unsigned d = bch2_extent_ptr_durability(c, &p); 1299 1248 ··· 1304 1253 1305 1254 i++; 1306 1255 } 1307 - rcu_read_unlock(); 1308 1256 1309 1257 return data_opts->kill_ptrs != 0; 1310 1258 }
+12 -14
fs/bcachefs/movinggc.c
··· 293 293 { 294 294 u64 wait = U64_MAX; 295 295 296 - rcu_read_lock(); 296 + guard(rcu)(); 297 297 for_each_rw_member_rcu(c, ca) 298 298 wait = min(wait, bch2_copygc_dev_wait_amount(ca)); 299 - rcu_read_unlock(); 300 - 301 299 return wait; 302 300 } 303 301 ··· 319 321 320 322 bch2_printbuf_make_room(out, 4096); 321 323 322 - rcu_read_lock(); 324 + struct task_struct *t; 323 325 out->atomic++; 326 + scoped_guard(rcu) { 327 + prt_printf(out, "Currently calculated wait:\n"); 328 + for_each_rw_member_rcu(c, ca) { 329 + prt_printf(out, " %s:\t", ca->name); 330 + prt_human_readable_u64(out, bch2_copygc_dev_wait_amount(ca)); 331 + prt_newline(out); 332 + } 324 333 325 - prt_printf(out, "Currently calculated wait:\n"); 326 - for_each_rw_member_rcu(c, ca) { 327 - prt_printf(out, " %s:\t", ca->name); 328 - prt_human_readable_u64(out, bch2_copygc_dev_wait_amount(ca)); 329 - prt_newline(out); 334 + t = rcu_dereference(c->copygc_thread); 335 + if (t) 336 + get_task_struct(t); 330 337 } 331 - 332 - struct task_struct *t = rcu_dereference(c->copygc_thread); 333 - if (t) 334 - get_task_struct(t); 335 338 --out->atomic; 336 - rcu_read_unlock(); 337 339 338 340 if (t) { 339 341 bch2_prt_task_backtrace(out, t, 0, GFP_KERNEL);
+1 -2
fs/bcachefs/movinggc.h
··· 7 7 8 8 static inline void bch2_copygc_wakeup(struct bch_fs *c) 9 9 { 10 - rcu_read_lock(); 10 + guard(rcu)(); 11 11 struct task_struct *p = rcu_dereference(c->copygc_thread); 12 12 if (p) 13 13 wake_up_process(p); 14 - rcu_read_unlock(); 15 14 } 16 15 17 16 void bch2_copygc_stop(struct bch_fs *);
+7 -14
fs/bcachefs/namei.c
··· 287 287 } 288 288 289 289 if (deleting_subvol && !inode_u->bi_subvol) { 290 - ret = -BCH_ERR_ENOENT_not_subvol; 290 + ret = bch_err_throw(c, ENOENT_not_subvol); 291 291 goto err; 292 292 } 293 293 ··· 425 425 } 426 426 427 427 ret = bch2_dirent_rename(trans, 428 - src_dir, &src_hash, &src_dir_u->bi_size, 429 - dst_dir, &dst_hash, &dst_dir_u->bi_size, 428 + src_dir, &src_hash, 429 + dst_dir, &dst_hash, 430 430 src_name, &src_inum, &src_offset, 431 431 dst_name, &dst_inum, &dst_offset, 432 432 mode); ··· 633 633 break; 634 634 635 635 if (!inode.bi_dir && !inode.bi_dir_offset) { 636 - ret = -BCH_ERR_ENOENT_inode_no_backpointer; 636 + ret = bch_err_throw(trans->c, ENOENT_inode_no_backpointer); 637 637 goto disconnected; 638 638 } 639 639 ··· 733 733 return __bch2_fsck_write_inode(trans, target); 734 734 } 735 735 736 - if (bch2_inode_should_have_single_bp(target) && 737 - !fsck_err(trans, inode_wrong_backpointer, 738 - "dirent points to inode that does not point back:\n%s", 739 - (bch2_bkey_val_to_text(&buf, c, d.s_c), 740 - prt_newline(&buf), 741 - bch2_inode_unpacked_to_text(&buf, target), 742 - buf.buf))) 743 - goto err; 744 - 745 736 struct bkey_s_c_dirent bp_dirent = 746 737 bch2_bkey_get_iter_typed(trans, &bp_iter, BTREE_ID_dirents, 747 738 SPOS(target->bi_dir, target->bi_dir_offset, target->bi_snapshot), ··· 759 768 ret = __bch2_fsck_write_inode(trans, target); 760 769 } 761 770 } else { 771 + printbuf_reset(&buf); 762 772 bch2_bkey_val_to_text(&buf, c, d.s_c); 763 773 prt_newline(&buf); 764 774 bch2_bkey_val_to_text(&buf, c, bp_dirent.s_c); ··· 849 857 n->v.d_inum = cpu_to_le64(target->bi_inum); 850 858 } 851 859 852 - ret = bch2_trans_update(trans, dirent_iter, &n->k_i, 0); 860 + ret = bch2_trans_update(trans, dirent_iter, &n->k_i, 861 + BTREE_UPDATE_internal_snapshot_node); 853 862 if (ret) 854 863 goto err; 855 864 }
+8
fs/bcachefs/printbuf.h
··· 140 140 .size = _size, \ 141 141 }) 142 142 143 + static inline struct printbuf bch2_printbuf_init(void) 144 + { 145 + return PRINTBUF; 146 + } 147 + 148 + DEFINE_CLASS(printbuf, struct printbuf, 149 + bch2_printbuf_exit(&_T), bch2_printbuf_init(), void) 150 + 143 151 /* 144 152 * Returns size remaining of output buffer: 145 153 */
+3 -3
fs/bcachefs/quota.c
··· 527 527 struct bch_sb_field_quota *sb_quota = bch2_sb_get_or_create_quota(&c->disk_sb); 528 528 if (!sb_quota) { 529 529 mutex_unlock(&c->sb_lock); 530 - return -BCH_ERR_ENOSPC_sb_quota; 530 + return bch_err_throw(c, ENOSPC_sb_quota); 531 531 } 532 532 533 533 bch2_sb_quota_read(c); ··· 572 572 mutex_lock(&c->sb_lock); 573 573 sb_quota = bch2_sb_get_or_create_quota(&c->disk_sb); 574 574 if (!sb_quota) { 575 - ret = -BCH_ERR_ENOSPC_sb_quota; 575 + ret = bch_err_throw(c, ENOSPC_sb_quota); 576 576 goto unlock; 577 577 } 578 578 ··· 726 726 mutex_lock(&c->sb_lock); 727 727 sb_quota = bch2_sb_get_or_create_quota(&c->disk_sb); 728 728 if (!sb_quota) { 729 - ret = -BCH_ERR_ENOSPC_sb_quota; 729 + ret = bch_err_throw(c, ENOSPC_sb_quota); 730 730 goto unlock; 731 731 } 732 732
+14 -13
fs/bcachefs/rebalance.c
··· 80 80 unsigned ptr_bit = 1; 81 81 unsigned rewrite_ptrs = 0; 82 82 83 - rcu_read_lock(); 83 + guard(rcu)(); 84 84 bkey_for_each_ptr(ptrs, ptr) { 85 85 if (!ptr->cached && !bch2_dev_in_target(c, ptr->dev, opts->background_target)) 86 86 rewrite_ptrs |= ptr_bit; 87 87 ptr_bit <<= 1; 88 88 } 89 - rcu_read_unlock(); 90 89 91 90 return rewrite_ptrs; 92 91 } ··· 134 135 } 135 136 incompressible: 136 137 if (opts->background_target) { 137 - rcu_read_lock(); 138 + guard(rcu)(); 138 139 bkey_for_each_ptr_decode(k.k, ptrs, p, entry) 139 140 if (!p.ptr.cached && 140 141 !bch2_dev_in_target(c, p.ptr.dev, opts->background_target)) 141 142 sectors += p.crc.compressed_size; 142 - rcu_read_unlock(); 143 143 } 144 144 145 145 return sectors; ··· 443 445 if (bch2_err_matches(ret, ENOMEM)) { 444 446 /* memory allocation failure, wait for some IO to finish */ 445 447 bch2_move_ctxt_wait_for_io(ctxt); 446 - ret = -BCH_ERR_transaction_restart_nested; 448 + ret = bch_err_throw(c, transaction_restart_nested); 447 449 } 448 450 449 451 if (bch2_err_matches(ret, BCH_ERR_transaction_restart)) ··· 525 527 r->state = BCH_REBALANCE_waiting; 526 528 } 527 529 528 - bch2_kthread_io_clock_wait(clock, r->wait_iotime_end, MAX_SCHEDULE_TIMEOUT); 530 + bch2_kthread_io_clock_wait_once(clock, r->wait_iotime_end, MAX_SCHEDULE_TIMEOUT); 529 531 } 530 532 531 533 static bool bch2_rebalance_enabled(struct bch_fs *c) ··· 542 544 struct bch_fs_rebalance *r = &c->rebalance; 543 545 struct btree_iter rebalance_work_iter, extent_iter = {}; 544 546 struct bkey_s_c k; 547 + u32 kick = r->kick; 545 548 int ret = 0; 546 549 547 550 bch2_trans_begin(trans); ··· 592 593 if (!ret && 593 594 !kthread_should_stop() && 594 595 !atomic64_read(&r->work_stats.sectors_seen) && 595 - !atomic64_read(&r->scan_stats.sectors_seen)) { 596 + !atomic64_read(&r->scan_stats.sectors_seen) && 597 + kick == r->kick) { 596 598 bch2_moving_ctxt_flush_all(ctxt); 597 599 bch2_trans_unlock_long(trans); 598 600 rebalance_wait(c); ··· 677 677 } 678 678 prt_newline(out); 679 679 680 - rcu_read_lock(); 681 - struct task_struct *t = rcu_dereference(c->rebalance.thread); 682 - if (t) 683 - get_task_struct(t); 684 - rcu_read_unlock(); 680 + struct task_struct *t; 681 + scoped_guard(rcu) { 682 + t = rcu_dereference(c->rebalance.thread); 683 + if (t) 684 + get_task_struct(t); 685 + } 685 686 686 687 if (t) { 687 688 bch2_prt_task_backtrace(out, t, 0, GFP_KERNEL); ··· 795 794 BTREE_ID_extents, POS_MIN, 796 795 BTREE_ITER_prefetch| 797 796 BTREE_ITER_all_snapshots); 798 - return -BCH_ERR_transaction_restart_nested; 797 + return bch_err_throw(c, transaction_restart_nested); 799 798 } 800 799 801 800 if (!extent_k.k && !rebalance_k.k)
+3 -5
fs/bcachefs/rebalance.h
··· 39 39 40 40 static inline void bch2_rebalance_wakeup(struct bch_fs *c) 41 41 { 42 - struct task_struct *p; 43 - 44 - rcu_read_lock(); 45 - p = rcu_dereference(c->rebalance.thread); 42 + c->rebalance.kick++; 43 + guard(rcu)(); 44 + struct task_struct *p = rcu_dereference(c->rebalance.thread); 46 45 if (p) 47 46 wake_up_process(p); 48 - rcu_read_unlock(); 49 47 } 50 48 51 49 void bch2_rebalance_status_to_text(struct printbuf *, struct bch_fs *);
+1
fs/bcachefs/rebalance_types.h
··· 18 18 19 19 struct bch_fs_rebalance { 20 20 struct task_struct __rcu *thread; 21 + u32 kick; 21 22 struct bch_pd_controller pd; 22 23 23 24 enum bch_rebalance_states state;
+1 -5
fs/bcachefs/recovery.c
··· 879 879 use_clean: 880 880 if (!clean) { 881 881 bch_err(c, "no superblock clean section found"); 882 - ret = -BCH_ERR_fsck_repair_impossible; 882 + ret = bch_err_throw(c, fsck_repair_impossible); 883 883 goto err; 884 884 885 885 } ··· 1093 1093 out: 1094 1094 bch2_flush_fsck_errs(c); 1095 1095 1096 - if (!c->opts.retain_recovery_info) { 1097 - bch2_journal_keys_put_initial(c); 1098 - bch2_find_btree_nodes_exit(&c->found_btree_nodes); 1099 - } 1100 1096 if (!IS_ERR(clean)) 1101 1097 kfree(clean); 1102 1098
+77 -15
fs/bcachefs/recovery_passes.c
··· 103 103 prt_tab(out); 104 104 105 105 bch2_pr_time_units(out, le32_to_cpu(i->last_runtime) * NSEC_PER_SEC); 106 + 107 + if (BCH_RECOVERY_PASS_NO_RATELIMIT(i)) 108 + prt_str(out, " (no ratelimit)"); 109 + 106 110 prt_newline(out); 107 111 } 108 112 } 109 113 110 - static void bch2_sb_recovery_pass_complete(struct bch_fs *c, 111 - enum bch_recovery_pass pass, 112 - s64 start_time) 114 + static struct recovery_pass_entry *bch2_sb_recovery_pass_entry(struct bch_fs *c, 115 + enum bch_recovery_pass pass) 113 116 { 114 117 enum bch_recovery_pass_stable stable = bch2_recovery_pass_to_stable(pass); 115 - s64 end_time = ktime_get_real_seconds(); 116 118 117 - mutex_lock(&c->sb_lock); 118 - struct bch_sb_field_ext *ext = bch2_sb_field_get(c->disk_sb.sb, ext); 119 - __clear_bit_le64(stable, ext->recovery_passes_required); 119 + lockdep_assert_held(&c->sb_lock); 120 120 121 121 struct bch_sb_field_recovery_passes *r = 122 122 bch2_sb_field_get(c->disk_sb.sb, recovery_passes); ··· 127 127 r = bch2_sb_field_resize(&c->disk_sb, recovery_passes, u64s); 128 128 if (!r) { 129 129 bch_err(c, "error creating recovery_passes sb section"); 130 - goto out; 130 + return NULL; 131 131 } 132 132 } 133 133 134 - r->start[stable].last_run = cpu_to_le64(end_time); 135 - r->start[stable].last_runtime = cpu_to_le32(max(0, end_time - start_time)); 136 - out: 134 + return r->start + stable; 135 + } 136 + 137 + static void bch2_sb_recovery_pass_complete(struct bch_fs *c, 138 + enum bch_recovery_pass pass, 139 + s64 start_time) 140 + { 141 + guard(mutex)(&c->sb_lock); 142 + struct bch_sb_field_ext *ext = bch2_sb_field_get(c->disk_sb.sb, ext); 143 + __clear_bit_le64(bch2_recovery_pass_to_stable(pass), 144 + ext->recovery_passes_required); 145 + 146 + struct recovery_pass_entry *e = bch2_sb_recovery_pass_entry(c, pass); 147 + if (e) { 148 + s64 end_time = ktime_get_real_seconds(); 149 + e->last_run = cpu_to_le64(end_time); 150 + e->last_runtime = cpu_to_le32(max(0, end_time - start_time)); 151 + SET_BCH_RECOVERY_PASS_NO_RATELIMIT(e, false); 152 + } 153 + 137 154 bch2_write_super(c); 138 - mutex_unlock(&c->sb_lock); 155 + } 156 + 157 + void bch2_recovery_pass_set_no_ratelimit(struct bch_fs *c, 158 + enum bch_recovery_pass pass) 159 + { 160 + guard(mutex)(&c->sb_lock); 161 + 162 + struct recovery_pass_entry *e = bch2_sb_recovery_pass_entry(c, pass); 163 + if (e && !BCH_RECOVERY_PASS_NO_RATELIMIT(e)) { 164 + SET_BCH_RECOVERY_PASS_NO_RATELIMIT(e, false); 165 + bch2_write_super(c); 166 + } 139 167 } 140 168 141 169 static bool bch2_recovery_pass_want_ratelimit(struct bch_fs *c, enum bch_recovery_pass pass) ··· 185 157 */ 186 158 ret = (u64) le32_to_cpu(i->last_runtime) * 100 > 187 159 ktime_get_real_seconds() - le64_to_cpu(i->last_run); 160 + 161 + if (BCH_RECOVERY_PASS_NO_RATELIMIT(i)) 162 + ret = false; 188 163 } 189 164 190 165 return ret; ··· 346 315 goto out; 347 316 348 317 bool in_recovery = test_bit(BCH_FS_in_recovery, &c->flags); 349 - bool rewind = in_recovery && r->curr_pass > pass; 318 + bool rewind = in_recovery && 319 + r->curr_pass > pass && 320 + !(r->passes_complete & BIT_ULL(pass)); 350 321 bool ratelimit = flags & RUN_RECOVERY_PASS_ratelimit; 351 322 352 323 if (!(in_recovery && (flags & RUN_RECOVERY_PASS_nopersistent))) { ··· 360 327 (!in_recovery || r->curr_pass >= BCH_RECOVERY_PASS_set_may_go_rw)) { 361 328 prt_printf(out, "need recovery pass %s (%u), but already rw\n", 362 329 bch2_recovery_passes[pass], pass); 363 - ret = -BCH_ERR_cannot_rewind_recovery; 330 + ret = bch_err_throw(c, cannot_rewind_recovery); 364 331 goto out; 365 332 } 366 333 ··· 380 347 if (rewind) { 381 348 r->next_pass = pass; 382 349 r->passes_complete &= (1ULL << pass) >> 1; 383 - ret = -BCH_ERR_restart_recovery; 350 + ret = bch_err_throw(c, restart_recovery); 384 351 } 385 352 } else { 386 353 prt_printf(out, "scheduling recovery pass %s (%u)%s\n", ··· 413 380 } 414 381 415 382 return ret; 383 + } 384 + 385 + /* 386 + * Returns 0 if @pass has run recently, otherwise one of 387 + * -BCH_ERR_restart_recovery 388 + * -BCH_ERR_recovery_pass_will_run 389 + */ 390 + int bch2_require_recovery_pass(struct bch_fs *c, 391 + struct printbuf *out, 392 + enum bch_recovery_pass pass) 393 + { 394 + if (test_bit(BCH_FS_in_recovery, &c->flags) && 395 + c->recovery.passes_complete & BIT_ULL(pass)) 396 + return 0; 397 + 398 + guard(mutex)(&c->sb_lock); 399 + 400 + if (bch2_recovery_pass_want_ratelimit(c, pass)) 401 + return 0; 402 + 403 + enum bch_run_recovery_pass_flags flags = 0; 404 + int ret = 0; 405 + 406 + if (recovery_pass_needs_set(c, pass, &flags)) { 407 + ret = __bch2_run_explicit_recovery_pass(c, out, pass, flags); 408 + bch2_write_super(c); 409 + } 410 + 411 + return ret ?: bch_err_throw(c, recovery_pass_will_run); 416 412 } 417 413 418 414 int bch2_run_print_explicit_recovery_pass(struct bch_fs *c, enum bch_recovery_pass pass)
+5
fs/bcachefs/recovery_passes.h
··· 10 10 11 11 u64 bch2_fsck_recovery_passes(void); 12 12 13 + void bch2_recovery_pass_set_no_ratelimit(struct bch_fs *, enum bch_recovery_pass); 14 + 13 15 enum bch_run_recovery_pass_flags { 14 16 RUN_RECOVERY_PASS_nopersistent = BIT(0), 15 17 RUN_RECOVERY_PASS_ratelimit = BIT(1), ··· 25 23 int bch2_run_explicit_recovery_pass(struct bch_fs *, struct printbuf *, 26 24 enum bch_recovery_pass, 27 25 enum bch_run_recovery_pass_flags); 26 + 27 + int bch2_require_recovery_pass(struct bch_fs *, struct printbuf *, 28 + enum bch_recovery_pass); 28 29 29 30 int bch2_run_online_recovery_passes(struct bch_fs *, u64); 30 31 int bch2_run_recovery_passes(struct bch_fs *, enum bch_recovery_pass);
+2
fs/bcachefs/recovery_passes_format.h
··· 87 87 __le32 flags; 88 88 }; 89 89 90 + LE32_BITMASK(BCH_RECOVERY_PASS_NO_RATELIMIT, struct recovery_pass_entry, flags, 0, 1) 91 + 90 92 struct bch_sb_field_recovery_passes { 91 93 struct bch_sb_field field; 92 94 struct recovery_pass_entry start[];
+5 -4
fs/bcachefs/reflink.c
··· 312 312 313 313 if (!bkey_refcount_c(k)) { 314 314 if (!(flags & BTREE_TRIGGER_overwrite)) 315 - ret = -BCH_ERR_missing_indirect_extent; 315 + ret = bch_err_throw(c, missing_indirect_extent); 316 316 goto next; 317 317 } 318 318 ··· 612 612 int ret = 0, ret2 = 0; 613 613 614 614 if (!enumerated_ref_tryget(&c->writes, BCH_WRITE_REF_reflink)) 615 - return -BCH_ERR_erofs_no_writes; 615 + return bch_err_throw(c, erofs_no_writes); 616 616 617 617 bch2_check_set_feature(c, BCH_FEATURE_reflink); 618 618 ··· 711 711 SET_REFLINK_P_IDX(&dst_p->v, offset); 712 712 713 713 if (reflink_p_may_update_opts_field && 714 - may_change_src_io_path_opts) 714 + may_change_src_io_path_opts && 715 + REFLINK_P_MAY_UPDATE_OPTIONS(src_p.v)) 715 716 SET_REFLINK_P_MAY_UPDATE_OPTIONS(&dst_p->v, true); 716 717 } else { 717 718 BUG(); ··· 848 847 struct reflink_gc *r = genradix_ptr_alloc(&c->reflink_gc_table, 849 848 c->reflink_gc_nr++, GFP_KERNEL); 850 849 if (!r) { 851 - ret = -BCH_ERR_ENOMEM_gc_reflink_start; 850 + ret = bch_err_throw(c, ENOMEM_gc_reflink_start); 852 851 break; 853 852 } 854 853
+18 -19
fs/bcachefs/replicas.c
··· 119 119 return 0; 120 120 bad: 121 121 bch2_replicas_entry_to_text(err, r); 122 - return -BCH_ERR_invalid_replicas_entry; 122 + return bch_err_throw(c, invalid_replicas_entry); 123 123 } 124 124 125 125 void bch2_cpu_replicas_to_text(struct printbuf *out, ··· 311 311 !__replicas_has_entry(&c->replicas_gc, new_entry)) { 312 312 new_gc = cpu_replicas_add_entry(c, &c->replicas_gc, new_entry); 313 313 if (!new_gc.entries) { 314 - ret = -BCH_ERR_ENOMEM_cpu_replicas; 314 + ret = bch_err_throw(c, ENOMEM_cpu_replicas); 315 315 goto err; 316 316 } 317 317 } ··· 319 319 if (!__replicas_has_entry(&c->replicas, new_entry)) { 320 320 new_r = cpu_replicas_add_entry(c, &c->replicas, new_entry); 321 321 if (!new_r.entries) { 322 - ret = -BCH_ERR_ENOMEM_cpu_replicas; 322 + ret = bch_err_throw(c, ENOMEM_cpu_replicas); 323 323 goto err; 324 324 } 325 325 ··· 422 422 if (!c->replicas_gc.entries) { 423 423 mutex_unlock(&c->sb_lock); 424 424 bch_err(c, "error allocating c->replicas_gc"); 425 - return -BCH_ERR_ENOMEM_replicas_gc; 425 + return bch_err_throw(c, ENOMEM_replicas_gc); 426 426 } 427 427 428 428 for_each_cpu_replicas_entry(&c->replicas, e) ··· 458 458 new.entries = kcalloc(nr, new.entry_size, GFP_KERNEL); 459 459 if (!new.entries) { 460 460 bch_err(c, "error allocating c->replicas_gc"); 461 - return -BCH_ERR_ENOMEM_replicas_gc; 461 + return bch_err_throw(c, ENOMEM_replicas_gc); 462 462 } 463 463 464 464 mutex_lock(&c->sb_lock); ··· 622 622 sb_r = bch2_sb_field_resize(&c->disk_sb, replicas_v0, 623 623 DIV_ROUND_UP(bytes, sizeof(u64))); 624 624 if (!sb_r) 625 - return -BCH_ERR_ENOSPC_sb_replicas; 625 + return bch_err_throw(c, ENOSPC_sb_replicas); 626 626 627 627 bch2_sb_field_delete(&c->disk_sb, BCH_SB_FIELD_replicas); 628 628 sb_r = bch2_sb_field_get(c->disk_sb.sb, replicas_v0); ··· 667 667 sb_r = bch2_sb_field_resize(&c->disk_sb, replicas, 668 668 DIV_ROUND_UP(bytes, sizeof(u64))); 669 669 if (!sb_r) 670 - return -BCH_ERR_ENOSPC_sb_replicas; 670 + return bch_err_throw(c, ENOSPC_sb_replicas); 671 671 672 672 bch2_sb_field_delete(&c->disk_sb, BCH_SB_FIELD_replicas_v0); 673 673 sb_r = bch2_sb_field_get(c->disk_sb.sb, replicas); ··· 819 819 if (e->data_type == BCH_DATA_cached) 820 820 continue; 821 821 822 - rcu_read_lock(); 823 - for (unsigned i = 0; i < e->nr_devs; i++) { 824 - if (e->devs[i] == BCH_SB_MEMBER_INVALID) { 825 - nr_failed++; 826 - continue; 822 + scoped_guard(rcu) 823 + for (unsigned i = 0; i < e->nr_devs; i++) { 824 + if (e->devs[i] == BCH_SB_MEMBER_INVALID) { 825 + nr_failed++; 826 + continue; 827 + } 828 + 829 + nr_online += test_bit(e->devs[i], devs.d); 830 + 831 + struct bch_dev *ca = bch2_dev_rcu_noerror(c, e->devs[i]); 832 + nr_failed += !ca || ca->mi.state == BCH_MEMBER_STATE_failed; 827 833 } 828 - 829 - nr_online += test_bit(e->devs[i], devs.d); 830 - 831 - struct bch_dev *ca = bch2_dev_rcu_noerror(c, e->devs[i]); 832 - nr_failed += !ca || ca->mi.state == BCH_MEMBER_STATE_failed; 833 - } 834 - rcu_read_unlock(); 835 834 836 835 if (nr_online + nr_failed == e->nr_devs) 837 836 continue;
+1
fs/bcachefs/sb-counters_format.h
··· 26 26 x(io_move_write_fail, 82, TYPE_COUNTER) \ 27 27 x(io_move_start_fail, 39, TYPE_COUNTER) \ 28 28 x(io_move_created_rebalance, 83, TYPE_COUNTER) \ 29 + x(io_move_evacuate_bucket, 84, TYPE_COUNTER) \ 29 30 x(bucket_invalidate, 3, TYPE_COUNTER) \ 30 31 x(bucket_discard, 4, TYPE_COUNTER) \ 31 32 x(bucket_discard_fast, 79, TYPE_COUNTER) \
+1 -1
fs/bcachefs/sb-downgrade.c
··· 417 417 418 418 d = bch2_sb_field_resize(&c->disk_sb, downgrade, sb_u64s); 419 419 if (!d) { 420 - ret = -BCH_ERR_ENOSPC_sb_downgrade; 420 + ret = bch_err_throw(c, ENOSPC_sb_downgrade); 421 421 goto out; 422 422 } 423 423
+22
fs/bcachefs/sb-errors.c
··· 78 78 .to_text = bch2_sb_errors_to_text, 79 79 }; 80 80 81 + void bch2_fs_errors_to_text(struct printbuf *out, struct bch_fs *c) 82 + { 83 + if (out->nr_tabstops < 1) 84 + printbuf_tabstop_push(out, 48); 85 + if (out->nr_tabstops < 2) 86 + printbuf_tabstop_push(out, 8); 87 + if (out->nr_tabstops < 3) 88 + printbuf_tabstop_push(out, 16); 89 + 90 + guard(mutex)(&c->fsck_error_counts_lock); 91 + 92 + bch_sb_errors_cpu *e = &c->fsck_error_counts; 93 + darray_for_each(*e, i) { 94 + bch2_sb_error_id_to_text(out, i->id); 95 + prt_tab(out); 96 + prt_u64(out, i->nr); 97 + prt_tab(out); 98 + bch2_prt_datetime(out, i->last_error_time); 99 + prt_newline(out); 100 + } 101 + } 102 + 81 103 void bch2_sb_error_count(struct bch_fs *c, enum bch_sb_error_id err) 82 104 { 83 105 bch_sb_errors_cpu *e = &c->fsck_error_counts;
+1
fs/bcachefs/sb-errors.h
··· 7 7 extern const char * const bch2_sb_error_strs[]; 8 8 9 9 void bch2_sb_error_id_to_text(struct printbuf *, enum bch_sb_error_id); 10 + void bch2_fs_errors_to_text(struct printbuf *, struct bch_fs *); 10 11 11 12 extern const struct bch_sb_field_ops bch_sb_field_ops_errors; 12 13
+3 -1
fs/bcachefs/sb-errors_format.h
··· 232 232 x(inode_dir_multiple_links, 206, FSCK_AUTOFIX) \ 233 233 x(inode_dir_missing_backpointer, 284, FSCK_AUTOFIX) \ 234 234 x(inode_dir_unlinked_but_not_empty, 286, FSCK_AUTOFIX) \ 235 + x(inode_dir_has_nonzero_i_size, 319, FSCK_AUTOFIX) \ 235 236 x(inode_multiple_links_but_nlink_0, 207, FSCK_AUTOFIX) \ 236 237 x(inode_wrong_backpointer, 208, FSCK_AUTOFIX) \ 237 238 x(inode_wrong_nlink, 209, FSCK_AUTOFIX) \ ··· 244 243 x(inode_parent_has_case_insensitive_not_set, 317, FSCK_AUTOFIX) \ 245 244 x(vfs_inode_i_blocks_underflow, 311, FSCK_AUTOFIX) \ 246 245 x(vfs_inode_i_blocks_not_zero_at_truncate, 313, FSCK_AUTOFIX) \ 246 + x(vfs_bad_inode_rm, 320, 0) \ 247 247 x(deleted_inode_but_clean, 211, FSCK_AUTOFIX) \ 248 248 x(deleted_inode_missing, 212, FSCK_AUTOFIX) \ 249 249 x(deleted_inode_is_dir, 213, FSCK_AUTOFIX) \ ··· 330 328 x(dirent_stray_data_after_cf_name, 305, 0) \ 331 329 x(rebalance_work_incorrectly_set, 309, FSCK_AUTOFIX) \ 332 330 x(rebalance_work_incorrectly_unset, 310, FSCK_AUTOFIX) \ 333 - x(MAX, 319, 0) 331 + x(MAX, 321, 0) 334 332 335 333 enum bch_sb_error_id { 336 334 #define x(t, n, ...) BCH_FSCK_ERR_##t = n,
+7 -14
fs/bcachefs/sb-members.c
··· 101 101 102 102 mi = bch2_sb_field_resize(&c->disk_sb, members_v2, u64s); 103 103 if (!mi) 104 - return -BCH_ERR_ENOSPC_sb_members_v2; 104 + return bch_err_throw(c, ENOSPC_sb_members_v2); 105 105 106 106 for (int i = c->disk_sb.sb->nr_devices - 1; i >= 0; --i) { 107 107 void *dst = (void *) mi->_members + (i * sizeof(struct bch_member)); ··· 378 378 { 379 379 struct bch_sb_field_members_v2 *mi = bch2_sb_field_get(c->disk_sb.sb, members_v2); 380 380 381 - rcu_read_lock(); 381 + guard(rcu)(); 382 382 for_each_member_device_rcu(c, ca, NULL) { 383 383 struct bch_member *m = __bch2_members_v2_get_mut(mi, ca->dev_idx); 384 384 385 385 for (unsigned e = 0; e < BCH_MEMBER_ERROR_NR; e++) 386 386 m->errors[e] = cpu_to_le64(atomic64_read(&ca->errors[e])); 387 387 } 388 - rcu_read_unlock(); 389 388 } 390 389 391 390 void bch2_dev_io_errors_to_text(struct printbuf *out, struct bch_dev *ca) ··· 442 443 443 444 bool bch2_dev_btree_bitmap_marked(struct bch_fs *c, struct bkey_s_c k) 444 445 { 445 - bool ret = true; 446 - rcu_read_lock(); 446 + guard(rcu)(); 447 447 bkey_for_each_ptr(bch2_bkey_ptrs_c(k), ptr) { 448 448 struct bch_dev *ca = bch2_dev_rcu(c, ptr->dev); 449 - if (!ca) 450 - continue; 451 - 452 - if (!bch2_dev_btree_bitmap_marked_sectors(ca, ptr->offset, btree_sectors(c))) { 453 - ret = false; 454 - break; 455 - } 449 + if (ca && 450 + !bch2_dev_btree_bitmap_marked_sectors(ca, ptr->offset, btree_sectors(c))) 451 + return false; 456 452 } 457 - rcu_read_unlock(); 458 - return ret; 453 + return true; 459 454 } 460 455 461 456 static void __bch2_dev_btree_bitmap_mark(struct bch_sb_field_members_v2 *mi, unsigned dev,
+11 -21
fs/bcachefs/sb-members.h
··· 28 28 29 29 static inline bool bch2_dev_idx_is_online(struct bch_fs *c, unsigned dev) 30 30 { 31 - rcu_read_lock(); 31 + guard(rcu)(); 32 32 struct bch_dev *ca = bch2_dev_rcu(c, dev); 33 - bool ret = ca && bch2_dev_is_online(ca); 34 - rcu_read_unlock(); 35 - 36 - return ret; 33 + return ca && bch2_dev_is_online(ca); 37 34 } 38 35 39 36 static inline bool bch2_dev_is_healthy(struct bch_dev *ca) ··· 139 142 140 143 static inline struct bch_dev *bch2_get_next_dev(struct bch_fs *c, struct bch_dev *ca) 141 144 { 142 - rcu_read_lock(); 145 + guard(rcu)(); 143 146 bch2_dev_put(ca); 144 147 if ((ca = __bch2_next_dev(c, ca, NULL))) 145 148 bch2_dev_get(ca); 146 - rcu_read_unlock(); 147 - 148 149 return ca; 149 150 } 150 151 ··· 161 166 unsigned state_mask, 162 167 int rw, unsigned ref_idx) 163 168 { 164 - rcu_read_lock(); 169 + guard(rcu)(); 165 170 if (ca) 166 171 enumerated_ref_put(&ca->io_ref[rw], ref_idx); 167 172 ··· 169 174 (!((1 << ca->mi.state) & state_mask) || 170 175 !enumerated_ref_tryget(&ca->io_ref[rw], ref_idx))) 171 176 ; 172 - rcu_read_unlock(); 173 177 174 178 return ca; 175 179 } ··· 233 239 234 240 static inline struct bch_dev *bch2_dev_tryget_noerror(struct bch_fs *c, unsigned dev) 235 241 { 236 - rcu_read_lock(); 242 + guard(rcu)(); 237 243 struct bch_dev *ca = bch2_dev_rcu_noerror(c, dev); 238 244 if (ca) 239 245 bch2_dev_get(ca); 240 - rcu_read_unlock(); 241 246 return ca; 242 247 } 243 248 ··· 292 299 { 293 300 might_sleep(); 294 301 295 - rcu_read_lock(); 302 + guard(rcu)(); 296 303 struct bch_dev *ca = bch2_dev_rcu(c, dev); 297 - if (ca && !enumerated_ref_tryget(&ca->io_ref[rw], ref_idx)) 298 - ca = NULL; 299 - rcu_read_unlock(); 304 + if (!ca || !enumerated_ref_tryget(&ca->io_ref[rw], ref_idx)) 305 + return NULL; 300 306 301 - if (ca && 302 - (ca->mi.state == BCH_MEMBER_STATE_rw || 303 - (ca->mi.state == BCH_MEMBER_STATE_ro && rw == READ))) 307 + if (ca->mi.state == BCH_MEMBER_STATE_rw || 308 + (ca->mi.state == BCH_MEMBER_STATE_ro && rw == READ)) 304 309 return ca; 305 310 306 - if (ca) 307 - enumerated_ref_put(&ca->io_ref[rw], ref_idx); 311 + enumerated_ref_put(&ca->io_ref[rw], ref_idx); 308 312 return NULL; 309 313 } 310 314
+2 -5
fs/bcachefs/six.c
··· 339 339 * acquiring the lock and setting the owner field. If we're an RT task 340 340 * that will live-lock because we won't let the owner complete. 341 341 */ 342 - rcu_read_lock(); 342 + guard(rcu)(); 343 343 struct task_struct *owner = READ_ONCE(lock->owner); 344 - bool ret = owner ? owner_on_cpu(owner) : !rt_or_dl_task(current); 345 - rcu_read_unlock(); 346 - 347 - return ret; 344 + return owner ? owner_on_cpu(owner) : !rt_or_dl_task(current); 348 345 } 349 346 350 347 static inline bool six_optimistic_spin(struct six_lock *lock,
+89 -59
fs/bcachefs/snapshot.c
··· 54 54 BTREE_ITER_with_updates, snapshot_tree, s); 55 55 56 56 if (bch2_err_matches(ret, ENOENT)) 57 - ret = -BCH_ERR_ENOENT_snapshot_tree; 57 + ret = bch_err_throw(trans->c, ENOENT_snapshot_tree); 58 58 return ret; 59 59 } 60 60 ··· 67 67 struct bkey_i_snapshot_tree *s_t; 68 68 69 69 if (ret == -BCH_ERR_ENOSPC_btree_slot) 70 - ret = -BCH_ERR_ENOSPC_snapshot_tree; 70 + ret = bch_err_throw(trans->c, ENOSPC_snapshot_tree); 71 71 if (ret) 72 72 return ERR_PTR(ret); 73 73 ··· 105 105 106 106 static bool bch2_snapshot_is_ancestor_early(struct bch_fs *c, u32 id, u32 ancestor) 107 107 { 108 - rcu_read_lock(); 109 - bool ret = __bch2_snapshot_is_ancestor_early(rcu_dereference(c->snapshots), id, ancestor); 110 - rcu_read_unlock(); 111 - 112 - return ret; 108 + guard(rcu)(); 109 + return __bch2_snapshot_is_ancestor_early(rcu_dereference(c->snapshots), id, ancestor); 113 110 } 114 111 115 112 static inline u32 get_ancestor_below(struct snapshot_table *t, u32 id, u32 ancestor) ··· 137 140 { 138 141 bool ret; 139 142 140 - rcu_read_lock(); 143 + guard(rcu)(); 141 144 struct snapshot_table *t = rcu_dereference(c->snapshots); 142 145 143 - if (unlikely(c->recovery.pass_done < BCH_RECOVERY_PASS_check_snapshots)) { 144 - ret = __bch2_snapshot_is_ancestor_early(t, id, ancestor); 145 - goto out; 146 - } 146 + if (unlikely(c->recovery.pass_done < BCH_RECOVERY_PASS_check_snapshots)) 147 + return __bch2_snapshot_is_ancestor_early(t, id, ancestor); 147 148 148 149 if (likely(ancestor >= IS_ANCESTOR_BITMAP)) 149 150 while (id && id < ancestor - IS_ANCESTOR_BITMAP) ··· 152 157 : id == ancestor; 153 158 154 159 EBUG_ON(ret != __bch2_snapshot_is_ancestor_early(t, id, ancestor)); 155 - out: 156 - rcu_read_unlock(); 157 - 158 160 return ret; 159 161 } 160 162 ··· 285 293 mutex_lock(&c->snapshot_table_lock); 286 294 int ret = snapshot_t_mut(c, id) 287 295 ? 0 288 - : -BCH_ERR_ENOMEM_mark_snapshot; 296 + : bch_err_throw(c, ENOMEM_mark_snapshot); 289 297 mutex_unlock(&c->snapshot_table_lock); 290 298 return ret; 291 299 } ··· 304 312 305 313 t = snapshot_t_mut(c, id); 306 314 if (!t) { 307 - ret = -BCH_ERR_ENOMEM_mark_snapshot; 315 + ret = bch_err_throw(c, ENOMEM_mark_snapshot); 308 316 goto err; 309 317 } 310 318 ··· 404 412 u32 bch2_snapshot_oldest_subvol(struct bch_fs *c, u32 snapshot_root, 405 413 snapshot_id_list *skip) 406 414 { 415 + guard(rcu)(); 407 416 u32 id, subvol = 0, s; 408 417 retry: 409 418 id = snapshot_root; 410 - rcu_read_lock(); 411 419 while (id && bch2_snapshot_exists(c, id)) { 412 420 if (!(skip && snapshot_list_has_id(skip, id))) { 413 421 s = snapshot_t(c, id)->subvol; ··· 419 427 if (id == snapshot_root) 420 428 break; 421 429 } 422 - rcu_read_unlock(); 423 430 424 431 if (!subvol && skip) { 425 432 skip = NULL; ··· 608 617 609 618 u32 bch2_snapshot_skiplist_get(struct bch_fs *c, u32 id) 610 619 { 611 - const struct snapshot_t *s; 612 - 613 620 if (!id) 614 621 return 0; 615 622 616 - rcu_read_lock(); 617 - s = snapshot_t(c, id); 618 - if (s->parent) 619 - id = bch2_snapshot_nth_parent(c, id, get_random_u32_below(s->depth)); 620 - rcu_read_unlock(); 621 - 622 - return id; 623 + guard(rcu)(); 624 + const struct snapshot_t *s = snapshot_t(c, id); 625 + return s->parent 626 + ? bch2_snapshot_nth_parent(c, id, get_random_u32_below(s->depth)) 627 + : id; 623 628 } 624 629 625 630 static int snapshot_skiplist_good(struct btree_trans *trans, u32 id, struct bch_snapshot s) ··· 934 947 935 948 static inline bool snapshot_id_lists_have_common(snapshot_id_list *l, snapshot_id_list *r) 936 949 { 937 - darray_for_each(*l, i) 938 - if (snapshot_list_has_id(r, *i)) 939 - return true; 940 - return false; 950 + return darray_find_p(*l, i, snapshot_list_has_id(r, *i)) != NULL; 941 951 } 942 952 943 953 static void snapshot_id_list_to_text(struct printbuf *out, snapshot_id_list *s) ··· 1006 1022 "snapshot node %u from tree %s missing, recreate?", *id, buf.buf)) { 1007 1023 if (t->nr > 1) { 1008 1024 bch_err(c, "cannot reconstruct snapshot trees with multiple nodes"); 1009 - ret = -BCH_ERR_fsck_repair_unimplemented; 1025 + ret = bch_err_throw(c, fsck_repair_unimplemented); 1010 1026 goto err; 1011 1027 } 1012 1028 ··· 1045 1061 ret = bch2_btree_delete_at(trans, iter, 1046 1062 BTREE_UPDATE_internal_snapshot_node) ?: 1; 1047 1063 1048 - /* 1049 - * Snapshot missing: we should have caught this with btree_lost_data and 1050 - * kicked off reconstruct_snapshots, so if we end up here we have no 1051 - * idea what happened: 1052 - */ 1053 - if (fsck_err_on(state == SNAPSHOT_ID_empty, 1054 - trans, bkey_in_missing_snapshot, 1055 - "key in missing snapshot %s, delete?", 1056 - (bch2_btree_id_to_text(&buf, iter->btree_id), 1057 - prt_char(&buf, ' '), 1058 - bch2_bkey_val_to_text(&buf, c, k), buf.buf))) 1059 - ret = bch2_btree_delete_at(trans, iter, 1060 - BTREE_UPDATE_internal_snapshot_node) ?: 1; 1064 + if (state == SNAPSHOT_ID_empty) { 1065 + /* 1066 + * Snapshot missing: we should have caught this with btree_lost_data and 1067 + * kicked off reconstruct_snapshots, so if we end up here we have no 1068 + * idea what happened. 1069 + * 1070 + * Do not delete unless we know that subvolumes and snapshots 1071 + * are consistent: 1072 + * 1073 + * XXX: 1074 + * 1075 + * We could be smarter here, and instead of using the generic 1076 + * recovery pass ratelimiting, track if there have been any 1077 + * changes to the snapshots or inodes btrees since those passes 1078 + * last ran. 1079 + */ 1080 + ret = bch2_require_recovery_pass(c, &buf, BCH_RECOVERY_PASS_check_snapshots) ?: ret; 1081 + ret = bch2_require_recovery_pass(c, &buf, BCH_RECOVERY_PASS_check_subvols) ?: ret; 1082 + 1083 + if (c->sb.btrees_lost_data & BIT_ULL(BTREE_ID_snapshots)) 1084 + ret = bch2_require_recovery_pass(c, &buf, BCH_RECOVERY_PASS_reconstruct_snapshots) ?: ret; 1085 + 1086 + unsigned repair_flags = FSCK_CAN_IGNORE | (!ret ? FSCK_CAN_FIX : 0); 1087 + 1088 + if (__fsck_err(trans, repair_flags, bkey_in_missing_snapshot, 1089 + "key in missing snapshot %s, delete?", 1090 + (bch2_btree_id_to_text(&buf, iter->btree_id), 1091 + prt_char(&buf, ' '), 1092 + bch2_bkey_val_to_text(&buf, c, k), buf.buf))) { 1093 + ret = bch2_btree_delete_at(trans, iter, 1094 + BTREE_UPDATE_internal_snapshot_node) ?: 1; 1095 + } 1096 + } 1061 1097 fsck_err: 1062 1098 printbuf_exit(&buf); 1099 + return ret; 1100 + } 1101 + 1102 + int __bch2_get_snapshot_overwrites(struct btree_trans *trans, 1103 + enum btree_id btree, struct bpos pos, 1104 + snapshot_id_list *s) 1105 + { 1106 + struct bch_fs *c = trans->c; 1107 + struct btree_iter iter; 1108 + struct bkey_s_c k; 1109 + int ret = 0; 1110 + 1111 + for_each_btree_key_reverse_norestart(trans, iter, btree, bpos_predecessor(pos), 1112 + BTREE_ITER_all_snapshots, k, ret) { 1113 + if (!bkey_eq(k.k->p, pos)) 1114 + break; 1115 + 1116 + if (!bch2_snapshot_is_ancestor(c, k.k->p.snapshot, pos.snapshot) || 1117 + snapshot_list_has_ancestor(c, s, k.k->p.snapshot)) 1118 + continue; 1119 + 1120 + ret = snapshot_list_add(c, s, k.k->p.snapshot); 1121 + if (ret) 1122 + break; 1123 + } 1124 + bch2_trans_iter_exit(trans, &iter); 1125 + if (ret) 1126 + darray_exit(s); 1127 + 1063 1128 return ret; 1064 1129 } 1065 1130 ··· 1296 1263 goto err; 1297 1264 1298 1265 if (!k.k || !k.k->p.offset) { 1299 - ret = -BCH_ERR_ENOSPC_snapshot_create; 1266 + ret = bch_err_throw(c, ENOSPC_snapshot_create); 1300 1267 goto err; 1301 1268 } 1302 1269 ··· 1432 1399 1433 1400 static inline u32 interior_delete_has_id(interior_delete_list *l, u32 id) 1434 1401 { 1435 - darray_for_each(*l, i) 1436 - if (i->id == id) 1437 - return i->live_child; 1438 - return 0; 1402 + struct snapshot_interior_delete *i = darray_find_p(*l, i, i->id == id); 1403 + return i ? i->live_child : 0; 1439 1404 } 1440 1405 1441 1406 static unsigned __live_child(struct snapshot_table *t, u32 id, ··· 1465 1434 { 1466 1435 struct snapshot_delete *d = &c->snapshot_delete; 1467 1436 1468 - rcu_read_lock(); 1469 - u32 ret = __live_child(rcu_dereference(c->snapshots), id, 1470 - &d->delete_leaves, &d->delete_interior); 1471 - rcu_read_unlock(); 1472 - return ret; 1437 + guard(rcu)(); 1438 + return __live_child(rcu_dereference(c->snapshots), id, 1439 + &d->delete_leaves, &d->delete_interior); 1473 1440 } 1474 1441 1475 1442 static bool snapshot_id_dying(struct snapshot_delete *d, unsigned id) ··· 1724 1695 static inline u32 bch2_snapshot_nth_parent_skip(struct bch_fs *c, u32 id, u32 n, 1725 1696 interior_delete_list *skip) 1726 1697 { 1727 - rcu_read_lock(); 1698 + guard(rcu)(); 1728 1699 while (interior_delete_has_id(skip, id)) 1729 1700 id = __bch2_snapshot_parent(c, id); 1730 1701 ··· 1733 1704 id = __bch2_snapshot_parent(c, id); 1734 1705 } while (interior_delete_has_id(skip, id)); 1735 1706 } 1736 - rcu_read_unlock(); 1737 1707 1738 1708 return id; 1739 1709 } ··· 1898 1870 d->running = false; 1899 1871 mutex_unlock(&d->progress_lock); 1900 1872 bch2_trans_put(trans); 1873 + 1874 + bch2_recovery_pass_set_no_ratelimit(c, BCH_RECOVERY_PASS_check_snapshots); 1901 1875 out_unlock: 1902 1876 mutex_unlock(&d->lock); 1903 1877 if (!bch2_err_matches(ret, EROFS)) ··· 1935 1905 1936 1906 BUG_ON(!test_bit(BCH_FS_may_go_rw, &c->flags)); 1937 1907 1938 - if (!queue_work(c->write_ref_wq, &c->snapshot_delete.work)) 1908 + if (!queue_work(system_long_wq, &c->snapshot_delete.work)) 1939 1909 enumerated_ref_put(&c->writes, BCH_WRITE_REF_delete_dead_snapshots); 1940 1910 } 1941 1911
+37 -48
fs/bcachefs/snapshot.h
··· 46 46 47 47 static inline u32 bch2_snapshot_tree(struct bch_fs *c, u32 id) 48 48 { 49 - rcu_read_lock(); 49 + guard(rcu)(); 50 50 const struct snapshot_t *s = snapshot_t(c, id); 51 - id = s ? s->tree : 0; 52 - rcu_read_unlock(); 53 - 54 - return id; 51 + return s ? s->tree : 0; 55 52 } 56 53 57 54 static inline u32 __bch2_snapshot_parent_early(struct bch_fs *c, u32 id) ··· 59 62 60 63 static inline u32 bch2_snapshot_parent_early(struct bch_fs *c, u32 id) 61 64 { 62 - rcu_read_lock(); 63 - id = __bch2_snapshot_parent_early(c, id); 64 - rcu_read_unlock(); 65 - 66 - return id; 65 + guard(rcu)(); 66 + return __bch2_snapshot_parent_early(c, id); 67 67 } 68 68 69 69 static inline u32 __bch2_snapshot_parent(struct bch_fs *c, u32 id) ··· 82 88 83 89 static inline u32 bch2_snapshot_parent(struct bch_fs *c, u32 id) 84 90 { 85 - rcu_read_lock(); 86 - id = __bch2_snapshot_parent(c, id); 87 - rcu_read_unlock(); 88 - 89 - return id; 91 + guard(rcu)(); 92 + return __bch2_snapshot_parent(c, id); 90 93 } 91 94 92 95 static inline u32 bch2_snapshot_nth_parent(struct bch_fs *c, u32 id, u32 n) 93 96 { 94 - rcu_read_lock(); 97 + guard(rcu)(); 95 98 while (n--) 96 99 id = __bch2_snapshot_parent(c, id); 97 - rcu_read_unlock(); 98 - 99 100 return id; 100 101 } 101 102 ··· 99 110 100 111 static inline u32 bch2_snapshot_root(struct bch_fs *c, u32 id) 101 112 { 102 - u32 parent; 113 + guard(rcu)(); 103 114 104 - rcu_read_lock(); 115 + u32 parent; 105 116 while ((parent = __bch2_snapshot_parent(c, id))) 106 117 id = parent; 107 - rcu_read_unlock(); 108 - 109 118 return id; 110 119 } 111 120 ··· 115 128 116 129 static inline enum snapshot_id_state bch2_snapshot_id_state(struct bch_fs *c, u32 id) 117 130 { 118 - rcu_read_lock(); 119 - enum snapshot_id_state ret = __bch2_snapshot_id_state(c, id); 120 - rcu_read_unlock(); 121 - 122 - return ret; 131 + guard(rcu)(); 132 + return __bch2_snapshot_id_state(c, id); 123 133 } 124 134 125 135 static inline bool bch2_snapshot_exists(struct bch_fs *c, u32 id) ··· 126 142 127 143 static inline int bch2_snapshot_is_internal_node(struct bch_fs *c, u32 id) 128 144 { 129 - rcu_read_lock(); 145 + guard(rcu)(); 130 146 const struct snapshot_t *s = snapshot_t(c, id); 131 - int ret = s ? s->children[0] : -BCH_ERR_invalid_snapshot_node; 132 - rcu_read_unlock(); 133 - 134 - return ret; 147 + return s ? s->children[0] : -BCH_ERR_invalid_snapshot_node; 135 148 } 136 149 137 150 static inline int bch2_snapshot_is_leaf(struct bch_fs *c, u32 id) ··· 141 160 142 161 static inline u32 bch2_snapshot_depth(struct bch_fs *c, u32 parent) 143 162 { 144 - u32 depth; 145 - 146 - rcu_read_lock(); 147 - depth = parent ? snapshot_t(c, parent)->depth + 1 : 0; 148 - rcu_read_unlock(); 149 - 150 - return depth; 163 + guard(rcu)(); 164 + return parent ? snapshot_t(c, parent)->depth + 1 : 0; 151 165 } 152 166 153 167 bool __bch2_snapshot_is_ancestor(struct bch_fs *, u32, u32); ··· 156 180 157 181 static inline bool bch2_snapshot_has_children(struct bch_fs *c, u32 id) 158 182 { 159 - rcu_read_lock(); 183 + guard(rcu)(); 160 184 const struct snapshot_t *t = snapshot_t(c, id); 161 - bool ret = t && (t->children[0]|t->children[1]) != 0; 162 - rcu_read_unlock(); 163 - 164 - return ret; 185 + return t && (t->children[0]|t->children[1]) != 0; 165 186 } 166 187 167 188 static inline bool snapshot_list_has_id(snapshot_id_list *s, u32 id) 168 189 { 169 - darray_for_each(*s, i) 170 - if (*i == id) 171 - return true; 172 - return false; 190 + return darray_find(*s, id) != NULL; 173 191 } 174 192 175 193 static inline bool snapshot_list_has_ancestor(struct bch_fs *c, snapshot_id_list *s, u32 id) ··· 226 256 return likely(bch2_snapshot_exists(trans->c, k.k->p.snapshot)) 227 257 ? 0 228 258 : __bch2_check_key_has_snapshot(trans, iter, k); 259 + } 260 + 261 + int __bch2_get_snapshot_overwrites(struct btree_trans *, 262 + enum btree_id, struct bpos, 263 + snapshot_id_list *); 264 + 265 + /* 266 + * Get a list of snapshot IDs that have overwritten a given key: 267 + */ 268 + static inline int bch2_get_snapshot_overwrites(struct btree_trans *trans, 269 + enum btree_id btree, struct bpos pos, 270 + snapshot_id_list *s) 271 + { 272 + darray_init(s); 273 + 274 + return bch2_snapshot_has_children(trans->c, pos.snapshot) 275 + ? __bch2_get_snapshot_overwrites(trans, btree, pos, s) 276 + : 0; 277 + 229 278 } 230 279 231 280 int bch2_snapshot_node_set_deleted(struct btree_trans *, u32);
+158 -87
fs/bcachefs/str_hash.c
··· 31 31 } 32 32 } 33 33 34 - static noinline int fsck_rename_dirent(struct btree_trans *trans, 35 - struct snapshots_seen *s, 36 - const struct bch_hash_desc desc, 37 - struct bch_hash_info *hash_info, 38 - struct bkey_s_c_dirent old) 34 + static int bch2_fsck_rename_dirent(struct btree_trans *trans, 35 + struct snapshots_seen *s, 36 + const struct bch_hash_desc desc, 37 + struct bch_hash_info *hash_info, 38 + struct bkey_s_c_dirent old, 39 + bool *updated_before_k_pos) 39 40 { 40 41 struct qstr old_name = bch2_dirent_get_name(old); 41 - struct bkey_i_dirent *new = bch2_trans_kmalloc(trans, bkey_bytes(old.k) + 32); 42 + struct bkey_i_dirent *new = bch2_trans_kmalloc(trans, BKEY_U64s_MAX * sizeof(u64)); 42 43 int ret = PTR_ERR_OR_ZERO(new); 43 44 if (ret) 44 45 return ret; ··· 48 47 dirent_copy_target(new, old); 49 48 new->k.p = old.k->p; 50 49 50 + char *renamed_buf = bch2_trans_kmalloc(trans, old_name.len + 20); 51 + ret = PTR_ERR_OR_ZERO(renamed_buf); 52 + if (ret) 53 + return ret; 54 + 51 55 for (unsigned i = 0; i < 1000; i++) { 52 - unsigned len = sprintf(new->v.d_name, "%.*s.fsck_renamed-%u", 53 - old_name.len, old_name.name, i); 54 - unsigned u64s = BKEY_U64s + dirent_val_u64s(len, 0); 56 + new->k.u64s = BKEY_U64s_MAX; 55 57 56 - if (u64s > U8_MAX) 57 - return -EINVAL; 58 + struct qstr renamed_name = (struct qstr) QSTR_INIT(renamed_buf, 59 + sprintf(renamed_buf, "%.*s.fsck_renamed-%u", 60 + old_name.len, old_name.name, i)); 58 61 59 - new->k.u64s = u64s; 62 + ret = bch2_dirent_init_name(new, hash_info, &renamed_name, NULL); 63 + if (ret) 64 + return ret; 60 65 61 66 ret = bch2_hash_set_in_snapshot(trans, bch2_dirent_hash_desc, hash_info, 62 67 (subvol_inum) { 0, old.k->p.inode }, 63 68 old.k->p.snapshot, &new->k_i, 64 - BTREE_UPDATE_internal_snapshot_node); 65 - if (!bch2_err_matches(ret, EEXIST)) 69 + BTREE_UPDATE_internal_snapshot_node| 70 + STR_HASH_must_create); 71 + if (ret && !bch2_err_matches(ret, EEXIST)) 66 72 break; 73 + if (!ret) { 74 + if (bpos_lt(new->k.p, old.k->p)) 75 + *updated_before_k_pos = true; 76 + break; 77 + } 67 78 } 68 79 69 - if (ret) 70 - return ret; 71 - 72 - return bch2_fsck_update_backpointers(trans, s, desc, hash_info, &new->k_i); 80 + ret = ret ?: bch2_fsck_update_backpointers(trans, s, desc, hash_info, &new->k_i); 81 + bch_err_fn(trans->c, ret); 82 + return ret; 73 83 } 74 84 75 85 static noinline int hash_pick_winner(struct btree_trans *trans, ··· 198 186 #endif 199 187 bch2_print_str(c, KERN_ERR, buf.buf); 200 188 printbuf_exit(&buf); 201 - ret = -BCH_ERR_fsck_repair_unimplemented; 189 + ret = bch_err_throw(c, fsck_repair_unimplemented); 202 190 goto err; 203 191 } 204 192 ··· 233 221 return ret; 234 222 } 235 223 224 + /* Put a str_hash key in its proper location, checking for duplicates */ 225 + int bch2_str_hash_repair_key(struct btree_trans *trans, 226 + struct snapshots_seen *s, 227 + const struct bch_hash_desc *desc, 228 + struct bch_hash_info *hash_info, 229 + struct btree_iter *k_iter, struct bkey_s_c k, 230 + struct btree_iter *dup_iter, struct bkey_s_c dup_k, 231 + bool *updated_before_k_pos) 232 + { 233 + struct bch_fs *c = trans->c; 234 + struct printbuf buf = PRINTBUF; 235 + bool free_snapshots_seen = false; 236 + int ret = 0; 237 + 238 + if (!s) { 239 + s = bch2_trans_kmalloc(trans, sizeof(*s)); 240 + ret = PTR_ERR_OR_ZERO(s); 241 + if (ret) 242 + goto out; 243 + 244 + s->pos = k_iter->pos; 245 + darray_init(&s->ids); 246 + 247 + ret = bch2_get_snapshot_overwrites(trans, desc->btree_id, k_iter->pos, &s->ids); 248 + if (ret) 249 + goto out; 250 + 251 + free_snapshots_seen = true; 252 + } 253 + 254 + if (!dup_k.k) { 255 + struct bkey_i *new = bch2_bkey_make_mut_noupdate(trans, k); 256 + ret = PTR_ERR_OR_ZERO(new); 257 + if (ret) 258 + goto out; 259 + 260 + dup_k = bch2_hash_set_or_get_in_snapshot(trans, dup_iter, *desc, hash_info, 261 + (subvol_inum) { 0, new->k.p.inode }, 262 + new->k.p.snapshot, new, 263 + STR_HASH_must_create| 264 + BTREE_ITER_with_updates| 265 + BTREE_UPDATE_internal_snapshot_node); 266 + ret = bkey_err(dup_k); 267 + if (ret) 268 + goto out; 269 + if (dup_k.k) 270 + goto duplicate_entries; 271 + 272 + if (bpos_lt(new->k.p, k.k->p)) 273 + *updated_before_k_pos = true; 274 + 275 + ret = bch2_insert_snapshot_whiteouts(trans, desc->btree_id, 276 + k_iter->pos, new->k.p) ?: 277 + bch2_hash_delete_at(trans, *desc, hash_info, k_iter, 278 + BTREE_ITER_with_updates| 279 + BTREE_UPDATE_internal_snapshot_node) ?: 280 + bch2_fsck_update_backpointers(trans, s, *desc, hash_info, new) ?: 281 + bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc) ?: 282 + -BCH_ERR_transaction_restart_commit; 283 + } else { 284 + duplicate_entries: 285 + ret = hash_pick_winner(trans, *desc, hash_info, k, dup_k); 286 + if (ret < 0) 287 + goto out; 288 + 289 + if (!fsck_err(trans, hash_table_key_duplicate, 290 + "duplicate hash table keys%s:\n%s", 291 + ret != 2 ? "" : ", both point to valid inodes", 292 + (printbuf_reset(&buf), 293 + bch2_bkey_val_to_text(&buf, c, k), 294 + prt_newline(&buf), 295 + bch2_bkey_val_to_text(&buf, c, dup_k), 296 + buf.buf))) 297 + goto out; 298 + 299 + switch (ret) { 300 + case 0: 301 + ret = bch2_hash_delete_at(trans, *desc, hash_info, k_iter, 0); 302 + break; 303 + case 1: 304 + ret = bch2_hash_delete_at(trans, *desc, hash_info, dup_iter, 0); 305 + break; 306 + case 2: 307 + ret = bch2_fsck_rename_dirent(trans, s, *desc, hash_info, 308 + bkey_s_c_to_dirent(k), 309 + updated_before_k_pos) ?: 310 + bch2_hash_delete_at(trans, *desc, hash_info, k_iter, 311 + BTREE_ITER_with_updates); 312 + goto out; 313 + } 314 + 315 + ret = bch2_trans_commit(trans, NULL, NULL, 0) ?: 316 + -BCH_ERR_transaction_restart_commit; 317 + } 318 + out: 319 + fsck_err: 320 + bch2_trans_iter_exit(trans, dup_iter); 321 + printbuf_exit(&buf); 322 + if (free_snapshots_seen) 323 + darray_exit(&s->ids); 324 + return ret; 325 + } 326 + 236 327 int __bch2_str_hash_check_key(struct btree_trans *trans, 237 328 struct snapshots_seen *s, 238 329 const struct bch_hash_desc *desc, 239 330 struct bch_hash_info *hash_info, 240 - struct btree_iter *k_iter, struct bkey_s_c hash_k) 331 + struct btree_iter *k_iter, struct bkey_s_c hash_k, 332 + bool *updated_before_k_pos) 241 333 { 242 334 struct bch_fs *c = trans->c; 243 335 struct btree_iter iter = {}; ··· 355 239 356 240 for_each_btree_key_norestart(trans, iter, desc->btree_id, 357 241 SPOS(hash_k.k->p.inode, hash, hash_k.k->p.snapshot), 358 - BTREE_ITER_slots, k, ret) { 242 + BTREE_ITER_slots| 243 + BTREE_ITER_with_updates, k, ret) { 359 244 if (bkey_eq(k.k->p, hash_k.k->p)) 360 245 break; 361 246 362 247 if (k.k->type == desc->key_type && 363 - !desc->cmp_bkey(k, hash_k)) 364 - goto duplicate_entries; 365 - 366 - if (bkey_deleted(k.k)) { 367 - bch2_trans_iter_exit(trans, &iter); 368 - goto bad_hash; 248 + !desc->cmp_bkey(k, hash_k)) { 249 + ret = check_inode_hash_info_matches_root(trans, hash_k.k->p.inode, 250 + hash_info) ?: 251 + bch2_str_hash_repair_key(trans, s, desc, hash_info, 252 + k_iter, hash_k, 253 + &iter, k, updated_before_k_pos); 254 + break; 369 255 } 256 + 257 + if (bkey_deleted(k.k)) 258 + goto bad_hash; 370 259 } 371 - out: 372 260 bch2_trans_iter_exit(trans, &iter); 261 + out: 262 + fsck_err: 373 263 printbuf_exit(&buf); 374 264 return ret; 375 265 bad_hash: 266 + bch2_trans_iter_exit(trans, &iter); 376 267 /* 377 268 * Before doing any repair, check hash_info itself: 378 269 */ ··· 388 265 goto out; 389 266 390 267 if (fsck_err(trans, hash_table_key_wrong_offset, 391 - "hash table key at wrong offset: btree %s inode %llu offset %llu, hashed to %llu\n%s", 392 - bch2_btree_id_str(desc->btree_id), hash_k.k->p.inode, hash_k.k->p.offset, hash, 393 - (printbuf_reset(&buf), 394 - bch2_bkey_val_to_text(&buf, c, hash_k), buf.buf))) { 395 - struct bkey_i *new = bch2_bkey_make_mut_noupdate(trans, hash_k); 396 - if (IS_ERR(new)) 397 - return PTR_ERR(new); 398 - 399 - k = bch2_hash_set_or_get_in_snapshot(trans, &iter, *desc, hash_info, 400 - (subvol_inum) { 0, hash_k.k->p.inode }, 401 - hash_k.k->p.snapshot, new, 402 - STR_HASH_must_create| 403 - BTREE_ITER_with_updates| 404 - BTREE_UPDATE_internal_snapshot_node); 405 - ret = bkey_err(k); 406 - if (ret) 407 - goto out; 408 - if (k.k) 409 - goto duplicate_entries; 410 - 411 - ret = bch2_hash_delete_at(trans, *desc, hash_info, k_iter, 412 - BTREE_UPDATE_internal_snapshot_node) ?: 413 - bch2_fsck_update_backpointers(trans, s, *desc, hash_info, new) ?: 414 - bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc) ?: 415 - -BCH_ERR_transaction_restart_nested; 416 - goto out; 417 - } 418 - fsck_err: 419 - goto out; 420 - duplicate_entries: 421 - ret = hash_pick_winner(trans, *desc, hash_info, hash_k, k); 422 - if (ret < 0) 423 - goto out; 424 - 425 - if (!fsck_err(trans, hash_table_key_duplicate, 426 - "duplicate hash table keys%s:\n%s", 427 - ret != 2 ? "" : ", both point to valid inodes", 428 - (printbuf_reset(&buf), 429 - bch2_bkey_val_to_text(&buf, c, hash_k), 430 - prt_newline(&buf), 431 - bch2_bkey_val_to_text(&buf, c, k), 432 - buf.buf))) 433 - goto out; 434 - 435 - switch (ret) { 436 - case 0: 437 - ret = bch2_hash_delete_at(trans, *desc, hash_info, k_iter, 0); 438 - break; 439 - case 1: 440 - ret = bch2_hash_delete_at(trans, *desc, hash_info, &iter, 0); 441 - break; 442 - case 2: 443 - ret = fsck_rename_dirent(trans, s, *desc, hash_info, bkey_s_c_to_dirent(hash_k)) ?: 444 - bch2_hash_delete_at(trans, *desc, hash_info, k_iter, 0); 445 - goto out; 446 - } 447 - 448 - ret = bch2_trans_commit(trans, NULL, NULL, 0) ?: 449 - -BCH_ERR_transaction_restart_nested; 268 + "hash table key at wrong offset: should be at %llu\n%s", 269 + hash, 270 + (bch2_bkey_val_to_text(&buf, c, hash_k), buf.buf))) 271 + ret = bch2_str_hash_repair_key(trans, s, desc, hash_info, 272 + k_iter, hash_k, 273 + &iter, bkey_s_c_null, 274 + updated_before_k_pos); 450 275 goto out; 451 276 }
+18 -6
fs/bcachefs/str_hash.h
··· 261 261 struct bkey_i *insert, 262 262 enum btree_iter_update_trigger_flags flags) 263 263 { 264 + struct bch_fs *c = trans->c; 264 265 struct btree_iter slot = {}; 265 266 struct bkey_s_c k; 266 267 bool found = false; ··· 289 288 } 290 289 291 290 if (!ret) 292 - ret = -BCH_ERR_ENOSPC_str_hash_create; 291 + ret = bch_err_throw(c, ENOSPC_str_hash_create); 293 292 out: 294 293 bch2_trans_iter_exit(trans, &slot); 295 294 bch2_trans_iter_exit(trans, iter); ··· 301 300 bch2_trans_iter_exit(trans, &slot); 302 301 return k; 303 302 } else if (!found && (flags & STR_HASH_must_replace)) { 304 - ret = -BCH_ERR_ENOENT_str_hash_set_must_replace; 303 + ret = bch_err_throw(c, ENOENT_str_hash_set_must_replace); 305 304 } else { 306 305 if (!found && slot.path) 307 306 swap(*iter, slot); ··· 329 328 return ret; 330 329 if (k.k) { 331 330 bch2_trans_iter_exit(trans, &iter); 332 - return -BCH_ERR_EEXIST_str_hash_set; 331 + return bch_err_throw(trans->c, EEXIST_str_hash_set); 333 332 } 334 333 335 334 return 0; ··· 398 397 int bch2_repair_inode_hash_info(struct btree_trans *, struct bch_inode_unpacked *); 399 398 400 399 struct snapshots_seen; 400 + int bch2_str_hash_repair_key(struct btree_trans *, 401 + struct snapshots_seen *, 402 + const struct bch_hash_desc *, 403 + struct bch_hash_info *, 404 + struct btree_iter *, struct bkey_s_c, 405 + struct btree_iter *, struct bkey_s_c, 406 + bool *); 407 + 401 408 int __bch2_str_hash_check_key(struct btree_trans *, 402 409 struct snapshots_seen *, 403 410 const struct bch_hash_desc *, 404 411 struct bch_hash_info *, 405 - struct btree_iter *, struct bkey_s_c); 412 + struct btree_iter *, struct bkey_s_c, 413 + bool *); 406 414 407 415 static inline int bch2_str_hash_check_key(struct btree_trans *trans, 408 416 struct snapshots_seen *s, 409 417 const struct bch_hash_desc *desc, 410 418 struct bch_hash_info *hash_info, 411 - struct btree_iter *k_iter, struct bkey_s_c hash_k) 419 + struct btree_iter *k_iter, struct bkey_s_c hash_k, 420 + bool *updated_before_k_pos) 412 421 { 413 422 if (hash_k.k->type != desc->key_type) 414 423 return 0; ··· 426 415 if (likely(desc->hash_bkey(hash_info, hash_k) == hash_k.k->p.offset)) 427 416 return 0; 428 417 429 - return __bch2_str_hash_check_key(trans, s, desc, hash_info, k_iter, hash_k); 418 + return __bch2_str_hash_check_key(trans, s, desc, hash_info, k_iter, hash_k, 419 + updated_before_k_pos); 430 420 } 431 421 432 422 #endif /* _BCACHEFS_STR_HASH_H */
+31 -14
fs/bcachefs/subvolume.c
··· 130 130 "subvolume %llu points to missing subvolume root %llu:%u", 131 131 k.k->p.offset, le64_to_cpu(subvol.v->inode), 132 132 le32_to_cpu(subvol.v->snapshot))) { 133 - ret = bch2_subvolume_delete(trans, iter->pos.offset); 134 - bch_err_msg(c, ret, "deleting subvolume %llu", iter->pos.offset); 135 - ret = ret ?: -BCH_ERR_transaction_restart_nested; 136 - goto err; 133 + /* 134 + * Recreate - any contents that are still disconnected 135 + * will then get reattached under lost+found 136 + */ 137 + bch2_inode_init_early(c, &inode); 138 + bch2_inode_init_late(c, &inode, bch2_current_time(c), 139 + 0, 0, S_IFDIR|0700, 0, NULL); 140 + inode.bi_inum = le64_to_cpu(subvol.v->inode); 141 + inode.bi_snapshot = le32_to_cpu(subvol.v->snapshot); 142 + inode.bi_subvol = k.k->p.offset; 143 + inode.bi_parent_subvol = le32_to_cpu(subvol.v->fs_path_parent); 144 + ret = __bch2_fsck_write_inode(trans, &inode); 145 + if (ret) 146 + goto err; 137 147 } 138 148 } else { 139 149 goto err; ··· 151 141 152 142 if (!BCH_SUBVOLUME_SNAP(subvol.v)) { 153 143 u32 snapshot_root = bch2_snapshot_root(c, le32_to_cpu(subvol.v->snapshot)); 154 - u32 snapshot_tree; 144 + u32 snapshot_tree = bch2_snapshot_tree(c, snapshot_root); 145 + 155 146 struct bch_snapshot_tree st; 156 - 157 - rcu_read_lock(); 158 - snapshot_tree = snapshot_t(c, snapshot_root)->tree; 159 - rcu_read_unlock(); 160 - 161 147 ret = bch2_snapshot_tree_lookup(trans, snapshot_tree, &st); 162 148 163 149 bch2_fs_inconsistent_on(bch2_err_matches(ret, ENOENT), c, ··· 265 259 prt_printf(out, " creation_parent %u", le32_to_cpu(s.v->creation_parent)); 266 260 prt_printf(out, " fs_parent %u", le32_to_cpu(s.v->fs_path_parent)); 267 261 } 262 + 263 + if (BCH_SUBVOLUME_RO(s.v)) 264 + prt_printf(out, " ro"); 265 + if (BCH_SUBVOLUME_SNAP(s.v)) 266 + prt_printf(out, " snapshot"); 267 + if (BCH_SUBVOLUME_UNLINKED(s.v)) 268 + prt_printf(out, " unlinked"); 268 269 } 269 270 270 271 static int subvolume_children_mod(struct btree_trans *trans, struct bpos pos, bool set) ··· 499 486 500 487 static int bch2_subvolume_delete(struct btree_trans *trans, u32 subvolid) 501 488 { 502 - return bch2_subvolumes_reparent(trans, subvolid) ?: 489 + int ret = bch2_subvolumes_reparent(trans, subvolid) ?: 503 490 commit_do(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc, 504 491 __bch2_subvolume_delete(trans, subvolid)); 492 + 493 + bch2_recovery_pass_set_no_ratelimit(trans->c, BCH_RECOVERY_PASS_check_subvols); 494 + return ret; 505 495 } 506 496 507 497 static void bch2_subvolume_wait_for_pagecache_and_delete(struct work_struct *work) ··· 613 597 ret = bch2_bkey_get_empty_slot(trans, &dst_iter, 614 598 BTREE_ID_subvolumes, POS(0, U32_MAX)); 615 599 if (ret == -BCH_ERR_ENOSPC_btree_slot) 616 - ret = -BCH_ERR_ENOSPC_subvolume_create; 600 + ret = bch_err_throw(c, ENOSPC_subvolume_create); 617 601 if (ret) 618 602 return ret; 619 603 ··· 719 703 return ret; 720 704 721 705 if (!bkey_is_inode(k.k)) { 722 - bch_err(trans->c, "root inode not found"); 723 - ret = -BCH_ERR_ENOENT_inode; 706 + struct bch_fs *c = trans->c; 707 + bch_err(c, "root inode not found"); 708 + ret = bch_err_throw(c, ENOENT_inode); 724 709 goto err; 725 710 } 726 711
+4 -4
fs/bcachefs/super-io.c
··· 1112 1112 prt_str(&buf, ")"); 1113 1113 bch2_fs_fatal_error(c, ": %s", buf.buf); 1114 1114 printbuf_exit(&buf); 1115 - ret = -BCH_ERR_sb_not_downgraded; 1115 + ret = bch_err_throw(c, sb_not_downgraded); 1116 1116 goto out; 1117 1117 } 1118 1118 ··· 1142 1142 1143 1143 if (c->opts.errors != BCH_ON_ERROR_continue && 1144 1144 c->opts.errors != BCH_ON_ERROR_fix_safe) { 1145 - ret = -BCH_ERR_erofs_sb_err; 1145 + ret = bch_err_throw(c, erofs_sb_err); 1146 1146 bch2_fs_fatal_error(c, "%s", buf.buf); 1147 1147 } else { 1148 1148 bch_err(c, "%s", buf.buf); ··· 1161 1161 ca->disk_sb.seq); 1162 1162 bch2_fs_fatal_error(c, "%s", buf.buf); 1163 1163 printbuf_exit(&buf); 1164 - ret = -BCH_ERR_erofs_sb_err; 1164 + ret = bch_err_throw(c, erofs_sb_err); 1165 1165 } 1166 1166 } 1167 1167 ··· 1215 1215 !can_mount_with_written), c, 1216 1216 ": Unable to write superblock to sufficient devices (from %ps)", 1217 1217 (void *) _RET_IP_)) 1218 - ret = -BCH_ERR_erofs_sb_err; 1218 + ret = bch_err_throw(c, erofs_sb_err); 1219 1219 out: 1220 1220 /* Make new options visible after they're persistent: */ 1221 1221 bch2_sb_update(c);
+50 -56
fs/bcachefs/super.c
··· 219 219 220 220 struct bch_fs *bch2_dev_to_fs(dev_t dev) 221 221 { 222 + guard(mutex)(&bch_fs_list_lock); 223 + guard(rcu)(); 224 + 222 225 struct bch_fs *c; 223 - 224 - mutex_lock(&bch_fs_list_lock); 225 - rcu_read_lock(); 226 - 227 226 list_for_each_entry(c, &bch_fs_list, list) 228 227 for_each_member_device_rcu(c, ca, NULL) 229 228 if (ca->disk_sb.bdev && ca->disk_sb.bdev->bd_dev == dev) { 230 229 closure_get(&c->cl); 231 - goto found; 230 + return c; 232 231 } 233 - c = NULL; 234 - found: 235 - rcu_read_unlock(); 236 - mutex_unlock(&bch_fs_list_lock); 237 - 238 - return c; 232 + return NULL; 239 233 } 240 234 241 235 static struct bch_fs *__bch2_uuid_to_fs(__uuid_t uuid) ··· 474 480 BUG_ON(!test_bit(BCH_FS_may_go_rw, &c->flags)); 475 481 476 482 if (WARN_ON(c->sb.features & BIT_ULL(BCH_FEATURE_no_alloc_info))) 477 - return -BCH_ERR_erofs_no_alloc_info; 483 + return bch_err_throw(c, erofs_no_alloc_info); 478 484 479 485 if (test_bit(BCH_FS_initial_gc_unfixed, &c->flags)) { 480 486 bch_err(c, "cannot go rw, unfixed btree errors"); 481 - return -BCH_ERR_erofs_unfixed_errors; 487 + return bch_err_throw(c, erofs_unfixed_errors); 482 488 } 483 489 484 490 if (c->sb.features & BIT_ULL(BCH_FEATURE_small_image)) { 485 491 bch_err(c, "cannot go rw, filesystem is an unresized image file"); 486 - return -BCH_ERR_erofs_filesystem_full; 492 + return bch_err_throw(c, erofs_filesystem_full); 487 493 } 488 494 489 495 if (test_bit(BCH_FS_rw, &c->flags)) ··· 501 507 502 508 clear_bit(BCH_FS_clean_shutdown, &c->flags); 503 509 504 - rcu_read_lock(); 505 - for_each_online_member_rcu(c, ca) 506 - if (ca->mi.state == BCH_MEMBER_STATE_rw) { 507 - bch2_dev_allocator_add(c, ca); 508 - enumerated_ref_start(&ca->io_ref[WRITE]); 509 - } 510 - rcu_read_unlock(); 510 + scoped_guard(rcu) 511 + for_each_online_member_rcu(c, ca) 512 + if (ca->mi.state == BCH_MEMBER_STATE_rw) { 513 + bch2_dev_allocator_add(c, ca); 514 + enumerated_ref_start(&ca->io_ref[WRITE]); 515 + } 511 516 512 517 bch2_recalc_capacity(c); 513 518 ··· 564 571 { 565 572 if (c->opts.recovery_pass_last && 566 573 c->opts.recovery_pass_last < BCH_RECOVERY_PASS_journal_replay) 567 - return -BCH_ERR_erofs_norecovery; 574 + return bch_err_throw(c, erofs_norecovery); 568 575 569 576 if (c->opts.nochanges) 570 - return -BCH_ERR_erofs_nochanges; 577 + return bch_err_throw(c, erofs_nochanges); 571 578 572 579 if (c->sb.features & BIT_ULL(BCH_FEATURE_no_alloc_info)) 573 - return -BCH_ERR_erofs_no_alloc_info; 580 + return bch_err_throw(c, erofs_no_alloc_info); 574 581 575 582 return __bch2_fs_read_write(c, false); 576 583 } ··· 755 762 if (c->sb.multi_device && 756 763 __bch2_uuid_to_fs(c->sb.uuid)) { 757 764 bch_err(c, "filesystem UUID already open"); 758 - return -BCH_ERR_filesystem_uuid_already_open; 765 + return bch_err_throw(c, filesystem_uuid_already_open); 759 766 } 760 767 761 768 ret = bch2_fs_chardev_init(c); ··· 814 821 WQ_HIGHPRI|WQ_FREEZABLE|WQ_MEM_RECLAIM, 1)) || 815 822 !(c->write_ref_wq = alloc_workqueue("bcachefs_write_ref", 816 823 WQ_FREEZABLE, 0))) 817 - return -BCH_ERR_ENOMEM_fs_other_alloc; 824 + return bch_err_throw(c, ENOMEM_fs_other_alloc); 818 825 819 826 int ret = bch2_fs_btree_interior_update_init(c) ?: 820 827 bch2_fs_btree_write_buffer_init(c) ?: ··· 995 1002 mempool_init_kvmalloc_pool(&c->btree_bounce_pool, 1, 996 1003 c->opts.btree_node_size) || 997 1004 mempool_init_kmalloc_pool(&c->large_bkey_pool, 1, 2048)) { 998 - ret = -BCH_ERR_ENOMEM_fs_other_alloc; 1005 + ret = bch_err_throw(c, ENOMEM_fs_other_alloc); 999 1006 goto err; 1000 1007 } 1001 1008 ··· 1031 1038 ret = -EINVAL; 1032 1039 goto err; 1033 1040 } 1034 - bch_info(c, "Using encoding defined by superblock: utf8-%u.%u.%u", 1035 - unicode_major(BCH_FS_DEFAULT_UTF8_ENCODING), 1036 - unicode_minor(BCH_FS_DEFAULT_UTF8_ENCODING), 1037 - unicode_rev(BCH_FS_DEFAULT_UTF8_ENCODING)); 1038 1041 #else 1039 1042 if (c->sb.features & BIT_ULL(BCH_FEATURE_casefolding)) { 1040 1043 printk(KERN_ERR "Cannot mount a filesystem with casefolding on a kernel without CONFIG_UNICODE\n"); ··· 1148 1159 1149 1160 print_mount_opts(c); 1150 1161 1162 + #ifdef CONFIG_UNICODE 1163 + bch_info(c, "Using encoding defined by superblock: utf8-%u.%u.%u", 1164 + unicode_major(BCH_FS_DEFAULT_UTF8_ENCODING), 1165 + unicode_minor(BCH_FS_DEFAULT_UTF8_ENCODING), 1166 + unicode_rev(BCH_FS_DEFAULT_UTF8_ENCODING)); 1167 + #endif 1168 + 1151 1169 if (!bch2_fs_may_start(c)) 1152 - return -BCH_ERR_insufficient_devices_to_start; 1170 + return bch_err_throw(c, insufficient_devices_to_start); 1153 1171 1154 1172 down_write(&c->state_lock); 1155 1173 mutex_lock(&c->sb_lock); ··· 1167 1171 sizeof(struct bch_sb_field_ext) / sizeof(u64))) { 1168 1172 mutex_unlock(&c->sb_lock); 1169 1173 up_write(&c->state_lock); 1170 - ret = -BCH_ERR_ENOSPC_sb; 1174 + ret = bch_err_throw(c, ENOSPC_sb); 1171 1175 goto err; 1172 1176 } 1173 1177 ··· 1178 1182 goto err; 1179 1183 } 1180 1184 1181 - rcu_read_lock(); 1182 - for_each_online_member_rcu(c, ca) 1183 - bch2_members_v2_get_mut(c->disk_sb.sb, ca->dev_idx)->last_mount = 1184 - cpu_to_le64(now); 1185 - rcu_read_unlock(); 1185 + scoped_guard(rcu) 1186 + for_each_online_member_rcu(c, ca) 1187 + bch2_members_v2_get_mut(c->disk_sb.sb, ca->dev_idx)->last_mount = 1188 + cpu_to_le64(now); 1186 1189 1187 1190 /* 1188 1191 * Dno't write superblock yet: recovery might have to downgrade 1189 1192 */ 1190 1193 mutex_unlock(&c->sb_lock); 1191 1194 1192 - rcu_read_lock(); 1193 - for_each_online_member_rcu(c, ca) 1194 - if (ca->mi.state == BCH_MEMBER_STATE_rw) 1195 - bch2_dev_allocator_add(c, ca); 1196 - rcu_read_unlock(); 1195 + scoped_guard(rcu) 1196 + for_each_online_member_rcu(c, ca) 1197 + if (ca->mi.state == BCH_MEMBER_STATE_rw) 1198 + bch2_dev_allocator_add(c, ca); 1197 1199 bch2_recalc_capacity(c); 1198 1200 up_write(&c->state_lock); 1199 1201 ··· 1209 1215 goto err; 1210 1216 1211 1217 if (bch2_fs_init_fault("fs_start")) { 1212 - ret = -BCH_ERR_injected_fs_start; 1218 + ret = bch_err_throw(c, injected_fs_start); 1213 1219 goto err; 1214 1220 } 1215 1221 ··· 1236 1242 struct bch_member m = bch2_sb_member_get(sb, sb->dev_idx); 1237 1243 1238 1244 if (le16_to_cpu(sb->block_size) != block_sectors(c)) 1239 - return -BCH_ERR_mismatched_block_size; 1245 + return bch_err_throw(c, mismatched_block_size); 1240 1246 1241 1247 if (le16_to_cpu(m.bucket_size) < 1242 1248 BCH_SB_BTREE_NODE_SIZE(c->disk_sb.sb)) 1243 - return -BCH_ERR_bucket_size_too_small; 1249 + return bch_err_throw(c, bucket_size_too_small); 1244 1250 1245 1251 return 0; 1246 1252 } ··· 1551 1557 bch2_dev_attach(c, ca, dev_idx); 1552 1558 return 0; 1553 1559 err: 1554 - return -BCH_ERR_ENOMEM_dev_alloc; 1560 + return bch_err_throw(c, ENOMEM_dev_alloc); 1555 1561 } 1556 1562 1557 1563 static int __bch2_dev_attach_bdev(struct bch_dev *ca, struct bch_sb_handle *sb) ··· 1561 1567 if (bch2_dev_is_online(ca)) { 1562 1568 bch_err(ca, "already have device online in slot %u", 1563 1569 sb->sb->dev_idx); 1564 - return -BCH_ERR_device_already_online; 1570 + return bch_err_throw(ca->fs, device_already_online); 1565 1571 } 1566 1572 1567 1573 if (get_capacity(sb->bdev->bd_disk) < 1568 1574 ca->mi.bucket_size * ca->mi.nbuckets) { 1569 1575 bch_err(ca, "cannot online: device too small"); 1570 - return -BCH_ERR_device_size_too_small; 1576 + return bch_err_throw(ca->fs, device_size_too_small); 1571 1577 } 1572 1578 1573 1579 BUG_ON(!enumerated_ref_is_zero(&ca->io_ref[READ])); ··· 1719 1725 return 0; 1720 1726 1721 1727 if (!bch2_dev_state_allowed(c, ca, new_state, flags)) 1722 - return -BCH_ERR_device_state_not_allowed; 1728 + return bch_err_throw(c, device_state_not_allowed); 1723 1729 1724 1730 if (new_state != BCH_MEMBER_STATE_rw) 1725 1731 __bch2_dev_read_only(c, ca); ··· 1772 1778 1773 1779 if (!bch2_dev_state_allowed(c, ca, BCH_MEMBER_STATE_failed, flags)) { 1774 1780 bch_err(ca, "Cannot remove without losing data"); 1775 - ret = -BCH_ERR_device_state_not_allowed; 1781 + ret = bch_err_throw(c, device_state_not_allowed); 1776 1782 goto err; 1777 1783 } 1778 1784 ··· 1908 1914 if (list_empty(&c->list)) { 1909 1915 mutex_lock(&bch_fs_list_lock); 1910 1916 if (__bch2_uuid_to_fs(c->sb.uuid)) 1911 - ret = -BCH_ERR_filesystem_uuid_already_open; 1917 + ret = bch_err_throw(c, filesystem_uuid_already_open); 1912 1918 else 1913 1919 list_add(&c->list, &bch_fs_list); 1914 1920 mutex_unlock(&bch_fs_list_lock); ··· 2095 2101 if (!bch2_dev_state_allowed(c, ca, BCH_MEMBER_STATE_failed, flags)) { 2096 2102 bch_err(ca, "Cannot offline required disk"); 2097 2103 up_write(&c->state_lock); 2098 - return -BCH_ERR_device_state_not_allowed; 2104 + return bch_err_throw(c, device_state_not_allowed); 2099 2105 } 2100 2106 2101 2107 __bch2_dev_offline(c, ca); ··· 2134 2140 if (nbuckets > BCH_MEMBER_NBUCKETS_MAX) { 2135 2141 bch_err(ca, "New device size too big (%llu greater than max %u)", 2136 2142 nbuckets, BCH_MEMBER_NBUCKETS_MAX); 2137 - ret = -BCH_ERR_device_size_too_big; 2143 + ret = bch_err_throw(c, device_size_too_big); 2138 2144 goto err; 2139 2145 } 2140 2146 ··· 2142 2148 get_capacity(ca->disk_sb.bdev->bd_disk) < 2143 2149 ca->mi.bucket_size * nbuckets) { 2144 2150 bch_err(ca, "New size larger than device"); 2145 - ret = -BCH_ERR_device_size_too_small; 2151 + ret = bch_err_throw(c, device_size_too_small); 2146 2152 goto err; 2147 2153 } 2148 2154 ··· 2377 2383 } 2378 2384 2379 2385 if (opts->nochanges && !opts->read_only) { 2380 - ret = -BCH_ERR_erofs_nochanges; 2386 + ret = bch_err_throw(c, erofs_nochanges); 2381 2387 goto err_print; 2382 2388 } 2383 2389
+24
fs/bcachefs/sysfs.c
··· 26 26 #include "disk_groups.h" 27 27 #include "ec.h" 28 28 #include "enumerated_ref.h" 29 + #include "error.h" 29 30 #include "inode.h" 30 31 #include "journal.h" 31 32 #include "journal_reclaim.h" ··· 38 37 #include "rebalance.h" 39 38 #include "recovery_passes.h" 40 39 #include "replicas.h" 40 + #include "sb-errors.h" 41 41 #include "super-io.h" 42 42 #include "tests.h" 43 43 ··· 145 143 write_attribute(trigger_gc); 146 144 write_attribute(trigger_discards); 147 145 write_attribute(trigger_invalidates); 146 + write_attribute(trigger_journal_commit); 148 147 write_attribute(trigger_journal_flush); 149 148 write_attribute(trigger_journal_writes); 150 149 write_attribute(trigger_btree_cache_shrink); ··· 154 151 write_attribute(trigger_freelist_wakeup); 155 152 write_attribute(trigger_recalc_capacity); 156 153 write_attribute(trigger_delete_dead_snapshots); 154 + write_attribute(trigger_emergency_read_only); 157 155 read_attribute(gc_gens_pos); 158 156 159 157 read_attribute(uuid); ··· 176 172 177 173 read_attribute(btree_cache_size); 178 174 read_attribute(compression_stats); 175 + read_attribute(errors); 179 176 read_attribute(journal_debug); 180 177 read_attribute(btree_cache); 181 178 read_attribute(btree_key_cache); ··· 358 353 if (attr == &sysfs_compression_stats) 359 354 bch2_compression_stats_to_text(out, c); 360 355 356 + if (attr == &sysfs_errors) 357 + bch2_fs_errors_to_text(out, c); 358 + 361 359 if (attr == &sysfs_new_stripes) 362 360 bch2_new_stripes_to_text(out, c); 363 361 ··· 436 428 if (attr == &sysfs_trigger_invalidates) 437 429 bch2_do_invalidates(c); 438 430 431 + if (attr == &sysfs_trigger_journal_commit) 432 + bch2_journal_flush(&c->journal); 433 + 439 434 if (attr == &sysfs_trigger_journal_flush) { 440 435 bch2_journal_flush_all_pins(&c->journal); 441 436 bch2_journal_meta(&c->journal); ··· 458 447 459 448 if (attr == &sysfs_trigger_delete_dead_snapshots) 460 449 __bch2_delete_dead_snapshots(c); 450 + 451 + if (attr == &sysfs_trigger_emergency_read_only) { 452 + struct printbuf buf = PRINTBUF; 453 + bch2_log_msg_start(c, &buf); 454 + 455 + prt_printf(&buf, "shutdown by sysfs\n"); 456 + bch2_fs_emergency_read_only2(c, &buf); 457 + bch2_print_str(c, KERN_ERR, buf.buf); 458 + printbuf_exit(&buf); 459 + } 461 460 462 461 #ifdef CONFIG_BCACHEFS_TESTS 463 462 if (attr == &sysfs_perf_test) { ··· 504 483 &sysfs_recovery_status, 505 484 506 485 &sysfs_compression_stats, 486 + &sysfs_errors, 507 487 508 488 #ifdef CONFIG_BCACHEFS_TESTS 509 489 &sysfs_perf_test, ··· 593 571 &sysfs_trigger_gc, 594 572 &sysfs_trigger_discards, 595 573 &sysfs_trigger_invalidates, 574 + &sysfs_trigger_journal_commit, 596 575 &sysfs_trigger_journal_flush, 597 576 &sysfs_trigger_journal_writes, 598 577 &sysfs_trigger_btree_cache_shrink, ··· 602 579 &sysfs_trigger_freelist_wakeup, 603 580 &sysfs_trigger_recalc_capacity, 604 581 &sysfs_trigger_delete_dead_snapshots, 582 + &sysfs_trigger_emergency_read_only, 605 583 606 584 &sysfs_gc_gens_pos, 607 585
+52 -17
fs/bcachefs/trace.h
··· 199 199 (unsigned long long)__entry->sector, __entry->nr_sector) 200 200 ); 201 201 202 + /* errors */ 203 + 204 + TRACE_EVENT(error_throw, 205 + TP_PROTO(struct bch_fs *c, int bch_err, unsigned long ip), 206 + TP_ARGS(c, bch_err, ip), 207 + 208 + TP_STRUCT__entry( 209 + __field(dev_t, dev ) 210 + __field(int, err ) 211 + __array(char, err_str, 32 ) 212 + __array(char, ip, 32 ) 213 + ), 214 + 215 + TP_fast_assign( 216 + __entry->dev = c->dev; 217 + __entry->err = bch_err; 218 + strscpy(__entry->err_str, bch2_err_str(bch_err), sizeof(__entry->err_str)); 219 + snprintf(__entry->ip, sizeof(__entry->ip), "%ps", (void *) ip); 220 + ), 221 + 222 + TP_printk("%d,%d %s ret %s", MAJOR(__entry->dev), MINOR(__entry->dev), 223 + __entry->ip, __entry->err_str) 224 + ); 225 + 226 + TRACE_EVENT(error_downcast, 227 + TP_PROTO(int bch_err, int std_err, unsigned long ip), 228 + TP_ARGS(bch_err, std_err, ip), 229 + 230 + TP_STRUCT__entry( 231 + __array(char, bch_err, 32 ) 232 + __array(char, std_err, 32 ) 233 + __array(char, ip, 32 ) 234 + ), 235 + 236 + TP_fast_assign( 237 + strscpy(__entry->bch_err, bch2_err_str(bch_err), sizeof(__entry->bch_err)); 238 + strscpy(__entry->std_err, bch2_err_str(std_err), sizeof(__entry->std_err)); 239 + snprintf(__entry->ip, sizeof(__entry->ip), "%ps", (void *) ip); 240 + ), 241 + 242 + TP_printk("%s ret %s -> %s %s", __entry->ip, 243 + __entry->bch_err, __entry->std_err, __entry->ip) 244 + ); 245 + 202 246 /* disk_accounting.c */ 203 247 204 248 TRACE_EVENT(accounting_mem_insert, ··· 1475 1431 TP_ARGS(c, str) 1476 1432 ); 1477 1433 1434 + DEFINE_EVENT(fs_str, io_move_pred, 1435 + TP_PROTO(struct bch_fs *c, const char *str), 1436 + TP_ARGS(c, str) 1437 + ); 1438 + 1478 1439 DEFINE_EVENT(fs_str, io_move_created_rebalance, 1479 1440 TP_PROTO(struct bch_fs *c, const char *str), 1480 1441 TP_ARGS(c, str) 1481 1442 ); 1482 1443 1483 - TRACE_EVENT(error_downcast, 1484 - TP_PROTO(int bch_err, int std_err, unsigned long ip), 1485 - TP_ARGS(bch_err, std_err, ip), 1486 - 1487 - TP_STRUCT__entry( 1488 - __array(char, bch_err, 32 ) 1489 - __array(char, std_err, 32 ) 1490 - __array(char, ip, 32 ) 1491 - ), 1492 - 1493 - TP_fast_assign( 1494 - strscpy(__entry->bch_err, bch2_err_str(bch_err), sizeof(__entry->bch_err)); 1495 - strscpy(__entry->std_err, bch2_err_str(std_err), sizeof(__entry->std_err)); 1496 - snprintf(__entry->ip, sizeof(__entry->ip), "%ps", (void *) ip); 1497 - ), 1498 - 1499 - TP_printk("%s -> %s %s", __entry->bch_err, __entry->std_err, __entry->ip) 1444 + DEFINE_EVENT(fs_str, io_move_evacuate_bucket, 1445 + TP_PROTO(struct bch_fs *c, const char *str), 1446 + TP_ARGS(c, str) 1500 1447 ); 1501 1448 1502 1449 #ifdef CONFIG_BCACHEFS_PATH_TRACEPOINTS