Merge tag 'for-6.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

+73 -7

Documentation/admin-guide/device-mapper/dm-raid.rst

··· 20 20 raid0 RAID0 striping (no resilience) 21 21 raid1 RAID1 mirroring 22 22 raid4 RAID4 with dedicated last parity disk 23 - raid5_n RAID5 with dedicated last parity disk supporting takeover 23 + raid5_n RAID5 with dedicated last parity disk supporting takeover from/to raid1 24 24 Same as raid4 25 25 26 - - Transitory layout 26 + - Transitory layout for takeover from/to raid1 27 27 raid5_la RAID5 left asymmetric 28 28 29 29 - rotating parity 0 with data continuation ··· 48 48 raid6_n_6 RAID6 with dedicate parity disks 49 49 50 50 - parity and Q-syndrome on the last 2 disks; 51 - layout for takeover from/to raid4/raid5_n 52 - raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk 51 + layout for takeover from/to raid0/raid4/raid5_n 52 + raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk supporting takeover from/to raid5 53 53 54 54 - layout for takeover from raid5_la from/to raid6 55 55 raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk ··· 173 173 The delta_disks option value (-251 < N < +251) triggers 174 174 device removal (negative value) or device addition (positive 175 175 value) to any reshape supporting raid levels 4/5/6 and 10. 176 - RAID levels 4/5/6 allow for addition of devices (metadata 177 - and data device tuple), raid10_near and raid10_offset only 178 - allow for device addition. raid10_far does not support any 176 + RAID levels 4/5/6 allow for addition and removal of devices 177 + (metadata and data device tuple), raid10_near and raid10_offset 178 + only allow for device addition. raid10_far does not support any 179 179 reshaping at all. 180 180 A minimum of devices have to be kept to enforce resilience, 181 181 which is 3 devices for raid4/5 and 4 devices for raid6. ··· 370 370 to safely enable discard support for RAID 4/5/6: 371 371 372 372 'devices_handle_discards_safely' 373 + 374 + 375 + Takeover/Reshape Support 376 + ------------------------ 377 + The target natively supports these two types of MDRAID conversions: 378 + 379 + o Takeover: Converts an array from one RAID level to another 380 + 381 + o Reshape: Changes the internal layout while maintaining the current RAID level 382 + 383 + Each operation is only valid under specific constraints imposed by the existing array's layout and configuration. 384 + 385 + 386 + Takeover: 387 + linear -> raid1 with N >= 2 mirrors 388 + raid0 -> raid4 (add dedicated parity device) 389 + raid0 -> raid5 (add dedicated parity device) 390 + raid0 -> raid10 with near layout and N >= 2 mirror groups (raid0 stripes have to become first member within mirror groups) 391 + raid1 -> linear 392 + raid1 -> raid5 with 2 mirrors 393 + raid4 -> raid5 w/ rotating parity 394 + raid5 with dedicated parity device -> raid4 395 + raid5 -> raid6 (with dedicated Q-syndrome) 396 + raid6 (with dedicated Q-syndrome) -> raid5 397 + raid10 with near layout and even number of disks -> raid0 (select any in-sync device from each mirror group) 398 + 399 + Reshape: 400 + linear: not possible 401 + raid0: not possible 402 + raid1: change number of mirrors 403 + raid4: add and remove stripes (minimum 3), change stripesize 404 + raid5: add and remove stripes (minimum 3, special case 2 for raid1 takeover), change rotating parity algorithms, change stripesize 405 + raid6: add and remove stripes (minimum 4), change rotating syndrome algorithms, change stripesize 406 + raid10 near: add stripes (minimum 4), change stripesize, no stripe removal possible, change to offset layout 407 + raid10 offset: add stripes, change stripesize, no stripe removal possible, change to near layout 408 + raid10 far: not possible 409 + 410 + Table line examples: 411 + 412 + ### raid1 -> raid5 413 + # 414 + # 2 devices limitation in raid1. 415 + # raid5 personality is able to just map 2 like raid1. 416 + # Reshape after takeover to change to full raid5 layout 417 + 418 + 0 1960886272 raid raid1 3 0 region_size 2048 2 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3 419 + 420 + # dm-0 and dm-2 are e.g. 4MiB large metadata devices, dm-1 and dm-3 have to be at least 1960886272 big. 421 + # 422 + # Table line to takeover to raid5 423 + 424 + 0 1960886272 raid raid5 3 0 region_size 2048 2 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3 425 + 426 + # Add required out-of-place reshape space to the beginniong of the given 2 data devices, 427 + # allocate another metadata/data device tuple with the same sizes for the parity space 428 + # and zero the first 4K of the metadata device. 429 + # 430 + # Example table of the out-of-place reshape space addition for one data device, e.g. dm-1 431 + 432 + 0 8192 linear 8:0 0 1960903888 # <- must be free space segment 433 + 8192 1960886272 linear 8:0 0 2048 # previous data segment 434 + 435 + # Mapping table for e.g. raid5_rs reshape causing the size of the raid device to double-fold once the reshape finishes. 436 + # Check the status output (e.g. "dmsetup status $RaidDev") for progess. 437 + 438 + 0 $((2 * 1960886272)) raid raid5 7 0 region_size 2048 data_offset 8192 delta_disk 1 2 /dev/dm-0 /dev/dm-1 /dev/dm-2 /dev/dm-3 373 439 374 440 375 441 Version History

+4 -2

Documentation/admin-guide/device-mapper/verity.rst

··· 236 236 237 237 Status 238 238 ====== 239 - V (for Valid) is returned if every check performed so far was valid. 240 - If any check failed, C (for Corruption) is returned. 239 + 1. V (for Valid) is returned if every check performed so far was valid. 240 + If any check failed, C (for Corruption) is returned. 241 + 2. Number of corrected blocks by Forward Error Correction. 242 + '-' if Forward Error Correction is not enabled. 241 243 242 244 Example 243 245 =======

+1

MAINTAINERS

··· 7225 7225 M: Alasdair Kergon <agk@redhat.com> 7226 7226 M: Mike Snitzer <snitzer@kernel.org> 7227 7227 M: Mikulas Patocka <mpatocka@redhat.com> 7228 + M: Benjamin Marzinski <bmarzins@redhat.com> 7228 7229 L: dm-devel@lists.linux.dev 7229 7230 S: Maintained 7230 7231 Q: http://patchwork.kernel.org/project/dm-devel/list/

+2

drivers/md/Kconfig

··· 299 299 select CRYPTO 300 300 select CRYPTO_CBC 301 301 select CRYPTO_ESSIV 302 + select CRYPTO_LIB_MD5 # needed by lmk IV mode 302 303 help 303 304 This device-mapper target allows you to create a device that 304 305 transparently encrypts the data on it. You'll need to activate ··· 547 546 depends on BLK_DEV_DM 548 547 select CRYPTO 549 548 select CRYPTO_HASH 549 + select CRYPTO_LIB_SHA256 550 550 select DM_BUFIO 551 551 help 552 552 This device-mapper target creates a read-only device that

+6 -4

drivers/md/dm-bufio.c

··· 1374 1374 { 1375 1375 unsigned int n_sectors; 1376 1376 sector_t sector; 1377 - unsigned int offset, end; 1377 + unsigned int offset, end, align; 1378 1378 1379 1379 b->end_io = end_io; 1380 1380 ··· 1388 1388 b->c->write_callback(b); 1389 1389 offset = b->write_start; 1390 1390 end = b->write_end; 1391 - offset &= -DM_BUFIO_WRITE_ALIGN; 1392 - end += DM_BUFIO_WRITE_ALIGN - 1; 1393 - end &= -DM_BUFIO_WRITE_ALIGN; 1391 + align = max(DM_BUFIO_WRITE_ALIGN, 1392 + bdev_physical_block_size(b->c->bdev)); 1393 + offset &= -align; 1394 + end += align - 1; 1395 + end &= -align; 1394 1396 if (unlikely(end > b->c->block_size)) 1395 1397 end = b->c->block_size; 1396 1398

-1

drivers/md/dm-core.h

··· 139 139 struct srcu_struct io_barrier; 140 140 141 141 #ifdef CONFIG_BLK_DEV_ZONED 142 - unsigned int nr_zones; 143 142 void *zone_revalidate_map; 144 143 struct task_struct *revalidate_map_task; 145 144 #endif

+46 -71

drivers/md/dm-crypt.c

··· 21 21 #include <linux/mempool.h> 22 22 #include <linux/slab.h> 23 23 #include <linux/crypto.h> 24 + #include <linux/fips.h> 24 25 #include <linux/workqueue.h> 25 26 #include <linux/kthread.h> 26 27 #include <linux/backing-dev.h> ··· 121 120 122 121 #define LMK_SEED_SIZE 64 /* hash + 0 */ 123 122 struct iv_lmk_private { 124 - struct crypto_shash *hash_tfm; 125 123 u8 *seed; 126 124 }; 127 125 ··· 254 254 module_param(max_write_size, uint, 0644); 255 255 MODULE_PARM_DESC(max_write_size, "Maximum size of a write request"); 256 256 257 - static unsigned get_max_request_sectors(struct dm_target *ti, struct bio *bio) 257 + static unsigned get_max_request_sectors(struct dm_target *ti, struct bio *bio, bool no_split) 258 258 { 259 259 struct crypt_config *cc = ti->private; 260 260 unsigned val, sector_align; 261 261 bool wrt = op_is_write(bio_op(bio)); 262 262 263 - if (wrt) { 264 - /* 265 - * For zoned devices, splitting write operations creates the 266 - * risk of deadlocking queue freeze operations with zone write 267 - * plugging BIO work when the reminder of a split BIO is 268 - * issued. So always allow the entire BIO to proceed. 269 - */ 270 - if (ti->emulate_zone_append) 271 - return bio_sectors(bio); 272 - 263 + if (no_split) { 264 + val = -1; 265 + } else if (wrt) { 273 266 val = min_not_zero(READ_ONCE(max_write_size), 274 267 DM_CRYPT_DEFAULT_MAX_WRITE_SIZE); 275 268 } else { ··· 458 465 { 459 466 struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk; 460 467 461 - if (lmk->hash_tfm && !IS_ERR(lmk->hash_tfm)) 462 - crypto_free_shash(lmk->hash_tfm); 463 - lmk->hash_tfm = NULL; 464 - 465 468 kfree_sensitive(lmk->seed); 466 469 lmk->seed = NULL; 467 470 } ··· 472 483 return -EINVAL; 473 484 } 474 485 475 - lmk->hash_tfm = crypto_alloc_shash("md5", 0, 476 - CRYPTO_ALG_ALLOCATES_MEMORY); 477 - if (IS_ERR(lmk->hash_tfm)) { 478 - ti->error = "Error initializing LMK hash"; 479 - return PTR_ERR(lmk->hash_tfm); 486 + if (fips_enabled) { 487 + ti->error = "LMK support is disabled due to FIPS"; 488 + /* ... because it uses MD5. */ 489 + return -EINVAL; 480 490 } 481 491 482 492 /* No seed in LMK version 2 */ ··· 486 498 487 499 lmk->seed = kzalloc(LMK_SEED_SIZE, GFP_KERNEL); 488 500 if (!lmk->seed) { 489 - crypt_iv_lmk_dtr(cc); 490 501 ti->error = "Error kmallocing seed storage in LMK"; 491 502 return -ENOMEM; 492 503 } ··· 501 514 /* LMK seed is on the position of LMK_KEYS + 1 key */ 502 515 if (lmk->seed) 503 516 memcpy(lmk->seed, cc->key + (cc->tfms_count * subkey_size), 504 - crypto_shash_digestsize(lmk->hash_tfm)); 517 + MD5_DIGEST_SIZE); 505 518 506 519 return 0; 507 520 } ··· 516 529 return 0; 517 530 } 518 531 519 - static int crypt_iv_lmk_one(struct crypt_config *cc, u8 *iv, 520 - struct dm_crypt_request *dmreq, 521 - u8 *data) 532 + static void crypt_iv_lmk_one(struct crypt_config *cc, u8 *iv, 533 + struct dm_crypt_request *dmreq, u8 *data) 522 534 { 523 535 struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk; 524 - SHASH_DESC_ON_STACK(desc, lmk->hash_tfm); 525 - union { 526 - struct md5_state md5state; 527 - u8 state[CRYPTO_MD5_STATESIZE]; 528 - } u; 536 + struct md5_ctx ctx; 529 537 __le32 buf[4]; 530 - int i, r; 531 538 532 - desc->tfm = lmk->hash_tfm; 539 + md5_init(&ctx); 533 540 534 - r = crypto_shash_init(desc); 535 - if (r) 536 - return r; 537 - 538 - if (lmk->seed) { 539 - r = crypto_shash_update(desc, lmk->seed, LMK_SEED_SIZE); 540 - if (r) 541 - return r; 542 - } 541 + if (lmk->seed) 542 + md5_update(&ctx, lmk->seed, LMK_SEED_SIZE); 543 543 544 544 /* Sector is always 512B, block size 16, add data of blocks 1-31 */ 545 - r = crypto_shash_update(desc, data + 16, 16 * 31); 546 - if (r) 547 - return r; 545 + md5_update(&ctx, data + 16, 16 * 31); 548 546 549 547 /* Sector is cropped to 56 bits here */ 550 548 buf[0] = cpu_to_le32(dmreq->iv_sector & 0xFFFFFFFF); 551 549 buf[1] = cpu_to_le32((((u64)dmreq->iv_sector >> 32) & 0x00FFFFFF) | 0x80000000); 552 550 buf[2] = cpu_to_le32(4024); 553 551 buf[3] = 0; 554 - r = crypto_shash_update(desc, (u8 *)buf, sizeof(buf)); 555 - if (r) 556 - return r; 552 + md5_update(&ctx, (u8 *)buf, sizeof(buf)); 557 553 558 554 /* No MD5 padding here */ 559 - r = crypto_shash_export(desc, &u.md5state); 560 - if (r) 561 - return r; 562 - 563 - for (i = 0; i < MD5_HASH_WORDS; i++) 564 - __cpu_to_le32s(&u.md5state.hash[i]); 565 - memcpy(iv, &u.md5state.hash, cc->iv_size); 566 - 567 - return 0; 555 + cpu_to_le32_array(ctx.state.h, ARRAY_SIZE(ctx.state.h)); 556 + memcpy(iv, ctx.state.h, cc->iv_size); 568 557 } 569 558 570 559 static int crypt_iv_lmk_gen(struct crypt_config *cc, u8 *iv, ··· 548 585 { 549 586 struct scatterlist *sg; 550 587 u8 *src; 551 - int r = 0; 552 588 553 589 if (bio_data_dir(dmreq->ctx->bio_in) == WRITE) { 554 590 sg = crypt_get_sg_data(cc, dmreq->sg_in); 555 591 src = kmap_local_page(sg_page(sg)); 556 - r = crypt_iv_lmk_one(cc, iv, dmreq, src + sg->offset); 592 + crypt_iv_lmk_one(cc, iv, dmreq, src + sg->offset); 557 593 kunmap_local(src); 558 594 } else 559 595 memset(iv, 0, cc->iv_size); 560 - 561 - return r; 596 + return 0; 562 597 } 563 598 564 599 static int crypt_iv_lmk_post(struct crypt_config *cc, u8 *iv, ··· 564 603 { 565 604 struct scatterlist *sg; 566 605 u8 *dst; 567 - int r; 568 606 569 607 if (bio_data_dir(dmreq->ctx->bio_in) == WRITE) 570 608 return 0; 571 609 572 610 sg = crypt_get_sg_data(cc, dmreq->sg_out); 573 611 dst = kmap_local_page(sg_page(sg)); 574 - r = crypt_iv_lmk_one(cc, iv, dmreq, dst + sg->offset); 612 + crypt_iv_lmk_one(cc, iv, dmreq, dst + sg->offset); 575 613 576 614 /* Tweak the first block of plaintext sector */ 577 - if (!r) 578 - crypto_xor(dst + sg->offset, iv, cc->iv_size); 615 + crypto_xor(dst + sg->offset, iv, cc->iv_size); 579 616 580 617 kunmap_local(dst); 581 - return r; 618 + return 0; 582 619 } 583 620 584 621 static void crypt_iv_tcw_dtr(struct crypt_config *cc) ··· 1740 1781 bio_for_each_folio_all(fi, clone) { 1741 1782 if (folio_test_large(fi.folio)) { 1742 1783 percpu_counter_sub(&cc->n_allocated_pages, 1743 - 1 << folio_order(fi.folio)); 1784 + folio_nr_pages(fi.folio)); 1744 1785 folio_put(fi.folio); 1745 1786 } else { 1746 1787 mempool_free(&fi.folio->page, &cc->page_pool); ··· 3455 3496 struct dm_crypt_io *io; 3456 3497 struct crypt_config *cc = ti->private; 3457 3498 unsigned max_sectors; 3499 + bool no_split; 3458 3500 3459 3501 /* 3460 3502 * If bio is REQ_PREFLUSH or REQ_OP_DISCARD, just bypass crypt queues. ··· 3473 3513 3474 3514 /* 3475 3515 * Check if bio is too large, split as needed. 3516 + * 3517 + * For zoned devices, splitting write operations creates the 3518 + * risk of deadlocking queue freeze operations with zone write 3519 + * plugging BIO work when the reminder of a split BIO is 3520 + * issued. So always allow the entire BIO to proceed. 3476 3521 */ 3477 - max_sectors = get_max_request_sectors(ti, bio); 3478 - if (unlikely(bio_sectors(bio) > max_sectors)) 3522 + no_split = (ti->emulate_zone_append && op_is_write(bio_op(bio))) || 3523 + (bio->bi_opf & REQ_ATOMIC); 3524 + max_sectors = get_max_request_sectors(ti, bio, no_split); 3525 + if (unlikely(bio_sectors(bio) > max_sectors)) { 3526 + if (unlikely(no_split)) 3527 + return DM_MAPIO_KILL; 3479 3528 dm_accept_partial_bio(bio, max_sectors); 3529 + } 3480 3530 3481 3531 /* 3482 3532 * Ensure that bio is a multiple of internal sector encryption size ··· 3732 3762 if (ti->emulate_zone_append) 3733 3763 limits->max_hw_sectors = min(limits->max_hw_sectors, 3734 3764 BIO_MAX_VECS << PAGE_SECTORS_SHIFT); 3765 + 3766 + limits->atomic_write_hw_unit_max = min(limits->atomic_write_hw_unit_max, 3767 + BIO_MAX_VECS << PAGE_SHIFT); 3768 + limits->atomic_write_hw_max = min(limits->atomic_write_hw_max, 3769 + BIO_MAX_VECS << PAGE_SHIFT); 3735 3770 } 3736 3771 3737 3772 static struct target_type crypt_target = { 3738 3773 .name = "crypt", 3739 - .version = {1, 28, 0}, 3774 + .version = {1, 29, 0}, 3740 3775 .module = THIS_MODULE, 3741 3776 .ctr = crypt_ctr, 3742 3777 .dtr = crypt_dtr, 3743 - .features = DM_TARGET_ZONED_HM, 3778 + .features = DM_TARGET_ZONED_HM | DM_TARGET_ATOMIC_WRITES, 3744 3779 .report_zones = crypt_report_zones, 3745 3780 .map = crypt_map, 3746 3781 .status = crypt_status,

+1 -1

drivers/md/dm-ebs-target.c

··· 103 103 } else { 104 104 flush_dcache_page(bv->bv_page); 105 105 memcpy(ba, pa, cur_len); 106 - dm_bufio_mark_partial_buffer_dirty(b, buf_off, buf_off + cur_len); 106 + dm_bufio_mark_buffer_dirty(b); 107 107 } 108 108 109 109 dm_bufio_release(b);

+1 -1

drivers/md/dm-exception-store.h

··· 29 29 * chunk within the device. 30 30 */ 31 31 struct dm_exception { 32 - struct hlist_bl_node hash_list; 32 + struct hlist_node hash_list; 33 33 34 34 chunk_t old_chunk; 35 35 chunk_t new_chunk;

+1

drivers/md/dm-log-writes.c

··· 432 432 struct log_writes_c *lc = arg; 433 433 sector_t sector = 0; 434 434 435 + set_freezable(); 435 436 while (!kthread_should_stop()) { 436 437 bool super = false; 437 438 bool logging_enabled;

+24 -39

drivers/md/dm-mpath.c

··· 131 131 #define MPATHF_QUEUE_IO 0 /* Must we queue all I/O? */ 132 132 #define MPATHF_QUEUE_IF_NO_PATH 1 /* Queue I/O if last path fails? */ 133 133 #define MPATHF_SAVED_QUEUE_IF_NO_PATH 2 /* Saved state during suspension */ 134 - #define MPATHF_RETAIN_ATTACHED_HW_HANDLER 3 /* If there's already a hw_handler present, don't change it. */ 134 + /* MPATHF_RETAIN_ATTACHED_HW_HANDLER no longer has any effect */ 135 135 #define MPATHF_PG_INIT_DISABLED 4 /* pg_init is not currently allowed */ 136 136 #define MPATHF_PG_INIT_REQUIRED 5 /* pg_init needs calling? */ 137 137 #define MPATHF_PG_INIT_DELAY_RETRY 6 /* Delay pg_init retry? */ ··· 237 237 238 238 static int alloc_multipath_stage2(struct dm_target *ti, struct multipath *m) 239 239 { 240 - if (m->queue_mode == DM_TYPE_NONE) { 240 + if (m->queue_mode == DM_TYPE_NONE) 241 241 m->queue_mode = DM_TYPE_REQUEST_BASED; 242 - } else if (m->queue_mode == DM_TYPE_BIO_BASED) { 242 + else if (m->queue_mode == DM_TYPE_BIO_BASED) 243 243 INIT_WORK(&m->process_queued_bios, process_queued_bios); 244 - /* 245 - * bio-based doesn't support any direct scsi_dh management; 246 - * it just discovers if a scsi_dh is attached. 247 - */ 248 - set_bit(MPATHF_RETAIN_ATTACHED_HW_HANDLER, &m->flags); 249 - } 250 244 251 245 dm_table_set_type(ti->table, m->queue_mode); 252 246 ··· 881 887 struct request_queue *q = bdev_get_queue(bdev); 882 888 int r; 883 889 884 - if (mpath_double_check_test_bit(MPATHF_RETAIN_ATTACHED_HW_HANDLER, m)) { 885 - retain: 886 - if (*attached_handler_name) { 887 - /* 888 - * Clear any hw_handler_params associated with a 889 - * handler that isn't already attached. 890 - */ 891 - if (m->hw_handler_name && strcmp(*attached_handler_name, m->hw_handler_name)) { 892 - kfree(m->hw_handler_params); 893 - m->hw_handler_params = NULL; 894 - } 895 - 896 - /* 897 - * Reset hw_handler_name to match the attached handler 898 - * 899 - * NB. This modifies the table line to show the actual 900 - * handler instead of the original table passed in. 901 - */ 902 - kfree(m->hw_handler_name); 903 - m->hw_handler_name = *attached_handler_name; 904 - *attached_handler_name = NULL; 890 + if (*attached_handler_name) { 891 + /* 892 + * Clear any hw_handler_params associated with a 893 + * handler that isn't already attached. 894 + */ 895 + if (m->hw_handler_name && strcmp(*attached_handler_name, 896 + m->hw_handler_name)) { 897 + kfree(m->hw_handler_params); 898 + m->hw_handler_params = NULL; 905 899 } 900 + 901 + /* 902 + * Reset hw_handler_name to match the attached handler 903 + * 904 + * NB. This modifies the table line to show the actual 905 + * handler instead of the original table passed in. 906 + */ 907 + kfree(m->hw_handler_name); 908 + m->hw_handler_name = *attached_handler_name; 909 + *attached_handler_name = NULL; 906 910 } 907 911 908 912 if (m->hw_handler_name) { 909 913 r = scsi_dh_attach(q, m->hw_handler_name); 910 - if (r == -EBUSY) { 911 - DMINFO("retaining handler on device %pg", bdev); 912 - goto retain; 913 - } 914 914 if (r < 0) { 915 915 *error = "error attaching hardware handler"; 916 916 return r; ··· 1126 1138 } 1127 1139 1128 1140 if (!strcasecmp(arg_name, "retain_attached_hw_handler")) { 1129 - set_bit(MPATHF_RETAIN_ATTACHED_HW_HANDLER, &m->flags); 1141 + /* no longer has any effect */ 1130 1142 continue; 1131 1143 } 1132 1144 ··· 1811 1823 DMEMIT("%u ", test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) + 1812 1824 (m->pg_init_retries > 0) * 2 + 1813 1825 (m->pg_init_delay_msecs != DM_PG_INIT_DELAY_DEFAULT) * 2 + 1814 - test_bit(MPATHF_RETAIN_ATTACHED_HW_HANDLER, &m->flags) + 1815 1826 (m->queue_mode != DM_TYPE_REQUEST_BASED) * 2); 1816 1827 1817 1828 if (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags)) ··· 1819 1832 DMEMIT("pg_init_retries %u ", m->pg_init_retries); 1820 1833 if (m->pg_init_delay_msecs != DM_PG_INIT_DELAY_DEFAULT) 1821 1834 DMEMIT("pg_init_delay_msecs %u ", m->pg_init_delay_msecs); 1822 - if (test_bit(MPATHF_RETAIN_ATTACHED_HW_HANDLER, &m->flags)) 1823 - DMEMIT("retain_attached_hw_handler "); 1824 1835 if (m->queue_mode != DM_TYPE_REQUEST_BASED) { 1825 1836 switch (m->queue_mode) { 1826 1837 case DM_TYPE_BIO_BASED: ··· 2292 2307 .name = "multipath", 2293 2308 .version = {1, 15, 0}, 2294 2309 .features = DM_TARGET_SINGLETON | DM_TARGET_IMMUTABLE | 2295 - DM_TARGET_PASSES_INTEGRITY, 2310 + DM_TARGET_PASSES_INTEGRITY | DM_TARGET_ATOMIC_WRITES, 2296 2311 .module = THIS_MODULE, 2297 2312 .ctr = multipath_ctr, 2298 2313 .dtr = multipath_dtr,

+8 -5

drivers/md/dm-pcache/cache.c

··· 10 10 11 11 static inline struct pcache_cache_info *get_cache_info_addr(struct pcache_cache *cache) 12 12 { 13 - return cache->cache_info_addr + cache->info_index; 13 + return (struct pcache_cache_info *)((char *)cache->cache_info_addr + 14 + (size_t)cache->info_index * PCACHE_CACHE_INFO_SIZE); 14 15 } 15 16 16 17 static void cache_info_write(struct pcache_cache *cache) ··· 22 21 cache_info->header.crc = pcache_meta_crc(&cache_info->header, 23 22 sizeof(struct pcache_cache_info)); 24 23 24 + cache->info_index = (cache->info_index + 1) % PCACHE_META_INDEX_MAX; 25 25 memcpy_flushcache(get_cache_info_addr(cache), cache_info, 26 26 sizeof(struct pcache_cache_info)); 27 - 28 - cache->info_index = (cache->info_index + 1) % PCACHE_META_INDEX_MAX; 27 + pmem_wmb(); 29 28 } 30 29 31 30 static void cache_info_init_default(struct pcache_cache *cache); ··· 49 48 cache->cache_info.flags & PCACHE_CACHE_FLAGS_DATA_CRC ? "true" : "false"); 50 49 return -EINVAL; 51 50 } 51 + 52 + cache->info_index = ((char *)cache_info_addr - (char *)cache->cache_info_addr) / PCACHE_CACHE_INFO_SIZE; 52 53 53 54 return 0; 54 55 } ··· 96 93 pos_onmedia.header.seq = seq; 97 94 pos_onmedia.header.crc = cache_pos_onmedia_crc(&pos_onmedia); 98 95 96 + *index = (*index + 1) % PCACHE_META_INDEX_MAX; 97 + 99 98 memcpy_flushcache(pos_onmedia_addr, &pos_onmedia, sizeof(struct pcache_cache_pos_onmedia)); 100 99 pmem_wmb(); 101 - 102 - *index = (*index + 1) % PCACHE_META_INDEX_MAX; 103 100 } 104 101 105 102 int cache_pos_decode(struct pcache_cache *cache,

+8 -5

drivers/md/dm-pcache/cache_segment.c

··· 26 26 seg_info->header.seq++; 27 27 seg_info->header.crc = pcache_meta_crc(&seg_info->header, sizeof(struct pcache_segment_info)); 28 28 29 + cache_seg->info_index = (cache_seg->info_index + 1) % PCACHE_META_INDEX_MAX; 30 + 29 31 seg_info_addr = get_seg_info_addr(cache_seg); 30 32 memcpy_flushcache(seg_info_addr, seg_info, sizeof(struct pcache_segment_info)); 31 33 pmem_wmb(); 32 - 33 - cache_seg->info_index = (cache_seg->info_index + 1) % PCACHE_META_INDEX_MAX; 34 34 mutex_unlock(&cache_seg->info_lock); 35 35 } 36 36 ··· 56 56 ret = -EIO; 57 57 goto out; 58 58 } 59 - cache_seg->info_index = cache_seg_info_addr - cache_seg_info_addr_base; 59 + 60 + cache_seg->info_index = 61 + ((char *)cache_seg_info_addr - (char *)cache_seg_info_addr_base) / 62 + PCACHE_SEG_INFO_SIZE; 60 63 out: 61 64 mutex_unlock(&cache_seg->info_lock); 62 65 ··· 132 129 cache_seg_gen.header.crc = pcache_meta_crc(&cache_seg_gen.header, 133 130 sizeof(struct pcache_cache_seg_gen)); 134 131 132 + cache_seg->gen_index = (cache_seg->gen_index + 1) % PCACHE_META_INDEX_MAX; 133 + 135 134 memcpy_flushcache(get_cache_seg_gen_addr(cache_seg), &cache_seg_gen, sizeof(struct pcache_cache_seg_gen)); 136 135 pmem_wmb(); 137 - 138 - cache_seg->gen_index = (cache_seg->gen_index + 1) % PCACHE_META_INDEX_MAX; 139 136 } 140 137 141 138 static void cache_seg_ctrl_init(struct pcache_cache_segment *cache_seg)

+2

drivers/md/dm-raid.c

··· 2287 2287 2288 2288 mddev->reshape_position = le64_to_cpu(sb->reshape_position); 2289 2289 rs->raid_type = get_raid_type_by_ll(mddev->level, mddev->layout); 2290 + if (!rs->raid_type) 2291 + return -EINVAL; 2290 2292 } 2291 2293 2292 2294 } else {

+34 -39

drivers/md/dm-snap.c

··· 40 40 #define DM_TRACKED_CHUNK_HASH(x) ((unsigned long)(x) & \ 41 41 (DM_TRACKED_CHUNK_HASH_SIZE - 1)) 42 42 43 + struct dm_hlist_head { 44 + struct hlist_head head; 45 + spinlock_t lock; 46 + }; 47 + 43 48 struct dm_exception_table { 44 49 uint32_t hash_mask; 45 50 unsigned int hash_shift; 46 - struct hlist_bl_head *table; 51 + struct dm_hlist_head *table; 47 52 }; 48 53 49 54 struct dm_snapshot { ··· 633 628 634 629 /* Lock to protect access to the completed and pending exception hash tables. */ 635 630 struct dm_exception_table_lock { 636 - struct hlist_bl_head *complete_slot; 637 - struct hlist_bl_head *pending_slot; 631 + spinlock_t *complete_slot; 632 + spinlock_t *pending_slot; 638 633 }; 639 634 640 635 static void dm_exception_table_lock_init(struct dm_snapshot *s, chunk_t chunk, ··· 643 638 struct dm_exception_table *complete = &s->complete; 644 639 struct dm_exception_table *pending = &s->pending; 645 640 646 - lock->complete_slot = &complete->table[exception_hash(complete, chunk)]; 647 - lock->pending_slot = &pending->table[exception_hash(pending, chunk)]; 641 + lock->complete_slot = &complete->table[exception_hash(complete, chunk)].lock; 642 + lock->pending_slot = &pending->table[exception_hash(pending, chunk)].lock; 648 643 } 649 644 650 645 static void dm_exception_table_lock(struct dm_exception_table_lock *lock) 651 646 { 652 - hlist_bl_lock(lock->complete_slot); 653 - hlist_bl_lock(lock->pending_slot); 647 + spin_lock_nested(lock->complete_slot, 1); 648 + spin_lock_nested(lock->pending_slot, 2); 654 649 } 655 650 656 651 static void dm_exception_table_unlock(struct dm_exception_table_lock *lock) 657 652 { 658 - hlist_bl_unlock(lock->pending_slot); 659 - hlist_bl_unlock(lock->complete_slot); 653 + spin_unlock(lock->pending_slot); 654 + spin_unlock(lock->complete_slot); 660 655 } 661 656 662 657 static int dm_exception_table_init(struct dm_exception_table *et, ··· 666 661 667 662 et->hash_shift = hash_shift; 668 663 et->hash_mask = size - 1; 669 - et->table = kvmalloc_array(size, sizeof(struct hlist_bl_head), 664 + et->table = kvmalloc_array(size, sizeof(struct dm_hlist_head), 670 665 GFP_KERNEL); 671 666 if (!et->table) 672 667 return -ENOMEM; 673 668 674 - for (i = 0; i < size; i++) 675 - INIT_HLIST_BL_HEAD(et->table + i); 669 + for (i = 0; i < size; i++) { 670 + INIT_HLIST_HEAD(&et->table[i].head); 671 + spin_lock_init(&et->table[i].lock); 672 + } 676 673 677 674 return 0; 678 675 } ··· 682 675 static void dm_exception_table_exit(struct dm_exception_table *et, 683 676 struct kmem_cache *mem) 684 677 { 685 - struct hlist_bl_head *slot; 678 + struct dm_hlist_head *slot; 686 679 struct dm_exception *ex; 687 - struct hlist_bl_node *pos, *n; 680 + struct hlist_node *pos; 688 681 int i, size; 689 682 690 683 size = et->hash_mask + 1; 691 684 for (i = 0; i < size; i++) { 692 685 slot = et->table + i; 693 686 694 - hlist_bl_for_each_entry_safe(ex, pos, n, slot, hash_list) { 687 + hlist_for_each_entry_safe(ex, pos, &slot->head, hash_list) { 688 + hlist_del(&ex->hash_list); 695 689 kmem_cache_free(mem, ex); 696 690 cond_resched(); 697 691 } ··· 708 700 709 701 static void dm_remove_exception(struct dm_exception *e) 710 702 { 711 - hlist_bl_del(&e->hash_list); 703 + hlist_del(&e->hash_list); 712 704 } 713 705 714 706 /* ··· 718 710 static struct dm_exception *dm_lookup_exception(struct dm_exception_table *et, 719 711 chunk_t chunk) 720 712 { 721 - struct hlist_bl_head *slot; 722 - struct hlist_bl_node *pos; 713 + struct hlist_head *slot; 723 714 struct dm_exception *e; 724 715 725 - slot = &et->table[exception_hash(et, chunk)]; 726 - hlist_bl_for_each_entry(e, pos, slot, hash_list) 716 + slot = &et->table[exception_hash(et, chunk)].head; 717 + hlist_for_each_entry(e, slot, hash_list) 727 718 if (chunk >= e->old_chunk && 728 719 chunk <= e->old_chunk + dm_consecutive_chunk_count(e)) 729 720 return e; ··· 769 762 static void dm_insert_exception(struct dm_exception_table *eh, 770 763 struct dm_exception *new_e) 771 764 { 772 - struct hlist_bl_head *l; 773 - struct hlist_bl_node *pos; 765 + struct hlist_head *l; 774 766 struct dm_exception *e = NULL; 775 767 776 - l = &eh->table[exception_hash(eh, new_e->old_chunk)]; 768 + l = &eh->table[exception_hash(eh, new_e->old_chunk)].head; 777 769 778 770 /* Add immediately if this table doesn't support consecutive chunks */ 779 771 if (!eh->hash_shift) 780 772 goto out; 781 773 782 774 /* List is ordered by old_chunk */ 783 - hlist_bl_for_each_entry(e, pos, l, hash_list) { 775 + hlist_for_each_entry(e, l, hash_list) { 784 776 /* Insert after an existing chunk? */ 785 777 if (new_e->old_chunk == (e->old_chunk + 786 778 dm_consecutive_chunk_count(e) + 1) && ··· 810 804 * Either the table doesn't support consecutive chunks or slot 811 805 * l is empty. 812 806 */ 813 - hlist_bl_add_head(&new_e->hash_list, l); 807 + hlist_add_head(&new_e->hash_list, l); 814 808 } else if (new_e->old_chunk < e->old_chunk) { 815 809 /* Add before an existing exception */ 816 - hlist_bl_add_before(&new_e->hash_list, &e->hash_list); 810 + hlist_add_before(&new_e->hash_list, &e->hash_list); 817 811 } else { 818 812 /* Add to l's tail: e is the last exception in this slot */ 819 - hlist_bl_add_behind(&new_e->hash_list, &e->hash_list); 813 + hlist_add_behind(&new_e->hash_list, &e->hash_list); 820 814 } 821 815 } 822 816 ··· 826 820 */ 827 821 static int dm_add_exception(void *context, chunk_t old, chunk_t new) 828 822 { 829 - struct dm_exception_table_lock lock; 830 823 struct dm_snapshot *s = context; 831 824 struct dm_exception *e; 832 825 ··· 838 833 /* Consecutive_count is implicitly initialised to zero */ 839 834 e->new_chunk = new; 840 835 841 - /* 842 - * Although there is no need to lock access to the exception tables 843 - * here, if we don't then hlist_bl_add_head(), called by 844 - * dm_insert_exception(), will complain about accessing the 845 - * corresponding list without locking it first. 846 - */ 847 - dm_exception_table_lock_init(s, old, &lock); 848 - 849 - dm_exception_table_lock(&lock); 850 836 dm_insert_exception(&s->complete, e); 851 - dm_exception_table_unlock(&lock); 852 837 853 838 return 0; 854 839 } ··· 868 873 /* use a fixed size of 2MB */ 869 874 unsigned long mem = 2 * 1024 * 1024; 870 875 871 - mem /= sizeof(struct hlist_bl_head); 876 + mem /= sizeof(struct dm_hlist_head); 872 877 873 878 return mem; 874 879 }

+2 -6

drivers/md/dm-sysfs.c

··· 86 86 87 87 static ssize_t dm_attr_suspended_show(struct mapped_device *md, char *buf) 88 88 { 89 - sprintf(buf, "%d\n", dm_suspended_md(md)); 90 - 91 - return strlen(buf); 89 + return sysfs_emit(buf, "%d\n", dm_suspended_md(md)); 92 90 } 93 91 94 92 static ssize_t dm_attr_use_blk_mq_show(struct mapped_device *md, char *buf) 95 93 { 96 94 /* Purely for userspace compatibility */ 97 - sprintf(buf, "%d\n", true); 98 - 99 - return strlen(buf); 95 + return sysfs_emit(buf, "%d\n", true); 100 96 } 101 97 102 98 static DM_ATTR_RO(name);

+4

drivers/md/dm-table.c

··· 2043 2043 return true; 2044 2044 } 2045 2045 2046 + /* 2047 + * This function will be skipped by noflush reloads of immutable request 2048 + * based devices (dm-mpath). 2049 + */ 2046 2050 int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, 2047 2051 struct queue_limits *limits) 2048 2052 {

+7 -12

drivers/md/dm-thin.c

··· 395 395 op->bio = NULL; 396 396 } 397 397 398 - static int issue_discard(struct discard_op *op, dm_block_t data_b, dm_block_t data_e) 398 + static void issue_discard(struct discard_op *op, dm_block_t data_b, dm_block_t data_e) 399 399 { 400 400 struct thin_c *tc = op->tc; 401 401 sector_t s = block_to_sectors(tc->pool, data_b); 402 402 sector_t len = block_to_sectors(tc->pool, data_e - data_b); 403 403 404 - return __blkdev_issue_discard(tc->pool_dev->bdev, s, len, GFP_NOIO, &op->bio); 404 + __blkdev_issue_discard(tc->pool_dev->bdev, s, len, GFP_NOIO, &op->bio); 405 405 } 406 406 407 407 static void end_discard(struct discard_op *op, int r) ··· 1113 1113 break; 1114 1114 } 1115 1115 1116 - r = issue_discard(&op, b, e); 1117 - if (r) 1118 - goto out; 1116 + issue_discard(&op, b, e); 1119 1117 1120 1118 b = e; 1121 1119 } ··· 1186 1188 struct discard_op op; 1187 1189 1188 1190 begin_discard(&op, tc, discard_parent); 1189 - r = issue_discard(&op, m->data_block, data_end); 1190 - end_discard(&op, r); 1191 + issue_discard(&op, m->data_block, data_end); 1192 + end_discard(&op, 0); 1191 1193 } 1192 1194 } 1193 1195 ··· 4381 4383 { 4382 4384 struct thin_c *tc = ti->private; 4383 4385 4384 - /* 4385 - * The dm_noflush_suspending flag has been cleared by now, so 4386 - * unfortunately we must always run this. 4387 - */ 4388 - noflush_work(tc, do_noflush_stop); 4386 + if (dm_noflush_suspending(ti)) 4387 + noflush_work(tc, do_noflush_stop); 4389 4388 } 4390 4389 4391 4390 static int thin_preresume(struct dm_target *ti)

+1 -1

drivers/md/dm-vdo/action-manager.c

··· 43 43 * @actions: The two action slots. 44 44 * @current_action: The current action slot. 45 45 * @zones: The number of zones in which an action is to be applied. 46 - * @Scheduler: A function to schedule a default next action. 46 + * @scheduler: A function to schedule a default next action. 47 47 * @get_zone_thread_id: A function to get the id of the thread on which to apply an action to a 48 48 * zone. 49 49 * @initiator_thread_id: The ID of the thread on which actions may be initiated.

+50 -25

drivers/md/dm-vdo/admin-state.c

··· 149 149 /** 150 150 * get_next_state() - Determine the state which should be set after a given operation completes 151 151 * based on the operation and the current state. 152 - * @operation The operation to be started. 152 + * @state: The current admin state. 153 + * @operation: The operation to be started. 153 154 * 154 155 * Return: The state to set when the operation completes or NULL if the operation can not be 155 156 * started in the current state. ··· 188 187 189 188 /** 190 189 * vdo_finish_operation() - Finish the current operation. 190 + * @state: The current admin state. 191 + * @result: The result of the operation. 191 192 * 192 193 * Will notify the operation waiter if there is one. This method should be used for operations 193 194 * started with vdo_start_operation(). For operations which were started with vdo_start_draining(), ··· 217 214 218 215 /** 219 216 * begin_operation() - Begin an operation if it may be started given the current state. 220 - * @waiter A completion to notify when the operation is complete; may be NULL. 221 - * @initiator The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 217 + * @state: The current admin state. 218 + * @operation: The operation to be started. 219 + * @waiter: A completion to notify when the operation is complete; may be NULL. 220 + * @initiator: The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 222 221 * 223 222 * Return: VDO_SUCCESS or an error. 224 223 */ ··· 264 259 265 260 /** 266 261 * start_operation() - Start an operation if it may be started given the current state. 267 - * @waiter A completion to notify when the operation is complete. 268 - * @initiator The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 262 + * @state: The current admin state. 263 + * @operation: The operation to be started. 264 + * @waiter: A completion to notify when the operation is complete; may be NULL. 265 + * @initiator: The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 269 266 * 270 267 * Return: true if the operation was started. 271 268 */ ··· 281 274 282 275 /** 283 276 * check_code() - Check the result of a state validation. 284 - * @valid true if the code is of an appropriate type. 285 - * @code The code which failed to be of the correct type. 286 - * @what What the code failed to be, for logging. 287 - * @waiter The completion to notify of the error; may be NULL. 277 + * @valid: True if the code is of an appropriate type. 278 + * @code: The code which failed to be of the correct type. 279 + * @what: What the code failed to be, for logging. 280 + * @waiter: The completion to notify of the error; may be NULL. 288 281 * 289 282 * If the result failed, log an invalid state error and, if there is a waiter, notify it. 290 283 * ··· 308 301 309 302 /** 310 303 * assert_vdo_drain_operation() - Check that an operation is a drain. 311 - * @waiter The completion to finish with an error if the operation is not a drain. 304 + * @operation: The operation to check. 305 + * @waiter: The completion to finish with an error if the operation is not a drain. 312 306 * 313 307 * Return: true if the specified operation is a drain. 314 308 */ ··· 321 313 322 314 /** 323 315 * vdo_start_draining() - Initiate a drain operation if the current state permits it. 324 - * @operation The type of drain to initiate. 325 - * @waiter The completion to notify when the drain is complete. 326 - * @initiator The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 316 + * @state: The current admin state. 317 + * @operation: The type of drain to initiate. 318 + * @waiter: The completion to notify when the drain is complete. 319 + * @initiator: The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 327 320 * 328 321 * Return: true if the drain was initiated, if not the waiter will be notified. 329 322 */ ··· 354 345 355 346 /** 356 347 * vdo_finish_draining() - Finish a drain operation if one was in progress. 348 + * @state: The current admin state. 357 349 * 358 350 * Return: true if the state was draining; will notify the waiter if so. 359 351 */ ··· 365 355 366 356 /** 367 357 * vdo_finish_draining_with_result() - Finish a drain operation with a status code. 358 + * @state: The current admin state. 359 + * @result: The result of the drain operation. 368 360 * 369 361 * Return: true if the state was draining; will notify the waiter if so. 370 362 */ ··· 377 365 378 366 /** 379 367 * vdo_assert_load_operation() - Check that an operation is a load. 380 - * @waiter The completion to finish with an error if the operation is not a load. 368 + * @operation: The operation to check. 369 + * @waiter: The completion to finish with an error if the operation is not a load. 381 370 * 382 371 * Return: true if the specified operation is a load. 383 372 */ ··· 390 377 391 378 /** 392 379 * vdo_start_loading() - Initiate a load operation if the current state permits it. 393 - * @operation The type of load to initiate. 394 - * @waiter The completion to notify when the load is complete (may be NULL). 395 - * @initiator The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 380 + * @state: The current admin state. 381 + * @operation: The type of load to initiate. 382 + * @waiter: The completion to notify when the load is complete; may be NULL. 383 + * @initiator: The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 396 384 * 397 385 * Return: true if the load was initiated, if not the waiter will be notified. 398 386 */ ··· 407 393 408 394 /** 409 395 * vdo_finish_loading() - Finish a load operation if one was in progress. 396 + * @state: The current admin state. 410 397 * 411 398 * Return: true if the state was loading; will notify the waiter if so. 412 399 */ ··· 418 403 419 404 /** 420 405 * vdo_finish_loading_with_result() - Finish a load operation with a status code. 421 - * @result The result of the load operation. 406 + * @state: The current admin state. 407 + * @result: The result of the load operation. 422 408 * 423 409 * Return: true if the state was loading; will notify the waiter if so. 424 410 */ ··· 430 414 431 415 /** 432 416 * assert_vdo_resume_operation() - Check whether an admin_state_code is a resume operation. 433 - * @waiter The completion to notify if the operation is not a resume operation; may be NULL. 417 + * @operation: The operation to check. 418 + * @waiter: The completion to notify if the operation is not a resume operation; may be NULL. 434 419 * 435 420 * Return: true if the code is a resume operation. 436 421 */ ··· 444 427 445 428 /** 446 429 * vdo_start_resuming() - Initiate a resume operation if the current state permits it. 447 - * @operation The type of resume to start. 448 - * @waiter The completion to notify when the resume is complete (may be NULL). 449 - * @initiator The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 430 + * @state: The current admin state. 431 + * @operation: The type of resume to start. 432 + * @waiter: The completion to notify when the resume is complete; may be NULL. 433 + * @initiator: The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 450 434 * 451 435 * Return: true if the resume was initiated, if not the waiter will be notified. 452 436 */ ··· 461 443 462 444 /** 463 445 * vdo_finish_resuming() - Finish a resume operation if one was in progress. 446 + * @state: The current admin state. 464 447 * 465 448 * Return: true if the state was resuming; will notify the waiter if so. 466 449 */ ··· 472 453 473 454 /** 474 455 * vdo_finish_resuming_with_result() - Finish a resume operation with a status code. 475 - * @result The result of the resume operation. 456 + * @state: The current admin state. 457 + * @result: The result of the resume operation. 476 458 * 477 459 * Return: true if the state was resuming; will notify the waiter if so. 478 460 */ ··· 485 465 /** 486 466 * vdo_resume_if_quiescent() - Change the state to normal operation if the current state is 487 467 * quiescent. 468 + * @state: The current admin state. 488 469 * 489 470 * Return: VDO_SUCCESS if the state resumed, VDO_INVALID_ADMIN_STATE otherwise. 490 471 */ ··· 500 479 501 480 /** 502 481 * vdo_start_operation() - Attempt to start an operation. 482 + * @state: The current admin state. 483 + * @operation: The operation to attempt to start. 503 484 * 504 485 * Return: VDO_SUCCESS if the operation was started, VDO_INVALID_ADMIN_STATE if not 505 486 */ ··· 513 490 514 491 /** 515 492 * vdo_start_operation_with_waiter() - Attempt to start an operation. 516 - * @waiter the completion to notify when the operation completes or fails to start; may be NULL. 517 - * @initiator The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 493 + * @state: The current admin state. 494 + * @operation: The operation to attempt to start. 495 + * @waiter: The completion to notify when the operation completes or fails to start; may be NULL. 496 + * @initiator: The vdo_admin_initiator_fn to call if the operation may begin; may be NULL. 518 497 * 519 498 * Return: VDO_SUCCESS if the operation was started, VDO_INVALID_ADMIN_STATE if not 520 499 */

+43 -8

drivers/md/dm-vdo/block-map.c

··· 174 174 175 175 /** 176 176 * initialize_info() - Initialize all page info structures and put them on the free list. 177 + * @cache: The page cache. 177 178 * 178 179 * Return: VDO_SUCCESS or an error. 179 180 */ ··· 210 209 /** 211 210 * allocate_cache_components() - Allocate components of the cache which require their own 212 211 * allocation. 212 + * @cache: The page cache. 213 213 * 214 214 * The caller is responsible for all clean up on errors. 215 215 * ··· 240 238 /** 241 239 * assert_on_cache_thread() - Assert that a function has been called on the VDO page cache's 242 240 * thread. 241 + * @cache: The page cache. 242 + * @function_name: The funtion name to report if the assertion fails. 243 243 */ 244 244 static inline void assert_on_cache_thread(struct vdo_page_cache *cache, 245 245 const char *function_name) ··· 275 271 276 272 /** 277 273 * get_page_state_name() - Return the name of a page state. 274 + * @state: The page state to describe. 278 275 * 279 276 * If the page state is invalid a static string is returned and the invalid state is logged. 280 277 * ··· 347 342 /** 348 343 * set_info_state() - Set the state of a page_info and put it on the right list, adjusting 349 344 * counters. 345 + * @info: The page info to update. 346 + * @new_state: The new state to set. 350 347 */ 351 348 static void set_info_state(struct page_info *info, enum vdo_page_buffer_state new_state) 352 349 { ··· 423 416 424 417 /** 425 418 * find_free_page() - Find a free page. 419 + * @cache: The page cache. 426 420 * 427 421 * Return: A pointer to the page info structure (if found), NULL otherwise. 428 422 */ ··· 441 433 442 434 /** 443 435 * find_page() - Find the page info (if any) associated with a given pbn. 436 + * @cache: The page cache. 444 437 * @pbn: The absolute physical block number of the page. 445 438 * 446 439 * Return: The page info for the page if available, or NULL if not. ··· 458 449 459 450 /** 460 451 * select_lru_page() - Determine which page is least recently used. 452 + * @cache: The page cache. 461 453 * 462 454 * Picks the least recently used from among the non-busy entries at the front of each of the lru 463 455 * list. Since whenever we mark a page busy we also put it to the end of the list it is unlikely ··· 533 523 534 524 /** 535 525 * distribute_page_over_waitq() - Complete a waitq of VDO page completions with a page result. 526 + * @info: The loaded page info. 527 + * @waitq: The list of waiting data_vios. 536 528 * 537 529 * Upon completion the waitq will be empty. 538 530 * ··· 560 548 561 549 /** 562 550 * set_persistent_error() - Set a persistent error which all requests will receive in the future. 551 + * @cache: The page cache. 563 552 * @context: A string describing what triggered the error. 553 + * @result: The error result to set on the cache. 564 554 * 565 555 * Once triggered, all enqueued completions will get this error. Any future requests will result in 566 556 * this error as well. ··· 595 581 /** 596 582 * validate_completed_page() - Check that a page completion which is being freed to the cache 597 583 * referred to a valid page and is in a valid state. 584 + * @completion: The page completion to check. 598 585 * @writable: Whether a writable page is required. 599 586 * 600 587 * Return: VDO_SUCCESS if the page was valid, otherwise as error ··· 773 758 774 759 /** 775 760 * launch_page_load() - Begin the process of loading a page. 761 + * @info: The page info to launch. 762 + * @pbn: The absolute physical block number of the page to load. 776 763 * 777 764 * Return: VDO_SUCCESS or an error code. 778 765 */ ··· 853 836 854 837 /** 855 838 * schedule_page_save() - Add a page to the outgoing list of pages waiting to be saved. 839 + * @info: The page info to save. 856 840 * 857 841 * Once in the list, a page may not be used until it has been written out. 858 842 */ ··· 872 854 /** 873 855 * launch_page_save() - Add a page to outgoing pages waiting to be saved, and then start saving 874 856 * pages if another save is not in progress. 857 + * @info: The page info to save. 875 858 */ 876 859 static void launch_page_save(struct page_info *info) 877 860 { ··· 883 864 /** 884 865 * completion_needs_page() - Determine whether a given vdo_page_completion (as a waiter) is 885 866 * requesting a given page number. 867 + * @waiter: The page completion waiter to check. 886 868 * @context: A pointer to the pbn of the desired page. 887 869 * 888 870 * Implements waiter_match_fn. ··· 900 880 /** 901 881 * allocate_free_page() - Allocate a free page to the first completion in the waiting queue, and 902 882 * any other completions that match it in page number. 883 + * @info: The page info to allocate a page for. 903 884 */ 904 885 static void allocate_free_page(struct page_info *info) 905 886 { ··· 946 925 947 926 /** 948 927 * discard_a_page() - Begin the process of discarding a page. 928 + * @cache: The page cache. 949 929 * 950 930 * If no page is discardable, increments a count of deferred frees so that the next release of a 951 931 * page which is no longer busy will kick off another discard cycle. This is an indication that the ··· 977 955 launch_page_save(info); 978 956 } 979 957 980 - /** 981 - * discard_page_for_completion() - Helper used to trigger a discard so that the completion can get 982 - * a different page. 983 - */ 984 958 static void discard_page_for_completion(struct vdo_page_completion *vdo_page_comp) 985 959 { 986 960 struct vdo_page_cache *cache = vdo_page_comp->cache; ··· 1150 1132 1151 1133 /** 1152 1134 * vdo_release_page_completion() - Release a VDO Page Completion. 1135 + * @completion: The page completion to release. 1153 1136 * 1154 1137 * The page referenced by this completion (if any) will no longer be held busy by this completion. 1155 1138 * If a page becomes discardable and there are completions awaiting free pages then a new round of ··· 1191 1172 } 1192 1173 } 1193 1174 1194 - /** 1195 - * load_page_for_completion() - Helper function to load a page as described by a VDO Page 1196 - * Completion. 1197 - */ 1198 1175 static void load_page_for_completion(struct page_info *info, 1199 1176 struct vdo_page_completion *vdo_page_comp) 1200 1177 { ··· 1334 1319 1335 1320 /** 1336 1321 * vdo_invalidate_page_cache() - Invalidate all entries in the VDO page cache. 1322 + * @cache: The page cache. 1337 1323 * 1338 1324 * There must not be any dirty pages in the cache. 1339 1325 * ··· 1361 1345 1362 1346 /** 1363 1347 * get_tree_page_by_index() - Get the tree page for a given height and page index. 1348 + * @forest: The block map forest. 1349 + * @root_index: The root index of the tree to search. 1350 + * @height: The height in the tree. 1351 + * @page_index: The page index. 1364 1352 * 1365 1353 * Return: The requested page. 1366 1354 */ ··· 2231 2211 /** 2232 2212 * vdo_find_block_map_slot() - Find the block map slot in which the block map entry for a data_vio 2233 2213 * resides and cache that result in the data_vio. 2214 + * @data_vio: The data vio. 2234 2215 * 2235 2216 * All ancestors in the tree will be allocated or loaded, as needed. 2236 2217 */ ··· 2456 2435 /** 2457 2436 * make_forest() - Make a collection of trees for a block_map, expanding the existing forest if 2458 2437 * there is one. 2438 + * @map: The block map. 2459 2439 * @entries: The number of entries the block map will hold. 2460 2440 * 2461 2441 * Return: VDO_SUCCESS or an error. ··· 2498 2476 2499 2477 /** 2500 2478 * replace_forest() - Replace a block_map's forest with the already-prepared larger forest. 2479 + * @map: The block map. 2501 2480 */ 2502 2481 static void replace_forest(struct block_map *map) 2503 2482 { ··· 2515 2492 /** 2516 2493 * finish_cursor() - Finish the traversal of a single tree. If it was the last cursor, finish the 2517 2494 * traversal. 2495 + * @cursor: The cursor to complete. 2518 2496 */ 2519 2497 static void finish_cursor(struct cursor *cursor) 2520 2498 { ··· 2573 2549 2574 2550 /** 2575 2551 * traverse() - Traverse a single block map tree. 2552 + * @cursor: A cursor tracking traversal progress. 2576 2553 * 2577 2554 * This is the recursive heart of the traversal process. 2578 2555 */ ··· 2644 2619 /** 2645 2620 * launch_cursor() - Start traversing a single block map tree now that the cursor has a VIO with 2646 2621 * which to load pages. 2622 + * @waiter: The parent of the cursor to launch. 2647 2623 * @context: The pooled_vio just acquired. 2648 2624 * 2649 2625 * Implements waiter_callback_fn. ··· 2662 2636 2663 2637 /** 2664 2638 * compute_boundary() - Compute the number of pages used at each level of the given root's tree. 2639 + * @map: The block map. 2640 + * @root_index: The tree root index. 2665 2641 * 2666 2642 * Return: The list of page counts as a boundary structure. 2667 2643 */ ··· 2696 2668 2697 2669 /** 2698 2670 * vdo_traverse_forest() - Walk the entire forest of a block map. 2671 + * @map: The block map. 2699 2672 * @callback: A function to call with the pbn of each allocated node in the forest. 2700 2673 * @completion: The completion to notify on each traversed PBN, and when traversal completes. 2701 2674 */ ··· 2736 2707 2737 2708 /** 2738 2709 * initialize_block_map_zone() - Initialize the per-zone portions of the block map. 2710 + * @map: The block map. 2711 + * @zone_number: The zone to initialize. 2712 + * @cache_size: The total block map cache size. 2739 2713 * @maximum_age: The number of journal blocks before a dirtied page is considered old and must be 2740 2714 * written out. 2741 2715 */ ··· 3123 3091 3124 3092 /** 3125 3093 * clear_mapped_location() - Clear a data_vio's mapped block location, setting it to be unmapped. 3094 + * @data_vio: The data vio. 3126 3095 * 3127 3096 * This indicates the block map entry for the logical block is either unmapped or corrupted. 3128 3097 */ ··· 3137 3104 /** 3138 3105 * set_mapped_location() - Decode and validate a block map entry, and set the mapped location of a 3139 3106 * data_vio. 3107 + * @data_vio: The data vio. 3108 + * @entry: The new mapped entry to set. 3140 3109 * 3141 3110 * Return: VDO_SUCCESS or VDO_BAD_MAPPING if the map entry is invalid or an error code for any 3142 3111 * other failure

+5

drivers/md/dm-vdo/completion.c

··· 65 65 66 66 /** 67 67 * vdo_set_completion_result() - Set the result of a completion. 68 + * @completion: The completion to update. 69 + * @result: The result to set. 68 70 * 69 71 * Older errors will not be masked. 70 72 */ ··· 79 77 80 78 /** 81 79 * vdo_launch_completion_with_priority() - Run or enqueue a completion. 80 + * @completion: The completion to launch. 82 81 * @priority: The priority at which to enqueue the completion. 83 82 * 84 83 * If called on the correct thread (i.e. the one specified in the completion's callback_thread_id ··· 128 125 129 126 /** 130 127 * vdo_requeue_completion_if_needed() - Requeue a completion if not called on the specified thread. 128 + * @completion: The completion to requeue. 129 + * @callback_thread_id: The thread on which to requeue the completion. 131 130 * 132 131 * Return: True if the completion was requeued; callers may not access the completion in this case. 133 132 */

+32 -2

drivers/md/dm-vdo/data-vio.c

··· 227 227 /** 228 228 * check_for_drain_complete_locked() - Check whether a data_vio_pool has no outstanding data_vios 229 229 * or waiters while holding the pool's lock. 230 + * @pool: The data_vio pool. 230 231 */ 231 232 static bool check_for_drain_complete_locked(struct data_vio_pool *pool) 232 233 { ··· 388 387 389 388 /** 390 389 * cancel_data_vio_compression() - Prevent this data_vio from being compressed or packed. 390 + * @data_vio: The data_vio. 391 391 * 392 392 * Return: true if the data_vio is in the packer and the caller was the first caller to cancel it. 393 393 */ ··· 485 483 /** 486 484 * launch_data_vio() - (Re)initialize a data_vio to have a new logical block number, keeping the 487 485 * same parent and other state and send it on its way. 486 + * @data_vio: The data_vio to launch. 487 + * @lbn: The logical block number. 488 488 */ 489 489 static void launch_data_vio(struct data_vio *data_vio, logical_block_number_t lbn) 490 490 { ··· 645 641 646 642 /** 647 643 * schedule_releases() - Ensure that release processing is scheduled. 644 + * @pool: The data_vio pool. 648 645 * 649 646 * If this call switches the state to processing, enqueue. Otherwise, some other thread has already 650 647 * done so. ··· 773 768 774 769 /** 775 770 * initialize_data_vio() - Allocate the components of a data_vio. 771 + * @data_vio: The data_vio to initialize. 772 + * @vdo: The vdo containing the data_vio. 776 773 * 777 774 * The caller is responsible for cleaning up the data_vio on error. 778 775 * ··· 887 880 888 881 /** 889 882 * free_data_vio_pool() - Free a data_vio_pool and the data_vios in it. 883 + * @pool: The data_vio pool to free. 890 884 * 891 885 * All data_vios must be returned to the pool before calling this function. 892 886 */ ··· 952 944 953 945 /** 954 946 * vdo_launch_bio() - Acquire a data_vio from the pool, assign the bio to it, and launch it. 947 + * @pool: The data_vio pool. 948 + * @bio: The bio to launch. 955 949 * 956 950 * This will block if data_vios or discard permits are not available. 957 951 */ ··· 1004 994 1005 995 /** 1006 996 * drain_data_vio_pool() - Wait asynchronously for all data_vios to be returned to the pool. 997 + * @pool: The data_vio pool. 1007 998 * @completion: The completion to notify when the pool has drained. 1008 999 */ 1009 1000 void drain_data_vio_pool(struct data_vio_pool *pool, struct vdo_completion *completion) ··· 1016 1005 1017 1006 /** 1018 1007 * resume_data_vio_pool() - Resume a data_vio pool. 1008 + * @pool: The data_vio pool. 1019 1009 * @completion: The completion to notify when the pool has resumed. 1020 1010 */ 1021 1011 void resume_data_vio_pool(struct data_vio_pool *pool, struct vdo_completion *completion) ··· 1036 1024 1037 1025 /** 1038 1026 * dump_data_vio_pool() - Dump a data_vio pool to the log. 1027 + * @pool: The data_vio pool. 1039 1028 * @dump_vios: Whether to dump the details of each busy data_vio as well. 1040 1029 */ 1041 1030 void dump_data_vio_pool(struct data_vio_pool *pool, bool dump_vios) ··· 1127 1114 /** 1128 1115 * release_allocated_lock() - Release the PBN lock and/or the reference on the allocated block at 1129 1116 * the end of processing a data_vio. 1117 + * @completion: The data_vio holding the lock. 1130 1118 */ 1131 1119 static void release_allocated_lock(struct vdo_completion *completion) 1132 1120 { ··· 1208 1194 /** 1209 1195 * release_logical_lock() - Release the logical block lock and flush generation lock at the end of 1210 1196 * processing a data_vio. 1197 + * @completion: The data_vio holding the lock. 1211 1198 */ 1212 1199 static void release_logical_lock(struct vdo_completion *completion) 1213 1200 { ··· 1243 1228 1244 1229 /** 1245 1230 * finish_cleanup() - Make some assertions about a data_vio which has finished cleaning up. 1231 + * @data_vio: The data_vio. 1246 1232 * 1247 1233 * If it is part of a multi-block discard, starts on the next block, otherwise, returns it to the 1248 1234 * pool. ··· 1358 1342 /** 1359 1343 * get_data_vio_operation_name() - Get the name of the last asynchronous operation performed on a 1360 1344 * data_vio. 1345 + * @data_vio: The data_vio. 1361 1346 */ 1362 1347 const char *get_data_vio_operation_name(struct data_vio *data_vio) 1363 1348 { ··· 1372 1355 1373 1356 /** 1374 1357 * data_vio_allocate_data_block() - Allocate a data block. 1375 - * 1358 + * @data_vio: The data_vio. 1376 1359 * @write_lock_type: The type of write lock to obtain on the block. 1377 1360 * @callback: The callback which will attempt an allocation in the current zone and continue if it 1378 1361 * succeeds. ··· 1396 1379 1397 1380 /** 1398 1381 * release_data_vio_allocation_lock() - Release the PBN lock on a data_vio's allocated block. 1382 + * @data_vio: The data_vio. 1399 1383 * @reset: If true, the allocation will be reset (i.e. any allocated pbn will be forgotten). 1400 1384 * 1401 1385 * If the reference to the locked block is still provisional, it will be released as well. ··· 1417 1399 1418 1400 /** 1419 1401 * uncompress_data_vio() - Uncompress the data a data_vio has just read. 1402 + * @data_vio: The data_vio. 1420 1403 * @mapping_state: The mapping state indicating which fragment to decompress. 1421 1404 * @buffer: The buffer to receive the uncompressed data. 1422 1405 */ ··· 1538 1519 1539 1520 /** 1540 1521 * read_block() - Read a block asynchronously. 1522 + * @completion: The data_vio doing the read. 1541 1523 * 1542 1524 * This is the callback registered in read_block_mapping(). 1543 1525 */ ··· 1695 1675 1696 1676 /** 1697 1677 * read_old_block_mapping() - Get the previous PBN/LBN mapping of an in-progress write. 1678 + * @completion: The data_vio doing the read. 1698 1679 * 1699 1680 * Gets the previous PBN mapped to this LBN from the block map, so as to make an appropriate 1700 1681 * journal entry referencing the removal of this LBN->PBN mapping. ··· 1725 1704 1726 1705 /** 1727 1706 * pack_compressed_data() - Attempt to pack the compressed data_vio into a block. 1707 + * @completion: The data_vio. 1728 1708 * 1729 1709 * This is the callback registered in launch_compress_data_vio(). 1730 1710 */ ··· 1747 1725 1748 1726 /** 1749 1727 * compress_data_vio() - Do the actual work of compressing the data on a CPU queue. 1728 + * @completion: The data_vio. 1750 1729 * 1751 1730 * This callback is registered in launch_compress_data_vio(). 1752 1731 */ ··· 1777 1754 1778 1755 /** 1779 1756 * launch_compress_data_vio() - Continue a write by attempting to compress the data. 1757 + * @data_vio: The data_vio. 1780 1758 * 1781 1759 * This is a re-entry point to vio_write used by hash locks. 1782 1760 */ ··· 1820 1796 /** 1821 1797 * hash_data_vio() - Hash the data in a data_vio and set the hash zone (which also flags the record 1822 1798 * name as set). 1823 - 1799 + * @completion: The data_vio. 1800 + * 1824 1801 * This callback is registered in prepare_for_dedupe(). 1825 1802 */ 1826 1803 static void hash_data_vio(struct vdo_completion *completion) ··· 1857 1832 /** 1858 1833 * write_bio_finished() - This is the bio_end_io function registered in write_block() to be called 1859 1834 * when a data_vio's write to the underlying storage has completed. 1835 + * @bio: The bio to update. 1860 1836 */ 1861 1837 static void write_bio_finished(struct bio *bio) 1862 1838 { ··· 1910 1884 1911 1885 /** 1912 1886 * acknowledge_write_callback() - Acknowledge a write to the requestor. 1887 + * @completion: The data_vio. 1913 1888 * 1914 1889 * This callback is registered in allocate_block() and continue_write_with_block_map_slot(). 1915 1890 */ ··· 1936 1909 1937 1910 /** 1938 1911 * allocate_block() - Attempt to allocate a block in the current allocation zone. 1912 + * @completion: The data_vio. 1939 1913 * 1940 1914 * This callback is registered in continue_write_with_block_map_slot(). 1941 1915 */ ··· 1969 1941 1970 1942 /** 1971 1943 * handle_allocation_error() - Handle an error attempting to allocate a block. 1944 + * @completion: The data_vio. 1972 1945 * 1973 1946 * This error handler is registered in continue_write_with_block_map_slot(). 1974 1947 */ ··· 1999 1970 2000 1971 /** 2001 1972 * continue_data_vio_with_block_map_slot() - Read the data_vio's mapping from the block map. 1973 + * @completion: The data_vio to continue. 2002 1974 * 2003 1975 * This callback is registered in launch_read_data_vio(). 2004 1976 */

+18 -24

drivers/md/dm-vdo/dedupe.c

··· 917 917 918 918 /** 919 919 * enter_forked_lock() - Bind the data_vio to a new hash lock. 920 + * @waiter: The data_vio's waiter link. 921 + * @context: The new hash lock. 920 922 * 921 923 * Implements waiter_callback_fn. Binds the data_vio that was waiting to a new hash lock and waits 922 924 * on that lock. ··· 973 971 * path. 974 972 * @lock: The hash lock. 975 973 * @data_vio: The data_vio to deduplicate using the hash lock. 976 - * @has_claim: true if the data_vio already has claimed an increment from the duplicate lock. 974 + * @has_claim: True if the data_vio already has claimed an increment from the duplicate lock. 977 975 * 978 976 * If no increments are available, this will roll over to a new hash lock and launch the data_vio 979 977 * as the writing agent for that lock. ··· 998 996 * true copy of their data on disk. 999 997 * @lock: The hash lock. 1000 998 * @agent: The data_vio acting as the agent for the lock. 1001 - * @agent_is_done: true only if the agent has already written or deduplicated against its data. 999 + * @agent_is_done: True only if the agent has already written or deduplicated against its data. 1002 1000 * 1003 1001 * If the agent itself needs to deduplicate, an increment for it must already have been claimed 1004 1002 * from the duplicate lock, ensuring the hash lock will still have a data_vio holding it. ··· 2148 2146 /** 2149 2147 * report_dedupe_timeouts() - Record and eventually report that some dedupe requests reached their 2150 2148 * expiration time without getting answers, so we timed them out. 2151 - * @zones: the hash zones. 2152 - * @timeouts: the number of newly timed out requests. 2149 + * @zones: The hash zones. 2150 + * @timeouts: The number of newly timed out requests. 2153 2151 */ 2154 2152 static void report_dedupe_timeouts(struct hash_zones *zones, unsigned int timeouts) 2155 2153 { ··· 2511 2509 2512 2510 /** 2513 2511 * suspend_index() - Suspend the UDS index prior to draining hash zones. 2512 + * @context: Not used. 2513 + * @completion: The completion for the suspend operation. 2514 2514 * 2515 2515 * Implements vdo_action_preamble_fn 2516 2516 */ ··· 2525 2521 initiate_suspend_index); 2526 2522 } 2527 2523 2528 - /** 2529 - * initiate_drain() - Initiate a drain. 2530 - * 2531 - * Implements vdo_admin_initiator_fn. 2532 - */ 2524 + /** Implements vdo_admin_initiator_fn. */ 2533 2525 static void initiate_drain(struct admin_state *state) 2534 2526 { 2535 2527 check_for_drain_complete(container_of(state, struct hash_zone, state)); 2536 2528 } 2537 2529 2538 - /** 2539 - * drain_hash_zone() - Drain a hash zone. 2540 - * 2541 - * Implements vdo_zone_action_fn. 2542 - */ 2530 + /** Implements vdo_zone_action_fn. */ 2543 2531 static void drain_hash_zone(void *context, zone_count_t zone_number, 2544 2532 struct vdo_completion *parent) 2545 2533 { ··· 2568 2572 2569 2573 /** 2570 2574 * resume_index() - Resume the UDS index prior to resuming hash zones. 2575 + * @context: Not used. 2576 + * @parent: The completion for the resume operation. 2571 2577 * 2572 2578 * Implements vdo_action_preamble_fn 2573 2579 */ ··· 2600 2602 vdo_finish_completion(parent); 2601 2603 } 2602 2604 2603 - /** 2604 - * resume_hash_zone() - Resume a hash zone. 2605 - * 2606 - * Implements vdo_zone_action_fn. 2607 - */ 2605 + /** Implements vdo_zone_action_fn. */ 2608 2606 static void resume_hash_zone(void *context, zone_count_t zone_number, 2609 2607 struct vdo_completion *parent) 2610 2608 { ··· 2628 2634 /** 2629 2635 * get_hash_zone_statistics() - Add the statistics for this hash zone to the tally for all zones. 2630 2636 * @zone: The hash zone to query. 2631 - * @tally: The tally 2637 + * @tally: The tally. 2632 2638 */ 2633 2639 static void get_hash_zone_statistics(const struct hash_zone *zone, 2634 2640 struct hash_lock_statistics *tally) ··· 2674 2680 2675 2681 /** 2676 2682 * vdo_get_dedupe_statistics() - Tally the statistics from all the hash zones and the UDS index. 2677 - * @zones: The hash zones to query 2678 - * @stats: A structure to store the statistics 2683 + * @zones: The hash zones to query. 2684 + * @stats: A structure to store the statistics. 2679 2685 * 2680 2686 * Return: The sum of the hash lock statistics from all hash zones plus the statistics from the UDS 2681 2687 * index ··· 2850 2856 2851 2857 /** 2852 2858 * acquire_context() - Acquire a dedupe context from a hash_zone if any are available. 2853 - * @zone: the hash zone 2859 + * @zone: The hash zone. 2854 2860 * 2855 - * Return: A dedupe_context or NULL if none are available 2861 + * Return: A dedupe_context or NULL if none are available. 2856 2862 */ 2857 2863 static struct dedupe_context * __must_check acquire_context(struct hash_zone *zone) 2858 2864 {

+3 -2

drivers/md/dm-vdo/dm-vdo-target.c

··· 1144 1144 /** 1145 1145 * get_thread_id_for_phase() - Get the thread id for the current phase of the admin operation in 1146 1146 * progress. 1147 + * @vdo: The vdo. 1147 1148 */ 1148 1149 static thread_id_t __must_check get_thread_id_for_phase(struct vdo *vdo) 1149 1150 { ··· 1189 1188 /** 1190 1189 * advance_phase() - Increment the phase of the current admin operation and prepare the admin 1191 1190 * completion to run on the thread for the next phase. 1192 - * @vdo: The on which an admin operation is being performed 1191 + * @vdo: The vdo on which an admin operation is being performed. 1193 1192 * 1194 - * Return: The current phase 1193 + * Return: The current phase. 1195 1194 */ 1196 1195 static u32 advance_phase(struct vdo *vdo) 1197 1196 {

+24 -2

drivers/md/dm-vdo/encodings.c

··· 432 432 /** 433 433 * vdo_compute_new_forest_pages() - Compute the number of pages which must be allocated at each 434 434 * level in order to grow the forest to a new number of entries. 435 + * @root_count: The number of block map roots. 436 + * @old_sizes: The sizes of the old tree segments. 435 437 * @entries: The new number of entries the block map must address. 438 + * @new_sizes: The sizes of the new tree segments. 436 439 * 437 440 * Return: The total number of non-leaf pages required. 438 441 */ ··· 465 462 466 463 /** 467 464 * encode_recovery_journal_state_7_0() - Encode the state of a recovery journal. 465 + * @buffer: A buffer to store the encoding. 466 + * @offset: The offset in the buffer at which to encode. 467 + * @state: The recovery journal state to encode. 468 468 * 469 469 * Return: VDO_SUCCESS or an error code. 470 470 */ ··· 490 484 /** 491 485 * decode_recovery_journal_state_7_0() - Decode the state of a recovery journal saved in a buffer. 492 486 * @buffer: The buffer containing the saved state. 487 + * @offset: The offset to start decoding from. 493 488 * @state: A pointer to a recovery journal state to hold the result of a successful decode. 494 489 * 495 490 * Return: VDO_SUCCESS or an error code. ··· 551 544 552 545 /** 553 546 * encode_slab_depot_state_2_0() - Encode the state of a slab depot into a buffer. 547 + * @buffer: A buffer to store the encoding. 548 + * @offset: The offset in the buffer at which to encode. 549 + * @state: The slab depot state to encode. 554 550 */ 555 551 static void encode_slab_depot_state_2_0(u8 *buffer, size_t *offset, 556 552 struct slab_depot_state_2_0 state) ··· 580 570 581 571 /** 582 572 * decode_slab_depot_state_2_0() - Decode slab depot component state version 2.0 from a buffer. 573 + * @buffer: The buffer being decoded. 574 + * @offset: The offset to start decoding from. 575 + * @state: A pointer to a slab depot state to hold the decoded result. 583 576 * 584 577 * Return: VDO_SUCCESS or an error code. 585 578 */ ··· 1169 1156 1170 1157 /** 1171 1158 * decode_vdo_component() - Decode the component data for the vdo itself out of the super block. 1159 + * @buffer: The buffer being decoded. 1160 + * @offset: The offset to start decoding from. 1161 + * @component: The vdo component structure to decode into. 1172 1162 * 1173 1163 * Return: VDO_SUCCESS or an error. 1174 1164 */ ··· 1306 1290 * understand. 1307 1291 * @buffer: The buffer being decoded. 1308 1292 * @offset: The offset to start decoding from. 1309 - * @geometry: The vdo geometry 1293 + * @geometry: The vdo geometry. 1310 1294 * @states: An object to hold the successfully decoded state. 1311 1295 * 1312 1296 * Return: VDO_SUCCESS or an error. ··· 1345 1329 /** 1346 1330 * vdo_decode_component_states() - Decode the payload of a super block. 1347 1331 * @buffer: The buffer containing the encoded super block contents. 1348 - * @geometry: The vdo geometry 1332 + * @geometry: The vdo geometry. 1349 1333 * @states: A pointer to hold the decoded states. 1350 1334 * 1351 1335 * Return: VDO_SUCCESS or an error. ··· 1399 1383 1400 1384 /** 1401 1385 * vdo_encode_component_states() - Encode the state of all vdo components in the super block. 1386 + * @buffer: A buffer to store the encoding. 1387 + * @offset: The offset into the buffer to start the encoding. 1388 + * @states: The component states to encode. 1402 1389 */ 1403 1390 static void vdo_encode_component_states(u8 *buffer, size_t *offset, 1404 1391 const struct vdo_component_states *states) ··· 1421 1402 1422 1403 /** 1423 1404 * vdo_encode_super_block() - Encode a super block into its on-disk representation. 1405 + * @buffer: A buffer to store the encoding. 1406 + * @states: The component states to encode. 1424 1407 */ 1425 1408 void vdo_encode_super_block(u8 *buffer, struct vdo_component_states *states) 1426 1409 { ··· 1447 1426 1448 1427 /** 1449 1428 * vdo_decode_super_block() - Decode a super block from its on-disk representation. 1429 + * @buffer: The buffer to decode from. 1450 1430 */ 1451 1431 int vdo_decode_super_block(u8 *buffer) 1452 1432 {

+1 -5

drivers/md/dm-vdo/flush.c

··· 522 522 vdo_enqueue_completion(completion, BIO_Q_FLUSH_PRIORITY); 523 523 } 524 524 525 - /** 526 - * initiate_drain() - Initiate a drain. 527 - * 528 - * Implements vdo_admin_initiator_fn. 529 - */ 525 + /** Implements vdo_admin_initiator_fn. */ 530 526 static void initiate_drain(struct admin_state *state) 531 527 { 532 528 check_for_drain_complete(container_of(state, struct flusher, state));

+7

drivers/md/dm-vdo/funnel-workqueue.c

··· 372 372 /** 373 373 * vdo_make_work_queue() - Create a work queue; if multiple threads are requested, completions will 374 374 * be distributed to them in round-robin fashion. 375 + * @thread_name_prefix: A prefix for the thread names to identify them as a vdo thread. 376 + * @name: A base name to identify this queue. 377 + * @owner: The vdo_thread structure to manage this queue. 378 + * @type: The type of queue to create. 379 + * @thread_count: The number of actual threads handling this queue. 380 + * @thread_privates: An array of private contexts, one for each thread; may be NULL. 381 + * @queue_ptr: A pointer to return the new work queue. 375 382 * 376 383 * Each queue is associated with a struct vdo_thread which has a single vdo thread id. Regardless 377 384 * of the actual number of queues and threads allocated here, code outside of the queue

+14 -12

drivers/md/dm-vdo/io-submitter.c

··· 118 118 /** 119 119 * vdo_submit_vio() - Submits a vio's bio to the underlying block device. May block if the device 120 120 * is busy. This callback should be used by vios which did not attempt to merge. 121 + * @completion: The vio to submit. 121 122 */ 122 123 void vdo_submit_vio(struct vdo_completion *completion) 123 124 { ··· 134 133 * The list will always contain at least one entry (the bio for the vio on which it is called), but 135 134 * other bios may have been merged with it as well. 136 135 * 137 - * Return: bio The head of the bio list to submit. 136 + * Return: The head of the bio list to submit. 138 137 */ 139 138 static struct bio *get_bio_list(struct vio *vio) 140 139 { ··· 159 158 /** 160 159 * submit_data_vio() - Submit a data_vio's bio to the storage below along with 161 160 * any bios that have been merged with it. 161 + * @completion: The vio to submit. 162 162 * 163 163 * Context: This call may block and so should only be called from a bio thread. 164 164 */ ··· 186 184 * There are two types of merging possible, forward and backward, which are distinguished by a flag 187 185 * that uses kernel elevator terminology. 188 186 * 189 - * Return: the vio to merge to, NULL if no merging is possible. 187 + * Return: The vio to merge to, NULL if no merging is possible. 190 188 */ 191 189 static struct vio *get_mergeable_locked(struct int_map *map, struct vio *vio, 192 190 bool back_merge) ··· 264 262 * 265 263 * Currently this is only used for data_vios, but is broken out for future use with metadata vios. 266 264 * 267 - * Return: whether or not the vio was merged. 265 + * Return: Whether or not the vio was merged. 268 266 */ 269 267 static bool try_bio_map_merge(struct vio *vio) 270 268 { ··· 308 306 309 307 /** 310 308 * vdo_submit_data_vio() - Submit I/O for a data_vio. 311 - * @data_vio: the data_vio for which to issue I/O. 309 + * @data_vio: The data_vio for which to issue I/O. 312 310 * 313 311 * If possible, this I/O will be merged other pending I/Os. Otherwise, the data_vio will be sent to 314 312 * the appropriate bio zone directly. ··· 323 321 324 322 /** 325 323 * __submit_metadata_vio() - Submit I/O for a metadata vio. 326 - * @vio: the vio for which to issue I/O 327 - * @physical: the physical block number to read or write 328 - * @callback: the bio endio function which will be called after the I/O completes 329 - * @error_handler: the handler for submission or I/O errors (may be NULL) 330 - * @operation: the type of I/O to perform 331 - * @data: the buffer to read or write (may be NULL) 332 - * @size: the I/O amount in bytes 324 + * @vio: The vio for which to issue I/O. 325 + * @physical: The physical block number to read or write. 326 + * @callback: The bio endio function which will be called after the I/O completes. 327 + * @error_handler: The handler for submission or I/O errors; may be NULL. 328 + * @operation: The type of I/O to perform. 329 + * @data: The buffer to read or write; may be NULL. 330 + * @size: The I/O amount in bytes. 333 331 * 334 332 * The vio is enqueued on a vdo bio queue so that bio submission (which may block) does not block 335 333 * other vdo threads. ··· 443 441 444 442 /** 445 443 * vdo_cleanup_io_submitter() - Tear down the io_submitter fields as needed for a physical layer. 446 - * @io_submitter: The I/O submitter data to tear down (may be NULL). 444 + * @io_submitter: The I/O submitter data to tear down; may be NULL. 447 445 */ 448 446 void vdo_cleanup_io_submitter(struct io_submitter *io_submitter) 449 447 {

+4 -16

drivers/md/dm-vdo/logical-zone.c

··· 159 159 vdo_finish_draining(&zone->state); 160 160 } 161 161 162 - /** 163 - * initiate_drain() - Initiate a drain. 164 - * 165 - * Implements vdo_admin_initiator_fn. 166 - */ 162 + /** Implements vdo_admin_initiator_fn. */ 167 163 static void initiate_drain(struct admin_state *state) 168 164 { 169 165 check_for_drain_complete(container_of(state, struct logical_zone, state)); 170 166 } 171 167 172 - /** 173 - * drain_logical_zone() - Drain a logical zone. 174 - * 175 - * Implements vdo_zone_action_fn. 176 - */ 168 + /** Implements vdo_zone_action_fn. */ 177 169 static void drain_logical_zone(void *context, zone_count_t zone_number, 178 170 struct vdo_completion *parent) 179 171 { ··· 184 192 parent); 185 193 } 186 194 187 - /** 188 - * resume_logical_zone() - Resume a logical zone. 189 - * 190 - * Implements vdo_zone_action_fn. 191 - */ 195 + /** Implements vdo_zone_action_fn. */ 192 196 static void resume_logical_zone(void *context, zone_count_t zone_number, 193 197 struct vdo_completion *parent) 194 198 { ··· 344 356 345 357 /** 346 358 * vdo_dump_logical_zone() - Dump information about a logical zone to the log for debugging. 347 - * @zone: The zone to dump 359 + * @zone: The zone to dump. 348 360 * 349 361 * Context: the information is dumped in a thread-unsafe fashion. 350 362 *

+6 -9

drivers/md/dm-vdo/packer.c

··· 35 35 /** 36 36 * vdo_get_compressed_block_fragment() - Get a reference to a compressed fragment from a compressed 37 37 * block. 38 - * @mapping_state [in] The mapping state for the look up. 39 - * @compressed_block [in] The compressed block that was read from disk. 40 - * @fragment_offset [out] The offset of the fragment within a compressed block. 41 - * @fragment_size [out] The size of the fragment. 38 + * @mapping_state: The mapping state describing the fragment. 39 + * @block: The compressed block that was read from disk. 40 + * @fragment_offset: The offset of the fragment within the compressed block. 41 + * @fragment_size: The size of the fragment. 42 42 * 43 43 * Return: If a valid compressed fragment is found, VDO_SUCCESS; otherwise, VDO_INVALID_FRAGMENT if 44 44 * the fragment is invalid. ··· 382 382 * @compression: The agent's compression_state to pack in to. 383 383 * @data_vio: The data_vio to pack. 384 384 * @offset: The offset into the compressed block at which to pack the fragment. 385 + * @slot: The slot number in the compressed block. 385 386 * @block: The compressed block which will be written out when batch is fully packed. 386 387 * 387 388 * Return: The new amount of space used. ··· 706 705 vdo_flush_packer(packer); 707 706 } 708 707 709 - /** 710 - * initiate_drain() - Initiate a drain. 711 - * 712 - * Implements vdo_admin_initiator_fn. 713 - */ 708 + /** Implements vdo_admin_initiator_fn. */ 714 709 static void initiate_drain(struct admin_state *state) 715 710 { 716 711 struct packer *packer = container_of(state, struct packer, state);

+3 -2

drivers/md/dm-vdo/physical-zone.c

··· 60 60 * vdo_is_pbn_read_lock() - Check whether a pbn_lock is a read lock. 61 61 * @lock: The lock to check. 62 62 * 63 - * Return: true if the lock is a read lock. 63 + * Return: True if the lock is a read lock. 64 64 */ 65 65 bool vdo_is_pbn_read_lock(const struct pbn_lock *lock) 66 66 { ··· 75 75 /** 76 76 * vdo_downgrade_pbn_write_lock() - Downgrade a PBN write lock to a PBN read lock. 77 77 * @lock: The PBN write lock to downgrade. 78 + * @compressed_write: True if the written block was a compressed block. 78 79 * 79 80 * The lock holder count is cleared and the caller is responsible for setting the new count. 80 81 */ ··· 583 582 * that fails try the next if possible. 584 583 * @data_vio: The data_vio needing an allocation. 585 584 * 586 - * Return: true if a block was allocated, if not the data_vio will have been dispatched so the 585 + * Return: True if a block was allocated, if not the data_vio will have been dispatched so the 587 586 * caller must not touch it. 588 587 */ 589 588 bool vdo_allocate_block_in_zone(struct data_vio *data_vio)

+17 -13

drivers/md/dm-vdo/recovery-journal.c

··· 109 109 * @journal: The recovery journal. 110 110 * @lock_number: The lock to check. 111 111 * 112 - * Return: true if the journal zone is locked. 112 + * Return: True if the journal zone is locked. 113 113 */ 114 114 static bool is_journal_zone_locked(struct recovery_journal *journal, 115 115 block_count_t lock_number) ··· 217 217 * Indicates it has any uncommitted entries, which includes both entries not written and entries 218 218 * written but not yet acknowledged. 219 219 * 220 - * Return: true if the block has any uncommitted entries. 220 + * Return: True if the block has any uncommitted entries. 221 221 */ 222 222 static inline bool __must_check is_block_dirty(const struct recovery_journal_block *block) 223 223 { ··· 228 228 * is_block_empty() - Check whether a journal block is empty. 229 229 * @block: The block to check. 230 230 * 231 - * Return: true if the block has no entries. 231 + * Return: True if the block has no entries. 232 232 */ 233 233 static inline bool __must_check is_block_empty(const struct recovery_journal_block *block) 234 234 { ··· 239 239 * is_block_full() - Check whether a journal block is full. 240 240 * @block: The block to check. 241 241 * 242 - * Return: true if the block is full. 242 + * Return: True if the block is full. 243 243 */ 244 244 static inline bool __must_check is_block_full(const struct recovery_journal_block *block) 245 245 { ··· 260 260 261 261 /** 262 262 * continue_waiter() - Release a data_vio from the journal. 263 + * @waiter: The data_vio waiting on journal activity. 264 + * @context: The result of the journal operation. 263 265 * 264 266 * Invoked whenever a data_vio is to be released from the journal, either because its entry was 265 267 * committed to disk, or because there was an error. Implements waiter_callback_fn. ··· 275 273 * has_block_waiters() - Check whether the journal has any waiters on any blocks. 276 274 * @journal: The journal in question. 277 275 * 278 - * Return: true if any block has a waiter. 276 + * Return: True if any block has a waiter. 279 277 */ 280 278 static inline bool has_block_waiters(struct recovery_journal *journal) 281 279 { ··· 298 296 * suspend_lock_counter() - Prevent the lock counter from notifying. 299 297 * @counter: The counter. 300 298 * 301 - * Return: true if the lock counter was not notifying and hence the suspend was efficacious. 299 + * Return: True if the lock counter was not notifying and hence the suspend was efficacious. 302 300 */ 303 301 static bool suspend_lock_counter(struct lock_counter *counter) 304 302 { ··· 418 416 * 419 417 * The head is the lowest sequence number of the block map head and the slab journal head. 420 418 * 421 - * Return: the head of the journal. 419 + * Return: The head of the journal. 422 420 */ 423 421 static inline sequence_number_t get_recovery_journal_head(const struct recovery_journal *journal) 424 422 { ··· 537 535 * vdo_get_recovery_journal_length() - Get the number of usable recovery journal blocks. 538 536 * @journal_size: The size of the recovery journal in blocks. 539 537 * 540 - * Return: the number of recovery journal blocks usable for entries. 538 + * Return: The number of recovery journal blocks usable for entries. 541 539 */ 542 540 block_count_t vdo_get_recovery_journal_length(block_count_t journal_size) 543 541 { ··· 1080 1078 1081 1079 /** 1082 1080 * assign_entry() - Assign an entry waiter to the active block. 1081 + * @waiter: The data_vio. 1082 + * @context: The recovery journal block. 1083 1083 * 1084 1084 * Implements waiter_callback_fn. 1085 1085 */ ··· 1169 1165 /** 1170 1166 * continue_committed_waiter() - invoked whenever a VIO is to be released from the journal because 1171 1167 * its entry was committed to disk. 1168 + * @waiter: The data_vio waiting on a journal write. 1169 + * @context: A pointer to the recovery journal. 1172 1170 * 1173 1171 * Implements waiter_callback_fn. 1174 1172 */ ··· 1368 1362 1369 1363 /** 1370 1364 * write_block() - Issue a block for writing. 1365 + * @waiter: The recovery journal block to write. 1366 + * @context: Not used. 1371 1367 * 1372 1368 * Implements waiter_callback_fn. 1373 1369 */ ··· 1619 1611 smp_mb__after_atomic(); 1620 1612 } 1621 1613 1622 - /** 1623 - * initiate_drain() - Initiate a drain. 1624 - * 1625 - * Implements vdo_admin_initiator_fn. 1626 - */ 1614 + /** Implements vdo_admin_initiator_fn. */ 1627 1615 static void initiate_drain(struct admin_state *state) 1628 1616 { 1629 1617 check_for_drain_complete(container_of(state, struct recovery_journal, state));

+55 -41

drivers/md/dm-vdo/slab-depot.c

··· 40 40 41 41 /** 42 42 * get_lock() - Get the lock object for a slab journal block by sequence number. 43 - * @journal: vdo_slab journal to retrieve from. 43 + * @journal: The vdo_slab journal to retrieve from. 44 44 * @sequence_number: Sequence number of the block. 45 45 * 46 46 * Return: The lock object for the given sequence number. ··· 110 110 * block_is_full() - Check whether a journal block is full. 111 111 * @journal: The slab journal for the block. 112 112 * 113 - * Return: true if the tail block is full. 113 + * Return: True if the tail block is full. 114 114 */ 115 115 static bool __must_check block_is_full(struct slab_journal *journal) 116 116 { ··· 127 127 128 128 /** 129 129 * is_slab_journal_blank() - Check whether a slab's journal is blank. 130 + * @slab: The slab to check. 130 131 * 131 132 * A slab journal is blank if it has never had any entries recorded in it. 132 133 * 133 - * Return: true if the slab's journal has never been modified. 134 + * Return: True if the slab's journal has never been modified. 134 135 */ 135 136 static bool is_slab_journal_blank(const struct vdo_slab *slab) 136 137 { ··· 228 227 229 228 /** 230 229 * check_summary_drain_complete() - Check whether an allocators summary has finished draining. 230 + * @allocator: The allocator to check. 231 231 */ 232 232 static void check_summary_drain_complete(struct block_allocator *allocator) 233 233 { ··· 351 349 352 350 /** 353 351 * update_slab_summary_entry() - Update the entry for a slab. 354 - * @slab: The slab whose entry is to be updated 352 + * @slab: The slab whose entry is to be updated. 355 353 * @waiter: The waiter that is updating the summary. 356 354 * @tail_block_offset: The offset of the slab journal's tail block. 357 355 * @load_ref_counts: Whether the reference counts must be loaded from disk on the vdo load. ··· 656 654 657 655 /** 658 656 * reopen_slab_journal() - Reopen a slab's journal by emptying it and then adding pending entries. 657 + * @slab: The slab to reopen. 659 658 */ 660 659 static void reopen_slab_journal(struct vdo_slab *slab) 661 660 { ··· 842 839 * @sbn: The slab block number of the entry to encode. 843 840 * @operation: The type of the entry. 844 841 * @increment: True if this is an increment. 845 - * 846 - * Exposed for unit tests. 847 842 */ 848 843 static void encode_slab_journal_entry(struct slab_journal_block_header *tail_header, 849 844 slab_journal_payload *payload, ··· 952 951 * @parent: The completion to notify when there is space to add the entry if the entry could not be 953 952 * added immediately. 954 953 * 955 - * Return: true if the entry was added immediately. 954 + * Return: True if the entry was added immediately. 956 955 */ 957 956 bool vdo_attempt_replay_into_slab(struct vdo_slab *slab, physical_block_number_t pbn, 958 957 enum journal_operation operation, bool increment, ··· 1004 1003 * requires_reaping() - Check whether the journal must be reaped before adding new entries. 1005 1004 * @journal: The journal to check. 1006 1005 * 1007 - * Return: true if the journal must be reaped. 1006 + * Return: True if the journal must be reaped. 1008 1007 */ 1009 1008 static bool requires_reaping(const struct slab_journal *journal) 1010 1009 { ··· 1276 1275 1277 1276 /** 1278 1277 * get_reference_block() - Get the reference block that covers the given block index. 1278 + * @slab: The slab containing the references. 1279 + * @index: The index of the physical block. 1279 1280 */ 1280 1281 static struct reference_block * __must_check get_reference_block(struct vdo_slab *slab, 1281 1282 slab_block_number index) ··· 1382 1379 1383 1380 /** 1384 1381 * adjust_free_block_count() - Adjust the free block count and (if needed) reprioritize the slab. 1385 - * @incremented: true if the free block count went up. 1382 + * @slab: The slab. 1383 + * @incremented: True if the free block count went up. 1386 1384 */ 1387 1385 static void adjust_free_block_count(struct vdo_slab *slab, bool incremented) 1388 1386 { ··· 1889 1885 /** 1890 1886 * reset_search_cursor() - Reset the free block search back to the first reference counter in the 1891 1887 * first reference block of a slab. 1888 + * @slab: The slab. 1892 1889 */ 1893 1890 static void reset_search_cursor(struct vdo_slab *slab) 1894 1891 { ··· 1897 1892 1898 1893 cursor->block = cursor->first_block; 1899 1894 cursor->index = 0; 1900 - /* Unit tests have slabs with only one reference block (and it's a runt). */ 1901 1895 cursor->end_index = min_t(u32, COUNTS_PER_BLOCK, slab->block_count); 1902 1896 } 1903 1897 1904 1898 /** 1905 1899 * advance_search_cursor() - Advance the search cursor to the start of the next reference block in 1906 - * a slab, 1900 + * a slab. 1901 + * @slab: The slab. 1907 1902 * 1908 1903 * Wraps around to the first reference block if the current block is the last reference block. 1909 1904 * 1910 - * Return: true unless the cursor was at the last reference block. 1905 + * Return: True unless the cursor was at the last reference block. 1911 1906 */ 1912 1907 static bool advance_search_cursor(struct vdo_slab *slab) 1913 1908 { ··· 1938 1933 1939 1934 /** 1940 1935 * vdo_adjust_reference_count_for_rebuild() - Adjust the reference count of a block during rebuild. 1936 + * @depot: The slab depot. 1937 + * @pbn: The physical block number to adjust. 1938 + * @operation: The type opf operation. 1941 1939 * 1942 1940 * Return: VDO_SUCCESS or an error. 1943 1941 */ ··· 2046 2038 * @slab: The slab counters to scan. 2047 2039 * @index_ptr: A pointer to hold the array index of the free block. 2048 2040 * 2049 - * Exposed for unit testing. 2050 - * 2051 - * Return: true if a free block was found in the specified range. 2041 + * Return: True if a free block was found in the specified range. 2052 2042 */ 2053 2043 static bool find_free_block(const struct vdo_slab *slab, slab_block_number *index_ptr) 2054 2044 { ··· 2103 2097 * @slab: The slab to search. 2104 2098 * @free_index_ptr: A pointer to receive the array index of the zero reference count. 2105 2099 * 2106 - * Return: true if an unreferenced counter was found. 2100 + * Return: True if an unreferenced counter was found. 2107 2101 */ 2108 2102 static bool search_current_reference_block(const struct vdo_slab *slab, 2109 2103 slab_block_number *free_index_ptr) ··· 2122 2116 * counter index saved in the search cursor and searching up to the end of the last reference 2123 2117 * block. The search does not wrap. 2124 2118 * 2125 - * Return: true if an unreferenced counter was found. 2119 + * Return: True if an unreferenced counter was found. 2126 2120 */ 2127 2121 static bool search_reference_blocks(struct vdo_slab *slab, 2128 2122 slab_block_number *free_index_ptr) ··· 2142 2136 2143 2137 /** 2144 2138 * make_provisional_reference() - Do the bookkeeping for making a provisional reference. 2139 + * @slab: The slab. 2140 + * @block_number: The index for the physical block to reference. 2145 2141 */ 2146 2142 static void make_provisional_reference(struct vdo_slab *slab, 2147 2143 slab_block_number block_number) ··· 2163 2155 2164 2156 /** 2165 2157 * dirty_all_reference_blocks() - Mark all reference count blocks in a slab as dirty. 2158 + * @slab: The slab. 2166 2159 */ 2167 2160 static void dirty_all_reference_blocks(struct vdo_slab *slab) 2168 2161 { ··· 2182 2173 2183 2174 /** 2184 2175 * match_bytes() - Check an 8-byte word for bytes matching the value specified 2185 - * @input: A word to examine the bytes of 2186 - * @match: The byte value sought 2176 + * @input: A word to examine the bytes of. 2177 + * @match: The byte value sought. 2187 2178 * 2188 - * Return: 1 in each byte when the corresponding input byte matched, 0 otherwise 2179 + * Return: 1 in each byte when the corresponding input byte matched, 0 otherwise. 2189 2180 */ 2190 2181 static inline u64 match_bytes(u64 input, u8 match) 2191 2182 { ··· 2200 2191 2201 2192 /** 2202 2193 * count_valid_references() - Process a newly loaded refcount array 2203 - * @counters: the array of counters from a metadata block 2194 + * @counters: The array of counters from a metadata block. 2204 2195 * 2205 - * Scan a 8-byte-aligned array of counters, fixing up any "provisional" values that weren't 2206 - * cleaned up at shutdown, changing them internally to "empty". 2196 + * Scan an 8-byte-aligned array of counters, fixing up any provisional values that 2197 + * weren't cleaned up at shutdown, changing them internally to zero. 2207 2198 * 2208 - * Return: the number of blocks that are referenced (counters not "empty") 2199 + * Return: The number of blocks with a non-zero reference count. 2209 2200 */ 2210 2201 static unsigned int count_valid_references(vdo_refcount_t *counters) 2211 2202 { ··· 2360 2351 /** 2361 2352 * load_reference_blocks() - Load a slab's reference blocks from the underlying storage into a 2362 2353 * pre-allocated reference counter. 2354 + * @slab: The slab. 2363 2355 */ 2364 2356 static void load_reference_blocks(struct vdo_slab *slab) 2365 2357 { ··· 2385 2375 2386 2376 /** 2387 2377 * drain_slab() - Drain all reference count I/O. 2378 + * @slab: The slab. 2388 2379 * 2389 2380 * Depending upon the type of drain being performed (as recorded in the ref_count's vdo_slab), the 2390 2381 * reference blocks may be loaded from disk or dirty reference blocks may be written out. ··· 2575 2564 2576 2565 /** 2577 2566 * load_slab_journal() - Load a slab's journal by reading the journal's tail. 2567 + * @slab: The slab. 2578 2568 */ 2579 2569 static void load_slab_journal(struct vdo_slab *slab) 2580 2570 { ··· 2675 2663 prioritize_slab(slab); 2676 2664 } 2677 2665 2678 - /** 2679 - * initiate_slab_action() - Initiate a slab action. 2680 - * 2681 - * Implements vdo_admin_initiator_fn. 2682 - */ 2666 + /** Implements vdo_admin_initiator_fn. */ 2683 2667 static void initiate_slab_action(struct admin_state *state) 2684 2668 { 2685 2669 struct vdo_slab *slab = container_of(state, struct vdo_slab, state); ··· 2728 2720 * has_slabs_to_scrub() - Check whether a scrubber has slabs to scrub. 2729 2721 * @scrubber: The scrubber to check. 2730 2722 * 2731 - * Return: true if the scrubber has slabs to scrub. 2723 + * Return: True if the scrubber has slabs to scrub. 2732 2724 */ 2733 2725 static inline bool __must_check has_slabs_to_scrub(struct slab_scrubber *scrubber) 2734 2726 { ··· 2749 2741 * finish_scrubbing() - Stop scrubbing, either because there are no more slabs to scrub or because 2750 2742 * there's been an error. 2751 2743 * @scrubber: The scrubber. 2744 + * @result: The result of the scrubbing operation. 2752 2745 */ 2753 2746 static void finish_scrubbing(struct slab_scrubber *scrubber, int result) 2754 2747 { ··· 3141 3132 3142 3133 /** 3143 3134 * abort_waiter() - Abort vios waiting to make journal entries when read-only. 3135 + * @waiter: A waiting data_vio. 3136 + * @context: Not used. 3144 3137 * 3145 3138 * This callback is invoked on all vios waiting to make slab journal entries after the VDO has gone 3146 3139 * into read-only mode. Implements waiter_callback_fn. 3147 3140 */ 3148 - static void abort_waiter(struct vdo_waiter *waiter, void *context __always_unused) 3141 + static void abort_waiter(struct vdo_waiter *waiter, void __always_unused *context) 3149 3142 { 3150 3143 struct reference_updater *updater = 3151 3144 container_of(waiter, struct reference_updater, waiter); ··· 3547 3536 /** 3548 3537 * vdo_notify_slab_journals_are_recovered() - Inform a block allocator that its slab journals have 3549 3538 * been recovered from the recovery journal. 3550 - * @completion The allocator completion 3539 + * @completion: The allocator completion. 3551 3540 */ 3552 3541 void vdo_notify_slab_journals_are_recovered(struct vdo_completion *completion) 3553 3542 { ··· 3786 3775 * in the slab. 3787 3776 * @allocator: The block allocator to which the slab belongs. 3788 3777 * @slab_number: The slab number of the slab. 3789 - * @is_new: true if this slab is being allocated as part of a resize. 3778 + * @is_new: True if this slab is being allocated as part of a resize. 3790 3779 * @slab_ptr: A pointer to receive the new slab. 3791 3780 * 3792 3781 * Return: VDO_SUCCESS or an error code. ··· 3905 3894 vdo_free(vdo_forget(depot->new_slabs)); 3906 3895 } 3907 3896 3908 - /** 3909 - * get_allocator_thread_id() - Get the ID of the thread on which a given allocator operates. 3910 - * 3911 - * Implements vdo_zone_thread_getter_fn. 3912 - */ 3897 + /** Implements vdo_zone_thread_getter_fn. */ 3913 3898 static thread_id_t get_allocator_thread_id(void *context, zone_count_t zone_number) 3914 3899 { 3915 3900 return ((struct slab_depot *) context)->allocators[zone_number].thread_id; ··· 3918 3911 * @recovery_lock: The sequence number of the recovery journal block whose locks should be 3919 3912 * released. 3920 3913 * 3921 - * Return: true if the journal does hold a lock on the specified block (which it will release). 3914 + * Return: True if the journal released a lock on the specified block. 3922 3915 */ 3923 3916 static bool __must_check release_recovery_journal_lock(struct slab_journal *journal, 3924 3917 sequence_number_t recovery_lock) ··· 3962 3955 3963 3956 /** 3964 3957 * prepare_for_tail_block_commit() - Prepare to commit oldest tail blocks. 3958 + * @context: The slab depot. 3959 + * @parent: The parent operation. 3965 3960 * 3966 3961 * Implements vdo_action_preamble_fn. 3967 3962 */ ··· 3977 3968 3978 3969 /** 3979 3970 * schedule_tail_block_commit() - Schedule a tail block commit if necessary. 3971 + * @context: The slab depot. 3980 3972 * 3981 3973 * This method should not be called directly. Rather, call vdo_schedule_default_action() on the 3982 3974 * depot's action manager. ··· 4371 4361 4372 4362 /** 4373 4363 * vdo_allocate_reference_counters() - Allocate the reference counters for all slabs in the depot. 4364 + * @depot: The slab depot. 4374 4365 * 4375 4366 * Context: This method may be called only before entering normal operation from the load thread. 4376 4367 * ··· 4626 4615 } 4627 4616 4628 4617 /** 4629 - * load_slab_summary() - The preamble of a load operation. 4618 + * load_slab_summary() - Load the slab summary before the slab data. 4619 + * @context: The slab depot. 4620 + * @parent: The load operation. 4630 4621 * 4631 4622 * Implements vdo_action_preamble_fn. 4632 4623 */ ··· 4744 4731 * vdo_prepare_to_grow_slab_depot() - Allocate new memory needed for a resize of a slab depot to 4745 4732 * the given size. 4746 4733 * @depot: The depot to prepare to resize. 4747 - * @partition: The new depot partition 4734 + * @partition: The new depot partition. 4748 4735 * 4749 4736 * Return: VDO_SUCCESS or an error. 4750 4737 */ ··· 4794 4781 /** 4795 4782 * finish_registration() - Finish registering new slabs now that all of the allocators have 4796 4783 * received their new slabs. 4784 + * @context: The slab depot. 4797 4785 * 4798 4786 * Implements vdo_action_conclusion_fn. 4799 4787 */

+6 -3

drivers/md/dm-vdo/vdo.c

··· 181 181 182 182 /** 183 183 * initialize_thread_config() - Initialize the thread mapping 184 + * @counts: The number and types of threads to create. 185 + * @config: The thread_config to initialize. 184 186 * 185 187 * If the logical, physical, and hash zone counts are all 0, a single thread will be shared by all 186 188 * three plus the packer and recovery journal. Otherwise, there must be at least one of each type, ··· 886 884 887 885 /** 888 886 * record_vdo() - Record the state of the VDO for encoding in the super block. 887 + * @vdo: The vdo. 889 888 */ 890 889 static void record_vdo(struct vdo *vdo) 891 890 { ··· 1280 1277 * vdo_is_read_only() - Check whether the VDO is read-only. 1281 1278 * @vdo: The vdo. 1282 1279 * 1283 - * Return: true if the vdo is read-only. 1280 + * Return: True if the vdo is read-only. 1284 1281 * 1285 1282 * This method may be called from any thread, as opposed to examining the VDO's state field which 1286 1283 * is only safe to check from the admin thread. ··· 1294 1291 * vdo_in_read_only_mode() - Check whether a vdo is in read-only mode. 1295 1292 * @vdo: The vdo to query. 1296 1293 * 1297 - * Return: true if the vdo is in read-only mode. 1294 + * Return: True if the vdo is in read-only mode. 1298 1295 */ 1299 1296 bool vdo_in_read_only_mode(const struct vdo *vdo) 1300 1297 { ··· 1305 1302 * vdo_in_recovery_mode() - Check whether the vdo is in recovery mode. 1306 1303 * @vdo: The vdo to query. 1307 1304 * 1308 - * Return: true if the vdo is in recovery mode. 1305 + * Return: True if the vdo is in recovery mode. 1309 1306 */ 1310 1307 bool vdo_in_recovery_mode(const struct vdo *vdo) 1311 1308 {

+3 -1

drivers/md/dm-vdo/vdo.h

··· 279 279 280 280 /** 281 281 * typedef vdo_filter_fn - Method type for vdo matching methods. 282 + * @vdo: The vdo to match. 283 + * @context: A parameter for the filter to use. 282 284 * 283 - * A filter function returns false if the vdo doesn't match. 285 + * Return: True if the vdo matches the filter criteria, false if it doesn't. 284 286 */ 285 287 typedef bool (*vdo_filter_fn)(struct vdo *vdo, const void *context); 286 288

+2 -1

drivers/md/dm-vdo/vio.c

··· 398 398 399 399 /** 400 400 * is_vio_pool_busy() - Check whether an vio pool has outstanding entries. 401 + * @pool: The vio pool. 401 402 * 402 - * Return: true if the pool is busy. 403 + * Return: True if the pool is busy. 403 404 */ 404 405 bool is_vio_pool_busy(struct vio_pool *pool) 405 406 {

+4 -2

drivers/md/dm-vdo/vio.h

··· 156 156 /** 157 157 * continue_vio() - Enqueue a vio to run its next callback. 158 158 * @vio: The vio to continue. 159 - * 160 - * Return: The result of the current operation. 159 + * @result: The result of the current operation. 161 160 */ 162 161 static inline void continue_vio(struct vio *vio, int result) 163 162 { ··· 171 172 172 173 /** 173 174 * continue_vio_after_io() - Continue a vio now that its I/O has returned. 175 + * @vio: The vio to continue. 176 + * @callback: The next operation for this vio. 177 + * @thread: Which thread to run the next operation on. 174 178 */ 175 179 static inline void continue_vio_after_io(struct vio *vio, vdo_action_fn callback, 176 180 thread_id_t thread)

+17 -24

drivers/md/dm-verity-fec.c

··· 177 177 if (r < 0 && neras) 178 178 DMERR_LIMIT("%s: FEC %llu: failed to correct: %d", 179 179 v->data_dev->name, (unsigned long long)rsb, r); 180 - else if (r > 0) 180 + else if (r > 0) { 181 181 DMWARN_LIMIT("%s: FEC %llu: corrected %d errors", 182 182 v->data_dev->name, (unsigned long long)rsb, r); 183 + atomic64_inc(&v->fec->corrected); 184 + } 183 185 184 186 return r; 185 187 } ··· 190 188 * Locate data block erasures using verity hashes. 191 189 */ 192 190 static int fec_is_erasure(struct dm_verity *v, struct dm_verity_io *io, 193 - u8 *want_digest, u8 *data) 191 + const u8 *want_digest, const u8 *data) 194 192 { 195 193 if (unlikely(verity_hash(v, io, data, 1 << v->data_dev_block_bits, 196 - verity_io_real_digest(v, io)))) 194 + io->tmp_digest))) 197 195 return 0; 198 196 199 - return memcmp(verity_io_real_digest(v, io), want_digest, 200 - v->digest_size) != 0; 197 + return memcmp(io->tmp_digest, want_digest, v->digest_size) != 0; 201 198 } 202 199 203 200 /* ··· 329 328 if (fio->bufs[n]) 330 329 continue; 331 330 332 - fio->bufs[n] = mempool_alloc(&v->fec->extra_pool, GFP_NOWAIT); 331 + fio->bufs[n] = kmem_cache_alloc(v->fec->cache, GFP_NOWAIT); 333 332 /* we can manage with even one buffer if necessary */ 334 333 if (unlikely(!fio->bufs[n])) 335 334 break; ··· 363 362 */ 364 363 static int fec_decode_rsb(struct dm_verity *v, struct dm_verity_io *io, 365 364 struct dm_verity_fec_io *fio, u64 rsb, u64 offset, 366 - bool use_erasures) 365 + const u8 *want_digest, bool use_erasures) 367 366 { 368 367 int r, neras = 0; 369 368 unsigned int pos; ··· 389 388 390 389 /* Always re-validate the corrected block against the expected hash */ 391 390 r = verity_hash(v, io, fio->output, 1 << v->data_dev_block_bits, 392 - verity_io_real_digest(v, io)); 391 + io->tmp_digest); 393 392 if (unlikely(r < 0)) 394 393 return r; 395 394 396 - if (memcmp(verity_io_real_digest(v, io), verity_io_want_digest(v, io), 397 - v->digest_size)) { 395 + if (memcmp(io->tmp_digest, want_digest, v->digest_size)) { 398 396 DMERR_LIMIT("%s: FEC %llu: failed to correct (%d erasures)", 399 397 v->data_dev->name, (unsigned long long)rsb, neras); 400 398 return -EILSEQ; ··· 404 404 405 405 /* Correct errors in a block. Copies corrected block to dest. */ 406 406 int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io, 407 - enum verity_block_type type, sector_t block, u8 *dest) 407 + enum verity_block_type type, const u8 *want_digest, 408 + sector_t block, u8 *dest) 408 409 { 409 410 int r; 410 411 struct dm_verity_fec_io *fio = fec_io(io); ··· 414 413 if (!verity_fec_is_enabled(v)) 415 414 return -EOPNOTSUPP; 416 415 417 - if (fio->level >= DM_VERITY_FEC_MAX_RECURSION) { 418 - DMWARN_LIMIT("%s: FEC: recursion too deep", v->data_dev->name); 416 + if (fio->level) 419 417 return -EIO; 420 - } 421 418 422 419 fio->level++; 423 420 ··· 446 447 * them first. Do a second attempt with erasures if the corruption is 447 448 * bad enough. 448 449 */ 449 - r = fec_decode_rsb(v, io, fio, rsb, offset, false); 450 + r = fec_decode_rsb(v, io, fio, rsb, offset, want_digest, false); 450 451 if (r < 0) { 451 - r = fec_decode_rsb(v, io, fio, rsb, offset, true); 452 + r = fec_decode_rsb(v, io, fio, rsb, offset, want_digest, true); 452 453 if (r < 0) 453 454 goto done; 454 455 } ··· 478 479 mempool_free(fio->bufs[n], &f->prealloc_pool); 479 480 480 481 fec_for_each_extra_buffer(fio, n) 481 - mempool_free(fio->bufs[n], &f->extra_pool); 482 + if (fio->bufs[n]) 483 + kmem_cache_free(f->cache, fio->bufs[n]); 482 484 483 485 mempool_free(fio->output, &f->output_pool); 484 486 } ··· 531 531 532 532 mempool_exit(&f->rs_pool); 533 533 mempool_exit(&f->prealloc_pool); 534 - mempool_exit(&f->extra_pool); 535 534 mempool_exit(&f->output_pool); 536 535 kmem_cache_destroy(f->cache); 537 536 ··· 780 781 f->cache); 781 782 if (ret) { 782 783 ti->error = "Cannot allocate FEC buffer prealloc pool"; 783 - return ret; 784 - } 785 - 786 - ret = mempool_init_slab_pool(&f->extra_pool, 0, f->cache); 787 - if (ret) { 788 - ti->error = "Cannot allocate FEC buffer extra pool"; 789 784 return ret; 790 785 } 791 786

+4 -6

drivers/md/dm-verity-fec.h

··· 23 23 #define DM_VERITY_FEC_BUF_MAX \ 24 24 (1 << (PAGE_SHIFT - DM_VERITY_FEC_BUF_RS_BITS)) 25 25 26 - /* maximum recursion level for verity_fec_decode */ 27 - #define DM_VERITY_FEC_MAX_RECURSION 4 28 - 29 26 #define DM_VERITY_OPT_FEC_DEV "use_fec_from_device" 30 27 #define DM_VERITY_OPT_FEC_BLOCKS "fec_blocks" 31 28 #define DM_VERITY_OPT_FEC_START "fec_start" ··· 42 45 unsigned char rsn; /* N of RS(M, N) */ 43 46 mempool_t rs_pool; /* mempool for fio->rs */ 44 47 mempool_t prealloc_pool; /* mempool for preallocated buffers */ 45 - mempool_t extra_pool; /* mempool for extra buffers */ 46 48 mempool_t output_pool; /* mempool for output */ 47 49 struct kmem_cache *cache; /* cache for buffers */ 50 + atomic64_t corrected; /* corrected errors */ 48 51 }; 49 52 50 53 /* per-bio data */ ··· 65 68 extern bool verity_fec_is_enabled(struct dm_verity *v); 66 69 67 70 extern int verity_fec_decode(struct dm_verity *v, struct dm_verity_io *io, 68 - enum verity_block_type type, sector_t block, 69 - u8 *dest); 71 + enum verity_block_type type, const u8 *want_digest, 72 + sector_t block, u8 *dest); 70 73 71 74 extern unsigned int verity_fec_status_table(struct dm_verity *v, unsigned int sz, 72 75 char *result, unsigned int maxlen); ··· 96 99 static inline int verity_fec_decode(struct dm_verity *v, 97 100 struct dm_verity_io *io, 98 101 enum verity_block_type type, 102 + const u8 *want_digest, 99 103 sector_t block, u8 *dest) 100 104 { 101 105 return -EOPNOTSUPP;

+154 -55

drivers/md/dm-verity-target.c

··· 117 117 int verity_hash(struct dm_verity *v, struct dm_verity_io *io, 118 118 const u8 *data, size_t len, u8 *digest) 119 119 { 120 - struct shash_desc *desc = &io->hash_desc; 120 + struct shash_desc *desc; 121 121 int r; 122 122 123 + if (likely(v->use_sha256_lib)) { 124 + struct sha256_ctx *ctx = &io->hash_ctx.sha256; 125 + 126 + /* 127 + * Fast path using SHA-256 library. This is enabled only for 128 + * verity version 1, where the salt is at the beginning. 129 + */ 130 + *ctx = *v->initial_hashstate.sha256; 131 + sha256_update(ctx, data, len); 132 + sha256_final(ctx, digest); 133 + return 0; 134 + } 135 + 136 + desc = &io->hash_ctx.shash; 123 137 desc->tfm = v->shash_tfm; 124 - if (unlikely(v->initial_hashstate == NULL)) { 138 + if (unlikely(v->initial_hashstate.shash == NULL)) { 125 139 /* Version 0: salt at end */ 126 140 r = crypto_shash_init(desc) ?: 127 141 crypto_shash_update(desc, data, len) ?: ··· 143 129 crypto_shash_final(desc, digest); 144 130 } else { 145 131 /* Version 1: salt at beginning */ 146 - r = crypto_shash_import(desc, v->initial_hashstate) ?: 132 + r = crypto_shash_import(desc, v->initial_hashstate.shash) ?: 147 133 crypto_shash_finup(desc, data, len, digest); 148 134 } 149 135 if (unlikely(r)) ··· 229 215 * Verify hash of a metadata block pertaining to the specified data block 230 216 * ("block" argument) at a specified level ("level" argument). 231 217 * 232 - * On successful return, verity_io_want_digest(v, io) contains the hash value 233 - * for a lower tree level or for the data block (if we're at the lowest level). 218 + * On successful return, want_digest contains the hash value for a lower tree 219 + * level or for the data block (if we're at the lowest level). 234 220 * 235 221 * If "skip_unverified" is true, unverified buffer is skipped and 1 is returned. 236 222 * If "skip_unverified" is false, unverified buffer is hashed and verified 237 - * against current value of verity_io_want_digest(v, io). 223 + * against current value of want_digest. 238 224 */ 239 225 static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io, 240 226 sector_t block, int level, bool skip_unverified, ··· 273 259 if (IS_ERR(data)) 274 260 return r; 275 261 if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_METADATA, 276 - hash_block, data) == 0) { 262 + want_digest, hash_block, data) == 0) { 277 263 aux = dm_bufio_get_aux_data(buf); 278 264 aux->hash_verified = 1; 279 265 goto release_ok; ··· 293 279 } 294 280 295 281 r = verity_hash(v, io, data, 1 << v->hash_dev_block_bits, 296 - verity_io_real_digest(v, io)); 282 + io->tmp_digest); 297 283 if (unlikely(r < 0)) 298 284 goto release_ret_r; 299 285 300 - if (likely(memcmp(verity_io_real_digest(v, io), want_digest, 286 + if (likely(memcmp(io->tmp_digest, want_digest, 301 287 v->digest_size) == 0)) 302 288 aux->hash_verified = 1; 303 289 else if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) { ··· 308 294 r = -EAGAIN; 309 295 goto release_ret_r; 310 296 } else if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_METADATA, 311 - hash_block, data) == 0) 297 + want_digest, hash_block, data) == 0) 312 298 aux->hash_verified = 1; 313 299 else if (verity_handle_err(v, 314 300 DM_VERITY_BLOCK_TYPE_METADATA, ··· 372 358 } 373 359 374 360 static noinline int verity_recheck(struct dm_verity *v, struct dm_verity_io *io, 375 - sector_t cur_block, u8 *dest) 361 + const u8 *want_digest, sector_t cur_block, 362 + u8 *dest) 376 363 { 377 364 struct page *page; 378 365 void *buffer; ··· 397 382 goto free_ret; 398 383 399 384 r = verity_hash(v, io, buffer, 1 << v->data_dev_block_bits, 400 - verity_io_real_digest(v, io)); 385 + io->tmp_digest); 401 386 if (unlikely(r)) 402 387 goto free_ret; 403 388 404 - if (memcmp(verity_io_real_digest(v, io), 405 - verity_io_want_digest(v, io), v->digest_size)) { 389 + if (memcmp(io->tmp_digest, want_digest, v->digest_size)) { 406 390 r = -EIO; 407 391 goto free_ret; 408 392 } ··· 416 402 417 403 static int verity_handle_data_hash_mismatch(struct dm_verity *v, 418 404 struct dm_verity_io *io, 419 - struct bio *bio, sector_t blkno, 420 - u8 *data) 405 + struct bio *bio, 406 + struct pending_block *block) 421 407 { 408 + const u8 *want_digest = block->want_digest; 409 + sector_t blkno = block->blkno; 410 + u8 *data = block->data; 411 + 422 412 if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) { 423 413 /* 424 414 * Error handling code (FEC included) cannot be run in the ··· 430 412 */ 431 413 return -EAGAIN; 432 414 } 433 - if (verity_recheck(v, io, blkno, data) == 0) { 415 + if (verity_recheck(v, io, want_digest, blkno, data) == 0) { 434 416 if (v->validated_blocks) 435 417 set_bit(blkno, v->validated_blocks); 436 418 return 0; 437 419 } 438 420 #if defined(CONFIG_DM_VERITY_FEC) 439 - if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA, blkno, 440 - data) == 0) 421 + if (verity_fec_decode(v, io, DM_VERITY_BLOCK_TYPE_DATA, want_digest, 422 + blkno, data) == 0) 441 423 return 0; 442 424 #endif 443 425 if (bio->bi_status) ··· 451 433 return 0; 452 434 } 453 435 436 + static void verity_clear_pending_blocks(struct dm_verity_io *io) 437 + { 438 + int i; 439 + 440 + for (i = io->num_pending - 1; i >= 0; i--) { 441 + kunmap_local(io->pending_blocks[i].data); 442 + io->pending_blocks[i].data = NULL; 443 + } 444 + io->num_pending = 0; 445 + } 446 + 447 + static int verity_verify_pending_blocks(struct dm_verity *v, 448 + struct dm_verity_io *io, 449 + struct bio *bio) 450 + { 451 + const unsigned int block_size = 1 << v->data_dev_block_bits; 452 + int i, r; 453 + 454 + if (io->num_pending == 2) { 455 + /* num_pending == 2 implies that the algorithm is SHA-256 */ 456 + sha256_finup_2x(v->initial_hashstate.sha256, 457 + io->pending_blocks[0].data, 458 + io->pending_blocks[1].data, block_size, 459 + io->pending_blocks[0].real_digest, 460 + io->pending_blocks[1].real_digest); 461 + } else { 462 + for (i = 0; i < io->num_pending; i++) { 463 + r = verity_hash(v, io, io->pending_blocks[i].data, 464 + block_size, 465 + io->pending_blocks[i].real_digest); 466 + if (unlikely(r)) 467 + return r; 468 + } 469 + } 470 + 471 + for (i = 0; i < io->num_pending; i++) { 472 + struct pending_block *block = &io->pending_blocks[i]; 473 + 474 + if (likely(memcmp(block->real_digest, block->want_digest, 475 + v->digest_size) == 0)) { 476 + if (v->validated_blocks) 477 + set_bit(block->blkno, v->validated_blocks); 478 + } else { 479 + r = verity_handle_data_hash_mismatch(v, io, bio, block); 480 + if (unlikely(r)) 481 + return r; 482 + } 483 + } 484 + verity_clear_pending_blocks(io); 485 + return 0; 486 + } 487 + 454 488 /* 455 489 * Verify one "dm_verity_io" structure. 456 490 */ ··· 510 440 { 511 441 struct dm_verity *v = io->v; 512 442 const unsigned int block_size = 1 << v->data_dev_block_bits; 443 + const int max_pending = v->use_sha256_finup_2x ? 2 : 1; 513 444 struct bvec_iter iter_copy; 514 445 struct bvec_iter *iter; 515 446 struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size); 516 447 unsigned int b; 448 + int r; 449 + 450 + io->num_pending = 0; 517 451 518 452 if (static_branch_unlikely(&use_bh_wq_enabled) && io->in_bh) { 519 453 /* ··· 531 457 532 458 for (b = 0; b < io->n_blocks; 533 459 b++, bio_advance_iter(bio, iter, block_size)) { 534 - int r; 535 - sector_t cur_block = io->block + b; 460 + sector_t blkno = io->block + b; 461 + struct pending_block *block; 536 462 bool is_zero; 537 463 struct bio_vec bv; 538 464 void *data; 539 465 540 466 if (v->validated_blocks && bio->bi_status == BLK_STS_OK && 541 - likely(test_bit(cur_block, v->validated_blocks))) 467 + likely(test_bit(blkno, v->validated_blocks))) 542 468 continue; 543 469 544 - r = verity_hash_for_block(v, io, cur_block, 545 - verity_io_want_digest(v, io), 470 + block = &io->pending_blocks[io->num_pending]; 471 + 472 + r = verity_hash_for_block(v, io, blkno, block->want_digest, 546 473 &is_zero); 547 474 if (unlikely(r < 0)) 548 - return r; 475 + goto error; 549 476 550 477 bv = bio_iter_iovec(bio, *iter); 551 478 if (unlikely(bv.bv_len < block_size)) { ··· 557 482 * data block size to be greater than PAGE_SIZE. 558 483 */ 559 484 DMERR_LIMIT("unaligned io (data block spans pages)"); 560 - return -EIO; 485 + r = -EIO; 486 + goto error; 561 487 } 562 488 563 489 data = bvec_kmap_local(&bv); ··· 572 496 kunmap_local(data); 573 497 continue; 574 498 } 575 - 576 - r = verity_hash(v, io, data, block_size, 577 - verity_io_real_digest(v, io)); 578 - if (unlikely(r < 0)) { 579 - kunmap_local(data); 580 - return r; 499 + block->data = data; 500 + block->blkno = blkno; 501 + if (++io->num_pending == max_pending) { 502 + r = verity_verify_pending_blocks(v, io, bio); 503 + if (unlikely(r)) 504 + goto error; 581 505 } 506 + } 582 507 583 - if (likely(memcmp(verity_io_real_digest(v, io), 584 - verity_io_want_digest(v, io), v->digest_size) == 0)) { 585 - if (v->validated_blocks) 586 - set_bit(cur_block, v->validated_blocks); 587 - kunmap_local(data); 588 - continue; 589 - } 590 - r = verity_handle_data_hash_mismatch(v, io, bio, cur_block, 591 - data); 592 - kunmap_local(data); 508 + if (io->num_pending) { 509 + r = verity_verify_pending_blocks(v, io, bio); 593 510 if (unlikely(r)) 594 - return r; 511 + goto error; 595 512 } 596 513 597 514 return 0; 515 + 516 + error: 517 + verity_clear_pending_blocks(io); 518 + return r; 598 519 } 599 520 600 521 /* ··· 848 775 switch (type) { 849 776 case STATUSTYPE_INFO: 850 777 DMEMIT("%c", v->hash_failed ? 'C' : 'V'); 778 + if (verity_fec_is_enabled(v)) 779 + DMEMIT(" %lld", atomic64_read(&v->fec->corrected)); 780 + else 781 + DMEMIT(" -"); 851 782 break; 852 783 case STATUSTYPE_TABLE: 853 784 DMEMIT("%u %s %s %u %u %llu %llu %s ", ··· 1081 1004 1082 1005 kvfree(v->validated_blocks); 1083 1006 kfree(v->salt); 1084 - kfree(v->initial_hashstate); 1007 + kfree(v->initial_hashstate.shash); 1085 1008 kfree(v->root_digest); 1086 1009 kfree(v->zero_digest); 1087 1010 verity_free_sig(v); ··· 1146 1069 if (!v->zero_digest) 1147 1070 return r; 1148 1071 1149 - io = kmalloc(sizeof(*io) + crypto_shash_descsize(v->shash_tfm), 1150 - GFP_KERNEL); 1072 + io = kmalloc(v->ti->per_io_data_size, GFP_KERNEL); 1151 1073 1152 1074 if (!io) 1153 1075 return r; /* verity_dtr will free zero_digest */ ··· 1328 1252 } 1329 1253 v->shash_tfm = shash; 1330 1254 v->digest_size = crypto_shash_digestsize(shash); 1331 - DMINFO("%s using \"%s\"", alg_name, crypto_shash_driver_name(shash)); 1332 1255 if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) { 1333 1256 ti->error = "Digest size too big"; 1334 1257 return -EINVAL; 1258 + } 1259 + if (likely(v->version && strcmp(alg_name, "sha256") == 0)) { 1260 + /* 1261 + * Fast path: use the library API for reduced overhead and 1262 + * interleaved hashing support. 1263 + */ 1264 + v->use_sha256_lib = true; 1265 + if (sha256_finup_2x_is_optimized()) 1266 + v->use_sha256_finup_2x = true; 1267 + ti->per_io_data_size = 1268 + offsetofend(struct dm_verity_io, hash_ctx.sha256); 1269 + } else { 1270 + /* Fallback case: use the generic crypto API. */ 1271 + ti->per_io_data_size = 1272 + offsetofend(struct dm_verity_io, hash_ctx.shash) + 1273 + crypto_shash_descsize(shash); 1335 1274 } 1336 1275 return 0; 1337 1276 } ··· 1368 1277 return -EINVAL; 1369 1278 } 1370 1279 } 1371 - if (v->version) { /* Version 1: salt at beginning */ 1280 + if (likely(v->use_sha256_lib)) { 1281 + /* Implies version 1: salt at beginning */ 1282 + v->initial_hashstate.sha256 = 1283 + kmalloc(sizeof(struct sha256_ctx), GFP_KERNEL); 1284 + if (!v->initial_hashstate.sha256) { 1285 + ti->error = "Cannot allocate initial hash state"; 1286 + return -ENOMEM; 1287 + } 1288 + sha256_init(v->initial_hashstate.sha256); 1289 + sha256_update(v->initial_hashstate.sha256, 1290 + v->salt, v->salt_size); 1291 + } else if (v->version) { /* Version 1: salt at beginning */ 1372 1292 SHASH_DESC_ON_STACK(desc, v->shash_tfm); 1373 1293 int r; 1374 1294 ··· 1387 1285 * Compute the pre-salted hash state that can be passed to 1388 1286 * crypto_shash_import() for each block later. 1389 1287 */ 1390 - v->initial_hashstate = kmalloc( 1288 + v->initial_hashstate.shash = kmalloc( 1391 1289 crypto_shash_statesize(v->shash_tfm), GFP_KERNEL); 1392 - if (!v->initial_hashstate) { 1290 + if (!v->initial_hashstate.shash) { 1393 1291 ti->error = "Cannot allocate initial hash state"; 1394 1292 return -ENOMEM; 1395 1293 } 1396 1294 desc->tfm = v->shash_tfm; 1397 1295 r = crypto_shash_init(desc) ?: 1398 1296 crypto_shash_update(desc, v->salt, v->salt_size) ?: 1399 - crypto_shash_export(desc, v->initial_hashstate); 1297 + crypto_shash_export(desc, v->initial_hashstate.shash); 1400 1298 if (r) { 1401 1299 ti->error = "Cannot set up initial hash state"; 1402 1300 return r; ··· 1658 1556 goto bad; 1659 1557 } 1660 1558 1661 - ti->per_io_data_size = sizeof(struct dm_verity_io) + 1662 - crypto_shash_descsize(v->shash_tfm); 1663 - 1664 1559 r = verity_fec_ctr(v); 1665 1560 if (r) 1666 1561 goto bad; ··· 1789 1690 .name = "verity", 1790 1691 /* Note: the LSMs depend on the singleton and immutable features */ 1791 1692 .features = DM_TARGET_SINGLETON | DM_TARGET_IMMUTABLE, 1792 - .version = {1, 12, 0}, 1693 + .version = {1, 13, 0}, 1793 1694 .module = THIS_MODULE, 1794 1695 .ctr = verity_ctr, 1795 1696 .dtr = verity_dtr,

+33 -19

drivers/md/dm-verity.h

··· 16 16 #include <linux/device-mapper.h> 17 17 #include <linux/interrupt.h> 18 18 #include <crypto/hash.h> 19 + #include <crypto/sha2.h> 19 20 20 21 #define DM_VERITY_MAX_LEVELS 63 21 22 ··· 43 42 struct crypto_shash *shash_tfm; 44 43 u8 *root_digest; /* digest of the root block */ 45 44 u8 *salt; /* salt: its size is salt_size */ 46 - u8 *initial_hashstate; /* salted initial state, if version >= 1 */ 45 + union { 46 + struct sha256_ctx *sha256; /* for use_sha256_lib=1 */ 47 + u8 *shash; /* for use_sha256_lib=0 */ 48 + } initial_hashstate; /* salted initial state, if version >= 1 */ 47 49 u8 *zero_digest; /* digest for a zero block */ 48 50 #ifdef CONFIG_SECURITY 49 51 u8 *root_digest_sig; /* signature of the root digest */ ··· 63 59 unsigned char version; 64 60 bool hash_failed:1; /* set if hash of any block failed */ 65 61 bool use_bh_wq:1; /* try to verify in BH wq before normal work-queue */ 62 + bool use_sha256_lib:1; /* use SHA-256 library instead of generic crypto API */ 63 + bool use_sha256_finup_2x:1; /* use interleaved hashing optimization */ 66 64 unsigned int digest_size; /* digest size for the current hash algorithm */ 67 65 enum verity_mode mode; /* mode for handling verification errors */ 68 66 enum verity_mode error_mode;/* mode for handling I/O errors */ ··· 84 78 mempool_t recheck_pool; 85 79 }; 86 80 81 + struct pending_block { 82 + void *data; 83 + sector_t blkno; 84 + u8 want_digest[HASH_MAX_DIGESTSIZE]; 85 + u8 real_digest[HASH_MAX_DIGESTSIZE]; 86 + }; 87 + 87 88 struct dm_verity_io { 88 89 struct dm_verity *v; 89 90 ··· 107 94 struct work_struct work; 108 95 struct work_struct bh_work; 109 96 110 - u8 real_digest[HASH_MAX_DIGESTSIZE]; 111 - u8 want_digest[HASH_MAX_DIGESTSIZE]; 97 + u8 tmp_digest[HASH_MAX_DIGESTSIZE]; 112 98 113 99 /* 114 - * Temporary space for hashing. This is variable-length and must be at 115 - * the end of the struct. struct shash_desc is just the fixed part; 116 - * it's followed by a context of size crypto_shash_descsize(shash_tfm). 100 + * This is the queue of data blocks that are pending verification. When 101 + * the crypto layer supports interleaved hashing, we allow multiple 102 + * blocks to be queued up in order to utilize it. This can improve 103 + * performance significantly vs. sequential hashing of each block. 117 104 */ 118 - struct shash_desc hash_desc; 105 + int num_pending; 106 + struct pending_block pending_blocks[2]; 107 + 108 + /* 109 + * Temporary space for hashing. Either sha256 or shash is used, 110 + * depending on the value of use_sha256_lib. If shash is used, 111 + * then this field is variable-length, with total size 112 + * sizeof(struct shash_desc) + crypto_shash_descsize(shash_tfm). 113 + * For this reason, this field must be the end of the struct. 114 + */ 115 + union { 116 + struct sha256_ctx sha256; 117 + struct shash_desc shash; 118 + } hash_ctx; 119 119 }; 120 - 121 - static inline u8 *verity_io_real_digest(struct dm_verity *v, 122 - struct dm_verity_io *io) 123 - { 124 - return io->real_digest; 125 - } 126 - 127 - static inline u8 *verity_io_want_digest(struct dm_verity *v, 128 - struct dm_verity_io *io) 129 - { 130 - return io->want_digest; 131 - } 132 120 133 121 extern int verity_hash(struct dm_verity *v, struct dm_verity_io *io, 134 122 const u8 *data, size_t len, u8 *digest);

-3

drivers/md/dm-zone.c

··· 203 203 return ret; 204 204 } 205 205 206 - md->nr_zones = disk->nr_zones; 207 - 208 206 return 0; 209 207 } 210 208 ··· 450 452 set_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); 451 453 } else { 452 454 clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); 453 - md->nr_zones = 0; 454 455 md->disk->nr_zones = 0; 455 456 } 456 457 }

+31 -15

drivers/md/dm.c

··· 272 272 int r, i; 273 273 274 274 #if (IS_ENABLED(CONFIG_IMA) && !IS_ENABLED(CONFIG_IMA_DISABLE_HTABLE)) 275 - DMWARN("CONFIG_IMA_DISABLE_HTABLE is disabled." 275 + DMINFO("CONFIG_IMA_DISABLE_HTABLE is disabled." 276 276 " Duplicate IMA measurements will not be recorded in the IMA log."); 277 277 #endif 278 278 ··· 1321 1321 BUG_ON(dm_tio_flagged(tio, DM_TIO_IS_DUPLICATE_BIO)); 1322 1322 BUG_ON(bio_sectors > *tio->len_ptr); 1323 1323 BUG_ON(n_sectors > bio_sectors); 1324 + BUG_ON(bio->bi_opf & REQ_ATOMIC); 1324 1325 1325 1326 if (static_branch_unlikely(&zoned_enabled) && 1326 1327 unlikely(bdev_is_zoned(bio->bi_bdev))) { ··· 1736 1735 ci->submit_as_polled = !!(ci->bio->bi_opf & REQ_POLLED); 1737 1736 1738 1737 len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count); 1739 - if (ci->bio->bi_opf & REQ_ATOMIC && len != ci->sector_count) 1740 - return BLK_STS_IOERR; 1738 + if (ci->bio->bi_opf & REQ_ATOMIC) { 1739 + if (unlikely(!dm_target_supports_atomic_writes(ti->type))) 1740 + return BLK_STS_IOERR; 1741 + if (unlikely(len != ci->sector_count)) 1742 + return BLK_STS_IOERR; 1743 + } 1741 1744 1742 1745 setup_split_accounting(ci, len); 1743 1746 ··· 2444 2439 { 2445 2440 struct dm_table *old_map; 2446 2441 sector_t size, old_size; 2447 - int ret; 2448 2442 2449 2443 lockdep_assert_held(&md->suspend_lock); 2450 2444 ··· 2458 2454 2459 2455 set_capacity(md->disk, size); 2460 2456 2461 - ret = dm_table_set_restrictions(t, md->queue, limits); 2462 - if (ret) { 2463 - set_capacity(md->disk, old_size); 2464 - old_map = ERR_PTR(ret); 2465 - goto out; 2457 + if (limits) { 2458 + int ret = dm_table_set_restrictions(t, md->queue, limits); 2459 + if (ret) { 2460 + set_capacity(md->disk, old_size); 2461 + old_map = ERR_PTR(ret); 2462 + goto out; 2463 + } 2466 2464 } 2467 2465 2468 2466 /* ··· 2842 2836 2843 2837 static void dm_queue_flush(struct mapped_device *md) 2844 2838 { 2839 + clear_bit(DMF_NOFLUSH_SUSPENDING, &md->flags); 2845 2840 clear_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags); 2846 2841 smp_mb__after_atomic(); 2847 2842 queue_work(md->wq, &md->work); ··· 2855 2848 { 2856 2849 struct dm_table *live_map = NULL, *map = ERR_PTR(-EINVAL); 2857 2850 struct queue_limits limits; 2851 + bool update_limits = true; 2858 2852 int r; 2859 2853 2860 2854 mutex_lock(&md->suspend_lock); ··· 2865 2857 goto out; 2866 2858 2867 2859 /* 2860 + * To avoid a potential deadlock locking the queue limits, disallow 2861 + * updating the queue limits during a table swap, when updating an 2862 + * immutable request-based dm device (dm-multipath) during a noflush 2863 + * suspend. It is userspace's responsibility to make sure that the new 2864 + * table uses the same limits as the existing table, if it asks for a 2865 + * noflush suspend. 2866 + */ 2867 + if (dm_request_based(md) && md->immutable_target && 2868 + __noflush_suspending(md)) 2869 + update_limits = false; 2870 + /* 2868 2871 * If the new table has no data devices, retain the existing limits. 2869 2872 * This helps multipath with queue_if_no_path if all paths disappear, 2870 2873 * then new I/O is queued based on these limits, and then some paths 2871 2874 * reappear. 2872 2875 */ 2873 - if (dm_table_has_no_data_devices(table)) { 2876 + else if (dm_table_has_no_data_devices(table)) { 2874 2877 live_map = dm_get_live_table_fast(md); 2875 2878 if (live_map) 2876 2879 limits = md->queue->limits; 2877 2880 dm_put_live_table_fast(md); 2878 2881 } 2879 2882 2880 - if (!live_map) { 2883 + if (update_limits && !live_map) { 2881 2884 r = dm_calculate_queue_limits(table, &limits); 2882 2885 if (r) { 2883 2886 map = ERR_PTR(r); ··· 2896 2877 } 2897 2878 } 2898 2879 2899 - map = __bind(md, table, &limits); 2880 + map = __bind(md, table, update_limits ? &limits : NULL); 2900 2881 dm_issue_global_event(); 2901 2882 2902 2883 out: ··· 2949 2930 2950 2931 /* 2951 2932 * DMF_NOFLUSH_SUSPENDING must be set before presuspend. 2952 - * This flag is cleared before dm_suspend returns. 2953 2933 */ 2954 2934 if (noflush) 2955 2935 set_bit(DMF_NOFLUSH_SUSPENDING, &md->flags); ··· 3011 2993 if (!r) 3012 2994 set_bit(dmf_suspended_flag, &md->flags); 3013 2995 3014 - if (noflush) 3015 - clear_bit(DMF_NOFLUSH_SUSPENDING, &md->flags); 3016 2996 if (map) 3017 2997 synchronize_srcu(&md->io_barrier); 3018 2998