Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'nfs-for-4.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
"Highlights include:

Stable bugfixes:
- Fix use after free in write error path
- Use GFP_NOIO for two allocations in writeback
- Fix a hang in OPEN related to server reboot
- Check the result of nfs4_pnfs_ds_connect
- Fix an rcu lock leak

Features:
- Removal of the unmaintained and unused OSD pNFS layout
- Cleanup and removal of lots of unnecessary dprintk()s
- Cleanup and removal of some memory failure paths now that GFP_NOFS
is guaranteed to never fail.
- Remove the v3-only data server limitation on pNFS/flexfiles

Bugfixes:
- RPC/RDMA connection handling bugfixes
- Copy offload: fixes to ensure the copied data is COMMITed to disk.
- Readdir: switch back to using the ->iterate VFS interface
- File locking fixes from Ben Coddington
- Various use-after-free and deadlock issues in pNFS
- Write path bugfixes"

* tag 'nfs-for-4.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (89 commits)
pNFS/flexfiles: Always attempt to call layoutstats when flexfiles is enabled
NFSv4.1: Work around a Linux server bug...
NFS append COMMIT after synchronous COPY
NFSv4: Fix exclusive create attributes encoding
NFSv4: Fix an rcu lock leak
nfs: use kmap/kunmap directly
NFS: always treat the invocation of nfs_getattr as cache hit when noac is on
Fix nfs_client refcounting if kmalloc fails in nfs4_proc_exchange_id and nfs4_proc_async_renew
NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION
pNFS: Fix NULL dereference in pnfs_generic_alloc_ds_commits
pNFS: Fix a typo in pnfs_generic_alloc_ds_commits
pNFS: Fix a deadlock when coalescing writes and returning the layout
pNFS: Don't clear the layout return info if there are segments to return
pNFS: Ensure we commit the layout if it has been invalidated
pNFS: Don't send COMMITs to the DSes if the server invalidated our layout
pNFS/flexfiles: Fix up the ff_layout_write_pagelist failure path
pNFS: Ensure we check layout validity before marking it for return
NFS4.1 handle interrupted slot reuse from ERR_DELAY
NFSv4: check return value of xdr_inline_decode
nfs/filelayout: fix NULL pointer dereference in fl_pnfs_update_layout()
...

+952 -2963
-6
Documentation/admin-guide/kernel-parameters.txt
··· 2434 2434 and gids from such clients. This is intended to ease 2435 2435 migration from NFSv2/v3. 2436 2436 2437 - objlayoutdriver.osd_login_prog= 2438 - [NFS] [OBJLAYOUT] sets the pathname to the program which 2439 - is used to automatically discover and login into new 2440 - osd-targets. Please see: 2441 - Documentation/filesystems/pnfs.txt for more explanations 2442 - 2443 2437 nmi_debug= [KNL,SH] Specify one or more actions to take 2444 2438 when a NMI is triggered. 2445 2439 Format: [state][,regs][,debounce][,die]
-37
Documentation/filesystems/nfs/pnfs.txt
··· 64 64 different layout types. 65 65 66 66 Files-layout-driver code is in: fs/nfs/filelayout/.. directory 67 - Objects-layout-driver code is in: fs/nfs/objlayout/.. directory 68 67 Blocks-layout-driver code is in: fs/nfs/blocklayout/.. directory 69 68 Flexfiles-layout-driver code is in: fs/nfs/flexfilelayout/.. directory 70 - 71 - objects-layout setup 72 - -------------------- 73 - 74 - As part of the full STD implementation the objlayoutdriver.ko needs, at times, 75 - to automatically login to yet undiscovered iscsi/osd devices. For this the 76 - driver makes up-calles to a user-mode script called *osd_login* 77 - 78 - The path_name of the script to use is by default: 79 - /sbin/osd_login. 80 - This name can be overridden by the Kernel module parameter: 81 - objlayoutdriver.osd_login_prog 82 - 83 - If Kernel does not find the osd_login_prog path it will zero it out 84 - and will not attempt farther logins. An admin can then write new value 85 - to the objlayoutdriver.osd_login_prog Kernel parameter to re-enable it. 86 - 87 - The /sbin/osd_login is part of the nfs-utils package, and should usually 88 - be installed on distributions that support this Kernel version. 89 - 90 - The API to the login script is as follows: 91 - Usage: $0 -u <URI> -o <OSDNAME> -s <SYSTEMID> 92 - Options: 93 - -u target uri e.g. iscsi://<ip>:<port> 94 - (always exists) 95 - (More protocols can be defined in the future. 96 - The client does not interpret this string it is 97 - passed unchanged as received from the Server) 98 - -o osdname of the requested target OSD 99 - (Might be empty) 100 - (A string which denotes the OSD name, there is a 101 - limit of 64 chars on this string) 102 - -s systemid of the requested target OSD 103 - (Might be empty) 104 - (This string, if not empty is always an hex 105 - representation of the 20 bytes osd_system_id) 106 69 107 70 blocks-layout setup 108 71 -------------------
+1 -1
fs/fuse/file.c
··· 2177 2177 } 2178 2178 2179 2179 /* Unlock on close is handled by the flush method */ 2180 - if (fl->fl_flags & FL_CLOSE) 2180 + if ((fl->fl_flags & FL_CLOSE_POSIX) == FL_CLOSE_POSIX) 2181 2181 return 0; 2182 2182 2183 2183 if (pid && pid_nr == 0)
+1
fs/lockd/clntlock.c
··· 69 69 if (host->h_rpcclnt == NULL && nlm_bind_host(host) == NULL) 70 70 goto out_nobind; 71 71 72 + host->h_nlmclnt_ops = nlm_init->nlmclnt_ops; 72 73 return host; 73 74 out_nobind: 74 75 nlmclnt_release_host(host);
+25 -1
fs/lockd/clntproc.c
··· 150 150 * @host: address of a valid nlm_host context representing the NLM server 151 151 * @cmd: fcntl-style file lock operation to perform 152 152 * @fl: address of arguments for the lock operation 153 + * @data: address of data to be sent to callback operations 153 154 * 154 155 */ 155 - int nlmclnt_proc(struct nlm_host *host, int cmd, struct file_lock *fl) 156 + int nlmclnt_proc(struct nlm_host *host, int cmd, struct file_lock *fl, void *data) 156 157 { 157 158 struct nlm_rqst *call; 158 159 int status; 160 + const struct nlmclnt_operations *nlmclnt_ops = host->h_nlmclnt_ops; 159 161 160 162 call = nlm_alloc_call(host); 161 163 if (call == NULL) 162 164 return -ENOMEM; 165 + 166 + if (nlmclnt_ops && nlmclnt_ops->nlmclnt_alloc_call) 167 + nlmclnt_ops->nlmclnt_alloc_call(data); 163 168 164 169 nlmclnt_locks_init_private(fl, host); 165 170 if (!fl->fl_u.nfs_fl.owner) { ··· 174 169 } 175 170 /* Set up the argument struct */ 176 171 nlmclnt_setlockargs(call, fl); 172 + call->a_callback_data = data; 177 173 178 174 if (IS_SETLK(cmd) || IS_SETLKW(cmd)) { 179 175 if (fl->fl_type != F_UNLCK) { ··· 220 214 221 215 void nlmclnt_release_call(struct nlm_rqst *call) 222 216 { 217 + const struct nlmclnt_operations *nlmclnt_ops = call->a_host->h_nlmclnt_ops; 218 + 223 219 if (!atomic_dec_and_test(&call->a_count)) 224 220 return; 221 + if (nlmclnt_ops && nlmclnt_ops->nlmclnt_release_call) 222 + nlmclnt_ops->nlmclnt_release_call(call->a_callback_data); 225 223 nlmclnt_release_host(call->a_host); 226 224 nlmclnt_release_lockargs(call); 227 225 kfree(call); ··· 697 687 return status; 698 688 } 699 689 690 + static void nlmclnt_unlock_prepare(struct rpc_task *task, void *data) 691 + { 692 + struct nlm_rqst *req = data; 693 + const struct nlmclnt_operations *nlmclnt_ops = req->a_host->h_nlmclnt_ops; 694 + bool defer_call = false; 695 + 696 + if (nlmclnt_ops && nlmclnt_ops->nlmclnt_unlock_prepare) 697 + defer_call = nlmclnt_ops->nlmclnt_unlock_prepare(task, req->a_callback_data); 698 + 699 + if (!defer_call) 700 + rpc_call_start(task); 701 + } 702 + 700 703 static void nlmclnt_unlock_callback(struct rpc_task *task, void *data) 701 704 { 702 705 struct nlm_rqst *req = data; ··· 743 720 } 744 721 745 722 static const struct rpc_call_ops nlmclnt_unlock_ops = { 723 + .rpc_call_prepare = nlmclnt_unlock_prepare, 746 724 .rpc_call_done = nlmclnt_unlock_callback, 747 725 .rpc_release = nlmclnt_rpc_release, 748 726 };
+1 -1
fs/locks.c
··· 2504 2504 .fl_owner = filp, 2505 2505 .fl_pid = current->tgid, 2506 2506 .fl_file = filp, 2507 - .fl_flags = FL_FLOCK, 2507 + .fl_flags = FL_FLOCK | FL_CLOSE, 2508 2508 .fl_type = F_UNLCK, 2509 2509 .fl_end = OFFSET_MAX, 2510 2510 };
-5
fs/nfs/Kconfig
··· 123 123 depends on NFS_V4_1 && BLK_DEV_DM 124 124 default NFS_V4 125 125 126 - config PNFS_OBJLAYOUT 127 - tristate 128 - depends on NFS_V4_1 && SCSI_OSD_ULD 129 - default NFS_V4 130 - 131 126 config PNFS_FLEXFILE_LAYOUT 132 127 tristate 133 128 depends on NFS_V4_1 && NFS_V3
-1
fs/nfs/Makefile
··· 31 31 nfsv4-$(CONFIG_NFS_V4_2) += nfs42proc.o 32 32 33 33 obj-$(CONFIG_PNFS_FILE_LAYOUT) += filelayout/ 34 - obj-$(CONFIG_PNFS_OBJLAYOUT) += objlayout/ 35 34 obj-$(CONFIG_PNFS_BLOCK) += blocklayout/ 36 35 obj-$(CONFIG_PNFS_FLEXFILE_LAYOUT) += flexfilelayout/
+7 -40
fs/nfs/callback_proc.c
··· 131 131 if (!inode) 132 132 continue; 133 133 if (!nfs_sb_active(inode->i_sb)) { 134 - rcu_read_lock(); 134 + rcu_read_unlock(); 135 135 spin_unlock(&clp->cl_lock); 136 136 iput(inode); 137 137 spin_lock(&clp->cl_lock); 138 + rcu_read_lock(); 138 139 goto restart; 139 140 } 140 141 return inode; ··· 171 170 if (!inode) 172 171 continue; 173 172 if (!nfs_sb_active(inode->i_sb)) { 174 - rcu_read_lock(); 173 + rcu_read_unlock(); 175 174 spin_unlock(&clp->cl_lock); 176 175 iput(inode); 177 176 spin_lock(&clp->cl_lock); 177 + rcu_read_lock(); 178 178 goto restart; 179 179 } 180 180 return inode; ··· 319 317 static u32 do_callback_layoutrecall(struct nfs_client *clp, 320 318 struct cb_layoutrecallargs *args) 321 319 { 322 - u32 res; 323 - 324 - dprintk("%s enter, type=%i\n", __func__, args->cbl_recall_type); 325 320 if (args->cbl_recall_type == RETURN_FILE) 326 - res = initiate_file_draining(clp, args); 327 - else 328 - res = initiate_bulk_draining(clp, args); 329 - dprintk("%s returning %i\n", __func__, res); 330 - return res; 331 - 321 + return initiate_file_draining(clp, args); 322 + return initiate_bulk_draining(clp, args); 332 323 } 333 324 334 325 __be32 nfs4_callback_layoutrecall(struct cb_layoutrecallargs *args, 335 326 void *dummy, struct cb_process_state *cps) 336 327 { 337 - u32 res; 338 - 339 - dprintk("%s: -->\n", __func__); 328 + u32 res = NFS4ERR_OP_NOT_IN_SESSION; 340 329 341 330 if (cps->clp) 342 331 res = do_callback_layoutrecall(cps->clp, args); 343 - else 344 - res = NFS4ERR_OP_NOT_IN_SESSION; 345 - 346 - dprintk("%s: exit with status = %d\n", __func__, res); 347 332 return cpu_to_be32(res); 348 333 } 349 334 ··· 353 364 struct nfs_client *clp = cps->clp; 354 365 struct nfs_server *server = NULL; 355 366 356 - dprintk("%s: -->\n", __func__); 357 - 358 367 if (!clp) { 359 368 res = cpu_to_be32(NFS4ERR_OP_NOT_IN_SESSION); 360 369 goto out; ··· 371 384 goto found; 372 385 } 373 386 rcu_read_unlock(); 374 - dprintk("%s: layout type %u not found\n", 375 - __func__, dev->cbd_layout_type); 376 387 continue; 377 388 } 378 389 ··· 380 395 381 396 out: 382 397 kfree(args->devs); 383 - dprintk("%s: exit with status = %u\n", 384 - __func__, be32_to_cpu(res)); 385 398 return res; 386 399 } 387 400 ··· 400 417 validate_seqid(const struct nfs4_slot_table *tbl, const struct nfs4_slot *slot, 401 418 const struct cb_sequenceargs * args) 402 419 { 403 - dprintk("%s enter. slotid %u seqid %u, slot table seqid: %u\n", 404 - __func__, args->csa_slotid, args->csa_sequenceid, slot->seq_nr); 405 - 406 420 if (args->csa_slotid > tbl->server_highest_slotid) 407 421 return htonl(NFS4ERR_BADSLOT); 408 422 409 423 /* Replay */ 410 424 if (args->csa_sequenceid == slot->seq_nr) { 411 - dprintk("%s seqid %u is a replay\n", 412 - __func__, args->csa_sequenceid); 413 425 if (nfs4_test_locked_slot(tbl, slot->slot_nr)) 414 426 return htonl(NFS4ERR_DELAY); 415 427 /* Signal process_op to set this error on next op */ ··· 458 480 459 481 for (j = 0; j < rclist->rcl_nrefcalls; j++) { 460 482 ref = &rclist->rcl_refcalls[j]; 461 - 462 - dprintk("%s: sessionid %x:%x:%x:%x sequenceid %u " 463 - "slotid %u\n", __func__, 464 - ((u32 *)&rclist->rcl_sessionid.data)[0], 465 - ((u32 *)&rclist->rcl_sessionid.data)[1], 466 - ((u32 *)&rclist->rcl_sessionid.data)[2], 467 - ((u32 *)&rclist->rcl_sessionid.data)[3], 468 - ref->rc_sequenceid, ref->rc_slotid); 469 - 470 483 status = nfs4_slot_wait_on_seqid(tbl, ref->rc_slotid, 471 484 ref->rc_sequenceid, HZ >> 1) < 0; 472 485 if (status) ··· 562 593 res->csr_status = status; 563 594 564 595 trace_nfs4_cb_sequence(args, res, status); 565 - dprintk("%s: exit with status = %d res->csr_status %d\n", __func__, 566 - ntohl(status), ntohl(res->csr_status)); 567 596 return status; 568 597 } 569 598
+27 -82
fs/nfs/callback_xdr.c
··· 171 171 return htonl(NFS4ERR_MINOR_VERS_MISMATCH); 172 172 } 173 173 hdr->nops = ntohl(*p); 174 - dprintk("%s: minorversion %d nops %d\n", __func__, 175 - hdr->minorversion, hdr->nops); 176 174 return 0; 177 175 } 178 176 ··· 190 192 191 193 status = decode_fh(xdr, &args->fh); 192 194 if (unlikely(status != 0)) 193 - goto out; 194 - status = decode_bitmap(xdr, args->bitmap); 195 - out: 196 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 197 - return status; 195 + return status; 196 + return decode_bitmap(xdr, args->bitmap); 198 197 } 199 198 200 199 static __be32 decode_recall_args(struct svc_rqst *rqstp, struct xdr_stream *xdr, struct cb_recallargs *args) ··· 201 206 202 207 status = decode_delegation_stateid(xdr, &args->stateid); 203 208 if (unlikely(status != 0)) 204 - goto out; 209 + return status; 205 210 p = read_buf(xdr, 4); 206 - if (unlikely(p == NULL)) { 207 - status = htonl(NFS4ERR_RESOURCE); 208 - goto out; 209 - } 211 + if (unlikely(p == NULL)) 212 + return htonl(NFS4ERR_RESOURCE); 210 213 args->truncate = ntohl(*p); 211 - status = decode_fh(xdr, &args->fh); 212 - out: 213 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 214 - return status; 214 + return decode_fh(xdr, &args->fh); 215 215 } 216 216 217 217 #if defined(CONFIG_NFS_V4_1) ··· 225 235 uint32_t iomode; 226 236 227 237 p = read_buf(xdr, 4 * sizeof(uint32_t)); 228 - if (unlikely(p == NULL)) { 229 - status = htonl(NFS4ERR_BADXDR); 230 - goto out; 231 - } 238 + if (unlikely(p == NULL)) 239 + return htonl(NFS4ERR_BADXDR); 232 240 233 241 args->cbl_layout_type = ntohl(*p++); 234 242 /* Depite the spec's xdr, iomode really belongs in the FILE switch, ··· 240 252 args->cbl_range.iomode = iomode; 241 253 status = decode_fh(xdr, &args->cbl_fh); 242 254 if (unlikely(status != 0)) 243 - goto out; 255 + return status; 244 256 245 257 p = read_buf(xdr, 2 * sizeof(uint64_t)); 246 - if (unlikely(p == NULL)) { 247 - status = htonl(NFS4ERR_BADXDR); 248 - goto out; 249 - } 258 + if (unlikely(p == NULL)) 259 + return htonl(NFS4ERR_BADXDR); 250 260 p = xdr_decode_hyper(p, &args->cbl_range.offset); 251 261 p = xdr_decode_hyper(p, &args->cbl_range.length); 252 - status = decode_layout_stateid(xdr, &args->cbl_stateid); 253 - if (unlikely(status != 0)) 254 - goto out; 262 + return decode_layout_stateid(xdr, &args->cbl_stateid); 255 263 } else if (args->cbl_recall_type == RETURN_FSID) { 256 264 p = read_buf(xdr, 2 * sizeof(uint64_t)); 257 - if (unlikely(p == NULL)) { 258 - status = htonl(NFS4ERR_BADXDR); 259 - goto out; 260 - } 265 + if (unlikely(p == NULL)) 266 + return htonl(NFS4ERR_BADXDR); 261 267 p = xdr_decode_hyper(p, &args->cbl_fsid.major); 262 268 p = xdr_decode_hyper(p, &args->cbl_fsid.minor); 263 - } else if (args->cbl_recall_type != RETURN_ALL) { 264 - status = htonl(NFS4ERR_BADXDR); 265 - goto out; 266 - } 267 - dprintk("%s: ltype 0x%x iomode %d changed %d recall_type %d\n", 268 - __func__, 269 - args->cbl_layout_type, iomode, 270 - args->cbl_layoutchanged, args->cbl_recall_type); 271 - out: 272 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 273 - return status; 269 + } else if (args->cbl_recall_type != RETURN_ALL) 270 + return htonl(NFS4ERR_BADXDR); 271 + return 0; 274 272 } 275 273 276 274 static ··· 411 437 412 438 status = decode_sessionid(xdr, &args->csa_sessionid); 413 439 if (status) 414 - goto out; 440 + return status; 415 441 416 - status = htonl(NFS4ERR_RESOURCE); 417 442 p = read_buf(xdr, 5 * sizeof(uint32_t)); 418 443 if (unlikely(p == NULL)) 419 - goto out; 444 + return htonl(NFS4ERR_RESOURCE); 420 445 421 446 args->csa_addr = svc_addr(rqstp); 422 447 args->csa_sequenceid = ntohl(*p++); ··· 429 456 sizeof(*args->csa_rclists), 430 457 GFP_KERNEL); 431 458 if (unlikely(args->csa_rclists == NULL)) 432 - goto out; 459 + return htonl(NFS4ERR_RESOURCE); 433 460 434 461 for (i = 0; i < args->csa_nrclists; i++) { 435 462 status = decode_rc_list(xdr, &args->csa_rclists[i]); ··· 439 466 } 440 467 } 441 468 } 442 - status = 0; 443 - 444 - dprintk("%s: sessionid %x:%x:%x:%x sequenceid %u slotid %u " 445 - "highestslotid %u cachethis %d nrclists %u\n", 446 - __func__, 447 - ((u32 *)&args->csa_sessionid)[0], 448 - ((u32 *)&args->csa_sessionid)[1], 449 - ((u32 *)&args->csa_sessionid)[2], 450 - ((u32 *)&args->csa_sessionid)[3], 451 - args->csa_sequenceid, args->csa_slotid, 452 - args->csa_highestslotid, args->csa_cachethis, 453 - args->csa_nrclists); 454 - out: 455 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 456 - return status; 469 + return 0; 457 470 458 471 out_free: 459 472 for (i = 0; i < args->csa_nrclists; i++) 460 473 kfree(args->csa_rclists[i].rcl_refcalls); 461 474 kfree(args->csa_rclists); 462 - goto out; 475 + return status; 463 476 } 464 477 465 478 static __be32 decode_recallany_args(struct svc_rqst *rqstp, ··· 516 557 517 558 status = decode_fh(xdr, &args->cbnl_fh); 518 559 if (unlikely(status != 0)) 519 - goto out; 520 - status = decode_lockowner(xdr, args); 521 - out: 522 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 523 - return status; 560 + return status; 561 + return decode_lockowner(xdr, args); 524 562 } 525 563 526 564 #endif /* CONFIG_NFS_V4_1 */ ··· 663 707 status = encode_attr_mtime(xdr, res->bitmap, &res->mtime); 664 708 *savep = htonl((unsigned int)((char *)xdr->p - (char *)(savep+1))); 665 709 out: 666 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 667 710 return status; 668 711 } 669 712 ··· 689 734 __be32 status = res->csr_status; 690 735 691 736 if (unlikely(status != 0)) 692 - goto out; 737 + return status; 693 738 694 739 status = encode_sessionid(xdr, &res->csr_sessionid); 695 740 if (status) 696 - goto out; 741 + return status; 697 742 698 743 p = xdr_reserve_space(xdr, 4 * sizeof(uint32_t)); 699 744 if (unlikely(p == NULL)) ··· 703 748 *p++ = htonl(res->csr_slotid); 704 749 *p++ = htonl(res->csr_highestslotid); 705 750 *p++ = htonl(res->csr_target_highestslotid); 706 - out: 707 - dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); 708 - return status; 751 + return 0; 709 752 } 710 753 711 754 static __be32 ··· 824 871 long maxlen; 825 872 __be32 res; 826 873 827 - dprintk("%s: start\n", __func__); 828 874 status = decode_op_hdr(xdr_in, &op_nr); 829 875 if (unlikely(status)) 830 876 return status; 831 - 832 - dprintk("%s: minorversion=%d nop=%d op_nr=%u\n", 833 - __func__, cps->minorversion, nop, op_nr); 834 877 835 878 switch (cps->minorversion) { 836 879 case 0: ··· 866 917 return res; 867 918 if (op->encode_res != NULL && status == 0) 868 919 status = op->encode_res(rqstp, xdr_out, resp); 869 - dprintk("%s: done, status = %d\n", __func__, ntohl(status)); 870 920 return status; 871 921 } 872 922 ··· 884 936 .net = SVC_NET(rqstp), 885 937 }; 886 938 unsigned int nops = 0; 887 - 888 - dprintk("%s: start\n", __func__); 889 939 890 940 xdr_init_decode(&xdr_in, &rqstp->rq_arg, rqstp->rq_arg.head[0].iov_base); 891 941 ··· 923 977 *hdr_res.nops = htonl(nops); 924 978 nfs4_cb_free_slot(&cps); 925 979 nfs_put_client(cps.clp); 926 - dprintk("%s: done, status = %u\n", __func__, ntohl(status)); 927 980 return rpc_success; 928 981 929 982 out_invalidcred:
+12 -55
fs/nfs/client.c
··· 218 218 static void pnfs_init_server(struct nfs_server *server) 219 219 { 220 220 rpc_init_wait_queue(&server->roc_rpcwaitq, "pNFS ROC"); 221 + rpc_init_wait_queue(&server->uoc_rpcwaitq, "NFS UOC"); 221 222 } 222 223 223 224 #else ··· 241 240 */ 242 241 void nfs_free_client(struct nfs_client *clp) 243 242 { 244 - dprintk("--> nfs_free_client(%u)\n", clp->rpc_ops->version); 245 - 246 243 nfs_fscache_release_client_cookie(clp); 247 244 248 245 /* -EIO all pending I/O */ ··· 255 256 kfree(clp->cl_hostname); 256 257 kfree(clp->cl_acceptor); 257 258 kfree(clp); 258 - 259 - dprintk("<-- nfs_free_client()\n"); 260 259 } 261 260 EXPORT_SYMBOL_GPL(nfs_free_client); 262 261 ··· 268 271 if (!clp) 269 272 return; 270 273 271 - dprintk("--> nfs_put_client({%d})\n", atomic_read(&clp->cl_count)); 272 274 nn = net_generic(clp->cl_net, nfs_net_id); 273 275 274 276 if (atomic_dec_and_lock(&clp->cl_count, &nn->nfs_client_lock)) { ··· 378 382 } 379 383 380 384 smp_rmb(); 381 - 382 - dprintk("<-- %s found nfs_client %p for %s\n", 383 - __func__, clp, cl_init->hostname ?: ""); 384 385 return clp; 385 386 } 386 387 ··· 395 402 WARN_ON(1); 396 403 return NULL; 397 404 } 398 - 399 - dprintk("--> nfs_get_client(%s,v%u)\n", 400 - cl_init->hostname, rpc_ops->version); 401 405 402 406 /* see if the client already exists */ 403 407 do { ··· 420 430 new = rpc_ops->alloc_client(cl_init); 421 431 } while (!IS_ERR(new)); 422 432 423 - dprintk("<-- nfs_get_client() Failed to find %s (%ld)\n", 424 - cl_init->hostname, PTR_ERR(new)); 425 433 return new; 426 434 } 427 435 EXPORT_SYMBOL_GPL(nfs_get_client); ··· 546 558 .noresvport = server->flags & NFS_MOUNT_NORESVPORT ? 547 559 1 : 0, 548 560 .net = clp->cl_net, 561 + .nlmclnt_ops = clp->cl_nfs_mod->rpc_ops->nlmclnt_ops, 549 562 }; 550 563 551 564 if (nlm_init.nfs_version > 3) ··· 613 624 { 614 625 int error; 615 626 616 - if (clp->cl_cons_state == NFS_CS_READY) { 617 - /* the client is already initialised */ 618 - dprintk("<-- nfs_init_client() = 0 [already %p]\n", clp); 627 + /* the client is already initialised */ 628 + if (clp->cl_cons_state == NFS_CS_READY) 619 629 return clp; 620 - } 621 630 622 631 /* 623 632 * Create a client RPC handle for doing FSSTAT with UNIX auth only 624 633 * - RFC 2623, sec 2.3.2 625 634 */ 626 635 error = nfs_create_rpc_client(clp, cl_init, RPC_AUTH_UNIX); 627 - if (error < 0) 628 - goto error; 629 - nfs_mark_client_ready(clp, NFS_CS_READY); 636 + nfs_mark_client_ready(clp, error == 0 ? NFS_CS_READY : error); 637 + if (error < 0) { 638 + nfs_put_client(clp); 639 + clp = ERR_PTR(error); 640 + } 630 641 return clp; 631 - 632 - error: 633 - nfs_mark_client_ready(clp, error); 634 - nfs_put_client(clp); 635 - dprintk("<-- nfs_init_client() = xerror %d\n", error); 636 - return ERR_PTR(error); 637 642 } 638 643 EXPORT_SYMBOL_GPL(nfs_init_client); 639 644 ··· 651 668 struct nfs_client *clp; 652 669 int error; 653 670 654 - dprintk("--> nfs_init_server()\n"); 655 - 656 671 nfs_init_timeout_values(&timeparms, data->nfs_server.protocol, 657 672 data->timeo, data->retrans); 658 673 if (data->flags & NFS_MOUNT_NORESVPORT) ··· 658 677 659 678 /* Allocate or find a client reference we can use */ 660 679 clp = nfs_get_client(&cl_init); 661 - if (IS_ERR(clp)) { 662 - dprintk("<-- nfs_init_server() = error %ld\n", PTR_ERR(clp)); 680 + if (IS_ERR(clp)) 663 681 return PTR_ERR(clp); 664 - } 665 682 666 683 server->nfs_client = clp; 667 684 ··· 704 725 server->mountd_protocol = data->mount_server.protocol; 705 726 706 727 server->namelen = data->namlen; 707 - dprintk("<-- nfs_init_server() = 0 [new %p]\n", clp); 708 728 return 0; 709 729 710 730 error: 711 731 server->nfs_client = NULL; 712 732 nfs_put_client(clp); 713 - dprintk("<-- nfs_init_server() = xerror %d\n", error); 714 733 return error; 715 734 } 716 735 ··· 775 798 struct nfs_client *clp = server->nfs_client; 776 799 int error; 777 800 778 - dprintk("--> nfs_probe_fsinfo()\n"); 779 - 780 801 if (clp->rpc_ops->set_capabilities != NULL) { 781 802 error = clp->rpc_ops->set_capabilities(server, mntfh); 782 803 if (error < 0) 783 - goto out_error; 804 + return error; 784 805 } 785 806 786 807 fsinfo.fattr = fattr; ··· 786 811 memset(fsinfo.layouttype, 0, sizeof(fsinfo.layouttype)); 787 812 error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo); 788 813 if (error < 0) 789 - goto out_error; 814 + return error; 790 815 791 816 nfs_server_set_fsinfo(server, &fsinfo); 792 817 ··· 801 826 server->namelen = pathinfo.max_namelen; 802 827 } 803 828 804 - dprintk("<-- nfs_probe_fsinfo() = 0\n"); 805 829 return 0; 806 - 807 - out_error: 808 - dprintk("nfs_probe_fsinfo: error = %d\n", -error); 809 - return error; 810 830 } 811 831 EXPORT_SYMBOL_GPL(nfs_probe_fsinfo); 812 832 ··· 897 927 */ 898 928 void nfs_free_server(struct nfs_server *server) 899 929 { 900 - dprintk("--> nfs_free_server()\n"); 901 - 902 930 nfs_server_remove_lists(server); 903 931 904 932 if (server->destroy != NULL) ··· 914 946 nfs_free_iostats(server->io_stats); 915 947 kfree(server); 916 948 nfs_release_automount_timer(); 917 - dprintk("<-- nfs_free_server()\n"); 918 949 } 919 950 EXPORT_SYMBOL_GPL(nfs_free_server); 920 951 ··· 993 1026 struct nfs_fattr *fattr_fsinfo; 994 1027 int error; 995 1028 996 - dprintk("--> nfs_clone_server(,%llx:%llx,)\n", 997 - (unsigned long long) fattr->fsid.major, 998 - (unsigned long long) fattr->fsid.minor); 999 - 1000 1029 server = nfs_alloc_server(); 1001 1030 if (!server) 1002 1031 return ERR_PTR(-ENOMEM); ··· 1024 1061 if (server->namelen == 0 || server->namelen > NFS4_MAXNAMLEN) 1025 1062 server->namelen = NFS4_MAXNAMLEN; 1026 1063 1027 - dprintk("Cloned FSID: %llx:%llx\n", 1028 - (unsigned long long) server->fsid.major, 1029 - (unsigned long long) server->fsid.minor); 1030 - 1031 1064 error = nfs_start_lockd(server); 1032 1065 if (error < 0) 1033 1066 goto out_free_server; ··· 1032 1073 server->mount_time = jiffies; 1033 1074 1034 1075 nfs_free_fattr(fattr_fsinfo); 1035 - dprintk("<-- nfs_clone_server() = %p\n", server); 1036 1076 return server; 1037 1077 1038 1078 out_free_server: 1039 1079 nfs_free_fattr(fattr_fsinfo); 1040 1080 nfs_free_server(server); 1041 - dprintk("<-- nfs_clone_server() = error %d\n", error); 1042 1081 return ERR_PTR(error); 1043 1082 } 1044 1083 EXPORT_SYMBOL_GPL(nfs_clone_server);
+24 -80
fs/nfs/dir.c
··· 57 57 const struct file_operations nfs_dir_operations = { 58 58 .llseek = nfs_llseek_dir, 59 59 .read = generic_read_dir, 60 - .iterate_shared = nfs_readdir, 60 + .iterate = nfs_readdir, 61 61 .open = nfs_opendir, 62 62 .release = nfs_closedir, 63 63 .fsync = nfs_fsync_dir, ··· 145 145 }; 146 146 147 147 struct nfs_cache_array { 148 - atomic_t refcount; 149 148 int size; 150 149 int eof_index; 151 150 u64 last_cookie; ··· 170 171 } nfs_readdir_descriptor_t; 171 172 172 173 /* 173 - * The caller is responsible for calling nfs_readdir_release_array(page) 174 - */ 175 - static 176 - struct nfs_cache_array *nfs_readdir_get_array(struct page *page) 177 - { 178 - void *ptr; 179 - if (page == NULL) 180 - return ERR_PTR(-EIO); 181 - ptr = kmap(page); 182 - if (ptr == NULL) 183 - return ERR_PTR(-ENOMEM); 184 - return ptr; 185 - } 186 - 187 - static 188 - void nfs_readdir_release_array(struct page *page) 189 - { 190 - kunmap(page); 191 - } 192 - 193 - /* 194 174 * we are freeing strings created by nfs_add_to_readdir_array() 195 175 */ 196 176 static ··· 179 201 int i; 180 202 181 203 array = kmap_atomic(page); 182 - if (atomic_dec_and_test(&array->refcount)) 183 - for (i = 0; i < array->size; i++) 184 - kfree(array->array[i].string.name); 204 + for (i = 0; i < array->size; i++) 205 + kfree(array->array[i].string.name); 185 206 kunmap_atomic(array); 186 - } 187 - 188 - static bool grab_page(struct page *page) 189 - { 190 - struct nfs_cache_array *array = kmap_atomic(page); 191 - bool res = atomic_inc_not_zero(&array->refcount); 192 - kunmap_atomic(array); 193 - return res; 194 207 } 195 208 196 209 /* ··· 208 239 static 209 240 int nfs_readdir_add_to_array(struct nfs_entry *entry, struct page *page) 210 241 { 211 - struct nfs_cache_array *array = nfs_readdir_get_array(page); 242 + struct nfs_cache_array *array = kmap(page); 212 243 struct nfs_cache_array_entry *cache_entry; 213 244 int ret; 214 - 215 - if (IS_ERR(array)) 216 - return PTR_ERR(array); 217 245 218 246 cache_entry = &array->array[array->size]; 219 247 ··· 230 264 if (entry->eof != 0) 231 265 array->eof_index = array->size; 232 266 out: 233 - nfs_readdir_release_array(page); 267 + kunmap(page); 234 268 return ret; 235 269 } 236 270 ··· 319 353 struct nfs_cache_array *array; 320 354 int status; 321 355 322 - array = nfs_readdir_get_array(desc->page); 323 - if (IS_ERR(array)) { 324 - status = PTR_ERR(array); 325 - goto out; 326 - } 356 + array = kmap(desc->page); 327 357 328 358 if (*desc->dir_cookie == 0) 329 359 status = nfs_readdir_search_for_pos(array, desc); ··· 331 369 desc->current_index += array->size; 332 370 desc->page_index++; 333 371 } 334 - nfs_readdir_release_array(desc->page); 335 - out: 372 + kunmap(desc->page); 336 373 return status; 337 374 } 338 375 ··· 567 606 568 607 out_nopages: 569 608 if (count == 0 || (status == -EBADCOOKIE && entry->eof != 0)) { 570 - array = nfs_readdir_get_array(page); 571 - if (!IS_ERR(array)) { 572 - array->eof_index = array->size; 573 - status = 0; 574 - nfs_readdir_release_array(page); 575 - } else 576 - status = PTR_ERR(array); 609 + array = kmap(page); 610 + array->eof_index = array->size; 611 + status = 0; 612 + kunmap(page); 577 613 } 578 614 579 615 put_page(scratch); ··· 632 674 goto out; 633 675 } 634 676 635 - array = nfs_readdir_get_array(page); 636 - if (IS_ERR(array)) { 637 - status = PTR_ERR(array); 638 - goto out_label_free; 639 - } 677 + array = kmap(page); 640 678 memset(array, 0, sizeof(struct nfs_cache_array)); 641 - atomic_set(&array->refcount, 1); 642 679 array->eof_index = -1; 643 680 644 681 status = nfs_readdir_alloc_pages(pages, array_size); ··· 656 703 657 704 nfs_readdir_free_pages(pages, array_size); 658 705 out_release_array: 659 - nfs_readdir_release_array(page); 660 - out_label_free: 706 + kunmap(page); 661 707 nfs4_label_free(entry.label); 662 708 out: 663 709 nfs_free_fattr(entry.fattr); ··· 695 743 static 696 744 void cache_page_release(nfs_readdir_descriptor_t *desc) 697 745 { 698 - nfs_readdir_clear_array(desc->page); 746 + if (!desc->page->mapping) 747 + nfs_readdir_clear_array(desc->page); 699 748 put_page(desc->page); 700 749 desc->page = NULL; 701 750 } ··· 704 751 static 705 752 struct page *get_cache_page(nfs_readdir_descriptor_t *desc) 706 753 { 707 - struct page *page; 708 - 709 - for (;;) { 710 - page = read_cache_page(desc->file->f_mapping, 754 + return read_cache_page(desc->file->f_mapping, 711 755 desc->page_index, (filler_t *)nfs_readdir_filler, desc); 712 - if (IS_ERR(page) || grab_page(page)) 713 - break; 714 - put_page(page); 715 - } 716 - return page; 717 756 } 718 757 719 758 /* ··· 754 809 struct nfs_cache_array *array = NULL; 755 810 struct nfs_open_dir_context *ctx = file->private_data; 756 811 757 - array = nfs_readdir_get_array(desc->page); 758 - if (IS_ERR(array)) { 759 - res = PTR_ERR(array); 760 - goto out; 761 - } 762 - 812 + array = kmap(desc->page); 763 813 for (i = desc->cache_entry_index; i < array->size; i++) { 764 814 struct nfs_cache_array_entry *ent; 765 815 ··· 775 835 if (array->eof_index >= 0) 776 836 desc->eof = 1; 777 837 778 - nfs_readdir_release_array(desc->page); 779 - out: 838 + kunmap(desc->page); 780 839 cache_page_release(desc); 781 840 dfprintk(DIRCACHE, "NFS: nfs_do_filldir() filling ended @ cookie %Lu; returning = %d\n", 782 841 (unsigned long long)*desc->dir_cookie, res); ··· 905 966 906 967 static loff_t nfs_llseek_dir(struct file *filp, loff_t offset, int whence) 907 968 { 969 + struct inode *inode = file_inode(filp); 908 970 struct nfs_open_dir_context *dir_ctx = filp->private_data; 909 971 910 972 dfprintk(FILE, "NFS: llseek dir(%pD2, %lld, %d)\n", 911 973 filp, offset, whence); 912 974 975 + inode_lock(inode); 913 976 switch (whence) { 914 977 case 1: 915 978 offset += filp->f_pos; ··· 919 978 if (offset >= 0) 920 979 break; 921 980 default: 922 - return -EINVAL; 981 + offset = -EINVAL; 982 + goto out; 923 983 } 924 984 if (offset != filp->f_pos) { 925 985 filp->f_pos = offset; 926 986 dir_ctx->dir_cookie = 0; 927 987 dir_ctx->duped = 0; 928 988 } 989 + out: 990 + inode_unlock(inode); 929 991 return offset; 930 992 } 931 993
+2 -19
fs/nfs/direct.c
··· 392 392 nfs_direct_req_release(dreq); 393 393 } 394 394 395 - static void nfs_direct_readpage_release(struct nfs_page *req) 396 - { 397 - dprintk("NFS: direct read done (%s/%llu %d@%lld)\n", 398 - req->wb_context->dentry->d_sb->s_id, 399 - (unsigned long long)NFS_FILEID(d_inode(req->wb_context->dentry)), 400 - req->wb_bytes, 401 - (long long)req_offset(req)); 402 - nfs_release_request(req); 403 - } 404 - 405 395 static void nfs_direct_read_completion(struct nfs_pgio_header *hdr) 406 396 { 407 397 unsigned long bytes = 0; ··· 416 426 set_page_dirty(page); 417 427 bytes += req->wb_bytes; 418 428 nfs_list_remove_request(req); 419 - nfs_direct_readpage_release(req); 429 + nfs_release_request(req); 420 430 } 421 431 out_put: 422 432 if (put_dreq(dreq)) ··· 690 700 int status = data->task.tk_status; 691 701 692 702 nfs_init_cinfo_from_dreq(&cinfo, dreq); 693 - if (status < 0) { 694 - dprintk("NFS: %5u commit failed with error %d.\n", 695 - data->task.tk_pid, status); 703 + if (status < 0 || nfs_direct_cmp_commit_data_verf(dreq, data)) 696 704 dreq->flags = NFS_ODIRECT_RESCHED_WRITES; 697 - } else if (nfs_direct_cmp_commit_data_verf(dreq, data)) { 698 - dprintk("NFS: %5u commit verify failed\n", data->task.tk_pid); 699 - dreq->flags = NFS_ODIRECT_RESCHED_WRITES; 700 - } 701 705 702 - dprintk("NFS: %5u commit returned %d\n", data->task.tk_pid, status); 703 706 while (!list_empty(&data->pages)) { 704 707 req = nfs_list_entry(data->pages.next); 705 708 nfs_list_remove_request(req);
+22 -8
fs/nfs/file.c
··· 482 482 inode->i_ino, (long long)page_offset(page)); 483 483 484 484 nfs_fscache_wait_on_page_write(nfsi, page); 485 - return nfs_wb_launder_page(inode, page); 485 + return nfs_wb_page(inode, page); 486 486 } 487 487 488 488 static int nfs_swap_activate(struct swap_info_struct *sis, struct file *file, ··· 697 697 if (!IS_ERR(l_ctx)) { 698 698 status = nfs_iocounter_wait(l_ctx); 699 699 nfs_put_lock_context(l_ctx); 700 - if (status < 0) 700 + /* NOTE: special case 701 + * If we're signalled while cleaning up locks on process exit, we 702 + * still need to complete the unlock. 703 + */ 704 + if (status < 0 && !(fl->fl_flags & FL_CLOSE)) 701 705 return status; 702 706 } 703 707 704 - /* NOTE: special case 705 - * If we're signalled while cleaning up locks on process exit, we 706 - * still need to complete the unlock. 707 - */ 708 708 /* 709 709 * Use local locking if mounted with "-onolock" or with appropriate 710 710 * "-olocal_lock=" ··· 820 820 if (NFS_SERVER(inode)->flags & NFS_MOUNT_LOCAL_FLOCK) 821 821 is_local = 1; 822 822 823 - /* We're simulating flock() locks using posix locks on the server */ 824 - if (fl->fl_type == F_UNLCK) 823 + /* 824 + * VFS doesn't require the open mode to match a flock() lock's type. 825 + * NFS, however, may simulate flock() locking with posix locking which 826 + * requires the open mode to match the lock type. 827 + */ 828 + switch (fl->fl_type) { 829 + case F_UNLCK: 825 830 return do_unlk(filp, cmd, fl, is_local); 831 + case F_RDLCK: 832 + if (!(filp->f_mode & FMODE_READ)) 833 + return -EBADF; 834 + break; 835 + case F_WRLCK: 836 + if (!(filp->f_mode & FMODE_WRITE)) 837 + return -EBADF; 838 + } 839 + 826 840 return do_setlk(filp, cmd, fl, is_local); 827 841 } 828 842 EXPORT_SYMBOL_GPL(nfs_flock);
+6 -4
fs/nfs/filelayout/filelayout.c
··· 921 921 fl = FILELAYOUT_LSEG(lseg); 922 922 923 923 status = filelayout_check_deviceid(lo, fl, gfp_flags); 924 - if (status) 925 - lseg = ERR_PTR(status); 926 - out: 927 - if (IS_ERR(lseg)) 924 + if (status) { 928 925 pnfs_put_lseg(lseg); 926 + lseg = ERR_PTR(status); 927 + } 928 + out: 929 929 return lseg; 930 930 } 931 931 ··· 933 933 filelayout_pg_init_read(struct nfs_pageio_descriptor *pgio, 934 934 struct nfs_page *req) 935 935 { 936 + pnfs_generic_pg_check_layout(pgio); 936 937 if (!pgio->pg_lseg) { 937 938 pgio->pg_lseg = fl_pnfs_update_layout(pgio->pg_inode, 938 939 req->wb_context, ··· 960 959 struct nfs_commit_info cinfo; 961 960 int status; 962 961 962 + pnfs_generic_pg_check_layout(pgio); 963 963 if (!pgio->pg_lseg) { 964 964 pgio->pg_lseg = fl_pnfs_update_layout(pgio->pg_inode, 965 965 req->wb_context,
+21 -3
fs/nfs/flexfilelayout/flexfilelayout.c
··· 846 846 int ds_idx; 847 847 848 848 retry: 849 + pnfs_generic_pg_check_layout(pgio); 849 850 /* Use full layout for now */ 850 851 if (!pgio->pg_lseg) 851 852 ff_layout_pg_get_read(pgio, req, false); ··· 895 894 int status; 896 895 897 896 retry: 897 + pnfs_generic_pg_check_layout(pgio); 898 898 if (!pgio->pg_lseg) { 899 899 pgio->pg_lseg = pnfs_update_layout(pgio->pg_inode, 900 900 req->wb_context, ··· 1802 1800 1803 1801 ds = nfs4_ff_layout_prepare_ds(lseg, idx, true); 1804 1802 if (!ds) 1805 - return PNFS_NOT_ATTEMPTED; 1803 + goto out_failed; 1806 1804 1807 1805 ds_clnt = nfs4_ff_find_or_create_ds_client(lseg, idx, ds->ds_clp, 1808 1806 hdr->inode); 1809 1807 if (IS_ERR(ds_clnt)) 1810 - return PNFS_NOT_ATTEMPTED; 1808 + goto out_failed; 1811 1809 1812 1810 ds_cred = ff_layout_get_ds_cred(lseg, idx, hdr->cred); 1813 1811 if (!ds_cred) 1814 - return PNFS_NOT_ATTEMPTED; 1812 + goto out_failed; 1815 1813 1816 1814 vers = nfs4_ff_layout_ds_version(lseg, idx); 1817 1815 ··· 1841 1839 sync, RPC_TASK_SOFTCONN); 1842 1840 put_rpccred(ds_cred); 1843 1841 return PNFS_ATTEMPTED; 1842 + 1843 + out_failed: 1844 + if (ff_layout_avoid_mds_available_ds(lseg)) 1845 + return PNFS_TRY_AGAIN; 1846 + return PNFS_NOT_ATTEMPTED; 1844 1847 } 1845 1848 1846 1849 static u32 calc_ds_index_from_commit(struct pnfs_layout_segment *lseg, u32 i) ··· 2361 2354 return 0; 2362 2355 } 2363 2356 2357 + static int 2358 + ff_layout_set_layoutdriver(struct nfs_server *server, 2359 + const struct nfs_fh *dummy) 2360 + { 2361 + #if IS_ENABLED(CONFIG_NFS_V4_2) 2362 + server->caps |= NFS_CAP_LAYOUTSTATS; 2363 + #endif 2364 + return 0; 2365 + } 2366 + 2364 2367 static struct pnfs_layoutdriver_type flexfilelayout_type = { 2365 2368 .id = LAYOUT_FLEX_FILES, 2366 2369 .name = "LAYOUT_FLEX_FILES", 2367 2370 .owner = THIS_MODULE, 2371 + .set_layoutdriver = ff_layout_set_layoutdriver, 2368 2372 .alloc_layout_hdr = ff_layout_alloc_layout_hdr, 2369 2373 .free_layout_hdr = ff_layout_free_layout_hdr, 2370 2374 .alloc_lseg = ff_layout_alloc_lseg,
+8 -2
fs/nfs/flexfilelayout/flexfilelayoutdev.c
··· 119 119 if (ds_versions[i].wsize > NFS_MAX_FILE_IO_SIZE) 120 120 ds_versions[i].wsize = NFS_MAX_FILE_IO_SIZE; 121 121 122 - if (ds_versions[i].version != 3 || ds_versions[i].minor_version != 0) { 122 + /* 123 + * check for valid major/minor combination. 124 + * currently we support dataserver which talk: 125 + * v3, v4.0, v4.1, v4.2 126 + */ 127 + if (!((ds_versions[i].version == 3 && ds_versions[i].minor_version == 0) || 128 + (ds_versions[i].version == 4 && ds_versions[i].minor_version < 3))) { 123 129 dprintk("%s: [%d] unsupported ds version %d-%d\n", __func__, 124 130 i, ds_versions[i].version, 125 131 ds_versions[i].minor_version); ··· 421 415 mirror->mirror_ds->ds_versions[0].minor_version); 422 416 423 417 /* connect success, check rsize/wsize limit */ 424 - if (ds->ds_clp) { 418 + if (!status) { 425 419 max_payload = 426 420 nfs_block_size(rpc_max_payload(ds->ds_clp->cl_rpcclient), 427 421 NULL);
+4 -1
fs/nfs/inode.c
··· 734 734 if (need_atime || nfs_need_revalidate_inode(inode)) { 735 735 struct nfs_server *server = NFS_SERVER(inode); 736 736 737 - nfs_readdirplus_parent_cache_miss(path->dentry); 737 + if (!(server->flags & NFS_MOUNT_NOAC)) 738 + nfs_readdirplus_parent_cache_miss(path->dentry); 739 + else 740 + nfs_readdirplus_parent_cache_hit(path->dentry); 738 741 err = __nfs_revalidate_inode(server, inode); 739 742 } else 740 743 nfs_readdirplus_parent_cache_hit(path->dentry);
+4 -1
fs/nfs/internal.h
··· 495 495 u32 ds_commit_idx); 496 496 int nfs_write_need_commit(struct nfs_pgio_header *); 497 497 void nfs_writeback_update_inode(struct nfs_pgio_header *hdr); 498 - int nfs_commit_file(struct file *file, struct nfs_write_verifier *verf); 499 498 int nfs_generic_commit_list(struct inode *inode, struct list_head *head, 500 499 int how, struct nfs_commit_info *cinfo); 501 500 void nfs_retry_commit(struct list_head *page_list, ··· 755 756 { 756 757 switch (err) { 757 758 case -ERESTARTSYS: 759 + case -EACCES: 760 + case -EDQUOT: 761 + case -EFBIG: 758 762 case -EIO: 759 763 case -ENOSPC: 760 764 case -EROFS: 765 + case -ESTALE: 761 766 case -E2BIG: 762 767 return true; 763 768 default:
+10 -28
fs/nfs/namespace.c
··· 143 143 struct nfs_fh *fh = NULL; 144 144 struct nfs_fattr *fattr = NULL; 145 145 146 - dprintk("--> nfs_d_automount()\n"); 147 - 148 - mnt = ERR_PTR(-ESTALE); 149 146 if (IS_ROOT(path->dentry)) 150 - goto out_nofree; 147 + return ERR_PTR(-ESTALE); 151 148 152 149 mnt = ERR_PTR(-ENOMEM); 153 150 fh = nfs_alloc_fhandle(); ··· 152 155 if (fh == NULL || fattr == NULL) 153 156 goto out; 154 157 155 - dprintk("%s: enter\n", __func__); 156 - 157 158 mnt = server->nfs_client->rpc_ops->submount(server, path->dentry, fh, fattr); 158 159 if (IS_ERR(mnt)) 159 160 goto out; 160 161 161 - dprintk("%s: done, success\n", __func__); 162 162 mntget(mnt); /* prevent immediate expiration */ 163 163 mnt_set_expiry(mnt, &nfs_automount_list); 164 164 schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); ··· 163 169 out: 164 170 nfs_free_fattr(fattr); 165 171 nfs_free_fhandle(fh); 166 - out_nofree: 167 - if (IS_ERR(mnt)) 168 - dprintk("<-- %s(): error %ld\n", __func__, PTR_ERR(mnt)); 169 - else 170 - dprintk("<-- %s() = %p\n", __func__, mnt); 171 172 return mnt; 172 173 } 173 174 ··· 237 248 .fattr = fattr, 238 249 .authflavor = authflavor, 239 250 }; 240 - struct vfsmount *mnt = ERR_PTR(-ENOMEM); 251 + struct vfsmount *mnt; 241 252 char *page = (char *) __get_free_page(GFP_USER); 242 253 char *devname; 243 254 244 - dprintk("--> nfs_do_submount()\n"); 245 - 246 - dprintk("%s: submounting on %pd2\n", __func__, 247 - dentry); 248 255 if (page == NULL) 249 - goto out; 250 - devname = nfs_devname(dentry, page, PAGE_SIZE); 251 - mnt = (struct vfsmount *)devname; 252 - if (IS_ERR(devname)) 253 - goto free_page; 254 - mnt = nfs_do_clone_mount(NFS_SB(dentry->d_sb), devname, &mountdata); 255 - free_page: 256 - free_page((unsigned long)page); 257 - out: 258 - dprintk("%s: done\n", __func__); 256 + return ERR_PTR(-ENOMEM); 259 257 260 - dprintk("<-- nfs_do_submount() = %p\n", mnt); 258 + devname = nfs_devname(dentry, page, PAGE_SIZE); 259 + if (IS_ERR(devname)) 260 + mnt = (struct vfsmount *)devname; 261 + else 262 + mnt = nfs_do_clone_mount(NFS_SB(dentry->d_sb), devname, &mountdata); 263 + 264 + free_page((unsigned long)page); 261 265 return mnt; 262 266 } 263 267 EXPORT_SYMBOL_GPL(nfs_do_submount);
+53 -1
fs/nfs/nfs3proc.c
··· 865 865 msg->rpc_proc = &nfs3_procedures[NFS3PROC_COMMIT]; 866 866 } 867 867 868 + static void nfs3_nlm_alloc_call(void *data) 869 + { 870 + struct nfs_lock_context *l_ctx = data; 871 + if (l_ctx && test_bit(NFS_CONTEXT_UNLOCK, &l_ctx->open_context->flags)) { 872 + get_nfs_open_context(l_ctx->open_context); 873 + nfs_get_lock_context(l_ctx->open_context); 874 + } 875 + } 876 + 877 + static bool nfs3_nlm_unlock_prepare(struct rpc_task *task, void *data) 878 + { 879 + struct nfs_lock_context *l_ctx = data; 880 + if (l_ctx && test_bit(NFS_CONTEXT_UNLOCK, &l_ctx->open_context->flags)) 881 + return nfs_async_iocounter_wait(task, l_ctx); 882 + return false; 883 + 884 + } 885 + 886 + static void nfs3_nlm_release_call(void *data) 887 + { 888 + struct nfs_lock_context *l_ctx = data; 889 + struct nfs_open_context *ctx; 890 + if (l_ctx && test_bit(NFS_CONTEXT_UNLOCK, &l_ctx->open_context->flags)) { 891 + ctx = l_ctx->open_context; 892 + nfs_put_lock_context(l_ctx); 893 + put_nfs_open_context(ctx); 894 + } 895 + } 896 + 897 + const struct nlmclnt_operations nlmclnt_fl_close_lock_ops = { 898 + .nlmclnt_alloc_call = nfs3_nlm_alloc_call, 899 + .nlmclnt_unlock_prepare = nfs3_nlm_unlock_prepare, 900 + .nlmclnt_release_call = nfs3_nlm_release_call, 901 + }; 902 + 868 903 static int 869 904 nfs3_proc_lock(struct file *filp, int cmd, struct file_lock *fl) 870 905 { 871 906 struct inode *inode = file_inode(filp); 907 + struct nfs_lock_context *l_ctx = NULL; 908 + struct nfs_open_context *ctx = nfs_file_open_context(filp); 909 + int status; 872 910 873 - return nlmclnt_proc(NFS_SERVER(inode)->nlm_host, cmd, fl); 911 + if (fl->fl_flags & FL_CLOSE) { 912 + l_ctx = nfs_get_lock_context(ctx); 913 + if (IS_ERR(l_ctx)) 914 + l_ctx = NULL; 915 + else 916 + set_bit(NFS_CONTEXT_UNLOCK, &ctx->flags); 917 + } 918 + 919 + status = nlmclnt_proc(NFS_SERVER(inode)->nlm_host, cmd, fl, l_ctx); 920 + 921 + if (l_ctx) 922 + nfs_put_lock_context(l_ctx); 923 + 924 + return status; 874 925 } 875 926 876 927 static int nfs3_have_delegation(struct inode *inode, fmode_t flags) ··· 972 921 .dir_inode_ops = &nfs3_dir_inode_operations, 973 922 .file_inode_ops = &nfs3_file_inode_operations, 974 923 .file_ops = &nfs_file_operations, 924 + .nlmclnt_ops = &nlmclnt_fl_close_lock_ops, 975 925 .getroot = nfs3_proc_get_root, 976 926 .submount = nfs_submount, 977 927 .try_mount = nfs_try_mount,
+16 -8
fs/nfs/nfs42proc.c
··· 167 167 if (status) 168 168 return status; 169 169 170 + res->commit_res.verf = kzalloc(sizeof(struct nfs_writeverf), GFP_NOFS); 171 + if (!res->commit_res.verf) 172 + return -ENOMEM; 170 173 status = nfs4_call_sync(server->client, server, &msg, 171 174 &args->seq_args, &res->seq_res, 0); 172 175 if (status == -ENOTSUPP) 173 176 server->caps &= ~NFS_CAP_COPY; 174 177 if (status) 175 - return status; 178 + goto out; 176 179 177 - if (res->write_res.verifier.committed != NFS_FILE_SYNC) { 178 - status = nfs_commit_file(dst, &res->write_res.verifier.verifier); 179 - if (status) 180 - return status; 180 + if (!nfs_write_verifier_cmp(&res->write_res.verifier.verifier, 181 + &res->commit_res.verf->verifier)) { 182 + status = -EAGAIN; 183 + goto out; 181 184 } 182 185 183 186 truncate_pagecache_range(dst_inode, pos_dst, 184 187 pos_dst + res->write_res.count); 185 188 186 - return res->write_res.count; 189 + status = res->write_res.count; 190 + out: 191 + kfree(res->commit_res.verf); 192 + return status; 187 193 } 188 194 189 195 ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src, ··· 246 240 if (err == -ENOTSUPP) { 247 241 err = -EOPNOTSUPP; 248 242 break; 243 + } if (err == -EAGAIN) { 244 + dst_exception.retry = 1; 245 + continue; 249 246 } 250 247 251 248 err2 = nfs4_handle_exception(server, err, &src_exception); ··· 388 379 pnfs_mark_layout_stateid_invalid(lo, &head); 389 380 spin_unlock(&inode->i_lock); 390 381 pnfs_free_lseg_list(&head); 382 + nfs_commit_inode(inode, 0); 391 383 } else 392 384 spin_unlock(&inode->i_lock); 393 385 break; ··· 410 400 case -EOPNOTSUPP: 411 401 NFS_SERVER(inode)->caps &= ~NFS_CAP_LAYOUTSTATS; 412 402 } 413 - 414 - dprintk("%s server returns %d\n", __func__, task->tk_status); 415 403 } 416 404 417 405 static void
+20 -2
fs/nfs/nfs42xdr.c
··· 66 66 encode_putfh_maxsz + \ 67 67 encode_savefh_maxsz + \ 68 68 encode_putfh_maxsz + \ 69 - encode_copy_maxsz) 69 + encode_copy_maxsz + \ 70 + encode_commit_maxsz) 70 71 #define NFS4_dec_copy_sz (compound_decode_hdr_maxsz + \ 71 72 decode_putfh_maxsz + \ 72 73 decode_savefh_maxsz + \ 73 74 decode_putfh_maxsz + \ 74 - decode_copy_maxsz) 75 + decode_copy_maxsz + \ 76 + decode_commit_maxsz) 75 77 #define NFS4_enc_deallocate_sz (compound_encode_hdr_maxsz + \ 76 78 encode_putfh_maxsz + \ 77 79 encode_deallocate_maxsz + \ ··· 224 222 encode_nops(&hdr); 225 223 } 226 224 225 + static void encode_copy_commit(struct xdr_stream *xdr, 226 + struct nfs42_copy_args *args, 227 + struct compound_hdr *hdr) 228 + { 229 + __be32 *p; 230 + 231 + encode_op_hdr(xdr, OP_COMMIT, decode_commit_maxsz, hdr); 232 + p = reserve_space(xdr, 12); 233 + p = xdr_encode_hyper(p, args->dst_pos); 234 + *p = cpu_to_be32(args->count); 235 + } 236 + 227 237 /* 228 238 * Encode COPY request 229 239 */ ··· 253 239 encode_savefh(xdr, &hdr); 254 240 encode_putfh(xdr, args->dst_fh, &hdr); 255 241 encode_copy(xdr, args, &hdr); 242 + encode_copy_commit(xdr, args, &hdr); 256 243 encode_nops(&hdr); 257 244 } 258 245 ··· 496 481 if (status) 497 482 goto out; 498 483 status = decode_copy(xdr, res); 484 + if (status) 485 + goto out; 486 + status = decode_commit(xdr, &res->commit_res); 499 487 out: 500 488 return status; 501 489 }
+80 -203
fs/nfs/nfs4client.c
··· 359 359 struct nfs_client *old; 360 360 int error; 361 361 362 - if (clp->cl_cons_state == NFS_CS_READY) { 362 + if (clp->cl_cons_state == NFS_CS_READY) 363 363 /* the client is initialised already */ 364 - dprintk("<-- nfs4_init_client() = 0 [already %p]\n", clp); 365 364 return clp; 366 - } 367 365 368 366 /* Check NFS protocol revision and initialize RPC op vector */ 369 367 clp->rpc_ops = &nfs_v4_clientops; ··· 419 421 error: 420 422 nfs_mark_client_ready(clp, error); 421 423 nfs_put_client(clp); 422 - dprintk("<-- nfs4_init_client() = xerror %d\n", error); 423 424 return ERR_PTR(error); 424 425 } 425 426 ··· 466 469 return memcmp(v1->data, v2->data, sizeof(v1->data)) == 0; 467 470 } 468 471 472 + static int nfs4_match_client(struct nfs_client *pos, struct nfs_client *new, 473 + struct nfs_client **prev, struct nfs_net *nn) 474 + { 475 + int status; 476 + 477 + if (pos->rpc_ops != new->rpc_ops) 478 + return 1; 479 + 480 + if (pos->cl_minorversion != new->cl_minorversion) 481 + return 1; 482 + 483 + /* If "pos" isn't marked ready, we can't trust the 484 + * remaining fields in "pos", especially the client 485 + * ID and serverowner fields. Wait for CREATE_SESSION 486 + * to finish. */ 487 + if (pos->cl_cons_state > NFS_CS_READY) { 488 + atomic_inc(&pos->cl_count); 489 + spin_unlock(&nn->nfs_client_lock); 490 + 491 + nfs_put_client(*prev); 492 + *prev = pos; 493 + 494 + status = nfs_wait_client_init_complete(pos); 495 + spin_lock(&nn->nfs_client_lock); 496 + 497 + if (status < 0) 498 + return status; 499 + } 500 + 501 + if (pos->cl_cons_state != NFS_CS_READY) 502 + return 1; 503 + 504 + if (pos->cl_clientid != new->cl_clientid) 505 + return 1; 506 + 507 + /* NFSv4.1 always uses the uniform string, however someone 508 + * might switch the uniquifier string on us. 509 + */ 510 + if (!nfs4_match_client_owner_id(pos, new)) 511 + return 1; 512 + 513 + return 0; 514 + } 515 + 469 516 /** 470 517 * nfs40_walk_client_list - Find server that recognizes a client ID 471 518 * ··· 538 497 spin_lock(&nn->nfs_client_lock); 539 498 list_for_each_entry(pos, &nn->nfs_client_list, cl_share_link) { 540 499 541 - if (pos->rpc_ops != new->rpc_ops) 542 - continue; 543 - 544 - if (pos->cl_minorversion != new->cl_minorversion) 545 - continue; 546 - 547 - /* If "pos" isn't marked ready, we can't trust the 548 - * remaining fields in "pos" */ 549 - if (pos->cl_cons_state > NFS_CS_READY) { 550 - atomic_inc(&pos->cl_count); 551 - spin_unlock(&nn->nfs_client_lock); 552 - 553 - nfs_put_client(prev); 554 - prev = pos; 555 - 556 - status = nfs_wait_client_init_complete(pos); 557 - if (status < 0) 558 - goto out; 559 - status = -NFS4ERR_STALE_CLIENTID; 560 - spin_lock(&nn->nfs_client_lock); 561 - } 562 - if (pos->cl_cons_state != NFS_CS_READY) 563 - continue; 564 - 565 - if (pos->cl_clientid != new->cl_clientid) 566 - continue; 567 - 568 - if (!nfs4_match_client_owner_id(pos, new)) 500 + status = nfs4_match_client(pos, new, &prev, nn); 501 + if (status < 0) 502 + goto out_unlock; 503 + if (status != 0) 569 504 continue; 570 505 /* 571 506 * We just sent a new SETCLIENTID, which should have ··· 574 557 575 558 prev = NULL; 576 559 *result = pos; 577 - dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n", 578 - __func__, pos, atomic_read(&pos->cl_count)); 579 560 goto out; 580 561 case -ERESTARTSYS: 581 562 case -ETIMEDOUT: ··· 582 567 */ 583 568 nfs4_schedule_path_down_recovery(pos); 584 569 default: 570 + spin_lock(&nn->nfs_client_lock); 585 571 goto out; 586 572 } 587 573 588 574 spin_lock(&nn->nfs_client_lock); 589 575 } 576 + out_unlock: 590 577 spin_unlock(&nn->nfs_client_lock); 591 578 592 579 /* No match found. The server lost our clientid */ 593 580 out: 594 581 nfs_put_client(prev); 595 - dprintk("NFS: <-- %s status = %d\n", __func__, status); 596 582 return status; 597 583 } 598 584 599 585 #ifdef CONFIG_NFS_V4_1 600 - /* 601 - * Returns true if the client IDs match 602 - */ 603 - static bool nfs4_match_clientids(u64 a, u64 b) 604 - { 605 - if (a != b) { 606 - dprintk("NFS: --> %s client ID %llx does not match %llx\n", 607 - __func__, a, b); 608 - return false; 609 - } 610 - dprintk("NFS: --> %s client ID %llx matches %llx\n", 611 - __func__, a, b); 612 - return true; 613 - } 614 - 615 586 /* 616 587 * Returns true if the server major ids match 617 588 */ ··· 606 605 struct nfs41_server_owner *o2) 607 606 { 608 607 if (o1->major_id_sz != o2->major_id_sz) 609 - goto out_major_mismatch; 610 - if (memcmp(o1->major_id, o2->major_id, o1->major_id_sz) != 0) 611 - goto out_major_mismatch; 612 - 613 - dprintk("NFS: --> %s server owner major IDs match\n", __func__); 614 - return true; 615 - 616 - out_major_mismatch: 617 - dprintk("NFS: --> %s server owner major IDs do not match\n", 618 - __func__); 619 - return false; 620 - } 621 - 622 - /* 623 - * Returns true if server minor ids match 624 - */ 625 - static bool 626 - nfs4_check_serverowner_minor_id(struct nfs41_server_owner *o1, 627 - struct nfs41_server_owner *o2) 628 - { 629 - /* Check eir_server_owner so_minor_id */ 630 - if (o1->minor_id != o2->minor_id) 631 - goto out_minor_mismatch; 632 - 633 - dprintk("NFS: --> %s server owner minor IDs match\n", __func__); 634 - return true; 635 - 636 - out_minor_mismatch: 637 - dprintk("NFS: --> %s server owner minor IDs do not match\n", __func__); 638 - return false; 608 + return false; 609 + return memcmp(o1->major_id, o2->major_id, o1->major_id_sz) == 0; 639 610 } 640 611 641 612 /* ··· 618 645 struct nfs41_server_scope *s2) 619 646 { 620 647 if (s1->server_scope_sz != s2->server_scope_sz) 621 - goto out_scope_mismatch; 622 - if (memcmp(s1->server_scope, s2->server_scope, 623 - s1->server_scope_sz) != 0) 624 - goto out_scope_mismatch; 625 - 626 - dprintk("NFS: --> %s server scopes match\n", __func__); 627 - return true; 628 - 629 - out_scope_mismatch: 630 - dprintk("NFS: --> %s server scopes do not match\n", 631 - __func__); 632 - return false; 648 + return false; 649 + return memcmp(s1->server_scope, s2->server_scope, 650 + s1->server_scope_sz) == 0; 633 651 } 634 652 635 653 /** ··· 644 680 struct rpc_xprt *xprt) 645 681 { 646 682 /* Check eir_clientid */ 647 - if (!nfs4_match_clientids(clp->cl_clientid, res->clientid)) 683 + if (clp->cl_clientid != res->clientid) 648 684 goto out_err; 649 685 650 686 /* Check eir_server_owner so_major_id */ ··· 653 689 goto out_err; 654 690 655 691 /* Check eir_server_owner so_minor_id */ 656 - if (!nfs4_check_serverowner_minor_id(clp->cl_serverowner, 657 - res->server_owner)) 692 + if (clp->cl_serverowner->minor_id != res->server_owner->minor_id) 658 693 goto out_err; 659 694 660 695 /* Check eir_server_scope */ ··· 702 739 if (pos == new) 703 740 goto found; 704 741 705 - if (pos->rpc_ops != new->rpc_ops) 706 - continue; 707 - 708 - if (pos->cl_minorversion != new->cl_minorversion) 709 - continue; 710 - 711 - /* If "pos" isn't marked ready, we can't trust the 712 - * remaining fields in "pos", especially the client 713 - * ID and serverowner fields. Wait for CREATE_SESSION 714 - * to finish. */ 715 - if (pos->cl_cons_state > NFS_CS_READY) { 716 - atomic_inc(&pos->cl_count); 717 - spin_unlock(&nn->nfs_client_lock); 718 - 719 - nfs_put_client(prev); 720 - prev = pos; 721 - 722 - status = nfs_wait_client_init_complete(pos); 723 - spin_lock(&nn->nfs_client_lock); 724 - if (status < 0) 725 - break; 726 - status = -NFS4ERR_STALE_CLIENTID; 727 - } 728 - if (pos->cl_cons_state != NFS_CS_READY) 729 - continue; 730 - 731 - if (!nfs4_match_clientids(pos->cl_clientid, new->cl_clientid)) 742 + status = nfs4_match_client(pos, new, &prev, nn); 743 + if (status < 0) 744 + goto out; 745 + if (status != 0) 732 746 continue; 733 747 734 748 /* ··· 717 777 new->cl_serverowner)) 718 778 continue; 719 779 720 - /* Unlike NFSv4.0, we know that NFSv4.1 always uses the 721 - * uniform string, however someone might switch the 722 - * uniquifier string on us. 723 - */ 724 - if (!nfs4_match_client_owner_id(pos, new)) 725 - continue; 726 780 found: 727 781 atomic_inc(&pos->cl_count); 728 782 *result = pos; 729 783 status = 0; 730 - dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n", 731 - __func__, pos, atomic_read(&pos->cl_count)); 732 784 break; 733 785 } 734 786 787 + out: 735 788 spin_unlock(&nn->nfs_client_lock); 736 - dprintk("NFS: <-- %s status = %d\n", __func__, status); 737 789 nfs_put_client(prev); 738 790 return status; 739 791 } ··· 848 916 .timeparms = timeparms, 849 917 }; 850 918 struct nfs_client *clp; 851 - int error; 852 - 853 - dprintk("--> nfs4_set_client()\n"); 854 919 855 920 if (server->flags & NFS_MOUNT_NORESVPORT) 856 921 set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags); ··· 856 927 857 928 /* Allocate or find a client reference we can use */ 858 929 clp = nfs_get_client(&cl_init); 859 - if (IS_ERR(clp)) { 860 - error = PTR_ERR(clp); 861 - goto error; 862 - } 930 + if (IS_ERR(clp)) 931 + return PTR_ERR(clp); 863 932 864 - if (server->nfs_client == clp) { 865 - error = -ELOOP; 866 - goto error; 867 - } 933 + if (server->nfs_client == clp) 934 + return -ELOOP; 868 935 869 936 /* 870 937 * Query for the lease time on clientid setup or renewal ··· 872 947 set_bit(NFS_CS_CHECK_LEASE_TIME, &clp->cl_res_state); 873 948 874 949 server->nfs_client = clp; 875 - dprintk("<-- nfs4_set_client() = 0 [new %p]\n", clp); 876 950 return 0; 877 - error: 878 - dprintk("<-- nfs4_set_client() = xerror %d\n", error); 879 - return error; 880 951 } 881 952 882 953 /* ··· 903 982 .net = mds_clp->cl_net, 904 983 .timeparms = &ds_timeout, 905 984 }; 906 - struct nfs_client *clp; 907 985 char buf[INET6_ADDRSTRLEN + 1]; 908 986 909 987 if (rpc_ntop(ds_addr, buf, sizeof(buf)) <= 0) ··· 918 998 * (section 13.1 RFC 5661). 919 999 */ 920 1000 nfs_init_timeout_values(&ds_timeout, ds_proto, ds_timeo, ds_retrans); 921 - clp = nfs_get_client(&cl_init); 922 - 923 - dprintk("<-- %s %p\n", __func__, clp); 924 - return clp; 1001 + return nfs_get_client(&cl_init); 925 1002 } 926 1003 EXPORT_SYMBOL_GPL(nfs4_set_ds_client); 927 1004 ··· 1015 1098 struct rpc_timeout timeparms; 1016 1099 int error; 1017 1100 1018 - dprintk("--> nfs4_init_server()\n"); 1019 - 1020 1101 nfs_init_timeout_values(&timeparms, data->nfs_server.protocol, 1021 1102 data->timeo, data->retrans); 1022 1103 ··· 1042 1127 data->minorversion, 1043 1128 data->net); 1044 1129 if (error < 0) 1045 - goto error; 1130 + return error; 1046 1131 1047 1132 if (data->rsize) 1048 1133 server->rsize = nfs_block_size(data->rsize, NULL); ··· 1053 1138 server->acregmax = data->acregmax * HZ; 1054 1139 server->acdirmin = data->acdirmin * HZ; 1055 1140 server->acdirmax = data->acdirmax * HZ; 1141 + server->port = data->nfs_server.port; 1056 1142 1057 - server->port = data->nfs_server.port; 1058 - 1059 - error = nfs_init_server_rpcclient(server, &timeparms, 1060 - data->selected_flavor); 1061 - 1062 - error: 1063 - /* Done */ 1064 - dprintk("<-- nfs4_init_server() = %d\n", error); 1065 - return error; 1143 + return nfs_init_server_rpcclient(server, &timeparms, 1144 + data->selected_flavor); 1066 1145 } 1067 1146 1068 1147 /* ··· 1071 1162 struct nfs_server *server; 1072 1163 bool auth_probe; 1073 1164 int error; 1074 - 1075 - dprintk("--> nfs4_create_server()\n"); 1076 1165 1077 1166 server = nfs_alloc_server(); 1078 1167 if (!server) ··· 1087 1180 if (error < 0) 1088 1181 goto error; 1089 1182 1090 - dprintk("<-- nfs4_create_server() = %p\n", server); 1091 1183 return server; 1092 1184 1093 1185 error: 1094 1186 nfs_free_server(server); 1095 - dprintk("<-- nfs4_create_server() = error %d\n", error); 1096 1187 return ERR_PTR(error); 1097 1188 } 1098 1189 ··· 1104 1199 struct nfs_server *server, *parent_server; 1105 1200 bool auth_probe; 1106 1201 int error; 1107 - 1108 - dprintk("--> nfs4_create_referral_server()\n"); 1109 1202 1110 1203 server = nfs_alloc_server(); 1111 1204 if (!server) ··· 1138 1235 if (error < 0) 1139 1236 goto error; 1140 1237 1141 - dprintk("<-- nfs_create_referral_server() = %p\n", server); 1142 1238 return server; 1143 1239 1144 1240 error: 1145 1241 nfs_free_server(server); 1146 - dprintk("<-- nfs4_create_referral_server() = error %d\n", error); 1147 1242 return ERR_PTR(error); 1148 1243 } 1149 1244 ··· 1201 1300 struct sockaddr *localaddr = (struct sockaddr *)&address; 1202 1301 int error; 1203 1302 1204 - dprintk("--> %s: move FSID %llx:%llx to \"%s\")\n", __func__, 1205 - (unsigned long long)server->fsid.major, 1206 - (unsigned long long)server->fsid.minor, 1207 - hostname); 1208 - 1209 1303 error = rpc_switch_client_transport(clnt, &xargs, clnt->cl_timeout); 1210 - if (error != 0) { 1211 - dprintk("<-- %s(): rpc_switch_client_transport returned %d\n", 1212 - __func__, error); 1213 - goto out; 1214 - } 1304 + if (error != 0) 1305 + return error; 1215 1306 1216 1307 error = rpc_localaddr(clnt, localaddr, sizeof(address)); 1217 - if (error != 0) { 1218 - dprintk("<-- %s(): rpc_localaddr returned %d\n", 1219 - __func__, error); 1220 - goto out; 1221 - } 1308 + if (error != 0) 1309 + return error; 1222 1310 1223 - error = -EAFNOSUPPORT; 1224 - if (rpc_ntop(localaddr, buf, sizeof(buf)) == 0) { 1225 - dprintk("<-- %s(): rpc_ntop returned %d\n", 1226 - __func__, error); 1227 - goto out; 1228 - } 1311 + if (rpc_ntop(localaddr, buf, sizeof(buf)) == 0) 1312 + return -EAFNOSUPPORT; 1229 1313 1230 1314 nfs_server_remove_lists(server); 1231 1315 error = nfs4_set_client(server, hostname, sap, salen, buf, ··· 1219 1333 nfs_put_client(clp); 1220 1334 if (error != 0) { 1221 1335 nfs_server_insert_lists(server); 1222 - dprintk("<-- %s(): nfs4_set_client returned %d\n", 1223 - __func__, error); 1224 - goto out; 1336 + return error; 1225 1337 } 1226 1338 1227 1339 if (server->nfs_client->cl_hostname == NULL) 1228 1340 server->nfs_client->cl_hostname = kstrdup(hostname, GFP_KERNEL); 1229 1341 nfs_server_insert_lists(server); 1230 1342 1231 - error = nfs_probe_destination(server); 1232 - if (error < 0) 1233 - goto out; 1234 - 1235 - dprintk("<-- %s() succeeded\n", __func__); 1236 - 1237 - out: 1238 - return error; 1343 + return nfs_probe_destination(server); 1239 1344 }
-3
fs/nfs/nfs4getroot.c
··· 14 14 struct nfs_fsinfo fsinfo; 15 15 int ret = -ENOMEM; 16 16 17 - dprintk("--> nfs4_get_rootfh()\n"); 18 - 19 17 fsinfo.fattr = nfs_alloc_fattr(); 20 18 if (fsinfo.fattr == NULL) 21 19 goto out; ··· 36 38 memcpy(&server->fsid, &fsinfo.fattr->fsid, sizeof(server->fsid)); 37 39 out: 38 40 nfs_free_fattr(fsinfo.fattr); 39 - dprintk("<-- nfs4_get_rootfh() = %d\n", ret); 40 41 return ret; 41 42 }
+1 -6
fs/nfs/nfs4namespace.c
··· 340 340 out: 341 341 free_page((unsigned long) page); 342 342 free_page((unsigned long) page2); 343 - dprintk("%s: done\n", __func__); 344 343 return mnt; 345 344 } 346 345 ··· 357 358 int err; 358 359 359 360 /* BUG_ON(IS_ROOT(dentry)); */ 360 - dprintk("%s: enter\n", __func__); 361 - 362 361 page = alloc_page(GFP_KERNEL); 363 362 if (page == NULL) 364 - goto out; 363 + return mnt; 365 364 366 365 fs_locations = kmalloc(sizeof(struct nfs4_fs_locations), GFP_KERNEL); 367 366 if (fs_locations == NULL) ··· 383 386 out_free: 384 387 __free_page(page); 385 388 kfree(fs_locations); 386 - out: 387 - dprintk("%s: done\n", __func__); 388 389 return mnt; 389 390 } 390 391
+44 -55
fs/nfs/nfs4proc.c
··· 698 698 session = slot->table->session; 699 699 700 700 if (slot->interrupted) { 701 - slot->interrupted = 0; 701 + if (res->sr_status != -NFS4ERR_DELAY) 702 + slot->interrupted = 0; 702 703 interrupted = true; 703 704 } 704 705 ··· 2301 2300 if (status != 0) 2302 2301 return status; 2303 2302 } 2304 - if (!(o_res->f_attr->valid & NFS_ATTR_FATTR)) 2303 + if (!(o_res->f_attr->valid & NFS_ATTR_FATTR)) { 2304 + nfs4_sequence_free_slot(&o_res->seq_res); 2305 2305 nfs4_proc_getattr(server, &o_res->fh, o_res->f_attr, o_res->f_label); 2306 + } 2306 2307 return 0; 2307 2308 } 2308 2309 ··· 3268 3265 .rpc_resp = &res, 3269 3266 }; 3270 3267 int status; 3268 + int i; 3271 3269 3272 3270 bitmask[0] = FATTR4_WORD0_SUPPORTED_ATTRS | 3273 3271 FATTR4_WORD0_FH_EXPIRE_TYPE | ··· 3334 3330 server->cache_consistency_bitmask[0] &= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE; 3335 3331 server->cache_consistency_bitmask[1] &= FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY; 3336 3332 server->cache_consistency_bitmask[2] = 0; 3333 + 3334 + /* Avoid a regression due to buggy server */ 3335 + for (i = 0; i < ARRAY_SIZE(res.exclcreat_bitmask); i++) 3336 + res.exclcreat_bitmask[i] &= res.attr_bitmask[i]; 3337 3337 memcpy(server->exclcreat_bitmask, res.exclcreat_bitmask, 3338 3338 sizeof(server->exclcreat_bitmask)); 3339 + 3339 3340 server->acl_bitmask = res.acl_bitmask; 3340 3341 server->fh_expire_type = res.fh_expire_type; 3341 3342 } ··· 4619 4610 return 0; 4620 4611 if (nfs4_set_rw_stateid(&hdr->args.stateid, hdr->args.context, 4621 4612 hdr->args.lock_context, 4622 - hdr->rw_ops->rw_mode) == -EIO) 4613 + hdr->rw_mode) == -EIO) 4623 4614 return -EIO; 4624 4615 if (unlikely(test_bit(NFS_CONTEXT_BAD, &hdr->args.context->flags))) 4625 4616 return -EIO; ··· 4813 4804 if (!atomic_inc_not_zero(&clp->cl_count)) 4814 4805 return -EIO; 4815 4806 data = kmalloc(sizeof(*data), GFP_NOFS); 4816 - if (data == NULL) 4807 + if (data == NULL) { 4808 + nfs_put_client(clp); 4817 4809 return -ENOMEM; 4810 + } 4818 4811 data->client = clp; 4819 4812 data->timestamp = jiffies; 4820 4813 return rpc_call_async(clp->cl_rpcclient, &msg, RPC_TASK_TIMEOUT, ··· 5793 5782 struct nfs_locku_res res; 5794 5783 struct nfs4_lock_state *lsp; 5795 5784 struct nfs_open_context *ctx; 5785 + struct nfs_lock_context *l_ctx; 5796 5786 struct file_lock fl; 5797 5787 struct nfs_server *server; 5798 5788 unsigned long timestamp; ··· 5818 5806 atomic_inc(&lsp->ls_count); 5819 5807 /* Ensure we don't close file until we're done freeing locks! */ 5820 5808 p->ctx = get_nfs_open_context(ctx); 5809 + p->l_ctx = nfs_get_lock_context(ctx); 5821 5810 memcpy(&p->fl, fl, sizeof(p->fl)); 5822 5811 p->server = NFS_SERVER(inode); 5823 5812 return p; ··· 5829 5816 struct nfs4_unlockdata *calldata = data; 5830 5817 nfs_free_seqid(calldata->arg.seqid); 5831 5818 nfs4_put_lock_state(calldata->lsp); 5819 + nfs_put_lock_context(calldata->l_ctx); 5832 5820 put_nfs_open_context(calldata->ctx); 5833 5821 kfree(calldata); 5834 5822 } ··· 5870 5856 static void nfs4_locku_prepare(struct rpc_task *task, void *data) 5871 5857 { 5872 5858 struct nfs4_unlockdata *calldata = data; 5859 + 5860 + if (test_bit(NFS_CONTEXT_UNLOCK, &calldata->l_ctx->open_context->flags) && 5861 + nfs_async_iocounter_wait(task, calldata->l_ctx)) 5862 + return; 5873 5863 5874 5864 if (nfs_wait_on_sequence(calldata->arg.seqid, task) != 0) 5875 5865 goto out_wait; ··· 5926 5908 * canceled lock is passed in, and it won't be an unlock. 5927 5909 */ 5928 5910 fl->fl_type = F_UNLCK; 5911 + if (fl->fl_flags & FL_CLOSE) 5912 + set_bit(NFS_CONTEXT_UNLOCK, &ctx->flags); 5929 5913 5930 5914 data = nfs4_alloc_unlockdata(fl, ctx, lsp, seqid); 5931 5915 if (data == NULL) { ··· 6465 6445 ctx = nfs_file_open_context(filp); 6466 6446 state = ctx->state; 6467 6447 6468 - if (request->fl_start < 0 || request->fl_end < 0) 6469 - return -EINVAL; 6470 - 6471 6448 if (IS_GETLK(cmd)) { 6472 6449 if (state != NULL) 6473 6450 return nfs4_proc_getlk(state, F_GETLK, request); ··· 6486 6469 if ((request->fl_flags & FL_POSIX) && 6487 6470 !test_bit(NFS_STATE_POSIX_LOCKS, &state->flags)) 6488 6471 return -ENOLCK; 6489 - 6490 - /* 6491 - * Don't rely on the VFS having checked the file open mode, 6492 - * since it won't do this for flock() locks. 6493 - */ 6494 - switch (request->fl_type) { 6495 - case F_RDLCK: 6496 - if (!(filp->f_mode & FMODE_READ)) 6497 - return -EBADF; 6498 - break; 6499 - case F_WRLCK: 6500 - if (!(filp->f_mode & FMODE_WRITE)) 6501 - return -EBADF; 6502 - } 6503 6472 6504 6473 status = nfs4_set_lock_state(state, request); 6505 6474 if (status != 0) ··· 7158 7155 }; 7159 7156 struct rpc_task *task; 7160 7157 7161 - dprintk("--> %s\n", __func__); 7162 - 7163 7158 nfs4_copy_sessionid(&args.sessionid, &clp->cl_session->sess_id); 7164 7159 if (!(clp->cl_session->flags & SESSION4_BACK_CHAN)) 7165 7160 args.dir = NFS4_CDFC4_FORE; ··· 7177 7176 if (memcmp(res.sessionid.data, 7178 7177 clp->cl_session->sess_id.data, NFS4_MAX_SESSIONID_LEN)) { 7179 7178 dprintk("NFS: %s: Session ID mismatch\n", __func__); 7180 - status = -EIO; 7181 - goto out; 7179 + return -EIO; 7182 7180 } 7183 7181 if ((res.dir & args.dir) != res.dir || res.dir == 0) { 7184 7182 dprintk("NFS: %s: Unexpected direction from server\n", 7185 7183 __func__); 7186 - status = -EIO; 7187 - goto out; 7184 + return -EIO; 7188 7185 } 7189 7186 if (res.use_conn_in_rdma_mode != args.use_conn_in_rdma_mode) { 7190 7187 dprintk("NFS: %s: Server returned RDMA mode = true\n", 7191 7188 __func__); 7192 - status = -EIO; 7193 - goto out; 7189 + return -EIO; 7194 7190 } 7195 7191 } 7196 - out: 7197 - dprintk("<-- %s status= %d\n", __func__, status); 7192 + 7198 7193 return status; 7199 7194 } 7200 7195 ··· 7456 7459 }; 7457 7460 struct nfs41_exchange_id_data *calldata; 7458 7461 struct rpc_task *task; 7459 - int status = -EIO; 7462 + int status; 7460 7463 7461 7464 if (!atomic_inc_not_zero(&clp->cl_count)) 7462 - goto out; 7465 + return -EIO; 7463 7466 7464 - status = -ENOMEM; 7465 7467 calldata = kzalloc(sizeof(*calldata), GFP_NOFS); 7466 - if (!calldata) 7467 - goto out; 7468 + if (!calldata) { 7469 + nfs_put_client(clp); 7470 + return -ENOMEM; 7471 + } 7468 7472 7469 7473 if (!xprt) 7470 7474 nfs4_init_boot_verifier(clp, &verifier); ··· 7473 7475 status = nfs4_init_uniform_client_string(clp); 7474 7476 if (status) 7475 7477 goto out_calldata; 7476 - 7477 - dprintk("NFS call exchange_id auth=%s, '%s'\n", 7478 - clp->cl_rpcclient->cl_auth->au_ops->au_name, 7479 - clp->cl_owner_id); 7480 7478 7481 7479 calldata->res.server_owner = kzalloc(sizeof(struct nfs41_server_owner), 7482 7480 GFP_NOFS); ··· 7539 7545 7540 7546 rpc_put_task(task); 7541 7547 out: 7542 - if (clp->cl_implid != NULL) 7543 - dprintk("NFS reply exchange_id: Server Implementation ID: " 7544 - "domain: %s, name: %s, date: %llu,%u\n", 7545 - clp->cl_implid->domain, clp->cl_implid->name, 7546 - clp->cl_implid->date.seconds, 7547 - clp->cl_implid->date.nseconds); 7548 - dprintk("NFS reply exchange_id: %d\n", status); 7549 7548 return status; 7550 7549 7551 7550 out_impl_id: ··· 7756 7769 7757 7770 nfs4_init_sequence(&args.la_seq_args, &res.lr_seq_res, 0); 7758 7771 nfs4_set_sequence_privileged(&args.la_seq_args); 7759 - dprintk("--> %s\n", __func__); 7760 7772 task = rpc_run_task(&task_setup); 7761 7773 7762 7774 if (IS_ERR(task)) 7763 - status = PTR_ERR(task); 7764 - else { 7765 - status = task->tk_status; 7766 - rpc_put_task(task); 7767 - } 7768 - dprintk("<-- %s return %d\n", __func__, status); 7775 + return PTR_ERR(task); 7769 7776 7777 + status = task->tk_status; 7778 + rpc_put_task(task); 7770 7779 return status; 7771 7780 } 7772 7781 ··· 8163 8180 /* fall through */ 8164 8181 case -NFS4ERR_RETRY_UNCACHED_REP: 8165 8182 return -EAGAIN; 8183 + case -NFS4ERR_BADSESSION: 8184 + case -NFS4ERR_DEADSESSION: 8185 + case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION: 8186 + nfs4_schedule_session_recovery(clp->cl_session, 8187 + task->tk_status); 8188 + break; 8166 8189 default: 8167 8190 nfs4_schedule_lease_recovery(clp); 8168 8191 } ··· 8247 8258 if (status == 0) 8248 8259 status = task->tk_status; 8249 8260 rpc_put_task(task); 8250 - return 0; 8251 8261 out: 8252 8262 dprintk("<-- %s status=%d\n", __func__, status); 8253 8263 return status; ··· 8345 8357 */ 8346 8358 pnfs_mark_layout_stateid_invalid(lo, &head); 8347 8359 spin_unlock(&inode->i_lock); 8360 + nfs_commit_inode(inode, 0); 8348 8361 pnfs_free_lseg_list(&head); 8349 8362 status = -EAGAIN; 8350 8363 goto out;
+7 -3
fs/nfs/nfs4state.c
··· 1649 1649 nfs4_state_mark_reclaim_helper(clp, nfs4_state_mark_reclaim_reboot); 1650 1650 } 1651 1651 1652 - static void nfs4_reclaim_complete(struct nfs_client *clp, 1652 + static int nfs4_reclaim_complete(struct nfs_client *clp, 1653 1653 const struct nfs4_state_recovery_ops *ops, 1654 1654 struct rpc_cred *cred) 1655 1655 { 1656 1656 /* Notify the server we're done reclaiming our state */ 1657 1657 if (ops->reclaim_complete) 1658 - (void)ops->reclaim_complete(clp, cred); 1658 + return ops->reclaim_complete(clp, cred); 1659 + return 0; 1659 1660 } 1660 1661 1661 1662 static void nfs4_clear_reclaim_server(struct nfs_server *server) ··· 1703 1702 { 1704 1703 const struct nfs4_state_recovery_ops *ops; 1705 1704 struct rpc_cred *cred; 1705 + int err; 1706 1706 1707 1707 if (!nfs4_state_clear_reclaim_reboot(clp)) 1708 1708 return; 1709 1709 ops = clp->cl_mvops->reboot_recovery_ops; 1710 1710 cred = nfs4_get_clid_cred(clp); 1711 - nfs4_reclaim_complete(clp, ops, cred); 1711 + err = nfs4_reclaim_complete(clp, ops, cred); 1712 1712 put_rpccred(cred); 1713 + if (err == -NFS4ERR_CONN_NOT_BOUND_TO_SESSION) 1714 + set_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state); 1713 1715 } 1714 1716 1715 1717 static void nfs4_state_start_reclaim_nograce(struct nfs_client *clp)
+41 -53
fs/nfs/nfs4xdr.c
··· 1000 1000 1001 1001 static void encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, 1002 1002 const struct nfs4_label *label, 1003 + const umode_t *umask, 1003 1004 const struct nfs_server *server, 1004 - bool excl_check, const umode_t *umask) 1005 + const uint32_t attrmask[]) 1005 1006 { 1006 1007 char owner_name[IDMAP_NAMESZ]; 1007 1008 char owner_group[IDMAP_NAMESZ]; ··· 1017 1016 /* 1018 1017 * We reserve enough space to write the entire attribute buffer at once. 1019 1018 */ 1020 - if (iap->ia_valid & ATTR_SIZE) { 1019 + if ((iap->ia_valid & ATTR_SIZE) && (attrmask[0] & FATTR4_WORD0_SIZE)) { 1021 1020 bmval[0] |= FATTR4_WORD0_SIZE; 1022 1021 len += 8; 1023 1022 } 1024 - if (!(server->attr_bitmask[2] & FATTR4_WORD2_MODE_UMASK)) 1025 - umask = NULL; 1026 1023 if (iap->ia_valid & ATTR_MODE) { 1027 - if (umask) { 1024 + if (umask && (attrmask[2] & FATTR4_WORD2_MODE_UMASK)) { 1028 1025 bmval[2] |= FATTR4_WORD2_MODE_UMASK; 1029 1026 len += 8; 1030 - } else { 1027 + } else if (attrmask[1] & FATTR4_WORD1_MODE) { 1031 1028 bmval[1] |= FATTR4_WORD1_MODE; 1032 1029 len += 4; 1033 1030 } 1034 1031 } 1035 - if (iap->ia_valid & ATTR_UID) { 1032 + if ((iap->ia_valid & ATTR_UID) && (attrmask[1] & FATTR4_WORD1_OWNER)) { 1036 1033 owner_namelen = nfs_map_uid_to_name(server, iap->ia_uid, owner_name, IDMAP_NAMESZ); 1037 1034 if (owner_namelen < 0) { 1038 1035 dprintk("nfs: couldn't resolve uid %d to string\n", ··· 1043 1044 bmval[1] |= FATTR4_WORD1_OWNER; 1044 1045 len += 4 + (XDR_QUADLEN(owner_namelen) << 2); 1045 1046 } 1046 - if (iap->ia_valid & ATTR_GID) { 1047 + if ((iap->ia_valid & ATTR_GID) && 1048 + (attrmask[1] & FATTR4_WORD1_OWNER_GROUP)) { 1047 1049 owner_grouplen = nfs_map_gid_to_group(server, iap->ia_gid, owner_group, IDMAP_NAMESZ); 1048 1050 if (owner_grouplen < 0) { 1049 1051 dprintk("nfs: couldn't resolve gid %d to string\n", ··· 1056 1056 bmval[1] |= FATTR4_WORD1_OWNER_GROUP; 1057 1057 len += 4 + (XDR_QUADLEN(owner_grouplen) << 2); 1058 1058 } 1059 - if (iap->ia_valid & ATTR_ATIME_SET) { 1060 - bmval[1] |= FATTR4_WORD1_TIME_ACCESS_SET; 1061 - len += 16; 1062 - } else if (iap->ia_valid & ATTR_ATIME) { 1063 - bmval[1] |= FATTR4_WORD1_TIME_ACCESS_SET; 1064 - len += 4; 1059 + if (attrmask[1] & FATTR4_WORD1_TIME_ACCESS_SET) { 1060 + if (iap->ia_valid & ATTR_ATIME_SET) { 1061 + bmval[1] |= FATTR4_WORD1_TIME_ACCESS_SET; 1062 + len += 16; 1063 + } else if (iap->ia_valid & ATTR_ATIME) { 1064 + bmval[1] |= FATTR4_WORD1_TIME_ACCESS_SET; 1065 + len += 4; 1066 + } 1065 1067 } 1066 - if (iap->ia_valid & ATTR_MTIME_SET) { 1067 - bmval[1] |= FATTR4_WORD1_TIME_MODIFY_SET; 1068 - len += 16; 1069 - } else if (iap->ia_valid & ATTR_MTIME) { 1070 - bmval[1] |= FATTR4_WORD1_TIME_MODIFY_SET; 1071 - len += 4; 1072 - } 1073 - 1074 - if (excl_check) { 1075 - const u32 *excl_bmval = server->exclcreat_bitmask; 1076 - bmval[0] &= excl_bmval[0]; 1077 - bmval[1] &= excl_bmval[1]; 1078 - bmval[2] &= excl_bmval[2]; 1079 - 1080 - if (!(excl_bmval[2] & FATTR4_WORD2_SECURITY_LABEL)) 1081 - label = NULL; 1068 + if (attrmask[1] & FATTR4_WORD1_TIME_MODIFY_SET) { 1069 + if (iap->ia_valid & ATTR_MTIME_SET) { 1070 + bmval[1] |= FATTR4_WORD1_TIME_MODIFY_SET; 1071 + len += 16; 1072 + } else if (iap->ia_valid & ATTR_MTIME) { 1073 + bmval[1] |= FATTR4_WORD1_TIME_MODIFY_SET; 1074 + len += 4; 1075 + } 1082 1076 } 1083 1077 1084 - if (label) { 1078 + if (label && (attrmask[2] & FATTR4_WORD2_SECURITY_LABEL)) { 1085 1079 len += 4 + 4 + 4 + (XDR_QUADLEN(label->len) << 2); 1086 1080 bmval[2] |= FATTR4_WORD2_SECURITY_LABEL; 1087 1081 } ··· 1182 1188 } 1183 1189 1184 1190 encode_string(xdr, create->name->len, create->name->name); 1185 - encode_attrs(xdr, create->attrs, create->label, create->server, false, 1186 - &create->umask); 1191 + encode_attrs(xdr, create->attrs, create->label, &create->umask, 1192 + create->server, create->server->attr_bitmask); 1187 1193 } 1188 1194 1189 1195 static void encode_getattr_one(struct xdr_stream *xdr, uint32_t bitmap, struct compound_hdr *hdr) ··· 1403 1409 switch(arg->createmode) { 1404 1410 case NFS4_CREATE_UNCHECKED: 1405 1411 *p = cpu_to_be32(NFS4_CREATE_UNCHECKED); 1406 - encode_attrs(xdr, arg->u.attrs, arg->label, arg->server, false, 1407 - &arg->umask); 1412 + encode_attrs(xdr, arg->u.attrs, arg->label, &arg->umask, 1413 + arg->server, arg->server->attr_bitmask); 1408 1414 break; 1409 1415 case NFS4_CREATE_GUARDED: 1410 1416 *p = cpu_to_be32(NFS4_CREATE_GUARDED); 1411 - encode_attrs(xdr, arg->u.attrs, arg->label, arg->server, false, 1412 - &arg->umask); 1417 + encode_attrs(xdr, arg->u.attrs, arg->label, &arg->umask, 1418 + arg->server, arg->server->attr_bitmask); 1413 1419 break; 1414 1420 case NFS4_CREATE_EXCLUSIVE: 1415 1421 *p = cpu_to_be32(NFS4_CREATE_EXCLUSIVE); ··· 1418 1424 case NFS4_CREATE_EXCLUSIVE4_1: 1419 1425 *p = cpu_to_be32(NFS4_CREATE_EXCLUSIVE4_1); 1420 1426 encode_nfs4_verifier(xdr, &arg->u.verifier); 1421 - encode_attrs(xdr, arg->u.attrs, arg->label, arg->server, true, 1422 - &arg->umask); 1427 + encode_attrs(xdr, arg->u.attrs, arg->label, &arg->umask, 1428 + arg->server, arg->server->exclcreat_bitmask); 1423 1429 } 1424 1430 } 1425 1431 ··· 1675 1681 { 1676 1682 encode_op_hdr(xdr, OP_SETATTR, decode_setattr_maxsz, hdr); 1677 1683 encode_nfs4_stateid(xdr, &arg->stateid); 1678 - encode_attrs(xdr, arg->iap, arg->label, server, false, NULL); 1684 + encode_attrs(xdr, arg->iap, arg->label, NULL, server, 1685 + server->attr_bitmask); 1679 1686 } 1680 1687 1681 1688 static void encode_setclientid(struct xdr_stream *xdr, const struct nfs4_setclientid *setclientid, struct compound_hdr *hdr) ··· 2000 2005 *p++ = cpu_to_be32(0); /* Never send time_modify_changed */ 2001 2006 *p++ = cpu_to_be32(NFS_SERVER(args->inode)->pnfs_curr_ld->id);/* type */ 2002 2007 2003 - if (NFS_SERVER(inode)->pnfs_curr_ld->encode_layoutcommit) { 2004 - NFS_SERVER(inode)->pnfs_curr_ld->encode_layoutcommit( 2005 - NFS_I(inode)->layout, xdr, args); 2006 - } else { 2007 - encode_uint32(xdr, args->layoutupdate_len); 2008 - if (args->layoutupdate_pages) { 2009 - xdr_write_pages(xdr, args->layoutupdate_pages, 0, 2010 - args->layoutupdate_len); 2011 - } 2012 - } 2008 + encode_uint32(xdr, args->layoutupdate_len); 2009 + if (args->layoutupdate_pages) 2010 + xdr_write_pages(xdr, args->layoutupdate_pages, 0, 2011 + args->layoutupdate_len); 2013 2012 2014 2013 return 0; 2015 2014 } ··· 2013 2024 const struct nfs4_layoutreturn_args *args, 2014 2025 struct compound_hdr *hdr) 2015 2026 { 2016 - const struct pnfs_layoutdriver_type *lr_ops = NFS_SERVER(args->inode)->pnfs_curr_ld; 2017 2027 __be32 *p; 2018 2028 2019 2029 encode_op_hdr(xdr, OP_LAYOUTRETURN, decode_layoutreturn_maxsz, hdr); ··· 2029 2041 spin_unlock(&args->inode->i_lock); 2030 2042 if (args->ld_private->ops && args->ld_private->ops->encode) 2031 2043 args->ld_private->ops->encode(xdr, args, args->ld_private); 2032 - else if (lr_ops->encode_layoutreturn) 2033 - lr_ops->encode_layoutreturn(xdr, args); 2034 2044 else 2035 2045 encode_uint32(xdr, 0); 2036 2046 } ··· 5565 5579 unsigned int i; 5566 5580 5567 5581 p = xdr_inline_decode(xdr, 4); 5582 + if (!p) 5583 + return -EIO; 5568 5584 bitmap_words = be32_to_cpup(p++); 5569 5585 if (bitmap_words > NFS4_OP_MAP_NUM_WORDS) 5570 5586 return -EIO;
-5
fs/nfs/objlayout/Kbuild
··· 1 - # 2 - # Makefile for the pNFS Objects Layout Driver kernel module 3 - # 4 - objlayoutdriver-y := objio_osd.o pnfs_osd_xdr_cli.o objlayout.o 5 - obj-$(CONFIG_PNFS_OBJLAYOUT) += objlayoutdriver.o
-675
fs/nfs/objlayout/objio_osd.c
··· 1 - /* 2 - * pNFS Objects layout implementation over open-osd initiator library 3 - * 4 - * Copyright (C) 2009 Panasas Inc. [year of first publication] 5 - * All rights reserved. 6 - * 7 - * Benny Halevy <bhalevy@panasas.com> 8 - * Boaz Harrosh <ooo@electrozaur.com> 9 - * 10 - * This program is free software; you can redistribute it and/or modify 11 - * it under the terms of the GNU General Public License version 2 12 - * See the file COPYING included with this distribution for more details. 13 - * 14 - * Redistribution and use in source and binary forms, with or without 15 - * modification, are permitted provided that the following conditions 16 - * are met: 17 - * 18 - * 1. Redistributions of source code must retain the above copyright 19 - * notice, this list of conditions and the following disclaimer. 20 - * 2. Redistributions in binary form must reproduce the above copyright 21 - * notice, this list of conditions and the following disclaimer in the 22 - * documentation and/or other materials provided with the distribution. 23 - * 3. Neither the name of the Panasas company nor the names of its 24 - * contributors may be used to endorse or promote products derived 25 - * from this software without specific prior written permission. 26 - * 27 - * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED 28 - * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 29 - * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 30 - * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 31 - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 32 - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 33 - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR 34 - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 35 - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 36 - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 37 - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 38 - */ 39 - 40 - #include <linux/module.h> 41 - #include <scsi/osd_ore.h> 42 - 43 - #include "objlayout.h" 44 - #include "../internal.h" 45 - 46 - #define NFSDBG_FACILITY NFSDBG_PNFS_LD 47 - 48 - struct objio_dev_ent { 49 - struct nfs4_deviceid_node id_node; 50 - struct ore_dev od; 51 - }; 52 - 53 - static void 54 - objio_free_deviceid_node(struct nfs4_deviceid_node *d) 55 - { 56 - struct objio_dev_ent *de = container_of(d, struct objio_dev_ent, id_node); 57 - 58 - dprintk("%s: free od=%p\n", __func__, de->od.od); 59 - osduld_put_device(de->od.od); 60 - kfree_rcu(d, rcu); 61 - } 62 - 63 - struct objio_segment { 64 - struct pnfs_layout_segment lseg; 65 - 66 - struct ore_layout layout; 67 - struct ore_components oc; 68 - }; 69 - 70 - static inline struct objio_segment * 71 - OBJIO_LSEG(struct pnfs_layout_segment *lseg) 72 - { 73 - return container_of(lseg, struct objio_segment, lseg); 74 - } 75 - 76 - struct objio_state { 77 - /* Generic layer */ 78 - struct objlayout_io_res oir; 79 - 80 - bool sync; 81 - /*FIXME: Support for extra_bytes at ore_get_rw_state() */ 82 - struct ore_io_state *ios; 83 - }; 84 - 85 - /* Send and wait for a get_device_info of devices in the layout, 86 - then look them up with the osd_initiator library */ 87 - struct nfs4_deviceid_node * 88 - objio_alloc_deviceid_node(struct nfs_server *server, struct pnfs_device *pdev, 89 - gfp_t gfp_flags) 90 - { 91 - struct pnfs_osd_deviceaddr *deviceaddr; 92 - struct objio_dev_ent *ode = NULL; 93 - struct osd_dev *od; 94 - struct osd_dev_info odi; 95 - bool retry_flag = true; 96 - __be32 *p; 97 - int err; 98 - 99 - deviceaddr = kzalloc(sizeof(*deviceaddr), gfp_flags); 100 - if (!deviceaddr) 101 - return NULL; 102 - 103 - p = page_address(pdev->pages[0]); 104 - pnfs_osd_xdr_decode_deviceaddr(deviceaddr, p); 105 - 106 - odi.systemid_len = deviceaddr->oda_systemid.len; 107 - if (odi.systemid_len > sizeof(odi.systemid)) { 108 - dprintk("%s: odi.systemid_len > sizeof(systemid=%zd)\n", 109 - __func__, sizeof(odi.systemid)); 110 - err = -EINVAL; 111 - goto out; 112 - } else if (odi.systemid_len) 113 - memcpy(odi.systemid, deviceaddr->oda_systemid.data, 114 - odi.systemid_len); 115 - odi.osdname_len = deviceaddr->oda_osdname.len; 116 - odi.osdname = (u8 *)deviceaddr->oda_osdname.data; 117 - 118 - if (!odi.osdname_len && !odi.systemid_len) { 119 - dprintk("%s: !odi.osdname_len && !odi.systemid_len\n", 120 - __func__); 121 - err = -ENODEV; 122 - goto out; 123 - } 124 - 125 - retry_lookup: 126 - od = osduld_info_lookup(&odi); 127 - if (IS_ERR(od)) { 128 - err = PTR_ERR(od); 129 - dprintk("%s: osduld_info_lookup => %d\n", __func__, err); 130 - if (err == -ENODEV && retry_flag) { 131 - err = objlayout_autologin(deviceaddr); 132 - if (likely(!err)) { 133 - retry_flag = false; 134 - goto retry_lookup; 135 - } 136 - } 137 - goto out; 138 - } 139 - 140 - dprintk("Adding new dev_id(%llx:%llx)\n", 141 - _DEVID_LO(&pdev->dev_id), _DEVID_HI(&pdev->dev_id)); 142 - 143 - ode = kzalloc(sizeof(*ode), gfp_flags); 144 - if (!ode) { 145 - dprintk("%s: -ENOMEM od=%p\n", __func__, od); 146 - goto out; 147 - } 148 - 149 - nfs4_init_deviceid_node(&ode->id_node, server, &pdev->dev_id); 150 - kfree(deviceaddr); 151 - 152 - ode->od.od = od; 153 - return &ode->id_node; 154 - 155 - out: 156 - kfree(deviceaddr); 157 - return NULL; 158 - } 159 - 160 - static void copy_single_comp(struct ore_components *oc, unsigned c, 161 - struct pnfs_osd_object_cred *src_comp) 162 - { 163 - struct ore_comp *ocomp = &oc->comps[c]; 164 - 165 - WARN_ON(src_comp->oc_cap_key.cred_len > 0); /* libosd is NO_SEC only */ 166 - WARN_ON(src_comp->oc_cap.cred_len > sizeof(ocomp->cred)); 167 - 168 - ocomp->obj.partition = src_comp->oc_object_id.oid_partition_id; 169 - ocomp->obj.id = src_comp->oc_object_id.oid_object_id; 170 - 171 - memcpy(ocomp->cred, src_comp->oc_cap.cred, sizeof(ocomp->cred)); 172 - } 173 - 174 - static int __alloc_objio_seg(unsigned numdevs, gfp_t gfp_flags, 175 - struct objio_segment **pseg) 176 - { 177 - /* This is the in memory structure of the objio_segment 178 - * 179 - * struct __alloc_objio_segment { 180 - * struct objio_segment olseg; 181 - * struct ore_dev *ods[numdevs]; 182 - * struct ore_comp comps[numdevs]; 183 - * } *aolseg; 184 - * NOTE: The code as above compiles and runs perfectly. It is elegant, 185 - * type safe and compact. At some Past time Linus has decided he does not 186 - * like variable length arrays, For the sake of this principal we uglify 187 - * the code as below. 188 - */ 189 - struct objio_segment *lseg; 190 - size_t lseg_size = sizeof(*lseg) + 191 - numdevs * sizeof(lseg->oc.ods[0]) + 192 - numdevs * sizeof(*lseg->oc.comps); 193 - 194 - lseg = kzalloc(lseg_size, gfp_flags); 195 - if (unlikely(!lseg)) { 196 - dprintk("%s: Failed allocation numdevs=%d size=%zd\n", __func__, 197 - numdevs, lseg_size); 198 - return -ENOMEM; 199 - } 200 - 201 - lseg->oc.numdevs = numdevs; 202 - lseg->oc.single_comp = EC_MULTPLE_COMPS; 203 - lseg->oc.ods = (void *)(lseg + 1); 204 - lseg->oc.comps = (void *)(lseg->oc.ods + numdevs); 205 - 206 - *pseg = lseg; 207 - return 0; 208 - } 209 - 210 - int objio_alloc_lseg(struct pnfs_layout_segment **outp, 211 - struct pnfs_layout_hdr *pnfslay, 212 - struct pnfs_layout_range *range, 213 - struct xdr_stream *xdr, 214 - gfp_t gfp_flags) 215 - { 216 - struct nfs_server *server = NFS_SERVER(pnfslay->plh_inode); 217 - struct objio_segment *objio_seg; 218 - struct pnfs_osd_xdr_decode_layout_iter iter; 219 - struct pnfs_osd_layout layout; 220 - struct pnfs_osd_object_cred src_comp; 221 - unsigned cur_comp; 222 - int err; 223 - 224 - err = pnfs_osd_xdr_decode_layout_map(&layout, &iter, xdr); 225 - if (unlikely(err)) 226 - return err; 227 - 228 - err = __alloc_objio_seg(layout.olo_num_comps, gfp_flags, &objio_seg); 229 - if (unlikely(err)) 230 - return err; 231 - 232 - objio_seg->layout.stripe_unit = layout.olo_map.odm_stripe_unit; 233 - objio_seg->layout.group_width = layout.olo_map.odm_group_width; 234 - objio_seg->layout.group_depth = layout.olo_map.odm_group_depth; 235 - objio_seg->layout.mirrors_p1 = layout.olo_map.odm_mirror_cnt + 1; 236 - objio_seg->layout.raid_algorithm = layout.olo_map.odm_raid_algorithm; 237 - 238 - err = ore_verify_layout(layout.olo_map.odm_num_comps, 239 - &objio_seg->layout); 240 - if (unlikely(err)) 241 - goto err; 242 - 243 - objio_seg->oc.first_dev = layout.olo_comps_index; 244 - cur_comp = 0; 245 - while (pnfs_osd_xdr_decode_layout_comp(&src_comp, &iter, xdr, &err)) { 246 - struct nfs4_deviceid_node *d; 247 - struct objio_dev_ent *ode; 248 - 249 - copy_single_comp(&objio_seg->oc, cur_comp, &src_comp); 250 - 251 - d = nfs4_find_get_deviceid(server, 252 - &src_comp.oc_object_id.oid_device_id, 253 - pnfslay->plh_lc_cred, gfp_flags); 254 - if (!d) { 255 - err = -ENXIO; 256 - goto err; 257 - } 258 - 259 - ode = container_of(d, struct objio_dev_ent, id_node); 260 - objio_seg->oc.ods[cur_comp++] = &ode->od; 261 - } 262 - /* pnfs_osd_xdr_decode_layout_comp returns false on error */ 263 - if (unlikely(err)) 264 - goto err; 265 - 266 - *outp = &objio_seg->lseg; 267 - return 0; 268 - 269 - err: 270 - kfree(objio_seg); 271 - dprintk("%s: Error: return %d\n", __func__, err); 272 - *outp = NULL; 273 - return err; 274 - } 275 - 276 - void objio_free_lseg(struct pnfs_layout_segment *lseg) 277 - { 278 - int i; 279 - struct objio_segment *objio_seg = OBJIO_LSEG(lseg); 280 - 281 - for (i = 0; i < objio_seg->oc.numdevs; i++) { 282 - struct ore_dev *od = objio_seg->oc.ods[i]; 283 - struct objio_dev_ent *ode; 284 - 285 - if (!od) 286 - break; 287 - ode = container_of(od, typeof(*ode), od); 288 - nfs4_put_deviceid_node(&ode->id_node); 289 - } 290 - kfree(objio_seg); 291 - } 292 - 293 - static int 294 - objio_alloc_io_state(struct pnfs_layout_hdr *pnfs_layout_type, bool is_reading, 295 - struct pnfs_layout_segment *lseg, struct page **pages, unsigned pgbase, 296 - loff_t offset, size_t count, void *rpcdata, gfp_t gfp_flags, 297 - struct objio_state **outp) 298 - { 299 - struct objio_segment *objio_seg = OBJIO_LSEG(lseg); 300 - struct ore_io_state *ios; 301 - int ret; 302 - struct __alloc_objio_state { 303 - struct objio_state objios; 304 - struct pnfs_osd_ioerr ioerrs[objio_seg->oc.numdevs]; 305 - } *aos; 306 - 307 - aos = kzalloc(sizeof(*aos), gfp_flags); 308 - if (unlikely(!aos)) 309 - return -ENOMEM; 310 - 311 - objlayout_init_ioerrs(&aos->objios.oir, objio_seg->oc.numdevs, 312 - aos->ioerrs, rpcdata, pnfs_layout_type); 313 - 314 - ret = ore_get_rw_state(&objio_seg->layout, &objio_seg->oc, is_reading, 315 - offset, count, &ios); 316 - if (unlikely(ret)) { 317 - kfree(aos); 318 - return ret; 319 - } 320 - 321 - ios->pages = pages; 322 - ios->pgbase = pgbase; 323 - ios->private = aos; 324 - BUG_ON(ios->nr_pages > (pgbase + count + PAGE_SIZE - 1) >> PAGE_SHIFT); 325 - 326 - aos->objios.sync = 0; 327 - aos->objios.ios = ios; 328 - *outp = &aos->objios; 329 - return 0; 330 - } 331 - 332 - void objio_free_result(struct objlayout_io_res *oir) 333 - { 334 - struct objio_state *objios = container_of(oir, struct objio_state, oir); 335 - 336 - ore_put_io_state(objios->ios); 337 - kfree(objios); 338 - } 339 - 340 - static enum pnfs_osd_errno osd_pri_2_pnfs_err(enum osd_err_priority oep) 341 - { 342 - switch (oep) { 343 - case OSD_ERR_PRI_NO_ERROR: 344 - return (enum pnfs_osd_errno)0; 345 - 346 - case OSD_ERR_PRI_CLEAR_PAGES: 347 - BUG_ON(1); 348 - return 0; 349 - 350 - case OSD_ERR_PRI_RESOURCE: 351 - return PNFS_OSD_ERR_RESOURCE; 352 - case OSD_ERR_PRI_BAD_CRED: 353 - return PNFS_OSD_ERR_BAD_CRED; 354 - case OSD_ERR_PRI_NO_ACCESS: 355 - return PNFS_OSD_ERR_NO_ACCESS; 356 - case OSD_ERR_PRI_UNREACHABLE: 357 - return PNFS_OSD_ERR_UNREACHABLE; 358 - case OSD_ERR_PRI_NOT_FOUND: 359 - return PNFS_OSD_ERR_NOT_FOUND; 360 - case OSD_ERR_PRI_NO_SPACE: 361 - return PNFS_OSD_ERR_NO_SPACE; 362 - default: 363 - WARN_ON(1); 364 - /* fallthrough */ 365 - case OSD_ERR_PRI_EIO: 366 - return PNFS_OSD_ERR_EIO; 367 - } 368 - } 369 - 370 - static void __on_dev_error(struct ore_io_state *ios, 371 - struct ore_dev *od, unsigned dev_index, enum osd_err_priority oep, 372 - u64 dev_offset, u64 dev_len) 373 - { 374 - struct objio_state *objios = ios->private; 375 - struct pnfs_osd_objid pooid; 376 - struct objio_dev_ent *ode = container_of(od, typeof(*ode), od); 377 - /* FIXME: what to do with more-then-one-group layouts. We need to 378 - * translate from ore_io_state index to oc->comps index 379 - */ 380 - unsigned comp = dev_index; 381 - 382 - pooid.oid_device_id = ode->id_node.deviceid; 383 - pooid.oid_partition_id = ios->oc->comps[comp].obj.partition; 384 - pooid.oid_object_id = ios->oc->comps[comp].obj.id; 385 - 386 - objlayout_io_set_result(&objios->oir, comp, 387 - &pooid, osd_pri_2_pnfs_err(oep), 388 - dev_offset, dev_len, !ios->reading); 389 - } 390 - 391 - /* 392 - * read 393 - */ 394 - static void _read_done(struct ore_io_state *ios, void *private) 395 - { 396 - struct objio_state *objios = private; 397 - ssize_t status; 398 - int ret = ore_check_io(ios, &__on_dev_error); 399 - 400 - /* FIXME: _io_free(ios) can we dealocate the libosd resources; */ 401 - 402 - if (likely(!ret)) 403 - status = ios->length; 404 - else 405 - status = ret; 406 - 407 - objlayout_read_done(&objios->oir, status, objios->sync); 408 - } 409 - 410 - int objio_read_pagelist(struct nfs_pgio_header *hdr) 411 - { 412 - struct objio_state *objios; 413 - int ret; 414 - 415 - ret = objio_alloc_io_state(NFS_I(hdr->inode)->layout, true, 416 - hdr->lseg, hdr->args.pages, hdr->args.pgbase, 417 - hdr->args.offset, hdr->args.count, hdr, 418 - GFP_KERNEL, &objios); 419 - if (unlikely(ret)) 420 - return ret; 421 - 422 - objios->ios->done = _read_done; 423 - dprintk("%s: offset=0x%llx length=0x%x\n", __func__, 424 - hdr->args.offset, hdr->args.count); 425 - ret = ore_read(objios->ios); 426 - if (unlikely(ret)) 427 - objio_free_result(&objios->oir); 428 - return ret; 429 - } 430 - 431 - /* 432 - * write 433 - */ 434 - static void _write_done(struct ore_io_state *ios, void *private) 435 - { 436 - struct objio_state *objios = private; 437 - ssize_t status; 438 - int ret = ore_check_io(ios, &__on_dev_error); 439 - 440 - /* FIXME: _io_free(ios) can we dealocate the libosd resources; */ 441 - 442 - if (likely(!ret)) { 443 - /* FIXME: should be based on the OSD's persistence model 444 - * See OSD2r05 Section 4.13 Data persistence model */ 445 - objios->oir.committed = NFS_FILE_SYNC; 446 - status = ios->length; 447 - } else { 448 - status = ret; 449 - } 450 - 451 - objlayout_write_done(&objios->oir, status, objios->sync); 452 - } 453 - 454 - static struct page *__r4w_get_page(void *priv, u64 offset, bool *uptodate) 455 - { 456 - struct objio_state *objios = priv; 457 - struct nfs_pgio_header *hdr = objios->oir.rpcdata; 458 - struct address_space *mapping = hdr->inode->i_mapping; 459 - pgoff_t index = offset / PAGE_SIZE; 460 - struct page *page; 461 - loff_t i_size = i_size_read(hdr->inode); 462 - 463 - if (offset >= i_size) { 464 - *uptodate = true; 465 - dprintk("%s: g_zero_page index=0x%lx\n", __func__, index); 466 - return ZERO_PAGE(0); 467 - } 468 - 469 - page = find_get_page(mapping, index); 470 - if (!page) { 471 - page = find_or_create_page(mapping, index, GFP_NOFS); 472 - if (unlikely(!page)) { 473 - dprintk("%s: grab_cache_page Failed index=0x%lx\n", 474 - __func__, index); 475 - return NULL; 476 - } 477 - unlock_page(page); 478 - } 479 - *uptodate = PageUptodate(page); 480 - dprintk("%s: index=0x%lx uptodate=%d\n", __func__, index, *uptodate); 481 - return page; 482 - } 483 - 484 - static void __r4w_put_page(void *priv, struct page *page) 485 - { 486 - dprintk("%s: index=0x%lx\n", __func__, 487 - (page == ZERO_PAGE(0)) ? -1UL : page->index); 488 - if (ZERO_PAGE(0) != page) 489 - put_page(page); 490 - return; 491 - } 492 - 493 - static const struct _ore_r4w_op _r4w_op = { 494 - .get_page = &__r4w_get_page, 495 - .put_page = &__r4w_put_page, 496 - }; 497 - 498 - int objio_write_pagelist(struct nfs_pgio_header *hdr, int how) 499 - { 500 - struct objio_state *objios; 501 - int ret; 502 - 503 - ret = objio_alloc_io_state(NFS_I(hdr->inode)->layout, false, 504 - hdr->lseg, hdr->args.pages, hdr->args.pgbase, 505 - hdr->args.offset, hdr->args.count, hdr, GFP_NOFS, 506 - &objios); 507 - if (unlikely(ret)) 508 - return ret; 509 - 510 - objios->sync = 0 != (how & FLUSH_SYNC); 511 - objios->ios->r4w = &_r4w_op; 512 - 513 - if (!objios->sync) 514 - objios->ios->done = _write_done; 515 - 516 - dprintk("%s: offset=0x%llx length=0x%x\n", __func__, 517 - hdr->args.offset, hdr->args.count); 518 - ret = ore_write(objios->ios); 519 - if (unlikely(ret)) { 520 - objio_free_result(&objios->oir); 521 - return ret; 522 - } 523 - 524 - if (objios->sync) 525 - _write_done(objios->ios, objios); 526 - 527 - return 0; 528 - } 529 - 530 - /* 531 - * Return 0 if @req cannot be coalesced into @pgio, otherwise return the number 532 - * of bytes (maximum @req->wb_bytes) that can be coalesced. 533 - */ 534 - static size_t objio_pg_test(struct nfs_pageio_descriptor *pgio, 535 - struct nfs_page *prev, struct nfs_page *req) 536 - { 537 - struct nfs_pgio_mirror *mirror = nfs_pgio_current_mirror(pgio); 538 - unsigned int size; 539 - 540 - size = pnfs_generic_pg_test(pgio, prev, req); 541 - 542 - if (!size || mirror->pg_count + req->wb_bytes > 543 - (unsigned long)pgio->pg_layout_private) 544 - return 0; 545 - 546 - return min(size, req->wb_bytes); 547 - } 548 - 549 - static void objio_init_read(struct nfs_pageio_descriptor *pgio, struct nfs_page *req) 550 - { 551 - pnfs_generic_pg_init_read(pgio, req); 552 - if (unlikely(pgio->pg_lseg == NULL)) 553 - return; /* Not pNFS */ 554 - 555 - pgio->pg_layout_private = (void *) 556 - OBJIO_LSEG(pgio->pg_lseg)->layout.max_io_length; 557 - } 558 - 559 - static bool aligned_on_raid_stripe(u64 offset, struct ore_layout *layout, 560 - unsigned long *stripe_end) 561 - { 562 - u32 stripe_off; 563 - unsigned stripe_size; 564 - 565 - if (layout->raid_algorithm == PNFS_OSD_RAID_0) 566 - return true; 567 - 568 - stripe_size = layout->stripe_unit * 569 - (layout->group_width - layout->parity); 570 - 571 - div_u64_rem(offset, stripe_size, &stripe_off); 572 - if (!stripe_off) 573 - return true; 574 - 575 - *stripe_end = stripe_size - stripe_off; 576 - return false; 577 - } 578 - 579 - static void objio_init_write(struct nfs_pageio_descriptor *pgio, struct nfs_page *req) 580 - { 581 - unsigned long stripe_end = 0; 582 - u64 wb_size; 583 - 584 - if (pgio->pg_dreq == NULL) 585 - wb_size = i_size_read(pgio->pg_inode) - req_offset(req); 586 - else 587 - wb_size = nfs_dreq_bytes_left(pgio->pg_dreq); 588 - 589 - pnfs_generic_pg_init_write(pgio, req, wb_size); 590 - if (unlikely(pgio->pg_lseg == NULL)) 591 - return; /* Not pNFS */ 592 - 593 - if (req->wb_offset || 594 - !aligned_on_raid_stripe(req->wb_index * PAGE_SIZE, 595 - &OBJIO_LSEG(pgio->pg_lseg)->layout, 596 - &stripe_end)) { 597 - pgio->pg_layout_private = (void *)stripe_end; 598 - } else { 599 - pgio->pg_layout_private = (void *) 600 - OBJIO_LSEG(pgio->pg_lseg)->layout.max_io_length; 601 - } 602 - } 603 - 604 - static const struct nfs_pageio_ops objio_pg_read_ops = { 605 - .pg_init = objio_init_read, 606 - .pg_test = objio_pg_test, 607 - .pg_doio = pnfs_generic_pg_readpages, 608 - .pg_cleanup = pnfs_generic_pg_cleanup, 609 - }; 610 - 611 - static const struct nfs_pageio_ops objio_pg_write_ops = { 612 - .pg_init = objio_init_write, 613 - .pg_test = objio_pg_test, 614 - .pg_doio = pnfs_generic_pg_writepages, 615 - .pg_cleanup = pnfs_generic_pg_cleanup, 616 - }; 617 - 618 - static struct pnfs_layoutdriver_type objlayout_type = { 619 - .id = LAYOUT_OSD2_OBJECTS, 620 - .name = "LAYOUT_OSD2_OBJECTS", 621 - .flags = PNFS_LAYOUTRET_ON_SETATTR | 622 - PNFS_LAYOUTRET_ON_ERROR, 623 - 624 - .max_deviceinfo_size = PAGE_SIZE, 625 - .owner = THIS_MODULE, 626 - .alloc_layout_hdr = objlayout_alloc_layout_hdr, 627 - .free_layout_hdr = objlayout_free_layout_hdr, 628 - 629 - .alloc_lseg = objlayout_alloc_lseg, 630 - .free_lseg = objlayout_free_lseg, 631 - 632 - .read_pagelist = objlayout_read_pagelist, 633 - .write_pagelist = objlayout_write_pagelist, 634 - .pg_read_ops = &objio_pg_read_ops, 635 - .pg_write_ops = &objio_pg_write_ops, 636 - 637 - .sync = pnfs_generic_sync, 638 - 639 - .free_deviceid_node = objio_free_deviceid_node, 640 - 641 - .encode_layoutcommit = objlayout_encode_layoutcommit, 642 - .encode_layoutreturn = objlayout_encode_layoutreturn, 643 - }; 644 - 645 - MODULE_DESCRIPTION("pNFS Layout Driver for OSD2 objects"); 646 - MODULE_AUTHOR("Benny Halevy <bhalevy@panasas.com>"); 647 - MODULE_LICENSE("GPL"); 648 - 649 - static int __init 650 - objlayout_init(void) 651 - { 652 - int ret = pnfs_register_layoutdriver(&objlayout_type); 653 - 654 - if (ret) 655 - printk(KERN_INFO 656 - "NFS: %s: Registering OSD pNFS Layout Driver failed: error=%d\n", 657 - __func__, ret); 658 - else 659 - printk(KERN_INFO "NFS: %s: Registered OSD pNFS Layout Driver\n", 660 - __func__); 661 - return ret; 662 - } 663 - 664 - static void __exit 665 - objlayout_exit(void) 666 - { 667 - pnfs_unregister_layoutdriver(&objlayout_type); 668 - printk(KERN_INFO "NFS: %s: Unregistered OSD pNFS Layout Driver\n", 669 - __func__); 670 - } 671 - 672 - MODULE_ALIAS("nfs-layouttype4-2"); 673 - 674 - module_init(objlayout_init); 675 - module_exit(objlayout_exit);
-706
fs/nfs/objlayout/objlayout.c
··· 1 - /* 2 - * pNFS Objects layout driver high level definitions 3 - * 4 - * Copyright (C) 2007 Panasas Inc. [year of first publication] 5 - * All rights reserved. 6 - * 7 - * Benny Halevy <bhalevy@panasas.com> 8 - * Boaz Harrosh <ooo@electrozaur.com> 9 - * 10 - * This program is free software; you can redistribute it and/or modify 11 - * it under the terms of the GNU General Public License version 2 12 - * See the file COPYING included with this distribution for more details. 13 - * 14 - * Redistribution and use in source and binary forms, with or without 15 - * modification, are permitted provided that the following conditions 16 - * are met: 17 - * 18 - * 1. Redistributions of source code must retain the above copyright 19 - * notice, this list of conditions and the following disclaimer. 20 - * 2. Redistributions in binary form must reproduce the above copyright 21 - * notice, this list of conditions and the following disclaimer in the 22 - * documentation and/or other materials provided with the distribution. 23 - * 3. Neither the name of the Panasas company nor the names of its 24 - * contributors may be used to endorse or promote products derived 25 - * from this software without specific prior written permission. 26 - * 27 - * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED 28 - * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 29 - * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 30 - * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 31 - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 32 - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 33 - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR 34 - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 35 - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 36 - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 37 - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 38 - */ 39 - 40 - #include <linux/kmod.h> 41 - #include <linux/moduleparam.h> 42 - #include <linux/ratelimit.h> 43 - #include <scsi/osd_initiator.h> 44 - #include "objlayout.h" 45 - 46 - #define NFSDBG_FACILITY NFSDBG_PNFS_LD 47 - /* 48 - * Create a objlayout layout structure for the given inode and return it. 49 - */ 50 - struct pnfs_layout_hdr * 51 - objlayout_alloc_layout_hdr(struct inode *inode, gfp_t gfp_flags) 52 - { 53 - struct objlayout *objlay; 54 - 55 - objlay = kzalloc(sizeof(struct objlayout), gfp_flags); 56 - if (!objlay) 57 - return NULL; 58 - spin_lock_init(&objlay->lock); 59 - INIT_LIST_HEAD(&objlay->err_list); 60 - dprintk("%s: Return %p\n", __func__, objlay); 61 - return &objlay->pnfs_layout; 62 - } 63 - 64 - /* 65 - * Free an objlayout layout structure 66 - */ 67 - void 68 - objlayout_free_layout_hdr(struct pnfs_layout_hdr *lo) 69 - { 70 - struct objlayout *objlay = OBJLAYOUT(lo); 71 - 72 - dprintk("%s: objlay %p\n", __func__, objlay); 73 - 74 - WARN_ON(!list_empty(&objlay->err_list)); 75 - kfree(objlay); 76 - } 77 - 78 - /* 79 - * Unmarshall layout and store it in pnfslay. 80 - */ 81 - struct pnfs_layout_segment * 82 - objlayout_alloc_lseg(struct pnfs_layout_hdr *pnfslay, 83 - struct nfs4_layoutget_res *lgr, 84 - gfp_t gfp_flags) 85 - { 86 - int status = -ENOMEM; 87 - struct xdr_stream stream; 88 - struct xdr_buf buf = { 89 - .pages = lgr->layoutp->pages, 90 - .page_len = lgr->layoutp->len, 91 - .buflen = lgr->layoutp->len, 92 - .len = lgr->layoutp->len, 93 - }; 94 - struct page *scratch; 95 - struct pnfs_layout_segment *lseg; 96 - 97 - dprintk("%s: Begin pnfslay %p\n", __func__, pnfslay); 98 - 99 - scratch = alloc_page(gfp_flags); 100 - if (!scratch) 101 - goto err_nofree; 102 - 103 - xdr_init_decode(&stream, &buf, NULL); 104 - xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE); 105 - 106 - status = objio_alloc_lseg(&lseg, pnfslay, &lgr->range, &stream, gfp_flags); 107 - if (unlikely(status)) { 108 - dprintk("%s: objio_alloc_lseg Return err %d\n", __func__, 109 - status); 110 - goto err; 111 - } 112 - 113 - __free_page(scratch); 114 - 115 - dprintk("%s: Return %p\n", __func__, lseg); 116 - return lseg; 117 - 118 - err: 119 - __free_page(scratch); 120 - err_nofree: 121 - dprintk("%s: Err Return=>%d\n", __func__, status); 122 - return ERR_PTR(status); 123 - } 124 - 125 - /* 126 - * Free a layout segement 127 - */ 128 - void 129 - objlayout_free_lseg(struct pnfs_layout_segment *lseg) 130 - { 131 - dprintk("%s: freeing layout segment %p\n", __func__, lseg); 132 - 133 - if (unlikely(!lseg)) 134 - return; 135 - 136 - objio_free_lseg(lseg); 137 - } 138 - 139 - /* 140 - * I/O Operations 141 - */ 142 - static inline u64 143 - end_offset(u64 start, u64 len) 144 - { 145 - u64 end; 146 - 147 - end = start + len; 148 - return end >= start ? end : NFS4_MAX_UINT64; 149 - } 150 - 151 - static void _fix_verify_io_params(struct pnfs_layout_segment *lseg, 152 - struct page ***p_pages, unsigned *p_pgbase, 153 - u64 offset, unsigned long count) 154 - { 155 - u64 lseg_end_offset; 156 - 157 - BUG_ON(offset < lseg->pls_range.offset); 158 - lseg_end_offset = end_offset(lseg->pls_range.offset, 159 - lseg->pls_range.length); 160 - BUG_ON(offset >= lseg_end_offset); 161 - WARN_ON(offset + count > lseg_end_offset); 162 - 163 - if (*p_pgbase > PAGE_SIZE) { 164 - dprintk("%s: pgbase(0x%x) > PAGE_SIZE\n", __func__, *p_pgbase); 165 - *p_pages += *p_pgbase >> PAGE_SHIFT; 166 - *p_pgbase &= ~PAGE_MASK; 167 - } 168 - } 169 - 170 - /* 171 - * I/O done common code 172 - */ 173 - static void 174 - objlayout_iodone(struct objlayout_io_res *oir) 175 - { 176 - if (likely(oir->status >= 0)) { 177 - objio_free_result(oir); 178 - } else { 179 - struct objlayout *objlay = oir->objlay; 180 - 181 - spin_lock(&objlay->lock); 182 - objlay->delta_space_valid = OBJ_DSU_INVALID; 183 - list_add(&objlay->err_list, &oir->err_list); 184 - spin_unlock(&objlay->lock); 185 - } 186 - } 187 - 188 - /* 189 - * objlayout_io_set_result - Set an osd_error code on a specific osd comp. 190 - * 191 - * The @index component IO failed (error returned from target). Register 192 - * the error for later reporting at layout-return. 193 - */ 194 - void 195 - objlayout_io_set_result(struct objlayout_io_res *oir, unsigned index, 196 - struct pnfs_osd_objid *pooid, int osd_error, 197 - u64 offset, u64 length, bool is_write) 198 - { 199 - struct pnfs_osd_ioerr *ioerr = &oir->ioerrs[index]; 200 - 201 - BUG_ON(index >= oir->num_comps); 202 - if (osd_error) { 203 - ioerr->oer_component = *pooid; 204 - ioerr->oer_comp_offset = offset; 205 - ioerr->oer_comp_length = length; 206 - ioerr->oer_iswrite = is_write; 207 - ioerr->oer_errno = osd_error; 208 - 209 - dprintk("%s: err[%d]: errno=%d is_write=%d dev(%llx:%llx) " 210 - "par=0x%llx obj=0x%llx offset=0x%llx length=0x%llx\n", 211 - __func__, index, ioerr->oer_errno, 212 - ioerr->oer_iswrite, 213 - _DEVID_LO(&ioerr->oer_component.oid_device_id), 214 - _DEVID_HI(&ioerr->oer_component.oid_device_id), 215 - ioerr->oer_component.oid_partition_id, 216 - ioerr->oer_component.oid_object_id, 217 - ioerr->oer_comp_offset, 218 - ioerr->oer_comp_length); 219 - } else { 220 - /* User need not call if no error is reported */ 221 - ioerr->oer_errno = 0; 222 - } 223 - } 224 - 225 - /* Function scheduled on rpc workqueue to call ->nfs_readlist_complete(). 226 - * This is because the osd completion is called with ints-off from 227 - * the block layer 228 - */ 229 - static void _rpc_read_complete(struct work_struct *work) 230 - { 231 - struct rpc_task *task; 232 - struct nfs_pgio_header *hdr; 233 - 234 - dprintk("%s enter\n", __func__); 235 - task = container_of(work, struct rpc_task, u.tk_work); 236 - hdr = container_of(task, struct nfs_pgio_header, task); 237 - 238 - pnfs_ld_read_done(hdr); 239 - } 240 - 241 - void 242 - objlayout_read_done(struct objlayout_io_res *oir, ssize_t status, bool sync) 243 - { 244 - struct nfs_pgio_header *hdr = oir->rpcdata; 245 - 246 - oir->status = hdr->task.tk_status = status; 247 - if (status >= 0) 248 - hdr->res.count = status; 249 - else 250 - hdr->pnfs_error = status; 251 - objlayout_iodone(oir); 252 - /* must not use oir after this point */ 253 - 254 - dprintk("%s: Return status=%zd eof=%d sync=%d\n", __func__, 255 - status, hdr->res.eof, sync); 256 - 257 - if (sync) 258 - pnfs_ld_read_done(hdr); 259 - else { 260 - INIT_WORK(&hdr->task.u.tk_work, _rpc_read_complete); 261 - schedule_work(&hdr->task.u.tk_work); 262 - } 263 - } 264 - 265 - /* 266 - * Perform sync or async reads. 267 - */ 268 - enum pnfs_try_status 269 - objlayout_read_pagelist(struct nfs_pgio_header *hdr) 270 - { 271 - struct inode *inode = hdr->inode; 272 - loff_t offset = hdr->args.offset; 273 - size_t count = hdr->args.count; 274 - int err; 275 - loff_t eof; 276 - 277 - eof = i_size_read(inode); 278 - if (unlikely(offset + count > eof)) { 279 - if (offset >= eof) { 280 - err = 0; 281 - hdr->res.count = 0; 282 - hdr->res.eof = 1; 283 - /*FIXME: do we need to call pnfs_ld_read_done() */ 284 - goto out; 285 - } 286 - count = eof - offset; 287 - } 288 - 289 - hdr->res.eof = (offset + count) >= eof; 290 - _fix_verify_io_params(hdr->lseg, &hdr->args.pages, 291 - &hdr->args.pgbase, 292 - hdr->args.offset, hdr->args.count); 293 - 294 - dprintk("%s: inode(%lx) offset 0x%llx count 0x%zx eof=%d\n", 295 - __func__, inode->i_ino, offset, count, hdr->res.eof); 296 - 297 - err = objio_read_pagelist(hdr); 298 - out: 299 - if (unlikely(err)) { 300 - hdr->pnfs_error = err; 301 - dprintk("%s: Returned Error %d\n", __func__, err); 302 - return PNFS_NOT_ATTEMPTED; 303 - } 304 - return PNFS_ATTEMPTED; 305 - } 306 - 307 - /* Function scheduled on rpc workqueue to call ->nfs_writelist_complete(). 308 - * This is because the osd completion is called with ints-off from 309 - * the block layer 310 - */ 311 - static void _rpc_write_complete(struct work_struct *work) 312 - { 313 - struct rpc_task *task; 314 - struct nfs_pgio_header *hdr; 315 - 316 - dprintk("%s enter\n", __func__); 317 - task = container_of(work, struct rpc_task, u.tk_work); 318 - hdr = container_of(task, struct nfs_pgio_header, task); 319 - 320 - pnfs_ld_write_done(hdr); 321 - } 322 - 323 - void 324 - objlayout_write_done(struct objlayout_io_res *oir, ssize_t status, bool sync) 325 - { 326 - struct nfs_pgio_header *hdr = oir->rpcdata; 327 - 328 - oir->status = hdr->task.tk_status = status; 329 - if (status >= 0) { 330 - hdr->res.count = status; 331 - hdr->verf.committed = oir->committed; 332 - } else { 333 - hdr->pnfs_error = status; 334 - } 335 - objlayout_iodone(oir); 336 - /* must not use oir after this point */ 337 - 338 - dprintk("%s: Return status %zd committed %d sync=%d\n", __func__, 339 - status, hdr->verf.committed, sync); 340 - 341 - if (sync) 342 - pnfs_ld_write_done(hdr); 343 - else { 344 - INIT_WORK(&hdr->task.u.tk_work, _rpc_write_complete); 345 - schedule_work(&hdr->task.u.tk_work); 346 - } 347 - } 348 - 349 - /* 350 - * Perform sync or async writes. 351 - */ 352 - enum pnfs_try_status 353 - objlayout_write_pagelist(struct nfs_pgio_header *hdr, int how) 354 - { 355 - int err; 356 - 357 - _fix_verify_io_params(hdr->lseg, &hdr->args.pages, 358 - &hdr->args.pgbase, 359 - hdr->args.offset, hdr->args.count); 360 - 361 - err = objio_write_pagelist(hdr, how); 362 - if (unlikely(err)) { 363 - hdr->pnfs_error = err; 364 - dprintk("%s: Returned Error %d\n", __func__, err); 365 - return PNFS_NOT_ATTEMPTED; 366 - } 367 - return PNFS_ATTEMPTED; 368 - } 369 - 370 - void 371 - objlayout_encode_layoutcommit(struct pnfs_layout_hdr *pnfslay, 372 - struct xdr_stream *xdr, 373 - const struct nfs4_layoutcommit_args *args) 374 - { 375 - struct objlayout *objlay = OBJLAYOUT(pnfslay); 376 - struct pnfs_osd_layoutupdate lou; 377 - __be32 *start; 378 - 379 - dprintk("%s: Begin\n", __func__); 380 - 381 - spin_lock(&objlay->lock); 382 - lou.dsu_valid = (objlay->delta_space_valid == OBJ_DSU_VALID); 383 - lou.dsu_delta = objlay->delta_space_used; 384 - objlay->delta_space_used = 0; 385 - objlay->delta_space_valid = OBJ_DSU_INIT; 386 - lou.olu_ioerr_flag = !list_empty(&objlay->err_list); 387 - spin_unlock(&objlay->lock); 388 - 389 - start = xdr_reserve_space(xdr, 4); 390 - 391 - BUG_ON(pnfs_osd_xdr_encode_layoutupdate(xdr, &lou)); 392 - 393 - *start = cpu_to_be32((xdr->p - start - 1) * 4); 394 - 395 - dprintk("%s: Return delta_space_used %lld err %d\n", __func__, 396 - lou.dsu_delta, lou.olu_ioerr_flag); 397 - } 398 - 399 - static int 400 - err_prio(u32 oer_errno) 401 - { 402 - switch (oer_errno) { 403 - case 0: 404 - return 0; 405 - 406 - case PNFS_OSD_ERR_RESOURCE: 407 - return OSD_ERR_PRI_RESOURCE; 408 - case PNFS_OSD_ERR_BAD_CRED: 409 - return OSD_ERR_PRI_BAD_CRED; 410 - case PNFS_OSD_ERR_NO_ACCESS: 411 - return OSD_ERR_PRI_NO_ACCESS; 412 - case PNFS_OSD_ERR_UNREACHABLE: 413 - return OSD_ERR_PRI_UNREACHABLE; 414 - case PNFS_OSD_ERR_NOT_FOUND: 415 - return OSD_ERR_PRI_NOT_FOUND; 416 - case PNFS_OSD_ERR_NO_SPACE: 417 - return OSD_ERR_PRI_NO_SPACE; 418 - default: 419 - WARN_ON(1); 420 - /* fallthrough */ 421 - case PNFS_OSD_ERR_EIO: 422 - return OSD_ERR_PRI_EIO; 423 - } 424 - } 425 - 426 - static void 427 - merge_ioerr(struct pnfs_osd_ioerr *dest_err, 428 - const struct pnfs_osd_ioerr *src_err) 429 - { 430 - u64 dest_end, src_end; 431 - 432 - if (!dest_err->oer_errno) { 433 - *dest_err = *src_err; 434 - /* accumulated device must be blank */ 435 - memset(&dest_err->oer_component.oid_device_id, 0, 436 - sizeof(dest_err->oer_component.oid_device_id)); 437 - 438 - return; 439 - } 440 - 441 - if (dest_err->oer_component.oid_partition_id != 442 - src_err->oer_component.oid_partition_id) 443 - dest_err->oer_component.oid_partition_id = 0; 444 - 445 - if (dest_err->oer_component.oid_object_id != 446 - src_err->oer_component.oid_object_id) 447 - dest_err->oer_component.oid_object_id = 0; 448 - 449 - if (dest_err->oer_comp_offset > src_err->oer_comp_offset) 450 - dest_err->oer_comp_offset = src_err->oer_comp_offset; 451 - 452 - dest_end = end_offset(dest_err->oer_comp_offset, 453 - dest_err->oer_comp_length); 454 - src_end = end_offset(src_err->oer_comp_offset, 455 - src_err->oer_comp_length); 456 - if (dest_end < src_end) 457 - dest_end = src_end; 458 - 459 - dest_err->oer_comp_length = dest_end - dest_err->oer_comp_offset; 460 - 461 - if ((src_err->oer_iswrite == dest_err->oer_iswrite) && 462 - (err_prio(src_err->oer_errno) > err_prio(dest_err->oer_errno))) { 463 - dest_err->oer_errno = src_err->oer_errno; 464 - } else if (src_err->oer_iswrite) { 465 - dest_err->oer_iswrite = true; 466 - dest_err->oer_errno = src_err->oer_errno; 467 - } 468 - } 469 - 470 - static void 471 - encode_accumulated_error(struct objlayout *objlay, __be32 *p) 472 - { 473 - struct objlayout_io_res *oir, *tmp; 474 - struct pnfs_osd_ioerr accumulated_err = {.oer_errno = 0}; 475 - 476 - list_for_each_entry_safe(oir, tmp, &objlay->err_list, err_list) { 477 - unsigned i; 478 - 479 - for (i = 0; i < oir->num_comps; i++) { 480 - struct pnfs_osd_ioerr *ioerr = &oir->ioerrs[i]; 481 - 482 - if (!ioerr->oer_errno) 483 - continue; 484 - 485 - printk(KERN_ERR "NFS: %s: err[%d]: errno=%d " 486 - "is_write=%d dev(%llx:%llx) par=0x%llx " 487 - "obj=0x%llx offset=0x%llx length=0x%llx\n", 488 - __func__, i, ioerr->oer_errno, 489 - ioerr->oer_iswrite, 490 - _DEVID_LO(&ioerr->oer_component.oid_device_id), 491 - _DEVID_HI(&ioerr->oer_component.oid_device_id), 492 - ioerr->oer_component.oid_partition_id, 493 - ioerr->oer_component.oid_object_id, 494 - ioerr->oer_comp_offset, 495 - ioerr->oer_comp_length); 496 - 497 - merge_ioerr(&accumulated_err, ioerr); 498 - } 499 - list_del(&oir->err_list); 500 - objio_free_result(oir); 501 - } 502 - 503 - pnfs_osd_xdr_encode_ioerr(p, &accumulated_err); 504 - } 505 - 506 - void 507 - objlayout_encode_layoutreturn(struct xdr_stream *xdr, 508 - const struct nfs4_layoutreturn_args *args) 509 - { 510 - struct pnfs_layout_hdr *pnfslay = args->layout; 511 - struct objlayout *objlay = OBJLAYOUT(pnfslay); 512 - struct objlayout_io_res *oir, *tmp; 513 - __be32 *start; 514 - 515 - dprintk("%s: Begin\n", __func__); 516 - start = xdr_reserve_space(xdr, 4); 517 - BUG_ON(!start); 518 - 519 - spin_lock(&objlay->lock); 520 - 521 - list_for_each_entry_safe(oir, tmp, &objlay->err_list, err_list) { 522 - __be32 *last_xdr = NULL, *p; 523 - unsigned i; 524 - int res = 0; 525 - 526 - for (i = 0; i < oir->num_comps; i++) { 527 - struct pnfs_osd_ioerr *ioerr = &oir->ioerrs[i]; 528 - 529 - if (!ioerr->oer_errno) 530 - continue; 531 - 532 - dprintk("%s: err[%d]: errno=%d is_write=%d " 533 - "dev(%llx:%llx) par=0x%llx obj=0x%llx " 534 - "offset=0x%llx length=0x%llx\n", 535 - __func__, i, ioerr->oer_errno, 536 - ioerr->oer_iswrite, 537 - _DEVID_LO(&ioerr->oer_component.oid_device_id), 538 - _DEVID_HI(&ioerr->oer_component.oid_device_id), 539 - ioerr->oer_component.oid_partition_id, 540 - ioerr->oer_component.oid_object_id, 541 - ioerr->oer_comp_offset, 542 - ioerr->oer_comp_length); 543 - 544 - p = pnfs_osd_xdr_ioerr_reserve_space(xdr); 545 - if (unlikely(!p)) { 546 - res = -E2BIG; 547 - break; /* accumulated_error */ 548 - } 549 - 550 - last_xdr = p; 551 - pnfs_osd_xdr_encode_ioerr(p, &oir->ioerrs[i]); 552 - } 553 - 554 - /* TODO: use xdr_write_pages */ 555 - if (unlikely(res)) { 556 - /* no space for even one error descriptor */ 557 - BUG_ON(!last_xdr); 558 - 559 - /* we've encountered a situation with lots and lots of 560 - * errors and no space to encode them all. Use the last 561 - * available slot to report the union of all the 562 - * remaining errors. 563 - */ 564 - encode_accumulated_error(objlay, last_xdr); 565 - goto loop_done; 566 - } 567 - list_del(&oir->err_list); 568 - objio_free_result(oir); 569 - } 570 - loop_done: 571 - spin_unlock(&objlay->lock); 572 - 573 - *start = cpu_to_be32((xdr->p - start - 1) * 4); 574 - dprintk("%s: Return\n", __func__); 575 - } 576 - 577 - enum { 578 - OBJLAYOUT_MAX_URI_LEN = 256, OBJLAYOUT_MAX_OSDNAME_LEN = 64, 579 - OBJLAYOUT_MAX_SYSID_HEX_LEN = OSD_SYSTEMID_LEN * 2 + 1, 580 - OSD_LOGIN_UPCALL_PATHLEN = 256 581 - }; 582 - 583 - static char osd_login_prog[OSD_LOGIN_UPCALL_PATHLEN] = "/sbin/osd_login"; 584 - 585 - module_param_string(osd_login_prog, osd_login_prog, sizeof(osd_login_prog), 586 - 0600); 587 - MODULE_PARM_DESC(osd_login_prog, "Path to the osd_login upcall program"); 588 - 589 - struct __auto_login { 590 - char uri[OBJLAYOUT_MAX_URI_LEN]; 591 - char osdname[OBJLAYOUT_MAX_OSDNAME_LEN]; 592 - char systemid_hex[OBJLAYOUT_MAX_SYSID_HEX_LEN]; 593 - }; 594 - 595 - static int __objlayout_upcall(struct __auto_login *login) 596 - { 597 - static char *envp[] = { "HOME=/", 598 - "TERM=linux", 599 - "PATH=/sbin:/usr/sbin:/bin:/usr/bin", 600 - NULL 601 - }; 602 - char *argv[8]; 603 - int ret; 604 - 605 - if (unlikely(!osd_login_prog[0])) { 606 - dprintk("%s: osd_login_prog is disabled\n", __func__); 607 - return -EACCES; 608 - } 609 - 610 - dprintk("%s uri: %s\n", __func__, login->uri); 611 - dprintk("%s osdname %s\n", __func__, login->osdname); 612 - dprintk("%s systemid_hex %s\n", __func__, login->systemid_hex); 613 - 614 - argv[0] = (char *)osd_login_prog; 615 - argv[1] = "-u"; 616 - argv[2] = login->uri; 617 - argv[3] = "-o"; 618 - argv[4] = login->osdname; 619 - argv[5] = "-s"; 620 - argv[6] = login->systemid_hex; 621 - argv[7] = NULL; 622 - 623 - ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC); 624 - /* 625 - * Disable the upcall mechanism if we're getting an ENOENT or 626 - * EACCES error. The admin can re-enable it on the fly by using 627 - * sysfs to set the objlayoutdriver.osd_login_prog module parameter once 628 - * the problem has been fixed. 629 - */ 630 - if (ret == -ENOENT || ret == -EACCES) { 631 - printk(KERN_ERR "PNFS-OBJ: %s was not found please set " 632 - "objlayoutdriver.osd_login_prog kernel parameter!\n", 633 - osd_login_prog); 634 - osd_login_prog[0] = '\0'; 635 - } 636 - dprintk("%s %s return value: %d\n", __func__, osd_login_prog, ret); 637 - 638 - return ret; 639 - } 640 - 641 - /* Assume dest is all zeros */ 642 - static void __copy_nfsS_and_zero_terminate(struct nfs4_string s, 643 - char *dest, int max_len, 644 - const char *var_name) 645 - { 646 - if (!s.len) 647 - return; 648 - 649 - if (s.len >= max_len) { 650 - pr_warn_ratelimited( 651 - "objlayout_autologin: %s: s.len(%d) >= max_len(%d)", 652 - var_name, s.len, max_len); 653 - s.len = max_len - 1; /* space for null terminator */ 654 - } 655 - 656 - memcpy(dest, s.data, s.len); 657 - } 658 - 659 - /* Assume sysid is all zeros */ 660 - static void _sysid_2_hex(struct nfs4_string s, 661 - char sysid[OBJLAYOUT_MAX_SYSID_HEX_LEN]) 662 - { 663 - int i; 664 - char *cur; 665 - 666 - if (!s.len) 667 - return; 668 - 669 - if (s.len != OSD_SYSTEMID_LEN) { 670 - pr_warn_ratelimited( 671 - "objlayout_autologin: systemid_len(%d) != OSD_SYSTEMID_LEN", 672 - s.len); 673 - if (s.len > OSD_SYSTEMID_LEN) 674 - s.len = OSD_SYSTEMID_LEN; 675 - } 676 - 677 - cur = sysid; 678 - for (i = 0; i < s.len; i++) 679 - cur = hex_byte_pack(cur, s.data[i]); 680 - } 681 - 682 - int objlayout_autologin(struct pnfs_osd_deviceaddr *deviceaddr) 683 - { 684 - int rc; 685 - struct __auto_login login; 686 - 687 - if (!deviceaddr->oda_targetaddr.ota_netaddr.r_addr.len) 688 - return -ENODEV; 689 - 690 - memset(&login, 0, sizeof(login)); 691 - __copy_nfsS_and_zero_terminate( 692 - deviceaddr->oda_targetaddr.ota_netaddr.r_addr, 693 - login.uri, sizeof(login.uri), "URI"); 694 - 695 - __copy_nfsS_and_zero_terminate( 696 - deviceaddr->oda_osdname, 697 - login.osdname, sizeof(login.osdname), "OSDNAME"); 698 - 699 - _sysid_2_hex(deviceaddr->oda_systemid, login.systemid_hex); 700 - 701 - rc = __objlayout_upcall(&login); 702 - if (rc > 0) /* script returns positive values */ 703 - rc = -ENODEV; 704 - 705 - return rc; 706 - }
-183
fs/nfs/objlayout/objlayout.h
··· 1 - /* 2 - * Data types and function declerations for interfacing with the 3 - * pNFS standard object layout driver. 4 - * 5 - * Copyright (C) 2007 Panasas Inc. [year of first publication] 6 - * All rights reserved. 7 - * 8 - * Benny Halevy <bhalevy@panasas.com> 9 - * Boaz Harrosh <ooo@electrozaur.com> 10 - * 11 - * This program is free software; you can redistribute it and/or modify 12 - * it under the terms of the GNU General Public License version 2 13 - * See the file COPYING included with this distribution for more details. 14 - * 15 - * Redistribution and use in source and binary forms, with or without 16 - * modification, are permitted provided that the following conditions 17 - * are met: 18 - * 19 - * 1. Redistributions of source code must retain the above copyright 20 - * notice, this list of conditions and the following disclaimer. 21 - * 2. Redistributions in binary form must reproduce the above copyright 22 - * notice, this list of conditions and the following disclaimer in the 23 - * documentation and/or other materials provided with the distribution. 24 - * 3. Neither the name of the Panasas company nor the names of its 25 - * contributors may be used to endorse or promote products derived 26 - * from this software without specific prior written permission. 27 - * 28 - * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED 29 - * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 30 - * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 31 - * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 32 - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 33 - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 34 - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR 35 - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 36 - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 37 - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 38 - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 39 - */ 40 - 41 - #ifndef _OBJLAYOUT_H 42 - #define _OBJLAYOUT_H 43 - 44 - #include <linux/nfs_fs.h> 45 - #include <linux/pnfs_osd_xdr.h> 46 - #include "../pnfs.h" 47 - 48 - /* 49 - * per-inode layout 50 - */ 51 - struct objlayout { 52 - struct pnfs_layout_hdr pnfs_layout; 53 - 54 - /* for layout_commit */ 55 - enum osd_delta_space_valid_enum { 56 - OBJ_DSU_INIT = 0, 57 - OBJ_DSU_VALID, 58 - OBJ_DSU_INVALID, 59 - } delta_space_valid; 60 - s64 delta_space_used; /* consumed by write ops */ 61 - 62 - /* for layout_return */ 63 - spinlock_t lock; 64 - struct list_head err_list; 65 - }; 66 - 67 - static inline struct objlayout * 68 - OBJLAYOUT(struct pnfs_layout_hdr *lo) 69 - { 70 - return container_of(lo, struct objlayout, pnfs_layout); 71 - } 72 - 73 - /* 74 - * per-I/O operation state 75 - * embedded in objects provider io_state data structure 76 - */ 77 - struct objlayout_io_res { 78 - struct objlayout *objlay; 79 - 80 - void *rpcdata; 81 - int status; /* res */ 82 - int committed; /* res */ 83 - 84 - /* Error reporting (layout_return) */ 85 - struct list_head err_list; 86 - unsigned num_comps; 87 - /* Pointer to array of error descriptors of size num_comps. 88 - * It should contain as many entries as devices in the osd_layout 89 - * that participate in the I/O. It is up to the io_engine to allocate 90 - * needed space and set num_comps. 91 - */ 92 - struct pnfs_osd_ioerr *ioerrs; 93 - }; 94 - 95 - static inline 96 - void objlayout_init_ioerrs(struct objlayout_io_res *oir, unsigned num_comps, 97 - struct pnfs_osd_ioerr *ioerrs, void *rpcdata, 98 - struct pnfs_layout_hdr *pnfs_layout_type) 99 - { 100 - oir->objlay = OBJLAYOUT(pnfs_layout_type); 101 - oir->rpcdata = rpcdata; 102 - INIT_LIST_HEAD(&oir->err_list); 103 - oir->num_comps = num_comps; 104 - oir->ioerrs = ioerrs; 105 - } 106 - 107 - /* 108 - * Raid engine I/O API 109 - */ 110 - extern int objio_alloc_lseg(struct pnfs_layout_segment **outp, 111 - struct pnfs_layout_hdr *pnfslay, 112 - struct pnfs_layout_range *range, 113 - struct xdr_stream *xdr, 114 - gfp_t gfp_flags); 115 - extern void objio_free_lseg(struct pnfs_layout_segment *lseg); 116 - 117 - /* objio_free_result will free these @oir structs received from 118 - * objlayout_{read,write}_done 119 - */ 120 - extern void objio_free_result(struct objlayout_io_res *oir); 121 - 122 - extern int objio_read_pagelist(struct nfs_pgio_header *rdata); 123 - extern int objio_write_pagelist(struct nfs_pgio_header *wdata, int how); 124 - 125 - /* 126 - * callback API 127 - */ 128 - extern void objlayout_io_set_result(struct objlayout_io_res *oir, 129 - unsigned index, struct pnfs_osd_objid *pooid, 130 - int osd_error, u64 offset, u64 length, bool is_write); 131 - 132 - static inline void 133 - objlayout_add_delta_space_used(struct objlayout *objlay, s64 space_used) 134 - { 135 - /* If one of the I/Os errored out and the delta_space_used was 136 - * invalid we render the complete report as invalid. Protocol mandate 137 - * the DSU be accurate or not reported. 138 - */ 139 - spin_lock(&objlay->lock); 140 - if (objlay->delta_space_valid != OBJ_DSU_INVALID) { 141 - objlay->delta_space_valid = OBJ_DSU_VALID; 142 - objlay->delta_space_used += space_used; 143 - } 144 - spin_unlock(&objlay->lock); 145 - } 146 - 147 - extern void objlayout_read_done(struct objlayout_io_res *oir, 148 - ssize_t status, bool sync); 149 - extern void objlayout_write_done(struct objlayout_io_res *oir, 150 - ssize_t status, bool sync); 151 - 152 - /* 153 - * exported generic objects function vectors 154 - */ 155 - 156 - extern struct pnfs_layout_hdr *objlayout_alloc_layout_hdr(struct inode *, gfp_t gfp_flags); 157 - extern void objlayout_free_layout_hdr(struct pnfs_layout_hdr *); 158 - 159 - extern struct pnfs_layout_segment *objlayout_alloc_lseg( 160 - struct pnfs_layout_hdr *, 161 - struct nfs4_layoutget_res *, 162 - gfp_t gfp_flags); 163 - extern void objlayout_free_lseg(struct pnfs_layout_segment *); 164 - 165 - extern enum pnfs_try_status objlayout_read_pagelist( 166 - struct nfs_pgio_header *); 167 - 168 - extern enum pnfs_try_status objlayout_write_pagelist( 169 - struct nfs_pgio_header *, 170 - int how); 171 - 172 - extern void objlayout_encode_layoutcommit( 173 - struct pnfs_layout_hdr *, 174 - struct xdr_stream *, 175 - const struct nfs4_layoutcommit_args *); 176 - 177 - extern void objlayout_encode_layoutreturn( 178 - struct xdr_stream *, 179 - const struct nfs4_layoutreturn_args *); 180 - 181 - extern int objlayout_autologin(struct pnfs_osd_deviceaddr *deviceaddr); 182 - 183 - #endif /* _OBJLAYOUT_H */
-415
fs/nfs/objlayout/pnfs_osd_xdr_cli.c
··· 1 - /* 2 - * Object-Based pNFS Layout XDR layer 3 - * 4 - * Copyright (C) 2007 Panasas Inc. [year of first publication] 5 - * All rights reserved. 6 - * 7 - * Benny Halevy <bhalevy@panasas.com> 8 - * Boaz Harrosh <ooo@electrozaur.com> 9 - * 10 - * This program is free software; you can redistribute it and/or modify 11 - * it under the terms of the GNU General Public License version 2 12 - * See the file COPYING included with this distribution for more details. 13 - * 14 - * Redistribution and use in source and binary forms, with or without 15 - * modification, are permitted provided that the following conditions 16 - * are met: 17 - * 18 - * 1. Redistributions of source code must retain the above copyright 19 - * notice, this list of conditions and the following disclaimer. 20 - * 2. Redistributions in binary form must reproduce the above copyright 21 - * notice, this list of conditions and the following disclaimer in the 22 - * documentation and/or other materials provided with the distribution. 23 - * 3. Neither the name of the Panasas company nor the names of its 24 - * contributors may be used to endorse or promote products derived 25 - * from this software without specific prior written permission. 26 - * 27 - * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED 28 - * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 29 - * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 30 - * DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 31 - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 32 - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 33 - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR 34 - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 35 - * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 36 - * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 37 - * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 38 - */ 39 - 40 - #include <linux/pnfs_osd_xdr.h> 41 - 42 - #define NFSDBG_FACILITY NFSDBG_PNFS_LD 43 - 44 - /* 45 - * The following implementation is based on RFC5664 46 - */ 47 - 48 - /* 49 - * struct pnfs_osd_objid { 50 - * struct nfs4_deviceid oid_device_id; 51 - * u64 oid_partition_id; 52 - * u64 oid_object_id; 53 - * }; // xdr size 32 bytes 54 - */ 55 - static __be32 * 56 - _osd_xdr_decode_objid(__be32 *p, struct pnfs_osd_objid *objid) 57 - { 58 - p = xdr_decode_opaque_fixed(p, objid->oid_device_id.data, 59 - sizeof(objid->oid_device_id.data)); 60 - 61 - p = xdr_decode_hyper(p, &objid->oid_partition_id); 62 - p = xdr_decode_hyper(p, &objid->oid_object_id); 63 - return p; 64 - } 65 - /* 66 - * struct pnfs_osd_opaque_cred { 67 - * u32 cred_len; 68 - * void *cred; 69 - * }; // xdr size [variable] 70 - * The return pointers are from the xdr buffer 71 - */ 72 - static int 73 - _osd_xdr_decode_opaque_cred(struct pnfs_osd_opaque_cred *opaque_cred, 74 - struct xdr_stream *xdr) 75 - { 76 - __be32 *p = xdr_inline_decode(xdr, 1); 77 - 78 - if (!p) 79 - return -EINVAL; 80 - 81 - opaque_cred->cred_len = be32_to_cpu(*p++); 82 - 83 - p = xdr_inline_decode(xdr, opaque_cred->cred_len); 84 - if (!p) 85 - return -EINVAL; 86 - 87 - opaque_cred->cred = p; 88 - return 0; 89 - } 90 - 91 - /* 92 - * struct pnfs_osd_object_cred { 93 - * struct pnfs_osd_objid oc_object_id; 94 - * u32 oc_osd_version; 95 - * u32 oc_cap_key_sec; 96 - * struct pnfs_osd_opaque_cred oc_cap_key 97 - * struct pnfs_osd_opaque_cred oc_cap; 98 - * }; // xdr size 32 + 4 + 4 + [variable] + [variable] 99 - */ 100 - static int 101 - _osd_xdr_decode_object_cred(struct pnfs_osd_object_cred *comp, 102 - struct xdr_stream *xdr) 103 - { 104 - __be32 *p = xdr_inline_decode(xdr, 32 + 4 + 4); 105 - int ret; 106 - 107 - if (!p) 108 - return -EIO; 109 - 110 - p = _osd_xdr_decode_objid(p, &comp->oc_object_id); 111 - comp->oc_osd_version = be32_to_cpup(p++); 112 - comp->oc_cap_key_sec = be32_to_cpup(p); 113 - 114 - ret = _osd_xdr_decode_opaque_cred(&comp->oc_cap_key, xdr); 115 - if (unlikely(ret)) 116 - return ret; 117 - 118 - ret = _osd_xdr_decode_opaque_cred(&comp->oc_cap, xdr); 119 - return ret; 120 - } 121 - 122 - /* 123 - * struct pnfs_osd_data_map { 124 - * u32 odm_num_comps; 125 - * u64 odm_stripe_unit; 126 - * u32 odm_group_width; 127 - * u32 odm_group_depth; 128 - * u32 odm_mirror_cnt; 129 - * u32 odm_raid_algorithm; 130 - * }; // xdr size 4 + 8 + 4 + 4 + 4 + 4 131 - */ 132 - static inline int 133 - _osd_data_map_xdr_sz(void) 134 - { 135 - return 4 + 8 + 4 + 4 + 4 + 4; 136 - } 137 - 138 - static __be32 * 139 - _osd_xdr_decode_data_map(__be32 *p, struct pnfs_osd_data_map *data_map) 140 - { 141 - data_map->odm_num_comps = be32_to_cpup(p++); 142 - p = xdr_decode_hyper(p, &data_map->odm_stripe_unit); 143 - data_map->odm_group_width = be32_to_cpup(p++); 144 - data_map->odm_group_depth = be32_to_cpup(p++); 145 - data_map->odm_mirror_cnt = be32_to_cpup(p++); 146 - data_map->odm_raid_algorithm = be32_to_cpup(p++); 147 - dprintk("%s: odm_num_comps=%u odm_stripe_unit=%llu odm_group_width=%u " 148 - "odm_group_depth=%u odm_mirror_cnt=%u odm_raid_algorithm=%u\n", 149 - __func__, 150 - data_map->odm_num_comps, 151 - (unsigned long long)data_map->odm_stripe_unit, 152 - data_map->odm_group_width, 153 - data_map->odm_group_depth, 154 - data_map->odm_mirror_cnt, 155 - data_map->odm_raid_algorithm); 156 - return p; 157 - } 158 - 159 - int pnfs_osd_xdr_decode_layout_map(struct pnfs_osd_layout *layout, 160 - struct pnfs_osd_xdr_decode_layout_iter *iter, struct xdr_stream *xdr) 161 - { 162 - __be32 *p; 163 - 164 - memset(iter, 0, sizeof(*iter)); 165 - 166 - p = xdr_inline_decode(xdr, _osd_data_map_xdr_sz() + 4 + 4); 167 - if (unlikely(!p)) 168 - return -EINVAL; 169 - 170 - p = _osd_xdr_decode_data_map(p, &layout->olo_map); 171 - layout->olo_comps_index = be32_to_cpup(p++); 172 - layout->olo_num_comps = be32_to_cpup(p++); 173 - dprintk("%s: olo_comps_index=%d olo_num_comps=%d\n", __func__, 174 - layout->olo_comps_index, layout->olo_num_comps); 175 - 176 - iter->total_comps = layout->olo_num_comps; 177 - return 0; 178 - } 179 - 180 - bool pnfs_osd_xdr_decode_layout_comp(struct pnfs_osd_object_cred *comp, 181 - struct pnfs_osd_xdr_decode_layout_iter *iter, struct xdr_stream *xdr, 182 - int *err) 183 - { 184 - BUG_ON(iter->decoded_comps > iter->total_comps); 185 - if (iter->decoded_comps == iter->total_comps) 186 - return false; 187 - 188 - *err = _osd_xdr_decode_object_cred(comp, xdr); 189 - if (unlikely(*err)) { 190 - dprintk("%s: _osd_xdr_decode_object_cred=>%d decoded_comps=%d " 191 - "total_comps=%d\n", __func__, *err, 192 - iter->decoded_comps, iter->total_comps); 193 - return false; /* stop the loop */ 194 - } 195 - dprintk("%s: dev(%llx:%llx) par=0x%llx obj=0x%llx " 196 - "key_len=%u cap_len=%u\n", 197 - __func__, 198 - _DEVID_LO(&comp->oc_object_id.oid_device_id), 199 - _DEVID_HI(&comp->oc_object_id.oid_device_id), 200 - comp->oc_object_id.oid_partition_id, 201 - comp->oc_object_id.oid_object_id, 202 - comp->oc_cap_key.cred_len, comp->oc_cap.cred_len); 203 - 204 - iter->decoded_comps++; 205 - return true; 206 - } 207 - 208 - /* 209 - * Get Device Information Decoding 210 - * 211 - * Note: since Device Information is currently done synchronously, all 212 - * variable strings fields are left inside the rpc buffer and are only 213 - * pointed to by the pnfs_osd_deviceaddr members. So the read buffer 214 - * should not be freed while the returned information is in use. 215 - */ 216 - /* 217 - *struct nfs4_string { 218 - * unsigned int len; 219 - * char *data; 220 - *}; // size [variable] 221 - * NOTE: Returned string points to inside the XDR buffer 222 - */ 223 - static __be32 * 224 - __read_u8_opaque(__be32 *p, struct nfs4_string *str) 225 - { 226 - str->len = be32_to_cpup(p++); 227 - str->data = (char *)p; 228 - 229 - p += XDR_QUADLEN(str->len); 230 - return p; 231 - } 232 - 233 - /* 234 - * struct pnfs_osd_targetid { 235 - * u32 oti_type; 236 - * struct nfs4_string oti_scsi_device_id; 237 - * };// size 4 + [variable] 238 - */ 239 - static __be32 * 240 - __read_targetid(__be32 *p, struct pnfs_osd_targetid* targetid) 241 - { 242 - u32 oti_type; 243 - 244 - oti_type = be32_to_cpup(p++); 245 - targetid->oti_type = oti_type; 246 - 247 - switch (oti_type) { 248 - case OBJ_TARGET_SCSI_NAME: 249 - case OBJ_TARGET_SCSI_DEVICE_ID: 250 - p = __read_u8_opaque(p, &targetid->oti_scsi_device_id); 251 - } 252 - 253 - return p; 254 - } 255 - 256 - /* 257 - * struct pnfs_osd_net_addr { 258 - * struct nfs4_string r_netid; 259 - * struct nfs4_string r_addr; 260 - * }; 261 - */ 262 - static __be32 * 263 - __read_net_addr(__be32 *p, struct pnfs_osd_net_addr* netaddr) 264 - { 265 - p = __read_u8_opaque(p, &netaddr->r_netid); 266 - p = __read_u8_opaque(p, &netaddr->r_addr); 267 - 268 - return p; 269 - } 270 - 271 - /* 272 - * struct pnfs_osd_targetaddr { 273 - * u32 ota_available; 274 - * struct pnfs_osd_net_addr ota_netaddr; 275 - * }; 276 - */ 277 - static __be32 * 278 - __read_targetaddr(__be32 *p, struct pnfs_osd_targetaddr *targetaddr) 279 - { 280 - u32 ota_available; 281 - 282 - ota_available = be32_to_cpup(p++); 283 - targetaddr->ota_available = ota_available; 284 - 285 - if (ota_available) 286 - p = __read_net_addr(p, &targetaddr->ota_netaddr); 287 - 288 - 289 - return p; 290 - } 291 - 292 - /* 293 - * struct pnfs_osd_deviceaddr { 294 - * struct pnfs_osd_targetid oda_targetid; 295 - * struct pnfs_osd_targetaddr oda_targetaddr; 296 - * u8 oda_lun[8]; 297 - * struct nfs4_string oda_systemid; 298 - * struct pnfs_osd_object_cred oda_root_obj_cred; 299 - * struct nfs4_string oda_osdname; 300 - * }; 301 - */ 302 - 303 - /* We need this version for the pnfs_osd_xdr_decode_deviceaddr which does 304 - * not have an xdr_stream 305 - */ 306 - static __be32 * 307 - __read_opaque_cred(__be32 *p, 308 - struct pnfs_osd_opaque_cred *opaque_cred) 309 - { 310 - opaque_cred->cred_len = be32_to_cpu(*p++); 311 - opaque_cred->cred = p; 312 - return p + XDR_QUADLEN(opaque_cred->cred_len); 313 - } 314 - 315 - static __be32 * 316 - __read_object_cred(__be32 *p, struct pnfs_osd_object_cred *comp) 317 - { 318 - p = _osd_xdr_decode_objid(p, &comp->oc_object_id); 319 - comp->oc_osd_version = be32_to_cpup(p++); 320 - comp->oc_cap_key_sec = be32_to_cpup(p++); 321 - 322 - p = __read_opaque_cred(p, &comp->oc_cap_key); 323 - p = __read_opaque_cred(p, &comp->oc_cap); 324 - return p; 325 - } 326 - 327 - void pnfs_osd_xdr_decode_deviceaddr( 328 - struct pnfs_osd_deviceaddr *deviceaddr, __be32 *p) 329 - { 330 - p = __read_targetid(p, &deviceaddr->oda_targetid); 331 - 332 - p = __read_targetaddr(p, &deviceaddr->oda_targetaddr); 333 - 334 - p = xdr_decode_opaque_fixed(p, deviceaddr->oda_lun, 335 - sizeof(deviceaddr->oda_lun)); 336 - 337 - p = __read_u8_opaque(p, &deviceaddr->oda_systemid); 338 - 339 - p = __read_object_cred(p, &deviceaddr->oda_root_obj_cred); 340 - 341 - p = __read_u8_opaque(p, &deviceaddr->oda_osdname); 342 - 343 - /* libosd likes this terminated in dbg. It's last, so no problems */ 344 - deviceaddr->oda_osdname.data[deviceaddr->oda_osdname.len] = 0; 345 - } 346 - 347 - /* 348 - * struct pnfs_osd_layoutupdate { 349 - * u32 dsu_valid; 350 - * s64 dsu_delta; 351 - * u32 olu_ioerr_flag; 352 - * }; xdr size 4 + 8 + 4 353 - */ 354 - int 355 - pnfs_osd_xdr_encode_layoutupdate(struct xdr_stream *xdr, 356 - struct pnfs_osd_layoutupdate *lou) 357 - { 358 - __be32 *p = xdr_reserve_space(xdr, 4 + 8 + 4); 359 - 360 - if (!p) 361 - return -E2BIG; 362 - 363 - *p++ = cpu_to_be32(lou->dsu_valid); 364 - if (lou->dsu_valid) 365 - p = xdr_encode_hyper(p, lou->dsu_delta); 366 - *p++ = cpu_to_be32(lou->olu_ioerr_flag); 367 - return 0; 368 - } 369 - 370 - /* 371 - * struct pnfs_osd_objid { 372 - * struct nfs4_deviceid oid_device_id; 373 - * u64 oid_partition_id; 374 - * u64 oid_object_id; 375 - * }; // xdr size 32 bytes 376 - */ 377 - static inline __be32 * 378 - pnfs_osd_xdr_encode_objid(__be32 *p, struct pnfs_osd_objid *object_id) 379 - { 380 - p = xdr_encode_opaque_fixed(p, &object_id->oid_device_id.data, 381 - sizeof(object_id->oid_device_id.data)); 382 - p = xdr_encode_hyper(p, object_id->oid_partition_id); 383 - p = xdr_encode_hyper(p, object_id->oid_object_id); 384 - 385 - return p; 386 - } 387 - 388 - /* 389 - * struct pnfs_osd_ioerr { 390 - * struct pnfs_osd_objid oer_component; 391 - * u64 oer_comp_offset; 392 - * u64 oer_comp_length; 393 - * u32 oer_iswrite; 394 - * u32 oer_errno; 395 - * }; // xdr size 32 + 24 bytes 396 - */ 397 - void pnfs_osd_xdr_encode_ioerr(__be32 *p, struct pnfs_osd_ioerr *ioerr) 398 - { 399 - p = pnfs_osd_xdr_encode_objid(p, &ioerr->oer_component); 400 - p = xdr_encode_hyper(p, ioerr->oer_comp_offset); 401 - p = xdr_encode_hyper(p, ioerr->oer_comp_length); 402 - *p++ = cpu_to_be32(ioerr->oer_iswrite); 403 - *p = cpu_to_be32(ioerr->oer_errno); 404 - } 405 - 406 - __be32 *pnfs_osd_xdr_ioerr_reserve_space(struct xdr_stream *xdr) 407 - { 408 - __be32 *p; 409 - 410 - p = xdr_reserve_space(xdr, 32 + 24); 411 - if (unlikely(!p)) 412 - dprintk("%s: out of xdr space\n", __func__); 413 - 414 - return p; 415 - }
+55 -22
fs/nfs/pagelist.c
··· 29 29 static struct kmem_cache *nfs_page_cachep; 30 30 static const struct rpc_call_ops nfs_pgio_common_ops; 31 31 32 - static bool nfs_pgarray_set(struct nfs_page_array *p, unsigned int pagecount) 33 - { 34 - p->npages = pagecount; 35 - if (pagecount <= ARRAY_SIZE(p->page_array)) 36 - p->pagevec = p->page_array; 37 - else { 38 - p->pagevec = kcalloc(pagecount, sizeof(struct page *), GFP_KERNEL); 39 - if (!p->pagevec) 40 - p->npages = 0; 41 - } 42 - return p->pagevec != NULL; 43 - } 44 - 45 32 struct nfs_pgio_mirror * 46 33 nfs_pgio_current_mirror(struct nfs_pageio_descriptor *desc) 47 34 { ··· 101 114 return wait_on_atomic_t(&l_ctx->io_count, nfs_wait_atomic_killable, 102 115 TASK_KILLABLE); 103 116 } 117 + 118 + /** 119 + * nfs_async_iocounter_wait - wait on a rpc_waitqueue for I/O 120 + * to complete 121 + * @task: the rpc_task that should wait 122 + * @l_ctx: nfs_lock_context with io_counter to check 123 + * 124 + * Returns true if there is outstanding I/O to wait on and the 125 + * task has been put to sleep. 126 + */ 127 + bool 128 + nfs_async_iocounter_wait(struct rpc_task *task, struct nfs_lock_context *l_ctx) 129 + { 130 + struct inode *inode = d_inode(l_ctx->open_context->dentry); 131 + bool ret = false; 132 + 133 + if (atomic_read(&l_ctx->io_count) > 0) { 134 + rpc_sleep_on(&NFS_SERVER(inode)->uoc_rpcwaitq, task, NULL); 135 + ret = true; 136 + } 137 + 138 + if (atomic_read(&l_ctx->io_count) == 0) { 139 + rpc_wake_up_queued_task(&NFS_SERVER(inode)->uoc_rpcwaitq, task); 140 + ret = false; 141 + } 142 + 143 + return ret; 144 + } 145 + EXPORT_SYMBOL_GPL(nfs_async_iocounter_wait); 104 146 105 147 /* 106 148 * nfs_page_group_lock - lock the head of the page group ··· 414 398 req->wb_page = NULL; 415 399 } 416 400 if (l_ctx != NULL) { 417 - if (atomic_dec_and_test(&l_ctx->io_count)) 401 + if (atomic_dec_and_test(&l_ctx->io_count)) { 418 402 wake_up_atomic_t(&l_ctx->io_count); 403 + if (test_bit(NFS_CONTEXT_UNLOCK, &ctx->flags)) 404 + rpc_wake_up(&NFS_SERVER(d_inode(ctx->dentry))->uoc_rpcwaitq); 405 + } 419 406 nfs_put_lock_context(l_ctx); 420 407 req->wb_lock_context = NULL; 421 408 } ··· 696 677 const struct nfs_pgio_completion_ops *compl_ops, 697 678 const struct nfs_rw_ops *rw_ops, 698 679 size_t bsize, 699 - int io_flags) 680 + int io_flags, 681 + gfp_t gfp_flags) 700 682 { 701 683 struct nfs_pgio_mirror *new; 702 684 int i; ··· 721 701 /* until we have a request, we don't have an lseg and no 722 702 * idea how many mirrors there will be */ 723 703 new = kcalloc(NFS_PAGEIO_DESCRIPTOR_MIRROR_MAX, 724 - sizeof(struct nfs_pgio_mirror), GFP_KERNEL); 704 + sizeof(struct nfs_pgio_mirror), gfp_flags); 725 705 desc->pg_mirrors_dynamic = new; 726 706 desc->pg_mirrors = new; 727 707 ··· 774 754 *last_page; 775 755 struct list_head *head = &mirror->pg_list; 776 756 struct nfs_commit_info cinfo; 757 + struct nfs_page_array *pg_array = &hdr->page_array; 777 758 unsigned int pagecount, pageused; 759 + gfp_t gfp_flags = GFP_KERNEL; 778 760 779 761 pagecount = nfs_page_array_len(mirror->pg_base, mirror->pg_count); 780 - if (!nfs_pgarray_set(&hdr->page_array, pagecount)) { 781 - nfs_pgio_error(hdr); 782 - desc->pg_error = -ENOMEM; 783 - return desc->pg_error; 762 + 763 + if (pagecount <= ARRAY_SIZE(pg_array->page_array)) 764 + pg_array->pagevec = pg_array->page_array; 765 + else { 766 + if (hdr->rw_mode == FMODE_WRITE) 767 + gfp_flags = GFP_NOIO; 768 + pg_array->pagevec = kcalloc(pagecount, sizeof(struct page *), gfp_flags); 769 + if (!pg_array->pagevec) { 770 + pg_array->npages = 0; 771 + nfs_pgio_error(hdr); 772 + desc->pg_error = -ENOMEM; 773 + return desc->pg_error; 774 + } 784 775 } 785 776 786 777 nfs_init_cinfo(&cinfo, desc->pg_inode, desc->pg_dreq); ··· 1287 1256 mirror = &desc->pg_mirrors[midx]; 1288 1257 if (!list_empty(&mirror->pg_list)) { 1289 1258 prev = nfs_list_entry(mirror->pg_list.prev); 1290 - if (index != prev->wb_index + 1) 1291 - nfs_pageio_complete_mirror(desc, midx); 1259 + if (index != prev->wb_index + 1) { 1260 + nfs_pageio_complete(desc); 1261 + break; 1262 + } 1292 1263 } 1293 1264 } 1294 1265 }
+53 -9
fs/nfs/pnfs.c
··· 322 322 static void 323 323 pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo) 324 324 { 325 + struct pnfs_layout_segment *lseg; 325 326 lo->plh_return_iomode = 0; 326 327 lo->plh_return_seq = 0; 327 328 clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags); 329 + list_for_each_entry(lseg, &lo->plh_segs, pls_list) { 330 + if (!test_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags)) 331 + continue; 332 + pnfs_set_plh_return_info(lo, lseg->pls_range.iomode, 0); 333 + } 328 334 } 329 335 330 336 static void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo) ··· 373 367 struct pnfs_layout_segment *lseg, *next; 374 368 375 369 set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags); 376 - pnfs_clear_layoutreturn_info(lo); 377 370 list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list) 378 371 pnfs_clear_lseg_state(lseg, lseg_list); 372 + pnfs_clear_layoutreturn_info(lo); 379 373 pnfs_free_returned_lsegs(lo, lseg_list, &range, 0); 380 374 if (test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags) && 381 375 !test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) ··· 569 563 } 570 564 } 571 565 } 572 - EXPORT_SYMBOL_GPL(pnfs_put_lseg_locked); 573 566 574 567 /* 575 568 * is l2 fully contained in l1? ··· 733 728 pnfs_layout_clear_fail_bit(lo, NFS_LAYOUT_RW_FAILED); 734 729 spin_unlock(&nfsi->vfs_inode.i_lock); 735 730 pnfs_free_lseg_list(&tmp_list); 731 + nfs_commit_inode(&nfsi->vfs_inode, 0); 736 732 pnfs_put_layout_hdr(lo); 737 733 } else 738 734 spin_unlock(&nfsi->vfs_inode.i_lock); ··· 1215 1209 dprintk("<-- %s status: %d\n", __func__, status); 1216 1210 return status; 1217 1211 } 1218 - EXPORT_SYMBOL_GPL(_pnfs_return_layout); 1219 1212 1220 1213 int 1221 1214 pnfs_commit_and_return_layout(struct inode *inode) ··· 1996 1991 spin_unlock(&ino->i_lock); 1997 1992 lseg->pls_layout = lo; 1998 1993 NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg); 1994 + if (!pnfs_layout_is_valid(lo)) 1995 + nfs_commit_inode(ino, 0); 1999 1996 return ERR_PTR(-EAGAIN); 2000 1997 } 2001 1998 ··· 2058 2051 bool return_now = false; 2059 2052 2060 2053 spin_lock(&inode->i_lock); 2054 + if (!pnfs_layout_is_valid(lo)) { 2055 + spin_unlock(&inode->i_lock); 2056 + return; 2057 + } 2061 2058 pnfs_set_plh_return_info(lo, range.iomode, 0); 2062 - /* Block LAYOUTGET */ 2063 - set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags); 2064 2059 /* 2065 2060 * mark all matching lsegs so that we are sure to have no live 2066 2061 * segments at hand when sending layoutreturn. See pnfs_put_lseg() ··· 2084 2075 EXPORT_SYMBOL_GPL(pnfs_error_mark_layout_for_return); 2085 2076 2086 2077 void 2078 + pnfs_generic_pg_check_layout(struct nfs_pageio_descriptor *pgio) 2079 + { 2080 + if (pgio->pg_lseg == NULL || 2081 + test_bit(NFS_LSEG_VALID, &pgio->pg_lseg->pls_flags)) 2082 + return; 2083 + pnfs_put_lseg(pgio->pg_lseg); 2084 + pgio->pg_lseg = NULL; 2085 + } 2086 + EXPORT_SYMBOL_GPL(pnfs_generic_pg_check_layout); 2087 + 2088 + void 2087 2089 pnfs_generic_pg_init_read(struct nfs_pageio_descriptor *pgio, struct nfs_page *req) 2088 2090 { 2089 2091 u64 rd_size = req->wb_bytes; 2090 2092 2093 + pnfs_generic_pg_check_layout(pgio); 2091 2094 if (pgio->pg_lseg == NULL) { 2092 2095 if (pgio->pg_dreq == NULL) 2093 2096 rd_size = i_size_read(pgio->pg_inode) - req_offset(req); ··· 2130 2109 pnfs_generic_pg_init_write(struct nfs_pageio_descriptor *pgio, 2131 2110 struct nfs_page *req, u64 wb_size) 2132 2111 { 2112 + pnfs_generic_pg_check_layout(pgio); 2133 2113 if (pgio->pg_lseg == NULL) { 2134 2114 pgio->pg_lseg = pnfs_update_layout(pgio->pg_inode, 2135 2115 req->wb_context, ··· 2299 2277 enum pnfs_try_status trypnfs; 2300 2278 2301 2279 trypnfs = pnfs_try_to_write_data(hdr, call_ops, lseg, how); 2302 - if (trypnfs == PNFS_NOT_ATTEMPTED) 2280 + switch (trypnfs) { 2281 + case PNFS_NOT_ATTEMPTED: 2303 2282 pnfs_write_through_mds(desc, hdr); 2283 + case PNFS_ATTEMPTED: 2284 + break; 2285 + case PNFS_TRY_AGAIN: 2286 + /* cleanup hdr and prepare to redo pnfs */ 2287 + if (!test_and_set_bit(NFS_IOHDR_REDO, &hdr->flags)) { 2288 + struct nfs_pgio_mirror *mirror = nfs_pgio_current_mirror(desc); 2289 + list_splice_init(&hdr->pages, &mirror->pg_list); 2290 + mirror->pg_recoalesce = 1; 2291 + } 2292 + hdr->mds_ops->rpc_release(hdr); 2293 + } 2304 2294 } 2305 2295 2306 2296 static void pnfs_writehdr_free(struct nfs_pgio_header *hdr) ··· 2442 2408 enum pnfs_try_status trypnfs; 2443 2409 2444 2410 trypnfs = pnfs_try_to_read_data(hdr, call_ops, lseg); 2445 - if (trypnfs == PNFS_TRY_AGAIN) 2446 - pnfs_read_resend_pnfs(hdr); 2447 - if (trypnfs == PNFS_NOT_ATTEMPTED || hdr->task.tk_status) 2411 + switch (trypnfs) { 2412 + case PNFS_NOT_ATTEMPTED: 2448 2413 pnfs_read_through_mds(desc, hdr); 2414 + case PNFS_ATTEMPTED: 2415 + break; 2416 + case PNFS_TRY_AGAIN: 2417 + /* cleanup hdr and prepare to redo pnfs */ 2418 + if (!test_and_set_bit(NFS_IOHDR_REDO, &hdr->flags)) { 2419 + struct nfs_pgio_mirror *mirror = nfs_pgio_current_mirror(desc); 2420 + list_splice_init(&hdr->pages, &mirror->pg_list); 2421 + mirror->pg_recoalesce = 1; 2422 + } 2423 + hdr->mds_ops->rpc_release(hdr); 2424 + } 2449 2425 } 2450 2426 2451 2427 static void pnfs_readhdr_free(struct nfs_pgio_header *hdr)
+1 -5
fs/nfs/pnfs.h
··· 173 173 gfp_t gfp_flags); 174 174 175 175 int (*prepare_layoutreturn) (struct nfs4_layoutreturn_args *); 176 - void (*encode_layoutreturn) (struct xdr_stream *xdr, 177 - const struct nfs4_layoutreturn_args *args); 178 176 179 177 void (*cleanup_layoutcommit) (struct nfs4_layoutcommit_data *data); 180 178 int (*prepare_layoutcommit) (struct nfs4_layoutcommit_args *args); 181 - void (*encode_layoutcommit) (struct pnfs_layout_hdr *lo, 182 - struct xdr_stream *xdr, 183 - const struct nfs4_layoutcommit_args *args); 184 179 int (*prepare_layoutstats) (struct nfs42_layoutstat_args *args); 185 180 }; 186 181 ··· 234 239 235 240 void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, struct nfs_fsinfo *); 236 241 void unset_pnfs_layoutdriver(struct nfs_server *); 242 + void pnfs_generic_pg_check_layout(struct nfs_pageio_descriptor *pgio); 237 243 void pnfs_generic_pg_init_read(struct nfs_pageio_descriptor *, struct nfs_page *); 238 244 int pnfs_generic_pg_readpages(struct nfs_pageio_descriptor *desc); 239 245 void pnfs_generic_pg_init_write(struct nfs_pageio_descriptor *pgio,
+12 -12
fs/nfs/pnfs_nfs.c
··· 217 217 for (i = 0; i < fl_cinfo->nbuckets; i++, bucket++) { 218 218 if (list_empty(&bucket->committing)) 219 219 continue; 220 - data = nfs_commitdata_alloc(); 220 + /* 221 + * If the layout segment is invalid, then let 222 + * pnfs_generic_retry_commit() clean up the bucket. 223 + */ 224 + if (bucket->clseg && !pnfs_is_valid_lseg(bucket->clseg) && 225 + !test_bit(NFS_LSEG_LAYOUTRETURN, &bucket->clseg->pls_flags)) 226 + break; 227 + data = nfs_commitdata_alloc(false); 221 228 if (!data) 222 229 break; 223 230 data->ds_commit_index = i; ··· 290 283 unsigned int nreq = 0; 291 284 292 285 if (!list_empty(mds_pages)) { 293 - data = nfs_commitdata_alloc(); 294 - if (data != NULL) { 295 - data->ds_commit_index = -1; 296 - list_add(&data->pages, &list); 297 - nreq++; 298 - } else { 299 - nfs_retry_commit(mds_pages, NULL, cinfo, 0); 300 - pnfs_generic_retry_commit(cinfo, 0); 301 - return -ENOMEM; 302 - } 286 + data = nfs_commitdata_alloc(true); 287 + data->ds_commit_index = -1; 288 + list_add(&data->pages, &list); 289 + nreq++; 303 290 } 304 291 305 292 nreq += pnfs_generic_alloc_ds_commits(cinfo, &list); ··· 620 619 get_v3_ds_connect = NULL; 621 620 } 622 621 } 623 - EXPORT_SYMBOL_GPL(nfs4_pnfs_v3_ds_connect_unload); 624 622 625 623 static int _nfs4_pnfs_v3_ds_connect(struct nfs_server *mds_srv, 626 624 struct nfs4_pnfs_ds *ds,
+1 -1
fs/nfs/proc.c
··· 638 638 { 639 639 struct inode *inode = file_inode(filp); 640 640 641 - return nlmclnt_proc(NFS_SERVER(inode)->nlm_host, cmd, fl); 641 + return nlmclnt_proc(NFS_SERVER(inode)->nlm_host, cmd, fl, NULL); 642 642 } 643 643 644 644 /* Helper functions for NFS lock bounds checking */
+6 -3
fs/nfs/read.c
··· 35 35 36 36 static struct nfs_pgio_header *nfs_readhdr_alloc(void) 37 37 { 38 - return kmem_cache_zalloc(nfs_rdata_cachep, GFP_KERNEL); 38 + struct nfs_pgio_header *p = kmem_cache_zalloc(nfs_rdata_cachep, GFP_KERNEL); 39 + 40 + if (p) 41 + p->rw_mode = FMODE_READ; 42 + return p; 39 43 } 40 44 41 45 static void nfs_readhdr_free(struct nfs_pgio_header *rhdr) ··· 68 64 pg_ops = server->pnfs_curr_ld->pg_read_ops; 69 65 #endif 70 66 nfs_pageio_init(pgio, inode, pg_ops, compl_ops, &nfs_rw_read_ops, 71 - server->rsize, 0); 67 + server->rsize, 0, GFP_KERNEL); 72 68 } 73 69 EXPORT_SYMBOL_GPL(nfs_pageio_init_read); 74 70 ··· 455 451 } 456 452 457 453 static const struct nfs_rw_ops nfs_rw_read_ops = { 458 - .rw_mode = FMODE_READ, 459 454 .rw_alloc_header = nfs_readhdr_alloc, 460 455 .rw_free_header = nfs_readhdr_free, 461 456 .rw_done = nfs_readpage_done,
+56 -65
fs/nfs/write.c
··· 60 60 static struct kmem_cache *nfs_cdata_cachep; 61 61 static mempool_t *nfs_commit_mempool; 62 62 63 - struct nfs_commit_data *nfs_commitdata_alloc(void) 63 + struct nfs_commit_data *nfs_commitdata_alloc(bool never_fail) 64 64 { 65 - struct nfs_commit_data *p = mempool_alloc(nfs_commit_mempool, GFP_NOIO); 65 + struct nfs_commit_data *p; 66 66 67 - if (p) { 68 - memset(p, 0, sizeof(*p)); 69 - INIT_LIST_HEAD(&p->pages); 67 + if (never_fail) 68 + p = mempool_alloc(nfs_commit_mempool, GFP_NOIO); 69 + else { 70 + /* It is OK to do some reclaim, not no safe to wait 71 + * for anything to be returned to the pool. 72 + * mempool_alloc() cannot handle that particular combination, 73 + * so we need two separate attempts. 74 + */ 75 + p = mempool_alloc(nfs_commit_mempool, GFP_NOWAIT); 76 + if (!p) 77 + p = kmem_cache_alloc(nfs_cdata_cachep, GFP_NOIO | 78 + __GFP_NOWARN | __GFP_NORETRY); 79 + if (!p) 80 + return NULL; 70 81 } 82 + 83 + memset(p, 0, sizeof(*p)); 84 + INIT_LIST_HEAD(&p->pages); 71 85 return p; 72 86 } 73 87 EXPORT_SYMBOL_GPL(nfs_commitdata_alloc); ··· 96 82 { 97 83 struct nfs_pgio_header *p = mempool_alloc(nfs_wdata_mempool, GFP_NOIO); 98 84 99 - if (p) 85 + if (p) { 100 86 memset(p, 0, sizeof(*p)); 87 + p->rw_mode = FMODE_WRITE; 88 + } 101 89 return p; 102 90 } 103 91 ··· 563 547 { 564 548 nfs_unlock_request(req); 565 549 nfs_end_page_writeback(req); 566 - nfs_release_request(req); 567 550 generic_error_remove_page(page_file_mapping(req->wb_page), 568 551 req->wb_page); 552 + nfs_release_request(req); 553 + } 554 + 555 + static bool 556 + nfs_error_is_fatal_on_server(int err) 557 + { 558 + switch (err) { 559 + case 0: 560 + case -ERESTARTSYS: 561 + case -EINTR: 562 + return false; 563 + } 564 + return nfs_error_is_fatal(err); 569 565 } 570 566 571 567 /* ··· 585 557 * May return an error if the user signalled nfs_wait_on_request(). 586 558 */ 587 559 static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio, 588 - struct page *page, bool nonblock, 589 - bool launder) 560 + struct page *page, bool nonblock) 590 561 { 591 562 struct nfs_page *req; 592 563 int ret = 0; ··· 601 574 WARN_ON_ONCE(test_bit(PG_CLEAN, &req->wb_flags)); 602 575 603 576 ret = 0; 577 + /* If there is a fatal error that covers this write, just exit */ 578 + if (nfs_error_is_fatal_on_server(req->wb_context->error)) 579 + goto out_launder; 580 + 604 581 if (!nfs_pageio_add_request(pgio, req)) { 605 582 ret = pgio->pg_error; 606 583 /* 607 - * Remove the problematic req upon fatal errors 608 - * in launder case, while other dirty pages can 609 - * still be around until they get flushed. 584 + * Remove the problematic req upon fatal errors on the server 610 585 */ 611 586 if (nfs_error_is_fatal(ret)) { 612 587 nfs_context_set_write_error(req->wb_context, ret); 613 - if (launder) { 614 - nfs_write_error_remove_page(req); 615 - goto out; 616 - } 588 + if (nfs_error_is_fatal_on_server(ret)) 589 + goto out_launder; 617 590 } 618 591 nfs_redirty_request(req); 619 592 ret = -EAGAIN; ··· 622 595 NFSIOS_WRITEPAGES, 1); 623 596 out: 624 597 return ret; 598 + out_launder: 599 + nfs_write_error_remove_page(req); 600 + return ret; 625 601 } 626 602 627 603 static int nfs_do_writepage(struct page *page, struct writeback_control *wbc, 628 - struct nfs_pageio_descriptor *pgio, bool launder) 604 + struct nfs_pageio_descriptor *pgio) 629 605 { 630 606 int ret; 631 607 632 608 nfs_pageio_cond_complete(pgio, page_index(page)); 633 - ret = nfs_page_async_flush(pgio, page, wbc->sync_mode == WB_SYNC_NONE, 634 - launder); 609 + ret = nfs_page_async_flush(pgio, page, wbc->sync_mode == WB_SYNC_NONE); 635 610 if (ret == -EAGAIN) { 636 611 redirty_page_for_writepage(wbc, page); 637 612 ret = 0; ··· 645 616 * Write an mmapped page to the server. 646 617 */ 647 618 static int nfs_writepage_locked(struct page *page, 648 - struct writeback_control *wbc, 649 - bool launder) 619 + struct writeback_control *wbc) 650 620 { 651 621 struct nfs_pageio_descriptor pgio; 652 622 struct inode *inode = page_file_mapping(page)->host; ··· 654 626 nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE); 655 627 nfs_pageio_init_write(&pgio, inode, 0, 656 628 false, &nfs_async_write_completion_ops); 657 - err = nfs_do_writepage(page, wbc, &pgio, launder); 629 + err = nfs_do_writepage(page, wbc, &pgio); 658 630 nfs_pageio_complete(&pgio); 659 631 if (err < 0) 660 632 return err; ··· 667 639 { 668 640 int ret; 669 641 670 - ret = nfs_writepage_locked(page, wbc, false); 642 + ret = nfs_writepage_locked(page, wbc); 671 643 unlock_page(page); 672 644 return ret; 673 645 } ··· 676 648 { 677 649 int ret; 678 650 679 - ret = nfs_do_writepage(page, wbc, data, false); 651 + ret = nfs_do_writepage(page, wbc, data); 680 652 unlock_page(page); 681 653 return ret; 682 654 } ··· 1395 1367 pg_ops = server->pnfs_curr_ld->pg_write_ops; 1396 1368 #endif 1397 1369 nfs_pageio_init(pgio, inode, pg_ops, compl_ops, &nfs_rw_write_ops, 1398 - server->wsize, ioflags); 1370 + server->wsize, ioflags, GFP_NOIO); 1399 1371 } 1400 1372 EXPORT_SYMBOL_GPL(nfs_pageio_init_write); 1401 1373 ··· 1732 1704 if (list_empty(head)) 1733 1705 return 0; 1734 1706 1735 - data = nfs_commitdata_alloc(); 1736 - 1737 - if (!data) 1738 - goto out_bad; 1707 + data = nfs_commitdata_alloc(true); 1739 1708 1740 1709 /* Set up the argument struct */ 1741 1710 nfs_init_commit(data, head, NULL, cinfo); 1742 1711 atomic_inc(&cinfo->mds->rpcs_out); 1743 1712 return nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(inode), 1744 1713 data->mds_ops, how, 0); 1745 - out_bad: 1746 - nfs_retry_commit(head, NULL, cinfo, 0); 1747 - return -ENOMEM; 1748 1714 } 1749 - 1750 - int nfs_commit_file(struct file *file, struct nfs_write_verifier *verf) 1751 - { 1752 - struct inode *inode = file_inode(file); 1753 - struct nfs_open_context *open; 1754 - struct nfs_commit_info cinfo; 1755 - struct nfs_page *req; 1756 - int ret; 1757 - 1758 - open = get_nfs_open_context(nfs_file_open_context(file)); 1759 - req = nfs_create_request(open, NULL, NULL, 0, i_size_read(inode)); 1760 - if (IS_ERR(req)) { 1761 - ret = PTR_ERR(req); 1762 - goto out_put; 1763 - } 1764 - 1765 - nfs_init_cinfo_from_inode(&cinfo, inode); 1766 - 1767 - memcpy(&req->wb_verf, verf, sizeof(struct nfs_write_verifier)); 1768 - nfs_request_add_commit_list(req, &cinfo); 1769 - ret = nfs_commit_inode(inode, FLUSH_SYNC); 1770 - if (ret > 0) 1771 - ret = 0; 1772 - 1773 - nfs_free_request(req); 1774 - out_put: 1775 - put_nfs_open_context(open); 1776 - return ret; 1777 - } 1778 - EXPORT_SYMBOL_GPL(nfs_commit_file); 1779 1715 1780 1716 /* 1781 1717 * COMMIT call returned ··· 1977 1985 /* 1978 1986 * Write back all requests on one page - we do this before reading it. 1979 1987 */ 1980 - int nfs_wb_single_page(struct inode *inode, struct page *page, bool launder) 1988 + int nfs_wb_page(struct inode *inode, struct page *page) 1981 1989 { 1982 1990 loff_t range_start = page_file_offset(page); 1983 1991 loff_t range_end = range_start + (loff_t)(PAGE_SIZE - 1); ··· 1994 2002 for (;;) { 1995 2003 wait_on_page_writeback(page); 1996 2004 if (clear_page_dirty_for_io(page)) { 1997 - ret = nfs_writepage_locked(page, &wbc, launder); 2005 + ret = nfs_writepage_locked(page, &wbc); 1998 2006 if (ret < 0) 1999 2007 goto out_error; 2000 2008 continue; ··· 2099 2107 } 2100 2108 2101 2109 static const struct nfs_rw_ops nfs_rw_write_ops = { 2102 - .rw_mode = FMODE_WRITE, 2103 2110 .rw_alloc_header = nfs_writehdr_alloc, 2104 2111 .rw_free_header = nfs_writehdr_free, 2105 2112 .rw_done = nfs_writeback_done,
+2
include/linux/fs.h
··· 909 909 #define FL_OFDLCK 1024 /* lock is "owned" by struct file */ 910 910 #define FL_LAYOUT 2048 /* outstanding pNFS layout */ 911 911 912 + #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE) 913 + 912 914 /* 913 915 * Special return value from posix_lock_file() and vfs_lock_file() for 914 916 * asynchronous locking.
+22 -2
include/linux/lockd/bind.h
··· 18 18 19 19 /* Dummy declarations */ 20 20 struct svc_rqst; 21 + struct rpc_task; 21 22 22 23 /* 23 24 * This is the set of functions for lockd->nfsd communication ··· 44 43 u32 nfs_version; 45 44 int noresvport; 46 45 struct net *net; 46 + const struct nlmclnt_operations *nlmclnt_ops; 47 47 }; 48 48 49 49 /* ··· 54 52 extern struct nlm_host *nlmclnt_init(const struct nlmclnt_initdata *nlm_init); 55 53 extern void nlmclnt_done(struct nlm_host *host); 56 54 57 - extern int nlmclnt_proc(struct nlm_host *host, int cmd, 58 - struct file_lock *fl); 55 + /* 56 + * NLM client operations provide a means to modify RPC processing of NLM 57 + * requests. Callbacks receive a pointer to data passed into the call to 58 + * nlmclnt_proc(). 59 + */ 60 + struct nlmclnt_operations { 61 + /* Called on successful allocation of nlm_rqst, use for allocation or 62 + * reference counting. */ 63 + void (*nlmclnt_alloc_call)(void *); 64 + 65 + /* Called in rpc_task_prepare for unlock. A return value of true 66 + * indicates the callback has put the task to sleep on a waitqueue 67 + * and NLM should not call rpc_call_start(). */ 68 + bool (*nlmclnt_unlock_prepare)(struct rpc_task*, void *); 69 + 70 + /* Called when the nlm_rqst is freed, callbacks should clean up here */ 71 + void (*nlmclnt_release_call)(void *); 72 + }; 73 + 74 + extern int nlmclnt_proc(struct nlm_host *host, int cmd, struct file_lock *fl, void *data); 59 75 extern int lockd_up(struct net *net); 60 76 extern void lockd_down(struct net *net); 61 77
+2
include/linux/lockd/lockd.h
··· 69 69 char *h_addrbuf; /* address eyecatcher */ 70 70 struct net *net; /* host net */ 71 71 char nodename[UNX_MAXNODENAME + 1]; 72 + const struct nlmclnt_operations *h_nlmclnt_ops; /* Callback ops for NLM users */ 72 73 }; 73 74 74 75 /* ··· 143 142 struct nlm_block * a_block; 144 143 unsigned int a_retries; /* Retry count */ 145 144 u8 a_owner[NLMCLNT_OHSIZE]; 145 + void * a_callback_data; /* sent to nlmclnt_operations callbacks */ 146 146 }; 147 147 148 148 /*
+3 -14
include/linux/nfs_fs.h
··· 76 76 #define NFS_CONTEXT_ERROR_WRITE (0) 77 77 #define NFS_CONTEXT_RESEND_WRITES (1) 78 78 #define NFS_CONTEXT_BAD (2) 79 + #define NFS_CONTEXT_UNLOCK (3) 79 80 int error; 80 81 81 82 struct list_head list; ··· 500 499 */ 501 500 extern int nfs_sync_inode(struct inode *inode); 502 501 extern int nfs_wb_all(struct inode *inode); 503 - extern int nfs_wb_single_page(struct inode *inode, struct page *page, bool launder); 502 + extern int nfs_wb_page(struct inode *inode, struct page *page); 504 503 extern int nfs_wb_page_cancel(struct inode *inode, struct page* page); 505 504 extern int nfs_commit_inode(struct inode *, int); 506 - extern struct nfs_commit_data *nfs_commitdata_alloc(void); 505 + extern struct nfs_commit_data *nfs_commitdata_alloc(bool never_fail); 507 506 extern void nfs_commit_free(struct nfs_commit_data *data); 508 - 509 - static inline int 510 - nfs_wb_launder_page(struct inode *inode, struct page *page) 511 - { 512 - return nfs_wb_single_page(inode, page, true); 513 - } 514 - 515 - static inline int 516 - nfs_wb_page(struct inode *inode, struct page *page) 517 - { 518 - return nfs_wb_single_page(inode, page, false); 519 - } 520 507 521 508 static inline int 522 509 nfs_have_writebacks(struct inode *inode)
+1
include/linux/nfs_fs_sb.h
··· 221 221 u32 mountd_version; 222 222 unsigned short mountd_port; 223 223 unsigned short mountd_protocol; 224 + struct rpc_wait_queue uoc_rpcwaitq; 224 225 }; 225 226 226 227 /* Server capabilities */
+3 -2
include/linux/nfs_page.h
··· 64 64 }; 65 65 66 66 struct nfs_rw_ops { 67 - const fmode_t rw_mode; 68 67 struct nfs_pgio_header *(*rw_alloc_header)(void); 69 68 void (*rw_free_header)(struct nfs_pgio_header *); 70 69 int (*rw_done)(struct rpc_task *, struct nfs_pgio_header *, ··· 123 124 const struct nfs_pgio_completion_ops *compl_ops, 124 125 const struct nfs_rw_ops *rw_ops, 125 126 size_t bsize, 126 - int how); 127 + int how, 128 + gfp_t gfp_flags); 127 129 extern int nfs_pageio_add_request(struct nfs_pageio_descriptor *, 128 130 struct nfs_page *); 129 131 extern int nfs_pageio_resend(struct nfs_pageio_descriptor *, ··· 141 141 extern void nfs_page_group_lock_wait(struct nfs_page *); 142 142 extern void nfs_page_group_unlock(struct nfs_page *); 143 143 extern bool nfs_page_group_sync_on_bit(struct nfs_page *, unsigned int); 144 + extern bool nfs_async_iocounter_wait(struct rpc_task *, struct nfs_lock_context *); 144 145 145 146 /* 146 147 * Lock the page of an asynchronous request
+3
include/linux/nfs_xdr.h
··· 1383 1383 struct nfs42_write_res write_res; 1384 1384 bool consecutive; 1385 1385 bool synchronous; 1386 + struct nfs_commitres commit_res; 1386 1387 }; 1387 1388 1388 1389 struct nfs42_seek_args { ··· 1428 1427 struct list_head pages; 1429 1428 struct nfs_page *req; 1430 1429 struct nfs_writeverf verf; /* Used for writes */ 1430 + fmode_t rw_mode; 1431 1431 struct pnfs_layout_segment *lseg; 1432 1432 loff_t io_start; 1433 1433 const struct rpc_call_ops *mds_ops; ··· 1552 1550 const struct inode_operations *dir_inode_ops; 1553 1551 const struct inode_operations *file_inode_ops; 1554 1552 const struct file_operations *file_ops; 1553 + const struct nlmclnt_operations *nlmclnt_ops; 1555 1554 1556 1555 int (*getroot) (struct nfs_server *, struct nfs_fh *, 1557 1556 struct nfs_fsinfo *);
-8
net/sunrpc/clnt.c
··· 1042 1042 struct rpc_task *task; 1043 1043 1044 1044 task = rpc_new_task(task_setup_data); 1045 - if (IS_ERR(task)) 1046 - goto out; 1047 1045 1048 1046 rpc_task_set_client(task, task_setup_data->rpc_client); 1049 1047 rpc_task_set_rpc_message(task, task_setup_data->rpc_message); ··· 1051 1053 1052 1054 atomic_inc(&task->tk_count); 1053 1055 rpc_execute(task); 1054 - out: 1055 1056 return task; 1056 1057 } 1057 1058 EXPORT_SYMBOL_GPL(rpc_run_task); ··· 1137 1140 * Create an rpc_task to send the data 1138 1141 */ 1139 1142 task = rpc_new_task(&task_setup_data); 1140 - if (IS_ERR(task)) { 1141 - xprt_free_bc_request(req); 1142 - goto out; 1143 - } 1144 1143 task->tk_rqstp = req; 1145 1144 1146 1145 /* ··· 1151 1158 WARN_ON_ONCE(atomic_read(&task->tk_count) != 2); 1152 1159 rpc_execute(task); 1153 1160 1154 - out: 1155 1161 dprintk("RPC: rpc_run_bc_task: task= %p\n", task); 1156 1162 return task; 1157 1163 }
-5
net/sunrpc/sched.c
··· 965 965 966 966 if (task == NULL) { 967 967 task = rpc_alloc_task(); 968 - if (task == NULL) { 969 - rpc_release_calldata(setup_data->callback_ops, 970 - setup_data->callback_data); 971 - return ERR_PTR(-ENOMEM); 972 - } 973 968 flags = RPC_TASK_DYNAMIC; 974 969 } 975 970
+1 -1
net/sunrpc/xdr.c
··· 807 807 EXPORT_SYMBOL_GPL(xdr_init_decode); 808 808 809 809 /** 810 - * xdr_init_decode - Initialize an xdr_stream for decoding data. 810 + * xdr_init_decode_pages - Initialize an xdr_stream for decoding into pages 811 811 * @xdr: pointer to xdr_stream struct 812 812 * @buf: pointer to XDR buffer from which to decode data 813 813 * @pages: list of pages to decode into
+1
net/sunrpc/xprt.c
··· 651 651 xprt_wake_pending_tasks(xprt, -EAGAIN); 652 652 spin_unlock_bh(&xprt->transport_lock); 653 653 } 654 + EXPORT_SYMBOL_GPL(xprt_force_disconnect); 654 655 655 656 /** 656 657 * xprt_conditional_disconnect - force a transport to disconnect
+7 -5
net/sunrpc/xprtrdma/rpc_rdma.c
··· 494 494 } 495 495 sge->length = len; 496 496 497 - ib_dma_sync_single_for_device(ia->ri_device, sge->addr, 497 + ib_dma_sync_single_for_device(rdmab_device(rb), sge->addr, 498 498 sge->length, DMA_TO_DEVICE); 499 499 req->rl_send_wr.num_sge++; 500 500 return true; ··· 523 523 sge[sge_no].addr = rdmab_addr(rb); 524 524 sge[sge_no].length = xdr->head[0].iov_len; 525 525 sge[sge_no].lkey = rdmab_lkey(rb); 526 - ib_dma_sync_single_for_device(device, sge[sge_no].addr, 526 + ib_dma_sync_single_for_device(rdmab_device(rb), sge[sge_no].addr, 527 527 sge[sge_no].length, DMA_TO_DEVICE); 528 528 529 529 /* If there is a Read chunk, the page list is being handled ··· 781 781 return 0; 782 782 783 783 out_err: 784 - pr_err("rpcrdma: rpcrdma_marshal_req failed, status %ld\n", 785 - PTR_ERR(iptr)); 786 - r_xprt->rx_stats.failed_marshal_count++; 784 + if (PTR_ERR(iptr) != -ENOBUFS) { 785 + pr_err("rpcrdma: rpcrdma_marshal_req failed, status %ld\n", 786 + PTR_ERR(iptr)); 787 + r_xprt->rx_stats.failed_marshal_count++; 788 + } 787 789 return PTR_ERR(iptr); 788 790 } 789 791
+49 -8
net/sunrpc/xprtrdma/transport.c
··· 66 66 unsigned int xprt_rdma_max_inline_read = RPCRDMA_DEF_INLINE; 67 67 static unsigned int xprt_rdma_max_inline_write = RPCRDMA_DEF_INLINE; 68 68 static unsigned int xprt_rdma_inline_write_padding; 69 - static unsigned int xprt_rdma_memreg_strategy = RPCRDMA_FRMR; 70 - int xprt_rdma_pad_optimize = 0; 69 + unsigned int xprt_rdma_memreg_strategy = RPCRDMA_FRMR; 70 + int xprt_rdma_pad_optimize; 71 71 72 72 #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) 73 73 ··· 396 396 397 397 new_xprt = rpcx_to_rdmax(xprt); 398 398 399 - rc = rpcrdma_ia_open(new_xprt, sap, xprt_rdma_memreg_strategy); 399 + rc = rpcrdma_ia_open(new_xprt, sap); 400 400 if (rc) 401 401 goto out1; 402 402 ··· 457 457 return ERR_PTR(rc); 458 458 } 459 459 460 - /* 461 - * Close a connection, during shutdown or timeout/reconnect 460 + /** 461 + * xprt_rdma_close - Close down RDMA connection 462 + * @xprt: generic transport to be closed 463 + * 464 + * Called during transport shutdown reconnect, or device 465 + * removal. Caller holds the transport's write lock. 462 466 */ 463 467 static void 464 468 xprt_rdma_close(struct rpc_xprt *xprt) 465 469 { 466 470 struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt); 471 + struct rpcrdma_ep *ep = &r_xprt->rx_ep; 472 + struct rpcrdma_ia *ia = &r_xprt->rx_ia; 467 473 468 - dprintk("RPC: %s: closing\n", __func__); 469 - if (r_xprt->rx_ep.rep_connected > 0) 474 + dprintk("RPC: %s: closing xprt %p\n", __func__, xprt); 475 + 476 + if (test_and_clear_bit(RPCRDMA_IAF_REMOVING, &ia->ri_flags)) { 477 + xprt_clear_connected(xprt); 478 + rpcrdma_ia_remove(ia); 479 + return; 480 + } 481 + if (ep->rep_connected == -ENODEV) 482 + return; 483 + if (ep->rep_connected > 0) 470 484 xprt->reestablish_timeout = 0; 471 485 xprt_disconnect_done(xprt); 472 - rpcrdma_ep_disconnect(&r_xprt->rx_ep, &r_xprt->rx_ia); 486 + rpcrdma_ep_disconnect(ep, ia); 473 487 } 474 488 475 489 static void ··· 496 482 sap = (struct sockaddr_in *)&rpcx_to_rdmad(xprt).addr; 497 483 sap->sin_port = htons(port); 498 484 dprintk("RPC: %s: %u\n", __func__, port); 485 + } 486 + 487 + /** 488 + * xprt_rdma_timer - invoked when an RPC times out 489 + * @xprt: controlling RPC transport 490 + * @task: RPC task that timed out 491 + * 492 + * Invoked when the transport is still connected, but an RPC 493 + * retransmit timeout occurs. 494 + * 495 + * Since RDMA connections don't have a keep-alive, forcibly 496 + * disconnect and retry to connect. This drives full 497 + * detection of the network path, and retransmissions of 498 + * all pending RPCs. 499 + */ 500 + static void 501 + xprt_rdma_timer(struct rpc_xprt *xprt, struct rpc_task *task) 502 + { 503 + dprintk("RPC: %5u %s: xprt = %p\n", task->tk_pid, __func__, xprt); 504 + 505 + xprt_force_disconnect(xprt); 499 506 } 500 507 501 508 static void ··· 694 659 * xprt_rdma_send_request - marshal and send an RPC request 695 660 * @task: RPC task with an RPC message in rq_snd_buf 696 661 * 662 + * Caller holds the transport's write lock. 663 + * 697 664 * Return values: 698 665 * 0: The request has been sent 699 666 * ENOTCONN: Caller needs to invoke connect logic then call again ··· 721 684 struct rpcrdma_req *req = rpcr_to_rdmar(rqst); 722 685 struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt); 723 686 int rc = 0; 687 + 688 + if (!xprt_connected(xprt)) 689 + goto drop_connection; 724 690 725 691 /* On retransmit, remove any previously registered chunks */ 726 692 if (unlikely(!list_empty(&req->rl_registered))) ··· 816 776 .alloc_slot = xprt_alloc_slot, 817 777 .release_request = xprt_release_rqst_cong, /* ditto */ 818 778 .set_retrans_timeout = xprt_set_retrans_timeout_def, /* ditto */ 779 + .timer = xprt_rdma_timer, 819 780 .rpcbind = rpcb_getport_async, /* sunrpc/rpcb_clnt.c */ 820 781 .set_port = xprt_rdma_set_port, 821 782 .connect = xprt_rdma_connect,
+218 -105
net/sunrpc/xprtrdma/verbs.c
··· 53 53 #include <linux/sunrpc/addr.h> 54 54 #include <linux/sunrpc/svc_rdma.h> 55 55 #include <asm/bitops.h> 56 - #include <linux/module.h> /* try_module_get()/module_put() */ 56 + 57 57 #include <rdma/ib_cm.h> 58 58 59 59 #include "xprt_rdma.h" ··· 69 69 /* 70 70 * internal functions 71 71 */ 72 + static void rpcrdma_create_mrs(struct rpcrdma_xprt *r_xprt); 73 + static void rpcrdma_destroy_mrs(struct rpcrdma_buffer *buf); 74 + static void rpcrdma_dma_unmap_regbuf(struct rpcrdma_regbuf *rb); 72 75 73 - static struct workqueue_struct *rpcrdma_receive_wq; 76 + static struct workqueue_struct *rpcrdma_receive_wq __read_mostly; 74 77 75 78 int 76 79 rpcrdma_alloc_wq(void) ··· 183 180 rep->rr_wc_flags = wc->wc_flags; 184 181 rep->rr_inv_rkey = wc->ex.invalidate_rkey; 185 182 186 - ib_dma_sync_single_for_cpu(rep->rr_device, 183 + ib_dma_sync_single_for_cpu(rdmab_device(rep->rr_rdmabuf), 187 184 rdmab_addr(rep->rr_rdmabuf), 188 185 rep->rr_len, DMA_FROM_DEVICE); 189 186 ··· 265 262 __func__, ep); 266 263 complete(&ia->ri_done); 267 264 break; 265 + case RDMA_CM_EVENT_DEVICE_REMOVAL: 266 + #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) 267 + pr_info("rpcrdma: removing device for %pIS:%u\n", 268 + sap, rpc_get_port(sap)); 269 + #endif 270 + set_bit(RPCRDMA_IAF_REMOVING, &ia->ri_flags); 271 + ep->rep_connected = -ENODEV; 272 + xprt_force_disconnect(&xprt->rx_xprt); 273 + wait_for_completion(&ia->ri_remove_done); 274 + 275 + ia->ri_id = NULL; 276 + ia->ri_pd = NULL; 277 + ia->ri_device = NULL; 278 + /* Return 1 to ensure the core destroys the id. */ 279 + return 1; 268 280 case RDMA_CM_EVENT_ESTABLISHED: 269 281 connstate = 1; 270 282 ib_query_qp(ia->ri_id->qp, attr, ··· 309 291 goto connected; 310 292 case RDMA_CM_EVENT_DISCONNECTED: 311 293 connstate = -ECONNABORTED; 312 - goto connected; 313 - case RDMA_CM_EVENT_DEVICE_REMOVAL: 314 - connstate = -ENODEV; 315 294 connected: 316 295 dprintk("RPC: %s: %sconnected\n", 317 296 __func__, connstate > 0 ? "" : "dis"); ··· 344 329 return 0; 345 330 } 346 331 347 - static void rpcrdma_destroy_id(struct rdma_cm_id *id) 348 - { 349 - if (id) { 350 - module_put(id->device->owner); 351 - rdma_destroy_id(id); 352 - } 353 - } 354 - 355 332 static struct rdma_cm_id * 356 333 rpcrdma_create_id(struct rpcrdma_xprt *xprt, 357 334 struct rpcrdma_ia *ia, struct sockaddr *addr) ··· 353 346 int rc; 354 347 355 348 init_completion(&ia->ri_done); 349 + init_completion(&ia->ri_remove_done); 356 350 357 351 id = rdma_create_id(&init_net, rpcrdma_conn_upcall, xprt, RDMA_PS_TCP, 358 352 IB_QPT_RC); ··· 378 370 goto out; 379 371 } 380 372 381 - /* FIXME: 382 - * Until xprtrdma supports DEVICE_REMOVAL, the provider must 383 - * be pinned while there are active NFS/RDMA mounts to prevent 384 - * hangs and crashes at umount time. 385 - */ 386 - if (!ia->ri_async_rc && !try_module_get(id->device->owner)) { 387 - dprintk("RPC: %s: Failed to get device module\n", 388 - __func__); 389 - ia->ri_async_rc = -ENODEV; 390 - } 391 373 rc = ia->ri_async_rc; 392 374 if (rc) 393 375 goto out; ··· 387 389 if (rc) { 388 390 dprintk("RPC: %s: rdma_resolve_route() failed %i\n", 389 391 __func__, rc); 390 - goto put; 392 + goto out; 391 393 } 392 394 rc = wait_for_completion_interruptible_timeout(&ia->ri_done, wtimeout); 393 395 if (rc < 0) { 394 396 dprintk("RPC: %s: wait() exited: %i\n", 395 397 __func__, rc); 396 - goto put; 398 + goto out; 397 399 } 398 400 rc = ia->ri_async_rc; 399 401 if (rc) 400 - goto put; 402 + goto out; 401 403 402 404 return id; 403 - put: 404 - module_put(id->device->owner); 405 + 405 406 out: 406 407 rdma_destroy_id(id); 407 408 return ERR_PTR(rc); ··· 410 413 * Exported functions. 411 414 */ 412 415 413 - /* 414 - * Open and initialize an Interface Adapter. 415 - * o initializes fields of struct rpcrdma_ia, including 416 - * interface and provider attributes and protection zone. 416 + /** 417 + * rpcrdma_ia_open - Open and initialize an Interface Adapter. 418 + * @xprt: controlling transport 419 + * @addr: IP address of remote peer 420 + * 421 + * Returns 0 on success, negative errno if an appropriate 422 + * Interface Adapter could not be found and opened. 417 423 */ 418 424 int 419 - rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct sockaddr *addr, int memreg) 425 + rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct sockaddr *addr) 420 426 { 421 427 struct rpcrdma_ia *ia = &xprt->rx_ia; 422 428 int rc; ··· 427 427 ia->ri_id = rpcrdma_create_id(xprt, ia, addr); 428 428 if (IS_ERR(ia->ri_id)) { 429 429 rc = PTR_ERR(ia->ri_id); 430 - goto out1; 430 + goto out_err; 431 431 } 432 432 ia->ri_device = ia->ri_id->device; 433 433 ··· 435 435 if (IS_ERR(ia->ri_pd)) { 436 436 rc = PTR_ERR(ia->ri_pd); 437 437 pr_err("rpcrdma: ib_alloc_pd() returned %d\n", rc); 438 - goto out2; 438 + goto out_err; 439 439 } 440 440 441 - switch (memreg) { 441 + switch (xprt_rdma_memreg_strategy) { 442 442 case RPCRDMA_FRMR: 443 443 if (frwr_is_supported(ia)) { 444 444 ia->ri_ops = &rpcrdma_frwr_memreg_ops; ··· 452 452 } 453 453 /*FALLTHROUGH*/ 454 454 default: 455 - pr_err("rpcrdma: Unsupported memory registration mode: %d\n", 456 - memreg); 455 + pr_err("rpcrdma: Device %s does not support memreg mode %d\n", 456 + ia->ri_device->name, xprt_rdma_memreg_strategy); 457 457 rc = -EINVAL; 458 - goto out3; 458 + goto out_err; 459 459 } 460 460 461 461 return 0; 462 462 463 - out3: 464 - ib_dealloc_pd(ia->ri_pd); 465 - ia->ri_pd = NULL; 466 - out2: 467 - rpcrdma_destroy_id(ia->ri_id); 468 - ia->ri_id = NULL; 469 - out1: 463 + out_err: 464 + rpcrdma_ia_close(ia); 470 465 return rc; 471 466 } 472 467 473 - /* 474 - * Clean up/close an IA. 475 - * o if event handles and PD have been initialized, free them. 476 - * o close the IA 468 + /** 469 + * rpcrdma_ia_remove - Handle device driver unload 470 + * @ia: interface adapter being removed 471 + * 472 + * Divest transport H/W resources associated with this adapter, 473 + * but allow it to be restored later. 474 + */ 475 + void 476 + rpcrdma_ia_remove(struct rpcrdma_ia *ia) 477 + { 478 + struct rpcrdma_xprt *r_xprt = container_of(ia, struct rpcrdma_xprt, 479 + rx_ia); 480 + struct rpcrdma_ep *ep = &r_xprt->rx_ep; 481 + struct rpcrdma_buffer *buf = &r_xprt->rx_buf; 482 + struct rpcrdma_req *req; 483 + struct rpcrdma_rep *rep; 484 + 485 + cancel_delayed_work_sync(&buf->rb_refresh_worker); 486 + 487 + /* This is similar to rpcrdma_ep_destroy, but: 488 + * - Don't cancel the connect worker. 489 + * - Don't call rpcrdma_ep_disconnect, which waits 490 + * for another conn upcall, which will deadlock. 491 + * - rdma_disconnect is unneeded, the underlying 492 + * connection is already gone. 493 + */ 494 + if (ia->ri_id->qp) { 495 + ib_drain_qp(ia->ri_id->qp); 496 + rdma_destroy_qp(ia->ri_id); 497 + ia->ri_id->qp = NULL; 498 + } 499 + ib_free_cq(ep->rep_attr.recv_cq); 500 + ib_free_cq(ep->rep_attr.send_cq); 501 + 502 + /* The ULP is responsible for ensuring all DMA 503 + * mappings and MRs are gone. 504 + */ 505 + list_for_each_entry(rep, &buf->rb_recv_bufs, rr_list) 506 + rpcrdma_dma_unmap_regbuf(rep->rr_rdmabuf); 507 + list_for_each_entry(req, &buf->rb_allreqs, rl_all) { 508 + rpcrdma_dma_unmap_regbuf(req->rl_rdmabuf); 509 + rpcrdma_dma_unmap_regbuf(req->rl_sendbuf); 510 + rpcrdma_dma_unmap_regbuf(req->rl_recvbuf); 511 + } 512 + rpcrdma_destroy_mrs(buf); 513 + 514 + /* Allow waiters to continue */ 515 + complete(&ia->ri_remove_done); 516 + } 517 + 518 + /** 519 + * rpcrdma_ia_close - Clean up/close an IA. 520 + * @ia: interface adapter to close 521 + * 477 522 */ 478 523 void 479 524 rpcrdma_ia_close(struct rpcrdma_ia *ia) ··· 527 482 if (ia->ri_id != NULL && !IS_ERR(ia->ri_id)) { 528 483 if (ia->ri_id->qp) 529 484 rdma_destroy_qp(ia->ri_id); 530 - rpcrdma_destroy_id(ia->ri_id); 531 - ia->ri_id = NULL; 485 + rdma_destroy_id(ia->ri_id); 532 486 } 487 + ia->ri_id = NULL; 488 + ia->ri_device = NULL; 533 489 534 490 /* If the pd is still busy, xprtrdma missed freeing a resource */ 535 491 if (ia->ri_pd && !IS_ERR(ia->ri_pd)) 536 492 ib_dealloc_pd(ia->ri_pd); 493 + ia->ri_pd = NULL; 537 494 } 538 495 539 496 /* ··· 693 646 ib_free_cq(ep->rep_attr.send_cq); 694 647 } 695 648 649 + /* Re-establish a connection after a device removal event. 650 + * Unlike a normal reconnection, a fresh PD and a new set 651 + * of MRs and buffers is needed. 652 + */ 653 + static int 654 + rpcrdma_ep_recreate_xprt(struct rpcrdma_xprt *r_xprt, 655 + struct rpcrdma_ep *ep, struct rpcrdma_ia *ia) 656 + { 657 + struct sockaddr *sap = (struct sockaddr *)&r_xprt->rx_data.addr; 658 + int rc, err; 659 + 660 + pr_info("%s: r_xprt = %p\n", __func__, r_xprt); 661 + 662 + rc = -EHOSTUNREACH; 663 + if (rpcrdma_ia_open(r_xprt, sap)) 664 + goto out1; 665 + 666 + rc = -ENOMEM; 667 + err = rpcrdma_ep_create(ep, ia, &r_xprt->rx_data); 668 + if (err) { 669 + pr_err("rpcrdma: rpcrdma_ep_create returned %d\n", err); 670 + goto out2; 671 + } 672 + 673 + rc = -ENETUNREACH; 674 + err = rdma_create_qp(ia->ri_id, ia->ri_pd, &ep->rep_attr); 675 + if (err) { 676 + pr_err("rpcrdma: rdma_create_qp returned %d\n", err); 677 + goto out3; 678 + } 679 + 680 + rpcrdma_create_mrs(r_xprt); 681 + return 0; 682 + 683 + out3: 684 + rpcrdma_ep_destroy(ep, ia); 685 + out2: 686 + rpcrdma_ia_close(ia); 687 + out1: 688 + return rc; 689 + } 690 + 691 + static int 692 + rpcrdma_ep_reconnect(struct rpcrdma_xprt *r_xprt, struct rpcrdma_ep *ep, 693 + struct rpcrdma_ia *ia) 694 + { 695 + struct sockaddr *sap = (struct sockaddr *)&r_xprt->rx_data.addr; 696 + struct rdma_cm_id *id, *old; 697 + int err, rc; 698 + 699 + dprintk("RPC: %s: reconnecting...\n", __func__); 700 + 701 + rpcrdma_ep_disconnect(ep, ia); 702 + 703 + rc = -EHOSTUNREACH; 704 + id = rpcrdma_create_id(r_xprt, ia, sap); 705 + if (IS_ERR(id)) 706 + goto out; 707 + 708 + /* As long as the new ID points to the same device as the 709 + * old ID, we can reuse the transport's existing PD and all 710 + * previously allocated MRs. Also, the same device means 711 + * the transport's previous DMA mappings are still valid. 712 + * 713 + * This is a sanity check only. There should be no way these 714 + * point to two different devices here. 715 + */ 716 + old = id; 717 + rc = -ENETUNREACH; 718 + if (ia->ri_device != id->device) { 719 + pr_err("rpcrdma: can't reconnect on different device!\n"); 720 + goto out_destroy; 721 + } 722 + 723 + err = rdma_create_qp(id, ia->ri_pd, &ep->rep_attr); 724 + if (err) { 725 + dprintk("RPC: %s: rdma_create_qp returned %d\n", 726 + __func__, err); 727 + goto out_destroy; 728 + } 729 + 730 + /* Atomically replace the transport's ID and QP. */ 731 + rc = 0; 732 + old = ia->ri_id; 733 + ia->ri_id = id; 734 + rdma_destroy_qp(old); 735 + 736 + out_destroy: 737 + rdma_destroy_id(old); 738 + out: 739 + return rc; 740 + } 741 + 696 742 /* 697 743 * Connect unconnected endpoint. 698 744 */ ··· 794 654 { 795 655 struct rpcrdma_xprt *r_xprt = container_of(ia, struct rpcrdma_xprt, 796 656 rx_ia); 797 - struct rdma_cm_id *id, *old; 798 - struct sockaddr *sap; 799 657 unsigned int extras; 800 - int rc = 0; 658 + int rc; 801 659 802 - if (ep->rep_connected != 0) { 803 660 retry: 804 - dprintk("RPC: %s: reconnecting...\n", __func__); 805 - 806 - rpcrdma_ep_disconnect(ep, ia); 807 - 808 - sap = (struct sockaddr *)&r_xprt->rx_data.addr; 809 - id = rpcrdma_create_id(r_xprt, ia, sap); 810 - if (IS_ERR(id)) { 811 - rc = -EHOSTUNREACH; 812 - goto out; 813 - } 814 - /* TEMP TEMP TEMP - fail if new device: 815 - * Deregister/remarshal *all* requests! 816 - * Close and recreate adapter, pd, etc! 817 - * Re-determine all attributes still sane! 818 - * More stuff I haven't thought of! 819 - * Rrrgh! 820 - */ 821 - if (ia->ri_device != id->device) { 822 - printk("RPC: %s: can't reconnect on " 823 - "different device!\n", __func__); 824 - rpcrdma_destroy_id(id); 825 - rc = -ENETUNREACH; 826 - goto out; 827 - } 828 - /* END TEMP */ 829 - rc = rdma_create_qp(id, ia->ri_pd, &ep->rep_attr); 830 - if (rc) { 831 - dprintk("RPC: %s: rdma_create_qp failed %i\n", 832 - __func__, rc); 833 - rpcrdma_destroy_id(id); 834 - rc = -ENETUNREACH; 835 - goto out; 836 - } 837 - 838 - old = ia->ri_id; 839 - ia->ri_id = id; 840 - 841 - rdma_destroy_qp(old); 842 - rpcrdma_destroy_id(old); 843 - } else { 661 + switch (ep->rep_connected) { 662 + case 0: 844 663 dprintk("RPC: %s: connecting...\n", __func__); 845 664 rc = rdma_create_qp(ia->ri_id, ia->ri_pd, &ep->rep_attr); 846 665 if (rc) { 847 666 dprintk("RPC: %s: rdma_create_qp failed %i\n", 848 667 __func__, rc); 849 - /* do not update ep->rep_connected */ 850 - return -ENETUNREACH; 668 + rc = -ENETUNREACH; 669 + goto out_noupdate; 851 670 } 671 + break; 672 + case -ENODEV: 673 + rc = rpcrdma_ep_recreate_xprt(r_xprt, ep, ia); 674 + if (rc) 675 + goto out_noupdate; 676 + break; 677 + default: 678 + rc = rpcrdma_ep_reconnect(r_xprt, ep, ia); 679 + if (rc) 680 + goto out; 852 681 } 853 682 854 683 ep->rep_connected = 0; ··· 845 736 out: 846 737 if (rc) 847 738 ep->rep_connected = rc; 739 + 740 + out_noupdate: 848 741 return rc; 849 742 } 850 743 ··· 989 878 rpcrdma_create_rep(struct rpcrdma_xprt *r_xprt) 990 879 { 991 880 struct rpcrdma_create_data_internal *cdata = &r_xprt->rx_data; 992 - struct rpcrdma_ia *ia = &r_xprt->rx_ia; 993 881 struct rpcrdma_rep *rep; 994 882 int rc; 995 883 ··· 1004 894 goto out_free; 1005 895 } 1006 896 1007 - rep->rr_device = ia->ri_device; 1008 897 rep->rr_cqe.done = rpcrdma_wc_receive; 1009 898 rep->rr_rxprt = r_xprt; 1010 899 INIT_WORK(&rep->rr_work, rpcrdma_reply_handler); ··· 1146 1037 rpcrdma_buffer_destroy(struct rpcrdma_buffer *buf) 1147 1038 { 1148 1039 cancel_delayed_work_sync(&buf->rb_recovery_worker); 1040 + cancel_delayed_work_sync(&buf->rb_refresh_worker); 1149 1041 1150 1042 while (!list_empty(&buf->rb_recv_bufs)) { 1151 1043 struct rpcrdma_rep *rep; ··· 1191 1081 1192 1082 out_nomws: 1193 1083 dprintk("RPC: %s: no MWs available\n", __func__); 1194 - schedule_delayed_work(&buf->rb_refresh_worker, 0); 1084 + if (r_xprt->rx_ep.rep_connected != -ENODEV) 1085 + schedule_delayed_work(&buf->rb_refresh_worker, 0); 1195 1086 1196 1087 /* Allow the reply handler and refresh worker to run */ 1197 1088 cond_resched(); ··· 1342 1231 bool 1343 1232 __rpcrdma_dma_map_regbuf(struct rpcrdma_ia *ia, struct rpcrdma_regbuf *rb) 1344 1233 { 1234 + struct ib_device *device = ia->ri_device; 1235 + 1345 1236 if (rb->rg_direction == DMA_NONE) 1346 1237 return false; 1347 1238 1348 - rb->rg_iov.addr = ib_dma_map_single(ia->ri_device, 1239 + rb->rg_iov.addr = ib_dma_map_single(device, 1349 1240 (void *)rb->rg_base, 1350 1241 rdmab_length(rb), 1351 1242 rb->rg_direction); 1352 - if (ib_dma_mapping_error(ia->ri_device, rdmab_addr(rb))) 1243 + if (ib_dma_mapping_error(device, rdmab_addr(rb))) 1353 1244 return false; 1354 1245 1355 - rb->rg_device = ia->ri_device; 1246 + rb->rg_device = device; 1356 1247 rb->rg_iov.lkey = ia->ri_pd->local_dma_lkey; 1357 1248 return true; 1358 1249 }
+19 -3
net/sunrpc/xprtrdma/xprt_rdma.h
··· 69 69 struct rdma_cm_id *ri_id; 70 70 struct ib_pd *ri_pd; 71 71 struct completion ri_done; 72 + struct completion ri_remove_done; 72 73 int ri_async_rc; 73 74 unsigned int ri_max_segs; 74 75 unsigned int ri_max_frmr_depth; ··· 79 78 bool ri_reminv_expected; 80 79 bool ri_implicit_roundup; 81 80 enum ib_mr_type ri_mrtype; 81 + unsigned long ri_flags; 82 82 struct ib_qp_attr ri_qp_attr; 83 83 struct ib_qp_init_attr ri_qp_init_attr; 84 + }; 85 + 86 + enum { 87 + RPCRDMA_IAF_REMOVING = 0, 84 88 }; 85 89 86 90 /* ··· 170 164 return (struct rpcrdma_msg *)rb->rg_base; 171 165 } 172 166 167 + static inline struct ib_device * 168 + rdmab_device(struct rpcrdma_regbuf *rb) 169 + { 170 + return rb->rg_device; 171 + } 172 + 173 173 #define RPCRDMA_DEF_GFP (GFP_NOIO | __GFP_NOWARN) 174 174 175 175 /* To ensure a transport can always make forward progress, ··· 221 209 unsigned int rr_len; 222 210 int rr_wc_flags; 223 211 u32 rr_inv_rkey; 224 - struct ib_device *rr_device; 225 212 struct rpcrdma_xprt *rr_rxprt; 226 213 struct work_struct rr_work; 227 214 struct list_head rr_list; ··· 391 380 spinlock_t rb_mwlock; /* protect rb_mws list */ 392 381 struct list_head rb_mws; 393 382 struct list_head rb_all; 394 - char *rb_pool; 395 383 396 384 spinlock_t rb_lock; /* protect buf lists */ 397 385 int rb_send_count, rb_recv_count; ··· 507 497 * Default is 0, see sysctl entry and rpc_rdma.c rpcrdma_convert_iovs() */ 508 498 extern int xprt_rdma_pad_optimize; 509 499 500 + /* This setting controls the hunt for a supported memory 501 + * registration strategy. 502 + */ 503 + extern unsigned int xprt_rdma_memreg_strategy; 504 + 510 505 /* 511 506 * Interface Adapter calls - xprtrdma/verbs.c 512 507 */ 513 - int rpcrdma_ia_open(struct rpcrdma_xprt *, struct sockaddr *, int); 508 + int rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct sockaddr *addr); 509 + void rpcrdma_ia_remove(struct rpcrdma_ia *ia); 514 510 void rpcrdma_ia_close(struct rpcrdma_ia *); 515 511 bool frwr_is_supported(struct rpcrdma_ia *); 516 512 bool fmr_is_supported(struct rpcrdma_ia *);