Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'nfsd-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
"I'm thrilled to announce that the Linux in-kernel NFS server now
offers NFSv4 write delegations. A write delegation enables a client to
cache data and metadata for a single file more aggressively, reducing
network round trips and server workload. Many thanks to Dai Ngo for
contributing this facility, and to Jeff Layton and Neil Brown for
reviewing and testing it.

This release also sees the removal of all support for DES- and
triple-DES-based Kerberos encryption types in the kernel's SunRPC
implementation. These encryption types have been deprecated by the
Internet community for years and are considered insecure. This change
affects both the in-kernel NFS client and server.

The server's UDP and TCP socket transports have now fully adopted
David Howells' new bio_vec iterator so that no more than one sendmsg()
call is needed to transmit each RPC message. In particular, this helps
kTLS optimize record boundaries when sending RPC-with-TLS replies, and
it takes the server a baby step closer to handling file I/O via
folios.

We've begun work on overhauling the SunRPC thread scheduler to remove
a costly linked-list walk when looking for an idle RPC service thread
to wake. The pre-requisites are included in this release. Thanks to
Neil Brown for his ongoing work on this improvement"

* tag 'nfsd-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (56 commits)
Documentation: Add missing documentation for EXPORT_OP flags
SUNRPC: Remove unused declaration rpc_modcount()
SUNRPC: Remove unused declarations
NFSD: da_addr_body field missing in some GETDEVICEINFO replies
SUNRPC: Remove return value of svc_pool_wake_idle_thread()
SUNRPC: make rqst_should_sleep() idempotent()
SUNRPC: Clean up svc_set_num_threads
SUNRPC: Count ingress RPC messages per svc_pool
SUNRPC: Deduplicate thread wake-up code
SUNRPC: Move trace_svc_xprt_enqueue
SUNRPC: Add enum svc_auth_status
SUNRPC: change svc_xprt::xpt_flags bits to enum
SUNRPC: change svc_rqst::rq_flags bits to enum
SUNRPC: change svc_pool::sp_flags bits to enum
SUNRPC: change cache_head.flags bits to enum
SUNRPC: remove timeout arg from svc_recv()
SUNRPC: change svc_recv() to return void.
SUNRPC: call svc_process() from svc_recv().
nfsd: separate nfsd_last_thread() from nfsd_put()
nfsd: Simplify code around svc_exit_thread() call in nfsd()
...

+968 -1798
+26
Documentation/filesystems/nfs/exporting.rst
··· 215 215 This flag causes nfsd to close any open files for this inode _before_ 216 216 calling into the vfs to do an unlink or a rename that would replace 217 217 an existing file. 218 + 219 + EXPORT_OP_REMOTE_FS - Backing storage for this filesystem is remote 220 + PF_LOCAL_THROTTLE exists for loopback NFSD, where a thread needs to 221 + write to one bdi (the final bdi) in order to free up writes queued 222 + to another bdi (the client bdi). Such threads get a private balance 223 + of dirty pages so that dirty pages for the client bdi do not imact 224 + the daemon writing to the final bdi. For filesystems whose durable 225 + storage is not local (such as exported NFS filesystems), this 226 + constraint has negative consequences. EXPORT_OP_REMOTE_FS enables 227 + an export to disable writeback throttling. 228 + 229 + EXPORT_OP_NOATOMIC_ATTR - Filesystem does not update attributes atomically 230 + EXPORT_OP_NOATOMIC_ATTR indicates that the exported filesystem 231 + cannot provide the semantics required by the "atomic" boolean in 232 + NFSv4's change_info4. This boolean indicates to a client whether the 233 + returned before and after change attributes were obtained atomically 234 + with the respect to the requested metadata operation (UNLINK, 235 + OPEN/CREATE, MKDIR, etc). 236 + 237 + EXPORT_OP_FLUSH_ON_CLOSE - Filesystem flushes file data on close(2) 238 + On most filesystems, inodes can remain under writeback after the 239 + file is closed. NFSD relies on client activity or local flusher 240 + threads to handle writeback. Certain filesystems, such as NFS, flush 241 + all of an inode's dirty data on last close. Exports that behave this 242 + way should set EXPORT_OP_FLUSH_ON_CLOSE so that NFSD knows to skip 243 + waiting for writeback when closing such files.
+1
fs/exportfs/expfs.c
··· 386 386 * @inode: the object to encode 387 387 * @fid: where to store the file handle fragment 388 388 * @max_len: maximum length to store there 389 + * @parent: parent directory inode, if wanted 389 390 * @flags: properties of the requested file handle 390 391 * 391 392 * Returns an enum fid_type or a negative errno.
+3
fs/lockd/mon.c
··· 276 276 { 277 277 struct nsm_handle *new; 278 278 279 + if (!hostname) 280 + return NULL; 281 + 279 282 new = kzalloc(sizeof(*new) + hostname_len + 1, GFP_KERNEL); 280 283 if (unlikely(new == NULL)) 281 284 return NULL;
+10 -42
fs/lockd/svc.c
··· 45 45 46 46 #define NLMDBG_FACILITY NLMDBG_SVC 47 47 #define LOCKD_BUFSIZE (1024 + NLMSVC_XDRSIZE) 48 - #define ALLOWED_SIGS (sigmask(SIGKILL)) 49 48 50 49 static struct svc_program nlmsvc_program; 51 50 ··· 55 56 static unsigned int nlmsvc_users; 56 57 static struct svc_serv *nlmsvc_serv; 57 58 unsigned long nlmsvc_timeout; 59 + 60 + static void nlmsvc_request_retry(struct timer_list *tl) 61 + { 62 + svc_wake_up(nlmsvc_serv); 63 + } 64 + DEFINE_TIMER(nlmsvc_retry, nlmsvc_request_retry); 58 65 59 66 unsigned int lockd_net_id; 60 67 ··· 116 111 schedule_delayed_work(&ln->grace_period_end, grace_period); 117 112 } 118 113 119 - static void restart_grace(void) 120 - { 121 - if (nlmsvc_ops) { 122 - struct net *net = &init_net; 123 - struct lockd_net *ln = net_generic(net, lockd_net_id); 124 - 125 - cancel_delayed_work_sync(&ln->grace_period_end); 126 - locks_end_grace(&ln->lockd_manager); 127 - nlmsvc_invalidate_all(); 128 - set_grace_period(net); 129 - } 130 - } 131 - 132 114 /* 133 115 * This is the lockd kernel thread 134 116 */ 135 117 static int 136 118 lockd(void *vrqstp) 137 119 { 138 - int err = 0; 139 120 struct svc_rqst *rqstp = vrqstp; 140 121 struct net *net = &init_net; 141 122 struct lockd_net *ln = net_generic(net, lockd_net_id); 142 123 143 124 /* try_to_freeze() is called from svc_recv() */ 144 125 set_freezable(); 145 - 146 - /* Allow SIGKILL to tell lockd to drop all of its locks */ 147 - allow_signal(SIGKILL); 148 126 149 127 dprintk("NFS locking service started (ver " LOCKD_VERSION ").\n"); 150 128 ··· 136 148 * NFS mount or NFS daemon has gone away. 137 149 */ 138 150 while (!kthread_should_stop()) { 139 - long timeout = MAX_SCHEDULE_TIMEOUT; 140 - RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]); 141 - 142 151 /* update sv_maxconn if it has changed */ 143 152 rqstp->rq_server->sv_maxconn = nlm_max_connections; 144 153 145 - if (signalled()) { 146 - flush_signals(current); 147 - restart_grace(); 148 - continue; 149 - } 150 - 151 - timeout = nlmsvc_retry_blocked(); 152 - 153 - /* 154 - * Find a socket with data available and call its 155 - * recvfrom routine. 156 - */ 157 - err = svc_recv(rqstp, timeout); 158 - if (err == -EAGAIN || err == -EINTR) 159 - continue; 160 - dprintk("lockd: request from %s\n", 161 - svc_print_addr(rqstp, buf, sizeof(buf))); 162 - 163 - svc_process(rqstp); 154 + nlmsvc_retry_blocked(); 155 + svc_recv(rqstp); 164 156 } 165 - flush_signals(current); 166 157 if (nlmsvc_ops) 167 158 nlmsvc_invalidate_all(); 168 159 nlm_shutdown_hosts(); ··· 374 407 #endif 375 408 376 409 svc_set_num_threads(nlmsvc_serv, NULL, 0); 410 + timer_delete_sync(&nlmsvc_retry); 377 411 nlmsvc_serv = NULL; 378 412 dprintk("lockd_down: service destroyed\n"); 379 413 } ··· 506 538 } 507 539 508 540 509 - static int lockd_authenticate(struct svc_rqst *rqstp) 541 + static enum svc_auth_status lockd_authenticate(struct svc_rqst *rqstp) 510 542 { 511 543 rqstp->rq_client = NULL; 512 544 switch (rqstp->rq_authop->flavour) {
+15 -3
fs/lockd/svclock.c
··· 131 131 static inline void 132 132 nlmsvc_remove_block(struct nlm_block *block) 133 133 { 134 + spin_lock(&nlm_blocked_lock); 134 135 if (!list_empty(&block->b_list)) { 135 - spin_lock(&nlm_blocked_lock); 136 136 list_del_init(&block->b_list); 137 137 spin_unlock(&nlm_blocked_lock); 138 138 nlmsvc_release_block(block); 139 + return; 139 140 } 141 + spin_unlock(&nlm_blocked_lock); 140 142 } 141 143 142 144 /* ··· 154 152 file, lock->fl.fl_pid, 155 153 (long long)lock->fl.fl_start, 156 154 (long long)lock->fl.fl_end, lock->fl.fl_type); 155 + spin_lock(&nlm_blocked_lock); 157 156 list_for_each_entry(block, &nlm_blocked, b_list) { 158 157 fl = &block->b_call->a_args.lock.fl; 159 158 dprintk("lockd: check f=%p pd=%d %Ld-%Ld ty=%d cookie=%s\n", ··· 164 161 nlmdbg_cookie2a(&block->b_call->a_args.cookie)); 165 162 if (block->b_file == file && nlm_compare_locks(fl, &lock->fl)) { 166 163 kref_get(&block->b_count); 164 + spin_unlock(&nlm_blocked_lock); 167 165 return block; 168 166 } 169 167 } 168 + spin_unlock(&nlm_blocked_lock); 170 169 171 170 return NULL; 172 171 } ··· 190 185 { 191 186 struct nlm_block *block; 192 187 188 + spin_lock(&nlm_blocked_lock); 193 189 list_for_each_entry(block, &nlm_blocked, b_list) { 194 190 if (nlm_cookie_match(&block->b_call->a_args.cookie,cookie)) 195 191 goto found; 196 192 } 193 + spin_unlock(&nlm_blocked_lock); 197 194 198 195 return NULL; 199 196 200 197 found: 201 198 dprintk("nlmsvc_find_block(%s): block=%p\n", nlmdbg_cookie2a(cookie), block); 202 199 kref_get(&block->b_count); 200 + spin_unlock(&nlm_blocked_lock); 203 201 return block; 204 202 } 205 203 ··· 325 317 326 318 restart: 327 319 mutex_lock(&file->f_mutex); 320 + spin_lock(&nlm_blocked_lock); 328 321 list_for_each_entry_safe(block, next, &file->f_blocks, b_flist) { 329 322 if (!match(block->b_host, host)) 330 323 continue; ··· 334 325 if (list_empty(&block->b_list)) 335 326 continue; 336 327 kref_get(&block->b_count); 328 + spin_unlock(&nlm_blocked_lock); 337 329 mutex_unlock(&file->f_mutex); 338 330 nlmsvc_unlink_block(block); 339 331 nlmsvc_release_block(block); 340 332 goto restart; 341 333 } 334 + spin_unlock(&nlm_blocked_lock); 342 335 mutex_unlock(&file->f_mutex); 343 336 } 344 337 ··· 1019 1008 * picks up locks that can be granted, or grant notifications that must 1020 1009 * be retransmitted. 1021 1010 */ 1022 - unsigned long 1011 + void 1023 1012 nlmsvc_retry_blocked(void) 1024 1013 { 1025 1014 unsigned long timeout = MAX_SCHEDULE_TIMEOUT; ··· 1049 1038 } 1050 1039 spin_unlock(&nlm_blocked_lock); 1051 1040 1052 - return timeout; 1041 + if (timeout < MAX_SCHEDULE_TIMEOUT) 1042 + mod_timer(&nlmsvc_retry, jiffies + timeout); 1053 1043 }
-7
fs/locks.c
··· 1744 1744 if (is_deleg && !inode_trylock(inode)) 1745 1745 return -EAGAIN; 1746 1746 1747 - if (is_deleg && arg == F_WRLCK) { 1748 - /* Write delegations are not currently supported: */ 1749 - inode_unlock(inode); 1750 - WARN_ON_ONCE(1); 1751 - return -EINVAL; 1752 - } 1753 - 1754 1747 percpu_down_read(&file_rwsem); 1755 1748 spin_lock(&ctx->flc_lock); 1756 1749 time_out_leases(inode, &dispose);
+4 -19
fs/nfs/callback.c
··· 74 74 static int 75 75 nfs4_callback_svc(void *vrqstp) 76 76 { 77 - int err; 78 77 struct svc_rqst *rqstp = vrqstp; 79 78 80 79 set_freezable(); 81 80 82 - while (!kthread_freezable_should_stop(NULL)) { 83 - 84 - if (signal_pending(current)) 85 - flush_signals(current); 86 - /* 87 - * Listen for a request on the socket 88 - */ 89 - err = svc_recv(rqstp, MAX_SCHEDULE_TIMEOUT); 90 - if (err == -EAGAIN || err == -EINTR) 91 - continue; 92 - svc_process(rqstp); 93 - } 81 + while (!kthread_freezable_should_stop(NULL)) 82 + svc_recv(rqstp); 94 83 95 84 svc_exit_thread(rqstp); 96 85 return 0; ··· 101 112 set_freezable(); 102 113 103 114 while (!kthread_freezable_should_stop(NULL)) { 104 - 105 - if (signal_pending(current)) 106 - flush_signals(current); 107 - 108 - prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_INTERRUPTIBLE); 115 + prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_IDLE); 109 116 spin_lock_bh(&serv->sv_cb_lock); 110 117 if (!list_empty(&serv->sv_cb_list)) { 111 118 req = list_first_entry(&serv->sv_cb_list, ··· 372 387 * All other checking done after NFS decoding where the nfs_client can be 373 388 * found in nfs4_callback_compound 374 389 */ 375 - static int nfs_callback_authenticate(struct svc_rqst *rqstp) 390 + static enum svc_auth_status nfs_callback_authenticate(struct svc_rqst *rqstp) 376 391 { 377 392 rqstp->rq_auth_stat = rpc_autherr_badcred; 378 393
+9
fs/nfsd/blocklayoutxdr.c
··· 83 83 int len = sizeof(__be32), ret, i; 84 84 __be32 *p; 85 85 86 + /* 87 + * See paragraph 5 of RFC 8881 S18.40.3. 88 + */ 89 + if (!gdp->gd_maxcount) { 90 + if (xdr_stream_encode_u32(xdr, 0) != XDR_UNIT) 91 + return nfserr_resource; 92 + return nfs_ok; 93 + } 94 + 86 95 p = xdr_reserve_space(xdr, len + sizeof(__be32)); 87 96 if (!p) 88 97 return nfserr_resource;
+5 -3
fs/nfsd/cache.h
··· 19 19 * typical sockaddr_storage. This is for space reasons, since sockaddr_storage 20 20 * is much larger than a sockaddr_in6. 21 21 */ 22 - struct svc_cacherep { 22 + struct nfsd_cacherep { 23 23 struct { 24 24 /* Keep often-read xid, csum in the same cache line: */ 25 25 __be32 k_xid; ··· 84 84 void nfsd_net_reply_cache_destroy(struct nfsd_net *nn); 85 85 int nfsd_reply_cache_init(struct nfsd_net *); 86 86 void nfsd_reply_cache_shutdown(struct nfsd_net *); 87 - int nfsd_cache_lookup(struct svc_rqst *); 88 - void nfsd_cache_update(struct svc_rqst *, int, __be32 *); 87 + int nfsd_cache_lookup(struct svc_rqst *rqstp, 88 + struct nfsd_cacherep **cacherep); 89 + void nfsd_cache_update(struct svc_rqst *rqstp, struct nfsd_cacherep *rp, 90 + int cachetype, __be32 *statp); 89 91 int nfsd_reply_cache_stats_show(struct seq_file *m, void *v); 90 92 91 93 #endif /* NFSCACHE_H */
+9
fs/nfsd/flexfilelayoutxdr.c
··· 85 85 int addr_len; 86 86 __be32 *p; 87 87 88 + /* 89 + * See paragraph 5 of RFC 8881 S18.40.3. 90 + */ 91 + if (!gdp->gd_maxcount) { 92 + if (xdr_stream_encode_u32(xdr, 0) != XDR_UNIT) 93 + return nfserr_resource; 94 + return nfs_ok; 95 + } 96 + 88 97 /* len + padding for two strings */ 89 98 addr_len = 16 + da->netaddr.netid_len + da->netaddr.addr_len; 90 99 ver_len = 20;
+3 -1
fs/nfsd/nfs3proc.c
··· 307 307 if (!IS_POSIXACL(inode)) 308 308 iap->ia_mode &= ~current_umask(); 309 309 310 - fh_fill_pre_attrs(fhp); 310 + status = fh_fill_pre_attrs(fhp); 311 + if (status != nfs_ok) 312 + goto out; 311 313 host_err = vfs_create(&nop_mnt_idmap, inode, child, iap->ia_mode, true); 312 314 if (host_err < 0) { 313 315 status = nfserrno(host_err);
+29 -5
fs/nfsd/nfs4acl.c
··· 441 441 * calculated so far: */ 442 442 443 443 struct posix_acl_state { 444 - int empty; 444 + unsigned char valid; 445 445 struct posix_ace_state owner; 446 446 struct posix_ace_state group; 447 447 struct posix_ace_state other; ··· 457 457 int alloc; 458 458 459 459 memset(state, 0, sizeof(struct posix_acl_state)); 460 - state->empty = 1; 461 460 /* 462 461 * In the worst case, each individual acl could be for a distinct 463 462 * named user or group, but we don't know which, so we allocate ··· 499 500 * and effective cases: when there are no inheritable ACEs, 500 501 * calls ->set_acl with a NULL ACL structure. 501 502 */ 502 - if (state->empty && (flags & NFS4_ACL_TYPE_DEFAULT)) 503 + if (!state->valid && (flags & NFS4_ACL_TYPE_DEFAULT)) 503 504 return NULL; 504 505 505 506 /* ··· 621 622 struct nfs4_ace *ace) 622 623 { 623 624 u32 mask = ace->access_mask; 625 + short type = ace2type(ace); 624 626 int i; 625 627 626 - state->empty = 0; 628 + state->valid |= type; 627 629 628 - switch (ace2type(ace)) { 630 + switch (type) { 629 631 case ACL_USER_OBJ: 630 632 if (ace->type == NFS4_ACE_ACCESS_ALLOWED_ACE_TYPE) { 631 633 allow_bits(&state->owner, mask); ··· 726 726 if (!(ace->flag & NFS4_ACE_INHERIT_ONLY_ACE)) 727 727 process_one_v4_ace(&effective_acl_state, ace); 728 728 } 729 + 730 + /* 731 + * At this point, the default ACL may have zeroed-out entries for owner, 732 + * group and other. That usually results in a non-sensical resulting ACL 733 + * that denies all access except to any ACE that was explicitly added. 734 + * 735 + * The setfacl command solves a similar problem with this logic: 736 + * 737 + * "If a Default ACL entry is created, and the Default ACL contains 738 + * no owner, owning group, or others entry, a copy of the ACL 739 + * owner, owning group, or others entry is added to the Default ACL." 740 + * 741 + * Copy any missing ACEs from the effective set, if any ACEs were 742 + * explicitly set. 743 + */ 744 + if (default_acl_state.valid) { 745 + if (!(default_acl_state.valid & ACL_USER_OBJ)) 746 + default_acl_state.owner = effective_acl_state.owner; 747 + if (!(default_acl_state.valid & ACL_GROUP_OBJ)) 748 + default_acl_state.group = effective_acl_state.group; 749 + if (!(default_acl_state.valid & ACL_OTHER)) 750 + default_acl_state.other = effective_acl_state.other; 751 + } 752 + 729 753 *pacl = posix_state_to_acl(&effective_acl_state, flags); 730 754 if (IS_ERR(*pacl)) { 731 755 ret = PTR_ERR(*pacl);
+42 -9
fs/nfsd/nfs4proc.c
··· 297 297 } 298 298 299 299 if (d_really_is_positive(child)) { 300 - status = nfs_ok; 301 - 302 300 /* NFSv4 protocol requires change attributes even though 303 301 * no change happened. 304 302 */ 305 - fh_fill_both_attrs(fhp); 303 + status = fh_fill_both_attrs(fhp); 304 + if (status != nfs_ok) 305 + goto out; 306 306 307 307 switch (open->op_createmode) { 308 308 case NFS4_CREATE_UNCHECKED: ··· 345 345 if (!IS_POSIXACL(inode)) 346 346 iap->ia_mode &= ~current_umask(); 347 347 348 - fh_fill_pre_attrs(fhp); 348 + status = fh_fill_pre_attrs(fhp); 349 + if (status != nfs_ok) 350 + goto out; 349 351 status = nfsd4_vfs_create(fhp, child, open); 350 352 if (status != nfs_ok) 351 353 goto out; ··· 380 378 dput(child); 381 379 fh_drop_write(fhp); 382 380 return status; 381 + } 382 + 383 + /** 384 + * set_change_info - set up the change_info4 for a reply 385 + * @cinfo: pointer to nfsd4_change_info to be populated 386 + * @fhp: pointer to svc_fh to use as source 387 + * 388 + * Many operations in NFSv4 require change_info4 in the reply. This function 389 + * populates that from the info that we (should!) have already collected. In 390 + * the event that we didn't get any pre-attrs, just zero out both. 391 + */ 392 + static void 393 + set_change_info(struct nfsd4_change_info *cinfo, struct svc_fh *fhp) 394 + { 395 + cinfo->atomic = (u32)(fhp->fh_pre_saved && fhp->fh_post_saved && !fhp->fh_no_atomic_attr); 396 + cinfo->before_change = fhp->fh_pre_change; 397 + cinfo->after_change = fhp->fh_post_change; 398 + 399 + /* 400 + * If fetching the pre-change attributes failed, then we should 401 + * have already failed the whole operation. We could have still 402 + * failed to fetch post-change attributes however. 403 + * 404 + * If we didn't get post-op attrs, just zero-out the after 405 + * field since we don't know what it should be. If the pre_saved 406 + * field isn't set for some reason, throw warning and just copy 407 + * whatever is in the after field. 408 + */ 409 + if (WARN_ON_ONCE(!fhp->fh_pre_saved)) 410 + cinfo->before_change = 0; 411 + if (!fhp->fh_post_saved) 412 + cinfo->after_change = cinfo->before_change + 1; 383 413 } 384 414 385 415 static __be32 ··· 458 424 } else { 459 425 status = nfsd_lookup(rqstp, current_fh, 460 426 open->op_fname, open->op_fnamelen, *resfh); 461 - if (!status) 427 + if (status == nfs_ok) 462 428 /* NFSv4 protocol requires change attributes even though 463 429 * no change happened. 464 430 */ 465 - fh_fill_both_attrs(current_fh); 431 + status = fh_fill_both_attrs(current_fh); 466 432 } 467 433 if (status) 468 434 goto out; ··· 1347 1313 /* found a match */ 1348 1314 if (ni->nsui_busy) { 1349 1315 /* wait - and try again */ 1350 - prepare_to_wait(&nn->nfsd_ssc_waitq, &wait, 1351 - TASK_INTERRUPTIBLE); 1316 + prepare_to_wait(&nn->nfsd_ssc_waitq, &wait, TASK_IDLE); 1352 1317 spin_unlock(&nn->nfsd_ssc_lock); 1353 1318 1354 1319 /* allow 20secs for mount/unmount for now - revisit */ 1355 - if (signal_pending(current) || 1320 + if (kthread_should_stop() || 1356 1321 (schedule_timeout(20*HZ) == 0)) { 1357 1322 finish_wait(&nn->nfsd_ssc_waitq, &wait); 1358 1323 kfree(work);
+142 -20
fs/nfsd/nfs4state.c
··· 649 649 return ret; 650 650 } 651 651 652 + static struct nfsd_file * 653 + find_rw_file(struct nfs4_file *f) 654 + { 655 + struct nfsd_file *ret; 656 + 657 + spin_lock(&f->fi_lock); 658 + ret = nfsd_file_get(f->fi_fds[O_RDWR]); 659 + spin_unlock(&f->fi_lock); 660 + 661 + return ret; 662 + } 663 + 652 664 struct nfsd_file * 653 665 find_any_file(struct nfs4_file *f) 654 666 { ··· 1156 1144 1157 1145 static struct nfs4_delegation * 1158 1146 alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp, 1159 - struct nfs4_clnt_odstate *odstate) 1147 + struct nfs4_clnt_odstate *odstate, u32 dl_type) 1160 1148 { 1161 1149 struct nfs4_delegation *dp; 1162 1150 long n; ··· 1182 1170 INIT_LIST_HEAD(&dp->dl_recall_lru); 1183 1171 dp->dl_clnt_odstate = odstate; 1184 1172 get_clnt_odstate(odstate); 1185 - dp->dl_type = NFS4_OPEN_DELEGATE_READ; 1173 + dp->dl_type = dl_type; 1186 1174 dp->dl_retries = 1; 1187 1175 dp->dl_recalled = false; 1188 1176 nfsd4_init_cb(&dp->dl_recall, dp->dl_stid.sc_client, ··· 5461 5449 struct nfs4_file *fp = stp->st_stid.sc_file; 5462 5450 struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; 5463 5451 struct nfs4_delegation *dp; 5464 - struct nfsd_file *nf; 5452 + struct nfsd_file *nf = NULL; 5465 5453 struct file_lock *fl; 5454 + u32 dl_type; 5466 5455 5467 5456 /* 5468 5457 * The fi_had_conflict and nfs_get_existing_delegation checks ··· 5473 5460 if (fp->fi_had_conflict) 5474 5461 return ERR_PTR(-EAGAIN); 5475 5462 5476 - nf = find_readable_file(fp); 5477 - if (!nf) { 5478 - /* 5479 - * We probably could attempt another open and get a read 5480 - * delegation, but for now, don't bother until the 5481 - * client actually sends us one. 5482 - */ 5483 - return ERR_PTR(-EAGAIN); 5463 + /* 5464 + * Try for a write delegation first. RFC8881 section 10.4 says: 5465 + * 5466 + * "An OPEN_DELEGATE_WRITE delegation allows the client to handle, 5467 + * on its own, all opens." 5468 + * 5469 + * Furthermore the client can use a write delegation for most READ 5470 + * operations as well, so we require a O_RDWR file here. 5471 + * 5472 + * Offer a write delegation in the case of a BOTH open, and ensure 5473 + * we get the O_RDWR descriptor. 5474 + */ 5475 + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { 5476 + nf = find_rw_file(fp); 5477 + dl_type = NFS4_OPEN_DELEGATE_WRITE; 5484 5478 } 5479 + 5480 + /* 5481 + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR 5482 + * file for some reason, then try for a read delegation instead. 5483 + */ 5484 + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { 5485 + nf = find_readable_file(fp); 5486 + dl_type = NFS4_OPEN_DELEGATE_READ; 5487 + } 5488 + 5489 + if (!nf) 5490 + return ERR_PTR(-EAGAIN); 5491 + 5485 5492 spin_lock(&state_lock); 5486 5493 spin_lock(&fp->fi_lock); 5487 5494 if (nfs4_delegation_exists(clp, fp)) ··· 5524 5491 return ERR_PTR(status); 5525 5492 5526 5493 status = -ENOMEM; 5527 - dp = alloc_init_deleg(clp, fp, odstate); 5494 + dp = alloc_init_deleg(clp, fp, odstate, dl_type); 5528 5495 if (!dp) 5529 5496 goto out_delegees; 5530 5497 5531 - fl = nfs4_alloc_init_lease(dp, NFS4_OPEN_DELEGATE_READ); 5498 + fl = nfs4_alloc_init_lease(dp, dl_type); 5532 5499 if (!fl) 5533 5500 goto out_clnt_odstate; 5534 5501 ··· 5601 5568 } 5602 5569 5603 5570 /* 5604 - * Attempt to hand out a delegation. 5571 + * The Linux NFS server does not offer write delegations to NFSv4.0 5572 + * clients in order to avoid conflicts between write delegations and 5573 + * GETATTRs requesting CHANGE or SIZE attributes. 5605 5574 * 5606 - * Note we don't support write delegations, and won't until the vfs has 5607 - * proper support for them. 5575 + * With NFSv4.1 and later minorversions, the SEQUENCE operation that 5576 + * begins each COMPOUND contains a client ID. Delegation recall can 5577 + * be avoided when the server recognizes the client sending a 5578 + * GETATTR also holds write delegation it conflicts with. 5579 + * 5580 + * However, the NFSv4.0 protocol does not enable a server to 5581 + * determine that a GETATTR originated from the client holding the 5582 + * conflicting delegation versus coming from some other client. Per 5583 + * RFC 7530 Section 16.7.5, the server must recall or send a 5584 + * CB_GETATTR even when the GETATTR originates from the client that 5585 + * holds the conflicting delegation. 5586 + * 5587 + * An NFSv4.0 client can trigger a pathological situation if it 5588 + * always sends a DELEGRETURN preceded by a conflicting GETATTR in 5589 + * the same COMPOUND. COMPOUND execution will always stop at the 5590 + * GETATTR and the DELEGRETURN will never get executed. The server 5591 + * eventually revokes the delegation, which can result in loss of 5592 + * open or lock state. 5608 5593 */ 5609 5594 static void 5610 5595 nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, ··· 5641 5590 case NFS4_OPEN_CLAIM_PREVIOUS: 5642 5591 if (!cb_up) 5643 5592 open->op_recall = 1; 5644 - if (open->op_delegate_type != NFS4_OPEN_DELEGATE_READ) 5645 - goto out_no_deleg; 5646 5593 break; 5647 5594 case NFS4_OPEN_CLAIM_NULL: 5648 5595 parent = currentfh; ··· 5655 5606 goto out_no_deleg; 5656 5607 if (!cb_up || !(oo->oo_flags & NFS4_OO_CONFIRMED)) 5657 5608 goto out_no_deleg; 5609 + if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE && 5610 + !clp->cl_minorversion) 5611 + goto out_no_deleg; 5658 5612 break; 5659 5613 default: 5660 5614 goto out_no_deleg; ··· 5668 5616 5669 5617 memcpy(&open->op_delegate_stateid, &dp->dl_stid.sc_stateid, sizeof(dp->dl_stid.sc_stateid)); 5670 5618 5671 - trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid); 5672 - open->op_delegate_type = NFS4_OPEN_DELEGATE_READ; 5619 + if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { 5620 + open->op_delegate_type = NFS4_OPEN_DELEGATE_WRITE; 5621 + trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid); 5622 + } else { 5623 + open->op_delegate_type = NFS4_OPEN_DELEGATE_READ; 5624 + trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid); 5625 + } 5673 5626 nfs4_put_stid(&dp->dl_stid); 5674 5627 return; 5675 5628 out_no_deleg: ··· 8397 8340 union nfsd4_op_u *u) 8398 8341 { 8399 8342 get_stateid(cstate, &u->write.wr_stateid); 8343 + } 8344 + 8345 + /** 8346 + * nfsd4_deleg_getattr_conflict - Recall if GETATTR causes conflict 8347 + * @rqstp: RPC transaction context 8348 + * @inode: file to be checked for a conflict 8349 + * 8350 + * This function is called when there is a conflict between a write 8351 + * delegation and a change/size GETATTR from another client. The server 8352 + * must either use the CB_GETATTR to get the current values of the 8353 + * attributes from the client that holds the delegation or recall the 8354 + * delegation before replying to the GETATTR. See RFC 8881 section 8355 + * 18.7.4. 8356 + * 8357 + * The current implementation does not support CB_GETATTR yet. However 8358 + * this can avoid recalling the delegation could be added in follow up 8359 + * work. 8360 + * 8361 + * Returns 0 if there is no conflict; otherwise an nfs_stat 8362 + * code is returned. 8363 + */ 8364 + __be32 8365 + nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode) 8366 + { 8367 + __be32 status; 8368 + struct file_lock_context *ctx; 8369 + struct file_lock *fl; 8370 + struct nfs4_delegation *dp; 8371 + 8372 + ctx = locks_inode_context(inode); 8373 + if (!ctx) 8374 + return 0; 8375 + spin_lock(&ctx->flc_lock); 8376 + list_for_each_entry(fl, &ctx->flc_lease, fl_list) { 8377 + if (fl->fl_flags == FL_LAYOUT) 8378 + continue; 8379 + if (fl->fl_lmops != &nfsd_lease_mng_ops) { 8380 + /* 8381 + * non-nfs lease, if it's a lease with F_RDLCK then 8382 + * we are done; there isn't any write delegation 8383 + * on this inode 8384 + */ 8385 + if (fl->fl_type == F_RDLCK) 8386 + break; 8387 + goto break_lease; 8388 + } 8389 + if (fl->fl_type == F_WRLCK) { 8390 + dp = fl->fl_owner; 8391 + if (dp->dl_recall.cb_clp == *(rqstp->rq_lease_breaker)) { 8392 + spin_unlock(&ctx->flc_lock); 8393 + return 0; 8394 + } 8395 + break_lease: 8396 + spin_unlock(&ctx->flc_lock); 8397 + nfsd_stats_wdeleg_getattr_inc(); 8398 + status = nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ)); 8399 + if (status != nfserr_jukebox || 8400 + !nfsd_wait_for_delegreturn(rqstp, inode)) 8401 + return status; 8402 + return 0; 8403 + } 8404 + break; 8405 + } 8406 + spin_unlock(&ctx->flc_lock); 8407 + return 0; 8400 8408 }
+22 -17
fs/nfsd/nfs4xdr.c
··· 2984 2984 if (status) 2985 2985 goto out; 2986 2986 } 2987 + if (bmval0 & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) { 2988 + status = nfsd4_deleg_getattr_conflict(rqstp, d_inode(dentry)); 2989 + if (status) 2990 + goto out; 2991 + } 2987 2992 2988 2993 err = vfs_getattr(&path, &stat, 2989 2994 STATX_BASIC_STATS | STATX_BTIME | STATX_CHANGE_COOKIE, ··· 3978 3973 nfserr = nfsd4_encode_stateid(xdr, &open->op_delegate_stateid); 3979 3974 if (nfserr) 3980 3975 return nfserr; 3981 - p = xdr_reserve_space(xdr, 32); 3976 + 3977 + p = xdr_reserve_space(xdr, XDR_UNIT * 8); 3982 3978 if (!p) 3983 3979 return nfserr_resource; 3984 3980 *p++ = cpu_to_be32(open->op_recall); 3985 3981 3986 3982 /* 3983 + * Always flush on close 3984 + * 3987 3985 * TODO: space_limit's in delegations 3988 3986 */ 3989 3987 *p++ = cpu_to_be32(NFS4_LIMIT_SIZE); 3990 - *p++ = cpu_to_be32(~(u32)0); 3991 - *p++ = cpu_to_be32(~(u32)0); 3988 + *p++ = xdr_zero; 3989 + *p++ = xdr_zero; 3992 3990 3993 3991 /* 3994 3992 * TODO: ACE's in delegations ··· 4686 4678 4687 4679 *p++ = cpu_to_be32(gdev->gd_layout_type); 4688 4680 4689 - /* If maxcount is 0 then just update notifications */ 4690 - if (gdev->gd_maxcount != 0) { 4691 - ops = nfsd4_layout_ops[gdev->gd_layout_type]; 4692 - nfserr = ops->encode_getdeviceinfo(xdr, gdev); 4693 - if (nfserr) { 4694 - /* 4695 - * We don't bother to burden the layout drivers with 4696 - * enforcing gd_maxcount, just tell the client to 4697 - * come back with a bigger buffer if it's not enough. 4698 - */ 4699 - if (xdr->buf->len + 4 > gdev->gd_maxcount) 4700 - goto toosmall; 4701 - return nfserr; 4702 - } 4681 + ops = nfsd4_layout_ops[gdev->gd_layout_type]; 4682 + nfserr = ops->encode_getdeviceinfo(xdr, gdev); 4683 + if (nfserr) { 4684 + /* 4685 + * We don't bother to burden the layout drivers with 4686 + * enforcing gd_maxcount, just tell the client to 4687 + * come back with a bigger buffer if it's not enough. 4688 + */ 4689 + if (xdr->buf->len + 4 > gdev->gd_maxcount) 4690 + goto toosmall; 4691 + return nfserr; 4703 4692 } 4704 4693 4705 4694 if (gdev->gd_notify_types) {
+132 -74
fs/nfsd/nfscache.c
··· 84 84 return roundup_pow_of_two(limit / TARGET_BUCKET_SIZE); 85 85 } 86 86 87 - static struct svc_cacherep * 88 - nfsd_reply_cache_alloc(struct svc_rqst *rqstp, __wsum csum, 89 - struct nfsd_net *nn) 87 + static struct nfsd_cacherep * 88 + nfsd_cacherep_alloc(struct svc_rqst *rqstp, __wsum csum, 89 + struct nfsd_net *nn) 90 90 { 91 - struct svc_cacherep *rp; 91 + struct nfsd_cacherep *rp; 92 92 93 93 rp = kmem_cache_alloc(drc_slab, GFP_KERNEL); 94 94 if (rp) { ··· 110 110 return rp; 111 111 } 112 112 113 - static void 114 - nfsd_reply_cache_free_locked(struct nfsd_drc_bucket *b, struct svc_cacherep *rp, 115 - struct nfsd_net *nn) 113 + static void nfsd_cacherep_free(struct nfsd_cacherep *rp) 116 114 { 117 - if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) { 118 - nfsd_stats_drc_mem_usage_sub(nn, rp->c_replvec.iov_len); 115 + if (rp->c_type == RC_REPLBUFF) 119 116 kfree(rp->c_replvec.iov_base); 117 + kmem_cache_free(drc_slab, rp); 118 + } 119 + 120 + static unsigned long 121 + nfsd_cacherep_dispose(struct list_head *dispose) 122 + { 123 + struct nfsd_cacherep *rp; 124 + unsigned long freed = 0; 125 + 126 + while (!list_empty(dispose)) { 127 + rp = list_first_entry(dispose, struct nfsd_cacherep, c_lru); 128 + list_del(&rp->c_lru); 129 + nfsd_cacherep_free(rp); 130 + freed++; 120 131 } 132 + return freed; 133 + } 134 + 135 + static void 136 + nfsd_cacherep_unlink_locked(struct nfsd_net *nn, struct nfsd_drc_bucket *b, 137 + struct nfsd_cacherep *rp) 138 + { 139 + if (rp->c_type == RC_REPLBUFF && rp->c_replvec.iov_base) 140 + nfsd_stats_drc_mem_usage_sub(nn, rp->c_replvec.iov_len); 121 141 if (rp->c_state != RC_UNUSED) { 122 142 rb_erase(&rp->c_node, &b->rb_head); 123 143 list_del(&rp->c_lru); 124 144 atomic_dec(&nn->num_drc_entries); 125 145 nfsd_stats_drc_mem_usage_sub(nn, sizeof(*rp)); 126 146 } 127 - kmem_cache_free(drc_slab, rp); 128 147 } 129 148 130 149 static void 131 - nfsd_reply_cache_free(struct nfsd_drc_bucket *b, struct svc_cacherep *rp, 150 + nfsd_reply_cache_free_locked(struct nfsd_drc_bucket *b, struct nfsd_cacherep *rp, 151 + struct nfsd_net *nn) 152 + { 153 + nfsd_cacherep_unlink_locked(nn, b, rp); 154 + nfsd_cacherep_free(rp); 155 + } 156 + 157 + static void 158 + nfsd_reply_cache_free(struct nfsd_drc_bucket *b, struct nfsd_cacherep *rp, 132 159 struct nfsd_net *nn) 133 160 { 134 161 spin_lock(&b->cache_lock); 135 - nfsd_reply_cache_free_locked(b, rp, nn); 162 + nfsd_cacherep_unlink_locked(nn, b, rp); 136 163 spin_unlock(&b->cache_lock); 164 + nfsd_cacherep_free(rp); 137 165 } 138 166 139 167 int nfsd_drc_slab_create(void) 140 168 { 141 169 drc_slab = kmem_cache_create("nfsd_drc", 142 - sizeof(struct svc_cacherep), 0, 0, NULL); 170 + sizeof(struct nfsd_cacherep), 0, 0, NULL); 143 171 return drc_slab ? 0: -ENOMEM; 144 172 } 145 173 ··· 236 208 237 209 void nfsd_reply_cache_shutdown(struct nfsd_net *nn) 238 210 { 239 - struct svc_cacherep *rp; 211 + struct nfsd_cacherep *rp; 240 212 unsigned int i; 241 213 242 214 unregister_shrinker(&nn->nfsd_reply_cache_shrinker); ··· 244 216 for (i = 0; i < nn->drc_hashsize; i++) { 245 217 struct list_head *head = &nn->drc_hashtbl[i].lru_head; 246 218 while (!list_empty(head)) { 247 - rp = list_first_entry(head, struct svc_cacherep, c_lru); 219 + rp = list_first_entry(head, struct nfsd_cacherep, c_lru); 248 220 nfsd_reply_cache_free_locked(&nn->drc_hashtbl[i], 249 221 rp, nn); 250 222 } ··· 261 233 * not already scheduled. 262 234 */ 263 235 static void 264 - lru_put_end(struct nfsd_drc_bucket *b, struct svc_cacherep *rp) 236 + lru_put_end(struct nfsd_drc_bucket *b, struct nfsd_cacherep *rp) 265 237 { 266 238 rp->c_timestamp = jiffies; 267 239 list_move_tail(&rp->c_lru, &b->lru_head); ··· 275 247 return &nn->drc_hashtbl[hash]; 276 248 } 277 249 278 - static long prune_bucket(struct nfsd_drc_bucket *b, struct nfsd_net *nn, 279 - unsigned int max) 250 + /* 251 + * Remove and return no more than @max expired entries in bucket @b. 252 + * If @max is zero, do not limit the number of removed entries. 253 + */ 254 + static void 255 + nfsd_prune_bucket_locked(struct nfsd_net *nn, struct nfsd_drc_bucket *b, 256 + unsigned int max, struct list_head *dispose) 280 257 { 281 - struct svc_cacherep *rp, *tmp; 282 - long freed = 0; 258 + unsigned long expiry = jiffies - RC_EXPIRE; 259 + struct nfsd_cacherep *rp, *tmp; 260 + unsigned int freed = 0; 283 261 262 + lockdep_assert_held(&b->cache_lock); 263 + 264 + /* The bucket LRU is ordered oldest-first. */ 284 265 list_for_each_entry_safe(rp, tmp, &b->lru_head, c_lru) { 285 266 /* 286 267 * Don't free entries attached to calls that are still ··· 297 260 */ 298 261 if (rp->c_state == RC_INPROG) 299 262 continue; 263 + 300 264 if (atomic_read(&nn->num_drc_entries) <= nn->max_drc_entries && 301 - time_before(jiffies, rp->c_timestamp + RC_EXPIRE)) 265 + time_before(expiry, rp->c_timestamp)) 302 266 break; 303 - nfsd_reply_cache_free_locked(b, rp, nn); 304 - if (max && freed++ > max) 267 + 268 + nfsd_cacherep_unlink_locked(nn, b, rp); 269 + list_add(&rp->c_lru, dispose); 270 + 271 + if (max && ++freed > max) 305 272 break; 306 273 } 307 - return freed; 308 274 } 309 275 310 - static long nfsd_prune_bucket(struct nfsd_drc_bucket *b, struct nfsd_net *nn) 311 - { 312 - return prune_bucket(b, nn, 3); 313 - } 314 - 315 - /* 316 - * Walk the LRU list and prune off entries that are older than RC_EXPIRE. 317 - * Also prune the oldest ones when the total exceeds the max number of entries. 276 + /** 277 + * nfsd_reply_cache_count - count_objects method for the DRC shrinker 278 + * @shrink: our registered shrinker context 279 + * @sc: garbage collection parameters 280 + * 281 + * Returns the total number of entries in the duplicate reply cache. To 282 + * keep things simple and quick, this is not the number of expired entries 283 + * in the cache (ie, the number that would be removed by a call to 284 + * nfsd_reply_cache_scan). 318 285 */ 319 - static long 320 - prune_cache_entries(struct nfsd_net *nn) 321 - { 322 - unsigned int i; 323 - long freed = 0; 324 - 325 - for (i = 0; i < nn->drc_hashsize; i++) { 326 - struct nfsd_drc_bucket *b = &nn->drc_hashtbl[i]; 327 - 328 - if (list_empty(&b->lru_head)) 329 - continue; 330 - spin_lock(&b->cache_lock); 331 - freed += prune_bucket(b, nn, 0); 332 - spin_unlock(&b->cache_lock); 333 - } 334 - return freed; 335 - } 336 - 337 286 static unsigned long 338 287 nfsd_reply_cache_count(struct shrinker *shrink, struct shrink_control *sc) 339 288 { ··· 329 306 return atomic_read(&nn->num_drc_entries); 330 307 } 331 308 309 + /** 310 + * nfsd_reply_cache_scan - scan_objects method for the DRC shrinker 311 + * @shrink: our registered shrinker context 312 + * @sc: garbage collection parameters 313 + * 314 + * Free expired entries on each bucket's LRU list until we've released 315 + * nr_to_scan freed objects. Nothing will be released if the cache 316 + * has not exceeded it's max_drc_entries limit. 317 + * 318 + * Returns the number of entries released by this call. 319 + */ 332 320 static unsigned long 333 321 nfsd_reply_cache_scan(struct shrinker *shrink, struct shrink_control *sc) 334 322 { 335 323 struct nfsd_net *nn = container_of(shrink, 336 324 struct nfsd_net, nfsd_reply_cache_shrinker); 325 + unsigned long freed = 0; 326 + LIST_HEAD(dispose); 327 + unsigned int i; 337 328 338 - return prune_cache_entries(nn); 329 + for (i = 0; i < nn->drc_hashsize; i++) { 330 + struct nfsd_drc_bucket *b = &nn->drc_hashtbl[i]; 331 + 332 + if (list_empty(&b->lru_head)) 333 + continue; 334 + 335 + spin_lock(&b->cache_lock); 336 + nfsd_prune_bucket_locked(nn, b, 0, &dispose); 337 + spin_unlock(&b->cache_lock); 338 + 339 + freed += nfsd_cacherep_dispose(&dispose); 340 + if (freed > sc->nr_to_scan) 341 + break; 342 + } 343 + 344 + trace_nfsd_drc_gc(nn, freed); 345 + return freed; 339 346 } 347 + 340 348 /* 341 349 * Walk an xdr_buf and get a CRC for at most the first RC_CSUMLEN bytes 342 350 */ ··· 402 348 } 403 349 404 350 static int 405 - nfsd_cache_key_cmp(const struct svc_cacherep *key, 406 - const struct svc_cacherep *rp, struct nfsd_net *nn) 351 + nfsd_cache_key_cmp(const struct nfsd_cacherep *key, 352 + const struct nfsd_cacherep *rp, struct nfsd_net *nn) 407 353 { 408 354 if (key->c_key.k_xid == rp->c_key.k_xid && 409 355 key->c_key.k_csum != rp->c_key.k_csum) { ··· 419 365 * Must be called with cache_lock held. Returns the found entry or 420 366 * inserts an empty key on failure. 421 367 */ 422 - static struct svc_cacherep * 423 - nfsd_cache_insert(struct nfsd_drc_bucket *b, struct svc_cacherep *key, 368 + static struct nfsd_cacherep * 369 + nfsd_cache_insert(struct nfsd_drc_bucket *b, struct nfsd_cacherep *key, 424 370 struct nfsd_net *nn) 425 371 { 426 - struct svc_cacherep *rp, *ret = key; 372 + struct nfsd_cacherep *rp, *ret = key; 427 373 struct rb_node **p = &b->rb_head.rb_node, 428 374 *parent = NULL; 429 375 unsigned int entries = 0; ··· 432 378 while (*p != NULL) { 433 379 ++entries; 434 380 parent = *p; 435 - rp = rb_entry(parent, struct svc_cacherep, c_node); 381 + rp = rb_entry(parent, struct nfsd_cacherep, c_node); 436 382 437 383 cmp = nfsd_cache_key_cmp(key, rp, nn); 438 384 if (cmp < 0) ··· 465 411 /** 466 412 * nfsd_cache_lookup - Find an entry in the duplicate reply cache 467 413 * @rqstp: Incoming Call to find 414 + * @cacherep: OUT: DRC entry for this request 468 415 * 469 416 * Try to find an entry matching the current call in the cache. When none 470 417 * is found, we try to grab the oldest expired entry off the LRU list. If ··· 478 423 * %RC_REPLY: Reply from cache 479 424 * %RC_DROPIT: Do not process the request further 480 425 */ 481 - int nfsd_cache_lookup(struct svc_rqst *rqstp) 426 + int nfsd_cache_lookup(struct svc_rqst *rqstp, struct nfsd_cacherep **cacherep) 482 427 { 483 428 struct nfsd_net *nn; 484 - struct svc_cacherep *rp, *found; 429 + struct nfsd_cacherep *rp, *found; 485 430 __wsum csum; 486 431 struct nfsd_drc_bucket *b; 487 432 int type = rqstp->rq_cachetype; 433 + unsigned long freed; 434 + LIST_HEAD(dispose); 488 435 int rtn = RC_DOIT; 489 436 490 - rqstp->rq_cacherep = NULL; 491 437 if (type == RC_NOCACHE) { 492 438 nfsd_stats_rc_nocache_inc(); 493 439 goto out; ··· 501 445 * preallocate an entry. 502 446 */ 503 447 nn = net_generic(SVC_NET(rqstp), nfsd_net_id); 504 - rp = nfsd_reply_cache_alloc(rqstp, csum, nn); 448 + rp = nfsd_cacherep_alloc(rqstp, csum, nn); 505 449 if (!rp) 506 450 goto out; 507 451 ··· 510 454 found = nfsd_cache_insert(b, rp, nn); 511 455 if (found != rp) 512 456 goto found_entry; 457 + *cacherep = rp; 458 + rp->c_state = RC_INPROG; 459 + nfsd_prune_bucket_locked(nn, b, 3, &dispose); 460 + spin_unlock(&b->cache_lock); 461 + 462 + freed = nfsd_cacherep_dispose(&dispose); 463 + trace_nfsd_drc_gc(nn, freed); 513 464 514 465 nfsd_stats_rc_misses_inc(); 515 - rqstp->rq_cacherep = rp; 516 - rp->c_state = RC_INPROG; 517 - 518 466 atomic_inc(&nn->num_drc_entries); 519 467 nfsd_stats_drc_mem_usage_add(nn, sizeof(*rp)); 520 - 521 - nfsd_prune_bucket(b, nn); 522 - 523 - out_unlock: 524 - spin_unlock(&b->cache_lock); 525 - out: 526 - return rtn; 468 + goto out; 527 469 528 470 found_entry: 529 471 /* We found a matching entry which is either in progress or done. */ ··· 559 505 560 506 out_trace: 561 507 trace_nfsd_drc_found(nn, rqstp, rtn); 562 - goto out_unlock; 508 + out_unlock: 509 + spin_unlock(&b->cache_lock); 510 + out: 511 + return rtn; 563 512 } 564 513 565 514 /** 566 515 * nfsd_cache_update - Update an entry in the duplicate reply cache. 567 516 * @rqstp: svc_rqst with a finished Reply 517 + * @rp: IN: DRC entry for this request 568 518 * @cachetype: which cache to update 569 519 * @statp: pointer to Reply's NFS status code, or NULL 570 520 * ··· 586 528 * nfsd failed to encode a reply that otherwise would have been cached. 587 529 * In this case, nfsd_cache_update is called with statp == NULL. 588 530 */ 589 - void nfsd_cache_update(struct svc_rqst *rqstp, int cachetype, __be32 *statp) 531 + void nfsd_cache_update(struct svc_rqst *rqstp, struct nfsd_cacherep *rp, 532 + int cachetype, __be32 *statp) 590 533 { 591 534 struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); 592 - struct svc_cacherep *rp = rqstp->rq_cacherep; 593 535 struct kvec *resv = &rqstp->rq_res.head[0], *cachv; 594 536 struct nfsd_drc_bucket *b; 595 537 int len;
+1
fs/nfsd/nfsctl.c
··· 1627 1627 } 1628 1628 1629 1629 MODULE_AUTHOR("Olaf Kirch <okir@monad.swb.de>"); 1630 + MODULE_DESCRIPTION("In-kernel NFS server"); 1630 1631 MODULE_LICENSE("GPL"); 1631 1632 module_init(init_nfsd) 1632 1633 module_exit(exit_nfsd)
+6 -1
fs/nfsd/nfsd.h
··· 96 96 int nfsd_pool_stats_release(struct inode *, struct file *); 97 97 void nfsd_shutdown_threads(struct net *net); 98 98 99 - void nfsd_put(struct net *net); 99 + static inline void nfsd_put(struct net *net) 100 + { 101 + struct nfsd_net *nn = net_generic(net, nfsd_net_id); 102 + 103 + svc_put(nn->nfsd_serv); 104 + } 100 105 101 106 bool i_am_nfsd(void); 102 107
+16 -10
fs/nfsd/nfsfh.c
··· 614 614 * @fhp: file handle to be updated 615 615 * 616 616 */ 617 - void fh_fill_pre_attrs(struct svc_fh *fhp) 617 + __be32 __must_check fh_fill_pre_attrs(struct svc_fh *fhp) 618 618 { 619 619 bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE); 620 620 struct inode *inode; ··· 622 622 __be32 err; 623 623 624 624 if (fhp->fh_no_wcc || fhp->fh_pre_saved) 625 - return; 625 + return nfs_ok; 626 626 627 627 inode = d_inode(fhp->fh_dentry); 628 628 err = fh_getattr(fhp, &stat); 629 629 if (err) 630 - return; 630 + return err; 631 631 632 632 if (v4) 633 633 fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode); ··· 636 636 fhp->fh_pre_ctime = stat.ctime; 637 637 fhp->fh_pre_size = stat.size; 638 638 fhp->fh_pre_saved = true; 639 + return nfs_ok; 639 640 } 640 641 641 642 /** ··· 644 643 * @fhp: file handle to be updated 645 644 * 646 645 */ 647 - void fh_fill_post_attrs(struct svc_fh *fhp) 646 + __be32 fh_fill_post_attrs(struct svc_fh *fhp) 648 647 { 649 648 bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE); 650 649 struct inode *inode = d_inode(fhp->fh_dentry); 651 650 __be32 err; 652 651 653 652 if (fhp->fh_no_wcc) 654 - return; 653 + return nfs_ok; 655 654 656 655 if (fhp->fh_post_saved) 657 656 printk("nfsd: inode locked twice during operation.\n"); 658 657 659 658 err = fh_getattr(fhp, &fhp->fh_post_attr); 660 659 if (err) 661 - return; 660 + return err; 662 661 663 662 fhp->fh_post_saved = true; 664 663 if (v4) 665 664 fhp->fh_post_change = 666 665 nfsd4_change_attribute(&fhp->fh_post_attr, inode); 666 + return nfs_ok; 667 667 } 668 668 669 669 /** ··· 674 672 * This is used when the directory wasn't changed, but wcc attributes 675 673 * are needed anyway. 676 674 */ 677 - void fh_fill_both_attrs(struct svc_fh *fhp) 675 + __be32 __must_check fh_fill_both_attrs(struct svc_fh *fhp) 678 676 { 679 - fh_fill_post_attrs(fhp); 680 - if (!fhp->fh_post_saved) 681 - return; 677 + __be32 err; 678 + 679 + err = fh_fill_post_attrs(fhp); 680 + if (err) 681 + return err; 682 + 682 683 fhp->fh_pre_change = fhp->fh_post_change; 683 684 fhp->fh_pre_mtime = fhp->fh_post_attr.mtime; 684 685 fhp->fh_pre_ctime = fhp->fh_post_attr.ctime; 685 686 fhp->fh_pre_size = fhp->fh_post_attr.size; 686 687 fhp->fh_pre_saved = true; 688 + return nfs_ok; 687 689 } 688 690 689 691 /*
+3 -3
fs/nfsd/nfsfh.h
··· 294 294 } 295 295 296 296 u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode); 297 - extern void fh_fill_pre_attrs(struct svc_fh *fhp); 298 - extern void fh_fill_post_attrs(struct svc_fh *fhp); 299 - extern void fh_fill_both_attrs(struct svc_fh *fhp); 297 + __be32 __must_check fh_fill_pre_attrs(struct svc_fh *fhp); 298 + __be32 fh_fill_post_attrs(struct svc_fh *fhp); 299 + __be32 __must_check fh_fill_both_attrs(struct svc_fh *fhp); 300 300 #endif /* _LINUX_NFSD_NFSFH_H */
+27 -84
fs/nfsd/nfssvc.c
··· 542 542 /* Only used under nfsd_mutex, so this atomic may be overkill: */ 543 543 static atomic_t nfsd_notifier_refcount = ATOMIC_INIT(0); 544 544 545 - static void nfsd_last_thread(struct svc_serv *serv, struct net *net) 545 + static void nfsd_last_thread(struct net *net) 546 546 { 547 547 struct nfsd_net *nn = net_generic(net, nfsd_net_id); 548 + struct svc_serv *serv = nn->nfsd_serv; 549 + 550 + spin_lock(&nfsd_notifier_lock); 551 + nn->nfsd_serv = NULL; 552 + spin_unlock(&nfsd_notifier_lock); 548 553 549 554 /* check if the notifier still has clients */ 550 555 if (atomic_dec_return(&nfsd_notifier_refcount) == 0) { ··· 558 553 unregister_inet6addr_notifier(&nfsd_inet6addr_notifier); 559 554 #endif 560 555 } 556 + 557 + svc_xprt_destroy_all(serv, net); 561 558 562 559 /* 563 560 * write_ports can create the server without actually starting ··· 651 644 svc_get(serv); 652 645 /* Kill outstanding nfsd threads */ 653 646 svc_set_num_threads(serv, NULL, 0); 654 - nfsd_put(net); 647 + nfsd_last_thread(net); 648 + svc_put(serv); 655 649 mutex_unlock(&nfsd_mutex); 656 650 } 657 651 ··· 682 674 serv->sv_maxconn = nn->max_connections; 683 675 error = svc_bind(serv, net); 684 676 if (error < 0) { 685 - /* NOT nfsd_put() as notifiers (see below) haven't 686 - * been set up yet. 687 - */ 688 677 svc_put(serv); 689 678 return error; 690 679 } ··· 722 717 } 723 718 724 719 return 0; 725 - } 726 - 727 - /* This is the callback for kref_put() below. 728 - * There is no code here as the first thing to be done is 729 - * call svc_shutdown_net(), but we cannot get the 'net' from 730 - * the kref. So do all the work when kref_put returns true. 731 - */ 732 - static void nfsd_noop(struct kref *ref) 733 - { 734 - } 735 - 736 - void nfsd_put(struct net *net) 737 - { 738 - struct nfsd_net *nn = net_generic(net, nfsd_net_id); 739 - 740 - if (kref_put(&nn->nfsd_serv->sv_refcnt, nfsd_noop)) { 741 - svc_xprt_destroy_all(nn->nfsd_serv, net); 742 - nfsd_last_thread(nn->nfsd_serv, net); 743 - svc_destroy(&nn->nfsd_serv->sv_refcnt); 744 - spin_lock(&nfsd_notifier_lock); 745 - nn->nfsd_serv = NULL; 746 - spin_unlock(&nfsd_notifier_lock); 747 - } 748 720 } 749 721 750 722 int nfsd_set_nrthreads(int n, int *nthreads, struct net *net) ··· 774 792 if (err) 775 793 break; 776 794 } 777 - nfsd_put(net); 795 + svc_put(nn->nfsd_serv); 778 796 return err; 779 797 } 780 798 ··· 789 807 int error; 790 808 bool nfsd_up_before; 791 809 struct nfsd_net *nn = net_generic(net, nfsd_net_id); 810 + struct svc_serv *serv; 792 811 793 812 mutex_lock(&nfsd_mutex); 794 813 dprintk("nfsd: creating service\n"); ··· 809 826 goto out; 810 827 811 828 nfsd_up_before = nn->nfsd_net_up; 829 + serv = nn->nfsd_serv; 812 830 813 831 error = nfsd_startup_net(net, cred); 814 832 if (error) 815 833 goto out_put; 816 - error = svc_set_num_threads(nn->nfsd_serv, NULL, nrservs); 834 + error = svc_set_num_threads(serv, NULL, nrservs); 817 835 if (error) 818 836 goto out_shutdown; 819 - error = nn->nfsd_serv->sv_nrthreads; 837 + error = serv->sv_nrthreads; 838 + if (error == 0) 839 + nfsd_last_thread(net); 820 840 out_shutdown: 821 841 if (error < 0 && !nfsd_up_before) 822 842 nfsd_shutdown_net(net); 823 843 out_put: 824 844 /* Threads now hold service active */ 825 845 if (xchg(&nn->keep_active, 0)) 826 - nfsd_put(net); 827 - nfsd_put(net); 846 + svc_put(serv); 847 + svc_put(serv); 828 848 out: 829 849 mutex_unlock(&nfsd_mutex); 830 850 return error; ··· 939 953 struct svc_xprt *perm_sock = list_entry(rqstp->rq_server->sv_permsocks.next, typeof(struct svc_xprt), xpt_list); 940 954 struct net *net = perm_sock->xpt_net; 941 955 struct nfsd_net *nn = net_generic(net, nfsd_net_id); 942 - int err; 943 956 944 957 /* At this point, the thread shares current->fs 945 958 * with the init process. We need to create files with the ··· 950 965 951 966 current->fs->umask = 0; 952 967 953 - /* 954 - * thread is spawned with all signals set to SIG_IGN, re-enable 955 - * the ones that will bring down the thread 956 - */ 957 - allow_signal(SIGKILL); 958 - allow_signal(SIGHUP); 959 - allow_signal(SIGINT); 960 - allow_signal(SIGQUIT); 961 - 962 968 atomic_inc(&nfsdstats.th_cnt); 963 969 964 970 set_freezable(); ··· 957 981 /* 958 982 * The main request loop 959 983 */ 960 - for (;;) { 984 + while (!kthread_should_stop()) { 961 985 /* Update sv_maxconn if it has changed */ 962 986 rqstp->rq_server->sv_maxconn = nn->max_connections; 963 987 964 - /* 965 - * Find a socket with data available and call its 966 - * recvfrom routine. 967 - */ 968 - while ((err = svc_recv(rqstp, 60*60*HZ)) == -EAGAIN) 969 - ; 970 - if (err == -EINTR) 971 - break; 972 - validate_process_creds(); 973 - svc_process(rqstp); 988 + svc_recv(rqstp); 974 989 validate_process_creds(); 975 990 } 976 - 977 - /* Clear signals before calling svc_exit_thread() */ 978 - flush_signals(current); 979 991 980 992 atomic_dec(&nfsdstats.th_cnt); 981 993 982 994 out: 983 - /* Take an extra ref so that the svc_put in svc_exit_thread() 984 - * doesn't call svc_destroy() 985 - */ 986 - svc_get(nn->nfsd_serv); 987 - 988 995 /* Release the thread */ 989 996 svc_exit_thread(rqstp); 990 - 991 - /* We need to drop a ref, but may not drop the last reference 992 - * without holding nfsd_mutex, and we cannot wait for nfsd_mutex as that 993 - * could deadlock with nfsd_shutdown_threads() waiting for us. 994 - * So three options are: 995 - * - drop a non-final reference, 996 - * - get the mutex without waiting 997 - * - sleep briefly andd try the above again 998 - */ 999 - while (!svc_put_not_last(nn->nfsd_serv)) { 1000 - if (mutex_trylock(&nfsd_mutex)) { 1001 - nfsd_put(net); 1002 - mutex_unlock(&nfsd_mutex); 1003 - break; 1004 - } 1005 - msleep(20); 1006 - } 1007 - 1008 997 return 0; 1009 998 } 1010 999 ··· 987 1046 { 988 1047 const struct svc_procedure *proc = rqstp->rq_procinfo; 989 1048 __be32 *statp = rqstp->rq_accept_statp; 1049 + struct nfsd_cacherep *rp; 990 1050 991 1051 /* 992 1052 * Give the xdr decoder a chance to change this if it wants ··· 998 1056 if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream)) 999 1057 goto out_decode_err; 1000 1058 1001 - switch (nfsd_cache_lookup(rqstp)) { 1059 + rp = NULL; 1060 + switch (nfsd_cache_lookup(rqstp, &rp)) { 1002 1061 case RC_DOIT: 1003 1062 break; 1004 1063 case RC_REPLY: ··· 1015 1072 if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream)) 1016 1073 goto out_encode_err; 1017 1074 1018 - nfsd_cache_update(rqstp, rqstp->rq_cachetype, statp + 1); 1075 + nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1); 1019 1076 out_cached_reply: 1020 1077 return 1; 1021 1078 ··· 1025 1082 return 1; 1026 1083 1027 1084 out_update_drop: 1028 - nfsd_cache_update(rqstp, RC_NOCACHE, NULL); 1085 + nfsd_cache_update(rqstp, rp, RC_NOCACHE, NULL); 1029 1086 out_dropit: 1030 1087 return 0; 1031 1088 1032 1089 out_encode_err: 1033 1090 trace_nfsd_cant_encode_err(rqstp); 1034 - nfsd_cache_update(rqstp, RC_NOCACHE, NULL); 1091 + nfsd_cache_update(rqstp, rp, RC_NOCACHE, NULL); 1035 1092 *statp = rpc_system_err; 1036 1093 return 1; 1037 1094 }
+3
fs/nfsd/state.h
··· 732 732 cmpxchg(&clp->cl_state, NFSD4_COURTESY, NFSD4_EXPIRABLE); 733 733 return clp->cl_state == NFSD4_EXPIRABLE; 734 734 } 735 + 736 + extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, 737 + struct inode *inode); 735 738 #endif /* NFSD4_STATE_H */
+2
fs/nfsd/stats.c
··· 65 65 seq_printf(seq, " %lld", 66 66 percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_NFS4_OP(i)])); 67 67 } 68 + seq_printf(seq, "\nwdeleg_getattr %lld", 69 + percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_WDELEG_GETATTR])); 68 70 69 71 seq_putc(seq, '\n'); 70 72 #endif
+7
fs/nfsd/stats.h
··· 22 22 NFSD_STATS_FIRST_NFS4_OP, /* count of individual nfsv4 operations */ 23 23 NFSD_STATS_LAST_NFS4_OP = NFSD_STATS_FIRST_NFS4_OP + LAST_NFS4_OP, 24 24 #define NFSD_STATS_NFS4_OP(op) (NFSD_STATS_FIRST_NFS4_OP + (op)) 25 + NFSD_STATS_WDELEG_GETATTR, /* count of getattr conflict with wdeleg */ 25 26 #endif 26 27 NFSD_STATS_COUNTERS_NUM 27 28 }; ··· 94 93 percpu_counter_sub(&nn->counter[NFSD_NET_DRC_MEM_USAGE], amount); 95 94 } 96 95 96 + #ifdef CONFIG_NFSD_V4 97 + static inline void nfsd_stats_wdeleg_getattr_inc(void) 98 + { 99 + percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_WDELEG_GETATTR]); 100 + } 101 + #endif 97 102 #endif /* _NFSD_STATS_H */
+25 -2
fs/nfsd/trace.h
··· 607 607 608 608 DEFINE_STATEID_EVENT(open); 609 609 DEFINE_STATEID_EVENT(deleg_read); 610 + DEFINE_STATEID_EVENT(deleg_write); 610 611 DEFINE_STATEID_EVENT(deleg_return); 611 612 DEFINE_STATEID_EVENT(deleg_recall); 612 613 ··· 1241 1240 TRACE_EVENT(nfsd_drc_mismatch, 1242 1241 TP_PROTO( 1243 1242 const struct nfsd_net *nn, 1244 - const struct svc_cacherep *key, 1245 - const struct svc_cacherep *rp 1243 + const struct nfsd_cacherep *key, 1244 + const struct nfsd_cacherep *rp 1246 1245 ), 1247 1246 TP_ARGS(nn, key, rp), 1248 1247 TP_STRUCT__entry( ··· 1260 1259 TP_printk("boot_time=%16llx xid=0x%08x cached-csum=0x%08x ingress-csum=0x%08x", 1261 1260 __entry->boot_time, __entry->xid, __entry->cached, 1262 1261 __entry->ingress) 1262 + ); 1263 + 1264 + TRACE_EVENT_CONDITION(nfsd_drc_gc, 1265 + TP_PROTO( 1266 + const struct nfsd_net *nn, 1267 + unsigned long freed 1268 + ), 1269 + TP_ARGS(nn, freed), 1270 + TP_CONDITION(freed > 0), 1271 + TP_STRUCT__entry( 1272 + __field(unsigned long long, boot_time) 1273 + __field(unsigned long, freed) 1274 + __field(int, total) 1275 + ), 1276 + TP_fast_assign( 1277 + __entry->boot_time = nn->boot_time; 1278 + __entry->freed = freed; 1279 + __entry->total = atomic_read(&nn->num_drc_entries); 1280 + ), 1281 + TP_printk("boot_time=%16llx total=%d freed=%lu", 1282 + __entry->boot_time, __entry->total, __entry->freed 1283 + ) 1263 1284 ); 1264 1285 1265 1286 TRACE_EVENT(nfsd_cb_args,
+35 -17
fs/nfsd/vfs.c
··· 1540 1540 dput(dchild); 1541 1541 if (err) 1542 1542 goto out_unlock; 1543 - fh_fill_pre_attrs(fhp); 1543 + err = fh_fill_pre_attrs(fhp); 1544 + if (err != nfs_ok) 1545 + goto out_unlock; 1544 1546 err = nfsd_create_locked(rqstp, fhp, attrs, type, rdev, resfhp); 1545 1547 fh_fill_post_attrs(fhp); 1546 1548 out_unlock: ··· 1637 1635 inode_unlock(dentry->d_inode); 1638 1636 goto out_drop_write; 1639 1637 } 1640 - fh_fill_pre_attrs(fhp); 1638 + err = fh_fill_pre_attrs(fhp); 1639 + if (err != nfs_ok) 1640 + goto out_unlock; 1641 1641 host_err = vfs_symlink(&nop_mnt_idmap, d_inode(dentry), dnew, path); 1642 1642 err = nfserrno(host_err); 1643 1643 cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp); 1644 1644 if (!err) 1645 1645 nfsd_create_setattr(rqstp, fhp, resfhp, attrs); 1646 1646 fh_fill_post_attrs(fhp); 1647 + out_unlock: 1647 1648 inode_unlock(dentry->d_inode); 1648 1649 if (!err) 1649 1650 err = nfserrno(commit_metadata(fhp)); ··· 1708 1703 err = nfserr_noent; 1709 1704 if (d_really_is_negative(dold)) 1710 1705 goto out_dput; 1711 - fh_fill_pre_attrs(ffhp); 1706 + err = fh_fill_pre_attrs(ffhp); 1707 + if (err != nfs_ok) 1708 + goto out_dput; 1712 1709 host_err = vfs_link(dold, &nop_mnt_idmap, dirp, dnew, NULL); 1713 1710 fh_fill_post_attrs(ffhp); 1714 1711 inode_unlock(dirp); ··· 1796 1789 } 1797 1790 1798 1791 trap = lock_rename(tdentry, fdentry); 1799 - fh_fill_pre_attrs(ffhp); 1800 - fh_fill_pre_attrs(tfhp); 1792 + err = fh_fill_pre_attrs(ffhp); 1793 + if (err != nfs_ok) 1794 + goto out_unlock; 1795 + err = fh_fill_pre_attrs(tfhp); 1796 + if (err != nfs_ok) 1797 + goto out_unlock; 1801 1798 1802 1799 odentry = lookup_one_len(fname, fdentry, flen); 1803 1800 host_err = PTR_ERR(odentry); ··· 1868 1857 fh_fill_post_attrs(ffhp); 1869 1858 fh_fill_post_attrs(tfhp); 1870 1859 } 1860 + out_unlock: 1871 1861 unlock_rename(tdentry, fdentry); 1872 1862 fh_drop_write(ffhp); 1873 1863 ··· 1928 1916 goto out_unlock; 1929 1917 } 1930 1918 rinode = d_inode(rdentry); 1931 - ihold(rinode); 1919 + err = fh_fill_pre_attrs(fhp); 1920 + if (err != nfs_ok) 1921 + goto out_unlock; 1932 1922 1923 + ihold(rinode); 1933 1924 if (!type) 1934 1925 type = d_inode(rdentry)->i_mode & S_IFMT; 1935 1926 1936 - fh_fill_pre_attrs(fhp); 1937 1927 if (type != S_IFDIR) { 1938 1928 int retries; 1939 1929 ··· 2355 2341 return nfserrno(ret); 2356 2342 2357 2343 inode_lock(fhp->fh_dentry->d_inode); 2358 - fh_fill_pre_attrs(fhp); 2359 - 2344 + err = fh_fill_pre_attrs(fhp); 2345 + if (err != nfs_ok) 2346 + goto out_unlock; 2360 2347 ret = __vfs_removexattr_locked(&nop_mnt_idmap, fhp->fh_dentry, 2361 2348 name, NULL); 2362 - 2349 + err = nfsd_xattr_errno(ret); 2363 2350 fh_fill_post_attrs(fhp); 2351 + out_unlock: 2364 2352 inode_unlock(fhp->fh_dentry->d_inode); 2365 2353 fh_drop_write(fhp); 2366 2354 2367 - return nfsd_xattr_errno(ret); 2355 + return err; 2368 2356 } 2369 2357 2370 2358 __be32 ··· 2384 2368 if (ret) 2385 2369 return nfserrno(ret); 2386 2370 inode_lock(fhp->fh_dentry->d_inode); 2387 - fh_fill_pre_attrs(fhp); 2388 - 2389 - ret = __vfs_setxattr_locked(&nop_mnt_idmap, fhp->fh_dentry, name, buf, 2390 - len, flags, NULL); 2371 + err = fh_fill_pre_attrs(fhp); 2372 + if (err != nfs_ok) 2373 + goto out_unlock; 2374 + ret = __vfs_setxattr_locked(&nop_mnt_idmap, fhp->fh_dentry, 2375 + name, buf, len, flags, NULL); 2391 2376 fh_fill_post_attrs(fhp); 2377 + err = nfsd_xattr_errno(ret); 2378 + out_unlock: 2392 2379 inode_unlock(fhp->fh_dentry->d_inode); 2393 2380 fh_drop_write(fhp); 2394 - 2395 - return nfsd_xattr_errno(ret); 2381 + return err; 2396 2382 } 2397 2383 #endif 2398 2384
-11
fs/nfsd/xdr4.h
··· 774 774 775 775 #define NFS4_SVC_XDRSIZE sizeof(struct nfsd4_compoundargs) 776 776 777 - static inline void 778 - set_change_info(struct nfsd4_change_info *cinfo, struct svc_fh *fhp) 779 - { 780 - BUG_ON(!fhp->fh_pre_saved); 781 - cinfo->atomic = (u32)(fhp->fh_post_saved && !fhp->fh_no_atomic_attr); 782 - 783 - cinfo->before_change = fhp->fh_pre_change; 784 - cinfo->after_change = fhp->fh_post_change; 785 - } 786 - 787 - 788 777 bool nfsd4_mach_creds_match(struct nfs4_client *cl, struct svc_rqst *rqstp); 789 778 bool nfs4svc_decode_compoundargs(struct svc_rqst *rqstp, struct xdr_stream *xdr); 790 779 bool nfs4svc_encode_compoundres(struct svc_rqst *rqstp, struct xdr_stream *xdr);
+3 -1
include/linux/lockd/lockd.h
··· 204 204 extern bool nsm_use_hostnames; 205 205 extern u32 nsm_local_state; 206 206 207 + extern struct timer_list nlmsvc_retry; 208 + 207 209 /* 208 210 * Lockd client functions 209 211 */ ··· 282 280 struct nlm_host *, struct nlm_lock *, 283 281 struct nlm_lock *, struct nlm_cookie *); 284 282 __be32 nlmsvc_cancel_blocked(struct net *net, struct nlm_file *, struct nlm_lock *); 285 - unsigned long nlmsvc_retry_blocked(void); 283 + void nlmsvc_retry_blocked(void); 286 284 void nlmsvc_traverse_blocks(struct nlm_host *, struct nlm_file *, 287 285 nlm_host_match_fn_t match); 288 286 void nlmsvc_grant_reply(struct nlm_cookie *, __be32);
+8 -4
include/linux/sunrpc/cache.h
··· 56 56 struct kref ref; 57 57 unsigned long flags; 58 58 }; 59 - #define CACHE_VALID 0 /* Entry contains valid data */ 60 - #define CACHE_NEGATIVE 1 /* Negative entry - there is no match for the key */ 61 - #define CACHE_PENDING 2 /* An upcall has been sent but no reply received yet*/ 62 - #define CACHE_CLEANED 3 /* Entry has been cleaned from cache */ 59 + 60 + /* cache_head.flags */ 61 + enum { 62 + CACHE_VALID, /* Entry contains valid data */ 63 + CACHE_NEGATIVE, /* Negative entry - there is no match for the key */ 64 + CACHE_PENDING, /* An upcall has been sent but no reply received yet*/ 65 + CACHE_CLEANED, /* Entry has been cleaned from cache */ 66 + }; 63 67 64 68 #define CACHE_NEW_EXPIRY 120 /* keep new things pending confirmation for 120 seconds */ 65 69
+7 -16
include/linux/sunrpc/stats.h
··· 43 43 #ifdef CONFIG_PROC_FS 44 44 int rpc_proc_init(struct net *); 45 45 void rpc_proc_exit(struct net *); 46 - #else 47 - static inline int rpc_proc_init(struct net *net) 48 - { 49 - return 0; 50 - } 51 - 52 - static inline void rpc_proc_exit(struct net *net) 53 - { 54 - } 55 - #endif 56 - 57 - #ifdef MODULE 58 - void rpc_modcount(struct inode *, int); 59 - #endif 60 - 61 - #ifdef CONFIG_PROC_FS 62 46 struct proc_dir_entry * rpc_proc_register(struct net *,struct rpc_stat *); 63 47 void rpc_proc_unregister(struct net *,const char *); 64 48 void rpc_proc_zero(const struct rpc_program *); ··· 53 69 void svc_seq_show(struct seq_file *, 54 70 const struct svc_stat *); 55 71 #else 72 + static inline int rpc_proc_init(struct net *net) 73 + { 74 + return 0; 75 + } 56 76 77 + static inline void rpc_proc_exit(struct net *net) 78 + { 79 + } 57 80 static inline struct proc_dir_entry *rpc_proc_register(struct net *net, struct rpc_stat *s) { return NULL; } 58 81 static inline void rpc_proc_unregister(struct net *net, const char *p) {} 59 82 static inline void rpc_proc_zero(const struct rpc_program *p) {}
+23 -29
include/linux/sunrpc/svc.h
··· 39 39 struct list_head sp_all_threads; /* all server threads */ 40 40 41 41 /* statistics on pool operation */ 42 + struct percpu_counter sp_messages_arrived; 42 43 struct percpu_counter sp_sockets_queued; 43 44 struct percpu_counter sp_threads_woken; 44 - struct percpu_counter sp_threads_timedout; 45 45 46 - #define SP_TASK_PENDING (0) /* still work to do even if no 47 - * xprt is queued. */ 48 - #define SP_CONGESTED (1) 49 46 unsigned long sp_flags; 50 47 } ____cacheline_aligned_in_smp; 48 + 49 + /* bits for sp_flags */ 50 + enum { 51 + SP_TASK_PENDING, /* still work to do even if no xprt is queued */ 52 + SP_CONGESTED, /* all threads are busy, none idle */ 53 + }; 54 + 51 55 52 56 /* 53 57 * RPC service. ··· 122 118 static inline void svc_put(struct svc_serv *serv) 123 119 { 124 120 kref_put(&serv->sv_refcnt, svc_destroy); 125 - } 126 - 127 - /** 128 - * svc_put_not_last - decrement non-final reference count on SUNRPC serv 129 - * @serv: the svc_serv to have count decremented 130 - * 131 - * Returns: %true is refcount was decremented. 132 - * 133 - * If the refcount is 1, it is not decremented and instead failure is reported. 134 - */ 135 - static inline bool svc_put_not_last(struct svc_serv *serv) 136 - { 137 - return refcount_dec_not_one(&serv->sv_refcnt.refcount); 138 121 } 139 122 140 123 /* ··· 223 232 u32 rq_proc; /* procedure number */ 224 233 u32 rq_prot; /* IP protocol */ 225 234 int rq_cachetype; /* catering to nfsd */ 226 - #define RQ_SECURE (0) /* secure port */ 227 - #define RQ_LOCAL (1) /* local request */ 228 - #define RQ_USEDEFERRAL (2) /* use deferral */ 229 - #define RQ_DROPME (3) /* drop current reply */ 230 - #define RQ_SPLICE_OK (4) /* turned off in gss privacy 231 - * to prevent encrypting page 232 - * cache pages */ 233 - #define RQ_VICTIM (5) /* about to be shut down */ 234 - #define RQ_BUSY (6) /* request is busy */ 235 - #define RQ_DATA (7) /* request has data */ 236 235 unsigned long rq_flags; /* flags field */ 237 236 ktime_t rq_qtime; /* enqueue time */ 238 237 ··· 246 265 /* Catering to nfsd */ 247 266 struct auth_domain * rq_client; /* RPC peer info */ 248 267 struct auth_domain * rq_gssclient; /* "gss/"-style peer info */ 249 - struct svc_cacherep * rq_cacherep; /* cache info */ 250 268 struct task_struct *rq_task; /* service thread */ 251 269 struct net *rq_bc_net; /* pointer to backchannel's 252 270 * net namespace 253 271 */ 254 272 void ** rq_lease_breaker; /* The v4 client breaking a lease */ 273 + }; 274 + 275 + /* bits for rq_flags */ 276 + enum { 277 + RQ_SECURE, /* secure port */ 278 + RQ_LOCAL, /* local request */ 279 + RQ_USEDEFERRAL, /* use deferral */ 280 + RQ_DROPME, /* drop current reply */ 281 + RQ_SPLICE_OK, /* turned off in gss privacy to prevent 282 + * encrypting page cache pages */ 283 + RQ_VICTIM, /* about to be shut down */ 284 + RQ_BUSY, /* request is busy */ 285 + RQ_DATA, /* request has data */ 255 286 }; 256 287 257 288 #define SVC_NET(rqst) (rqst->rq_xprt ? rqst->rq_xprt->xpt_net : rqst->rq_bc_net) ··· 337 344 char * pg_name; /* service name */ 338 345 char * pg_class; /* class name: services sharing authentication */ 339 346 struct svc_stat * pg_stats; /* rpc statistics */ 340 - int (*pg_authenticate)(struct svc_rqst *); 347 + enum svc_auth_status (*pg_authenticate)(struct svc_rqst *rqstp); 341 348 __be32 (*pg_init_request)(struct svc_rqst *, 342 349 const struct svc_program *, 343 350 struct svc_process_info *); ··· 420 427 421 428 void svc_wake_up(struct svc_serv *); 422 429 void svc_reserve(struct svc_rqst *rqstp, int space); 430 + void svc_pool_wake_idle_thread(struct svc_pool *pool); 423 431 struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv); 424 432 char * svc_print_addr(struct svc_rqst *, char *, size_t); 425 433 const char * svc_proc_name(const struct svc_rqst *rqstp);
+21 -17
include/linux/sunrpc/svc_xprt.h
··· 56 56 struct list_head xpt_list; 57 57 struct list_head xpt_ready; 58 58 unsigned long xpt_flags; 59 - #define XPT_BUSY 0 /* enqueued/receiving */ 60 - #define XPT_CONN 1 /* conn pending */ 61 - #define XPT_CLOSE 2 /* dead or dying */ 62 - #define XPT_DATA 3 /* data pending */ 63 - #define XPT_TEMP 4 /* connected transport */ 64 - #define XPT_DEAD 6 /* transport closed */ 65 - #define XPT_CHNGBUF 7 /* need to change snd/rcv buf sizes */ 66 - #define XPT_DEFERRED 8 /* deferred request pending */ 67 - #define XPT_OLD 9 /* used for xprt aging mark+sweep */ 68 - #define XPT_LISTENER 10 /* listening endpoint */ 69 - #define XPT_CACHE_AUTH 11 /* cache auth info */ 70 - #define XPT_LOCAL 12 /* connection from loopback interface */ 71 - #define XPT_KILL_TEMP 13 /* call xpo_kill_temp_xprt before closing */ 72 - #define XPT_CONG_CTRL 14 /* has congestion control */ 73 - #define XPT_HANDSHAKE 15 /* xprt requests a handshake */ 74 - #define XPT_TLS_SESSION 16 /* transport-layer security established */ 75 - #define XPT_PEER_AUTH 17 /* peer has been authenticated */ 76 59 77 60 struct svc_serv *xpt_server; /* service for transport */ 78 61 atomic_t xpt_reserved; /* space on outq that is rsvd */ ··· 78 95 const struct cred *xpt_cred; 79 96 struct rpc_xprt *xpt_bc_xprt; /* NFSv4.1 backchannel */ 80 97 struct rpc_xprt_switch *xpt_bc_xps; /* NFSv4.1 backchannel */ 98 + }; 99 + 100 + /* flag bits for xpt_flags */ 101 + enum { 102 + XPT_BUSY, /* enqueued/receiving */ 103 + XPT_CONN, /* conn pending */ 104 + XPT_CLOSE, /* dead or dying */ 105 + XPT_DATA, /* data pending */ 106 + XPT_TEMP, /* connected transport */ 107 + XPT_DEAD, /* transport closed */ 108 + XPT_CHNGBUF, /* need to change snd/rcv buf sizes */ 109 + XPT_DEFERRED, /* deferred request pending */ 110 + XPT_OLD, /* used for xprt aging mark+sweep */ 111 + XPT_LISTENER, /* listening endpoint */ 112 + XPT_CACHE_AUTH, /* cache auth info */ 113 + XPT_LOCAL, /* connection from loopback interface */ 114 + XPT_KILL_TEMP, /* call xpo_kill_temp_xprt before closing */ 115 + XPT_CONG_CTRL, /* has congestion control */ 116 + XPT_HANDSHAKE, /* xprt requests a handshake */ 117 + XPT_TLS_SESSION, /* transport-layer security established */ 118 + XPT_PEER_AUTH, /* peer has been authenticated */ 81 119 }; 82 120 83 121 static inline void unregister_xpt_user(struct svc_xprt *xpt, struct svc_xpt_user *u)
+24 -29
include/linux/sunrpc/svcauth.h
··· 83 83 struct rcu_head rcu_head; 84 84 }; 85 85 86 + enum svc_auth_status { 87 + SVC_GARBAGE = 1, 88 + SVC_SYSERR, 89 + SVC_VALID, 90 + SVC_NEGATIVE, 91 + SVC_OK, 92 + SVC_DROP, 93 + SVC_CLOSE, 94 + SVC_DENIED, 95 + SVC_PENDING, 96 + SVC_COMPLETE, 97 + }; 98 + 86 99 /* 87 100 * Each authentication flavour registers an auth_ops 88 101 * structure. ··· 111 98 * is (probably) already in place. Certainly space is 112 99 * reserved for it. 113 100 * DROP - simply drop the request. It may have been deferred 101 + * CLOSE - like SVC_DROP, but request is definitely lost. 102 + * If there is a tcp connection, it should be closed. 114 103 * GARBAGE - rpc garbage_args error 115 104 * SYSERR - rpc system_err error 116 105 * DENIED - authp holds reason for denial. ··· 126 111 * 127 112 * release() is given a request after the procedure has been run. 128 113 * It should sign/encrypt the results if needed 129 - * It should return: 130 - * OK - the resbuf is ready to be sent 131 - * DROP - the reply should be quitely dropped 132 - * DENIED - authp holds a reason for MSG_DENIED 133 - * SYSERR - rpc system_err 134 114 * 135 115 * domain_release() 136 116 * This call releases a domain. 117 + * 137 118 * set_client() 138 119 * Givens a pending request (struct svc_rqst), finds and assigns 139 120 * an appropriate 'auth_domain' as the client. ··· 138 127 char * name; 139 128 struct module *owner; 140 129 int flavour; 141 - int (*accept)(struct svc_rqst *rq); 142 - int (*release)(struct svc_rqst *rq); 143 - void (*domain_release)(struct auth_domain *); 144 - int (*set_client)(struct svc_rqst *rq); 145 - }; 146 130 147 - #define SVC_GARBAGE 1 148 - #define SVC_SYSERR 2 149 - #define SVC_VALID 3 150 - #define SVC_NEGATIVE 4 151 - #define SVC_OK 5 152 - #define SVC_DROP 6 153 - #define SVC_CLOSE 7 /* Like SVC_DROP, but request is definitely 154 - * lost so if there is a tcp connection, it 155 - * should be closed 156 - */ 157 - #define SVC_DENIED 8 158 - #define SVC_PENDING 9 159 - #define SVC_COMPLETE 10 131 + enum svc_auth_status (*accept)(struct svc_rqst *rqstp); 132 + int (*release)(struct svc_rqst *rqstp); 133 + void (*domain_release)(struct auth_domain *dom); 134 + enum svc_auth_status (*set_client)(struct svc_rqst *rqstp); 135 + }; 160 136 161 137 struct svc_xprt; 162 138 163 - extern int svc_authenticate(struct svc_rqst *rqstp); 139 + extern enum svc_auth_status svc_authenticate(struct svc_rqst *rqstp); 164 140 extern int svc_authorise(struct svc_rqst *rqstp); 165 - extern int svc_set_client(struct svc_rqst *rqstp); 141 + extern enum svc_auth_status svc_set_client(struct svc_rqst *rqstp); 166 142 extern int svc_auth_register(rpc_authflavor_t flavor, struct auth_ops *aops); 167 143 extern void svc_auth_unregister(rpc_authflavor_t flavor); 168 144 169 145 extern struct auth_domain *unix_domain_find(char *name); 170 146 extern void auth_domain_put(struct auth_domain *item); 171 - extern int auth_unix_add_addr(struct net *net, struct in6_addr *addr, struct auth_domain *dom); 172 147 extern struct auth_domain *auth_domain_lookup(char *name, struct auth_domain *new); 173 148 extern struct auth_domain *auth_domain_find(char *name); 174 - extern struct auth_domain *auth_unix_lookup(struct net *net, struct in6_addr *addr); 175 - extern int auth_unix_forget_old(struct auth_domain *dom); 176 149 extern void svcauth_unix_purge(struct net *net); 177 150 extern void svcauth_unix_info_release(struct svc_xprt *xpt); 178 - extern int svcauth_unix_set_client(struct svc_rqst *rqstp); 151 + extern enum svc_auth_status svcauth_unix_set_client(struct svc_rqst *rqstp); 179 152 180 153 extern int unix_gid_cache_create(struct net *net); 181 154 extern void unix_gid_cache_destroy(struct net *net);
+3 -6
include/linux/sunrpc/svcsock.h
··· 35 35 /* Total length of the data (not including fragment headers) 36 36 * received so far in the fragments making up this rpc: */ 37 37 u32 sk_datalen; 38 - /* Number of queued send requests */ 39 - atomic_t sk_sendqlen; 38 + 39 + struct page_frag_cache sk_frag_cache; 40 40 41 41 struct completion sk_handshake_done; 42 42 ··· 56 56 /* 57 57 * Function prototypes. 58 58 */ 59 - void svc_close_net(struct svc_serv *, struct net *); 60 - int svc_recv(struct svc_rqst *, long); 59 + void svc_recv(struct svc_rqst *rqstp); 61 60 void svc_send(struct svc_rqst *rqstp); 62 61 void svc_drop(struct svc_rqst *); 63 62 void svc_sock_update_bufs(struct svc_serv *serv); ··· 65 66 const struct cred *cred); 66 67 void svc_init_xprt_sock(void); 67 68 void svc_cleanup_xprt_sock(void); 68 - struct svc_xprt *svc_sock_create(struct svc_serv *serv, int prot); 69 - void svc_sock_destroy(struct svc_xprt *); 70 69 71 70 /* 72 71 * svc_makesock socket characteristics
+2
include/linux/sunrpc/xdr.h
··· 139 139 size_t xdr_buf_pagecount(const struct xdr_buf *buf); 140 140 int xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp); 141 141 void xdr_free_bvec(struct xdr_buf *buf); 142 + unsigned int xdr_buf_to_bvec(struct bio_vec *bvec, unsigned int bvec_size, 143 + const struct xdr_buf *xdr); 142 144 143 145 static inline __be32 *xdr_encode_array(__be32 *p, const void *s, unsigned int len) 144 146 {
+50 -30
include/trace/events/sunrpc.h
··· 1706 1706 TRACE_DEFINE_ENUM(SVC_PENDING); 1707 1707 TRACE_DEFINE_ENUM(SVC_COMPLETE); 1708 1708 1709 - #define svc_show_status(status) \ 1709 + #define show_svc_auth_status(status) \ 1710 1710 __print_symbolic(status, \ 1711 1711 { SVC_GARBAGE, "SVC_GARBAGE" }, \ 1712 1712 { SVC_SYSERR, "SVC_SYSERR" }, \ ··· 1743 1743 __entry->xid, __get_sockaddr(server), __get_sockaddr(client) 1744 1744 1745 1745 TRACE_EVENT_CONDITION(svc_authenticate, 1746 - TP_PROTO(const struct svc_rqst *rqst, int auth_res), 1746 + TP_PROTO( 1747 + const struct svc_rqst *rqst, 1748 + enum svc_auth_status auth_res 1749 + ), 1747 1750 1748 1751 TP_ARGS(rqst, auth_res), 1749 1752 ··· 1769 1766 TP_printk(SVC_RQST_ENDPOINT_FORMAT 1770 1767 " auth_res=%s auth_stat=%s", 1771 1768 SVC_RQST_ENDPOINT_VARARGS, 1772 - svc_show_status(__entry->svc_status), 1769 + show_svc_auth_status(__entry->svc_status), 1773 1770 rpc_show_auth_stat(__entry->auth_stat)) 1774 1771 ); 1775 1772 ··· 1921 1918 __get_str(procedure), __entry->execute) 1922 1919 ); 1923 1920 1921 + /* 1922 + * from include/linux/sunrpc/svc_xprt.h 1923 + */ 1924 + #define SVC_XPRT_FLAG_LIST \ 1925 + svc_xprt_flag(BUSY) \ 1926 + svc_xprt_flag(CONN) \ 1927 + svc_xprt_flag(CLOSE) \ 1928 + svc_xprt_flag(DATA) \ 1929 + svc_xprt_flag(TEMP) \ 1930 + svc_xprt_flag(DEAD) \ 1931 + svc_xprt_flag(CHNGBUF) \ 1932 + svc_xprt_flag(DEFERRED) \ 1933 + svc_xprt_flag(OLD) \ 1934 + svc_xprt_flag(LISTENER) \ 1935 + svc_xprt_flag(CACHE_AUTH) \ 1936 + svc_xprt_flag(LOCAL) \ 1937 + svc_xprt_flag(KILL_TEMP) \ 1938 + svc_xprt_flag(CONG_CTRL) \ 1939 + svc_xprt_flag(HANDSHAKE) \ 1940 + svc_xprt_flag(TLS_SESSION) \ 1941 + svc_xprt_flag_end(PEER_AUTH) 1942 + 1943 + #undef svc_xprt_flag 1944 + #undef svc_xprt_flag_end 1945 + #define svc_xprt_flag(x) TRACE_DEFINE_ENUM(XPT_##x); 1946 + #define svc_xprt_flag_end(x) TRACE_DEFINE_ENUM(XPT_##x); 1947 + 1948 + SVC_XPRT_FLAG_LIST 1949 + 1950 + #undef svc_xprt_flag 1951 + #undef svc_xprt_flag_end 1952 + #define svc_xprt_flag(x) { BIT(XPT_##x), #x }, 1953 + #define svc_xprt_flag_end(x) { BIT(XPT_##x), #x } 1954 + 1924 1955 #define show_svc_xprt_flags(flags) \ 1925 - __print_flags(flags, "|", \ 1926 - { BIT(XPT_BUSY), "BUSY" }, \ 1927 - { BIT(XPT_CONN), "CONN" }, \ 1928 - { BIT(XPT_CLOSE), "CLOSE" }, \ 1929 - { BIT(XPT_DATA), "DATA" }, \ 1930 - { BIT(XPT_TEMP), "TEMP" }, \ 1931 - { BIT(XPT_DEAD), "DEAD" }, \ 1932 - { BIT(XPT_CHNGBUF), "CHNGBUF" }, \ 1933 - { BIT(XPT_DEFERRED), "DEFERRED" }, \ 1934 - { BIT(XPT_OLD), "OLD" }, \ 1935 - { BIT(XPT_LISTENER), "LISTENER" }, \ 1936 - { BIT(XPT_CACHE_AUTH), "CACHE_AUTH" }, \ 1937 - { BIT(XPT_LOCAL), "LOCAL" }, \ 1938 - { BIT(XPT_KILL_TEMP), "KILL_TEMP" }, \ 1939 - { BIT(XPT_CONG_CTRL), "CONG_CTRL" }, \ 1940 - { BIT(XPT_HANDSHAKE), "HANDSHAKE" }, \ 1941 - { BIT(XPT_TLS_SESSION), "TLS_SESSION" }, \ 1942 - { BIT(XPT_PEER_AUTH), "PEER_AUTH" }) 1956 + __print_flags(flags, "|", SVC_XPRT_FLAG_LIST) 1943 1957 1944 1958 TRACE_EVENT(svc_xprt_create_err, 1945 1959 TP_PROTO( ··· 2014 1994 TRACE_EVENT(svc_xprt_enqueue, 2015 1995 TP_PROTO( 2016 1996 const struct svc_xprt *xprt, 2017 - const struct svc_rqst *rqst 1997 + unsigned long flags 2018 1998 ), 2019 1999 2020 - TP_ARGS(xprt, rqst), 2000 + TP_ARGS(xprt, flags), 2021 2001 2022 2002 TP_STRUCT__entry( 2023 2003 SVC_XPRT_ENDPOINT_FIELDS(xprt) 2024 - 2025 - __field(int, pid) 2026 2004 ), 2027 2005 2028 2006 TP_fast_assign( 2029 - SVC_XPRT_ENDPOINT_ASSIGNMENTS(xprt); 2030 - 2031 - __entry->pid = rqst? rqst->rq_task->pid : 0; 2007 + __assign_sockaddr(server, &xprt->xpt_local, 2008 + xprt->xpt_locallen); 2009 + __assign_sockaddr(client, &xprt->xpt_remote, 2010 + xprt->xpt_remotelen); 2011 + __entry->flags = flags; 2012 + __entry->netns_ino = xprt->xpt_net->ns.inum; 2032 2013 ), 2033 2014 2034 - TP_printk(SVC_XPRT_ENDPOINT_FORMAT " pid=%d", 2035 - SVC_XPRT_ENDPOINT_VARARGS, __entry->pid) 2015 + TP_printk(SVC_XPRT_ENDPOINT_FORMAT, SVC_XPRT_ENDPOINT_VARARGS) 2036 2016 ); 2037 2017 2038 2018 TRACE_EVENT(svc_xprt_dequeue,
-1
net/sunrpc/.kunitconfig
··· 23 23 CONFIG_SUNRPC=y 24 24 CONFIG_SUNRPC_GSS=y 25 25 CONFIG_RPCSEC_GSS_KRB5=y 26 - CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_DES=y 27 26 CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y 28 27 CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_CAMELLIA=y 29 28 CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2=y
-35
net/sunrpc/Kconfig
··· 34 34 35 35 If unsure, say Y. 36 36 37 - config RPCSEC_GSS_KRB5_SIMPLIFIED 38 - bool 39 - depends on RPCSEC_GSS_KRB5 40 - 41 - config RPCSEC_GSS_KRB5_CRYPTOSYSTEM 42 - bool 43 - depends on RPCSEC_GSS_KRB5 44 - 45 - config RPCSEC_GSS_KRB5_ENCTYPES_DES 46 - bool "Enable Kerberos enctypes based on DES (deprecated)" 47 - depends on RPCSEC_GSS_KRB5 48 - depends on CRYPTO_CBC && CRYPTO_CTS && CRYPTO_ECB 49 - depends on CRYPTO_HMAC && CRYPTO_MD5 && CRYPTO_SHA1 50 - depends on CRYPTO_DES 51 - default n 52 - select RPCSEC_GSS_KRB5_SIMPLIFIED 53 - help 54 - Choose Y to enable the use of deprecated Kerberos 5 55 - encryption types that utilize Data Encryption Standard 56 - (DES) based ciphers. These include des-cbc-md5, 57 - des-cbc-crc, and des-cbc-md4, which were deprecated by 58 - RFC 6649, and des3-cbc-sha1, which was deprecated by RFC 59 - 8429. 60 - 61 - These encryption types are known to be insecure, therefore 62 - the default setting of this option is N. Support for these 63 - encryption types is available only for compatibility with 64 - legacy NFS client and server implementations. 65 - 66 - Removal of support is planned for a subsequent kernel 67 - release. 68 - 69 37 config RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1 70 38 bool "Enable Kerberos enctypes based on AES and SHA-1" 71 39 depends on RPCSEC_GSS_KRB5 ··· 41 73 depends on CRYPTO_HMAC && CRYPTO_SHA1 42 74 depends on CRYPTO_AES 43 75 default y 44 - select RPCSEC_GSS_KRB5_CRYPTOSYSTEM 45 76 help 46 77 Choose Y to enable the use of Kerberos 5 encryption types 47 78 that utilize Advanced Encryption Standard (AES) ciphers and ··· 53 86 depends on CRYPTO_CBC && CRYPTO_CTS && CRYPTO_CAMELLIA 54 87 depends on CRYPTO_CMAC 55 88 default n 56 - select RPCSEC_GSS_KRB5_CRYPTOSYSTEM 57 89 help 58 90 Choose Y to enable the use of Kerberos 5 encryption types 59 91 that utilize Camellia ciphers (RFC 3713) and CMAC digests ··· 66 100 depends on CRYPTO_HMAC && CRYPTO_SHA256 && CRYPTO_SHA512 67 101 depends on CRYPTO_AES 68 102 default n 69 - select RPCSEC_GSS_KRB5_CRYPTOSYSTEM 70 103 help 71 104 Choose Y to enable the use of Kerberos 5 encryption types 72 105 that utilize Advanced Encryption Standard (AES) ciphers and
+1 -1
net/sunrpc/auth_gss/Makefile
··· 12 12 obj-$(CONFIG_RPCSEC_GSS_KRB5) += rpcsec_gss_krb5.o 13 13 14 14 rpcsec_gss_krb5-y := gss_krb5_mech.o gss_krb5_seal.o gss_krb5_unseal.o \ 15 - gss_krb5_seqnum.o gss_krb5_wrap.o gss_krb5_crypto.o gss_krb5_keys.o 15 + gss_krb5_wrap.o gss_krb5_crypto.o gss_krb5_keys.o 16 16 17 17 obj-$(CONFIG_RPCSEC_GSS_KRB5_KUNIT_TEST) += gss_krb5_test.o
-23
net/sunrpc/auth_gss/gss_krb5_internal.h
··· 33 33 const u32 Ke_length; /* encryption subkey length, in octets */ 34 34 const u32 Ki_length; /* integrity subkey length, in octets */ 35 35 36 - int (*import_ctx)(struct krb5_ctx *ctx, gfp_t gfp_mask); 37 36 int (*derive_key)(const struct gss_krb5_enctype *gk5e, 38 37 const struct xdr_netobj *in, 39 38 struct xdr_netobj *out, ··· 84 85 * GSS Kerberos 5 mechanism Per-Message calls. 85 86 */ 86 87 87 - u32 gss_krb5_get_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *text, 88 - struct xdr_netobj *token); 89 88 u32 gss_krb5_get_mic_v2(struct krb5_ctx *ctx, struct xdr_buf *text, 90 89 struct xdr_netobj *token); 91 90 92 - u32 gss_krb5_verify_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *message_buffer, 93 - struct xdr_netobj *read_token); 94 91 u32 gss_krb5_verify_mic_v2(struct krb5_ctx *ctx, struct xdr_buf *message_buffer, 95 92 struct xdr_netobj *read_token); 96 93 97 - u32 gss_krb5_wrap_v1(struct krb5_ctx *kctx, int offset, 98 - struct xdr_buf *buf, struct page **pages); 99 94 u32 gss_krb5_wrap_v2(struct krb5_ctx *kctx, int offset, 100 95 struct xdr_buf *buf, struct page **pages); 101 96 102 - u32 gss_krb5_unwrap_v1(struct krb5_ctx *kctx, int offset, int len, 103 - struct xdr_buf *buf, unsigned int *slack, 104 - unsigned int *align); 105 97 u32 gss_krb5_unwrap_v2(struct krb5_ctx *kctx, int offset, int len, 106 98 struct xdr_buf *buf, unsigned int *slack, 107 99 unsigned int *align); ··· 102 112 */ 103 113 104 114 /* Key Derivation Functions */ 105 - 106 - int krb5_derive_key_v1(const struct gss_krb5_enctype *gk5e, 107 - const struct xdr_netobj *inkey, 108 - struct xdr_netobj *outkey, 109 - const struct xdr_netobj *label, 110 - gfp_t gfp_mask); 111 115 112 116 int krb5_derive_key_v2(const struct gss_krb5_enctype *gk5e, 113 117 const struct xdr_netobj *inkey, ··· 152 168 label_data[4] = seed; 153 169 return gk5e->derive_key(gk5e, inkey, outkey, &label, gfp_mask); 154 170 } 155 - 156 - s32 krb5_make_seq_num(struct krb5_ctx *kctx, struct crypto_sync_skcipher *key, 157 - int direction, u32 seqnum, unsigned char *cksum, 158 - unsigned char *buf); 159 - 160 - s32 krb5_get_seq_num(struct krb5_ctx *kctx, unsigned char *cksum, 161 - unsigned char *buf, int *direction, u32 *seqnum); 162 171 163 172 void krb5_make_confounder(u8 *p, int conflen); 164 173
-84
net/sunrpc/auth_gss/gss_krb5_keys.c
··· 222 222 return ret; 223 223 } 224 224 225 - #define smask(step) ((1<<step)-1) 226 - #define pstep(x, step) (((x)&smask(step))^(((x)>>step)&smask(step))) 227 - #define parity_char(x) pstep(pstep(pstep((x), 4), 2), 1) 228 - 229 - static void mit_des_fixup_key_parity(u8 key[8]) 230 - { 231 - int i; 232 - for (i = 0; i < 8; i++) { 233 - key[i] &= 0xfe; 234 - key[i] |= 1^parity_char(key[i]); 235 - } 236 - } 237 - 238 - static int krb5_random_to_key_v1(const struct gss_krb5_enctype *gk5e, 239 - struct xdr_netobj *randombits, 240 - struct xdr_netobj *key) 241 - { 242 - int i, ret = -EINVAL; 243 - 244 - if (key->len != 24) { 245 - dprintk("%s: key->len is %d\n", __func__, key->len); 246 - goto err_out; 247 - } 248 - if (randombits->len != 21) { 249 - dprintk("%s: randombits->len is %d\n", 250 - __func__, randombits->len); 251 - goto err_out; 252 - } 253 - 254 - /* take the seven bytes, move them around into the top 7 bits of the 255 - 8 key bytes, then compute the parity bits. Do this three times. */ 256 - 257 - for (i = 0; i < 3; i++) { 258 - memcpy(key->data + i*8, randombits->data + i*7, 7); 259 - key->data[i*8+7] = (((key->data[i*8]&1)<<1) | 260 - ((key->data[i*8+1]&1)<<2) | 261 - ((key->data[i*8+2]&1)<<3) | 262 - ((key->data[i*8+3]&1)<<4) | 263 - ((key->data[i*8+4]&1)<<5) | 264 - ((key->data[i*8+5]&1)<<6) | 265 - ((key->data[i*8+6]&1)<<7)); 266 - 267 - mit_des_fixup_key_parity(key->data + i*8); 268 - } 269 - ret = 0; 270 - err_out: 271 - return ret; 272 - } 273 - 274 - /** 275 - * krb5_derive_key_v1 - Derive a subkey for an RFC 3961 enctype 276 - * @gk5e: Kerberos 5 enctype profile 277 - * @inkey: base protocol key 278 - * @outkey: OUT: derived key 279 - * @label: subkey usage label 280 - * @gfp_mask: memory allocation control flags 281 - * 282 - * Caller sets @outkey->len to the desired length of the derived key. 283 - * 284 - * On success, returns 0 and fills in @outkey. A negative errno value 285 - * is returned on failure. 286 - */ 287 - int krb5_derive_key_v1(const struct gss_krb5_enctype *gk5e, 288 - const struct xdr_netobj *inkey, 289 - struct xdr_netobj *outkey, 290 - const struct xdr_netobj *label, 291 - gfp_t gfp_mask) 292 - { 293 - struct xdr_netobj inblock; 294 - int ret; 295 - 296 - inblock.len = gk5e->keybytes; 297 - inblock.data = kmalloc(inblock.len, gfp_mask); 298 - if (!inblock.data) 299 - return -ENOMEM; 300 - 301 - ret = krb5_DK(gk5e, inkey, inblock.data, label, gfp_mask); 302 - if (!ret) 303 - ret = krb5_random_to_key_v1(gk5e, &inblock, outkey); 304 - 305 - kfree_sensitive(inblock.data); 306 - return ret; 307 - } 308 - 309 225 /* 310 226 * This is the identity function, with some sanity checking. 311 227 */
+2 -255
net/sunrpc/auth_gss/gss_krb5_mech.c
··· 30 30 31 31 static struct gss_api_mech gss_kerberos_mech; 32 32 33 - #if defined(CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED) 34 - static int gss_krb5_import_ctx_des(struct krb5_ctx *ctx, gfp_t gfp_mask); 35 - static int gss_krb5_import_ctx_v1(struct krb5_ctx *ctx, gfp_t gfp_mask); 36 - #endif 37 - #if defined(CONFIG_RPCSEC_GSS_KRB5_CRYPTOSYSTEM) 38 - static int gss_krb5_import_ctx_v2(struct krb5_ctx *ctx, gfp_t gfp_mask); 39 - #endif 40 - 41 33 static const struct gss_krb5_enctype supported_gss_krb5_enctypes[] = { 42 - #if defined(CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_DES) 43 - /* 44 - * DES (All DES enctypes are mapped to the same gss functionality) 45 - */ 46 - { 47 - .etype = ENCTYPE_DES_CBC_RAW, 48 - .ctype = CKSUMTYPE_RSA_MD5, 49 - .name = "des-cbc-crc", 50 - .encrypt_name = "cbc(des)", 51 - .cksum_name = "md5", 52 - .import_ctx = gss_krb5_import_ctx_des, 53 - .get_mic = gss_krb5_get_mic_v1, 54 - .verify_mic = gss_krb5_verify_mic_v1, 55 - .wrap = gss_krb5_wrap_v1, 56 - .unwrap = gss_krb5_unwrap_v1, 57 - .signalg = SGN_ALG_DES_MAC_MD5, 58 - .sealalg = SEAL_ALG_DES, 59 - .keybytes = 7, 60 - .keylength = 8, 61 - .cksumlength = 8, 62 - .keyed_cksum = 0, 63 - }, 64 - /* 65 - * 3DES 66 - */ 67 - { 68 - .etype = ENCTYPE_DES3_CBC_RAW, 69 - .ctype = CKSUMTYPE_HMAC_SHA1_DES3, 70 - .name = "des3-hmac-sha1", 71 - .encrypt_name = "cbc(des3_ede)", 72 - .cksum_name = "hmac(sha1)", 73 - .import_ctx = gss_krb5_import_ctx_v1, 74 - .derive_key = krb5_derive_key_v1, 75 - .get_mic = gss_krb5_get_mic_v1, 76 - .verify_mic = gss_krb5_verify_mic_v1, 77 - .wrap = gss_krb5_wrap_v1, 78 - .unwrap = gss_krb5_unwrap_v1, 79 - .signalg = SGN_ALG_HMAC_SHA1_DES3_KD, 80 - .sealalg = SEAL_ALG_DES3KD, 81 - .keybytes = 21, 82 - .keylength = 24, 83 - .cksumlength = 20, 84 - .keyed_cksum = 1, 85 - }, 86 - #endif 87 - 88 34 #if defined(CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1) 89 35 /* 90 36 * AES-128 with SHA-1 (RFC 3962) ··· 42 96 .encrypt_name = "cts(cbc(aes))", 43 97 .aux_cipher = "cbc(aes)", 44 98 .cksum_name = "hmac(sha1)", 45 - .import_ctx = gss_krb5_import_ctx_v2, 46 99 .derive_key = krb5_derive_key_v2, 47 100 .encrypt = gss_krb5_aes_encrypt, 48 101 .decrypt = gss_krb5_aes_decrypt, ··· 71 126 .encrypt_name = "cts(cbc(aes))", 72 127 .aux_cipher = "cbc(aes)", 73 128 .cksum_name = "hmac(sha1)", 74 - .import_ctx = gss_krb5_import_ctx_v2, 75 129 .derive_key = krb5_derive_key_v2, 76 130 .encrypt = gss_krb5_aes_encrypt, 77 131 .decrypt = gss_krb5_aes_decrypt, ··· 110 166 .Ke_length = BITS2OCTETS(128), 111 167 .Ki_length = BITS2OCTETS(128), 112 168 113 - .import_ctx = gss_krb5_import_ctx_v2, 114 169 .derive_key = krb5_kdf_feedback_cmac, 115 170 .encrypt = gss_krb5_aes_encrypt, 116 171 .decrypt = gss_krb5_aes_decrypt, ··· 136 193 .Ke_length = BITS2OCTETS(256), 137 194 .Ki_length = BITS2OCTETS(256), 138 195 139 - .import_ctx = gss_krb5_import_ctx_v2, 140 196 .derive_key = krb5_kdf_feedback_cmac, 141 197 .encrypt = gss_krb5_aes_encrypt, 142 198 .decrypt = gss_krb5_aes_decrypt, ··· 165 223 .Ke_length = BITS2OCTETS(128), 166 224 .Ki_length = BITS2OCTETS(128), 167 225 168 - .import_ctx = gss_krb5_import_ctx_v2, 169 226 .derive_key = krb5_kdf_hmac_sha2, 170 227 .encrypt = krb5_etm_encrypt, 171 228 .decrypt = krb5_etm_decrypt, ··· 191 250 .Ke_length = BITS2OCTETS(256), 192 251 .Ki_length = BITS2OCTETS(192), 193 252 194 - .import_ctx = gss_krb5_import_ctx_v2, 195 253 .derive_key = krb5_kdf_hmac_sha2, 196 254 .encrypt = krb5_etm_encrypt, 197 255 .decrypt = krb5_etm_decrypt, ··· 223 283 #if defined(CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1) 224 284 ENCTYPE_AES256_CTS_HMAC_SHA1_96, 225 285 ENCTYPE_AES128_CTS_HMAC_SHA1_96, 226 - #endif 227 - #if defined(CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_DES) 228 - ENCTYPE_DES3_CBC_SHA1, 229 - ENCTYPE_DES_CBC_MD5, 230 - ENCTYPE_DES_CBC_CRC, 231 - ENCTYPE_DES_CBC_MD4, 232 286 #endif 233 287 }; 234 288 size_t total, i; ··· 262 328 return NULL; 263 329 } 264 330 EXPORT_SYMBOL_IF_KUNIT(gss_krb5_lookup_enctype); 265 - 266 - static struct crypto_sync_skcipher * 267 - gss_krb5_alloc_cipher_v1(struct krb5_ctx *ctx, struct xdr_netobj *key) 268 - { 269 - struct crypto_sync_skcipher *tfm; 270 - 271 - tfm = crypto_alloc_sync_skcipher(ctx->gk5e->encrypt_name, 0, 0); 272 - if (IS_ERR(tfm)) 273 - return NULL; 274 - if (crypto_sync_skcipher_setkey(tfm, key->data, key->len)) { 275 - crypto_free_sync_skcipher(tfm); 276 - return NULL; 277 - } 278 - return tfm; 279 - } 280 - 281 - static inline const void * 282 - get_key(const void *p, const void *end, 283 - struct krb5_ctx *ctx, struct crypto_sync_skcipher **res) 284 - { 285 - struct crypto_sync_skcipher *tfm; 286 - struct xdr_netobj key; 287 - int alg; 288 - 289 - p = simple_get_bytes(p, end, &alg, sizeof(alg)); 290 - if (IS_ERR(p)) 291 - goto out_err; 292 - switch (alg) { 293 - case ENCTYPE_DES_CBC_CRC: 294 - case ENCTYPE_DES_CBC_MD4: 295 - case ENCTYPE_DES_CBC_MD5: 296 - /* Map all these key types to ENCTYPE_DES_CBC_RAW */ 297 - alg = ENCTYPE_DES_CBC_RAW; 298 - break; 299 - } 300 - if (!gss_krb5_lookup_enctype(alg)) { 301 - pr_warn("gss_krb5: unsupported enctype: %d\n", alg); 302 - goto out_err_inval; 303 - } 304 - 305 - p = simple_get_netobj(p, end, &key); 306 - if (IS_ERR(p)) 307 - goto out_err; 308 - tfm = gss_krb5_alloc_cipher_v1(ctx, &key); 309 - kfree(key.data); 310 - if (!tfm) { 311 - pr_warn("gss_krb5: failed to initialize cipher '%s'\n", 312 - ctx->gk5e->encrypt_name); 313 - goto out_err_inval; 314 - } 315 - *res = tfm; 316 - 317 - return p; 318 - 319 - out_err_inval: 320 - p = ERR_PTR(-EINVAL); 321 - out_err: 322 - return p; 323 - } 324 - 325 - static int 326 - gss_import_v1_context(const void *p, const void *end, struct krb5_ctx *ctx) 327 - { 328 - u32 seq_send; 329 - int tmp; 330 - u32 time32; 331 - 332 - p = simple_get_bytes(p, end, &ctx->initiate, sizeof(ctx->initiate)); 333 - if (IS_ERR(p)) 334 - goto out_err; 335 - 336 - /* Old format supports only DES! Any other enctype uses new format */ 337 - ctx->enctype = ENCTYPE_DES_CBC_RAW; 338 - 339 - ctx->gk5e = gss_krb5_lookup_enctype(ctx->enctype); 340 - if (ctx->gk5e == NULL) { 341 - p = ERR_PTR(-EINVAL); 342 - goto out_err; 343 - } 344 - 345 - /* The downcall format was designed before we completely understood 346 - * the uses of the context fields; so it includes some stuff we 347 - * just give some minimal sanity-checking, and some we ignore 348 - * completely (like the next twenty bytes): */ 349 - if (unlikely(p + 20 > end || p + 20 < p)) { 350 - p = ERR_PTR(-EFAULT); 351 - goto out_err; 352 - } 353 - p += 20; 354 - p = simple_get_bytes(p, end, &tmp, sizeof(tmp)); 355 - if (IS_ERR(p)) 356 - goto out_err; 357 - if (tmp != SGN_ALG_DES_MAC_MD5) { 358 - p = ERR_PTR(-ENOSYS); 359 - goto out_err; 360 - } 361 - p = simple_get_bytes(p, end, &tmp, sizeof(tmp)); 362 - if (IS_ERR(p)) 363 - goto out_err; 364 - if (tmp != SEAL_ALG_DES) { 365 - p = ERR_PTR(-ENOSYS); 366 - goto out_err; 367 - } 368 - p = simple_get_bytes(p, end, &time32, sizeof(time32)); 369 - if (IS_ERR(p)) 370 - goto out_err; 371 - /* unsigned 32-bit time overflows in year 2106 */ 372 - ctx->endtime = (time64_t)time32; 373 - p = simple_get_bytes(p, end, &seq_send, sizeof(seq_send)); 374 - if (IS_ERR(p)) 375 - goto out_err; 376 - atomic_set(&ctx->seq_send, seq_send); 377 - p = simple_get_netobj(p, end, &ctx->mech_used); 378 - if (IS_ERR(p)) 379 - goto out_err; 380 - p = get_key(p, end, ctx, &ctx->enc); 381 - if (IS_ERR(p)) 382 - goto out_err_free_mech; 383 - p = get_key(p, end, ctx, &ctx->seq); 384 - if (IS_ERR(p)) 385 - goto out_err_free_key1; 386 - if (p != end) { 387 - p = ERR_PTR(-EFAULT); 388 - goto out_err_free_key2; 389 - } 390 - 391 - return 0; 392 - 393 - out_err_free_key2: 394 - crypto_free_sync_skcipher(ctx->seq); 395 - out_err_free_key1: 396 - crypto_free_sync_skcipher(ctx->enc); 397 - out_err_free_mech: 398 - kfree(ctx->mech_used.data); 399 - out_err: 400 - return PTR_ERR(p); 401 - } 402 - 403 - #if defined(CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED) 404 - static int 405 - gss_krb5_import_ctx_des(struct krb5_ctx *ctx, gfp_t gfp_mask) 406 - { 407 - return -EINVAL; 408 - } 409 - 410 - static int 411 - gss_krb5_import_ctx_v1(struct krb5_ctx *ctx, gfp_t gfp_mask) 412 - { 413 - struct xdr_netobj keyin, keyout; 414 - 415 - keyin.data = ctx->Ksess; 416 - keyin.len = ctx->gk5e->keylength; 417 - 418 - ctx->seq = gss_krb5_alloc_cipher_v1(ctx, &keyin); 419 - if (ctx->seq == NULL) 420 - goto out_err; 421 - ctx->enc = gss_krb5_alloc_cipher_v1(ctx, &keyin); 422 - if (ctx->enc == NULL) 423 - goto out_free_seq; 424 - 425 - /* derive cksum */ 426 - keyout.data = ctx->cksum; 427 - keyout.len = ctx->gk5e->keylength; 428 - if (krb5_derive_key(ctx, &keyin, &keyout, KG_USAGE_SIGN, 429 - KEY_USAGE_SEED_CHECKSUM, gfp_mask)) 430 - goto out_free_enc; 431 - 432 - return 0; 433 - 434 - out_free_enc: 435 - crypto_free_sync_skcipher(ctx->enc); 436 - out_free_seq: 437 - crypto_free_sync_skcipher(ctx->seq); 438 - out_err: 439 - return -EINVAL; 440 - } 441 - #endif 442 - 443 - #if defined(CONFIG_RPCSEC_GSS_KRB5_CRYPTOSYSTEM) 444 331 445 332 static struct crypto_sync_skcipher * 446 333 gss_krb5_alloc_cipher_v2(const char *cname, const struct xdr_netobj *key) ··· 391 636 goto out; 392 637 } 393 638 394 - #endif 395 - 396 639 static int 397 640 gss_import_v2_context(const void *p, const void *end, struct krb5_ctx *ctx, 398 641 gfp_t gfp_mask) ··· 424 671 p = simple_get_bytes(p, end, &ctx->enctype, sizeof(ctx->enctype)); 425 672 if (IS_ERR(p)) 426 673 goto out_err; 427 - /* Map ENCTYPE_DES3_CBC_SHA1 to ENCTYPE_DES3_CBC_RAW */ 428 - if (ctx->enctype == ENCTYPE_DES3_CBC_SHA1) 429 - ctx->enctype = ENCTYPE_DES3_CBC_RAW; 430 674 ctx->gk5e = gss_krb5_lookup_enctype(ctx->enctype); 431 675 if (ctx->gk5e == NULL) { 432 676 dprintk("gss_kerberos_mech: unsupported krb5 enctype %u\n", ··· 450 700 } 451 701 ctx->mech_used.len = gss_kerberos_mech.gm_oid.len; 452 702 453 - return ctx->gk5e->import_ctx(ctx, gfp_mask); 703 + return gss_krb5_import_ctx_v2(ctx, gfp_mask); 454 704 455 705 out_err: 456 706 return PTR_ERR(p); ··· 468 718 if (ctx == NULL) 469 719 return -ENOMEM; 470 720 471 - if (len == 85) 472 - ret = gss_import_v1_context(p, end, ctx); 473 - else 474 - ret = gss_import_v2_context(p, end, ctx, gfp_mask); 721 + ret = gss_import_v2_context(p, end, ctx, gfp_mask); 475 722 memzero_explicit(&ctx->Ksess, sizeof(ctx->Ksess)); 476 723 if (ret) { 477 724 kfree(ctx);
-69
net/sunrpc/auth_gss/gss_krb5_seal.c
··· 71 71 # define RPCDBG_FACILITY RPCDBG_AUTH 72 72 #endif 73 73 74 - #if defined(CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED) 75 - 76 - static void * 77 - setup_token(struct krb5_ctx *ctx, struct xdr_netobj *token) 78 - { 79 - u16 *ptr; 80 - void *krb5_hdr; 81 - int body_size = GSS_KRB5_TOK_HDR_LEN + ctx->gk5e->cksumlength; 82 - 83 - token->len = g_token_size(&ctx->mech_used, body_size); 84 - 85 - ptr = (u16 *)token->data; 86 - g_make_token_header(&ctx->mech_used, body_size, (unsigned char **)&ptr); 87 - 88 - /* ptr now at start of header described in rfc 1964, section 1.2.1: */ 89 - krb5_hdr = ptr; 90 - *ptr++ = KG_TOK_MIC_MSG; 91 - /* 92 - * signalg is stored as if it were converted from LE to host endian, even 93 - * though it's an opaque pair of bytes according to the RFC. 94 - */ 95 - *ptr++ = (__force u16)cpu_to_le16(ctx->gk5e->signalg); 96 - *ptr++ = SEAL_ALG_NONE; 97 - *ptr = 0xffff; 98 - 99 - return krb5_hdr; 100 - } 101 - 102 - u32 103 - gss_krb5_get_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *text, 104 - struct xdr_netobj *token) 105 - { 106 - char cksumdata[GSS_KRB5_MAX_CKSUM_LEN]; 107 - struct xdr_netobj md5cksum = {.len = sizeof(cksumdata), 108 - .data = cksumdata}; 109 - void *ptr; 110 - time64_t now; 111 - u32 seq_send; 112 - u8 *cksumkey; 113 - 114 - dprintk("RPC: %s\n", __func__); 115 - BUG_ON(ctx == NULL); 116 - 117 - now = ktime_get_real_seconds(); 118 - 119 - ptr = setup_token(ctx, token); 120 - 121 - if (ctx->gk5e->keyed_cksum) 122 - cksumkey = ctx->cksum; 123 - else 124 - cksumkey = NULL; 125 - 126 - if (make_checksum(ctx, ptr, 8, text, 0, cksumkey, 127 - KG_USAGE_SIGN, &md5cksum)) 128 - return GSS_S_FAILURE; 129 - 130 - memcpy(ptr + GSS_KRB5_TOK_HDR_LEN, md5cksum.data, md5cksum.len); 131 - 132 - seq_send = atomic_fetch_inc(&ctx->seq_send); 133 - 134 - if (krb5_make_seq_num(ctx, ctx->seq, ctx->initiate ? 0 : 0xff, 135 - seq_send, ptr + GSS_KRB5_TOK_HDR_LEN, ptr + 8)) 136 - return GSS_S_FAILURE; 137 - 138 - return (ctx->endtime < now) ? GSS_S_CONTEXT_EXPIRED : GSS_S_COMPLETE; 139 - } 140 - 141 - #endif 142 - 143 74 static void * 144 75 setup_token_v2(struct krb5_ctx *ctx, struct xdr_netobj *token) 145 76 {
-106
net/sunrpc/auth_gss/gss_krb5_seqnum.c
··· 1 - /* 2 - * linux/net/sunrpc/gss_krb5_seqnum.c 3 - * 4 - * Adapted from MIT Kerberos 5-1.2.1 lib/gssapi/krb5/util_seqnum.c 5 - * 6 - * Copyright (c) 2000 The Regents of the University of Michigan. 7 - * All rights reserved. 8 - * 9 - * Andy Adamson <andros@umich.edu> 10 - */ 11 - 12 - /* 13 - * Copyright 1993 by OpenVision Technologies, Inc. 14 - * 15 - * Permission to use, copy, modify, distribute, and sell this software 16 - * and its documentation for any purpose is hereby granted without fee, 17 - * provided that the above copyright notice appears in all copies and 18 - * that both that copyright notice and this permission notice appear in 19 - * supporting documentation, and that the name of OpenVision not be used 20 - * in advertising or publicity pertaining to distribution of the software 21 - * without specific, written prior permission. OpenVision makes no 22 - * representations about the suitability of this software for any 23 - * purpose. It is provided "as is" without express or implied warranty. 24 - * 25 - * OPENVISION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, 26 - * INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO 27 - * EVENT SHALL OPENVISION BE LIABLE FOR ANY SPECIAL, INDIRECT OR 28 - * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF 29 - * USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR 30 - * OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 31 - * PERFORMANCE OF THIS SOFTWARE. 32 - */ 33 - 34 - #include <crypto/skcipher.h> 35 - #include <linux/types.h> 36 - #include <linux/sunrpc/gss_krb5.h> 37 - 38 - #include "gss_krb5_internal.h" 39 - 40 - #if IS_ENABLED(CONFIG_SUNRPC_DEBUG) 41 - # define RPCDBG_FACILITY RPCDBG_AUTH 42 - #endif 43 - 44 - s32 45 - krb5_make_seq_num(struct krb5_ctx *kctx, 46 - struct crypto_sync_skcipher *key, 47 - int direction, 48 - u32 seqnum, 49 - unsigned char *cksum, unsigned char *buf) 50 - { 51 - unsigned char *plain; 52 - s32 code; 53 - 54 - plain = kmalloc(8, GFP_KERNEL); 55 - if (!plain) 56 - return -ENOMEM; 57 - 58 - plain[0] = (unsigned char) (seqnum & 0xff); 59 - plain[1] = (unsigned char) ((seqnum >> 8) & 0xff); 60 - plain[2] = (unsigned char) ((seqnum >> 16) & 0xff); 61 - plain[3] = (unsigned char) ((seqnum >> 24) & 0xff); 62 - 63 - plain[4] = direction; 64 - plain[5] = direction; 65 - plain[6] = direction; 66 - plain[7] = direction; 67 - 68 - code = krb5_encrypt(key, cksum, plain, buf, 8); 69 - kfree(plain); 70 - return code; 71 - } 72 - 73 - s32 74 - krb5_get_seq_num(struct krb5_ctx *kctx, 75 - unsigned char *cksum, 76 - unsigned char *buf, 77 - int *direction, u32 *seqnum) 78 - { 79 - s32 code; 80 - unsigned char *plain; 81 - struct crypto_sync_skcipher *key = kctx->seq; 82 - 83 - dprintk("RPC: krb5_get_seq_num:\n"); 84 - 85 - plain = kmalloc(8, GFP_KERNEL); 86 - if (!plain) 87 - return -ENOMEM; 88 - 89 - if ((code = krb5_decrypt(key, cksum, buf, plain, 8))) 90 - goto out; 91 - 92 - if ((plain[4] != plain[5]) || (plain[4] != plain[6]) || 93 - (plain[4] != plain[7])) { 94 - code = (s32)KG_BAD_SEQ; 95 - goto out; 96 - } 97 - 98 - *direction = plain[4]; 99 - 100 - *seqnum = ((plain[0]) | 101 - (plain[1] << 8) | (plain[2] << 16) | (plain[3] << 24)); 102 - 103 - out: 104 - kfree(plain); 105 - return code; 106 - }
-196
net/sunrpc/auth_gss/gss_krb5_test.c
··· 320 320 "result mismatch"); 321 321 } 322 322 323 - /* 324 - * RFC 3961 Appendix A.3. DES3 DR and DK 325 - * 326 - * These tests show the derived-random and derived-key values for the 327 - * des3-hmac-sha1-kd encryption scheme, using the DR and DK functions 328 - * defined in section 6.3.1. The input keys were randomly generated; 329 - * the usage values are from this specification. 330 - * 331 - * This test material is copyright (C) The Internet Society (2005). 332 - */ 333 - 334 - DEFINE_HEX_XDR_NETOBJ(des3_dk_usage_155, 335 - 0x00, 0x00, 0x00, 0x01, 0x55 336 - ); 337 - 338 - DEFINE_HEX_XDR_NETOBJ(des3_dk_usage_1aa, 339 - 0x00, 0x00, 0x00, 0x01, 0xaa 340 - ); 341 - 342 - DEFINE_HEX_XDR_NETOBJ(des3_dk_usage_kerberos, 343 - 0x6b, 0x65, 0x72, 0x62, 0x65, 0x72, 0x6f, 0x73 344 - ); 345 - 346 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test1_base_key, 347 - 0xdc, 0xe0, 0x6b, 0x1f, 0x64, 0xc8, 0x57, 0xa1, 348 - 0x1c, 0x3d, 0xb5, 0x7c, 0x51, 0x89, 0x9b, 0x2c, 349 - 0xc1, 0x79, 0x10, 0x08, 0xce, 0x97, 0x3b, 0x92 350 - ); 351 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test1_derived_key, 352 - 0x92, 0x51, 0x79, 0xd0, 0x45, 0x91, 0xa7, 0x9b, 353 - 0x5d, 0x31, 0x92, 0xc4, 0xa7, 0xe9, 0xc2, 0x89, 354 - 0xb0, 0x49, 0xc7, 0x1f, 0x6e, 0xe6, 0x04, 0xcd 355 - ); 356 - 357 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test2_base_key, 358 - 0x5e, 0x13, 0xd3, 0x1c, 0x70, 0xef, 0x76, 0x57, 359 - 0x46, 0x57, 0x85, 0x31, 0xcb, 0x51, 0xc1, 0x5b, 360 - 0xf1, 0x1c, 0xa8, 0x2c, 0x97, 0xce, 0xe9, 0xf2 361 - ); 362 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test2_derived_key, 363 - 0x9e, 0x58, 0xe5, 0xa1, 0x46, 0xd9, 0x94, 0x2a, 364 - 0x10, 0x1c, 0x46, 0x98, 0x45, 0xd6, 0x7a, 0x20, 365 - 0xe3, 0xc4, 0x25, 0x9e, 0xd9, 0x13, 0xf2, 0x07 366 - ); 367 - 368 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test3_base_key, 369 - 0x98, 0xe6, 0xfd, 0x8a, 0x04, 0xa4, 0xb6, 0x85, 370 - 0x9b, 0x75, 0xa1, 0x76, 0x54, 0x0b, 0x97, 0x52, 371 - 0xba, 0xd3, 0xec, 0xd6, 0x10, 0xa2, 0x52, 0xbc 372 - ); 373 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test3_derived_key, 374 - 0x13, 0xfe, 0xf8, 0x0d, 0x76, 0x3e, 0x94, 0xec, 375 - 0x6d, 0x13, 0xfd, 0x2c, 0xa1, 0xd0, 0x85, 0x07, 376 - 0x02, 0x49, 0xda, 0xd3, 0x98, 0x08, 0xea, 0xbf 377 - ); 378 - 379 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test4_base_key, 380 - 0x62, 0x2a, 0xec, 0x25, 0xa2, 0xfe, 0x2c, 0xad, 381 - 0x70, 0x94, 0x68, 0x0b, 0x7c, 0x64, 0x94, 0x02, 382 - 0x80, 0x08, 0x4c, 0x1a, 0x7c, 0xec, 0x92, 0xb5 383 - ); 384 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test4_derived_key, 385 - 0xf8, 0xdf, 0xbf, 0x04, 0xb0, 0x97, 0xe6, 0xd9, 386 - 0xdc, 0x07, 0x02, 0x68, 0x6b, 0xcb, 0x34, 0x89, 387 - 0xd9, 0x1f, 0xd9, 0xa4, 0x51, 0x6b, 0x70, 0x3e 388 - ); 389 - 390 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test5_base_key, 391 - 0xd3, 0xf8, 0x29, 0x8c, 0xcb, 0x16, 0x64, 0x38, 392 - 0xdc, 0xb9, 0xb9, 0x3e, 0xe5, 0xa7, 0x62, 0x92, 393 - 0x86, 0xa4, 0x91, 0xf8, 0x38, 0xf8, 0x02, 0xfb 394 - ); 395 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test5_derived_key, 396 - 0x23, 0x70, 0xda, 0x57, 0x5d, 0x2a, 0x3d, 0xa8, 397 - 0x64, 0xce, 0xbf, 0xdc, 0x52, 0x04, 0xd5, 0x6d, 398 - 0xf7, 0x79, 0xa7, 0xdf, 0x43, 0xd9, 0xda, 0x43 399 - ); 400 - 401 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test6_base_key, 402 - 0xc1, 0x08, 0x16, 0x49, 0xad, 0xa7, 0x43, 0x62, 403 - 0xe6, 0xa1, 0x45, 0x9d, 0x01, 0xdf, 0xd3, 0x0d, 404 - 0x67, 0xc2, 0x23, 0x4c, 0x94, 0x07, 0x04, 0xda 405 - ); 406 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test6_derived_key, 407 - 0x34, 0x80, 0x57, 0xec, 0x98, 0xfd, 0xc4, 0x80, 408 - 0x16, 0x16, 0x1c, 0x2a, 0x4c, 0x7a, 0x94, 0x3e, 409 - 0x92, 0xae, 0x49, 0x2c, 0x98, 0x91, 0x75, 0xf7 410 - ); 411 - 412 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test7_base_key, 413 - 0x5d, 0x15, 0x4a, 0xf2, 0x38, 0xf4, 0x67, 0x13, 414 - 0x15, 0x57, 0x19, 0xd5, 0x5e, 0x2f, 0x1f, 0x79, 415 - 0x0d, 0xd6, 0x61, 0xf2, 0x79, 0xa7, 0x91, 0x7c 416 - ); 417 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test7_derived_key, 418 - 0xa8, 0x80, 0x8a, 0xc2, 0x67, 0xda, 0xda, 0x3d, 419 - 0xcb, 0xe9, 0xa7, 0xc8, 0x46, 0x26, 0xfb, 0xc7, 420 - 0x61, 0xc2, 0x94, 0xb0, 0x13, 0x15, 0xe5, 0xc1 421 - ); 422 - 423 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test8_base_key, 424 - 0x79, 0x85, 0x62, 0xe0, 0x49, 0x85, 0x2f, 0x57, 425 - 0xdc, 0x8c, 0x34, 0x3b, 0xa1, 0x7f, 0x2c, 0xa1, 426 - 0xd9, 0x73, 0x94, 0xef, 0xc8, 0xad, 0xc4, 0x43 427 - ); 428 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test8_derived_key, 429 - 0xc8, 0x13, 0xf8, 0x8a, 0x3b, 0xe3, 0xb3, 0x34, 430 - 0xf7, 0x54, 0x25, 0xce, 0x91, 0x75, 0xfb, 0xe3, 431 - 0xc8, 0x49, 0x3b, 0x89, 0xc8, 0x70, 0x3b, 0x49 432 - ); 433 - 434 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test9_base_key, 435 - 0x26, 0xdc, 0xe3, 0x34, 0xb5, 0x45, 0x29, 0x2f, 436 - 0x2f, 0xea, 0xb9, 0xa8, 0x70, 0x1a, 0x89, 0xa4, 437 - 0xb9, 0x9e, 0xb9, 0x94, 0x2c, 0xec, 0xd0, 0x16 438 - ); 439 - DEFINE_HEX_XDR_NETOBJ(des3_dk_test9_derived_key, 440 - 0xf4, 0x8f, 0xfd, 0x6e, 0x83, 0xf8, 0x3e, 0x73, 441 - 0x54, 0xe6, 0x94, 0xfd, 0x25, 0x2c, 0xf8, 0x3b, 442 - 0xfe, 0x58, 0xf7, 0xd5, 0xba, 0x37, 0xec, 0x5d 443 - ); 444 - 445 - static const struct gss_krb5_test_param rfc3961_kdf_test_params[] = { 446 - { 447 - .desc = "des3-hmac-sha1 key derivation case 1", 448 - .enctype = ENCTYPE_DES3_CBC_RAW, 449 - .base_key = &des3_dk_test1_base_key, 450 - .usage = &des3_dk_usage_155, 451 - .expected_result = &des3_dk_test1_derived_key, 452 - }, 453 - { 454 - .desc = "des3-hmac-sha1 key derivation case 2", 455 - .enctype = ENCTYPE_DES3_CBC_RAW, 456 - .base_key = &des3_dk_test2_base_key, 457 - .usage = &des3_dk_usage_1aa, 458 - .expected_result = &des3_dk_test2_derived_key, 459 - }, 460 - { 461 - .desc = "des3-hmac-sha1 key derivation case 3", 462 - .enctype = ENCTYPE_DES3_CBC_RAW, 463 - .base_key = &des3_dk_test3_base_key, 464 - .usage = &des3_dk_usage_155, 465 - .expected_result = &des3_dk_test3_derived_key, 466 - }, 467 - { 468 - .desc = "des3-hmac-sha1 key derivation case 4", 469 - .enctype = ENCTYPE_DES3_CBC_RAW, 470 - .base_key = &des3_dk_test4_base_key, 471 - .usage = &des3_dk_usage_1aa, 472 - .expected_result = &des3_dk_test4_derived_key, 473 - }, 474 - { 475 - .desc = "des3-hmac-sha1 key derivation case 5", 476 - .enctype = ENCTYPE_DES3_CBC_RAW, 477 - .base_key = &des3_dk_test5_base_key, 478 - .usage = &des3_dk_usage_kerberos, 479 - .expected_result = &des3_dk_test5_derived_key, 480 - }, 481 - { 482 - .desc = "des3-hmac-sha1 key derivation case 6", 483 - .enctype = ENCTYPE_DES3_CBC_RAW, 484 - .base_key = &des3_dk_test6_base_key, 485 - .usage = &des3_dk_usage_155, 486 - .expected_result = &des3_dk_test6_derived_key, 487 - }, 488 - { 489 - .desc = "des3-hmac-sha1 key derivation case 7", 490 - .enctype = ENCTYPE_DES3_CBC_RAW, 491 - .base_key = &des3_dk_test7_base_key, 492 - .usage = &des3_dk_usage_1aa, 493 - .expected_result = &des3_dk_test7_derived_key, 494 - }, 495 - { 496 - .desc = "des3-hmac-sha1 key derivation case 8", 497 - .enctype = ENCTYPE_DES3_CBC_RAW, 498 - .base_key = &des3_dk_test8_base_key, 499 - .usage = &des3_dk_usage_155, 500 - .expected_result = &des3_dk_test8_derived_key, 501 - }, 502 - { 503 - .desc = "des3-hmac-sha1 key derivation case 9", 504 - .enctype = ENCTYPE_DES3_CBC_RAW, 505 - .base_key = &des3_dk_test9_base_key, 506 - .usage = &des3_dk_usage_1aa, 507 - .expected_result = &des3_dk_test9_derived_key, 508 - }, 509 - }; 510 - 511 - /* Creates the function rfc3961_kdf_gen_params */ 512 - KUNIT_ARRAY_PARAM(rfc3961_kdf, rfc3961_kdf_test_params, gss_krb5_get_desc); 513 - 514 323 static struct kunit_case rfc3961_test_cases[] = { 515 324 { 516 325 .name = "RFC 3961 n-fold", 517 326 .run_case = rfc3961_nfold_case, 518 327 .generate_params = rfc3961_nfold_gen_params, 519 - }, 520 - { 521 - .name = "RFC 3961 key derivation", 522 - .run_case = kdf_case, 523 - .generate_params = rfc3961_kdf_gen_params, 524 328 }, 525 329 {} 526 330 };
-77
net/sunrpc/auth_gss/gss_krb5_unseal.c
··· 69 69 # define RPCDBG_FACILITY RPCDBG_AUTH 70 70 #endif 71 71 72 - 73 - #if defined(CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED) 74 - /* read_token is a mic token, and message_buffer is the data that the mic was 75 - * supposedly taken over. */ 76 - u32 77 - gss_krb5_verify_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *message_buffer, 78 - struct xdr_netobj *read_token) 79 - { 80 - int signalg; 81 - int sealalg; 82 - char cksumdata[GSS_KRB5_MAX_CKSUM_LEN]; 83 - struct xdr_netobj md5cksum = {.len = sizeof(cksumdata), 84 - .data = cksumdata}; 85 - s32 now; 86 - int direction; 87 - u32 seqnum; 88 - unsigned char *ptr = (unsigned char *)read_token->data; 89 - int bodysize; 90 - u8 *cksumkey; 91 - 92 - dprintk("RPC: krb5_read_token\n"); 93 - 94 - if (g_verify_token_header(&ctx->mech_used, &bodysize, &ptr, 95 - read_token->len)) 96 - return GSS_S_DEFECTIVE_TOKEN; 97 - 98 - if ((ptr[0] != ((KG_TOK_MIC_MSG >> 8) & 0xff)) || 99 - (ptr[1] != (KG_TOK_MIC_MSG & 0xff))) 100 - return GSS_S_DEFECTIVE_TOKEN; 101 - 102 - /* XXX sanity-check bodysize?? */ 103 - 104 - signalg = ptr[2] + (ptr[3] << 8); 105 - if (signalg != ctx->gk5e->signalg) 106 - return GSS_S_DEFECTIVE_TOKEN; 107 - 108 - sealalg = ptr[4] + (ptr[5] << 8); 109 - if (sealalg != SEAL_ALG_NONE) 110 - return GSS_S_DEFECTIVE_TOKEN; 111 - 112 - if ((ptr[6] != 0xff) || (ptr[7] != 0xff)) 113 - return GSS_S_DEFECTIVE_TOKEN; 114 - 115 - if (ctx->gk5e->keyed_cksum) 116 - cksumkey = ctx->cksum; 117 - else 118 - cksumkey = NULL; 119 - 120 - if (make_checksum(ctx, ptr, 8, message_buffer, 0, 121 - cksumkey, KG_USAGE_SIGN, &md5cksum)) 122 - return GSS_S_FAILURE; 123 - 124 - if (memcmp(md5cksum.data, ptr + GSS_KRB5_TOK_HDR_LEN, 125 - ctx->gk5e->cksumlength)) 126 - return GSS_S_BAD_SIG; 127 - 128 - /* it got through unscathed. Make sure the context is unexpired */ 129 - 130 - now = ktime_get_real_seconds(); 131 - 132 - if (now > ctx->endtime) 133 - return GSS_S_CONTEXT_EXPIRED; 134 - 135 - /* do sequencing checks */ 136 - 137 - if (krb5_get_seq_num(ctx, ptr + GSS_KRB5_TOK_HDR_LEN, ptr + 8, 138 - &direction, &seqnum)) 139 - return GSS_S_FAILURE; 140 - 141 - if ((ctx->initiate && direction != 0xff) || 142 - (!ctx->initiate && direction != 0)) 143 - return GSS_S_BAD_SIG; 144 - 145 - return GSS_S_COMPLETE; 146 - } 147 - #endif 148 - 149 72 u32 150 73 gss_krb5_verify_mic_v2(struct krb5_ctx *ctx, struct xdr_buf *message_buffer, 151 74 struct xdr_netobj *read_token)
-287
net/sunrpc/auth_gss/gss_krb5_wrap.c
··· 40 40 # define RPCDBG_FACILITY RPCDBG_AUTH 41 41 #endif 42 42 43 - #if defined(CONFIG_RPCSEC_GSS_KRB5_SIMPLIFIED) 44 - 45 - static inline int 46 - gss_krb5_padding(int blocksize, int length) 47 - { 48 - return blocksize - (length % blocksize); 49 - } 50 - 51 - static inline void 52 - gss_krb5_add_padding(struct xdr_buf *buf, int offset, int blocksize) 53 - { 54 - int padding = gss_krb5_padding(blocksize, buf->len - offset); 55 - char *p; 56 - struct kvec *iov; 57 - 58 - if (buf->page_len || buf->tail[0].iov_len) 59 - iov = &buf->tail[0]; 60 - else 61 - iov = &buf->head[0]; 62 - p = iov->iov_base + iov->iov_len; 63 - iov->iov_len += padding; 64 - buf->len += padding; 65 - memset(p, padding, padding); 66 - } 67 - 68 - static inline int 69 - gss_krb5_remove_padding(struct xdr_buf *buf, int blocksize) 70 - { 71 - u8 *ptr; 72 - u8 pad; 73 - size_t len = buf->len; 74 - 75 - if (len <= buf->head[0].iov_len) { 76 - pad = *(u8 *)(buf->head[0].iov_base + len - 1); 77 - if (pad > buf->head[0].iov_len) 78 - return -EINVAL; 79 - buf->head[0].iov_len -= pad; 80 - goto out; 81 - } else 82 - len -= buf->head[0].iov_len; 83 - if (len <= buf->page_len) { 84 - unsigned int last = (buf->page_base + len - 1) 85 - >>PAGE_SHIFT; 86 - unsigned int offset = (buf->page_base + len - 1) 87 - & (PAGE_SIZE - 1); 88 - ptr = kmap_atomic(buf->pages[last]); 89 - pad = *(ptr + offset); 90 - kunmap_atomic(ptr); 91 - goto out; 92 - } else 93 - len -= buf->page_len; 94 - BUG_ON(len > buf->tail[0].iov_len); 95 - pad = *(u8 *)(buf->tail[0].iov_base + len - 1); 96 - out: 97 - /* XXX: NOTE: we do not adjust the page lengths--they represent 98 - * a range of data in the real filesystem page cache, and we need 99 - * to know that range so the xdr code can properly place read data. 100 - * However adjusting the head length, as we do above, is harmless. 101 - * In the case of a request that fits into a single page, the server 102 - * also uses length and head length together to determine the original 103 - * start of the request to copy the request for deferal; so it's 104 - * easier on the server if we adjust head and tail length in tandem. 105 - * It's not really a problem that we don't fool with the page and 106 - * tail lengths, though--at worst badly formed xdr might lead the 107 - * server to attempt to parse the padding. 108 - * XXX: Document all these weird requirements for gss mechanism 109 - * wrap/unwrap functions. */ 110 - if (pad > blocksize) 111 - return -EINVAL; 112 - if (buf->len > pad) 113 - buf->len -= pad; 114 - else 115 - return -EINVAL; 116 - return 0; 117 - } 118 - 119 - /* Assumptions: the head and tail of inbuf are ours to play with. 120 - * The pages, however, may be real pages in the page cache and we replace 121 - * them with scratch pages from **pages before writing to them. */ 122 - /* XXX: obviously the above should be documentation of wrap interface, 123 - * and shouldn't be in this kerberos-specific file. */ 124 - 125 - /* XXX factor out common code with seal/unseal. */ 126 - 127 - u32 128 - gss_krb5_wrap_v1(struct krb5_ctx *kctx, int offset, 129 - struct xdr_buf *buf, struct page **pages) 130 - { 131 - char cksumdata[GSS_KRB5_MAX_CKSUM_LEN]; 132 - struct xdr_netobj md5cksum = {.len = sizeof(cksumdata), 133 - .data = cksumdata}; 134 - int blocksize = 0, plainlen; 135 - unsigned char *ptr, *msg_start; 136 - time64_t now; 137 - int headlen; 138 - struct page **tmp_pages; 139 - u32 seq_send; 140 - u8 *cksumkey; 141 - u32 conflen = crypto_sync_skcipher_blocksize(kctx->enc); 142 - 143 - dprintk("RPC: %s\n", __func__); 144 - 145 - now = ktime_get_real_seconds(); 146 - 147 - blocksize = crypto_sync_skcipher_blocksize(kctx->enc); 148 - gss_krb5_add_padding(buf, offset, blocksize); 149 - BUG_ON((buf->len - offset) % blocksize); 150 - plainlen = conflen + buf->len - offset; 151 - 152 - headlen = g_token_size(&kctx->mech_used, 153 - GSS_KRB5_TOK_HDR_LEN + kctx->gk5e->cksumlength + plainlen) - 154 - (buf->len - offset); 155 - 156 - ptr = buf->head[0].iov_base + offset; 157 - /* shift data to make room for header. */ 158 - xdr_extend_head(buf, offset, headlen); 159 - 160 - /* XXX Would be cleverer to encrypt while copying. */ 161 - BUG_ON((buf->len - offset - headlen) % blocksize); 162 - 163 - g_make_token_header(&kctx->mech_used, 164 - GSS_KRB5_TOK_HDR_LEN + 165 - kctx->gk5e->cksumlength + plainlen, &ptr); 166 - 167 - 168 - /* ptr now at header described in rfc 1964, section 1.2.1: */ 169 - ptr[0] = (unsigned char) ((KG_TOK_WRAP_MSG >> 8) & 0xff); 170 - ptr[1] = (unsigned char) (KG_TOK_WRAP_MSG & 0xff); 171 - 172 - msg_start = ptr + GSS_KRB5_TOK_HDR_LEN + kctx->gk5e->cksumlength; 173 - 174 - /* 175 - * signalg and sealalg are stored as if they were converted from LE 176 - * to host endian, even though they're opaque pairs of bytes according 177 - * to the RFC. 178 - */ 179 - *(__le16 *)(ptr + 2) = cpu_to_le16(kctx->gk5e->signalg); 180 - *(__le16 *)(ptr + 4) = cpu_to_le16(kctx->gk5e->sealalg); 181 - ptr[6] = 0xff; 182 - ptr[7] = 0xff; 183 - 184 - krb5_make_confounder(msg_start, conflen); 185 - 186 - if (kctx->gk5e->keyed_cksum) 187 - cksumkey = kctx->cksum; 188 - else 189 - cksumkey = NULL; 190 - 191 - /* XXXJBF: UGH!: */ 192 - tmp_pages = buf->pages; 193 - buf->pages = pages; 194 - if (make_checksum(kctx, ptr, 8, buf, offset + headlen - conflen, 195 - cksumkey, KG_USAGE_SEAL, &md5cksum)) 196 - return GSS_S_FAILURE; 197 - buf->pages = tmp_pages; 198 - 199 - memcpy(ptr + GSS_KRB5_TOK_HDR_LEN, md5cksum.data, md5cksum.len); 200 - 201 - seq_send = atomic_fetch_inc(&kctx->seq_send); 202 - 203 - /* XXX would probably be more efficient to compute checksum 204 - * and encrypt at the same time: */ 205 - if ((krb5_make_seq_num(kctx, kctx->seq, kctx->initiate ? 0 : 0xff, 206 - seq_send, ptr + GSS_KRB5_TOK_HDR_LEN, ptr + 8))) 207 - return GSS_S_FAILURE; 208 - 209 - if (gss_encrypt_xdr_buf(kctx->enc, buf, 210 - offset + headlen - conflen, pages)) 211 - return GSS_S_FAILURE; 212 - 213 - return (kctx->endtime < now) ? GSS_S_CONTEXT_EXPIRED : GSS_S_COMPLETE; 214 - } 215 - 216 - u32 217 - gss_krb5_unwrap_v1(struct krb5_ctx *kctx, int offset, int len, 218 - struct xdr_buf *buf, unsigned int *slack, 219 - unsigned int *align) 220 - { 221 - int signalg; 222 - int sealalg; 223 - char cksumdata[GSS_KRB5_MAX_CKSUM_LEN]; 224 - struct xdr_netobj md5cksum = {.len = sizeof(cksumdata), 225 - .data = cksumdata}; 226 - time64_t now; 227 - int direction; 228 - s32 seqnum; 229 - unsigned char *ptr; 230 - int bodysize; 231 - void *data_start, *orig_start; 232 - int data_len; 233 - int blocksize; 234 - u32 conflen = crypto_sync_skcipher_blocksize(kctx->enc); 235 - int crypt_offset; 236 - u8 *cksumkey; 237 - unsigned int saved_len = buf->len; 238 - 239 - dprintk("RPC: gss_unwrap_kerberos\n"); 240 - 241 - ptr = (u8 *)buf->head[0].iov_base + offset; 242 - if (g_verify_token_header(&kctx->mech_used, &bodysize, &ptr, 243 - len - offset)) 244 - return GSS_S_DEFECTIVE_TOKEN; 245 - 246 - if ((ptr[0] != ((KG_TOK_WRAP_MSG >> 8) & 0xff)) || 247 - (ptr[1] != (KG_TOK_WRAP_MSG & 0xff))) 248 - return GSS_S_DEFECTIVE_TOKEN; 249 - 250 - /* XXX sanity-check bodysize?? */ 251 - 252 - /* get the sign and seal algorithms */ 253 - 254 - signalg = ptr[2] + (ptr[3] << 8); 255 - if (signalg != kctx->gk5e->signalg) 256 - return GSS_S_DEFECTIVE_TOKEN; 257 - 258 - sealalg = ptr[4] + (ptr[5] << 8); 259 - if (sealalg != kctx->gk5e->sealalg) 260 - return GSS_S_DEFECTIVE_TOKEN; 261 - 262 - if ((ptr[6] != 0xff) || (ptr[7] != 0xff)) 263 - return GSS_S_DEFECTIVE_TOKEN; 264 - 265 - /* 266 - * Data starts after token header and checksum. ptr points 267 - * to the beginning of the token header 268 - */ 269 - crypt_offset = ptr + (GSS_KRB5_TOK_HDR_LEN + kctx->gk5e->cksumlength) - 270 - (unsigned char *)buf->head[0].iov_base; 271 - 272 - buf->len = len; 273 - if (gss_decrypt_xdr_buf(kctx->enc, buf, crypt_offset)) 274 - return GSS_S_DEFECTIVE_TOKEN; 275 - 276 - if (kctx->gk5e->keyed_cksum) 277 - cksumkey = kctx->cksum; 278 - else 279 - cksumkey = NULL; 280 - 281 - if (make_checksum(kctx, ptr, 8, buf, crypt_offset, 282 - cksumkey, KG_USAGE_SEAL, &md5cksum)) 283 - return GSS_S_FAILURE; 284 - 285 - if (memcmp(md5cksum.data, ptr + GSS_KRB5_TOK_HDR_LEN, 286 - kctx->gk5e->cksumlength)) 287 - return GSS_S_BAD_SIG; 288 - 289 - /* it got through unscathed. Make sure the context is unexpired */ 290 - 291 - now = ktime_get_real_seconds(); 292 - 293 - if (now > kctx->endtime) 294 - return GSS_S_CONTEXT_EXPIRED; 295 - 296 - /* do sequencing checks */ 297 - 298 - if (krb5_get_seq_num(kctx, ptr + GSS_KRB5_TOK_HDR_LEN, 299 - ptr + 8, &direction, &seqnum)) 300 - return GSS_S_BAD_SIG; 301 - 302 - if ((kctx->initiate && direction != 0xff) || 303 - (!kctx->initiate && direction != 0)) 304 - return GSS_S_BAD_SIG; 305 - 306 - /* Copy the data back to the right position. XXX: Would probably be 307 - * better to copy and encrypt at the same time. */ 308 - 309 - blocksize = crypto_sync_skcipher_blocksize(kctx->enc); 310 - data_start = ptr + (GSS_KRB5_TOK_HDR_LEN + kctx->gk5e->cksumlength) + 311 - conflen; 312 - orig_start = buf->head[0].iov_base + offset; 313 - data_len = (buf->head[0].iov_base + buf->head[0].iov_len) - data_start; 314 - memmove(orig_start, data_start, data_len); 315 - buf->head[0].iov_len -= (data_start - orig_start); 316 - buf->len = len - (data_start - orig_start); 317 - 318 - if (gss_krb5_remove_padding(buf, blocksize)) 319 - return GSS_S_DEFECTIVE_TOKEN; 320 - 321 - /* slack must include room for krb5 padding */ 322 - *slack = XDR_QUADLEN(saved_len - buf->len); 323 - /* The GSS blob always precedes the RPC message payload */ 324 - *align = *slack; 325 - return GSS_S_COMPLETE; 326 - } 327 - 328 - #endif 329 - 330 43 /* 331 44 * We can shift data by up to LOCAL_BUF_LEN bytes in a pass. If we need 332 45 * to do more than that, we shift repeatedly. Kevin Coffman reports
+2 -5
net/sunrpc/auth_gss/svcauth_gss.c
··· 986 986 return -EINVAL; 987 987 } 988 988 989 - static int 989 + static enum svc_auth_status 990 990 svcauth_gss_set_client(struct svc_rqst *rqstp) 991 991 { 992 992 struct gss_svc_data *svcdata = rqstp->rq_auth_data; ··· 1634 1634 * 1635 1635 * The rqstp->rq_auth_stat field is also set (see RFCs 2203 and 5531). 1636 1636 */ 1637 - static int 1637 + static enum svc_auth_status 1638 1638 svcauth_gss_accept(struct svc_rqst *rqstp) 1639 1639 { 1640 1640 struct gss_svc_data *svcdata = rqstp->rq_auth_data; ··· 1945 1945 * %0: the Reply is ready to be sent 1946 1946 * %-ENOMEM: failed to allocate memory 1947 1947 * %-EINVAL: encoding error 1948 - * 1949 - * XXX: These return values do not match the return values documented 1950 - * for the auth_ops ->release method in linux/sunrpc/svcauth.h. 1951 1948 */ 1952 1949 static int 1953 1950 svcauth_gss_release(struct svc_rqst *rqstp)
+60 -37
net/sunrpc/svc.c
··· 513 513 INIT_LIST_HEAD(&pool->sp_all_threads); 514 514 spin_lock_init(&pool->sp_lock); 515 515 516 + percpu_counter_init(&pool->sp_messages_arrived, 0, GFP_KERNEL); 516 517 percpu_counter_init(&pool->sp_sockets_queued, 0, GFP_KERNEL); 517 518 percpu_counter_init(&pool->sp_threads_woken, 0, GFP_KERNEL); 518 - percpu_counter_init(&pool->sp_threads_timedout, 0, GFP_KERNEL); 519 519 } 520 520 521 521 return serv; ··· 588 588 for (i = 0; i < serv->sv_nrpools; i++) { 589 589 struct svc_pool *pool = &serv->sv_pools[i]; 590 590 591 + percpu_counter_destroy(&pool->sp_messages_arrived); 591 592 percpu_counter_destroy(&pool->sp_sockets_queued); 592 593 percpu_counter_destroy(&pool->sp_threads_woken); 593 - percpu_counter_destroy(&pool->sp_threads_timedout); 594 594 } 595 595 kfree(serv->sv_pools); 596 596 kfree(serv); ··· 689 689 return rqstp; 690 690 } 691 691 692 - /* 693 - * Choose a pool in which to create a new thread, for svc_set_num_threads 692 + /** 693 + * svc_pool_wake_idle_thread - Awaken an idle thread in @pool 694 + * @pool: service thread pool 695 + * 696 + * Can be called from soft IRQ or process context. Finding an idle 697 + * service thread and marking it BUSY is atomic with respect to 698 + * other calls to svc_pool_wake_idle_thread(). 699 + * 694 700 */ 695 - static inline struct svc_pool * 696 - choose_pool(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) 701 + void svc_pool_wake_idle_thread(struct svc_pool *pool) 697 702 { 698 - if (pool != NULL) 699 - return pool; 703 + struct svc_rqst *rqstp; 700 704 701 - return &serv->sv_pools[(*state)++ % serv->sv_nrpools]; 705 + rcu_read_lock(); 706 + list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { 707 + if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags)) 708 + continue; 709 + 710 + WRITE_ONCE(rqstp->rq_qtime, ktime_get()); 711 + wake_up_process(rqstp->rq_task); 712 + rcu_read_unlock(); 713 + percpu_counter_inc(&pool->sp_threads_woken); 714 + trace_svc_wake_up(rqstp->rq_task->pid); 715 + return; 716 + } 717 + rcu_read_unlock(); 718 + 719 + set_bit(SP_CONGESTED, &pool->sp_flags); 702 720 } 703 721 704 - /* 705 - * Choose a thread to kill, for svc_set_num_threads 706 - */ 707 - static inline struct task_struct * 708 - choose_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) 722 + static struct svc_pool * 723 + svc_pool_next(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) 724 + { 725 + return pool ? pool : &serv->sv_pools[(*state)++ % serv->sv_nrpools]; 726 + } 727 + 728 + static struct task_struct * 729 + svc_pool_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state) 709 730 { 710 731 unsigned int i; 711 732 struct task_struct *task = NULL; ··· 734 713 if (pool != NULL) { 735 714 spin_lock_bh(&pool->sp_lock); 736 715 } else { 737 - /* choose a pool in round-robin fashion */ 738 716 for (i = 0; i < serv->sv_nrpools; i++) { 739 717 pool = &serv->sv_pools[--(*state) % serv->sv_nrpools]; 740 718 spin_lock_bh(&pool->sp_lock); ··· 748 728 if (!list_empty(&pool->sp_all_threads)) { 749 729 struct svc_rqst *rqstp; 750 730 751 - /* 752 - * Remove from the pool->sp_all_threads list 753 - * so we don't try to kill it again. 754 - */ 755 731 rqstp = list_entry(pool->sp_all_threads.next, struct svc_rqst, rq_all); 756 732 set_bit(RQ_VICTIM, &rqstp->rq_flags); 757 733 list_del_rcu(&rqstp->rq_all); 758 734 task = rqstp->rq_task; 759 735 } 760 736 spin_unlock_bh(&pool->sp_lock); 761 - 762 737 return task; 763 738 } 764 739 765 - /* create new threads */ 766 740 static int 767 741 svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) 768 742 { ··· 768 754 769 755 do { 770 756 nrservs--; 771 - chosen_pool = choose_pool(serv, pool, &state); 772 - 757 + chosen_pool = svc_pool_next(serv, pool, &state); 773 758 node = svc_pool_map_get_node(chosen_pool->sp_id); 759 + 774 760 rqstp = svc_prepare_thread(serv, chosen_pool, node); 775 761 if (IS_ERR(rqstp)) 776 762 return PTR_ERR(rqstp); 777 - 778 763 task = kthread_create_on_node(serv->sv_threadfn, rqstp, 779 764 node, "%s", serv->sv_name); 780 765 if (IS_ERR(task)) { ··· 792 779 return 0; 793 780 } 794 781 795 - /* 796 - * Create or destroy enough new threads to make the number 797 - * of threads the given number. If `pool' is non-NULL, applies 798 - * only to threads in that pool, otherwise round-robins between 799 - * all pools. Caller must ensure that mutual exclusion between this and 800 - * server startup or shutdown. 801 - */ 802 - 803 - /* destroy old threads */ 804 782 static int 805 783 svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) 806 784 { ··· 799 795 struct task_struct *task; 800 796 unsigned int state = serv->sv_nrthreads-1; 801 797 802 - /* destroy old threads */ 803 798 do { 804 - task = choose_victim(serv, pool, &state); 799 + task = svc_pool_victim(serv, pool, &state); 805 800 if (task == NULL) 806 801 break; 807 802 rqstp = kthread_data(task); ··· 812 809 return 0; 813 810 } 814 811 812 + /** 813 + * svc_set_num_threads - adjust number of threads per RPC service 814 + * @serv: RPC service to adjust 815 + * @pool: Specific pool from which to choose threads, or NULL 816 + * @nrservs: New number of threads for @serv (0 or less means kill all threads) 817 + * 818 + * Create or destroy threads to make the number of threads for @serv the 819 + * given number. If @pool is non-NULL, change only threads in that pool; 820 + * otherwise, round-robin between all pools for @serv. @serv's 821 + * sv_nrthreads is adjusted for each thread created or destroyed. 822 + * 823 + * Caller must ensure mutual exclusion between this and server startup or 824 + * shutdown. 825 + * 826 + * Returns zero on success or a negative errno if an error occurred while 827 + * starting a thread. 828 + */ 815 829 int 816 830 svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) 817 831 { ··· 1297 1277 const struct svc_procedure *procp = NULL; 1298 1278 struct svc_serv *serv = rqstp->rq_server; 1299 1279 struct svc_process_info process; 1300 - int auth_res, rc; 1280 + enum svc_auth_status auth_res; 1301 1281 unsigned int aoffset; 1282 + int rc; 1302 1283 __be32 *p; 1303 1284 1304 1285 /* Will be turned off by GSS integrity and privacy services */ ··· 1354 1333 goto dropit; 1355 1334 case SVC_COMPLETE: 1356 1335 goto sendit; 1336 + default: 1337 + pr_warn_once("Unexpected svc_auth_status (%d)\n", auth_res); 1338 + goto err_system_err; 1357 1339 } 1358 1340 1359 1341 if (progp == NULL) ··· 1540 1516 out_drop: 1541 1517 svc_drop(rqstp); 1542 1518 } 1543 - EXPORT_SYMBOL_GPL(svc_process); 1544 1519 1545 1520 #if defined(CONFIG_SUNRPC_BACKCHANNEL) 1546 1521 /*
+45 -81
net/sunrpc/svc_xprt.c
··· 434 434 smp_rmb(); 435 435 xpt_flags = READ_ONCE(xprt->xpt_flags); 436 436 437 + trace_svc_xprt_enqueue(xprt, xpt_flags); 437 438 if (xpt_flags & BIT(XPT_BUSY)) 438 439 return false; 439 440 if (xpt_flags & (BIT(XPT_CONN) | BIT(XPT_CLOSE) | BIT(XPT_HANDSHAKE))) ··· 457 456 void svc_xprt_enqueue(struct svc_xprt *xprt) 458 457 { 459 458 struct svc_pool *pool; 460 - struct svc_rqst *rqstp = NULL; 461 459 462 460 if (!svc_xprt_ready(xprt)) 463 461 return; ··· 476 476 list_add_tail(&xprt->xpt_ready, &pool->sp_sockets); 477 477 spin_unlock_bh(&pool->sp_lock); 478 478 479 - /* find a thread for this xprt */ 480 - rcu_read_lock(); 481 - list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { 482 - if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags)) 483 - continue; 484 - percpu_counter_inc(&pool->sp_threads_woken); 485 - rqstp->rq_qtime = ktime_get(); 486 - wake_up_process(rqstp->rq_task); 487 - goto out_unlock; 488 - } 489 - set_bit(SP_CONGESTED, &pool->sp_flags); 490 - rqstp = NULL; 491 - out_unlock: 492 - rcu_read_unlock(); 493 - trace_svc_xprt_enqueue(xprt, rqstp); 479 + svc_pool_wake_idle_thread(pool); 494 480 } 495 481 EXPORT_SYMBOL_GPL(svc_xprt_enqueue); 496 482 ··· 567 581 svc_xprt_put(xprt); 568 582 } 569 583 570 - /* 584 + /** 585 + * svc_wake_up - Wake up a service thread for non-transport work 586 + * @serv: RPC service 587 + * 571 588 * Some svc_serv's will have occasional work to do, even when a xprt is not 572 589 * waiting to be serviced. This function is there to "kick" a task in one of 573 590 * those services so that it can wake up and do that work. Note that we only ··· 579 590 */ 580 591 void svc_wake_up(struct svc_serv *serv) 581 592 { 582 - struct svc_rqst *rqstp; 583 - struct svc_pool *pool; 593 + struct svc_pool *pool = &serv->sv_pools[0]; 584 594 585 - pool = &serv->sv_pools[0]; 586 - 587 - rcu_read_lock(); 588 - list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) { 589 - /* skip any that aren't queued */ 590 - if (test_bit(RQ_BUSY, &rqstp->rq_flags)) 591 - continue; 592 - rcu_read_unlock(); 593 - wake_up_process(rqstp->rq_task); 594 - trace_svc_wake_up(rqstp->rq_task->pid); 595 - return; 596 - } 597 - rcu_read_unlock(); 598 - 599 - /* No free entries available */ 600 595 set_bit(SP_TASK_PENDING, &pool->sp_flags); 601 - smp_wmb(); 602 - trace_svc_wake_up(0); 596 + svc_pool_wake_idle_thread(pool); 603 597 } 604 598 EXPORT_SYMBOL_GPL(svc_wake_up); 605 599 ··· 651 679 } 652 680 } 653 681 654 - static int svc_alloc_arg(struct svc_rqst *rqstp) 682 + static bool svc_alloc_arg(struct svc_rqst *rqstp) 655 683 { 656 684 struct svc_serv *serv = rqstp->rq_server; 657 685 struct xdr_buf *arg = &rqstp->rq_arg; ··· 673 701 /* Made progress, don't sleep yet */ 674 702 continue; 675 703 676 - set_current_state(TASK_INTERRUPTIBLE); 677 - if (signalled() || kthread_should_stop()) { 704 + set_current_state(TASK_IDLE); 705 + if (kthread_should_stop()) { 678 706 set_current_state(TASK_RUNNING); 679 - return -EINTR; 707 + return false; 680 708 } 681 709 trace_svc_alloc_arg_err(pages, ret); 682 710 memalloc_retry_wait(GFP_KERNEL); ··· 695 723 arg->tail[0].iov_len = 0; 696 724 697 725 rqstp->rq_xid = xdr_zero; 698 - return 0; 726 + return true; 699 727 } 700 728 701 729 static bool ··· 704 732 struct svc_pool *pool = rqstp->rq_pool; 705 733 706 734 /* did someone call svc_wake_up? */ 707 - if (test_and_clear_bit(SP_TASK_PENDING, &pool->sp_flags)) 735 + if (test_bit(SP_TASK_PENDING, &pool->sp_flags)) 708 736 return false; 709 737 710 738 /* was a socket queued? */ ··· 712 740 return false; 713 741 714 742 /* are we shutting down? */ 715 - if (signalled() || kthread_should_stop()) 743 + if (kthread_should_stop()) 716 744 return false; 717 745 718 746 /* are we freezing? */ ··· 722 750 return true; 723 751 } 724 752 725 - static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout) 753 + static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp) 726 754 { 727 755 struct svc_pool *pool = rqstp->rq_pool; 728 - long time_left = 0; 729 756 730 757 /* rq_xprt should be clear on entry */ 731 758 WARN_ON_ONCE(rqstp->rq_xprt); ··· 733 762 if (rqstp->rq_xprt) 734 763 goto out_found; 735 764 736 - /* 737 - * We have to be able to interrupt this wait 738 - * to bring down the daemons ... 739 - */ 740 - set_current_state(TASK_INTERRUPTIBLE); 765 + set_current_state(TASK_IDLE); 741 766 smp_mb__before_atomic(); 742 767 clear_bit(SP_CONGESTED, &pool->sp_flags); 743 768 clear_bit(RQ_BUSY, &rqstp->rq_flags); 744 769 smp_mb__after_atomic(); 745 770 746 771 if (likely(rqst_should_sleep(rqstp))) 747 - time_left = schedule_timeout(timeout); 772 + schedule(); 748 773 else 749 774 __set_current_state(TASK_RUNNING); 750 775 ··· 748 781 749 782 set_bit(RQ_BUSY, &rqstp->rq_flags); 750 783 smp_mb__after_atomic(); 784 + clear_bit(SP_TASK_PENDING, &pool->sp_flags); 751 785 rqstp->rq_xprt = svc_xprt_dequeue(pool); 752 786 if (rqstp->rq_xprt) 753 787 goto out_found; 754 788 755 - if (!time_left) 756 - percpu_counter_inc(&pool->sp_threads_timedout); 757 - 758 - if (signalled() || kthread_should_stop()) 759 - return ERR_PTR(-EINTR); 760 - return ERR_PTR(-EAGAIN); 789 + if (kthread_should_stop()) 790 + return NULL; 791 + return NULL; 761 792 out_found: 793 + clear_bit(SP_TASK_PENDING, &pool->sp_flags); 762 794 /* Normally we will wait up to 5 seconds for any required 763 795 * cache information to be provided. 764 796 */ ··· 833 867 return len; 834 868 } 835 869 836 - /* 837 - * Receive the next request on any transport. This code is carefully 838 - * organised not to touch any cachelines in the shared svc_serv 839 - * structure, only cachelines in the local svc_pool. 870 + /** 871 + * svc_recv - Receive and process the next request on any transport 872 + * @rqstp: an idle RPC service thread 873 + * 874 + * This code is carefully organised not to touch any cachelines in 875 + * the shared svc_serv structure, only cachelines in the local 876 + * svc_pool. 840 877 */ 841 - int svc_recv(struct svc_rqst *rqstp, long timeout) 878 + void svc_recv(struct svc_rqst *rqstp) 842 879 { 843 880 struct svc_xprt *xprt = NULL; 844 881 struct svc_serv *serv = rqstp->rq_server; 845 - int len, err; 882 + int len; 846 883 847 - err = svc_alloc_arg(rqstp); 848 - if (err) 884 + if (!svc_alloc_arg(rqstp)) 849 885 goto out; 850 886 851 887 try_to_freeze(); 852 888 cond_resched(); 853 - err = -EINTR; 854 - if (signalled() || kthread_should_stop()) 889 + if (kthread_should_stop()) 855 890 goto out; 856 891 857 - xprt = svc_get_next_xprt(rqstp, timeout); 858 - if (IS_ERR(xprt)) { 859 - err = PTR_ERR(xprt); 892 + xprt = svc_get_next_xprt(rqstp); 893 + if (!xprt) 860 894 goto out; 861 - } 862 895 863 896 len = svc_handle_xprt(rqstp, xprt); 864 897 865 898 /* No data, incomplete (TCP) read, or accept() */ 866 - err = -EAGAIN; 867 899 if (len <= 0) 868 900 goto out_release; 869 901 ··· 873 909 874 910 if (serv->sv_stats) 875 911 serv->sv_stats->netcnt++; 912 + percpu_counter_inc(&rqstp->rq_pool->sp_messages_arrived); 876 913 rqstp->rq_stime = ktime_get(); 877 - return len; 914 + svc_process(rqstp); 915 + out: 916 + return; 878 917 out_release: 879 918 rqstp->rq_res.len = 0; 880 919 svc_xprt_release(rqstp); 881 - out: 882 - return err; 883 920 } 884 921 EXPORT_SYMBOL_GPL(svc_recv); 885 922 ··· 1421 1456 return 0; 1422 1457 } 1423 1458 1424 - seq_printf(m, "%u %llu %llu %llu %llu\n", 1425 - pool->sp_id, 1426 - percpu_counter_sum_positive(&pool->sp_sockets_queued), 1427 - percpu_counter_sum_positive(&pool->sp_sockets_queued), 1428 - percpu_counter_sum_positive(&pool->sp_threads_woken), 1429 - percpu_counter_sum_positive(&pool->sp_threads_timedout)); 1459 + seq_printf(m, "%u %llu %llu %llu 0\n", 1460 + pool->sp_id, 1461 + percpu_counter_sum_positive(&pool->sp_messages_arrived), 1462 + percpu_counter_sum_positive(&pool->sp_sockets_queued), 1463 + percpu_counter_sum_positive(&pool->sp_threads_woken)); 1430 1464 1431 1465 return 0; 1432 1466 }
+29 -6
net/sunrpc/svcauth.c
··· 60 60 module_put(aops->owner); 61 61 } 62 62 63 - int 64 - svc_authenticate(struct svc_rqst *rqstp) 63 + /** 64 + * svc_authenticate - Initialize an outgoing credential 65 + * @rqstp: RPC execution context 66 + * 67 + * Return values: 68 + * %SVC_OK: XDR encoding of the result can begin 69 + * %SVC_DENIED: Credential or verifier is not valid 70 + * %SVC_GARBAGE: Failed to decode credential or verifier 71 + * %SVC_COMPLETE: GSS context lifetime event; no further action 72 + * %SVC_DROP: Drop this request; no further action 73 + * %SVC_CLOSE: Like drop, but also close transport connection 74 + */ 75 + enum svc_auth_status svc_authenticate(struct svc_rqst *rqstp) 65 76 { 66 77 struct auth_ops *aops; 67 78 u32 flavor; ··· 100 89 } 101 90 EXPORT_SYMBOL_GPL(svc_authenticate); 102 91 103 - int svc_set_client(struct svc_rqst *rqstp) 92 + /** 93 + * svc_set_client - Assign an appropriate 'auth_domain' as the client 94 + * @rqstp: RPC execution context 95 + * 96 + * Return values: 97 + * %SVC_OK: Client was found and assigned 98 + * %SVC_DENY: Client was explicitly denied 99 + * %SVC_DROP: Ignore this request 100 + * %SVC_CLOSE: Ignore this request and close the connection 101 + */ 102 + enum svc_auth_status svc_set_client(struct svc_rqst *rqstp) 104 103 { 105 104 rqstp->rq_client = NULL; 106 105 return rqstp->rq_authop->set_client(rqstp); 107 106 } 108 107 EXPORT_SYMBOL_GPL(svc_set_client); 109 108 110 - /* A request, which was authenticated, has now executed. 111 - * Time to finalise the credentials and verifier 112 - * and release and resources 109 + /** 110 + * svc_authorise - Finalize credentials/verifier and release resources 111 + * @rqstp: RPC execution context 112 + * 113 + * Returns zero on success, or a negative errno. 113 114 */ 114 115 int svc_authorise(struct svc_rqst *rqstp) 115 116 {
+4 -5
net/sunrpc/svcauth_unix.c
··· 665 665 } 666 666 } 667 667 668 - int 668 + enum svc_auth_status 669 669 svcauth_unix_set_client(struct svc_rqst *rqstp) 670 670 { 671 671 struct sockaddr_in *sin; ··· 736 736 rqstp->rq_auth_stat = rpc_auth_ok; 737 737 return SVC_OK; 738 738 } 739 - 740 739 EXPORT_SYMBOL_GPL(svcauth_unix_set_client); 741 740 742 741 /** ··· 750 751 * 751 752 * rqstp->rq_auth_stat is set as mandated by RFC 5531. 752 753 */ 753 - static int 754 + static enum svc_auth_status 754 755 svcauth_null_accept(struct svc_rqst *rqstp) 755 756 { 756 757 struct xdr_stream *xdr = &rqstp->rq_arg_stream; ··· 827 828 * 828 829 * rqstp->rq_auth_stat is set as mandated by RFC 5531. 829 830 */ 830 - static int 831 + static enum svc_auth_status 831 832 svcauth_tls_accept(struct svc_rqst *rqstp) 832 833 { 833 834 struct xdr_stream *xdr = &rqstp->rq_arg_stream; ··· 912 913 * 913 914 * rqstp->rq_auth_stat is set as mandated by RFC 5531. 914 915 */ 915 - static int 916 + static enum svc_auth_status 916 917 svcauth_unix_accept(struct svc_rqst *rqstp) 917 918 { 918 919 struct xdr_stream *xdr = &rqstp->rq_arg_stream;
+57 -70
net/sunrpc/svcsock.c
··· 36 36 #include <linux/skbuff.h> 37 37 #include <linux/file.h> 38 38 #include <linux/freezer.h> 39 + #include <linux/bvec.h> 40 + 39 41 #include <net/sock.h> 40 42 #include <net/checksum.h> 41 43 #include <net/ip.h> ··· 697 695 .msg_name = &rqstp->rq_addr, 698 696 .msg_namelen = rqstp->rq_addrlen, 699 697 .msg_control = cmh, 698 + .msg_flags = MSG_SPLICE_PAGES, 700 699 .msg_controllen = sizeof(buffer), 701 700 }; 702 - unsigned int sent; 701 + unsigned int count; 703 702 int err; 704 703 705 704 svc_udp_release_ctxt(xprt, rqstp->rq_xprt_ctxt); ··· 713 710 if (svc_xprt_is_dead(xprt)) 714 711 goto out_notconn; 715 712 716 - err = xdr_alloc_bvec(xdr, GFP_KERNEL); 717 - if (err < 0) 718 - goto out_unlock; 713 + count = xdr_buf_to_bvec(rqstp->rq_bvec, 714 + ARRAY_SIZE(rqstp->rq_bvec), xdr); 719 715 720 - err = xprt_sock_sendmsg(svsk->sk_sock, &msg, xdr, 0, 0, &sent); 716 + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec, 717 + count, 0); 718 + err = sock_sendmsg(svsk->sk_sock, &msg); 721 719 if (err == -ECONNREFUSED) { 722 720 /* ICMP error on earlier request. */ 723 - err = xprt_sock_sendmsg(svsk->sk_sock, &msg, xdr, 0, 0, &sent); 721 + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec, 722 + count, 0); 723 + err = sock_sendmsg(svsk->sk_sock, &msg); 724 724 } 725 - xdr_free_bvec(xdr); 725 + 726 726 trace_svcsock_udp_send(xprt, err); 727 - out_unlock: 727 + 728 728 mutex_unlock(&xprt->xpt_mutex); 729 - if (err < 0) 730 - return err; 731 - return sent; 729 + return err; 732 730 733 731 out_notconn: 734 732 mutex_unlock(&xprt->xpt_mutex); ··· 1093 1089 /* If we have more data, signal svc_xprt_enqueue() to try again */ 1094 1090 svsk->sk_tcplen = 0; 1095 1091 svsk->sk_marker = xdr_zero; 1092 + 1093 + smp_wmb(); 1094 + tcp_set_rcvlowat(svsk->sk_sk, 1); 1096 1095 } 1097 1096 1098 1097 /** ··· 1185 1178 goto err_delete; 1186 1179 if (len == want) 1187 1180 svc_tcp_fragment_received(svsk); 1188 - else 1181 + else { 1182 + /* Avoid more ->sk_data_ready() calls until the rest 1183 + * of the message has arrived. This reduces service 1184 + * thread wake-ups on large incoming messages. */ 1185 + tcp_set_rcvlowat(svsk->sk_sk, 1186 + svc_sock_reclen(svsk) - svsk->sk_tcplen); 1187 + 1189 1188 trace_svcsock_tcp_recv_short(&svsk->sk_xprt, 1190 1189 svc_sock_reclen(svsk), 1191 1190 svsk->sk_tcplen - sizeof(rpc_fraghdr)); 1191 + } 1192 1192 goto err_noclose; 1193 1193 error: 1194 1194 if (len != -EAGAIN) ··· 1212 1198 return 0; /* record not complete */ 1213 1199 } 1214 1200 1215 - static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec, 1216 - int flags) 1217 - { 1218 - struct msghdr msg = { .msg_flags = MSG_SPLICE_PAGES | flags, }; 1219 - 1220 - iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 1, vec->iov_len); 1221 - return sock_sendmsg(sock, &msg); 1222 - } 1223 - 1224 1201 /* 1225 1202 * MSG_SPLICE_PAGES is used exclusively to reduce the number of 1226 1203 * copy operations in this path. Therefore the caller must ensure 1227 1204 * that the pages backing @xdr are unchanging. 1228 1205 * 1229 - * In addition, the logic assumes that * .bv_len is never larger 1230 - * than PAGE_SIZE. 1206 + * Note that the send is non-blocking. The caller has incremented 1207 + * the reference count on each page backing the RPC message, and 1208 + * the network layer will "put" these pages when transmission is 1209 + * complete. 1210 + * 1211 + * This is safe for our RPC services because the memory backing 1212 + * the head and tail components is never kmalloc'd. These always 1213 + * come from pages in the svc_rqst::rq_pages array. 1231 1214 */ 1232 - static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr, 1215 + static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp, 1233 1216 rpc_fraghdr marker, unsigned int *sentp) 1234 1217 { 1235 - const struct kvec *head = xdr->head; 1236 - const struct kvec *tail = xdr->tail; 1237 - struct kvec rm = { 1238 - .iov_base = &marker, 1239 - .iov_len = sizeof(marker), 1240 - }; 1241 1218 struct msghdr msg = { 1242 - .msg_flags = 0, 1219 + .msg_flags = MSG_SPLICE_PAGES, 1243 1220 }; 1221 + unsigned int count; 1222 + void *buf; 1244 1223 int ret; 1245 1224 1246 1225 *sentp = 0; 1247 - ret = xdr_alloc_bvec(xdr, GFP_KERNEL); 1248 - if (ret < 0) 1249 - return ret; 1250 1226 1251 - ret = kernel_sendmsg(sock, &msg, &rm, 1, rm.iov_len); 1252 - if (ret < 0) 1253 - return ret; 1254 - *sentp += ret; 1255 - if (ret != rm.iov_len) 1256 - return -EAGAIN; 1227 + /* The stream record marker is copied into a temporary page 1228 + * fragment buffer so that it can be included in rq_bvec. 1229 + */ 1230 + buf = page_frag_alloc(&svsk->sk_frag_cache, sizeof(marker), 1231 + GFP_KERNEL); 1232 + if (!buf) 1233 + return -ENOMEM; 1234 + memcpy(buf, &marker, sizeof(marker)); 1235 + bvec_set_virt(rqstp->rq_bvec, buf, sizeof(marker)); 1257 1236 1258 - ret = svc_tcp_send_kvec(sock, head, 0); 1259 - if (ret < 0) 1260 - return ret; 1261 - *sentp += ret; 1262 - if (ret != head->iov_len) 1263 - goto out; 1237 + count = xdr_buf_to_bvec(rqstp->rq_bvec + 1, 1238 + ARRAY_SIZE(rqstp->rq_bvec) - 1, &rqstp->rq_res); 1264 1239 1265 - if (xdr_buf_pagecount(xdr)) 1266 - xdr->bvec[0].bv_offset = offset_in_page(xdr->page_base); 1267 - 1268 - msg.msg_flags = MSG_SPLICE_PAGES; 1269 - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec, 1270 - xdr_buf_pagecount(xdr), xdr->page_len); 1271 - ret = sock_sendmsg(sock, &msg); 1240 + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec, 1241 + 1 + count, sizeof(marker) + rqstp->rq_res.len); 1242 + ret = sock_sendmsg(svsk->sk_sock, &msg); 1272 1243 if (ret < 0) 1273 1244 return ret; 1274 1245 *sentp += ret; 1275 - 1276 - if (tail->iov_len) { 1277 - ret = svc_tcp_send_kvec(sock, tail, 0); 1278 - if (ret < 0) 1279 - return ret; 1280 - *sentp += ret; 1281 - } 1282 - 1283 - out: 1284 1246 return 0; 1285 1247 } 1286 1248 ··· 1282 1292 svc_tcp_release_ctxt(xprt, rqstp->rq_xprt_ctxt); 1283 1293 rqstp->rq_xprt_ctxt = NULL; 1284 1294 1285 - atomic_inc(&svsk->sk_sendqlen); 1286 1295 mutex_lock(&xprt->xpt_mutex); 1287 1296 if (svc_xprt_is_dead(xprt)) 1288 1297 goto out_notconn; 1289 - tcp_sock_set_cork(svsk->sk_sk, true); 1290 - err = svc_tcp_sendmsg(svsk->sk_sock, xdr, marker, &sent); 1291 - xdr_free_bvec(xdr); 1298 + err = svc_tcp_sendmsg(svsk, rqstp, marker, &sent); 1292 1299 trace_svcsock_tcp_send(xprt, err < 0 ? (long)err : sent); 1293 1300 if (err < 0 || sent != (xdr->len + sizeof(marker))) 1294 1301 goto out_close; 1295 - if (atomic_dec_and_test(&svsk->sk_sendqlen)) 1296 - tcp_sock_set_cork(svsk->sk_sk, false); 1297 1302 mutex_unlock(&xprt->xpt_mutex); 1298 1303 return sent; 1299 1304 1300 1305 out_notconn: 1301 - atomic_dec(&svsk->sk_sendqlen); 1302 1306 mutex_unlock(&xprt->xpt_mutex); 1303 1307 return -ENOTCONN; 1304 1308 out_close: ··· 1301 1317 (err < 0) ? "got error" : "sent", 1302 1318 (err < 0) ? err : sent, xdr->len); 1303 1319 svc_xprt_deferred_close(xprt); 1304 - atomic_dec(&svsk->sk_sendqlen); 1305 1320 mutex_unlock(&xprt->xpt_mutex); 1306 1321 return -EAGAIN; 1307 1322 } ··· 1627 1644 static void svc_sock_free(struct svc_xprt *xprt) 1628 1645 { 1629 1646 struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt); 1647 + struct page_frag_cache *pfc = &svsk->sk_frag_cache; 1630 1648 struct socket *sock = svsk->sk_sock; 1631 1649 1632 1650 trace_svcsock_free(svsk, sock); ··· 1637 1653 sockfd_put(sock); 1638 1654 else 1639 1655 sock_release(sock); 1656 + if (pfc->va) 1657 + __page_frag_cache_drain(virt_to_head_page(pfc->va), 1658 + pfc->pagecnt_bias); 1640 1659 kfree(svsk); 1641 1660 }
+50
net/sunrpc/xdr.c
··· 165 165 } 166 166 167 167 /** 168 + * xdr_buf_to_bvec - Copy components of an xdr_buf into a bio_vec array 169 + * @bvec: bio_vec array to populate 170 + * @bvec_size: element count of @bio_vec 171 + * @xdr: xdr_buf to be copied 172 + * 173 + * Returns the number of entries consumed in @bvec. 174 + */ 175 + unsigned int xdr_buf_to_bvec(struct bio_vec *bvec, unsigned int bvec_size, 176 + const struct xdr_buf *xdr) 177 + { 178 + const struct kvec *head = xdr->head; 179 + const struct kvec *tail = xdr->tail; 180 + unsigned int count = 0; 181 + 182 + if (head->iov_len) { 183 + bvec_set_virt(bvec++, head->iov_base, head->iov_len); 184 + ++count; 185 + } 186 + 187 + if (xdr->page_len) { 188 + unsigned int offset, len, remaining; 189 + struct page **pages = xdr->pages; 190 + 191 + offset = offset_in_page(xdr->page_base); 192 + remaining = xdr->page_len; 193 + while (remaining > 0) { 194 + len = min_t(unsigned int, remaining, 195 + PAGE_SIZE - offset); 196 + bvec_set_page(bvec++, *pages++, len, offset); 197 + remaining -= len; 198 + offset = 0; 199 + if (unlikely(++count > bvec_size)) 200 + goto bvec_overflow; 201 + } 202 + } 203 + 204 + if (tail->iov_len) { 205 + bvec_set_virt(bvec, tail->iov_base, tail->iov_len); 206 + if (unlikely(++count > bvec_size)) 207 + goto bvec_overflow; 208 + } 209 + 210 + return count; 211 + 212 + bvec_overflow: 213 + pr_warn_once("%s: bio_vec array overflow\n", __func__); 214 + return count - 1; 215 + } 216 + 217 + /** 168 218 * xdr_inline_pages - Prepare receive buffer for a large reply 169 219 * @xdr: xdr_buf into which reply will be placed 170 220 * @offset: expected offset where data payload will start, in bytes