Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

rxrpc: Fix recvmsg() unconditional requeue

If rxrpc_recvmsg() fails because MSG_DONTWAIT was specified but the call at
the front of the recvmsg queue already has its mutex locked, it requeues
the call - whether or not the call is already queued. The call may be on
the queue because MSG_PEEK was also passed and so the call was not dequeued
or because the I/O thread requeued it.

The unconditional requeue may then corrupt the recvmsg queue, leading to
things like UAFs or refcount underruns.

Fix this by only requeuing the call if it isn't already on the queue - and
moving it to the front if it is already queued. If we don't queue it, we
have to put the ref we obtained by dequeuing it.

Also, MSG_PEEK doesn't dequeue the call so shouldn't call
rxrpc_notify_socket() for the call if we didn't use up all the data on the
queue, so fix that also.

Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Reported-by: Faith <faith@zellic.io>
Reported-by: Pumpkin Chang <pumpkin@devco.re>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
cc: Nir Ohfeld <niro@wiz.io>
cc: Willy Tarreau <w@1wt.eu>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org
Link: https://patch.msgid.link/95163.1768428203@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

David Howells and committed by
Jakub Kicinski
2c28769a 7b8e1a80

+19 -4
+4
include/trace/events/rxrpc.h
··· 322 322 EM(rxrpc_call_put_kernel, "PUT kernel ") \ 323 323 EM(rxrpc_call_put_poke, "PUT poke ") \ 324 324 EM(rxrpc_call_put_recvmsg, "PUT recvmsg ") \ 325 + EM(rxrpc_call_put_recvmsg_peek_nowait, "PUT peek-nwt") \ 325 326 EM(rxrpc_call_put_release_recvmsg_q, "PUT rls-rcmq") \ 326 327 EM(rxrpc_call_put_release_sock, "PUT rls-sock") \ 327 328 EM(rxrpc_call_put_release_sock_tba, "PUT rls-sk-a") \ ··· 341 340 EM(rxrpc_call_see_input, "SEE input ") \ 342 341 EM(rxrpc_call_see_notify_released, "SEE nfy-rlsd") \ 343 342 EM(rxrpc_call_see_recvmsg, "SEE recvmsg ") \ 343 + EM(rxrpc_call_see_recvmsg_requeue, "SEE recv-rqu") \ 344 + EM(rxrpc_call_see_recvmsg_requeue_first, "SEE recv-rqF") \ 345 + EM(rxrpc_call_see_recvmsg_requeue_move, "SEE recv-rqM") \ 344 346 EM(rxrpc_call_see_release, "SEE release ") \ 345 347 EM(rxrpc_call_see_userid_exists, "SEE u-exists") \ 346 348 EM(rxrpc_call_see_waiting_call, "SEE q-conn ") \
+15 -4
net/rxrpc/recvmsg.c
··· 518 518 if (rxrpc_call_has_failed(call)) 519 519 goto call_failed; 520 520 521 - if (!skb_queue_empty(&call->recvmsg_queue)) 521 + if (!(flags & MSG_PEEK) && 522 + !skb_queue_empty(&call->recvmsg_queue)) 522 523 rxrpc_notify_socket(call); 523 524 goto not_yet_complete; 524 525 ··· 550 549 error_requeue_call: 551 550 if (!(flags & MSG_PEEK)) { 552 551 spin_lock_irq(&rx->recvmsg_lock); 553 - list_add(&call->recvmsg_link, &rx->recvmsg_q); 554 - spin_unlock_irq(&rx->recvmsg_lock); 552 + if (list_empty(&call->recvmsg_link)) { 553 + list_add(&call->recvmsg_link, &rx->recvmsg_q); 554 + rxrpc_see_call(call, rxrpc_call_see_recvmsg_requeue); 555 + spin_unlock_irq(&rx->recvmsg_lock); 556 + } else if (list_is_first(&call->recvmsg_link, &rx->recvmsg_q)) { 557 + spin_unlock_irq(&rx->recvmsg_lock); 558 + rxrpc_put_call(call, rxrpc_call_see_recvmsg_requeue_first); 559 + } else { 560 + list_move(&call->recvmsg_link, &rx->recvmsg_q); 561 + spin_unlock_irq(&rx->recvmsg_lock); 562 + rxrpc_put_call(call, rxrpc_call_see_recvmsg_requeue_move); 563 + } 555 564 trace_rxrpc_recvmsg(call_debug_id, rxrpc_recvmsg_requeue, 0); 556 565 } else { 557 - rxrpc_put_call(call, rxrpc_call_put_recvmsg); 566 + rxrpc_put_call(call, rxrpc_call_put_recvmsg_peek_nowait); 558 567 } 559 568 error_no_call: 560 569 release_sock(&rx->sk);