Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: Use a local struct to do the initial vfs_poll() on an irqfd

Use a function-local struct for the poll_table passed to vfs_poll(), as
nothing in the vfs_poll() callchain grabs a long-term reference to the
structure, i.e. its lifetime doesn't need to be tied to the irqfd. Using
a local structure will also allow propagating failures out of the polling
callback without further polluting kvm_kernel_irqfd.

Opportunstically rename irqfd_ptable_queue_proc() to kvm_irqfd_register()
to capture what it actually does.

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250522235223.3178519-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

+17 -10
-1
include/linux/kvm_irqfd.h
··· 55 55 /* Used for setup/shutdown */ 56 56 struct eventfd_ctx *eventfd; 57 57 struct list_head list; 58 - poll_table pt; 59 58 struct work_struct shutdown; 60 59 struct irq_bypass_consumer consumer; 61 60 struct irq_bypass_producer *producer;
+17 -9
virt/kvm/eventfd.c
··· 245 245 return ret; 246 246 } 247 247 248 - static void 249 - irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, 250 - poll_table *pt) 248 + struct kvm_irqfd_pt { 249 + struct kvm_kernel_irqfd *irqfd; 250 + poll_table pt; 251 + }; 252 + 253 + static void kvm_irqfd_register(struct file *file, wait_queue_head_t *wqh, 254 + poll_table *pt) 251 255 { 252 - struct kvm_kernel_irqfd *irqfd = 253 - container_of(pt, struct kvm_kernel_irqfd, pt); 256 + struct kvm_irqfd_pt *p = container_of(pt, struct kvm_irqfd_pt, pt); 257 + struct kvm_kernel_irqfd *irqfd = p->irqfd; 258 + 254 259 add_wait_queue_priority(wqh, &irqfd->wait); 255 260 } 256 261 ··· 303 298 { 304 299 struct kvm_kernel_irqfd *irqfd, *tmp; 305 300 struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL; 301 + struct kvm_irqfd_pt irqfd_pt; 306 302 int ret; 307 303 __poll_t events; 308 304 int idx; ··· 393 387 * a callback whenever someone signals the underlying eventfd 394 388 */ 395 389 init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); 396 - init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc); 397 390 398 391 spin_lock_irq(&kvm->irqfds.lock); 399 392 ··· 414 409 spin_unlock_irq(&kvm->irqfds.lock); 415 410 416 411 /* 417 - * Check if there was an event already pending on the eventfd 418 - * before we registered, and trigger it as if we didn't miss it. 412 + * Register the irqfd with the eventfd by polling on the eventfd. If 413 + * there was en event pending on the eventfd prior to registering, 414 + * manually trigger IRQ injection. 419 415 */ 420 - events = vfs_poll(fd_file(f), &irqfd->pt); 416 + irqfd_pt.irqfd = irqfd; 417 + init_poll_funcptr(&irqfd_pt.pt, kvm_irqfd_register); 421 418 419 + events = vfs_poll(fd_file(f), &irqfd_pt.pt); 422 420 if (events & EPOLLIN) 423 421 schedule_work(&irqfd->inject); 424 422