Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/vc4: Make pageflip completion handling more robust.

Protect both the setup of the pageflip event and the
latching of the new requested displaylist head pointer
by the event lock, so we can't get into a situation
where vc4_atomic_flush latches the new display list via
HVS_WRITE, then immediately gets preempted before queueing
the pageflip event, then the page-flip completes in hw and
the vc4_crtc_handle_page_flip() runs and no-ops due to
lack of a pending pageflip event, then vc4_atomic_flush
continues and only then queues the pageflip event - after
the page flip handling already no-oped. This would cause
flip completion handling only at the next vblank - one
frame too late.

In vc4_crtc_handle_page_flip() check the actual DL head
pointer in SCALER_DISPLACTX against the requested pointer
for page flip to make sure that the flip actually really
completed in the current vblank and doesn't get deferred
to the next one because the DL head pointer was written
a bit too late into SCALER_DISPLISTX, after start of
vblank, and missed the boat. This avoids handling a
pageflip completion too early - one frame too early.

According to Eric, DL head pointer updates which were
written into the HVS DISPLISTX reg get committed to hardware
at the last pixel of active scanout. Our vblank interrupt
handler, as triggered by PV_INT_VFP_START irq, gets to run
earliest at the first pixel of HBLANK at the end of the
last scanline of active scanout, ie. vblank irq handling
runs at least 1 pixel duration after a potential pageflip
completion happened in hardware.

This ordering of events in the hardware, together with the
lock protection and SCALER_DISPLACTX sampling of this patch,
guarantees that pageflip completion handling only runs at
exactly the vblank irq of actual pageflip completion in all
cases.

Background info from Eric about the relative timing of
HVS, PV's and trigger points for interrupts, DL updates:

https://lists.freedesktop.org/archives/dri-devel/2016-May/107510.html

Tested on RPi 2B with hardware timing measurement equipment
and shown to no longer complete flips too early or too late.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

authored by

Mario Kleiner and committed by
Eric Anholt
56d1fe09 b10c22e5

+22 -10
+18 -10
drivers/gpu/drm/vc4/vc4_crtc.c
··· 456 456 457 457 WARN_ON_ONCE(dlist_next - dlist_start != vc4_state->mm.size); 458 458 459 - HVS_WRITE(SCALER_DISPLISTX(vc4_crtc->channel), 460 - vc4_state->mm.start); 461 - 462 - if (debug_dump_regs) { 463 - DRM_INFO("CRTC %d HVS after:\n", drm_crtc_index(crtc)); 464 - vc4_hvs_dump_state(dev); 465 - } 466 - 467 459 if (crtc->state->event) { 468 460 unsigned long flags; 469 461 ··· 465 473 466 474 spin_lock_irqsave(&dev->event_lock, flags); 467 475 vc4_crtc->event = crtc->state->event; 468 - spin_unlock_irqrestore(&dev->event_lock, flags); 469 476 crtc->state->event = NULL; 477 + 478 + HVS_WRITE(SCALER_DISPLISTX(vc4_crtc->channel), 479 + vc4_state->mm.start); 480 + 481 + spin_unlock_irqrestore(&dev->event_lock, flags); 482 + } else { 483 + HVS_WRITE(SCALER_DISPLISTX(vc4_crtc->channel), 484 + vc4_state->mm.start); 485 + } 486 + 487 + if (debug_dump_regs) { 488 + DRM_INFO("CRTC %d HVS after:\n", drm_crtc_index(crtc)); 489 + vc4_hvs_dump_state(dev); 470 490 } 471 491 } 472 492 ··· 504 500 { 505 501 struct drm_crtc *crtc = &vc4_crtc->base; 506 502 struct drm_device *dev = crtc->dev; 503 + struct vc4_dev *vc4 = to_vc4_dev(dev); 504 + struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(crtc->state); 505 + u32 chan = vc4_crtc->channel; 507 506 unsigned long flags; 508 507 509 508 spin_lock_irqsave(&dev->event_lock, flags); 510 - if (vc4_crtc->event) { 509 + if (vc4_crtc->event && 510 + (vc4_state->mm.start == HVS_READ(SCALER_DISPLACTX(chan)))) { 511 511 drm_crtc_send_vblank_event(crtc, vc4_crtc->event); 512 512 vc4_crtc->event = NULL; 513 513 drm_crtc_vblank_put(crtc);
+4
drivers/gpu/drm/vc4/vc4_regs.h
··· 341 341 #define SCALER_DISPLACT0 0x00000030 342 342 #define SCALER_DISPLACT1 0x00000034 343 343 #define SCALER_DISPLACT2 0x00000038 344 + #define SCALER_DISPLACTX(x) (SCALER_DISPLACT0 + \ 345 + (x) * (SCALER_DISPLACT1 - \ 346 + SCALER_DISPLACT0)) 347 + 344 348 #define SCALER_DISPCTRL0 0x00000040 345 349 # define SCALER_DISPCTRLX_ENABLE BIT(31) 346 350 # define SCALER_DISPCTRLX_RESET BIT(30)