Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/i915: fix wait_for_pending_flips vs gpu hang deadlock

My g33 here seems to be shockingly good at hitting them all. This time
around kms_flip/flip-vs-panning-vs-hang blows up:

intel_crtc_wait_for_pending_flips correctly checks for gpu hangs and
if a gpu hang is pending aborts the wait for outstanding flips so that
the setcrtc call will succeed and release the crtc mutex. And the gpu
hang handler needs that lock in intel_display_handle_reset to be able
to complete outstanding flips.

The problem is that we can race in two ways:
- Waiters on the dev_priv->pending_flip_queue aren't woken up after
we've the reset as pending, but before we actually start the reset
work. This means that the waiter doesn't notice the pending reset
and hence will keep on hogging the locks.

Like with dev->struct_mutex and the ring->irq_queue wait queues we
there need to wake up everyone that potentially holds a lock which
the reset handler needs.

- intel_display_handle_reset was called _after_ we've already
signalled the completion of the reset work. Which means a waiter
could sneak in, grab the lock and never release it (since the
pageflips won't ever get released).

Similar to resetting the gem state all the reset work must complete
before we update the reset counter. Contrary to the gem reset we
don't need to have a second explicit wake up call since that will
have happened already when completing the pageflips. We also don't
have any issues that the completion happens while the reset state is
still pending - wait_for_pending_flips is only there to ensure we
display the right frame. After a gpu hang&reset events such
guarantees are out the window anyway. This is in contrast to the gem
code where too-early wake-up would result in unnecessary restarting
of ioctls.

Also, since we've gotten these various deadlocks and ordering
constraints wrong so often throw copious amounts of comments at the
code.

This deadlock regression has been introduced in the commit which added
the pageflip reset logic to the gpu hang work:

commit 96a02917a0131e52efefde49c2784c0421d6c439
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Mon Feb 18 19:08:49 2013 +0200

drm/i915: Finish page flips and update primary planes after a GPU reset

v2:
- Add comments to explain how the wake_up serves as memory barriers
for the atomic_t reset counter.
- Improve the comments a bit as suggested by Chris Wilson.
- Extract the wake_up calls before/after the reset into a little
i915_error_wake_up and unconditionally wake up the
pending_flip_queue waiters, again as suggested by Chris Wilson.

v3: Throw copious amounts of comments at i915_error_wake_up as
suggested by Chris Wilson.

Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

+54 -14
+54 -14
drivers/gpu/drm/i915/i915_irq.c
··· 1469 1469 return ret; 1470 1470 } 1471 1471 1472 + static void i915_error_wake_up(struct drm_i915_private *dev_priv, 1473 + bool reset_completed) 1474 + { 1475 + struct intel_ring_buffer *ring; 1476 + int i; 1477 + 1478 + /* 1479 + * Notify all waiters for GPU completion events that reset state has 1480 + * been changed, and that they need to restart their wait after 1481 + * checking for potential errors (and bail out to drop locks if there is 1482 + * a gpu reset pending so that i915_error_work_func can acquire them). 1483 + */ 1484 + 1485 + /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */ 1486 + for_each_ring(ring, dev_priv, i) 1487 + wake_up_all(&ring->irq_queue); 1488 + 1489 + /* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */ 1490 + wake_up_all(&dev_priv->pending_flip_queue); 1491 + 1492 + /* 1493 + * Signal tasks blocked in i915_gem_wait_for_error that the pending 1494 + * reset state is cleared. 1495 + */ 1496 + if (reset_completed) 1497 + wake_up_all(&dev_priv->gpu_error.reset_queue); 1498 + } 1499 + 1472 1500 /** 1473 1501 * i915_error_work_func - do process context error handling work 1474 1502 * @work: work struct ··· 1511 1483 drm_i915_private_t *dev_priv = container_of(error, drm_i915_private_t, 1512 1484 gpu_error); 1513 1485 struct drm_device *dev = dev_priv->dev; 1514 - struct intel_ring_buffer *ring; 1515 1486 char *error_event[] = { I915_ERROR_UEVENT "=1", NULL }; 1516 1487 char *reset_event[] = { I915_RESET_UEVENT "=1", NULL }; 1517 1488 char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL }; 1518 - int i, ret; 1489 + int ret; 1519 1490 1520 1491 kobject_uevent_env(&dev->primary->kdev.kobj, KOBJ_CHANGE, error_event); 1521 1492 ··· 1533 1506 kobject_uevent_env(&dev->primary->kdev.kobj, KOBJ_CHANGE, 1534 1507 reset_event); 1535 1508 1509 + /* 1510 + * All state reset _must_ be completed before we update the 1511 + * reset counter, for otherwise waiters might miss the reset 1512 + * pending state and not properly drop locks, resulting in 1513 + * deadlocks with the reset work. 1514 + */ 1536 1515 ret = i915_reset(dev); 1516 + 1517 + intel_display_handle_reset(dev); 1537 1518 1538 1519 if (ret == 0) { 1539 1520 /* ··· 1563 1528 atomic_set(&error->reset_counter, I915_WEDGED); 1564 1529 } 1565 1530 1566 - for_each_ring(ring, dev_priv, i) 1567 - wake_up_all(&ring->irq_queue); 1568 - 1569 - intel_display_handle_reset(dev); 1570 - 1571 - wake_up_all(&dev_priv->gpu_error.reset_queue); 1531 + /* 1532 + * Note: The wake_up also serves as a memory barrier so that 1533 + * waiters see the update value of the reset counter atomic_t. 1534 + */ 1535 + i915_error_wake_up(dev_priv, true); 1572 1536 } 1573 1537 } 1574 1538 ··· 1676 1642 void i915_handle_error(struct drm_device *dev, bool wedged) 1677 1643 { 1678 1644 struct drm_i915_private *dev_priv = dev->dev_private; 1679 - struct intel_ring_buffer *ring; 1680 - int i; 1681 1645 1682 1646 i915_capture_error_state(dev); 1683 1647 i915_report_and_clear_eir(dev); ··· 1685 1653 &dev_priv->gpu_error.reset_counter); 1686 1654 1687 1655 /* 1688 - * Wakeup waiting processes so that the reset work item 1689 - * doesn't deadlock trying to grab various locks. 1656 + * Wakeup waiting processes so that the reset work function 1657 + * i915_error_work_func doesn't deadlock trying to grab various 1658 + * locks. By bumping the reset counter first, the woken 1659 + * processes will see a reset in progress and back off, 1660 + * releasing their locks and then wait for the reset completion. 1661 + * We must do this for _all_ gpu waiters that might hold locks 1662 + * that the reset work needs to acquire. 1663 + * 1664 + * Note: The wake_up serves as the required memory barrier to 1665 + * ensure that the waiters see the updated value of the reset 1666 + * counter atomic_t. 1690 1667 */ 1691 - for_each_ring(ring, dev_priv, i) 1692 - wake_up_all(&ring->irq_queue); 1668 + i915_error_wake_up(dev_priv, false); 1693 1669 } 1694 1670 1695 1671 /*