hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()

hrtimer_force_reprogram() and hrtimer_interrupt() invokes
__hrtimer_get_next_event() to find the earliest expiry time of hrtimer
bases. __hrtimer_get_next_event() does not update
cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That
needs to be done at the callsites.

hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context. hrtimer_interrupt() does never update
cpu_base::softirq_expires_next which is wrong too.

That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.

cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.

Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields. Split this functionality into a
separate function to be able to use it in hrtimer_interrupt() as well
without copy paste.

Fixes: 5da70160462e ("hrtimer: Implement support for softirq based hrtimers")
Reported-by: Mikael Beckius <mikael.beckius@windriver.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikael Beckius <mikael.beckius@windriver.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de

authored by Anna-Maria Behnsen and committed by Ingo Molnar 46eb1701 a38fd874

Changed files
+39 -21
kernel
time
+39 -21
kernel/time/hrtimer.c
··· 546 546 } 547 547 548 548 /* 549 - * Recomputes cpu_base::*next_timer and returns the earliest expires_next but 550 - * does not set cpu_base::*expires_next, that is done by hrtimer_reprogram. 549 + * Recomputes cpu_base::*next_timer and returns the earliest expires_next 550 + * but does not set cpu_base::*expires_next, that is done by 551 + * hrtimer[_force]_reprogram and hrtimer_interrupt only. When updating 552 + * cpu_base::*expires_next right away, reprogramming logic would no longer 553 + * work. 551 554 * 552 555 * When a softirq is pending, we can ignore the HRTIMER_ACTIVE_SOFT bases, 553 556 * those timers will get run whenever the softirq gets handled, at the end of ··· 586 583 cpu_base->next_timer = next_timer; 587 584 expires_next = __hrtimer_next_event_base(cpu_base, NULL, active, 588 585 expires_next); 586 + } 587 + 588 + return expires_next; 589 + } 590 + 591 + static ktime_t hrtimer_update_next_event(struct hrtimer_cpu_base *cpu_base) 592 + { 593 + ktime_t expires_next, soft = KTIME_MAX; 594 + 595 + /* 596 + * If the soft interrupt has already been activated, ignore the 597 + * soft bases. They will be handled in the already raised soft 598 + * interrupt. 599 + */ 600 + if (!cpu_base->softirq_activated) { 601 + soft = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT); 602 + /* 603 + * Update the soft expiry time. clock_settime() might have 604 + * affected it. 605 + */ 606 + cpu_base->softirq_expires_next = soft; 607 + } 608 + 609 + expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_HARD); 610 + /* 611 + * If a softirq timer is expiring first, update cpu_base->next_timer 612 + * and program the hardware with the soft expiry time. 613 + */ 614 + if (expires_next > soft) { 615 + cpu_base->next_timer = cpu_base->softirq_next_timer; 616 + expires_next = soft; 589 617 } 590 618 591 619 return expires_next; ··· 662 628 { 663 629 ktime_t expires_next; 664 630 665 - /* 666 - * Find the current next expiration time. 667 - */ 668 - expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL); 669 - 670 - if (cpu_base->next_timer && cpu_base->next_timer->is_soft) { 671 - /* 672 - * When the softirq is activated, hrtimer has to be 673 - * programmed with the first hard hrtimer because soft 674 - * timer interrupt could occur too late. 675 - */ 676 - if (cpu_base->softirq_activated) 677 - expires_next = __hrtimer_get_next_event(cpu_base, 678 - HRTIMER_ACTIVE_HARD); 679 - else 680 - cpu_base->softirq_expires_next = expires_next; 681 - } 631 + expires_next = hrtimer_update_next_event(cpu_base); 682 632 683 633 if (skip_equal && expires_next == cpu_base->expires_next) 684 634 return; ··· 1662 1644 1663 1645 __hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD); 1664 1646 1665 - /* Reevaluate the clock bases for the next expiry */ 1666 - expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL); 1647 + /* Reevaluate the clock bases for the [soft] next expiry */ 1648 + expires_next = hrtimer_update_next_event(cpu_base); 1667 1649 /* 1668 1650 * Store the new expiry value so the migration code can verify 1669 1651 * against it.