[Xen-devel] [PATCH 0/4] x86: EOI timer corrections / improvements

Jan Beulich posted 4 patches 4 years, 11 months ago
Only 0 patches received!
[Xen-devel] [PATCH 0/4] x86: EOI timer corrections / improvements
Posted by Jan Beulich 4 years, 11 months ago
The first patch was sent on its own before; this is a plain resend. The
others have been added to address at least the majority of the
questions raised in
https://lists.xenproject.org/archives/html/xen-devel/2019-04/msg00883.html

1: don't keep EOI timer running without need
2: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
3: relax locking in irq_guest_eoi_timer_fn()
4: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn()

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Jan Beulich 4 years, 11 months ago
The timer needs to remain active only until all pending IRQ instances
have seen EOIs from their respective domains. Stop it when the in-flight
count has reached zero in desc_guest_eoi(). Note that this is race free
(with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
that point.

Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
of stopping it immediately before re-setting, stop it as soon as we've
made it past any early returns from the function (and hence we're sure
it'll get set again).

Finally bail from the actual timer handler in case we find the timer
already active again by the time we've managed to acquire the IRQ
descriptor lock. Without this we may forcibly EOI an IRQ immediately
after it got sent to a guest. For this, timer_is_active() gets split out
of active_timer(), deliberately moving just one of the two ASSERT()s (to
allow the function to be used also on a never initialized timer).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1115,6 +1115,9 @@ static void irq_guest_eoi_timer_fn(void
 
     action = (irq_guest_action_t *)desc->action;
 
+    if ( timer_is_active(&action->eoi_timer) )
+        goto out;
+
     if ( action->ack_type != ACKTYPE_NONE )
     {
         unsigned int i;
@@ -1167,6 +1170,9 @@ static void __do_IRQ_guest(int irq)
         return;
     }
 
+    if ( action->ack_type != ACKTYPE_NONE )
+        stop_timer(&action->eoi_timer);
+
     if ( action->ack_type == ACKTYPE_EOI )
     {
         sp = pending_eoi_sp(peoi);
@@ -1194,7 +1200,6 @@ static void __do_IRQ_guest(int irq)
 
     if ( action->ack_type != ACKTYPE_NONE )
     {
-        stop_timer(&action->eoi_timer);
         migrate_timer(&action->eoi_timer, smp_processor_id());
         set_timer(&action->eoi_timer, NOW() + MILLISECS(1));
     }
@@ -1457,6 +1462,8 @@ void desc_guest_eoi(struct irq_desc *des
         return;
     }
 
+    stop_timer(&action->eoi_timer);
+
     if ( action->ack_type == ACKTYPE_UNMASK )
     {
         ASSERT(cpumask_empty(action->cpu_eoi_map));
--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -282,11 +282,10 @@ static inline void timer_unlock(struct t
 })
 
 
-static bool_t active_timer(struct timer *timer)
+static bool active_timer(const struct timer *timer)
 {
     ASSERT(timer->status >= TIMER_STATUS_inactive);
-    ASSERT(timer->status <= TIMER_STATUS_in_list);
-    return (timer->status >= TIMER_STATUS_in_heap);
+    return timer_is_active(timer);
 }
 
 
--- a/xen/include/xen/timer.h
+++ b/xen/include/xen/timer.h
@@ -75,6 +75,19 @@ bool timer_expires_before(struct timer *
 
 #define timer_is_expired(t) timer_expires_before(t, NOW())
 
+/*
+ * True if a timer is active.
+ *
+ * Unlike for timer_expires_before(), it is the caller's responsibility to
+ * use suitable locking such that the returned value isn't stale by the time
+ * it gets acted upon.
+ */
+static inline bool timer_is_active(const struct timer *timer)
+{
+    ASSERT(timer->status <= TIMER_STATUS_in_list);
+    return timer->status >= TIMER_STATUS_in_heap;
+}
+
 /* Migrate a timer to a different CPU. The timer may be currently active. */
 void migrate_timer(struct timer *timer, unsigned int new_cpu);
 





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Roger Pau Monné 4 years, 11 months ago
On Wed, May 08, 2019 at 06:46:25AM -0600, Jan Beulich wrote:
> The timer needs to remain active only until all pending IRQ instances
> have seen EOIs from their respective domains. Stop it when the in-flight
> count has reached zero in desc_guest_eoi(). Note that this is race free
> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
> that point.
> 
> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
> of stopping it immediately before re-setting, stop it as soon as we've
> made it past any early returns from the function (and hence we're sure
> it'll get set again).
> 
> Finally bail from the actual timer handler in case we find the timer
> already active again by the time we've managed to acquire the IRQ
> descriptor lock. Without this we may forcibly EOI an IRQ immediately
> after it got sent to a guest. For this, timer_is_active() gets split out
> of active_timer(), deliberately moving just one of the two ASSERT()s (to
> allow the function to be used also on a never initialized timer).

AFAICT timer_is_active is exclusively used in irq_guest_eoi_timer_fn,
which must have initialized the timer in order for
irq_guest_eoi_timer_fn to be called, and hence I'm not sure why you
need to be able to call timer_is_active with an uninitialized timer.

Is this maybe used by other patches?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Jan Beulich 4 years, 11 months ago
>>> On 16.05.19 at 12:32, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 06:46:25AM -0600, Jan Beulich wrote:
>> The timer needs to remain active only until all pending IRQ instances
>> have seen EOIs from their respective domains. Stop it when the in-flight
>> count has reached zero in desc_guest_eoi(). Note that this is race free
>> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
>> that point.
>> 
>> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
>> of stopping it immediately before re-setting, stop it as soon as we've
>> made it past any early returns from the function (and hence we're sure
>> it'll get set again).
>> 
>> Finally bail from the actual timer handler in case we find the timer
>> already active again by the time we've managed to acquire the IRQ
>> descriptor lock. Without this we may forcibly EOI an IRQ immediately
>> after it got sent to a guest. For this, timer_is_active() gets split out
>> of active_timer(), deliberately moving just one of the two ASSERT()s (to
>> allow the function to be used also on a never initialized timer).
> 
> AFAICT timer_is_active is exclusively used in irq_guest_eoi_timer_fn,
> which must have initialized the timer in order for
> irq_guest_eoi_timer_fn to be called, and hence I'm not sure why you
> need to be able to call timer_is_active with an uninitialized timer.

It's not needed here, but I consider this useful behavior when used
outside of the specific timer's handler.

> Is this maybe used by other patches?

None that I would have in the works.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Roger Pau Monné 4 years, 11 months ago
On Thu, May 16, 2019 at 04:50:22AM -0600, Jan Beulich wrote:
> >>> On 16.05.19 at 12:32, <roger.pau@citrix.com> wrote:
> > On Wed, May 08, 2019 at 06:46:25AM -0600, Jan Beulich wrote:
> >> The timer needs to remain active only until all pending IRQ instances
> >> have seen EOIs from their respective domains. Stop it when the in-flight
> >> count has reached zero in desc_guest_eoi(). Note that this is race free
> >> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
> >> that point.
> >> 
> >> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
> >> of stopping it immediately before re-setting, stop it as soon as we've
> >> made it past any early returns from the function (and hence we're sure
> >> it'll get set again).
> >> 
> >> Finally bail from the actual timer handler in case we find the timer
> >> already active again by the time we've managed to acquire the IRQ
> >> descriptor lock. Without this we may forcibly EOI an IRQ immediately
> >> after it got sent to a guest. For this, timer_is_active() gets split out
> >> of active_timer(), deliberately moving just one of the two ASSERT()s (to
> >> allow the function to be used also on a never initialized timer).
> > 
> > AFAICT timer_is_active is exclusively used in irq_guest_eoi_timer_fn,
> > which must have initialized the timer in order for
> > irq_guest_eoi_timer_fn to be called, and hence I'm not sure why you
> > need to be able to call timer_is_active with an uninitialized timer.
> 
> It's not needed here, but I consider this useful behavior when used
> outside of the specific timer's handler.
> 
> > Is this maybe used by other patches?
> 
> None that I would have in the works.

Then IMO I would rather make timer_is_active a replacement for
active_timer (or just move active_timer to the header) if there's no
user that can call timer_is_active with an uninitialized timer. Ie: I
would keep the asserts as restrictive as possible unless there's a
user that requires less restrictive assertions.

Anyway, the change is an improvement, so with or without that changed:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Andrew Cooper 4 years, 10 months ago
On 08/05/2019 13:46, Jan Beulich wrote:
> The timer needs to remain active only until all pending IRQ instances
> have seen EOIs from their respective domains. Stop it when the in-flight
> count has reached zero in desc_guest_eoi(). Note that this is race free
> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
> that point.
>
> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
> of stopping it immediately before re-setting, stop it as soon as we've
> made it past any early returns from the function (and hence we're sure
> it'll get set again).

Why this this a good thing?

>
> Finally bail from the actual timer handler in case we find the timer
> already active again by the time we've managed to acquire the IRQ
> descriptor lock. Without this we may forcibly EOI an IRQ immediately
> after it got sent to a guest. For this, timer_is_active() gets split out
> of active_timer(), deliberately moving just one of the two ASSERT()s (to
> allow the function to be used also on a never initialized timer).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1115,6 +1115,9 @@ static void irq_guest_eoi_timer_fn(void
>  
>      action = (irq_guest_action_t *)desc->action;
>  

/* Another instance of this timer already running? Skip everything to
avoid forcing an EOI early. */

~Andrew

> +    if ( timer_is_active(&action->eoi_timer) )
> +        goto out;
> +
>      if ( action->ack_type != ACKTYPE_NONE )
>      {
>          unsigned int i;
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Jan Beulich 4 years, 10 months ago
>>> On 05.06.19 at 19:04, <andrew.cooper3@citrix.com> wrote:
> On 08/05/2019 13:46, Jan Beulich wrote:
>> The timer needs to remain active only until all pending IRQ instances
>> have seen EOIs from their respective domains. Stop it when the in-flight
>> count has reached zero in desc_guest_eoi(). Note that this is race free
>> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
>> that point.
>>
>> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
>> of stopping it immediately before re-setting, stop it as soon as we've
>> made it past any early returns from the function (and hence we're sure
>> it'll get set again).
> 
> Why this this a good thing?

For it to not fire when it doesn't need to. If we're about to set
a new timeout, we clearly don't want the previous one to have
any effect anymore.

>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -1115,6 +1115,9 @@ static void irq_guest_eoi_timer_fn(void
>>  
>>      action = (irq_guest_action_t *)desc->action;
>>  
> 
> /* Another instance of this timer already running? Skip everything to
> avoid forcing an EOI early. */

Fine with me, added.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86/IRQ: don't keep EOI timer running without need
Posted by Andrew Cooper 4 years, 10 months ago
On 06/06/2019 09:08, Jan Beulich wrote:
>>>> On 05.06.19 at 19:04, <andrew.cooper3@citrix.com> wrote:
>> On 08/05/2019 13:46, Jan Beulich wrote:
>>> The timer needs to remain active only until all pending IRQ instances
>>> have seen EOIs from their respective domains. Stop it when the in-flight
>>> count has reached zero in desc_guest_eoi(). Note that this is race free
>>> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
>>> that point.
>>>
>>> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
>>> of stopping it immediately before re-setting, stop it as soon as we've
>>> made it past any early returns from the function (and hence we're sure
>>> it'll get set again).
>> Why this this a good thing?
> For it to not fire when it doesn't need to. If we're about to set
> a new timeout, we clearly don't want the previous one to have
> any effect anymore.

Sounds like an excellent addition to the code, now that there is a
order-of-returns dependency.

With a suitable comment, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Jan Beulich 4 years, 11 months ago
There's no point entering the loop in the function in this case. Instead
there still being something in flight _after_ the loop would be an
actual problem: No timer would be running anymore for issuing the EOI
eventually, and hence this IRQ (and possibly lower priority ones) would
be blocked, perhaps indefinitely.

Issue a warning instead and prefer breaking some (presumably
misbehaving) guest over stalling perhaps the entire system.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void
 
     action = (irq_guest_action_t *)desc->action;
 
-    if ( timer_is_active(&action->eoi_timer) )
+    if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
         goto out;
 
     if ( action->ack_type != ACKTYPE_NONE )
@@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
         }
     }
 
-    if ( action->in_flight != 0 )
-        goto out;
+    if ( action->in_flight )
+        printk(XENLOG_G_WARNING
+               "IRQ%d: %d handlers still in flight at forced EOI\n",
+               desc->irq, action->in_flight);
 
     switch ( action->ack_type )
     {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Roger Pau Monné 4 years, 11 months ago
On Wed, May 08, 2019 at 06:46:51AM -0600, Jan Beulich wrote:
> There's no point entering the loop in the function in this case. Instead
> there still being something in flight _after_ the loop would be an
> actual problem: No timer would be running anymore for issuing the EOI
> eventually, and hence this IRQ (and possibly lower priority ones) would
> be blocked, perhaps indefinitely.
> 
> Issue a warning instead and prefer breaking some (presumably
> misbehaving) guest over stalling perhaps the entire system.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void
>  
>      action = (irq_guest_action_t *)desc->action;
>  
> -    if ( timer_is_active(&action->eoi_timer) )
> +    if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
>          goto out;
>  
>      if ( action->ack_type != ACKTYPE_NONE )
> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>          }
>      }
>  
> -    if ( action->in_flight != 0 )
> -        goto out;
> +    if ( action->in_flight )
> +        printk(XENLOG_G_WARNING
> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
> +               desc->irq, action->in_flight);

AFAICT action->in_flight should contain the number of guests pirqs
that have the pirq masked (pirq->masked == true), because in_flight is
only increased by __do_IRQ_guest when the pirq is not already masked.
At guest EOI (desc_guest_eoi) the in_flight count is also only
decreased if the pirq is unmasked.

Hence I think this condition could be turned into an ASSERT, but I'm
likely missing something.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Jan Beulich 4 years, 11 months ago
>>> On 16.05.19 at 13:37, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 06:46:51AM -0600, Jan Beulich wrote:
>> There's no point entering the loop in the function in this case. Instead
>> there still being something in flight _after_ the loop would be an
>> actual problem: No timer would be running anymore for issuing the EOI
>> eventually, and hence this IRQ (and possibly lower priority ones) would
>> be blocked, perhaps indefinitely.
>> 
>> Issue a warning instead and prefer breaking some (presumably
>> misbehaving) guest over stalling perhaps the entire system.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> 
>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void
>>  
>>      action = (irq_guest_action_t *)desc->action;
>>  
>> -    if ( timer_is_active(&action->eoi_timer) )
>> +    if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
>>          goto out;
>>  
>>      if ( action->ack_type != ACKTYPE_NONE )
>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>>          }
>>      }
>>  
>> -    if ( action->in_flight != 0 )
>> -        goto out;
>> +    if ( action->in_flight )
>> +        printk(XENLOG_G_WARNING
>> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
>> +               desc->irq, action->in_flight);
> 
> AFAICT action->in_flight should contain the number of guests pirqs
> that have the pirq masked (pirq->masked == true), because in_flight is
> only increased by __do_IRQ_guest when the pirq is not already masked.
> At guest EOI (desc_guest_eoi) the in_flight count is also only
> decreased if the pirq is unmasked.
> 
> Hence I think this condition could be turned into an ASSERT, but I'm
> likely missing something.

I don't think you are. Going from if() straight to ASSERT() simply
seemed too harsh to me, the more in a subsystem where I could
easily have overlooked some corner case, due to how convoluted
some of the implementation is.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Roger Pau Monné 4 years, 11 months ago
On Thu, May 16, 2019 at 06:02:15AM -0600, Jan Beulich wrote:
> >>> On 16.05.19 at 13:37, <roger.pau@citrix.com> wrote:
> > On Wed, May 08, 2019 at 06:46:51AM -0600, Jan Beulich wrote:
> >> There's no point entering the loop in the function in this case. Instead
> >> there still being something in flight _after_ the loop would be an
> >> actual problem: No timer would be running anymore for issuing the EOI
> >> eventually, and hence this IRQ (and possibly lower priority ones) would
> >> be blocked, perhaps indefinitely.
> >> 
> >> Issue a warning instead and prefer breaking some (presumably
> >> misbehaving) guest over stalling perhaps the entire system.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> 
> >> --- a/xen/arch/x86/irq.c
> >> +++ b/xen/arch/x86/irq.c
> >> @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void
> >>  
> >>      action = (irq_guest_action_t *)desc->action;
> >>  
> >> -    if ( timer_is_active(&action->eoi_timer) )
> >> +    if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
> >>          goto out;
> >>  
> >>      if ( action->ack_type != ACKTYPE_NONE )
> >> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
> >>          }
> >>      }
> >>  
> >> -    if ( action->in_flight != 0 )
> >> -        goto out;
> >> +    if ( action->in_flight )
> >> +        printk(XENLOG_G_WARNING
> >> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
> >> +               desc->irq, action->in_flight);
> > 
> > AFAICT action->in_flight should contain the number of guests pirqs
> > that have the pirq masked (pirq->masked == true), because in_flight is
> > only increased by __do_IRQ_guest when the pirq is not already masked.
> > At guest EOI (desc_guest_eoi) the in_flight count is also only
> > decreased if the pirq is unmasked.
> > 
> > Hence I think this condition could be turned into an ASSERT, but I'm
> > likely missing something.
> 
> I don't think you are. Going from if() straight to ASSERT() simply
> seemed too harsh to me, the more in a subsystem where I could
> easily have overlooked some corner case, due to how convoluted
> some of the implementation is.

I agree it's quite convoluted. I think it would be helpful to add an
ASSERT_UNREACHABLE together with the warning message. With that:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Andrew Cooper 4 years, 10 months ago
On 08/05/2019 13:46, Jan Beulich wrote:
> There's no point entering the loop in the function in this case. Instead
> there still being something in flight _after_ the loop would be an
> actual problem: No timer would be running anymore for issuing the EOI
> eventually, and hence this IRQ (and possibly lower priority ones) would
> be blocked, perhaps indefinitely.
>
> Issue a warning instead and prefer breaking some (presumably
> misbehaving) guest over stalling perhaps the entire system.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void
>  
>      action = (irq_guest_action_t *)desc->action;
>  
> -    if ( timer_is_active(&action->eoi_timer) )
> +    if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
>          goto out;
>  
>      if ( action->ack_type != ACKTYPE_NONE )
> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>          }
>      }
>  
> -    if ( action->in_flight != 0 )
> -        goto out;
> +    if ( action->in_flight )
> +        printk(XENLOG_G_WARNING
> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
> +               desc->irq, action->in_flight);

AFACIT, this condition can be triggered by a buggy/malicious guest, by
it simply ignoring or masking the line interrupt at the vIO-APIC.

The message would be far more useful if it identified the domain in
question, which looks like it can be obtained from the middle of the loop.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Jan Beulich 4 years, 10 months ago
>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote:
> On 08/05/2019 13:46, Jan Beulich wrote:
>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>>          }
>>      }
>>  
>> -    if ( action->in_flight != 0 )
>> -        goto out;
>> +    if ( action->in_flight )
>> +        printk(XENLOG_G_WARNING
>> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
>> +               desc->irq, action->in_flight);
> 
> AFACIT, this condition can be triggered by a buggy/malicious guest, by
> it simply ignoring or masking the line interrupt at the vIO-APIC.

I don't think it can, no. Or else the ASSERT_UNREACHABLE() below
here would be invalid to add.

> The message would be far more useful if it identified the domain in
> question, which looks like it can be obtained from the middle of the loop.

That very loop has just taken care of decrementing ->in_flight for
all such guests.

Also note that there could be more than one offending domain, for
shared IRQs. Plus the loop you're referring to can specifically _not_
be used for identifying the domain(s), because for the ones
processed there we _did_ decrement ->in_flight. If this message
gets logged, we simply have no idea why ->in_flight is _still_ non-
zero. This could be a BUG_ON(), but it seems more in line with our
general idea of how we would like to deal with such cases to try
and keep the system running here in release builds.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Andrew Cooper 4 years, 10 months ago
On 06/06/2019 09:17, Jan Beulich wrote:
>>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote:
>> On 08/05/2019 13:46, Jan Beulich wrote:
>>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>>>          }
>>>      }
>>>  
>>> -    if ( action->in_flight != 0 )
>>> -        goto out;
>>> +    if ( action->in_flight )
>>> +        printk(XENLOG_G_WARNING
>>> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
>>> +               desc->irq, action->in_flight);
>> AFACIT, this condition can be triggered by a buggy/malicious guest, by
>> it simply ignoring or masking the line interrupt at the vIO-APIC.
> I don't think it can, no. Or else the ASSERT_UNREACHABLE() below
> here would be invalid to add.

Which ASSERT_UNREACHABLE() ?  I know Roger asked for one, but I don't
see it anywhere in the code.

>
>> The message would be far more useful if it identified the domain in
>> question, which looks like it can be obtained from the middle of the loop.
> That very loop has just taken care of decrementing ->in_flight for
> all such guests.
>
> Also note that there could be more than one offending domain, for
> shared IRQs. Plus the loop you're referring to can specifically _not_
> be used for identifying the domain(s), because for the ones
> processed there we _did_ decrement ->in_flight. If this message
> gets logged, we simply have no idea why ->in_flight is _still_ non-
> zero. This could be a BUG_ON(), but it seems more in line with our
> general idea of how we would like to deal with such cases to try
> and keep the system running here in release builds.

Ok - lets go with this for now.  It is a net improvement, and we can
evaluate the guest-triggerability at a later point.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Jan Beulich 4 years, 10 months ago
>>> On 06.06.19 at 13:34, <andrew.cooper3@citrix.com> wrote:
> On 06/06/2019 09:17, Jan Beulich wrote:
>>>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote:
>>> On 08/05/2019 13:46, Jan Beulich wrote:
>>>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>>>>          }
>>>>      }
>>>>  
>>>> -    if ( action->in_flight != 0 )
>>>> -        goto out;
>>>> +    if ( action->in_flight )
>>>> +        printk(XENLOG_G_WARNING
>>>> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
>>>> +               desc->irq, action->in_flight);
>>> AFACIT, this condition can be triggered by a buggy/malicious guest, by
>>> it simply ignoring or masking the line interrupt at the vIO-APIC.
>> I don't think it can, no. Or else the ASSERT_UNREACHABLE() below
>> here would be invalid to add.
> 
> Which ASSERT_UNREACHABLE() ?  I know Roger asked for one, but I don't
> see it anywhere in the code.

Because so far there was no real reason to re-post. It's right here,
as Roger did ask for, and as I did (hesitantly) agree:

    if ( action->in_flight )
    {
        printk(XENLOG_G_WARNING
               "IRQ%u: %d/%d handler(s) still in flight at forced EOI\n",
               irq, action->in_flight, action->nr_guests);
        ASSERT_UNREACHABLE();
    }

>>> The message would be far more useful if it identified the domain in
>>> question, which looks like it can be obtained from the middle of the loop.
>> That very loop has just taken care of decrementing ->in_flight for
>> all such guests.
>>
>> Also note that there could be more than one offending domain, for
>> shared IRQs. Plus the loop you're referring to can specifically _not_
>> be used for identifying the domain(s), because for the ones
>> processed there we _did_ decrement ->in_flight. If this message
>> gets logged, we simply have no idea why ->in_flight is _still_ non-
>> zero. This could be a BUG_ON(), but it seems more in line with our
>> general idea of how we would like to deal with such cases to try
>> and keep the system running here in release builds.
> 
> Ok - lets go with this for now.  It is a net improvement, and we can
> evaluate the guest-triggerability at a later point.
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks much. I'll assume this holds also for the adjustments
requested by Roger.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Posted by Andrew Cooper 4 years, 10 months ago
On 06/06/2019 12:43, Jan Beulich wrote:
>>>> On 06.06.19 at 13:34, <andrew.cooper3@citrix.com> wrote:
>> On 06/06/2019 09:17, Jan Beulich wrote:
>>>>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote:
>>>> On 08/05/2019 13:46, Jan Beulich wrote:
>>>>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void
>>>>>          }
>>>>>      }
>>>>>  
>>>>> -    if ( action->in_flight != 0 )
>>>>> -        goto out;
>>>>> +    if ( action->in_flight )
>>>>> +        printk(XENLOG_G_WARNING
>>>>> +               "IRQ%d: %d handlers still in flight at forced EOI\n",
>>>>> +               desc->irq, action->in_flight);
>>>> AFACIT, this condition can be triggered by a buggy/malicious guest, by
>>>> it simply ignoring or masking the line interrupt at the vIO-APIC.
>>> I don't think it can, no. Or else the ASSERT_UNREACHABLE() below
>>> here would be invalid to add.
>> Which ASSERT_UNREACHABLE() ?  I know Roger asked for one, but I don't
>> see it anywhere in the code.
> Because so far there was no real reason to re-post. It's right here,
> as Roger did ask for, and as I did (hesitantly) agree:
>
>     if ( action->in_flight )
>     {
>         printk(XENLOG_G_WARNING
>                "IRQ%u: %d/%d handler(s) still in flight at forced EOI\n",
>                irq, action->in_flight, action->nr_guests);
>         ASSERT_UNREACHABLE();
>     }
>
>>>> The message would be far more useful if it identified the domain in
>>>> question, which looks like it can be obtained from the middle of the loop.
>>> That very loop has just taken care of decrementing ->in_flight for
>>> all such guests.
>>>
>>> Also note that there could be more than one offending domain, for
>>> shared IRQs. Plus the loop you're referring to can specifically _not_
>>> be used for identifying the domain(s), because for the ones
>>> processed there we _did_ decrement ->in_flight. If this message
>>> gets logged, we simply have no idea why ->in_flight is _still_ non-
>>> zero. This could be a BUG_ON(), but it seems more in line with our
>>> general idea of how we would like to deal with such cases to try
>>> and keep the system running here in release builds.
>> Ok - lets go with this for now.  It is a net improvement, and we can
>> evaluate the guest-triggerability at a later point.
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Thanks much. I'll assume this holds also for the adjustments
> requested by Roger.

Fine.  At least that should make things obvious in a debug build.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 3/4] x86/IRQ: relax locking in irq_guest_eoi_timer_fn()
Posted by Jan Beulich 4 years, 11 months ago
This is a timer handler, so it gets entered with IRQs enabled. Therefore
there's no need to save/restore the IRQ masking flag.

Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock
just for it to be dropped again right away. Do away with this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1106,9 +1106,8 @@ static void irq_guest_eoi_timer_fn(void
     unsigned int irq = desc - irq_desc;
     irq_guest_action_t *action;
     cpumask_t cpu_eoi_map;
-    unsigned long flags;
 
-    spin_lock_irqsave(&desc->lock, flags);
+    spin_lock_irq(&desc->lock);
     
     if ( !(desc->status & IRQ_GUEST) )
         goto out;
@@ -1145,12 +1144,11 @@ static void irq_guest_eoi_timer_fn(void
         cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
         spin_unlock_irq(&desc->lock);
         on_selected_cpus(&cpu_eoi_map, set_eoi_ready, desc, 0);
-        spin_lock_irq(&desc->lock);
-        break;
+        return;
     }
 
  out:
-    spin_unlock_irqrestore(&desc->lock, flags);
+    spin_unlock_irq(&desc->lock);
 }
 
 static void __do_IRQ_guest(int irq)





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/4] x86/IRQ: relax locking in irq_guest_eoi_timer_fn()
Posted by Andrew Cooper 4 years, 10 months ago
On 08/05/2019 13:47, Jan Beulich wrote:
> This is a timer handler, so it gets entered with IRQs enabled. Therefore
> there's no need to save/restore the IRQ masking flag.
>
> Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock
> just for it to be dropped again right away. Do away with this.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Yes - that is rather silly.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/4] x86/IRQ: relax locking in irq_guest_eoi_timer_fn()
Posted by Roger Pau Monné 4 years, 11 months ago
On Wed, May 08, 2019 at 06:47:34AM -0600, Jan Beulich wrote:
> This is a timer handler, so it gets entered with IRQs enabled. Therefore
> there's no need to save/restore the IRQ masking flag.
> 
> Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock
> just for it to be dropped again right away. Do away with this.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 4/4] x86/IRQ: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn()
Posted by Jan Beulich 4 years, 11 months ago
action->ack_type is set once before the timer even gets initialized, and
is never changed later. The timer gets activated only for EOI and UNMASK
types. Hence there's no need to have a respective if() in there. Replace
it by an ASSERT().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1103,7 +1103,7 @@ static void set_eoi_ready(void *data);
 static void irq_guest_eoi_timer_fn(void *data)
 {
     struct irq_desc *desc = data;
-    unsigned int irq = desc - irq_desc;
+    unsigned int i, irq = desc - irq_desc;
     irq_guest_action_t *action;
     cpumask_t cpu_eoi_map;
 
@@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void
 
     action = (irq_guest_action_t *)desc->action;
 
+    ASSERT(action->ack_type != ACKTYPE_NONE);
+
     if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
         goto out;
 
-    if ( action->ack_type != ACKTYPE_NONE )
+    for ( i = 0; i < action->nr_guests; i++ )
     {
-        unsigned int i;
-        for ( i = 0; i < action->nr_guests; i++ )
-        {
-            struct domain *d = action->guest[i];
-            unsigned int pirq = domain_irq_to_pirq(d, irq);
-            if ( test_and_clear_bool(pirq_info(d, pirq)->masked) )
-                action->in_flight--;
-        }
+        struct domain *d = action->guest[i];
+        unsigned int pirq = domain_irq_to_pirq(d, irq);
+
+        if ( test_and_clear_bool(pirq_info(d, pirq)->masked) )
+            action->in_flight--;
     }
 
     if ( action->in_flight )





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/4] x86/IRQ: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn()
Posted by Roger Pau Monné 4 years, 11 months ago
On Wed, May 08, 2019 at 06:48:16AM -0600, Jan Beulich wrote:
> action->ack_type is set once before the timer even gets initialized, and
> is never changed later. The timer gets activated only for EOI and UNMASK
> types. Hence there's no need to have a respective if() in there. Replace
> it by an ASSERT().
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Just one comment below which I'm not overly fussed about.

> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1103,7 +1103,7 @@ static void set_eoi_ready(void *data);
>  static void irq_guest_eoi_timer_fn(void *data)
>  {
>      struct irq_desc *desc = data;
> -    unsigned int irq = desc - irq_desc;
> +    unsigned int i, irq = desc - irq_desc;
>      irq_guest_action_t *action;
>      cpumask_t cpu_eoi_map;
>  
> @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void
>  
>      action = (irq_guest_action_t *)desc->action;
>  
> +    ASSERT(action->ack_type != ACKTYPE_NONE);
> +
>      if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
>          goto out;
>  
> -    if ( action->ack_type != ACKTYPE_NONE )
> +    for ( i = 0; i < action->nr_guests; i++ )
>      {
> -        unsigned int i;
> -        for ( i = 0; i < action->nr_guests; i++ )
> -        {
> -            struct domain *d = action->guest[i];

I think you could constify d here.

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/4] x86/IRQ: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn()
Posted by Jan Beulich 4 years, 11 months ago
>>> On 16.05.19 at 15:52, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 06:48:16AM -0600, Jan Beulich wrote:
>> action->ack_type is set once before the timer even gets initialized, and
>> is never changed later. The timer gets activated only for EOI and UNMASK
>> types. Hence there's no need to have a respective if() in there. Replace
>> it by an ASSERT().
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void
>>  
>>      action = (irq_guest_action_t *)desc->action;
>>  
>> +    ASSERT(action->ack_type != ACKTYPE_NONE);
>> +
>>      if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
>>          goto out;
>>  
>> -    if ( action->ack_type != ACKTYPE_NONE )
>> +    for ( i = 0; i < action->nr_guests; i++ )
>>      {
>> -        unsigned int i;
>> -        for ( i = 0; i < action->nr_guests; i++ )
>> -        {
>> -            struct domain *d = action->guest[i];
> 
> I think you could constify d here.

Ah yes, this should work.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/4] x86/IRQ: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn()
Posted by Jan Beulich 4 years, 11 months ago
>>> On 16.05.19 at 15:52, <roger.pau@citrix.com> wrote:
> On Wed, May 08, 2019 at 06:48:16AM -0600, Jan Beulich wrote:
>> @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void
>>  
>>      action = (irq_guest_action_t *)desc->action;
>>  
>> +    ASSERT(action->ack_type != ACKTYPE_NONE);
>> +
>>      if ( !action->in_flight || timer_is_active(&action->eoi_timer) )
>>          goto out;
>>  
>> -    if ( action->ack_type != ACKTYPE_NONE )
>> +    for ( i = 0; i < action->nr_guests; i++ )
>>      {
>> -        unsigned int i;
>> -        for ( i = 0; i < action->nr_guests; i++ )
>> -        {
>> -            struct domain *d = action->guest[i];
> 
> I think you could constify d here.

Now that I've tried I recall that I did so already when originally
putting together the patch. It doesn't work, because
radix_tree_lookup() requires a non-const pointer.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 4/4] x86/IRQ: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn()
Posted by Andrew Cooper 4 years, 10 months ago
On 08/05/2019 13:48, Jan Beulich wrote:
> action->ack_type is set once before the timer even gets initialized, and
> is never changed later. The timer gets activated only for EOI and UNMASK
> types. Hence there's no need to have a respective if() in there. Replace
> it by an ASSERT().
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel