The first patch was sent on its own before; this is a plain resend. The others have been added to address at least the majority of the questions raised in https://lists.xenproject.org/archives/html/xen-devel/2019-04/msg00883.html 1: don't keep EOI timer running without need 2: bail early from irq_guest_eoi_timer_fn() when nothing is in flight 3: relax locking in irq_guest_eoi_timer_fn() 4: ACKTYPE_NONE cannot make it into irq_guest_eoi_timer_fn() Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
The timer needs to remain active only until all pending IRQ instances have seen EOIs from their respective domains. Stop it when the in-flight count has reached zero in desc_guest_eoi(). Note that this is race free (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at that point. Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead of stopping it immediately before re-setting, stop it as soon as we've made it past any early returns from the function (and hence we're sure it'll get set again). Finally bail from the actual timer handler in case we find the timer already active again by the time we've managed to acquire the IRQ descriptor lock. Without this we may forcibly EOI an IRQ immediately after it got sent to a guest. For this, timer_is_active() gets split out of active_timer(), deliberately moving just one of the two ASSERT()s (to allow the function to be used also on a never initialized timer). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1115,6 +1115,9 @@ static void irq_guest_eoi_timer_fn(void action = (irq_guest_action_t *)desc->action; + if ( timer_is_active(&action->eoi_timer) ) + goto out; + if ( action->ack_type != ACKTYPE_NONE ) { unsigned int i; @@ -1167,6 +1170,9 @@ static void __do_IRQ_guest(int irq) return; } + if ( action->ack_type != ACKTYPE_NONE ) + stop_timer(&action->eoi_timer); + if ( action->ack_type == ACKTYPE_EOI ) { sp = pending_eoi_sp(peoi); @@ -1194,7 +1200,6 @@ static void __do_IRQ_guest(int irq) if ( action->ack_type != ACKTYPE_NONE ) { - stop_timer(&action->eoi_timer); migrate_timer(&action->eoi_timer, smp_processor_id()); set_timer(&action->eoi_timer, NOW() + MILLISECS(1)); } @@ -1457,6 +1462,8 @@ void desc_guest_eoi(struct irq_desc *des return; } + stop_timer(&action->eoi_timer); + if ( action->ack_type == ACKTYPE_UNMASK ) { ASSERT(cpumask_empty(action->cpu_eoi_map)); --- a/xen/common/timer.c +++ b/xen/common/timer.c @@ -282,11 +282,10 @@ static inline void timer_unlock(struct t }) -static bool_t active_timer(struct timer *timer) +static bool active_timer(const struct timer *timer) { ASSERT(timer->status >= TIMER_STATUS_inactive); - ASSERT(timer->status <= TIMER_STATUS_in_list); - return (timer->status >= TIMER_STATUS_in_heap); + return timer_is_active(timer); } --- a/xen/include/xen/timer.h +++ b/xen/include/xen/timer.h @@ -75,6 +75,19 @@ bool timer_expires_before(struct timer * #define timer_is_expired(t) timer_expires_before(t, NOW()) +/* + * True if a timer is active. + * + * Unlike for timer_expires_before(), it is the caller's responsibility to + * use suitable locking such that the returned value isn't stale by the time + * it gets acted upon. + */ +static inline bool timer_is_active(const struct timer *timer) +{ + ASSERT(timer->status <= TIMER_STATUS_in_list); + return timer->status >= TIMER_STATUS_in_heap; +} + /* Migrate a timer to a different CPU. The timer may be currently active. */ void migrate_timer(struct timer *timer, unsigned int new_cpu); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 06:46:25AM -0600, Jan Beulich wrote: > The timer needs to remain active only until all pending IRQ instances > have seen EOIs from their respective domains. Stop it when the in-flight > count has reached zero in desc_guest_eoi(). Note that this is race free > (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at > that point. > > Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead > of stopping it immediately before re-setting, stop it as soon as we've > made it past any early returns from the function (and hence we're sure > it'll get set again). > > Finally bail from the actual timer handler in case we find the timer > already active again by the time we've managed to acquire the IRQ > descriptor lock. Without this we may forcibly EOI an IRQ immediately > after it got sent to a guest. For this, timer_is_active() gets split out > of active_timer(), deliberately moving just one of the two ASSERT()s (to > allow the function to be used also on a never initialized timer). AFAICT timer_is_active is exclusively used in irq_guest_eoi_timer_fn, which must have initialized the timer in order for irq_guest_eoi_timer_fn to be called, and hence I'm not sure why you need to be able to call timer_is_active with an uninitialized timer. Is this maybe used by other patches? Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 16.05.19 at 12:32, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 06:46:25AM -0600, Jan Beulich wrote: >> The timer needs to remain active only until all pending IRQ instances >> have seen EOIs from their respective domains. Stop it when the in-flight >> count has reached zero in desc_guest_eoi(). Note that this is race free >> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at >> that point. >> >> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead >> of stopping it immediately before re-setting, stop it as soon as we've >> made it past any early returns from the function (and hence we're sure >> it'll get set again). >> >> Finally bail from the actual timer handler in case we find the timer >> already active again by the time we've managed to acquire the IRQ >> descriptor lock. Without this we may forcibly EOI an IRQ immediately >> after it got sent to a guest. For this, timer_is_active() gets split out >> of active_timer(), deliberately moving just one of the two ASSERT()s (to >> allow the function to be used also on a never initialized timer). > > AFAICT timer_is_active is exclusively used in irq_guest_eoi_timer_fn, > which must have initialized the timer in order for > irq_guest_eoi_timer_fn to be called, and hence I'm not sure why you > need to be able to call timer_is_active with an uninitialized timer. It's not needed here, but I consider this useful behavior when used outside of the specific timer's handler. > Is this maybe used by other patches? None that I would have in the works. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Thu, May 16, 2019 at 04:50:22AM -0600, Jan Beulich wrote: > >>> On 16.05.19 at 12:32, <roger.pau@citrix.com> wrote: > > On Wed, May 08, 2019 at 06:46:25AM -0600, Jan Beulich wrote: > >> The timer needs to remain active only until all pending IRQ instances > >> have seen EOIs from their respective domains. Stop it when the in-flight > >> count has reached zero in desc_guest_eoi(). Note that this is race free > >> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at > >> that point. > >> > >> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead > >> of stopping it immediately before re-setting, stop it as soon as we've > >> made it past any early returns from the function (and hence we're sure > >> it'll get set again). > >> > >> Finally bail from the actual timer handler in case we find the timer > >> already active again by the time we've managed to acquire the IRQ > >> descriptor lock. Without this we may forcibly EOI an IRQ immediately > >> after it got sent to a guest. For this, timer_is_active() gets split out > >> of active_timer(), deliberately moving just one of the two ASSERT()s (to > >> allow the function to be used also on a never initialized timer). > > > > AFAICT timer_is_active is exclusively used in irq_guest_eoi_timer_fn, > > which must have initialized the timer in order for > > irq_guest_eoi_timer_fn to be called, and hence I'm not sure why you > > need to be able to call timer_is_active with an uninitialized timer. > > It's not needed here, but I consider this useful behavior when used > outside of the specific timer's handler. > > > Is this maybe used by other patches? > > None that I would have in the works. Then IMO I would rather make timer_is_active a replacement for active_timer (or just move active_timer to the header) if there's no user that can call timer_is_active with an uninitialized timer. Ie: I would keep the asserts as restrictive as possible unless there's a user that requires less restrictive assertions. Anyway, the change is an improvement, so with or without that changed: Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 08/05/2019 13:46, Jan Beulich wrote: > The timer needs to remain active only until all pending IRQ instances > have seen EOIs from their respective domains. Stop it when the in-flight > count has reached zero in desc_guest_eoi(). Note that this is race free > (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at > that point. > > Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead > of stopping it immediately before re-setting, stop it as soon as we've > made it past any early returns from the function (and hence we're sure > it'll get set again). Why this this a good thing? > > Finally bail from the actual timer handler in case we find the timer > already active again by the time we've managed to acquire the IRQ > descriptor lock. Without this we may forcibly EOI an IRQ immediately > after it got sent to a guest. For this, timer_is_active() gets split out > of active_timer(), deliberately moving just one of the two ASSERT()s (to > allow the function to be used also on a never initialized timer). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -1115,6 +1115,9 @@ static void irq_guest_eoi_timer_fn(void > > action = (irq_guest_action_t *)desc->action; > /* Another instance of this timer already running? Skip everything to avoid forcing an EOI early. */ ~Andrew > + if ( timer_is_active(&action->eoi_timer) ) > + goto out; > + > if ( action->ack_type != ACKTYPE_NONE ) > { > unsigned int i; > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 05.06.19 at 19:04, <andrew.cooper3@citrix.com> wrote: > On 08/05/2019 13:46, Jan Beulich wrote: >> The timer needs to remain active only until all pending IRQ instances >> have seen EOIs from their respective domains. Stop it when the in-flight >> count has reached zero in desc_guest_eoi(). Note that this is race free >> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at >> that point. >> >> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead >> of stopping it immediately before re-setting, stop it as soon as we've >> made it past any early returns from the function (and hence we're sure >> it'll get set again). > > Why this this a good thing? For it to not fire when it doesn't need to. If we're about to set a new timeout, we clearly don't want the previous one to have any effect anymore. >> --- a/xen/arch/x86/irq.c >> +++ b/xen/arch/x86/irq.c >> @@ -1115,6 +1115,9 @@ static void irq_guest_eoi_timer_fn(void >> >> action = (irq_guest_action_t *)desc->action; >> > > /* Another instance of this timer already running? Skip everything to > avoid forcing an EOI early. */ Fine with me, added. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 06/06/2019 09:08, Jan Beulich wrote: >>>> On 05.06.19 at 19:04, <andrew.cooper3@citrix.com> wrote: >> On 08/05/2019 13:46, Jan Beulich wrote: >>> The timer needs to remain active only until all pending IRQ instances >>> have seen EOIs from their respective domains. Stop it when the in-flight >>> count has reached zero in desc_guest_eoi(). Note that this is race free >>> (with __do_IRQ_guest()), as the IRQ descriptor lock is being held at >>> that point. >>> >>> Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead >>> of stopping it immediately before re-setting, stop it as soon as we've >>> made it past any early returns from the function (and hence we're sure >>> it'll get set again). >> Why this this a good thing? > For it to not fire when it doesn't need to. If we're about to set > a new timeout, we clearly don't want the previous one to have > any effect anymore. Sounds like an excellent addition to the code, now that there is a order-of-returns dependency. With a suitable comment, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
There's no point entering the loop in the function in this case. Instead there still being something in flight _after_ the loop would be an actual problem: No timer would be running anymore for issuing the EOI eventually, and hence this IRQ (and possibly lower priority ones) would be blocked, perhaps indefinitely. Issue a warning instead and prefer breaking some (presumably misbehaving) guest over stalling perhaps the entire system. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void action = (irq_guest_action_t *)desc->action; - if ( timer_is_active(&action->eoi_timer) ) + if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) goto out; if ( action->ack_type != ACKTYPE_NONE ) @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void } } - if ( action->in_flight != 0 ) - goto out; + if ( action->in_flight ) + printk(XENLOG_G_WARNING + "IRQ%d: %d handlers still in flight at forced EOI\n", + desc->irq, action->in_flight); switch ( action->ack_type ) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 06:46:51AM -0600, Jan Beulich wrote: > There's no point entering the loop in the function in this case. Instead > there still being something in flight _after_ the loop would be an > actual problem: No timer would be running anymore for issuing the EOI > eventually, and hence this IRQ (and possibly lower priority ones) would > be blocked, perhaps indefinitely. > > Issue a warning instead and prefer breaking some (presumably > misbehaving) guest over stalling perhaps the entire system. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void > > action = (irq_guest_action_t *)desc->action; > > - if ( timer_is_active(&action->eoi_timer) ) > + if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) > goto out; > > if ( action->ack_type != ACKTYPE_NONE ) > @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void > } > } > > - if ( action->in_flight != 0 ) > - goto out; > + if ( action->in_flight ) > + printk(XENLOG_G_WARNING > + "IRQ%d: %d handlers still in flight at forced EOI\n", > + desc->irq, action->in_flight); AFAICT action->in_flight should contain the number of guests pirqs that have the pirq masked (pirq->masked == true), because in_flight is only increased by __do_IRQ_guest when the pirq is not already masked. At guest EOI (desc_guest_eoi) the in_flight count is also only decreased if the pirq is unmasked. Hence I think this condition could be turned into an ASSERT, but I'm likely missing something. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 16.05.19 at 13:37, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 06:46:51AM -0600, Jan Beulich wrote: >> There's no point entering the loop in the function in this case. Instead >> there still being something in flight _after_ the loop would be an >> actual problem: No timer would be running anymore for issuing the EOI >> eventually, and hence this IRQ (and possibly lower priority ones) would >> be blocked, perhaps indefinitely. >> >> Issue a warning instead and prefer breaking some (presumably >> misbehaving) guest over stalling perhaps the entire system. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> --- a/xen/arch/x86/irq.c >> +++ b/xen/arch/x86/irq.c >> @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void >> >> action = (irq_guest_action_t *)desc->action; >> >> - if ( timer_is_active(&action->eoi_timer) ) >> + if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) >> goto out; >> >> if ( action->ack_type != ACKTYPE_NONE ) >> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void >> } >> } >> >> - if ( action->in_flight != 0 ) >> - goto out; >> + if ( action->in_flight ) >> + printk(XENLOG_G_WARNING >> + "IRQ%d: %d handlers still in flight at forced EOI\n", >> + desc->irq, action->in_flight); > > AFAICT action->in_flight should contain the number of guests pirqs > that have the pirq masked (pirq->masked == true), because in_flight is > only increased by __do_IRQ_guest when the pirq is not already masked. > At guest EOI (desc_guest_eoi) the in_flight count is also only > decreased if the pirq is unmasked. > > Hence I think this condition could be turned into an ASSERT, but I'm > likely missing something. I don't think you are. Going from if() straight to ASSERT() simply seemed too harsh to me, the more in a subsystem where I could easily have overlooked some corner case, due to how convoluted some of the implementation is. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Thu, May 16, 2019 at 06:02:15AM -0600, Jan Beulich wrote: > >>> On 16.05.19 at 13:37, <roger.pau@citrix.com> wrote: > > On Wed, May 08, 2019 at 06:46:51AM -0600, Jan Beulich wrote: > >> There's no point entering the loop in the function in this case. Instead > >> there still being something in flight _after_ the loop would be an > >> actual problem: No timer would be running anymore for issuing the EOI > >> eventually, and hence this IRQ (and possibly lower priority ones) would > >> be blocked, perhaps indefinitely. > >> > >> Issue a warning instead and prefer breaking some (presumably > >> misbehaving) guest over stalling perhaps the entire system. > >> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > >> > >> --- a/xen/arch/x86/irq.c > >> +++ b/xen/arch/x86/irq.c > >> @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void > >> > >> action = (irq_guest_action_t *)desc->action; > >> > >> - if ( timer_is_active(&action->eoi_timer) ) > >> + if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) > >> goto out; > >> > >> if ( action->ack_type != ACKTYPE_NONE ) > >> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void > >> } > >> } > >> > >> - if ( action->in_flight != 0 ) > >> - goto out; > >> + if ( action->in_flight ) > >> + printk(XENLOG_G_WARNING > >> + "IRQ%d: %d handlers still in flight at forced EOI\n", > >> + desc->irq, action->in_flight); > > > > AFAICT action->in_flight should contain the number of guests pirqs > > that have the pirq masked (pirq->masked == true), because in_flight is > > only increased by __do_IRQ_guest when the pirq is not already masked. > > At guest EOI (desc_guest_eoi) the in_flight count is also only > > decreased if the pirq is unmasked. > > > > Hence I think this condition could be turned into an ASSERT, but I'm > > likely missing something. > > I don't think you are. Going from if() straight to ASSERT() simply > seemed too harsh to me, the more in a subsystem where I could > easily have overlooked some corner case, due to how convoluted > some of the implementation is. I agree it's quite convoluted. I think it would be helpful to add an ASSERT_UNREACHABLE together with the warning message. With that: Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 08/05/2019 13:46, Jan Beulich wrote: > There's no point entering the loop in the function in this case. Instead > there still being something in flight _after_ the loop would be an > actual problem: No timer would be running anymore for issuing the EOI > eventually, and hence this IRQ (and possibly lower priority ones) would > be blocked, perhaps indefinitely. > > Issue a warning instead and prefer breaking some (presumably > misbehaving) guest over stalling perhaps the entire system. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -1115,7 +1115,7 @@ static void irq_guest_eoi_timer_fn(void > > action = (irq_guest_action_t *)desc->action; > > - if ( timer_is_active(&action->eoi_timer) ) > + if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) > goto out; > > if ( action->ack_type != ACKTYPE_NONE ) > @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void > } > } > > - if ( action->in_flight != 0 ) > - goto out; > + if ( action->in_flight ) > + printk(XENLOG_G_WARNING > + "IRQ%d: %d handlers still in flight at forced EOI\n", > + desc->irq, action->in_flight); AFACIT, this condition can be triggered by a buggy/malicious guest, by it simply ignoring or masking the line interrupt at the vIO-APIC. The message would be far more useful if it identified the domain in question, which looks like it can be obtained from the middle of the loop. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote: > On 08/05/2019 13:46, Jan Beulich wrote: >> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void >> } >> } >> >> - if ( action->in_flight != 0 ) >> - goto out; >> + if ( action->in_flight ) >> + printk(XENLOG_G_WARNING >> + "IRQ%d: %d handlers still in flight at forced EOI\n", >> + desc->irq, action->in_flight); > > AFACIT, this condition can be triggered by a buggy/malicious guest, by > it simply ignoring or masking the line interrupt at the vIO-APIC. I don't think it can, no. Or else the ASSERT_UNREACHABLE() below here would be invalid to add. > The message would be far more useful if it identified the domain in > question, which looks like it can be obtained from the middle of the loop. That very loop has just taken care of decrementing ->in_flight for all such guests. Also note that there could be more than one offending domain, for shared IRQs. Plus the loop you're referring to can specifically _not_ be used for identifying the domain(s), because for the ones processed there we _did_ decrement ->in_flight. If this message gets logged, we simply have no idea why ->in_flight is _still_ non- zero. This could be a BUG_ON(), but it seems more in line with our general idea of how we would like to deal with such cases to try and keep the system running here in release builds. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 06/06/2019 09:17, Jan Beulich wrote: >>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote: >> On 08/05/2019 13:46, Jan Beulich wrote: >>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void >>> } >>> } >>> >>> - if ( action->in_flight != 0 ) >>> - goto out; >>> + if ( action->in_flight ) >>> + printk(XENLOG_G_WARNING >>> + "IRQ%d: %d handlers still in flight at forced EOI\n", >>> + desc->irq, action->in_flight); >> AFACIT, this condition can be triggered by a buggy/malicious guest, by >> it simply ignoring or masking the line interrupt at the vIO-APIC. > I don't think it can, no. Or else the ASSERT_UNREACHABLE() below > here would be invalid to add. Which ASSERT_UNREACHABLE() ? I know Roger asked for one, but I don't see it anywhere in the code. > >> The message would be far more useful if it identified the domain in >> question, which looks like it can be obtained from the middle of the loop. > That very loop has just taken care of decrementing ->in_flight for > all such guests. > > Also note that there could be more than one offending domain, for > shared IRQs. Plus the loop you're referring to can specifically _not_ > be used for identifying the domain(s), because for the ones > processed there we _did_ decrement ->in_flight. If this message > gets logged, we simply have no idea why ->in_flight is _still_ non- > zero. This could be a BUG_ON(), but it seems more in line with our > general idea of how we would like to deal with such cases to try > and keep the system running here in release builds. Ok - lets go with this for now. It is a net improvement, and we can evaluate the guest-triggerability at a later point. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 06.06.19 at 13:34, <andrew.cooper3@citrix.com> wrote: > On 06/06/2019 09:17, Jan Beulich wrote: >>>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote: >>> On 08/05/2019 13:46, Jan Beulich wrote: >>>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void >>>> } >>>> } >>>> >>>> - if ( action->in_flight != 0 ) >>>> - goto out; >>>> + if ( action->in_flight ) >>>> + printk(XENLOG_G_WARNING >>>> + "IRQ%d: %d handlers still in flight at forced EOI\n", >>>> + desc->irq, action->in_flight); >>> AFACIT, this condition can be triggered by a buggy/malicious guest, by >>> it simply ignoring or masking the line interrupt at the vIO-APIC. >> I don't think it can, no. Or else the ASSERT_UNREACHABLE() below >> here would be invalid to add. > > Which ASSERT_UNREACHABLE() ? I know Roger asked for one, but I don't > see it anywhere in the code. Because so far there was no real reason to re-post. It's right here, as Roger did ask for, and as I did (hesitantly) agree: if ( action->in_flight ) { printk(XENLOG_G_WARNING "IRQ%u: %d/%d handler(s) still in flight at forced EOI\n", irq, action->in_flight, action->nr_guests); ASSERT_UNREACHABLE(); } >>> The message would be far more useful if it identified the domain in >>> question, which looks like it can be obtained from the middle of the loop. >> That very loop has just taken care of decrementing ->in_flight for >> all such guests. >> >> Also note that there could be more than one offending domain, for >> shared IRQs. Plus the loop you're referring to can specifically _not_ >> be used for identifying the domain(s), because for the ones >> processed there we _did_ decrement ->in_flight. If this message >> gets logged, we simply have no idea why ->in_flight is _still_ non- >> zero. This could be a BUG_ON(), but it seems more in line with our >> general idea of how we would like to deal with such cases to try >> and keep the system running here in release builds. > > Ok - lets go with this for now. It is a net improvement, and we can > evaluate the guest-triggerability at a later point. > > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Thanks much. I'll assume this holds also for the adjustments requested by Roger. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 06/06/2019 12:43, Jan Beulich wrote: >>>> On 06.06.19 at 13:34, <andrew.cooper3@citrix.com> wrote: >> On 06/06/2019 09:17, Jan Beulich wrote: >>>>>> On 05.06.19 at 19:15, <andrew.cooper3@citrix.com> wrote: >>>> On 08/05/2019 13:46, Jan Beulich wrote: >>>>> @@ -1130,8 +1130,10 @@ static void irq_guest_eoi_timer_fn(void >>>>> } >>>>> } >>>>> >>>>> - if ( action->in_flight != 0 ) >>>>> - goto out; >>>>> + if ( action->in_flight ) >>>>> + printk(XENLOG_G_WARNING >>>>> + "IRQ%d: %d handlers still in flight at forced EOI\n", >>>>> + desc->irq, action->in_flight); >>>> AFACIT, this condition can be triggered by a buggy/malicious guest, by >>>> it simply ignoring or masking the line interrupt at the vIO-APIC. >>> I don't think it can, no. Or else the ASSERT_UNREACHABLE() below >>> here would be invalid to add. >> Which ASSERT_UNREACHABLE() ? I know Roger asked for one, but I don't >> see it anywhere in the code. > Because so far there was no real reason to re-post. It's right here, > as Roger did ask for, and as I did (hesitantly) agree: > > if ( action->in_flight ) > { > printk(XENLOG_G_WARNING > "IRQ%u: %d/%d handler(s) still in flight at forced EOI\n", > irq, action->in_flight, action->nr_guests); > ASSERT_UNREACHABLE(); > } > >>>> The message would be far more useful if it identified the domain in >>>> question, which looks like it can be obtained from the middle of the loop. >>> That very loop has just taken care of decrementing ->in_flight for >>> all such guests. >>> >>> Also note that there could be more than one offending domain, for >>> shared IRQs. Plus the loop you're referring to can specifically _not_ >>> be used for identifying the domain(s), because for the ones >>> processed there we _did_ decrement ->in_flight. If this message >>> gets logged, we simply have no idea why ->in_flight is _still_ non- >>> zero. This could be a BUG_ON(), but it seems more in line with our >>> general idea of how we would like to deal with such cases to try >>> and keep the system running here in release builds. >> Ok - lets go with this for now. It is a net improvement, and we can >> evaluate the guest-triggerability at a later point. >> >> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> > Thanks much. I'll assume this holds also for the adjustments > requested by Roger. Fine. At least that should make things obvious in a debug build. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
This is a timer handler, so it gets entered with IRQs enabled. Therefore there's no need to save/restore the IRQ masking flag. Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock just for it to be dropped again right away. Do away with this. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1106,9 +1106,8 @@ static void irq_guest_eoi_timer_fn(void unsigned int irq = desc - irq_desc; irq_guest_action_t *action; cpumask_t cpu_eoi_map; - unsigned long flags; - spin_lock_irqsave(&desc->lock, flags); + spin_lock_irq(&desc->lock); if ( !(desc->status & IRQ_GUEST) ) goto out; @@ -1145,12 +1144,11 @@ static void irq_guest_eoi_timer_fn(void cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map); spin_unlock_irq(&desc->lock); on_selected_cpus(&cpu_eoi_map, set_eoi_ready, desc, 0); - spin_lock_irq(&desc->lock); - break; + return; } out: - spin_unlock_irqrestore(&desc->lock, flags); + spin_unlock_irq(&desc->lock); } static void __do_IRQ_guest(int irq) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 08/05/2019 13:47, Jan Beulich wrote: > This is a timer handler, so it gets entered with IRQs enabled. Therefore > there's no need to save/restore the IRQ masking flag. > > Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock > just for it to be dropped again right away. Do away with this. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Yes - that is rather silly. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 06:47:34AM -0600, Jan Beulich wrote: > This is a timer handler, so it gets entered with IRQs enabled. Therefore > there's no need to save/restore the IRQ masking flag. > > Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock > just for it to be dropped again right away. Do away with this. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
action->ack_type is set once before the timer even gets initialized, and is never changed later. The timer gets activated only for EOI and UNMASK types. Hence there's no need to have a respective if() in there. Replace it by an ASSERT(). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1103,7 +1103,7 @@ static void set_eoi_ready(void *data); static void irq_guest_eoi_timer_fn(void *data) { struct irq_desc *desc = data; - unsigned int irq = desc - irq_desc; + unsigned int i, irq = desc - irq_desc; irq_guest_action_t *action; cpumask_t cpu_eoi_map; @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void action = (irq_guest_action_t *)desc->action; + ASSERT(action->ack_type != ACKTYPE_NONE); + if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) goto out; - if ( action->ack_type != ACKTYPE_NONE ) + for ( i = 0; i < action->nr_guests; i++ ) { - unsigned int i; - for ( i = 0; i < action->nr_guests; i++ ) - { - struct domain *d = action->guest[i]; - unsigned int pirq = domain_irq_to_pirq(d, irq); - if ( test_and_clear_bool(pirq_info(d, pirq)->masked) ) - action->in_flight--; - } + struct domain *d = action->guest[i]; + unsigned int pirq = domain_irq_to_pirq(d, irq); + + if ( test_and_clear_bool(pirq_info(d, pirq)->masked) ) + action->in_flight--; } if ( action->in_flight ) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 06:48:16AM -0600, Jan Beulich wrote: > action->ack_type is set once before the timer even gets initialized, and > is never changed later. The timer gets activated only for EOI and UNMASK > types. Hence there's no need to have a respective if() in there. Replace > it by an ASSERT(). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Just one comment below which I'm not overly fussed about. > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -1103,7 +1103,7 @@ static void set_eoi_ready(void *data); > static void irq_guest_eoi_timer_fn(void *data) > { > struct irq_desc *desc = data; > - unsigned int irq = desc - irq_desc; > + unsigned int i, irq = desc - irq_desc; > irq_guest_action_t *action; > cpumask_t cpu_eoi_map; > > @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void > > action = (irq_guest_action_t *)desc->action; > > + ASSERT(action->ack_type != ACKTYPE_NONE); > + > if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) > goto out; > > - if ( action->ack_type != ACKTYPE_NONE ) > + for ( i = 0; i < action->nr_guests; i++ ) > { > - unsigned int i; > - for ( i = 0; i < action->nr_guests; i++ ) > - { > - struct domain *d = action->guest[i]; I think you could constify d here. Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 16.05.19 at 15:52, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 06:48:16AM -0600, Jan Beulich wrote: >> action->ack_type is set once before the timer even gets initialized, and >> is never changed later. The timer gets activated only for EOI and UNMASK >> types. Hence there's no need to have a respective if() in there. Replace >> it by an ASSERT(). >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. >> @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void >> >> action = (irq_guest_action_t *)desc->action; >> >> + ASSERT(action->ack_type != ACKTYPE_NONE); >> + >> if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) >> goto out; >> >> - if ( action->ack_type != ACKTYPE_NONE ) >> + for ( i = 0; i < action->nr_guests; i++ ) >> { >> - unsigned int i; >> - for ( i = 0; i < action->nr_guests; i++ ) >> - { >> - struct domain *d = action->guest[i]; > > I think you could constify d here. Ah yes, this should work. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 16.05.19 at 15:52, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 06:48:16AM -0600, Jan Beulich wrote: >> @@ -1114,19 +1114,18 @@ static void irq_guest_eoi_timer_fn(void >> >> action = (irq_guest_action_t *)desc->action; >> >> + ASSERT(action->ack_type != ACKTYPE_NONE); >> + >> if ( !action->in_flight || timer_is_active(&action->eoi_timer) ) >> goto out; >> >> - if ( action->ack_type != ACKTYPE_NONE ) >> + for ( i = 0; i < action->nr_guests; i++ ) >> { >> - unsigned int i; >> - for ( i = 0; i < action->nr_guests; i++ ) >> - { >> - struct domain *d = action->guest[i]; > > I think you could constify d here. Now that I've tried I recall that I did so already when originally putting together the patch. It doesn't work, because radix_tree_lookup() requires a non-const pointer. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 08/05/2019 13:48, Jan Beulich wrote: > action->ack_type is set once before the timer even gets initialized, and > is never changed later. The timer gets activated only for EOI and UNMASK > types. Hence there's no need to have a respective if() in there. Replace > it by an ASSERT(). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
© 2016 - 2024 Red Hat, Inc.