First and foremost this series is trying to deal with CPU offlining issues, which have become more prominent with the recently added SMT enable/disable operation in xen-hptool. Later patches in the series then carry out more or less unrelated changes (hopefully improvements) noticed while looking at various pieces of involved code. 01: deal with move-in-progress state in fixup_irqs() 02: deal with move cleanup count state in fixup_irqs() 03: avoid UB (or worse) in trace_irq_mask() 04: improve dump_irqs() 05: desc->affinity should strictly represent the requested value 06: consolidate use of ->arch.cpu_mask 07: fix locking around vector management 08: correct/tighten vector check in _clear_irq_vector() 09: make fixup_irqs() skip unconnected internally used interrupts 10: reduce unused space in struct arch_irq_desc 11: drop redundant cpumask_empty() from move_masked_irq() 12: simplify and rename pirq_acktype() In principle patches 1-3, 5-7, and maybe 9 are backporting candidates. Their intrusive nature makes wanting to do so questionable, though. I'm omitting the final v1 "x86/IO-APIC: drop an unused variable from setup_IO_APIC_irqs()" here, as it was acked already and is entirely independent of this series. For other v2 specific information please see the individual patches. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
The flag being set may prevent affinity changes, as these often imply assignment of a new vector. When there's no possible destination left for the IRQ, the clearing of the flag needs to happen right from fixup_irqs(). Additionally _assign_irq_vector() needs to avoid setting the flag when there's no online CPU left in what gets put into ->arch.old_cpu_mask. The old vector can be released right away in this case. Also extend the log message about broken affinity to include the new affinity as well, allowing to notice issues with affinity changes not actually having taken place. Swap the if/else-if order there at the same time to reduce the amount of conditions checked. At the same time replace two open coded instances of the new helper function. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Add/use valid_irq_vector(). v1b: Also update vector_irq[] in the code added to fixup_irqs(). --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -99,6 +99,11 @@ void unlock_vector_lock(void) spin_unlock(&vector_lock); } +static inline bool valid_irq_vector(unsigned int vector) +{ + return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR; +} + static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask) { struct { @@ -242,6 +247,22 @@ void destroy_irq(unsigned int irq) xfree(action); } +static void release_old_vec(struct irq_desc *desc) +{ + unsigned int vector = desc->arch.old_vector; + + desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; + cpumask_clear(desc->arch.old_cpu_mask); + + if ( !valid_irq_vector(vector) ) + ASSERT_UNREACHABLE(); + else if ( desc->arch.used_vectors ) + { + ASSERT(test_bit(vector, desc->arch.used_vectors)); + clear_bit(vector, desc->arch.used_vectors); + } +} + static void __clear_irq_vector(int irq) { int cpu, vector, old_vector; @@ -285,14 +306,7 @@ static void __clear_irq_vector(int irq) per_cpu(vector_irq, cpu)[old_vector] = ~irq; } - desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; - cpumask_clear(desc->arch.old_cpu_mask); - - if ( desc->arch.used_vectors ) - { - ASSERT(test_bit(old_vector, desc->arch.used_vectors)); - clear_bit(old_vector, desc->arch.used_vectors); - } + release_old_vec(desc); desc->arch.move_in_progress = 0; } @@ -517,12 +531,21 @@ next: /* Found one! */ current_vector = vector; current_offset = offset; - if (old_vector > 0) { - desc->arch.move_in_progress = 1; - cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask); + + if ( old_vector > 0 ) + { + cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask, + &cpu_online_map); desc->arch.old_vector = desc->arch.vector; + if ( !cpumask_empty(desc->arch.old_cpu_mask) ) + desc->arch.move_in_progress = 1; + else + /* This can happen while offlining a CPU. */ + release_old_vec(desc); } + trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask); + for_each_cpu(new_cpu, &tmp_mask) per_cpu(vector_irq, new_cpu)[vector] = irq; desc->arch.vector = vector; @@ -691,14 +714,8 @@ void irq_move_cleanup_interrupt(struct c if ( desc->arch.move_cleanup_count == 0 ) { - desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; - cpumask_clear(desc->arch.old_cpu_mask); - - if ( desc->arch.used_vectors ) - { - ASSERT(test_bit(vector, desc->arch.used_vectors)); - clear_bit(vector, desc->arch.used_vectors); - } + ASSERT(vector == desc->arch.old_vector); + release_old_vec(desc); } unlock: spin_unlock(&desc->lock); @@ -2391,6 +2408,33 @@ void fixup_irqs(const cpumask_t *mask, b continue; } + /* + * In order for the affinity adjustment below to be successful, we + * need __assign_irq_vector() to succeed. This in particular means + * clearing desc->arch.move_in_progress if this would otherwise + * prevent the function from succeeding. Since there's no way for the + * flag to get cleared anymore when there's no possible destination + * left (the only possibility then would be the IRQs enabled window + * after this loop), there's then also no race with us doing it here. + * + * Therefore the logic here and there need to remain in sync. + */ + if ( desc->arch.move_in_progress && + !cpumask_intersects(mask, desc->arch.cpu_mask) ) + { + unsigned int cpu; + + cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map); + + spin_lock(&vector_lock); + for_each_cpu(cpu, &affinity) + per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq; + spin_unlock(&vector_lock); + + release_old_vec(desc); + desc->arch.move_in_progress = 0; + } + cpumask_and(&affinity, &affinity, mask); if ( cpumask_empty(&affinity) ) { @@ -2409,15 +2453,18 @@ void fixup_irqs(const cpumask_t *mask, b if ( desc->handler->enable ) desc->handler->enable(desc); + cpumask_copy(&affinity, desc->affinity); + spin_unlock(&desc->lock); if ( !verbose ) continue; - if ( break_affinity && set_affinity ) - printk("Broke affinity for irq %i\n", irq); - else if ( !set_affinity ) - printk("Cannot set affinity for irq %i\n", irq); + if ( !set_affinity ) + printk("Cannot set affinity for IRQ%u\n", irq); + else if ( break_affinity ) + printk("Broke affinity for IRQ%u, new: %*pb\n", + irq, nr_cpu_ids, &affinity); } /* That doesn't seem sufficient. Give it 1ms. */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 07:03:09AM -0600, Jan Beulich wrote: > The flag being set may prevent affinity changes, as these often imply > assignment of a new vector. When there's no possible destination left > for the IRQ, the clearing of the flag needs to happen right from > fixup_irqs(). > > Additionally _assign_irq_vector() needs to avoid setting the flag when > there's no online CPU left in what gets put into ->arch.old_cpu_mask. > The old vector can be released right away in this case. > > Also extend the log message about broken affinity to include the new > affinity as well, allowing to notice issues with affinity changes not > actually having taken place. Swap the if/else-if order there at the > same time to reduce the amount of conditions checked. > > At the same time replace two open coded instances of the new helper > function. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Thanks, Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> One comment below. > --- > v2: Add/use valid_irq_vector(). > v1b: Also update vector_irq[] in the code added to fixup_irqs(). > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -99,6 +99,11 @@ void unlock_vector_lock(void) > spin_unlock(&vector_lock); > } > > +static inline bool valid_irq_vector(unsigned int vector) > +{ > + return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR; > +} > + > static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask) > { > struct { > @@ -242,6 +247,22 @@ void destroy_irq(unsigned int irq) > xfree(action); > } > > +static void release_old_vec(struct irq_desc *desc) > +{ > + unsigned int vector = desc->arch.old_vector; > + > + desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; > + cpumask_clear(desc->arch.old_cpu_mask); > + > + if ( !valid_irq_vector(vector) ) > + ASSERT_UNREACHABLE(); > + else if ( desc->arch.used_vectors ) > + { > + ASSERT(test_bit(vector, desc->arch.used_vectors)); > + clear_bit(vector, desc->arch.used_vectors); > + } > +} > + > static void __clear_irq_vector(int irq) > { > int cpu, vector, old_vector; > @@ -285,14 +306,7 @@ static void __clear_irq_vector(int irq) > per_cpu(vector_irq, cpu)[old_vector] = ~irq; > } > > - desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; > - cpumask_clear(desc->arch.old_cpu_mask); > - > - if ( desc->arch.used_vectors ) > - { > - ASSERT(test_bit(old_vector, desc->arch.used_vectors)); > - clear_bit(old_vector, desc->arch.used_vectors); > - } > + release_old_vec(desc); > > desc->arch.move_in_progress = 0; > } > @@ -517,12 +531,21 @@ next: > /* Found one! */ > current_vector = vector; > current_offset = offset; > - if (old_vector > 0) { > - desc->arch.move_in_progress = 1; > - cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask); > + > + if ( old_vector > 0 ) Maybe you could use valid_irq_vector here, or compare against IRQ_VECTOR_UNASSIGNED? The fact that IRQ_VECTOR_UNASSIGNED is a negative value is an implementation detail that shouldn't be exposed directly in the code IMO. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 13.05.19 at 11:04, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 07:03:09AM -0600, Jan Beulich wrote: >> The flag being set may prevent affinity changes, as these often imply >> assignment of a new vector. When there's no possible destination left >> for the IRQ, the clearing of the flag needs to happen right from >> fixup_irqs(). >> >> Additionally _assign_irq_vector() needs to avoid setting the flag when >> there's no online CPU left in what gets put into ->arch.old_cpu_mask. >> The old vector can be released right away in this case. >> >> Also extend the log message about broken affinity to include the new >> affinity as well, allowing to notice issues with affinity changes not >> actually having taken place. Swap the if/else-if order there at the >> same time to reduce the amount of conditions checked. >> >> At the same time replace two open coded instances of the new helper >> function. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Thanks, > > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. >> @@ -517,12 +531,21 @@ next: >> /* Found one! */ >> current_vector = vector; >> current_offset = offset; >> - if (old_vector > 0) { >> - desc->arch.move_in_progress = 1; >> - cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask); >> + >> + if ( old_vector > 0 ) > > Maybe you could use valid_irq_vector here, or compare against > IRQ_VECTOR_UNASSIGNED? Not in this patch, but I'd like to widen the use of valid_irq_vector() subsequently, which would likely also include this case. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
The cleanup IPI may get sent immediately before a CPU gets removed from the online map. In such a case the IPI would get handled on the CPU being offlined no earlier than in the interrupts disabled window after fixup_irqs()' main loop. This is too late, however, because a possible affinity change may incur the need for vector assignment, which will fail when the IRQ's move cleanup count is still non-zero. To fix this - record the set of CPUs the cleanup IPIs gets actually sent to alongside setting their count, - adjust the count in fixup_irqs(), accounting for all CPUs that the cleanup IPI was sent to, but that are no longer online, - bail early from the cleanup IPI handler when the CPU is no longer online, to prevent double accounting. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -665,6 +665,9 @@ void irq_move_cleanup_interrupt(struct c ack_APIC_irq(); me = smp_processor_id(); + if ( !cpu_online(me) ) + return; + for ( vector = FIRST_DYNAMIC_VECTOR; vector <= LAST_HIPRIORITY_VECTOR; vector++) { @@ -724,11 +727,14 @@ unlock: static void send_cleanup_vector(struct irq_desc *desc) { - cpumask_t cleanup_mask; + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, + &cpu_online_map); + desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask); - cpumask_and(&cleanup_mask, desc->arch.old_cpu_mask, &cpu_online_map); - desc->arch.move_cleanup_count = cpumask_weight(&cleanup_mask); - send_IPI_mask(&cleanup_mask, IRQ_MOVE_CLEANUP_VECTOR); + if ( desc->arch.move_cleanup_count ) + send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR); + else + release_old_vec(desc); desc->arch.move_in_progress = 0; } @@ -2401,6 +2407,16 @@ void fixup_irqs(const cpumask_t *mask, b vector <= LAST_HIPRIORITY_VECTOR ) cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask); + if ( desc->arch.move_cleanup_count ) + { + /* The cleanup IPI may have got sent while we were still online. */ + cpumask_andnot(&affinity, desc->arch.old_cpu_mask, + &cpu_online_map); + desc->arch.move_cleanup_count -= cpumask_weight(&affinity); + if ( !desc->arch.move_cleanup_count ) + release_old_vec(desc); + } + cpumask_copy(&affinity, desc->affinity); if ( !desc->action || cpumask_subset(&affinity, mask) ) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Dynamically allocated CPU mask objects may be smaller than cpumask_t, so copying has to be restricted to the actual allocation size. This is particulary important since the function doesn't bail early when tracing is not active, so even production builds would be affected by potential misbehavior here. Take the opportunity and also - use initializers instead of assignment + memset(), - constify the cpumask_t input pointer, - u32 -> uint32_t. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- TBD: I wonder whether the function shouldn't gain an early tb_init_done check, like many other trace_*() have. George, despite your general request to be copied on entire series rather than individual patches, I thought it would be better to copy you on just this one (for its tracing aspect), as the patch here is independent of the rest of the series, but at least one later patch depends on the parameter constification done here. --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -104,16 +104,19 @@ static inline bool valid_irq_vector(unsi return vector >= FIRST_DYNAMIC_VECTOR && vector <= LAST_HIPRIORITY_VECTOR; } -static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask) +static void trace_irq_mask(uint32_t event, int irq, int vector, + const cpumask_t *mask) { struct { unsigned int irq:16, vec:16; unsigned int mask[6]; - } d; - d.irq = irq; - d.vec = vector; - memset(d.mask, 0, sizeof(d.mask)); - memcpy(d.mask, mask, min(sizeof(d.mask), sizeof(cpumask_t))); + } d = { + .irq = irq, + .vec = vector, + }; + + memcpy(d.mask, mask, + min(sizeof(d.mask), BITS_TO_LONGS(nr_cpu_ids) * sizeof(long))); trace_var(event, 1, sizeof(d), &d); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 07:07:21AM -0600, Jan Beulich wrote: > Dynamically allocated CPU mask objects may be smaller than cpumask_t, so > copying has to be restricted to the actual allocation size. This is > particulary important since the function doesn't bail early when tracing > is not active, so even production builds would be affected by potential > misbehavior here. > > Take the opportunity and also > - use initializers instead of assignment + memset(), > - constify the cpumask_t input pointer, > - u32 -> uint32_t. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 5/8/19 2:07 PM, Jan Beulich wrote: > Dynamically allocated CPU mask objects may be smaller than cpumask_t, so > copying has to be restricted to the actual allocation size. This is > particulary important since the function doesn't bail early when tracing > is not active, so even production builds would be affected by potential > misbehavior here. > > Take the opportunity and also > - use initializers instead of assignment + memset(), > - constify the cpumask_t input pointer, > - u32 -> uint32_t. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: New. > --- > TBD: I wonder whether the function shouldn't gain an early tb_init_done > check, like many other trace_*() have. Yeah, avoiding these memcopies when tracing is not enabled seems like a good thing. Either way: Acked-by: George Dunlap <george.dunlap@citrix.com> > > George, despite your general request to be copied on entire series > rather than individual patches, I thought it would be better to copy > you on just this one (for its tracing aspect), as the patch here is > independent of the rest of the series, but at least one later patch > depends on the parameter constification done here. Yes, I think in this case this was the easiest thing for me. Thanks. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 13.05.19 at 12:42, <george.dunlap@citrix.com> wrote: > On 5/8/19 2:07 PM, Jan Beulich wrote: >> TBD: I wonder whether the function shouldn't gain an early tb_init_done >> check, like many other trace_*() have. > > Yeah, avoiding these memcopies when tracing is not enabled seems like a > good thing. I've taken note to submit a respective follow-on patch. > Either way: > > Acked-by: George Dunlap <george.dunlap@citrix.com> Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Don't log a stray trailing comma. Shorten a few fields. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -2328,7 +2328,7 @@ static void dump_irqs(unsigned char key) spin_lock_irqsave(&desc->lock, flags); - printk(" IRQ:%4d affinity:%*pb vec:%02x type=%-15s status=%08x ", + printk(" IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ", irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector, desc->handler->typename, desc->status); @@ -2339,23 +2339,21 @@ static void dump_irqs(unsigned char key) { action = (irq_guest_action_t *)desc->action; - printk("in-flight=%d domain-list=", action->in_flight); + printk("in-flight=%d%c", + action->in_flight, action->nr_guests ? ' ' : '\n'); - for ( i = 0; i < action->nr_guests; i++ ) + for ( i = 0; i < action->nr_guests; ) { - d = action->guest[i]; + d = action->guest[i++]; pirq = domain_irq_to_pirq(d, irq); info = pirq_info(d, pirq); - printk("%u:%3d(%c%c%c)", + printk("d%d:%3d(%c%c%c)%c", d->domain_id, pirq, evtchn_port_is_pending(d, info->evtchn) ? 'P' : '-', evtchn_port_is_masked(d, info->evtchn) ? 'M' : '-', - (info->masked ? 'M' : '-')); - if ( i != action->nr_guests ) - printk(","); + info->masked ? 'M' : '-', + i < action->nr_guests ? ',' : '\n'); } - - printk("\n"); } else if ( desc->action ) printk("%ps()\n", desc->action->handler); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
desc->arch.cpu_mask reflects the actual set of target CPUs. Don't ever fiddle with desc->affinity itself, except to store caller requested values. Note that assign_irq_vector() now takes a NULL incoming CPU mask to mean "all CPUs" now, rather than just "all currently online CPUs". This way no further affinity adjustment is needed after onlining further CPUs. This renders both set_native_irq_info() uses (which weren't using proper locking anyway) redundant - drop the function altogether. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -1042,7 +1042,6 @@ static void __init setup_IO_APIC_irqs(vo SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS)); spin_lock_irqsave(&ioapic_lock, flags); __ioapic_write_entry(apic, pin, 0, entry); - set_native_irq_info(irq, TARGET_CPUS); spin_unlock_irqrestore(&ioapic_lock, flags); } } @@ -2251,7 +2250,6 @@ int io_apic_set_pci_routing (int ioapic, spin_lock_irqsave(&ioapic_lock, flags); __ioapic_write_entry(ioapic, pin, 0, entry); - set_native_irq_info(irq, TARGET_CPUS); spin_unlock(&ioapic_lock); spin_lock(&desc->lock); --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -582,11 +582,16 @@ int assign_irq_vector(int irq, const cpu spin_lock_irqsave(&vector_lock, flags); ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS); - if (!ret) { + if ( !ret ) + { ret = desc->arch.vector; - cpumask_copy(desc->affinity, desc->arch.cpu_mask); + if ( mask ) + cpumask_copy(desc->affinity, mask); + else + cpumask_setall(desc->affinity); } spin_unlock_irqrestore(&vector_lock, flags); + return ret; } @@ -2328,9 +2333,10 @@ static void dump_irqs(unsigned char key) spin_lock_irqsave(&desc->lock, flags); - printk(" IRQ:%4d aff:%*pb vec:%02x %-15s status=%03x ", - irq, nr_cpu_ids, cpumask_bits(desc->affinity), desc->arch.vector, - desc->handler->typename, desc->status); + printk(" IRQ:%4d aff:%*pb/%*pb vec:%02x %-15s status=%03x ", + irq, nr_cpu_ids, cpumask_bits(desc->affinity), + nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask), + desc->arch.vector, desc->handler->typename, desc->status); if ( ssid ) printk("Z=%-25s ", ssid); @@ -2418,8 +2424,7 @@ void fixup_irqs(const cpumask_t *mask, b release_old_vec(desc); } - cpumask_copy(&affinity, desc->affinity); - if ( !desc->action || cpumask_subset(&affinity, mask) ) + if ( !desc->action || cpumask_subset(desc->affinity, mask) ) { spin_unlock(&desc->lock); continue; @@ -2452,12 +2457,13 @@ void fixup_irqs(const cpumask_t *mask, b desc->arch.move_in_progress = 0; } - cpumask_and(&affinity, &affinity, mask); - if ( cpumask_empty(&affinity) ) + if ( !cpumask_intersects(mask, desc->affinity) ) { break_affinity = true; - cpumask_copy(&affinity, mask); + cpumask_setall(&affinity); } + else + cpumask_copy(&affinity, desc->affinity); if ( desc->handler->disable ) desc->handler->disable(desc); --- a/xen/include/xen/irq.h +++ b/xen/include/xen/irq.h @@ -162,11 +162,6 @@ extern irq_desc_t *domain_spin_lock_irq_ extern irq_desc_t *pirq_spin_lock_irq_desc( const struct pirq *, unsigned long *pflags); -static inline void set_native_irq_info(unsigned int irq, const cpumask_t *mask) -{ - cpumask_copy(irq_to_desc(irq)->affinity, mask); -} - unsigned int set_desc_affinity(struct irq_desc *, const cpumask_t *); #ifndef arch_hwdom_irqs _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Mixed meaning was implied so far by different pieces of code - disagreement was in particular about whether to expect offline CPUs' bits to possibly be set. Switch to a mostly consistent meaning (exception being high priority interrupts, which would perhaps better be switched to the same model as well in due course). Use the field to record the vector allocation mask, i.e. potentially including bits of offline (parked) CPUs. This implies that before passing the mask to certain functions (most notably cpu_mask_to_apicid()) it needs to be further reduced to the online subset. The exception of high priority interrupts is also why for the moment _bind_irq_vector() is left as is, despite looking wrong: It's used exclusively for IRQ0, which isn't supposed to move off CPU0 at any time. The prior lack of restricting to online CPUs in set_desc_affinity() before calling cpu_mask_to_apicid() in particular allowed (in x2APIC clustered mode) offlined CPUs to end up enabled in an IRQ's destination field. (I wonder whether vector_allocation_cpumask_flat() shouldn't follow a similar model, using cpu_present_map in favor of cpu_online_map.) For IO-APIC code it was definitely wrong to potentially store, as a fallback, TARGET_CPUS (i.e. all online ones) into the field, as that would have caused problems when determining on which CPUs to release vectors when they've gone out of use. Disable interrupts instead when no valid target CPU can be established (which code elsewhere should guarantee to never happen), and log a message in such an unlikely event. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void) continue; irq = pin_2_irq(irq_entry, ioapic, pin); desc = irq_to_desc(irq); - BUG_ON(cpumask_empty(desc->arch.cpu_mask)); + BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map)); set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); } @@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic, { struct irq_desc *desc = irq_to_desc(irq); struct IO_APIC_route_entry entry; - cpumask_t mask; unsigned long flags; int vector; @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic, return vector; entry.vector = vector; - cpumask_copy(&mask, TARGET_CPUS); - /* Don't chance ending up with an empty mask. */ - if (cpumask_intersects(&mask, desc->arch.cpu_mask)) - cpumask_and(&mask, &mask, desc->arch.cpu_mask); - SET_DEST(entry, logical, cpu_mask_to_apicid(&mask)); + if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) { + cpumask_t *mask = this_cpu(scratch_cpumask); + + cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS); + SET_DEST(entry, logical, cpu_mask_to_apicid(mask)); + } else { + printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n", + irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask), + nr_cpu_ids, cpumask_bits(TARGET_CPUS)); + desc->status |= IRQ_DISABLED; + } apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry " "(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic, @@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy /* Set the vector field to the real vector! */ rte.vector = desc->arch.vector; - SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask)); + if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) ) + { + cpumask_t *mask = this_cpu(scratch_cpumask); + + cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS); + SET_DEST(rte, logical, cpu_mask_to_apicid(mask)); + } + else + { + gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n", + irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask), + nr_cpu_ids, cpumask_bits(TARGET_CPUS)); + desc->status |= IRQ_DISABLED; + rte.mask = 1; + } __ioapic_write_entry(apic, pin, 0, rte); --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -471,11 +471,13 @@ static int __assign_irq_vector( */ static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0; int cpu, err, old_vector; - cpumask_t tmp_mask; vmask_t *irq_used_vectors = NULL; old_vector = irq_to_vector(irq); - if (old_vector > 0) { + if ( old_vector > 0 ) + { + cpumask_t tmp_mask; + cpumask_and(&tmp_mask, mask, &cpu_online_map); if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) { desc->arch.vector = old_vector; @@ -498,7 +500,9 @@ static int __assign_irq_vector( else irq_used_vectors = irq_get_used_vector_mask(irq); - for_each_cpu(cpu, mask) { + for_each_cpu(cpu, mask) + { + const cpumask_t *vec_mask; int new_cpu; int vector, offset; @@ -506,8 +510,7 @@ static int __assign_irq_vector( if (!cpu_online(cpu)) continue; - cpumask_and(&tmp_mask, vector_allocation_cpumask(cpu), - &cpu_online_map); + vec_mask = vector_allocation_cpumask(cpu); vector = current_vector; offset = current_offset; @@ -528,7 +531,7 @@ next: && test_bit(vector, irq_used_vectors) ) goto next; - for_each_cpu(new_cpu, &tmp_mask) + for_each_cpu(new_cpu, vec_mask) if (per_cpu(vector_irq, new_cpu)[vector] >= 0) goto next; /* Found one! */ @@ -547,12 +550,12 @@ next: release_old_vec(desc); } - trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask); + trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask); - for_each_cpu(new_cpu, &tmp_mask) + for_each_cpu(new_cpu, vec_mask) per_cpu(vector_irq, new_cpu)[vector] = irq; desc->arch.vector = vector; - cpumask_copy(desc->arch.cpu_mask, &tmp_mask); + cpumask_copy(desc->arch.cpu_mask, vec_mask); desc->arch.used = IRQ_USED; ASSERT((desc->arch.used_vectors == NULL) @@ -783,6 +786,7 @@ unsigned int set_desc_affinity(struct ir cpumask_copy(desc->affinity, mask); cpumask_and(&dest_mask, mask, desc->arch.cpu_mask); + cpumask_and(&dest_mask, &dest_mask, &cpu_online_map); return cpu_mask_to_apicid(&dest_mask); } --- a/xen/include/asm-x86/irq.h +++ b/xen/include/asm-x86/irq.h @@ -32,6 +32,12 @@ struct irq_desc; struct arch_irq_desc { s16 vector; /* vector itself is only 8 bits, */ s16 old_vector; /* but we use -1 for unassigned */ + /* + * Except for high priority interrupts @cpu_mask may have bits set for + * offline CPUs. Consumers need to be careful to mask this down to + * online ones as necessary. There is supposed to always be a non- + * empty intersection with cpu_online_map. + */ cpumask_var_t cpu_mask; cpumask_var_t old_cpu_mask; cpumask_var_t pending_mask; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote: > Mixed meaning was implied so far by different pieces of code - > disagreement was in particular about whether to expect offline CPUs' > bits to possibly be set. Switch to a mostly consistent meaning > (exception being high priority interrupts, which would perhaps better > be switched to the same model as well in due course). Use the field to > record the vector allocation mask, i.e. potentially including bits of > offline (parked) CPUs. This implies that before passing the mask to > certain functions (most notably cpu_mask_to_apicid()) it needs to be > further reduced to the online subset. > > The exception of high priority interrupts is also why for the moment > _bind_irq_vector() is left as is, despite looking wrong: It's used > exclusively for IRQ0, which isn't supposed to move off CPU0 at any time. > > The prior lack of restricting to online CPUs in set_desc_affinity() > before calling cpu_mask_to_apicid() in particular allowed (in x2APIC > clustered mode) offlined CPUs to end up enabled in an IRQ's destination > field. (I wonder whether vector_allocation_cpumask_flat() shouldn't > follow a similar model, using cpu_present_map in favor of > cpu_online_map.) > > For IO-APIC code it was definitely wrong to potentially store, as a > fallback, TARGET_CPUS (i.e. all online ones) into the field, as that > would have caused problems when determining on which CPUs to release > vectors when they've gone out of use. Disable interrupts instead when > no valid target CPU can be established (which code elsewhere should > guarantee to never happen), and log a message in such an unlikely event. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Thanks. Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Some comments below. > --- > v2: New. > > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void) > continue; > irq = pin_2_irq(irq_entry, ioapic, pin); > desc = irq_to_desc(irq); > - BUG_ON(cpumask_empty(desc->arch.cpu_mask)); > + BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map)); I wonder if maybe you could instead do: if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) ) set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); else ASSERT_UNREACHABLE(); I guess if the IRQ is in use by Xen itself the failure ought to be fatal. > set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); > } > > @@ -2197,7 +2197,6 @@ int io_apic_set_pci_routing (int ioapic, > { > struct irq_desc *desc = irq_to_desc(irq); > struct IO_APIC_route_entry entry; > - cpumask_t mask; > unsigned long flags; > int vector; > > @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic, > return vector; > entry.vector = vector; > > - cpumask_copy(&mask, TARGET_CPUS); > - /* Don't chance ending up with an empty mask. */ > - if (cpumask_intersects(&mask, desc->arch.cpu_mask)) > - cpumask_and(&mask, &mask, desc->arch.cpu_mask); > - SET_DEST(entry, logical, cpu_mask_to_apicid(&mask)); > + if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) { > + cpumask_t *mask = this_cpu(scratch_cpumask); > + > + cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS); > + SET_DEST(entry, logical, cpu_mask_to_apicid(mask)); > + } else { > + printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n", > + irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask), > + nr_cpu_ids, cpumask_bits(TARGET_CPUS)); > + desc->status |= IRQ_DISABLED; > + } Hm, part of this file doesn't seem to use Xen coding style, but the chunk you add below does use it. And there are functions (like mask_and_ack_level_ioapic_irq that seem to use a mix of coding styles). I'm not sure what's the policy here, should new chunks follow Xen's coding style? > > apic_printk(APIC_DEBUG, KERN_DEBUG "IOAPIC[%d]: Set PCI routing entry " > "(%d-%d -> %#x -> IRQ %d Mode:%i Active:%i)\n", ioapic, > @@ -2422,7 +2427,21 @@ int ioapic_guest_write(unsigned long phy > /* Set the vector field to the real vector! */ > rte.vector = desc->arch.vector; > > - SET_DEST(rte, logical, cpu_mask_to_apicid(desc->arch.cpu_mask)); > + if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) ) > + { > + cpumask_t *mask = this_cpu(scratch_cpumask); > + > + cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS); > + SET_DEST(rte, logical, cpu_mask_to_apicid(mask)); > + } > + else > + { > + gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n", > + irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask), > + nr_cpu_ids, cpumask_bits(TARGET_CPUS)); > + desc->status |= IRQ_DISABLED; > + rte.mask = 1; > + } > > __ioapic_write_entry(apic, pin, 0, rte); > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -471,11 +471,13 @@ static int __assign_irq_vector( > */ > static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0; > int cpu, err, old_vector; > - cpumask_t tmp_mask; > vmask_t *irq_used_vectors = NULL; > > old_vector = irq_to_vector(irq); > - if (old_vector > 0) { > + if ( old_vector > 0 ) Another candidate to switch to valid_irq_vector or at least make an explicit comparison with IRQ_VECTOR_UNASSIGNED. Seeing your reply to my comment in that direction on a previous patch this can be done as a follow up. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 13.05.19 at 13:32, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 07:10:29AM -0600, Jan Beulich wrote: >> --- a/xen/arch/x86/io_apic.c >> +++ b/xen/arch/x86/io_apic.c >> @@ -680,7 +680,7 @@ void /*__init*/ setup_ioapic_dest(void) >> continue; >> irq = pin_2_irq(irq_entry, ioapic, pin); >> desc = irq_to_desc(irq); >> - BUG_ON(cpumask_empty(desc->arch.cpu_mask)); >> + BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map)); > > I wonder if maybe you could instead do: > > if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) ) > set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); > else > ASSERT_UNREACHABLE(); > > I guess if the IRQ is in use by Xen itself the failure ought to be > fatal. And imo also when it's another one (used by Dom0). Iirc we get here only during Dom0 boot (the commented out __init serving as a hint). Hence I think BUG_ON() is better in this case than any for of assertion. >> @@ -2232,11 +2231,17 @@ int io_apic_set_pci_routing (int ioapic, >> return vector; >> entry.vector = vector; >> >> - cpumask_copy(&mask, TARGET_CPUS); >> - /* Don't chance ending up with an empty mask. */ >> - if (cpumask_intersects(&mask, desc->arch.cpu_mask)) >> - cpumask_and(&mask, &mask, desc->arch.cpu_mask); >> - SET_DEST(entry, logical, cpu_mask_to_apicid(&mask)); >> + if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) { >> + cpumask_t *mask = this_cpu(scratch_cpumask); >> + >> + cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS); >> + SET_DEST(entry, logical, cpu_mask_to_apicid(mask)); >> + } else { >> + printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n", >> + irq, nr_cpu_ids, cpumask_bits(desc->arch.cpu_mask), >> + nr_cpu_ids, cpumask_bits(TARGET_CPUS)); >> + desc->status |= IRQ_DISABLED; >> + } > > Hm, part of this file doesn't seem to use Xen coding style, but the > chunk you add below does use it. And there are functions (like > mask_and_ack_level_ioapic_irq that seem to use a mix of coding > styles). > > I'm not sure what's the policy here, should new chunks follow Xen's > coding style? Well, I've decided to match surrounding code's style, until the file gets morphed into consistent shape. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc fields, and hence ought to be called with the descriptor lock held in addition to vector_lock. This is currently the case for only set_desc_affinity() (in the common case) and destroy_irq(), which also clarifies what the nesting behavior between the locks has to be. Reflect the new expectation by having these functions all take a descriptor as parameter instead of an interrupt number. Also take care of the two special cases of calls to set_desc_affinity(): set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called directly as well, and in these cases the descriptor locks hadn't got acquired till now. For set_ioapic_affinity_irq() this means acquiring / releasing of the IO-APIC lock can be plain spin_{,un}lock() then. Drop one of the two leading underscores from all three functions at the same time. There's one case left where descriptors get manipulated with just vector_lock held: setup_vector_irq() assumes its caller to acquire vector_lock, and hence can't itself acquire the descriptor locks (wrong lock order). I don't currently see how to address this. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Also adjust set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity(). --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -550,14 +550,14 @@ static void clear_IO_APIC (void) static void set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask) { - unsigned long flags; unsigned int dest; int pin, irq; struct irq_pin_list *entry; irq = desc->irq; - spin_lock_irqsave(&ioapic_lock, flags); + spin_lock(&ioapic_lock); + dest = set_desc_affinity(desc, mask); if (dest != BAD_APICID) { if ( !x2apic_enabled ) @@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc entry = irq_2_pin + entry->next; } } - spin_unlock_irqrestore(&ioapic_lock, flags); + spin_unlock(&ioapic_lock); } /* @@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void) for (ioapic = 0; ioapic < nr_ioapics; ioapic++) { for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) { struct irq_desc *desc; + unsigned long flags; irq_entry = find_irq_entry(ioapic, pin, mp_INT); if (irq_entry == -1) continue; irq = pin_2_irq(irq_entry, ioapic, pin); desc = irq_to_desc(irq); + + spin_lock_irqsave(&desc->lock, flags); BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map)); set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); + spin_unlock_irqrestore(&desc->lock, flags); } - } } --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -27,6 +27,7 @@ #include <public/physdev.h> static int parse_irq_vector_map_param(const char *s); +static void _clear_irq_vector(struct irq_desc *desc); /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */ bool __read_mostly opt_noirqbalance; @@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even trace_var(event, 1, sizeof(d), &d); } -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask) +static int __init _bind_irq_vector(struct irq_desc *desc, int vector, + const cpumask_t *cpu_mask) { cpumask_t online_mask; int cpu; - struct irq_desc *desc = irq_to_desc(irq); - BUG_ON((unsigned)irq >= nr_irqs); BUG_ON((unsigned)vector >= NR_VECTORS); cpumask_and(&online_mask, cpu_mask, &cpu_online_map); @@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int return 0; if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED ) return -EBUSY; - trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask); + trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask); for_each_cpu(cpu, &online_mask) - per_cpu(vector_irq, cpu)[vector] = irq; + per_cpu(vector_irq, cpu)[vector] = desc->irq; desc->arch.vector = vector; cpumask_copy(desc->arch.cpu_mask, &online_mask); if ( desc->arch.used_vectors ) @@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask) { + struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; int ret; - spin_lock_irqsave(&vector_lock, flags); - ret = __bind_irq_vector(irq, vector, cpu_mask); - spin_unlock_irqrestore(&vector_lock, flags); + BUG_ON((unsigned)irq >= nr_irqs); + + spin_lock_irqsave(&desc->lock, flags); + spin_lock(&vector_lock); + ret = _bind_irq_vector(desc, vector, cpu_mask); + spin_unlock(&vector_lock); + spin_unlock_irqrestore(&desc->lock, flags); + return ret; } @@ -243,7 +249,9 @@ void destroy_irq(unsigned int irq) spin_lock_irqsave(&desc->lock, flags); desc->handler = &no_irq_type; - clear_irq_vector(irq); + spin_lock(&vector_lock); + _clear_irq_vector(desc); + spin_unlock(&vector_lock); desc->arch.used_vectors = NULL; spin_unlock_irqrestore(&desc->lock, flags); @@ -266,11 +274,11 @@ static void release_old_vec(struct irq_d } } -static void __clear_irq_vector(int irq) +static void _clear_irq_vector(struct irq_desc *desc) { - int cpu, vector, old_vector; + unsigned int cpu; + int vector, old_vector, irq = desc->irq; cpumask_t tmp_mask; - struct irq_desc *desc = irq_to_desc(irq); BUG_ON(!desc->arch.vector); @@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq) void clear_irq_vector(int irq) { + struct irq_desc *desc = irq_to_desc(irq); unsigned long flags; - spin_lock_irqsave(&vector_lock, flags); - __clear_irq_vector(irq); - spin_unlock_irqrestore(&vector_lock, flags); + spin_lock_irqsave(&desc->lock, flags); + spin_lock(&vector_lock); + _clear_irq_vector(desc); + spin_unlock(&vector_lock); + spin_unlock_irqrestore(&desc->lock, flags); } int irq_to_vector(int irq) @@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask return ret; } -static int __assign_irq_vector( - int irq, struct irq_desc *desc, const cpumask_t *mask) +static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask) { /* * NOTE! The local APIC isn't very good at handling @@ -470,7 +480,8 @@ static int __assign_irq_vector( * 0x80, because int 0x80 is hm, kind of importantish. ;) */ static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0; - int cpu, err, old_vector; + unsigned int cpu; + int err, old_vector, irq = desc->irq; vmask_t *irq_used_vectors = NULL; old_vector = irq_to_vector(irq); @@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu BUG_ON(irq >= nr_irqs || irq <0); - spin_lock_irqsave(&vector_lock, flags); - ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS); + spin_lock_irqsave(&desc->lock, flags); + + spin_lock(&vector_lock); + ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS); + spin_unlock(&vector_lock); + if ( !ret ) { ret = desc->arch.vector; @@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu else cpumask_setall(desc->affinity); } - spin_unlock_irqrestore(&vector_lock, flags); + + spin_unlock_irqrestore(&desc->lock, flags); return ret; } @@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc * unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask) { - unsigned int irq; int ret; unsigned long flags; cpumask_t dest_mask; @@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir if (!cpumask_intersects(mask, &cpu_online_map)) return BAD_APICID; - irq = desc->irq; - spin_lock_irqsave(&vector_lock, flags); - ret = __assign_irq_vector(irq, desc, mask); + ret = _assign_irq_vector(desc, mask); spin_unlock_irqrestore(&vector_lock, flags); if (ret < 0) --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) : NUMA_NO_NODE; const cpumask_t *cpumask = &cpu_online_map; + struct irq_desc *desc; if ( node < MAX_NUMNODES && node_online(node) && cpumask_intersects(&node_to_cpumask(node), cpumask) ) cpumask = &node_to_cpumask(node); - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); + + desc = irq_to_desc(drhd->iommu->msi.irq); + spin_lock_irq(&desc->lock); + dma_msi_set_affinity(desc, cpumask); + spin_unlock_irq(&desc->lock); } static int adjust_vtd_irq_affinities(void) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc > fields, and hence ought to be called with the descriptor lock held in > addition to vector_lock. This is currently the case for only > set_desc_affinity() (in the common case) and destroy_irq(), which also > clarifies what the nesting behavior between the locks has to be. > Reflect the new expectation by having these functions all take a > descriptor as parameter instead of an interrupt number. > > Also take care of the two special cases of calls to set_desc_affinity(): > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called > directly as well, and in these cases the descriptor locks hadn't got > acquired till now. For set_ioapic_affinity_irq() this means acquiring / > releasing of the IO-APIC lock can be plain spin_{,un}lock() then. > > Drop one of the two leading underscores from all three functions at > the same time. > > There's one case left where descriptors get manipulated with just > vector_lock held: setup_vector_irq() assumes its caller to acquire > vector_lock, and hence can't itself acquire the descriptor locks (wrong > lock order). I don't currently see how to address this. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a > unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) > : NUMA_NO_NODE; > const cpumask_t *cpumask = &cpu_online_map; > + struct irq_desc *desc; > > if ( node < MAX_NUMNODES && node_online(node) && > cpumask_intersects(&node_to_cpumask(node), cpumask) ) > cpumask = &node_to_cpumask(node); > - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); > + > + desc = irq_to_desc(drhd->iommu->msi.irq); > + spin_lock_irq(&desc->lock); I would use the irqsave/irqrestore variants here for extra safety. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote: > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: >> --- a/xen/drivers/passthrough/vtd/iommu.c >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a >> unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) >> : NUMA_NO_NODE; >> const cpumask_t *cpumask = &cpu_online_map; >> + struct irq_desc *desc; >> >> if ( node < MAX_NUMNODES && node_online(node) && >> cpumask_intersects(&node_to_cpumask(node), cpumask) ) >> cpumask = &node_to_cpumask(node); >> - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); >> + >> + desc = irq_to_desc(drhd->iommu->msi.irq); >> + spin_lock_irq(&desc->lock); > > I would use the irqsave/irqrestore variants here for extra safety. Hmm, maybe. But I think we're in bigger trouble if IRQs indeed ended up enabled at any of the two points where this function gets called. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote: > >>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote: > > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: > >> --- a/xen/drivers/passthrough/vtd/iommu.c > >> +++ b/xen/drivers/passthrough/vtd/iommu.c > >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a > >> unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) > >> : NUMA_NO_NODE; > >> const cpumask_t *cpumask = &cpu_online_map; > >> + struct irq_desc *desc; > >> > >> if ( node < MAX_NUMNODES && node_online(node) && > >> cpumask_intersects(&node_to_cpumask(node), cpumask) ) > >> cpumask = &node_to_cpumask(node); > >> - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); > >> + > >> + desc = irq_to_desc(drhd->iommu->msi.irq); > >> + spin_lock_irq(&desc->lock); > > > > I would use the irqsave/irqrestore variants here for extra safety. > > Hmm, maybe. But I think we're in bigger trouble if IRQs indeed > ended up enabled at any of the two points where this function > gets called. I think I'm misreading the above, but if you expect adjust_irq_affinity to always be called with interrupts disabled using spin_unlock_irq is wrong as it unconditionally enables interrupts. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 13.05.19 at 16:45, <roger.pau@citrix.com> wrote: > On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote: >> >>> On 13.05.19 at 15:48, <roger.pau@citrix.com> wrote: >> > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: >> >> --- a/xen/drivers/passthrough/vtd/iommu.c >> >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a >> >> unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) >> >> : NUMA_NO_NODE; >> >> const cpumask_t *cpumask = &cpu_online_map; >> >> + struct irq_desc *desc; >> >> >> >> if ( node < MAX_NUMNODES && node_online(node) && >> >> cpumask_intersects(&node_to_cpumask(node), cpumask) ) >> >> cpumask = &node_to_cpumask(node); >> >> - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); >> >> + >> >> + desc = irq_to_desc(drhd->iommu->msi.irq); >> >> + spin_lock_irq(&desc->lock); >> > >> > I would use the irqsave/irqrestore variants here for extra safety. >> >> Hmm, maybe. But I think we're in bigger trouble if IRQs indeed >> ended up enabled at any of the two points where this function >> gets called. > > I think I'm misreading the above, but if you expect > adjust_irq_affinity to always be called with interrupts disabled using > spin_unlock_irq is wrong as it unconditionally enables interrupts. Oops - s/enabled/disabled/ in my earlier reply. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
>>> On 08.05.19 at 15:10, <JBeulich@suse.com> wrote: > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc > fields, and hence ought to be called with the descriptor lock held in > addition to vector_lock. This is currently the case for only > set_desc_affinity() (in the common case) and destroy_irq(), which also > clarifies what the nesting behavior between the locks has to be. > Reflect the new expectation by having these functions all take a > descriptor as parameter instead of an interrupt number. > > Also take care of the two special cases of calls to set_desc_affinity(): > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called > directly as well, and in these cases the descriptor locks hadn't got > acquired till now. For set_ioapic_affinity_irq() this means acquiring / > releasing of the IO-APIC lock can be plain spin_{,un}lock() then. > > Drop one of the two leading underscores from all three functions at > the same time. > > There's one case left where descriptors get manipulated with just > vector_lock held: setup_vector_irq() assumes its caller to acquire > vector_lock, and hence can't itself acquire the descriptor locks (wrong > lock order). I don't currently see how to address this. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Also adjust set_ioapic_affinity_irq() and VT-d's > dma_msi_set_affinity(). I'm sorry, Kevin, I should have Cc-ed you on this one. Jan > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -550,14 +550,14 @@ static void clear_IO_APIC (void) > static void > set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask) > { > - unsigned long flags; > unsigned int dest; > int pin, irq; > struct irq_pin_list *entry; > > irq = desc->irq; > > - spin_lock_irqsave(&ioapic_lock, flags); > + spin_lock(&ioapic_lock); > + > dest = set_desc_affinity(desc, mask); > if (dest != BAD_APICID) { > if ( !x2apic_enabled ) > @@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc > entry = irq_2_pin + entry->next; > } > } > - spin_unlock_irqrestore(&ioapic_lock, flags); > > + spin_unlock(&ioapic_lock); > } > > /* > @@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void) > for (ioapic = 0; ioapic < nr_ioapics; ioapic++) { > for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) { > struct irq_desc *desc; > + unsigned long flags; > > irq_entry = find_irq_entry(ioapic, pin, mp_INT); > if (irq_entry == -1) > continue; > irq = pin_2_irq(irq_entry, ioapic, pin); > desc = irq_to_desc(irq); > + > + spin_lock_irqsave(&desc->lock, flags); > BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, > &cpu_online_map)); > set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); > + spin_unlock_irqrestore(&desc->lock, flags); > } > - > } > } > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -27,6 +27,7 @@ > #include <public/physdev.h> > > static int parse_irq_vector_map_param(const char *s); > +static void _clear_irq_vector(struct irq_desc *desc); > > /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. > */ > bool __read_mostly opt_noirqbalance; > @@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even > trace_var(event, 1, sizeof(d), &d); > } > > -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t > *cpu_mask) > +static int __init _bind_irq_vector(struct irq_desc *desc, int vector, > + const cpumask_t *cpu_mask) > { > cpumask_t online_mask; > int cpu; > - struct irq_desc *desc = irq_to_desc(irq); > > - BUG_ON((unsigned)irq >= nr_irqs); > BUG_ON((unsigned)vector >= NR_VECTORS); > > cpumask_and(&online_mask, cpu_mask, &cpu_online_map); > @@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int > return 0; > if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED ) > return -EBUSY; > - trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, &online_mask); > + trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask); > for_each_cpu(cpu, &online_mask) > - per_cpu(vector_irq, cpu)[vector] = irq; > + per_cpu(vector_irq, cpu)[vector] = desc->irq; > desc->arch.vector = vector; > cpumask_copy(desc->arch.cpu_mask, &online_mask); > if ( desc->arch.used_vectors ) > @@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int > > int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask) > { > + struct irq_desc *desc = irq_to_desc(irq); > unsigned long flags; > int ret; > > - spin_lock_irqsave(&vector_lock, flags); > - ret = __bind_irq_vector(irq, vector, cpu_mask); > - spin_unlock_irqrestore(&vector_lock, flags); > + BUG_ON((unsigned)irq >= nr_irqs); > + > + spin_lock_irqsave(&desc->lock, flags); > + spin_lock(&vector_lock); > + ret = _bind_irq_vector(desc, vector, cpu_mask); > + spin_unlock(&vector_lock); > + spin_unlock_irqrestore(&desc->lock, flags); > + > return ret; > } > > @@ -243,7 +249,9 @@ void destroy_irq(unsigned int irq) > > spin_lock_irqsave(&desc->lock, flags); > desc->handler = &no_irq_type; > - clear_irq_vector(irq); > + spin_lock(&vector_lock); > + _clear_irq_vector(desc); > + spin_unlock(&vector_lock); > desc->arch.used_vectors = NULL; > spin_unlock_irqrestore(&desc->lock, flags); > > @@ -266,11 +274,11 @@ static void release_old_vec(struct irq_d > } > } > > -static void __clear_irq_vector(int irq) > +static void _clear_irq_vector(struct irq_desc *desc) > { > - int cpu, vector, old_vector; > + unsigned int cpu; > + int vector, old_vector, irq = desc->irq; > cpumask_t tmp_mask; > - struct irq_desc *desc = irq_to_desc(irq); > > BUG_ON(!desc->arch.vector); > > @@ -316,11 +324,14 @@ static void __clear_irq_vector(int irq) > > void clear_irq_vector(int irq) > { > + struct irq_desc *desc = irq_to_desc(irq); > unsigned long flags; > > - spin_lock_irqsave(&vector_lock, flags); > - __clear_irq_vector(irq); > - spin_unlock_irqrestore(&vector_lock, flags); > + spin_lock_irqsave(&desc->lock, flags); > + spin_lock(&vector_lock); > + _clear_irq_vector(desc); > + spin_unlock(&vector_lock); > + spin_unlock_irqrestore(&desc->lock, flags); > } > > int irq_to_vector(int irq) > @@ -455,8 +466,7 @@ static vmask_t *irq_get_used_vector_mask > return ret; > } > > -static int __assign_irq_vector( > - int irq, struct irq_desc *desc, const cpumask_t *mask) > +static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask) > { > /* > * NOTE! The local APIC isn't very good at handling > @@ -470,7 +480,8 @@ static int __assign_irq_vector( > * 0x80, because int 0x80 is hm, kind of importantish. ;) > */ > static int current_vector = FIRST_DYNAMIC_VECTOR, current_offset = 0; > - int cpu, err, old_vector; > + unsigned int cpu; > + int err, old_vector, irq = desc->irq; > vmask_t *irq_used_vectors = NULL; > > old_vector = irq_to_vector(irq); > @@ -583,8 +594,12 @@ int assign_irq_vector(int irq, const cpu > > BUG_ON(irq >= nr_irqs || irq <0); > > - spin_lock_irqsave(&vector_lock, flags); > - ret = __assign_irq_vector(irq, desc, mask ?: TARGET_CPUS); > + spin_lock_irqsave(&desc->lock, flags); > + > + spin_lock(&vector_lock); > + ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS); > + spin_unlock(&vector_lock); > + > if ( !ret ) > { > ret = desc->arch.vector; > @@ -593,7 +608,8 @@ int assign_irq_vector(int irq, const cpu > else > cpumask_setall(desc->affinity); > } > - spin_unlock_irqrestore(&vector_lock, flags); > + > + spin_unlock_irqrestore(&desc->lock, flags); > > return ret; > } > @@ -767,7 +783,6 @@ void irq_complete_move(struct irq_desc * > > unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t > *mask) > { > - unsigned int irq; > int ret; > unsigned long flags; > cpumask_t dest_mask; > @@ -775,10 +790,8 @@ unsigned int set_desc_affinity(struct ir > if (!cpumask_intersects(mask, &cpu_online_map)) > return BAD_APICID; > > - irq = desc->irq; > - > spin_lock_irqsave(&vector_lock, flags); > - ret = __assign_irq_vector(irq, desc, mask); > + ret = _assign_irq_vector(desc, mask); > spin_unlock_irqrestore(&vector_lock, flags); > > if (ret < 0) > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a > unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) > : NUMA_NO_NODE; > const cpumask_t *cpumask = &cpu_online_map; > + struct irq_desc *desc; > > if ( node < MAX_NUMNODES && node_online(node) && > cpumask_intersects(&node_to_cpumask(node), cpumask) ) > cpumask = &node_to_cpumask(node); > - dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); > + > + desc = irq_to_desc(drhd->iommu->msi.irq); > + spin_lock_irq(&desc->lock); > + dma_msi_set_affinity(desc, cpumask); > + spin_unlock_irq(&desc->lock); > } > > static int adjust_vtd_irq_affinities(void) > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
> From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Wednesday, May 8, 2019 9:16 PM > > >>> On 08.05.19 at 15:10, <JBeulich@suse.com> wrote: > > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc > > fields, and hence ought to be called with the descriptor lock held in > > addition to vector_lock. This is currently the case for only > > set_desc_affinity() (in the common case) and destroy_irq(), which also > > clarifies what the nesting behavior between the locks has to be. > > Reflect the new expectation by having these functions all take a > > descriptor as parameter instead of an interrupt number. > > > > Also take care of the two special cases of calls to set_desc_affinity(): > > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called > > directly as well, and in these cases the descriptor locks hadn't got > > acquired till now. For set_ioapic_affinity_irq() this means acquiring / > > releasing of the IO-APIC lock can be plain spin_{,un}lock() then. > > > > Drop one of the two leading underscores from all three functions at > > the same time. > > > > There's one case left where descriptors get manipulated with just > > vector_lock held: setup_vector_irq() assumes its caller to acquire > > vector_lock, and hence can't itself acquire the descriptor locks (wrong > > lock order). I don't currently see how to address this. > > > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- > > v2: Also adjust set_ioapic_affinity_irq() and VT-d's > > dma_msi_set_affinity(). > > I'm sorry, Kevin, I should have Cc-ed you on this one. Reviewed-by: Kevin Tian <kevin.tian@intel.com> for vtd part. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
If any particular value was to be checked against, it would need to be IRQ_VECTOR_UNASSIGNED. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Be more strict though and use valid_irq_vector() instead. Take the opportunity and also convert local variables to unsigned int. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -276,14 +276,13 @@ static void release_old_vec(struct irq_d static void _clear_irq_vector(struct irq_desc *desc) { - unsigned int cpu; - int vector, old_vector, irq = desc->irq; + unsigned int cpu, old_vector, irq = desc->irq; + unsigned int vector = desc->arch.vector; cpumask_t tmp_mask; - BUG_ON(!desc->arch.vector); + BUG_ON(!valid_irq_vector(vector)); /* Always clear desc->arch.vector */ - vector = desc->arch.vector; cpumask_and(&tmp_mask, desc->arch.cpu_mask, &cpu_online_map); for_each_cpu(cpu, &tmp_mask) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 07:11:52AM -0600, Jan Beulich wrote: > If any particular value was to be checked against, it would need to be > IRQ_VECTOR_UNASSIGNED. > > Reported-by: Roger Pau Monné <roger.pau@citrix.com> > > Be more strict though and use valid_irq_vector() instead. > > Take the opportunity and also convert local variables to unsigned int. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Since the "Cannot set affinity ..." warning is a one time one, avoid triggering it already at boot time when parking secondary threads and the serial console uses a (still unconnected at that time) PCI IRQ. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -2428,8 +2428,20 @@ void fixup_irqs(const cpumask_t *mask, b vector = irq_to_vector(irq); if ( vector >= FIRST_HIPRIORITY_VECTOR && vector <= LAST_HIPRIORITY_VECTOR ) + { cpumask_and(desc->arch.cpu_mask, desc->arch.cpu_mask, mask); + /* + * This can in particular happen when parking secondary threads + * during boot and when the serial console wants to use a PCI IRQ. + */ + if ( desc->handler == &no_irq_type ) + { + spin_unlock(&desc->lock); + continue; + } + } + if ( desc->arch.move_cleanup_count ) { /* The cleanup IPI may have got sent while we were still online. */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> --- a/xen/include/asm-x86/irq.h +++ b/xen/include/asm-x86/irq.h @@ -41,8 +41,8 @@ struct arch_irq_desc { cpumask_var_t cpu_mask; cpumask_var_t old_cpu_mask; cpumask_var_t pending_mask; - unsigned move_cleanup_count; vmask_t *used_vectors; + unsigned move_cleanup_count; u8 move_in_progress : 1; s8 used; }; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
The subsequent cpumask_intersects() covers the "empty" case quite fine. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -650,9 +650,6 @@ void move_masked_irq(struct irq_desc *de desc->status &= ~IRQ_MOVE_PENDING; - if (unlikely(cpumask_empty(pending_mask))) - return; - if (!desc->handler->set_affinity) return; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Its only caller already has the IRQ descriptor in its hands, so there's no need for the function to re-obtain it. As a result the leading p of its name is no longer appropriate and hence gets dropped. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1550,17 +1550,8 @@ int pirq_guest_unmask(struct domain *d) return 0; } -static int pirq_acktype(struct domain *d, int pirq) +static int irq_acktype(const struct irq_desc *desc) { - struct irq_desc *desc; - int irq; - - irq = domain_pirq_to_irq(d, pirq); - if ( irq <= 0 ) - return ACKTYPE_NONE; - - desc = irq_to_desc(irq); - if ( desc->handler == &no_irq_type ) return ACKTYPE_NONE; @@ -1591,7 +1582,8 @@ static int pirq_acktype(struct domain *d if ( !strcmp(desc->handler->typename, "XT-PIC") ) return ACKTYPE_UNMASK; - printk("Unknown PIC type '%s' for IRQ %d\n", desc->handler->typename, irq); + printk("Unknown PIC type '%s' for IRQ%d\n", + desc->handler->typename, desc->irq); BUG(); return 0; @@ -1668,7 +1660,7 @@ int pirq_guest_bind(struct vcpu *v, stru action->nr_guests = 0; action->in_flight = 0; action->shareable = will_share; - action->ack_type = pirq_acktype(v->domain, pirq->pirq); + action->ack_type = irq_acktype(desc); init_timer(&action->eoi_timer, irq_guest_eoi_timer_fn, desc, 0); desc->status |= IRQ_GUEST; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Wed, May 08, 2019 at 07:14:06AM -0600, Jan Beulich wrote: > Its only caller already has the IRQ descriptor in its hands, so there's > no need for the function to re-obtain it. As a result the leading p of > its name is no longer appropriate and hence gets dropped. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
© 2016 - 2024 Red Hat, Inc.