[RFC PATCH v4 2/4] genirq/cpuhotplug: Dynamically isolate CPUs from managed interrupts

Costa Shulyupin posted 4 patches 1 year, 2 months ago
[RFC PATCH v4 2/4] genirq/cpuhotplug: Dynamically isolate CPUs from managed interrupts
Posted by Costa Shulyupin 1 year, 2 months ago
After change of housekeeping_cpumask(HK_TYPE_MANAGED_IRQ) during runtime
managed interrupts continue to run on isolated CPUs.

Dynamic CPUs isolation is complex task. One of approaches is:
1. Set affected CPUs offline and disable relevant interrupts
2. Change housekeeping_cpumask
3. Set affected CPUs online and enable relevant interrupts

irq_restore_affinity_of_irq() restores managed interrupts
during complex CPU hotplug process of bringing back a CPU online.

Leave the interrupts disabled those affinity doesn't intersect
with new housekeeping_cpumask thereby ensuring isolation
of the CPU from managed intrrupts.

Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
---
 kernel/irq/cpuhotplug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index ec2cdcd20bee..839d3e879c0d 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -218,6 +218,9 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu)
 	if (desc->istate & IRQS_SUSPENDED)
 		return;
 
+	if (!cpumask_intersects(affinity, housekeeping_cpumask(HK_TYPE_MANAGED_IRQ)))
+		return;
+
 	if (irqd_is_managed_and_shutdown(data))
 		irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
 
-- 
2.47.0
Re: [RFC PATCH v4 2/4] genirq/cpuhotplug: Dynamically isolate CPUs from managed interrupts
Posted by Thomas Gleixner 1 year, 2 months ago
On Sun, Dec 01 2024 at 14:42, Costa Shulyupin wrote:
> After change of housekeeping_cpumask(HK_TYPE_MANAGED_IRQ) during runtime
> managed interrupts continue to run on isolated CPUs.
>
> Dynamic CPUs isolation is complex task. One of approaches is:
> 1. Set affected CPUs offline and disable relevant interrupts
> 2. Change housekeeping_cpumask
> 3. Set affected CPUs online and enable relevant interrupts
>
> irq_restore_affinity_of_irq() restores managed interrupts
> during complex CPU hotplug process of bringing back a CPU online.
>
> Leave the interrupts disabled those affinity doesn't intersect
> with new housekeeping_cpumask thereby ensuring isolation
> of the CPU from managed intrrupts.

And thereby breaking drivers, which will restore the per cpu queue and
expect interrupts to work.

The semantics of HK_TYPE_MANAGED_IRQ are clearly not what you try to
make them. See the description of the "managed_irq" command line
parameter:

        Isolate from being targeted by managed interrupts
        which have an interrupt mask containing isolated
        CPUs. The affinity of managed interrupts is
        handled by the kernel and cannot be changed via
        the /proc/irq/* interfaces.

        This isolation is best effort and only effective
        if the automatically assigned interrupt mask of a
        device queue contains isolated and housekeeping
        CPUs. If housekeeping CPUs are online then such
        interrupts are directed to the housekeeping CPU
        so that IO submitted on the housekeeping CPU
        cannot disturb the isolated CPU.

        If a queue's affinity mask contains only isolated
        CPUs then this parameter has no effect on the
        interrupt routing decision, though interrupts are
        only delivered when tasks running on those
        isolated CPUs submit IO. IO submitted on
        housekeeping CPUs has no influence on those
        queues.

It's pretty clear, no?

Thanks,

        tglx