From nobody Mon Jun 8 08:28:16 2026 Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB51632B13C for ; Sat, 30 May 2026 20:26:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780172762; cv=none; b=G0D5BN+B1bBIOQ/x6udYoE8yJTyhVbEvNRDccsa5Y1pI8I4m6d6jTWjodbYhfsW2K7jMfAFNLvUXIAYHK8I3pDYFwlylfhWXs5Bi3pWm6QAK66fSfoCWyXKe9gIUmNWX7vUwG4WbdFsEkvc00SaDffUJrbscAe8KWyp58j3qouE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780172762; c=relaxed/simple; bh=NJMctWkCkvaKSvd5Uce3HvzWFWfIWYwWyFq+kwfh0kA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IKT19RoE9UUM38UhTk4LR1gSAPcYjNSjFCEp8pRNZV/RjvRJFNSpdfFQSOQyUPmMjLJBWBI0TGWftbf55gTVDhUVmuWWiXTfCOda+/4fCTOdZJgwEHc+7p7YoWtJm2QNHiztTtCmTMJu+rotiazsBTA5Eqa8YRZE2SasBMaBc8A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AAoQpJRO; arc=none smtp.client-ip=209.85.128.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AAoQpJRO" Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-7e10f91425fso4451827b3.0 for ; Sat, 30 May 2026 13:26:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780172760; x=1780777560; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xzxLFbOxRCDOeLozJMvg6yHAwBzCK2kggsGVQYYIzAk=; b=AAoQpJRO/VrIddgTVa1eD/U7wI7XtiWI+JGMEjpwWHb82hMoZ8qE7oraHuLAT21XLU 7iwosoB29SQRKy8EZ6YnQXXd3F7bz2fKn15ZZokDItMiPTmaQgDW7EPKnI3nMHHCxan9 e6otxA8qbtZOmiKuLbPWDZO8aPYGfQ8pMo1aLFzQBzvKMmeuWDdXiBqLiOxOVIruxGGR jVYE+j2q1Qm7ujjQo1YrJ5qzbLULYmh0u58tLv+MzXakcnkqSw4lZr+kzgZlR1dFovV8 7362LEiQS1VmTLElypvsGE3hdVMjA5xUrQK1WbVB/6gddMIbudIftcDKhAmjYAA+Y4WQ 6w4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780172760; x=1780777560; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=xzxLFbOxRCDOeLozJMvg6yHAwBzCK2kggsGVQYYIzAk=; b=UuW4eEuDMM6zqOUEa91mjo3Sdz0M+9aam1fQq9KCgmmTz8ugNGC9YFbF1Amw7XrFQg FaUeB/FL92uqp8xl4wv1B/PoobL4ok7/KENGonFjQyf3rtxxoRBZe1iuyj1EitGkHXqO H2iXSH34sS2Doll0w5uTxCOhct38oCSjPIwjtGJAkTtGq+Edvwe68rpV5x8LcIyfgXlA OK5feIjyQU0lgWQLrvg1Hi+2SG+m2K5kmRwCDm0E++6JItq7AnAIiyLQ11+pdEhaeDHN RsjWbOFhoY0F04svhertVIBp7Sj2J21yeiOXXtHnwK2p7Is/nTkk9Paz3CvyrzkOUVfZ eN2A== X-Gm-Message-State: AOJu0Yy6Ym6OjFbdI2y7erzNUSsCKEM5R5ftG9vDGlcP2Pipqd7p1jCZ R8X6Z/mo8Gu12ph+cDz+sNYxiGHMy7GDnT6Cn2B9fDH7rFJdD0MYkfpWFVRrofUE X-Gm-Gg: Acq92OH4FTVQNg5OnjYth+4TOOgTQKnhLGZe/vPhTzGF07dnEg5rFT2C4CyJahko+QD PmIBqWjZeEm2Hz2o8ltuGB9OJK43cEznp3U46zsUWLnzJoswqSkKti1wk3jNzF8NhcM5zPd6qnh amFqwHPts8BcpjdplLsqNFSmz345USlSNNR62fkOumvSOAYKZazquuX9hfgyylCN25qHQbVHAKE yGnffoY2eFF6ikRAVd3nnhP9aeJnlyZ5dOehKULh6gv9e3NvomEKns8OreqvCdyjjdog1bCCY8r qE0+uaZ2vddAEy08lOllZU9O8nNQZeGVfqod8LiXsRVkgUVySGV8iwsQxSrzzDqvJEtj/DDYwQp 64qYWAEh7Xn1upr7PWxzlY5TkPrnbb/RLNYO54GSc/1QjAp05to/OpvoPdISI8mLWUMWpQdqOYD 3x+1L9OU4iJbmtNxDotIoyCUKs+vSFQjCaog== X-Received: by 2002:a05:690c:4711:b0:7dc:7dfa:7266 with SMTP id 00721157ae682-7de2759ce27mr46275927b3.34.1780172759901; Sat, 30 May 2026 13:25:59 -0700 (PDT) Received: from localhost ([2600:1702:7a90:6f9f:8bc4:8aec:108d:7a04]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7e17ae7e240sm12675387b3.32.2026.05.30.13.25.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 May 2026 13:25:59 -0700 (PDT) From: Matt Turner To: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Richard Henderson , Magnus Lindholm , Ivan Kokshaysky , Matt Turner Subject: [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock Date: Sat, 30 May 2026 16:25:42 -0400 Message-ID: <20260530202544.59231-2-mattst88@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260530202544.59231-1-mattst88@gmail.com> References: <20260530202544.59231-1-mattst88@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Two or more CPUs simultaneously calling any function that uses on_each_cpu(wait=3D1) or smp_call_function(wait=3D1) deadlock: each blocks in csd_lock_wait spinning while waiting for the remote CPU to signal CSD completion. While spinning, neither CPU can receive the other's IPI, so neither completion signal arrives =E2=80=94 permanent hang. Affected callers: smp_imb, flush_tlb_all, flush_tlb_mm, flush_tlb_page, flush_icache_user_page (smp.c) and migrate_flush_tlb_page (tlbflush.c). Introduce alpha_smp_ipi_lock (plain spinlock, defined in smp.c, declared in asm/smp.h) and apply it to all six callers. Rather than spin_lock(), use a trylock loop with alpha_drain_ipi(): if the lock is held, the loser actively drains any pending IPI bits on the local CPU before retrying. This is necessary because some callers hold IRQs disabled (e.g. paths that take spin_lock_irqsave), so no RTC interrupt will fire to rescue a lost wripir edge via alpha_poll_ipi_inirq(). alpha_drain_ipi() calls handle_ipi() under local_irq_save/restore, satisfying handle_ipi()'s requirement that IRQs be disabled, without touching lockdep hardirq-context state. This fix is necessary but not sufficient. A separate, independent deadlock path exists: if the target CPU is inside do_entInt at IPL=3D7 when wripir fires, the hardware IPI edge is lost and the sending CPU spins forever even when only one CPU is issuing a wait=3D1 call. That race is fixed independently by alpha_poll_ipi_inirq() (see follow-on commit). Both fixes are required for a complete solution. The deadlock has been observed on EV7/Marvel under workloads generating a high rate of synchronous TLB flush IPIs (e.g. the git test suite). Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Matt Turner --- arch/alpha/include/asm/smp.h | 9 ++++++ arch/alpha/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++ arch/alpha/mm/tlbflush.c | 3 ++ 3 files changed, 74 insertions(+) diff --git ./arch/alpha/include/asm/smp.h ./arch/alpha/include/asm/smp.h index 2264ae72673b..8bd529376cf6 100644 --- ./arch/alpha/include/asm/smp.h +++ ./arch/alpha/include/asm/smp.h @@ -48,6 +48,15 @@ extern int smp_num_cpus; extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); =20 +/* + * Global spinlock serializing all synchronous (wait=3D1) IPI callers. + * Callers must use the trylock+alpha_drain_ipi() pattern, not spin_lock(), + * because some call sites hold IRQs disabled and cannot rely on the RTC + * interrupt to rescue a lost wripir edge. + */ +extern spinlock_t alpha_smp_ipi_lock; +extern void alpha_drain_ipi(void); + #else /* CONFIG_SMP */ =20 #define hard_smp_processor_id() 0 diff --git ./arch/alpha/kernel/smp.c ./arch/alpha/kernel/smp.c index ed06367ece57..d900da49b0d8 100644 --- ./arch/alpha/kernel/smp.c +++ ./arch/alpha/kernel/smp.c @@ -597,11 +597,61 @@ ipi_imb(void *ignored) imb(); } =20 +/* + * Serialize all synchronous (wait=3D1) IPI operations to prevent cross-CPU + * deadlock on EV7/Marvel. If two CPUs simultaneously call any function t= hat + * uses on_each_cpu(wait=3D1) or smp_call_function(wait=3D1), each blocks = in + * csd_lock_wait spinning for the remote CPU to signal completion. While + * spinning, neither CPU can receive the other's IPI, so neither completion + * signal arrives =E2=80=94 permanent hang. + * + * A plain spinlock (not irqsave) is intentional: the CPU that loses the l= ock + * race spins with IRQs enabled and can service the winner's IPI before + * taking the lock itself. + * + * All callers of synchronous IPIs =E2=80=94 including migrate_flush_tlb_p= age in + * tlbflush.c =E2=80=94 must hold this lock. + */ +DEFINE_SPINLOCK(alpha_smp_ipi_lock); + +/* + * Drain any pending IPIs for this CPU while spinning on alpha_smp_ipi_loc= k. + * + * The lock holder has already sent a wripir but is blocked in csd_lock_wa= it + * waiting for our IPI ACK. We cannot simply spin on the lock: if IRQs are + * disabled (e.g. caller holds a spin_lock_irqsave), no RTC interrupt will + * fire and the lost wripir edge is never rescued by alpha_poll_ipi_inirq. + * + * Call this from the trylock loop so the IPI is processed even with IRQs + * disabled, breaking the circular wait. + * + * handle_ipi() requires IRQs disabled: generic_smp_call_function_interrupt + * asserts lockdep_assert_irqs_disabled(). Use local_irq_save/restore so + * this is safe whether the caller has IRQs enabled (e.g. page fault path) + * or disabled (e.g. spin_lock_irqsave holder). Avoid __irq_enter_raw/ + * __irq_exit_raw: those manipulate lockdep hardirq-context state and trig= ger + * a lockdep WARNING when called while lockdep already tracks hardirq cont= ext. + */ +void alpha_drain_ipi(void) +{ + unsigned long flags; + + if (!READ_ONCE(ipi_data[smp_processor_id()].bits)) + return; + + local_irq_save(flags); + handle_ipi(NULL); /* regs unused in handle_ipi() */ + local_irq_restore(flags); +} + void smp_imb(void) { /* Must wait other processors to flush their icache before continue. */ + while (!spin_trylock(&alpha_smp_ipi_lock)) + alpha_drain_ipi(); on_each_cpu(ipi_imb, NULL, 1); + spin_unlock(&alpha_smp_ipi_lock); } EXPORT_SYMBOL(smp_imb); =20 @@ -616,7 +666,10 @@ flush_tlb_all(void) { /* Although we don't have any data to pass, we do want to synchronize with the other processors. */ + while (!spin_trylock(&alpha_smp_ipi_lock)) + alpha_drain_ipi(); on_each_cpu(ipi_flush_tlb_all, NULL, 1); + spin_unlock(&alpha_smp_ipi_lock); } =20 #define asn_locked() (cpu_data[smp_processor_id()].asn_lock) @@ -651,7 +704,10 @@ flush_tlb_mm(struct mm_struct *mm) } } =20 + while (!spin_trylock(&alpha_smp_ipi_lock)) + alpha_drain_ipi(); smp_call_function(ipi_flush_tlb_mm, mm, 1); + spin_unlock(&alpha_smp_ipi_lock); =20 preempt_enable(); } @@ -702,7 +758,10 @@ flush_tlb_page(struct vm_area_struct *vma, unsigned lo= ng addr) data.mm =3D mm; data.addr =3D addr; =20 + while (!spin_trylock(&alpha_smp_ipi_lock)) + alpha_drain_ipi(); smp_call_function(ipi_flush_tlb_page, &data, 1); + spin_unlock(&alpha_smp_ipi_lock); =20 preempt_enable(); } @@ -752,7 +811,10 @@ flush_icache_user_page(struct vm_area_struct *vma, str= uct page *page, } } =20 + while (!spin_trylock(&alpha_smp_ipi_lock)) + alpha_drain_ipi(); smp_call_function(ipi_flush_icache_page, mm, 1); + spin_unlock(&alpha_smp_ipi_lock); =20 preempt_enable(); } diff --git ./arch/alpha/mm/tlbflush.c ./arch/alpha/mm/tlbflush.c index ccbc317b9a34..37607d08796b 100644 --- ./arch/alpha/mm/tlbflush.c +++ ./arch/alpha/mm/tlbflush.c @@ -89,7 +89,10 @@ void migrate_flush_tlb_page(struct vm_area_struct *vma, = unsigned long addr) * This is the "combined" version of flush_tlb_mm + per-page invalidate. */ preempt_disable(); + while (!spin_trylock(&alpha_smp_ipi_lock)) + alpha_drain_ipi(); on_each_cpu(ipi_flush_mm_and_page, &d, 1); + spin_unlock(&alpha_smp_ipi_lock); =20 /* * mimic flush_tlb_mm()'s mm_users<=3D1 optimization. --=20 2.53.0 From nobody Mon Jun 8 08:28:16 2026 Received: from mail-yx1-f49.google.com (mail-yx1-f49.google.com [74.125.224.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 597CF339B3D for ; Sat, 30 May 2026 20:26:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780172764; cv=none; b=MYtMecKvaGcFG61MH3a+SYnuJwXRZFcQJCfhu3xaLrMXoLAs8X26LrAJd6tda/7qLBP+hfzmyQ/pY8JYuLUcP/6cSZnIr2VA7iMg6lDbGsxA5kgMlLmWFhSGXSkVyqPOTQoKTfdsbSysEt6urjjokpqzcMgmJr1xLFEN/nKXx3w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780172764; c=relaxed/simple; bh=lSq1XiVdFbegW3dLAoeQ8xE+ZRtn2eqXPy7IPZ9+odk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Iwk63UaCwDSmBlisihqLn/RQGmHzfZfBWqEaAyldJAX7bO2zA0oAIMrgIOGEutFwM+5O5cu30i5AHe/KCPfUFAWmseg5rgao6GbZY7AidqvPUFDjKTjqFSLey2lvSXhBLmcM4um9m2EMSbXijYdhAy9xpg2Egc/ykyLkExIP7bI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BtMT3GpX; arc=none smtp.client-ip=74.125.224.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BtMT3GpX" Received: by mail-yx1-f49.google.com with SMTP id 956f58d0204a3-66050c021f0so1771870d50.1 for ; Sat, 30 May 2026 13:26:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780172761; x=1780777561; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lljDjPO0XM3n9bEJfoxc211+yO6B5dMSriEAJih/sic=; b=BtMT3GpXpJvN0avXcy3oxCM7sXooitTOfw0++EuBYLPQc4bsNiI+4LGyCaRSt8XGJd WPNqZcnuXy/VN/jKA+N8Llzl1rhqbpUR6fJjMVQ120QgUcva95oY3Xl/eL5EbCN/okTD NmQ4m4/72rF44nttQ0ujSnnbIIEdFS59PKOU6kuFd+eEfKBssotY1x192xrFu+muESHY zDLuF7uiqtY1cnHRMjr3onQWsw1LIzv0U0ULW1FvCQPxdONFMe3yuCHVvaGEy4hUniu3 YcyHq0ZN/R24QUqkXU0/95OEkE9eyk4oUa1qpuYxtqrNLszIXlL6qSlbROvqXLiYmo1G 3Tww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780172761; x=1780777561; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lljDjPO0XM3n9bEJfoxc211+yO6B5dMSriEAJih/sic=; b=nJQCM1JoMr8ofuNvR6KBW53iP7L9i1g6KlY1mEAwKUXdvcDJVMQZgFttdEiiyQUsZV UYeytzbOlRpK6iBHtDXRddkHLNjMbJqkiYg+dGbhQvfTFWo9VR2XGc8f8Er8jUeqqmZ7 n7xPGSNVwmKkpftcWFk6jbGOJm54pqziM7++5Zg5zf5rH/YklkuruKPZ55yMqsSqS2qb jBtjfSeogasAlviPbgaMWA81NcQeiC5i+6RV5sEiHK/mJ98t/RnnlcO+burHtdhvreVF vhI8O8+qjA7gD/ofKSQT5fhv48jwCcW60//FXjpDPDg1bA6sl9p16g0VbThMlbxcs3gI AXsw== X-Gm-Message-State: AOJu0YwD6oLE1GbP/KMvimowpikvUholGUuCk6qvjzsQKixBrEi4QKcR zcWCrGGL5DLHsD45PMLkkBUZVzq71jHtbY9JqfDGeTzaJE32mFADNvSp X-Gm-Gg: Acq92OHfgFPRfUKb7/X93SeomiEUD8qHtJU57e9YZEiOi7cmIdHvznQSAtady0SKMgI rqSgKOhnbAjkkjcTykS1ofDQR25+OycRW/azaCMusc6bALVWCdJChFNXWByCZpPFzu7tmEBrmyz Yp+My1uWf4qI9iXrUt92Cb7R2y4CZpcXw5LDg+r4lHqFtGRL0bfbbmzudO3DZr6SxNBNDMaSRmW Kz9l8OMVGmbEhnys62bcs7HiMK9XVUJbVWtZzgW9hl+PkuXK/9Mum4EOmfc4fi/mmDIcH23oD3S XJbFCsTl8TAj36aFeerlrI46qRkkzTrvhfbZrUflo3R/5XLa58+s9qGm4mAXAB9txF2mnqWkDn9 5LJCsPi96LqyRpK0dDGqDOUJxuuJ49iT+ycAmG/aIiqv/StB+zDFtM0hdaPCVg9XuJgCDsUO5YT ZakMnXH1Ab/Kmhp+Lez85PGFhL+XSXa062Gg== X-Received: by 2002:a05:690e:4812:b0:64c:c616:c349 with SMTP id 956f58d0204a3-6605ef9948bmr2773016d50.31.1780172761303; Sat, 30 May 2026 13:26:01 -0700 (PDT) Received: from localhost ([2600:1702:7a90:6f9f:8bc4:8aec:108d:7a04]) by smtp.gmail.com with ESMTPSA id 956f58d0204a3-6606977ab05sm1178609d50.9.2026.05.30.13.26.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 May 2026 13:26:00 -0700 (PDT) From: Matt Turner To: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Richard Henderson , Magnus Lindholm , Ivan Kokshaysky , Matt Turner Subject: [PATCH 2/3] alpha: Fix SMP IPI loss when target CPU is in interrupt handler Date: Sat, 30 May 2026 16:25:43 -0400 Message-ID: <20260530202544.59231-3-mattst88@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260530202544.59231-1-mattst88@gmail.com> References: <20260530202544.59231-1-mattst88@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On EV7/IO7, the wripir PALcall delivers IPIs as edge-triggered hardware signals through the IO7 I/O controller. If the target CPU is already executing at IPL=3D7 inside do_entInt handling another interrupt, the IPI edge is lost: the hardware never re-delivers it when the CPU drops back to IPL=3D0. The software IPI bit in ipi_data[cpu].bits is set before wripir is called, so it remains set after the interrupt handler returns. But because no hardware edge fires, handle_ipi() is never invoked again, and the sending CPU spins forever in csd_lock_wait. This race is the root cause of a 15-year SMP deadlock on EV7/Marvel systems. It is reliably triggered by workloads that generate many synchronous IPIs (TLB flushes via on_each_cpu(wait=3D1)) while the target CPU receives concurrent I/O or RTC interrupts. Fix: add alpha_poll_ipi_inirq(), called from do_entInt within each interrupt handler's irq_enter/irq_exit bracket. It checks ipi_data[smp_processor_id()].bits and drains any pending IPIs that arrived while we were at IPL=3D7, before irq_exit() opens the softirq window where a TLB-flush softirq could itself deadlock on alpha_smp_ipi_lock. The check is a single READ_ONCE so there is no overhead when no IPI was missed. For the RTC interrupt (case 1 in do_entInt), handle_irq() already calls its own irq_enter()/irq_exit() internally. The outer irq_enter/irq_exit pair added here is intentional: it keeps irq_count > 0 while handle_irq() runs, so handle_irq()'s inner irq_exit() sees a non-zero count and skips the softirq window. The softirq window is deferred until the outer irq_exit(), which runs after alpha_poll_ipi_inirq() has already drained any pending IPIs. Without this outer bracket, irq_exit() inside handle_irq() could open the softirq window before any missed IPIs are rescued, risking a deadlock on alpha_smp_ipi_lock. Approximately 98% of rescued IPIs are IPI_CALL_FUNC (the TLB-flush type), confirming that IO7 genuinely drops the hardware edge rather than holding it pending until IPL falls. A lost IPI_CALL_FUNC only deadlocks when the sender is blocking (wait=3D1). wait=3D0 callers do not hang, but silently skip the function on the remote CPU, which may be a correctness issue in its own right. This fix is complementary to the alpha_smp_ipi_lock serialization (previous commit). Both are required: - Serialization prevents two CPUs simultaneously issuing wait=3D1 IPIs from deadlocking each other in csd_lock_wait. - This fix prevents a single wait=3D1 caller from deadlocking due to an IPI edge lost to an IPL=3D7 window on the remote CPU. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Matt Turner --- arch/alpha/kernel/irq_alpha.c | 29 ++++++++++++++++++++++++++++- arch/alpha/kernel/proto.h | 1 + arch/alpha/kernel/smp.c | 35 +++++++++++++++++++++++++++++++++++ 3 files changed, 64 insertions(+), 1 deletion(-) diff --git ./arch/alpha/kernel/irq_alpha.c ./arch/alpha/kernel/irq_alpha.c index ac941172ae66..0e4234ef7ea0 100644 --- ./arch/alpha/kernel/irq_alpha.c +++ ./arch/alpha/kernel/irq_alpha.c @@ -69,22 +69,49 @@ do_entInt(unsigned long type, unsigned long vector, break; #endif case 1: - /* handle_irq() already does irq_enter()/irq_exit() */ + /* + * Wrap handle_irq() in our own irq_enter/irq_exit so that the + * inner irq_exit() inside handle_irq() does not run softirqs + * (irq_count remains > 0). We poll for lost IPIs before the + * outer irq_exit(), which is where softirqs may run. This + * prevents a TLB flush softirq from deadlocking on + * alpha_smp_ipi_lock while the sending CPU waits for our ACK. + */ + irq_enter(); handle_irq(RTC_IRQ); +#ifdef CONFIG_SMP + alpha_poll_ipi_inirq(regs); +#endif + irq_exit(); break; case 2: irq_enter(); alpha_mv.machine_check(vector, la_ptr); +#ifdef CONFIG_SMP + alpha_poll_ipi_inirq(regs); +#endif irq_exit(); break; case 3: irq_enter(); alpha_mv.device_interrupt(vector); +#ifdef CONFIG_SMP + /* + * Drain any IPIs whose edge was lost while we were at IPL=3D7. + * Must be called before irq_exit() to prevent softirqs (e.g. + * a TLB flush) from deadlocking on alpha_smp_ipi_lock while + * the sending CPU spins in csd_lock_wait. + */ + alpha_poll_ipi_inirq(regs); +#endif irq_exit(); break; case 4: irq_enter(); perf_irq(la_ptr, regs); +#ifdef CONFIG_SMP + alpha_poll_ipi_inirq(regs); +#endif irq_exit(); break; default: diff --git ./arch/alpha/kernel/proto.h ./arch/alpha/kernel/proto.h index f138bd494628..04879e0b2932 100644 --- ./arch/alpha/kernel/proto.h +++ ./arch/alpha/kernel/proto.h @@ -120,6 +120,7 @@ extern void unregister_srm_console(void); /* smp.c */ extern void setup_smp(void); extern void handle_ipi(struct pt_regs *); +extern void alpha_poll_ipi_inirq(struct pt_regs *); extern void __init smp_callin(void); =20 /* bios32.c */ diff --git ./arch/alpha/kernel/smp.c ./arch/alpha/kernel/smp.c index d900da49b0d8..099e1ac6a0d6 100644 --- ./arch/alpha/kernel/smp.c +++ ./arch/alpha/kernel/smp.c @@ -557,6 +557,41 @@ handle_ipi(struct pt_regs *regs) recv_secondary_console_msg(); } =20 +/* + * On EV7/IO7, IPI signals are edge-triggered. If an IPI arrives while this + * CPU is executing at IPL=3D7 (inside another interrupt handler), the har= dware + * edge is lost. The software bit in ipi_data[] remains set but handle_ipi= () + * is never re-invoked, causing the sending CPU to spin forever in csd_loc= k_wait. + * + * Call this from within hardirq context (between irq_enter and irq_exit) = to + * drain any IPIs that arrived while we were running at IPL=3D7, before ir= q_exit() + * opens the softirq window where a TLB flush could deadlock on alpha_smp_= ipi_lock. + */ +void alpha_poll_ipi_inirq(struct pt_regs *regs) +{ + int cpu =3D smp_processor_id(); + unsigned long bits =3D READ_ONCE(ipi_data[cpu].bits); + + if (!bits) + return; + + /* + * Peek at type bits before handle_ipi() clears them via xchg(). + * Bits arriving after this READ_ONCE are drained but not counted; + * the counters are approximate but sufficient for diagnosis. + * Note: handle_ipi() also increments ipi_count, so the "IPI:" row + * in /proc/interrupts includes both normal and rescued deliveries. + */ + if (bits & (1UL << IPI_RESCHEDULE)) + cpu_data[cpu].rescued_reschedule_count++; + if (bits & (1UL << IPI_CALL_FUNC)) + cpu_data[cpu].rescued_call_func_count++; + if (bits & (1UL << IPI_CPU_STOP)) + cpu_data[cpu].rescued_cpu_stop_count++; + + handle_ipi(regs); +} + void arch_smp_send_reschedule(int cpu) { --=20 2.53.0 From nobody Mon Jun 8 08:28:16 2026 Received: from mail-yx1-f44.google.com (mail-yx1-f44.google.com [74.125.224.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C494033508E for ; Sat, 30 May 2026 20:26:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780172768; cv=none; b=nRCt0CFh0j51Y5EpHd+gfOQT+LCq1BiTlTFrdz73DYxAn7kwy+V7pQYvHJ4YTMIOI3lBVGN8D8Ne/nkQd/RS2nktQQW7delrNPLkPEng8fog/wBMCxhGUucXpqA6BddrBdqbZPU98YLFy9uHUzND6DE3ZhS1+gBgxeAvHKhtTnE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780172768; c=relaxed/simple; bh=o+6b37jGJdtML4VSCNe28AJfx9gXkiAKzQVnvryAsrc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VsUrvhzKFADO8PKbv6zLNjUShfCfvalnLANMZbl5VOKl48NaIDn+9h8d0a/k2AGEhcvGdG4wAlq2g97A8A4qR3cJvPa26WR0vrvRx7ztlsCJDUayyVDVK20FxIfnDdlwCIqflb9T8Be4fFTH/Jku/elVvmOQXw26q+/0X49lH6o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AvaqfiBx; arc=none smtp.client-ip=74.125.224.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AvaqfiBx" Received: by mail-yx1-f44.google.com with SMTP id 956f58d0204a3-66050c021f0so1771882d50.1 for ; Sat, 30 May 2026 13:26:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780172764; x=1780777564; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N5bfay+UMC0FV3+SdGrQmnKGsJ3WhsW5MN1JRBSHUmc=; b=AvaqfiBxt0eEeX3Gd00bYgHkUStmaLZ2TsilXE/fVG6mPtC540jhzFZrcKd4VpcSHA BS2cso5rNSUXCC5lgI0bs2HtKMxs1qpt2tl2w71q6P7KpxIsh65iDE70cZ0czmJYMbwr D1Bvb6lGAk6TCvlJP07KiR6Y2nf0c1YNOhYf133ICw5SSpoN21NlhUsPJJq3JISKxiQQ 457UCRAz/9xaCgEYCKlUnZl5SNKskzAsxIvy+ugJfdeSgEjL9veofNzUaD4cz6OImEa3 1G98O0vL4qwltrOIPagIilUPfpbIqvVv6iW7z8TYL4a+zQ0SepjApJE9XaiKVNdhVqJg 0Z+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780172764; x=1780777564; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=N5bfay+UMC0FV3+SdGrQmnKGsJ3WhsW5MN1JRBSHUmc=; b=tGuN+NcCWd01VBhy4EnnLJiw/d9P9Ir6CTieQeOTrLjW6TOL1gZCq0Ym+PssYC7oZ3 MnOQj2Sf5gpNDu95NHTjRMQmIktyF7Z6HRx/CQg2Ncw/hCKhwneT6TA62XLnAFMgguaO WcpQKcmk0n+Iwu0/MAvesC5H81aU8t8luDWn49euJoXb5zvb3u7kwvmhEaHu9ula3M6l fPpoOe0lE4OaI2L9U9nciLaScJOG7TPeN00rk8CAtvhEbZcJ/3RQgKoAJLTttJDnMiyV KpsJF6HS9+H0BG3/lQbNZ16tnNHkCmul095yLCaUP98B4TeIcmDAS7ESBbfAWJxB0gOt SgzQ== X-Gm-Message-State: AOJu0YxcXns3D7CAmZlwgrg4lAsyWoAurs2h3gfjsEZ46bqXz/lt1mbU HRA7gnq+cOp579oLgT3qFX9AOKfN4OR61+vZxfgxVLvUnVpZCptXdEyk X-Gm-Gg: Acq92OHLiJKnAYKw4kh3J0ieyHrvic8sSunHixWRao3Ij6d9QimpRXkDcrfOM2uo+Fy h4drfV4R77RV2ctv0OpbJm182mkR70lU15knAwSmFEwwG/eS1SwuzUpcfYfYjU/rrfdOv0bOPWx LtXMXbjDRTn8ayQgwcbQ/7O7C7D0RtWAfgeh3jxaGPiNkt3D20X2hasxA+ajidk3bYKgokZiIRc 1wWxulbh9IigzjLlc+mpTbcLfjU3kwO6STGLIhGPESPMERLYPeVdOa7GFJl4KGF76/tkujtLEzF 60omY9y/xCIyIddwhFNN5Vr8ZfITSyrbIUhvo5FaseoI8mbpUsZuaXzT+j6VXFjOE5ifXHup7oB P9kIxXNO+xeRmn6d8wVCgyzBgoTXhaLlCgE46t/ZFo77fLmpU91YiO7+cfEThHo4g8dw+j/1RiF jTVVo2+bJZwnWjNHCMB2FE4DUrj7MbPIbWBay/elsNZAdb X-Received: by 2002:a53:e013:0:b0:660:5c10:df05 with SMTP id 956f58d0204a3-6605efdbd3bmr3390808d50.42.1780172763803; Sat, 30 May 2026 13:26:03 -0700 (PDT) Received: from localhost ([2600:1702:7a90:6f9f:8bc4:8aec:108d:7a04]) by smtp.gmail.com with ESMTPSA id 956f58d0204a3-66069abb92csm1148658d50.18.2026.05.30.13.26.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 May 2026 13:26:02 -0700 (PDT) From: Matt Turner To: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Richard Henderson , Magnus Lindholm , Ivan Kokshaysky , Matt Turner Subject: [PATCH 3/3] alpha: Break down rescued IPI counter by type in /proc/interrupts Date: Sat, 30 May 2026 16:25:44 -0400 Message-ID: <20260530202544.59231-4-mattst88@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260530202544.59231-1-mattst88@gmail.com> References: <20260530202544.59231-1-mattst88@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add per-type rescued IPI counters to cpuinfo_alpha: rescued_reschedule_count (RIP:) rescued_call_func_count (RIF:) rescued_cpu_stop_count (RIS:) alpha_poll_ipi_inirq() peeks at ipi_data[cpu].bits before handle_ipi() clears them via xchg(), then increments the appropriate per-CPU counter. Expose all three as separate rows in /proc/interrupts alongside the existing "IPI:" row. This lets us distinguish the deadlock-causing subset (RIF: IPI_CALL_FUNC, of which wait=3D1 callers are the ones that deadlock) from the harmless majority (RIP: reschedule). A non-zero RIF count confirms the EV7/IO7 edge-triggered IPI loss hypothesis. Note: handle_ipi() also increments ipi_count unconditionally, so the "IPI:" row in /proc/interrupts includes both normal and rescued deliveries. The RIP:/RIF:/RIS: counters sample bits before the xchg(); bits that arrive after the READ_ONCE are drained by handle_ipi() but not reflected in these counters =E2=80=94 they are approximate but sufficient for diagnos= is. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Matt Turner --- arch/alpha/include/asm/smp.h | 3 +++ arch/alpha/kernel/irq.c | 12 ++++++++++++ 2 files changed, 15 insertions(+) diff --git ./arch/alpha/include/asm/smp.h ./arch/alpha/include/asm/smp.h index 8bd529376cf6..98f522ee367f 100644 --- ./arch/alpha/include/asm/smp.h +++ ./arch/alpha/include/asm/smp.h @@ -31,6 +31,9 @@ struct cpuinfo_alpha { int need_new_asn; int asn_lock; unsigned long ipi_count; + unsigned long rescued_reschedule_count; + unsigned long rescued_call_func_count; + unsigned long rescued_cpu_stop_count; unsigned long prof_multiplier; unsigned long prof_counter; unsigned char mcheck_expected; diff --git ./arch/alpha/kernel/irq.c ./arch/alpha/kernel/irq.c index c67047c5d830..34709e1c42c5 100644 --- ./arch/alpha/kernel/irq.c +++ ./arch/alpha/kernel/irq.c @@ -76,6 +76,18 @@ int arch_show_interrupts(struct seq_file *p, int prec) for_each_online_cpu(j) seq_printf(p, "%10lu ", cpu_data[j].ipi_count); seq_putc(p, '\n'); + seq_puts(p, "RIP: "); + for_each_online_cpu(j) + seq_printf(p, "%10lu ", cpu_data[j].rescued_reschedule_count); + seq_puts(p, " Rescued IPIs: reschedule\n"); + seq_puts(p, "RIF: "); + for_each_online_cpu(j) + seq_printf(p, "%10lu ", cpu_data[j].rescued_call_func_count); + seq_puts(p, " Rescued IPIs: call function\n"); + seq_puts(p, "RIS: "); + for_each_online_cpu(j) + seq_printf(p, "%10lu ", cpu_data[j].rescued_cpu_stop_count); + seq_puts(p, " Rescued IPIs: cpu stop\n"); #endif seq_puts(p, "PMI: "); for_each_online_cpu(j) --=20 2.53.0