From nobody Thu Apr 2 20:00:40 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB135C32771 for ; Wed, 21 Sep 2022 03:35:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231410AbiIUDfI (ORCPT ); Tue, 20 Sep 2022 23:35:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231730AbiIUDd7 (ORCPT ); Tue, 20 Sep 2022 23:33:59 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFDE780F7C; Tue, 20 Sep 2022 20:31:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6B05DB818CA; Wed, 21 Sep 2022 03:31:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8C2B5C433B5; Wed, 21 Sep 2022 03:31:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1663731117; bh=xstK3zkOHtc6QOgvpFphVjtbdDCoP5sViwZUf0BdeYk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GCIcStlo/ud0blWsY+b5muH6E9HgVU6Q/HPMLkLjstBxUGGP7AT5zICnTh76hNBHz WwGZAOLN+Ngb0fqg1Q//ZFaEeZGfC4qKE8sCVK/TlGvFpnsekN7tbJA3gCjmDiWwkS MOtO1uVLOrgoJ9quvaS5neP8VtVZk0vQtLNpFH/MWBzGJ7v3MmHvAH2n1ZbFJ4821Y XbZbYlKb68m1VsC9EQidtMUohg/j9KkGTn40jEz5/k+g5sG5gW/9PT1fuFnzA8Pw7P c7QisY5HWUvowUOJgjSYNxb6urZHaWGi1uDA3q/NPtJzUjiBYmZKfMtw93qfYFzH/q D/DYk21slivpw== From: guoren@kernel.org To: xianting.tian@linux.alibaba.com, palmer@dabbelt.com, palmer@rivosinc.com, heiko@sntech.de, liaochang1@huawei.com, jszhang@kernel.org, arnd@arndb.de Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, Guo Ren , Guo Ren , Nick Kossifidis Subject: [PATCH V4 1/3] riscv: kexec: Fixup irq controller broken in kexec crash path Date: Tue, 20 Sep 2022 23:31:32 -0400 Message-Id: <20220921033134.3133319-2-guoren@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220921033134.3133319-1-guoren@kernel.org> References: <20220921033134.3133319-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren If a crash happens on cpu3 and all interrupts are binding on cpu0, the bad irq routing will cause a crash kernel which can't receive any irq. Because crash kernel won't clean up all harts' PLIC enable bits in enable registers. This patch is similar to 9141a003a491 ("ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path") and 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()"), and PowerPC also has the same mechanism. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Guo Ren Signed-off-by: Guo Ren Reviewed-by: Xianting Tian Cc: Nick Kossifidis Cc: Palmer Dabbelt --- arch/riscv/kernel/machine_kexec.c | 35 +++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_= kexec.c index ee79e6839b86..db41c676e5a2 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -15,6 +15,8 @@ #include /* For unreachable() */ #include /* For cpu_down() */ #include +#include +#include =20 /* * kexec_image_info - Print received image details @@ -154,6 +156,37 @@ void crash_smp_send_stop(void) cpus_stopped =3D 1; } =20 +static void machine_kexec_mask_interrupts(void) +{ + unsigned int i; + struct irq_desc *desc; + + for_each_irq_desc(i, desc) { + struct irq_chip *chip; + int ret; + + chip =3D irq_desc_get_chip(desc); + if (!chip) + continue; + + /* + * First try to remove the active state. If this + * fails, try to EOI the interrupt. + */ + ret =3D irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false); + + if (ret && irqd_irq_inprogress(&desc->irq_data) && + chip->irq_eoi) + chip->irq_eoi(&desc->irq_data); + + if (chip->irq_mask) + chip->irq_mask(&desc->irq_data); + + if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) + chip->irq_disable(&desc->irq_data); + } +} + /* * machine_crash_shutdown - Prepare to kexec after a kernel crash * @@ -169,6 +202,8 @@ machine_crash_shutdown(struct pt_regs *regs) crash_smp_send_stop(); =20 crash_save_cpu(regs, smp_processor_id()); + machine_kexec_mask_interrupts(); + pr_info("Starting crashdump kernel...\n"); } =20 --=20 2.36.1 From nobody Thu Apr 2 20:00:40 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E70FC32771 for ; Wed, 21 Sep 2022 03:35:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230500AbiIUDfS (ORCPT ); Tue, 20 Sep 2022 23:35:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231742AbiIUDeC (ORCPT ); Tue, 20 Sep 2022 23:34:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B041816AD; Tue, 20 Sep 2022 20:32:02 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C36B662F29; Wed, 21 Sep 2022 03:32:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5B73C4347C; Wed, 21 Sep 2022 03:31:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1663731121; bh=PqtTeTuX/9cuTV4f8ppEtuFHBczU7wArbGTDlrJgeWU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nldOaCBpVw+C+u8TmKEqYZhhSaCX2zQBHYnsztEFpbDtw4ZpzaAQyFaXislCWDIcI 3xWCJ70+m/yKVADuEPGTE/kyzj1c8IPnawBARucJCBzW9KgSA4Ftqw+hvAn71lsw0p 5dRRwAKx207iFs3INCyKoTGCgqWxMtqOJxoPPjJlmuPxJL2wdYxZhEp/JHp3CK970B QM+EXvLmlwy9dSUWyKc7X5pKJ1PtmhbvlHMKbCkIIetUFT4MEzWGCWchzCMwqVEa6L hKbZEtTKGfl/qcDMcSczaLH614kCQJlQuz7UuKqa1k5HaSvoDDrOAYNvmlnpY3+JBJ tkG9h6e9iAG/g== From: guoren@kernel.org To: xianting.tian@linux.alibaba.com, palmer@dabbelt.com, palmer@rivosinc.com, heiko@sntech.de, liaochang1@huawei.com, jszhang@kernel.org, arnd@arndb.de Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, Guo Ren , Guo Ren , Nick Kossifidis Subject: [PATCH V4 2/3] riscv: kexec: Fixup crash_smp_send_stop without multi cores Date: Tue, 20 Sep 2022 23:31:33 -0400 Message-Id: <20220921033134.3133319-3-guoren@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220921033134.3133319-1-guoren@kernel.org> References: <20220921033134.3133319-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv. Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shu= tdown()") Reviewed-by: Xianting Tian Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Nick Kossifidis --- arch/riscv/include/asm/smp.h | 3 + arch/riscv/kernel/machine_kexec.c | 21 ++----- arch/riscv/kernel/smp.c | 97 ++++++++++++++++++++++++++++++- 3 files changed, 103 insertions(+), 18 deletions(-) diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h index d3443be7eedc..3831b638ecab 100644 --- a/arch/riscv/include/asm/smp.h +++ b/arch/riscv/include/asm/smp.h @@ -50,6 +50,9 @@ void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops); /* Clear IPI for current CPU */ void riscv_clear_ipi(void); =20 +/* Check other CPUs stop or not */ +bool smp_crash_stop_failed(void); + /* Secondary hart entry */ asmlinkage void smp_callin(void); =20 diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_= kexec.c index db41c676e5a2..2d139b724bc8 100644 --- a/arch/riscv/kernel/machine_kexec.c +++ b/arch/riscv/kernel/machine_kexec.c @@ -140,22 +140,6 @@ void machine_shutdown(void) #endif } =20 -/* Override the weak function in kernel/panic.c */ -void crash_smp_send_stop(void) -{ - static int cpus_stopped; - - /* - * This function can be called twice in panic path, but obviously - * we execute this only once. - */ - if (cpus_stopped) - return; - - smp_send_stop(); - cpus_stopped =3D 1; -} - static void machine_kexec_mask_interrupts(void) { unsigned int i; @@ -230,6 +214,11 @@ machine_kexec(struct kimage *image) void *control_code_buffer =3D page_address(image->control_code_page); riscv_kexec_method kexec_method =3D NULL; =20 +#ifdef CONFIG_SMP + WARN(smp_crash_stop_failed(), + "Some CPUs may be stale, kdump will be unreliable.\n"); +#endif + if (image->type !=3D KEXEC_TYPE_CRASH) kexec_method =3D control_code_buffer; else diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c index 760a64518c58..8c3b59f1f9b8 100644 --- a/arch/riscv/kernel/smp.c +++ b/arch/riscv/kernel/smp.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -22,11 +23,13 @@ #include #include #include +#include =20 enum ipi_message_type { IPI_RESCHEDULE, IPI_CALL_FUNC, IPI_CPU_STOP, + IPI_CPU_CRASH_STOP, IPI_IRQ_WORK, IPI_TIMER, IPI_MAX @@ -71,6 +74,32 @@ static void ipi_stop(void) wait_for_interrupt(); } =20 +#ifdef CONFIG_KEXEC_CORE +static atomic_t waiting_for_crash_ipi =3D ATOMIC_INIT(0); + +static inline void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *re= gs) +{ + crash_save_cpu(regs, cpu); + + atomic_dec(&waiting_for_crash_ipi); + + local_irq_disable(); + +#ifdef CONFIG_HOTPLUG_CPU + if (cpu_has_hotplug(cpu)) + cpu_ops[cpu]->cpu_stop(); +#endif + + for(;;) + wait_for_interrupt(); +} +#else +static inline void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *re= gs) +{ + unreachable(); +} +#endif + static const struct riscv_ipi_ops *ipi_ops __ro_after_init; =20 void riscv_set_ipi_ops(const struct riscv_ipi_ops *ops) @@ -124,8 +153,9 @@ void arch_irq_work_raise(void) =20 void handle_IPI(struct pt_regs *regs) { - unsigned long *pending_ipis =3D &ipi_data[smp_processor_id()].bits; - unsigned long *stats =3D ipi_data[smp_processor_id()].stats; + unsigned int cpu =3D smp_processor_id(); + unsigned long *pending_ipis =3D &ipi_data[cpu].bits; + unsigned long *stats =3D ipi_data[cpu].stats; =20 riscv_clear_ipi(); =20 @@ -154,6 +184,10 @@ void handle_IPI(struct pt_regs *regs) ipi_stop(); } =20 + if (ops & (1 << IPI_CPU_CRASH_STOP)) { + ipi_cpu_crash_stop(cpu, get_irq_regs()); + } + if (ops & (1 << IPI_IRQ_WORK)) { stats[IPI_IRQ_WORK]++; irq_work_run(); @@ -176,6 +210,7 @@ static const char * const ipi_names[] =3D { [IPI_RESCHEDULE] =3D "Rescheduling interrupts", [IPI_CALL_FUNC] =3D "Function call interrupts", [IPI_CPU_STOP] =3D "CPU stop interrupts", + [IPI_CPU_CRASH_STOP] =3D "CPU stop (for crash dump) interrupts", [IPI_IRQ_WORK] =3D "IRQ work interrupts", [IPI_TIMER] =3D "Timer broadcast interrupts", }; @@ -235,6 +270,64 @@ void smp_send_stop(void) cpumask_pr_args(cpu_online_mask)); } =20 +#ifdef CONFIG_KEXEC_CORE +/* + * The number of CPUs online, not counting this CPU (which may not be + * fully online and so not counted in num_online_cpus()). + */ +static inline unsigned int num_other_online_cpus(void) +{ + unsigned int this_cpu_online =3D cpu_online(smp_processor_id()); + + return num_online_cpus() - this_cpu_online; +} + +void crash_smp_send_stop(void) +{ + static int cpus_stopped; + cpumask_t mask; + unsigned long timeout; + + /* + * This function can be called twice in panic path, but obviously + * we execute this only once. + */ + if (cpus_stopped) + return; + + cpus_stopped =3D 1; + + /* + * If this cpu is the only one alive at this point in time, online or + * not, there are no stop messages to be sent around, so just back out. + */ + if (num_other_online_cpus() =3D=3D 0) + return; + + cpumask_copy(&mask, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &mask); + + atomic_set(&waiting_for_crash_ipi, num_other_online_cpus()); + + pr_crit("SMP: stopping secondary CPUs\n"); + send_ipi_mask(&mask, IPI_CPU_CRASH_STOP); + + /* Wait up to one second for other CPUs to stop */ + timeout =3D USEC_PER_SEC; + while ((atomic_read(&waiting_for_crash_ipi) > 0) && timeout--) + udelay(1); + + if (atomic_read(&waiting_for_crash_ipi) > 0) + pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", + cpumask_pr_args(&mask)); +} + +bool smp_crash_stop_failed(void) +{ + return (atomic_read(&waiting_for_crash_ipi) > 0); +} +#endif + void smp_send_reschedule(int cpu) { send_ipi_single(cpu, IPI_RESCHEDULE); --=20 2.36.1 From nobody Thu Apr 2 20:00:40 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C57AC32771 for ; Wed, 21 Sep 2022 03:35:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229811AbiIUDf1 (ORCPT ); Tue, 20 Sep 2022 23:35:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231757AbiIUDeF (ORCPT ); Tue, 20 Sep 2022 23:34:05 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7ECCD1147F; Tue, 20 Sep 2022 20:32:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 195AA62F2B; Wed, 21 Sep 2022 03:32:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B9C1FC43470; Wed, 21 Sep 2022 03:32:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1663731125; bh=MxNuze7CJb1Pmp8L25+DA/brD4eP/MM76a60K8U4Mcs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FRxeTfpE+CmJ/UHg18U/YEKM1cdw9Nd3/ZrMO+2WFlPQcIVZmG+oLxSeOIhiqj56m 30PUkA8R4skrh/squVPaUW2yj1m6I8Fo5mLX3/pyuNyIY0OMQbhrQluBMmcoKAjmu2 Zp9XtC37CjSARm5j/qE2CdK2w6c/YsSL57SNs8KPW0ZFXTLDliXZx30woV5Knoc4aO Y4fy+5wVX9wUM83y8byyhaKmtUaLVvmFLCg4E3MWPFgbYzy8ihHsV+AxJ4PVnE7ja0 e/nLwc2DtA6WcmfvtE/jt0kHwDO9FPdhe6hUBQ+s/2HzV96A/6IIm2nnsA3rIs9ogo gnPzMokyo8yAA== From: guoren@kernel.org To: xianting.tian@linux.alibaba.com, palmer@dabbelt.com, palmer@rivosinc.com, heiko@sntech.de, liaochang1@huawei.com, jszhang@kernel.org, arnd@arndb.de Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, Guo Ren , Guo Ren , Catalin Marinas , Thomas Gleixner Subject: [PATCH V4 3/3] arch: crash: Remove duplicate declaration in smp.h Date: Tue, 20 Sep 2022 23:31:34 -0400 Message-Id: <20220921033134.3133319-4-guoren@kernel.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220921033134.3133319-1-guoren@kernel.org> References: <20220921033134.3133319-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren Remove crash_smp_send_stop declarations in arm64, x86 asm/smp.h which has been done in include/linux/smp.h. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Acked-by: Catalin Marinas Cc: Arnd Bergmann Cc: Thomas Gleixner Reviewed-by: Heiko Stuebner --- arch/arm64/include/asm/smp.h | 1 - arch/x86/include/asm/crash.h | 1 - 2 files changed, 2 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index fc55f5a57a06..a108ac93fd8f 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -141,7 +141,6 @@ static inline void cpu_panic_kernel(void) */ bool cpus_are_stuck_in_kernel(void); =20 -extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); extern void panic_smp_self_stop(void); =20 diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h index 8b6bd63530dc..6a9be4907c82 100644 --- a/arch/x86/include/asm/crash.h +++ b/arch/x86/include/asm/crash.h @@ -7,6 +7,5 @@ struct kimage; int crash_load_segments(struct kimage *image); int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params); -void crash_smp_send_stop(void); =20 #endif /* _ASM_X86_CRASH_H */ --=20 2.36.1