From nobody Tue Apr 7 14:38:10 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEA7B399368; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; cv=none; b=d2EiKgooqLQrDr37pFIdTx6xfAfDYPonC5ewszKOXwa3kR1XHeYHiqpxsqie9mRxfy1nrP1Xqh1U77yFT6/4XFNX2Rco8yOYusx/Uk9ExSJrELb/u3EMNYyBK21qswz5OEo7dXkhQ/Jys14tU/DE8pT90icIUUlLdVcwCnC+WjA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; c=relaxed/simple; bh=ZvdhkjPc20Ylp1M7TIY3QZ3BSppLWF2s98ZTCW2gE34=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=hVdxjaBkk/vR5nC3NiUIxNsqkpX7ViLsWG1e4Xfr+69iuEycdjEOL2HVzYnyxyylu80d6DkjjtOXUIcOMKxiyjoXesOnGV7a5ieYc6aSwI8gWi0KcFqRc5ItXldZfyTkkC3zp+1TdMcDW90ZVu9gwNp54r/qeC9qS2PfG2HjZMY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=sHUNbtCZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sHUNbtCZ" Received: by smtp.kernel.org (Postfix) with ESMTPS id AB441C2BC86; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773357753; bh=ZvdhkjPc20Ylp1M7TIY3QZ3BSppLWF2s98ZTCW2gE34=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=sHUNbtCZ9/+RUnbBi8m1Kd4o05tOrLQvFeFZW/7FjQ8FAdd8ueHeoZYBX6puIC7sS ufkH+3EcKi3cFua8PSDqXvv68nDVBuCZBtQhsCJNlBQNtJQNCGrSKf7FCVwvLDcU4Y qwDBgqEhTWYEAv506l7f7abt5edExtzAIQQSdoXFAUErA0/Nnzrw82Hz6Yl6NkjChS SgK5wrEtDdP18ucTJZrAZTJO0KeSAtJlZGAsNPlvvd6k9/nvFR+DWDQ/x1X5TgCq8g Ud8kg0HM5dUgcvCFXHDl30BE5nXygl+e97ZCkc5rWiAeCAMTm9Y/SbA2BCAanadqx3 fe8+qBmkP9PWg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93FA6106ACF9; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) From: Mayank Rungta via B4 Relay Date: Thu, 12 Mar 2026 16:22:02 -0700 Subject: [PATCH v2 1/5] watchdog: Return early in watchdog_hardlockup_check() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260312-hardlockup-watchdog-fixes-v2-1-45bd8a0cc7ed@google.com> References: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> In-Reply-To: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> To: Petr Mladek , Jinchao Wang , Yunhui Cui , Stephane Eranian , Ian Rogers , Li Huafei , Feng Tang , Max Kellermann , Jonathan Corbet , Douglas Anderson , Andrew Morton , Florian Delizy , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Mayank Rungta X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773357752; l=5437; i=mrungta@google.com; s=20260212; h=from:subject:message-id; bh=/gmGbSYfq0uL3jFNpI3ejNjGlNROGDfe6h8BWDUCdcs=; b=HQzF2MlIsRSuZGZCoBBHHUHE0LnOCFLTKwDzHh7aXPJMLq192jyuitITA7uKVL+uLOB7DMUHR xkMy50bkDXSB0PzgpgywZ8SeScifNZ4R3sKJqyvZctb9P4s0AQEspxX X-Developer-Key: i=mrungta@google.com; a=ed25519; pk=2Bjwbv/ibL10QnyvK9G7DoKpffXy7z6+M4NawEYgYDI= X-Endpoint-Received: by B4 Relay for mrungta@google.com/20260212 with auth_id=634 X-Original-From: Mayank Rungta Reply-To: mrungta@google.com From: Mayank Rungta Invert the `is_hardlockup(cpu)` check in `watchdog_hardlockup_check()` to return early when a hardlockup is not detected. This flattens the main logic block, reducing the indentation level and making the code easier to read and maintain. This refactoring serves as a preparation patch for future hardlockup changes. Signed-off-by: Mayank Rungta Reviewed-by: Douglas Anderson Reviewed-by: Petr Mladek --- kernel/watchdog.c | 117 +++++++++++++++++++++++++++-----------------------= ---- 1 file changed, 59 insertions(+), 58 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 7d675781bc91..4c5b47495745 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -187,6 +187,8 @@ static void watchdog_hardlockup_kick(void) void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs) { int hardlockup_all_cpu_backtrace; + unsigned int this_cpu; + unsigned long flags; =20 if (per_cpu(watchdog_hardlockup_touched, cpu)) { per_cpu(watchdog_hardlockup_touched, cpu) =3D false; @@ -201,74 +203,73 @@ void watchdog_hardlockup_check(unsigned int cpu, stru= ct pt_regs *regs) * fired multiple times before we overflow'd. If it hasn't * then this is a good indication the cpu is stuck */ - if (is_hardlockup(cpu)) { - unsigned int this_cpu =3D smp_processor_id(); - unsigned long flags; + if (!is_hardlockup(cpu)) { + per_cpu(watchdog_hardlockup_warned, cpu) =3D false; + return; + } =20 #ifdef CONFIG_SYSFS - ++hardlockup_count; + ++hardlockup_count; #endif - /* - * A poorly behaving BPF scheduler can trigger hard lockup by - * e.g. putting numerous affinitized tasks in a single queue and - * directing all CPUs at it. The following call can return true - * only once when sched_ext is enabled and will immediately - * abort the BPF scheduler and print out a warning message. - */ - if (scx_hardlockup(cpu)) - return; + /* + * A poorly behaving BPF scheduler can trigger hard lockup by + * e.g. putting numerous affinitized tasks in a single queue and + * directing all CPUs at it. The following call can return true + * only once when sched_ext is enabled and will immediately + * abort the BPF scheduler and print out a warning message. + */ + if (scx_hardlockup(cpu)) + return; =20 - /* Only print hardlockups once. */ - if (per_cpu(watchdog_hardlockup_warned, cpu)) - return; + /* Only print hardlockups once. */ + if (per_cpu(watchdog_hardlockup_warned, cpu)) + return; =20 - /* - * Prevent multiple hard-lockup reports if one cpu is already - * engaged in dumping all cpu back traces. - */ - if (hardlockup_all_cpu_backtrace) { - if (test_and_set_bit_lock(0, &hard_lockup_nmi_warn)) - return; - } + /* + * Prevent multiple hard-lockup reports if one cpu is already + * engaged in dumping all cpu back traces. + */ + if (hardlockup_all_cpu_backtrace) { + if (test_and_set_bit_lock(0, &hard_lockup_nmi_warn)) + return; + } =20 - /* - * NOTE: we call printk_cpu_sync_get_irqsave() after printing - * the lockup message. While it would be nice to serialize - * that printout, we really want to make sure that if some - * other CPU somehow locked up while holding the lock associated - * with printk_cpu_sync_get_irqsave() that we can still at least - * get the message about the lockup out. - */ - pr_emerg("CPU%u: Watchdog detected hard LOCKUP on cpu %u\n", this_cpu, c= pu); - printk_cpu_sync_get_irqsave(flags); + /* + * NOTE: we call printk_cpu_sync_get_irqsave() after printing + * the lockup message. While it would be nice to serialize + * that printout, we really want to make sure that if some + * other CPU somehow locked up while holding the lock associated + * with printk_cpu_sync_get_irqsave() that we can still at least + * get the message about the lockup out. + */ + this_cpu =3D smp_processor_id(); + pr_emerg("CPU%u: Watchdog detected hard LOCKUP on cpu %u\n", this_cpu, cp= u); + printk_cpu_sync_get_irqsave(flags); =20 - print_modules(); - print_irqtrace_events(current); - if (cpu =3D=3D this_cpu) { - if (regs) - show_regs(regs); - else - dump_stack(); - printk_cpu_sync_put_irqrestore(flags); - } else { - printk_cpu_sync_put_irqrestore(flags); - trigger_single_cpu_backtrace(cpu); - } + print_modules(); + print_irqtrace_events(current); + if (cpu =3D=3D this_cpu) { + if (regs) + show_regs(regs); + else + dump_stack(); + printk_cpu_sync_put_irqrestore(flags); + } else { + printk_cpu_sync_put_irqrestore(flags); + trigger_single_cpu_backtrace(cpu); + } =20 - if (hardlockup_all_cpu_backtrace) { - trigger_allbutcpu_cpu_backtrace(cpu); - if (!hardlockup_panic) - clear_bit_unlock(0, &hard_lockup_nmi_warn); - } + if (hardlockup_all_cpu_backtrace) { + trigger_allbutcpu_cpu_backtrace(cpu); + if (!hardlockup_panic) + clear_bit_unlock(0, &hard_lockup_nmi_warn); + } =20 - sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); - if (hardlockup_panic) - nmi_panic(regs, "Hard LOCKUP"); + sys_info(hardlockup_si_mask & ~SYS_INFO_ALL_BT); + if (hardlockup_panic) + nmi_panic(regs, "Hard LOCKUP"); =20 - per_cpu(watchdog_hardlockup_warned, cpu) =3D true; - } else { - per_cpu(watchdog_hardlockup_warned, cpu) =3D false; - } + per_cpu(watchdog_hardlockup_warned, cpu) =3D true; } =20 #else /* CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER */ --=20 2.53.0.851.ga537e3e6e9-goog From nobody Tue Apr 7 14:38:10 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEAFB39C013; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; cv=none; b=HKEEDulgbnxD68AL2V7r+7Jiq09O3McBUPTaVNoEfOJxw3t6bXmYDh/ZpfV7N0DQkW3Lu9KMT4o2GfqVXpw842y9lfFvPmXkxGaPLe5TrdwYJ1NnGqsX8zxLTt6ZGdHVCnVvjGob4QQHIQ20W+HXXJq1QvnNBlkJXsVo2upWtrI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; c=relaxed/simple; bh=eWuXaoMzdNBHWhbv2FBD2M6er0YUT2LHSrUYqdGHcfQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=omBWFgiILDSWGLrAfYNPV6/9/42GBtjHGSXE2gl3ZS7vVZMd3YF0jqddln4NLysjmtgur0BXgAlo645tHMHuiI6IPm4ho3znnDuxhS1RPv6QpUeOGJL9+zmUlLy4gjOx1W7EtDLBp+UVYnVW3bsD1yFw9yPIxHUrwgq15M293vE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oIyjnkWO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oIyjnkWO" Received: by smtp.kernel.org (Postfix) with ESMTPS id AFBBEC19425; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773357753; bh=eWuXaoMzdNBHWhbv2FBD2M6er0YUT2LHSrUYqdGHcfQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=oIyjnkWO2pJtzQIrdUe/necxKUbRa31Q2D5jwuGUsmxaUndAShtSmB/NfD9Kom62Q 8T4pPEif7fpPC2UltLchbw8jHmtwh1r9UcjXWordGmSljSyOuE+XWInKIxC4EHQmmn 1eZ0SDXpply/9YzNFmdSTJIrYvFmYQ8n3rM0kTScK3FPgtGxrKygVVvFt323hHW0qZ jRhiwK+05voRwARbtNwLHPkHMcta/K+1ahosbuap0A0K1uu79pxEh5O99GOolMaZgo VrDNP+3khJOk6SWadeqtH8RMH7U+Sy/PyjsPMiD/CGbm4fo0Z8eUwKZao/TwF4jJjo jPRy/7QIJ0/xA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1ACC106ACFB; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) From: Mayank Rungta via B4 Relay Date: Thu, 12 Mar 2026 16:22:03 -0700 Subject: [PATCH v2 2/5] watchdog: Update saved interrupts during check Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260312-hardlockup-watchdog-fixes-v2-2-45bd8a0cc7ed@google.com> References: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> In-Reply-To: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> To: Petr Mladek , Jinchao Wang , Yunhui Cui , Stephane Eranian , Ian Rogers , Li Huafei , Feng Tang , Max Kellermann , Jonathan Corbet , Douglas Anderson , Andrew Morton , Florian Delizy , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Mayank Rungta X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773357752; l=2486; i=mrungta@google.com; s=20260212; h=from:subject:message-id; bh=M8TFUUsd+x02qNZV27ER4ewjfXgDz4uo0zb2OpCd5AY=; b=c1BEltI1CYPBMfFHi5jcw831jLPMgod4QJfhUT9xrFyjz3NldwgfjwNdNDMWxW2HPjQ0sE4J1 AmprlHBsxMWCOna15znVCFqclrRZKFGioExw5mgmR06dtLbmlgCbqf9 X-Developer-Key: i=mrungta@google.com; a=ed25519; pk=2Bjwbv/ibL10QnyvK9G7DoKpffXy7z6+M4NawEYgYDI= X-Endpoint-Received: by B4 Relay for mrungta@google.com/20260212 with auth_id=634 X-Original-From: Mayank Rungta Reply-To: mrungta@google.com From: Mayank Rungta Currently, arch_touch_nmi_watchdog() causes an early return that skips updating hrtimer_interrupts_saved. This leads to stale comparisons and delayed lockup detection. I found this issue because in our system the serial console is fairly chatty. For example, the 8250 console driver frequently calls touch_nmi_watchdog() via console_write(). If a CPU locks up after a timer interrupt but before next watchdog check, we see the following sequence: * watchdog_hardlockup_check() saves counter (e.g., 1000) * Timer runs and updates the counter (1001) * touch_nmi_watchdog() is called * CPU locks up * 10s pass: check() notices touch, returns early, skips update * 10s pass: check() saves counter (1001) * 10s pass: check() finally detects lockup This delays detection to 30 seconds. With this fix, we detect the lockup in 20 seconds. Reviewed-by: Douglas Anderson Signed-off-by: Mayank Rungta Reviewed-by: Petr Mladek --- kernel/watchdog.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 4c5b47495745..431c540bd035 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -159,21 +159,28 @@ void watchdog_hardlockup_touch_cpu(unsigned int cpu) per_cpu(watchdog_hardlockup_touched, cpu) =3D true; } =20 -static bool is_hardlockup(unsigned int cpu) +static void watchdog_hardlockup_update(unsigned int cpu) { int hrint =3D atomic_read(&per_cpu(hrtimer_interrupts, cpu)); =20 - if (per_cpu(hrtimer_interrupts_saved, cpu) =3D=3D hrint) - return true; - /* * NOTE: we don't need any fancy atomic_t or READ_ONCE/WRITE_ONCE * for hrtimer_interrupts_saved. hrtimer_interrupts_saved is * written/read by a single CPU. */ per_cpu(hrtimer_interrupts_saved, cpu) =3D hrint; +} + +static bool is_hardlockup(unsigned int cpu) +{ + int hrint =3D atomic_read(&per_cpu(hrtimer_interrupts, cpu)); + + if (per_cpu(hrtimer_interrupts_saved, cpu) !=3D hrint) { + watchdog_hardlockup_update(cpu); + return false; + } =20 - return false; + return true; } =20 static void watchdog_hardlockup_kick(void) @@ -191,6 +198,7 @@ void watchdog_hardlockup_check(unsigned int cpu, struct= pt_regs *regs) unsigned long flags; =20 if (per_cpu(watchdog_hardlockup_touched, cpu)) { + watchdog_hardlockup_update(cpu); per_cpu(watchdog_hardlockup_touched, cpu) =3D false; return; } --=20 2.53.0.851.ga537e3e6e9-goog From nobody Tue Apr 7 14:38:10 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEBA039DBEC; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; cv=none; b=UgqmOAazSzlhxeZ2p3+4KLrMeTa0GQ2G51GE3OSRhXHiY1jLwBCHVsEoSsfjUwZTFt6K7z64MPxT6nETJ2ztygbSwYKhk6fejOMQaDMBypnOwcd3J2xpNst9s1lXXR1DhRk2H5q+ewvqlAbQFl5n3oypDcic0w6vGGQZoWkwhW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; c=relaxed/simple; bh=ByJnEZI8QFOxbIYQDUBN5eDPkGo3Il86k/QbZHrUJ+k=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=fR0zkpzt2El9gwR0iJgIuUN6rs2/EBMKaV2zijqV4TpenLdTEkPtFPViCC1wMgkC4TSVDsjiwy2lpaSeucJ3Fs8Eut4FjwnOiiWp686/8zq/51a0C/JlNlBKP+JKvH+697sIBWJehUsZHvEnUyHXzGE8I8f+YBEtImIyYg9CIJo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jCEEjuel; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jCEEjuel" Received: by smtp.kernel.org (Postfix) with ESMTPS id B9B44C2BC87; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773357753; bh=ByJnEZI8QFOxbIYQDUBN5eDPkGo3Il86k/QbZHrUJ+k=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=jCEEjuelE2PMkS9edIPPnGJEwj5L5voXC57NNlx78BjKLWraBHivvGvwVpFB0n4JE k5X7YU+2aD2pkuLTDu/Xg7BzWIQ8vLzON9HKaWK4ps5wvxXSXI6sz2608EG8qjMzRk OGRGzP52fpomrUfSF7S6HWQML+IUfD5x9gp65JaMGgDvFcXl2Q6tssOQmTVDo24MuE WvUUN6of54d+eDUL8+97wcc0UWnvjV3YLTbBvg046TR8NVLM4dVLnwY2bQSe65Lgm5 Ra9RYwP6gGWPdN2Ab5mIAHgtKkucooGblN+apgoaO3Ne7SXtasIL8c8fqnG9D3oH8g klLYXqejfn8Eg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0498FED2F5; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) From: Mayank Rungta via B4 Relay Date: Thu, 12 Mar 2026 16:22:04 -0700 Subject: [PATCH v2 3/5] doc: watchdog: Clarify hardlockup detection timing Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260312-hardlockup-watchdog-fixes-v2-3-45bd8a0cc7ed@google.com> References: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> In-Reply-To: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> To: Petr Mladek , Jinchao Wang , Yunhui Cui , Stephane Eranian , Ian Rogers , Li Huafei , Feng Tang , Max Kellermann , Jonathan Corbet , Douglas Anderson , Andrew Morton , Florian Delizy , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Mayank Rungta X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773357752; l=3703; i=mrungta@google.com; s=20260212; h=from:subject:message-id; bh=i35ZyFlJNjhl6osHQxeTEcPG+44w2tOOkFaSeusqbQ8=; b=4HY4TTU3VfyWx5uaGcGREg3VgDAbDYasdvalGDgnIB9Gmcy5UZOgogoRSUtub5SNrrLoLHXJp ICQg76rZwgyB3mzqZ82s8ycPp/su8VGsl+82bGkh0WB00M61CKEyMJS X-Developer-Key: i=mrungta@google.com; a=ed25519; pk=2Bjwbv/ibL10QnyvK9G7DoKpffXy7z6+M4NawEYgYDI= X-Endpoint-Received: by B4 Relay for mrungta@google.com/20260212 with auth_id=634 X-Original-From: Mayank Rungta Reply-To: mrungta@google.com From: Mayank Rungta The current documentation implies that a hardlockup is strictly defined as looping for "more than 10 seconds." However, the detection mechanism is periodic (based on `watchdog_thresh`), meaning detection time varies significantly depending on when the lockup occurs relative to the NMI perf event. Update the definition to remove the strict "more than 10 seconds" constraint in the introduction and defer details to the Implementation section. Additionally, add a "Detection Overhead" section illustrating the Best Case (~6s) and Worst Case (~20s) detection scenarios to provide administrators with a clearer understanding of the watchdog's latency. Reviewed-by: Petr Mladek Reviewed-by: Douglas Anderson Signed-off-by: Mayank Rungta --- Documentation/admin-guide/lockup-watchdogs.rst | 41 ++++++++++++++++++++++= +++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/lockup-watchdogs.rst b/Documentation= /admin-guide/lockup-watchdogs.rst index 3e09284a8b9b..1b374053771f 100644 --- a/Documentation/admin-guide/lockup-watchdogs.rst +++ b/Documentation/admin-guide/lockup-watchdogs.rst @@ -16,7 +16,7 @@ details), and a compile option, "BOOTPARAM_SOFTLOCKUP_PAN= IC", are provided for this. =20 A 'hardlockup' is defined as a bug that causes the CPU to loop in -kernel mode for more than 10 seconds (see "Implementation" below for +kernel mode for several seconds (see "Implementation" below for details), without letting other interrupts have a chance to run. Similarly to the softlockup case, the current stack trace is displayed upon detection and the system will stay locked up unless the default @@ -64,6 +64,45 @@ administrators to configure the period of the hrtimer an= d the perf event. The right value for a particular environment is a trade-off between fast response to lockups and detection overhead. =20 +Detection Overhead +------------------ + +The hardlockup detector checks for lockups using a periodic NMI perf +event. This means the time to detect a lockup can vary depending on +when the lockup occurs relative to the NMI check window. + +**Best Case:** +In the best case scenario, the lockup occurs just before the first +heartbeat is due. The detector will notice the missing hrtimer +interrupt almost immediately during the next check. + +:: + + Time 100.0: cpu 1 heartbeat + Time 100.1: hardlockup_check, cpu1 stores its state + Time 103.9: Hard Lockup on cpu1 + Time 104.0: cpu 1 heartbeat never comes + Time 110.1: hardlockup_check, cpu1 checks the state again, should be the= same, declares lockup + + Time to detection: ~6 seconds + +**Worst Case:** +In the worst case scenario, the lockup occurs shortly after a valid +interrupt (heartbeat) which itself happened just after the NMI check. +The next NMI check sees that the interrupt count has changed (due to +that one heartbeat), assumes the CPU is healthy, and resets the +baseline. The lockup is only detected at the subsequent check. + +:: + + Time 100.0: hardlockup_check, cpu1 stores its state + Time 100.1: cpu 1 heartbeat + Time 100.2: Hard Lockup on cpu1 + Time 110.0: hardlockup_check, cpu1 stores its state (misses lockup as st= ate changed) + Time 120.0: hardlockup_check, cpu1 checks the state again, should be the= same, declares lockup + + Time to detection: ~20 seconds + By default, the watchdog runs on all online cores. However, on a kernel configured with NO_HZ_FULL, by default the watchdog runs only on the housekeeping cores, not the cores specified in the "nohz_full" --=20 2.53.0.851.ga537e3e6e9-goog From nobody Tue Apr 7 14:38:10 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEC2E39E17E; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; cv=none; b=kqz6xw+sJr8l07ugy090m979SctkUVe6SB+OiW+IVmu+lhXgrCWYCSPiOH7Z0U6I61+zPL54STtvuxk+6noXe45/897q3QpnjalUB94q6LGNV67URERoCLuUAVL4Q0wAuvXoR2gyBSGLN561SoVe57xmGmScx4jGE2zRifzVqJA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; c=relaxed/simple; bh=YH2mKdaJhvLHzAH3E0arTkx7m03q3Vnf8+sv4vYqr9g=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=i0GvTQwQwX6tH+dpBaIswkOFltsLLTAaHpXoMKKINM3pPHX17D2KgJU5t4xGTyHJdB5CvdW4bWkoj6JkLSsPtb9Eh7lQXEc89xKWGq7HPT4UZCrRcmsYqSNtyA/ECu8mgjWjx8kSwUF/VrDuM0HqXadKc2HzTZXi+thn2cYsCKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qzlPPFfE; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qzlPPFfE" Received: by smtp.kernel.org (Postfix) with ESMTPS id C9A74C2BC9E; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773357753; bh=YH2mKdaJhvLHzAH3E0arTkx7m03q3Vnf8+sv4vYqr9g=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=qzlPPFfEgNG+3H3kTSZgnm8IbWNlB9zhDVfiCz6btvPwkbF91Pn+iMEYlBBj7awHt WKjEdLarKPhL5GO0myHK3k+agMiAsvUWWOl/kSUDqQXW4jXLzvZVF+52Td9CDceQIr dM42U4Jsd5UkDMpUd/QFrRepXcL0+CcGGgNXX5AftLskIglIKcj50HGle7xnnna67e t4GY2/tslV3qodR3HaL5CjahXUXrXfz7tAgwv2sv5uq/0sYu/tC22WTqjySPXY0ZHx 48dOrm7F7jY7TUxWRP86VsuqXaxG4om7euJ1vi+3x9Jj4sOLZZxc0rXl7vFBkk2nNk Sj3shFqLOOgog== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE3E3106ACFE; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) From: Mayank Rungta via B4 Relay Date: Thu, 12 Mar 2026 16:22:05 -0700 Subject: [PATCH v2 4/5] watchdog/hardlockup: improve buddy system detection timeliness Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260312-hardlockup-watchdog-fixes-v2-4-45bd8a0cc7ed@google.com> References: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> In-Reply-To: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> To: Petr Mladek , Jinchao Wang , Yunhui Cui , Stephane Eranian , Ian Rogers , Li Huafei , Feng Tang , Max Kellermann , Jonathan Corbet , Douglas Anderson , Andrew Morton , Florian Delizy , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Mayank Rungta X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773357752; l=5127; i=mrungta@google.com; s=20260212; h=from:subject:message-id; bh=5wYvVIHfFCgN5ygcPFtvVUMvQnV/JrQ6DexOgbq9zes=; b=Ve4A5fVpXCjy0qR9OOgYTqE82VOOLfn2r8KcluiT8/2Cov4UiiiRegO9i3eZdiUXoCJdtOIim fATQ3YWZMPRA2wEEp6DeaTkjR6ADfJm0Dnb1vm+4fnAzjtL4iTc8r4x X-Developer-Key: i=mrungta@google.com; a=ed25519; pk=2Bjwbv/ibL10QnyvK9G7DoKpffXy7z6+M4NawEYgYDI= X-Endpoint-Received: by B4 Relay for mrungta@google.com/20260212 with auth_id=634 X-Original-From: Mayank Rungta Reply-To: mrungta@google.com From: Mayank Rungta Currently, the buddy system only performs checks every 3rd sample. With a 4-second interval. If a check window is missed, the next check occurs 12 seconds later, potentially delaying hard lockup detection for up to 24 seconds. Modify the buddy system to perform checks at every interval (4s). Introduce a missed-interrupt threshold to maintain the existing grace period while reducing the detection window to 8-12 seconds. Best and worst case detection scenarios: Before (12s check window): - Best case: Lockup occurs after first check but just before heartbeat interval. Detected in ~8s (8s till next check). - Worst case: Lockup occurs just after a check. Detected in ~24s (missed check + 12s till next check + 12s logic). After (4s check window with threshold of 3): - Best case: Lockup occurs just before a check. Detected in ~8s (0s till 1st check + 4s till 2nd + 4s till 3rd). - Worst case: Lockup occurs just after a check. Detected in ~12s (4s till 1st check + 4s till 2nd + 4s till 3rd). Reviewed-by: Douglas Anderson Signed-off-by: Mayank Rungta Reviewed-by: Petr Mladek --- include/linux/nmi.h | 1 + kernel/watchdog.c | 19 ++++++++++++++++--- kernel/watchdog_buddy.c | 9 +-------- 3 files changed, 18 insertions(+), 11 deletions(-) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 207156f2143c..bc1162895f35 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -21,6 +21,7 @@ void lockup_detector_soft_poweroff(void); extern int watchdog_user_enabled; extern int watchdog_thresh; extern unsigned long watchdog_enabled; +extern int watchdog_hardlockup_miss_thresh; =20 extern struct cpumask watchdog_cpumask; extern unsigned long *watchdog_cpumask_bits; diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 431c540bd035..87dd5e0f6968 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -60,6 +60,13 @@ unsigned long *watchdog_cpumask_bits =3D cpumask_bits(&w= atchdog_cpumask); int __read_mostly sysctl_hardlockup_all_cpu_backtrace; # endif /* CONFIG_SMP */ =20 +/* + * Number of consecutive missed interrupts before declaring a lockup. + * Default to 1 (immediate) for NMI/Perf. Buddy will overwrite this to 3. + */ +int __read_mostly watchdog_hardlockup_miss_thresh =3D 1; +EXPORT_SYMBOL_GPL(watchdog_hardlockup_miss_thresh); + /* * Should we panic when a soft-lockup or hard-lockup occurs: */ @@ -137,6 +144,7 @@ __setup("nmi_watchdog=3D", hardlockup_panic_setup); =20 static DEFINE_PER_CPU(atomic_t, hrtimer_interrupts); static DEFINE_PER_CPU(int, hrtimer_interrupts_saved); +static DEFINE_PER_CPU(int, hrtimer_interrupts_missed); static DEFINE_PER_CPU(bool, watchdog_hardlockup_warned); static DEFINE_PER_CPU(bool, watchdog_hardlockup_touched); static unsigned long hard_lockup_nmi_warn; @@ -159,7 +167,7 @@ void watchdog_hardlockup_touch_cpu(unsigned int cpu) per_cpu(watchdog_hardlockup_touched, cpu) =3D true; } =20 -static void watchdog_hardlockup_update(unsigned int cpu) +static void watchdog_hardlockup_update_reset(unsigned int cpu) { int hrint =3D atomic_read(&per_cpu(hrtimer_interrupts, cpu)); =20 @@ -169,6 +177,7 @@ static void watchdog_hardlockup_update(unsigned int cpu) * written/read by a single CPU. */ per_cpu(hrtimer_interrupts_saved, cpu) =3D hrint; + per_cpu(hrtimer_interrupts_missed, cpu) =3D 0; } =20 static bool is_hardlockup(unsigned int cpu) @@ -176,10 +185,14 @@ static bool is_hardlockup(unsigned int cpu) int hrint =3D atomic_read(&per_cpu(hrtimer_interrupts, cpu)); =20 if (per_cpu(hrtimer_interrupts_saved, cpu) !=3D hrint) { - watchdog_hardlockup_update(cpu); + watchdog_hardlockup_update_reset(cpu); return false; } =20 + per_cpu(hrtimer_interrupts_missed, cpu)++; + if (per_cpu(hrtimer_interrupts_missed, cpu) % watchdog_hardlockup_miss_th= resh) + return false; + return true; } =20 @@ -198,7 +211,7 @@ void watchdog_hardlockup_check(unsigned int cpu, struct= pt_regs *regs) unsigned long flags; =20 if (per_cpu(watchdog_hardlockup_touched, cpu)) { - watchdog_hardlockup_update(cpu); + watchdog_hardlockup_update_reset(cpu); per_cpu(watchdog_hardlockup_touched, cpu) =3D false; return; } diff --git a/kernel/watchdog_buddy.c b/kernel/watchdog_buddy.c index ee754d767c21..3a1e57080c1c 100644 --- a/kernel/watchdog_buddy.c +++ b/kernel/watchdog_buddy.c @@ -21,6 +21,7 @@ static unsigned int watchdog_next_cpu(unsigned int cpu) =20 int __init watchdog_hardlockup_probe(void) { + watchdog_hardlockup_miss_thresh =3D 3; return 0; } =20 @@ -86,14 +87,6 @@ void watchdog_buddy_check_hardlockup(int hrtimer_interru= pts) { unsigned int next_cpu; =20 - /* - * Test for hardlockups every 3 samples. The sample period is - * watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly over - * watchdog_thresh (over by 20%). - */ - if (hrtimer_interrupts % 3 !=3D 0) - return; - /* check for a hardlockup on the next CPU */ next_cpu =3D watchdog_next_cpu(smp_processor_id()); if (next_cpu >=3D nr_cpu_ids) --=20 2.53.0.851.ga537e3e6e9-goog From nobody Tue Apr 7 14:38:10 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2AA9A3B8927; Thu, 12 Mar 2026 23:22:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; cv=none; b=J6saS17XJFUYQ9XI9BVzv4YiiCNKinQO013cBwB7Gzgs0itWDMwi86QUIvE6gyAJ+boUXuw9hRygovkHZBWxDfFqRV0tZbBCZ520fIG+uXZYTfkH1dJCSbh2stl11U+cjJzf/4xoYVcPLosnG+s6PRpDXFqZ2KcMkBlj9BV0IZI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773357754; c=relaxed/simple; bh=weK6V1BYIxAVEyIKnRkGftZezgurppXwxIQ5/lZBM6U=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OnoOSM4U08rJUzoAT2Od0q/MeYTE4l1LvrHbuEiQRhMtCGQeEe6M+oa+JjWqh8ALYzDGLJFWqao0to2n4tCAcTpcEyRkQR82hHD6Y0PcNvwIRpXa/1uIa9lajhnN1tipwaOYbhIzV09kYG2G945ApnfV6Nu7hcmOOsSlShPr2Ys= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lFq/PF0z; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lFq/PF0z" Received: by smtp.kernel.org (Postfix) with ESMTPS id D91B9C2BCAF; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773357753; bh=weK6V1BYIxAVEyIKnRkGftZezgurppXwxIQ5/lZBM6U=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=lFq/PF0zHyVUZNrhmNobMGv1au7sw0L4N4HuKf1RbO41yv4ptU3KXu3VEG2aaYzM1 0Wv0oIXZyVK4cSuYlcKhHpchQJarGohsv9fkkjb5eMwHId4FMmy2rkSg9ID4ugsE6n d181v1/7SY3Rw1DkdrLR4tzO9b2Zg8IO1pPx46l5EBdO+qcJk5A18vavU5hbn6H0oZ 4N9oj0zHwcnQo20t6opADsob7iOlt7wAjNrZ5lyrruVVSCUuIVapvxSZ5+Ag0QiUwW G40N2ukoqzIDzWCdbSb87z8Xy6WkJnh8mwPzP6ZKRe8AKC899d0lLeuNslr8ejCfWH jEjKwKn05nj4A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFE75106ACFC; Thu, 12 Mar 2026 23:22:33 +0000 (UTC) From: Mayank Rungta via B4 Relay Date: Thu, 12 Mar 2026 16:22:06 -0700 Subject: [PATCH v2 5/5] doc: watchdog: Document buddy detector Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260312-hardlockup-watchdog-fixes-v2-5-45bd8a0cc7ed@google.com> References: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> In-Reply-To: <20260312-hardlockup-watchdog-fixes-v2-0-45bd8a0cc7ed@google.com> To: Petr Mladek , Jinchao Wang , Yunhui Cui , Stephane Eranian , Ian Rogers , Li Huafei , Feng Tang , Max Kellermann , Jonathan Corbet , Douglas Anderson , Andrew Morton , Florian Delizy , Shuah Khan Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Mayank Rungta X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773357752; l=9259; i=mrungta@google.com; s=20260212; h=from:subject:message-id; bh=bwkL0Xl6QKw4NxlixI2yZ7jUdew4kR4CuvDfQemly9g=; b=c/GuPPZRvi3GB9QDz/HD7sY6yOSvVKiuSmTZ28Yk0sqRHK8ArHEaxCrlpGdx2drDzt0lxn1Rz hmcdXnQa1IKDsXNVIp2VGQQkqsiUmOGh1sX7fhi1ntCKP3pLxdwfciX X-Developer-Key: i=mrungta@google.com; a=ed25519; pk=2Bjwbv/ibL10QnyvK9G7DoKpffXy7z6+M4NawEYgYDI= X-Endpoint-Received: by B4 Relay for mrungta@google.com/20260212 with auth_id=634 X-Original-From: Mayank Rungta Reply-To: mrungta@google.com From: Mayank Rungta The current documentation generalizes the hardlockup detector as primarily NMI-perf-based and lacks details on the SMP "Buddy" detector. Update the documentation to add a detailed description of the Buddy detector, and also restructure the "Implementation" section to explicitly separate "Softlockup Detector", "Hardlockup Detector (NMI/Perf)", and "Hardlockup Detector (Buddy)". Clarify that the softlockup hrtimer acts as the heartbeat generator for both hardlockup mechanisms and centralize the configuration details in a "Frequency and Heartbeats" section. Reviewed-by: Douglas Anderson Signed-off-by: Mayank Rungta --- Documentation/admin-guide/lockup-watchdogs.rst | 149 +++++++++++++++++----= ---- 1 file changed, 101 insertions(+), 48 deletions(-) diff --git a/Documentation/admin-guide/lockup-watchdogs.rst b/Documentation= /admin-guide/lockup-watchdogs.rst index 1b374053771f..7ae7ce3abd2c 100644 --- a/Documentation/admin-guide/lockup-watchdogs.rst +++ b/Documentation/admin-guide/lockup-watchdogs.rst @@ -30,22 +30,23 @@ timeout is set through the confusingly named "kernel.pa= nic" sysctl), to cause the system to reboot automatically after a specified amount of time. =20 +Configuration +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +A kernel knob is provided that allows administrators to configure +this period. The "watchdog_thresh" parameter (default 10 seconds) +controls the threshold. The right value for a particular environment +is a trade-off between fast response to lockups and detection overhead. + Implementation =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 -The soft and hard lockup detectors are built on top of the hrtimer and -perf subsystems, respectively. A direct consequence of this is that, -in principle, they should work in any architecture where these -subsystems are present. +The soft lockup detector is built on top of the hrtimer subsystem. +The hard lockup detector is built on top of the perf subsystem +(on architectures that support it) or uses an SMP "buddy" system. =20 -A periodic hrtimer runs to generate interrupts and kick the watchdog -job. An NMI perf event is generated every "watchdog_thresh" -(compile-time initialized to 10 and configurable through sysctl of the -same name) seconds to check for hardlockups. If any CPU in the system -does not receive any hrtimer interrupt during that time the -'hardlockup detector' (the handler for the NMI perf event) will -generate a kernel warning or call panic, depending on the -configuration. +Softlockup Detector +------------------- =20 The watchdog job runs in a stop scheduling thread that updates a timestamp every time it is scheduled. If that timestamp is not updated @@ -55,53 +56,105 @@ will dump useful debug information to the system log, = after which it will call panic if it was instructed to do so or resume execution of other kernel code. =20 -The period of the hrtimer is 2*watchdog_thresh/5, which means it has -two or three chances to generate an interrupt before the hardlockup -detector kicks in. +Frequency and Heartbeats +------------------------ + +The hrtimer used by the softlockup detector serves a dual purpose: +it detects softlockups, and it also generates the interrupts +(heartbeats) that the hardlockup detectors use to verify CPU liveness. + +The period of this hrtimer is 2*watchdog_thresh/5. This means the +hrtimer has two or three chances to generate an interrupt before the +NMI hardlockup detector kicks in. + +Hardlockup Detector (NMI/Perf) +------------------------------ + +On architectures that support NMI (Non-Maskable Interrupt) perf events, +a periodic NMI is generated every "watchdog_thresh" seconds. + +If any CPU in the system does not receive any hrtimer interrupt +(heartbeat) during the "watchdog_thresh" window, the 'hardlockup +detector' (the handler for the NMI perf event) will generate a kernel +warning or call panic. + +**Detection Overhead (NMI):** + +The time to detect a lockup can vary depending on when the lockup +occurs relative to the NMI check window. Examples below assume a watchdog_= thresh of 10. + +* **Best Case:** The lockup occurs just before the first heartbeat is + due. The detector will notice the missing hrtimer interrupt almost + immediately during the next check. + + :: + + Time 100.0: cpu 1 heartbeat + Time 100.1: hardlockup_check, cpu1 stores its state + Time 103.9: Hard Lockup on cpu1 + Time 104.0: cpu 1 heartbeat never comes + Time 110.1: hardlockup_check, cpu1 checks the state again, should be t= he same, declares lockup + + Time to detection: ~6 seconds + +* **Worst Case:** The lockup occurs shortly after a valid interrupt + (heartbeat) which itself happened just after the NMI check. The next + NMI check sees that the interrupt count has changed (due to that one + heartbeat), assumes the CPU is healthy, and resets the baseline. The + lockup is only detected at the subsequent check. + + :: + + Time 100.0: hardlockup_check, cpu1 stores its state + Time 100.1: cpu 1 heartbeat + Time 100.2: Hard Lockup on cpu1 + Time 110.0: hardlockup_check, cpu1 stores its state (misses lockup as = state changed) + Time 120.0: hardlockup_check, cpu1 checks the state again, should be t= he same, declares lockup =20 -As explained above, a kernel knob is provided that allows -administrators to configure the period of the hrtimer and the perf -event. The right value for a particular environment is a trade-off -between fast response to lockups and detection overhead. + Time to detection: ~20 seconds =20 -Detection Overhead ------------------- +Hardlockup Detector (Buddy) +--------------------------- =20 -The hardlockup detector checks for lockups using a periodic NMI perf -event. This means the time to detect a lockup can vary depending on -when the lockup occurs relative to the NMI check window. +On architectures or configurations where NMI perf events are not +available (or disabled), the kernel may use the "buddy" hardlockup +detector. This mechanism requires SMP (Symmetric Multi-Processing). =20 -**Best Case:** -In the best case scenario, the lockup occurs just before the first -heartbeat is due. The detector will notice the missing hrtimer -interrupt almost immediately during the next check. +In this mode, each CPU is assigned a "buddy" CPU to monitor. The +monitoring CPU runs its own hrtimer (the same one used for softlockup +detection) and checks if the buddy CPU's hrtimer interrupt count has +increased. =20 -:: +To ensure timeliness and avoid false positives, the buddy system performs +checks at every hrtimer interval (2*watchdog_thresh/5, which is 4 seconds +by default). It uses a missed-interrupt threshold of 3. If the buddy's +interrupt count has not changed for 3 consecutive checks, it is assumed +that the buddy CPU is hardlocked (interrupts disabled). The monitoring +CPU will then trigger the hardlockup response (warning or panic). =20 - Time 100.0: cpu 1 heartbeat - Time 100.1: hardlockup_check, cpu1 stores its state - Time 103.9: Hard Lockup on cpu1 - Time 104.0: cpu 1 heartbeat never comes - Time 110.1: hardlockup_check, cpu1 checks the state again, should be the= same, declares lockup +**Detection Overhead (Buddy):** =20 - Time to detection: ~6 seconds +With a default check interval of 4 seconds (watchdog_thresh =3D 10): =20 -**Worst Case:** -In the worst case scenario, the lockup occurs shortly after a valid -interrupt (heartbeat) which itself happened just after the NMI check. -The next NMI check sees that the interrupt count has changed (due to -that one heartbeat), assumes the CPU is healthy, and resets the -baseline. The lockup is only detected at the subsequent check. +* **Best case:** Lockup occurs just before a check. + Detected in ~8s (0s till 1st check + 4s till 2nd + 4s till 3rd). +* **Worst case:** Lockup occurs just after a check. + Detected in ~12s (4s till 1st check + 4s till 2nd + 4s till 3rd). =20 -:: +**Limitations of the Buddy Detector:** =20 - Time 100.0: hardlockup_check, cpu1 stores its state - Time 100.1: cpu 1 heartbeat - Time 100.2: Hard Lockup on cpu1 - Time 110.0: hardlockup_check, cpu1 stores its state (misses lockup as st= ate changed) - Time 120.0: hardlockup_check, cpu1 checks the state again, should be the= same, declares lockup +1. **All-CPU Lockup:** If all CPUs lock up simultaneously, the buddy + detector cannot detect the condition because the monitoring CPUs + are also frozen. +2. **Stack Traces:** Unlike the NMI detector, the buddy detector + cannot directly interrupt the locked CPU to grab a stack trace. + It relies on architecture-specific mechanisms (like NMI backtrace + support) to try and retrieve the status of the locked CPU. If + such support is missing, the log may only show that a lockup + occurred without providing the locked CPU's stack. =20 - Time to detection: ~20 seconds +Watchdog Core Exclusion +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 By default, the watchdog runs on all online cores. However, on a kernel configured with NO_HZ_FULL, by default the watchdog runs only --=20 2.53.0.851.ga537e3e6e9-goog