From nobody Sun May 24 21:38:37 2026 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 259BB33D6D8 for ; Thu, 21 May 2026 03:03:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779332638; cv=none; b=KR/XD8sgkFS7RvCSVghJed47KQ5WBr9Hl/KPiotVPwbxr+bAWr1F/as94TEFD/ETlD4mzmDyx5wbTpay/pJlBjtIg1DWUcZIMUFr3S+nja8PlwYH9+Mn9cHzSR8ivHcSo+qqvXA+XDtxKQa8iafCfaot2exaMt8LGFqTpc4TQQg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779332638; c=relaxed/simple; bh=U/cinXBZSYMQlfL9cVRfgo0sYi56AfP3ybwRR1FPRrY=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=lhdiDhjvA0Yos172Wj7mJZEdE4DTiBz9sQBn0Cq3djy0asZiq+tPNZ/efItljVvEYxWBgerXqkL22zVXNYGNTPMy9kDVRitzzYom7w3LJgCgiKFjS0vbz8WFUQARQFAEBMalA6hQDKOkucc0J0MDfCosCvGLFXc/l5hwMP+ayjk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=iCa+gWyX; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="iCa+gWyX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1779332619; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=ex5NLHc/98hblRToBleBf7PEM+9wfi2dg7WafHixNfs=; b=iCa+gWyXEZKr/UCIgnjRStDseXK43+AH5mXYJe3CO5ExpLjfyZ2NI4mJ0xsXVoWVVAOrPMHr+H8Gp0umQ7/5fwvJDf7H4ZyH6yF020HH4oMEAFW4n63f+s6YWVp+gKvRBk7H5eiHXucP6axVv+wDbmFpLr84OAKCQBdrUxuFmm0= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045133197;MF=feng.tang@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0X3KUFBP_1779332618; Received: from localhost(mailfrom:feng.tang@linux.alibaba.com fp:SMTPD_---0X3KUFBP_1779332618 cluster:ay36) by smtp.aliyun-inc.com; Thu, 21 May 2026 11:03:38 +0800 From: Feng Tang To: Andrew Morton , paulmck@kernel.org, Petr Mladek Cc: linux-kernel@vger.kernel.org, Feng Tang Subject: [PATCH] lib/nmi_backtrace: print out the CPUs which fail to respond to NMI Date: Thu, 21 May 2026 11:03:36 +0800 Message-Id: <20260521030336.92172-1-feng.tang@linux.alibaba.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When debugging RCU stall cases, usually all CPUs will respond to the NMI and print out the backtrace. But in some nasty or hardware related cases, some CPUs may fail to respond in 10 seconds, and very likely this is sign of severe issues. Paul E. McKenney has implemented the NMI backtrace stall check for x86, and for other architectures, it should be also helpful to at least print out those CPUs which failed to repond to the NMI, so that users can get an early heads-up for possible CPU hard stall. Signed-off-by: Feng Tang Reviewed-by: Petr Mladek --- lib/nmi_backtrace.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c index 33c154264bfe..a113d3d669be 100644 --- a/lib/nmi_backtrace.c +++ b/lib/nmi_backtrace.c @@ -75,7 +75,13 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask, mdelay(1); touch_softlockup_watchdog(); } - nmi_backtrace_stall_check(to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_warn("After 10 seconds, these CPUS still haven't responded to the NMI= : %*pbl\n", + cpumask_pr_args(to_cpumask(backtrace_mask))); + + nmi_backtrace_stall_check(to_cpumask(backtrace_mask)); + } =20 /* * Force flush any remote buffers that might be stuck in IRQ context --=20 2.39.5 (Apple Git-154)