From nobody Sun Feb 8 11:59:17 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 966171A3164 for ; Thu, 29 Jan 2026 13:34:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769693684; cv=none; b=rZz1IRe4gjgnQMYDc2gEhQB4GiKkJLGHTU+3QIhrQn04T4SNOvP0c5TA+1OLuylQgKyRt7q/CCmjJE/pbl7N1Kd9nZrIe8DGy8BQZxlmN/ooe8QxMQIYi8obN4fuMVncCJaCN0dIqnzziwwlgPjpqR8EiMtTZ4gA3jIVMT/F2Zo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769693684; c=relaxed/simple; bh=K4nt5U+K0mzduximxBPcgwZwF5EZvnMTnL9rEKcw0Nk=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=eA75B0t1jrLHb0c9CjlrLxAnF2NMeXY/PQYRmSAvcmntTa8uKDF623WDcDV+blMo3skKrJx9rveMEDQe4ppmptFtQemEGoGlNMuUmriW1GccUSCdbxP24g6uom+KRgjEvIi11gzYx9oQHwyGWla4Ayu1z8sSVTz6grXH2S4OVb0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=Usg+EyHg; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="Usg+EyHg" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From: Reply-To:Content-ID:Content-Description:In-Reply-To:References; bh=T0k0Sub0Mq72s9Oiq6GYg291efYrwY1mCaY72weOR3A=; b=Usg+EyHgxc+vK3WC1sLae3gn/s EXwaipTNSlZgfhUqAnEy2qW9CK7UApD7jillQZLggDEWrQmB5JldrbEPEoKVsLb8iGwkx6/YHD7qJ Kt/NBkIetJw2BzbYCN/dgMNGqsI8ReX/h0ui2i42wDTuWyCWjbw5/eg77Bf7LFYbmDjKsvn5RfIZz nkL3n2wOSfE56pvTvuCW6QvKhAVpVY/glllMDmSVjWRsBJFesyDE+aSsW7gKqWBVTfngao29qQ3VT J7zp6c4JIevUKPVEcvgz1C9f/OfjHhb7lGqLU20s+OlqSQZOrMfV0EmhJ3cwGEXrozK7m/Ppp6arS JnzXKcCQ==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vlSAE-000yg7-0Z; Thu, 29 Jan 2026 13:34:26 +0000 From: Breno Leitao Date: Thu, 29 Jan 2026 05:34:10 -0800 Subject: [PATCH] vmcore_info: expose hardware error recovery statistics via sysfs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260129-vmcoreinfo_sysfs-v1-1-164c1fe1fe07@debian.org> X-B4-Tracking: v=1; b=H4sIANJhe2kC/yXMQQrCMBAF0KsMf91AG7Q1uYoU0XRGRzCRjBal9 O6iLt/mLTCuyoZICyrPaloyInUNIV2O+cxOJ0SCb33fdj64+ZZKZc1SDvY2MSey6XdDGEKatmg I98qir1+5H/+25+nK6fF9sK4fZq5Mx3QAAAA= X-Change-ID: 20260129-vmcoreinfo_sysfs-ff4687979cd5 To: akpm@linux-foundation.org, bhe@redhat.com Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, Breno Leitao , kexec@lists.infradead.org, dyoung@redhat.com, tony.luck@intel.com, xueshuai@linux.alibaba.com, vgoyal@redhat.com, zhiquan1.li@intel.com, olja@meta.com X-Mailer: b4 0.15-dev-f4305 X-Developer-Signature: v=1; a=openpgp-sha256; l=4963; i=leitao@debian.org; h=from:subject:message-id; bh=K4nt5U+K0mzduximxBPcgwZwF5EZvnMTnL9rEKcw0Nk=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpe2HeA+EAfkq6/EmsGFRAuwoQQGVZGLenY5RAi J431FQ/AiGJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaXth3gAKCRA1o5Of/Hh3 bT21D/4g6McQp+JPE9FNSGB2O9+IDVRtAu7i+Q1IIlHk80am2cUjz8W2YLQj+glsV6h/zj+lqU6 hDmP7XugMTHnfWnZw60aNEsuw/F5VhDhe6/arhW16XlQXeDdz1/P+t2cZCKGEsKR6syOEX+4zay AoDx7BgiKR8YpfYCfnfj3IP81z4AiNmLQD0MRDx37DxSbTT/c73h/DqMfAnk8lk7ZB1k5Tlj0dv do83UbVbvK4AXcPdyHhghkksj+a5U4xriI/Rz71qPjEMpCXXdl2K0cOmmThtewIOKMCmwwY+bfE K30XaahW1aDprq9XeaMPCvm1AIVEiIPSAE9j0xvcuc1UwgU9Ct5DgJjGDg4YrYjbILz7R7zJtr1 OdoJ3U72NXUPauj6OAwb5GWENODaqESB66wTj9XaXwC1Sbi9Wa943++ldU5Id4P/jWT+TWdWco8 eZ+Akt1/j/iaMyCoZmsv6bK3F9mjzxIsDCICv/F4fNnf1NgpQ+wITh2t7g5Blg7atxQJdCchW4A vKU9N2KIOnfzOhErZEQ6dk4+FBZr4JCqJBAZ8wfyeFZRWKOEGMvP/IEKd3rOgXYlDyN472fC8Z5 oz/p4OVVgJcRgU0RErz8BSet5oExk/F7FxIrOgYl03UmvwaNH8GIVV2EdvvXIJvhO1t0vO215M4 GPaO9yxxl2FzxBw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao Add a sysfs file at /sys/kernel/vmcore_stats and expose hardware error recovery statistics that are already tracked by the kernel. This allows userspace monitoring tools to track recovered hardware errors without requiring kernel crashes. This is useful to track recoverable hardware errors in a time series, even if the host doesn't crash. Create a generic vmcore_stats sysfs, and add a section for hwerr_recovery that shows the counts per subsystem and timestamps: - cpu: CPU-related errors (MCE, ARM processor errors) - memory: Memory-related errors - pci: PCI/PCIe AER non-fatal errors - cxl: CXL errors - other: Other hardware errors Example output: hwerr_recovery: cpu: 0 (0) memory: 2 (1738148257) pci: 1 (1738147000) cxl: 0 (0) other: 0 (0) The value in parentheses is the timestamp (seconds since epoch) of the last error of that type, or 0 if no errors have occurred. These statistics provide visibility into the health of the system's hardware and can be used by system administrators to proactively detect failing components before they cause system crashes. Signed-off-by: Breno Leitao --- To: akpm@linux-foundation.org Cc: kexec@lists.infradead.org To: bhe@redhat.com Cc: linux-kernel@vger.kernel.org Cc: dyoung@redhat.com Cc: tony.luck@intel.com Cc: xueshuai@linux.alibaba.com Cc: vgoyal@redhat.com Cc: zhiquan1.li@intel.com Cc: olja@meta.com --- .../ABI/testing/sysfs-kernel-vmcore_stats | 23 ++++++++++++++++ kernel/vmcore_info.c | 31 ++++++++++++++++++= ++++ 2 files changed, 54 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-kernel-vmcore_stats b/Document= ation/ABI/testing/sysfs-kernel-vmcore_stats new file mode 100644 index 0000000000000..b42f18d24c00b --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-vmcore_stats @@ -0,0 +1,23 @@ +What: /sys/kernel/vmcore_stats +Date: January 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Shows statistics related to vmcore functionality. Currently + includes hardware error recovery statistics. + + Format: + Recovered hardware errors: + metric: count (timestamp) + + Statistics about recoverable hardware errors that the kernel + has handled since boot. Each metric shows the count and + timestamp (seconds since epoch) of the last error in + parentheses (0 if no errors have occurred). + + Metrics: + - cpu: CPU-related errors (MCE, ARM processor errors) + - memory: Memory-related errors + - pci: PCI/PCIe AER non-fatal errors + - cxl: CXL (Compute Express Link) errors + - other: Other hardware errors diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c index fe9bf8db1922e..5974b4be08cbc 100644 --- a/kernel/vmcore_info.c +++ b/kernel/vmcore_info.c @@ -6,6 +6,8 @@ =20 #include #include +#include +#include #include #include #include @@ -135,6 +137,31 @@ void hwerr_log_error_type(enum hwerr_error_type src) } EXPORT_SYMBOL_GPL(hwerr_log_error_type); =20 +/* sysfs interface for hardware error recovery statistics */ +static ssize_t vmcore_stats_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, + "Recovered hardware errors:\n" + " cpu: %d (%lld)\n" + " memory: %d (%lld)\n" + " pci: %d (%lld)\n" + " cxl: %d (%lld)\n" + " other: %d (%lld)\n", + atomic_read(&hwerr_data[HWERR_RECOV_CPU].count), + (long long)READ_ONCE(hwerr_data[HWERR_RECOV_CPU].timestamp), + atomic_read(&hwerr_data[HWERR_RECOV_MEMORY].count), + (long long)READ_ONCE(hwerr_data[HWERR_RECOV_MEMORY].timestamp), + atomic_read(&hwerr_data[HWERR_RECOV_PCI].count), + (long long)READ_ONCE(hwerr_data[HWERR_RECOV_PCI].timestamp), + atomic_read(&hwerr_data[HWERR_RECOV_CXL].count), + (long long)READ_ONCE(hwerr_data[HWERR_RECOV_CXL].timestamp), + atomic_read(&hwerr_data[HWERR_RECOV_OTHERS].count), + (long long)READ_ONCE(hwerr_data[HWERR_RECOV_OTHERS].timestamp)); +} + +static struct kobj_attribute vmcore_stats_attr =3D __ATTR_RO(vmcore_stats); + static int __init crash_save_vmcoreinfo_init(void) { vmcoreinfo_data =3D (unsigned char *)get_zeroed_page(GFP_KERNEL); @@ -244,6 +271,10 @@ static int __init crash_save_vmcoreinfo_init(void) arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); =20 + /* Create /sys/kernel/vmcore_stats */ + if (sysfs_create_file(kernel_kobj, &vmcore_stats_attr.attr)) + pr_warn("Failed to create vmcore_stats sysfs file\n"); + return 0; } =20 --- base-commit: 8dfce8991b95d8625d0a1d2896e42f93b9d7f68d change-id: 20260129-vmcoreinfo_sysfs-ff4687979cd5 Best regards, -- =20 Breno Leitao