From nobody Sat Feb 7 09:46:57 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 796FC4A21; Mon, 2 Feb 2026 14:28:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770042495; cv=none; b=Zye8Q5NsZpWgxKC/cpZjEUqdrDVWqXB9k2GvLY8mWPAfB37b9JWAX+/CbfU5utrTPjxncPS0G/XS6hTzqbrZscl9Ox+xra7Kt5SnYANKbLCE7J48EgWVGtvM8qo4nURTzEFfX+Hq1i11oz+gwe3GYAm/4MMFfmuZwR5zLtuvaJg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770042495; c=relaxed/simple; bh=U8EjTz/5T7IsTrWh7ZVcmmDz3W/0lTlfkat5cxh/2+Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Y50Py/gsULTHm3HnfnVWuO17fQpnwnaaynLW3IftlOuRQFBIi7VQTpMf9zrCKyt45eh+Bg0zWpCPdoUTQoYxTAtALimnn/mrqHiAsNT9FihM9lRgPDRyfhHUZtbY40cVEm9h6+Xs0LHxKLgrUJ9K1vsJzYTt9i1r+yKinWo1WW0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=Nhbu0VYP; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="Nhbu0VYP" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=pCFb4xxEV2N7KzX/8XoJeWKKN9wS7tz47sg5Z5hO/c4=; b=Nhbu0VYP5qTK/s7HyyQREOyam0 Zr2Pj3ksZVl3YspztLX8aZJrLG2MIPsanMki3avg4U09a8bbgGudCOOS/2DfdUluDn4wGg8Ikcf+D engXbwdRtM8ezeXXpTuFlc4VxUxROW3OMSUMa/17Q/o4Urwel3/BjJHaYy3PfXGwXG+KMRYjW9C5b rWrOjcrk/Dee/c2M+MZ6wOTvBeCRT/tCLKOO4ygn62B/oYUqO3aw52IKQkU2ALeTnPTJYsLtznVw/ FWhjpZIAxOzFozgN1FyqQ4rpLe3QJhz6ZfPFaSxowDpOSAz6PwpU/JDUBtQ8CuRYX+P2ZB+tINPyg 9AV7EUuA==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vmuuO-0046Xd-NY; Mon, 02 Feb 2026 14:28:09 +0000 From: Breno Leitao Date: Mon, 02 Feb 2026 06:27:39 -0800 Subject: [PATCH v2 1/2] vmcoreinfo: expose hardware error recovery statistics via sysfs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260202-vmcoreinfo_sysfs-v2-1-8f3b5308b894@debian.org> References: <20260202-vmcoreinfo_sysfs-v2-0-8f3b5308b894@debian.org> In-Reply-To: <20260202-vmcoreinfo_sysfs-v2-0-8f3b5308b894@debian.org> To: akpm@linux-foundation.org, bhe@redhat.com Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org, dyoung@redhat.com, tony.luck@intel.com, xueshuai@linux.alibaba.com, vgoyal@redhat.com, zhiquan1.li@intel.com, olja@meta.com, Breno Leitao , kernel-team@meta.com X-Mailer: b4 0.15-dev-f4305 X-Developer-Signature: v=1; a=openpgp-sha256; l=3683; i=leitao@debian.org; h=from:subject:message-id; bh=U8EjTz/5T7IsTrWh7ZVcmmDz3W/0lTlfkat5cxh/2+Q=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpgLRtwiUfi29mGTnW4ixNZP7zWlKBlM3vszSJq AEuwxTr9ZeJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaYC0bQAKCRA1o5Of/Hh3 bVU7D/4/ExbTJHPtttzMSuBhk++xqEFTGnO6/RhPBgJukq8LwzOrL2hTvO+rp4eeK8XuIhLXwwg bqHzyu8rolbjXfhhAaHIUgSvhChUcCPBHq8yzLTAJKn/I2EEVNOz/ZBR1sdHNNQD2D/Zdsn5EZ6 fW2qiKLSsVchqc/HvA7aGm19p4Wk5BIQ+cbiaiKgkzZqCKe0V49KH5KP1NiVlLFjR/lPVrO2jp6 fCp/Z7HPXRA8+GpYjT9O6ScKytWNNGmDlN2qZOwbU/RNqGej1tUlbL/Xtcr9c0tZzh9AoZTUUdJ tO+RmLebtx5DBsWl5jOzTE9XcA/eQJFIYM/Hy4+/3BTRscC2Jo3CyJf+XicmaKxF/KG3CdnOjNQ 4vhgMhqdKw5xeXA/+eJ5ZiVNymjHTM/ohxmqvrcq9LVj1Cd6Ppnm29eQBchEaWY5f8xJCGsnAtK Crnwg+xpSxZJT2CXLzbm2f2q7+GgSrw60Prdml0pQyPGTwl4DorPYv83f+5ji1wMkmE3sk0AJxF SxxrLb6nVEGwdT3edG53ysA++T6Y0oRN/0ZWzWvM+RIuARAOznbXwibF1eNELSbsQyAf+Vnl6Q+ YT+QI/GpAb0Km9KE73ggMe3TFKAw1/8N4ZudMe7rXnTtyvg8hlhjuLwSd2YKPXq53KGojpz/U06 uxWw7bKkJrqqotw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao Add a sysfs directory at /sys/kernel/hwerr_recovery_stats/ to expose hardware error recovery statistics that are already tracked by the kernel. This allows userspace monitoring tools to track recovered hardware errors without requiring kernel crashes. This is useful to track recoverable hardware errors in a time series, even if the host doesn't crash. The sysfs directory contains one file per error subsystem: /sys/kernel/hwerr_recovery_stats/cpu - CPU-related errors (MCE, ARM e= rrors) /sys/kernel/hwerr_recovery_stats/memory - Memory-related errors /sys/kernel/hwerr_recovery_stats/pci - PCI/PCIe AER non-fatal errors /sys/kernel/hwerr_recovery_stats/cxl - CXL errors /sys/kernel/hwerr_recovery_stats/others - Other hardware errors Each file contains a single integer representing the count of recovered errors for that subsystem. These statistics provide visibility into the health of the system's hardware and can be used by system administrators to proactively detect failing components before they cause system crashes. Signed-off-by: Breno Leitao --- kernel/vmcore_info.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 55 insertions(+) diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c index e2784038bbed7..b7fcd21be7c59 100644 --- a/kernel/vmcore_info.c +++ b/kernel/vmcore_info.c @@ -6,6 +6,8 @@ =20 #include #include +#include +#include #include #include #include @@ -139,6 +141,56 @@ void hwerr_log_error_type(enum hwerr_error_type src) } EXPORT_SYMBOL_GPL(hwerr_log_error_type); =20 +/* sysfs interface for hardware error recovery statistics */ +#define HWERR_ATTR_RO(_name, _type) \ +static ssize_t _name##_show(struct kobject *kobj, \ + struct kobj_attribute *attr, char *buf) \ +{ \ + return sysfs_emit(buf, "%d\n", \ + atomic_read(&hwerr_data[_type].count)); \ +} \ +static struct kobj_attribute hwerr_##_name##_attr =3D __ATTR_RO(_name) + +HWERR_ATTR_RO(cpu, HWERR_RECOV_CPU); +HWERR_ATTR_RO(memory, HWERR_RECOV_MEMORY); +HWERR_ATTR_RO(pci, HWERR_RECOV_PCI); +HWERR_ATTR_RO(cxl, HWERR_RECOV_CXL); +HWERR_ATTR_RO(others, HWERR_RECOV_OTHERS); + +static struct attribute *hwerr_recovery_stats_attrs[] =3D { + &hwerr_cpu_attr.attr, + &hwerr_memory_attr.attr, + &hwerr_pci_attr.attr, + &hwerr_cxl_attr.attr, + &hwerr_others_attr.attr, + NULL, +}; + +static const struct attribute_group hwerr_recovery_stats_group =3D { + .attrs =3D hwerr_recovery_stats_attrs, +}; + +static struct kobject *hwerr_recovery_stats_kobj; + +static int __init hwerr_recovery_stats_init(void) +{ + hwerr_recovery_stats_kobj =3D kobject_create_and_add("hwerr_recovery_stat= s", + kernel_kobj); + if (!hwerr_recovery_stats_kobj) { + pr_warn("Failed to create hwerr_recovery_stats kobject\n"); + return -ENOMEM; + } + + if (sysfs_create_group(hwerr_recovery_stats_kobj, + &hwerr_recovery_stats_group)) { + kobject_put(hwerr_recovery_stats_kobj); + pr_warn("Failed to create hwerr_recovery_stats sysfs group\n"); + return -ENOMEM; + } + + return 0; +} + static int __init crash_save_vmcoreinfo_init(void) { vmcoreinfo_data =3D (unsigned char *)get_zeroed_page(GFP_KERNEL); @@ -248,6 +300,9 @@ static int __init crash_save_vmcoreinfo_init(void) arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); =20 + /* Create /sys/kernel/hwerr_recovery_stats/ directory */ + hwerr_recovery_stats_init(); + return 0; } =20 --=20 2.47.3 From nobody Sat Feb 7 09:46:57 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3C4836D4EB; Mon, 2 Feb 2026 14:28:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770042497; cv=none; b=mBuqkteg/63ZoZ57ZwFQHMzOAV/M5KnoW5Ostb4P+j+QUN3uoG3Cjt4zWbD74WpTeIyfLNgeWMTZgVncfRrlQWBkK08vuVyFzIiEPTueAwI8VHK7J0cZ1Ps83fMCPed0qc82+f0T70gB9lYok4/M+rO0iFRUxr22hQVAoZrmmRo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770042497; c=relaxed/simple; bh=18gfPfnS76lTnOnvubbkQRKtlWGXJGHF5diwW/VEeLw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=DOLvzMB+y1arspPyl9kUb7NlFhZFTT8Cbx0VeclaOZ1A1iol8JvcsUhDcTGzlMfWoJ3gIfHxb1pcZNIdkfXxdpBdLHs9hI/sw81Y3C7OO92eZYRl3FYZAABGv17OJcu6T9MeQzsF1B2ylx5lwbg1c5i29slTCB7zuU6kdfT8uEM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=aTgBNQQh; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="aTgBNQQh" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=BsYw1jdHI6UNaHJ0xGEA+g0R+OyYdE59KzSX/GWoWYg=; b=aTgBNQQhltAuw96wTOS8uy9t3o u5GM4R+satqAZlav3VkjHQMCbIG+RU7NskjwGA0dSj4hMrzgo3GbAQY6aonNxs5O34lCC8JVWXB3v 20VQ2YCokPSqSK2xreRzSFq2dcdhkY71p3FBlRSJ2FhYaloOS6gMTGmL/sJzYm7H8d2IyQ3Gk3Kwy x8pNfhc+PtmsR1Yq3cwsj/0kUr/8gf54yPRr6qfmF39F41WG2wO3E8fcR4C0B+soL7U3O+sRwqt1w ZVwyNntOpXeIXXsnuzW2YYLxdpZPW2fi2O9huq1p51c0Ip/wltzgBdMk6rNaGHud87OYBeFFeLHmu 1y8f8MTQ==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vmuuT-0046Xj-1B; Mon, 02 Feb 2026 14:28:13 +0000 From: Breno Leitao Date: Mon, 02 Feb 2026 06:27:40 -0800 Subject: [PATCH v2 2/2] docs: add ABI documentation for /sys/kernel/hwerr_recovery_stats/ Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260202-vmcoreinfo_sysfs-v2-2-8f3b5308b894@debian.org> References: <20260202-vmcoreinfo_sysfs-v2-0-8f3b5308b894@debian.org> In-Reply-To: <20260202-vmcoreinfo_sysfs-v2-0-8f3b5308b894@debian.org> To: akpm@linux-foundation.org, bhe@redhat.com Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org, dyoung@redhat.com, tony.luck@intel.com, xueshuai@linux.alibaba.com, vgoyal@redhat.com, zhiquan1.li@intel.com, olja@meta.com, Breno Leitao , kernel-team@meta.com X-Mailer: b4 0.15-dev-f4305 X-Developer-Signature: v=1; a=openpgp-sha256; l=3053; i=leitao@debian.org; h=from:subject:message-id; bh=18gfPfnS76lTnOnvubbkQRKtlWGXJGHF5diwW/VEeLw=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpgLRtcjZ4+WBywa2wxT1191HPWP7OKn4XR6Tn9 uqH93TzJzmJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaYC0bQAKCRA1o5Of/Hh3 bXbeEACoupF1Pf8XMs+rEdztojnWgDL4it51PtcayAHw+CQ6Wdp1MY6lxhkmj0g1+YNf+SwxbRl QynpeiRWj8Q15AddCGhpfuh7Uj4EuytJ6LXF18iIygYbwxlqC6GHxcYALJDCe/0RCpvnmrB5t0C KFSumDtE69FQYF2XESSku2W12rRODBRbnqvMWXdUkdVUobQyWmVhzGrntc7YDDX8MAg9si7Q6RQ WRrOTx28/G/RnNnzV/y4JBPjPJWV6Ibb7xGBrpVoiOJSJzEkGPBbKl6ZE+Fm3a+CmQp50aOLeWp yEscW50qSksuh0MyMdyDr1YhG+1vibIu/YUePZscz/cU7lK2dO+lORrafpeeEHHVbS5mnxmeM+R XDzkv+QiSPJIz1SAblR4iVeIwJYVuhI8irPOuv5KvdGWG1LE9aCEb/IZQKF3zQ3fOl5ywUK97JR SgoWnzj4yei259VzhSdUsgb9UR3RhGX8IX5TN8YJ7Yip9jK2Q4h3GL9loYly8g0c3ptWRDSaV4f eL50p/ADmpWoj899YL94PfCLUhZ8LSx+/QrwOdWmq3n6l46BPoEdrrF9LcXVnNDXIjw1tfAlhA8 SAOGRYTZHatuaHp8sqoHRksVztne8OSrCPc70LI+pubdyQxWbmjuYKvYSVq59R3S/0PBxZm0oAF 6TW1S81PupfKqIw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao Document the new hwerr_recovery_stats sysfs directory that exposes hardware error recovery statistics. Update hw-recoverable-errors.rst to reference the new sysfs interface for runtime monitoring. Signed-off-by: Breno Leitao --- .../ABI/testing/sysfs-kernel-hwerr_recovery_stats | 47 ++++++++++++++++++= ++++ Documentation/driver-api/hw-recoverable-errors.rst | 3 +- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-hwerr_recovery_stats b/= Documentation/ABI/testing/sysfs-kernel-hwerr_recovery_stats new file mode 100644 index 0000000000000..4cb9f5a89fba9 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-hwerr_recovery_stats @@ -0,0 +1,47 @@ +What: /sys/kernel/hwerr_recovery_stats/ +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Directory containing hardware error recovery statistics. + These statistics track recoverable hardware errors that the + kernel has handled since boot. + + Each file contains a single integer representing the count + of recovered errors for that subsystem. + +What: /sys/kernel/hwerr_recovery_stats/cpu +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Count of CPU-related recovered errors (MCE, ARM processor + errors). + +What: /sys/kernel/hwerr_recovery_stats/memory +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Count of memory-related recovered errors. + +What: /sys/kernel/hwerr_recovery_stats/pci +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Count of PCI/PCIe AER non-fatal recovered errors. + +What: /sys/kernel/hwerr_recovery_stats/cxl +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Count of CXL (Compute Express Link) recovered errors. + +What: /sys/kernel/hwerr_recovery_stats/others +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao +Description: + Count of other hardware recovered errors. diff --git a/Documentation/driver-api/hw-recoverable-errors.rst b/Documenta= tion/driver-api/hw-recoverable-errors.rst index fc526c3454bd7..4aefcd103be22 100644 --- a/Documentation/driver-api/hw-recoverable-errors.rst +++ b/Documentation/driver-api/hw-recoverable-errors.rst @@ -36,7 +36,8 @@ Data Exposure and Consumption types like CPU, memory, PCI, CXL, and others. - It is exposed via vmcoreinfo crash dump notes and can be read using tools like `crash`, `drgn`, or other kernel crash analysis utilities. -- There is no other way to read these data other than from crash dumps. +- It is also exposed via sysfs at ``/sys/kernel/hwerr_recovery_stats/`` fo= r runtime + monitoring without requiring a crash dump. - These errors are divided by area, which includes CPU, Memory, PCI, CXL a= nd others. =20 --=20 2.47.3