From nobody Mon Jun 8 04:25:37 2026 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10F8F230264; Tue, 2 Jun 2026 07:15:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384562; cv=none; b=oKOigF81f7DeQRV/ZDzVAS3YPjgBO0XmL2gimkGdnj/5ONsKAx7HGxb3dsO8OA0mZr1+ww/xY3SLB6t+sC2XJgpUHxTcbUBouSUQOBUas+E9BEZF8IvvTB0ggRsETuNDra2GkaXf8eQCWixYbM4bZlCHPrVY88fU8TidHKrzOgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384562; c=relaxed/simple; bh=HgyMTH/zffk+7pHCe6NYto1bA5KcvSFsyYjMSj035Jw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZyzVue7Zhs/fpD7u4C+mgvJPcb8VkqhnABhGI/FRY1cEbwhC7V3pbHuIPYZ+lNKQBKEwO0H5YO9tcBj2U4An8DcXUxbl4G6oBj6RLzKHPMbYRJ/QwXIgHAlrtZQEQuC1xBVCu4GwIVjVue+3i4jIm4hItTQ+ctCco1b17Hoi2ic= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=BzfJ+BSw; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="BzfJ+BSw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384550; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=itObtE7tOGkEONQAFJ+mBj1BRHYMUlPupZz3p9rp7M0=; b=BzfJ+BSw4T2xhIfiroktiAspcUf9L/bkEGdPj9/3DH2KlCyaMszodHhkatnYiYSSr6bW1pnFVPb/tD8v+i5+jXJab88I75z9HkEZYRfi42TYD4G6xynXcHOynHEpYr9PkMyUt4W804KP6d5r9hPucqxl7nrTa1ksbLrdgOL01vU= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R231e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045133197;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2YA_1780384546; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2YA_1780384546 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:15:48 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 01/16] ACPI/AEST: Register arm64_ras platform devices from AEST v2 Date: Tue, 2 Jun 2026 15:15:24 +0800 Message-ID: <20260602071540.3711528-2-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Parse the ARM Error Source Table (AEST) v2 [1] and present each error-source node to the RAS subsystem as a generic platform device. Rather than letting the RAS driver consume AEST-specific structures directly, all per-node metadata (interface type, register bases, group format, record bitmaps, GSIVs, vendor data) is conveyed via fwnode software-node properties. This keeps every AEST encoding detail in the ACPI front-end and lets the same back-end driver bind unchanged when a Device Tree front-end is added later. If the interface flags indicate an associated ACPI namespace device (AEST_XFACE_FLAG_ERROR_DEVICE), the companion ACPI device is looked up and attached so that downstream drivers can reach it. [1]: https://developer.arm.com/documentation/den0085/0200/ Signed-off-by: Ruidong Tian --- MAINTAINERS | 8 ++ drivers/acpi/arm64/Kconfig | 10 ++ drivers/acpi/arm64/Makefile | 1 + drivers/acpi/arm64/aest.c | 256 ++++++++++++++++++++++++++++++++++++ include/linux/acpi_aest.h | 19 +++ 5 files changed, 294 insertions(+) create mode 100644 drivers/acpi/arm64/aest.c create mode 100644 include/linux/acpi_aest.h diff --git a/MAINTAINERS b/MAINTAINERS index c3fe46d7c4bc..16c80a7ea72c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -344,6 +344,14 @@ S: Maintained F: drivers/acpi/arm64 F: include/linux/acpi_iort.h =20 +ACPI AEST +M: Ruidong Tian +L: linux-acpi@vger.kernel.org +L: linux-arm-kernel@lists.infradead.org +S: Supported +F: drivers/acpi/arm64/aest.c +F: include/linux/acpi_aest.h + ACPI FOR RISC-V (ACPI/riscv) M: Sunil V L L: linux-acpi@vger.kernel.org diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig index f2fd79f22e7d..49b487bba928 100644 --- a/drivers/acpi/arm64/Kconfig +++ b/drivers/acpi/arm64/Kconfig @@ -24,3 +24,13 @@ config ACPI_APMT =20 config ACPI_MPAM bool + +config ACPI_AEST + bool "ARM Error Source Table Support" + depends on ARM64_RAS_EXTN + help + The Arm Error Source Table (AEST) provides details on ACPI + extensions that enable kernel-first handling of errors in a + system that supports the Armv8 RAS extensions. + + If set, the kernel will report and log hardware errors. diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile index 9390b57cb564..bad77fdbf8dd 100644 --- a/drivers/acpi/arm64/Makefile +++ b/drivers/acpi/arm64/Makefile @@ -7,5 +7,6 @@ obj-$(CONFIG_ACPI_IORT) +=3D iort.o obj-$(CONFIG_ACPI_MPAM) +=3D mpam.o obj-$(CONFIG_ACPI_PROCESSOR_IDLE) +=3D cpuidle.o obj-$(CONFIG_ARM_AMBA) +=3D amba.o +obj-$(CONFIG_ACPI_AEST) +=3D aest.o obj-y +=3D dma.o init.o obj-y +=3D thermal_cpufreq.o diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c new file mode 100644 index 000000000000..8cf24467d0c2 --- /dev/null +++ b/drivers/acpi/arm64/aest.c @@ -0,0 +1,256 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM Error Source Table Support + * + * Copyright (c) 2025, Alibaba Group. + */ + +#include +#include +#include +#include + +#include "init.h" + +#undef pr_fmt +#define pr_fmt(fmt) "ACPI AEST: " fmt + +/* + * Fill the per-AEST-entry inner properties (node-type / interface-type / + * group-format / record bitmaps / register bases ...). + */ +static int __init +aest_init_node_props(struct acpi_aest_hdr *hdr, struct property_entry *pro= ps, + int *p, struct platform_device *pdev) +{ + struct acpi_aest_node_interface_header *interface; + struct acpi_aest_node_interface_common *common =3D NULL; + u64 *record_implemented =3D NULL; + u64 *status_reporting =3D NULL; + u64 *addressing_mode =3D NULL; + int group_len =3D 0, i; + size_t len; + + interface =3D ACPI_ADD_PTR(struct acpi_aest_node_interface_header, + hdr, hdr->node_interface_offset); + switch (interface->group_format) { + case ACPI_AEST_NODE_GROUP_FORMAT_4K: { + struct acpi_aest_node_interface_4k *itf =3D + (struct acpi_aest_node_interface_4k *)(interface + 1); + + record_implemented =3D &itf->error_record_implemented; + status_reporting =3D &itf->error_status_reporting; + addressing_mode =3D &itf->addressing_mode; + group_len =3D 1; + common =3D &itf->common; + break; + } + case ACPI_AEST_NODE_GROUP_FORMAT_16K: { + struct acpi_aest_node_interface_16k *itf =3D + (struct acpi_aest_node_interface_16k *)(interface + 1); + + record_implemented =3D itf->error_record_implemented; + status_reporting =3D itf->error_status_reporting; + addressing_mode =3D itf->addressing_mode; + group_len =3D 4; + common =3D &itf->common; + break; + } + case ACPI_AEST_NODE_GROUP_FORMAT_64K: { + struct acpi_aest_node_interface_64k *itf =3D + (struct acpi_aest_node_interface_64k *)(interface + 1); + + record_implemented =3D itf->error_record_implemented; + status_reporting =3D itf->error_status_reporting; + addressing_mode =3D itf->addressing_mode; + group_len =3D 14; + common =3D &itf->common; + break; + } + default: + pr_err("invalid group format: %d\n", interface->group_format); + return -EINVAL; + } + + if (interface->flags & AEST_XFACE_FLAG_ERROR_DEVICE) { + struct acpi_device *companion; + char uid[16]; + int n; + + n =3D snprintf(uid, sizeof(uid), "%u", + common->error_node_device); + if (n > 0 && n < sizeof(uid)) { + companion =3D acpi_dev_get_first_match_dev("ARMHE000", + uid, -1); + if (companion) { + ACPI_COMPANION_SET(&pdev->dev, companion); + acpi_dev_put(companion); + } else { + pr_debug("MSC.%u: missing namespace entry\n", + common->error_node_device); + } + } + } + + props[(*p)++] =3D PROPERTY_ENTRY_U8("arm,node-type", hdr->type); + props[(*p)++] =3D PROPERTY_ENTRY_U8("arm,group-format", + interface->group_format); + props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,error-records-count", + interface->error_record_count); + props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,error-records-index", + interface->error_record_index); + props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,interface-flags", + interface->flags); + props[(*p)++] =3D PROPERTY_ENTRY_U64_ARRAY_LEN("arm,record-implemented", + record_implemented, + group_len); + props[(*p)++] =3D PROPERTY_ENTRY_U64_ARRAY_LEN("arm,status-reporting", + status_reporting, + group_len); + props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,error-group-base", + common->error_group_register_base); + + len =3D hdr->node_interface_offset - hdr->node_specific_offset; + props[(*p)++] =3D + PROPERTY_ENTRY_U8_ARRAY_LEN("arm,node-specific-data", + ACPI_ADD_PTR(u8, hdr, hdr->node_specific_offset), len); + + return 0; +} + +static int __init +aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) +{ + struct property_entry props[10] =3D { }; + int p =3D 0; + int ret; + + ret =3D aest_init_node_props(hdr, props, &p, pdev); + if (ret) + return ret; + + return device_create_managed_software_node(&pdev->dev, props, NULL); +} + +static int aest_node_mem_size(u8 group_format) +{ + switch (group_format) { + case ACPI_AEST_NODE_GROUP_FORMAT_4K: + return SZ_4K; + case ACPI_AEST_NODE_GROUP_FORMAT_16K: + return SZ_16K; + case ACPI_AEST_NODE_GROUP_FORMAT_64K: + return SZ_64K; + default: + return SZ_4K; + } +} + +DEFINE_FREE(res, struct resource *, if (_T) kfree(_T)) + +static struct platform_device *__init +acpi_aest_alloc_pdev(struct acpi_aest_hdr *aest_hdr) +{ + struct platform_device *pdev __free(platform_device_put) =3D + platform_device_alloc("arm64_ras", PLATFORM_DEVID_AUTO); + struct resource *res __free(res) =3D NULL; + struct acpi_aest_node_interface_header *interface; + int ret, j =3D 0; + + if (!pdev) + return ERR_PTR(-ENOMEM); + + res =3D kcalloc(1, sizeof(*res), GFP_KERNEL); + if (!res) + return ERR_PTR(-ENOMEM); + + interface =3D ACPI_ADD_PTR(struct acpi_aest_node_interface_header, + aest_hdr, aest_hdr->node_interface_offset); + if (interface->type !=3D ACPI_AEST_NODE_SYSTEM_REGISTER) { + res[j].name =3D AEST_NODE_NAME; + res[j].start =3D interface->address; + res[j].end =3D res[j].start + aest_node_mem_size(interface->group_format= ) - 1; + res[j].flags =3D IORESOURCE_MEM; + j++; + } + + ret =3D platform_device_add_resources(pdev, res, j); + if (ret) + return ERR_PTR(ret); + + return_ptr(pdev); +} + +static int __init acpi_aest_init_node(struct acpi_aest_hdr *aest_hdr) +{ + struct platform_device *pdev __free(platform_device_put) =3D NULL; + int ret; + + pdev =3D acpi_aest_alloc_pdev(aest_hdr); + if (IS_ERR(pdev)) + return PTR_ERR(pdev); + + ret =3D aest_create_node_fwnode(aest_hdr, pdev); + if (ret) + return ret; + + ret =3D platform_device_add(pdev); + if (ret) + return ret; + pr_debug("Platform device added for AEST node: %s.%d\n", + pdev->name, pdev->id); + retain_and_null_ptr(pdev); + + return 0; +} + +static int __init acpi_aest_init_nodes(struct acpi_table_header *aest_tabl= e) +{ + struct acpi_aest_hdr *aest_node, *aest_end; + struct acpi_table_aest *aest; + int rc; + + aest =3D (struct acpi_table_aest *)aest_table; + aest_node =3D ACPI_ADD_PTR(struct acpi_aest_hdr, aest, + sizeof(struct acpi_table_header)); + aest_end =3D ACPI_ADD_PTR(struct acpi_aest_hdr, aest, aest_table->length); + + while (aest_node < aest_end) { + if (((u64)aest_node + aest_node->length) > (u64)aest_end) { + pr_warn(FW_WARN + "AEST node pointer overflow, bad table.\n"); + return -EINVAL; + } + + rc =3D acpi_aest_init_node(aest_node); + if (rc) + return rc; + + aest_node =3D ACPI_ADD_PTR(struct acpi_aest_hdr, aest_node, + aest_node->length); + } + + return 0; +} + +static int __init acpi_aest_init(void) +{ + int ret; + + if (acpi_disabled) + return 0; + + struct acpi_table_header *aest_table __free(acpi_put_table) =3D + acpi_get_table_pointer(ACPI_SIG_AEST, 0); + if (IS_ERR(aest_table)) + return 0; + + ret =3D acpi_aest_init_nodes(aest_table); + if (ret) { + pr_err("Failed init aest node %d\n", ret); + return ret; + } + + return 0; +} +subsys_initcall_sync(acpi_aest_init); diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h new file mode 100644 index 000000000000..e485a6236891 --- /dev/null +++ b/include/linux/acpi_aest.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ACPI_AEST_H__ +#define __ACPI_AEST_H__ + +#include + +/* AEST resource name */ +#define AEST_NODE_NAME "AEST:NODE" + +/* AEST interface */ +#define AEST_XFACE_FLAG_SHARED BIT(0) +#define AEST_XFACE_FLAG_CLEAR_MISC BIT(1) +#define AEST_XFACE_FLAG_ERROR_DEVICE BIT(2) +#define AEST_XFACE_FLAG_AFFINITY BIT(3) +#define AEST_XFACE_FLAG_ERROR_GROUP BIT(4) +#define AEST_XFACE_FLAG_FAULT_INJECT BIT(5) +#define AEST_XFACE_FLAG_INT_CONFIG BIT(6) + +#endif /* __ACPI_AEST_H__ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 048C5357CF5; Tue, 2 Jun 2026 07:15:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384619; cv=none; b=qqG84szpyP7mZaiL3sGzP3IwcAqdN1kr3N9eh5Txc8xlGskQiEVPBhDhTUJP0xs1XfLQZJPdwwGb6vb6O2IyA2g2ODbN+ten+F/P2ooQrpcHYAVEX7vCsmFxDkpvNGQyITilMnGqAqDE1U2ROM5aSaWTYovPALadDYmEwc05TCw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384619; c=relaxed/simple; bh=gADpYitvE/7Oyw0mK8pa3fgtuYfNMZ1aQai+GgQXS/A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=d3sH+nHHIUFkTlU6qmkBMvM1u+x3z8ZmAncA9AlxYMYxLVxBIQy+KiTxHmHHUtoSKhxGXAaVi5Bco5cNEhddrs818BY/giY6hYHLBxzHfEwmcjnRezzw8OWlmiuPtwFaNrcbd5O5sgubn+BFh/ElMpTHuvxt6ZqLPtWZf3Ach30= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Jd5TXOMq; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Jd5TXOMq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384553; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=kbx4Eivf9wzreQZFZq4e6akKl81b9F+2yXP+5FfPTTs=; b=Jd5TXOMqMDNAe/pknxq8COYeBjr/Sb+OsdT7Z/pAPyRJuGJFvWhEhE7TAxhQalFmqd+4hJNLOrvDiJOmQKbiTMZFDP0po+zib1rgg/oJFMwRR8D6q41bzcbt1Fz4lxLDd6V9qi1micBTjqXzbWrIDxO/oSeSPhc75f/QjR/7DUs= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2ZE_1780384549; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2ZE_1780384549 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:15:52 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 02/16] arm64: ras: Add probe/remove for arm64_ras driver Date: Tue, 2 Jun 2026 15:15:25 +0800 Message-ID: <20260602071540.3711528-3-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Introduce the back-end platform driver that binds to the devices created by the AEST front-end. Driver input is taken exclusively from fwnode properties, so the same probe path serves any future front-end (DT, hand-rolled) without conditional code. The probe builds two layers of state: - struct ras_node: one AEST error source - struct ras_record: one error record inside a node This split mirrors the hardware: a node owns the shared MMIO/ERRGSR window and policy, while records are the unit at which errors are reported, masked and polled. Later patches plug interrupts, decoding, storm mitigation and userspace ABI onto these two objects without touching the front-end. Signed-off-by: Umang Chheda Signed-off-by: Ruidong Tian --- MAINTAINERS | 2 + arch/arm64/include/asm/ras.h | 15 ++ drivers/ras/Kconfig | 1 + drivers/ras/Makefile | 1 + drivers/ras/arm64/Kconfig | 16 +++ drivers/ras/arm64/Makefile | 5 + drivers/ras/arm64/ras-core.c | 266 +++++++++++++++++++++++++++++++++++ drivers/ras/arm64/ras.h | 104 ++++++++++++++ include/linux/acpi_aest.h | 3 + 9 files changed, 413 insertions(+) create mode 100644 arch/arm64/include/asm/ras.h create mode 100644 drivers/ras/arm64/Kconfig create mode 100644 drivers/ras/arm64/Makefile create mode 100644 drivers/ras/arm64/ras-core.c create mode 100644 drivers/ras/arm64/ras.h diff --git a/MAINTAINERS b/MAINTAINERS index 16c80a7ea72c..766d1240b465 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -349,7 +349,9 @@ M: Ruidong Tian L: linux-acpi@vger.kernel.org L: linux-arm-kernel@lists.infradead.org S: Supported +F: arch/arm64/include/asm/ras.h F: drivers/acpi/arm64/aest.c +F: drivers/ras/arm64/ F: include/linux/acpi_aest.h =20 ACPI FOR RISC-V (ACPI/riscv) diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h new file mode 100644 index 000000000000..b6640b9972bf --- /dev/null +++ b/arch/arm64/include/asm/ras.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_RAS_H +#define __ASM_RAS_H + +#include + +struct ras_ext_regs { + u64 err_fr; + u64 err_ctlr; + u64 err_status; + u64 err_addr; + u64 err_misc[4]; +}; + +#endif /* __ASM_RAS_H */ diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig index fc4f4bb94a4c..61e545993609 100644 --- a/drivers/ras/Kconfig +++ b/drivers/ras/Kconfig @@ -33,6 +33,7 @@ if RAS =20 source "arch/x86/ras/Kconfig" source "drivers/ras/amd/atl/Kconfig" +source "drivers/ras/arm64/Kconfig" =20 config RAS_FMPM tristate "FRU Memory Poison Manager" diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile index 11f95d59d397..1b62a3017fa3 100644 --- a/drivers/ras/Makefile +++ b/drivers/ras/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_RAS_CEC) +=3D cec.o =20 obj-$(CONFIG_RAS_FMPM) +=3D amd/fmpm.o obj-y +=3D amd/atl/ +obj-$(CONFIG_ARM64_RAS_DRIVER) +=3D arm64/ diff --git a/drivers/ras/arm64/Kconfig b/drivers/ras/arm64/Kconfig new file mode 100644 index 000000000000..dcdeaa216d67 --- /dev/null +++ b/drivers/ras/arm64/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# ARM Error Source Table Support +# +# Copyright (c) 2025, Alibaba Group. +# + +config ARM64_RAS_DRIVER + tristate "ARM64 RAS Driver" + depends on ARM64 && ACPI_AEST && RAS + help + This is the RAS driver for the arm64 architecture. It depends on + the Arm Error Source Table (AEST) to provide basic register and + interrupt information. + + If set, the kernel will report and process hardware errors. diff --git a/drivers/ras/arm64/Makefile b/drivers/ras/arm64/Makefile new file mode 100644 index 000000000000..c5387f05a067 --- /dev/null +++ b/drivers/ras/arm64/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-$(CONFIG_ARM64_RAS_DRIVER) +=3D arm64_ras.o + +arm64_ras-y :=3D ras-core.o diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c new file mode 100644 index 000000000000..b5448f4a841f --- /dev/null +++ b/drivers/ras/arm64/ras-core.c @@ -0,0 +1,266 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM Error Source Table Support + * + * Copyright (c) 2025, Alibaba Group. + */ + +#include +#include +#include + +#include "ras.h" + +#undef pr_fmt +#define pr_fmt(fmt) "arm64_ras: " fmt + +static const char *const ras_node_name[] =3D { + [ACPI_AEST_PROCESSOR_ERROR_NODE] =3D "processor", + [ACPI_AEST_MEMORY_ERROR_NODE] =3D "memory", + [ACPI_AEST_SMMU_ERROR_NODE] =3D "smmu", + [ACPI_AEST_VENDOR_ERROR_NODE] =3D "vendor", + [ACPI_AEST_GIC_ERROR_NODE] =3D "gic", + [ACPI_AEST_PCIE_ERROR_NODE] =3D "pcie", + [ACPI_AEST_PROXY_ERROR_NODE] =3D "proxy", +}; + +const struct ras_group ras_group_config[] =3D { + [ACPI_AEST_NODE_GROUP_FORMAT_4K] =3D { + .errgsr_num =3D ERXGROUP_4K_ERRGSR_NUM, + .size =3D ERXGROUP_4K_SIZE, + .errgsr_offset =3D ERXGROUP_4K_OFFSET, + }, + [ACPI_AEST_NODE_GROUP_FORMAT_16K] =3D { + .errgsr_num =3D ERXGROUP_16K_ERRGSR_NUM, + .size =3D ERXGROUP_16K_SIZE, + .errgsr_offset =3D ERXGROUP_16K_OFFSET, + }, + [ACPI_AEST_NODE_GROUP_FORMAT_64K] =3D { + .errgsr_num =3D ERXGROUP_64K_ERRGSR_NUM, + .size =3D ERXGROUP_64K_SIZE, + .errgsr_offset =3D ERXGROUP_64K_OFFSET, + }, +}; + +static int ras_init_record(struct ras_record *record, int i, struct ras_no= de *node) +{ + record->name =3D devm_kasprintf(node->dev, GFP_KERNEL, "record%d", i); + if (!record->name) + return -ENOMEM; + + if (node->base) + record->regs_base =3D node->base + sizeof(struct ras_ext_regs) * i; + + record->index =3D i; + record->node =3D node; + + return 0; +} + +static char *alloc_ras_node_name(struct ras_node *node) +{ + char *name; + struct acpi_aest_processor *processor =3D NULL; + + switch (node->type) { + case ACPI_AEST_PROCESSOR_ERROR_NODE: + processor =3D (struct acpi_aest_processor *)node->specific_data; + + /* + * Shared/global processor nodes (e.g. cluster L3 cache, DSU) + * have processor_id=3D0 and use smp_processor_id() at error-log + * time =E2=80=94 using processor_id in the name would produce the same + * "processor.0" string for every shared node and every CPU0 + * per-PE node, making logs ambiguous. + * + * For shared/global nodes, build the name from the resource + * type and the device id so each node gets a unique, meaningful + * name (e.g. "processor.cache.1", "processor.tlb.2"). + * + * For per-PE nodes, keep the original "processor." form. + */ + if (processor->flags & + (ACPI_AEST_PROC_FLAG_SHARED | ACPI_AEST_PROC_FLAG_GLOBAL)) { + static const char *const res_name[] =3D { + [ACPI_AEST_CACHE_RESOURCE] =3D "cache", + [ACPI_AEST_TLB_RESOURCE] =3D "tlb", + [ACPI_AEST_GENERIC_RESOURCE] =3D "generic", + }; + u8 rtype =3D processor->resource_type; + const char *rstr =3D (rtype < ARRAY_SIZE(res_name) && + res_name[rtype]) ? res_name[rtype] : "unknown"; + + name =3D devm_kasprintf(node->dev, GFP_KERNEL, + "%s.%s.%x", + ras_node_name[node->type], + rstr, + *(u32 *)(processor + 1)); + } else { + name =3D devm_kasprintf(node->dev, GFP_KERNEL, + "%s.%d", + ras_node_name[node->type], + processor->processor_id); + } + break; + case ACPI_AEST_MEMORY_ERROR_NODE: + case ACPI_AEST_SMMU_ERROR_NODE: + case ACPI_AEST_VENDOR_ERROR_NODE: + case ACPI_AEST_GIC_ERROR_NODE: + case ACPI_AEST_PCIE_ERROR_NODE: + case ACPI_AEST_PROXY_ERROR_NODE: + name =3D devm_kasprintf(node->dev, GFP_KERNEL, "%s.%llx", + ras_node_name[node->type], node->addr); + break; + default: + dev_warn(node->dev, "unknown AEST node type %u\n", node->type); + return NULL; + } + + return name; +} + +static int ras_node_set_errgsr(struct ras_node *node, phys_addr_t base) +{ + phys_addr_t errgsr_base; + int ret; + + if (!(node->flags & AEST_XFACE_FLAG_ERROR_GROUP)) { + node->errgsr =3D node->base + node->group->errgsr_offset; + return 0; + } + + ret =3D device_property_read_u64(node->dev, "arm,error-group-base", + &errgsr_base); + if (ret || !errgsr_base) + return -EINVAL; + + node->errgsr =3D errgsr_base - base + node->base; + return 0; +} + +static struct ras_node *ras_init_node(struct platform_device *pdev) +{ + int i, ret =3D 0; + struct device *dev =3D &pdev->dev; + struct resource *mem; + struct ras_node *node; + + node =3D devm_kzalloc(&pdev->dev, sizeof(*node), GFP_KERNEL); + if (!node) + return ERR_PTR(-ENOMEM); + + node->dev =3D &pdev->dev; + + ret =3D ret ?: device_property_read_u8(dev, "arm,node-type", &node->type); + ret =3D ret ?: device_property_read_u8(dev, "arm,group-format", &node->gr= oup_format); + ret =3D ret ?: device_property_read_u32(dev, "arm,interface-flags", &node= ->flags); + ret =3D ret ?: device_property_read_u32(dev, "arm,error-records-count", &= node->record_count); + ret =3D ret ?: device_property_read_u32(dev, "arm,error-records-index", &= node->record_index); + if (ret) + return ERR_PTR(ret); + node->group =3D &ras_group_config[node->group_format]; + + node->record_implemented =3D devm_bitmap_zalloc(dev, + node->group->errgsr_num * BITS_PER_TYPE(u64), + GFP_KERNEL); + if (!node->record_implemented) + return ERR_PTR(-ENOMEM); + node->status_reporting =3D devm_bitmap_zalloc(dev, + node->group->errgsr_num * BITS_PER_TYPE(u64), + GFP_KERNEL); + if (!node->status_reporting) + return ERR_PTR(-ENOMEM); + + ret =3D device_property_read_u64_array(dev, "arm,record-implemented", + (u64 *)node->record_implemented, + node->group->errgsr_num); + ret =3D ret ?: device_property_read_u64_array(dev, "arm,status-reporting", + (u64 *)node->status_reporting, + node->group->errgsr_num); + if (ret) + return ERR_PTR(ret); + + node->specific_data_size =3D device_property_count_u8(dev, "arm,node-spec= ific-data"); + if (node->specific_data_size > 0) { + node->specific_data =3D devm_kzalloc(dev, node->specific_data_size, GFP_= KERNEL); + if (!node->specific_data) + return ERR_PTR(-ENOMEM); + ret =3D device_property_read_u8_array(dev, "arm,node-specific-data", + node->specific_data, + node->specific_data_size); + if (ret) + return ERR_PTR(ret); + } + + mem =3D platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 0); + if (mem) { + node->addr =3D mem->start; + node->base =3D devm_ioremap(node->dev, mem->start, resource_size(mem)); + if (!node->base) + return ERR_PTR(-ENOMEM); + + ret =3D ras_node_set_errgsr(node, mem->start); + if (ret) + return ERR_PTR(ret); + } + + node->name =3D alloc_ras_node_name(node); + if (!node->name) + return ERR_PTR(-ENOMEM); + + node->records =3D devm_kcalloc(node->dev, node->record_count, + sizeof(struct ras_record), GFP_KERNEL); + if (!node->records) + return ERR_PTR(-ENOMEM); + + for (i =3D 0; i < node->record_count; i++) { + ret =3D ras_init_record(&node->records[i], + i + node->record_index, node); + if (ret) + return ERR_PTR(ret); + } + ras_node_dbg(node, "base: %llx\n", node->addr); + return node; +} + +static int arm64_ras_probe(struct platform_device *pdev) +{ + int ret; + struct ras_node *node; + + node =3D ras_init_node(pdev); + if (IS_ERR(node)) + return PTR_ERR(node); + + ret =3D dev_set_name(&pdev->dev, "%s%d", ras_node_name[node->type], + pdev->id); + if (ret) + return ret; + + platform_set_drvdata(pdev, node); + + return 0; +} + +static struct platform_driver arm64_ras_driver =3D { + .driver =3D { + .name =3D "arm64_ras", + }, + .probe =3D arm64_ras_probe, +}; + +static int __init arm64_ras_init(void) +{ + return platform_driver_register(&arm64_ras_driver); +} +module_init(arm64_ras_init); + +static void __exit arm64_ras_exit(void) +{ + platform_driver_unregister(&arm64_ras_driver); +} +module_exit(arm64_ras_exit); + +MODULE_DESCRIPTION("ARM RAS Driver"); +MODULE_AUTHOR("Ruidong Tian "); +MODULE_LICENSE("GPL"); diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h new file mode 100644 index 000000000000..3d83f8b26da7 --- /dev/null +++ b/drivers/ras/arm64/ras.h @@ -0,0 +1,104 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * ARM Error Source Table Support + * + * Copyright (c) 2025, Alibaba Group. + */ + +#ifndef _DRIVERS_RAS_ARM64_RAS_H_ +#define _DRIVERS_RAS_ARM64_RAS_H_ + +#include +#include + +#define ras_node_err(__node, format, ...) \ + dev_err((__node)->dev, "%s: " format, (__node)->name, \ + ##__VA_ARGS__) +#define ras_node_info(__node, format, ...) \ + dev_info((__node)->dev, "%s: " format, (__node)->name, \ + ##__VA_ARGS__) +#define ras_node_dbg(__node, format, ...) \ + dev_dbg((__node)->dev, "%s: " format, (__node)->name, \ + ##__VA_ARGS__) + +#define ras_record_err(__record, format, ...) \ + dev_err((__record)->node->dev, "%s: %s: " format, \ + (__record)->node->name, (__record)->name, ##__VA_ARGS__) +#define ras_record_info(__record, format, ...) \ + dev_info((__record)->node->dev, "%s: %s: " format, \ + (__record)->node->name, (__record)->name, ##__VA_ARGS__) +#define ras_record_dbg(__record, format, ...) \ + dev_dbg((__record)->node->dev, "%s: %s: " format, \ + (__record)->node->name, (__record)->name, ##__VA_ARGS__) + +#define ERXGROUP_4K_OFFSET 0xE00 +#define ERXGROUP_16K_OFFSET 0x3800 +#define ERXGROUP_64K_OFFSET 0xE000 +#define ERXGROUP_4K_SIZE SZ_4K +#define ERXGROUP_16K_SIZE SZ_16K +#define ERXGROUP_64K_SIZE SZ_64K +#define ERXGROUP_4K_ERRGSR_NUM 1 +#define ERXGROUP_16K_ERRGSR_NUM 4 +#define ERXGROUP_64K_ERRGSR_NUM 14 + +struct ras_record { + char *name; + void __iomem *regs_base; + struct ras_node *node; + + int index; +}; + +struct ras_group { + int errgsr_num; + size_t size; + u64 errgsr_offset; +}; + +extern const struct ras_group ras_group_config[]; + +struct ras_node { + char *name; + + struct device *dev; + const struct ras_group *group; + + void __iomem *base; + void __iomem *errgsr; + phys_addr_t addr; + + u8 *specific_data; + /* + * This bitmap indicates which of the error records within this error + * node must be polled for error status. + * Bit[n] of this field pertains to error record corresponding to + * index n in this error group. + * Bit[n] =3D 0b: Error record at index n needs to be polled. + * Bit[n] =3D 1b: Error record at index n does not need to be polled. + */ + unsigned long *record_implemented; + /* + * This bitmap indicates which of the error records within this error + * node support error status reporting using ERRGSR register. + * Bit[n] of this field pertains to error record corresponding to + * index n in this error group. + * Bit[n] =3D 0b: Error record at index n supports error status reporting + * through ERRGSR.S. + * Bit[n] =3D 1b: Error record at index n does not support error reporting + * through the ERRGSR.S bit. If this error record is + * implemented, then it must be polled explicitly for + * error events. + */ + unsigned long *status_reporting; + struct ras_record *records; + + u32 specific_data_size; + u32 record_count; + u32 record_index; + u32 flags; + + u8 type; + u8 group_format; +}; + +#endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h index e485a6236891..df6369bcc96b 100644 --- a/include/linux/acpi_aest.h +++ b/include/linux/acpi_aest.h @@ -16,4 +16,7 @@ #define AEST_XFACE_FLAG_FAULT_INJECT BIT(5) #define AEST_XFACE_FLAG_INT_CONFIG BIT(6) =20 +#define ACPI_AEST_PROC_FLAG_GLOBAL BIT(0) +#define ACPI_AEST_PROC_FLAG_SHARED BIT(1) + #endif /* __ACPI_AEST_H__ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABBEC34752D; Tue, 2 Jun 2026 07:16:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384577; cv=none; b=N/ObduDIAi3FM+SmPaXu7OsHkIVbwG4PpntsgdMWK+4RBAGwi1T2c5wVlKRjFLNRLbUJnLgoVHp6C/xwmdGtMq5UrEPS2MPWTWmi09NIDZqaBgYHA/2H/0cO33LsXPgEND5NW1SFBTBCLPMZUYQ/lbz68KShHR47cgVCHvi6Z6c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384577; c=relaxed/simple; bh=LutnBlhqyOIvbFqTOmDw9r6dqJ1LBQLMQACDal4qZTg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j2hkmUMit6/ztyBLbjRjGTJEYV4epNNxD6kjbIq6m54o5Cg9pvmz2jrUJdB6bCy2H0ZSLcp6yViUtxtRMaV2fdp0xyfRFwBpGSo93sFYEecUHT4CZmmUOxfNwoHkQ0ATYAECpDVI88SlA5aNqeeEAKQ3yeioEpiOE2y+sP8IHmk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=AStWR9MM; arc=none smtp.client-ip=115.124.30.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="AStWR9MM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384561; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=Wkg7zePrkjLP7+kbmK0jAUbfScXCvDraNXIRyL6x9wI=; b=AStWR9MMAK5o0oXahF0LjPCdIJyRQqOyckZ6gI1Muc9Xm5xTWxrkNBJxgj5HYc3MieSvdB2PDqCCkwOTG2WQfrgsj8oIhuMuAaFXh9Re3mkiRxFNss1vaLYDB9wW21612YZgb4+oJsut96TCJKRHMPG02CPcv0xvVMvy2JtzLYI= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R391e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2aa_1780384553; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2aa_1780384553 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:15:58 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 03/16] arm64: ras: Unify the read/write interface for system and MMIO registers Date: Tue, 2 Jun 2026 15:15:26 +0800 Message-ID: <20260602071540.3711528-4-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" ARM RAS error records are reachable either through system registers (PE-affine sources) or through memory-mapped windows (off-core sources), depending on the AEST interface type. Hard-coding the choice at every call site would scatter that knowledge across the driver and double every future change. Introduce a ras_access ops table that hides the transport behind a single read/write contract, so all later error handling, masking and injection code is access-method-agnostic and can be reasoned about as plain register operations. Signed-off-by: Ruidong Tian --- drivers/acpi/arm64/aest.c | 3 +- drivers/ras/arm64/ras-core.c | 5 +- drivers/ras/arm64/ras.h | 94 ++++++++++++++++++++++++++++++++++++ 3 files changed, 100 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c index 8cf24467d0c2..3a813fe7047c 100644 --- a/drivers/acpi/arm64/aest.c +++ b/drivers/acpi/arm64/aest.c @@ -93,6 +93,7 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct pr= operty_entry *props, } =20 props[(*p)++] =3D PROPERTY_ENTRY_U8("arm,node-type", hdr->type); + props[(*p)++] =3D PROPERTY_ENTRY_U8("arm,interface-type", interface->type= ); props[(*p)++] =3D PROPERTY_ENTRY_U8("arm,group-format", interface->group_format); props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,error-records-count", @@ -121,7 +122,7 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, static int __init aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) { - struct property_entry props[10] =3D { }; + struct property_entry props[11] =3D { }; int p =3D 0; int ret; =20 diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index b5448f4a841f..47ab78cc88d7 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -51,6 +51,7 @@ static int ras_init_record(struct ras_record *record, int= i, struct ras_node *no if (node->base) record->regs_base =3D node->base + sizeof(struct ras_ext_regs) * i; =20 + record->access =3D &ras_access[node->access_type]; record->index =3D i; record->node =3D node; =20 @@ -152,6 +153,7 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) node->dev =3D &pdev->dev; =20 ret =3D ret ?: device_property_read_u8(dev, "arm,node-type", &node->type); + ret =3D ret ?: device_property_read_u8(dev, "arm,interface-type", &node->= access_type); ret =3D ret ?: device_property_read_u8(dev, "arm,group-format", &node->gr= oup_format); ret =3D ret ?: device_property_read_u32(dev, "arm,interface-flags", &node= ->flags); ret =3D ret ?: device_property_read_u32(dev, "arm,error-records-count", &= node->record_count); @@ -219,7 +221,8 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) if (ret) return ERR_PTR(ret); } - ras_node_dbg(node, "base: %llx\n", node->addr); + ras_node_dbg(node, "base: %llx, access_type: %s\n", + node->addr, node->access_type ? "MMIO" : "Register"); return node; } =20 diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 3d83f8b26da7..94ffeb83b251 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -11,6 +11,11 @@ #include #include =20 +#define record_read(record, offset) \ + ((record)->access->read((record)->regs_base, (offset))) +#define record_write(record, offset, val) \ + ((record)->access->write((record)->regs_base, (offset), (val))) + #define ras_node_err(__node, format, ...) \ dev_err((__node)->dev, "%s: " format, (__node)->name, \ ##__VA_ARGS__) @@ -41,10 +46,25 @@ #define ERXGROUP_16K_ERRGSR_NUM 4 #define ERXGROUP_64K_ERRGSR_NUM 14 =20 +#define ERXFR 0x0 +#define ERXCTLR 0x8 +#define ERXSTATUS 0x10 +#define ERXADDR 0x18 +#define ERXMISC0 0x20 +#define ERXMISC1 0x28 +#define ERXMISC2 0x30 +#define ERXMISC3 0x38 + +struct ras_access { + u64 (*read)(void __iomem *base, u32 offset); + void (*write)(void __iomem *base, u32 offset, u64 val); +}; + struct ras_record { char *name; void __iomem *regs_base; struct ras_node *node; + const struct ras_access *access; =20 int index; }; @@ -98,7 +118,81 @@ struct ras_node { u32 flags; =20 u8 type; + u8 access_type; u8 group_format; }; =20 +#define CASE_READ(res, x) \ + case (x): { \ + res =3D read_sysreg_s(SYS_##x##_EL1); \ + break; \ + } + +#define CASE_WRITE(val, x) \ + case (x): { \ + write_sysreg_s((val), SYS_##x##_EL1); \ + break; \ + } + +static inline u64 ras_sysreg_read(void __iomem *base __always_unused, u32 = offset) +{ + u64 res; + + switch (offset) { + CASE_READ(res, ERXFR) + CASE_READ(res, ERXCTLR) + CASE_READ(res, ERXSTATUS) + CASE_READ(res, ERXADDR) + CASE_READ(res, ERXMISC0) + CASE_READ(res, ERXMISC1) + CASE_READ(res, ERXMISC2) + CASE_READ(res, ERXMISC3) + default: + res =3D 0; + } + return res; +} + +static inline void ras_sysreg_write(void __iomem *base __always_unused, u3= 2 offset, u64 val) +{ + switch (offset) { + CASE_WRITE(val, ERXFR) + CASE_WRITE(val, ERXCTLR) + CASE_WRITE(val, ERXSTATUS) + CASE_WRITE(val, ERXADDR) + CASE_WRITE(val, ERXMISC0) + CASE_WRITE(val, ERXMISC1) + CASE_WRITE(val, ERXMISC2) + CASE_WRITE(val, ERXMISC3) + default: + return; + } +} + +static inline u64 ras_iomem_read(void __iomem *base, u32 offset) +{ + return readq_relaxed(base + offset); +} + +static inline void ras_iomem_write(void __iomem *base, u32 offset, u64 val) +{ + writeq_relaxed(val, base + offset); +} + +/* access type is decided by AEST interface type. */ +static const struct ras_access ras_access[] =3D { + [ACPI_AEST_NODE_SYSTEM_REGISTER] =3D { + .read =3D ras_sysreg_read, + .write =3D ras_sysreg_write, + }, + [ACPI_AEST_NODE_MEMORY_MAPPED] =3D { + .read =3D ras_iomem_read, + .write =3D ras_iomem_write, + }, + [ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] =3D { + .read =3D ras_iomem_read, + .write =3D ras_iomem_write, + }, +}; + #endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 153D53655E0; Tue, 2 Jun 2026 07:16:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384575; cv=none; b=cmXBZ7f+UAQZ1BYh6xFPp5tvbJOhYV54B9l/UG6qAy6N9GN0y9qJKvql8E9YlSOrpYD/+TBqnWEQUJHaifuX2VEnyZJ9sKnStlZ9ucaanJ4oXRj75y8GNjBOuRatQaKTP60GvOHYYRL0ICPY7eEoFrhY2lVfNYZqL4EG5xWbkHI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384575; c=relaxed/simple; bh=0svoaL3r/YWMJ6JF6jBtWapibLAVEBZXPOVm1qZExqk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qj9/rTbA0TWv8BPTDwr4n9C5Ozo2MCfmvtv+MFQT5DMWQXjStrkv9lkGnvKw9xvkHigXYD0529eZmoonj/wf4KzRfenOqEAyqyac5M8n3cne4fXH6+CnwChLiwtLrL5qtB+N09iHVpDj3jPNUnHeOOi8cu2oR9tpC+/2EtASE50= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=i5HvXKEm; arc=none smtp.client-ip=115.124.30.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="i5HvXKEm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384561; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=hn3Ar2RstWmCt7LTcZYq3bGkhJyXxPWAwFLAA6UOZgU=; b=i5HvXKEm6O8ghaGhzrzsj69yiX5g5YSMfs19SSvk++2YfhMxl/kRjVTf4smXGxqcymhgZErEYQVqPi3cxjl+MHQllu6WXqx7m+BH/Zb9679ZDF7C46T//KZcI3FmV1a26RUHe5xtozzLemn/VOZaH+gRy5VNs97UAx7ujJ3PEAM= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032089153;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2c-_1780384558; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2c-_1780384558 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:00 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 04/16] arm64: ras: Support RAS Common Fault Injection Model Extension Date: Tue, 2 Jun 2026 15:15:27 +0800 Message-ID: <20260602071540.3711528-5-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add inject register descripted in Common Fault Injection Model Extension. Signed-off-by: Ruidong Tian --- drivers/acpi/arm64/aest.c | 4 +++- drivers/ras/arm64/ras-core.c | 27 +++++++++++++++++++++++++-- drivers/ras/arm64/ras.h | 10 ++++++++++ 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c index 3a813fe7047c..868013498abb 100644 --- a/drivers/acpi/arm64/aest.c +++ b/drivers/acpi/arm64/aest.c @@ -110,6 +110,8 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, group_len); props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,error-group-base", common->error_group_register_base); + props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,fault-inject-base", + common->fault_inject_register_base); =20 len =3D hdr->node_interface_offset - hdr->node_specific_offset; props[(*p)++] =3D @@ -122,7 +124,7 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, static int __init aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) { - struct property_entry props[11] =3D { }; + struct property_entry props[12] =3D { }; int p =3D 0; int ret; =20 diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 47ab78cc88d7..1dd471376449 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -139,6 +139,23 @@ static int ras_node_set_errgsr(struct ras_node *node, = phys_addr_t base) return 0; } =20 +static int ras_node_set_inj_base(struct ras_node *node, phys_addr_t base) +{ + phys_addr_t inj_base =3D 0; + int ret =3D 0; + + if (!(node->flags & AEST_XFACE_FLAG_FAULT_INJECT)) + return 0; + + ret =3D device_property_read_u64(node->dev, "arm,fault-inject-base", + &inj_base); + if (ret || !inj_base) + return -EINVAL; + + node->inj =3D inj_base - base + node->base; + return 0; +} + static struct ras_node *ras_init_node(struct platform_device *pdev) { int i, ret =3D 0; @@ -204,6 +221,11 @@ static struct ras_node *ras_init_node(struct platform_= device *pdev) ret =3D ras_node_set_errgsr(node, mem->start); if (ret) return ERR_PTR(ret); + ret =3D ras_node_set_inj_base(node, mem->start); + if (ret) + return ERR_PTR(ret); + } else if (node->access_type =3D=3D ACPI_AEST_NODE_MEMORY_MAPPED) { + return ERR_PTR(-EINVAL); } =20 node->name =3D alloc_ras_node_name(node); @@ -221,8 +243,9 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) if (ret) return ERR_PTR(ret); } - ras_node_dbg(node, "base: %llx, access_type: %s\n", - node->addr, node->access_type ? "MMIO" : "Register"); + ras_node_dbg(node, "base: %llx, access_type: %s, %s inject\n", + node->addr, node->access_type ? "MMIO" : "Register", + node->flags & AEST_XFACE_FLAG_FAULT_INJECT ? "with" : "without"); return node; } =20 diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 94ffeb83b251..da03593e5f7f 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -54,6 +54,9 @@ #define ERXMISC1 0x28 #define ERXMISC2 0x30 #define ERXMISC3 0x38 +#define ERXPFGF 0x800 +#define ERXPFGCTL 0x808 +#define ERXPFGCDN 0x810 =20 struct ras_access { u64 (*read)(void __iomem *base, u32 offset); @@ -85,6 +88,7 @@ struct ras_node { =20 void __iomem *base; void __iomem *errgsr; + void __iomem *inj; phys_addr_t addr; =20 u8 *specific_data; @@ -147,6 +151,9 @@ static inline u64 ras_sysreg_read(void __iomem *base __= always_unused, u32 offset CASE_READ(res, ERXMISC1) CASE_READ(res, ERXMISC2) CASE_READ(res, ERXMISC3) + CASE_READ(res, ERXPFGF) + CASE_READ(res, ERXPFGCTL) + CASE_READ(res, ERXPFGCDN) default: res =3D 0; } @@ -164,6 +171,9 @@ static inline void ras_sysreg_write(void __iomem *base = __always_unused, u32 offs CASE_WRITE(val, ERXMISC1) CASE_WRITE(val, ERXMISC2) CASE_WRITE(val, ERXMISC3) + CASE_WRITE(val, ERXPFGF) + CASE_WRITE(val, ERXPFGCTL) + CASE_WRITE(val, ERXPFGCDN) default: return; } --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD3ECCA6B; Tue, 2 Jun 2026 07:16:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.98 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384570; cv=none; b=bneY9mDswetf4tdUAFJm1YP5M2RAREc4JMOhmyiHRXZT2tSvi7TyIs9jS/e8HUETMBJm/aAQn8ARMp/oLPmU/5u7lWurB0TM6zxyHlbNulAMC3zevO9ajeiTAA6/09fGLt899E363Mrf27qb0X4rB06ofJvKJqt5g+tdWG5ljD0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384570; c=relaxed/simple; bh=An9KDj2M+K9eutYkfQ/1UTH5MdV8TP/xRRi2LTL0edY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tn8xYYZi39SrYmrElSad9VO7hRXeBBRzkMAqnyOAYR8Mg2fV1EoWM+1xshWEfjDAk1OlcAqrFdvjoBfZ5TRmOZprobezMzzsCD0epECo+QZ4qKlvFdnflJtaRcffLt/x/yXK3s8EqRkB3n0Opu8qN45TjxSYw8sBB+RH6kK5M+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=vQpV7CDF; arc=none smtp.client-ip=115.124.30.98 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="vQpV7CDF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384562; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=aiCdx3Q8seVVXlkwNhOkJDJdYZEvP8kG8HhMkkSmX6Q=; b=vQpV7CDF/IRM/YlA4QbknG1LJZEeyiY1EzJO/mBAa7QHoO299282QZi85IS9pYUWo+UX407rCTGwTqB40AaXh/6x2a8zlU7mHkUq2vhhgpZM9volKh3bx2JRZCKTF15cTk4G27earxfXuYD7oN7SRRYD9ce73cbabod/6/DYk3w= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045133197;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2cW_1780384560; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2cW_1780384560 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:01 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 05/16] arm64: ras: Plumb AEST interrupts as platform IRQ resources Date: Tue, 2 Jun 2026 15:15:28 +0800 Message-ID: <20260602071540.3711528-6-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" AEST describes one or more interrupts per error source for delivering Fault Handling (FHI) and Error Recovery (ERI) events. This change keeps the existing front-end / back-end split: - The ACPI front-end is the only place that understands AEST interrupt encodings; it registers each GSI and exposes the resulting Linux IRQs as named platform resources ("AEST:FHI" / "AEST:ERI"). - The back-end looks up its IRQs by name and requests them according to routing type. SPIs are device-shared and use a normal shared handler. PPIs are inherently per-CPU, so the node is cloned into a percpu ras_node and the handler is installed via request_percpu_irq() with that percpu cookie, ensuring each CPU services its own copy of the source. Signed-off-by: Ruidong Tian --- drivers/acpi/arm64/aest.c | 56 +++++++++++++++++- drivers/ras/arm64/ras-core.c | 109 +++++++++++++++++++++++++++++++++++ drivers/ras/arm64/ras.h | 2 + include/linux/acpi_aest.h | 7 +++ 4 files changed, 172 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c index 868013498abb..af03f4365cfa 100644 --- a/drivers/acpi/arm64/aest.c +++ b/drivers/acpi/arm64/aest.c @@ -15,6 +15,40 @@ #undef pr_fmt #define pr_fmt(fmt) "ACPI AEST: " fmt =20 +static int acpi_aest_parse_irqs(struct platform_device *pdev, + struct acpi_aest_hdr *aest_hdr, + struct resource *res, int *res_idx) +{ + int i; + struct acpi_aest_node_interrupt_v2 *interrupt; + int trigger, irq; + + interrupt =3D ACPI_ADD_PTR(struct acpi_aest_node_interrupt_v2, aest_hdr, + aest_hdr->node_interrupt_offset); + for (i =3D 0; i < aest_hdr->node_interrupt_count; i++, interrupt++) { + trigger =3D (interrupt->flags & AEST_INTERRUPT_MODE) ? + ACPI_LEVEL_SENSITIVE : + ACPI_EDGE_SENSITIVE; + + irq =3D acpi_register_gsi(&pdev->dev, interrupt->gsiv, trigger, + ACPI_ACTIVE_HIGH); + if (irq <=3D 0) { + pr_err("failed to map AEST GSI %d\n", interrupt->gsiv); + return irq ? irq : -EINVAL; + } + + res[*res_idx].start =3D irq; + res[*res_idx].end =3D irq; + res[*res_idx].flags =3D IORESOURCE_IRQ; + res[*res_idx].name =3D interrupt->type ? AEST_ERI_NAME : + AEST_FHI_NAME; + + (*res_idx)++; + } + + return 0; +} + /* * Fill the per-AEST-entry inner properties (node-type / interface-type / * group-format / record bitmaps / register bases ...). @@ -25,9 +59,11 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct p= roperty_entry *props, { struct acpi_aest_node_interface_header *interface; struct acpi_aest_node_interface_common *common =3D NULL; + struct acpi_aest_node_interrupt_v2 *interrupt; u64 *record_implemented =3D NULL; u64 *status_reporting =3D NULL; u64 *addressing_mode =3D NULL; + u32 fhi_gsiv =3D 0, eri_gsiv =3D 0; int group_len =3D 0, i; size_t len; =20 @@ -72,6 +108,15 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, return -EINVAL; } =20 + interrupt =3D ACPI_ADD_PTR(struct acpi_aest_node_interrupt_v2, hdr, + hdr->node_interrupt_offset); + for (i =3D 0; i < hdr->node_interrupt_count; i++, interrupt++) { + if (interrupt->type =3D=3D ACPI_AEST_NODE_FAULT_HANDLING) + fhi_gsiv =3D interrupt->gsiv; + else if (interrupt->type =3D=3D ACPI_AEST_NODE_ERROR_RECOVERY) + eri_gsiv =3D interrupt->gsiv; + } + if (interface->flags & AEST_XFACE_FLAG_ERROR_DEVICE) { struct acpi_device *companion; char uid[16]; @@ -112,6 +157,8 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, common->error_group_register_base); props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,fault-inject-base", common->fault_inject_register_base); + props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,fhi-gsiv", fhi_gsiv); + props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,eri-gsiv", eri_gsiv); =20 len =3D hdr->node_interface_offset - hdr->node_specific_offset; props[(*p)++] =3D @@ -124,7 +171,7 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, static int __init aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) { - struct property_entry props[12] =3D { }; + struct property_entry props[14] =3D { }; int p =3D 0; int ret; =20 @@ -163,7 +210,8 @@ acpi_aest_alloc_pdev(struct acpi_aest_hdr *aest_hdr) if (!pdev) return ERR_PTR(-ENOMEM); =20 - res =3D kcalloc(1, sizeof(*res), GFP_KERNEL); + res =3D kcalloc(AEST_MAX_INTERRUPT_PER_NODE + 1, sizeof(*res), + GFP_KERNEL); if (!res) return ERR_PTR(-ENOMEM); =20 @@ -177,6 +225,10 @@ acpi_aest_alloc_pdev(struct acpi_aest_hdr *aest_hdr) j++; } =20 + ret =3D acpi_aest_parse_irqs(pdev, aest_hdr, res, &j); + if (ret) + return ERR_PTR(ret); + ret =3D platform_device_add_resources(pdev, res, j); if (ret) return ERR_PTR(ret); diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 1dd471376449..9520415df8cb 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -5,6 +5,7 @@ * Copyright (c) 2025, Alibaba Group. */ =20 +#include #include #include #include @@ -14,6 +15,8 @@ #undef pr_fmt #define pr_fmt(fmt) "arm64_ras: " fmt =20 +static DEFINE_PER_CPU(struct ras_node, percpu_ras_node); + static const char *const ras_node_name[] =3D { [ACPI_AEST_PROCESSOR_ERROR_NODE] =3D "processor", [ACPI_AEST_MEMORY_ERROR_NODE] =3D "memory", @@ -42,6 +45,55 @@ const struct ras_group ras_group_config[] =3D { }, }; =20 +static irqreturn_t ras_irq_func(int irq, void *input) +{ + struct ras_node *node =3D input; + + return IRQ_HANDLED; +} + +static int ras_register_irq(struct ras_node *node) +{ + int i, irq, ret; + char *irq_desc; + + irq_desc =3D devm_kasprintf(node->dev, GFP_KERNEL, "%s.%s.", + dev_driver_string(node->dev), + node->name); + if (!irq_desc) + return -ENOMEM; + + for (i =3D 0; i < AEST_MAX_INTERRUPT_PER_NODE; i++) { + irq =3D node->irq[i]; + + if (!irq) + continue; + + if (irq_is_percpu_devid(irq)) { + ret =3D request_percpu_irq(irq, ras_irq_func, irq_desc, + node->oncore_node); + if (ret) + goto free; + } else { + ret =3D devm_request_irq(node->dev, irq, ras_irq_func, IRQF_SHARED, + irq_desc, node); + if (ret) + return ret; + } + } + return 0; + +free: + for (i =3D i - 1; i >=3D 0; i--) { + irq =3D node->irq[i]; + + if (irq_is_percpu_devid(irq)) + free_percpu_irq(irq, node->oncore_node); + } + + return ret; +} + static int ras_init_record(struct ras_record *record, int i, struct ras_no= de *node) { record->name =3D devm_kasprintf(node->dev, GFP_KERNEL, "record%d", i); @@ -249,6 +301,53 @@ static struct ras_node *ras_init_node(struct platform_= device *pdev) return node; } =20 + +static int __setup_ppi(struct ras_node *node) +{ + int cpu; + struct ras_node *oncore_node; + size_t size; + + node->oncore_node =3D &percpu_ras_node; + for_each_possible_cpu(cpu) { + oncore_node =3D per_cpu_ptr(&percpu_ras_node, cpu); + memcpy(oncore_node, node, sizeof(struct ras_node)); + + oncore_node->records =3D devm_kcalloc( + node->dev, oncore_node->record_count, + sizeof(struct ras_record), GFP_KERNEL); + if (!oncore_node->records) + return -ENOMEM; + + size =3D oncore_node->record_count * + sizeof(struct ras_record); + memcpy(oncore_node->records, node->records, size); + + ras_node_dbg(node, "Init node on CPU%d.\n", cpu); + } + + return 0; +} + +static int ras_setup_irq(struct platform_device *pdev, struct ras_node *no= de) +{ + int fhi_irq, eri_irq; + + fhi_irq =3D platform_get_irq_byname_optional(pdev, AEST_FHI_NAME); + if (fhi_irq > 0) + node->irq[ACPI_AEST_NODE_FAULT_HANDLING] =3D fhi_irq; + + eri_irq =3D platform_get_irq_byname_optional(pdev, AEST_ERI_NAME); + if (eri_irq > 0) + node->irq[ACPI_AEST_NODE_ERROR_RECOVERY] =3D eri_irq; + + /* Allocate and initialise the percpu device pointer for PPI */ + if (irq_is_percpu(fhi_irq) || irq_is_percpu(eri_irq)) + return __setup_ppi(node); + + return 0; +} + static int arm64_ras_probe(struct platform_device *pdev) { int ret; @@ -263,6 +362,16 @@ static int arm64_ras_probe(struct platform_device *pde= v) if (ret) return ret; =20 + ret =3D ras_setup_irq(pdev, node); + if (ret) + return ret; + + ret =3D ras_register_irq(node); + if (ret) { + ras_node_err(node, "register irq failed\n"); + return ret; + } + platform_set_drvdata(pdev, node); =20 return 0; diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index da03593e5f7f..b64eae59b6ac 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -85,6 +85,7 @@ struct ras_node { =20 struct device *dev; const struct ras_group *group; + struct ras_node __percpu *oncore_node; =20 void __iomem *base; void __iomem *errgsr; @@ -124,6 +125,7 @@ struct ras_node { u8 type; u8 access_type; u8 group_format; + u32 irq[AEST_MAX_INTERRUPT_PER_NODE]; }; =20 #define CASE_READ(res, x) \ diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h index df6369bcc96b..a462895a7b5a 100644 --- a/include/linux/acpi_aest.h +++ b/include/linux/acpi_aest.h @@ -6,6 +6,13 @@ =20 /* AEST resource name */ #define AEST_NODE_NAME "AEST:NODE" +#define AEST_FHI_NAME "AEST:FHI" +#define AEST_ERI_NAME "AEST:ERI" + +/* AEST interrupt */ +#define AEST_INTERRUPT_MODE BIT(0) + +#define AEST_MAX_INTERRUPT_PER_NODE 2 =20 /* AEST interface */ #define AEST_XFACE_FLAG_SHARED BIT(0) --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADD14376BC2; Tue, 2 Jun 2026 07:16:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.99 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384574; cv=none; b=VHhNgSn05m3YWzvz1hw79n/yoBlGTJ31aHQJwOxOE1keXlrJQqjJHezpTQTYxxrXhQRQwSeFowSM7B841uJOKT2DKc2jsPCw+XVrFG1onswWZbgLEjQIihtaDbhp/JzTl33sJhNyQi65MCl30cHrqruE12pXFgManzLK9YhVvYk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384574; c=relaxed/simple; bh=CgcjcuPBxEx8U7Jt3cd3b1xIp0omNfWJVhkqvzvq26I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TnLNElUAEN1WETvAWZVPBCkMEijWmcf/b3AlsnoPnOVwvcM7ihOUBQQDYqsag4tWHCdAlOBEWeEeTduweNs3jdD/7ZtL6/UgZEziHpb4ozpcMCTEpcLtUInP8bkqvkWdVQr+y2g2r3gAS5VBD9zGMW+EIfJX4q+QxzZvCyxze6g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=YKJ0sDQh; arc=none smtp.client-ip=115.124.30.99 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="YKJ0sDQh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384564; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=h3iv51PzsH25Ajqxiiu8MZp/L1z6wfeiwmy5WW7gSNI=; b=YKJ0sDQhSZsjSQCybnlrt8mOVnVeehcEqu47bijn4wMpzSsxuqxxqxjWmHW2Ig9MDNoQvxeK2UmdxHZS8UEy5dpq8t+f1UgAxD8DpZBJ5AFpTmB+gnGc7fYbGoye2AFuD3VC795VOni3sFayjtztzn1f5mTujjgAjhGB4SrPJjI= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032089153;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2cw_1780384562; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2cw_1780384562 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:03 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 06/16] arm64: ras: Enable error reporting Date: Tue, 2 Jun 2026 15:15:29 +0800 Message-ID: <20260602071540.3711528-7-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Until now, AEST nodes were registered but never had interrupt routing programmed, so probe finished without actually arming any error source. Both shared and PE (oncore) nodes need this bring-up; the difference is only that oncore state is percpu and must follow CPU online/offline. Signed-off-by: Ruidong Tian --- arch/arm64/include/asm/ras.h | 10 +++ drivers/acpi/arm64/aest.c | 4 +- drivers/ras/arm64/ras-core.c | 159 ++++++++++++++++++++++++++++++++++- drivers/ras/arm64/ras.h | 32 +++++++ include/linux/cpuhotplug.h | 1 + 5 files changed, 204 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h index b6640b9972bf..a992610d7755 100644 --- a/arch/arm64/include/asm/ras.h +++ b/arch/arm64/include/asm/ras.h @@ -2,8 +2,18 @@ #ifndef __ASM_RAS_H #define __ASM_RAS_H =20 +#include #include =20 +/* ERRCTLR */ +#define ERR_CTLR_CFI BIT(8) +#define ERR_CTLR_FI BIT(3) +#define ERR_CTLR_UI BIT(2) + +/* ERRIRQCR */ +#define ERRFHICR0_OFFSET 0x0 +#define ERRERICR0_OFFSET 0x10 + struct ras_ext_regs { u64 err_fr; u64 err_ctlr; diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c index af03f4365cfa..5733c91c8e0d 100644 --- a/drivers/acpi/arm64/aest.c +++ b/drivers/acpi/arm64/aest.c @@ -157,6 +157,8 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, common->error_group_register_base); props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,fault-inject-base", common->fault_inject_register_base); + props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,interrupt-config-base", + common->interrupt_config_register_base); props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,fhi-gsiv", fhi_gsiv); props[(*p)++] =3D PROPERTY_ENTRY_U32("arm,eri-gsiv", eri_gsiv); =20 @@ -171,7 +173,7 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, static int __init aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) { - struct property_entry props[14] =3D { }; + struct property_entry props[15] =3D { }; int p =3D 0; int ret; =20 diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 9520415df8cb..98f274b9731d 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -5,6 +5,7 @@ * Copyright (c) 2025, Alibaba Group. */ =20 +#include #include #include #include @@ -45,6 +46,21 @@ const struct ras_group ras_group_config[] =3D { }, }; =20 +static void ras_node_foreach_record(void (*func)(struct ras_record *, void= *), + struct ras_node *node, void *data, + unsigned long *bitmap) +{ + int i; + + for_each_clear_bit(i, bitmap, node->record_count) { + ras_select_record(node, i); + + func(&node->records[i], data); + + ras_sync(node); + } + } + static irqreturn_t ras_irq_func(int irq, void *input) { struct ras_node *node =3D input; @@ -52,6 +68,23 @@ static irqreturn_t ras_irq_func(int irq, void *input) return IRQ_HANDLED; } =20 +static void ras_config_irq(struct ras_node *node) +{ + u32 fhi_gsi, eri_gsi; + + if (!node->irq_config) + return; + + if (!device_property_read_u32(node->dev, "arm,fhi-gsiv", &fhi_gsi)) + writeq_relaxed(fhi_gsi, node->irq_config + ERRFHICR0_OFFSET); + + if (!device_property_read_u32(node->dev, "arm,eri-gsiv", &eri_gsi)) + writeq_relaxed(eri_gsi, node->irq_config + ERRERICR0_OFFSET); + + ras_node_dbg(node, "config irq fhi_gsi %u eri_gsi %u at %pK", + fhi_gsi, eri_gsi, node->irq_config); +} + static int ras_register_irq(struct ras_node *node) { int i, irq, ret; @@ -94,6 +127,21 @@ static int ras_register_irq(struct ras_node *node) return ret; } =20 +static void ras_enable_irq(struct ras_record *record) +{ + struct ras_node *node =3D record->node; + u64 err_ctlr; + + err_ctlr =3D record_read(record, ERXCTLR); + + if (node->irq[0]) + err_ctlr |=3D (ERR_CTLR_FI | ERR_CTLR_CFI); + if (node->irq[1]) + err_ctlr |=3D ERR_CTLR_UI; + + record_write(record, ERXCTLR, err_ctlr); +} + static int ras_init_record(struct ras_record *record, int i, struct ras_no= de *node) { record->name =3D devm_kasprintf(node->dev, GFP_KERNEL, "record%d", i); @@ -110,6 +158,85 @@ static int ras_init_record(struct ras_record *record, = int i, struct ras_node *no return 0; } =20 +static void ras_online_record(struct ras_record *record, void *data) +{ + ras_enable_irq(record); +} + +static void ras_online_node(struct ras_node *node) +{ + if (!node->name) + return; + + ras_config_irq(node); + + ras_node_foreach_record(ras_online_record, node, NULL, + node->record_implemented); +} + +static void ras_online_oncore_dev(void *data) +{ + int fhi_irq, eri_irq; + struct ras_node *node =3D this_cpu_ptr(data); + + ras_online_node(node); + + fhi_irq =3D node->irq[ACPI_AEST_NODE_FAULT_HANDLING]; + if (fhi_irq > 0) + enable_percpu_irq(fhi_irq, IRQ_TYPE_NONE); + eri_irq =3D node->irq[ACPI_AEST_NODE_ERROR_RECOVERY]; + if (eri_irq > 0) + enable_percpu_irq(eri_irq, IRQ_TYPE_NONE); +} + +static void ras_offline_oncore_dev(void *data) +{ + int fhi_irq, eri_irq; + struct ras_node *node =3D this_cpu_ptr(data); + + fhi_irq =3D node->irq[ACPI_AEST_NODE_FAULT_HANDLING]; + if (fhi_irq > 0) + disable_percpu_irq(fhi_irq); + eri_irq =3D node->irq[ACPI_AEST_NODE_ERROR_RECOVERY]; + if (eri_irq > 0) + disable_percpu_irq(eri_irq); +} + +static int ras_starting_cpu(unsigned int cpu) +{ + pr_debug("CPU%d starting\n", cpu); + ras_online_oncore_dev(&percpu_ras_node); + + return 0; +} + +static int ras_dying_cpu(unsigned int cpu) +{ + pr_debug("CPU%d dying\n", cpu); + ras_offline_oncore_dev(&percpu_ras_node); + + return 0; +} + +static void arm64_ras_remove(struct platform_device *pdev) +{ + struct ras_node *node =3D platform_get_drvdata(pdev); + int i; + + platform_set_drvdata(pdev, NULL); + + if (node->type !=3D ACPI_AEST_PROCESSOR_ERROR_NODE) + return; + + cpuhp_remove_state(CPUHP_AP_ARM_RAS_STARTING); + on_each_cpu(ras_offline_oncore_dev, node->oncore_node, 1); + + for (i =3D 0; i < AEST_MAX_INTERRUPT_PER_NODE; i++) { + if (node->irq[i]) + free_percpu_irq(node->irq[i], node->oncore_node); + } +} + static char *alloc_ras_node_name(struct ras_node *node) { char *name; @@ -208,6 +335,23 @@ static int ras_node_set_inj_base(struct ras_node *node= , phys_addr_t base) return 0; } =20 +static int ras_node_set_irq_base(struct ras_node *node, phys_addr_t base) +{ + phys_addr_t irq_base; + int ret; + + if (!(node->flags & AEST_XFACE_FLAG_INT_CONFIG)) + return 0; + + ret =3D device_property_read_u64(node->dev, "arm,interrupt-config-base", + &irq_base); + if (ret || !irq_base) + return 0; + + node->irq_config =3D irq_base - base + node->base; + return 0; +} + static struct ras_node *ras_init_node(struct platform_device *pdev) { int i, ret =3D 0; @@ -276,6 +420,9 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) ret =3D ras_node_set_inj_base(node, mem->start); if (ret) return ERR_PTR(ret); + ret =3D ras_node_set_irq_base(node, mem->start); + if (ret) + return ERR_PTR(ret); } else if (node->access_type =3D=3D ACPI_AEST_NODE_MEMORY_MAPPED) { return ERR_PTR(-EINVAL); } @@ -372,6 +519,15 @@ static int arm64_ras_probe(struct platform_device *pde= v) return ret; } =20 + if (ras_node_is_oncore(node)) + ret =3D cpuhp_setup_state(CPUHP_AP_ARM_RAS_STARTING, + "drivers/ras/arm64/ras:starting", + ras_starting_cpu, ras_dying_cpu); + else + ras_online_node(node); + if (ret) + return ret; + platform_set_drvdata(pdev, node); =20 return 0; @@ -381,7 +537,8 @@ static struct platform_driver arm64_ras_driver =3D { .driver =3D { .name =3D "arm64_ras", }, - .probe =3D arm64_ras_probe, + .probe =3D arm64_ras_probe, + .remove =3D arm64_ras_remove, }; =20 static int __init arm64_ras_init(void) diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index b64eae59b6ac..c26a0aae26c5 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -90,6 +90,7 @@ struct ras_node { void __iomem *base; void __iomem *errgsr; void __iomem *inj; + void __iomem *irq_config; phys_addr_t addr; =20 u8 *specific_data; @@ -126,6 +127,7 @@ struct ras_node { u8 access_type; u8 group_format; u32 irq[AEST_MAX_INTERRUPT_PER_NODE]; + u32 gsi[AEST_MAX_INTERRUPT_PER_NODE]; }; =20 #define CASE_READ(res, x) \ @@ -207,4 +209,34 @@ static const struct ras_access ras_access[] =3D { }, }; =20 +static inline bool ras_node_is_oncore(struct ras_node *node) +{ + /* + * A processor node is "on-core" (uses PPI + cpuhp) only when its + * interrupt is a per-CPU PPI. A shared processor node (e.g. cluster + * L3 cache, DSU) uses an SPI and must follow the non-oncore path + * (aest_online_dev) so that aest_config_irq and aest_online_dev are + * called instead of cpuhp_setup_state. + */ + if (node->type !=3D ACPI_AEST_PROCESSOR_ERROR_NODE) + return false; + return irq_is_percpu(node->irq[ACPI_AEST_NODE_FAULT_HANDLING]) || + irq_is_percpu(node->irq[ACPI_AEST_NODE_ERROR_RECOVERY]); +} + +static inline void ras_select_record(struct ras_node *node, int index) +{ + if (node->type =3D=3D ACPI_AEST_PROCESSOR_ERROR_NODE) { + write_sysreg_s(index, SYS_ERRSELR_EL1); + isb(); + } +} + +/* Ensure all writes has taken effect. */ +static inline void ras_sync(struct ras_node *node) +{ + if (node->type =3D=3D ACPI_AEST_PROCESSOR_ERROR_NODE) + isb(); +} + #endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 62cd7b35a29c..ef55c10f6c71 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -179,6 +179,7 @@ enum cpuhp_state { CPUHP_AP_HYPERV_TIMER_STARTING, /* Must be the last timer callback */ CPUHP_AP_DUMMY_TIMER_STARTING, + CPUHP_AP_ARM_RAS_STARTING, CPUHP_AP_ARM_XEN_STARTING, CPUHP_AP_ARM_XEN_RUNSTATE_STARTING, CPUHP_AP_ARM_CORESIGHT_STARTING, --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9BB3E379EFF; Tue, 2 Jun 2026 07:16:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.113 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384581; cv=none; b=Kt6yf1Jcd95ceOIyohOgvaPFwz7cTlG29Me+cKtRNPlDHaNG/rPvIcxFA1lBhz1xyKXLRi59onjGVm0GqHD6xfyObWmst+OA7YZ4ANbJP2urJZGtkUan6XmNkUaNc4ZL8iWAYz/NIV8GTUBwZPm2EM0+Is6cL7ID07bG6k0q7bk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384581; c=relaxed/simple; bh=Ol51BBUlDDyYnHEWecP2GzXHfgfm5MvdvkVOhuRSwyA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=N4+xQ/FomPgM+KK/CRzdmmVVINNYvYMNZLu8SDg+kXkDrhBdRk13puc6+Owvlq6PDn01VoMDQMvBPuvUSqcsnlmXOhrVWmKPlI2C6lkfz1vMsOFb9QtpCK7S/2fuy+1BvNDWJJhkqkB+5rDM81tc+CaG29kSp1ZDxmDZ9L2kXjY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=nQms/IoF; arc=none smtp.client-ip=115.124.30.113 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="nQms/IoF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384565; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=6QFRtDSUIJeN26FkCjHT0Hs3G1HOTOdExXdLDoEsG2Y=; b=nQms/IoFuVUEVluxJ/0Fn53BlovQ3oRCeKfPoUarFAm1bxUnhcPMeT/d7UPk90pgZFS4TyP/Tf4fogqm+OvC2eim4peOM9p9iYEirS7O342qnKlO9upRYukI2az6xjB7rZVW+x3/1dwpeeThDiK7v04bY+yf1lZPyWDTo9RYyw8= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032089153;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2dH_1780384563; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2dH_1780384563 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:04 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 07/16] arm64: ras: Add error record processing and interrupt handling Date: Tue, 2 Jun 2026 15:15:30 +0800 Message-ID: <20260602071540.3711528-8-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Wire the IRQs registered by the front-end into a record-processing routine, so that interrupts raised by an error source actually translate into observable error events. Signed-off-by: Umang Chheda Signed-off-by: Ruidong Tian --- arch/arm64/include/asm/ras.h | 26 +++++ drivers/ras/arm64/ras-core.c | 213 +++++++++++++++++++++++++++++++++++ include/linux/acpi_aest.h | 4 + 3 files changed, 243 insertions(+) diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h index a992610d7755..42900e1e9a19 100644 --- a/arch/arm64/include/asm/ras.h +++ b/arch/arm64/include/asm/ras.h @@ -5,6 +5,32 @@ #include #include =20 +/* ERRSTATUS */ +#define ERR_STATUS_AV BIT(31) +#define ERR_STATUS_V BIT(30) +#define ERR_STATUS_UE BIT(29) +#define ERR_STATUS_ER BIT(28) +#define ERR_STATUS_OF BIT(27) +#define ERR_STATUS_MV BIT(26) +#define ERR_STATUS_CE GENMASK(25, 24) +#define ERR_STATUS_DE BIT(23) +#define ERR_STATUS_PN BIT(22) +#define ERR_STATUS_UET GENMASK(21, 20) +#define ERR_STATUS_CI BIT(19) +#define ERR_STATUS_IERR GENMASK_ULL(15, 8) +#define ERR_STATUS_SERR GENMASK_ULL(7, 0) + +/* These bits are write-one-to-clear */ +#define ERR_STATUS_W1TC \ + (ERR_STATUS_AV | ERR_STATUS_V | ERR_STATUS_UE | ERR_STATUS_ER | \ + ERR_STATUS_OF | ERR_STATUS_MV | ERR_STATUS_CE | ERR_STATUS_DE | \ + ERR_STATUS_PN | ERR_STATUS_UET | ERR_STATUS_CI) + +#define ERR_STATUS_UET_UC 0 +#define ERR_STATUS_UET_UEU 1 +#define ERR_STATUS_UET_UEO 2 +#define ERR_STATUS_UET_UER 3 + /* ERRCTLR */ #define ERR_CTLR_CFI BIT(8) #define ERR_CTLR_FI BIT(3) diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 98f274b9731d..8c6d202882ed 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include =20 @@ -16,6 +17,12 @@ #undef pr_fmt #define pr_fmt(fmt) "arm64_ras: " fmt =20 +static bool panic_on_ue; +module_param(panic_on_ue, bool, 0600); +MODULE_PARM_DESC(aest_panic_on_ue, + "Panic on unrecoverable error: 0=3Doff 1=3Don (default: 1)"); + + static DEFINE_PER_CPU(struct ras_node, percpu_ras_node); =20 static const char *const ras_node_name[] =3D { @@ -46,6 +53,145 @@ const struct ras_group ras_group_config[] =3D { }, }; =20 +#define AEST_LOG_PREFIX_BUFFER 64 + +static void ras_print(struct ras_record *record, struct ras_ext_regs *regs) +{ + static atomic_t seqno =3D { 0 }; + struct ras_node *node =3D record->node; + u8 *data =3D node->specific_data; + unsigned int curr_seqno; + char pfx_seq[AEST_LOG_PREFIX_BUFFER]; + int index =3D record->index; + + curr_seqno =3D atomic_inc_return(&seqno); + snprintf(pfx_seq, sizeof(pfx_seq), "{%u}" HW_ERR, curr_seqno); + pr_info("%sHardware error from AEST %s\n", pfx_seq, node->name); + + switch (node->type) { + case ACPI_AEST_PROCESSOR_ERROR_NODE: { + struct acpi_aest_processor *proc =3D (struct acpi_aest_processor *)data; + + if (proc->flags & + (ACPI_AEST_PROC_FLAG_SHARED | ACPI_AEST_PROC_FLAG_GLOBAL)) + pr_err("%s Error from shared processor resource (interrupt handled on C= PU%d)\n", + pfx_seq, smp_processor_id()); + else + pr_err("%s Error from CPU%d\n", pfx_seq, smp_processor_id()); + break; + } + case ACPI_AEST_MEMORY_ERROR_NODE: + pr_err("%s Error from memory at SRAT proximity domain %#x\n", + pfx_seq, + ((struct acpi_aest_memory *)data)->srat_proximity_domain); + break; + case ACPI_AEST_SMMU_ERROR_NODE: + pr_err("%s Error from SMMU IORT node %#x subcomponent %#x\n", + pfx_seq, + ((struct acpi_aest_smmu *)data)->iort_node_reference, + ((struct acpi_aest_smmu *)data)->subcomponent_reference); + break; + case ACPI_AEST_VENDOR_ERROR_NODE: + pr_err("%s Error from vendor hid %8.8s uid %#x\n", pfx_seq, + ((struct acpi_aest_vendor_v2 *)data)->acpi_hid, + ((struct acpi_aest_vendor_v2 *)data)->acpi_uid); + break; + case ACPI_AEST_GIC_ERROR_NODE: + pr_err("%s Error from GIC type %#x instance %#x\n", pfx_seq, + ((struct acpi_aest_gic *)data)->interface_type, + ((struct acpi_aest_gic *)data)->instance_id); + break; + default: + pr_err("%s Unknown AEST node type\n", pfx_seq); + return; + } + + pr_err("%s ERR%dFR: 0x%llx\n", pfx_seq, index, regs->err_fr); + pr_err("%s ERR%dCTRL: 0x%llx\n", pfx_seq, index, regs->err_ctlr); + pr_err("%s ERR%dSTATUS: 0x%llx\n", pfx_seq, index, regs->err_status); + if (regs->err_status & ERR_STATUS_AV) + pr_err("%s ERR%dADDR: 0x%llx\n", pfx_seq, index, + regs->err_addr); + + if (regs->err_status & ERR_STATUS_MV) { + pr_err("%s ERR%dMISC0: 0x%llx\n", pfx_seq, index, + regs->err_misc[0]); + pr_err("%s ERR%dMISC1: 0x%llx\n", pfx_seq, index, + regs->err_misc[1]); + pr_err("%s ERR%dMISC2: 0x%llx\n", pfx_seq, index, + regs->err_misc[2]); + pr_err("%s ERR%dMISC3: 0x%llx\n", pfx_seq, index, + regs->err_misc[3]); + } +} + +static void ras_do_proc(struct ras_record *record, struct ras_ext_regs *re= gs) +{ + ras_print(record, regs); +} + +static void ras_panic(struct ras_record *record, struct ras_ext_regs *regs, + char *msg) +{ + ras_print(record, regs); + + panic(msg); +} + +static void ras_proc_record(struct ras_record *record, void *data) +{ + struct ras_ext_regs regs =3D { 0 }; + int *count =3D data; + u64 ue; + + regs.err_status =3D record_read(record, ERXSTATUS); + if (!(regs.err_status & ERR_STATUS_V)) + return; + + (*count)++; + + if (regs.err_status & ERR_STATUS_AV) + regs.err_addr =3D record_read(record, ERXADDR); + + regs.err_fr =3D record_read(record, ERXFR); + regs.err_ctlr =3D record_read(record, ERXCTLR); + + if (regs.err_status & ERR_STATUS_MV) { + regs.err_misc[0] =3D record_read(record, ERXMISC0); + regs.err_misc[1] =3D record_read(record, ERXMISC1); + if (record->node->flags & AEST_XFACE_FLAG_CLEAR_MISC) { + record_write(record, ERXMISC0, 0); + record_write(record, ERXMISC1, 0); + } + } + + /* panic if unrecoverable and uncontainable error encountered */ + ue =3D FIELD_GET(ERR_STATUS_UET, regs.err_status); + if ((regs.err_status & ERR_STATUS_UE) && + (ue =3D=3D ERR_STATUS_UET_UC || ue =3D=3D ERR_STATUS_UET_UEU)) { + if (!panic_on_ue) + ras_record_err(record, "UE detected, panic suppressed\n"); + else + ras_panic(record, ®s, + "AEST: unrecoverable error encountered"); + } + + ras_do_proc(record, ®s); + + /* Write-one-to-clear the bits we've seen */ + regs.err_status &=3D ERR_STATUS_W1TC; + + /* Multi bit filed need to write all-ones to clear. */ + if (regs.err_status & ERR_STATUS_CE) + regs.err_status |=3D ERR_STATUS_CE; + + /* Multi bit filed need to write all-ones to clear. */ + if (regs.err_status & ERR_STATUS_UET) + regs.err_status |=3D ERR_STATUS_UET; + + record_write(record, ERXSTATUS, regs.err_status); +} + static void ras_node_foreach_record(void (*func)(struct ras_record *, void= *), struct ras_node *node, void *data, unsigned long *bitmap) @@ -59,12 +205,72 @@ static void ras_node_foreach_record(void (*func)(struc= t ras_record *, void *), =20 ras_sync(node); } +} + +static void ras_node_foreach_poll_record(void (*func)(struct ras_record *,= void *), + struct ras_node *node, void *data) +{ + int i; + /* + * Per AEST spec: + * - record_implemented: bitmap of records that are actually + * implemented (valid records on this node). + * - status_reporting: bitmap of records whose error status is + * reported through ERRGSR; these will be discovered via the + * ERRGSR scan path below and do not need polling. + * + * The remaining records (implemented but not reported via ERRGSR) + * must be polled one by one to detect errors. Compute that set as: + * poll_bitmap =3D record_implemented & ~status_reporting + */ + for_each_clear_bit(i, node->record_implemented, node->record_count) { + if (!test_bit(i, node->status_reporting)) + continue; + + ras_select_record(node, i); + + func(&node->records[i], data); + + ras_sync(node); + } +} + +static int ras_proc(struct ras_node *node) +{ + int count =3D 0, i, j, size =3D node->record_count; + u64 err_group =3D 0; + + ras_node_foreach_poll_record(ras_proc_record, node, &count); + + if (!node->errgsr) + return count; + + ras_node_dbg(node, "Report bitmap %*pb\n", size, node->status_reporting); + for (i =3D 0; i < BITS_TO_U64(size); i++) { + err_group =3D readq_relaxed((void *)node->errgsr + i * 8); + ras_node_dbg(node, "errgsr[%d]: 0x%llx\n", i, err_group); + + for_each_set_bit(j, (unsigned long *)&err_group, BITS_PER_LONG) { + /* + * Error group base is only valid in Memory Map node, + * so driver do not need to write select register and + * sync. + */ + if (test_bit(i * BITS_PER_LONG + j, node->status_reporting)) + continue; + ras_proc_record(&node->records[j], &count); + } } =20 + return count; +} + static irqreturn_t ras_irq_func(int irq, void *input) { struct ras_node *node =3D input; =20 + ras_proc(node); + return IRQ_HANDLED; } =20 @@ -165,9 +371,16 @@ static void ras_online_record(struct ras_record *recor= d, void *data) =20 static void ras_online_node(struct ras_node *node) { + int count =3D 0; + if (!node->name) return; =20 + ras_node_foreach_record(ras_proc_record, node, &count, + node->record_implemented); + + ras_node_dbg(node, "%d errors found before enabled\n", count); + ras_config_irq(node); =20 ras_node_foreach_record(ras_online_record, node, NULL, diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h index a462895a7b5a..9cb0fcb52c39 100644 --- a/include/linux/acpi_aest.h +++ b/include/linux/acpi_aest.h @@ -9,6 +9,10 @@ #define AEST_FHI_NAME "AEST:FHI" #define AEST_ERI_NAME "AEST:ERI" =20 +/* AEST component */ +#define ACPI_AEST_PROC_FLAG_GLOBAL BIT(0) +#define ACPI_AEST_PROC_FLAG_SHARED BIT(1) + /* AEST interrupt */ #define AEST_INTERRUPT_MODE BIT(0) =20 --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B953D364E89; Tue, 2 Jun 2026 07:16:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384578; cv=none; b=ZUYXfHS8YKeXMPvxTxMVGZA4Feq5a5WH7ahb1QUm0UsGnSVmxjcXDm80Wao1APmcP5f3ileDDkfOa5vs13t3epkIvOnOFiPeWItHbweR8nciZqVmZmOLldUyP9YrUi0Rcht3TG5TLVpDoPXz1v8hcso1DyiK3rF8jGs1LP7POpM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384578; c=relaxed/simple; bh=8vbY5RHqeoV0fwDx4oOCaSKQX+tWBP8lX18+TLXj984=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=h+vMuQkxDJtB5Ye4X6xo7NEEPWNCOEg1m8hJco1HeXvqpAV+AC5X2XmSVkvp8f6NgPv5RoBLcrfe2wtjUnDNSjMGjrg0kOvmaqJEYjJYyTioeE6BGzk0QJI5gCHnwhFK9d4bWNg6hyPc9euRIMfuaw9NxYShOY3A+NtILOaum2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=OvM1IUWn; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="OvM1IUWn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384568; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=H4LP+ZuzdN+MoIKSVl/cWVVJdadts6yo+cmT5nvYGPQ=; b=OvM1IUWncySTNVCWe1p1OHS/lbPzFzQEgB1j5Ft44S6PxW3XtfjcXijIGjStjklmO1iXaTAZ6nMDxAnAGoKFoqMhW/nOWeZy4q0bTbMFD427JxYRIQuJfhno4OagheqTdYKm24agLfkguP+gIlObAfCWiDRaoRn92W4GogYQox0= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2dj_1780384565; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2dj_1780384565 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:06 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 08/16] arm64: ras: Handle memory failure for uncorrectable errors Date: Tue, 2 Jun 2026 15:15:31 +0800 Message-ID: <20260602071540.3711528-9-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When an uncorrectable error (UE/DE) is detected and the error record reports a System Physical Address (SPA), invoke memory_failure() to offline the affected page. This prevents further consumption of corrupted data. Signed-off-by: Ruidong Tian --- arch/arm64/include/asm/ras.h | 4 ++++ drivers/acpi/arm64/aest.c | 5 ++++- drivers/ras/arm64/ras-core.c | 21 +++++++++++++++++++++ drivers/ras/arm64/ras.h | 26 ++++++++++++++++++++++++++ include/linux/acpi_aest.h | 3 +++ 5 files changed, 58 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h index 42900e1e9a19..7bef631a395c 100644 --- a/arch/arm64/include/asm/ras.h +++ b/arch/arm64/include/asm/ras.h @@ -31,6 +31,10 @@ #define ERR_STATUS_UET_UEO 2 #define ERR_STATUS_UET_UER 3 =20 +/* ERRADDR */ +#define ERR_ADDR_AI BIT(61) +#define ERR_ADDR_PADDR GENMASK_ULL(55, 0) + /* ERRCTLR */ #define ERR_CTLR_CFI BIT(8) #define ERR_CTLR_FI BIT(3) diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c index 5733c91c8e0d..1b020ab7eccd 100644 --- a/drivers/acpi/arm64/aest.c +++ b/drivers/acpi/arm64/aest.c @@ -153,6 +153,9 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, props[(*p)++] =3D PROPERTY_ENTRY_U64_ARRAY_LEN("arm,status-reporting", status_reporting, group_len); + props[(*p)++] =3D PROPERTY_ENTRY_U64_ARRAY_LEN("arm,addressing-mode", + addressing_mode, + group_len); props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,error-group-base", common->error_group_register_base); props[(*p)++] =3D PROPERTY_ENTRY_U64("arm,fault-inject-base", @@ -173,7 +176,7 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct = property_entry *props, static int __init aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) { - struct property_entry props[15] =3D { }; + struct property_entry props[16] =3D { }; int p =3D 0; int ret; =20 diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 8c6d202882ed..babb390b795f 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -127,7 +127,17 @@ static void ras_print(struct ras_record *record, struc= t ras_ext_regs *regs) =20 static void ras_do_proc(struct ras_record *record, struct ras_ext_regs *re= gs) { + u64 status =3D regs->err_status, addr =3D regs->err_addr; + ras_print(record, regs); + + if (status & ERR_STATUS_CE) + return; + + if (record->addressing_mode =3D=3D AEST_ADDRESS_LA || (addr & ERR_ADDR_AI= )) + return; + + memory_failure_queue(addr & PHYS_MASK, 0); } =20 static void ras_panic(struct ras_record *record, struct ras_ext_regs *regs, @@ -360,7 +370,10 @@ static int ras_init_record(struct ras_record *record, = int i, struct ras_node *no record->access =3D &ras_access[node->access_type]; record->index =3D i; record->node =3D node; + record->addressing_mode =3D test_bit(i, node->addressing_mode); =20 + ras_record_dbg(record, "record initialized, addressing mode: %s\n", + record->addressing_mode ? "LA" : "SPA"); return 0; } =20 @@ -598,6 +611,11 @@ static struct ras_node *ras_init_node(struct platform_= device *pdev) GFP_KERNEL); if (!node->status_reporting) return ERR_PTR(-ENOMEM); + node->addressing_mode =3D devm_bitmap_zalloc(dev, + node->group->errgsr_num * BITS_PER_TYPE(u64), + GFP_KERNEL); + if (!node->addressing_mode) + return ERR_PTR(-ENOMEM); =20 ret =3D device_property_read_u64_array(dev, "arm,record-implemented", (u64 *)node->record_implemented, @@ -605,6 +623,9 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) ret =3D ret ?: device_property_read_u64_array(dev, "arm,status-reporting", (u64 *)node->status_reporting, node->group->errgsr_num); + ret =3D ret ?: device_property_read_u64_array(dev, "arm,addressing-mode", + (u64 *)node->addressing_mode, + node->group->errgsr_num); if (ret) return ERR_PTR(ret); =20 diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index c26a0aae26c5..11c6def1e4bf 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -70,6 +70,16 @@ struct ras_record { const struct ras_access *access; =20 int index; + /* + * This bit specifies the addressing mode to populate the ERR_ADDR + * register: + * 0b: Error record reports System Physical Addresses (SPA) in + * the ERR_ADDR register. + * 1b: Error record reports error node-specific Logical Addresses (LA) + * in the ERR_ADDR register. OS must use other means to translate + * the reported LA into SPA. + */ + int addressing_mode; }; =20 struct ras_group { @@ -116,6 +126,22 @@ struct ras_node { * error events. */ unsigned long *status_reporting; + /* + * This bitmap specifies the addressing mode used by each + * error record within this error node to populate the + * ERR_ADDR register. + * Bit[n] of this field pertains to error record corresponding + * to index n in the error group. + * Bit[n] =3D 0b: Error record at index n reports System + * Physical Addresses (SPA) in the ERR_ADDR + * register. + * Bit[n] =3D 1b: Error record at index n reports error + * node-specific Logical Addresses (LA) in the + * ERR_ADDR register. + * OS must use other means to translate the reported LA + * into SPA + */ + unsigned long *addressing_mode; struct ras_record *records; =20 u32 specific_data_size; diff --git a/include/linux/acpi_aest.h b/include/linux/acpi_aest.h index 9cb0fcb52c39..9a8aa234d9e5 100644 --- a/include/linux/acpi_aest.h +++ b/include/linux/acpi_aest.h @@ -13,6 +13,9 @@ #define ACPI_AEST_PROC_FLAG_GLOBAL BIT(0) #define ACPI_AEST_PROC_FLAG_SHARED BIT(1) =20 +#define AEST_ADDRESS_SPA 0 +#define AEST_ADDRESS_LA 1 + /* AEST interrupt */ #define AEST_INTERRUPT_MODE BIT(0) =20 --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9920B345CAA; Tue, 2 Jun 2026 07:16:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384579; cv=none; b=LLHat2tWBbQ33IXZ+wHWgTGP7ok3E3GqwDGVvrA1OUtM5Kba8CJZHbBWo5jhAHKvTQUYWQF05Oiaex5Sh898QoUcQMqmkl1O5RP2mxfvdd2v3euq/DKPy7eInGtyn+x9/sqBVP8oIOEbIzFi2P0chcbt2rTmWVvzds0RIrk6o38= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384579; c=relaxed/simple; bh=1FLUAjqkojQoRY09Mk+KMmwl/m0J9QDmG803FDSKLhI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Fjfbwe/gMGKUuKHo3xXOqJXVYHzwbYWQWW3XrVZzHtUBOO6hUooyJlVRPknXZWqsAI98xSobhYQTKVHwWvP97PFI42jw5AG+vHQq22UcQqcnQ6GnqKWVBeV23s/ced/PfiD0IACFwHzhqC62FjSlcRytQozNgqw/WHnxxEmertU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ny7VHHEV; arc=none smtp.client-ip=115.124.30.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ny7VHHEV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384568; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=ZozZHcNNhfj5S1VhtyeRB12PjLLzujmaREsy9720OSQ=; b=ny7VHHEVsALIT4DhdBqYklIKfaE1BssdRSwZ9WmYZfMUA9L7l6t5xPxx+ho5hW0O9XADD/PKXH+zxgQBYfgY066ogH+z0Hg1NQpZH5aEn0tHZkEQvcH6ZM6VTpr2Dtox7Y1NtGLVmIS8JRkfZR+FuknVDJnSrOsjoR3BBaDH7tY= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R901e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2e1_1780384566; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2e1_1780384566 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:07 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 09/16] arm64: ras: Probe RAS architecture version Date: Tue, 2 Jun 2026 15:15:32 +0800 Message-ID: <20260602071540.3711528-10-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The RAS version of a component can be probed via its ERRDEVARCH register. In cases where a component (e.g., SMMU) does not implement an ERRDEVARCH register, the driver falls back to using the RAS version of the Processing Element (PE). Signed-off-by: Ruidong Tian --- arch/arm64/include/asm/ras.h | 3 +++ drivers/ras/arm64/ras-core.c | 25 +++++++++++++++++++++++++ drivers/ras/arm64/ras.h | 3 +++ 3 files changed, 31 insertions(+) diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h index 7bef631a395c..5b938ff03e74 100644 --- a/arch/arm64/include/asm/ras.h +++ b/arch/arm64/include/asm/ras.h @@ -44,6 +44,9 @@ #define ERRFHICR0_OFFSET 0x0 #define ERRERICR0_OFFSET 0x10 =20 +/* ERRDEVARCH */ +#define ERRDEVARCH_REV GENMASK(19, 16) + struct ras_ext_regs { u64 err_fr; u64 err_ctlr; diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index babb390b795f..9fbc98e89f15 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -169,9 +169,18 @@ static void ras_proc_record(struct ras_record *record,= void *data) if (regs.err_status & ERR_STATUS_MV) { regs.err_misc[0] =3D record_read(record, ERXMISC0); regs.err_misc[1] =3D record_read(record, ERXMISC1); + if (record->node->version >=3D ID_AA64PFR0_EL1_RAS_V1P1) { + regs.err_misc[2] =3D record_read(record, ERXMISC2); + regs.err_misc[3] =3D record_read(record, ERXMISC3); + } + if (record->node->flags & AEST_XFACE_FLAG_CLEAR_MISC) { record_write(record, ERXMISC0, 0); record_write(record, ERXMISC1, 0); + if (record->node->version >=3D ID_AA64PFR0_EL1_RAS_V1P1) { + record_write(record, ERXMISC2, 0); + record_write(record, ERXMISC3, 0); + } } } =20 @@ -358,6 +367,21 @@ static void ras_enable_irq(struct ras_record *record) record_write(record, ERXCTLR, err_ctlr); } =20 +static int get_ras_node_ver(struct ras_node *node) +{ + u32 reg; + + if (node->type =3D=3D ACPI_AEST_GIC_ERROR_NODE) { + if (!node->base) + return 0; + + reg =3D readl_relaxed(node->base + GIC_ERRDEVARCH); + return FIELD_GET(ERRDEVARCH_REV, reg); + } + + return FIELD_GET(ID_AA64PFR0_EL1_RAS_MASK, read_cpuid(ID_AA64PFR0_EL1)); +} + static int ras_init_record(struct ras_record *record, int i, struct ras_no= de *node) { record->name =3D devm_kasprintf(node->dev, GFP_KERNEL, "record%d", i); @@ -665,6 +689,7 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) if (!node->name) return ERR_PTR(-ENOMEM); =20 + node->version =3D get_ras_node_ver(node); node->records =3D devm_kcalloc(node->dev, node->record_count, sizeof(struct ras_record), GFP_KERNEL); if (!node->records) diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 11c6def1e4bf..03d1b498acc4 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -58,6 +58,8 @@ #define ERXPFGCTL 0x808 #define ERXPFGCDN 0x810 =20 +#define GIC_ERRDEVARCH 0xFFBC + struct ras_access { u64 (*read)(void __iomem *base, u32 offset); void (*write)(void __iomem *base, u32 offset, u64 val); @@ -148,6 +150,7 @@ struct ras_node { u32 record_count; u32 record_index; u32 flags; + int version; =20 u8 type; u8 access_type; --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8942C378D72; Tue, 2 Jun 2026 07:16:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.99 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384605; cv=none; b=GS2DPpgAGROZj+hTiK3R1FQjYDsKPVphOi9ogZDqWnuGsb3VB/3Ew4QnMeHEU6B96fCMFGus+KTsr7kDSTD7qYmuecjqOxkDRYDD0WM/YsgT9iPY9cig8YNHS6KeZGRPMyg4U6EkRJnKn1Si8H/im5GmHtaLhlh8eyH5yaS7pTI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384605; c=relaxed/simple; bh=1Z9YVimM0YkRL83C7BqmD94EJEbNUNkrK1UKar8gDKQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qfc9kuVU8IabKbWzodtoECZ/4pXwbsb8vVKBUK+oyCO8d/3hhOzj+4xjBExZCx3yEOabe0M/58uQlu4eAWfAoanhOuPKXC2rBXUqZOyNlo14z9MMwpJVNVT+/QiFeQzZDDoGnIr2whxZkkZgYqrMU1wha23/GrZB9JcphbHV56M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=NKjZjc6l; arc=none smtp.client-ip=115.124.30.99 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="NKjZjc6l" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384571; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=WNKFUAbkpoYroboLULokfYwli0lIltgQwkqHon85TEw=; b=NKjZjc6lt05eD4Phx7Y4LsrrKeu+xacNaD7Bnd85HYU6ke8eObaOEpDP4Jemq/YEpuRHr3ygSkHrJ6Xle8UBmGck1/DysY2D5tkyFTzXuPJULdUSpSAGpgvLzEQXnusuxC/naPgiFWGnTv40D6XiU66aYiJsJ91OTnrs7NHnY+8= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R601e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2et_1780384568; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2et_1780384568 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:09 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 10/16] arm64: ras: Support CE threshold of error record Date: Tue, 2 Jun 2026 15:15:33 +0800 Message-ID: <20260602071540.3711528-11-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The CE threshold defines the number of Correctable Errors (CE) that must occur in a record before triggering an interrupt. Error records support multiple threshold configurations, including 8B, 16B, and 32B. This patch detects the supported threshold settings for error records and sets the default threshold to 1, ensuring an interrupt is generated for every CE occurrence. Signed-off-by: Ruidong Tian --- arch/arm64/include/asm/ras.h | 41 +++++++++++++++++ drivers/ras/arm64/ras-core.c | 85 +++++++++++++++++++++++++++++++++++- drivers/ras/arm64/ras.h | 18 ++++++++ 3 files changed, 143 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/ras.h b/arch/arm64/include/asm/ras.h index 5b938ff03e74..ae67cfcc214e 100644 --- a/arch/arm64/include/asm/ras.h +++ b/arch/arm64/include/asm/ras.h @@ -5,6 +5,39 @@ #include #include =20 +/* ERRFR */ +#define ERR_FR_CE GENMASK_ULL(54, 53) +#define ERR_FR_RP BIT(15) +#define ERR_FR_CEC GENMASK_ULL(14, 12) + +#define ERR_FR_RP_SINGLE_COUNTER 0 +#define ERR_FR_RP_DOUBLE_COUNTER 1 + +#define ERR_FR_CEC_0B_COUNTER 0 +#define ERR_FR_CEC_8B_COUNTER BIT(1) +#define ERR_FR_CEC_16B_COUNTER BIT(2) + +/* ERRMISC0 */ + +/* ERRFR.CEC =3D=3D 0b010, ERRFR.RP =3D=3D 0 */ +#define ERR_MISC0_8B_OF BIT(39) +#define ERR_MISC0_8B_CEC GENMASK_ULL(38, 32) + +/* ERRFR.CEC =3D=3D 0b100, ERRFR.RP =3D=3D 0 */ +#define ERR_MISC0_16B_OF BIT(47) +#define ERR_MISC0_16B_CEC GENMASK_ULL(46, 32) + +#define ERR_MISC0_CEC_SHIFT 32 + +#define ERR_8B_CEC_MAX (ERR_MISC0_8B_CEC >> ERR_MISC0_CEC_SHIFT) +#define ERR_16B_CEC_MAX (ERR_MISC0_16B_CEC >> ERR_MISC0_CEC_SHIFT) + +/* ERRFR.CEC =3D=3D 0b100, ERRFR.RP =3D=3D 1 */ +#define ERR_MISC0_16B_OFO BIT(63) +#define ERR_MISC0_16B_CECO GENMASK_ULL(62, 48) +#define ERR_MISC0_16B_OFR BIT(47) +#define ERR_MISC0_16B_CECR GENMASK_ULL(46, 32) + /* ERRSTATUS */ #define ERR_STATUS_AV BIT(31) #define ERR_STATUS_V BIT(30) @@ -47,6 +80,14 @@ /* ERRDEVARCH */ #define ERRDEVARCH_REV GENMASK(19, 16) =20 +enum ras_ce_threshold { + RAS_CE_THRESHOLD_0B, + RAS_CE_THRESHOLD_8B, + RAS_CE_THRESHOLD_16B, + RAS_CE_THRESHOLD_32B, + RAS_CE_THRESHOLD_UNKNOWN, +}; + struct ras_ext_regs { u64 err_fr; u64 err_ctlr; diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 9fbc98e89f15..94514a5bb973 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -53,6 +53,20 @@ const struct ras_group ras_group_config[] =3D { }, }; =20 +static const struct ce_threshold_info ce_info[] =3D { + [RAS_CE_THRESHOLD_0B] =3D { 0 }, + [RAS_CE_THRESHOLD_8B] =3D { + .max_count =3D ERR_8B_CEC_MAX, + .mask =3D ERR_MISC0_8B_CEC, + .shift =3D ERR_MISC0_CEC_SHIFT, + }, + [RAS_CE_THRESHOLD_16B] =3D { + .max_count =3D ERR_16B_CEC_MAX, + .mask =3D ERR_MISC0_16B_CEC, + .shift =3D ERR_MISC0_CEC_SHIFT, + }, +}; + #define AEST_LOG_PREFIX_BUFFER 64 =20 static void ras_print(struct ras_record *record, struct ras_ext_regs *regs) @@ -174,8 +188,8 @@ static void ras_proc_record(struct ras_record *record, = void *data) regs.err_misc[3] =3D record_read(record, ERXMISC3); } =20 + record_write(record, ERXMISC0, record->ce.reg_val); if (record->node->flags & AEST_XFACE_FLAG_CLEAR_MISC) { - record_write(record, ERXMISC0, 0); record_write(record, ERXMISC1, 0); if (record->node->version >=3D ID_AA64PFR0_EL1_RAS_V1P1) { record_write(record, ERXMISC2, 0); @@ -367,6 +381,73 @@ static void ras_enable_irq(struct ras_record *record) record_write(record, ERXCTLR, err_ctlr); } =20 +static int ras_get_ce_threshold(struct ras_record *record) +{ + u64 err_fr, err_fr_cec, err_fr_rp; + + err_fr =3D record_read(record, ERXFR); + err_fr_cec =3D FIELD_GET(ERR_FR_CEC, err_fr); + err_fr_rp =3D FIELD_GET(ERR_FR_RP, err_fr); + + if (err_fr_cec =3D=3D ERR_FR_CEC_0B_COUNTER) + return RAS_CE_THRESHOLD_0B; + else if (err_fr_rp =3D=3D ERR_FR_RP_DOUBLE_COUNTER) + return RAS_CE_THRESHOLD_32B; + else if (err_fr_cec =3D=3D ERR_FR_CEC_8B_COUNTER) + return RAS_CE_THRESHOLD_8B; + else if (err_fr_cec =3D=3D ERR_FR_CEC_16B_COUNTER) + return RAS_CE_THRESHOLD_16B; + + return RAS_CE_THRESHOLD_UNKNOWN; +} + +static void ras_set_ce_threshold(struct ras_record *record) +{ + u64 err_misc0; + struct ce_threshold *ce =3D &record->ce; + const struct ce_threshold_info *info; + + record->threshold_type =3D ras_get_ce_threshold(record); + + switch (record->threshold_type) { + case RAS_CE_THRESHOLD_0B: + ras_record_dbg(record, "do not support CE threshold!\n"); + return; + case RAS_CE_THRESHOLD_8B: + ras_record_dbg(record, "support 8 bit CE threshold!\n"); + break; + case RAS_CE_THRESHOLD_16B: + ras_record_dbg(record, "support 16 bit CE threshold!\n"); + break; + case RAS_CE_THRESHOLD_32B: + ras_record_dbg(record, "not support 32 bit CE threshold!\n"); + return; + default: + ras_record_dbg(record, "Unknown misc0 ce threshold!\n"); + return; + } + + err_misc0 =3D record_read(record, ERXMISC0); + info =3D &ce_info[record->threshold_type]; + ce->info =3D info; + + /* Default CE threshold is 1 */ + ce->threshold =3D DEFAULT_CE_THRESHOLD; + /* + * The CEC field in ERXMISC0 is a saturating up-counter; the + * overflow flag (ERXSTATUS.OF) is asserted only when CEC + * saturates at max_count. To make "threshold" mean "trigger OF + * after `threshold` more CEs", preset CEC to max_count - threshold. + */ + ce->count =3D info->max_count - ce->threshold + 1; + ce->reg_val =3D (err_misc0 & ~info->mask) | + (ce->count << info->shift); + + record_write(record, ERXMISC0, ce->reg_val); + ras_record_dbg(record, "CE threshold is %llu, controlled by Kernel", + ce->threshold); +} + static int get_ras_node_ver(struct ras_node *node) { u32 reg; @@ -382,6 +463,7 @@ static int get_ras_node_ver(struct ras_node *node) return FIELD_GET(ID_AA64PFR0_EL1_RAS_MASK, read_cpuid(ID_AA64PFR0_EL1)); } =20 + static int ras_init_record(struct ras_record *record, int i, struct ras_no= de *node) { record->name =3D devm_kasprintf(node->dev, GFP_KERNEL, "record%d", i); @@ -403,6 +485,7 @@ static int ras_init_record(struct ras_record *record, i= nt i, struct ras_node *no =20 static void ras_online_record(struct ras_record *record, void *data) { + ras_set_ce_threshold(record); ras_enable_irq(record); } =20 diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 03d1b498acc4..ac3876912495 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -11,6 +11,8 @@ #include #include =20 +#define DEFAULT_CE_THRESHOLD 1 + #define record_read(record, offset) \ ((record)->access->read((record)->regs_base, (offset))) #define record_write(record, offset, val) \ @@ -65,12 +67,28 @@ struct ras_access { void (*write)(void __iomem *base, u32 offset, u64 val); }; =20 +struct ce_threshold_info { + u64 max_count; + u64 mask; + u64 shift; +}; + +struct ce_threshold { + const struct ce_threshold_info *info; + u64 count; + u64 threshold; + u64 reg_val; +}; + struct ras_record { char *name; void __iomem *regs_base; struct ras_node *node; const struct ras_access *access; =20 + struct ce_threshold ce; + enum ras_ce_threshold threshold_type; + int index; /* * This bit specifies the addressing mode to populate the ERR_ADDR --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 139ED379EF9; Tue, 2 Jun 2026 07:16:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384578; cv=none; b=szXftfCW8QPyp3a2lOjOQ1LylroZiMZwB3qXTq2kDTEByOeOQ8++lQRrwnpyPbTg1QDIIUBCS03FDhSe8JUrAcOzgc6j6MV4j5o5QgwcMiDlNDGhhOGDfSewN5+bVWhJmDDsFYBKq0ayMOyzGvqaQlVQAkLsTslvv6Y8VEHn2Dw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384578; c=relaxed/simple; bh=osrAY3oSXcmFdhXfl5J9csnroBfPWQeryAd6I4CtGFA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UCW9pSjHb7DKA5IRKAlOY5Wh+gCAK6ljJK7nS3Q3rA1FpBTb+07Lkm9vgjDyEtYJemLdJhOnI5QHsFnEfoEHssLEBIbXtxjoM+S8fZyepyWu57M7kTSfEpofIP4nf3ebwxnKf/k6JF4VxHUGLUL34/4qWirV4PC/iiC/AYbiT4M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Jcy+GYjs; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Jcy+GYjs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384572; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=wKoXcf+SQackXHAfJqahD6MXuhSRg2crwRRaqTj4nF4=; b=Jcy+GYjs10Y8Lj6OjSEB4FzPA09B5dzTjv4QSUfLv72M/nenvRzCUBrE+r+J6t5i1hSgi+zakJRlUGFP7o/DOL4YuFR6EpEX0F4uIdyubdy376ZUGoYmwk92q8Yx7zEGhkzq8iJhbNdojhNfHxg+2e7JzRM59fz2phgIRys7JFY= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2fN_1780384569; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2fN_1780384569 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:10 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 11/16] arm64: ras: Add RAS decode notifier chain Date: Tue, 2 Jun 2026 15:15:34 +0800 Message-ID: <20260602071540.3711528-12-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a blocking notifier chain that allows external modules (e.g., EDAC drivers, vendor-specific decoders) to receive and further decode RAS error events. Each error event is passed to all registered decoders after being logged. Signed-off-by: Ruidong Tian --- drivers/ras/arm64/ras-core.c | 15 +++++++++++++++ include/linux/ras.h | 8 ++++++++ 2 files changed, 23 insertions(+) diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 94514a5bb973..0b07b69545ad 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -139,11 +139,26 @@ static void ras_print(struct ras_record *record, stru= ct ras_ext_regs *regs) } } =20 +static ATOMIC_NOTIFIER_HEAD(ras_decoder_chain); + +void ras_register_decode_chain(struct notifier_block *nb) +{ + atomic_notifier_chain_register(&ras_decoder_chain, nb); +} +EXPORT_SYMBOL_GPL(ras_register_decode_chain); + +void ras_unregister_decode_chain(struct notifier_block *nb) +{ + atomic_notifier_chain_unregister(&ras_decoder_chain, nb); +} +EXPORT_SYMBOL_GPL(ras_unregister_decode_chain); + static void ras_do_proc(struct ras_record *record, struct ras_ext_regs *re= gs) { u64 status =3D regs->err_status, addr =3D regs->err_addr; =20 ras_print(record, regs); + atomic_notifier_call_chain(&ras_decoder_chain, 0, record); =20 if (status & ERR_STATUS_CE) return; diff --git a/include/linux/ras.h b/include/linux/ras.h index 468941bfe855..11663150612f 100644 --- a/include/linux/ras.h +++ b/include/linux/ras.h @@ -63,4 +63,12 @@ amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err= ) { return -EINVAL; } #define GET_LOGICAL_INDEX(mpidr) -EINVAL #endif /* CONFIG_ARM || CONFIG_ARM64 */ =20 +#if IS_ENABLED(CONFIG_ARM64_RAS_DRIVER) +void ras_register_decode_chain(struct notifier_block *nb); +void ras_unregister_decode_chain(struct notifier_block *nb); +#else +static inline void ras_register_decode_chain(struct notifier_block *nb) {} +static inline void ras_unregister_decode_chain(struct notifier_block *nb) = {} +#endif /* CONFIG_ARM64_RAS_DRIVER */ + #endif /* __RAS_H__ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D3F336308E; Tue, 2 Jun 2026 07:16:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.118 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384589; cv=none; b=IxoSweUKRGpTZppZqMLeXfySKQ1K815dFDACr3KsXLvpLTqxqxpI08mvn0h23Nntcpi6iXZ4gEp616d7zAks/naoKFWCjC14/C5XMYp/7g7OoMpYins1zBMimsTkFebfjTyt5o9LJvHUFoFhINNRKEokVZYRwjrJzZDQPvT9OvU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384589; c=relaxed/simple; bh=tQvGGQAYsxwcIjdfFqcFVvJ1kUqg3IjP3cYzbconbNI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GPR7zMjpYYmFsX8us7ZTLenHqZ+ctzuQMzd2He5E3iYDyV8+LuMsuPljMENncta3OPSODtWHuF+DB0wVGF02c+hD2dUW08LvVGqXz0b7SfUEEKhKtGHmfUPdWCZ9AcNREQMha5EExO+sn0MhN1L9r5J5bG6D/La3EOc1oiZTbb8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=s2dI6SaE; arc=none smtp.client-ip=115.124.30.118 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="s2dI6SaE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384573; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=7gkf4QjVEhb8uFI0feUbMQvBshcd3PL9ANlWVkCDcgw=; b=s2dI6SaECnhAJmr5bq+Rlvzd0CJINSglVufGfgCDXPPgM5J1GMcdSUc3NF8c0zqg5CyIkw06w+CcXzVqQDzYZZJ+6lrh3VT++YSSSjEvuE50+dsqCD7bcqqCi3yQ9uDsMVgxmJe0BGP3rOLyJP/0d3LL02vE9FZELqm6r2cZSUk= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2fz_1780384571; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2fz_1780384571 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:12 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 12/16] arm64: ras: Expose config abi through debugfs Date: Tue, 2 Jun 2026 15:15:35 +0800 Message-ID: <20260602071540.3711528-13-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Expose per-node and per-record configuration through debugfs so bring-up and validation can inspect and tweak driver state (CE threshold, error counters, ...) at runtime. Signed-off-by: Umang Chheda Signed-off-by: Ruidong Tian --- Documentation/ABI/testing/debugfs-arm64-ras | 50 +++++ MAINTAINERS | 1 + drivers/ras/arm64/Makefile | 1 + drivers/ras/arm64/ras-core.c | 31 +++ drivers/ras/arm64/ras-sysfs.c | 211 ++++++++++++++++++++ drivers/ras/arm64/ras.h | 17 ++ drivers/ras/debugfs.c | 3 +- include/linux/ras.h | 2 + 8 files changed, 315 insertions(+), 1 deletion(-) create mode 100644 Documentation/ABI/testing/debugfs-arm64-ras create mode 100644 drivers/ras/arm64/ras-sysfs.c diff --git a/Documentation/ABI/testing/debugfs-arm64-ras b/Documentation/AB= I/testing/debugfs-arm64-ras new file mode 100644 index 000000000000..d86bde83d0b9 --- /dev/null +++ b/Documentation/ABI/testing/debugfs-arm64-ras @@ -0,0 +1,50 @@ +What: /sys/kernel/debug/ras/arm64// +Date: Dec 2025 +KernelVersion: 6.19 +Contact: Ruidong Tian +Description: + Directory representing a RAS node device, means device + type, like: + + - processor + - memory + - smmu + - ... + +What: /sys/kernel/debug/ras/arm64//record/err_* +Date: Dec 2025 +KernelVersion: 6.19 +Contact: Ruidong Tian +Description: + (RW) Read/Write err_* register. + +What: /sys/kernel/debug/ras/arm64//err_count +Date: Dec 2025 +KernelVersion: 6.19 +Contact: Ruidong Tian +Description: + (RO) Outputs error statistics for all error records of this node. + + +What: /sys/kernel/debug/ras/arm64//record/err_count +Date: Dec 2025 +KernelVersion: 6.19 +Contact: Ruidong Tian +Description: + (RO) Outputs error statistics for this record. + +What: /sys/kernel/debug/ras/arm64//ce_threshold +Date: Dec 2025 +KernelVersion: 6.19 +Contact: Ruidong Tian +Description: + (WO) Write the CE threshold to all records of this node. + Returns error if input exceeded the maximum threshold. + +What: /sys/kernel/debug/ras/arm64//record/ce_threshold +Date: Dec 2025 +KernelVersion: 6.19 +Contact: Ruidong Tian +Description: + (RW) Read and write the CE threshold to this record. + Returns error if input exceeded the maximum threshold. \ No newline at end of file diff --git a/MAINTAINERS b/MAINTAINERS index 766d1240b465..007a5a69b6d9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -349,6 +349,7 @@ M: Ruidong Tian L: linux-acpi@vger.kernel.org L: linux-arm-kernel@lists.infradead.org S: Supported +F: Documentation/ABI/testing/debugfs-arm64-ras F: arch/arm64/include/asm/ras.h F: drivers/acpi/arm64/aest.c F: drivers/ras/arm64/ diff --git a/drivers/ras/arm64/Makefile b/drivers/ras/arm64/Makefile index c5387f05a067..e13e223107dd 100644 --- a/drivers/ras/arm64/Makefile +++ b/drivers/ras/arm64/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_ARM64_RAS_DRIVER) +=3D arm64_ras.o =20 arm64_ras-y :=3D ras-core.o +arm64_ras-y +=3D ras-sysfs.o diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 0b07b69545ad..c427c131a862 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -25,6 +25,8 @@ MODULE_PARM_DESC(aest_panic_on_ue, =20 static DEFINE_PER_CPU(struct ras_node, percpu_ras_node); =20 +struct dentry *arm64_ras_debugfs; + static const char *const ras_node_name[] =3D { [ACPI_AEST_PROCESSOR_ERROR_NODE] =3D "processor", [ACPI_AEST_MEMORY_ERROR_NODE] =3D "memory", @@ -158,6 +160,27 @@ static void ras_do_proc(struct ras_record *record, str= uct ras_ext_regs *regs) u64 status =3D regs->err_status, addr =3D regs->err_addr; =20 ras_print(record, regs); + if (regs->err_status & ERR_STATUS_CE) + record->count.ce++; + if (regs->err_status & ERR_STATUS_DE) + record->count.de++; + if (regs->err_status & ERR_STATUS_UE) { + switch (FIELD_GET(ERR_STATUS_UET, regs->err_status)) { + case ERR_STATUS_UET_UC: + record->count.uc++; + break; + case ERR_STATUS_UET_UEU: + record->count.ueu++; + break; + case ERR_STATUS_UET_UER: + record->count.uer++; + break; + case ERR_STATUS_UET_UEO: + record->count.ueo++; + break; + } + } + atomic_notifier_call_chain(&ras_decoder_chain, 0, record); =20 if (status & ERR_STATUS_CE) @@ -887,6 +910,8 @@ static int arm64_ras_probe(struct platform_device *pdev) =20 platform_set_drvdata(pdev, node); =20 + ras_node_init_debugfs(node); + return 0; } =20 @@ -900,12 +925,18 @@ static struct platform_driver arm64_ras_driver =3D { =20 static int __init arm64_ras_init(void) { +#ifdef CONFIG_DEBUG_FS + arm64_ras_debugfs =3D debugfs_create_dir("arm64", ras_debugfs_dir); +#endif return platform_driver_register(&arm64_ras_driver); } module_init(arm64_ras_init); =20 static void __exit arm64_ras_exit(void) { +#ifdef CONFIG_DEBUG_FS + debugfs_remove_recursive(arm64_ras_debugfs); +#endif platform_driver_unregister(&arm64_ras_driver); } module_exit(arm64_ras_exit); diff --git a/drivers/ras/arm64/ras-sysfs.c b/drivers/ras/arm64/ras-sysfs.c new file mode 100644 index 000000000000..03cc00b820e2 --- /dev/null +++ b/drivers/ras/arm64/ras-sysfs.c @@ -0,0 +1,211 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM Error Source Table Support + * + * Copyright (c) 2025, Alibaba Group. + */ + +#include "ras.h" + +static int ras_store_threshold(struct ras_record *record, u64 threshold) +{ + struct ce_threshold *ce =3D &record->ce; + u64 err_misc0; + + if (!ce->info) + return -EOPNOTSUPP; + + if (threshold > ce->info->max_count) + return -EINVAL; + + ce->threshold =3D threshold; + ce->count =3D ce->info->max_count - threshold + 1; + + err_misc0 =3D record_read(record, ERXMISC0); + ce->reg_val =3D (err_misc0 & ~ce->info->mask) | + (ce->count << ce->info->shift); + + record_write(record, ERXMISC0, ce->reg_val); + return 0; +} + +static void ras_error_count(struct ras_record *record, struct record_count= *count) +{ + count->ce +=3D record->count.ce; + count->de +=3D record->count.de; + count->uc +=3D record->count.uc; + count->ueu +=3D record->count.ueu; + count->uer +=3D record->count.uer; + count->ueo +=3D record->count.ueo; +} + +/* Debugfs for RAS node */ + +static int ras_node_err_count_show(struct seq_file *m, void *data) +{ + struct ras_node *node =3D m->private; + struct record_count count =3D { 0 }; + int i; + + for (i =3D 0; i < node->record_count; i++) + if (!test_bit(i, node->record_implemented)) + ras_error_count(&node->records[i], &count); + + seq_printf(m, "CE: %llu\n" + "DE: %llu\n" + "UC: %llu\n" + "UEU: %llu\n" + "UEO: %llu\n" + "UER: %llu\n", + count.ce, count.de, count.uc, count.ueu, + count.uer, count.ueo); + return 0; +} +DEFINE_SHOW_ATTRIBUTE(ras_node_err_count); + +/* Attribute for RAS record */ + +#define DEFINE_RAS_DEBUGFS_ATTR(name, offset) \ +static int name##_get(void *data, u64 *val) \ +{ \ + struct ras_record *record =3D data; \ + *val =3D record_read(record, offset); \ + return 0; \ +} \ +static int name##_set(void *data, u64 val) \ +{ \ + struct ras_record *record =3D data; \ + record_write(record, offset, val); \ + return 0; \ +} \ +DEFINE_DEBUGFS_ATTRIBUTE(name##_ops, name##_get, name##_set, "%#llx\n") + +DEFINE_RAS_DEBUGFS_ATTR(err_fr, ERXFR); +DEFINE_RAS_DEBUGFS_ATTR(err_ctrl, ERXCTLR); + +static int record_ce_threshold_get(void *data, u64 *val) +{ + struct ras_record *record =3D data; + + *val =3D record->ce.threshold; + return 0; +} + +static int record_ce_threshold_set(void *data, u64 val) +{ + struct ras_record *record =3D data; + + return ras_store_threshold(record, val); +} + +DEFINE_DEBUGFS_ATTRIBUTE(record_ce_threshold_ops, record_ce_threshold_get, + record_ce_threshold_set, "%llu\n"); + +/* Node-level ce_threshold: write threshold to all records of this node */ + +static int node_ce_threshold_set(void *data, u64 val) +{ + struct ras_node *node =3D data; + int i, ret, last_err =3D -EOPNOTSUPP; + + for (i =3D 0; i < node->record_count; i++) { + ret =3D ras_store_threshold(&node->records[i], val); + if (ret =3D=3D 0) + last_err =3D 0; + else if (ret =3D=3D -EINVAL) + return ret; + } + + return last_err; +} + +DEFINE_DEBUGFS_ATTRIBUTE(node_ce_threshold_ops, NULL, + node_ce_threshold_set, "%llu\n"); + +static int ras_record_err_count_show(struct seq_file *m, void *data) +{ + struct ras_record *record =3D m->private; + struct record_count count =3D { 0 }; + + ras_error_count(record, &count); + + seq_printf(m, "CE: %llu\n" + "DE: %llu\n" + "UC: %llu\n" + "UEU: %llu\n" + "UEO: %llu\n" + "UER: %llu\n", + count.ce, count.de, count.uc, count.ueu, + count.uer, count.ueo); + return 0; +} +DEFINE_SHOW_ATTRIBUTE(ras_record_err_count); + +static void ras_record_init_debugfs(struct ras_record *record) +{ + debugfs_create_file("err_fr", 0600, record->debugfs, + record, &err_fr_ops); + debugfs_create_file("err_ctrl", 0600, record->debugfs, + record, &err_ctrl_ops); + debugfs_create_file("err_count", 0400, record->debugfs, + record, &ras_record_err_count_fops); + debugfs_create_file("ce_threshold", 0600, record->debugfs, + record, &record_ce_threshold_ops); +} + +static void ras_init_records_debugfs(struct ras_node *node) +{ + struct ras_record *record; + int i; + + for (i =3D 0; i < node->record_count; i++) { + record =3D &node->records[i]; + if (!record->name || test_bit(i, node->record_implemented)) + continue; + record->debugfs =3D debugfs_create_dir(record->name, + node->debugfs); + + ras_record_init_debugfs(record); + } +} + +static void ras_oncore_node_init_debugfs(struct ras_node *node) +{ + int cpu; + struct ras_node *percpu_node; + char name[16]; + + for_each_possible_cpu(cpu) { + percpu_node =3D per_cpu_ptr(node->oncore_node, cpu); + + snprintf(name, sizeof(name), "processor%u", cpu); + percpu_node->debugfs =3D debugfs_create_dir(name, arm64_ras_debugfs); + + debugfs_create_file("err_count", 0400, percpu_node->debugfs, + percpu_node, &ras_node_err_count_fops); + debugfs_create_file("ce_threshold", 0200, percpu_node->debugfs, + percpu_node, &node_ce_threshold_ops); + ras_init_records_debugfs(percpu_node); + } +} + +void ras_node_init_debugfs(struct ras_node *node) +{ + if (!node->name) + return; + + if (ras_node_is_oncore(node)) { + ras_oncore_node_init_debugfs(node); + return; + } + + node->debugfs =3D debugfs_create_dir(node->name, arm64_ras_debugfs); + if (IS_ERR_OR_NULL(node->debugfs)) + return; + + debugfs_create_file("err_count", 0400, node->debugfs, + node, &ras_node_err_count_fops); + debugfs_create_file("ce_threshold", 0200, node->debugfs, + node, &node_ce_threshold_ops); + ras_init_records_debugfs(node); +} diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index ac3876912495..92cbb975b4df 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -10,6 +10,7 @@ =20 #include #include +#include =20 #define DEFAULT_CE_THRESHOLD 1 =20 @@ -62,6 +63,8 @@ =20 #define GIC_ERRDEVARCH 0xFFBC =20 +extern struct dentry *arm64_ras_debugfs; + struct ras_access { u64 (*read)(void __iomem *base, u32 offset); void (*write)(void __iomem *base, u32 offset, u64 val); @@ -80,14 +83,25 @@ struct ce_threshold { u64 reg_val; }; =20 +struct record_count { + u64 ce; + u64 de; + u64 uc; + u64 uer; + u64 ueo; + u64 ueu; +}; + struct ras_record { char *name; void __iomem *regs_base; struct ras_node *node; const struct ras_access *access; + struct dentry *debugfs; =20 struct ce_threshold ce; enum ras_ce_threshold threshold_type; + struct record_count count; =20 int index; /* @@ -116,6 +130,7 @@ struct ras_node { struct device *dev; const struct ras_group *group; struct ras_node __percpu *oncore_node; + struct dentry *debugfs; =20 void __iomem *base; void __iomem *errgsr; @@ -286,4 +301,6 @@ static inline void ras_sync(struct ras_node *node) isb(); } =20 +void ras_node_init_debugfs(struct ras_node *node); + #endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ diff --git a/drivers/ras/debugfs.c b/drivers/ras/debugfs.c index 42afd3de68b2..e4d9a5627e5f 100644 --- a/drivers/ras/debugfs.c +++ b/drivers/ras/debugfs.c @@ -3,7 +3,8 @@ #include #include "debugfs.h" =20 -static struct dentry *ras_debugfs_dir; +struct dentry *ras_debugfs_dir; +EXPORT_SYMBOL_GPL(ras_debugfs_dir); =20 static atomic_t trace_count =3D ATOMIC_INIT(0); =20 diff --git a/include/linux/ras.h b/include/linux/ras.h index 11663150612f..976cd102f76c 100644 --- a/include/linux/ras.h +++ b/include/linux/ras.h @@ -71,4 +71,6 @@ static inline void ras_register_decode_chain(struct notif= ier_block *nb) {} static inline void ras_unregister_decode_chain(struct notifier_block *nb) = {} #endif /* CONFIG_ARM64_RAS_DRIVER */ =20 +extern struct dentry *ras_debugfs_dir; + #endif /* __RAS_H__ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE0513AD52B; Tue, 2 Jun 2026 07:16:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384588; cv=none; b=R9+loJ33BECbzj0hA+wane6fOZOjkRF9sfp4dFJUMQxzkprtmXEGpe5DkkFDgnKsMbvLuHpuDdLEsgIMXQMNBu7uixGYVmQb7+Ozhiu6e7KRdLKZ1z0+A1Vc5xhHneOOkHx8mEc+r/TFX1It/uVx49RXatYnNH6sxmwB53e3Su4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384588; c=relaxed/simple; bh=CGngURJ8uhwlmCur94zG6M+OhpqHYIyJHxZWQrOAxFU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jI+vW6vlg5M5uiROgqgUdc4bgV1NtxHLgHVzBcM8wfVvMzv4I2YPBcqodGbrezmwHd0w9QSRSag5sycGITv6eYes8RA7Gk7DaDzu88aiNmR2zNXFir9BfRFGcKXA3V9/Iskfkk9tBAH9TnRO3n1IFQN0o68deTW2JszjAvtOnbQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=pz5w3hVB; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="pz5w3hVB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384574; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=mzmNojoq2hu2ZwgjtjdxdvvwEj7R5BiH7fs14gYyDQo=; b=pz5w3hVBtmkAI1acYXYyUysZSELWVvp57uYqJ9b3IHNf7aYfnGPjusHpdgpcrJTk+trK3FB9PyyarZug2jWXWn7+HercLDGc3KM1yP45o+e1aJLgASSI+Pw0mtaY3KMfnHhmW7kEF3qRTK+d4jm+x8lEemmpddebQM81sEkabYE= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2gL_1780384573; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2gL_1780384573 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:14 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 13/16] arm64: ras: Introduce ras inject interface Date: Tue, 2 Jun 2026 15:15:36 +0800 Message-ID: <20260602071540.3711528-14-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" End-to-end validation of the RAS driver requires the ability to generate errors on demand. Driver gives two complementary paths, which the inject interface deliberately keeps separate: - Soft injection writes directly into the record's status registers and is unconstrained by the spec. It is the tool of choice for state-machine coverage: notifier delivery, ABI shape. - Hard injection drives the architected inject registers and obeys the hardware contract, exercising the real interrupt and decode paths. Signed-off-by: Ruidong Tian --- Documentation/ABI/testing/debugfs-arm64-ras | 39 +++++- drivers/ras/arm64/Makefile | 1 + drivers/ras/arm64/ras-core.c | 26 ++-- drivers/ras/arm64/ras-inject.c | 127 ++++++++++++++++++++ drivers/ras/arm64/ras-sysfs.c | 1 + drivers/ras/arm64/ras.h | 2 + 6 files changed, 183 insertions(+), 13 deletions(-) create mode 100644 drivers/ras/arm64/ras-inject.c diff --git a/Documentation/ABI/testing/debugfs-arm64-ras b/Documentation/AB= I/testing/debugfs-arm64-ras index d86bde83d0b9..a14f481c0c04 100644 --- a/Documentation/ABI/testing/debugfs-arm64-ras +++ b/Documentation/ABI/testing/debugfs-arm64-ras @@ -47,4 +47,41 @@ KernelVersion: 6.19 Contact: Ruidong Tian Description: (RW) Read and write the CE threshold to this record. - Returns error if input exceeded the maximum threshold. \ No newline at end of file + Returns error if input exceeded the maximum threshold. + +What: /sys/kernel/debug/ras/arm64//record/inject/err_* +Date: Dec 2025 +KernelVersion 6.19 +Contact: Ruidong Tian +Description: + (RW) These registers are used to simulate soft injection errors + by holding error register values. You can write any values + to them. To trigger the injection, you need to write soft_inject + at last. The validity of the injected error depends on the + value written to err_status. + + Accepts values - any. + +What: /sys/kernel/debug/ras/arm64//record/inject/soft_i= nject +Date: Dec 2025 +KernelVersion 6.19 +Contact: Ruidong Tian +Description: + (WO) Write any value to this file to trigger the error + injection. Make sure you have specified all necessary error + parameters, i.e. this write should be the last step when + injecting errors. + + Accepts values - any. + +What: /sys/kernel/debug/ras/arm64//record/inject/hard_i= nject +Date: Dec 2025 +KernelVersion 6.19 +Contact: Ruidong Tian +Description: + (WO) If the AEST table provides error injection registers, + you can write to them via this interface. For instance, + values can be written to the ERXPFGCTL register. The post-injection + behavior is then determined by the hardware specification. + + Accepts values - any. \ No newline at end of file diff --git a/drivers/ras/arm64/Makefile b/drivers/ras/arm64/Makefile index e13e223107dd..2f3119ac3ec5 100644 --- a/drivers/ras/arm64/Makefile +++ b/drivers/ras/arm64/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_ARM64_RAS_DRIVER) +=3D arm64_ras.o =20 arm64_ras-y :=3D ras-core.o arm64_ras-y +=3D ras-sysfs.o +arm64_ras-y +=3D ras-inject.o diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index c427c131a862..8cde26ab95de 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -200,7 +200,7 @@ static void ras_panic(struct ras_record *record, struct= ras_ext_regs *regs, panic(msg); } =20 -static void ras_proc_record(struct ras_record *record, void *data) +void ras_proc_record(struct ras_record *record, void *data, bool fake) { struct ras_ext_regs regs =3D { 0 }; int *count =3D data; @@ -240,13 +240,15 @@ static void ras_proc_record(struct ras_record *record= , void *data) ue =3D FIELD_GET(ERR_STATUS_UET, regs.err_status); if ((regs.err_status & ERR_STATUS_UE) && (ue =3D=3D ERR_STATUS_UET_UC || ue =3D=3D ERR_STATUS_UET_UEU)) { - if (!panic_on_ue) - ras_record_err(record, "UE detected, panic suppressed\n"); - else + if (fake) + ras_record_info(record, + "Simulated error! Skip panic due to fault injection\n"); + else if (panic_on_ue) ras_panic(record, ®s, - "AEST: unrecoverable error encountered"); + "AEST: unrecoverable error encountered"); + else + ras_record_err(record, "UE detected, panic suppressed\n"); } - ras_do_proc(record, ®s); =20 /* Write-one-to-clear the bits we've seen */ @@ -263,7 +265,7 @@ static void ras_proc_record(struct ras_record *record, = void *data) record_write(record, ERXSTATUS, regs.err_status); } =20 -static void ras_node_foreach_record(void (*func)(struct ras_record *, void= *), +static void ras_node_foreach_record(void (*func)(struct ras_record *, void= *, bool), struct ras_node *node, void *data, unsigned long *bitmap) { @@ -272,13 +274,13 @@ static void ras_node_foreach_record(void (*func)(stru= ct ras_record *, void *), for_each_clear_bit(i, bitmap, node->record_count) { ras_select_record(node, i); =20 - func(&node->records[i], data); + func(&node->records[i], data, false); =20 ras_sync(node); } } =20 -static void ras_node_foreach_poll_record(void (*func)(struct ras_record *,= void *), +static void ras_node_foreach_poll_record(void (*func)(struct ras_record *,= void *, bool), struct ras_node *node, void *data) { int i; @@ -300,7 +302,7 @@ static void ras_node_foreach_poll_record(void (*func)(s= truct ras_record *, void =20 ras_select_record(node, i); =20 - func(&node->records[i], data); + func(&node->records[i], data, false); =20 ras_sync(node); } @@ -329,7 +331,7 @@ static int ras_proc(struct ras_node *node) */ if (test_bit(i * BITS_PER_LONG + j, node->status_reporting)) continue; - ras_proc_record(&node->records[j], &count); + ras_proc_record(&node->records[j], &count, false); } } =20 @@ -521,7 +523,7 @@ static int ras_init_record(struct ras_record *record, i= nt i, struct ras_node *no return 0; } =20 -static void ras_online_record(struct ras_record *record, void *data) +static void ras_online_record(struct ras_record *record, void *data, bool = __unused) { ras_set_ce_threshold(record); ras_enable_irq(record); diff --git a/drivers/ras/arm64/ras-inject.c b/drivers/ras/arm64/ras-inject.c new file mode 100644 index 000000000000..7fb522a845e7 --- /dev/null +++ b/drivers/ras/arm64/ras-inject.c @@ -0,0 +1,127 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM Error Source Table Support + * + * Copyright (c) 2024, Alibaba Group. + */ + +#include "ras.h" + +static struct ras_ext_regs regs_inj; + +struct inj_attr { + struct attribute attr; + ssize_t (*show)(struct ras_node *n, struct inj_attr *a, char *b); + ssize_t (*store)(struct ras_node *n, struct inj_attr *a, const char *b, + size_t c); +}; + +struct ras_inject { + struct ras_node *node; + struct kobject kobj; +}; + +#define to_inj(k) container_of(k, struct ras_inject, kobj) +#define to_inj_attr(a) container_of(a, struct inj_attr, attr) + +static u64 ras_sysreg_read_inject(void *__unused, u32 offset) +{ + u64 *p =3D (u64 *)®s_inj; + + return p[offset/8]; +} + +static void ras_sysreg_write_inject(void *base, u32 offset, u64 val) +{ + u64 *p =3D (u64 *)®s_inj; + + p[offset/8] =3D val; +} + +static u64 ras_iomem_read_inject(void *base, u32 offset) +{ + u64 *p =3D (u64 *)®s_inj; + + return p[offset/8]; +} + +static void ras_iomem_write_inject(void *base, u32 offset, u64 val) +{ + u64 *p =3D (u64 *)®s_inj; + + p[offset/8] =3D val; +} + +static struct ras_access ras_access_inject[] =3D { + [ACPI_AEST_NODE_SYSTEM_REGISTER] =3D { + .read =3D ras_sysreg_read_inject, + .write =3D ras_sysreg_write_inject, + }, + + [ACPI_AEST_NODE_MEMORY_MAPPED] =3D { + .read =3D ras_iomem_read_inject, + .write =3D ras_iomem_write_inject, + }, + [ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED] =3D { + .read =3D ras_iomem_read_inject, + .write =3D ras_iomem_write_inject, + }, + { } +}; + +static int soft_inject_store(void *data, u64 val) +{ + int count =3D 0; + struct ras_record record_inj, *record =3D data; + struct ras_node *node =3D record->node; + + memcpy(&record_inj, record, sizeof(*record)); + record_inj.access =3D &ras_access_inject[node->access_type]; + + regs_inj.err_status |=3D ERR_STATUS_V; + + ras_proc_record(&record_inj, &count, true); + + if (count !=3D 1) + return -EIO; + + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(soft_inject_ops, NULL, soft_inject_store, "%llu\n= "); + +static int hard_inject_store(void *data, u64 val) +{ + struct ras_record *record =3D data; + struct ras_node *node =3D record->node; + + if (node->type !=3D ACPI_AEST_PROCESSOR_ERROR_NODE && !node->inj) + return -EPERM; + + ras_select_record(node, record->index); + record_write(record, ERXPFGCTL, val); + record_write(record, ERXPFGCDN, 0x100); + ras_sync(node); + + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(hard_inject_ops, NULL, hard_inject_store, "%llu\n= "); + +void ras_inject_init_debugfs(struct ras_record *record) +{ + struct dentry *inj; + + inj =3D debugfs_create_dir("inject", record->debugfs); + + debugfs_create_u64("err_fr", 0600, inj, ®s_inj.err_fr); + debugfs_create_u64("err_ctrl", 0600, inj, ®s_inj.err_ctlr); + debugfs_create_u64("err_status", 0600, inj, ®s_inj.err_status); + debugfs_create_u64("err_addr", 0600, inj, ®s_inj.err_addr); + debugfs_create_u64("err_misc0", 0600, inj, ®s_inj.err_misc[0]); + debugfs_create_u64("err_misc1", 0600, inj, ®s_inj.err_misc[1]); + debugfs_create_u64("err_misc2", 0600, inj, ®s_inj.err_misc[2]); + debugfs_create_u64("err_misc3", 0600, inj, ®s_inj.err_misc[3]); + debugfs_create_file("soft_inject", 0200, inj, record, &soft_inject_ops); + + if (record->node->type =3D=3D ACPI_AEST_PROCESSOR_ERROR_NODE || record->n= ode->inj) + debugfs_create_file("hard_inject", 0200, inj, record, &hard_inject_ops); +} diff --git a/drivers/ras/arm64/ras-sysfs.c b/drivers/ras/arm64/ras-sysfs.c index 03cc00b820e2..d8b351ee9aef 100644 --- a/drivers/ras/arm64/ras-sysfs.c +++ b/drivers/ras/arm64/ras-sysfs.c @@ -151,6 +151,7 @@ static void ras_record_init_debugfs(struct ras_record *= record) record, &ras_record_err_count_fops); debugfs_create_file("ce_threshold", 0600, record->debugfs, record, &record_ce_threshold_ops); + ras_inject_init_debugfs(record); } =20 static void ras_init_records_debugfs(struct ras_node *node) diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 92cbb975b4df..8a0d2909fe4b 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -302,5 +302,7 @@ static inline void ras_sync(struct ras_node *node) } =20 void ras_node_init_debugfs(struct ras_node *node); +void ras_inject_init_debugfs(struct ras_record *record); +void ras_proc_record(struct ras_record *record, void *data, bool fake); =20 #endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACDF83783C4; Tue, 2 Jun 2026 07:16:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384597; cv=none; b=u1hoc7tNQ6HQ6P6RNGS9R9sCq1pGkpO4UMHpcwzmJUrZYysCLAykF9qqYbVF6R8IfZ1YrrYoCuGBP6P1B5e+jKB1IFk0/h0dF3YEp4OhIIktJdvWhTVh7U0XIOjEXX2f6L7FsiTAbYR+QBPp38h9FJx76AKGrSX3G6IZcjVT9n0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384597; c=relaxed/simple; bh=j3LWQZxs0djlPAZCD7aX7+D4Rxouki06AXik8Oox2E8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VunMCrWynSeurM5pAXBPTefZYvSeK39t7Nv1KqBwqToeaTts7fTo/YIYlBIiADxb1peNxfnW1YcjsRyxD89oSznxWqlxrAiIkv0XjO62xSP0Oer57mrcaMOHwEVDlelqA6KhgwEwOwvA1IZRW4UL04Qv8v9yYd5HsUAxmWwGc+E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=gY3Wt7fe; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="gY3Wt7fe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384576; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=h2pWtpjptei4P784Opv68R55ZKackwdCfjKjGSRIiNQ=; b=gY3Wt7feqo9rBB3iP1KHoFap53MUK1R34atzpnNtr3BOoHZAD7VEOowR453a2wITI3Pc73uBXoq0GDjox+4YNd6fpgmOAX9TdLAmescDlhjRWbeBjTGBfnGnyr2lI1wbVUMxpDnQw6qGY5ubNIdVYZKrBQDDOn/e/L0LpcfCvPg= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037026112;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2gc_1780384574; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2gc_1780384574 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:15 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 14/16] arm64: ras: support vendor node CMN700 Date: Tue, 2 Jun 2026 15:15:37 +0800 Message-ID: <20260602071540.3711528-15-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The CMN (Coherent Mesh Network) architecture incorporates five distinct device types. Each device type is associated with an error group register set. CMN's error records utilize a memory-mapped single error record view [1]. Critically, one error record corresponds to one AEST node, implying that a single CMN instance can generate hundreds of AEST nodes. To manage this scale, this driver introduces a virtual ras node, which represents an entire CMN device, such as an HNI or HNF. This allows an HNF ras node, for instance, to leverage its errgsr register to pinpoint which specific error record has reported an error. [1]: https://developer.arm.com/documentation/102308/latest/ Signed-off-by: Ruidong Tian --- drivers/acpi/arm64/aest.c | 111 ++++++++- drivers/ras/arm64/Makefile | 1 + drivers/ras/arm64/ras-cmn.c | 469 +++++++++++++++++++++++++++++++++++ drivers/ras/arm64/ras-core.c | 48 +++- drivers/ras/arm64/ras.h | 29 +++ 5 files changed, 650 insertions(+), 8 deletions(-) create mode 100644 drivers/ras/arm64/ras-cmn.c diff --git a/drivers/acpi/arm64/aest.c b/drivers/acpi/arm64/aest.c index 1b020ab7eccd..1c68c83ccf4a 100644 --- a/drivers/acpi/arm64/aest.c +++ b/drivers/acpi/arm64/aest.c @@ -5,6 +5,7 @@ * Copyright (c) 2025, Alibaba Group. */ =20 +#include #include #include #include @@ -173,6 +174,10 @@ aest_init_node_props(struct acpi_aest_hdr *hdr, struct= property_entry *props, return 0; } =20 +/* + * Non-vendor path: attach all per-entry properties as the platform device= 's + * primary fwnode (single-layer structure). + */ static int __init aest_create_node_fwnode(struct acpi_aest_hdr *hdr, struct platform_device = *pdev) { @@ -187,6 +192,51 @@ aest_create_node_fwnode(struct acpi_aest_hdr *hdr, str= uct platform_device *pdev) return device_create_managed_software_node(&pdev->dev, props, NULL); } =20 +/* + * CMN700 path (double-layer structure): + * - first_time: create the parent fwnode on @pdev carrying the vendor + * identification (HID/UID); + * - always: hang a child swnode under @pdev's primary fwnode for the + * current AEST entry, so that multiple AEST entries sharing the same + * HID/UID accumulate as siblings under one platform device. + */ +static int __init +aest_create_cmn700_fwnode(struct acpi_aest_hdr *hdr, + struct platform_device *pdev, bool first_time) +{ + struct acpi_aest_node_interface_header *interface; + struct property_entry child_props[17] =3D { }; + struct fwnode_handle *child; + int p =3D 0; + int ret; + + ret =3D aest_init_node_props(hdr, child_props, &p, pdev); + if (ret) + return ret; + + if (first_time) { + ret =3D device_create_managed_software_node(&pdev->dev, child_props, NUL= L); + if (ret) + return ret; + } + + interface =3D ACPI_ADD_PTR(struct acpi_aest_node_interface_header, + hdr, hdr->node_interface_offset); + + child_props[p++] =3D PROPERTY_ENTRY_U64("arm,record-base", interface->add= ress); + /* + * Hang the per-entry properties as a child swnode under the platform + * device's primary fwnode. AEST platform devices live for the whole + * system lifetime, so we intentionally do not track child fwnodes for + * removal here. + */ + child =3D fwnode_create_software_node(child_props, dev_fwnode(&pdev->dev)= ); + if (IS_ERR(child)) + return PTR_ERR(child); + + return 0; +} + static int aest_node_mem_size(u8 group_format) { switch (group_format) { @@ -264,9 +314,60 @@ static int __init acpi_aest_init_node(struct acpi_aest= _hdr *aest_hdr) return 0; } =20 +static DEFINE_XARRAY(aest_cmn700_groups); +static int __init acpi_aest_init_cmn700_node(struct acpi_aest_hdr *aest_hd= r) +{ + struct acpi_aest_vendor_v2 *vendor; + struct platform_device *existing; + int ret; + + vendor =3D ACPI_ADD_PTR(struct acpi_aest_vendor_v2, aest_hdr, + aest_hdr->node_specific_offset); + + /* + * If a previous AEST entry already produced a platform device for + * the same vendor HID/UID, just append a child swnode for the + * current entry under that pdev's primary fwnode and return. + */ + existing =3D xa_load(&aest_cmn700_groups, vendor->acpi_uid); + if (existing) + return aest_create_cmn700_fwnode(aest_hdr, existing, false); + + struct platform_device *pdev __free(platform_device_put) =3D + acpi_aest_alloc_pdev(aest_hdr); + if (IS_ERR(pdev)) + return PTR_ERR(pdev); + + ret =3D aest_create_cmn700_fwnode(aest_hdr, pdev, true); + if (ret) + return ret; + + ret =3D platform_device_add(pdev); + if (ret) + return ret; + + /* pdev is now owned by the driver core; release the cleanup-managed put.= */ + struct platform_device *added =3D no_free_ptr(pdev); + + ret =3D xa_err(xa_store(&aest_cmn700_groups, vendor->acpi_uid, added, GFP= _KERNEL)); + if (ret) + return ret; + + pr_debug("Platform device added for AEST vendor node: %s.%d\n", + added->name, added->id); + + return 0; +} + +static int __init is_acpi_aest_cmn_node(struct acpi_aest_vendor_v2 *vendor) +{ + return strncmp(vendor->acpi_hid, "ARMHC701", 8) =3D=3D 0; +} + static int __init acpi_aest_init_nodes(struct acpi_table_header *aest_tabl= e) { struct acpi_aest_hdr *aest_node, *aest_end; + struct acpi_aest_vendor_v2 *vendor; struct acpi_table_aest *aest; int rc; =20 @@ -281,8 +382,14 @@ static int __init acpi_aest_init_nodes(struct acpi_tab= le_header *aest_table) "AEST node pointer overflow, bad table.\n"); return -EINVAL; } - - rc =3D acpi_aest_init_node(aest_node); + vendor =3D ACPI_ADD_PTR(struct acpi_aest_vendor_v2, aest_node, + aest_node->node_specific_offset); + + if (aest_node->type =3D=3D ACPI_AEST_VENDOR_ERROR_NODE && + is_acpi_aest_cmn_node(vendor)) + rc =3D acpi_aest_init_cmn700_node(aest_node); + else + rc =3D acpi_aest_init_node(aest_node); if (rc) return rc; =20 diff --git a/drivers/ras/arm64/Makefile b/drivers/ras/arm64/Makefile index 2f3119ac3ec5..0e4c7421c131 100644 --- a/drivers/ras/arm64/Makefile +++ b/drivers/ras/arm64/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_ARM64_RAS_DRIVER) +=3D arm64_ras.o arm64_ras-y :=3D ras-core.o arm64_ras-y +=3D ras-sysfs.o arm64_ras-y +=3D ras-inject.o +arm64_ras-y +=3D ras-cmn.o diff --git a/drivers/ras/arm64/ras-cmn.c b/drivers/ras/arm64/ras-cmn.c new file mode 100644 index 000000000000..109cdc46a717 --- /dev/null +++ b/drivers/ras/arm64/ras-cmn.c @@ -0,0 +1,469 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM Error Source Table CMN-700 Support + * + * Copyright (c) 2025, Alibaba Inc + * + * CMN-700 exposes 6 RAS-relevant device types (HN-I, HN-F, XP, SBSX, + * RN-D/CXRA, MTSX). Each device type owns an error group register set + * holding a set of error records. + * + * CMN uses the memory-mapped single-record view, so every AEST node + * corresponds to exactly one CMN error record - a single mesh can + * yield hundreds of AEST entries. Per Arm ACPI Spec[1] =C2=A72.6.3.4 the + * device type is recovered from the AEST vendor-specific data. This + * driver enumerates every CMN AEST entry, reads the CMN node-info + * register and stitches all entries of the same type into one + * aggregate ras_node carrying many ras_records (one per logic_id). + * + * Each CMN instance owns its own error interrupt. The shared FHI/ERI + * lines are registered per ras_node with IRQF_SHARED, so every + * per-type handler runs and locates the offending record by walking + * the error group registers - mirroring CMN Spec[2] =C2=A73.8. + * + * The CMN RAS topology is: + * + * +----+ + * -->|XP | ...... + * | +----+ + * | + * | +----+ ...... + * | |HNI | +----------------+ + * | +----+ ->|record/AEST node| + * | | +----------------+ + * +------------+ | +----+ | . + * |CMN Instance|--| |HNF |---| . + * +------------+ | +----+ | . + * | | +----------------+ + * | +----+ ->|record/AEST node| + * | |SBSX| +----------------+ + * | +----+ ...... + * | + * | +----+ + * -->|RND | ...... (also MTSX) + * +----+ + * + * All addressing needed to reach the CMN RAS register block, the CMN + * node-info register and the CMN ERRGSR is taken from AEST. + * + * PERIPHBASE =3D ERRFR_addr - ERRFR_offset_in_register_block + * - register_block_offset_within_CMN + * =3D record_base - 0x3100 - cmn_node_offset + * + * where the CMN-700 record register block places ERRFR at offset 0x3100 + * (CMN-700 TRM[2]). The AEST "arm,node-specific-data" payload carries + * two u64s used by this driver: [0..7] =3D hnd_offset (locates the + * per-type ERRGSR via cmn_config->errgsr_offset()), [8..15] =3D + * cmn_node_offset (offset of this node's register block within CMN). + * + * Per CMN-700 erratum #2732981, ERRGSR for HN-I / HN-S / SBSX is + * broken; for those types the per-record status_reporting bit is left + * set so the core polls the records instead of reading ERRGSR. + * + * AEST topology consumed by this driver (see drivers/acpi/arm64/aest.c): + * + * pdev (arm64_ras, dev_name =3D "cmn700") + * =E2=94=9C=E2=94=80=E2=94=80 primary fwnode : + * =E2=94=94=E2=94=80=E2=94=80 child swnode x N: per-AEST-entry properti= es: + * arm,interface-type + * arm,record-base + * arm,node-specific-data[] (vendor data) + * + * Each child swnode corresponds to one AEST node, i.e. one CMN error + * record identified by (node_type, logic_id). + * + * [1] Arm ACPI for Armv8/Armv9: https://developer.arm.com/documentation/d= en0093/latest + * [2] CMN-700 TRM (Arm 102308): https://developer.arm.com/documentation/1= 02308/latest + */ + +#include +#include +#include +#include +#include + +#include "ras.h" + + +#define CMN_NODE_INFO 0x0000 +#define CMN_NI_NODE_TYPE GENMASK_ULL(15, 0) +#define CMN_NI_NODE_ID GENMASK_ULL(31, 16) +#define CMN_NI_LOGICAL_ID GENMASK_ULL(47, 32) + +/* Subset of CMN node types relevant to RAS */ +enum cmn_ras_node_type { + CMN_TYPE_HNI =3D 0x4, + CMN_TYPE_HNF =3D 0x5, + CMN_TYPE_XP =3D 0x6, + CMN_TYPE_SBSX =3D 0x7, + CMN_TYPE_MTSX =3D 0x10, + CMN_TYPE_CXRA =3D 0x100, + CMN_TYPE_CXHA =3D 0x101, + CMN_TYPE_CCHA =3D 0x104, + CMN_TYPE_HNS =3D 0x200, +}; + +/* + * Offset of ERRFR within the CMN-700 RAS register block. + * AEST's interface->address points at ERRFR; subtracting this plus the + * cmn_node_offset (vendor-specific-data[8..15]) yields PERIPHBASE. + */ +#define CMN_ERRFR_OFFSET_IN_REGBLK 0x3100 + +#define CMN_RAS_DEV_NUM 6 +#define CMN700_ERRGSR_NUM 8 +#define CMN_ERRGSR_OFFSET 0x3000 + +struct cmn_vendor_data { + struct acpi_aest_vendor_v2 vendor; + int node_type; + int node_id; + int logic_id; +}; + +struct cmn_config { + int errgsr_num; + int dev_num; + const int *node_id_map; + const char *const *node_name; + int (*errgsr_mapping)(int errgsr_bit); + u64 (*errgsr_offset)(u64 hnd_offset, int node_idx); +}; + +static const char *const cmn700_node_name[] =3D { + [CMN_TYPE_HNI] =3D "HNI", + [CMN_TYPE_HNF] =3D "HNF", + [CMN_TYPE_XP] =3D "XP", + [CMN_TYPE_SBSX] =3D "SBSX", + [CMN_TYPE_CXRA] =3D "RND", + [CMN_TYPE_MTSX] =3D "MTSX", +}; + +static const int cmn700_node_id_map[] =3D { + [CMN_TYPE_HNI] =3D 1, + [CMN_TYPE_HNF] =3D 2, + [CMN_TYPE_XP] =3D 0, + [CMN_TYPE_SBSX] =3D 3, + [CMN_TYPE_CXRA] =3D 4, + [CMN_TYPE_MTSX] =3D 5, +}; + +static u64 cmn700_errgsr_offset(u64 hnd_offset, int node_idx) +{ + return hnd_offset + CMN_ERRGSR_OFFSET + + (node_idx * 2) * CMN700_ERRGSR_NUM * 8; +} + +static int cmn700_errgsr_mapping(int errgsr_bit) +{ + return errgsr_bit / 2; +} + +static struct cmn_config cmn700_config =3D { + .errgsr_num =3D CMN700_ERRGSR_NUM, + .dev_num =3D CMN_RAS_DEV_NUM, + .node_name =3D cmn700_node_name, + .node_id_map =3D cmn700_node_id_map, + .errgsr_mapping =3D cmn700_errgsr_mapping, + .errgsr_offset =3D cmn700_errgsr_offset, +}; + +static struct cmn_config *cmn_config; + + +static int cmn_init_vendor_data(struct device *dev, struct cmn_vendor_data= *vendor_data, + u64 *errgsr_addr, u64 record_base) +{ + struct acpi_aest_vendor_v2 vendor; + u64 cmn_node_offset, reg, logic_id, type, node_id; + u64 hnd_offset, periphbase; + void __iomem *cmn_node_base; + struct fwnode_handle *child =3D dev_fwnode(dev); + + fwnode_property_read_u8_array(child, "arm,node-specific-data", + (u8 *)&vendor, sizeof(vendor)); + + hnd_offset =3D get_unaligned_le64(&vendor.vendor_specific_data[0]); + cmn_node_offset =3D get_unaligned_le64(&vendor.vendor_specific_data[8]); + + periphbase =3D record_base - CMN_ERRFR_OFFSET_IN_REGBLK - cmn_node_offset; + + cmn_node_base =3D devm_ioremap(dev, periphbase + cmn_node_offset + + CMN_NODE_INFO, SZ_4K); + if (!cmn_node_base) + return -ENOMEM; + + reg =3D readq_relaxed(cmn_node_base); + logic_id =3D FIELD_GET(CMN_NI_LOGICAL_ID, reg); + type =3D FIELD_GET(CMN_NI_NODE_TYPE, reg); + node_id =3D FIELD_GET(CMN_NI_NODE_ID, reg); + + if (type >=3D ARRAY_SIZE(cmn700_node_id_map) || + !cmn_config->node_name[type]) { + dev_dbg(dev, "Skipping unsupported CMN node type %llx\n", type); + return -ENODEV; + } + + *errgsr_addr =3D periphbase + cmn_config->errgsr_offset(hnd_offset, + cmn_config->node_id_map[type]); + + vendor_data->vendor =3D vendor; + vendor_data->node_type =3D type; + vendor_data->node_id =3D node_id; + vendor_data->logic_id =3D logic_id; + + devm_iounmap(dev, cmn_node_base); + + dev_dbg(dev, "periphbase %llx, node_offset %llx, logic_id %llx, type %llx= , node_id %llx\n", + periphbase, cmn_node_offset, logic_id, type, node_id); + + return 0; +} + +/* + * Initialise one ras_node (representing one CMN node *type*, e.g. HN-F). + * Per CMN-700 erratum #2732981, ERRGSR for HN-I / HN-S / SBSX is broken; + * AEST conveys this via the per-record "Error group-based status reporting + * supported" flag (bit0 of arm,status-reporting). When that bit is 0 we + * leave node->errgsr NULL so the core polls instead of reading ERRGSR. + */ +static int cmn_init_node(struct platform_device *pdev, + struct ras_node *cmn_node, u64 type, u64 errgsr_addr) +{ + struct device *dev =3D &pdev->dev; + int ret; + + cmn_node->dev =3D dev; + cmn_node->type =3D ACPI_AEST_VENDOR_ERROR_NODE; + cmn_node->name =3D devm_kasprintf(dev, GFP_KERNEL, "%s.%llx", + cmn_config->node_name[type], errgsr_addr); + if (!cmn_node->name) + return -ENOMEM; + + /* CMN700 just support version 1 */ + cmn_node->version =3D 1; + cmn_node->errgsr =3D devm_ioremap(dev, errgsr_addr, cmn_config->errgsr_nu= m * 8); + if (!cmn_node->errgsr) + return -ENOMEM; + + cmn_node->errgsr_num =3D cmn_config->errgsr_num; + cmn_node->errgsr_mapping =3D cmn_config->errgsr_mapping; + cmn_node->record_count =3D cmn_config->errgsr_num * BITS_PER_LONG / 2; + cmn_node->record_implemented =3D devm_bitmap_zalloc( + dev, cmn_node->record_count, GFP_KERNEL); + if (!cmn_node->record_implemented) + return -ENOMEM; + bitmap_set(cmn_node->record_implemented, 0, cmn_node->record_count); + + cmn_node->status_reporting =3D devm_bitmap_zalloc( + dev, cmn_node->record_count, GFP_KERNEL); + if (!cmn_node->status_reporting) + return -ENOMEM; + bitmap_set(cmn_node->status_reporting, 0, cmn_node->record_count); + /* If !errgsr_supported leave bitmap zero so all records are polled. */ + + cmn_node->records =3D devm_kcalloc(dev, cmn_node->record_count, + sizeof(struct ras_record), GFP_KERNEL); + if (!cmn_node->records) + return -ENOMEM; + + cmn_node->specific_data_size =3D device_property_count_u8(dev, + "arm,node-specific-data"); + if (cmn_node->specific_data_size > 0) { + cmn_node->specific_data =3D devm_kzalloc(dev, cmn_node->specific_data_si= ze, + GFP_KERNEL); + if (!cmn_node->specific_data) + return -ENOMEM; + ret =3D device_property_read_u8_array(dev, "arm,node-specific-data", + cmn_node->specific_data, + cmn_node->specific_data_size); + if (ret) + return ret; + } + + ras_node_dbg(cmn_node, "Init with errgsr %llx\n", errgsr_addr); + return 0; +} + +/* + * Process one AEST record (one child fwnode) and stitch it into the + * appropriate per-type ras_node. The ras_node is initialised lazily on the + * first record observed for that type. + */ +static int cmn_init_record(struct platform_device *pdev, struct ras_node *= nodes, + struct fwnode_handle *child) +{ + struct device *dev =3D &pdev->dev; + u64 errgsr_addr, record_base; + struct cmn_vendor_data *vendor_data; + struct ras_node *cmn_node; + struct ras_record *record; + int ret, node_index; + u8 interface_type; + + + ret =3D fwnode_property_read_u8(child, "arm,interface-type", + &interface_type); + if (ret) + return ret; + if (interface_type !=3D ACPI_AEST_NODE_SINGLE_RECORD_MEMORY_MAPPED) { + dev_err(dev, "CMN only supports single-record memory mapped\n"); + return -ENODEV; + } + + ret =3D fwnode_property_read_u64(child, "arm,record-base", + &record_base); + if (ret) + return ret; + + vendor_data =3D devm_kzalloc(dev, sizeof(*vendor_data), GFP_KERNEL); + if (!vendor_data) + return -ENOMEM; + + ret =3D cmn_init_vendor_data(dev, vendor_data, &errgsr_addr, record_base); + if (ret) + return ret; + + node_index =3D cmn_config->node_id_map[vendor_data->node_type]; + + cmn_node =3D &nodes[node_index]; + if (!cmn_node->name) { + ret =3D cmn_init_node(pdev, cmn_node, vendor_data->node_type, errgsr_add= r); + if (ret) + return ret; + } + + if (vendor_data->logic_id >=3D cmn_node->record_count) { + dev_warn(dev, "logic_id %u exceeds record_count %u\n", + vendor_data->logic_id, cmn_node->record_count); + return 0; + } + + /* + * CMN-700 stitches several single-mapping AEST nodes into one + * aggregate ras_node, so the record_implemented / status_reporting + * bitmaps that ACPI normally provides per group are absent here + * and must be populated by the driver: clear the bit at this + * record's logic_id slot to mark it implemented (and reporting). + */ + clear_bit(vendor_data->logic_id, cmn_node->record_implemented); + /* CMN-700 erratum #2732981, ERRGSR for HN-I / HN-S / SBSX is broken */ + if (vendor_data->node_type !=3D CMN_TYPE_HNI && + vendor_data->node_type !=3D CMN_TYPE_HNS && + vendor_data->node_type !=3D CMN_TYPE_SBSX) + clear_bit(vendor_data->logic_id, cmn_node->status_reporting); + + record =3D &cmn_node->records[vendor_data->logic_id]; + record->name =3D devm_kasprintf(dev, GFP_KERNEL, "record%d", vendor_data-= >logic_id); + if (!record->name) + return -ENOMEM; + record->regs_base =3D devm_ioremap(dev, + (resource_size_t)record_base, + sizeof(struct ras_ext_regs)); + if (!record->regs_base) + return -ENOMEM; + record->node =3D cmn_node; + record->index =3D vendor_data->logic_id; + record->access =3D &ras_access[interface_type]; + + record->vendor_data =3D vendor_data; + record->vendor_data_size =3D sizeof(*vendor_data); + + ras_record_dbg(record, "base %llx\n", record_base); + return 0; +} + +/* + * Vendor pdev (CMN) carries one shared fhi/eri pair. Register it on each + * populated ras_node with IRQF_SHARED so all per-type handlers run, and + * enable per-record FI/CFI/UI in ERXCTLR via the shared ras_enable_irq. + */ +static int cmn_register_record_irq(struct platform_device *pdev, + struct ras_node *nodes) +{ + struct device *dev =3D &pdev->dev; + int fhi_irq, eri_irq, i, ret; + + fhi_irq =3D platform_get_irq_byname_optional(pdev, AEST_FHI_NAME); + eri_irq =3D platform_get_irq_byname_optional(pdev, AEST_ERI_NAME); + if (fhi_irq <=3D 0 && eri_irq <=3D 0) + return 0; + + for (i =3D 0; i < cmn_config->dev_num; i++) { + struct ras_node *n =3D &nodes[i]; + char *desc; + + if (!n->name) /* slot not used by this CMN */ + continue; + + desc =3D devm_kasprintf(dev, GFP_KERNEL, "arm64_ras.%s.%s", + dev_name(dev), n->name); + if (!desc) + return -ENOMEM; + + if (fhi_irq > 0) { + ret =3D devm_request_irq(dev, fhi_irq, ras_irq_func, + IRQF_SHARED, desc, n); + if (ret) + return ret; + n->irq[0] =3D fhi_irq; + } + if (eri_irq > 0) { + ret =3D devm_request_irq(dev, eri_irq, ras_irq_func, + IRQF_SHARED, desc, n); + if (ret) + return ret; + n->irq[1] =3D eri_irq; + } + } + return 0; +} + +/* Common entry point: walk every child swnode under @pdev. */ +static int cmn_probe(struct platform_device *pdev) +{ + struct device *dev =3D &pdev->dev; + struct fwnode_handle *child; + struct ras_node *nodes; + int ret; + + nodes =3D devm_kcalloc(dev, cmn_config->dev_num, sizeof(*nodes), + GFP_KERNEL); + if (!nodes) + return -ENOMEM; + + /* + * In CMN-700, each AEST node is a single mapping record, so + * treat every child fwnode as one record rather than a node + * with multiple records underneath. + */ + device_for_each_child_node(dev, child) { + ret =3D cmn_init_record(pdev, nodes, child); + if (ret) { + fwnode_handle_put(child); + return ret; + } + } + + ret =3D cmn_register_record_irq(pdev, nodes); + if (ret) + return ret; + + platform_set_drvdata(pdev, nodes); + + for (int i =3D 0; i < cmn_config->dev_num; i++) { + ras_online_node(&nodes[i]); + ras_node_init_debugfs(&nodes[i]); + } + + return 0; +} + +int ras_cmn700_probe(struct platform_device *pdev) +{ + cmn_config =3D &cmn700_config; + + dev_set_name(&pdev->dev, "cmn700"); + + return cmn_probe(pdev); +} diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 8cde26ab95de..2fb645659694 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -284,6 +284,7 @@ static void ras_node_foreach_poll_record(void (*func)(s= truct ras_record *, void struct ras_node *node, void *data) { int i; + /* * Per AEST spec: * - record_implemented: bitmap of records that are actually @@ -310,7 +311,7 @@ static void ras_node_foreach_poll_record(void (*func)(s= truct ras_record *, void =20 static int ras_proc(struct ras_node *node) { - int count =3D 0, i, j, size =3D node->record_count; + int count =3D 0, i, j, size =3D node->record_count, record_idx; u64 err_group =3D 0; =20 ras_node_foreach_poll_record(ras_proc_record, node, &count); @@ -321,24 +322,27 @@ static int ras_proc(struct ras_node *node) ras_node_dbg(node, "Report bitmap %*pb\n", size, node->status_reporting); for (i =3D 0; i < BITS_TO_U64(size); i++) { err_group =3D readq_relaxed((void *)node->errgsr + i * 8); - ras_node_dbg(node, "errgsr[%d]: 0x%llx\n", i, err_group); =20 for_each_set_bit(j, (unsigned long *)&err_group, BITS_PER_LONG) { + record_idx =3D node->errgsr_mapping(i * BITS_PER_LONG + j); + ras_node_dbg(node, "errgsr[%d]: bit %d occur error\n", + i, record_idx); /* * Error group base is only valid in Memory Map node, * so driver do not need to write select register and * sync. */ - if (test_bit(i * BITS_PER_LONG + j, node->status_reporting)) + if (test_bit(record_idx, node->status_reporting)) continue; - ras_proc_record(&node->records[j], &count, false); + ras_proc_record(&node->records[record_idx], + &count, false); } } =20 return count; } =20 -static irqreturn_t ras_irq_func(int irq, void *input) +irqreturn_t ras_irq_func(int irq, void *input) { struct ras_node *node =3D input; =20 @@ -529,7 +533,7 @@ static void ras_online_record(struct ras_record *record= , void *data, bool __unus ras_enable_irq(record); } =20 -static void ras_online_node(struct ras_node *node) +void ras_online_node(struct ras_node *node) { int count =3D 0; =20 @@ -808,6 +812,7 @@ static struct ras_node *ras_init_node(struct platform_d= evice *pdev) return ERR_PTR(-EINVAL); } =20 + node->errgsr_mapping =3D default_errgsr_mapping; node->name =3D alloc_ras_node_name(node); if (!node->name) return ERR_PTR(-ENOMEM); @@ -877,10 +882,41 @@ static int ras_setup_irq(struct platform_device *pdev= , struct ras_node *node) return 0; } =20 +static struct ras_vendor_match vendor_match[] =3D { + { "ARMHC701", &ras_cmn700_probe }, + { }, +}; + +static int +ras_vendor_probe(struct platform_device *pdev) +{ + int i; + struct acpi_aest_vendor_v2 vendor; + + device_property_read_u8_array(&pdev->dev, "arm,node-specific-data", + (u8 *)&vendor, sizeof(vendor)); + + dev_dbg(&pdev->dev, "Try to probe vendor node %s\n", vendor.acpi_hid); + for (i =3D 0; i < ARRAY_SIZE(vendor_match); i++) { + if (!strncmp(vendor_match[i].hid, vendor.acpi_hid, 8)) + return vendor_match[i].probe(pdev); + } + + return -ENODEV; +} + static int arm64_ras_probe(struct platform_device *pdev) { int ret; struct ras_node *node; + u8 type; + + ret =3D device_property_read_u8(&pdev->dev, "arm,node-type", &type); + if (ret) + return ret; + + if (type =3D=3D ACPI_AEST_VENDOR_ERROR_NODE) + return ras_vendor_probe(pdev); =20 node =3D ras_init_node(pdev); if (IS_ERR(node)) diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 8a0d2909fe4b..75aa4ac83a41 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -9,6 +9,7 @@ #define _DRIVERS_RAS_ARM64_RAS_H_ =20 #include +#include #include #include =20 @@ -98,6 +99,8 @@ struct ras_record { struct ras_node *node; const struct ras_access *access; struct dentry *debugfs; + void *vendor_data; + size_t vendor_data_size; =20 struct ce_threshold ce; enum ras_ce_threshold threshold_type; @@ -177,6 +180,18 @@ struct ras_node { * into SPA */ unsigned long *addressing_mode; + /* + * Usually bit[n] in errgsr indicates [n]th error record within this + * error node report error. But some compoent may have different rules. + * For example, CMN700 TRM 4.3.5.12 say: + * ``` Error occurs when the index is even and Fault + * occurs when the index is odd. ``` + * Bit[n]: record[n] report ERROR. + * Bit[n + 1]: record[n] report FAULT. + * errgsr_mapping function is used to map errgsr bit to record index + * for various components. + */ + int (*errgsr_mapping)(int errgsr_bit); struct ras_record *records; =20 u32 specific_data_size; @@ -184,6 +199,7 @@ struct ras_node { u32 record_index; u32 flags; int version; + int errgsr_num; =20 u8 type; u8 access_type; @@ -301,8 +317,21 @@ static inline void ras_sync(struct ras_node *node) isb(); } =20 +static inline int default_errgsr_mapping(int errgsr_bit) +{ + return errgsr_bit; +} + +struct ras_vendor_match { + char hid[ACPI_ID_LEN]; + int (*probe)(struct platform_device *pdev); +}; + void ras_node_init_debugfs(struct ras_node *node); void ras_inject_init_debugfs(struct ras_record *record); void ras_proc_record(struct ras_record *record, void *data, bool fake); +irqreturn_t ras_irq_func(int irq, void *input); +void ras_online_node(struct ras_node *node); +int ras_cmn700_probe(struct platform_device *pdev); =20 #endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42CF73AE185; Tue, 2 Jun 2026 07:16:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384589; cv=none; b=MF7lI+AApllbP1q/dW3XgLd0tICQ0Pbi+ytyEVofWUbDBaJ1pYSkIzphULbBP5iFMZpnOBhHlPPZLF4Z78R+usj8sGNg/OTuQXopaGJrYAhTDM/qZ59dR7jvAo0nUZ2xs25a+rG127zcu1J8ZbEeRp3VHeYjcNzj8iX48d3qx7o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384589; c=relaxed/simple; bh=2PI49NUohaY7BwfcbdNXp4IWNPZLW/2G+3Zan7DuxUw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j+nAZR3/YvYX342I8rQZqbtQvIaujV7W0Xkr2ec660XmFdFWQBkUiKGIActAP1JOmJ8OQM6YOh1RqSwFzDknof0s8DMARM/+48FdpCPlKWCs7lBrPY+jmO9EG1B707yyYHsx41RwibQakfEA7gt6WbIsVjLTBfSClZjuFx3Bp0I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=RvvbGDer; arc=none smtp.client-ip=115.124.30.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="RvvbGDer" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384579; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=4nbxh71Fq5dsKcDIguU1IiPKLaoUTyNChJcaP+bZc5E=; b=RvvbGDerCmP28ymjLvFcJZDbJCRZny8/7NbOINME31aFyP48zmsbKRD3p+bkCXeELZS3+aK4wOHWmwOqiJpK+Uq3esjb8lSGt1Jb8kEYHzUnEAs9nhVDiFGVUOTGjBbKxW5JM+qMNkNp+lK8N+SBn5l/sZFMY4PQVhwgT4NV//M= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R391e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037009110;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2h-_1780384576; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2h-_1780384576 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:17 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 15/16] arm64: ras: Introduce ras error storm mitigation Date: Tue, 2 Jun 2026 15:15:38 +0800 Message-ID: <20260602071540.3711528-16-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Corrected errors can occasionally be reported at a very high rate (known on x86 as a "CMCI storm"), consuming a substantial share of CPU time and disturbing the running workload. The RAS driver needs to throttle this so that error reporting itself does not become a denial of service. Adopt a hybrid per-record / per-node strategy: - Each record tracks its own CE history in a sliding window. By default a record enters storm mode once it accumulates more than 5 corrected errors within roughly one minute, at which point its FHI/CFI is masked via ERR_CTLR and the record is added to a node-wide poll set. It leaves storm mode after ~30 s without any further error, at which point the interrupt is unmasked again. The thresholds and the poll interval are exposed via debugfs so operators can retune the algorithm at runtime without rebuilding the kernel. - The node owns a single timer that drains every stormy record. The first record entering storm starts the timer; the last one leaving stops it. - The poll interval is adaptive: halved on each tick that still finds errors, doubled on each tick that does not, capped at 300 s. This way the driver tracks the actual storm intensity instead of paying a fixed polling cost. - - ERI / UI (uncorrectable reporting) is intentionally excluded: uncorrected errors must remain synchronous and never be rate-limited. The result bounds CPU spent on RAS under fault floods while preserving full fidelity for correctness-critical paths. Signed-off-by: Ruidong Tian --- drivers/ras/arm64/Makefile | 1 + drivers/ras/arm64/ras-cmn.c | 10 ++ drivers/ras/arm64/ras-core.c | 45 +++++++- drivers/ras/arm64/ras-inject.c | 3 + drivers/ras/arm64/ras-storm.c | 198 +++++++++++++++++++++++++++++++++ drivers/ras/arm64/ras-sysfs.c | 107 ++++++++++++++++++ drivers/ras/arm64/ras.h | 37 ++++++ 7 files changed, 397 insertions(+), 4 deletions(-) create mode 100644 drivers/ras/arm64/ras-storm.c diff --git a/drivers/ras/arm64/Makefile b/drivers/ras/arm64/Makefile index 0e4c7421c131..6897798f7314 100644 --- a/drivers/ras/arm64/Makefile +++ b/drivers/ras/arm64/Makefile @@ -6,3 +6,4 @@ arm64_ras-y :=3D ras-core.o arm64_ras-y +=3D ras-sysfs.o arm64_ras-y +=3D ras-inject.o arm64_ras-y +=3D ras-cmn.o +arm64_ras-y +=3D ras-storm.o diff --git a/drivers/ras/arm64/ras-cmn.c b/drivers/ras/arm64/ras-cmn.c index 109cdc46a717..489bbe38de5f 100644 --- a/drivers/ras/arm64/ras-cmn.c +++ b/drivers/ras/arm64/ras-cmn.c @@ -452,6 +452,16 @@ static int cmn_probe(struct platform_device *pdev) platform_set_drvdata(pdev, nodes); =20 for (int i =3D 0; i < cmn_config->dev_num; i++) { + + if (!nodes[i].name) + continue; + + ret =3D arm64_ras_storm_init(&nodes[i]); + if (ret) { + ras_node_err(&nodes[i], "init storm mitigation failed\n"); + return ret; + } + ras_online_node(&nodes[i]); ras_node_init_debugfs(&nodes[i]); } diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 2fb645659694..82e8bb10870f 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -207,6 +207,9 @@ void ras_proc_record(struct ras_record *record, void *d= ata, bool fake) u64 ue; =20 regs.err_status =3D record_read(record, ERXSTATUS); + + arm64_ras_storm_track_record(record, regs.err_status); + if (!(regs.err_status & ERR_STATUS_V)) return; =20 @@ -265,9 +268,8 @@ void ras_proc_record(struct ras_record *record, void *d= ata, bool fake) record_write(record, ERXSTATUS, regs.err_status); } =20 -static void ras_node_foreach_record(void (*func)(struct ras_record *, void= *, bool), - struct ras_node *node, void *data, - unsigned long *bitmap) +void ras_node_foreach_record(void (*func)(struct ras_record *, void *, boo= l), + struct ras_node *node, void *data, unsigned long *bitmap) { int i; =20 @@ -410,7 +412,7 @@ static int ras_register_irq(struct ras_node *node) return ret; } =20 -static void ras_enable_irq(struct ras_record *record) +void ras_enable_irq(struct ras_record *record) { struct ras_node *node =3D record->node; u64 err_ctlr; @@ -425,6 +427,15 @@ static void ras_enable_irq(struct ras_record *record) record_write(record, ERXCTLR, err_ctlr); } =20 +void ras_disable_irq(struct ras_record *record) +{ + u64 err_ctlr; + + err_ctlr =3D record_read(record, ERXCTLR); + err_ctlr &=3D ~(ERR_CTLR_FI | ERR_CTLR_CFI); + record_write(record, ERXCTLR, err_ctlr); +} + static int ras_get_ce_threshold(struct ras_record *record) { u64 err_fr, err_fr_cec, err_fr_rp; @@ -533,6 +544,11 @@ static void ras_online_record(struct ras_record *recor= d, void *data, bool __unus ras_enable_irq(record); } =20 +static void ras_offline_record(struct ras_record *record, void *data, bool= __unused) +{ + ras_disable_irq(record); +} + void ras_online_node(struct ras_node *node) { int count =3D 0; @@ -547,10 +563,23 @@ void ras_online_node(struct ras_node *node) =20 ras_config_irq(node); =20 + arm64_ras_storm_init(node); + ras_node_foreach_record(ras_online_record, node, NULL, node->record_implemented); } =20 +static void ras_offline_node(struct ras_node *node) +{ + if (!node->name) + return; + + arm64_ras_storm_reset_node(node); + + ras_node_foreach_record(ras_offline_record, node, NULL, + node->record_implemented); +} + static void ras_online_oncore_dev(void *data) { int fhi_irq, eri_irq; @@ -571,6 +600,8 @@ static void ras_offline_oncore_dev(void *data) int fhi_irq, eri_irq; struct ras_node *node =3D this_cpu_ptr(data); =20 + ras_offline_node(node); + fhi_irq =3D node->irq[ACPI_AEST_NODE_FAULT_HANDLING]; if (fhi_irq > 0) disable_percpu_irq(fhi_irq); @@ -937,6 +968,12 @@ static int arm64_ras_probe(struct platform_device *pde= v) return ret; } =20 + ret =3D arm64_ras_storm_init(node); + if (ret) { + ras_node_err(node, "init storm mitigation failed\n"); + return ret; + } + if (ras_node_is_oncore(node)) ret =3D cpuhp_setup_state(CPUHP_AP_ARM_RAS_STARTING, "drivers/ras/arm64/ras:starting", diff --git a/drivers/ras/arm64/ras-inject.c b/drivers/ras/arm64/ras-inject.c index 7fb522a845e7..5e4b22806756 100644 --- a/drivers/ras/arm64/ras-inject.c +++ b/drivers/ras/arm64/ras-inject.c @@ -82,6 +82,9 @@ static int soft_inject_store(void *data, u64 val) =20 ras_proc_record(&record_inj, &count, true); =20 + memcpy(&record->count, &record_inj.count, sizeof(record->count)); + memcpy(&record->storm, &record_inj.storm, sizeof(record->storm)); + if (count !=3D 1) return -EIO; =20 diff --git a/drivers/ras/arm64/ras-storm.c b/drivers/ras/arm64/ras-storm.c new file mode 100644 index 000000000000..6ec95a34809b --- /dev/null +++ b/drivers/ras/arm64/ras-storm.c @@ -0,0 +1,198 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * arm64 RAS error storm mitigation. + * + * This file plumbs the architecture-independent storm tracker + * (drivers/ras/storm.c) into the arm64 RAS driver. Storm tracking is + * per-record while the mitigation is mixed: + * + * - When a record enters storm mode its FHI (fault handling) and CFI + * (corrected fault) interrupts are masked via ERR_CTLR. + * - The first record entering storm starts the node's poll timer + * which drains all currently stormy records. + * - When a record leaves storm mode its interrupts are re-enabled and + * it is removed from the poll. The timer stops once no + * record remains in storm. + * + * Note that ERI / UI (uncorrected error reporting) is intentionally + * left untouched: uncorrected errors must continue to be delivered + * synchronously and never participate in storm suppression. + */ +#include +#include +#include +#include +#include +#include +#include + +#include "ras.h" + +#define INITIAL_CHECK_INTERVAL (5 * 60) /* 5 minutes */ + +#define NUM_HISTORY_BITS (sizeof(u64) * BITS_PER_BYTE) + +/* How many errors within the history buffer mark the start of a storm. */ +#define STORM_BEGIN_THRESHOLD 5 + +/* + * How many polls of machine check bank without an error before declaring + * the storm is over. Since it is tracked by the bitmasks in the history + * field of struct storm_bank the mask is 30 bits [0 ... 29]. + */ +#define STORM_END_POLL_THRESHOLD 29 + +static void arm64_ras_storm_timer_fn(struct timer_list *t) +{ + struct ras_node *node =3D timer_container_of(node, t, storm_timer); + unsigned long iv =3D msecs_to_jiffies(node->timer_interval); + int count =3D 0; + + ras_node_dbg(node, "Stormy bitmap %*pb\n", node->record_count, node->stor= m_bitmap); + ras_node_foreach_record(ras_proc_record, node, &count, node->storm_bitmap= ); + + if (count) + iv =3D max(iv / 2, (unsigned long) HZ/100); + else + iv =3D min(iv * 2, INITIAL_CHECK_INTERVAL * HZ); + + + node->timer_interval =3D jiffies_to_msecs(iv); + + if (atomic_read(&node->stormy_count)) { + ras_node_dbg(node, "next poll at %d ms\n", node->timer_interval); + mod_timer(&node->storm_timer, jiffies + iv); + } +} + +static void +arm64_ras_storm_reset_record(struct ras_record *record, void *__unused0, b= ool __unused1) +{ + struct ras_storm_unit *unit =3D &record->storm; + + unit->in_storm_mode =3D false; + unit->history =3D 0; + unit->timestamp =3D 0; +} + +void arm64_ras_storm_reset_node(void *data) +{ + struct ras_node *node =3D data; + + node->begin_threshold =3D STORM_BEGIN_THRESHOLD; + node->end_poll_threshold =3D STORM_END_POLL_THRESHOLD; + node->timer_interval =3D INITIAL_CHECK_INTERVAL * MSEC_PER_SEC; + + timer_delete_sync(&node->storm_timer); + bitmap_set(node->storm_bitmap, 0, node->record_count); + + ras_node_foreach_record(arm64_ras_storm_reset_record, node, NULL, node->r= ecord_implemented); +} + +static int arm64_ras_storm_do_init(struct ras_node *node) +{ + node->storm_bitmap =3D devm_bitmap_zalloc(node->dev, + node->record_count, GFP_KERNEL); + if (!node->storm_bitmap) + return -ENOMEM; + + timer_setup(&node->storm_timer, arm64_ras_storm_timer_fn, 0); + + return devm_add_action_or_reset(node->dev, + arm64_ras_storm_reset_node, node); +} + +int arm64_ras_storm_init(struct ras_node *node) +{ + int ret =3D 0; + + if (!node->record_count) + return ret; + + /* + * Per-CPU (oncore) nodes re-enter this path on every CPU + * online transition, so the bitmap is allocated only on the + * first call and reused on subsequent re-inits. + */ + if (!node->storm_bitmap) { + ret =3D arm64_ras_storm_do_init(node); + if (ret) + return ret; + } + + arm64_ras_storm_reset_node(node); + return ret; +} + +/** + * The function maintains the unit's history bitmap and decides whether + * the unit should enter or leave storm mode. + */ +static void ras_track_storm(struct ras_storm_unit *unit, bool corrected) +{ + unsigned long now =3D jiffies, delta; + unsigned int shift =3D 1; + u64 history =3D 0; + + /* + * Check how long it has been since this bank was last checked, + * and adjust the amount of "shift" to apply to history. + */ + delta =3D now - unit->timestamp; + shift =3D (delta + HZ) / HZ; + + /* If it has been a long time since the last poll, clear history. */ + if (shift < NUM_HISTORY_BITS) + history =3D unit->history << shift; + + unit->timestamp =3D now; + + /* History keeps track of corrected errors. VAL=3D1 && UC=3D0 */ + if (corrected) + history |=3D 1; + + unit->history =3D history; +} + +void arm64_ras_storm_track_record(struct ras_record *record, u64 err_statu= s) +{ + struct ras_storm_unit *u =3D &record->storm; + struct ras_node *node =3D record->node; + + ras_track_storm(u, err_status & ERR_STATUS_CE); + + if (u->in_storm_mode) { + /* + * Storm ends when no corrected error has been seen for + * STORM_END_POLL_THRESHOLD + 1 consecutive polls. + */ + if (u->history & GENMASK_ULL(node->end_poll_threshold, 0)) + return; + + u->in_storm_mode =3D false; + u->history =3D 0; + set_bit(record->index, node->storm_bitmap); + + ras_node_info(node, "%s: exited storm mode\n", record->name); + ras_enable_irq(record); + + if (atomic_dec_and_test(&record->node->stormy_count)) + timer_delete(&node->storm_timer); + } else { + if (hweight64(u->history) < node->begin_threshold) + return; + + u->in_storm_mode =3D true; + clear_bit(record->index, node->storm_bitmap); + + ras_node_info(node, "%s: entered storm mode\n", record->name); + ras_disable_irq(record); + /* + * If this is the first record on this node to enter storm mode + * start polling. + */ + if (atomic_inc_return(&record->node->stormy_count) =3D=3D 1) + mod_timer(&node->storm_timer, + jiffies + msecs_to_jiffies(node->timer_interval)); + } +} diff --git a/drivers/ras/arm64/ras-sysfs.c b/drivers/ras/arm64/ras-sysfs.c index d8b351ee9aef..9bf83d92d420 100644 --- a/drivers/ras/arm64/ras-sysfs.c +++ b/drivers/ras/arm64/ras-sysfs.c @@ -122,6 +122,109 @@ static int node_ce_threshold_set(void *data, u64 val) DEFINE_DEBUGFS_ATTRIBUTE(node_ce_threshold_ops, NULL, node_ce_threshold_set, "%llu\n"); =20 +/* Storm debugfs entries */ + +static int storm_stormy_count_get(void *data, u64 *val) +{ + struct ras_node *node =3D data; + + *val =3D atomic_read(&node->stormy_count); + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(storm_stormy_count_ops, storm_stormy_count_get, + NULL, "%llu\n"); + +static int storm_begin_threshold_get(void *data, u64 *val) +{ + struct ras_node *node =3D data; + + *val =3D node->begin_threshold; + return 0; +} + +static int storm_begin_threshold_set(void *data, u64 val) +{ + struct ras_node *node =3D data; + + if (val < 1 || val > BITS_PER_LONG) + return -EINVAL; + + node->begin_threshold =3D val; + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(storm_begin_threshold_ops, storm_begin_threshold_= get, + storm_begin_threshold_set, "%llu\n"); + +static int storm_end_poll_threshold_get(void *data, u64 *val) +{ + struct ras_node *node =3D data; + + *val =3D node->end_poll_threshold; + return 0; +} + +static int storm_end_poll_threshold_set(void *data, u64 val) +{ + struct ras_node *node =3D data; + + if (val >=3D BITS_PER_LONG) + return -EINVAL; + + node->end_poll_threshold =3D val; + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(storm_end_poll_threshold_ops, + storm_end_poll_threshold_get, + storm_end_poll_threshold_set, "%llu\n"); + +static int storm_interval_ms_get(void *data, u64 *val) +{ + struct ras_node *node =3D data; + + *val =3D node->timer_interval; + return 0; +} + +static int storm_interval_ms_set(void *data, u64 val) +{ + struct ras_node *node =3D data; + + node->timer_interval =3D val; + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(storm_interval_ms_ops, + storm_interval_ms_get, + storm_interval_ms_set, "%llu\n"); + +static int record_in_storm_get(void *data, u64 *val) +{ + struct ras_record *record =3D data; + + *val =3D atomic_read(&record->node->stormy_count); + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(record_in_storm_ops, record_in_storm_get, + NULL, "%llu\n"); + +static void ras_storm_init_debugfs(struct ras_node *node) +{ + struct dentry *storm_dir; + + if (!node->record_count) + return; + + storm_dir =3D debugfs_create_dir("storm", node->debugfs); + + debugfs_create_file("stormy_count", 0400, storm_dir, + node, &storm_stormy_count_ops); + debugfs_create_file("begin_threshold", 0600, storm_dir, + node, &storm_begin_threshold_ops); + debugfs_create_file("end_poll_threshold", 0600, storm_dir, + node, &storm_end_poll_threshold_ops); + debugfs_create_file("check_interval_ms", 0600, storm_dir, + node, &storm_interval_ms_ops); +} + static int ras_record_err_count_show(struct seq_file *m, void *data) { struct ras_record *record =3D m->private; @@ -151,6 +254,8 @@ static void ras_record_init_debugfs(struct ras_record *= record) record, &ras_record_err_count_fops); debugfs_create_file("ce_threshold", 0600, record->debugfs, record, &record_ce_threshold_ops); + debugfs_create_file("in_storm", 0400, record->debugfs, + record, &record_in_storm_ops); ras_inject_init_debugfs(record); } =20 @@ -186,6 +291,7 @@ static void ras_oncore_node_init_debugfs(struct ras_nod= e *node) percpu_node, &ras_node_err_count_fops); debugfs_create_file("ce_threshold", 0200, percpu_node->debugfs, percpu_node, &node_ce_threshold_ops); + ras_storm_init_debugfs(percpu_node); ras_init_records_debugfs(percpu_node); } } @@ -208,5 +314,6 @@ void ras_node_init_debugfs(struct ras_node *node) node, &ras_node_err_count_fops); debugfs_create_file("ce_threshold", 0200, node->debugfs, node, &node_ce_threshold_ops); + ras_storm_init_debugfs(node); ras_init_records_debugfs(node); } diff --git a/drivers/ras/arm64/ras.h b/drivers/ras/arm64/ras.h index 75aa4ac83a41..e54082e2c3b9 100644 --- a/drivers/ras/arm64/ras.h +++ b/drivers/ras/arm64/ras.h @@ -12,6 +12,7 @@ #include #include #include +#include =20 #define DEFAULT_CE_THRESHOLD 1 =20 @@ -93,6 +94,19 @@ struct record_count { u64 ueu; }; =20 +/* + * history: Bitmask tracking errors occurrence. Each set bit + * represents an error seen. + * + * timestamp: Last time (in jiffies) that the bank was polled. + * in_storm_mode: Is this bank in storm mode? + */ +struct ras_storm_unit { + u64 history; + u64 timestamp; + bool in_storm_mode; +}; + struct ras_record { char *name; void __iomem *regs_base; @@ -105,6 +119,7 @@ struct ras_record { struct ce_threshold ce; enum ras_ce_threshold threshold_type; struct record_count count; + struct ras_storm_unit storm; =20 int index; /* @@ -193,6 +208,9 @@ struct ras_node { */ int (*errgsr_mapping)(int errgsr_bit); struct ras_record *records; + unsigned long *storm_bitmap; + + struct timer_list storm_timer; =20 u32 specific_data_size; u32 record_count; @@ -206,6 +224,15 @@ struct ras_node { u8 group_format; u32 irq[AEST_MAX_INTERRUPT_PER_NODE]; u32 gsi[AEST_MAX_INTERRUPT_PER_NODE]; + + /* + * @stormy_count: Updated concurrently from hardirq and + * timer softirqs. + */ + atomic_t stormy_count; + unsigned int begin_threshold; + unsigned int end_poll_threshold; + unsigned int timer_interval; }; =20 #define CASE_READ(res, x) \ @@ -334,4 +361,14 @@ irqreturn_t ras_irq_func(int irq, void *input); void ras_online_node(struct ras_node *node); int ras_cmn700_probe(struct platform_device *pdev); =20 +void ras_disable_irq(struct ras_record *record); +void ras_enable_irq(struct ras_record *record); +void ras_node_foreach_record(void (*func)(struct ras_record *, void *, boo= l), + struct ras_node *node, void *data, unsigned long *bitmap); + +/* ras-storm.c */ +int arm64_ras_storm_init(struct ras_node *node); +void arm64_ras_storm_track_record(struct ras_record *record, u64 err_statu= s); +void arm64_ras_storm_reset_node(void *data); + #endif /* _DRIVERS_RAS_ARM64_RAS_H_ */ --=20 2.51.2.612.gdc70283dfc From nobody Mon Jun 8 04:25:37 2026 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B46C378803; Tue, 2 Jun 2026 07:16:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384591; cv=none; b=Z8l4PxSPNl4fvvPzSCynw2Mm34MKGEQK3gglGioO8twbjOKN5TaFbQmpGN/eSh/mpEmmXBCNrLBAeaAC7xwziAPGJiFZUoLM7FOphydSoVP3RmkHIMtwetfJTAcXWETa70gX0fo8EMa0bj08kChwFN+6WQuCixZzNfy3v8foqtI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780384591; c=relaxed/simple; bh=dwOtUaaZ1T28ZHMpxfRgsLXLHFh404qL2muulpCIriU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XWH7VU+/7wcIdl++896KyCaiVszF+v4RUPBbsIXfQ7151U0hRSoZ9mrcV5DPzGWKki0ri/QaXnPb2bkqtaBo1+rdTLjwv61Pgwc6zJqOCgQVQg5ykZiQOKfBgbY+EoXKAwi7g1FGCwJwzTh6vZpxf4dzEjByLBfqYISVN8Cq23c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ejTna3FH; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ejTna3FH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1780384580; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=m8PYl6vt/JGLK/5rXbDBL7ylKeOz6sqUbxtL6VST4yk=; b=ejTna3FHsIne9z7Da7wJFICwGYJcS0wZJ053H8Tus8UXsdGhsKU0s0zOgYOKmb1/QmxgKlrtvmDObCRh8ZpPp3uP0M50IUx753hhbzCpVYy/0mZGRACR0kDxErphTgzSwS1IBRSa1ZJDLwsJ5WHl8HjCJals9uzfJT3/NTmiHxw= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam011083073210;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0X43R2hf_1780384577; Received: from t50a05405.sqa.eu95.tbsite.net(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0X43R2hf_1780384577 cluster:ay36) by smtp.aliyun-inc.com; Tue, 02 Jun 2026 15:16:19 +0800 From: Ruidong Tian To: Catalin Marinas , Will Deacon , Lorenzo Pieralisi , Hanjun Guo , Sudeep Holla , "Rafael J . Wysocki" , Len Brown , Tony Luck , Borislav Petkov , Thomas Gleixner , Peter Zijlstra , Robin Murphy , Umang Chheda Cc: linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, zhuo.song@linux.alibaba.com, oliver.yang@linux.alibaba.com, Ruidong Tian Subject: [PATCH v7 16/16] trace, ras: add ARM RAS extension trace event Date: Tue, 2 Jun 2026 15:15:39 +0800 Message-ID: <20260602071540.3711528-17-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.43.7 In-Reply-To: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> References: <20260602071540.3711528-1-tianruidong@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a trace event for hardware errors reported by the ARMv8 RAS extension registers. userspace app can monitor this trace event and decode error information. Signed-off-by: Ruidong Tian --- drivers/ras/arm64/ras-core.c | 5 +++ drivers/ras/ras.c | 3 ++ include/ras/ras_event.h | 79 ++++++++++++++++++++++++++++++++++++ 3 files changed, 87 insertions(+) diff --git a/drivers/ras/arm64/ras-core.c b/drivers/ras/arm64/ras-core.c index 82e8bb10870f..3f4e7866bb75 100644 --- a/drivers/ras/arm64/ras-core.c +++ b/drivers/ras/arm64/ras-core.c @@ -11,6 +11,7 @@ #include #include #include +#include =20 #include "ras.h" =20 @@ -181,6 +182,10 @@ static void ras_do_proc(struct ras_record *record, str= uct ras_ext_regs *regs) } } =20 + trace_arm_ras_ext_event(record->node->type, record->index, regs, + record->node->specific_data, record->node->specific_data_size, + record->vendor_data, record->vendor_data_size); + atomic_notifier_call_chain(&ras_decoder_chain, 0, record); =20 if (status & ERR_STATUS_CE) diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c index 03df3db62334..c8858b745021 100644 --- a/drivers/ras/ras.c +++ b/drivers/ras/ras.c @@ -115,6 +115,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event); EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); EXPORT_TRACEPOINT_SYMBOL_GPL(non_standard_event); EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event); +#ifdef CONFIG_ARM64_RAS_EXTN +EXPORT_TRACEPOINT_SYMBOL_GPL(arm_ras_ext_event); +#endif =20 static int __init parse_ras_param(char *str) { diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index fdb785fa4613..346c868f3cf7 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -381,6 +381,85 @@ TRACE_EVENT(aer_event, "Not available") ); #endif /* CONFIG_PCIEAER */ + +/* + * ARM RAS Extension Events Report + * + * This event is generated when an error reported by the ARM RAS extension + * hardware is detected. + */ + +#ifdef CONFIG_ARM64_RAS_EXTN +#include +TRACE_EVENT(arm_ras_ext_event, + + TP_PROTO(const u8 type, + const u32 index, + struct ras_ext_regs *regs, + const u8 *specific_data, + const u32 specific_data_size, + const u8 *vendor_data, + const u32 vendor_data_size), + + TP_ARGS(type, index, regs, specific_data, specific_data_size, + vendor_data, vendor_data_size), + + TP_STRUCT__entry( + __field(u8, type) + __field(u32, index) + __field(u64, err_fr) + __field(u64, err_ctlr) + __field(u64, err_status) + __field(u64, err_addr) + __field(u64, err_misc0) + __field(u64, err_misc1) + __field(u64, err_misc2) + __field(u64, err_misc3) + __field(u32, specific_data_size) + __dynamic_array(u8, specific_data, specific_data_size) + __field(u32, vendor_data_size) + __dynamic_array(u8, vendor_data, vendor_data_size) + ), + + TP_fast_assign( + __entry->type =3D type; + __entry->index =3D index; + __entry->err_fr =3D regs->err_fr; + __entry->err_ctlr =3D regs->err_ctlr; + __entry->err_status =3D regs->err_status; + __entry->err_addr =3D regs->err_addr; + __entry->err_misc0 =3D regs->err_misc[0]; + __entry->err_misc1 =3D regs->err_misc[1]; + __entry->err_misc2 =3D regs->err_misc[2]; + __entry->err_misc3 =3D regs->err_misc[3]; + __entry->specific_data_size =3D specific_data_size; + memcpy(__get_dynamic_array(specific_data), specific_data, specific_data_= size); + __entry->vendor_data_size =3D vendor_data_size; + memcpy(__get_dynamic_array(vendor_data), vendor_data, vendor_data_size); + ), + + TP_printk("type: %d; index: %d; " + "ERR_FR: %llx; ERR_CTLR: %llx; ERR_STATUS: %llx; " + "ERR_ADDR: %llx; ERR_MISC0: %llx; ERR_MISC1: %llx; " + "ERR_MISC2: %llx; ERR_MISC3: %llx; " + "specific data len:%d; specific data:%s; " + "vendor data len:%d; vendor data:%s", + __entry->type, + __entry->index, + __entry->err_fr, + __entry->err_ctlr, + __entry->err_status, + __entry->err_addr, + __entry->err_misc0, + __entry->err_misc1, + __entry->err_misc2, + __entry->err_misc3, + __entry->specific_data_size, + __print_hex(__get_dynamic_array(specific_data), __entry->specific_data= _size), + __entry->vendor_data_size, + __print_hex(__get_dynamic_array(vendor_data), __entry->vendor_data_siz= e)) +); +#endif /* CONFIG_ARM64_RAS_EXTN */ #endif /* _TRACE_HW_EVENT_MC_H */ =20 /* This part must be outside protection */ --=20 2.51.2.612.gdc70283dfc