.../devicetree/bindings/arm/cpus.yaml | 6 + drivers/edac/Kconfig | 9 + drivers/edac/Makefile | 1 + drivers/edac/cortex_arm64_l1_l2.c | 232 ++++++++++++++++++ 4 files changed, 248 insertions(+) create mode 100644 drivers/edac/cortex_arm64_l1_l2.c
Hello, This is an attempt to revive [v5] series. I have attempted to address comments and suggestions from Marc Zyngier since [v5]. Additionally, I have extended support for A72 processors. Testing on a problematic A72 SoC has led to the detection of Correctable Errors (CEs). I am eager to hear your suggestions and feedback on this series. Thanks, Vijay [v5] https://lore.kernel.org/all/20210401110615.15326-1-s.hauer@pengutronix.de/#t [v6] https://lore.kernel.org/all/1744241785-20256-1-git-send-email-vijayb@linux.microsoft.com/ Changes since v6: - restore the change made in [v5] to clear CPU/L2 syndrome registers back to read_errors() (Tyler) - upon detecting a valid error, clear syndrome registers immediately to avoid clobbering between the read and write (Marc) - NULL return check for of_get_cpu_node() (Tyler) - of_node_put() to avoid refcount issue (Tyler) - quotes are dropped in yaml file (Krzysztof) Changes since v5: - rebase on v6.15-rc1 - the syndrome registers for CPU/L2 memory errors are cleared only upon detecting an error and an isb() after for synchronization (Marc) - "edac-enabled" hunk moved to initial patch to avoid breaking virtual environments (Marc) - to ensure compatibility across all three families, we are not reporting "L1 Dirty RAM," documented only in the A53 TRM - above prompted changing default CPU L1 error meesage from "unknown" to "Unspecified" - capturing CPUID/WAY information in L2 memory error log (Marc) - module license from "GPL v2" to "GPL" (checkpatch.pl warning) - extend support for A72 Changes since v4: - Rebase on v5.12-rc5 Changes since v3: - Add edac-enabled property to make EDAC 3support optional Changes since v2: - drop usage of virtual dt node (Robh) - use read_sysreg_s instead of open coded variant (James Morse) - separate error retrieving from error reporting - use smp_call_function_single rather than smp_call_function_single_async - make driver single instance and register all 'cpu' hierarchy up front once Changes since v1: - Split dt-binding into separate patch - Sort local function variables in reverse-xmas tree order - drop unnecessary comparison and make variable bool Sascha Hauer (2): drivers/edac: Add L1 and L2 error detection for A53, A57 and A72 dt-bindings: arm: cpus: Add edac-enabled property .../devicetree/bindings/arm/cpus.yaml | 6 + drivers/edac/Kconfig | 9 + drivers/edac/Makefile | 1 + drivers/edac/cortex_arm64_l1_l2.c | 232 ++++++++++++++++++ 4 files changed, 248 insertions(+) create mode 100644 drivers/edac/cortex_arm64_l1_l2.c base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 -- 2.49.0
On Fri, Apr 11, 2025 at 03:08:37PM -0700, Vijay Balakrishna wrote:
> Hello,
>
> This is an attempt to revive [v5] series. I have attempted to address comments
> and suggestions from Marc Zyngier since [v5]. Additionally, I have extended
> support for A72 processors. Testing on a problematic A72 SoC has led to the
> detection of Correctable Errors (CEs). I am eager to hear your suggestions and
> feedback on this series.
Did you not read Marc's note:
https://lore.kernel.org/all/86a58kl51r.wl-maz@kernel.org/
or
https://lore.kernel.org/all/86frigkmtd.wl-maz@kernel.org/
?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 4/13/25 13:39, Borislav Petkov wrote: > On Fri, Apr 11, 2025 at 03:08:37PM -0700, Vijay Balakrishna wrote: >> Hello, >> >> This is an attempt to revive [v5] series. I have attempted to address comments >> and suggestions from Marc Zyngier since [v5]. Additionally, I have extended >> support for A72 processors. Testing on a problematic A72 SoC has led to the >> detection of Correctable Errors (CEs). I am eager to hear your suggestions and >> feedback on this series. > > Did you not read Marc's note: > > https://lore.kernel.org/all/86a58kl51r.wl-maz@kernel.org/ > > or > > https://lore.kernel.org/all/86frigkmtd.wl-maz@kernel.org/ > > ? > Hi Borislav, I did see the second reply above, but not the first before posting v7. I opted to submit v7 after addressing the comments and issues identified in v6 for the benefit of those interested. Sascha's v5 series has helped us in confirming a problematic A72 indeed suffering from CEs. Our primary focus is on A72. I can re-submit with modifications solely related to A72 and exclude A53 and A57. As Tyler mentioned, we have a significant number of A72-based systems in our fleet, and timely replacements via monitoring CEs will be instrumental in managing them effectively. Please share your thoughts. Thanks, Vijay
© 2016 - 2026 Red Hat, Inc.