[PATCH v2 0/2] Incorporate DRAM address in EDAC messages

Avadhut Naik posted 2 patches 4 months, 3 weeks ago
There is a newer version of this series
drivers/edac/amd64_edac.c      | 23 +++++++++++++++++++++-
drivers/edac/amd64_edac.h      |  1 +
drivers/ras/amd/atl/core.c     |  3 ++-
drivers/ras/amd/atl/internal.h |  9 +++++++++
drivers/ras/amd/atl/prm.c      | 36 ++++++++++++++++++++++++++++++----
drivers/ras/amd/atl/umc.c      |  9 +++++++++
drivers/ras/ras.c              | 18 +++++++++++++++--
include/linux/ras.h            | 19 +++++++++++++++++-
8 files changed, 109 insertions(+), 9 deletions(-)
[PATCH v2 0/2] Incorporate DRAM address in EDAC messages
Posted by Avadhut Naik 4 months, 3 weeks ago
Currently, the amd64_edac module only provides UMC normalized and system
physical address when a DRAM ECC error occurs. DRAM Address is neither
logged nor exported through tracepoint.

Modern AMD SOCs provide UEFI PRM module that implements various address
translation PRM handlers. These PRM handlers can be leveraged to convert
UMC normalized address into DRAM address at runtime on occurrence of a
DRAM ECC error. This translated DRAM address can then be logged and
exported through tracepoints. This set adds the required support to
accomplish the aforementioned.

The first patch adds support in the Address Translation Library to invoke
the appropriate PRM handler to perform the translation.

The second patch leverages the support added in the first patch to log
DRAM Address and export it through the RAS tracepoint on occurrence of a
DRAM ECC error.

Changes in v2:
 - Modify commit messages per feedback received.
 - Remove unnecessary variables.
 - Rename struct dram_addr to atl_dram_addr.
 - Replace sprintf call in __log_ecc_error() with scnprintf.
 - Pass the DRAM Address to edac_mc_handle_error() through "other_detail"
parameter instead of "msg".

Avadhut Naik (2):
  RAS/AMD/ATL: Translate UMC normalized address to DRAM address using
    PRM
  EDAC/amd64: Incorporate DRAM Address in EDAC message

 drivers/edac/amd64_edac.c      | 23 +++++++++++++++++++++-
 drivers/edac/amd64_edac.h      |  1 +
 drivers/ras/amd/atl/core.c     |  3 ++-
 drivers/ras/amd/atl/internal.h |  9 +++++++++
 drivers/ras/amd/atl/prm.c      | 36 ++++++++++++++++++++++++++++++----
 drivers/ras/amd/atl/umc.c      |  9 +++++++++
 drivers/ras/ras.c              | 18 +++++++++++++++--
 include/linux/ras.h            | 19 +++++++++++++++++-
 8 files changed, 109 insertions(+), 9 deletions(-)


base-commit: 501973598d05fdb1d1089fbf3cf40b605b836e16
-- 
2.43.0
Re: [PATCH v2 0/2] Incorporate DRAM address in EDAC messages
Posted by Yazen Ghannam 4 months ago
On Mon, Sep 15, 2025 at 09:20:21PM +0000, Avadhut Naik wrote:
> Currently, the amd64_edac module only provides UMC normalized and system
> physical address when a DRAM ECC error occurs. DRAM Address is neither
> logged nor exported through tracepoint.
> 
> Modern AMD SOCs provide UEFI PRM module that implements various address
> translation PRM handlers. These PRM handlers can be leveraged to convert
> UMC normalized address into DRAM address at runtime on occurrence of a
> DRAM ECC error. This translated DRAM address can then be logged and
> exported through tracepoints. This set adds the required support to
> accomplish the aforementioned.
> 
> The first patch adds support in the Address Translation Library to invoke
> the appropriate PRM handler to perform the translation.
> 
> The second patch leverages the support added in the first patch to log
> DRAM Address and export it through the RAS tracepoint on occurrence of a
> DRAM ECC error.
> 
> Changes in v2:
>  - Modify commit messages per feedback received.
>  - Remove unnecessary variables.
>  - Rename struct dram_addr to atl_dram_addr.
>  - Replace sprintf call in __log_ecc_error() with scnprintf.
>  - Pass the DRAM Address to edac_mc_handle_error() through "other_detail"
> parameter instead of "msg".
> 
> Avadhut Naik (2):
>   RAS/AMD/ATL: Translate UMC normalized address to DRAM address using
>     PRM
>   EDAC/amd64: Incorporate DRAM Address in EDAC message
> 
>  drivers/edac/amd64_edac.c      | 23 +++++++++++++++++++++-
>  drivers/edac/amd64_edac.h      |  1 +
>  drivers/ras/amd/atl/core.c     |  3 ++-
>  drivers/ras/amd/atl/internal.h |  9 +++++++++
>  drivers/ras/amd/atl/prm.c      | 36 ++++++++++++++++++++++++++++++----
>  drivers/ras/amd/atl/umc.c      |  9 +++++++++
>  drivers/ras/ras.c              | 18 +++++++++++++++--
>  include/linux/ras.h            | 19 +++++++++++++++++-
>  8 files changed, 109 insertions(+), 9 deletions(-)
> 
> 
> base-commit: 501973598d05fdb1d1089fbf3cf40b605b836e16
> -- 

Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>

Thanks,
Yazen