[PATCH 09/13] rasdaemon: ras-mc-ctl: Fix logging of memory event type in CXL DRAM error table

shiju.jose@huawei.com posted 13 patches 3 days, 1 hour ago
[PATCH 09/13] rasdaemon: ras-mc-ctl: Fix logging of memory event type in CXL DRAM error table
Posted by shiju.jose@huawei.com 3 days, 1 hour ago
From: Shiju Jose <shiju.jose@huawei.com>

CXL spec rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Fix decoding of memory event type in the CXL DRAM error table in RAS
SQLite database.
For e.g. if value is 0x1 it will be logged as an Invalid Address
(General Media Event Record - Memory Event Type) instead of Scrub Media
ECC Error (DRAM Event Record - Memory Event Type) and so on.

Fixes: c38c14afc5d7 ("rasdaemon: ras-mc-ctl: Add support for CXL DRAM trace events")
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 util/ras-mc-ctl.in | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/util/ras-mc-ctl.in b/util/ras-mc-ctl.in
index c24941f..3f9bad0 100755
--- a/util/ras-mc-ctl.in
+++ b/util/ras-mc-ctl.in
@@ -1339,7 +1339,7 @@ sub get_cxl_descriptor_flags_text
     return join (", ", @out);
 }
 
-sub get_cxl_mem_event_type
+sub get_cxl_gmer_mem_event_type
 {
     my @types;
 
@@ -1354,6 +1354,22 @@ sub get_cxl_mem_event_type
     return $types[$_[0]];
 }
 
+sub get_cxl_der_mem_event_type
+{
+    my @types;
+
+    if ($_[0] < 0 || $_[0] > 3) {
+	return "unknown-type";
+    }
+
+    @types = ("Media ECC Error",
+	      "Scrub Media ECC Error",
+	      "Invalid Address",
+	      "Data Path Error");
+
+    return $types[$_[0]];
+}
+
 sub get_cxl_transaction_type
 {
     my @types;
@@ -1978,7 +1994,7 @@ sub errors
 	    $out .= sprintf "dpa=0x%llx, ", $dpa if (defined $dpa && length $dpa);
 	    $out .= sprintf "dpa_flags: %s, ", get_cxl_dpa_flags_text($dpa_flags) if (defined $dpa_flags && length $dpa_flags);
 	    $out .= sprintf "descriptor_flags: %s, ", get_cxl_descriptor_flags_text($descriptor) if (defined $descriptor && length $descriptor);
-	    $out .= sprintf "memory event type: %s, ", get_cxl_mem_event_type($mem_event_type) if (defined $mem_event_type && length $mem_event_type);
+	    $out .= sprintf "memory event type: %s, ", get_cxl_gmer_mem_event_type($mem_event_type) if (defined $mem_event_type && length $mem_event_type);
 	    $out .= sprintf "transaction_type: %s, ", get_cxl_transaction_type($transaction_type) if (defined $transaction_type && length $transaction_type);
 	    $out .= sprintf "channel=%u, ", $channel if (defined $channel && length $channel);
 	    $out .= sprintf "rank=%u, ", $rank if (defined $rank && length $rank);
@@ -2024,7 +2040,7 @@ sub errors
 	    $out .= sprintf "dpa=0x%llx, ", $dpa if (defined $dpa && length $dpa);
 	    $out .= sprintf "dpa_flags: %s, ", get_cxl_dpa_flags_text($dpa_flags) if (defined $dpa_flags && length $dpa_flags);
 	    $out .= sprintf "descriptor_flags: %s, ", get_cxl_descriptor_flags_text($descriptor) if (defined $descriptor && length $descriptor);
-	    $out .= sprintf "memory event type: %s, ", get_cxl_mem_event_type($type) if (defined $type && length $type);
+	    $out .= sprintf "memory event type: %s, ", get_cxl_der_mem_event_type($type) if (defined $type && length $type);
 	    $out .= sprintf "transaction_type: %s, ", get_cxl_transaction_type($transaction_type) if (defined $transaction_type && length $transaction_type);
 	    $out .= sprintf "channel=%u, ", $channel if (defined $channel && length $channel);
 	    $out .= sprintf "rank=%u, ", $rank if (defined $rank && length $rank);
-- 
2.43.0
Re: [PATCH 09/13] rasdaemon: ras-mc-ctl: Fix logging of memory event type in CXL DRAM error table
Posted by Jonathan Cameron 1 day, 19 hours ago
On Wed, 20 Nov 2024 09:59:19 +0000
<shiju.jose@huawei.com> wrote:

> From: Shiju Jose <shiju.jose@huawei.com>
> 
> CXL spec rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> 
> Fix decoding of memory event type in the CXL DRAM error table in RAS
> SQLite database.
> For e.g. if value is 0x1 it will be logged as an Invalid Address
> (General Media Event Record - Memory Event Type) instead of Scrub Media
> ECC Error (DRAM Event Record - Memory Event Type) and so on.
> 
> Fixes: c38c14afc5d7 ("rasdaemon: ras-mc-ctl: Add support for CXL DRAM trace events")
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Though note I don't really understand this code so only
reviewing based on changes looking correct given what was there
before.

J