[PATCH 06/19] coresight: trbe: Refactor syndrome decoding

Leo Yan posted 19 patches 2 months, 1 week ago
[PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by Leo Yan 2 months, 1 week ago
It gives priority to TRBSR_EL1.EA (external abort); an external abort
will immediately bail out and return an error.

Next, the syndrome decoding is refactored based on two levels of
information: the EC (Event Class) bits and the BSC (Trace Buffer Status
Code) bits.

If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
identify the specific trace buffer event.  Otherwise, any non-zero
TRBSR_EL1.EC is treated as an error.

For error cases, the driver prints an error string and dumps registers
for debugging.

No additional checks are required for wrap mode beyond verifying the
TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit
reliably indicates a buffer wrap.

Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 60 +++++++++++++++++++++-------
 drivers/hwtracing/coresight/coresight-trbe.h |  8 ++--
 2 files changed, 51 insertions(+), 17 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 9e565122816949b37b8ff5e4ba04cfbc317c6f25..28e2bfa68074f19ccaa4a737d00af577aea818fe 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -643,29 +643,61 @@ static void trbe_enable_hw(struct trbe_buf *buf)
 static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
 						 u64 trbsr)
 {
+	const char *err_str;
 	int ec = get_trbe_ec(trbsr);
 	int bsc = get_trbe_bsc(trbsr);
-	struct trbe_buf *buf = etm_perf_sink_config(handle);
-	struct trbe_cpudata *cpudata = buf->cpudata;
 
 	WARN_ON(is_trbe_running(trbsr));
-	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
-		return TRBE_FAULT_ACT_FATAL;
 
-	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
-		return TRBE_FAULT_ACT_FATAL;
+	if (is_trbe_abort(trbsr)) {
+		err_str = "External abort";
+		goto out_fatal;
+	}
 
-	/*
-	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
-	 * it might write data after a WRAP event in the fill mode.
-	 * Thus the check TRBPTR == TRBBASER will not be honored.
-	 */
-	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
-	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
-	     get_trbe_write_pointer() == get_trbe_base_pointer()))
+	switch (ec) {
+	case TRBE_EC_OTHERS:
+		break;
+	case TRBE_EC_BUF_MGMT_IMPL:
+		err_str = "Unexpected implemented management";
+		goto out_fatal;
+	case TRBE_EC_GP_CHECK_FAULT:
+		err_str = "Granule Protection Check fault";
+		goto out_fatal;
+	case TRBE_EC_STAGE1_ABORT:
+		err_str = "Stage 1 data abort";
+		goto out_fatal;
+	case TRBE_EC_STAGE2_ABORT:
+		err_str = "Stage 2 data abort";
+		goto out_fatal;
+	default:
+		err_str = "Unknown error code";
+		goto out_fatal;
+	}
+
+	switch (bsc) {
+	case TRBE_BSC_NOT_STOPPED:
+		break;
+	case TRBE_BSC_FILLED:
+		break;
+	case TRBE_BSC_TRIGGERED:
+		err_str = "Unexpected trigger status";
+		goto out_fatal;
+	default:
+		err_str = "Unexpected buffer status code";
+		goto out_fatal;
+	}
+
+	if (is_trbe_wrap(trbsr))
 		return TRBE_FAULT_ACT_WRAP;
 
 	return TRBE_FAULT_ACT_SPURIOUS;
+
+out_fatal:
+	pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx, TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n",
+			   err_str, smp_processor_id(), trbsr,
+			   read_sysreg_s(SYS_TRBPTR_EL1),
+			   read_sysreg_s(SYS_TRBLIMITR_EL1));
+	return TRBE_FAULT_ACT_FATAL;
 }
 
 static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
index 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.h
+++ b/drivers/hwtracing/coresight/coresight-trbe.h
@@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void)
 	return trblimitr & TRBLIMITR_EL1_E;
 }
 
-#define TRBE_EC_OTHERS		0
-#define TRBE_EC_STAGE1_ABORT	36
-#define TRBE_EC_STAGE2_ABORT	37
+#define TRBE_EC_OTHERS			0x0
+#define TRBE_EC_GP_CHECK_FAULT		0X1e
+#define TRBE_EC_BUF_MGMT_IMPL		0x1f
+#define TRBE_EC_STAGE1_ABORT		0x24
+#define TRBE_EC_STAGE2_ABORT		0x25
 
 static inline int get_trbe_ec(u64 trbsr)
 {

-- 
2.34.1
Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by Anshuman Khandual 2 months ago
On 01/12/25 4:51 PM, Leo Yan wrote:
> It gives priority to TRBSR_EL1.EA (external abort); an external abort
> will immediately bail out and return an error.

This is an abrupt starting for a commit message without giving context.
The rationale for the change needs to be explained rather than details
of implementation.

> 
> Next, the syndrome decoding is refactored based on two levels of
> information: the EC (Event Class) bits and the BSC (Trace Buffer Status
> Code) bits.
> 
> If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
> identify the specific trace buffer event.  Otherwise, any non-zero
> TRBSR_EL1.EC is treated as an error.
> 
> For error cases, the driver prints an error string and dumps registers
> for debugging.
> 
> No additional checks are required for wrap mode beyond verifying the
> TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit
> reliably indicates a buffer wrap.

Should the errata related changes be done in a separate patch instead ?

> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 60 +++++++++++++++++++++-------
>  drivers/hwtracing/coresight/coresight-trbe.h |  8 ++--
>  2 files changed, 51 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 9e565122816949b37b8ff5e4ba04cfbc317c6f25..28e2bfa68074f19ccaa4a737d00af577aea818fe 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -643,29 +643,61 @@ static void trbe_enable_hw(struct trbe_buf *buf)
>  static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
>  						 u64 trbsr)
>  {
> +	const char *err_str;
>  	int ec = get_trbe_ec(trbsr);
>  	int bsc = get_trbe_bsc(trbsr);
> -	struct trbe_buf *buf = etm_perf_sink_config(handle);
> -	struct trbe_cpudata *cpudata = buf->cpudata;
>  
>  	WARN_ON(is_trbe_running(trbsr));
> -	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> -		return TRBE_FAULT_ACT_FATAL;
>  
> -	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
> -		return TRBE_FAULT_ACT_FATAL;
> +	if (is_trbe_abort(trbsr)) {
> +		err_str = "External abort";
> +		goto out_fatal;
> +	}
>  
> -	/*
> -	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> -	 * it might write data after a WRAP event in the fill mode.
> -	 * Thus the check TRBPTR == TRBBASER will not be honored.
> -	 */
> -	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
> -	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
> -	     get_trbe_write_pointer() == get_trbe_base_pointer()))
> +	switch (ec) {
> +	case TRBE_EC_OTHERS:
> +		break;

No message for this ?

> +	case TRBE_EC_BUF_MGMT_IMPL:
> +		err_str = "Unexpected implemented management";
> +		goto out_fatal;
> +	case TRBE_EC_GP_CHECK_FAULT:
> +		err_str = "Granule Protection Check fault";
> +		goto out_fatal;
> +	case TRBE_EC_STAGE1_ABORT:
> +		err_str = "Stage 1 data abort";
> +		goto out_fatal;
> +	case TRBE_EC_STAGE2_ABORT:
> +		err_str = "Stage 2 data abort";
> +		goto out_fatal;
> +	default:
> +		err_str = "Unknown error code";
> +		goto out_fatal;
> +	}
> +
> +	switch (bsc) {
> +	case TRBE_BSC_NOT_STOPPED:
> +		break;
> +	case TRBE_BSC_FILLED:
> +		break;
> +	case TRBE_BSC_TRIGGERED:
> +		err_str = "Unexpected trigger status";
> +		goto out_fatal;
> +	default:
> +		err_str = "Unexpected buffer status code";
> +		goto out_fatal;
> +	}

Just wondering if it would be cleaner to add a const char * based
static array mapping these EC/BSC codes with above error messages.
 
> +
> +	if (is_trbe_wrap(trbsr))

But what about TRBE affected with trbe_may_overwrite_in_fill_mode()
is that still being taken care of some how ?

>  		return TRBE_FAULT_ACT_WRAP;
>  
>  	return TRBE_FAULT_ACT_SPURIOUS;
> +
> +out_fatal:
> +	pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx, TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n",
> +			   err_str, smp_processor_id(), trbsr,
> +			   read_sysreg_s(SYS_TRBPTR_EL1),
> +			   read_sysreg_s(SYS_TRBLIMITR_EL1));

Right - comprehensive TRBE configuration register dump when a fatal
error happens.

> +	return TRBE_FAULT_ACT_FATAL;
>  }
>  
>  static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
> index 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.h
> +++ b/drivers/hwtracing/coresight/coresight-trbe.h
> @@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void)
>  	return trblimitr & TRBLIMITR_EL1_E;
>  }
>  
> -#define TRBE_EC_OTHERS		0
> -#define TRBE_EC_STAGE1_ABORT	36
> -#define TRBE_EC_STAGE2_ABORT	37
> +#define TRBE_EC_OTHERS			0x0
> +#define TRBE_EC_GP_CHECK_FAULT		0X1e
> +#define TRBE_EC_BUF_MGMT_IMPL		0x1f
> +#define TRBE_EC_STAGE1_ABORT		0x24
> +#define TRBE_EC_STAGE2_ABORT		0x25

Please do mention the document source for these new EC codes in the
commit message.

>  
>  static inline int get_trbe_ec(u64 trbsr)
>  {
>
Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by Leo Yan 2 months ago
On Fri, Dec 05, 2025 at 09:40:33AM +0530, Anshuman Khandual wrote:
> On 01/12/25 4:51 PM, Leo Yan wrote:
> > It gives priority to TRBSR_EL1.EA (external abort); an external abort
> > will immediately bail out and return an error.
> 
> This is an abrupt starting for a commit message without giving context.
> The rationale for the change needs to be explained rather than details
> of implementation.

Agreed.  I will refine the commit log to state that we need to diagnose
the error code based on the hierarchy information (EC and BSC).

> > Next, the syndrome decoding is refactored based on two levels of
> > information: the EC (Event Class) bits and the BSC (Trace Buffer Status
> > Code) bits.
> > 
> > If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
> > identify the specific trace buffer event.  Otherwise, any non-zero
> > TRBSR_EL1.EC is treated as an error.
> > 
> > For error cases, the driver prints an error string and dumps registers
> > for debugging.
> > 
> > No additional checks are required for wrap mode beyond verifying the
> > TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit
> > reliably indicates a buffer wrap.
> 
> Should the errata related changes be done in a separate patch instead ?

Good point.  I will consider extracting the change; this would make
the review clearer.

[...]

> >  static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
> >  						 u64 trbsr)
> >  {
> > +	const char *err_str;
> >  	int ec = get_trbe_ec(trbsr);
> >  	int bsc = get_trbe_bsc(trbsr);
> > -	struct trbe_buf *buf = etm_perf_sink_config(handle);
> > -	struct trbe_cpudata *cpudata = buf->cpudata;
> >  
> >  	WARN_ON(is_trbe_running(trbsr));
> > -	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> > -		return TRBE_FAULT_ACT_FATAL;
> >  
> > -	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
> > -		return TRBE_FAULT_ACT_FATAL;
> > +	if (is_trbe_abort(trbsr)) {
> > +		err_str = "External abort";
> > +		goto out_fatal;
> > +	}
> >  
> > -	/*
> > -	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> > -	 * it might write data after a WRAP event in the fill mode.
> > -	 * Thus the check TRBPTR == TRBBASER will not be honored.
> > -	 */
> > -	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
> > -	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
> > -	     get_trbe_write_pointer() == get_trbe_base_pointer()))
> > +	switch (ec) {
> > +	case TRBE_EC_OTHERS:
> > +		break;
> 
> No message for this ?

No.  EC_OTHERS means "this is a normal maintenance interrupt".

> > +	case TRBE_EC_BUF_MGMT_IMPL:
> > +		err_str = "Unexpected implemented management";
> > +		goto out_fatal;
> > +	case TRBE_EC_GP_CHECK_FAULT:
> > +		err_str = "Granule Protection Check fault";
> > +		goto out_fatal;
> > +	case TRBE_EC_STAGE1_ABORT:
> > +		err_str = "Stage 1 data abort";
> > +		goto out_fatal;
> > +	case TRBE_EC_STAGE2_ABORT:
> > +		err_str = "Stage 2 data abort";
> > +		goto out_fatal;
> > +	default:
> > +		err_str = "Unknown error code";
> > +		goto out_fatal;
> > +	}
> > +
> > +	switch (bsc) {
> > +	case TRBE_BSC_NOT_STOPPED:
> > +		break;
> > +	case TRBE_BSC_FILLED:
> > +		break;
> > +	case TRBE_BSC_TRIGGERED:
> > +		err_str = "Unexpected trigger status";
> > +		goto out_fatal;
> > +	default:
> > +		err_str = "Unexpected buffer status code";
> > +		goto out_fatal;
> > +	}
> 
> Just wondering if it would be cleaner to add a const char * based
> static array mapping these EC/BSC codes with above error messages.

Now we have two level's (EC/BSC) error codes, it would be complex to
maintain a string array and map to two level's error codes.

> > +	if (is_trbe_wrap(trbsr))
> 
> But what about TRBE affected with trbe_may_overwrite_in_fill_mode()
> is that still being taken care of some how ?

The TRBSR_EL1.WRAP bit reliably indicates that the pointer has
wrapped.  I don't think we need any special handling for the overwrite
erratum here; we can simply return TRBE_FAULT_ACT_WRAP.

Afterward, trbe_get_trace_size() consumes the wrap flag and will
compute the trace size appropriately for the overwrite erratum when
the wrap flag is set.

[...]

> > -#define TRBE_EC_OTHERS		0
> > -#define TRBE_EC_STAGE1_ABORT	36
> > -#define TRBE_EC_STAGE2_ABORT	37
> > +#define TRBE_EC_OTHERS			0x0
> > +#define TRBE_EC_GP_CHECK_FAULT		0X1e
> > +#define TRBE_EC_BUF_MGMT_IMPL		0x1f
> > +#define TRBE_EC_STAGE1_ABORT		0x24
> > +#define TRBE_EC_STAGE2_ABORT		0x25
> 
> Please do mention the document source for these new EC codes in the
> commit message.

As James suggested in another reply, I will consider to add sysreg
enum.

Thanks,
Leo
Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by Suzuki K Poulose 2 months, 1 week ago
Hi Leo

A couple of minor nits below. Otherwise looks like a nice cleanup


On 01/12/2025 11:21, Leo Yan wrote:
> It gives priority to TRBSR_EL1.EA (external abort); an external abort
> will immediately bail out and return an error.
> 
> Next, the syndrome decoding is refactored based on two levels of
> information: the EC (Event Class) bits and the BSC (Trace Buffer Status
> Code) bits.
> 
> If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
> identify the specific trace buffer event.  Otherwise, any non-zero
> TRBSR_EL1.EC is treated as an error.
> 
> For error cases, the driver prints an error string and dumps registers
> for debugging.

minor nit: Please avoid describing the code, which cod does well.
Briefly mention what you are doing, giving any additional contexts
that may not be evident from the code.

e.g:

"Add support for decoding the syndrome for TRBE, providing a verbose
description of the code for Fatal/unhandled events.
While at it add the new definitions from the latest Arm ARM for
EC and BSC"




> 
> No additional checks are required for wrap mode beyond verifying the
> TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit


> reliably indicates a buffer wrap.
> 
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>   drivers/hwtracing/coresight/coresight-trbe.c | 60 +++++++++++++++++++++-------
>   drivers/hwtracing/coresight/coresight-trbe.h |  8 ++--
>   2 files changed, 51 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 9e565122816949b37b8ff5e4ba04cfbc317c6f25..28e2bfa68074f19ccaa4a737d00af577aea818fe 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -643,29 +643,61 @@ static void trbe_enable_hw(struct trbe_buf *buf)
>   static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
>   						 u64 trbsr)
>   {
> +	const char *err_str;
>   	int ec = get_trbe_ec(trbsr);
>   	int bsc = get_trbe_bsc(trbsr);
> -	struct trbe_buf *buf = etm_perf_sink_config(handle);
> -	struct trbe_cpudata *cpudata = buf->cpudata;
>   
>   	WARN_ON(is_trbe_running(trbsr));
> -	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> -		return TRBE_FAULT_ACT_FATAL;
>   
> -	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
> -		return TRBE_FAULT_ACT_FATAL;
> +	if (is_trbe_abort(trbsr)) {
> +		err_str = "External abort";
> +		goto out_fatal;
> +	}
>   
> -	/*
> -	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> -	 * it might write data after a WRAP event in the fill mode.
> -	 * Thus the check TRBPTR == TRBBASER will not be honored.
> -	 */
> -	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
> -	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
> -	     get_trbe_write_pointer() == get_trbe_base_pointer()))
> +	switch (ec) {
> +	case TRBE_EC_OTHERS:
> +		break;
> +	case TRBE_EC_BUF_MGMT_IMPL:
> +		err_str = "Unexpected implemented management";

minor nit: "Unexpected IMPDEF buffer management event"


Suzuki

> +		goto out_fatal;
> +	case TRBE_EC_GP_CHECK_FAULT:
> +		err_str = "Granule Protection Check fault";
> +		goto out_fatal;
> +	case TRBE_EC_STAGE1_ABORT:
> +		err_str = "Stage 1 data abort";
> +		goto out_fatal;
> +	case TRBE_EC_STAGE2_ABORT:
> +		err_str = "Stage 2 data abort";
> +		goto out_fatal;
> +	default:
> +		err_str = "Unknown error code";
> +		goto out_fatal;
> +	}
> +
> +	switch (bsc) {
> +	case TRBE_BSC_NOT_STOPPED:
> +		break;
> +	case TRBE_BSC_FILLED:
> +		break;
> +	case TRBE_BSC_TRIGGERED:
> +		err_str = "Unexpected trigger status";
> +		goto out_fatal;
> +	default:
> +		err_str = "Unexpected buffer status code";
> +		goto out_fatal;
> +	}
> +
> +	if (is_trbe_wrap(trbsr))
>   		return TRBE_FAULT_ACT_WRAP;
>   
>   	return TRBE_FAULT_ACT_SPURIOUS;
> +
> +out_fatal:
> +	pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx, TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n",
> +			   err_str, smp_processor_id(), trbsr,
> +			   read_sysreg_s(SYS_TRBPTR_EL1),
> +			   read_sysreg_s(SYS_TRBLIMITR_EL1));
> +	return TRBE_FAULT_ACT_FATAL;
>   }
>   
>   static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
> index 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.h
> +++ b/drivers/hwtracing/coresight/coresight-trbe.h
> @@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void)
>   	return trblimitr & TRBLIMITR_EL1_E;
>   }
>   
> -#define TRBE_EC_OTHERS		0
> -#define TRBE_EC_STAGE1_ABORT	36
> -#define TRBE_EC_STAGE2_ABORT	37
> +#define TRBE_EC_OTHERS			0x0
> +#define TRBE_EC_GP_CHECK_FAULT		0X1e
> +#define TRBE_EC_BUF_MGMT_IMPL		0x1f
> +#define TRBE_EC_STAGE1_ABORT		0x24
> +#define TRBE_EC_STAGE2_ABORT		0x25
>   
>   static inline int get_trbe_ec(u64 trbsr)
>   {
>
Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by James Clark 2 months ago

On 02/12/2025 11:06 am, Suzuki K Poulose wrote:
> Hi Leo
> 
> A couple of minor nits below. Otherwise looks like a nice cleanup
> 
> 
> On 01/12/2025 11:21, Leo Yan wrote:
>> It gives priority to TRBSR_EL1.EA (external abort); an external abort
>> will immediately bail out and return an error.
>>
>> Next, the syndrome decoding is refactored based on two levels of
>> information: the EC (Event Class) bits and the BSC (Trace Buffer Status
>> Code) bits.
>>
>> If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
>> identify the specific trace buffer event.  Otherwise, any non-zero
>> TRBSR_EL1.EC is treated as an error.
>>
>> For error cases, the driver prints an error string and dumps registers
>> for debugging.
> 
> minor nit: Please avoid describing the code, which cod does well.
> Briefly mention what you are doing, giving any additional contexts
> that may not be evident from the code.
> 
> e.g:
> 
> "Add support for decoding the syndrome for TRBE, providing a verbose
> description of the code for Fatal/unhandled events.
> While at it add the new definitions from the latest Arm ARM for
> EC and BSC"
> 
> 
> 
> 
>>
>> No additional checks are required for wrap mode beyond verifying the
>> TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit
> 
> 
>> reliably indicates a buffer wrap.
>>
>> Signed-off-by: Leo Yan <leo.yan@arm.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-trbe.c | 60 ++++++++++++++++++ 
>> +++-------
>>   drivers/hwtracing/coresight/coresight-trbe.h |  8 ++--
>>   2 files changed, 51 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/ 
>> hwtracing/coresight/coresight-trbe.c
>> index 
>> 9e565122816949b37b8ff5e4ba04cfbc317c6f25..28e2bfa68074f19ccaa4a737d00af577aea818fe 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -643,29 +643,61 @@ static void trbe_enable_hw(struct trbe_buf *buf)
>>   static enum trbe_fault_action trbe_get_fault_act(struct 
>> perf_output_handle *handle,
>>                            u64 trbsr)
>>   {
>> +    const char *err_str;
>>       int ec = get_trbe_ec(trbsr);
>>       int bsc = get_trbe_bsc(trbsr);
>> -    struct trbe_buf *buf = etm_perf_sink_config(handle);
>> -    struct trbe_cpudata *cpudata = buf->cpudata;
>>       WARN_ON(is_trbe_running(trbsr));
>> -    if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
>> -        return TRBE_FAULT_ACT_FATAL;
>> -    if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
>> -        return TRBE_FAULT_ACT_FATAL;
>> +    if (is_trbe_abort(trbsr)) {
>> +        err_str = "External abort";
>> +        goto out_fatal;
>> +    }
>> -    /*
>> -     * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
>> -     * it might write data after a WRAP event in the fill mode.
>> -     * Thus the check TRBPTR == TRBBASER will not be honored.
>> -     */
>> -    if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == 
>> TRBE_BSC_FILLED)) &&
>> -        (trbe_may_overwrite_in_fill_mode(cpudata) ||
>> -         get_trbe_write_pointer() == get_trbe_base_pointer()))
>> +    switch (ec) {
>> +    case TRBE_EC_OTHERS:
>> +        break;
>> +    case TRBE_EC_BUF_MGMT_IMPL:
>> +        err_str = "Unexpected implemented management";
> 
> minor nit: "Unexpected IMPDEF buffer management event"
> 
> 
> Suzuki
> 
>> +        goto out_fatal;
>> +    case TRBE_EC_GP_CHECK_FAULT:
>> +        err_str = "Granule Protection Check fault";
>> +        goto out_fatal;
>> +    case TRBE_EC_STAGE1_ABORT:
>> +        err_str = "Stage 1 data abort";
>> +        goto out_fatal;
>> +    case TRBE_EC_STAGE2_ABORT:
>> +        err_str = "Stage 2 data abort";
>> +        goto out_fatal;
>> +    default:
>> +        err_str = "Unknown error code";
>> +        goto out_fatal;
>> +    }
>> +
>> +    switch (bsc) {
>> +    case TRBE_BSC_NOT_STOPPED:
>> +        break;
>> +    case TRBE_BSC_FILLED:
>> +        break;
>> +    case TRBE_BSC_TRIGGERED:
>> +        err_str = "Unexpected trigger status";
>> +        goto out_fatal;
>> +    default:
>> +        err_str = "Unexpected buffer status code";
>> +        goto out_fatal;
>> +    }
>> +
>> +    if (is_trbe_wrap(trbsr))
>>           return TRBE_FAULT_ACT_WRAP;
>>       return TRBE_FAULT_ACT_SPURIOUS;
>> +
>> +out_fatal:
>> +    pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx, 
>> TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n",
>> +               err_str, smp_processor_id(), trbsr,
>> +               read_sysreg_s(SYS_TRBPTR_EL1),
>> +               read_sysreg_s(SYS_TRBLIMITR_EL1));

Don't you need to use %p to not leak pointers?

>> +    return TRBE_FAULT_ACT_FATAL;
>>   }
>>   static unsigned long trbe_get_trace_size(struct perf_output_handle 
>> *handle,
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/ 
>> hwtracing/coresight/coresight-trbe.h
>> index 
>> 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.h
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.h
>> @@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void)
>>       return trblimitr & TRBLIMITR_EL1_E;
>>   }
>> -#define TRBE_EC_OTHERS        0
>> -#define TRBE_EC_STAGE1_ABORT    36
>> -#define TRBE_EC_STAGE2_ABORT    37
>> +#define TRBE_EC_OTHERS            0x0
>> +#define TRBE_EC_GP_CHECK_FAULT        0X1e
>> +#define TRBE_EC_BUF_MGMT_IMPL        0x1f
>> +#define TRBE_EC_STAGE1_ABORT        0x24
>> +#define TRBE_EC_STAGE2_ABORT        0x25

I know some were already defined here, but this should probably be an 
enum in sysreg.

>>   static inline int get_trbe_ec(u64 trbsr)
>>   {
>>
> 

Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by Leo Yan 2 months ago
On Tue, Dec 09, 2025 at 01:17:50PM +0000, James Clark wrote:

[...]

> > > +    pr_err_ratelimited("%s on CPU %d [TRBSR=0x%016llx,
> > > TRBPTR=0x%016llx, TRBLIMITR=0x%016llx]\n",
> > > +               err_str, smp_processor_id(), trbsr,
> > > +               read_sysreg_s(SYS_TRBPTR_EL1),
> > > +               read_sysreg_s(SYS_TRBLIMITR_EL1));
> 
> Don't you need to use %p to not leak pointers?

I will use "%pK" instead for debugging purpose when set
kptr_restrict=0.

> > > +    return TRBE_FAULT_ACT_FATAL;
> > >   }
> > >   static unsigned long trbe_get_trace_size(struct perf_output_handle
> > > *handle,
> > > diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/
> > > hwtracing/coresight/coresight-trbe.h
> > > index 45202c48accec7c86ba56130e2737bc2d1830fae..d7f7cd763c0c7139cf322b7336ee563073e3bea0
> > > 100644
> > > --- a/drivers/hwtracing/coresight/coresight-trbe.h
> > > +++ b/drivers/hwtracing/coresight/coresight-trbe.h
> > > @@ -35,9 +35,11 @@ static inline bool is_trbe_enabled(void)
> > >       return trblimitr & TRBLIMITR_EL1_E;
> > >   }
> > > -#define TRBE_EC_OTHERS        0
> > > -#define TRBE_EC_STAGE1_ABORT    36
> > > -#define TRBE_EC_STAGE2_ABORT    37
> > > +#define TRBE_EC_OTHERS            0x0
> > > +#define TRBE_EC_GP_CHECK_FAULT        0X1e
> > > +#define TRBE_EC_BUF_MGMT_IMPL        0x1f
> > > +#define TRBE_EC_STAGE1_ABORT        0x24
> > > +#define TRBE_EC_STAGE2_ABORT        0x25
> 
> I know some were already defined here, but this should probably be an enum
> in sysreg.

Good point.  Will do.

Thanks,
Leo
Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
Posted by Leo Yan 2 months, 1 week ago
On Tue, Dec 02, 2025 at 11:06:48AM +0000, Suzuki K Poulose wrote:
> Hi Leo
> 
> A couple of minor nits below. Otherwise looks like a nice cleanup
> 
> 
> On 01/12/2025 11:21, Leo Yan wrote:
> > It gives priority to TRBSR_EL1.EA (external abort); an external abort
> > will immediately bail out and return an error.
> > 
> > Next, the syndrome decoding is refactored based on two levels of
> > information: the EC (Event Class) bits and the BSC (Trace Buffer Status
> > Code) bits.
> > 
> > If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
> > identify the specific trace buffer event.  Otherwise, any non-zero
> > TRBSR_EL1.EC is treated as an error.
> > 
> > For error cases, the driver prints an error string and dumps registers
> > for debugging.
> 
> minor nit: Please avoid describing the code, which cod does well.
> Briefly mention what you are doing, giving any additional contexts
> that may not be evident from the code.

Sure, I will note this later.

> e.g:
> 
> "Add support for decoding the syndrome for TRBE, providing a verbose
> description of the code for Fatal/unhandled events.
> While at it add the new definitions from the latest Arm ARM for
> EC and BSC"

Will do this and followed comment.  Thanks for suggestion!

Leo