[PATCH] coresight-tmc: Add configurable timeout for flush and tmcready

Yuanfang Zhang posted 1 patch 3 months, 1 week ago
drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
1 file changed, 41 insertions(+), 2 deletions(-)
[PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by Yuanfang Zhang 3 months, 1 week ago
The current implementation uses a fixed timeout via
coresight_timeout(), which may be insufficient when multiple
sources are enabled or under heavy load, leading to TMC
readiness or flush completion timeout.

This patch introduces a configurable timeout mechanism for
flush and tmcready.

Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
---
 drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-core.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
@@ -8,6 +8,7 @@
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/types.h>
+#include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/idr.h>
 #include <linux/io.h>
@@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
 DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
 DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
 
+static u32 tmc_timeout;
+
+static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
+{
+	int i;
+
+	for (i = tmc_timeout; i > 0; i--) {
+		if (i - 1)
+			udelay(1);
+	}
+}
+
+static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
+{
+	return coresight_timeout_action(csa, offset, pos, val,
+			tmc_extend_timeout);
+}
+
 int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
 {
 	struct coresight_device *csdev = drvdata->csdev;
 	struct csdev_access *csa = &csdev->access;
 
 	/* Ensure formatter, unformatter and hardware fifo are empty */
-	if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
+	if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
 		dev_err(&csdev->dev,
 			"timeout while waiting for TMC to be Ready\n");
 		return -EBUSY;
@@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
 	ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
 	writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
 	/* Ensure flush completes */
-	if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
+	if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
 		dev_err(&csdev->dev,
 		"timeout while waiting for completion of Manual Flush\n");
 	}
@@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
 
 static DEVICE_ATTR_RW(stop_on_flush);
 
+static ssize_t timeout_cfg_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
+}
+
+static ssize_t timeout_cfg_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t size)
+{
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val))
+		return -EINVAL;
+	tmc_timeout = val;
+
+	return size;
+}
+static DEVICE_ATTR_RW(timeout_cfg);
 
 static struct attribute *coresight_tmc_attrs[] = {
 	&dev_attr_trigger_cntr.attr,
 	&dev_attr_buffer_size.attr,
 	&dev_attr_stop_on_flush.attr,
+	&dev_attr_timeout_cfg.attr,
 	NULL,
 };
 

---
base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
change-id: 20250627-flush_timeout-a598b4c0ce7b

Best regards,
-- 
Yuanfang Zhang <quic_yuanfang@quicinc.com>
Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by James Clark 3 months, 1 week ago

On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
> The current implementation uses a fixed timeout via
> coresight_timeout(), which may be insufficient when multiple
> sources are enabled or under heavy load, leading to TMC
> readiness or flush completion timeout.
> 
> This patch introduces a configurable timeout mechanism for
> flush and tmcready.
> 

What kind of values are you using? Is there a reason to not increase the 
global one?

I don't think it's important what value we choose because it's only to 
stop hangs and make it terminate eventually. As far as I can see it 
wouldn't matter if we set it to a huge value like 1 second. That would 
only cause a big delay when something has actually gone wrong. Under 
normal circumstances the timeout won't be hit so it doesn't really need 
to be configurable.

> Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
> ---
>   drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
>   1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
> index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
> @@ -8,6 +8,7 @@
>   #include <linux/kernel.h>
>   #include <linux/init.h>
>   #include <linux/types.h>
> +#include <linux/delay.h>
>   #include <linux/device.h>
>   #include <linux/idr.h>
>   #include <linux/io.h>
> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>   DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
>   DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
>   
> +static u32 tmc_timeout;
> +
> +static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
> +{
> +	int i;
> +
> +	for (i = tmc_timeout; i > 0; i--) {
> +		if (i - 1)

I didn't get what the if is for here? Removing it does basically the 
same thing, but if you do want to keep it maybe if (i > 1) is more 
explanatory.

> +			udelay(1);

Can you not do udelay(tmc_timeout)?

> +	}
> +}
> +
> +static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
> +{
> +	return coresight_timeout_action(csa, offset, pos, val,
> +			tmc_extend_timeout);
> +}
> +
>   int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
>   {
>   	struct coresight_device *csdev = drvdata->csdev;
>   	struct csdev_access *csa = &csdev->access;
>   
>   	/* Ensure formatter, unformatter and hardware fifo are empty */
> -	if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
> +	if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>   		dev_err(&csdev->dev,
>   			"timeout while waiting for TMC to be Ready\n");
>   		return -EBUSY;
> @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
>   	ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
>   	writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
>   	/* Ensure flush completes */
> -	if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
> +	if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>   		dev_err(&csdev->dev,
>   		"timeout while waiting for completion of Manual Flush\n");
>   	}
> @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
>   
>   static DEVICE_ATTR_RW(stop_on_flush);
>   
> +static ssize_t timeout_cfg_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
> +}
> +
> +static ssize_t timeout_cfg_store(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf, size_t size)
> +{
> +	unsigned long val;
> +
> +	if (kstrtoul(buf, 0, &val))
> +		return -EINVAL;
> +	tmc_timeout = val;
> +
> +	return size;
> +}
> +static DEVICE_ATTR_RW(timeout_cfg);
>   

Seeing as the existing timeout is global for all devices, if we do want 
a configurable one shouldn't we make the global one configurable rather 
than per-device? That seems too fine grained to me.

>   static struct attribute *coresight_tmc_attrs[] = {
>   	&dev_attr_trigger_cntr.attr,
>   	&dev_attr_buffer_size.attr,
>   	&dev_attr_stop_on_flush.attr,
> +	&dev_attr_timeout_cfg.attr,
>   	NULL,
>   };
>   
> 
> ---
> base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
> change-id: 20250627-flush_timeout-a598b4c0ce7b
> 
> Best regards,
Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by Yuanfang Zhang 3 months, 1 week ago

On 6/27/2025 7:23 PM, James Clark wrote:
> 
> 
> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
>> The current implementation uses a fixed timeout via
>> coresight_timeout(), which may be insufficient when multiple
>> sources are enabled or under heavy load, leading to TMC
>> readiness or flush completion timeout.
>>
>> This patch introduces a configurable timeout mechanism for
>> flush and tmcready.
>>
> 
> What kind of values are you using? Is there a reason to not increase the global one?
1000, Because only TMC FLUSH will face timeout situations.
> 
> I don't think it's important what value we choose because it's only to stop hangs and make it terminate eventually. As far as I can see it wouldn't matter if we set it to a huge value like 1 second. That would only cause a big delay when something has actually gone wrong. Under normal circumstances the timeout won't be hit so it doesn't really need to be configurable.

But in some cases, TMC doesn't hang up, it just requires a longer waiting time.
> 
>> Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
>> ---
>>   drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
>> index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>> @@ -8,6 +8,7 @@
>>   #include <linux/kernel.h>
>>   #include <linux/init.h>
>>   #include <linux/types.h>
>> +#include <linux/delay.h>
>>   #include <linux/device.h>
>>   #include <linux/idr.h>
>>   #include <linux/io.h>
>> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>>   DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
>>   DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
>>   +static u32 tmc_timeout;
>> +
>> +static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
>> +{
>> +    int i;
>> +
>> +    for (i = tmc_timeout; i > 0; i--) {
>> +        if (i - 1)
> 
> I didn't get what the if is for here? Removing it does basically the same thing, but if you do want to keep it maybe if (i > 1) is more explanatory.
sure.
> 
>> +            udelay(1);
> 
> Can you not do udelay(tmc_timeout)?
sure.
> 
>> +    }
>> +}
>> +
>> +static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
>> +{
>> +    return coresight_timeout_action(csa, offset, pos, val,
>> +            tmc_extend_timeout);
>> +}
>> +
>>   int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
>>   {
>>       struct coresight_device *csdev = drvdata->csdev;
>>       struct csdev_access *csa = &csdev->access;
>>         /* Ensure formatter, unformatter and hardware fifo are empty */
>> -    if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>> +    if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>>           dev_err(&csdev->dev,
>>               "timeout while waiting for TMC to be Ready\n");
>>           return -EBUSY;
>> @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
>>       ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
>>       writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
>>       /* Ensure flush completes */
>> -    if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>> +    if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>>           dev_err(&csdev->dev,
>>           "timeout while waiting for completion of Manual Flush\n");
>>       }
>> @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
>>     static DEVICE_ATTR_RW(stop_on_flush);
>>   +static ssize_t timeout_cfg_show(struct device *dev,
>> +                struct device_attribute *attr, char *buf)
>> +{
>> +    return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
>> +}
>> +
>> +static ssize_t timeout_cfg_store(struct device *dev,
>> +                 struct device_attribute *attr,
>> +                 const char *buf, size_t size)
>> +{
>> +    unsigned long val;
>> +
>> +    if (kstrtoul(buf, 0, &val))
>> +        return -EINVAL;
>> +    tmc_timeout = val;
>> +
>> +    return size;
>> +}
>> +static DEVICE_ATTR_RW(timeout_cfg);
>>   
> 
> Seeing as the existing timeout is global for all devices, if we do want a configurable one shouldn't we make the global one configurable rather than per-device? That seems too fine grained to me.
sure.
> 
>>   static struct attribute *coresight_tmc_attrs[] = {
>>       &dev_attr_trigger_cntr.attr,
>>       &dev_attr_buffer_size.attr,
>>       &dev_attr_stop_on_flush.attr,
>> +    &dev_attr_timeout_cfg.attr,
>>       NULL,
>>   };
>>  
>> ---
>> base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
>> change-id: 20250627-flush_timeout-a598b4c0ce7b
>>
>> Best regards,
> 

Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by James Clark 3 months, 1 week ago

On 30/06/2025 11:03 am, Yuanfang Zhang wrote:
> 
> 
> On 6/27/2025 7:23 PM, James Clark wrote:
>>
>>
>> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
>>> The current implementation uses a fixed timeout via
>>> coresight_timeout(), which may be insufficient when multiple
>>> sources are enabled or under heavy load, leading to TMC
>>> readiness or flush completion timeout.
>>>
>>> This patch introduces a configurable timeout mechanism for
>>> flush and tmcready.
>>>
>>
>> What kind of values are you using? Is there a reason to not increase the global one?
> 1000, Because only TMC FLUSH will face timeout situations.

How long was the flush taking exactly? You should be able to log the 
time it took to get past the flush. Because if it's 101us then we can 
increase the global timeout to 150us or 200us without too much thought.

I don't think we can justify why 100us was picked over any other value 
anyway. And we've seen a couple of timeouts in our own CI.

>>
>> I don't think it's important what value we choose because it's only to stop hangs and make it terminate eventually. As far as I can see it wouldn't matter if we set it to a huge value like 1 second. That would only cause a big delay when something has actually gone wrong. Under normal circumstances the timeout won't be hit so it doesn't really need to be configurable.
> 
> But in some cases, TMC doesn't hang up, it just requires a longer waiting time.
>>
>>> Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
>>> ---
>>>    drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++--
>>>    1 file changed, 41 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644
>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>> @@ -8,6 +8,7 @@
>>>    #include <linux/kernel.h>
>>>    #include <linux/init.h>
>>>    #include <linux/types.h>
>>> +#include <linux/delay.h>
>>>    #include <linux/device.h>
>>>    #include <linux/idr.h>
>>>    #include <linux/io.h>
>>> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb");
>>>    DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf");
>>>    DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr");
>>>    +static u32 tmc_timeout;
>>> +
>>> +static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val)
>>> +{
>>> +    int i;
>>> +
>>> +    for (i = tmc_timeout; i > 0; i--) {
>>> +        if (i - 1)
>>
>> I didn't get what the if is for here? Removing it does basically the same thing, but if you do want to keep it maybe if (i > 1) is more explanatory.
> sure.
>>
>>> +            udelay(1);
>>
>> Can you not do udelay(tmc_timeout)?
> sure.
>>
>>> +    }
>>> +}
>>> +
>>> +static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val)
>>> +{
>>> +    return coresight_timeout_action(csa, offset, pos, val,
>>> +            tmc_extend_timeout);
>>> +}
>>> +
>>>    int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata)
>>>    {
>>>        struct coresight_device *csdev = drvdata->csdev;
>>>        struct csdev_access *csa = &csdev->access;
>>>          /* Ensure formatter, unformatter and hardware fifo are empty */
>>> -    if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>>> +    if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) {
>>>            dev_err(&csdev->dev,
>>>                "timeout while waiting for TMC to be Ready\n");
>>>            return -EBUSY;
>>> @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
>>>        ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT);
>>>        writel_relaxed(ffcr, drvdata->base + TMC_FFCR);
>>>        /* Ensure flush completes */
>>> -    if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>>> +    if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) {
>>>            dev_err(&csdev->dev,
>>>            "timeout while waiting for completion of Manual Flush\n");
>>>        }
>>> @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev,
>>>      static DEVICE_ATTR_RW(stop_on_flush);
>>>    +static ssize_t timeout_cfg_show(struct device *dev,
>>> +                struct device_attribute *attr, char *buf)
>>> +{
>>> +    return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout);
>>> +}
>>> +
>>> +static ssize_t timeout_cfg_store(struct device *dev,
>>> +                 struct device_attribute *attr,
>>> +                 const char *buf, size_t size)
>>> +{
>>> +    unsigned long val;
>>> +
>>> +    if (kstrtoul(buf, 0, &val))
>>> +        return -EINVAL;
>>> +    tmc_timeout = val;
>>> +
>>> +    return size;
>>> +}
>>> +static DEVICE_ATTR_RW(timeout_cfg);
>>>    
>>
>> Seeing as the existing timeout is global for all devices, if we do want a configurable one shouldn't we make the global one configurable rather than per-device? That seems too fine grained to me.
> sure.
>>
>>>    static struct attribute *coresight_tmc_attrs[] = {
>>>        &dev_attr_trigger_cntr.attr,
>>>        &dev_attr_buffer_size.attr,
>>>        &dev_attr_stop_on_flush.attr,
>>> +    &dev_attr_timeout_cfg.attr,
>>>        NULL,
>>>    };
>>>   
>>> ---
>>> base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164
>>> change-id: 20250627-flush_timeout-a598b4c0ce7b
>>>
>>> Best regards,
>>
> 

Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by Leo Yan 3 months, 1 week ago
On Fri, Jun 27, 2025 at 12:23:29PM +0100, James Clark wrote:
> 
> 
> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
> > The current implementation uses a fixed timeout via
> > coresight_timeout(), which may be insufficient when multiple
> > sources are enabled or under heavy load, leading to TMC
> > readiness or flush completion timeout.
> > 
> > This patch introduces a configurable timeout mechanism for
> > flush and tmcready.
> > 
> 
> What kind of values are you using? Is there a reason to not increase the
> global one?

IIUC, this patch is related to patch [1].

It seems to me that both patches aim to address the long latency when
flushing the TMC, but take different approaches. In the earlier patch,
both Mike and I questioned how the timeout occurred in hardware, but
no information provided about the cause.

I would suggest that we first make clear if this is a hardware quirk or
a common issue in Arm CoreSight.

Thanks,
Leo

[1] https://lore.kernel.org/linux-arm-kernel/CAJ9a7Vgre_3mkXB_xeVx5N9BqPTa2Ai4_8E+daDZ-yAUv44A9g@mail.gmail.com/
Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by Yuanfang Zhang 3 months, 1 week ago

On 6/27/2025 10:17 PM, Leo Yan wrote:
> On Fri, Jun 27, 2025 at 12:23:29PM +0100, James Clark wrote:
>>
>>
>> On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
>>> The current implementation uses a fixed timeout via
>>> coresight_timeout(), which may be insufficient when multiple
>>> sources are enabled or under heavy load, leading to TMC
>>> readiness or flush completion timeout.
>>>
>>> This patch introduces a configurable timeout mechanism for
>>> flush and tmcready.
>>>
>>
>> What kind of values are you using? Is there a reason to not increase the
>> global one?
> 
> IIUC, this patch is related to patch [1].
> 
> It seems to me that both patches aim to address the long latency when
> flushing the TMC, but take different approaches. In the earlier patch,
> both Mike and I questioned how the timeout occurred in hardware, but
> no information provided about the cause.
> 
> I would suggest that we first make clear if this is a hardware quirk or
> a common issue in Arm CoreSight.
> 
> Thanks,
> Leo
> 
sure, now this issue has been found that not only CPU ETM, but also subsystem ETM.
> [1] https://lore.kernel.org/linux-arm-kernel/CAJ9a7Vgre_3mkXB_xeVx5N9BqPTa2Ai4_8E+daDZ-yAUv44A9g@mail.gmail.com/
Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by Leo Yan 3 months, 1 week ago
On Mon, Jun 30, 2025 at 06:40:53PM +0800, Yuanfang Zhang wrote:

[...]

> >>> The current implementation uses a fixed timeout via
> >>> coresight_timeout(), which may be insufficient when multiple
> >>> sources are enabled or under heavy load, leading to TMC
> >>> readiness or flush completion timeout.
> >
> > I would suggest that we first make clear if this is a hardware quirk or
> > a common issue in Arm CoreSight.

> sure, now this issue has been found that not only CPU ETM, but also subsystem ETM.

As the commit log states, "sources are enabled or under heavy load,
leading to TMC readiness or flush completion timeout." I would like to
confirm how this situation can happen.

When disabling a CoreSight path, the driver follows a sequential
order: the source device is disabled first, followed by flushing and
disabling the TMC. We expect that there should be no contention
between source devices and the sink when disabling the path. For a
subsystem ETM, I expect we should also follow this sequence.

Leo
Re: [PATCH] coresight-tmc: Add configurable timeout for flush and tmcready
Posted by Yuanfang Zhang 3 months, 1 week ago

On 6/30/2025 7:08 PM, Leo Yan wrote:
> On Mon, Jun 30, 2025 at 06:40:53PM +0800, Yuanfang Zhang wrote:
> 
> [...]
> 
>>>>> The current implementation uses a fixed timeout via
>>>>> coresight_timeout(), which may be insufficient when multiple
>>>>> sources are enabled or under heavy load, leading to TMC
>>>>> readiness or flush completion timeout.
>>>
>>> I would suggest that we first make clear if this is a hardware quirk or
>>> a common issue in Arm CoreSight.
> 
>> sure, now this issue has been found that not only CPU ETM, but also subsystem ETM.
> 
> As the commit log states, "sources are enabled or under heavy load,
> leading to TMC readiness or flush completion timeout." I would like to
> confirm how this situation can happen.
> 
Enable all etms, then cat TMC without disable source.
> When disabling a CoreSight path, the driver follows a sequential
> order: the source device is disabled first, followed by flushing and
> disabling the TMC. We expect that there should be no contention
> between source devices and the sink when disabling the path. For a
> subsystem ETM, I expect we should also follow this sequence.
> 
> Leo
this issue is a different case: running cat tmc directly did not disable source.