[PATCH v2 01/28] coresight: Change device mode to atomic type

Leo Yan posted 28 patches 3 months, 1 week ago
[PATCH v2 01/28] coresight: Change device mode to atomic type
Posted by Leo Yan 3 months, 1 week ago
The device mode is defined as local type. This type cannot promise
SMP-safe access.

Change to atomic type and impose relax ordering, which ensures the
SMP-safe synchronisation and the ordering between the mode setting and
relevant operations.

Fixes: 22fd532eaa0c ("coresight: etm3x: adding operation mode for etm_enable()")
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 include/linux/coresight.h | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 4ac65c68bbf44b98db22c3dad2d83a224ce5278e..5fd3d08824e5a91a197aa01daf0fc392392f3e55 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -251,15 +251,11 @@ struct coresight_trace_id_map {
  *		by @coresight_ops.
  * @access:	Device i/o access abstraction for this device.
  * @dev:	The device entity associated to this component.
- * @mode:	This tracer's mode, i.e sysFS, Perf or disabled. This is
- *		actually an 'enum cs_mode', but is stored in an atomic type.
- *		This is always accessed through local_read() and local_set(),
- *		but wherever it's done from within the Coresight device's lock,
- *		a non-atomic read would also work. This is the main point of
- *		synchronisation between code happening inside the sysfs mode's
- *		coresight_mutex and outside when running in Perf mode. A compare
- *		and exchange swap is done to atomically claim one mode or the
- *		other.
+ * @mode:	The device mode, i.e sysFS, Perf or disabled. This is actually
+ *		an 'enum cs_mode' but stored in an atomic type. Access is always
+ *		through atomic APIs, ensuring SMP-safe synchronisation between
+ *		racing from sysFS and Perf mode. A compare-and-exchange
+ *		operation is done to atomically claim one mode or the other.
  * @refcnt:	keep track of what is in use. Only access this outside of the
  *		device's spinlock when the coresight_mutex held and mode ==
  *		CS_MODE_SYSFS. Otherwise it must be accessed from inside the
@@ -288,7 +284,7 @@ struct coresight_device {
 	const struct coresight_ops *ops;
 	struct csdev_access access;
 	struct device dev;
-	local_t	mode;
+	atomic_t mode;
 	int refcnt;
 	bool orphan;
 	/* sink specific fields */
@@ -650,13 +646,14 @@ static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
 static inline bool coresight_take_mode(struct coresight_device *csdev,
 				       enum cs_mode new_mode)
 {
-	return local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, new_mode) ==
-	       CS_MODE_DISABLED;
+	int curr = CS_MODE_DISABLED;
+
+	return atomic_try_cmpxchg_acquire(&csdev->mode, &curr, new_mode);
 }
 
 static inline enum cs_mode coresight_get_mode(struct coresight_device *csdev)
 {
-	return local_read(&csdev->mode);
+	return atomic_read_acquire(&csdev->mode);
 }
 
 static inline void coresight_set_mode(struct coresight_device *csdev,
@@ -672,7 +669,7 @@ static inline void coresight_set_mode(struct coresight_device *csdev,
 	WARN(new_mode != CS_MODE_DISABLED && current_mode != CS_MODE_DISABLED &&
 	     current_mode != new_mode, "Device already in use\n");
 
-	local_set(&csdev->mode, new_mode);
+	atomic_set_release(&csdev->mode, new_mode);
 }
 
 struct coresight_device *coresight_register(struct coresight_desc *desc);

-- 
2.34.1
Re: [PATCH v2 01/28] coresight: Change device mode to atomic type
Posted by Anshuman Khandual 2 months, 3 weeks ago

On 01/07/25 8:23 PM, Leo Yan wrote:
> The device mode is defined as local type. This type cannot promise
> SMP-safe access.
> 
> Change to atomic type and impose relax ordering, which ensures the
> SMP-safe synchronisation and the ordering between the mode setting and
> relevant operations.


But have we really faced such problems on real systems due to local_t
or this is just an improvement ?

> 
> Fixes: 22fd532eaa0c ("coresight: etm3x: adding operation mode for etm_enable()")
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  include/linux/coresight.h | 25 +++++++++++--------------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 4ac65c68bbf44b98db22c3dad2d83a224ce5278e..5fd3d08824e5a91a197aa01daf0fc392392f3e55 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -251,15 +251,11 @@ struct coresight_trace_id_map {
>   *		by @coresight_ops.
>   * @access:	Device i/o access abstraction for this device.
>   * @dev:	The device entity associated to this component.
> - * @mode:	This tracer's mode, i.e sysFS, Perf or disabled. This is
> - *		actually an 'enum cs_mode', but is stored in an atomic type.
> - *		This is always accessed through local_read() and local_set(),
> - *		but wherever it's done from within the Coresight device's lock,
> - *		a non-atomic read would also work. This is the main point of
> - *		synchronisation between code happening inside the sysfs mode's
> - *		coresight_mutex and outside when running in Perf mode. A compare
> - *		and exchange swap is done to atomically claim one mode or the
> - *		other.
> + * @mode:	The device mode, i.e sysFS, Perf or disabled. This is actually
> + *		an 'enum cs_mode' but stored in an atomic type. Access is always
> + *		through atomic APIs, ensuring SMP-safe synchronisation between
> + *		racing from sysFS and Perf mode. A compare-and-exchange
> + *		operation is done to atomically claim one mode or the other.
>   * @refcnt:	keep track of what is in use. Only access this outside of the
>   *		device's spinlock when the coresight_mutex held and mode ==
>   *		CS_MODE_SYSFS. Otherwise it must be accessed from inside the
> @@ -288,7 +284,7 @@ struct coresight_device {
>  	const struct coresight_ops *ops;
>  	struct csdev_access access;
>  	struct device dev;
> -	local_t	mode;
> +	atomic_t mode;
>  	int refcnt;
>  	bool orphan;
>  	/* sink specific fields */
> @@ -650,13 +646,14 @@ static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
>  static inline bool coresight_take_mode(struct coresight_device *csdev,
>  				       enum cs_mode new_mode)
>  {
> -	return local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, new_mode) ==
> -	       CS_MODE_DISABLED;
> +	int curr = CS_MODE_DISABLED;
> +
> +	return atomic_try_cmpxchg_acquire(&csdev->mode, &curr, new_mode);
>  }
>  
>  static inline enum cs_mode coresight_get_mode(struct coresight_device *csdev)
>  {
> -	return local_read(&csdev->mode);
> +	return atomic_read_acquire(&csdev->mode);
>  }
>  
>  static inline void coresight_set_mode(struct coresight_device *csdev,
> @@ -672,7 +669,7 @@ static inline void coresight_set_mode(struct coresight_device *csdev,
>  	WARN(new_mode != CS_MODE_DISABLED && current_mode != CS_MODE_DISABLED &&
>  	     current_mode != new_mode, "Device already in use\n");
>  
> -	local_set(&csdev->mode, new_mode);
> +	atomic_set_release(&csdev->mode, new_mode);
>  }
>  
>  struct coresight_device *coresight_register(struct coresight_desc *desc);
>
Re: [PATCH v2 01/28] coresight: Change device mode to atomic type
Posted by Leo Yan 2 months ago
On Tue, Jul 15, 2025 at 12:23:22PM +0530, Anshuman Khandual wrote:
> On 01/07/25 8:23 PM, Leo Yan wrote:
> > The device mode is defined as local type. This type cannot promise
> > SMP-safe access.
> > 
> > Change to atomic type and impose relax ordering, which ensures the
> > SMP-safe synchronisation and the ordering between the mode setting and
> > relevant operations.
> 
> 
> But have we really faced such problems on real systems due to local_t
> or this is just an improvement ?

This is an improvement based on reading code. The reason is the
variable is accessed by mutliple CPUs but not per-CPU access.

Thanks,
Leo
Re: [PATCH v2 01/28] coresight: Change device mode to atomic type
Posted by Yeoreum Yun 3 months, 1 week ago
Hi Leo,

>  {
> -	return local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, new_mode) ==
> -	       CS_MODE_DISABLED;
> +	int curr = CS_MODE_DISABLED;
> +
> +	return atomic_try_cmpxchg_acquire(&csdev->mode, &curr, new_mode);
>  }

Just question. why is acquire symentic enough in here?
before this change, local_cmpxchg seems to use full_fenced.

--
Sincerely,
Yeoreum Yun
Re: [PATCH v2 01/28] coresight: Change device mode to atomic type
Posted by Leo Yan 3 months, 1 week ago
Hi Levi,

On Wed, Jul 02, 2025 at 10:49:01AM +0100, Yeoreum Yun wrote:
> Hi Leo,
> 
> >  {
> > -	return local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, new_mode) ==
> > -	       CS_MODE_DISABLED;
> > +	int curr = CS_MODE_DISABLED;
> > +
> > +	return atomic_try_cmpxchg_acquire(&csdev->mode, &curr, new_mode);
> >  }
> 
> Just question. why is acquire symentic enough in here?

My understanding is that acquire semantics ensure ordering between
cmpxchg_acquire() and all memory accesses that follow it. However, it
does not guarantee that memory accesses appearing before the acquire
are ordered as well.

This is exactly what we want in the driver. We must ensure to first grab
an active device mode, then it is safe to proceed later operations (e.g.
set configurations in driver data and access registers).

> before this change, local_cmpxchg seems to use full_fenced.

Not really. Arm64 has atomic instruction for cmpxchg, it does not use
full_fenced. It should run into the path of arch_cmpxchg().

Thanks,
Leo
Re: [PATCH v2 01/28] coresight: Change device mode to atomic type
Posted by Yeoreum Yun 3 months, 1 week ago
Hi,
>
> On Wed, Jul 02, 2025 at 10:49:01AM +0100, Yeoreum Yun wrote:
> > Hi Leo,
> >
> > >  {
> > > -	return local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, new_mode) ==
> > > -	       CS_MODE_DISABLED;
> > > +	int curr = CS_MODE_DISABLED;
> > > +
> > > +	return atomic_try_cmpxchg_acquire(&csdev->mode, &curr, new_mode);
> > >  }
> >
> > Just question. why is acquire symentic enough in here?
>
> My understanding is that acquire semantics ensure ordering between
> cmpxchg_acquire() and all memory accesses that follow it. However, it
> does not guarantee that memory accesses appearing before the acquire
> are ordered as well.
>
> This is exactly what we want in the driver. We must ensure to first grab
> an active device mode, then it is safe to proceed later operations (e.g.
> set configurations in driver data and access registers).
>
> > before this change, local_cmpxchg seems to use full_fenced.
>
> Not really. Arm64 has atomic instruction for cmpxchg, it does not use
> full_fenced. It should run into the path of arch_cmpxchg().

No, It local_cmpxchg backend with atomic_cmpxchg() and it seems include
full fenced.

But, you've explained the what i concerned with below diagram clear

CPU0                                          CPU1
// store csmode in store buffer
atomic_cmpxchg_acquired(csmode)
                                             // couldn't see the contents in store buffer
                                             atomic_read_acquire(csmode);

                                             Send SMP call
              ,-------------------------------/
             v
atomic_cmpxchg_acquired(csmode)  ==> Fail to grab another mode

However, I think it would be good to change the atomic_try_cmpxchg()
instead of atomic_cmpxchg_acquired() to prevent redundnat IPI trigger for
observation failure.

Thanks.

--
Sincerely,
Yeoreum Yun