The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
and it returns results for three different cases:
Case 1: 'other' is a subset of 'orig'.
Case 2: 'orig' is a subset of 'other'.
Case 3: 'orig' and 'other' are not subsets of each other.
The result combinations are:
+--------+-------------+-----------+-----------+
| Cases | Result | orig | other |
+--------+-------------+-----------+-----------+
| Case1 | orig | No change | No change |
+--------+-------------+-----------+-----------+
| Case2 | other | No change | refcnt++ |
+--------+-------------+-----------+-----------+
| Case3 | New CPU map | refcnt-- | No change |
+--------+-------------+-----------+-----------+
Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
is because the reference counter operations are not consistent crossing
different cases and leads to difficulty for callers handling them.
For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
the merging result, but its refcnt is not incremented. This can lead to
the map being released repeatedly:
struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("2");
/* 'm' and 'a' point to the same CPU map */
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
...
perf_cpu_map__put(m); -> Release the map
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the same merged again
For Case 3, it is possible that the CPU map pointed to by 'orig' can be
released twice: within the function and outside of it.
struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("3");
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
`> 'm' is new allocated map. But 'a' has
been released in the function.
...
perf_cpu_map__put(m);
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the 'a' map again
This commit increases the reference counter if a passed map is returned.
For the case of a newly allocated map, it does not change the reference
counter for passed maps.
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
tools/lib/perf/cpumap.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index cae799ad44e1..3f80eade8b1c 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -438,9 +438,7 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu
/*
* Merge two cpumaps
*
- * orig either gets freed and replaced with a new map, or reused
- * with no reference count change (similar to "realloc")
- * other has its reference count increased.
+ * Return a new map, or reused with its reference count increased.
*/
struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
@@ -452,11 +450,9 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
struct perf_cpu_map *merged;
if (perf_cpu_map__is_subset(orig, other))
- return orig;
- if (perf_cpu_map__is_subset(other, orig)) {
- perf_cpu_map__put(orig);
+ return perf_cpu_map__get(orig);
+ if (perf_cpu_map__is_subset(other, orig))
return perf_cpu_map__get(other);
- }
tmp_len = __perf_cpu_map__nr(orig) + __perf_cpu_map__nr(other);
tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
@@ -483,7 +479,6 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
merged = cpu_map__trim_new(k, tmp_cpus);
free(tmp_cpus);
- perf_cpu_map__put(orig);
return merged;
}
--
2.34.1
On 25/09/24 22:53, Leo Yan wrote: > The perf_cpu_map__merge() function has two arguments, 'orig' and 'other', > and it returns results for three different cases: > > Case 1: 'other' is a subset of 'orig'. > Case 2: 'orig' is a subset of 'other'. > Case 3: 'orig' and 'other' are not subsets of each other. > > The result combinations are: > > +--------+-------------+-----------+-----------+ > | Cases | Result | orig | other | > +--------+-------------+-----------+-----------+ > | Case1 | orig | No change | No change | > +--------+-------------+-----------+-----------+ > | Case2 | other | No change | refcnt++ | > +--------+-------------+-----------+-----------+ > | Case3 | New CPU map | refcnt-- | No change | > +--------+-------------+-----------+-----------+ > > Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This > is because the reference counter operations are not consistent crossing > different cases and leads to difficulty for callers handling them. > > For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as > the merging result, but its refcnt is not incremented. This can lead to > the map being released repeatedly: > > struct perf_cpu_map *a = perf_cpu_map__new("1,2"); > struct perf_cpu_map *b = perf_cpu_map__new("2"); > > /* 'm' and 'a' point to the same CPU map */ > struct perf_cpu_map *m = perf_cpu_map__merge(a, b); > > ... > > perf_cpu_map__put(m); -> Release the map > perf_cpu_map__put(b); > perf_cpu_map__put(a); -> Release the same merged again > > For Case 3, it is possible that the CPU map pointed to by 'orig' can be > released twice: within the function and outside of it. > > struct perf_cpu_map *a = perf_cpu_map__new("1,2"); > struct perf_cpu_map *b = perf_cpu_map__new("3"); > > struct perf_cpu_map *m = perf_cpu_map__merge(a, b); > `> 'm' is new allocated map. But 'a' has > been released in the function. > ... > > perf_cpu_map__put(m); > perf_cpu_map__put(b); > perf_cpu_map__put(a); -> Release the 'a' map again > > This commit increases the reference counter if a passed map is returned. > For the case of a newly allocated map, it does not change the reference > counter for passed maps. The 2 non-test uses of perf_cpu_map__merge both do: a = perf_cpu_map__merge(a, b); so another way to make the API less misleading would be to introduce: err = perf_cpu_map__merge_in(&a, b); where: int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other) { struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other); if (!result) return -ENOMEM; *orig = result; return 0; } without any changes to perf_cpu_map__merge(). > > Signed-off-by: Leo Yan <leo.yan@arm.com> > --- > tools/lib/perf/cpumap.c | 11 +++-------- > 1 file changed, 3 insertions(+), 8 deletions(-) > > diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c > index cae799ad44e1..3f80eade8b1c 100644 > --- a/tools/lib/perf/cpumap.c > +++ b/tools/lib/perf/cpumap.c > @@ -438,9 +438,7 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu > /* > * Merge two cpumaps > * > - * orig either gets freed and replaced with a new map, or reused > - * with no reference count change (similar to "realloc") > - * other has its reference count increased. > + * Return a new map, or reused with its reference count increased. > */ > > struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, > @@ -452,11 +450,9 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, > struct perf_cpu_map *merged; > > if (perf_cpu_map__is_subset(orig, other)) > - return orig; > - if (perf_cpu_map__is_subset(other, orig)) { > - perf_cpu_map__put(orig); > + return perf_cpu_map__get(orig); > + if (perf_cpu_map__is_subset(other, orig)) > return perf_cpu_map__get(other); > - } > > tmp_len = __perf_cpu_map__nr(orig) + __perf_cpu_map__nr(other); > tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu)); > @@ -483,7 +479,6 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, > > merged = cpu_map__trim_new(k, tmp_cpus); > free(tmp_cpus); > - perf_cpu_map__put(orig); > return merged; > } >
Hi Adrian, On 10/10/24 18:41, Adrian Hunter wrote:> > On 25/09/24 22:53, Leo Yan wrote: >> The perf_cpu_map__merge() function has two arguments, 'orig' and 'other', >> and it returns results for three different cases: >> >> Case 1: 'other' is a subset of 'orig'. >> Case 2: 'orig' is a subset of 'other'. >> Case 3: 'orig' and 'other' are not subsets of each other. >> >> The result combinations are: >> >> +--------+-------------+-----------+-----------+ >> | Cases | Result | orig | other | >> +--------+-------------+-----------+-----------+ >> | Case1 | orig | No change | No change | >> +--------+-------------+-----------+-----------+ >> | Case2 | other | No change | refcnt++ | >> +--------+-------------+-----------+-----------+ >> | Case3 | New CPU map | refcnt-- | No change | >> +--------+-------------+-----------+-----------+ >> >> Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This >> is because the reference counter operations are not consistent crossing >> different cases and leads to difficulty for callers handling them. >> >> For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as >> the merging result, but its refcnt is not incremented. This can lead to >> the map being released repeatedly: >> >> struct perf_cpu_map *a = perf_cpu_map__new("1,2"); >> struct perf_cpu_map *b = perf_cpu_map__new("2"); >> >> /* 'm' and 'a' point to the same CPU map */ >> struct perf_cpu_map *m = perf_cpu_map__merge(a, b); >> >> ... >> >> perf_cpu_map__put(m); -> Release the map >> perf_cpu_map__put(b); >> perf_cpu_map__put(a); -> Release the same merged again >> >> For Case 3, it is possible that the CPU map pointed to by 'orig' can be >> released twice: within the function and outside of it. >> >> struct perf_cpu_map *a = perf_cpu_map__new("1,2"); >> struct perf_cpu_map *b = perf_cpu_map__new("3"); >> >> struct perf_cpu_map *m = perf_cpu_map__merge(a, b); >> `> 'm' is new allocated map. But 'a' has >> been released in the function. >> ... >> >> perf_cpu_map__put(m); >> perf_cpu_map__put(b); >> perf_cpu_map__put(a); -> Release the 'a' map again >> >> This commit increases the reference counter if a passed map is returned. >> For the case of a newly allocated map, it does not change the reference >> counter for passed maps. > > The 2 non-test uses of perf_cpu_map__merge both do: > > a = perf_cpu_map__merge(a, b); > > so another way to make the API less misleading would be > to introduce: > > err = perf_cpu_map__merge_in(&a, b); > > where: > > int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other) > { > struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other); > > if (!result) > return -ENOMEM; > > *orig = result; > return 0; > } > > without any changes to perf_cpu_map__merge(). Just wandering why we cannot do the same thing for the perf_cpu_map__merge() function? int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other) This can allow us to avoid any confusion in the first place. And we don't need to maintain two functions for the same thing. Thanks, Leo
On 10/11/24 10:34, Leo Yan wrote: >> The 2 non-test uses of perf_cpu_map__merge both do: >> >> a = perf_cpu_map__merge(a, b); >> >> so another way to make the API less misleading would be >> to introduce: >> >> err = perf_cpu_map__merge_in(&a, b); >> >> where: >> >> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map >> *other) >> { >> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other); >> >> if (!result) >> return -ENOMEM; >> >> *orig = result; >> return 0; >> } >> >> without any changes to perf_cpu_map__merge(). > > Just wandering why we cannot do the same thing for the perf_cpu_map__merge() > function? > > int perf_cpu_map__merge_in(struct perf_cpu_map **orig, > struct perf_cpu_map *other) Sorry for typo and spamming. The above suggested definition is for perf_cpu_map__merge(). > This can allow us to avoid any confusion in the first place. And we don't need > to maintain two functions for the same thing. > > Thanks, > Leo > >
On 11/10/24 12:40, Leo Yan wrote: > > > On 10/11/24 10:34, Leo Yan wrote: > >>> The 2 non-test uses of perf_cpu_map__merge both do: >>> >>> a = perf_cpu_map__merge(a, b); >>> >>> so another way to make the API less misleading would be >>> to introduce: >>> >>> err = perf_cpu_map__merge_in(&a, b); >>> >>> where: >>> >>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other) >>> { >>> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other); >>> >>> if (!result) >>> return -ENOMEM; >>> >>> *orig = result; >>> return 0; >>> } >>> >>> without any changes to perf_cpu_map__merge(). >> >> Just wandering why we cannot do the same thing for the perf_cpu_map__merge() >> function? >> >> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, >> struct perf_cpu_map *other) > > Sorry for typo and spamming. The above suggested definition is for perf_cpu_map__merge(). Yes - there is not much reason to have perf_cpu_map__merge() and perf_cpu_map__merge_in(). > > >> This can allow us to avoid any confusion in the first place. And we don't need >> to maintain two functions for the same thing. >> >> Thanks, >> Leo >> >>
On 10/11/24 10:46, Adrian Hunter wrote: [...] >>>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other) >>>> { >>>> struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other); >>>> >>>> if (!result) >>>> return -ENOMEM; >>>> >>>> *orig = result; >>>> return 0; >>>> } >>>> >>>> without any changes to perf_cpu_map__merge(). >>> >>> Just wandering why we cannot do the same thing for the perf_cpu_map__merge() >>> function? >>> >>> int perf_cpu_map__merge_in(struct perf_cpu_map **orig, >>> struct perf_cpu_map *other) >> >> Sorry for typo and spamming. The above suggested definition is for perf_cpu_map__merge(). > > Yes - there is not much reason to have perf_cpu_map__merge() > and perf_cpu_map__merge_in(). Thanks for suggestion! Will move towards this. Leo
© 2016 - 2024 Red Hat, Inc.