The current code propagate evsel's cpu map settings to evlist when it's
added to an evlist. But the evlist->all_cpus and each evsel's cpus will
be updated in perf_evlist__set_maps() later. No need to do it before
evlist's cpus are set actually.
Actually we discarded this intermediate all_cpus maps at the beginning
of perf_evlist__set_maps(). Let's not do this. It's only needed when
an evsel is added after the evlist cpu maps are set.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/lib/perf/evlist.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 187129652ab6..cc070c3a134d 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -67,10 +67,6 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
- /* Recomputing all_cpus, so start with a blank slate. */
- perf_cpu_map__put(evlist->all_cpus);
- evlist->all_cpus = NULL;
-
perf_evlist__for_each_evsel(evlist, evsel)
__perf_evlist__propagate_maps(evlist, evsel);
}
@@ -81,7 +77,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
evsel->idx = evlist->nr_entries;
list_add_tail(&evsel->node, &evlist->entries);
evlist->nr_entries += 1;
- __perf_evlist__propagate_maps(evlist, evsel);
+
+ if (evlist->all_cpus)
+ __perf_evlist__propagate_maps(evlist, evsel);
}
void perf_evlist__remove(struct perf_evlist *evlist,
--
2.37.3.998.g577e59143f-goog
On 24/09/22 19:57, Namhyung Kim wrote:
> The current code propagate evsel's cpu map settings to evlist when it's
> added to an evlist. But the evlist->all_cpus and each evsel's cpus will
> be updated in perf_evlist__set_maps() later. No need to do it before
> evlist's cpus are set actually.
>
> Actually we discarded this intermediate all_cpus maps at the beginning
> of perf_evlist__set_maps(). Let's not do this. It's only needed when
> an evsel is added after the evlist cpu maps are set.
That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles
with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
added to the evlist. It can also remove an evsel from the evlist.
There might be other cases like that, but that was just one that stuck
out.
>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/lib/perf/evlist.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 187129652ab6..cc070c3a134d 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -67,10 +67,6 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> struct perf_evsel *evsel;
>
> - /* Recomputing all_cpus, so start with a blank slate. */
> - perf_cpu_map__put(evlist->all_cpus);
> - evlist->all_cpus = NULL;
> -
> perf_evlist__for_each_evsel(evlist, evsel)
> __perf_evlist__propagate_maps(evlist, evsel);
> }
> @@ -81,7 +77,9 @@ void perf_evlist__add(struct perf_evlist *evlist,
> evsel->idx = evlist->nr_entries;
> list_add_tail(&evsel->node, &evlist->entries);
> evlist->nr_entries += 1;
> - __perf_evlist__propagate_maps(evlist, evsel);
> +
> + if (evlist->all_cpus)
> + __perf_evlist__propagate_maps(evlist, evsel);
> }
>
> void perf_evlist__remove(struct perf_evlist *evlist,
Hi Adrian, On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 24/09/22 19:57, Namhyung Kim wrote: > > The current code propagate evsel's cpu map settings to evlist when it's > > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > be updated in perf_evlist__set_maps() later. No need to do it before > > evlist's cpus are set actually. > > > > Actually we discarded this intermediate all_cpus maps at the beginning > > of perf_evlist__set_maps(). Let's not do this. It's only needed when > > an evsel is added after the evlist cpu maps are set. > > That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > added to the evlist. It can also remove an evsel from the evlist. Thanks for your review. I think it's fine to change evsel cpus or to remove an evsel from evlist before calling evlist__create_maps(). The function will take care of setting evlist's all_cpus from the evsels in the evlist. So previous changes in evsel/cpus wouldn't be any special. After this point, adding a new evsel needs to update evlist all cpus by propagating cpu maps. So I think hybrid cpus should be fine. Did I miss something? > > There might be other cases like that, but that was just one that stuck > out. Thanks for sharing your concern. Please let me know if you could come up with another. Namhyung
On 27/09/22 20:28, Namhyung Kim wrote: > Hi Adrian, > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >> >> On 24/09/22 19:57, Namhyung Kim wrote: >>> The current code propagate evsel's cpu map settings to evlist when it's >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will >>> be updated in perf_evlist__set_maps() later. No need to do it before >>> evlist's cpus are set actually. >>> >>> Actually we discarded this intermediate all_cpus maps at the beginning >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when >>> an evsel is added after the evlist cpu maps are set. >> >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been >> added to the evlist. It can also remove an evsel from the evlist. > > Thanks for your review. I think it's fine to change evsel cpus or to remove > an evsel from evlist before calling evlist__create_maps(). The function > will take care of setting evlist's all_cpus from the evsels in the evlist. > So previous changes in evsel/cpus wouldn't be any special. > > After this point, adding a new evsel needs to update evlist all cpus by > propagating cpu maps. So I think hybrid cpus should be fine. > Did I miss something? I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the cpus from the target->cpu_list (using perf record -C) , since after this patch all_cpus always starts with the target->cpu_list instead of an empty list. But then, in the hybrid case, it puts a dummy event that uses the target cpu list anyway, so the result is the same. I don't know if there are any cases where all_cpus would actually need to exclude some of the cpus from target->cpu_list. > >> >> There might be other cases like that, but that was just one that stuck >> out. > > Thanks for sharing your concern. Please let me know if you could > come up with another. > > Namhyung
On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 27/09/22 20:28, Namhyung Kim wrote: > > Hi Adrian, > > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >> > >> On 24/09/22 19:57, Namhyung Kim wrote: > >>> The current code propagate evsel's cpu map settings to evlist when it's > >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > >>> be updated in perf_evlist__set_maps() later. No need to do it before > >>> evlist's cpus are set actually. > >>> > >>> Actually we discarded this intermediate all_cpus maps at the beginning > >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > >>> an evsel is added after the evlist cpu maps are set. > >> > >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > >> added to the evlist. It can also remove an evsel from the evlist. > > > > Thanks for your review. I think it's fine to change evsel cpus or to remove > > an evsel from evlist before calling evlist__create_maps(). The function > > will take care of setting evlist's all_cpus from the evsels in the evlist. > > So previous changes in evsel/cpus wouldn't be any special. > > > > After this point, adding a new evsel needs to update evlist all cpus by > > propagating cpu maps. So I think hybrid cpus should be fine. > > Did I miss something? > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > cpus from the target->cpu_list (using perf record -C) , since after this > patch all_cpus always starts with the target->cpu_list instead of an empty > list. But then, in the hybrid case, it puts a dummy event that uses the > target cpu list anyway, so the result is the same. > > I don't know if there are any cases where all_cpus would actually need to > exclude some of the cpus from target->cpu_list. I'm not aware of other cases to reduce cpu list. I think it'd be fine if it has a cpu in the evlist->all_cpus even if it's not used. The evsel should have a correct list anyway and we mostly use the evsel cpus to do the real work. Thanks, Namhyung
On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > > On 27/09/22 20:28, Namhyung Kim wrote: > > > Hi Adrian, > > > > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >> > > >> On 24/09/22 19:57, Namhyung Kim wrote: > > >>> The current code propagate evsel's cpu map settings to evlist when it's > > >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > >>> be updated in perf_evlist__set_maps() later. No need to do it before > > >>> evlist's cpus are set actually. > > >>> > > >>> Actually we discarded this intermediate all_cpus maps at the beginning > > >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > > >>> an evsel is added after the evlist cpu maps are set. > > >> > > >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > > >> added to the evlist. It can also remove an evsel from the evlist. > > > > > > Thanks for your review. I think it's fine to change evsel cpus or to remove > > > an evsel from evlist before calling evlist__create_maps(). The function > > > will take care of setting evlist's all_cpus from the evsels in the evlist. > > > So previous changes in evsel/cpus wouldn't be any special. > > > > > > After this point, adding a new evsel needs to update evlist all cpus by > > > propagating cpu maps. So I think hybrid cpus should be fine. > > > Did I miss something? > > > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > > cpus from the target->cpu_list (using perf record -C) , since after this > > patch all_cpus always starts with the target->cpu_list instead of an empty > > list. But then, in the hybrid case, it puts a dummy event that uses the > > target cpu list anyway, so the result is the same. > > > > I don't know if there are any cases where all_cpus would actually need to > > exclude some of the cpus from target->cpu_list. > > I'm not aware of other cases to reduce cpu list. I think it'd be fine > if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > should have a correct list anyway and we mostly use the evsel cpus > to do the real work. > > Thanks, > Namhyung The affinity changes made it so that we use all_cpus probably more often than the evsel CPU maps for real work. The reason being we want to avoid IPIs so we do all the work on 1 CPU and then move to the next CPU in evlist all_cpus. evsel CPU maps are used to make sure the indices are kept accurate - for example, if an uncore event is measured with a CPU event: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 Thanks, Ian
On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > > On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > > > > On 27/09/22 20:28, Namhyung Kim wrote: > > > > Hi Adrian, > > > > > > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > >> > > > >> On 24/09/22 19:57, Namhyung Kim wrote: > > > >>> The current code propagate evsel's cpu map settings to evlist when it's > > > >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > > >>> be updated in perf_evlist__set_maps() later. No need to do it before > > > >>> evlist's cpus are set actually. > > > >>> > > > >>> Actually we discarded this intermediate all_cpus maps at the beginning > > > >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > > > >>> an evsel is added after the evlist cpu maps are set. > > > >> > > > >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > > > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > > > >> added to the evlist. It can also remove an evsel from the evlist. > > > > > > > > Thanks for your review. I think it's fine to change evsel cpus or to remove > > > > an evsel from evlist before calling evlist__create_maps(). The function > > > > will take care of setting evlist's all_cpus from the evsels in the evlist. > > > > So previous changes in evsel/cpus wouldn't be any special. > > > > > > > > After this point, adding a new evsel needs to update evlist all cpus by > > > > propagating cpu maps. So I think hybrid cpus should be fine. > > > > Did I miss something? > > > > > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > > > cpus from the target->cpu_list (using perf record -C) , since after this > > > patch all_cpus always starts with the target->cpu_list instead of an empty > > > list. But then, in the hybrid case, it puts a dummy event that uses the > > > target cpu list anyway, so the result is the same. > > > > > > I don't know if there are any cases where all_cpus would actually need to > > > exclude some of the cpus from target->cpu_list. > > > > I'm not aware of other cases to reduce cpu list. I think it'd be fine > > if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > > should have a correct list anyway and we mostly use the evsel cpus > > to do the real work. > > > > Thanks, > > Namhyung > > The affinity changes made it so that we use all_cpus probably more > often than the evsel CPU maps for real work. The reason being we want > to avoid IPIs so we do all the work on 1 CPU and then move to the next > CPU in evlist all_cpus. evsel CPU maps are used to make sure the > indices are kept accurate - for example, if an uncore event is > measured with a CPU event: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 Right, I meant it'd check the evsel cpus eventually even if it iterates on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a CPU if it's not in the evsel cpus. Thanks, Namhyung
On 29/09/22 08:09, Namhyung Kim wrote:
> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
>>
>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
>>>
>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>
>>>> On 27/09/22 20:28, Namhyung Kim wrote:
>>>>> Hi Adrian,
>>>>>
>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>>
>>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will
>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before
>>>>>>> evlist's cpus are set actually.
>>>>>>>
>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when
>>>>>>> an evsel is added after the evlist cpu maps are set.
>>>>>>
>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles
>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
>>>>>> added to the evlist. It can also remove an evsel from the evlist.
>>>>>
>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove
>>>>> an evsel from evlist before calling evlist__create_maps(). The function
>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
>>>>> So previous changes in evsel/cpus wouldn't be any special.
>>>>>
>>>>> After this point, adding a new evsel needs to update evlist all cpus by
>>>>> propagating cpu maps. So I think hybrid cpus should be fine.
>>>>> Did I miss something?
>>>>
>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
>>>> cpus from the target->cpu_list (using perf record -C) , since after this
>>>> patch all_cpus always starts with the target->cpu_list instead of an empty
>>>> list. But then, in the hybrid case, it puts a dummy event that uses the
>>>> target cpu list anyway, so the result is the same.
>>>>
>>>> I don't know if there are any cases where all_cpus would actually need to
>>>> exclude some of the cpus from target->cpu_list.
>>>
>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine
>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel
>>> should have a correct list anyway and we mostly use the evsel cpus
>>> to do the real work.
>>>
>>> Thanks,
>>> Namhyung
>>
>> The affinity changes made it so that we use all_cpus probably more
>> often than the evsel CPU maps for real work. The reason being we want
>> to avoid IPIs so we do all the work on 1 CPU and then move to the next
>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
>> indices are kept accurate - for example, if an uncore event is
>> measured with a CPU event:
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
>
> Right, I meant it'd check the evsel cpus eventually even if it iterates
> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a
> CPU if it's not in the evsel cpus.
>
> Thanks,
> Namhyung
Perhaps an alternative is to be explicit about deferring map
propagation e.g.
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 19eaea99aa4f..5ce19e62397d 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -70,6 +70,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
/* Recomputing all_cpus, so start with a blank slate. */
perf_cpu_map__put(evlist->all_cpus);
evlist->all_cpus = NULL;
+ evlist->defer_map_propagation = false;
perf_evlist__for_each_evsel(evlist, evsel)
__perf_evlist__propagate_maps(evlist, evsel);
@@ -81,7 +82,8 @@ void perf_evlist__add(struct perf_evlist *evlist,
evsel->idx = evlist->nr_entries;
list_add_tail(&evsel->node, &evlist->entries);
evlist->nr_entries += 1;
- __perf_evlist__propagate_maps(evlist, evsel);
+ if (!evlist->defer_map_propagation)
+ __perf_evlist__propagate_maps(evlist, evsel);
}
void perf_evlist__remove(struct perf_evlist *evlist,
@@ -177,9 +179,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
evlist->threads = perf_thread_map__get(threads);
}
- if (!evlist->all_cpus && cpus)
- evlist->all_cpus = perf_cpu_map__get(cpus);
-
perf_evlist__propagate_maps(evlist);
}
diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
index 6f89aec3e608..dbe0b763f597 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -19,6 +19,7 @@ struct perf_evlist {
int nr_entries;
int nr_groups;
bool has_user_cpus;
+ bool defer_map_propagation;
/**
* The cpus passed from the command line or all online CPUs by
* default.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 52d254b1530c..1c2523d66a14 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -3975,6 +3975,7 @@ int cmd_record(int argc, const char **argv)
rec->evlist = evlist__new();
if (rec->evlist == NULL)
return -ENOMEM;
+ rec->evlist->core.defer_map_propagation = true;
err = perf_config(perf_record_config, rec);
if (err)
On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 29/09/22 08:09, Namhyung Kim wrote:
> > On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote:
> >>
> >> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >>>
> >>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>
> >>>> On 27/09/22 20:28, Namhyung Kim wrote:
> >>>>> Hi Adrian,
> >>>>>
> >>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>>>>
> >>>>>> On 24/09/22 19:57, Namhyung Kim wrote:
> >>>>>>> The current code propagate evsel's cpu map settings to evlist when it's
> >>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will
> >>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before
> >>>>>>> evlist's cpus are set actually.
> >>>>>>>
> >>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning
> >>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when
> >>>>>>> an evsel is added after the evlist cpu maps are set.
> >>>>>>
> >>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles
> >>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been
> >>>>>> added to the evlist. It can also remove an evsel from the evlist.
> >>>>>
> >>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove
> >>>>> an evsel from evlist before calling evlist__create_maps(). The function
> >>>>> will take care of setting evlist's all_cpus from the evsels in the evlist.
> >>>>> So previous changes in evsel/cpus wouldn't be any special.
> >>>>>
> >>>>> After this point, adding a new evsel needs to update evlist all cpus by
> >>>>> propagating cpu maps. So I think hybrid cpus should be fine.
> >>>>> Did I miss something?
> >>>>
> >>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the
> >>>> cpus from the target->cpu_list (using perf record -C) , since after this
> >>>> patch all_cpus always starts with the target->cpu_list instead of an empty
> >>>> list. But then, in the hybrid case, it puts a dummy event that uses the
> >>>> target cpu list anyway, so the result is the same.
> >>>>
> >>>> I don't know if there are any cases where all_cpus would actually need to
> >>>> exclude some of the cpus from target->cpu_list.
> >>>
> >>> I'm not aware of other cases to reduce cpu list. I think it'd be fine
> >>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel
> >>> should have a correct list anyway and we mostly use the evsel cpus
> >>> to do the real work.
> >>>
> >>> Thanks,
> >>> Namhyung
> >>
> >> The affinity changes made it so that we use all_cpus probably more
> >> often than the evsel CPU maps for real work. The reason being we want
> >> to avoid IPIs so we do all the work on 1 CPU and then move to the next
> >> CPU in evlist all_cpus. evsel CPU maps are used to make sure the
> >> indices are kept accurate - for example, if an uncore event is
> >> measured with a CPU event:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366
> >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404
> >
> > Right, I meant it'd check the evsel cpus eventually even if it iterates
> > on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a
> > CPU if it's not in the evsel cpus.
> >
> > Thanks,
> > Namhyung
>
> Perhaps an alternative is to be explicit about deferring map
> propagation e.g.
Thanks for your patch. Yeah, we can use this.
But I still think it'd be better doing it unconditionally
since any propagation before perf_evlist__set_maps
will be discarded anyway. With this change, other
than perf record will collect all cpus before _set_maps
and then discard it. It seems like a waste, no?
Or else, we can have allow_map_propagation initialized
to false and set it to true in perf_evlist__set_maps().
Thanks,
Namhyung
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 19eaea99aa4f..5ce19e62397d 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -70,6 +70,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> /* Recomputing all_cpus, so start with a blank slate. */
> perf_cpu_map__put(evlist->all_cpus);
> evlist->all_cpus = NULL;
> + evlist->defer_map_propagation = false;
>
> perf_evlist__for_each_evsel(evlist, evsel)
> __perf_evlist__propagate_maps(evlist, evsel);
> @@ -81,7 +82,8 @@ void perf_evlist__add(struct perf_evlist *evlist,
> evsel->idx = evlist->nr_entries;
> list_add_tail(&evsel->node, &evlist->entries);
> evlist->nr_entries += 1;
> - __perf_evlist__propagate_maps(evlist, evsel);
> + if (!evlist->defer_map_propagation)
> + __perf_evlist__propagate_maps(evlist, evsel);
> }
>
> void perf_evlist__remove(struct perf_evlist *evlist,
> @@ -177,9 +179,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist,
> evlist->threads = perf_thread_map__get(threads);
> }
>
> - if (!evlist->all_cpus && cpus)
> - evlist->all_cpus = perf_cpu_map__get(cpus);
> -
> perf_evlist__propagate_maps(evlist);
> }
>
> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h
> index 6f89aec3e608..dbe0b763f597 100644
> --- a/tools/lib/perf/include/internal/evlist.h
> +++ b/tools/lib/perf/include/internal/evlist.h
> @@ -19,6 +19,7 @@ struct perf_evlist {
> int nr_entries;
> int nr_groups;
> bool has_user_cpus;
> + bool defer_map_propagation;
> /**
> * The cpus passed from the command line or all online CPUs by
> * default.
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 52d254b1530c..1c2523d66a14 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -3975,6 +3975,7 @@ int cmd_record(int argc, const char **argv)
> rec->evlist = evlist__new();
> if (rec->evlist == NULL)
> return -ENOMEM;
> + rec->evlist->core.defer_map_propagation = true;
>
> err = perf_config(perf_record_config, rec);
> if (err)
>
On 29/09/22 23:42, Namhyung Kim wrote: > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: >> >> On 29/09/22 08:09, Namhyung Kim wrote: >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: >>>> >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: >>>>> >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >>>>>> >>>>>> On 27/09/22 20:28, Namhyung Kim wrote: >>>>>>> Hi Adrian, >>>>>>> >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >>>>>>>> >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote: >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's >>>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will >>>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before >>>>>>>>> evlist's cpus are set actually. >>>>>>>>> >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning >>>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when >>>>>>>>> an evsel is added after the evlist cpu maps are set. >>>>>>>> >>>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been >>>>>>>> added to the evlist. It can also remove an evsel from the evlist. >>>>>>> >>>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove >>>>>>> an evsel from evlist before calling evlist__create_maps(). The function >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. >>>>>>> So previous changes in evsel/cpus wouldn't be any special. >>>>>>> >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by >>>>>>> propagating cpu maps. So I think hybrid cpus should be fine. >>>>>>> Did I miss something? >>>>>> >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty >>>>>> list. But then, in the hybrid case, it puts a dummy event that uses the >>>>>> target cpu list anyway, so the result is the same. >>>>>> >>>>>> I don't know if there are any cases where all_cpus would actually need to >>>>>> exclude some of the cpus from target->cpu_list. >>>>> >>>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine >>>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel >>>>> should have a correct list anyway and we mostly use the evsel cpus >>>>> to do the real work. >>>>> >>>>> Thanks, >>>>> Namhyung >>>> >>>> The affinity changes made it so that we use all_cpus probably more >>>> often than the evsel CPU maps for real work. The reason being we want >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the >>>> indices are kept accurate - for example, if an uncore event is >>>> measured with a CPU event: >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 >>> >>> Right, I meant it'd check the evsel cpus eventually even if it iterates >>> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a >>> CPU if it's not in the evsel cpus. >>> >>> Thanks, >>> Namhyung >> >> Perhaps an alternative is to be explicit about deferring map >> propagation e.g. > > Thanks for your patch. Yeah, we can use this. > > But I still think it'd be better doing it unconditionally > since any propagation before perf_evlist__set_maps > will be discarded anyway. With this change, other > than perf record will collect all cpus before _set_maps > and then discard it. It seems like a waste, no? > > Or else, we can have allow_map_propagation initialized > to false and set it to true in perf_evlist__set_maps(). > That sounds fine.
On Fri, Sep 30, 2022 at 5:50 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 29/09/22 23:42, Namhyung Kim wrote: > > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: > >> > >> On 29/09/22 08:09, Namhyung Kim wrote: > >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > >>>> > >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > >>>>> > >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >>>>>> > >>>>>> On 27/09/22 20:28, Namhyung Kim wrote: > >>>>>>> Hi Adrian, > >>>>>>> > >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >>>>>>>> > >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote: > >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's > >>>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > >>>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before > >>>>>>>>> evlist's cpus are set actually. > >>>>>>>>> > >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning > >>>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > >>>>>>>>> an evsel is added after the evlist cpu maps are set. > >>>>>>>> > >>>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > >>>>>>>> added to the evlist. It can also remove an evsel from the evlist. > >>>>>>> > >>>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove > >>>>>>> an evsel from evlist before calling evlist__create_maps(). The function > >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. > >>>>>>> So previous changes in evsel/cpus wouldn't be any special. > >>>>>>> > >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by > >>>>>>> propagating cpu maps. So I think hybrid cpus should be fine. > >>>>>>> Did I miss something? > >>>>>> > >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this > >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty > >>>>>> list. But then, in the hybrid case, it puts a dummy event that uses the > >>>>>> target cpu list anyway, so the result is the same. > >>>>>> > >>>>>> I don't know if there are any cases where all_cpus would actually need to > >>>>>> exclude some of the cpus from target->cpu_list. > >>>>> > >>>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine > >>>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > >>>>> should have a correct list anyway and we mostly use the evsel cpus > >>>>> to do the real work. > >>>>> > >>>>> Thanks, > >>>>> Namhyung > >>>> > >>>> The affinity changes made it so that we use all_cpus probably more > >>>> often than the evsel CPU maps for real work. The reason being we want > >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next > >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the > >>>> indices are kept accurate - for example, if an uncore event is > >>>> measured with a CPU event: > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 > >>> > >>> Right, I meant it'd check the evsel cpus eventually even if it iterates > >>> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a > >>> CPU if it's not in the evsel cpus. > >>> > >>> Thanks, > >>> Namhyung > >> > >> Perhaps an alternative is to be explicit about deferring map > >> propagation e.g. > > > > Thanks for your patch. Yeah, we can use this. > > > > But I still think it'd be better doing it unconditionally > > since any propagation before perf_evlist__set_maps > > will be discarded anyway. With this change, other > > than perf record will collect all cpus before _set_maps > > and then discard it. It seems like a waste, no? > > > > Or else, we can have allow_map_propagation initialized > > to false and set it to true in perf_evlist__set_maps(). > > > > That sounds fine. Thanks! Arnaldo, how do you want to handle it? I can send v2 for the whole series, but I think you already applied it. Then do you want me to send this change on top? Namhyung
Em Fri, Sep 30, 2022 at 09:44:49AM -0700, Namhyung Kim escreveu: > On Fri, Sep 30, 2022 at 5:50 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > > On 29/09/22 23:42, Namhyung Kim wrote: > > > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >> > > >> On 29/09/22 08:09, Namhyung Kim wrote: > > >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > > >>>> > > >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > >>>>> > > >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >>>>>> > > >>>>>> On 27/09/22 20:28, Namhyung Kim wrote: > > >>>>>>> Hi Adrian, > > >>>>>>> > > >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >>>>>>>> > > >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote: > > >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's > > >>>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > >>>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before > > >>>>>>>>> evlist's cpus are set actually. > > >>>>>>>>> > > >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning > > >>>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > > >>>>>>>>> an evsel is added after the evlist cpu maps are set. > > >>>>>>>> > > >>>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > > >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > > >>>>>>>> added to the evlist. It can also remove an evsel from the evlist. > > >>>>>>> > > >>>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove > > >>>>>>> an evsel from evlist before calling evlist__create_maps(). The function > > >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. > > >>>>>>> So previous changes in evsel/cpus wouldn't be any special. > > >>>>>>> > > >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by > > >>>>>>> propagating cpu maps. So I think hybrid cpus should be fine. > > >>>>>>> Did I miss something? > > >>>>>> > > >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > > >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this > > >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty > > >>>>>> list. But then, in the hybrid case, it puts a dummy event that uses the > > >>>>>> target cpu list anyway, so the result is the same. > > >>>>>> > > >>>>>> I don't know if there are any cases where all_cpus would actually need to > > >>>>>> exclude some of the cpus from target->cpu_list. > > >>>>> > > >>>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine > > >>>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > > >>>>> should have a correct list anyway and we mostly use the evsel cpus > > >>>>> to do the real work. > > >>>>> > > >>>>> Thanks, > > >>>>> Namhyung > > >>>> > > >>>> The affinity changes made it so that we use all_cpus probably more > > >>>> often than the evsel CPU maps for real work. The reason being we want > > >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next > > >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the > > >>>> indices are kept accurate - for example, if an uncore event is > > >>>> measured with a CPU event: > > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 > > >>> > > >>> Right, I meant it'd check the evsel cpus eventually even if it iterates > > >>> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a > > >>> CPU if it's not in the evsel cpus. > > >>> > > >>> Thanks, > > >>> Namhyung > > >> > > >> Perhaps an alternative is to be explicit about deferring map > > >> propagation e.g. > > > > > > Thanks for your patch. Yeah, we can use this. > > > > > > But I still think it'd be better doing it unconditionally > > > since any propagation before perf_evlist__set_maps > > > will be discarded anyway. With this change, other > > > than perf record will collect all cpus before _set_maps > > > and then discard it. It seems like a waste, no? > > > > > > Or else, we can have allow_map_propagation initialized > > > to false and set it to true in perf_evlist__set_maps(). > > > > > > > That sounds fine. > > Thanks! > > Arnaldo, how do you want to handle it? I can send v2 for the > whole series, but I think you already applied it. Then do you > want me to send this change on top? Send v2 for the whole series, I haven't yet published it so I can replace. - Arnaldo
© 2016 - 2026 Red Hat, Inc.