RE: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)

Luck, Tony posted 23 patches 10 months, 1 week ago
Only 0 patches received!
There is a newer version of this series
RE: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
Posted by Luck, Tony 10 months, 1 week ago
> Yes, although there is some hard-to-avoid fuzz about the precise
> meaning of "local" and "total".

Things are only getting fuzzier with mixed DDR and CXL memory.

> As Reinette pointed out, there is the also the possibility of adding
> new named events other than "local" and "total" if we find that some
> kinds of event don't fit these categories.

Not just new names, new scopes too. Patches coming later this year
that would present:

$ cd sys/fs/resctrl
$ cat mon_data/mon_PKG_00/llc_stalls
779762866739

I.e. a way to cheaply collect some "perf" like events across
all CPUs on a package that executed jobs with a specific RMID.

Of course this can be done with perf today, but the cost to collect
this data from heavily multi-threaded workloads that context switch
rapidly is very high.

-Tony
Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
Posted by Reinette Chatre 10 months, 1 week ago
Hi Tony,

On 2/13/25 10:39 AM, Luck, Tony wrote:
>> Yes, although there is some hard-to-avoid fuzz about the precise
>> meaning of "local" and "total".
> 
> Things are only getting fuzzier with mixed DDR and CXL memory.
> 
>> As Reinette pointed out, there is the also the possibility of adding
>> new named events other than "local" and "total" if we find that some
>> kinds of event don't fit these categories.
> 
> Not just new names, new scopes too. Patches coming later this year
> that would present:
> 
> $ cd sys/fs/resctrl
> $ cat mon_data/mon_PKG_00/llc_stalls
> 779762866739

Thank you for catching this. To support this would not be possible for
the current plan for mbm_assign_control since it does not have a way
to distinguish domain X of the PKG resource from domain X of the L3 resource.
Sounds like we need to include the resource name in the mbm_assign_control
syntax?

Reinette
Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)
Posted by Reinette Chatre 10 months, 1 week ago

On 2/13/25 10:34 PM, Reinette Chatre wrote:
> Hi Tony,
> 
> On 2/13/25 10:39 AM, Luck, Tony wrote:
>>> Yes, although there is some hard-to-avoid fuzz about the precise
>>> meaning of "local" and "total".
>>
>> Things are only getting fuzzier with mixed DDR and CXL memory.
>>
>>> As Reinette pointed out, there is the also the possibility of adding
>>> new named events other than "local" and "total" if we find that some
>>> kinds of event don't fit these categories.
>>
>> Not just new names, new scopes too. Patches coming later this year
>> that would present:
>>
>> $ cd sys/fs/resctrl
>> $ cat mon_data/mon_PKG_00/llc_stalls
>> 779762866739
> 
> Thank you for catching this. To support this would not be possible for
> the current plan for mbm_assign_control since it does not have a way
> to distinguish domain X of the PKG resource from domain X of the L3 resource.
> Sounds like we need to include the resource name in the mbm_assign_control
> syntax?

ugh ... please ignore this message. This is not needed since mbm_assign_control
is already associated with the resource.

Reinette