Hi Reinette,
On 7/17/2025 10:50 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 7/8/25 3:17 PM, Babu Moger wrote:
>> When supported, "mbm_event" counter assignment mode allows users to assign
>> a hardware counter to an RMID, event pair and monitor the bandwidth usage
>> as long as it is assigned. The hardware continues to track the assigned
>> counter until it is explicitly unassigned by the user.
>>
>> Introduce the architecture calls resctrl_arch_cntr_read() and
>> resctrl_arch_reset_cntr() to read and reset event counters when "mbm_event"
>> mode is supported.
>
> A reader may wonder about the inconsistent verb placement so I would
> suggest adding something like: "Function names are chosen to match
> existing resctrl_arch_rmid_read() and resctrl_arch_reset_rmid()."
>
Sure.
>>
>> Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v15: New patch to add arch calls resctrl_arch_cntr_read() and resctrl_arch_reset_cntr()
>> with mbm_event mode.
>> https://lore.kernel.org/lkml/b4b14670-9cb0-4f65-abd5-39db996e8da9@intel.com/
>> ---
>> include/linux/resctrl.h | 50 +++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 50 insertions(+)
>>
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index 50e38445183a..96679ad49d66 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -613,6 +613,56 @@ void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> u32 cntr_id, bool assign);
>>
>> +/**
>> + * resctrl_arch_cntr_read() - Read the eventid counter corresponding to rmid
>
> rmid -> RMID (throughout)
>
Sure.
>> + * for this resource and domain.
>> + * @r: resource that the counter should be read from.
>> + * @d: domain that the counter should be read from.
>> + * @closid: closid that matches the rmid. Depending on the architecture, the
>
> closid -> CLOSID (throughout)
>
Sure.
>> + * counter may match traffic of both @closid and @rmid, or @rmid
>> + * only.
>> + * @rmid: rmid of the counter to read.
>> + * @cntr_id: The counter ID whose event data should be reset. Valid when
>
> "should be reset" -> "should be read"?
>
>> + * "mbm_event" mode is enabled and @eventid is MBM event.
>> + * @eventid: eventid to read, e.g. L3 occupancy.
>
> This is resctrl_arch_cntr_read() that explicitly reads a *counter* and
> just the previous line mentions that @cntr_id is only valid when "mbm_event" mode is
> enabled. How could it thus be possible for "L3 occupancy" to be an example of
> an @eventid?
Fixed it now.
>
>> + * @val: result of the counter read in bytes.
>> + * @arch_mon_ctx: An architecture specific value from
>> + * resctrl_arch_mon_ctx_alloc(), for MPAM this identifies
>> + * the hardware monitor allocated for this read request.
>
> This is confusing. @cntr_id is intended to identify a hardware counter that
> is allocated to this event yet @arch_mon_ctx points to an allocated "hardware monitor"?
> What is difference between a "hardware counter" and a "hardware monitor"?
> The @arch_mon_ctx does not seem relevant to reading of the hardware counters managed
> by this series?
It is not relevant. Removed it now.
>
>> + *
>> + * Some architectures need to sleep when first programming some of the counters.
>> + * (specifically: arm64's MPAM cache occupancy counters can return 'not ready'
>
> Again the example of cache occupancy counters for an API that is unique to MBM
> events. Very confusing.
> Would any needed hardware programming not occur when the counter is allocated
> and assigned? What does this "first programming some of the counters" refer to?
Removed this text.
>
>> + * for a short period of time). Call from a non-migrateable process context on
>> + * a CPU that belongs to domain @d. e.g. use smp_call_on_cpu() or
>> + * schedule_work_on(). This function can be called with interrupts masked,
>> + * e.g. using smp_call_function_any(), but may consistently return an error.
>> + *
>> + * Return:
>> + * 0 on success, or -EIO, -EINVAL etc on error.
>> + */
>> +int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + u32 closid, u32 rmid, int cntr_id,
>> + enum resctrl_event_id eventid, u64 *val,
>> + void *arch_mon_ctx);
>> +
>> +/**
>> + * resctrl_arch_reset_cntr() - Reset any private state associated with counter id
>> + * and eventid.
>> + * @r: The domain's resource.
>> + * @d: The rmid's domain.
>> + * @closid: closid that matches the rmid. Depending on the architecture, the
>> + * counter may match traffic of both @closid and @rmid, or @rmid only.
>> + * @rmid: The rmid whose counter values should be reset.
>> + * @cntr_id: The counter ID whose event data should be reset. Valid when
>> + * "mbm_event" mode is enabled and @eventid is MBM event.
>> + * @eventid: The eventid whose counter values should be reset.
>> + *
>> + * This can be called from any CPU.
>> + */
>> +void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + u32 closid, u32 rmid, int cntr_id,
>> + enum resctrl_event_id eventid);
>> +
>> extern unsigned int resctrl_rmid_realloc_threshold;
>> extern unsigned int resctrl_rmid_realloc_limit;
>>
>
> This is new API to create MBM counters. Please use API that is appropriate for
> this purpose. I do not see why it is necessary to copy the RMID read API.
> What am I missing?
No. You are not missing anything.
This is my cut paste issue. Cleaned up the patch now. Only kept the
relevant information.
Thanks
Babu