[PATCH v7 19/24] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode

Babu Moger posted 24 patches 1 year, 3 months ago
There is a newer version of this series
[PATCH v7 19/24] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode
Posted by Babu Moger 1 year, 3 months ago
In mbm_cntr_assign mode, the hardware counter should be assigned to read
the MBM events.

Report "Unassigned" in case the user attempts to read the events without
assigning the counter.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v7: Moved the documentation under "mon_data".
    Updated the text little bit.

v6: Added more explaination in the resctrl.rst
    Added checks to detect "Unassigned" before reading RMID.

v5: New patch.
---
 Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 13 ++++++++++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 3e9302971faf..ff5397d19704 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -417,6 +417,16 @@ When monitoring is enabled all MON groups will also contain:
 	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
 	where "YY" is the node number.
 
+	The mbm_cntr_assign mode allows users to assign a hardware counter
+	to an RMID-event pair, enabling bandwidth monitoring for as long
+	as the counter remains assigned. The hardware will continue tracking
+	the assigned RMID until the user manually unassigns it, ensuring
+	that counters are not reset during this period. With a limited number
+	of counters, the system may run out of assignable resources. In
+	mbm_cntr_assign mode, MBM event counters will return "Unassigned"
+	if the counter is not allocated to the event when read. Users must
+	manually assign a counter to read the events.
+
 "mon_hw_id":
 	Available only with debug option. The identifier used by hardware
 	for the monitor group. On x86 this is the RMID.
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 50fa1fe9a073..fc19b1d131b2 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -562,7 +562,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 	struct rdtgroup *rdtgrp;
 	struct rdt_resource *r;
 	union mon_data_bits md;
-	int ret = 0;
+	int ret = 0, index;
 
 	rdtgrp = rdtgroup_kn_lock_live(of->kn);
 	if (!rdtgrp) {
@@ -576,6 +576,15 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 	evtid = md.u.evtid;
 	r = &rdt_resources_all[resid].r_resctrl;
 
+	if (resctrl_arch_mbm_cntr_assign_enabled(r) && evtid != QOS_L3_OCCUP_EVENT_ID) {
+		index = mon_event_config_index_get(evtid);
+		if (index != INVALID_CONFIG_INDEX &&
+		    rdtgrp->mon.cntr_id[index] == MON_CNTR_UNSET) {
+			rr.err = -ENOENT;
+			goto checkresult;
+		}
+	}
+
 	if (md.u.sum) {
 		/*
 		 * This file requires summing across all domains that share
@@ -613,6 +622,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
 		seq_puts(m, "Error\n");
 	else if (rr.err == -EINVAL)
 		seq_puts(m, "Unavailable\n");
+	else if (rr.err == -ENOENT)
+		seq_puts(m, "Unassigned\n");
 	else
 		seq_printf(m, "%llu\n", rr.val);
 
-- 
2.34.1
Re: [PATCH v7 19/24] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode
Posted by Reinette Chatre 1 year, 3 months ago
Hi Babu,

On 9/4/24 3:21 PM, Babu Moger wrote:
> In mbm_cntr_assign mode, the hardware counter should be assigned to read
> the MBM events.
> 
> Report "Unassigned" in case the user attempts to read the events without
> assigning the counter.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v7: Moved the documentation under "mon_data".
>     Updated the text little bit.
> 
> v6: Added more explaination in the resctrl.rst
>     Added checks to detect "Unassigned" before reading RMID.
> 
> v5: New patch.
> ---
>  Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 13 ++++++++++++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 3e9302971faf..ff5397d19704 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -417,6 +417,16 @@ When monitoring is enabled all MON groups will also contain:
>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>  	where "YY" is the node number.
>  
> +	The mbm_cntr_assign mode allows users to assign a hardware counter
> +	to an RMID-event pair, enabling bandwidth monitoring for as long
> +	as the counter remains assigned. The hardware will continue tracking
> +	the assigned RMID until the user manually unassigns it, ensuring
> +	that counters are not reset during this period. With a limited number
> +	of counters, the system may run out of assignable resources. In
> +	mbm_cntr_assign mode, MBM event counters will return "Unassigned"
> +	if the counter is not allocated to the event when read. Users must
> +	manually assign a counter to read the events.
> +

Please consider how this text could also be relevant to soft-ABMC.

>  "mon_hw_id":
>  	Available only with debug option. The identifier used by hardware
>  	for the monitor group. On x86 this is the RMID.
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 50fa1fe9a073..fc19b1d131b2 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -562,7 +562,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  	struct rdtgroup *rdtgrp;
>  	struct rdt_resource *r;
>  	union mon_data_bits md;
> -	int ret = 0;
> +	int ret = 0, index;
>  
>  	rdtgrp = rdtgroup_kn_lock_live(of->kn);
>  	if (!rdtgrp) {
> @@ -576,6 +576,15 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  	evtid = md.u.evtid;
>  	r = &rdt_resources_all[resid].r_resctrl;
>  
> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) && evtid != QOS_L3_OCCUP_EVENT_ID) {
> +		index = mon_event_config_index_get(evtid);

This should use MBM_EVENT_ARRAY_INDEX, not the arch index.

> +		if (index != INVALID_CONFIG_INDEX &&
> +		    rdtgrp->mon.cntr_id[index] == MON_CNTR_UNSET) {
> +			rr.err = -ENOENT;
> +			goto checkresult;
> +		}
> +	}
> +
>  	if (md.u.sum) {
>  		/*
>  		 * This file requires summing across all domains that share
> @@ -613,6 +622,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>  		seq_puts(m, "Error\n");
>  	else if (rr.err == -EINVAL)
>  		seq_puts(m, "Unavailable\n");
> +	else if (rr.err == -ENOENT)
> +		seq_puts(m, "Unassigned\n");
>  	else
>  		seq_printf(m, "%llu\n", rr.val);
>  

Reinette
Re: [PATCH v7 19/24] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode
Posted by Moger, Babu 1 year, 2 months ago
Hi Reinette,


On 9/19/24 12:31, Reinette Chatre wrote:
> Hi Babu,
> 
> On 9/4/24 3:21 PM, Babu Moger wrote:
>> In mbm_cntr_assign mode, the hardware counter should be assigned to read
>> the MBM events.
>>
>> Report "Unassigned" in case the user attempts to read the events without
>> assigning the counter.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v7: Moved the documentation under "mon_data".
>>     Updated the text little bit.
>>
>> v6: Added more explaination in the resctrl.rst
>>     Added checks to detect "Unassigned" before reading RMID.
>>
>> v5: New patch.
>> ---
>>  Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
>>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 13 ++++++++++++-
>>  2 files changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index 3e9302971faf..ff5397d19704 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -417,6 +417,16 @@ When monitoring is enabled all MON groups will also contain:
>>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>>  	where "YY" is the node number.
>>  
>> +	The mbm_cntr_assign mode allows users to assign a hardware counter
>> +	to an RMID-event pair, enabling bandwidth monitoring for as long
>> +	as the counter remains assigned. The hardware will continue tracking
>> +	the assigned RMID until the user manually unassigns it, ensuring
>> +	that counters are not reset during this period. With a limited number
>> +	of counters, the system may run out of assignable resources. In
>> +	mbm_cntr_assign mode, MBM event counters will return "Unassigned"
>> +	if the counter is not allocated to the event when read. Users must
>> +	manually assign a counter to read the events.
>> +
> 
> Please consider how this text could also be relevant to soft-ABMC.

It mostly applies to soft-ABMC also. Minor tweaking may be required.
How about?

"When supported the 'mbm_cntr_assign' mode allows users to assign a
hardware counter to RMID, event pair, enabling bandwidth monitoring for as
long as the counter remains assigned. The hardware will continue tracking
the assigned RMID until the user manually unassigns it, ensuring
that counters are not reset during this period. With a limited number
of counters, the system may run out of assignable counters at some point.
In that case, MBM event counters will return "Unassigned" when the event
when read. Users must manually assign a counter to read the events."


> 
>>  "mon_hw_id":
>>  	Available only with debug option. The identifier used by hardware
>>  	for the monitor group. On x86 this is the RMID.
>> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> index 50fa1fe9a073..fc19b1d131b2 100644
>> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
>> @@ -562,7 +562,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>>  	struct rdtgroup *rdtgrp;
>>  	struct rdt_resource *r;
>>  	union mon_data_bits md;
>> -	int ret = 0;
>> +	int ret = 0, index;
>>  
>>  	rdtgrp = rdtgroup_kn_lock_live(of->kn);
>>  	if (!rdtgrp) {
>> @@ -576,6 +576,15 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>>  	evtid = md.u.evtid;
>>  	r = &rdt_resources_all[resid].r_resctrl;
>>  
>> +	if (resctrl_arch_mbm_cntr_assign_enabled(r) && evtid != QOS_L3_OCCUP_EVENT_ID) {
>> +		index = mon_event_config_index_get(evtid);
> 
> This should use MBM_EVENT_ARRAY_INDEX, not the arch index.

Sure.

> 
>> +		if (index != INVALID_CONFIG_INDEX &&
>> +		    rdtgrp->mon.cntr_id[index] == MON_CNTR_UNSET) {
>> +			rr.err = -ENOENT;
>> +			goto checkresult;
>> +		}
>> +	}
>> +
>>  	if (md.u.sum) {
>>  		/*
>>  		 * This file requires summing across all domains that share
>> @@ -613,6 +622,8 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
>>  		seq_puts(m, "Error\n");
>>  	else if (rr.err == -EINVAL)
>>  		seq_puts(m, "Unavailable\n");
>> +	else if (rr.err == -ENOENT)
>> +		seq_puts(m, "Unassigned\n");
>>  	else
>>  		seq_printf(m, "%llu\n", rr.val);
>>  
> 
> Reinette
> 

-- 
Thanks
Babu Moger
Re: [PATCH v7 19/24] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode
Posted by Reinette Chatre 1 year, 2 months ago
Hi Babu,

On 9/26/24 12:16 PM, Moger, Babu wrote:
> On 9/19/24 12:31, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 9/4/24 3:21 PM, Babu Moger wrote:
>>> In mbm_cntr_assign mode, the hardware counter should be assigned to read
>>> the MBM events.
>>>
>>> Report "Unassigned" in case the user attempts to read the events without
>>> assigning the counter.
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>> v7: Moved the documentation under "mon_data".
>>>     Updated the text little bit.
>>>
>>> v6: Added more explaination in the resctrl.rst
>>>     Added checks to detect "Unassigned" before reading RMID.
>>>
>>> v5: New patch.
>>> ---
>>>  Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
>>>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 13 ++++++++++++-
>>>  2 files changed, 22 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>> index 3e9302971faf..ff5397d19704 100644
>>> --- a/Documentation/arch/x86/resctrl.rst
>>> +++ b/Documentation/arch/x86/resctrl.rst
>>> @@ -417,6 +417,16 @@ When monitoring is enabled all MON groups will also contain:
>>>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>>>  	where "YY" is the node number.
>>>  
>>> +	The mbm_cntr_assign mode allows users to assign a hardware counter
>>> +	to an RMID-event pair, enabling bandwidth monitoring for as long
>>> +	as the counter remains assigned. The hardware will continue tracking
>>> +	the assigned RMID until the user manually unassigns it, ensuring
>>> +	that counters are not reset during this period. With a limited number
>>> +	of counters, the system may run out of assignable resources. In
>>> +	mbm_cntr_assign mode, MBM event counters will return "Unassigned"
>>> +	if the counter is not allocated to the event when read. Users must
>>> +	manually assign a counter to read the events.
>>> +
>>
>> Please consider how this text could also be relevant to soft-ABMC.
> 
> It mostly applies to soft-ABMC also. Minor tweaking may be required.

hmmm ... seems that I still have mostly the "soft-RMID" model in my head.

> How about?
> 
> "When supported the 'mbm_cntr_assign' mode allows users to assign a
> hardware counter to RMID, event pair, enabling bandwidth monitoring for as

hmmm ... so soft-ABMC also assigns hardware counters?

Also, we should aim for generic text that will cover how this may look on MPAM
also. Considering this, it may just mean to replace "RMID, event pair" with 
"mon_hw_id, event pair"?

> long as the counter remains assigned. The hardware will continue tracking
> the assigned RMID until the user manually unassigns it, ensuring

Please do double-check all usage of "RMID" in user facing interfaces/docs where
mon_hw_id may be more appropriate.

> that counters are not reset during this period. With a limited number
> of counters, the system may run out of assignable counters at some point.
> In that case, MBM event counters will return "Unassigned" when the event
> when read. Users must manually assign a counter to read the events."

"when the event when read" -> "when the event is read"?

Reinette
Re: [PATCH v7 19/24] x86/resctrl: Report "Unassigned" for MBM events in mbm_cntr_assign mode
Posted by Moger, Babu 1 year, 2 months ago
Hi Reinette,

On 9/26/2024 8:50 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 9/26/24 12:16 PM, Moger, Babu wrote:
>> On 9/19/24 12:31, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 9/4/24 3:21 PM, Babu Moger wrote:
>>>> In mbm_cntr_assign mode, the hardware counter should be assigned to read
>>>> the MBM events.
>>>>
>>>> Report "Unassigned" in case the user attempts to read the events without
>>>> assigning the counter.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>> v7: Moved the documentation under "mon_data".
>>>>      Updated the text little bit.
>>>>
>>>> v6: Added more explaination in the resctrl.rst
>>>>      Added checks to detect "Unassigned" before reading RMID.
>>>>
>>>> v5: New patch.
>>>> ---
>>>>   Documentation/arch/x86/resctrl.rst        | 10 ++++++++++
>>>>   arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 13 ++++++++++++-
>>>>   2 files changed, 22 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>>> index 3e9302971faf..ff5397d19704 100644
>>>> --- a/Documentation/arch/x86/resctrl.rst
>>>> +++ b/Documentation/arch/x86/resctrl.rst
>>>> @@ -417,6 +417,16 @@ When monitoring is enabled all MON groups will also contain:
>>>>   	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
>>>>   	where "YY" is the node number.
>>>>   
>>>> +	The mbm_cntr_assign mode allows users to assign a hardware counter
>>>> +	to an RMID-event pair, enabling bandwidth monitoring for as long
>>>> +	as the counter remains assigned. The hardware will continue tracking
>>>> +	the assigned RMID until the user manually unassigns it, ensuring
>>>> +	that counters are not reset during this period. With a limited number
>>>> +	of counters, the system may run out of assignable resources. In
>>>> +	mbm_cntr_assign mode, MBM event counters will return "Unassigned"
>>>> +	if the counter is not allocated to the event when read. Users must
>>>> +	manually assign a counter to read the events.
>>>> +
>>>
>>> Please consider how this text could also be relevant to soft-ABMC.
>>
>> It mostly applies to soft-ABMC also. Minor tweaking may be required.
> 
> hmmm ... seems that I still have mostly the "soft-RMID" model in my head.
> 
>> How about?
>>
>> "When supported the 'mbm_cntr_assign' mode allows users to assign a
>> hardware counter to RMID, event pair, enabling bandwidth monitoring for as
> 
> hmmm ... so soft-ABMC also assigns hardware counters?


It does not have hardware counter. I need to change this text.

> Also, we should aim for generic text that will cover how this may look on MPAM
> also. Considering this, it may just mean to replace "RMID, event pair" with
> "mon_hw_id, event pair"?

ok.

> 
>> long as the counter remains assigned. The hardware will continue tracking
>> the assigned RMID until the user manually unassigns it, ensuring
> 
> Please do double-check all usage of "RMID" in user facing interfaces/docs where
> mon_hw_id may be more appropriate.

Sure.

> 
>> that counters are not reset during this period. With a limited number
>> of counters, the system may run out of assignable counters at some point.
>> In that case, MBM event counters will return "Unassigned" when the event
>> when read. Users must manually assign a counter to read the events."
> 
> "when the event when read" -> "when the event is read"?

Sure.

-- 
- Babu Moger