[PATCH v9 07/31] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types

Tony Luck posted 31 patches 1 month ago
There is a newer version of this series
[PATCH v9 07/31] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types
Posted by Tony Luck 1 month ago
All monitoring events are associated with the L3 resource.

The RDT_RESOURCE_L3 resource carries a lot of state in the domain
structures which needs to be dealt with when a domain is taken offline
by removing the last CPU in the domain.

New telemetry events will be associated with a new package scoped
resource with new domain structures.

Refactor domain_remove_cpu_mon() so all the L3 processing is separate
from general actions of clearing the CPU bit in the mask and removing
sub-directories from the mon_data directory.

resctrl_offline_mon_domain() continues to remove domain specific
directories and files from the "mon_data" directories, but skips the
L3 resource specific cleanup when called for other resource types.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/resctrl/core.c | 21 +++++++++++++--------
 fs/resctrl/rdtgroup.c              |  5 ++++-
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 4db46c282b5c..f1b215b72844 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -646,20 +646,25 @@ static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r)
 		return;
 	}
 
-	if (!domain_header_is_valid(hdr, RESCTRL_MON_DOMAIN, r->rid))
+	cpumask_clear_cpu(cpu, &hdr->cpu_mask);
+	if (!cpumask_empty(&hdr->cpu_mask))
 		return;
 
-	d = container_of(hdr, struct rdt_mon_domain, hdr);
-	hw_dom = resctrl_to_arch_mon_dom(d);
+	switch (r->rid) {
+	case RDT_RESOURCE_L3:
+		if (!domain_header_is_valid(hdr, RESCTRL_MON_DOMAIN, RDT_RESOURCE_L3))
+			return;
 
-	cpumask_clear_cpu(cpu, &d->hdr.cpu_mask);
-	if (cpumask_empty(&d->hdr.cpu_mask)) {
+		d = container_of(hdr, struct rdt_mon_domain, hdr);
+		hw_dom = resctrl_to_arch_mon_dom(d);
 		resctrl_offline_mon_domain(r, d);
-		list_del_rcu(&d->hdr.list);
+		list_del_rcu(&hdr->list);
 		synchronize_rcu();
 		mon_domain_free(hw_dom);
-
-		return;
+		break;
+	default:
+		pr_warn_once("Unknown resource rid=%d\n", r->rid);
+		break;
 	}
 }
 
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 77336d5e4915..05438e15e2ca 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -4047,6 +4047,9 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
 	if (resctrl_mounted && resctrl_arch_mon_capable())
 		rmdir_mondata_subdir_allrdtgrp(r, d);
 
+	if (r->rid != RDT_RESOURCE_L3)
+		goto out_unlock;
+
 	if (resctrl_is_mbm_enabled())
 		cancel_delayed_work(&d->mbm_over);
 	if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID) && has_busy_rmid(d)) {
@@ -4063,7 +4066,7 @@ void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_domain *d
 	}
 
 	domain_destroy_mon_state(d);
-
+out_unlock:
 	mutex_unlock(&rdtgroup_mutex);
 }
 
-- 
2.50.1
Re: [PATCH v9 07/31] x86,fs/resctrl: Refactor domain_remove_cpu_mon() ready for new domain types
Posted by Reinette Chatre 3 weeks, 2 days ago
Hi Tony,

On 8/29/25 12:33 PM, Tony Luck wrote:
> All monitoring events are associated with the L3 resource.
> 
> The RDT_RESOURCE_L3 resource carries a lot of state in the domain
> structures which needs to be dealt with when a domain is taken offline
> by removing the last CPU in the domain.
> 
> New telemetry events will be associated with a new package scoped
> resource with new domain structures.
> 
> Refactor domain_remove_cpu_mon() so all the L3 processing is separate
> from general actions of clearing the CPU bit in the mask and removing
> sub-directories from the mon_data directory.
> 
> resctrl_offline_mon_domain() continues to remove domain specific
> directories and files from the "mon_data" directories, but skips the
> L3 resource specific cleanup when called for other resource types.

This part does not seem to be related to this change since up to here
this is all about refactoring L3 support while the final part starts to
add support for a new resource under the guise of "L3 refactoring".

The resctrl_offline_mon_domain() change looks more appropriate for
patch #9 "x86,fs/resctrl: Use struct rdt_domain_hdr instead of struct
rdt_mon_domain" where the change makes clear to which code only a
RDT_RESOURCE_L3 domain can belong and matches the change to
resctrl_online_mon_domain() in same patch. This is not ideal though
since even in patch #9 this seems to be PERF_PKG enabling code (in both
resctrl_online_mon_domain() and resctrl_offline_mon_domain()) added
under another guise of refactoring.

I think it could also be argued that the related changes (code flow changes to
support only files for PERF_PKG) to resctrl_online_mon_domain()
and resctrl_offline_mon_domain() belongs in patch 22 since only then is
support for PERF_PKG added and indeed is when it is explicit that
"Support the PERF_PKG resource in the CPU online/offline handlers."

Doing so will keep refactoring as-is ... it is just refactoring existing flow,
and then code flow changes come later when actual enabling is done.

Reinette