[PATCH v3 1/9] fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state during umount

Zeng Heng posted 9 patches 2 weeks, 6 days ago
[PATCH v3 1/9] fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state during umount
Posted by Zeng Heng 2 weeks, 6 days ago
This patch fixes a pre-existing issue in the resctrl filesystem teardown
sequence where premature clearing of cdp_enabled could lead to MPAM Partid
parsing errors.

The closid to partid conversion logic inherently depends on the global
cdp_enabled state. However, rdt_disable_ctx() clears this flag early in
the umount path, while free_rmid() operations will reference after that.
This creates a window where partid parsing operates with inconsistent CDP
state, potentially make monitor reads with wrong partid mapping.

Additionally, rmid_entry remaining in limbo between mount sessions may
trigger potential partid out-of-range errors, leading to MPAM fault
interrupts and subsequent MPAM disablement.

Reorder rdt_kill_sb() to delay rdt_disable_ctx() until after
rmdir_all_sub() and resctrl_fs_teardown() complete. This ensures
all rmid-related operations finish with correct CDP state.

Introduce rdt_flush_limbo() to flush and cancel limbo work before the
filesystem teardown completes. An alternative approach would be to cancel
limbo work on umount and restart it on remount with remaked bitmap.
However, this would require substantial changes in the resctrl layer to
handle CDP state transitions across mount sessions, which is beyond the
scope of the reqpartid feature work this patchset focuses on. The current
fix addresses the immediate correctness issue with minimal churn.

Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
 fs/resctrl/rdtgroup.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 5da305bd36c9..bc0735eef92a 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -3165,6 +3165,25 @@ static void resctrl_fs_teardown(void)
 	rdtgroup_destroy_root();
 }
 
+static void rdt_flush_limbo(void)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+	struct rdt_l3_mon_domain *d;
+
+	if (!IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID))
+		return;
+
+	if (!resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID))
+		return;
+
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		if (has_busy_rmid(d)) {
+			__check_limbo(d, true);
+			cancel_delayed_work(&d->cqm_limbo);
+		}
+	}
+}
+
 static void rdt_kill_sb(struct super_block *sb)
 {
 	struct rdt_resource *r;
@@ -3172,13 +3191,14 @@ static void rdt_kill_sb(struct super_block *sb)
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
 
-	rdt_disable_ctx();
-
 	/* Put everything back to default values. */
 	for_each_alloc_capable_rdt_resource(r)
 		resctrl_arch_reset_all_ctrls(r);
 
 	resctrl_fs_teardown();
+	rdt_flush_limbo();
+	rdt_disable_ctx();
+
 	if (resctrl_arch_alloc_capable())
 		resctrl_arch_disable_alloc();
 	if (resctrl_arch_mon_capable())
-- 
2.25.1
Re: [PATCH v3 1/9] fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state during umount
Posted by Ben Horgan 2 weeks, 3 days ago
Hi Zeng,

On 3/17/26 13:21, Zeng Heng wrote:
> This patch fixes a pre-existing issue in the resctrl filesystem teardown
> sequence where premature clearing of cdp_enabled could lead to MPAM Partid
> parsing errors.
> 
> The closid to partid conversion logic inherently depends on the global
> cdp_enabled state. However, rdt_disable_ctx() clears this flag early in
> the umount path, while free_rmid() operations will reference after that.
> This creates a window where partid parsing operates with inconsistent CDP
> state, potentially make monitor reads with wrong partid mapping.
> 
> Additionally, rmid_entry remaining in limbo between mount sessions may
> trigger potential partid out-of-range errors, leading to MPAM fault
> interrupts and subsequent MPAM disablement.
> 
> Reorder rdt_kill_sb() to delay rdt_disable_ctx() until after
> rmdir_all_sub() and resctrl_fs_teardown() complete. This ensures
> all rmid-related operations finish with correct CDP state.
> 
> Introduce rdt_flush_limbo() to flush and cancel limbo work before the
> filesystem teardown completes. An alternative approach would be to cancel

The code looks correct but it does introduce a subtle change of behaviour which
may or may not be acceptable. A busy rmid may now be allocated after remount.
Clean rmids were never guaranteed, e.g. when a domain goes offline, but this
weakens the guarantee.

> limbo work on umount and restart it on remount with remaked bitmap.
> However, this would require substantial changes in the resctrl layer to
> handle CDP state transitions across mount sessions, which is beyond the
> scope of the reqpartid feature work this patchset focuses on. The current

Another option to consider is whether limbo could be replaced by checking whether
an rmid is busy at allocation.

Do your changes here to resctrl_arch_rmid_idx_encode() have an impact on how
limbo works?

Thanks,

Ben

> fix addresses the immediate correctness issue with minimal churn.
> 
> Signed-off-by: Zeng Heng <zengheng4@huawei.com>
> ---
>  fs/resctrl/rdtgroup.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 5da305bd36c9..bc0735eef92a 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -3165,6 +3165,25 @@ static void resctrl_fs_teardown(void)
>  	rdtgroup_destroy_root();
>  }
>  
> +static void rdt_flush_limbo(void)
> +{
> +	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> +	struct rdt_l3_mon_domain *d;
> +
> +	if (!IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID))
> +		return;
> +
> +	if (!resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID))
> +		return;
> +
> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +		if (has_busy_rmid(d)) {
> +			__check_limbo(d, true);
> +			cancel_delayed_work(&d->cqm_limbo);
> +		}
> +	}
> +}
> +
>  static void rdt_kill_sb(struct super_block *sb)
>  {
>  	struct rdt_resource *r;
> @@ -3172,13 +3191,14 @@ static void rdt_kill_sb(struct super_block *sb)
>  	cpus_read_lock();
>  	mutex_lock(&rdtgroup_mutex);
>  
> -	rdt_disable_ctx();
> -
>  	/* Put everything back to default values. */
>  	for_each_alloc_capable_rdt_resource(r)
>  		resctrl_arch_reset_all_ctrls(r);
>  
>  	resctrl_fs_teardown();
> +	rdt_flush_limbo();
> +	rdt_disable_ctx();
> +
>  	if (resctrl_arch_alloc_capable())
>  		resctrl_arch_disable_alloc();
>  	if (resctrl_arch_mon_capable())
Re: [PATCH v3 1/9] fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state during umount
Posted by Zeng Heng 2 weeks, 2 days ago
Hi Ben,

On 2026/3/21 1:07, Ben Horgan wrote:
> Hi Zeng,
> 
> On 3/17/26 13:21, Zeng Heng wrote:
>> This patch fixes a pre-existing issue in the resctrl filesystem teardown
>> sequence where premature clearing of cdp_enabled could lead to MPAM Partid
>> parsing errors.
>>
>> The closid to partid conversion logic inherently depends on the global
>> cdp_enabled state. However, rdt_disable_ctx() clears this flag early in
>> the umount path, while free_rmid() operations will reference after that.
>> This creates a window where partid parsing operates with inconsistent CDP
>> state, potentially make monitor reads with wrong partid mapping.
>>
>> Additionally, rmid_entry remaining in limbo between mount sessions may
>> trigger potential partid out-of-range errors, leading to MPAM fault
>> interrupts and subsequent MPAM disablement.
>>
>> Reorder rdt_kill_sb() to delay rdt_disable_ctx() until after
>> rmdir_all_sub() and resctrl_fs_teardown() complete. This ensures
>> all rmid-related operations finish with correct CDP state.
>>
>> Introduce rdt_flush_limbo() to flush and cancel limbo work before the
>> filesystem teardown completes. An alternative approach would be to cancel
> 
> The code looks correct but it does introduce a subtle change of behaviour which
> may or may not be acceptable. A busy rmid may now be allocated after remount.
> Clean rmids were never guaranteed, e.g. when a domain goes offline, but this
> weakens the guarantee.

Yes, this would indeed weaken MPAM's guarantee for clean RMIDs.

Hopefully, no one is doing this in production, repeatedly switching
resctrl mount modes while monitoring workloads (which sounds more like
testing to me), and still expecting strict guarantees of clean RMID
allocation.

> 
>> limbo work on umount and restart it on remount with remaked bitmap.
>> However, this would require substantial changes in the resctrl layer to
>> handle CDP state transitions across mount sessions, which is beyond the
>> scope of the reqpartid feature work this patchset focuses on. The current
> 
> Another option to consider is whether limbo could be replaced by checking whether
> an rmid is busy at allocation.
> 
> Do your changes here to resctrl_arch_rmid_idx_encode() have an impact on how
> limbo works?


In follow-up patches, resctrl_arch_rmid_idx_encode() also needs to
depend on the CDP state because it needs to check out the intpartid and
reqpartid. Between remount sessions, RMIDs residing in limbo also have a
parsing error issue.


Best Regards,
Zeng Heng


> 
> Thanks,
> 
> Ben
> 
>> fix addresses the immediate correctness issue with minimal churn.
>>
>> Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Re: [PATCH v3 1/9] fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state during umount
Posted by Zeng Heng 2 weeks, 2 days ago

On 2026/3/21 12:11, Zeng Heng wrote:
> Hi Ben,
> 
> On 2026/3/21 1:07, Ben Horgan wrote:
>> Hi Zeng,
>>
>> On 3/17/26 13:21, Zeng Heng wrote:
>>> This patch fixes a pre-existing issue in the resctrl filesystem teardown
>>> sequence where premature clearing of cdp_enabled could lead to MPAM 
>>> Partid
>>> parsing errors.
>>>
>>> The closid to partid conversion logic inherently depends on the global
>>> cdp_enabled state. However, rdt_disable_ctx() clears this flag early in
>>> the umount path, while free_rmid() operations will reference after that.
>>> This creates a window where partid parsing operates with inconsistent 
>>> CDP
>>> state, potentially make monitor reads with wrong partid mapping.
>>>
>>> Additionally, rmid_entry remaining in limbo between mount sessions may
>>> trigger potential partid out-of-range errors, leading to MPAM fault
>>> interrupts and subsequent MPAM disablement.
>>>
>>> Reorder rdt_kill_sb() to delay rdt_disable_ctx() until after
>>> rmdir_all_sub() and resctrl_fs_teardown() complete. This ensures
>>> all rmid-related operations finish with correct CDP state.
>>>
>>> Introduce rdt_flush_limbo() to flush and cancel limbo work before the
>>> filesystem teardown completes. An alternative approach would be to 
>>> cancel
>>
>> The code looks correct but it does introduce a subtle change of 
>> behaviour which
>> may or may not be acceptable. A busy rmid may now be allocated after 
>> remount.
>> Clean rmids were never guaranteed, e.g. when a domain goes offline, 
>> but this
>> weakens the guarantee.
> 
> Yes, this would indeed weaken MPAM's guarantee for clean RMIDs.
> 
> Hopefully, no one is doing this in production, repeatedly switching
> resctrl mount modes while monitoring workloads (which sounds more like
> testing to me), and still expecting strict guarantees of clean RMID
> allocation.
> 
>>
>>> limbo work on umount and restart it on remount with remaked bitmap.
>>> However, this would require substantial changes in the resctrl layer to
>>> handle CDP state transitions across mount sessions, which is beyond the
>>> scope of the reqpartid feature work this patchset focuses on. The 
>>> current
>>
>> Another option to consider is whether limbo could be replaced by 
>> checking whether
>> an rmid is busy at allocation.
>>
>> Do your changes here to resctrl_arch_rmid_idx_encode() have an impact 
>> on how
>> limbo works?
> 
> 
> In follow-up patches, resctrl_arch_rmid_idx_encode() also needs to
> depend on the CDP state because it needs to check out the intpartid and
> reqpartid. Between remount sessions, RMIDs residing in limbo also have a
> parsing error issue.
> 
> 

For this reason, had to make this patch as a prerequisite fix in the
patch series.


Best Regards,
Zeng Heng