arch/x86/kernel/cpu/resctrl/monitor.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
The issue was observed during testing on systems with multiple resctrl
domains, where tasks were dynamically moved between domains.
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When an RMID transitions from "Unavailable" back to active, the hardware
sets the IA32_QM_CTR.Unavailable (bit 62) on the first read from
MSR_IA32_QM_CTR. This indicates that the RMID was not previously tracked.
Once the hardware begins tracking the RMID, subsequent reads from that RMID
will have the Unavailable bit cleared, as long as it remains tracked.
Problem scenario:
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctr /sys/fs/resctrl
$mkdir /sys/fs/resctr/mon_groups/test1
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/l3_mon_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
Because the RMID transitions from the “Unavailable” state to the
active state, the first read sets IA32_QM_CTR.Unavailable (bit 62).
The following read starts counting from zero, which can be lower than
the previously saved MSR value (7345357). Consequently, the kernel’s
overflow logic is triggered—it compares the previous value (7345357)
with the new, smaller value and mistakenly interprets this as a counter
overflow, adding a large delta. In reality, this is a false positive:
the counter didn’t overflow but was simply reset when the RMID
transitioned from “Unavailable” back to active.
Fix the issue by resetting the prev_msr value to zero when hardware
returns IA32_QM_CTR.Unavailable (bit 62) when the RMID becomes active from
"Unavailable".
Here is the text from APM [1].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
The details are documented in APM [1] available from [2].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
---
Tested this on multiple AMD systems, but not on Intel systems.
Need help with that. If everything goes well, this patch needs to
go to all the stable kernels.
---
arch/x86/kernel/cpu/resctrl/monitor.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c8945610d455..a685370dd160 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
int cpu = cpumask_any(&d->hdr.cpu_mask);
+ struct arch_mbm_state *am;
u64 msr_val;
u32 prmid;
int ret;
@@ -251,12 +253,21 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
prmid = logical_rmid_to_physical_rmid(cpu, rmid);
ret = __rmid_read_phys(prmid, eventid, &msr_val);
- if (ret)
- return ret;
- *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ switch (ret) {
+ case 0:
+ *val = get_corrected_val(r, d, rmid, eventid, msr_val);
+ break;
+ case -EINVAL:
+ am = get_arch_mbm_state(hw_dom, rmid, eventid);
+ if (am)
+ am->prev_msr = 0;
+ break;
+ default:
+ break;
+ }
- return 0;
+ return ret;
}
static int __cntr_id_read(u32 cntr_id, u64 *val)
--
2.34.1
Hi Babu,
Thank you for catching and preparing a fix for this issue.
On 10/6/25 4:13 PM, Babu Moger wrote:
> The issue was observed during testing on systems with multiple resctrl
> domains, where tasks were dynamically moved between domains.
(please let changelog stand on its own and not be a continuation of subject line)
>
> Users can create as many monitoring groups as the number of RMIDs supported
> by the hardware. However, on AMD systems, only a limited number of RMIDs
> are guaranteed to be actively tracked by the hardware. RMIDs that exceed
> this limit are placed in an "Unavailable" state.
>
> When an RMID transitions from "Unavailable" back to active, the hardware
> sets the IA32_QM_CTR.Unavailable (bit 62) on the first read from
(can drop "the" from "sets the IA32_QM_CTR.Unavailable (bit 62)")
(IA32_QM_CTR -> MSR_IA32_QM_CTR) (learning from touchup made during merge of ABMC work)
> MSR_IA32_QM_CTR. This indicates that the RMID was not previously tracked.
This seems like a contradiction to me. Some times, like above, it reads
that Unavailable bit is set on *first* read after counter returns to
"active" but this text (and the spec) also mentions the "Unavailable state" where
where the Unavailable bit is set on *every* read while RMID is in Unavailable
state.
For example, from the spec:
Potential causes of the U bit being set include (but are not limited to):
- RMID is not currently tracked by the hardware
> Once the hardware begins tracking the RMID, subsequent reads from that RMID
> will have the Unavailable bit cleared, as long as it remains tracked.
Can we thus expect that Unavailable bit is *always* set when RMID is not tracked?
>
> Problem scenario:
> 1. The resctrl filesystem is mounted, and a task is assigned to a
> monitoring group.
>
> $mount -t resctrl resctr /sys/fs/resctrl
(resctr -> resctrl)
> $mkdir /sys/fs/resctr/mon_groups/test1
(resctr -> resctrl)
> $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
>
> $cat /sys/fs/resctrl/mon_groups/test1/mon_data/l3_mon_*/mbm_total_bytes
(l3_mon* -> mon_L3*)
> 21323 <- Total bytes on domain 0
> "Unavailable" <- Total bytes on domain 1
>
> Task is running on domain 0. Counter on domain 1 is "Unavailable".
This implies that the Unavailable bit is set while RMID is "in Unavailable state/
not tracked" which seems to support that this bit is always set while RMID is not
tracked, not just on first read after counter returns active.
>
> 2. The task runs on domain 0 for a while and then moves to domain 1. The
> counter starts incrementing on domain 1.
>
> $cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
(l3_mon* -> mon_L3*)
> 7345357 <- Total bytes on domain 0
> 4545 <- Total bytes on domain 1
>
>
> 3. At some point, the RMID in domain 0 transitions to the "Unavailable"
> state because the task is no longer executing in that domain.
>
> $cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
(l3_mon* -> mon_L3*)
> "Unavailable" <- Total bytes on domain 0
> 434341 <- Total bytes on domain 1
>
> 4. Since the task continues to migrate between domains, it may eventually
> return to domain 0.
>
> $cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
(l3_mon* -> mon_L3*)
> 17592178699059 <- Overflow on domain 0
> 3232332 <- Total bytes on domain 1
>
> Because the RMID transitions from the “Unavailable” state to the
This paragraph contains a few non-ascii characters ... like the quotes
around Unavailable above (and below) and a few other characters (’ and —)
below.
(This happened in ABMC work also but slipped through with Boris needing
to fix. Now I include a check for it so that it can be addressed sooner.
Please check your tools to prevent these from slipping in.)
> active state, the first read sets IA32_QM_CTR.Unavailable (bit 62).
(IA32_QM_CTR -> MSR_IA32_QM_CTR)
I expected the Unavailable bit to be set the entire time the RMID was
not tracked, not just this "first read"?
> The following read starts counting from zero, which can be lower than
> the previously saved MSR value (7345357). Consequently, the kernel’s
> overflow logic is triggered—it compares the previous value (7345357)
> with the new, smaller value and mistakenly interprets this as a counter
> overflow, adding a large delta. In reality, this is a false positive:
> the counter didn’t overflow but was simply reset when the RMID
> transitioned from “Unavailable” back to active.
>
> Fix the issue by resetting the prev_msr value to zero when hardware
"the prev_msr value" -> "arch_mbm_state::prev_msr"
Although, this can be seen from the code. It may be more useful to describe
the fix as "Reset the stored value of MSR_IA32_QM_CTR used to handle
counter overflow every time the RMID is unavailable ..."
> returns IA32_QM_CTR.Unavailable (bit 62) when the RMID becomes active from
(IA32_QM_CTR -> MSR_IA32_QM_CTR)
> "Unavailable".
Related to earlier comments I do not think "when the RMID becomes active from
Unavailable" is accurate. It seems more accurate to say that the value is
set to zero every time the RMID is unavailable/not tracked in preparation for
the RMID to be reset to zero when it starts to be tracked again.
>
> Here is the text from APM [1].
>
> "In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
> first QM_CTR read when it begins tracking an RMID that it was not
> previously tracking. The U bit will be zero for all subsequent reads from
> that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
> read with the U bit set when that RMID is in use by a processor can be
> considered 0 when calculating the difference with a subsequent read."
Indeed ... and APM also says that Unavailable bit is set on *every* read while
RMID is not tracked? :/
>
> The details are documented in APM [1] available from [2].
Seems unnecessary (copied from ABMC?)? Earlier quote can just be prefixed with:
Here is the text from APM [1] available from [2].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
> Bandwidth (MBM).
(another learning from ABMC work, this can be more readable with indentation)
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
> ---
>
> Tested this on multiple AMD systems, but not on Intel systems.
> Need help with that. If everything goes well, this patch needs to
> go to all the stable kernels.
Needs a "Fixes" and "Cc: stable@vger.kernel.org"?
This patch will only apply to upstream though so would need backporting
support. Will you be able to support this?
> ---
> arch/x86/kernel/cpu/resctrl/monitor.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index c8945610d455..a685370dd160 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
> u32 unused, u32 rmid, enum resctrl_event_id eventid,
> u64 *val, void *ignored)
> {
> + struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
> int cpu = cpumask_any(&d->hdr.cpu_mask);
> + struct arch_mbm_state *am;
> u64 msr_val;
> u32 prmid;
> int ret;
> @@ -251,12 +253,21 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>
> prmid = logical_rmid_to_physical_rmid(cpu, rmid);
> ret = __rmid_read_phys(prmid, eventid, &msr_val);
> - if (ret)
> - return ret;
>
> - *val = get_corrected_val(r, d, rmid, eventid, msr_val);
> + switch (ret) {
> + case 0:
> + *val = get_corrected_val(r, d, rmid, eventid, msr_val);
> + break;
> + case -EINVAL:
> + am = get_arch_mbm_state(hw_dom, rmid, eventid);
> + if (am)
> + am->prev_msr = 0;
> + break;
> + default:
> + break;
> + }
>
> - return 0;
> + return ret;
> }
>
> static int __cntr_id_read(u32 cntr_id, u64 *val)
The fix looks good to me.
Reinette
Hi, On Tue, Oct 07, 2025 at 01:23:36PM -0700, Reinette Chatre wrote: > Hi Babu, > > Thank you for catching and preparing a fix for this issue. > > On 10/6/25 4:13 PM, Babu Moger wrote: > > The issue was observed during testing on systems with multiple resctrl > > domains, where tasks were dynamically moved between domains. > > (please let changelog stand on its own and not be a continuation of subject line) [...] [Quick, drive-by observation:] Can I add that the commit message also seems way too long? I think some of the description of the problem symptom could probably be after the tearoff -- there doesn't seem to be a clear statement of what is actually wrong in the code, or of why the change made in the patch fixes it (or if there is, I struggled to find it.) I puzzled over this for some minutes before I figured out this patch is fixing something that is not upstream, yet. A statement to that effect would have helped. Possibly I didn't read carefully enough... Cheers ---Dave
Hi Dave, On 10/8/25 10:23, Dave Martin wrote: > Hi, > > On Tue, Oct 07, 2025 at 01:23:36PM -0700, Reinette Chatre wrote: >> Hi Babu, >> >> Thank you for catching and preparing a fix for this issue. >> >> On 10/6/25 4:13 PM, Babu Moger wrote: >>> The issue was observed during testing on systems with multiple resctrl >>> domains, where tasks were dynamically moved between domains. >> (please let changelog stand on its own and not be a continuation of subject line) > [...] > > [Quick, drive-by observation:] > > Can I add that the commit message also seems way too long? I agree. But problem is bit complicated, so had to give all the steps. > > I think some of the description of the problem symptom could probably > be after the tearoff -- there doesn't seem to be a clear statement of > what is actually wrong in the code, or of why the change made in the > patch fixes it (or if there is, I struggled to find it.) > > I puzzled over this for some minutes before I figured out this patch is > fixing something that is not upstream, yet. A statement to that effect > would have helped. No. Code is already in upstream. Just that it is not resetting the internal state when RMID reports "Unavailable". Updated the changes log now with brief summary in the beginning. thanks Babu
On 10/7/2025 3:23 PM, Reinette Chatre wrote:
> Hi Babu,
>
> Thank you for catching and preparing a fix for this issue.
>
> On 10/6/25 4:13 PM, Babu Moger wrote:
>> The issue was observed during testing on systems with multiple resctrl
>> domains, where tasks were dynamically moved between domains.
>
> (please let changelog stand on its own and not be a continuation of subject line)
ok. Will remove the above first line.
>
>>
>> Users can create as many monitoring groups as the number of RMIDs supported
>> by the hardware. However, on AMD systems, only a limited number of RMIDs
>> are guaranteed to be actively tracked by the hardware. RMIDs that exceed
>> this limit are placed in an "Unavailable" state.
>>
>> When an RMID transitions from "Unavailable" back to active, the hardware
>> sets the IA32_QM_CTR.Unavailable (bit 62) on the first read from
>
> (can drop "the" from "sets the IA32_QM_CTR.Unavailable (bit 62)")
>
> (IA32_QM_CTR -> MSR_IA32_QM_CTR) (learning from touchup made during merge of ABMC work)
Sure.
>> MSR_IA32_QM_CTR. This indicates that the RMID was not previously tracked.
>
> This seems like a contradiction to me. Some times, like above, it reads
> that Unavailable bit is set on *first* read after counter returns to
> "active" but this text (and the spec) also mentions the "Unavailable state" where
> where the Unavailable bit is set on *every* read while RMID is in Unavailable
> state.
You are right. Will change the text to make that clear.
>
> For example, from the spec:
> Potential causes of the U bit being set include (but are not limited to):
> - RMID is not currently tracked by the hardware
>
>> Once the hardware begins tracking the RMID, subsequent reads from that RMID
>> will have the Unavailable bit cleared, as long as it remains tracked.
>
> Can we thus expect that Unavailable bit is *always* set when RMID is not tracked?
Yes.
>
>>
>> Problem scenario:
>> 1. The resctrl filesystem is mounted, and a task is assigned to a
>> monitoring group.
>>
>> $mount -t resctrl resctr /sys/fs/resctrl
>
> (resctr -> resctrl)
Sure.
>
>> $mkdir /sys/fs/resctr/mon_groups/test1
>
> (resctr -> resctrl)
Sure.
>
>> $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
>>
>> $cat /sys/fs/resctrl/mon_groups/test1/mon_data/l3_mon_*/mbm_total_bytes
>
> (l3_mon* -> mon_L3*)
Sure.
>
>> 21323 <- Total bytes on domain 0
>> "Unavailable" <- Total bytes on domain 1
>>
>> Task is running on domain 0. Counter on domain 1 is "Unavailable".
>
> This implies that the Unavailable bit is set while RMID is "in Unavailable state/
> not tracked" which seems to support that this bit is always set while RMID is not
> tracked, not just on first read after counter returns active.
Correct.
>
>>
>> 2. The task runs on domain 0 for a while and then moves to domain 1. The
>> counter starts incrementing on domain 1.
>>
>> $cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
>
> (l3_mon* -> mon_L3*)
Sure.
>
>> 7345357 <- Total bytes on domain 0
>> 4545 <- Total bytes on domain 1
>>
>>
>> 3. At some point, the RMID in domain 0 transitions to the "Unavailable"
>> state because the task is no longer executing in that domain.
>>
>> $cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
>
> (l3_mon* -> mon_L3*)
Sure.
>
>> "Unavailable" <- Total bytes on domain 0
>> 434341 <- Total bytes on domain 1
>>
>> 4. Since the task continues to migrate between domains, it may eventually
>> return to domain 0.
>>
>> $cat /sys/fs/resctrl/mon_groups/mon_data/l3_mon_*/mbm_total_bytes
>
> (l3_mon* -> mon_L3*)
Sure.
>
>> 17592178699059 <- Overflow on domain 0
>> 3232332 <- Total bytes on domain 1
>>
>> Because the RMID transitions from the “Unavailable” state to the
>
> This paragraph contains a few non-ascii characters ... like the quotes
> around Unavailable above (and below) and a few other characters (’ and —)
> below.
>
> (This happened in ABMC work also but slipped through with Boris needing
> to fix. Now I include a check for it so that it can be addressed sooner.
> Please check your tools to prevent these from slipping in.)
Let me check that.
>
>> active state, the first read sets IA32_QM_CTR.Unavailable (bit 62).
>
> (IA32_QM_CTR -> MSR_IA32_QM_CTR)
>
> I expected the Unavailable bit to be set the entire time the RMID was
> not tracked, not just this "first read"?
Yes. That is correct.
>
>> The following read starts counting from zero, which can be lower than
>> the previously saved MSR value (7345357). Consequently, the kernel’s
>> overflow logic is triggered—it compares the previous value (7345357)
>> with the new, smaller value and mistakenly interprets this as a counter
>> overflow, adding a large delta. In reality, this is a false positive:
>> the counter didn’t overflow but was simply reset when the RMID
>> transitioned from “Unavailable” back to active.
>>
>> Fix the issue by resetting the prev_msr value to zero when hardware
>
> "the prev_msr value" -> "arch_mbm_state::prev_msr"
> Although, this can be seen from the code. It may be more useful to describe
> the fix as "Reset the stored value of MSR_IA32_QM_CTR used to handle
> counter overflow every time the RMID is unavailable ..."
Sure.
>
>> returns IA32_QM_CTR.Unavailable (bit 62) when the RMID becomes active from
>
> (IA32_QM_CTR -> MSR_IA32_QM_CTR)
>
>> "Unavailable".
>
> Related to earlier comments I do not think "when the RMID becomes active from
> Unavailable" is accurate. It seems more accurate to say that the value is
> set to zero every time the RMID is unavailable/not tracked in preparation for
> the RMID to be reset to zero when it starts to be tracked again.
That is correct.
>
>>
>> Here is the text from APM [1].
>>
>> "In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
>> first QM_CTR read when it begins tracking an RMID that it was not
>> previously tracking. The U bit will be zero for all subsequent reads from
>> that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
>> read with the U bit set when that RMID is in use by a processor can be
>> considered 0 when calculating the difference with a subsequent read."
>
> Indeed ... and APM also says that Unavailable bit is set on *every* read while
> RMID is not tracked? :/
Yes.
>
>>
>> The details are documented in APM [1] available from [2].
>
> Seems unnecessary (copied from ABMC?)? Earlier quote can just be prefixed with:
>
> Here is the text from APM [1] available from [2].
Sure.
>
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
>> Bandwidth (MBM).
>
> (another learning from ABMC work, this can be more readable with indentation)
Sure.
>
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
> Bandwidth (MBM).
>
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
>> ---
>>
>> Tested this on multiple AMD systems, but not on Intel systems.
>> Need help with that. If everything goes well, this patch needs to
>> go to all the stable kernels.
>
> Needs a "Fixes" and "Cc: stable@vger.kernel.org"?
Hmm.. Not sure. Which commit to add Fixes in. This is AMD related.
We should have taken care when adding support to AMD. Does this commit
make sense?
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
> This patch will only apply to upstream though so would need backporting
> support. Will you be able to support this?
Yes. I can do that.
>
>> ---
>> arch/x86/kernel/cpu/resctrl/monitor.c | 19 +++++++++++++++----
>> 1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index c8945610d455..a685370dd160 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -242,7 +242,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>> u32 unused, u32 rmid, enum resctrl_event_id eventid,
>> u64 *val, void *ignored)
>> {
>> + struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>> int cpu = cpumask_any(&d->hdr.cpu_mask);
>> + struct arch_mbm_state *am;
>> u64 msr_val;
>> u32 prmid;
>> int ret;
>> @@ -251,12 +253,21 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *d,
>>
>> prmid = logical_rmid_to_physical_rmid(cpu, rmid);
>> ret = __rmid_read_phys(prmid, eventid, &msr_val);
>> - if (ret)
>> - return ret;
>>
>> - *val = get_corrected_val(r, d, rmid, eventid, msr_val);
>> + switch (ret) {
>> + case 0:
>> + *val = get_corrected_val(r, d, rmid, eventid, msr_val);
>> + break;
>> + case -EINVAL:
>> + am = get_arch_mbm_state(hw_dom, rmid, eventid);
>> + if (am)
>> + am->prev_msr = 0;
>> + break;
>> + default:
>> + break;
>> + }
>>
>> - return 0;
>> + return ret;
>> }
>>
>> static int __cntr_id_read(u32 cntr_id, u64 *val)
>
> The fix looks good to me.
Thanks
Babu
Hi Babu,
On 10/7/25 4:47 PM, Moger, Babu wrote:
> On 10/7/2025 3:23 PM, Reinette Chatre wrote:
>> On 10/6/25 4:13 PM, Babu Moger wrote:
...
>>> ---
>>>
>>> Tested this on multiple AMD systems, but not on Intel systems.
>>> Need help with that. If everything goes well, this patch needs to
>>> go to all the stable kernels.
>>
>> Needs a "Fixes" and "Cc: stable@vger.kernel.org"?
>
> Hmm.. Not sure. Which commit to add Fixes in. This is AMD related.
> We should have taken care when adding support to AMD. Does this commit make sense?
>
> Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
This seems fair. From what I can tell this spans all LTS kernels so I expect
a couple of different variants of this fix since this code has evolved quite a lot.
Do you want this fix backported to all the LTS kernels?
It may be useful to add a note to the stable team that this patch needs changes to
apply to all kernel versions (see "Point out known problems"
in Documentation/process/stable-kernel-rules.rst)
>> This patch will only apply to upstream though so would need backporting
>> support. Will you be able to support this?
>
> Yes. I can do that.
Thank you very much.
Reinette
Hi Reinette,
On 10/7/25 22:04, Reinette Chatre wrote:
> Hi Babu,
>
> On 10/7/25 4:47 PM, Moger, Babu wrote:
>> On 10/7/2025 3:23 PM, Reinette Chatre wrote:
>>> On 10/6/25 4:13 PM, Babu Moger wrote:
> ...
>
>>>> ---
>>>>
>>>> Tested this on multiple AMD systems, but not on Intel systems.
>>>> Need help with that. If everything goes well, this patch needs to
>>>> go to all the stable kernels.
>>> Needs a "Fixes" and "Cc: stable@vger.kernel.org"?
>> Hmm.. Not sure. Which commit to add Fixes in. This is AMD related.
>> We should have taken care when adding support to AMD. Does this commit make sense?
>>
>> Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
> This seems fair. From what I can tell this spans all LTS kernels so I expect
> a couple of different variants of this fix since this code has evolved quite a lot.
> Do you want this fix backported to all the LTS kernels?
> It may be useful to add a note to the stable team that this patch needs changes to
> apply to all kernel versions (see "Point out known problems"
> in Documentation/process/stable-kernel-rules.rst)
Sure. Sending v2 will all the updates.
--
Thanks
Babu Moger
© 2016 - 2025 Red Hat, Inc.