[PATCH v8 3/7] x86/resctrl: Refactor mbm_update()

Tony Luck posted 7 patches 3 weeks, 5 days ago
There is a newer version of this series
[PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Tony Luck 3 weeks, 5 days ago
Computing memory bandwidth for all enabled events resulted in
identical code blocks for total and local bandwidth in mbm_update().

Refactor with a helper function to eliminate code duplication.

No functional change.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
 1 file changed, 24 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 3ef339e405c2..1b6cb3bbc008 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
 	resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
 }
 
-static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
-		       u32 closid, u32 rmid)
+static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
+				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
 {
 	struct rmid_read rr = {0};
 
 	rr.r = r;
 	rr.d = d;
+	rr.evtid = evtid;
+	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
+	if (IS_ERR(rr.arch_mon_ctx)) {
+		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
+				    PTR_ERR(rr.arch_mon_ctx));
+		return;
+	}
+
+	__mon_event_count(closid, rmid, &rr);
+
+	if (is_mba_sc(NULL))
+		mbm_bw_count(closid, rmid, &rr);
+
+	resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
+}
 
+static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
+		       u32 closid, u32 rmid)
+{
 	/*
 	 * This is protected from concurrent reads from user
 	 * as both the user and we hold the global mutex.
 	 */
-	if (is_mbm_total_enabled()) {
-		rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID;
-		rr.val = 0;
-		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
-		if (IS_ERR(rr.arch_mon_ctx)) {
-			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
-					    PTR_ERR(rr.arch_mon_ctx));
-			return;
-		}
-
-		__mon_event_count(closid, rmid, &rr);
-
-		/*
-		 * Call the MBA software controller only for the
-		 * control groups and when user has enabled
-		 * the software controller explicitly.
-		 */
-		if (is_mba_sc(NULL))
-			mbm_bw_count(closid, rmid, &rr);
-
-		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
-	}
-	if (is_mbm_local_enabled()) {
-		rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID;
-		rr.val = 0;
-		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
-		if (IS_ERR(rr.arch_mon_ctx)) {
-			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
-					    PTR_ERR(rr.arch_mon_ctx));
-			return;
-		}
-
-		__mon_event_count(closid, rmid, &rr);
-
-		/*
-		 * Call the MBA software controller only for the
-		 * control groups and when user has enabled
-		 * the software controller explicitly.
-		 */
-		if (is_mba_sc(NULL))
-			mbm_bw_count(closid, rmid, &rr);
+	if (is_mbm_total_enabled())
+		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
 
-		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
-	}
+	if (is_mbm_local_enabled())
+		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
 }
 
 /*
-- 
2.47.0
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Reinette Chatre 1 week, 4 days ago
Hi Tony,

On 10/29/24 10:28 AM, Tony Luck wrote:
> Computing memory bandwidth for all enabled events resulted in
> identical code blocks for total and local bandwidth in mbm_update().
> 
> Refactor with a helper function to eliminate code duplication.
> 
> No functional change.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
>  1 file changed, 24 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 3ef339e405c2..1b6cb3bbc008 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>  	resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
>  }
>  
> -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> -		       u32 closid, u32 rmid)
> +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> +				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
>  {
>  	struct rmid_read rr = {0};
>  
>  	rr.r = r;
>  	rr.d = d;
> +	rr.evtid = evtid;
> +	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> +	if (IS_ERR(rr.arch_mon_ctx)) {
> +		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> +				    PTR_ERR(rr.arch_mon_ctx));
> +		return;
> +	}
> +
> +	__mon_event_count(closid, rmid, &rr);
> +
> +	if (is_mba_sc(NULL))
> +		mbm_bw_count(closid, rmid, &rr);
> +

As I am staring at this more there seems to be an existing issue here ... note how
__mon_event_count()'s return value is not checked before mbm_bw_count() is called.
This means that mbm_bw_count() may run with rr.val of 0 that results in wraparound
inside it resulting in some unexpected bandwidth numbers. Since a counter read can fail
with a "Unavailable"/"Error" from hardware it is not deterministic how frequently this
issue can be encountered.

Skipping mbm_bw_count() if rr.val is 0 is one option ... that would keep the bandwidth
measurement static at whatever was the last successful read and thus not cause dramatic
changes by the software controller ... setting bandwidth to 0 if rr.val is 0 is another
option to reflect that bandwidth data is unavailable, but then the software controller should
perhaps get signal to not make adjustments? I expect there are better options? What do
you think?

Reinette
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Tony Luck 1 week, 4 days ago
On Wed, Nov 13, 2024 at 02:25:53PM -0800, Reinette Chatre wrote:
> Hi Tony,
> 
> On 10/29/24 10:28 AM, Tony Luck wrote:
> > Computing memory bandwidth for all enabled events resulted in
> > identical code blocks for total and local bandwidth in mbm_update().
> > 
> > Refactor with a helper function to eliminate code duplication.
> > 
> > No functional change.
> > 
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> >  arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
> >  1 file changed, 24 insertions(+), 45 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> > index 3ef339e405c2..1b6cb3bbc008 100644
> > --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> > +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> > @@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
> >  	resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
> >  }
> >  
> > -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> > -		       u32 closid, u32 rmid)
> > +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> > +				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
> >  {
> >  	struct rmid_read rr = {0};
> >  
> >  	rr.r = r;
> >  	rr.d = d;
> > +	rr.evtid = evtid;
> > +	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> > +	if (IS_ERR(rr.arch_mon_ctx)) {
> > +		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> > +				    PTR_ERR(rr.arch_mon_ctx));
> > +		return;
> > +	}
> > +
> > +	__mon_event_count(closid, rmid, &rr);
> > +
> > +	if (is_mba_sc(NULL))
> > +		mbm_bw_count(closid, rmid, &rr);
> > +
> 
> As I am staring at this more there seems to be an existing issue here ... note how
> __mon_event_count()'s return value is not checked before mbm_bw_count() is called.
> This means that mbm_bw_count() may run with rr.val of 0 that results in wraparound
> inside it resulting in some unexpected bandwidth numbers. Since a counter read can fail
> with a "Unavailable"/"Error" from hardware it is not deterministic how frequently this
> issue can be encountered.
> 
> Skipping mbm_bw_count() if rr.val is 0 is one option ... that would keep the bandwidth
> measurement static at whatever was the last successful read and thus not cause dramatic
> changes by the software controller ... setting bandwidth to 0 if rr.val is 0 is another
> option to reflect that bandwidth data is unavailable, but then the software controller should
> perhaps get signal to not make adjustments? I expect there are better options? What do
> you think?

Skipping mbm_bw_count() is also undesirable. If some later
__mon_event_count() does succeed the bandwidth will be computed
based on the last and current values as if they were one second
apart, when actually some longer interval elapsed.

I don't think this is a big issue for current Intel CPU RDT
implementations because I don't think they will return the
bit 62 unavailable value in the IA32_QM_CTR MSR. I'll ask
around to check.

But it does mean that implementing the "summary bandwidth"
file discussed in the other e-mail thread[1] may be more
complex on systems that can return that a counter is
unavailable. We'd have to keep track that two succesful
counter reads occured, with a measure of the interval
between them before reporting a value in the summary file.

-Tony

[1] https://lore.kernel.org/all/CALPaoCjCWZ4ZYfwooFEzMn15jJM7s9Rfq83YhorOGUD=1GdSyw@mail.gmail.com/
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Peter Newman 1 week, 3 days ago
Hi Tony,

On Wed, Nov 13, 2024 at 11:58 PM Tony Luck <tony.luck@intel.com> wrote:
>
> On Wed, Nov 13, 2024 at 02:25:53PM -0800, Reinette Chatre wrote:
> > Hi Tony,
> >
> > On 10/29/24 10:28 AM, Tony Luck wrote:
> > > Computing memory bandwidth for all enabled events resulted in
> > > identical code blocks for total and local bandwidth in mbm_update().
> > >
> > > Refactor with a helper function to eliminate code duplication.
> > >
> > > No functional change.
> > >
> > > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > > ---
> > >  arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
> > >  1 file changed, 24 insertions(+), 45 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> > > index 3ef339e405c2..1b6cb3bbc008 100644
> > > --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> > > +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> > > @@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
> > >     resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
> > >  }
> > >
> > > -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> > > -                  u32 closid, u32 rmid)
> > > +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> > > +                            u32 closid, u32 rmid, enum resctrl_event_id evtid)
> > >  {
> > >     struct rmid_read rr = {0};
> > >
> > >     rr.r = r;
> > >     rr.d = d;
> > > +   rr.evtid = evtid;
> > > +   rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> > > +   if (IS_ERR(rr.arch_mon_ctx)) {
> > > +           pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> > > +                               PTR_ERR(rr.arch_mon_ctx));
> > > +           return;
> > > +   }
> > > +
> > > +   __mon_event_count(closid, rmid, &rr);
> > > +
> > > +   if (is_mba_sc(NULL))
> > > +           mbm_bw_count(closid, rmid, &rr);
> > > +
> >
> > As I am staring at this more there seems to be an existing issue here ... note how
> > __mon_event_count()'s return value is not checked before mbm_bw_count() is called.
> > This means that mbm_bw_count() may run with rr.val of 0 that results in wraparound
> > inside it resulting in some unexpected bandwidth numbers. Since a counter read can fail
> > with a "Unavailable"/"Error" from hardware it is not deterministic how frequently this
> > issue can be encountered.
> >
> > Skipping mbm_bw_count() if rr.val is 0 is one option ... that would keep the bandwidth
> > measurement static at whatever was the last successful read and thus not cause dramatic
> > changes by the software controller ... setting bandwidth to 0 if rr.val is 0 is another
> > option to reflect that bandwidth data is unavailable, but then the software controller should
> > perhaps get signal to not make adjustments? I expect there are better options? What do
> > you think?
>
> Skipping mbm_bw_count() is also undesirable. If some later
> __mon_event_count() does succeed the bandwidth will be computed
> based on the last and current values as if they were one second
> apart, when actually some longer interval elapsed.
>
> I don't think this is a big issue for current Intel CPU RDT
> implementations because I don't think they will return the
> bit 62 unavailable value in the IA32_QM_CTR MSR. I'll ask
> around to check.
>
> But it does mean that implementing the "summary bandwidth"
> file discussed in the other e-mail thread[1] may be more
> complex on systems that can return that a counter is
> unavailable. We'd have to keep track that two succesful
> counter reads occured, with a measure of the interval
> between them before reporting a value in the summary file.

At least for the purposes of reporting the mbps rate to userspace, my
users will record from the files one per second, so it would be fine
to just report Unavailable (or Unassigned when it's clear that the
error is because counter wasn't unassigned) whenever either the
current or previous reading was not successful. Then they can assume
the value or error reported always refers to the most
recently-completed one-second window.

-Peter
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Tony Luck 1 week, 3 days ago
On Thu, Nov 14, 2024 at 11:31:25AM +0100, Peter Newman wrote:
> At least for the purposes of reporting the mbps rate to userspace, my
> users will record from the files one per second, so it would be fine
> to just report Unavailable (or Unassigned when it's clear that the
> error is because counter wasn't unassigned) whenever either the
> current or previous reading was not successful. Then they can assume
> the value or error reported always refers to the most
> recently-completed one-second window.

First hack at keeping status for memory bandwidth values. Simple
state machine.  Compile tested.


diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 6345ab3e0890..80de5c6b0754 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -359,14 +359,22 @@ struct rftype {
 			 char *buf, size_t nbytes, loff_t off);
 };
 
+enum mbm_state_status {
+	MBM_INVALID,
+	MBM_ONE_VALUE,
+	MBM_VALID
+};
+
 /**
  * struct mbm_state - status for each MBM counter in each domain
  * @prev_bw_bytes: Previous bytes value read for bandwidth calculation
  * @prev_bw:	The most recent bandwidth in MBps
+ * @status:	Validity of @prev_bw
  */
 struct mbm_state {
-	u64	prev_bw_bytes;
-	u32	prev_bw;
+	u64			prev_bw_bytes;
+	u32			prev_bw;
+	enum mbm_state_status	status;
 };
 
 /**
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index da4ae21350c8..767a526af2f5 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -670,6 +670,17 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr)
 	if (WARN_ON_ONCE(!m))
 		return;
 
+	if (rr->err || !rr->val) {
+		m->status = MBM_INVALID;
+		return;
+	}
+
+	if (m->status == MBM_INVALID) {
+		m->status = MBM_ONE_VALUE;
+		m->prev_bw_bytes = rr->val;
+		return;
+	}
+
 	cur_bytes = rr->val;
 	bytes = cur_bytes - m->prev_bw_bytes;
 	m->prev_bw_bytes = cur_bytes;
@@ -677,6 +688,7 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr)
 	cur_bw = bytes / SZ_1M;
 
 	m->prev_bw = cur_bw;
+	m->status = MBM_VALID;
 }
 
 /*
@@ -781,6 +793,9 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
 	if (WARN_ON_ONCE(!pmbm_data))
 		return;
 
+	if (pmbm_data->status != MBM_VALID)
+		return;
+
 	dom_mba = get_ctrl_domain_from_cpu(smp_processor_id(), r_mba);
 	if (!dom_mba) {
 		pr_warn_once("Failure to get domain for MBA update\n");
@@ -801,6 +816,9 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
 		cmbm_data = get_mbm_state(dom_mbm, entry->closid, entry->mon.rmid, evt_id);
 		if (WARN_ON_ONCE(!cmbm_data))
 			return;
+		if (cmbm_data->status != MBM_VALID)
+			return;
+
 		cur_bw += cmbm_data->prev_bw;
 	}
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Reinette Chatre 1 week, 4 days ago
Hi Tony,

On 11/13/24 2:58 PM, Tony Luck wrote:
> On Wed, Nov 13, 2024 at 02:25:53PM -0800, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 10/29/24 10:28 AM, Tony Luck wrote:
>>> Computing memory bandwidth for all enabled events resulted in
>>> identical code blocks for total and local bandwidth in mbm_update().
>>>
>>> Refactor with a helper function to eliminate code duplication.
>>>
>>> No functional change.
>>>
>>> Signed-off-by: Tony Luck <tony.luck@intel.com>
>>> ---
>>>  arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
>>>  1 file changed, 24 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> index 3ef339e405c2..1b6cb3bbc008 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>>> @@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>>>  	resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
>>>  }
>>>  
>>> -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
>>> -		       u32 closid, u32 rmid)
>>> +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>>> +				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
>>>  {
>>>  	struct rmid_read rr = {0};
>>>  
>>>  	rr.r = r;
>>>  	rr.d = d;
>>> +	rr.evtid = evtid;
>>> +	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
>>> +	if (IS_ERR(rr.arch_mon_ctx)) {
>>> +		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
>>> +				    PTR_ERR(rr.arch_mon_ctx));
>>> +		return;
>>> +	}
>>> +
>>> +	__mon_event_count(closid, rmid, &rr);
>>> +
>>> +	if (is_mba_sc(NULL))
>>> +		mbm_bw_count(closid, rmid, &rr);
>>> +
>>
>> As I am staring at this more there seems to be an existing issue here ... note how
>> __mon_event_count()'s return value is not checked before mbm_bw_count() is called.
>> This means that mbm_bw_count() may run with rr.val of 0 that results in wraparound
>> inside it resulting in some unexpected bandwidth numbers. Since a counter read can fail
>> with a "Unavailable"/"Error" from hardware it is not deterministic how frequently this
>> issue can be encountered.
>>
>> Skipping mbm_bw_count() if rr.val is 0 is one option ... that would keep the bandwidth
>> measurement static at whatever was the last successful read and thus not cause dramatic
>> changes by the software controller ... setting bandwidth to 0 if rr.val is 0 is another
>> option to reflect that bandwidth data is unavailable, but then the software controller should
>> perhaps get signal to not make adjustments? I expect there are better options? What do
>> you think?
> 
> Skipping mbm_bw_count() is also undesirable. If some later
> __mon_event_count() does succeed the bandwidth will be computed
> based on the last and current values as if they were one second
> apart, when actually some longer interval elapsed.

Indeed.

> 
> I don't think this is a big issue for current Intel CPU RDT
> implementations because I don't think they will return the
> bit 62 unavailable value in the IA32_QM_CTR MSR. I'll ask
> around to check.

Thank you very much for confirming this. 

> 
> But it does mean that implementing the "summary bandwidth"
> file discussed in the other e-mail thread[1] may be more
> complex on systems that can return that a counter is
> unavailable. We'd have to keep track that two succesful
> counter reads occured, with a measure of the interval
> between them before reporting a value in the summary file.

Looking at expanding the scope of mbm_bw_count() beyond software
controller as well as beyond Intel to support [1] is indeed why I
am looking at this code more.

Reinette
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Fenghua Yu 3 weeks, 2 days ago
Hi, Tony,

On 10/29/24 10:28, Tony Luck wrote:
> Computing memory bandwidth for all enabled events resulted in
> identical code blocks for total and local bandwidth in mbm_update().
> 
> Refactor with a helper function to eliminate code duplication.
> 
> No functional change.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>   arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
>   1 file changed, 24 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 3ef339e405c2..1b6cb3bbc008 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>   	resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
>   }
>   
> -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> -		       u32 closid, u32 rmid)
> +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> +				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
>   {
>   	struct rmid_read rr = {0};
>   
>   	rr.r = r;
>   	rr.d = d;
> +	rr.evtid = evtid;
> +	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> +	if (IS_ERR(rr.arch_mon_ctx)) {
> +		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> +				    PTR_ERR(rr.arch_mon_ctx));
> +		return;
> +	}
> +
> +	__mon_event_count(closid, rmid, &rr);
> +

The comment (added in patch 2 and legacy code) is removed completely here.

Maybe it's better to remain the comment here?

> +	if (is_mba_sc(NULL))
> +		mbm_bw_count(closid, rmid, &rr);
> +
> +	resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
> +}
>   
> +static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> +		       u32 closid, u32 rmid)
> +{
>   	/*
>   	 * This is protected from concurrent reads from user
>   	 * as both the user and we hold the global mutex.
>   	 */
> -	if (is_mbm_total_enabled()) {
> -		rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID;
> -		rr.val = 0;
> -		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> -		if (IS_ERR(rr.arch_mon_ctx)) {
> -			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> -					    PTR_ERR(rr.arch_mon_ctx));
> -			return;
> -		}
> -
> -		__mon_event_count(closid, rmid, &rr);
> -
> -		/*
> -		 * Call the MBA software controller only for the
> -		 * control groups and when user has enabled
> -		 * the software controller explicitly.
> -		 */

Same comment was added in patch 2 but is removed in the helper. Maybe 
it's better to add back the comment in the helper.

> -		if (is_mba_sc(NULL))
> -			mbm_bw_count(closid, rmid, &rr);
> -
> -		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
> -	}
> -	if (is_mbm_local_enabled()) {
> -		rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID;
> -		rr.val = 0;
> -		rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> -		if (IS_ERR(rr.arch_mon_ctx)) {
> -			pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> -					    PTR_ERR(rr.arch_mon_ctx));
> -			return;
> -		}
> -
> -		__mon_event_count(closid, rmid, &rr);
> -
> -		/*
> -		 * Call the MBA software controller only for the
> -		 * control groups and when user has enabled
> -		 * the software controller explicitly.
> -		 */

This same comment is in legacy code and is removed in the helper.

> -		if (is_mba_sc(NULL))
> -			mbm_bw_count(closid, rmid, &rr);
> +	if (is_mbm_total_enabled())
> +		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID);
>   
> -		resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx);
> -	}
> +	if (is_mbm_local_enabled())
> +		mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID);
>   }
>   
>   /*

Thanks.

-Fenghua
Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()
Posted by Tony Luck 3 weeks, 2 days ago
On Fri, Nov 01, 2024 at 03:08:27PM -0700, Fenghua Yu wrote:
> > -		/*
> > -		 * Call the MBA software controller only for the
> > -		 * control groups and when user has enabled
> > -		 * the software controller explicitly.
> > -		 */
> 
> Same comment was added in patch 2 but is removed in the helper. Maybe it's
> better to add back the comment in the helper.
> 
> > -		if (is_mba_sc(NULL))
> > -			mbm_bw_count(closid, rmid, &rr);

In patch 2 I cut & pasted the comment along with the code from
the if (is_mbm_local()) {...}  clause into the if (is_mbm_total()) { ... }
clause. Maybe to make it more obvious that the code was duplicated.

But I had doubts about the usefulness/accuracy when making the
helper function.

Breaking the comment into pieces:

    "Call the MBA software controller"

This code isn't calling the "controller". To me that's the
update_mba_bw() function that adjusts the MBA values. The
call to mbm_bw_count() is simply computing the bandwidth.

    "only for the control groups"

while this is true, the test in the code here "if (is_mba_sc(NULL))"
isn't making any checks about control groups.

    "and when user has enabled the software controller"

Accurate.

    "explicitly"

Redundant (or confusing, there is no "implicit" way to enable
the s/w controller).


If a comment is needed (and I'm not convinced that it is)
maybe something like:

	/*
	 * If the software controller controller is enabled, compute the
	 * bandwidth for this event id.
	 */

-Tony