arm_mpam: Try reading again if MPAM instance returns not ready

[PATCH] arm_mpam: Try reading again if MPAM instance returns not ready

Posted by Zeng Heng 4 months, 3 weeks ago

After updating the monitor configuration, the first read of the monitoring
result requires waiting for the "not ready" duration before an effective
value can be obtained.

Because a component consists of multiple MPAM instances, after updating the
configuration of each instance, should wait for the "not ready" period of
per single instance before the valid monitoring value can be obtained, not
just wait for once interval per component.

Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
It's fine to merge this patch directly into patch 7 of the responding
patchset.
---
 drivers/resctrl/mpam_devices.c | 36 +++++++++++++++-------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 2962cd018207..e79a46646863 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1183,11 +1183,14 @@ static void __ris_msmon_read(void *arg)
 	}

 	*m->val += now;
+	m->err = 0;
 }

 static int _msmon_read(struct mpam_component *comp, struct mon_read *arg)
 {
 	int err, idx;
+	bool read_again;
+	u64 wait_jiffies;
 	struct mpam_msc *msc;
 	struct mpam_vmsc *vmsc;
 	struct mpam_msc_ris *ris;
@@ -1198,10 +1201,22 @@ static int _msmon_read(struct mpam_component *comp, struct mon_read *arg)

 		list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
 			arg->ris = ris;
+			read_again = false;
+again:

 			err = smp_call_function_any(&msc->accessibility,
 						    __ris_msmon_read, arg,
 						    true);
+			if (arg->err == -EBUSY && !read_again) {
+				read_again = true;
+
+				wait_jiffies = usecs_to_jiffies(comp->class->nrdy_usec);
+				while (wait_jiffies)
+					wait_jiffies = schedule_timeout_uninterruptible(wait_jiffies);
+
+				goto again;
+			}
+
 			if (!err && arg->err)
 				err = arg->err;
 			if (err)
@@ -1218,9 +1233,7 @@ static int _msmon_read(struct mpam_component *comp, struct mon_read *arg)
 int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
 		    enum mpam_device_features type, u64 *val)
 {
-	int err;
 	struct mon_read arg;
-	u64 wait_jiffies = 0;
 	struct mpam_props *cprops = &comp->class->props;

 	might_sleep();
@@ -1237,24 +1250,7 @@ int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
 	arg.val = val;
 	*val = 0;

-	err = _msmon_read(comp, &arg);
-	if (err == -EBUSY && comp->class->nrdy_usec)
-		wait_jiffies = usecs_to_jiffies(comp->class->nrdy_usec);
-
-	while (wait_jiffies)
-		wait_jiffies = schedule_timeout_uninterruptible(wait_jiffies);
-
-	if (err == -EBUSY) {
-		memset(&arg, 0, sizeof(arg));
-		arg.ctx = ctx;
-		arg.type = type;
-		arg.val = val;
-		*val = 0;
-
-		err = _msmon_read(comp, &arg);
-	}
-
-	return err;
+	return _msmon_read(comp, &arg);
 }

 void mpam_msmon_reset_mbwu(struct mpam_component *comp, struct mon_cfg *ctx)
--
2.25.1

Re: [PATCH] arm_mpam: Try reading again if MPAM instance returns not ready

Posted by James Morse 4 months, 3 weeks ago

Hi Zeng,

On 16/09/2025 14:17, Zeng Heng wrote:
> After updating the monitor configuration, the first read of the monitoring
> result requires waiting for the "not ready" duration before an effective
> value can be obtained.

May need to wait - some platforms need to do this, some don't.
Yours is the first I've heard of that does this!


> Because a component consists of multiple MPAM instances, after updating the
> configuration of each instance, should wait for the "not ready" period of
> per single instance before the valid monitoring value can be obtained, not
> just wait for once interval per component.

I'm really trying to avoid that ... if you have ~200 MSC pretending to be one thing, you'd
wait 200x the maximum period. On systems with CMN, the number of MSC scales with the
number of CPUs, so 200x isn't totally crazy.

I think the real problem here is the driver doesn't go on to reconfigure MSC-2 if MSC-1
returned not-ready, meaning the "I'll only wait once" logic kicks in and returns not-ready
to the user. (which is presumably what you're seeing?)

Does this solve your problem?:
-----------------%<-----------------
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 404bd4c1fd5e..2f39d0339349 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1395,7 +1395,7 @@ static void __ris_msmon_read(void *arg)

 static int _msmon_read(struct mpam_component *comp, struct mon_read *arg)
 {
-       int err, idx;
+       int err, any_err = 0, idx;
        struct mpam_msc *msc;
        struct mpam_vmsc *vmsc;
        struct mpam_msc_ris *ris;
@@ -1412,15 +1412,19 @@ static int _msmon_read(struct mpam_component *comp, stru
ct mon_read *arg)
                                                    true);
                        if (!err && arg->err)
                                err = arg->err;
+
+                       /*
+                        * Save one error to be returned to the caller, but
+                        * keep reading counters so that the get reprogrammed.
+                        * On platforms with NRDY this lets us wait once.
+                        */
                        if (err)
-                               break;
+                               any_err = err;
                }
-               if (err)
-                       break;
        }
        srcu_read_unlock(&mpam_srcu, idx);

-       return err;
+       return any_err;
 }

 int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
-----------------%<-----------------


Thanks,

James