tools/testing/selftests/resctrl/cache.c | 1 - tools/testing/selftests/resctrl/cat_test.c | 5 +++-- tools/testing/selftests/resctrl/fill_buf.c | 4 ++++ tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++ tools/testing/selftests/resctrl/resctrlfs.c | 2 ++ 6 files changed, 17 insertions(+), 3 deletions(-)
Hello Fenghua, Reinette, Ben, James, and to whom it may concern,
The MPAM driver is nearing upstream merge,
but resctrl_test doesn't work on the Arm architecture.
I'm actively working on a series to support CAT/NONCONT_CAT tests for the Arm.
(Support for MBM/MBA tests will be considered in the future.)
While I've modified the resctrl_test code to enable CAT on Arm,
CAT test is failing in the NVIDIA Grace environment.
(I don't have any other environments.)
Am I misunderstanding the CAT tests, or is there something specific
about Grace that I'm overlooking? Any advice would be greatly appreciated.
First of all,
when running CAT on Grace, I observed that cache limiting is working as expected.
I verified this by checking "sudo cat /sys/fs/resctrl/c1/mon_data/mon_L3_*/llc_occupancy".
Furthermore, I noticed that benchmark execution times varied directly with the limited cache size.
I reused the existing Intel CAT test methodology,
that involves collecting cache miss counts via perf_event during a benchmark task and then
verifying a correlation between the cache limit value and these miss counts.
https://lore.kernel.org/lkml/20231215150515.36983-23-ilpo.jarvinen@linux.intel.com/#r
I'm aware that the specific cache miss numbers and CAT's impact can
differ significantly depending on the microarchitecture or SoC.
For Arm, we need to establish an appropriate minimum difference in LLC
misses between a test with n+1 bits CBM to the test with n bits.
However, my experiments with Grace showed that even when I significantly
varied the cache span size, the average LLC miss counts remained nearly unchanged.
Detailed test results as follows:
# # Starting L3_CAT test ...
# # Mounting resctrl to "/sys/fs/resctrl"
# # Cache size :119537664
# # Writing benchmark parameters to resctrl FS
# # Write schema "L3:1=fc0" to resctrl FS
# # Write schema "L3:1=3f" to resctrl FS
# # Write schema "L3:1=fe0" to resctrl FS
# # Write schema "L3:1=1f" to resctrl FS
# # Write schema "L3:1=ff0" to resctrl FS
# # Write schema "L3:1=f" to resctrl FS
# # Write schema "L3:1=ff8" to resctrl FS
# # Write schema "L3:1=7" to resctrl FS
# # Write schema "L3:1=ffc" to resctrl FS
# # Write schema "L3:1=3" to resctrl FS
# # Write schema "L3:1=ffe" to resctrl FS
# # Write schema "L3:1=1" to resctrl FS
# # Checking for pass/fail
# # Number of bits: 6
# # Average LLC val: 1609252
# # Cache span (lines): 933888
# # Fail: Check cache miss rate changed more than 4.0%
# # Percent diff=-0.0
# # Number of bits: 5
# # Average LLC val: 1609038
# # Cache span (lines): 778240
# # Fail: Check cache miss rate changed more than 3.0%
# # Percent diff=0.7
# # Number of bits: 4
# # Average LLC val: 1620802
# # Cache span (lines): 622592
# # Fail: Check cache miss rate changed more than 2.0%
# # Percent diff=1.1
# # Number of bits: 3
# # Average LLC val: 1639214
# # Cache span (lines): 466944
# # Fail: Check cache miss rate changed more than 1.0%
# # Percent diff=0.9
# # Number of bits: 2
# # Average LLC val: 1653470
# # Cache span (lines): 311296
# # Pass: Check cache miss rate changed more than 0.0%
# # Percent diff=1.0
# # Number of bits: 1
# # Average LLC val: 1669618
# # Cache span (lines): 155648
# not ok 4 L3_CAT: test
Additionally, even with a fixed alloc buffer size(span = 119537664),
the Average LLC value remains nearly unchanged regardless of the limited cache size.
Furthermore, it appears that ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL is
mapped to PERF_COUNT_HW_CACHE_MISSES in "./drivers/perf/arm_pmuv3.c",
to counteract this, I attempted to use the perf_event measurement event
to ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD,
ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL,
and ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD,
however, the Average LLC value still remains nearly unchanged.
My modifications to resctrl_test (for context):
diff --git a/tools/testing/selftests/resctrl/cache.c
b/tools/testing/selftests/resctrl/cache.c
index 9a4a6c52b14c..9f00680039c6 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -8,7 +8,8 @@ char llc_occup_path[1024];
void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
{
memset(pea, 0, sizeof(*pea));
- pea->type = PERF_TYPE_HARDWARE;
+ //pea->type = PERF_TYPE_HARDWARE;
+ pea->type = PERF_TYPE_RAW;
pea->size = sizeof(*pea);
pea->read_format = PERF_FORMAT_GROUP;
pea->exclude_kernel = 1;
diff --git a/tools/testing/selftests/resctrl/cat_test.c
b/tools/testing/selftests/resctrl/cat_test.c
index 58b1590695d1..3ecf22fa1983 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -8,6 +8,7 @@
* Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
* Fenghua Yu <fenghua.yu@intel.com>
*/
+#include "perf/arm_pmuv3.h"
#include "resctrl.h"
#include <unistd.h>
@@ -181,7 +182,11 @@ static int cat_test(const struct resctrl_test *test,
if (ret)
goto reset_affinity;
perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
+ //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE);
+ //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD);
+ //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL);
+ //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD);
perf_event_initialize_read_format(&pe_read);
pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
if (pe_fd < 0) {
@@ -276,6 +281,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
};
param.mask = long_mask;
span = cache_portion_size(cache_total_size, start_mask, full_cache_mask);
+ //span = 119537664; //L3 cache size of my machine
remove(param.filename);
Any insights or suggestions would be greatly appreciated.
Best regards,
Shaopeng TAN
---
Shaopeng Tan (5):
kselftests/resctrl: Detect the ARM architecture
kselftests/resctrl: enable noncont_cat for MPAM
kselftests/resctrl: remove unnecessary exclude_idle
kselftests/resctrl: set shareable_mask to zero if all bits are shared
between software and hardware
kselftests/resctrl: Add support for CAT test on ARM
tools/testing/selftests/resctrl/cache.c | 1 -
tools/testing/selftests/resctrl/cat_test.c | 5 +++--
tools/testing/selftests/resctrl/fill_buf.c | 4 ++++
tools/testing/selftests/resctrl/resctrl.h | 1 +
tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
tools/testing/selftests/resctrl/resctrlfs.c | 2 ++
6 files changed, 17 insertions(+), 3 deletions(-)
--
2.47.3
Hi Shaopeng,
On 1/23/26 04:40, Shaopeng Tan wrote:
> Hello Fenghua, Reinette, Ben, James, and to whom it may concern,
>
> The MPAM driver is nearing upstream merge,
> but resctrl_test doesn't work on the Arm architecture.
> I'm actively working on a series to support CAT/NONCONT_CAT tests for the Arm.
> (Support for MBM/MBA tests will be considered in the future.)
Great :) Having MPAM support in the resctrl kselftests will be be good.
>
> While I've modified the resctrl_test code to enable CAT on Arm,
> CAT test is failing in the NVIDIA Grace environment.
> (I don't have any other environments.)
> Am I misunderstanding the CAT tests, or is there something specific
> about Grace that I'm overlooking? Any advice would be greatly appreciated.
IIUC the L3 cache is in the nvidia interconnect and so changing the
cache portion bitmap would correlate with events from the nvidia
interconnect pmu. However, I don't think you are using events from the
interconnect.
>
> First of all,
> when running CAT on Grace, I observed that cache limiting is working as expected.
> I verified this by checking "sudo cat /sys/fs/resctrl/c1/mon_data/mon_L3_*/llc_occupancy".
> Furthermore, I noticed that benchmark execution times varied directly with the limited cache size.
Good to know.
>
> I reused the existing Intel CAT test methodology,
> that involves collecting cache miss counts via perf_event during a benchmark task and then
> verifying a correlation between the cache limit value and these miss counts.
> https://lore.kernel.org/lkml/20231215150515.36983-23-ilpo.jarvinen@linux.intel.com/#r
>
> I'm aware that the specific cache miss numbers and CAT's impact can
> differ significantly depending on the microarchitecture or SoC.
> For Arm, we need to establish an appropriate minimum difference in LLC
> misses between a test with n+1 bits CBM to the test with n bits.
>
> However, my experiments with Grace showed that even when I significantly
> varied the cache span size, the average LLC miss counts remained nearly unchanged.
>
> Detailed test results as follows:
>
> # # Starting L3_CAT test ...
> # # Mounting resctrl to "/sys/fs/resctrl"
> # # Cache size :119537664
> # # Writing benchmark parameters to resctrl FS
> # # Write schema "L3:1=fc0" to resctrl FS
> # # Write schema "L3:1=3f" to resctrl FS
> # # Write schema "L3:1=fe0" to resctrl FS
> # # Write schema "L3:1=1f" to resctrl FS
> # # Write schema "L3:1=ff0" to resctrl FS
> # # Write schema "L3:1=f" to resctrl FS
> # # Write schema "L3:1=ff8" to resctrl FS
> # # Write schema "L3:1=7" to resctrl FS
> # # Write schema "L3:1=ffc" to resctrl FS
> # # Write schema "L3:1=3" to resctrl FS
> # # Write schema "L3:1=ffe" to resctrl FS
> # # Write schema "L3:1=1" to resctrl FS
> # # Checking for pass/fail
> # # Number of bits: 6
> # # Average LLC val: 1609252
> # # Cache span (lines): 933888
> # # Fail: Check cache miss rate changed more than 4.0%
> # # Percent diff=-0.0
> # # Number of bits: 5
> # # Average LLC val: 1609038
> # # Cache span (lines): 778240
> # # Fail: Check cache miss rate changed more than 3.0%
> # # Percent diff=0.7
> # # Number of bits: 4
> # # Average LLC val: 1620802
> # # Cache span (lines): 622592
> # # Fail: Check cache miss rate changed more than 2.0%
> # # Percent diff=1.1
> # # Number of bits: 3
> # # Average LLC val: 1639214
> # # Cache span (lines): 466944
> # # Fail: Check cache miss rate changed more than 1.0%
> # # Percent diff=0.9
> # # Number of bits: 2
> # # Average LLC val: 1653470
> # # Cache span (lines): 311296
> # # Pass: Check cache miss rate changed more than 0.0%
> # # Percent diff=1.0
> # # Number of bits: 1
> # # Average LLC val: 1669618
> # # Cache span (lines): 155648
> # not ok 4 L3_CAT: test
>
> Additionally, even with a fixed alloc buffer size(span = 119537664),
> the Average LLC value remains nearly unchanged regardless of the limited cache size.
> Furthermore, it appears that ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL is
> mapped to PERF_COUNT_HW_CACHE_MISSES in "./drivers/perf/arm_pmuv3.c",
> to counteract this, I attempted to use the perf_event measurement event
> to ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD,
> ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL,
> and ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD,
> however, the Average LLC value still remains nearly unchanged.
I think these are from the neoverse_v2 rather than the interconnect.
>
> My modifications to resctrl_test (for context):
>
> diff --git a/tools/testing/selftests/resctrl/cache.c
> b/tools/testing/selftests/resctrl/cache.c
> index 9a4a6c52b14c..9f00680039c6 100644
> --- a/tools/testing/selftests/resctrl/cache.c
> +++ b/tools/testing/selftests/resctrl/cache.c
> @@ -8,7 +8,8 @@ char llc_occup_path[1024];
> void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
> {
> memset(pea, 0, sizeof(*pea));
> - pea->type = PERF_TYPE_HARDWARE;
> + //pea->type = PERF_TYPE_HARDWARE;
> + pea->type = PERF_TYPE_RAW;
> pea->size = sizeof(*pea);
> pea->read_format = PERF_FORMAT_GROUP;
> pea->exclude_kernel = 1;
> diff --git a/tools/testing/selftests/resctrl/cat_test.c
> b/tools/testing/selftests/resctrl/cat_test.c
> index 58b1590695d1..3ecf22fa1983 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -8,6 +8,7 @@
> * Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
> * Fenghua Yu <fenghua.yu@intel.com>
> */
> +#include "perf/arm_pmuv3.h"
> #include "resctrl.h"
> #include <unistd.h>
>
> @@ -181,7 +182,11 @@ static int cat_test(const struct resctrl_test *test,
> if (ret)
> goto reset_affinity;
>
> perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
> + //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE);
> + //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD);
> + //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL);
> + //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD);
> perf_event_initialize_read_format(&pe_read);
> pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
> if (pe_fd < 0) {
> @@ -276,6 +281,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
> };
> param.mask = long_mask;
> span = cache_portion_size(cache_total_size, start_mask, full_cache_mask);
> + //span = 119537664; //L3 cache size of my machine
>
> remove(param.filename);
>
> Any insights or suggestions would be greatly appreciated.
>
> Best regards,
> Shaopeng TAN
>
> ---
> Shaopeng Tan (5):
> kselftests/resctrl: Detect the ARM architecture
> kselftests/resctrl: enable noncont_cat for MPAM
> kselftests/resctrl: remove unnecessary exclude_idle
> kselftests/resctrl: set shareable_mask to zero if all bits are shared
> between software and hardware
> kselftests/resctrl: Add support for CAT test on ARM
>
> tools/testing/selftests/resctrl/cache.c | 1 -
> tools/testing/selftests/resctrl/cat_test.c | 5 +++--
> tools/testing/selftests/resctrl/fill_buf.c | 4 ++++
> tools/testing/selftests/resctrl/resctrl.h | 1 +
> tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
> tools/testing/selftests/resctrl/resctrlfs.c | 2 ++
> 6 files changed, 17 insertions(+), 3 deletions(-)
>
Thanks,
Ben
The resctrl test is not enabled for MPAM (ARM Memory System Resource
Partitioning and Monitoring)
Add processing to detect the ARM architecture.
Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
tools/testing/selftests/resctrl/resctrl.h | 1 +
tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 3c51bdac2dfa..492d2a1c4033 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -38,6 +38,7 @@
*/
#define ARCH_INTEL 1
#define ARCH_AMD 2
+#define ARCH_ARM 3
#define END_OF_TESTS 1
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 5154ffd821c4..662968d38eca 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -8,6 +8,7 @@
* Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
* Fenghua Yu <fenghua.yu@intel.com>
*/
+#include <sys/utsname.h>
#include "resctrl.h"
/* Volatile memory sink to prevent compiler optimizations */
@@ -26,6 +27,7 @@ static struct resctrl_test *resctrl_tests[] = {
static int detect_vendor(void)
{
FILE *inf = fopen("/proc/cpuinfo", "r");
+ struct utsname system_info;
int vendor_id = 0;
char *s = NULL;
char *res;
@@ -42,6 +44,11 @@ static int detect_vendor(void)
vendor_id = ARCH_INTEL;
else if (s && !strcmp(s, ": AuthenticAMD\n"))
vendor_id = ARCH_AMD;
+ else {
+ uname(&system_info);
+ if (strstr(system_info.machine, "aarch64") != NULL)
+ vendor_id = ARCH_ARM;
+ }
fclose(inf);
free(res);
--
2.47.3
Arm(MPAM driver) also supports non-contiguous CBM.
So enable noncont_cat for Arm.
Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
tools/testing/selftests/resctrl/cat_test.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 94cfdba5308d..e1b30ab4cef5 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -291,7 +291,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
static bool arch_supports_noncont_cat(const struct resctrl_test *test)
{
/* AMD always supports non-contiguous CBM. */
- if (get_vendor() == ARCH_AMD)
+ /* ARM(MPAM driver) also supports non-contiguous CBM. */
+ if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_ARM)
return true;
#if defined(__i386__) || defined(__x86_64__) /* arch */
--
2.47.3
The Linux manual states regarding exclude_idle: "While you can currently
enable this for any event type, it is ignored for all but software events."
Also, it appears exclude_idle is not supported on Arm.
Therefore, remove it.
Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
tools/testing/selftests/resctrl/cache.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 1ff1104e6575..9a4a6c52b14c 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -13,7 +13,6 @@ void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
pea->read_format = PERF_FORMAT_GROUP;
pea->exclude_kernel = 1;
pea->exclude_hv = 1;
- pea->exclude_idle = 1;
pea->exclude_callchain_kernel = 1;
pea->inherit = 1;
pea->exclude_guest = 1;
--
2.47.3
When all bits are shared between software and hardware, CAT test can not run.
In the case of MPAM driver, even if all bits are shared between
hardware and software, they can be used as if software-exclusive.
To enable CAT, if all bits are shared between hardware and software,
set shareable_mask to zero.
Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
tools/testing/selftests/resctrl/resctrlfs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 195f04c4d158..4b9ee803a112 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -495,6 +495,8 @@ int get_mask_no_shareable(const char *cache_type, unsigned long *mask)
return -1;
if (get_shareable_mask(cache_type, &shareable_mask) < 0)
return -1;
+ if (full_mask == shareable_mask)
+ shareable_mask = 0;
len = count_contiguous_bits(full_mask & ~shareable_mask, &start);
if (!len)
--
2.47.3
Currently, CAT test is limited to Intel architectures.
Add cache cleaning and enable result checking for Arm architectures.
Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
tools/testing/selftests/resctrl/cat_test.c | 2 +-
tools/testing/selftests/resctrl/fill_buf.c | 4 ++++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index e1b30ab4cef5..58b1590695d1 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -113,7 +113,7 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type
ret = show_results_info(sum_llc_perf_miss, bits,
alloc_size / 64,
MIN_DIFF_PERCENT_PER_BIT * (bits - 1),
- runs, get_vendor() == ARCH_INTEL,
+ runs, (get_vendor() == ARCH_INTEL || get_vendor() == ARCH_ARM),
&prev_avg_llc_val);
if (ret)
fail = 1;
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 19a01a52dc1a..dbbf80d22f42 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -35,6 +35,10 @@ static void cl_flush(void *p)
#if defined(__i386) || defined(__x86_64)
asm volatile("clflush (%0)\n\t"
: : "r"(p) : "memory");
+#elif defined(__aarch64__)
+ __asm__ __volatile__("dc civac, %0\n\t"
+ : : "r" (p) : "memory");
+
#endif
}
--
2.47.3
Hi Shaopeng,
On 1/23/26 04:40, Shaopeng Tan wrote:
> Currently, CAT test is limited to Intel architectures.
> Add cache cleaning and enable result checking for Arm architectures.
>
> Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 2 +-
> tools/testing/selftests/resctrl/fill_buf.c | 4 ++++
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index e1b30ab4cef5..58b1590695d1 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -113,7 +113,7 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type
> ret = show_results_info(sum_llc_perf_miss, bits,
> alloc_size / 64,
> MIN_DIFF_PERCENT_PER_BIT * (bits - 1),
> - runs, get_vendor() == ARCH_INTEL,
> + runs, (get_vendor() == ARCH_INTEL || get_vendor() == ARCH_ARM),
> &prev_avg_llc_val);
> if (ret)
> fail = 1;
> diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
> index 19a01a52dc1a..dbbf80d22f42 100644
> --- a/tools/testing/selftests/resctrl/fill_buf.c
> +++ b/tools/testing/selftests/resctrl/fill_buf.c
> @@ -35,6 +35,10 @@ static void cl_flush(void *p)
> #if defined(__i386) || defined(__x86_64)
> asm volatile("clflush (%0)\n\t"
> : : "r"(p) : "memory");
> +#elif defined(__aarch64__)
> + __asm__ __volatile__("dc civac, %0\n\t"
> + : : "r" (p) : "memory");
> +
This is only guaranteed to clean and invalidate to the point of
coherence, PoC. On Grace I expect this is L3/slc and so the cache line
there in L3/slc is likely not invalidated or pushed to DRAM.
The dsb() for synchronization is missing for aarch64 in sb().
> #endif
> }
>
Thanks,
Ben
© 2016 - 2026 Red Hat, Inc.