Use u64_replace_bits() instead of u64p_replace_bits() to set PMCR.N in
arm64's vPMU counter access test to fudge around what appears to be a gcc
bug. With the recent change to have vcpu_get_reg() return a value in lieu
of an out-param, some versions of gcc completely ignore the operation
performed by set_pmcr_n(), i.e. ignore the output param.
The issue is most easily observed by making set_pmcr_n() noinline and
wrapping the call with printf(), e.g. sans comments, for this code:
printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n);
set_pmcr_n(&pmcr, pmcr_n);
printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n);
gcc-13 generates:
0000000000401c90 <set_pmcr_n>:
401c90: f9400002 ldr x2, [x0]
401c94: b3751022 bfi x2, x1, #11, #5
401c98: f9000002 str x2, [x0]
401c9c: d65f03c0 ret
0000000000402660 <test_create_vpmu_vm_with_pmcr_n>:
402724: aa1403e3 mov x3, x20
402728: aa1503e2 mov x2, x21
40272c: aa1603e0 mov x0, x22
402730: aa1503e1 mov x1, x21
402734: 940060ff bl 41ab30 <_IO_printf>
402738: aa1403e1 mov x1, x20
40273c: 910183e0 add x0, sp, #0x60
402740: 97fffd54 bl 401c90 <set_pmcr_n>
402744: aa1403e3 mov x3, x20
402748: aa1503e2 mov x2, x21
40274c: aa1503e1 mov x1, x21
402750: aa1603e0 mov x0, x22
402754: 940060f7 bl 41ab30 <_IO_printf>
with the value stored in [sp + 0x60] ignored by both printf() above and
in the test proper, resulting in a false failure due to vcpu_set_reg()
simply storing the original value, not the intended value.
$ ./vpmu_counter_access
Random seed: 0x6b8b4567
orig = 3040, next = 3040, want = 0
orig = 3040, next = 3040, want = 0
==== Test Assertion Failure ====
aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr)
pid=71578 tid=71578 errno=9 - Bad file descriptor
1 0x400673: run_access_test at vpmu_counter_access.c:522
2 (inlined by) main at vpmu_counter_access.c:643
3 0x4132d7: __libc_start_call_main at libc-start.o:0
4 0x413653: __libc_start_main at ??:0
5 0x40106f: _start at ??:0
Failed to update PMCR.N to 0 (received: 6)
Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if
set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n()
is inlined in its sole caller.
All signs point to this being a gcc bug, as clang doesn't exhibit the same
issue, the code generated by u64p_replace_bits() is correct, and the error
is somewhat transient, e.g. varies between gcc versions and depends on
surrounding code.
For now, work around the issue to unblock the vcpu_get_reg() cleanup, and
because arguably using u64_replace_bits() makes the code a wee bit more
intuitive.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
index 30d9c9e7ae35..74da8252b884 100644
--- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
@@ -45,11 +45,6 @@ static uint64_t get_pmcr_n(uint64_t pmcr)
return FIELD_GET(ARMV8_PMU_PMCR_N, pmcr);
}
-static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n)
-{
- u64p_replace_bits((__u64 *) pmcr, pmcr_n, ARMV8_PMU_PMCR_N);
-}
-
static uint64_t get_counters_mask(uint64_t n)
{
uint64_t mask = BIT(ARMV8_PMU_CYCLE_IDX);
@@ -484,13 +479,12 @@ static void test_create_vpmu_vm_with_pmcr_n(uint64_t pmcr_n, bool expect_fail)
vcpu = vpmu_vm.vcpu;
pmcr_orig = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0));
- pmcr = pmcr_orig;
/*
* Setting a larger value of PMCR.N should not modify the field, and
* return a success.
*/
- set_pmcr_n(&pmcr, pmcr_n);
+ pmcr = u64_replace_bits(pmcr_orig, pmcr_n, ARMV8_PMU_PMCR_N);
vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0), pmcr);
pmcr = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0));
--
2.46.0.598.g6f2099f65c-goog