target/arm/cpu.h | 6 +++- target/arm/kvm_arm.h | 1 + linux-user/elfload.c | 2 +- target/arm/cpu.c | 4 --- target/arm/helper.c | 2 +- target/arm/kvm.c | 1 + target/arm/kvm32.c | 75 +++++++++++++++++++++++++------------------- target/arm/kvm64.c | 69 ++++++++++++++++++++++++++++++++++++++-- target/arm/machine.c | 3 +- 9 files changed, 120 insertions(+), 43 deletions(-)
My previous patch set for replacing feature bits with id registers failed to consider that these id registers are beginning to control migration, and thus we must fill them in for KVM as well. Thus, we want to initialize these values within CPU from the host. Finally, re-send the T32EE conversion patch, fixing the build failure on an arm32 host in kvm32.c. Changes, v1->v2: * Remove assert that AArch32 sysreg <= UINT32_MAX. * Remove unused local variable. * Add commentary for AArch32 sysregs vs missing AArch32 support. r~ Richard Henderson (5): target/arm: Install ARMISARegisters from kvm host target/arm: Fill in ARMISARegisters for kvm64 target/arm: Introduce read_sys_reg32 for kvm32 target/arm: Fill in ARMISARegisters for kvm32 target/arm: Convert t32ee from feature bit to isar3 test target/arm/cpu.h | 6 +++- target/arm/kvm_arm.h | 1 + linux-user/elfload.c | 2 +- target/arm/cpu.c | 4 --- target/arm/helper.c | 2 +- target/arm/kvm.c | 1 + target/arm/kvm32.c | 75 +++++++++++++++++++++++++------------------- target/arm/kvm64.c | 69 ++++++++++++++++++++++++++++++++++++++-- target/arm/machine.c | 3 +- 9 files changed, 120 insertions(+), 43 deletions(-) -- 2.17.2
On 2 November 2018 at 14:54, Richard Henderson <richard.henderson@linaro.org> wrote: > My previous patch set for replacing feature bits with id registers > failed to consider that these id registers are beginning to control > migration, and thus we must fill them in for KVM as well. > > Thus, we want to initialize these values within CPU from the host. > > Finally, re-send the T32EE conversion patch, fixing the build > failure on an arm32 host in kvm32.c. > > Changes, v1->v2: > * Remove assert that AArch32 sysreg <= UINT32_MAX. > * Remove unused local variable. > * Add commentary for AArch32 sysregs vs missing AArch32 support. As noted on IRC, on my admittedly pretty ancient 4.8.0 kernel some of these ID register reads via KVM_GET_ONE_REG fail ENOENT. strace says: openat(AT_FDCWD, "/dev/kvm", O_RDWR|O_CLOEXEC) = 18 ioctl(18, KVM_CREATE_VM or LOGGER_GET_LOG_BUF_SIZE, 0) = 19 ioctl(19, KVM_CREATE_VCPU, 0) = 20 ioctl(19, KVM_ARM_PREFERRED_TARGET, 0xffffcfeb4e88) = 0 ioctl(20, KVM_ARM_VCPU_INIT, 0xffffcfeb4e88) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = -1 ENOENT (No such file or directory) I added a bit of extra tracing, since strace doesn't print the ID field for the ioctl: peter.maydell@mustang-maydell:~/qemu$ ~/test-images/virtv8-for-nesting/runme-kvm ./build/for-kvm/aarch64-softmmu/qemu-system-aarch64 -enable-kvm -cpu max -machine gic-version=max read_sys_reg64: reading ID 0x603000000013c030...-1 read_sys_reg64: reading ID 0x603000000013c031...-1 read_sys_reg64: reading ID 0x603000000013c020...-1 read_sys_reg64: reading ID 0x603000000013c021...-1 read_sys_reg32: reading ID 0x603000000013c010...0 read_sys_reg32: reading ID 0x603000000013c011...0 read_sys_reg32: reading ID 0x603000000013c012...0 read_sys_reg32: reading ID 0x603000000013c013...0 read_sys_reg32: reading ID 0x603000000013c014...0 read_sys_reg32: reading ID 0x603000000013c015...0 read_sys_reg32: reading ID 0x603000000013c017...-1 read_sys_reg32: reading ID 0x603000000013c018...-1 read_sys_reg32: reading ID 0x603000000013c019...-1 read_sys_reg32: reading ID 0x603000000013c01a...-1 qemu-system-aarch64: Failed to retrieve host CPU features It looks like the kernel can handle reads of ID_ISAR0_EL1 through ID_ISAR5_EL1, but not ID_ISAR6_EL1, any of the MVFR*_EL1 or ID_AA64_ISAR* or ID_AA64PFR*. This is probably because the kernel is way too old to be interestingly supportable for KVM, but we did previously manage to boot on this setup. We should probably at least figure out which version of the kernel fixed this bug and made the ID registers available to userspace... if it's sufficiently ancient we could likely say "not supported", but if it's more recent we need a workaround somehow. I have cc'd a couple of kernel folks who might be able to help with the "which version" question. thanks -- PMM
On Fri, Nov 02, 2018 at 04:36:35PM +0000, Peter Maydell wrote: > On 2 November 2018 at 14:54, Richard Henderson > <richard.henderson@linaro.org> wrote: > > My previous patch set for replacing feature bits with id registers > > failed to consider that these id registers are beginning to control > > migration, and thus we must fill them in for KVM as well. > > > > Thus, we want to initialize these values within CPU from the host. > > > > Finally, re-send the T32EE conversion patch, fixing the build > > failure on an arm32 host in kvm32.c. > > > > Changes, v1->v2: > > * Remove assert that AArch32 sysreg <= UINT32_MAX. > > * Remove unused local variable. > > * Add commentary for AArch32 sysregs vs missing AArch32 support. > > As noted on IRC, on my admittedly pretty ancient 4.8.0 kernel some > of these ID register reads via KVM_GET_ONE_REG fail ENOENT. > strace says: > > openat(AT_FDCWD, "/dev/kvm", O_RDWR|O_CLOEXEC) = 18 > ioctl(18, KVM_CREATE_VM or LOGGER_GET_LOG_BUF_SIZE, 0) = 19 > ioctl(19, KVM_CREATE_VCPU, 0) = 20 > ioctl(19, KVM_ARM_PREFERRED_TARGET, 0xffffcfeb4e88) = 0 > ioctl(20, KVM_ARM_VCPU_INIT, 0xffffcfeb4e88) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) = 0 > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > ioctl(20, KVM_ARM_SET_DEVICE_ADDR or KVM_GET_ONE_REG, 0xffffcfeb4e28) > = -1 ENOENT (No such file or directory) > > > I added a bit of extra tracing, since strace doesn't > print the ID field for the ioctl: > > peter.maydell@mustang-maydell:~/qemu$ > ~/test-images/virtv8-for-nesting/runme-kvm > ./build/for-kvm/aarch64-softmmu/qemu-system-aarch64 -enable-kvm -cpu > max -machine gic-version=max > read_sys_reg64: reading ID 0x603000000013c030...-1 > read_sys_reg64: reading ID 0x603000000013c031...-1 > read_sys_reg64: reading ID 0x603000000013c020...-1 > read_sys_reg64: reading ID 0x603000000013c021...-1 > read_sys_reg32: reading ID 0x603000000013c010...0 > read_sys_reg32: reading ID 0x603000000013c011...0 > read_sys_reg32: reading ID 0x603000000013c012...0 > read_sys_reg32: reading ID 0x603000000013c013...0 > read_sys_reg32: reading ID 0x603000000013c014...0 > read_sys_reg32: reading ID 0x603000000013c015...0 > read_sys_reg32: reading ID 0x603000000013c017...-1 > read_sys_reg32: reading ID 0x603000000013c018...-1 > read_sys_reg32: reading ID 0x603000000013c019...-1 > read_sys_reg32: reading ID 0x603000000013c01a...-1 > qemu-system-aarch64: Failed to retrieve host CPU features > > It looks like the kernel can handle reads of ID_ISAR0_EL1 > through ID_ISAR5_EL1, but not ID_ISAR6_EL1, any of the > MVFR*_EL1 or ID_AA64_ISAR* or ID_AA64PFR*. > > This is probably because the kernel is way too old to be > interestingly supportable for KVM, but we did previously > manage to boot on this setup. I'm a little confused. v4.8 used to work (although it was perhaps not the most stable at that time). What changed? Is this attempting to restore a VM from a newer kernel, or has QEMU been updated to detect this? > > We should probably at least figure out which version of > the kernel fixed this bug and made the ID registers available > to userspace... if it's sufficiently ancient we could > likely say "not supported", but if it's more recent we > need a workaround somehow. I have cc'd a couple of kernel > folks who might be able to help with the "which version" > question. > It appears the support for exposing a bunch of ID registers was introduced with: 93390c0a1b20 (arm64: KVM: Hide unsupported AArch64 CPU features from guests, 2017-10-31) Which Dave (cc'ed) wrote and which was introduced in v4.15. As per my question above, I'm not exactly sure what (if anything) we need to fix on the kernel side? Thanks, Christoffer
On 11/2/18 7:35 PM, Christoffer Dall wrote: >> It looks like the kernel can handle reads of ID_ISAR0_EL1 >> through ID_ISAR5_EL1, but not ID_ISAR6_EL1, any of the >> MVFR*_EL1 or ID_AA64_ISAR* or ID_AA64PFR*. >> >> This is probably because the kernel is way too old to be >> interestingly supportable for KVM, but we did previously >> manage to boot on this setup. > > I'm a little confused. v4.8 used to work (although it was perhaps not > the most stable at that time). What changed? We are changing QEMU to depend on ID registers to define the set of features that are supported, instead of re-defining those same features with the ARM_FEATURE_* enumeration. For KVM, there are only a few of those that are important in the end: those that affect migration. However, it seemed to me to be most future-proof if I just read in all of the relevant ID registers so that should a code path shared between TCG and KVM test a field, we get the right value. Which appeared to work, since I was using a 4.15 kernel. But I hadn't imagined that such support had only just appeared. > 93390c0a1b20 (arm64: KVM: Hide unsupported AArch64 CPU features from guests, 2017-10-31) > > Which Dave (cc'ed) wrote and which was introduced in v4.15. > > As per my question above, I'm not exactly sure what (if anything) we > need to fix on the kernel side? It doesn't sound like there's anything in the kernel that needs fixing. But it does sound like we need a workaround in userspace to deal with old kernels. Peter, what about filling in some defaults via MIDR and friends, using the values we have from the TCG side. Then overwriting that default with the registers that we can read from the kernel. Ignore ENOENT failures and just leave the default. There are currently only two features that affect migration, and so must not be approximated: T2EE and SVE. The former is in ID_ISAR3, which is in the set that is supported by 4.8. The latter you were never going to run on a kernel that old anyway. Although, while I have you, Christopher, is there a way to extract the Limited Ordering Region registers without modifying guest state? Those are multiplexed with LORN_EL1. r~
On Sat, 3 Nov 2018 09:53:44 +0000 Richard Henderson <richard.henderson@linaro.org> wrote: Hi Richard, > Although, while I have you, Christopher, is there a way to extract the Limited > Ordering Region registers without modifying guest state? Those are multiplexed > with LORN_EL1. We actively hide the LORegion feature from the guest since cc33c4e20185a391766ed5e78e2acc97e17ba511 (in the 4.17 time frame), so you shouldn't be able to obtain these on a recent host. I'm not sure this answers your question though... Thanks, M. -- Without deviation from the norm, progress is not possible.
On 11/3/18 12:32 PM, Marc Zyngier wrote: > We actively hide the LORegion feature from the guest since > cc33c4e20185a391766ed5e78e2acc97e17ba511 (in the 4.17 time frame), so > you shouldn't be able to obtain these on a recent host. I don't think that patch is ideal. In particular, LOR is a mandatory requirement of ARMv8.1. The OS can rightly presume it is present in context. Better, I think, is to trap the LOR registers, as you are doing, and have them act as RAZ/WI. This indicates, via LORID_EL1, that there are zero supported LO regions, which is a valid ARMv8.1 configuration. r~
Hi Richard, On Sun, 04 Nov 2018 09:50:29 +0000, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 11/3/18 12:32 PM, Marc Zyngier wrote: > > We actively hide the LORegion feature from the guest since > > cc33c4e20185a391766ed5e78e2acc97e17ba511 (in the 4.17 time frame), so > > you shouldn't be able to obtain these on a recent host. > > I don't think that patch is ideal. > > In particular, LOR is a mandatory requirement of ARMv8.1. > The OS can rightly presume it is present in context. > > Better, I think, is to trap the LOR registers, as you are doing, > and have them act as RAZ/WI. This indicates, via LORID_EL1, that > there are zero supported LO regions, which is a valid ARMv8.1 > configuration. I agree this looks like a much better solution. I seem to remember Mark (now cc'd) battling some half baked implementation that didn't expose the LOR feature, and yet allowed the various LOR registers to be freely used, with unknown consequences. Since we unfortunately need to handle that sorry excuse for a CPU (and assuming I remember the case correctly), I'd suggest the following (untested) change: - We leave the LOR feature visible - We handle the LOR registers as RAZ/WI - We still delivering an UNDEF when we're in the aforementioned case. It would also solve the pathological cases that would result from mixing 8.0 and 8.1+ CPUs in the same system (yeah, we all foolishly thought it would never happen...). Thoughts? M. diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 22fbbdbece3c..d50f912d3f4a 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -314,10 +314,15 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu, return read_zero(vcpu, p); } -static bool trap_undef(struct kvm_vcpu *vcpu, - struct sys_reg_params *p, - const struct sys_reg_desc *r) +static bool trap_loregion(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) { + u64 val = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); + + if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT)) + return trap_raz_wi(vcpu, p, r); + kvm_inject_undefined(vcpu); return false; } @@ -1040,11 +1045,6 @@ static u64 read_id_reg(struct sys_reg_desc const *r, bool raz) kvm_debug("SVE unsupported for guests, suppressing\n"); val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); - } else if (id == SYS_ID_AA64MMFR1_EL1) { - if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT)) - kvm_debug("LORegions unsupported for guests, suppressing\n"); - - val &= ~(0xfUL << ID_AA64MMFR1_LOR_SHIFT); } return val; @@ -1330,11 +1330,11 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 }, { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 }, - { SYS_DESC(SYS_LORSA_EL1), trap_undef }, - { SYS_DESC(SYS_LOREA_EL1), trap_undef }, - { SYS_DESC(SYS_LORN_EL1), trap_undef }, - { SYS_DESC(SYS_LORC_EL1), trap_undef }, - { SYS_DESC(SYS_LORID_EL1), trap_undef }, + { SYS_DESC(SYS_LORSA_EL1), trap_loregion }, + { SYS_DESC(SYS_LOREA_EL1), trap_loregion }, + { SYS_DESC(SYS_LORN_EL1), trap_loregion }, + { SYS_DESC(SYS_LORC_EL1), trap_loregion }, + { SYS_DESC(SYS_LORID_EL1), trap_loregion }, { SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 }, { SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 }, -- Jazz is not dead, it just smell funny.
On Sun, Nov 04, 2018 at 11:25:00AM +0000, Marc Zyngier wrote: > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c > index 22fbbdbece3c..d50f912d3f4a 100644 > --- a/arch/arm64/kvm/sys_regs.c > +++ b/arch/arm64/kvm/sys_regs.c > @@ -314,10 +314,15 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu, > return read_zero(vcpu, p); > } > > -static bool trap_undef(struct kvm_vcpu *vcpu, > - struct sys_reg_params *p, > - const struct sys_reg_desc *r) We should probably have a comment here, like: /* * ARMv8.1 mandates at least a trivial LORegion implementation, where * all the RW registers are RES0 (which we can implement as RAZ/WI). On * an ARMv8.0 system, these registers should UNDEF. */ > +static bool trap_loregion(struct kvm_vcpu *vcpu, > + struct sys_reg_params *p, > + const struct sys_reg_desc *r) > { > + u64 val = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); > + > + if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT)) > + return trap_raz_wi(vcpu, p, r); > + > kvm_inject_undefined(vcpu); > return false; > } > @@ -1040,11 +1045,6 @@ static u64 read_id_reg(struct sys_reg_desc const *r, bool raz) > kvm_debug("SVE unsupported for guests, suppressing\n"); > > val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT); > - } else if (id == SYS_ID_AA64MMFR1_EL1) { > - if (val & (0xfUL << ID_AA64MMFR1_LOR_SHIFT)) > - kvm_debug("LORegions unsupported for guests, suppressing\n"); > - > - val &= ~(0xfUL << ID_AA64MMFR1_LOR_SHIFT); > } > > return val; > @@ -1330,11 +1330,11 @@ static const struct sys_reg_desc sys_reg_descs[] = { > { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 }, > { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 }, > > - { SYS_DESC(SYS_LORSA_EL1), trap_undef }, > - { SYS_DESC(SYS_LOREA_EL1), trap_undef }, > - { SYS_DESC(SYS_LORN_EL1), trap_undef }, > - { SYS_DESC(SYS_LORC_EL1), trap_undef }, > - { SYS_DESC(SYS_LORID_EL1), trap_undef }, > + { SYS_DESC(SYS_LORSA_EL1), trap_loregion }, > + { SYS_DESC(SYS_LOREA_EL1), trap_loregion }, > + { SYS_DESC(SYS_LORN_EL1), trap_loregion }, > + { SYS_DESC(SYS_LORC_EL1), trap_loregion }, > + { SYS_DESC(SYS_LORID_EL1), trap_loregion }, LORID_EL1 is always RO, so we need to UNDEF on writes. Either we add a separate helper, or special-case LORID_EL1 in trap_loregion(). Otherwise, this looks good to me. Thanks, Mark.
© 2016 - 2024 Red Hat, Inc.