[RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time

Andrew Jones posted 5 patches 4 years, 3 months ago
Test asan failed
Test checkpatch failed
Test FreeBSD failed
Test docker-mingw@fedora failed
Test docker-clang@ubuntu failed
Test docker-quick@centos7 failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20191212173320.11610-1-drjones@redhat.com
Maintainers: Halil Pasic <pasic@linux.ibm.com>, David Gibson <david@gibson.dropbear.id.au>, Peter Maydell <peter.maydell@linaro.org>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Richard Henderson <rth@twiddle.net>, Eduardo Habkost <ehabkost@redhat.com>, David Hildenbrand <david@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Cornelia Huck <cohuck@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>
There is a newer version of this series
docs/arm-cpu-features.rst  | 31 +++++++++++++++-
hw/arm/virt.c              | 17 ++++++++-
hw/core/machine.c          |  3 ++
hw/i386/pc.c               |  3 ++
hw/i386/pc_piix.c          | 14 ++++++-
hw/i386/pc_q35.c           | 13 ++++++-
hw/ppc/spapr.c             | 15 +++++++-
hw/s390x/s390-virtio-ccw.c | 15 +++++++-
include/hw/arm/virt.h      |  1 +
include/hw/boards.h        |  3 ++
include/hw/i386/pc.h       |  3 ++
target/arm/cpu.c           |  2 +
target/arm/cpu.h           |  9 +++++
target/arm/cpu64.c         |  1 +
target/arm/kvm.c           | 76 ++++++++++++++++++++++++++++++++++++++
target/arm/kvm32.c         |  3 ++
target/arm/kvm64.c         |  4 ++
target/arm/kvm_arm.h       | 34 +++++++++++++++++
target/arm/monitor.c       |  1 +
tests/arm-cpu-features.c   | 48 +++++++++++++++++++-----
20 files changed, 280 insertions(+), 16 deletions(-)
[RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Andrew Jones 4 years, 3 months ago
v2:
 - Reworked it enough that I brought back the RFC tag and retitled the
   series. Also had to drop r-b's from a couple of patches, and even
   drop patches.
 - Changed approach from writing the QEMU virtual time to the guest
   vtime counter to saving and restoring the guest vtime counter.
 - Changed the kvm-adjvtime property, which was off by default, to a
   kvm-no-adjvtime property, which is also off by default, meaning the
   effective "adjust vtime" property is now on by default (but only
   for 5.0 virt machine types and later)

v1:
 - move from RFC status to v1
 - put kvm_arm_vm_state_change() in kvm.c to share among kvm32.c and kvm64.c
 - add r-b's from Richard


This series is inspired by a series[1] posted by Bijan Mottahedeh over
a year ago and by the patch[2] posted by Heyi Guo almost a year ago.
The problem described in the cover letter of [1] is easily reproducible
and some users would like to have the option to avoid it. However the
solution, which is to adjust the virtual counter each time the VM
transitions to the running state, introduces a different problem, which
is that the virtual and physical counters diverge. As described in the
cover letter of [1] this divergence is easily observed when comparing
the output of `date` and `hwclock` after suspending the guest, waiting
a while, and then resuming it. Because this different problem may actually
be worse for some users, unlike [1], the series posted here makes the
virtual counter adjustment optional. Besides the adjustment being
optional, this series approaches the needed changes differently to apply
them in more appropriate locations and also integrates some of the
approach posted in [2].

Additional notes
----------------

Note 1
------

As described above, when running a guest with kvm-no-adjtime disabled
it will be less likely the guest OS and guest applications get surprise
time jumps when they use the virtual counter.  However the counter will
no longer reflect real time.  It will lag behind.  If this is a problem
then the guest can resynchronize its time from an external source or
even from its physical counter.  If the suspend/resume is done with
libvirt's virsh, and the guest is running the guest agent, then it's
also possible to use a sequence like this

 $ virsh suspend $GUEST
 $ virsh resume $GUEST
 $ virsh domtime --sync $GUEST

in order to resynchronize a guest right after the resume.  Of course
there will still be time when the clock is not right, possibly creating
confusing timestamps in logs, for example, and the guest must still be
tolerant to the time synchronizations.

Note 2
------

Userspace that wants to set KVM_REG_ARM_TIMER_CNT should beware that
the KVM register ID is not correct.  This cannot be fixed because it's
UAPI and if the UAPI headers are used then it can't be a problem.
However, if a userspace attempts to create the ID themselves from the
register's specification, then they will get KVM_REG_ARM_TIMER_CVAL
instead, as the _CNT and _CVAL definitions have their register
parameters swapped.

Note 3
------

I didn't test this with a 32-bit KVM host, but the changes to kvm32.c
are the same as kvm64.c. So what could go wrong? Test results would be
appreciated.
 

[1] https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg05713.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg03695.html

Thanks,
drew


Andrew Jones (5):
  hw: add compat machines for 5.0
  target/arm/kvm64: kvm64 cpus have timer registers
  target/arm/kvm: Implement virtual time adjustment
  tests/arm-cpu-features: Check feature default values
  target/arm/cpu: Add the kvm-no-adjvtime CPU property

 docs/arm-cpu-features.rst  | 31 +++++++++++++++-
 hw/arm/virt.c              | 17 ++++++++-
 hw/core/machine.c          |  3 ++
 hw/i386/pc.c               |  3 ++
 hw/i386/pc_piix.c          | 14 ++++++-
 hw/i386/pc_q35.c           | 13 ++++++-
 hw/ppc/spapr.c             | 15 +++++++-
 hw/s390x/s390-virtio-ccw.c | 15 +++++++-
 include/hw/arm/virt.h      |  1 +
 include/hw/boards.h        |  3 ++
 include/hw/i386/pc.h       |  3 ++
 target/arm/cpu.c           |  2 +
 target/arm/cpu.h           |  9 +++++
 target/arm/cpu64.c         |  1 +
 target/arm/kvm.c           | 76 ++++++++++++++++++++++++++++++++++++++
 target/arm/kvm32.c         |  3 ++
 target/arm/kvm64.c         |  4 ++
 target/arm/kvm_arm.h       | 34 +++++++++++++++++
 target/arm/monitor.c       |  1 +
 tests/arm-cpu-features.c   | 48 +++++++++++++++++++-----
 20 files changed, 280 insertions(+), 16 deletions(-)

-- 
2.21.0


Re: [RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Peter Maydell 4 years, 3 months ago
On Thu, 12 Dec 2019 at 17:33, Andrew Jones <drjones@redhat.com> wrote:

> Userspace that wants to set KVM_REG_ARM_TIMER_CNT should beware that
> the KVM register ID is not correct.  This cannot be fixed because it's
> UAPI and if the UAPI headers are used then it can't be a problem.
> However, if a userspace attempts to create the ID themselves from the
> register's specification, then they will get KVM_REG_ARM_TIMER_CVAL
> instead, as the _CNT and _CVAL definitions have their register
> parameters swapped.

So, to be clear, you mean that:

(1) the kernel headers say:

/* EL0 Virtual Timer Registers */
#define KVM_REG_ARM_TIMER_CTL           ARM64_SYS_REG(3, 3, 14, 3, 1)
#define KVM_REG_ARM_TIMER_CNT           ARM64_SYS_REG(3, 3, 14, 3, 2)
#define KVM_REG_ARM_TIMER_CVAL          ARM64_SYS_REG(3, 3, 14, 0, 2)

(2) some of the RHSes of these are wrong

(3) but the kernel internally is using the same 'wrong' value, so
userspace also needs to use that value, ie trust the #defined name
rather than manufacturing one ?

That's awkward. I think it would be worth at least having a kernel
patch to add a comment clearly documenting this bug.

(This error seems to only be in the 64-bit ABI, not 32-bit.)

QEMU does assume that the kernel's ID register values match
the hardware for sysregs in some ways -- we use this when we
construct our mapping from KVM register IDs as returned by
KVM_GET_REG_LIST to QEMU cpreg definitions and thus CPUState
struct fields. I *think* that in this case the only visible
effect will be that gdbstub will show you the CNT value
if you ask it to print the value of the CVAL sysreg.

thanks
-- PMM

Re: [RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Peter Maydell 4 years, 3 months ago
On Mon, 16 Dec 2019 at 15:33, Peter Maydell <peter.maydell@linaro.org> wrote:
> So, to be clear, you mean that:
>
> (1) the kernel headers say:
>
> /* EL0 Virtual Timer Registers */
> #define KVM_REG_ARM_TIMER_CTL           ARM64_SYS_REG(3, 3, 14, 3, 1)
> #define KVM_REG_ARM_TIMER_CNT           ARM64_SYS_REG(3, 3, 14, 3, 2)
> #define KVM_REG_ARM_TIMER_CVAL          ARM64_SYS_REG(3, 3, 14, 0, 2)
>
> (2) some of the RHSes of these are wrong
>
> (3) but the kernel internally is using the same 'wrong' value, so
> userspace also needs to use that value, ie trust the #defined name
> rather than manufacturing one ?
>
> That's awkward. I think it would be worth at least having a kernel
> patch to add a comment clearly documenting this bug.
>
> (This error seems to only be in the 64-bit ABI, not 32-bit.)
>
> QEMU does assume that the kernel's ID register values match
> the hardware for sysregs in some ways -- we use this when we
> construct our mapping from KVM register IDs as returned by
> KVM_GET_REG_LIST to QEMU cpreg definitions and thus CPUState
> struct fields. I *think* that in this case the only visible
> effect will be that gdbstub will show you the CNT value
> if you ask it to print the value of the CVAL sysreg.

...perhaps we should work around this kernel bug in the
kvm_to_cpreg_id() and cpreg_to_kvm_id() functions. (Need
to think through/test whether that would break migration.)

thanks
-- PMM

Re: [RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Andrew Jones 4 years, 2 months ago
On Mon, Dec 16, 2019 at 03:44:05PM +0000, Peter Maydell wrote:
> On Mon, 16 Dec 2019 at 15:33, Peter Maydell <peter.maydell@linaro.org> wrote:
> > So, to be clear, you mean that:
> >
> > (1) the kernel headers say:
> >
> > /* EL0 Virtual Timer Registers */
> > #define KVM_REG_ARM_TIMER_CTL           ARM64_SYS_REG(3, 3, 14, 3, 1)
> > #define KVM_REG_ARM_TIMER_CNT           ARM64_SYS_REG(3, 3, 14, 3, 2)
> > #define KVM_REG_ARM_TIMER_CVAL          ARM64_SYS_REG(3, 3, 14, 0, 2)
> >
> > (2) some of the RHSes of these are wrong
> >
> > (3) but the kernel internally is using the same 'wrong' value, so
> > userspace also needs to use that value, ie trust the #defined name
> > rather than manufacturing one ?
> >
> > That's awkward. I think it would be worth at least having a kernel
> > patch to add a comment clearly documenting this bug.
> >
> > (This error seems to only be in the 64-bit ABI, not 32-bit.)
> >
> > QEMU does assume that the kernel's ID register values match
> > the hardware for sysregs in some ways -- we use this when we
> > construct our mapping from KVM register IDs as returned by
> > KVM_GET_REG_LIST to QEMU cpreg definitions and thus CPUState
> > struct fields. I *think* that in this case the only visible
> > effect will be that gdbstub will show you the CNT value
> > if you ask it to print the value of the CVAL sysreg.
> 
> ...perhaps we should work around this kernel bug in the
> kvm_to_cpreg_id() and cpreg_to_kvm_id() functions. (Need
> to think through/test whether that would break migration.)
>

I just did some grepping for this too and, while it's easy to get
lost, I think I've confirmed what you state, that the only visible
effect is in gdb. I'll try to test this, but I think the workaround
in kvm_to_cpreg_id and cpreg_to_kvm_id is a probably a good idea,
because

   1) new qemu will be corrected

   2) migrate old qemu to new qemu (with same machine type)

      gdb cnt and cval swapped until first kvmstate sync

      (maybe too small a window of brokenness to notice/care?)

   3) migrate new qemu to old qemu (with same machine type)

      gdb cnt and cval correct until first kvmstate sync

      (old machine type keeps its old bug - except for same small
       window as for (2))

Thanks,
drew


Re: [RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Marc Zyngier 4 years, 3 months ago
On 2019-12-16 15:33, Peter Maydell wrote:
> On Thu, 12 Dec 2019 at 17:33, Andrew Jones <drjones@redhat.com> 
> wrote:
>
>> Userspace that wants to set KVM_REG_ARM_TIMER_CNT should beware that
>> the KVM register ID is not correct.  This cannot be fixed because 
>> it's
>> UAPI and if the UAPI headers are used then it can't be a problem.
>> However, if a userspace attempts to create the ID themselves from 
>> the
>> register's specification, then they will get KVM_REG_ARM_TIMER_CVAL
>> instead, as the _CNT and _CVAL definitions have their register
>> parameters swapped.
>
> So, to be clear, you mean that:
>
> (1) the kernel headers say:
>
> /* EL0 Virtual Timer Registers */
> #define KVM_REG_ARM_TIMER_CTL           ARM64_SYS_REG(3, 3, 14, 3, 1)
> #define KVM_REG_ARM_TIMER_CNT           ARM64_SYS_REG(3, 3, 14, 3, 2)
> #define KVM_REG_ARM_TIMER_CVAL          ARM64_SYS_REG(3, 3, 14, 0, 2)
>
> (2) some of the RHSes of these are wrong
>
> (3) but the kernel internally is using the same 'wrong' value, so
> userspace also needs to use that value, ie trust the #defined name
> rather than manufacturing one ?
>
> That's awkward. I think it would be worth at least having a kernel
> patch to add a comment clearly documenting this bug.
>
> (This error seems to only be in the 64-bit ABI, not 32-bit.)

Yeah, this is pretty bad. I wonder how we managed not to notice
this for so long... :-(.

Andrew, could you please write a patch documenting this (both in
the UAPI headers and in the documentation)?

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

Re: [RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Andrew Jones 4 years, 3 months ago
On Mon, Dec 16, 2019 at 04:18:23PM +0000, Marc Zyngier wrote:
> On 2019-12-16 15:33, Peter Maydell wrote:
> > On Thu, 12 Dec 2019 at 17:33, Andrew Jones <drjones@redhat.com> wrote:
> > 
> > > Userspace that wants to set KVM_REG_ARM_TIMER_CNT should beware that
> > > the KVM register ID is not correct.  This cannot be fixed because
> > > it's
> > > UAPI and if the UAPI headers are used then it can't be a problem.
> > > However, if a userspace attempts to create the ID themselves from
> > > the
> > > register's specification, then they will get KVM_REG_ARM_TIMER_CVAL
> > > instead, as the _CNT and _CVAL definitions have their register
> > > parameters swapped.
> > 
> > So, to be clear, you mean that:
> > 
> > (1) the kernel headers say:
> > 
> > /* EL0 Virtual Timer Registers */
> > #define KVM_REG_ARM_TIMER_CTL           ARM64_SYS_REG(3, 3, 14, 3, 1)
> > #define KVM_REG_ARM_TIMER_CNT           ARM64_SYS_REG(3, 3, 14, 3, 2)
> > #define KVM_REG_ARM_TIMER_CVAL          ARM64_SYS_REG(3, 3, 14, 0, 2)
> > 
> > (2) some of the RHSes of these are wrong
> > 
> > (3) but the kernel internally is using the same 'wrong' value, so
> > userspace also needs to use that value, ie trust the #defined name
> > rather than manufacturing one ?
> > 
> > That's awkward. I think it would be worth at least having a kernel
> > patch to add a comment clearly documenting this bug.
> > 
> > (This error seems to only be in the 64-bit ABI, not 32-bit.)
> 
> Yeah, this is pretty bad. I wonder how we managed not to notice
> this for so long... :-(.
> 
> Andrew, could you please write a patch documenting this (both in
> the UAPI headers and in the documentation)?
>

Will do. I'll try to get to it this week.

Thanks,
drew


Re: [RFC PATCH v2 0/5] target/arm/kvm: Adjust virtual time
Posted by Marc Zyngier 4 years, 3 months ago
On 2019-12-16 16:59, Andrew Jones wrote:
> On Mon, Dec 16, 2019 at 04:18:23PM +0000, Marc Zyngier wrote:
>> On 2019-12-16 15:33, Peter Maydell wrote:
>> > On Thu, 12 Dec 2019 at 17:33, Andrew Jones <drjones@redhat.com> 
>> wrote:
>> >
>> > > Userspace that wants to set KVM_REG_ARM_TIMER_CNT should beware 
>> that
>> > > the KVM register ID is not correct.  This cannot be fixed 
>> because
>> > > it's
>> > > UAPI and if the UAPI headers are used then it can't be a 
>> problem.
>> > > However, if a userspace attempts to create the ID themselves 
>> from
>> > > the
>> > > register's specification, then they will get 
>> KVM_REG_ARM_TIMER_CVAL
>> > > instead, as the _CNT and _CVAL definitions have their register
>> > > parameters swapped.
>> >
>> > So, to be clear, you mean that:
>> >
>> > (1) the kernel headers say:
>> >
>> > /* EL0 Virtual Timer Registers */
>> > #define KVM_REG_ARM_TIMER_CTL           ARM64_SYS_REG(3, 3, 14, 3, 
>> 1)
>> > #define KVM_REG_ARM_TIMER_CNT           ARM64_SYS_REG(3, 3, 14, 3, 
>> 2)
>> > #define KVM_REG_ARM_TIMER_CVAL          ARM64_SYS_REG(3, 3, 14, 0, 
>> 2)
>> >
>> > (2) some of the RHSes of these are wrong
>> >
>> > (3) but the kernel internally is using the same 'wrong' value, so
>> > userspace also needs to use that value, ie trust the #defined name
>> > rather than manufacturing one ?
>> >
>> > That's awkward. I think it would be worth at least having a kernel
>> > patch to add a comment clearly documenting this bug.
>> >
>> > (This error seems to only be in the 64-bit ABI, not 32-bit.)
>>
>> Yeah, this is pretty bad. I wonder how we managed not to notice
>> this for so long... :-(.
>>
>> Andrew, could you please write a patch documenting this (both in
>> the UAPI headers and in the documentation)?
>>
>
> Will do. I'll try to get to it this week.

Thanks a lot.

         M.
-- 
Jazz is not dead. It just smells funny...