MAINTAINERS | 5 + hw/acpi/Kconfig | 5 + hw/acpi/meson.build | 1 + hw/acpi/vmclock.c | 179 ++++++++++++++++++++++++++ hw/i386/Kconfig | 1 + hw/i386/acpi-build.c | 10 +- include/hw/acpi/vmclock.h | 34 +++++ include/standard-headers/linux/vmclock-abi.h | 182 +++++++++++++++++++++++++++ linux-headers/linux/iommufd.h | 31 +++-- linux-headers/linux/stddef.h | 13 +- scripts/update-linux-headers.sh | 1 + 11 files changed, 447 insertions(+), 15 deletions(-)
(Posting one last time with the header commits split out). The vmclock device addresses the problem of live migration with precision clocks. The tolerances of a hardware counter (e.g. TSC) are typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that counter against an external source of 'real' time, and track the precise frequency of the counter as it changes with environmental conditions. When a guest is live migrated, anything it knows about the frequency of the underlying counter becomes invalid. It may move from a host where the counter running at -50PPM of its nominal frequency, to a host where it runs at +50PPM. There will also be a step change in the value of the counter, as the correctness of its absolute value at migration is limited by the accuracy of the source and destination host's time synchronization. The device exposes a shared memory region to guests, which can be mapped all the way to userspace. In the first phase, this merely advertises a 'disruption_marker', which indicates that the guest should throw away any NTP synchronization it thinks it has, and start again. Because the region can be exposed all the way to userspace, applications can still use time from a fast vDSO 'system call', and check the disruption marker to be sure that their timestamp is indeed truthful. The structure also allows for the precise time, as known by the host, to be exposed directly to guests so that they don't have to wait for NTP to resync from scratch. The values and fields are based on the nascent virtio-rtc specification, and the intent is that a version (hopefully precisely this version) of this structure will be included as an optional part of that spec. In the meantime, a simple ACPI device along the lines of VMGENID is perfectly sufficient and is compatible with what's being shipped in certain commercial hypervisors. Linux guest support was merged into the 6.13-rc1 kernel: https://git.kernel.org/torvalds/c/205032724226 --- v7: • Split update-kernel-headers.sh and the addition of the new header file into separate commits, add MAINTAINERS entry. v6: • Rebase for DEFINE_PROP_END_OF_LIST removal and sysemu→system rename. v5: • Trivial simplification to AML generation. • Import vmclock-abi.h from Linux now the guest support is merged. v4: • Trivial checkpatch fixes and comment improvements. v3: • Add comment that vmclock-abi.h will come from the Linux kernel headers once it gets merged there. v2: • Change esterror/maxerror fields to nanoseconds. • Change to officially assigned AMZNC10C ACPI HID. • Fix little-endian handling of fields in update. David Woodhouse (3): linux-headers: Add vmclock-abi.h linux-headers: Update to Linux 6.13-rc7 hw/acpi: Add vmclock device MAINTAINERS | 5 + hw/acpi/Kconfig | 5 + hw/acpi/meson.build | 1 + hw/acpi/vmclock.c | 179 ++++++++++++++++++++++++++ hw/i386/Kconfig | 1 + hw/i386/acpi-build.c | 10 +- include/hw/acpi/vmclock.h | 34 +++++ include/standard-headers/linux/vmclock-abi.h | 182 +++++++++++++++++++++++++++ linux-headers/linux/iommufd.h | 31 +++-- linux-headers/linux/stddef.h | 13 +- scripts/update-linux-headers.sh | 1 + 11 files changed, 447 insertions(+), 15 deletions(-)
On Thu, Jan 16, 2025 at 01:59:40PM +0000, David Woodhouse wrote: > (Posting one last time with the header commits split out). > > The vmclock device addresses the problem of live migration with > precision clocks. The tolerances of a hardware counter (e.g. TSC) are > typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that > counter against an external source of 'real' time, and track the precise > frequency of the counter as it changes with environmental conditions. > > When a guest is live migrated, anything it knows about the frequency of > the underlying counter becomes invalid. It may move from a host where > the counter running at -50PPM of its nominal frequency, to a host where > it runs at +50PPM. There will also be a step change in the value of the > counter, as the correctness of its absolute value at migration is > limited by the accuracy of the source and destination host's time > synchronization. > > The device exposes a shared memory region to guests, which can be mapped > all the way to userspace. In the first phase, this merely advertises a > 'disruption_marker', which indicates that the guest should throw away any > NTP synchronization it thinks it has, and start again. > > Because the region can be exposed all the way to userspace, applications > can still use time from a fast vDSO 'system call', and check the > disruption marker to be sure that their timestamp is indeed truthful. > > The structure also allows for the precise time, as known by the host, to > be exposed directly to guests so that they don't have to wait for NTP to > resync from scratch. > > The values and fields are based on the nascent virtio-rtc specification, > and the intent is that a version (hopefully precisely this version) of > this structure will be included as an optional part of that spec. In the > meantime, a simple ACPI device along the lines of VMGENID is perfectly > sufficient and is compatible with what's being shipped in certain > commercial hypervisors. > > Linux guest support was merged into the 6.13-rc1 kernel: > https://git.kernel.org/torvalds/c/205032724226 Reviewed-by: Michael S. Tsirkin <mst@redhat.com> feel free to merge. > --- > v7: > • Split update-kernel-headers.sh and the addition of the new header > file into separate commits, add MAINTAINERS entry. > > v6: > • Rebase for DEFINE_PROP_END_OF_LIST removal and sysemu→system > rename. > > v5: > • Trivial simplification to AML generation. > • Import vmclock-abi.h from Linux now the guest support is merged. > > v4: > • Trivial checkpatch fixes and comment improvements. > > v3: > • Add comment that vmclock-abi.h will come from the Linux kernel > headers once it gets merged there. > > v2: > • Change esterror/maxerror fields to nanoseconds. > • Change to officially assigned AMZNC10C ACPI HID. > • Fix little-endian handling of fields in update. > > David Woodhouse (3): > linux-headers: Add vmclock-abi.h > linux-headers: Update to Linux 6.13-rc7 > hw/acpi: Add vmclock device > > MAINTAINERS | 5 + > hw/acpi/Kconfig | 5 + > hw/acpi/meson.build | 1 + > hw/acpi/vmclock.c | 179 ++++++++++++++++++++++++++ > hw/i386/Kconfig | 1 + > hw/i386/acpi-build.c | 10 +- > include/hw/acpi/vmclock.h | 34 +++++ > include/standard-headers/linux/vmclock-abi.h | 182 +++++++++++++++++++++++++++ > linux-headers/linux/iommufd.h | 31 +++-- > linux-headers/linux/stddef.h | 13 +- > scripts/update-linux-headers.sh | 1 + > 11 files changed, 447 insertions(+), 15 deletions(-) >
On Thu, 2025-01-16 at 09:44 -0500, Michael S. Tsirkin wrote: > On Thu, Jan 16, 2025 at 01:59:40PM +0000, David Woodhouse wrote: > > (Posting one last time with the header commits split out). > > > > The vmclock device addresses the problem of live migration with > > precision clocks. The tolerances of a hardware counter (e.g. TSC) are > > typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that > > counter against an external source of 'real' time, and track the precise > > frequency of the counter as it changes with environmental conditions. > > > > When a guest is live migrated, anything it knows about the frequency of > > the underlying counter becomes invalid. It may move from a host where > > the counter running at -50PPM of its nominal frequency, to a host where > > it runs at +50PPM. There will also be a step change in the value of the > > counter, as the correctness of its absolute value at migration is > > limited by the accuracy of the source and destination host's time > > synchronization. > > > > The device exposes a shared memory region to guests, which can be mapped > > all the way to userspace. In the first phase, this merely advertises a > > 'disruption_marker', which indicates that the guest should throw away any > > NTP synchronization it thinks it has, and start again. > > > > Because the region can be exposed all the way to userspace, applications > > can still use time from a fast vDSO 'system call', and check the > > disruption marker to be sure that their timestamp is indeed truthful. > > > > The structure also allows for the precise time, as known by the host, to > > be exposed directly to guests so that they don't have to wait for NTP to > > resync from scratch. > > > > The values and fields are based on the nascent virtio-rtc specification, > > and the intent is that a version (hopefully precisely this version) of > > this structure will be included as an optional part of that spec. In the > > meantime, a simple ACPI device along the lines of VMGENID is perfectly > > sufficient and is compatible with what's being shipped in certain > > commercial hypervisors. > > > > Linux guest support was merged into the 6.13-rc1 kernel: > > https://git.kernel.org/torvalds/c/205032724226 > > > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > > feel free to merge. On doing a final rebase and test, I found it had already been merged via your tree anyway :)
On Thu, 2025-01-16 at 09:44 -0500, Michael S. Tsirkin wrote: > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > > feel free to merge. Thanks. I've added your R-b to all three (replacing your previous Acked-by), and will post the PR tomorrow to give others a chance to comment on the header bits.
© 2016 - 2025 Red Hat, Inc.