target/i386/cpu.c | 10 ++++++++++ target/i386/cpu.h | 3 +++ 2 files changed, 13 insertions(+)
Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The calibration of the Hyper-V reference time overflows
and fails; as a result the processors' clock sources are out of sync.
As a workaround, reset the TSC to a small value. Do not do this
unconditionally and require a special property to be set.
Reported-by: Vadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/cpu.c | 10 ++++++++++
target/i386/cpu.h | 3 +++
2 files changed, 13 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ec3b50bf6e..9b29cea8c4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5931,6 +5931,15 @@ static void x86_cpu_reset(DeviceState *dev)
env->xstate_bv = 0;
env->pat = 0x0007040600070406ULL;
+
+ /*
+ * Some versions of Windows hang on reboot if their TSC value is greater
+ * than 2^54. As a workaround, reset the TSC to a small value. Do not use
+ * zero, KVM applies special heuristics for CPU startup when TSC is cleared.
+ */
+ if (cpu->tsc_clear_on_reset) {
+ env->tsc = 1;
+ }
env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
if (env->features[FEAT_1_ECX] & CPUID_EXT_MONITOR) {
env->msr_ia32_misc_enable |= MSR_IA32_MISC_ENABLE_MWAIT;
@@ -6957,6 +6966,7 @@ static Property x86_cpu_properties[] = {
false),
DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true),
DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true),
+ DEFINE_PROP_BOOL("tsc-clear-on-reset", X86CPU, tsc_clear_on_reset, true),
DEFINE_PROP_BOOL("x-migrate-smi-count", X86CPU, migrate_smi_count,
true),
/*
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e31e6bd8b8..66f7901729 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1864,6 +1864,9 @@ struct ArchCPU {
/* Forcefully disable KVM PV features not exposed in guest CPUIDs */
bool kvm_pv_enforce_cpuid;
+ /* Clear TSC on reset */
+ bool tsc_clear_on_reset;
+
/* Number of physical address bits supported */
uint32_t phys_bits;
--
2.35.1
On Thu, Mar 24, 2022 at 09:23:46AM +0100, Paolo Bonzini wrote: > Some versions of Windows hang on reboot if their TSC value is greater > than 2^54. The calibration of the Hyper-V reference time overflows > and fails; as a result the processors' clock sources are out of sync. > As a workaround, reset the TSC to a small value. Do not do this > unconditionally and require a special property to be set. What's the problem with doing it unconditionally ? Requiring this special niche property means that it'll have to be enabled by management apps. Most will never learn it exists, and of those that do, many will take years to get this enabled and into usage by users, and many won't even bother. IMHO, this is the kind of situation where we need the fix to be enabled by default, or we might as well not bother. > > Reported-by: Vadim Rozenfeld <vrozenfe@redhat.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > target/i386/cpu.c | 10 ++++++++++ > target/i386/cpu.h | 3 +++ > 2 files changed, 13 insertions(+) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index ec3b50bf6e..9b29cea8c4 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -5931,6 +5931,15 @@ static void x86_cpu_reset(DeviceState *dev) > env->xstate_bv = 0; > > env->pat = 0x0007040600070406ULL; > + > + /* > + * Some versions of Windows hang on reboot if their TSC value is greater > + * than 2^54. As a workaround, reset the TSC to a small value. Do not use > + * zero, KVM applies special heuristics for CPU startup when TSC is cleared. > + */ > + if (cpu->tsc_clear_on_reset) { > + env->tsc = 1; > + } > env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; > if (env->features[FEAT_1_ECX] & CPUID_EXT_MONITOR) { > env->msr_ia32_misc_enable |= MSR_IA32_MISC_ENABLE_MWAIT; > @@ -6957,6 +6966,7 @@ static Property x86_cpu_properties[] = { > false), > DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true), > DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true), > + DEFINE_PROP_BOOL("tsc-clear-on-reset", X86CPU, tsc_clear_on_reset, true), > DEFINE_PROP_BOOL("x-migrate-smi-count", X86CPU, migrate_smi_count, > true), > /* > diff --git a/target/i386/cpu.h b/target/i386/cpu.h > index e31e6bd8b8..66f7901729 100644 > --- a/target/i386/cpu.h > +++ b/target/i386/cpu.h > @@ -1864,6 +1864,9 @@ struct ArchCPU { > /* Forcefully disable KVM PV features not exposed in guest CPUIDs */ > bool kvm_pv_enforce_cpuid; > > + /* Clear TSC on reset */ > + bool tsc_clear_on_reset; > + > /* Number of physical address bits supported */ > uint32_t phys_bits; > > -- > 2.35.1 > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, Mar 24, 2022 at 09:13:12AM +0000, Daniel P. Berrangé wrote: > On Thu, Mar 24, 2022 at 09:23:46AM +0100, Paolo Bonzini wrote: > > Some versions of Windows hang on reboot if their TSC value is greater > > than 2^54. The calibration of the Hyper-V reference time overflows > > and fails; as a result the processors' clock sources are out of sync. > > As a workaround, reset the TSC to a small value. Do not do this > > unconditionally and require a special property to be set. > > What's the problem with doing it unconditionally ? > > Requiring this special niche property means that it'll have to be > enabled by management apps. Most will never learn it exists, and > of those that do, many will take years to get this enabled and > into usage by users, and many won't even bother. > > IMHO, this is the kind of situation where we need the fix to be > enabled by default, or we might as well not bother. Sigh, hit send too soon. I see the property is actually turned on in the defaults below, so effectively it will always be on unconditionally as no one will bother to add support for turning it off. > > > > > Reported-by: Vadim Rozenfeld <vrozenfe@redhat.com> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > > --- > > target/i386/cpu.c | 10 ++++++++++ > > target/i386/cpu.h | 3 +++ > > 2 files changed, 13 insertions(+) > > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > > index ec3b50bf6e..9b29cea8c4 100644 > > --- a/target/i386/cpu.c > > +++ b/target/i386/cpu.c > > @@ -5931,6 +5931,15 @@ static void x86_cpu_reset(DeviceState *dev) > > env->xstate_bv = 0; > > > > env->pat = 0x0007040600070406ULL; > > + > > + /* > > + * Some versions of Windows hang on reboot if their TSC value is greater > > + * than 2^54. As a workaround, reset the TSC to a small value. Do not use > > + * zero, KVM applies special heuristics for CPU startup when TSC is cleared. > > + */ > > + if (cpu->tsc_clear_on_reset) { > > + env->tsc = 1; > > + } > > env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; > > if (env->features[FEAT_1_ECX] & CPUID_EXT_MONITOR) { > > env->msr_ia32_misc_enable |= MSR_IA32_MISC_ENABLE_MWAIT; > > @@ -6957,6 +6966,7 @@ static Property x86_cpu_properties[] = { > > false), > > DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true), > > DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true), > > + DEFINE_PROP_BOOL("tsc-clear-on-reset", X86CPU, tsc_clear_on_reset, true), > > DEFINE_PROP_BOOL("x-migrate-smi-count", X86CPU, migrate_smi_count, > > true), > > /* > > diff --git a/target/i386/cpu.h b/target/i386/cpu.h > > index e31e6bd8b8..66f7901729 100644 > > --- a/target/i386/cpu.h > > +++ b/target/i386/cpu.h > > @@ -1864,6 +1864,9 @@ struct ArchCPU { > > /* Forcefully disable KVM PV features not exposed in guest CPUIDs */ > > bool kvm_pv_enforce_cpuid; > > > > + /* Clear TSC on reset */ > > + bool tsc_clear_on_reset; > > + > > /* Number of physical address bits supported */ > > uint32_t phys_bits; > > > > -- > > 2.35.1 > > > > > > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 3/24/22 10:15, Daniel P. Berrangé wrote: > On Thu, Mar 24, 2022 at 09:13:12AM +0000, Daniel P. Berrangé wrote: >> On Thu, Mar 24, 2022 at 09:23:46AM +0100, Paolo Bonzini wrote: >>> Some versions of Windows hang on reboot if their TSC value is greater >>> than 2^54. The calibration of the Hyper-V reference time overflows >>> and fails; as a result the processors' clock sources are out of sync. >>> As a workaround, reset the TSC to a small value. Do not do this >>> unconditionally and require a special property to be set. >> >> What's the problem with doing it unconditionally ? >> >> Requiring this special niche property means that it'll have to be >> enabled by management apps. Most will never learn it exists, and >> of those that do, many will take years to get this enabled and >> into usage by users, and many won't even bother. >> >> IMHO, this is the kind of situation where we need the fix to be >> enabled by default, or we might as well not bother. > > Sigh, hit send too soon. I see the property is actually turned > on in the defaults below, so effectively it will always be on > unconditionally as no one will bother to add support for turning > it off. Well, I have a patch to turn it on/off in Libvirt and I also planned to leave it off by default in RHEL patch updates (I'm not tying it to the machine type because it's not a guest ABI change). I am myself conflicted on whether to leave it on or off in QEMU. For example you could use the TSC to measure how long the VM has been up, but this patch makes that not work anymore. Considering that the bug requires literally 2-3 months of VM uptime to manifest itself, it might be better to set up the property in libosinfo and only for Windows guests. Also, since it is a bug in Windows, it will hopefully be fixed sooner or later. Paolo
On Thu, Mar 24, 2022 at 10:42:22AM +0100, Paolo Bonzini wrote: > On 3/24/22 10:15, Daniel P. Berrangé wrote: > > On Thu, Mar 24, 2022 at 09:13:12AM +0000, Daniel P. Berrangé wrote: > > > On Thu, Mar 24, 2022 at 09:23:46AM +0100, Paolo Bonzini wrote: > > > > Some versions of Windows hang on reboot if their TSC value is greater > > > > than 2^54. The calibration of the Hyper-V reference time overflows > > > > and fails; as a result the processors' clock sources are out of sync. > > > > As a workaround, reset the TSC to a small value. Do not do this > > > > unconditionally and require a special property to be set. > > > > > > What's the problem with doing it unconditionally ? > > > > > > Requiring this special niche property means that it'll have to be > > > enabled by management apps. Most will never learn it exists, and > > > of those that do, many will take years to get this enabled and > > > into usage by users, and many won't even bother. > > > > > > IMHO, this is the kind of situation where we need the fix to be > > > enabled by default, or we might as well not bother. > > > > Sigh, hit send too soon. I see the property is actually turned > > on in the defaults below, so effectively it will always be on > > unconditionally as no one will bother to add support for turning > > it off. > > Well, I have a patch to turn it on/off in Libvirt and I also planned to > leave it off by default in RHEL patch updates (I'm not tying it to the > machine type because it's not a guest ABI change). > > I am myself conflicted on whether to leave it on or off in QEMU. For > example you could use the TSC to measure how long the VM has been up, but > this patch makes that not work anymore. Considering that the bug requires > literally 2-3 months of VM uptime to manifest itself, it might be better to > set up the property in libosinfo and only for Windows guests. > > Also, since it is a bug in Windows, it will hopefully be fixed sooner or > later. In the libvirt patches you mentioned VMWare has the same workaround available, and not enabled by default. I looked up this VMWare setting and it is interesting reading: https://kb.vmware.com/s/article/2092807 Of particular note "This only applies to virtual machine hardware version 10 as Windows resets the TSC on all CPUs on virtual machines with older hardware versions (which do not support hypervisor.cpuid.v2)." do you know what they mean when they refer to 'hypervisor.cpuid.v2' here ? I wonder if it gives any hints as to a root cause that could be fixed ? This hardware version 10 is well old - their current hardware version is 19, so it seems to show the implemented some built-in fix in newer hardware versions (their equiv of machine types). The vmware setting dates from 2013, and if I read that kbase correctly isn't needed on their modern hardware versions. Or maybe monitor_control.enable_softResetClearTSC became the default in newer hardware versions ? With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 3/24/22 12:03, Daniel P. Berrangé wrote: > > "This only applies to virtual machine hardware version 10 as Windows > resets the TSC on all CPUs on virtual machines with older hardware > versions (which do not support hypervisor.cpuid.v2)." > > do you know what they mean when they refer to 'hypervisor.cpuid.v2' > here ? I wonder if it gives any hints as to a root cause that could > be fixed ? The difference between hardware versions probably is that older versions do not support Hyper-V enlightenments. The bug does not happen if Windows uses the RTC for timekeeping, which is consistent with the description above. > This hardware version 10 is well old - their current hardware version > is 19, so it seems to show the implemented some built-in fix in newer > hardware versions (their equiv of machine types). The vmware setting > dates from 2013, and if I read that kbase correctly isn't needed on > their modern hardware versions. Or maybe monitor_control.enable_softResetClearTSC > became the default in newer hardware versions ? Yeah, maybe it became the default but they didn't want to change previously-released hardware versions. So that would be basically treating it as an ABI change, and only enable it in 7.0 machine types. That said, the VMware kbase does paint a slightly different picture. It implies that starting with hardware version 11 rebooting Windows is done through a hard reset instead of INIT. I'm not sure how that would be done, but in the meanwhile our fix should take care of do_cpu_init as well. Paolo
On 3/24/22 12:24, Paolo Bonzini wrote: > That said, the VMware kbase does paint a slightly different picture. It > implies that starting with hardware version 11 rebooting Windows is done > through a hard reset instead of INIT. I'm not sure how that would be > done, but in the meanwhile our fix should take care of do_cpu_init as well. Ok, so here are my findings: - Windows resets the system by writing 0xFE to port 0x64. This is a hard reset on QEMU, but presumably it was a soft reset (INIT) on VMware until version 10. - QEMU _does_ try to write 0 to the TSC on hard reset. But KVM special cases 0 as "somebody is trying to hot-plug a new CPU" and wants to help out, so it keeps the CPU synchronized with the previous TSC. So this is a pretty clear-cut QEMU bug. It can be fixed by e.g. writing 1 to the TSC instead of 0. Paolo
On Thu, Mar 24, 2022 at 06:13:03PM +0100, Paolo Bonzini wrote: > On 3/24/22 12:24, Paolo Bonzini wrote: > > That said, the VMware kbase does paint a slightly different picture. It > > implies that starting with hardware version 11 rebooting Windows is done > > through a hard reset instead of INIT. I'm not sure how that would be > > done, but in the meanwhile our fix should take care of do_cpu_init as > > well. > > Ok, so here are my findings: > > - Windows resets the system by writing 0xFE to port 0x64. This is a hard > reset on QEMU, but presumably it was a soft reset (INIT) on VMware until > version 10. > > - QEMU _does_ try to write 0 to the TSC on hard reset. But KVM special > cases 0 as "somebody is trying to hot-plug a new CPU" and wants to help out, > so it keeps the CPU synchronized with the previous TSC. > > So this is a pretty clear-cut QEMU bug. It can be fixed by e.g. writing 1 > to the TSC instead of 0. Ah, excellant findings, so we won't need a config knob after all. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2024 Red Hat, Inc.