From: David Woodhouse <dwmw@amazon.co.uk>
In https://lkml.org/lkml/2008/10/1/246 a proposal was made for generic
CPUID leaves, of which only 0x40000010 was defined, to contain the TSC
and local APIC frequencies. The proposal from VMware was mostly shot
down in flames, *but* XNU does unconditionally assume that this leaf
contains the frequency information, if it's present on any hypervisor:
https://github.com/apple/darwin-xnu/blob/main/osfmk/i386/cpuid.c
So does FreeBSD: https://github.com/freebsd/freebsd-src/commit/4a432614f68
So at this point it would be daft for a hypervisor to expose 0x40000010
for any *other* content. KVM might as well adopt it, and fill in the
accurate TSC frequency just as it does for the Xen TSC leaf.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
arch/x86/include/uapi/asm/kvm_para.h | 11 +++++++++++
arch/x86/kvm/cpuid.c | 7 +++++++
2 files changed, 18 insertions(+)
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index a1efa7907a0b..1597c4a2a24a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -44,6 +44,17 @@
*/
#define KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 24
+
+/*
+ * Proposed by VMware in https://lkml.org/lkml/2008/10/1/246 the timing
+ * information leaf provides the TSC and local APIC timer frequencies:
+ *
+ * # EAX: (Virtual) TSC frequency in kHz.
+ * # EBX: (Virtual) Bus (local apic timer) frequency in kHz.
+ * # ECX, EDX: RESERVED (reserved fields are set to zero).
+ */
+#define KVM_CPUID_TIMING_INFO 0x40000010
+
#define MSR_KVM_WALL_CLOCK 0x11
#define MSR_KVM_SYSTEM_TIME 0x12
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index bcce3a75c3f2..1bd69d9c86b7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -2029,6 +2029,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
} else if (index == 2) {
*eax = vcpu->arch.hw_tsc_khz;
}
+ } else if (vcpu->arch.kvm_cpuid.base &&
+ function <= vcpu->arch.kvm_cpuid.limit &&
+ function == (vcpu->arch.kvm_cpuid.base | KVM_CPUID_TIMING_INFO)) {
+ if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu))
+ kvm_guest_time_update(vcpu);
+
+ *eax = vcpu->arch.hw_tsc_khz;
}
} else {
*eax = *ebx = *ecx = *edx = 0;
--
2.49.0
+Doug and Zach
VMware folks, TL;DR question for you:
Does VMware report TSC and APIC bus frequency in CPUID 0x40000010.{EAX,EBX},
or at the very least pinky swear not to use those outputs for anything else?
On Sat, Aug 16, 2025, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> In https://lkml.org/lkml/2008/10/1/246 a proposal was made for generic
> CPUID leaves, of which only 0x40000010 was defined, to contain the TSC
> and local APIC frequencies. The proposal from VMware was mostly shot
> down in flames, *but* XNU does unconditionally assume that this leaf
> contains the frequency information, if it's present on any hypervisor:
> https://github.com/apple/darwin-xnu/blob/main/osfmk/i386/cpuid.c
>
> So does FreeBSD: https://github.com/freebsd/freebsd-src/commit/4a432614f68
For me, the more convincing argument is following the breadcrumbs from the
changelog for the above commit
: This speeds up the boot process by 100 ms in EC2 and other systems,
: by allowing the early calibration DELAY to be skipped.
back to QEMU commit 9954a1582e ("x86-KVM: Supply TSC and APIC clock rates to guest
like VMWare"), with an assumption that EC2 enables vmware-cpuid-freq. I.e. the
de facto reference VMM for KVM (QEMU), has utilized CPUID 0x40000010 in this way
for almost 9 years.
> So at this point it would be daft for a hypervisor to expose 0x40000010
> for any *other* content.
My only hesitation is that VMware _does_ put other content in 0x40000010. From
arch/x86/kernel/cpu/vmware.c:
static u8 __init vmware_select_hypercall(void)
{
int eax, ebx, ecx, edx;
cpuid(CPUID_VMWARE_FEATURES_LEAF, &eax, &ebx, &ecx, &edx);
return (ecx & (CPUID_VMWARE_FEATURES_ECX_VMMCALL |
CPUID_VMWARE_FEATURES_ECX_VMCALL));
}
And oddly, Linux doesn't use CPUID to get the TSC frequency on VMware:
eax = vmware_hypercall3(VMWARE_CMD_GETHZ, UINT_MAX, &ebx, &ecx);
if (ebx != UINT_MAX) {
lpj = tsc_khz = eax | (((u64)ebx) << 32);
do_div(tsc_khz, 1000);
WARN_ON(tsc_khz >> 32);
pr_info("TSC freq read from hypervisor : %lu.%03lu MHz\n",
(unsigned long) tsc_khz / 1000,
(unsigned long) tsc_khz % 1000);
if (!preset_lpj) {
do_div(lpj, HZ);
preset_lpj = lpj;
}
vmware_tsc_khz = tsc_khz;
tsc_register_calibration_routines(vmware_get_tsc_khz,
vmware_get_tsc_khz,
TSC_FREQ_KNOWN_AND_RELIABLE);
However, VMware appears to deliberately avoid using EAX and EBX, and the above
FreeBSD commit (and current code) is broken if VMware does NOT populate CPUID
0x40000010 with at least the TSC frequency. Because FreeBSD prioritizes getting
the TSC frequency from CPUID:
if (tsc_freq_cpuid_vm()) {
if (bootverbose)
printf(
"Early TSC frequency %juHz derived from hypervisor CPUID\n",
(uintmax_t)tsc_freq);
} else if (vm_guest == VM_GUEST_VMWARE) {
tsc_freq_vmware();
if (bootverbose)
printf(
"Early TSC frequency %juHz derived from VMWare hypercall\n",
(uintmax_t)tsc_freq);
}
where tsc_freq_cpuid_vm() only checks if 0x40000010 is available, not if
0x40000010.EAX contains a sane, non-zero frequency.
static int
tsc_freq_cpuid_vm(void)
{
u_int regs[4];
if (vm_guest == VM_GUEST_NO)
return (false);
if (hv_high < 0x40000010)
return (false);
do_cpuid(0x40000010, regs);
tsc_freq = (uint64_t)(regs[0]) * 1000;
tsc_early_calib_exact = 1;
return (true);
}
I.e. if VMware isn't populating 0x40000010.EAX with the TSC frequency, then I
would think FreeBSD would be getting bug reports when running on VMware, which
AFAICT isn't the case.
So jumping back to my questions for the VMware folks, if VMware enumerates timing
information in CPUID 0x40000010.{EAX,EBX}, or at least doesn't use those outputs
for other purposes, then I 100% agree that reserving CPUID 0x40000010 for timing
information in KVM's PV CPUID leaves is a no-brainer. Even if the answer to both
is "no", I think it still makes sense to carve out 0x40000010, it'll just require
a bit more care and some different context.
On Tue, Dec 16, 2025 at 3:27 PM Sean Christopherson <seanjc@google.com> wrote:
>
> +Doug and Zach
>
> VMware folks, TL;DR question for you:
>
> Does VMware report TSC and APIC bus frequency in CPUID 0x40000010.{EAX,EBX},
> or at the very least pinky swear not to use those outputs for anything else?
Yes, all 32-bits of 0x40000010.EAX is for TSC frequency and all
32-bits of 0x40000010.EBX is for APIC bus frequency.
Doug
On Tue, Dec 16, 2025, Doug Covelli wrote:
> On Tue, Dec 16, 2025 at 3:27 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > +Doug and Zach
> >
> > VMware folks, TL;DR question for you:
> >
> > Does VMware report TSC and APIC bus frequency in CPUID 0x40000010.{EAX,EBX},
> > or at the very least pinky swear not to use those outputs for anything else?
>
> Yes, all 32-bits of 0x40000010.EAX is for TSC frequency and all
> 32-bits of 0x40000010.EBX is for APIC bus frequency.
Nice, thanks much for the confirmation and quick response!
© 2016 - 2026 Red Hat, Inc.