[PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created

Kai Huang posted 2 patches 3 months ago
There is a newer version of this series
[PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Kai Huang 3 months ago
Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
created.

The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
frequency that all subsequent created vCPUs use.  It is only intended to
be called before any vCPU is created.  Allowing it to be called after
that only results in confusion but nothing good.

Note this is an ABI change.  But currently in Qemu (the de facto
userspace VMM) only TDX uses this VM ioctl, and it is only called once
before creating any vCPU, therefore the risk of breaking userspace is
pretty low.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/x86.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 699ca5e74bba..e5e55d549468 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 		u32 user_tsc_khz;
 
 		r = -EINVAL;
+
+		if (kvm->created_vcpus)
+			goto out;
+
 		user_tsc_khz = (u32)arg;
 
 		if (kvm_caps.has_tsc_control &&
-- 
2.50.0
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Nikunj A. Dadhania 3 months ago

On 7/9/2025 11:08 AM, Kai Huang wrote:
> Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
> created.

Probably the below is clear:

Reject KVM_SET_TSC_KHZ VM ioctl when vCPUs have been created

> 
> The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
> frequency that all subsequent created vCPUs use.  It is only intended to
> be called before any vCPU is created.  Allowing it to be called after
> that only results in confusion but nothing good.
> 
> Note this is an ABI change.  But currently in Qemu (the de facto
> userspace VMM) only TDX uses this VM ioctl, and it is only called once
> before creating any vCPU, therefore the risk of breaking userspace is
> pretty low.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>

Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>

> ---
>  arch/x86/kvm/x86.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 699ca5e74bba..e5e55d549468 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
>  		u32 user_tsc_khz;
>  
>  		r = -EINVAL;
> +
> +		if (kvm->created_vcpus)
> +			goto out;
> +
>  		user_tsc_khz = (u32)arg;
>  
>  		if (kvm_caps.has_tsc_control &&
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Huang, Kai 2 months, 4 weeks ago
On Wed, 2025-07-09 at 14:21 +0530, Nikunj A. Dadhania wrote:
> 
> On 7/9/2025 11:08 AM, Kai Huang wrote:
> > Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
> > created.
> 
> Probably the below is clear:
> 
> Reject KVM_SET_TSC_KHZ VM ioctl when vCPUs have been created

Will do.  Thanks.

> 
> > 
> > The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
> > frequency that all subsequent created vCPUs use.  It is only intended to
> > be called before any vCPU is created.  Allowing it to be called after
> > that only results in confusion but nothing good.
> > 
> > Note this is an ABI change.  But currently in Qemu (the de facto
> > userspace VMM) only TDX uses this VM ioctl, and it is only called once
> > before creating any vCPU, therefore the risk of breaking userspace is
> > pretty low.
> > 
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> 
> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>

Thanks.  I'll keep you guys RB anyway after adding the mutex.  Let me know
if you are not OK.
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Chao Gao 3 months ago
On Wed, Jul 09, 2025 at 05:38:00PM +1200, Kai Huang wrote:
>Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
>created.
>
>The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
>frequency that all subsequent created vCPUs use.  It is only intended to
>be called before any vCPU is created.  Allowing it to be called after
>that only results in confusion but nothing good.
>
>Note this is an ABI change.  But currently in Qemu (the de facto
>userspace VMM) only TDX uses this VM ioctl, and it is only called once
>before creating any vCPU, therefore the risk of breaking userspace is
>pretty low.
>
>Suggested-by: Sean Christopherson <seanjc@google.com>
>Signed-off-by: Kai Huang <kai.huang@intel.com>
>---
> arch/x86/kvm/x86.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 699ca5e74bba..e5e55d549468 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
> 		u32 user_tsc_khz;
> 
> 		r = -EINVAL;
>+
>+		if (kvm->created_vcpus)
>+			goto out;
>+

shouldn't kvm->lock be held?

> 		user_tsc_khz = (u32)arg;
> 
> 		if (kvm_caps.has_tsc_control &&
>-- 
>2.50.0
>
>
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Sean Christopherson 3 months ago
On Wed, Jul 09, 2025, Chao Gao wrote:
> On Wed, Jul 09, 2025 at 05:38:00PM +1200, Kai Huang wrote:
> >Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
> >created.
> >
> >The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
> >frequency that all subsequent created vCPUs use.  It is only intended to
> >be called before any vCPU is created.  Allowing it to be called after
> >that only results in confusion but nothing good.
> >
> >Note this is an ABI change.  But currently in Qemu (the de facto
> >userspace VMM) only TDX uses this VM ioctl, and it is only called once
> >before creating any vCPU, therefore the risk of breaking userspace is
> >pretty low.
> >
> >Suggested-by: Sean Christopherson <seanjc@google.com>
> >Signed-off-by: Kai Huang <kai.huang@intel.com>
> >---
> > arch/x86/kvm/x86.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 699ca5e74bba..e5e55d549468 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
> > 		u32 user_tsc_khz;
> > 
> > 		r = -EINVAL;
> >+
> >+		if (kvm->created_vcpus)
> >+			goto out;
> >+
> 
> shouldn't kvm->lock be held?

Yep.
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Huang, Kai 2 months, 4 weeks ago
On Wed, 2025-07-09 at 06:55 -0700, Sean Christopherson wrote:
> On Wed, Jul 09, 2025, Chao Gao wrote:
> > On Wed, Jul 09, 2025 at 05:38:00PM +1200, Kai Huang wrote:
> > > Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
> > > created.
> > > 
> > > The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
> > > frequency that all subsequent created vCPUs use.  It is only intended to
> > > be called before any vCPU is created.  Allowing it to be called after
> > > that only results in confusion but nothing good.
> > > 
> > > Note this is an ABI change.  But currently in Qemu (the de facto
> > > userspace VMM) only TDX uses this VM ioctl, and it is only called once
> > > before creating any vCPU, therefore the risk of breaking userspace is
> > > pretty low.
> > > 
> > > Suggested-by: Sean Christopherson <seanjc@google.com>
> > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > ---
> > > arch/x86/kvm/x86.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 699ca5e74bba..e5e55d549468 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
> > > 		u32 user_tsc_khz;
> > > 
> > > 		r = -EINVAL;
> > > +
> > > +		if (kvm->created_vcpus)
> > > +			goto out;
> > > +
> > 
> > shouldn't kvm->lock be held?
> 
> Yep.

My bad.  I'll fixup and send out v2 soon,  together with the doc update.
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Huang, Kai 2 months, 4 weeks ago
On Thu, 2025-07-10 at 22:53 +0000, Huang, Kai wrote:
> On Wed, 2025-07-09 at 06:55 -0700, Sean Christopherson wrote:
> > On Wed, Jul 09, 2025, Chao Gao wrote:
> > > On Wed, Jul 09, 2025 at 05:38:00PM +1200, Kai Huang wrote:
> > > > Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
> > > > created.
> > > > 
> > > > The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
> > > > frequency that all subsequent created vCPUs use.  It is only intended to
> > > > be called before any vCPU is created.  Allowing it to be called after
> > > > that only results in confusion but nothing good.
> > > > 
> > > > Note this is an ABI change.  But currently in Qemu (the de facto
> > > > userspace VMM) only TDX uses this VM ioctl, and it is only called once
> > > > before creating any vCPU, therefore the risk of breaking userspace is
> > > > pretty low.
> > > > 
> > > > Suggested-by: Sean Christopherson <seanjc@google.com>
> > > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > > ---
> > > > arch/x86/kvm/x86.c | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > index 699ca5e74bba..e5e55d549468 100644
> > > > --- a/arch/x86/kvm/x86.c
> > > > +++ b/arch/x86/kvm/x86.c
> > > > @@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
> > > > 		u32 user_tsc_khz;
> > > > 
> > > > 		r = -EINVAL;
> > > > +
> > > > +		if (kvm->created_vcpus)
> > > > +			goto out;
> > > > +
> > > 
> > > shouldn't kvm->lock be held?
> > 
> > Yep.
> 
> My bad.  I'll fixup and send out v2 soon,  together with the doc update.

Sorry for multiple emails, but I ended up with below.

AFAICT the actual updating of kvm->arch.default_tsc_khz needs to be in the
kvm->lock mutex too.  Please let me know if you found any issue?

diff --git a/Documentation/virt/kvm/api.rst
b/Documentation/virt/kvm/api.rst
index 43ed57e048a8..86ea1e2b2737 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2006,7 +2006,7 @@ frequency is KHz.

 If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
 be used as a vm ioctl to set the initial tsc frequency of subsequently
-created vCPUs.
+created vCPUs.  It must be called before any vCPU is created.

 4.56 KVM_GET_TSC_KHZ
 --------------------
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2806f7104295..4051c0cacb92 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7199,9 +7199,12 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned
int ioctl, unsigned long arg)
                if (user_tsc_khz == 0)
                        user_tsc_khz = tsc_khz;

-               WRITE_ONCE(kvm->arch.default_tsc_khz, user_tsc_khz);
-               r = 0;
-
+               mutex_lock(&kvm->lock);
+               if (!kvm->created_vcpus) {
+                       WRITE_ONCE(kvm->arch.default_tsc_khz,
user_tsc_khz);
+                       r = 0;
+               }
+               mutex_unlock(&kvm->lock);
                goto out;
        }
        case KVM_GET_TSC_KHZ: {
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Chao Gao 2 months, 4 weeks ago
>AFAICT the actual updating of kvm->arch.default_tsc_khz needs to be in the
>kvm->lock mutex too.

Yep.

>Please let me know if you found any issue?
>
>diff --git a/Documentation/virt/kvm/api.rst
>b/Documentation/virt/kvm/api.rst
>index 43ed57e048a8..86ea1e2b2737 100644
>--- a/Documentation/virt/kvm/api.rst
>+++ b/Documentation/virt/kvm/api.rst
>@@ -2006,7 +2006,7 @@ frequency is KHz.
>
> If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
> be used as a vm ioctl to set the initial tsc frequency of subsequently
>-created vCPUs.
>+created vCPUs.  It must be called before any vCPU is created.

		^^ remove one space here.

"must be" sounds like a mandatory action, but IIUC the vm ioctl is optional for
non-CC VMs. I'm not sure if this is just a problem of my interpretation.

To make the API documentation super clear, how about:

If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
be used as a vm ioctl to set the initial tsc frequency of vCPUs before
any vCPU is created. Attempting to call this vm ioctl after vCPU creation
will return an EINVAL error.

>
> 4.56 KVM_GET_TSC_KHZ
> --------------------
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 2806f7104295..4051c0cacb92 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -7199,9 +7199,12 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned
>int ioctl, unsigned long arg)
>                if (user_tsc_khz == 0)
>                        user_tsc_khz = tsc_khz;
>
>-               WRITE_ONCE(kvm->arch.default_tsc_khz, user_tsc_khz);
>-               r = 0;
>-
>+               mutex_lock(&kvm->lock);
>+               if (!kvm->created_vcpus) {
>+                       WRITE_ONCE(kvm->arch.default_tsc_khz,
>user_tsc_khz);
>+                       r = 0;
>+               }
>+               mutex_unlock(&kvm->lock);

LGTM.

>                goto out;
>        }
>        case KVM_GET_TSC_KHZ: {
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Huang, Kai 2 months, 4 weeks ago
On Fri, 2025-07-11 at 10:17 +0800, Chao Gao wrote:
> > AFAICT the actual updating of kvm->arch.default_tsc_khz needs to be in the
> > kvm->lock mutex too.
> 
> Yep.
> 
> > Please let me know if you found any issue?
> > 
> > diff --git a/Documentation/virt/kvm/api.rst
> > b/Documentation/virt/kvm/api.rst
> > index 43ed57e048a8..86ea1e2b2737 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -2006,7 +2006,7 @@ frequency is KHz.
> > 
> > If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
> > be used as a vm ioctl to set the initial tsc frequency of subsequently
> > -created vCPUs.
> > +created vCPUs.  It must be called before any vCPU is created.
> 
> 		^^ remove one space here.

OK.

> 
> "must be" sounds like a mandatory action, but IIUC the vm ioctl is optional for
> non-CC VMs. I'm not sure if this is just a problem of my interpretation.

The context of that paragraph has "can also be used ...", so to me it's
implied that "if it is called", i.e., it's implied that it is optional for
non-CC VMs.

> 
> To make the API documentation super clear, how about:
> 
> If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
> be used as a vm ioctl to set the initial tsc frequency of vCPUs before
> any vCPU is created. Attempting to call this vm ioctl after vCPU creation
> will return an EINVAL error.

I am not sure we need to mention -EINVAL.  This IOCTL is already returning
-EINVAL for other errors (when invalid user_tsc_khz is supplied).

IMHO: Userspace should just care about whether success or not.  Being
explicit is good, but sometimes it's better to have some room here.

But I'll let Sean/Paolo to decide.

> 
> > 
> > 4.56 KVM_GET_TSC_KHZ
> > --------------------
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 2806f7104295..4051c0cacb92 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7199,9 +7199,12 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned
> > int ioctl, unsigned long arg)
> >                if (user_tsc_khz == 0)
> >                        user_tsc_khz = tsc_khz;
> > 
> > -               WRITE_ONCE(kvm->arch.default_tsc_khz, user_tsc_khz);
> > -               r = 0;
> > -
> > +               mutex_lock(&kvm->lock);
> > +               if (!kvm->created_vcpus) {
> > +                       WRITE_ONCE(kvm->arch.default_tsc_khz,
> > user_tsc_khz);
> > +                       r = 0;
> > +               }
> > +               mutex_unlock(&kvm->lock);
> 
> LGTM.
> 
> >                goto out;
> >        }
> >        case KVM_GET_TSC_KHZ: {
Re: [PATCH 2/2] KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPU has been created
Posted by Xiaoyao Li 3 months ago
On 7/9/2025 1:38 PM, Kai Huang wrote:
> Reject the KVM_SET_TSC_KHZ VM ioctl when there's vCPU has already been
> created.
> 
> The VM scope KVM_SET_TSC_KHZ ioctl is used to set up the default TSC
> frequency that all subsequent created vCPUs use.  It is only intended to
> be called before any vCPU is created.  Allowing it to be called after
> that only results in confusion but nothing good.
> 
> Note this is an ABI change.  But currently in Qemu (the de facto
> userspace VMM) only TDX uses this VM ioctl, and it is only called once
> before creating any vCPU, therefore the risk of breaking userspace is
> pretty low.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
>   arch/x86/kvm/x86.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 699ca5e74bba..e5e55d549468 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7194,6 +7194,10 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
>   		u32 user_tsc_khz;
>   
>   		r = -EINVAL;
> +
> +		if (kvm->created_vcpus)
> +			goto out;
> +
>   		user_tsc_khz = (u32)arg;
>   
>   		if (kvm_caps.has_tsc_control &&