From nobody Sat Feb 7 22:21:18 2026 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19D342F25EB for ; Tue, 30 Dec 2025 20:56:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767128205; cv=none; b=EcOR+aihM4pK0RXw0lbwr2rfxOAYWs7SpkGS/snRoXl066NIFrYe3HUeCgYKchBTK9cA3SRdos5H/YIiKy86woKWMn2pr/rcKWnilLGh/NyY4ki1mxQkXI0mdQJaFG/i2Pz2usgbwPlp0v8+HTd3pWZH2O0kz0WPIUgB3vSgwPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767128205; c=relaxed/simple; bh=Zpti2DE8NLjjGFX7QkWgyP3/8trcS9RnFliaTJhMRTY=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=Q/OuCtVkM0DiUv/Ov0iz7aONKUEdhjfFV96EIQ1VEFtnS5iz2By7hTkAayydkMzyY25wZBUAzd6oQnHb76NpyCeW1kCNaOziFFBFtEMmYojzwifXK3PwBFU8Zm2Puw3YVsqPhSxSi8U7owlixp5wpZuHbRw8SHpJ09Rq/w6KEak= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=aji2ts3G; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="aji2ts3G" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7b9090d9f2eso19089215b3a.0 for ; Tue, 30 Dec 2025 12:56:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1767128203; x=1767733003; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:reply-to:from:to:cc :subject:date:message-id:reply-to; bh=DRabemupLF66hH0SwZ7ZSRwpDtdLTil/3jaPkZaQB7o=; b=aji2ts3GhG7yuHt66nPSk1iY2jORzWlSmHZp3PumsDcBs8BBeOlxVTwRcqXbzM6wBv x5b7hRAE+9DD+D4RGWZDXrPF+iX3rgq2xGy3ZJemCLWhelv1esgvSG9FjY8XPeduUfp4 Caf8Z155hoFkgxEm1+f/e6efNKXZS6nJuIT1LD3KVT8GA5QENRupI1oc9bZ++gPvOOxU yaopOAStO+e6UxXa44pPMj7r/AX8ncG7ThoadzBXrct4rN6tLVOeg2qhDtGUOCr47mwQ YZFF8ZB8IVL+S9shaidrEeMkvdPVQ5+zF13XYJ9FNAaLKCgQ8sEVr/y0Rcxz3qXLRZbg Iaiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767128203; x=1767733003; h=cc:to:from:subject:message-id:mime-version:date:reply-to :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DRabemupLF66hH0SwZ7ZSRwpDtdLTil/3jaPkZaQB7o=; b=JRrrz8m05wmwMU/gSyLtGyvqNTMcSbB/8F1GdGuR+UTWPFtn2oVpSlKPhXdfuzm11o HTVMUxqMrXGW4vu2a7DRnQl/dYd8oqyMPu0+6tEUdcPAnpERBwy8qvKTK1KC/qClbdP+ /z8bIUzyfePH+PcxIsBxhC0jnGPoTLV6bvhxC5QOS39C32PE9ohowqjs9levV1NDxKVp PkoOMVg0VQLEBIM8lyXzCrqu/Rs4rYqihu1Sh5IwRGXhTFi41hI02JdVGfBtu8vc38GD FWXST615SOZSLddqVw4cWm15lXA4f2P3uANHbTJJSSU58TJnqtsQQbicBNNF0gn2td0m MmsA== X-Forwarded-Encrypted: i=1; AJvYcCWdeg9Q0PTTIrc/P4zRYJ7cGMyKUSr8tIiHXp8q0go9HUxhk/42IjE3mdjhwfLwpS67bGG2enQCrZep4EQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yzw7qYhjYkQuSvP+cIVk+g8Jr4josrafzCTU6R1vnnQq3fVqQxM FPjolw3MPqi3msH1ARRJNjxy35BvoW9JLD0xHqiuyV9VVE9dokYVFkEs5+Xn+V8/4bhyP74Ocp2 A1wMp5w== X-Google-Smtp-Source: AGHT+IEX1PqvWuWghUFfljNmHndzHQxhOTfzYiMCKX2gYau/+FcPOnzEAXLj2mcgonFc2H2/No2Y4aDmEU8= X-Received: from pflb16.prod.google.com ([2002:a05:6a00:a90:b0:7b0:e3d3:f040]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:278e:b0:7f6:fd3b:caa6 with SMTP id d2e1a72fcca58-7ff648e960dmr27916022b3a.19.1767128203055; Tue, 30 Dec 2025 12:56:43 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 30 Dec 2025 12:56:41 -0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.52.0.351.gbe84eed79e-goog Message-ID: <20251230205641.4092235-1-seanjc@google.com> Subject: [PATCH] KVM: x86: Disallow setting CPUID and/or feature MSRs if L2 is active From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Kevin Cheng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend KVM's restriction on CPUID and feature MSR changes to disallow updates while L2 is active in addition to rejecting updates after the vCPU has run at least once. Like post-run vCPU model updates, attempting to react to model changes while L2 is active is practically infeasible, e.g. KVM would need to do _something_ in response to impossible situations where userspace has a removed a feature that was consumed as parted of nested VM-Enter. In practice, disallowing vCPU model changes while L2 is active is largely uninteresting, as the only way for L2 to be active without the vCPU having run at least once is if userspace stuffed state via KVM_SET_NESTED_STATE. And because KVM_SET_NESTED_STATE can't put the vCPU into L2 without userspace first defining the vCPU model, e.g. to enable SVM/VMX, modifying the vCPU model while L2 is active would require deliberately setting the vCPU model, then loading nested state, and then changing the model. I.e. no sane VMM should run afoul of the new restriction, and any VMM that does encounter problems has likely been running a broken setup for a long time. Cc: Yosry Ahmed Cc: Kevin Cheng Signed-off-by: Sean Christopherson Reviewed-by: Yosry Ahmed --- arch/x86/kvm/cpuid.c | 19 +++++++++++-------- arch/x86/kvm/mmu/mmu.c | 6 +----- arch/x86/kvm/pmu.c | 2 +- arch/x86/kvm/x86.c | 13 +++++++------ arch/x86/kvm/x86.h | 4 ++-- 5 files changed, 22 insertions(+), 22 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 88a5426674a1..f37331ad3ad8 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -534,17 +534,20 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struc= t kvm_cpuid_entry2 *e2, BUILD_BUG_ON(sizeof(vcpu_caps) !=3D sizeof(vcpu->arch.cpu_caps)); =20 /* - * KVM does not correctly handle changing guest CPUID after KVM_RUN, as - * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't - * tracked in kvm_mmu_page_role. As a result, KVM may miss guest page - * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with - * the core vCPU model on the fly. It would've been better to forbid any - * KVM_SET_CPUID{,2} calls after KVM_RUN altogether but unfortunately - * some VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do + * KVM does not correctly handle changing guest CPUID after KVM_RUN or + * while L2 is active, as MAXPHYADDR, GBPAGES support, AMD reserved bit + * behavior, etc. aren't tracked in kvm_mmu_page_role, and L2 state + * can't be adjusted (without breaking L2 in some way). As a result, + * KVM may reuse SPs/SPTEs and/or run L2 with bad/misconfigured state. + * + * In practice, no sane VMM mucks with the core vCPU model on the fly. + * It would've been better to forbid any KVM_SET_CPUID{,2} calls after + * KVM_RUN or KVM_SET_NESTED_STATE altogether, but unfortunately some + * VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do * KVM_SET_CPUID{,2} again. To support this legacy behavior, check * whether the supplied CPUID data is equal to what's already set. */ - if (kvm_vcpu_has_run(vcpu)) { + if (!kvm_can_set_cpuid_and_feature_msrs(vcpu)) { r =3D kvm_cpuid_check_equal(vcpu, e2, nent); if (r) goto err; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 02c450686b4a..f17324546900 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6031,11 +6031,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.nested_mmu.cpu_role.ext.valid =3D 0; kvm_mmu_reset_context(vcpu); =20 - /* - * Changing guest CPUID after KVM_RUN is forbidden, see the comment in - * kvm_arch_vcpu_ioctl(). - */ - KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm); + KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm); } =20 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 487ad19a236e..ff20b4102173 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -853,7 +853,7 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu) { struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); =20 - if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm)) + if (KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm)) return; =20 /* diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ff8812f3a129..211d8c24a4b1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2314,13 +2314,14 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsign= ed index, u64 *data) u64 val; =20 /* - * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does - * not support modifying the guest vCPU model on the fly, e.g. changing - * the nVMX capabilities while L2 is running is nonsensical. Allow - * writes of the same value, e.g. to allow userspace to blindly stuff - * all MSRs when emulating RESET. + * Reject writes to immutable feature MSRs if the vCPU model is frozen, + * as KVM doesn't support modifying the guest vCPU model on the fly, + * e.g. changing the VMX capabilities MSRs while L2 is active is + * nonsensical. Allow writes of the same value, e.g. so that userspace + * can blindly stuff all MSRs when emulating RESET. */ - if (kvm_vcpu_has_run(vcpu) && kvm_is_immutable_feature_msr(index) && + if (!kvm_can_set_cpuid_and_feature_msrs(vcpu) && + kvm_is_immutable_feature_msr(index) && (do_get_msr(vcpu, index, &val) || *data !=3D val)) return -EINVAL; =20 diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index fdab0ad49098..9084e0dfa15c 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -172,9 +172,9 @@ static inline void kvm_nested_vmexit_handle_ibrs(struct= kvm_vcpu *vcpu) indirect_branch_prediction_barrier(); } =20 -static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu) +static inline bool kvm_can_set_cpuid_and_feature_msrs(struct kvm_vcpu *vcp= u) { - return vcpu->arch.last_vmentry_cpu !=3D -1; + return vcpu->arch.last_vmentry_cpu =3D=3D -1 && !is_guest_mode(vcpu); } =20 static inline void kvm_set_mp_state(struct kvm_vcpu *vcpu, int mp_state) base-commit: 9448598b22c50c8a5bb77a9103e2d49f134c9578 --=20 2.52.0.351.gbe84eed79e-goog