From nobody Wed Feb 11 05:35:56 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44BFD26563C for ; Thu, 29 May 2025 23:40:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748562047; cv=none; b=PcGKP2diYY8Z0Wl99YvUf6cLOGkGQA3vsPsE8r1u9bpGhQ8aVA7Es5r45ksBaleDN7jEngh+MdXsFOC7QN1EA3SfEjh2cEoT3ucGnPRk9u4xhRhNWfxhq00oc/3UvLb6tAZWp8Es6xv+uibamDYKX/TH/ipzQ0Uo+GbfJDYjIeE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748562047; c=relaxed/simple; bh=l+bL02aU50EoI4h1YgPPVT/B8gun1WMmPfYxA+pHVO8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=I12JTIz9vub/wsiPujwOLBtl1dUuQRuYsI4bPpzTZDsMvj3Bbk690A+djRCS/rRmqSfkJzVq/tAz8e47FA3FaPWSFBoFyFcRQ+5+SGSoBFseVOWvFHxU7zphqlI3CmtBe1XB/hnqaFjdtOO7qCOr7dibYSY+9I8vlL5yGbOu1ww= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Xiaz/YaR; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Xiaz/YaR" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-311e7d05931so1358544a91.1 for ; Thu, 29 May 2025 16:40:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1748562045; x=1749166845; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=0MKMXjzifl+Y0iOoy/Oi+CiJXlYxImLK0MhEMw2cEd4=; b=Xiaz/YaRtn+XdnOv/+lkWifoP8SvCJC4eggqMbMXPd8Fs8yuNKgy4CJRrQFOG8IZMD MbJ7icPf2p8OVYY/bf9d2+UsWZDdJqxXjkW63vg83jV/lr4pJLgQP5x9caSRsYY36n5Z l7CugQX6FOLpMRihaTiGLgO/ECJf/FU9uKPanFt5RuQxZWujFQVdUKfyycDnjPOgLL04 efdORC6kniNHhzFcDWyrvUO5m0RCx7amjADdcIK5AJiiSLQ9CGhioE2TRi7g25XdTwjS SwXfMoHH5QFyWvn7h1IwP2MKrsbMtH6d6Ic3sXmj/DJuub1d0OYzzWIGwKz4ySugGe5V Jp7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748562045; x=1749166845; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=0MKMXjzifl+Y0iOoy/Oi+CiJXlYxImLK0MhEMw2cEd4=; b=i6AYUqrp7gKp2pkliAUI7Jrju1rg5puuhoNPc10xyodkfM479/zKIrsPmvkOA6HNsf hu7GAWSAcrV8aNoyOR3NQq1/uvzMeS5C9G3p07vUNiEFUK7l3VhiyTW/4UVic9Rni35L ujtoVAXy60nS/naC0B/VCKDhgExwle6Cbwhhhe1lLtQRdCM+H2IzQDb3HK28SC8SDsUG PTkfutndQ3jcrwoqan8bT9uV1Z66pA+8TQSftww/aL8hiOJGwYlo+ld9EQoreZM2lcBw GyAwdHGN2e3Ir52lVrdpiDCE/yYXIDs0awW2ZzR4Ci1x5zifuifE25M2yDwdyWJmQdtH neEg== X-Forwarded-Encrypted: i=1; AJvYcCX74XP4DfMojn1ABvrYoabZZ6a2BwQqv4eEWlm09yZfxKBGJBGmqPJ4DcFmDc8t8ixO7wwi82OYQ46KRGw=@vger.kernel.org X-Gm-Message-State: AOJu0Yw+CnkxlP1vXACmB9B9nNJ029Th5chFn+CZKkw8JYuETX2L2iFl cn6BdbXSkZRVNsk4JcmqZ5PMtXTJgzY5q9PtFO/YqHpLLne24aWmHSMKVRgi/FZ7NikBlmoUOnR vEVjnYQ== X-Google-Smtp-Source: AGHT+IG//Yn4AMtGisp550xuhCbxFGP4qZhdOUv975mhuo37yy0hIIVaEbhoy9qK/k4ZRN5e22pRDlIRZjc= X-Received: from pjm12.prod.google.com ([2002:a17:90b:2fcc:b0:311:4bc2:3093]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4a03:b0:311:b413:f5e1 with SMTP id 98e67ed59e1d1-31241e86b02mr1734982a91.32.1748562044848; Thu, 29 May 2025 16:40:44 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 29 May 2025 16:40:02 -0700 In-Reply-To: <20250529234013.3826933-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250529234013.3826933-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1204.g71687c7c1d-goog Message-ID: <20250529234013.3826933-18-seanjc@google.com> Subject: [PATCH 17/28] KVM: SVM: Manually recalc all MSR intercepts on userspace MSR filter change From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Borislav Petkov , Xin Li , Chao Gao , Dapeng Mi Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On a userspace MSR filter change, recalculate all MSR intercepts using the filter-agnostic logic instead of maintaining a "shadow copy" of KVM's desired intercepts. The shadow bitmaps add yet another point of failure, are confusing (e.g. what does "handled specially" mean!?!?), an eyesore, and a maintenance burden. Given that KVM *must* be able to recalculate the correct intercepts at any given time, and that MSR filter updates are not hot paths, there is zero benefit to maintaining the shadow bitmaps. Link: https://lore.kernel.org/all/aCdPbZiYmtni4Bjs@google.com Link: https://lore.kernel.org/all/20241126180253.GAZ0YNTdXH1UGeqsu6@fat_cra= te.local Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/sev.c | 16 +- arch/x86/kvm/svm/svm.c | 371 +++++++++++------------------------------ arch/x86/kvm/svm/svm.h | 7 +- 3 files changed, 105 insertions(+), 289 deletions(-) diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 694d38a2327c..800ece58b84c 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -4411,9 +4411,12 @@ int sev_es_string_io(struct vcpu_svm *svm, int size,= unsigned int port, int in) count, in); } =20 -static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm) +void sev_es_recalc_msr_intercepts(struct kvm_vcpu *vcpu) { - struct kvm_vcpu *vcpu =3D &svm->vcpu; + /* Clear intercepts on MSRs that are context switched by hardware. */ + svm_disable_intercept_for_msr(vcpu, MSR_AMD64_SEV_ES_GHCB, MSR_TYPE_RW); + svm_disable_intercept_for_msr(vcpu, MSR_EFER, MSR_TYPE_RW); + svm_disable_intercept_for_msr(vcpu, MSR_IA32_CR_PAT, MSR_TYPE_RW); =20 if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) svm_set_intercept_for_msr(vcpu, MSR_TSC_AUX, MSR_TYPE_RW, @@ -4448,16 +4451,12 @@ void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm) best =3D kvm_find_cpuid_entry(vcpu, 0x8000001F); if (best) vcpu->arch.reserved_gpa_bits &=3D ~(1UL << (best->ebx & 0x3f)); - - if (sev_es_guest(svm->vcpu.kvm)) - sev_es_vcpu_after_set_cpuid(svm); } =20 static void sev_es_init_vmcb(struct vcpu_svm *svm) { struct kvm_sev_info *sev =3D to_kvm_sev_info(svm->vcpu.kvm); struct vmcb *vmcb =3D svm->vmcb01.ptr; - struct kvm_vcpu *vcpu =3D &svm->vcpu; =20 svm->vmcb->control.nested_ctl |=3D SVM_NESTED_CTL_SEV_ES_ENABLE; =20 @@ -4511,11 +4510,6 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) =20 /* Can't intercept XSETBV, HV can't modify XCR0 directly */ svm_clr_intercept(svm, INTERCEPT_XSETBV); - - /* Clear intercepts on MSRs that are context switched by hardware. */ - svm_disable_intercept_for_msr(vcpu, MSR_AMD64_SEV_ES_GHCB, MSR_TYPE_RW); - svm_disable_intercept_for_msr(vcpu, MSR_EFER, MSR_TYPE_RW); - svm_disable_intercept_for_msr(vcpu, MSR_IA32_CR_PAT, MSR_TYPE_RW); } =20 void sev_init_vmcb(struct vcpu_svm *svm) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c01eda772997..685d9fd4a4e1 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -71,8 +71,6 @@ MODULE_DEVICE_TABLE(x86cpu, svm_cpu_id); =20 static bool erratum_383_found __read_mostly; =20 -u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly; - /* * Set osvw_len to higher value when updated Revision Guides * are published and we know what the new status bits are @@ -81,70 +79,6 @@ static uint64_t osvw_len =3D 4, osvw_status; =20 static DEFINE_PER_CPU(u64, current_tsc_ratio); =20 -static const u32 direct_access_msrs[] =3D { - MSR_STAR, - MSR_IA32_SYSENTER_CS, - MSR_IA32_SYSENTER_EIP, - MSR_IA32_SYSENTER_ESP, -#ifdef CONFIG_X86_64 - MSR_GS_BASE, - MSR_FS_BASE, - MSR_KERNEL_GS_BASE, - MSR_LSTAR, - MSR_CSTAR, - MSR_SYSCALL_MASK, -#endif - MSR_IA32_SPEC_CTRL, - MSR_IA32_PRED_CMD, - MSR_IA32_FLUSH_CMD, - MSR_IA32_DEBUGCTLMSR, - MSR_IA32_LASTBRANCHFROMIP, - MSR_IA32_LASTBRANCHTOIP, - MSR_IA32_LASTINTFROMIP, - MSR_IA32_LASTINTTOIP, - MSR_IA32_XSS, - MSR_EFER, - MSR_IA32_CR_PAT, - MSR_AMD64_SEV_ES_GHCB, - MSR_TSC_AUX, - X2APIC_MSR(APIC_ID), - X2APIC_MSR(APIC_LVR), - X2APIC_MSR(APIC_TASKPRI), - X2APIC_MSR(APIC_ARBPRI), - X2APIC_MSR(APIC_PROCPRI), - X2APIC_MSR(APIC_EOI), - X2APIC_MSR(APIC_RRR), - X2APIC_MSR(APIC_LDR), - X2APIC_MSR(APIC_DFR), - X2APIC_MSR(APIC_SPIV), - X2APIC_MSR(APIC_ISR), - X2APIC_MSR(APIC_TMR), - X2APIC_MSR(APIC_IRR), - X2APIC_MSR(APIC_ESR), - X2APIC_MSR(APIC_ICR), - X2APIC_MSR(APIC_ICR2), - - /* - * Note: - * AMD does not virtualize APIC TSC-deadline timer mode, but it is - * emulated by KVM. When setting APIC LVTT (0x832) register bit 18, - * the AVIC hardware would generate GP fault. Therefore, always - * intercept the MSR 0x832, and do not setup direct_access_msr. - */ - X2APIC_MSR(APIC_LVTTHMR), - X2APIC_MSR(APIC_LVTPC), - X2APIC_MSR(APIC_LVT0), - X2APIC_MSR(APIC_LVT1), - X2APIC_MSR(APIC_LVTERR), - X2APIC_MSR(APIC_TMICT), - X2APIC_MSR(APIC_TMCCT), - X2APIC_MSR(APIC_TDCR), -}; - -static_assert(ARRAY_SIZE(direct_access_msrs) =3D=3D - MAX_DIRECT_ACCESS_MSRS - 6 * !IS_ENABLED(CONFIG_X86_64)); -#undef MAX_DIRECT_ACCESS_MSRS - /* * These 2 parameters are used to config the controls for Pause-Loop Exiti= ng: * pause_filter_count: On processors that support Pause filtering(indicated @@ -761,44 +695,6 @@ static void clr_dr_intercepts(struct vcpu_svm *svm) recalc_intercepts(svm); } =20 -static int direct_access_msr_slot(u32 msr) -{ - u32 i; - - for (i =3D 0; i < ARRAY_SIZE(direct_access_msrs); i++) { - if (direct_access_msrs[i] =3D=3D msr) - return i; - } - - return -ENOENT; -} - -static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu, u32 msr, int r= ead, - int write) -{ - struct vcpu_svm *svm =3D to_svm(vcpu); - int slot =3D direct_access_msr_slot(msr); - - if (slot =3D=3D -ENOENT) - return; - - /* Set the shadow bitmaps to the desired intercept states */ - if (read) - __set_bit(slot, svm->shadow_msr_intercept.read); - else - __clear_bit(slot, svm->shadow_msr_intercept.read); - - if (write) - __set_bit(slot, svm->shadow_msr_intercept.write); - else - __clear_bit(slot, svm->shadow_msr_intercept.write); -} - -static bool valid_msr_intercept(u32 index) -{ - return direct_access_msr_slot(index) !=3D -ENOENT; -} - static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr) { /* @@ -816,62 +712,11 @@ static bool msr_write_intercepted(struct kvm_vcpu *vc= pu, u32 msr) return svm_test_msr_bitmap_write(msrpm, msr); } =20 -static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm, - u32 msr, int read, int write) -{ - struct vcpu_svm *svm =3D to_svm(vcpu); - u8 bit_read, bit_write; - unsigned long tmp; - u32 offset; - - /* - * If this warning triggers extend the direct_access_msrs list at the - * beginning of the file - */ - WARN_ON(!valid_msr_intercept(msr)); - - /* Enforce non allowed MSRs to trap */ - if (read && !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ)) - read =3D 0; - - if (write && !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_WRITE)) - write =3D 0; - - offset =3D svm_msrpm_offset(msr); - bit_read =3D 2 * (msr & 0x0f); - bit_write =3D 2 * (msr & 0x0f) + 1; - tmp =3D msrpm[offset]; - - if (KVM_BUG_ON(offset =3D=3D MSR_INVALID, vcpu->kvm)) - return; - - read ? __clear_bit(bit_read, &tmp) : __set_bit(bit_read, &tmp); - write ? __clear_bit(bit_write, &tmp) : __set_bit(bit_write, &tmp); - - if (read) - svm_clear_msr_bitmap_read((void *)msrpm, msr); - else - svm_set_msr_bitmap_read((void *)msrpm, msr); - - if (write) - svm_clear_msr_bitmap_write((void *)msrpm, msr); - else - svm_set_msr_bitmap_write((void *)msrpm, msr); - - WARN_ON_ONCE(msrpm[offset] !=3D (u32)tmp); - - svm_hv_vmcb_dirty_nested_enlightenments(vcpu); - svm->nested.force_msr_bitmap_recalc =3D true; -} - void svm_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int typ= e) { struct vcpu_svm *svm =3D to_svm(vcpu); void *msrpm =3D svm->msrpm; =20 - /* Note, the shadow intercept bitmaps have inverted polarity. */ - set_shadow_msr_intercept(vcpu, msr, type & MSR_TYPE_R, type & MSR_TYPE_W); - /* * Don't disabled interception for the MSR if userspace wants to * handle it. @@ -903,10 +748,6 @@ void svm_enable_intercept_for_msr(struct kvm_vcpu *vcp= u, u32 msr, int type) struct vcpu_svm *svm =3D to_svm(vcpu); void *msrpm =3D svm->msrpm; =20 - - set_shadow_msr_intercept(vcpu, msr, - !(type & MSR_TYPE_R), !(type & MSR_TYPE_W)); - if (type & MSR_TYPE_R) svm_set_msr_bitmap_read(msrpm, msr); =20 @@ -932,6 +773,20 @@ u32 *svm_vcpu_alloc_msrpm(void) return msrpm; } =20 +static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) +{ + bool intercept =3D !(to_svm(vcpu)->vmcb->control.virt_ext & LBR_CTL_ENABL= E_MASK); + + svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW, i= ntercept); + svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW, int= ercept); + svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTINTFROMIP, MSR_TYPE_RW, inte= rcept); + svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTINTTOIP, MSR_TYPE_RW, interc= ept); + + if (sev_es_guest(vcpu->kvm)) + svm_set_intercept_for_msr(vcpu, MSR_IA32_DEBUGCTLMSR, MSR_TYPE_RW, inter= cept); + +} + static void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu) { svm_disable_intercept_for_msr(vcpu, MSR_STAR, MSR_TYPE_RW); @@ -949,6 +804,38 @@ static void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu) =20 void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept) { + static const u32 x2avic_passthrough_msrs[] =3D { + X2APIC_MSR(APIC_ID), + X2APIC_MSR(APIC_LVR), + X2APIC_MSR(APIC_TASKPRI), + X2APIC_MSR(APIC_ARBPRI), + X2APIC_MSR(APIC_PROCPRI), + X2APIC_MSR(APIC_EOI), + X2APIC_MSR(APIC_RRR), + X2APIC_MSR(APIC_LDR), + X2APIC_MSR(APIC_DFR), + X2APIC_MSR(APIC_SPIV), + X2APIC_MSR(APIC_ISR), + X2APIC_MSR(APIC_TMR), + X2APIC_MSR(APIC_IRR), + X2APIC_MSR(APIC_ESR), + X2APIC_MSR(APIC_ICR), + X2APIC_MSR(APIC_ICR2), + + /* + * Note! Always intercept LVTT, as TSC-deadline timer mode + * isn't virtualized by hardware, and the CPU will generate a + * #GP instead of a #VMEXIT. + */ + X2APIC_MSR(APIC_LVTTHMR), + X2APIC_MSR(APIC_LVTPC), + X2APIC_MSR(APIC_LVT0), + X2APIC_MSR(APIC_LVT1), + X2APIC_MSR(APIC_LVTERR), + X2APIC_MSR(APIC_TMICT), + X2APIC_MSR(APIC_TMCCT), + X2APIC_MSR(APIC_TDCR), + }; int i; =20 if (intercept =3D=3D svm->x2avic_msrs_intercepted) @@ -957,15 +844,9 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *= svm, bool intercept) if (!x2avic_enabled) return; =20 - for (i =3D 0; i < ARRAY_SIZE(direct_access_msrs); i++) { - int index =3D direct_access_msrs[i]; - - if ((index < APIC_BASE_MSR) || - (index > APIC_BASE_MSR + 0xff)) - continue; - - svm_set_intercept_for_msr(&svm->vcpu, index, MSR_TYPE_RW, intercept); - } + for (i =3D 0; i < ARRAY_SIZE(x2avic_passthrough_msrs); i++) + svm_set_intercept_for_msr(&svm->vcpu, x2avic_passthrough_msrs[i], + MSR_TYPE_RW, intercept); =20 svm->x2avic_msrs_intercepted =3D intercept; } @@ -975,65 +856,53 @@ void svm_vcpu_free_msrpm(u32 *msrpm) __free_pages(virt_to_page(msrpm), get_order(MSRPM_SIZE)); } =20 +static void svm_recalc_msr_intercepts(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm =3D to_svm(vcpu); + + svm_vcpu_init_msrpm(vcpu); + + if (lbrv) + svm_recalc_lbr_msr_intercepts(vcpu); + + if (boot_cpu_has(X86_FEATURE_IBPB)) + svm_set_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W, + !guest_has_pred_cmd_msr(vcpu)); + + if (boot_cpu_has(X86_FEATURE_FLUSH_L1D)) + svm_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W, + !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D)); + + /* + * Unconditionally disable interception of SPEC_CTRL if V_SPEC_CTRL is + * supported, i.e. if VMRUN/#VMEXIT context switch MSR_IA32_SPEC_CTRL. + */ + if (boot_cpu_has(X86_FEATURE_V_SPEC_CTRL)) + svm_disable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW); + else + svm_set_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW, !svm->s= pec_ctrl); + + /* + * Intercept SYSENTER_EIP and SYSENTER_ESP when emulating an Intel CPU, + * as AMD hardware only store 32 bits, whereas Intel CPUs track 64 bits. + */ + svm_set_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW, + guest_cpuid_is_intel_compatible(vcpu)); + svm_set_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW, + guest_cpuid_is_intel_compatible(vcpu)); + + if (sev_es_guest(vcpu->kvm)) + sev_es_recalc_msr_intercepts(vcpu); + + /* + * x2APIC intercepts are modified on-demand and cannot be filtered by + * userspace. + */ +} + static void svm_msr_filter_changed(struct kvm_vcpu *vcpu) { - struct vcpu_svm *svm =3D to_svm(vcpu); - u32 i; - - /* - * Set intercept permissions for all direct access MSRs again. They - * will automatically get filtered through the MSR filter, so we are - * back in sync after this. - */ - for (i =3D 0; i < ARRAY_SIZE(direct_access_msrs); i++) { - u32 msr =3D direct_access_msrs[i]; - u32 read =3D test_bit(i, svm->shadow_msr_intercept.read); - u32 write =3D test_bit(i, svm->shadow_msr_intercept.write); - - set_msr_interception_bitmap(vcpu, svm->msrpm, msr, read, write); - } -} - -static __init int add_msr_offset(u32 offset) -{ - int i; - - for (i =3D 0; i < MSRPM_OFFSETS; ++i) { - - /* Offset already in list? */ - if (msrpm_offsets[i] =3D=3D offset) - return 0; - - /* Slot used by another offset? */ - if (msrpm_offsets[i] !=3D MSR_INVALID) - continue; - - /* Add offset to list */ - msrpm_offsets[i] =3D offset; - - return 0; - } - - return -EIO; -} - -static __init int init_msrpm_offsets(void) -{ - int i; - - memset(msrpm_offsets, 0xff, sizeof(msrpm_offsets)); - - for (i =3D 0; i < ARRAY_SIZE(direct_access_msrs); i++) { - u32 offset; - - offset =3D svm_msrpm_offset(direct_access_msrs[i]); - if (WARN_ON(offset =3D=3D MSR_INVALID)) - return -EIO; - - if (WARN_ON_ONCE(add_msr_offset(offset))) - return -EIO; - } - return 0; + svm_recalc_msr_intercepts(vcpu); } =20 void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb) @@ -1052,13 +921,7 @@ void svm_enable_lbrv(struct kvm_vcpu *vcpu) struct vcpu_svm *svm =3D to_svm(vcpu); =20 svm->vmcb->control.virt_ext |=3D LBR_CTL_ENABLE_MASK; - svm_disable_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_R= W); - svm_disable_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW); - svm_disable_intercept_for_msr(vcpu, MSR_IA32_LASTINTFROMIP, MSR_TYPE_RW); - svm_disable_intercept_for_msr(vcpu, MSR_IA32_LASTINTTOIP, MSR_TYPE_RW); - - if (sev_es_guest(vcpu->kvm)) - svm_disable_intercept_for_msr(vcpu, MSR_IA32_DEBUGCTLMSR, MSR_TYPE_RW); + svm_recalc_lbr_msr_intercepts(vcpu); =20 /* Move the LBR msrs to the vmcb02 so that the guest can see them. */ if (is_guest_mode(vcpu)) @@ -1072,10 +935,7 @@ static void svm_disable_lbrv(struct kvm_vcpu *vcpu) KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm); =20 svm->vmcb->control.virt_ext &=3D ~LBR_CTL_ENABLE_MASK; - svm_enable_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW= ); - svm_enable_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW); - svm_enable_intercept_for_msr(vcpu, MSR_IA32_LASTINTFROMIP, MSR_TYPE_RW); - svm_enable_intercept_for_msr(vcpu, MSR_IA32_LASTINTTOIP, MSR_TYPE_RW); + svm_recalc_lbr_msr_intercepts(vcpu); =20 /* * Move the LBR msrs back to the vmcb01 to avoid copying them @@ -1258,17 +1118,9 @@ static inline void init_vmcb_after_set_cpuid(struct = kvm_vcpu *vcpu) struct vcpu_svm *svm =3D to_svm(vcpu); =20 if (guest_cpuid_is_intel_compatible(vcpu)) { - /* - * We must intercept SYSENTER_EIP and SYSENTER_ESP - * accesses because the processor only stores 32 bits. - * For the same reason we cannot use virtual VMLOAD/VMSAVE. - */ svm_set_intercept(svm, INTERCEPT_VMLOAD); svm_set_intercept(svm, INTERCEPT_VMSAVE); svm->vmcb->control.virt_ext &=3D ~VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; - - svm_enable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); - svm_enable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); } else { /* * If hardware supports Virtual VMLOAD VMSAVE then enable it @@ -1279,10 +1131,9 @@ static inline void init_vmcb_after_set_cpuid(struct = kvm_vcpu *vcpu) svm_clr_intercept(svm, INTERCEPT_VMSAVE); svm->vmcb->control.virt_ext |=3D VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; } - /* No need to intercept these MSRs */ - svm_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); - svm_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); } + + svm_recalc_msr_intercepts(vcpu); } =20 static void init_vmcb(struct kvm_vcpu *vcpu) @@ -1409,13 +1260,6 @@ static void init_vmcb(struct kvm_vcpu *vcpu) =20 svm_recalc_instruction_intercepts(vcpu, svm); =20 - /* - * If the host supports V_SPEC_CTRL then disable the interception - * of MSR_IA32_SPEC_CTRL. - */ - if (boot_cpu_has(X86_FEATURE_V_SPEC_CTRL)) - svm_disable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW); - if (kvm_vcpu_apicv_active(vcpu)) avic_init_vmcb(svm, vmcb); =20 @@ -1446,8 +1290,6 @@ static void __svm_vcpu_reset(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm =3D to_svm(vcpu); =20 - svm_vcpu_init_msrpm(vcpu); - svm_init_osvw(vcpu); =20 if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_STUFF_FEATURE_MSRS)) @@ -3247,8 +3089,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct = msr_data *msr) =20 /* * TSC_AUX is usually changed only during boot and never read - * directly. Intercept TSC_AUX instead of exposing it to the - * guest via direct_access_msrs, and switch it via user return. + * directly. Intercept TSC_AUX and switch it via user return. */ preempt_disable(); ret =3D kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull); @@ -4684,14 +4525,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu= *vcpu) =20 svm_recalc_instruction_intercepts(vcpu, svm); =20 - if (boot_cpu_has(X86_FEATURE_IBPB)) - svm_set_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W, - !guest_has_pred_cmd_msr(vcpu)); - - if (boot_cpu_has(X86_FEATURE_FLUSH_L1D)) - svm_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W, - !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D)); - if (sev_guest(vcpu->kvm)) sev_vcpu_after_set_cpuid(svm); =20 @@ -5559,12 +5392,6 @@ static __init int svm_hardware_setup(void) memset(iopm_va, 0xff, PAGE_SIZE * (1 << order)); iopm_base =3D __sme_page_pa(iopm_pages); =20 - r =3D init_msrpm_offsets(); - if (r) { - __free_pages(__sme_pa_to_page(iopm_base), get_order(IOPM_SIZE)); - return r; - } - kvm_caps.supported_xcr0 &=3D ~(XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR); =20 diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 32bb1e536dce..23e1e3ae30b0 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -318,12 +318,6 @@ struct vcpu_svm { struct list_head ir_list; spinlock_t ir_list_lock; =20 - /* Save desired MSR intercept (read: pass-through) state */ - struct { - DECLARE_BITMAP(read, MAX_DIRECT_ACCESS_MSRS); - DECLARE_BITMAP(write, MAX_DIRECT_ACCESS_MSRS); - } shadow_msr_intercept; - struct vcpu_sev_es_state sev_es; =20 bool guest_state_loaded; @@ -824,6 +818,7 @@ void sev_init_vmcb(struct vcpu_svm *svm); void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm); int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, in= t in); void sev_es_vcpu_reset(struct vcpu_svm *svm); +void sev_es_recalc_msr_intercepts(struct kvm_vcpu *vcpu); void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector); void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_sa= ve_area *hostsa); void sev_es_unmap_ghcb(struct vcpu_svm *svm); --=20 2.49.0.1204.g71687c7c1d-goog