From nobody Mon Feb 9 14:32:02 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB7203783B2 for ; Thu, 29 Jan 2026 01:16:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769649405; cv=none; b=IF7TLPbKWSI1BUwWKTelEiGsc7OInEHywZNjdSfPCKpEOKnHAgeNTkEQfTduj4N2qM9QgHKA0WUVlmaR4qKVcB3YIxe9QjEEbbb9rI5cyJzMdXKWMC/dYCiilhNmWlPhGF7WdhqwRwow+hFIGNBaPx0gI7x6vTPVN1hgyjwaaEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769649405; c=relaxed/simple; bh=MaC+fCO7g1By3J/qOTxfjL/79GFuNBOpVwOO27+9vUE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=aCsIDCjq39uobUoMrhalHk/yZ+kMicr88vCR0XalMWecPnym24Q5xWn2LmMTYLtaD6uav1k/L1sKYiy7+15TvgRCXqJp9ip3Ii8/Wf6uS8fQJG3CurglkE9m2iyoUmSM1o5p+R9wQhqJv989fifqna/sqGER3jxlPmHf+zCsZxM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=paUTCwqr; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="paUTCwqr" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-352e1a8603bso362052a91.1 for ; Wed, 28 Jan 2026 17:16:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1769649403; x=1770254203; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=cCcD265cxmmVZ0UcKWxh2iCOGhqGhkNB0SavdcofzbU=; b=paUTCwqrRbNpS5BvaZ6238ym2ADiolbeNS1rSuK054wMak7iMmC1rXWvCnG8ev0i0w aYKDDEPuhgw+qKX9KFqVBqrPwXcDdjdoeykarvB2wBrXlSrkCLgsu6erZlSB9esCY3CZ rRwAH7nDRTjMY7c5ywdofRy/dWHaVbXu4F4E3EkVryzQl5ZpCLOJCzj1rrsDmYOB2dYG FfqRMOks6IJf07V1mlzoQ+K3MJP5aF0f4N1gC1N8rrJuoxa5Cfkj27VrH+BpJ2NqR+/S RgIUPS+nObnt2zGR1UOE5VRGqhIgQlC7apuCkr2WgFjCVFBvfUQlebx6YoB5G+IazjwH FQ3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769649403; x=1770254203; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cCcD265cxmmVZ0UcKWxh2iCOGhqGhkNB0SavdcofzbU=; b=Ik1RLcwtu2O1k4uhXSVaq+MtGhIBI0gybtFG5odme8UyLbWniCvVbOsNiBXSRdk/ew AVEWfj1qlWiCat2RcGGqHs4p5PD0r4V0tXtDhTaIFqQKJGDDXcMW4bEeT7NsfrkndFzC tl3CIG+g4MkN2EHW7doRPae9voNv0PY1lZKsVd7E6qxUinsZjRTvNUvCNPtqQvhGj4jm ok6jraON0yczg06t6NBjCqvxpmMCm7RnpB+UWPI2LAewtZRddAYdLbsAhxirmlPMSYyd Xi0T8RoY02J6Sa9SofjiaTW3Fx1t2JDdm9JPYbBaGd9FlaXO9thXSXCUZ73K0CYL1P7W Ydqg== X-Gm-Message-State: AOJu0Yx0N9l/Qg+9xypRHcD84Os2ZysX9i4bUSda0ywb8Hk0553HHhdo mpSX8XzeNr46h2z+3oXiRKNp56CT95rFkeOFlIsDB1sgeTaULQZzMpm0+uKyHxFSQf99Bz4SefG 0DjYFXA== X-Received: from pjbbf7.prod.google.com ([2002:a17:90b:b07:b0:34a:b3a0:78b9]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d008:b0:32d:e780:e9d5 with SMTP id 98e67ed59e1d1-353fed6ebffmr5150167a91.22.1769649403000; Wed, 28 Jan 2026 17:16:43 -0800 (PST) Reply-To: Sean Christopherson Date: Wed, 28 Jan 2026 17:15:12 -0800 In-Reply-To: <20260129011517.3545883-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260129011517.3545883-1-seanjc@google.com> X-Mailer: git-send-email 2.53.0.rc1.217.geba53bf80e-goog Message-ID: <20260129011517.3545883-41-seanjc@google.com> Subject: [RFC PATCH v5 40/45] KVM: x86: Introduce hugepage_set_guest_inhibit() From: Sean Christopherson To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, Kiryl Shutsemau , Sean Christopherson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, kvm@vger.kernel.org, Kai Huang , Rick Edgecombe , Yan Zhao , Vishal Annapurve , Ackerley Tng , Sagi Shahar , Binbin Wu , Xiaoyao Li , Isaku Yamahata Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yan Zhao TDX requires guests to accept S-EPT mappings created by the host KVM. Due to the current implementation of the TDX module, if a guest accepts a GFN at a lower level after KVM maps it at a higher level, the TDX module will emulate an EPT violation VMExit to KVM instead of returning a size mismatch error to the guest. If KVM fails to perform page splitting in the VMExit handler, the guest's accept operation will be triggered again upon re-entering the guest, causing a repeated EPT violation VMExit. To facilitate passing the guest's accept level information to the KVM MMU core and to prevent the repeated mapping of a GFN at different levels due to different accept levels specified by different vCPUs, introduce the interface hugepage_set_guest_inhibit(). This interface specifies across vCPUs that mapping at a certain level is inhibited from the guest. Intentionally don't provide an API to clear KVM_LPAGE_GUEST_INHIBIT_FLAG for the time being, as detecting that it's ok to (re)install a hugepage is tricky (and costly if KVM wants to be 100% accurate), and KVM doesn't currently support hugepage promotion (only direct installation of hugepages) for S-EPT. As a result, the only scenario where clearing the flag would likely allow KVM to install a hugepage is when an entire 2MiB / 1GiB range is converted to shared or private. But if the guest is accepting at 4KiB granulairty, odds are good the guest is using the memory for something "special" and will never convert the entire range to shared (and/or back to private). Punt that optimization to the future, if it's ever needed. Link: https://lore.kernel.org/all/a6ffe23fb97e64109f512fa43e9f6405236ed40a.= camel@intel.com [1] Suggested-by: Rick Edgecombe Suggested-by: Sean Christopherson Signed-off-by: Yan Zhao [sean: explain *why* the flag is never cleared] Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu.h | 4 ++++ arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++++++++--- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 830f46145692..fa6a8daf4b05 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -322,4 +322,8 @@ static inline bool kvm_is_gfn_alias(struct kvm *kvm, gf= n_t gfn) { return gfn & kvm_gfn_direct_bits(kvm); } + +void hugepage_set_guest_inhibit(struct kvm_memory_slot *slot, gfn_t gfn, i= nt level); +bool hugepage_test_guest_inhibit(struct kvm_memory_slot *slot, gfn_t gfn, = int level); + #endif diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 45650f70eeab..c2765bfc8492 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -718,12 +718,14 @@ static struct kvm_lpage_info *lpage_info_slot(gfn_t g= fn, } =20 /* - * The most significant bit in disallow_lpage tracks whether or not memory - * attributes are mixed, i.e. not identical for all gfns at the current le= vel. + * The most 2 significant bits in disallow_lpage tracks whether or not mem= ory + * attributes are mixed, i.e. not identical for all gfns at the current le= vel, + * or whether or not guest inhibits the current level of hugepage at the g= fn. * The lower order bits are used to refcount other cases where a hugepage = is * disallowed, e.g. if KVM has shadow a page table at the gfn. */ #define KVM_LPAGE_MIXED_FLAG BIT(31) +#define KVM_LPAGE_GUEST_INHIBIT_FLAG BIT(30) =20 static void update_gfn_disallow_lpage_count(const struct kvm_memory_slot *= slot, gfn_t gfn, int count) @@ -736,7 +738,8 @@ static void update_gfn_disallow_lpage_count(const struc= t kvm_memory_slot *slot, =20 old =3D linfo->disallow_lpage; linfo->disallow_lpage +=3D count; - WARN_ON_ONCE((old ^ linfo->disallow_lpage) & KVM_LPAGE_MIXED_FLAG); + WARN_ON_ONCE((old ^ linfo->disallow_lpage) & + (KVM_LPAGE_MIXED_FLAG | KVM_LPAGE_GUEST_INHIBIT_FLAG)); } } =20 @@ -1648,6 +1651,18 @@ static bool __kvm_rmap_zap_gfn_range(struct kvm *kvm, start, end - 1, can_yield, true, flush); } =20 +bool hugepage_test_guest_inhibit(struct kvm_memory_slot *slot, gfn_t gfn, = int level) +{ + return lpage_info_slot(gfn, slot, level)->disallow_lpage & KVM_LPAGE_GUES= T_INHIBIT_FLAG; +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(hugepage_test_guest_inhibit); + +void hugepage_set_guest_inhibit(struct kvm_memory_slot *slot, gfn_t gfn, i= nt level) +{ + lpage_info_slot(gfn, slot, level)->disallow_lpage |=3D KVM_LPAGE_GUEST_IN= HIBIT_FLAG; +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(hugepage_set_guest_inhibit); + bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { bool flush =3D false; --=20 2.53.0.rc1.217.geba53bf80e-goog