From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D54BC433BC for ; Fri, 17 Oct 2025 00:32:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661172; cv=none; b=jGIYyJlI19M4nyC8mFtpl5Of2/eLNkiuZOGO1h78AyHe93qc2tit2IT5rnzBa6Dl3qivyvW5LYz48NxkvXx5F5BceP/PaDDHGx+ww5hTptPqCLDGRgXAWjU7RrIe5e0xuAhdopDECw8oWA5GYTTGQ4TOktqXijGlVrAnpOeX1mU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661172; c=relaxed/simple; bh=yTxeKi5ytBqSGu+K0Aiq0a5C1Z+QgeX31iG3pFbDjT8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=R1kZ0xQdg8STCwQIuoiB7KacJaz8enNLf4drKRs56ZzWRX69ppVVlb1mER8QIhJUjP1NGirOF6MxYUb2GrwSa60bd6BBSfw422DQQKtOgnq3TPYuGok8oM9ZKhTXbkEYRBpBBEbb1I3O3LlZbmtMohwc0pdj//NMw5edl7Q6JR0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=CB1L57Zq; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CB1L57Zq" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33be01bcda8so19934a91.2 for ; Thu, 16 Oct 2025 17:32:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661170; x=1761265970; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ZLjI+VHvWwq1NOed3zBHwZiL3R2UZSybOuIS4LQq3Jk=; b=CB1L57ZquaYl4KXQN3zJ9b1sxYAobCnU/4SwIG4P8kOr8y047n4gj9NLZeS3l032Gj WCGo/HyjzW/3FcD6PXLtzO85/Ms6ShfWN2VKqro37Rw2igFXptf58hlCH66dU566PPDa pnTLfhEr42qw3kz9sok77Axzo35exHQK1oiN6cU9v8TFZou6NecfHXTduvDdOnymx+P9 r/xWwHySLO3WowSk4UUpfYJljm83shK4pUhMKq/nP1FBZH2QDfefsEMLnU2sQ8bbGtCX 8tuW3mmMd10VZirLLaGfl7eEKtEcBSG8SWMBHGMU7KIJcFELliZIhRIIKWWLPG4CgJlm oZ2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661170; x=1761265970; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZLjI+VHvWwq1NOed3zBHwZiL3R2UZSybOuIS4LQq3Jk=; b=HmcwtALGW8xDpQ0X7P4FqiFInGx5+mAHDJZclJss8MpfsOcbf7vgS2AEQ7A9ZGh5J1 iQ82rq/hqgTMOc/AHH3l26UcBsfRNogdQlhAsRBsAPVZ23fHavhUXp/cHxuu4MybJXAg MKjE47bP4+qfdBc15i+80l016r3fJEGi5QO6qg8r0nbm+uvq3dx8aoQjMD40/XqglCPt AFoJYI46evZ3WKTWNoK+33QKg1x9ngjks4+xbLGmRR3nVPb+KAsR6yz3YxlFZ0nQ9zZI UAm6oeQn/Hwb6fRH93krMBOV1WwB1n1x1CH0loTO2GNaqt4OUX5S4sE1Ury3RKOsV2AR 7HRA== X-Forwarded-Encrypted: i=1; AJvYcCXmyIengK8rQBlO3nYwUp1NrhCQzo/9IdUvef86DiG3brKkevrNZwPt2Gd05CAoVcERi0cbaVRX3asDz94=@vger.kernel.org X-Gm-Message-State: AOJu0YzbrUpIXOcG/ID5PGCL1ACoUVoeFPRlC3NcRnzkPWMGrN+y8GEw rFr4v9krgyACpZ5pZAzKP7Q9HOxkJrbL2OTzgPocbevyhTcQh2ER125a+Lx8kbAM3BjrsrDeE+s 6DMi7hg== X-Google-Smtp-Source: AGHT+IHzipqFwGOQM14+hMbE6MDd7f/rMR4PzXuvuqV/70HQDUX00eM7uGkCf4DwDHKPsPDEwqW7PjFGoNU= X-Received: from pjbse5.prod.google.com ([2002:a17:90b:5185:b0:33b:51fe:1a88]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4ccd:b0:327:734a:ae7a with SMTP id 98e67ed59e1d1-33bcf87ac05mr2025384a91.11.1760661170081; Thu, 16 Oct 2025 17:32:50 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:19 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-2-seanjc@google.com> Subject: [PATCH v3 01/25] KVM: Make support for kvm_arch_vcpu_async_ioctl() mandatory From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement kvm_arch_vcpu_async_ioctl() "natively" in x86 and arm64 instead of relying on an #ifdef'd stub, and drop HAVE_KVM_VCPU_ASYNC_IOCTL in anticipation of using the API on x86. Once x86 uses the API, providing a stub for one architecture and having all other architectures opt-in requires more code than simply implementing the API in the lone holdout. Eliminating the Kconfig will also reduce churn if the API is renamed in the future (spoiler alert). No functional change intended. Signed-off-by: Sean Christopherson Acked-by: Claudio Imbrenda --- arch/arm64/kvm/arm.c | 6 ++++++ arch/loongarch/kvm/Kconfig | 1 - arch/mips/kvm/Kconfig | 1 - arch/powerpc/kvm/Kconfig | 1 - arch/riscv/kvm/Kconfig | 1 - arch/s390/kvm/Kconfig | 1 - arch/x86/kvm/x86.c | 6 ++++++ include/linux/kvm_host.h | 10 ---------- virt/kvm/Kconfig | 3 --- 9 files changed, 12 insertions(+), 18 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index f21d1b7f20f8..785aaaee6a5d 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1828,6 +1828,12 @@ long kvm_arch_vcpu_ioctl(struct file *filp, return r; } =20 +long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + return -ENOIOCTLCMD; +} + void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *mems= lot) { =20 diff --git a/arch/loongarch/kvm/Kconfig b/arch/loongarch/kvm/Kconfig index ae64bbdf83a7..ed4f724db774 100644 --- a/arch/loongarch/kvm/Kconfig +++ b/arch/loongarch/kvm/Kconfig @@ -25,7 +25,6 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_MSI select HAVE_KVM_READONLY_MEM - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_COMMON select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_GENERIC_HARDWARE_ENABLING diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig index ab57221fa4dd..cc13cc35f208 100644 --- a/arch/mips/kvm/Kconfig +++ b/arch/mips/kvm/Kconfig @@ -22,7 +22,6 @@ config KVM select EXPORT_UASM select KVM_COMMON select KVM_GENERIC_DIRTYLOG_READ_PROTECT - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_MMIO select KVM_GENERIC_MMU_NOTIFIER select KVM_GENERIC_HARDWARE_ENABLING diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 2f2702c867f7..c9a2d50ff1b0 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -20,7 +20,6 @@ if VIRTUALIZATION config KVM bool select KVM_COMMON - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_VFIO select HAVE_KVM_IRQ_BYPASS =20 diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig index c50328212917..77379f77840a 100644 --- a/arch/riscv/kvm/Kconfig +++ b/arch/riscv/kvm/Kconfig @@ -23,7 +23,6 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQ_ROUTING select HAVE_KVM_MSI - select HAVE_KVM_VCPU_ASYNC_IOCTL select HAVE_KVM_READONLY_MEM select HAVE_KVM_DIRTY_RING_ACQ_REL select KVM_COMMON diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig index cae908d64550..96d16028e8b7 100644 --- a/arch/s390/kvm/Kconfig +++ b/arch/s390/kvm/Kconfig @@ -20,7 +20,6 @@ config KVM def_tristate y prompt "Kernel-based Virtual Machine (KVM) support" select HAVE_KVM_CPU_RELAX_INTERCEPT - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_ASYNC_PF select KVM_ASYNC_PF_SYNC select KVM_COMMON diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b4b5d2d09634..ca5ba2caf314 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7240,6 +7240,12 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, v= oid __user *argp) return 0; } =20 +long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + return -ENOIOCTLCMD; +} + int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long= arg) { struct kvm *kvm =3D filp->private_data; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5bd76cf394fa..7186b2ae4b57 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2437,18 +2437,8 @@ static inline bool kvm_arch_no_poll(struct kvm_vcpu = *vcpu) } #endif /* CONFIG_HAVE_KVM_NO_POLL */ =20 -#ifdef CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); -#else -static inline long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, - unsigned long arg) -{ - return -ENOIOCTLCMD; -} -#endif /* CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL */ - void kvm_arch_guest_memory_reclaimed(struct kvm *kvm); =20 #ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 5f0015c5dd95..267c7369c765 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -78,9 +78,6 @@ config HAVE_KVM_IRQ_BYPASS tristate select IRQ_BYPASS_MANAGER =20 -config HAVE_KVM_VCPU_ASYNC_IOCTL - bool - config HAVE_KVM_VCPU_RUN_PID_CHANGE bool =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8C9C21C167 for ; Fri, 17 Oct 2025 00:32:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661174; cv=none; b=Y8IkSke9IoGaWosR1ON7Eb0eCGKk7yJFCXyznYmY95QHqMSEk8zsz5jToG/OzB/PpdTmGacjMGEdJGNmo/WRq//8uOkGte5wd6v6wzphTVH9rCaQEfqVGgt0SAtaHw/wRAdShFowOSkDJtMUbcd0wi6BAUR2g8D4m/aulFhCkpU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661174; c=relaxed/simple; bh=Z5xAV/JEgL1zrSHNWBCq2Gg3U16HqziHPgrjQtnPeVA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=KaBJjvoAag4lf9V9vRg8/UCNUAhD2UojQW2ggbX7gIu1Y+gCJ/j2y4nFK4PDggWvWY3tHF68fDKkP+8xOP46B2MSBJzFqrgz4ES//E1uF7gky/1hhRJEsbMcoqvsIqpkavKRkYaR2imJ+O+1VJTB/P3HYPxn205NXDAzKWmMRQQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=QM8zctMa; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="QM8zctMa" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-334b0876195so1510560a91.1 for ; Thu, 16 Oct 2025 17:32:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661172; x=1761265972; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=33cozUgZY2qVcRUHL7b8tITs99/x6FMYwOKO1Eer8GQ=; b=QM8zctMal0jak+Hkh+cbgsNjpxPJpQFhhpVfech8Rsa3etGhvYcKG2EONWX7YTp64L TLeBL/0n7NXC0RO5UqQwqZlIkzSIv0OBCKOocJ7qli06nzXiII1vmx0bGg9pMhgX35C7 Nhi8uljei6ZSNHh94/HIEP3N7Zp0Wr6i676LbrH9D3dVZ/V7bCSAKsl5xWfwdUIEuCd9 jzv2lnr+HXgnXHMoQy68ZqSyTZcLf5USnJIGLb3g3ogUNWyCboTLj7H7/fHRvyrAA3O+ QY+gucfsoBYmN3n12pMXXMWtzfP+4ynDhNtSlHLiaMNJkRng2NOzpD3kvPp7juEmNj/i qVTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661172; x=1761265972; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=33cozUgZY2qVcRUHL7b8tITs99/x6FMYwOKO1Eer8GQ=; b=dSKcm4BNAKUawH5OY2CXGpG58XjVQy5uP3hx+/L5H8Adb5a8y5oT0ZMX+Ua1TPhSUw 4lFi6vppBY+0l1zgo8pvsBDOe+vvqA1vfxyPGDk5OTJRiPMJqoEwvxQmF6NhpAhCjbLq +8cqNqIIecYOlzFUiaur6YA9ENbkTQY0BQCyyxHWkAn5lsNi50G8/cS3CuQMK7E9w6MS JZtccOSLzL3Vv62ZOjsG8Pxm0agSIgeA3yzMjN+8adxEC/JxF8VZE0AwZiXTRuLPlTvr DMXGqX1DjNA7rnyMSU2Pr3uDIrf0mfS8Q9egGko9uszEqoG3nOiIqB2FkagdXKEDMK0w KymA== X-Forwarded-Encrypted: i=1; AJvYcCX8IsJcTZ7qfjHtH9Ik05Cm4jcMJa/4HQj57axwQH37jNWSFg0f9wKcmjB6vjlE+nIxcMk24USP09O+dtQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyQG9gwvVVCHSiGR369Rrs9PKzDRixebA3Hi7G7bBwNDmV6hW3l dDfVGw4oNEniP3aRre+X2Wp7r867knpvza+uQWhUKB9IcVAA6UTefQFTnkGlOmeqbudp1KljVv+ kWDegig== X-Google-Smtp-Source: AGHT+IHi1bmDhBVE/1dfVFfxtUsGZvUIPpndhcsCAkXUuWnb62NCix+r2iPcIxi5145AtUHJ60DhAmmul/g= X-Received: from pjut11.prod.google.com ([2002:a17:90a:d50b:b0:33b:beb0:be7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:380f:b0:32e:8931:b59c with SMTP id 98e67ed59e1d1-33bcf90c0cdmr1921084a91.27.1760661171886; Thu, 16 Oct 2025 17:32:51 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:20 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-3-seanjc@google.com> Subject: [PATCH v3 02/25] KVM: Rename kvm_arch_vcpu_async_ioctl() to kvm_arch_vcpu_unlocked_ioctl() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rename the "async" ioctl API to "unlocked" so that upcoming usage in x86's TDX code doesn't result in a massive misnomer. To avoid having to retry SEAMCALLs, TDX needs to acquire kvm->lock *and* all vcpu->mutex locks, and acquiring all of those locks after/inside the current vCPU's mutex is a non-starter. However, TDX also needs to acquire the vCPU's mutex and load the vCPU, i.e. the handling is very much not async to the vCPU. No functional change intended. Signed-off-by: Sean Christopherson Acked-by: Claudio Imbrenda --- arch/arm64/kvm/arm.c | 4 ++-- arch/loongarch/kvm/vcpu.c | 4 ++-- arch/mips/kvm/mips.c | 4 ++-- arch/powerpc/kvm/powerpc.c | 4 ++-- arch/riscv/kvm/vcpu.c | 4 ++-- arch/s390/kvm/kvm-s390.c | 4 ++-- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 4 ++-- virt/kvm/kvm_main.c | 6 +++--- 9 files changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 785aaaee6a5d..e8d654024608 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1828,8 +1828,8 @@ long kvm_arch_vcpu_ioctl(struct file *filp, return r; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, - unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { return -ENOIOCTLCMD; } diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index 30e3b089a596..9a5844e85fd3 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -1471,8 +1471,8 @@ int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, s= truct kvm_interrupt *irq) return 0; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { void __user *argp =3D (void __user *)arg; struct kvm_vcpu *vcpu =3D filp->private_data; diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index a75587018f44..b0fb92fda4d4 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -895,8 +895,8 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *v= cpu, return r; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, - unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 2ba057171ebe..9a89a6d98f97 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -2028,8 +2028,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *= vcpu, return -EINVAL; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index bccb919ca615..a4bd6077eecc 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -238,8 +238,8 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, s= truct vm_fault *vmf) return VM_FAULT_SIGBUS; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 16ba04062854..8c4caa5f2fcd 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -5730,8 +5730,8 @@ static long kvm_s390_vcpu_memsida_op(struct kvm_vcpu = *vcpu, return r; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ca5ba2caf314..b85cb213a336 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7240,8 +7240,8 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, vo= id __user *argp) return 0; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, - unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { return -ENOIOCTLCMD; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7186b2ae4b57..d93f75b05ae2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1557,6 +1557,8 @@ long kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, + unsigned int ioctl, unsigned long arg); vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf= ); =20 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext); @@ -2437,8 +2439,6 @@ static inline bool kvm_arch_no_poll(struct kvm_vcpu *= vcpu) } #endif /* CONFIG_HAVE_KVM_NO_POLL */ =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg); void kvm_arch_guest_memory_reclaimed(struct kvm *kvm); =20 #ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b7a0ae2a7b20..b7db1d5f71a8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4434,10 +4434,10 @@ static long kvm_vcpu_ioctl(struct file *filp, return r; =20 /* - * Some architectures have vcpu ioctls that are asynchronous to vcpu - * execution; mutex_lock() would break them. + * Let arch code handle select vCPU ioctls without holding vcpu->mutex, + * e.g. to support ioctls that can run asynchronous to vCPU execution. */ - r =3D kvm_arch_vcpu_async_ioctl(filp, ioctl, arg); + r =3D kvm_arch_vcpu_unlocked_ioctl(filp, ioctl, arg); if (r !=3D -ENOIOCTLCMD) return r; =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C4192253EE for ; Fri, 17 Oct 2025 00:32:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661176; cv=none; b=KrHXqfRN+9ylyw3NVdNN740XVuGFjT/pGJ50dVNxWf6J6ww9ndF+l+9aCCmayIOzqoFO2+upigVyDvdevP8QO+5rqwYNNhhUgJsYUKSz4jhA+ToRP9kjg6AvxTDB8NeFRIOwLBL8xD7h9IfAmm8FJIarvdzHp2q5QkndoA8iujQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661176; c=relaxed/simple; bh=sDhYqOoM0kA/BE6wjd9QATnXbbERuSw/zd8xDCYAeAE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jCkhi9v7sqXYsk7CtJNVGZsdBvdxZEcpQ6pFQ/pBNfCNbSsM+aw3IkKqkZ54Rw+vmHaM8LQRRwMX3sxGW9qzAj+HCDsKLvi2+6X5BTm4T5a13+gCCS478cpw0ozbVS1pbh1p0A5mcZok1AO5xoHu23BTQH9f5PAE1D3zXrpx/40= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=vhrgDNyh; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vhrgDNyh" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32eb18b5659so1155867a91.2 for ; Thu, 16 Oct 2025 17:32:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661174; x=1761265974; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ufVneY39YrbdCfRZpp0jVK1n9RSn66wnVA9SJtGGBdY=; b=vhrgDNyh+HntyeXmKCNtlmjQcYQuh41wFaS8THuiXv7h8VhIOYJigW7tHQFkK1qtZk cF3JFs/tRuYIYC0Vg+PueBh2U96Cx2/FfjcdCBaKZWBpXCLlGzWu1JTSFMs6QbL4/TYo CnBpwvr8/eU13IqJaQHAdPSc1OmEdbnjZorAfuzABbYHQMfaNOv2T1qxeRoFrFF/P1gd 3Zqo2c4NEc6j/EoeRt9AIRt8FimCIY9tKuoCaRTKfXwhOEE2Aa2D6Czx7HhtiviFGDwF 5Chq1MqvjrN7DgOliPDGnl6kP4O4qb6RZpHwUpspi6K6Nxpx7iXc4mFCKlc3D1wwB7Mj L0aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661174; x=1761265974; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ufVneY39YrbdCfRZpp0jVK1n9RSn66wnVA9SJtGGBdY=; b=aGIdaY54gaVighXxpWxRw52GnsJ0mFgglpFJY4XJiRv1AJrWbvtzT3KQI2T74Yfij0 KEQXOdwxOmF6FjiP55YEJB7RkFgbat3yTmvn1lzfHLPsL6yJ+4ZSNQnxv8x+C6vque0z 4cLgNnSH/wcOv6XUfwQ1L/8pAWdedYsyGsal5A+ZueMZJVQbL326C/Oho0IjH1StE0Bq hpX5Uy6zicOJPbFxApF2YVPklq8N4B14Zfgz8ocEoLMCxMNr+XG6EjWsyeDfyIu1gmW2 6yZeH4XuilyRxc2lhkIJi23su+30ei/3xu3ry1h4sGoBydx1mCRavtc5jdopMTMXRFl/ Y0GA== X-Forwarded-Encrypted: i=1; AJvYcCXcn4sNnZbkAEjet2RB0oLeumyr8LwJ/kX4ZQyzhGGP6nCsilGiFpB7JpmnwreDMny6GtaJFQHUPopnALc=@vger.kernel.org X-Gm-Message-State: AOJu0YxGQYMUengVG4IywLmWrj6le60yVBBN732l4JYIa8MqqKHLGhZi 4267LYMjsNMDeFGxbK93V1vWk/mLrsw8mfuqLbmyht4e3dK48UBiaebfWZpRFh5ZR02lklLyOs3 WMZ75NQ== X-Google-Smtp-Source: AGHT+IGBsD40M1TCHpceHuTvrzQu31EsWCpow7sShvAnQiEKP0H9VSo9B6k/+Ftt8wsphMd8FgX3xhZLFrA= X-Received: from pjop7.prod.google.com ([2002:a17:90a:9307:b0:33b:aa58:175c]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2fd0:b0:32d:601d:f718 with SMTP id 98e67ed59e1d1-33bcf8f9c16mr1675095a91.31.1760661173675; Thu, 16 Oct 2025 17:32:53 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:21 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-4-seanjc@google.com> Subject: [PATCH v3 03/25] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop TDX's sanity check that a mirror EPT mapping isn't zapped between creating said mapping and doing TDH.MEM.PAGE.ADD, as the check is simultaneously superfluous and incomplete. Per commit 2608f1057601 ("KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU"), the justification for introducing kvm_tdp_mmu_gpa_is_mapped() was to check that the target gfn was pre-populated, with a link that points to this snippet: : > One small question: : > : > What if the memory region passed to KVM_TDX_INIT_MEM_REGION hasn't bee= n pre- : > populated? If we want to make KVM_TDX_INIT_MEM_REGION work with these= regions, : > then we still need to do the real map. Or we can make KVM_TDX_INIT_ME= M_REGION : > return error when it finds the region hasn't been pre-populated? : : Return an error. I don't love the idea of bleeding so many TDX details = into : userspace, but I'm pretty sure that ship sailed a long, long time ago. But that justification makes little sense for the final code, as the check on nr_premapped after TDH.MEM.PAGE.ADD will detect and return an error if KVM attempted to zap a S-EPT entry (tdx_sept_zap_private_spte() will fail on TDH.MEM.RANGE.BLOCK due lack of a valid S-EPT entry). And as evidenced by the "is mapped?" code being guarded with CONFIG_KVM_PROVE_MMU=3Dy, KVM is NOT relying on the check for general correctness. The sanity check is also incomplete in the sense that mmu_lock is dropped between the check and TDH.MEM.PAGE.ADD, i.e. will only detect KVM bugs that zap SPTEs in a very specific window (note, this also applies to the check on nr_premapped). Removing the sanity check will allow removing kvm_tdp_mmu_gpa_is_mapped(), which has no business being exposed to vendor code, and more importantly will pave the way for eliminating the "pre-map" approach entirely in favor of doing TDH.MEM.PAGE.ADD under mmu_lock. Reviewed-by: Ira Weiny Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu --- arch/x86/kvm/vmx/tdx.c | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 326db9b9c567..4c3014befe9f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3181,20 +3181,6 @@ static int tdx_gmem_post_populate(struct kvm *kvm, g= fn_t gfn, kvm_pfn_t pfn, if (ret < 0) goto out; =20 - /* - * The private mem cannot be zapped after kvm_tdp_map_page() - * because all paths are covered by slots_lock and the - * filemap invalidate lock. Check that they are indeed enough. - */ - if (IS_ENABLED(CONFIG_KVM_PROVE_MMU)) { - scoped_guard(read_lock, &kvm->mmu_lock) { - if (KVM_BUG_ON(!kvm_tdp_mmu_gpa_is_mapped(vcpu, gpa), kvm)) { - ret =3D -EIO; - goto out; - } - } - } - ret =3D 0; err =3D tdh_mem_page_add(&kvm_tdx->td, gpa, pfn_to_page(pfn), src_page, &entry, &level_state); --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D67623BD1F for ; Fri, 17 Oct 2025 00:32:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661178; cv=none; b=FTLuWk6AxapL6vMmqsaeXWWTcJUTvVUKYMNgABvOc3KefabZr/zRwL8IUscke0jD4vm9YY/NjcyNHa88kYiO3G717bbgHCnf7iZRYhpvZuNk0nED7hwkj6cqU/W5+QsxYX9cnq+w4M5FGWiFjP2FY+n7ZVr02IYJW4Nc+szFprI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661178; c=relaxed/simple; bh=s+5xqtORsbmLUfv3OZ/dLobWuRRdQhZOtDHAsyPrT00=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=MHXns/d4URryyF3PiIjpnQcEpTPr/dlVJ0ebHj44qEpLZkmE6BqUd2dEtKH3ikKUqeumikrUqQWgVu27/q7n+iP53R2S7kmyXlZWbNIv9F1tSrWoUFe298uFNcW9Co0FtVy+W22U55lXgn7LOv1NCjyo/6onj+DTsjZ24jSR0Qk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Ku59ti05; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Ku59ti05" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33428befc08so3128519a91.2 for ; Thu, 16 Oct 2025 17:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661175; x=1761265975; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ko9UTh0kFN5WvzyFoNPT6gnIbmUfHZvedxM+uwzO14M=; b=Ku59ti05tSWpMW4sQoT4PS2PHyRF0IxwARVPyuhoQqqiGhf4v4VNUaUJYshPjoufjC KOmWTKgCT1BakXcGHjSGNawIewIucwVFGsNgGkwcrU5cZjscsttUHK9qPbwNrANcLEak +69UnC6zkOujk1Y8NReZRLRJaY0aI6Gi+MjhvIV7U80zq1ADCZ8GaR9fYs4Sc23miEcS 87/nTEk9bQIn/OtEGrnN9Q5nq/tgeRPAsuBpSCSkqF7Rtx5OR2NvCsgUB/L2KSgvTn+I vqrwUIng1BKsmf60gSYCYFng3Qe9HrYfJgPDbDSzOfhtUlWRs1IHq4FZzKkrxJvJPjq3 1R3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661175; x=1761265975; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ko9UTh0kFN5WvzyFoNPT6gnIbmUfHZvedxM+uwzO14M=; b=FZ3ExLHOakZ1bZLEto8sYRLjNDmT68wK2doJX0huknFIGQ822bymoPfUIMi35b9CXj cm+CA6TZHCHkOct+MkVCRUyi7uGn0iFHhPfGBsnROkndSBfkzEXtmAD8U7QZ89ffK20a wLC3k3k/KpQraAqsj7wCpUU9db6y7PaNtgu1H3Xg8la2zs5L3cZNy9AxOyCqOP4JLC5n URNEHT7It8mJdBG2D14xOoV14v+zOvvTg5ecnEswRb11Y3pUNUsBIcunzO5WLS9V8ZTJ Lat/42DH7EiGNi0vTuCYeytnnv9BxDvaiQtPM1SPhd+8uQgY7OdO2nfmnQbfTkn8YPAR bFPQ== X-Forwarded-Encrypted: i=1; AJvYcCUudKUjCDc3pu5gb82seNhXFmXi1PXTHMqQ6T8XtJ/5LET4X/4jMNvZs4Z4dE5PDIqzSY11k3OC+Mnwfjc=@vger.kernel.org X-Gm-Message-State: AOJu0YzFTjSDrWZF6eWRAsCK0yAG6uXbAHwWpgV52xAcL4oSli6pCCvt iWTOm+OVln0BSJ7yxtkjfTjcJ8wURUaiivTxzHU0bEEZcyI0dJ5fClYzoy8dBY/VZhL34reTRIf jG/TDGQ== X-Google-Smtp-Source: AGHT+IE6CYPgkt1lc/boR85TJpRZ+SOX8flYF3i2xBYbojnuL2N+y/yOiWuvwhvUCsHzA0ZslE057qBeNRU= X-Received: from pjbmr8.prod.google.com ([2002:a17:90b:2388:b0:330:49f5:c0a7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3c0c:b0:33b:c5f6:40f1 with SMTP id 98e67ed59e1d1-33bcf85d123mr2212838a91.7.1760661175511; Thu, 16 Oct 2025 17:32:55 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:22 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-5-seanjc@google.com> Subject: [PATCH v3 04/25] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add and use a new API for mapping a private pfn from guest_memfd into the TDP MMU from TDX's post-populate hook instead of partially open-coding the functionality into the TDX code. Sharing code with the pre-fault path sounded good on paper, but it's fatally flawed as simulating a fault loses the pfn, and calling back into gmem to re-retrieve the pfn creates locking problems, e.g. kvm_gmem_populate() already holds the gmem invalidation lock. Providing a dedicated API will also removing several MMU exports that ideally would not be exposed outside of the MMU, let alone to vendor code. On that topic, opportunistically drop the kvm_mmu_load() export. Leave kvm_tdp_mmu_gpa_is_mapped() alone for now; the entire commit that added kvm_tdp_mmu_gpa_is_mapped() will be removed in the near future. Cc: Michael Roth Cc: Yan Zhao Cc: Ira Weiny Cc: Vishal Annapurve Cc: Rick Edgecombe Link: https://lore.kernel.org/all/20250709232103.zwmufocd3l7sqk7y@amd.com Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Rick Edgecombe --- arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/mmu/mmu.c | 60 +++++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.c | 10 +++---- 3 files changed, 63 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index f63074048ec6..2f108e381959 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -259,6 +259,7 @@ extern bool tdp_mmu_enabled; =20 bool kvm_tdp_mmu_gpa_is_mapped(struct kvm_vcpu *vcpu, u64 gpa); int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, u8 = *level); +int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn); =20 static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 18d69d48bc55..ba5cca825a7f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5014,6 +5014,65 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu = *vcpu, return min(range->size, end - range->gpa); } =20 +int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn) +{ + struct kvm_page_fault fault =3D { + .addr =3D gfn_to_gpa(gfn), + .error_code =3D PFERR_GUEST_FINAL_MASK | PFERR_PRIVATE_ACCESS, + .prefetch =3D true, + .is_tdp =3D true, + .nx_huge_page_workaround_enabled =3D is_nx_huge_page_enabled(vcpu->kvm), + + .max_level =3D PG_LEVEL_4K, + .req_level =3D PG_LEVEL_4K, + .goal_level =3D PG_LEVEL_4K, + .is_private =3D true, + + .gfn =3D gfn, + .slot =3D kvm_vcpu_gfn_to_memslot(vcpu, gfn), + .pfn =3D pfn, + .map_writable =3D true, + }; + struct kvm *kvm =3D vcpu->kvm; + int r; + + lockdep_assert_held(&kvm->slots_lock); + + if (KVM_BUG_ON(!tdp_mmu_enabled, kvm)) + return -EIO; + + if (kvm_gfn_is_write_tracked(kvm, fault.slot, fault.gfn)) + return -EPERM; + + r =3D kvm_mmu_reload(vcpu); + if (r) + return r; + + r =3D mmu_topup_memory_caches(vcpu, false); + if (r) + return r; + + do { + if (signal_pending(current)) + return -EINTR; + + if (kvm_test_request(KVM_REQ_VM_DEAD, vcpu)) + return -EIO; + + cond_resched(); + + guard(read_lock)(&kvm->mmu_lock); + + r =3D kvm_tdp_mmu_map(vcpu, &fault); + } while (r =3D=3D RET_PF_RETRY); + + if (r !=3D RET_PF_FIXED) + return -EIO; + + return 0; +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_mmu_map_private_pfn); + static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault =3D nonpaging_page_fault; @@ -5997,7 +6056,6 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) out: return r; } -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_load); =20 void kvm_mmu_unload(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4c3014befe9f..29f344af4cc2 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3157,15 +3157,12 @@ struct tdx_gmem_post_populate_arg { static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pf= n, void __user *src, int order, void *_arg) { - u64 error_code =3D PFERR_GUEST_FINAL_MASK | PFERR_PRIVATE_ACCESS; - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); struct tdx_gmem_post_populate_arg *arg =3D _arg; - struct kvm_vcpu *vcpu =3D arg->vcpu; + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + u64 err, entry, level_state; gpa_t gpa =3D gfn_to_gpa(gfn); - u8 level =3D PG_LEVEL_4K; struct page *src_page; int ret, i; - u64 err, entry, level_state; =20 /* * Get the source page if it has been faulted in. Return failure if the @@ -3177,7 +3174,7 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, if (ret !=3D 1) return -ENOMEM; =20 - ret =3D kvm_tdp_map_page(vcpu, gpa, error_code, &level); + ret =3D kvm_tdp_mmu_map_private_pfn(arg->vcpu, gfn, pfn); if (ret < 0) goto out; =20 @@ -3240,7 +3237,6 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *= vcpu, struct kvm_tdx_cmd *c !vt_is_tdx_private_gpa(kvm, region.gpa + (region.nr_pages << PAGE_SHI= FT) - 1)) return -EINVAL; =20 - kvm_mmu_reload(vcpu); ret =3D 0; while (region.nr_pages) { if (signal_pending(current)) { --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 012EE212562 for ; Fri, 17 Oct 2025 00:32:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661179; cv=none; b=QMuWbxMSTLUVADxtJo+XWy2lpaqj34ZVNjeGehfj6Ymyki09IzTmFIw9KAwIeohwa6NWzDoipkTfBLBccsZ+newo+KWCyLGktT59SxhNO7SKq/XEgBaXcJ+dlCUEvnRj8LKZhtKspHkTSP12tldaFyGJE1m2W9SXOBarNpV1Luo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661179; c=relaxed/simple; bh=lN6TxI+PfhLHjdoqy+h+mIIhaLO+a3s8opjxmb2B/Lc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=s6vl+ruwf0Wuis/Gx2Ps+WfadmEyDdBBGa5pUEA9ZV+HTcL4P5aOd7dqmKaIVvQBCJwZwD/fGOvOuFBETjrnwHuQBPFDeMow9hq1F7/h85LasVJLiLqxzO3mZ+RW0KfmBl9q7pheJhM6YEjdC6z8VBhHvvJCx5lqZ+x7H4K4EgI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=CnYZdZvM; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CnYZdZvM" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-33bb3b235ebso2009369a91.1 for ; Thu, 16 Oct 2025 17:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661177; x=1761265977; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=YWIeng8BsOTkZWInydege0vVgImBIZwpoQ1m6PsVarA=; b=CnYZdZvMZsIlfT6jNRzCEt4+z1sXCySN271npfUOfxcUy50qgO4Vy20SScChVoQyq9 6UK/L0FfJayVIZnOKOvtTFSSfnxN9QYu3XU4BNq9Z4E9+RSWJhRnkKmpd8JUYhehIqA6 vS0UWk0530MiT7yorSdCiKyLK8HpZll+7qRGUv2/RTa4CjIkoTghMAkWC2umdIBv+xAS ulu5mFN2ArxdmVnIj1mKGkXkm9TB/mAic6qfBtxUWmSUFOuV6vxMuJHQxKrxQBUe9J5P 1NJ7zvw29ZXJNttZnkbNjhZpk0sothousC4ntiU4Db2m4iBKqh43coiGYLnI9UDXl5pc t2/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661177; x=1761265977; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YWIeng8BsOTkZWInydege0vVgImBIZwpoQ1m6PsVarA=; b=mAUD3HsCW4hmz2EVrzfCS1CpMVjN1TpoXdh7+qq3rsVoct+hirS1dDZm6W4ZG9GJwl +FqyV8Xw/5K1NfFbYT8nFIomwIyUvKh/rUOjaZNsTZr2Usi4+B+UvBiKEK6A801RChGG dk51iDgVl6PIISqNRCPDOtzpR5DP2QpNdAvZiEBB+TN42PyXUiGeKlmNGEvyfkniQmSZ Y0Jay/px5ubCmf9zPCJoya+RfNUDuotAot6GO82f5Na9hWuO6x4/Q6a7QNH7MLbl9hwh E6uJIdIckxnF6hPdqLa98UVymyK7cLyEhQuQWbVmwiUdUFgImPMJ7hIw0W2+pA/2fT1C zbGA== X-Forwarded-Encrypted: i=1; AJvYcCVVtLcbZe4Awm2seljyTrhBdc8+L537s+Ep6d5XQXFYrD0vV6UaYK4zTFrMaBpSK1hykMAAe09RF3QHRy0=@vger.kernel.org X-Gm-Message-State: AOJu0YzB0+QHkuIH+bcjlLSWABNPwTHOTdyOjCcEZCu4n8pIoZFGYytb NJjSwFutSWiZjyaA8WPQRzvO1iVPTLlLULatYiqfoASieyBe7OlQUzkOVweLl4KLVa4LDkRD7MV w0Cu8dA== X-Google-Smtp-Source: AGHT+IFms9prLqRWu4j+VpVVO9qXU7xKhQ1tryuBH0Kw5CCld5HGLDevmfw2rLOsCJwDJFMJbPJJK8xVq/g= X-Received: from pjzh17.prod.google.com ([2002:a17:90a:ea91:b0:32f:46d:993b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2789:b0:330:6d5e:f17b with SMTP id 98e67ed59e1d1-33bcf8fa140mr1764119a91.21.1760661177139; Thu, 16 Oct 2025 17:32:57 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:23 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-6-seanjc@google.com> Subject: [PATCH v3 05/25] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the helper and exports that were added to allow TDX code to reuse kvm_tdp_map_page() for its gmem post-populate flow now that a dedicated TDP MMU API is provided to install a mapping given a gfn+pfn pair. This reverts commit 2608f105760115e94a03efd9f12f8fbfd1f9af4b. Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang --- arch/x86/kvm/mmu.h | 2 -- arch/x86/kvm/mmu/mmu.c | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 37 +++++-------------------------------- 3 files changed, 7 insertions(+), 36 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 2f108e381959..9e5045a60d8b 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -257,8 +257,6 @@ extern bool tdp_mmu_enabled; #define tdp_mmu_enabled false #endif =20 -bool kvm_tdp_mmu_gpa_is_mapped(struct kvm_vcpu *vcpu, u64 gpa); -int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, u8 = *level); int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn); =20 static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index ba5cca825a7f..3711dba92440 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4924,7 +4924,8 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct = kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } =20 -int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, u8 = *level) +static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_co= de, + u8 *level) { int r; =20 @@ -4966,7 +4967,6 @@ int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa= , u64 error_code, u8 *level return -EIO; } } -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_map_page); =20 long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index c5734ca5c17d..9b4006c2120e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1939,13 +1939,16 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, * * Must be called between kvm_tdp_mmu_walk_lockless_{begin,end}. */ -static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sp= tes, - struct kvm_mmu_page *root) +int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, + int *root_level) { + struct kvm_mmu_page *root =3D root_to_sp(vcpu->arch.mmu->root.hpa); struct tdp_iter iter; gfn_t gfn =3D addr >> PAGE_SHIFT; int leaf =3D -1; =20 + *root_level =3D vcpu->arch.mmu->root_role.level; + for_each_tdp_pte(iter, vcpu->kvm, root, gfn, gfn + 1) { leaf =3D iter.level; sptes[leaf] =3D iter.old_spte; @@ -1954,36 +1957,6 @@ static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *v= cpu, u64 addr, u64 *sptes, return leaf; } =20 -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, - int *root_level) -{ - struct kvm_mmu_page *root =3D root_to_sp(vcpu->arch.mmu->root.hpa); - *root_level =3D vcpu->arch.mmu->root_role.level; - - return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, root); -} - -bool kvm_tdp_mmu_gpa_is_mapped(struct kvm_vcpu *vcpu, u64 gpa) -{ - struct kvm *kvm =3D vcpu->kvm; - bool is_direct =3D kvm_is_addr_direct(kvm, gpa); - hpa_t root =3D is_direct ? vcpu->arch.mmu->root.hpa : - vcpu->arch.mmu->mirror_root_hpa; - u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte; - int leaf; - - lockdep_assert_held(&kvm->mmu_lock); - rcu_read_lock(); - leaf =3D __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, root_to_sp(root)); - rcu_read_unlock(); - if (leaf < 0) - return false; - - spte =3D sptes[leaf]; - return is_shadow_present_pte(spte) && is_last_spte(spte, leaf); -} -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_mmu_gpa_is_mapped); - /* * Returns the last level spte pointer of the shadow page walk for the giv= en * gpa, and sets *spte to the spte value. This spte may be non-preset. If = no --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75657248F58 for ; Fri, 17 Oct 2025 00:32:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661181; cv=none; b=ABWNWD0QtlG4hcZLAqYgV0UXqlTU/zbXODjnxJ2e2cs8Mb8zUoDjdijXUk/YsMYavvNu0pJJkkfPXO3u6FbwR8z71tmW/8QjQLMtBrTKGYIsZWNV3dO5cXCk6w1GI681/5oi5ng82GDXi0CNIs2J5rZr00qRujRmyGY5Czy3jTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661181; c=relaxed/simple; bh=gVnI4GkFUztKHqGbjzSVkprmmFdtZW16vSgVFL0Yqz4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nxHKZilCoqvIEew6mrQ3eaVsXglt1JJYf67/FPr3oEQAfVm0V4lp3lWHKDYXTbbmtmm9I/Yv/T7CLSs6CYWh4sDfwAovmvSYi1y/CCznW6nVo5uOrhxRDSRi9GLO2zRND0DaqZAfMRedbQj0lDnhNpR3J3eNDsU+buMGDLhhZ4I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=W0Nfc0oD; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="W0Nfc0oD" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32eddb7e714so1205484a91.1 for ; Thu, 16 Oct 2025 17:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661179; x=1761265979; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=+a1al4eiPUacV4Mot5X5KU5+gR34Rt/H3K9T45t0ZTE=; b=W0Nfc0oDehA7dSHB/NNs2UbhuxuBbk60XueyjBDVKfZgSQ4Etf95qvXsYQ5djIwFAK +zu2FTj7SPbTSkYTMNq+GtTnOrt5cjHdqxlil1vJz3eaynMquxV+ByJFGLMOWAQYE3R+ w9VbCut7kxmxws6V47HpWw5qzK1r8jYUh7t58EHuJxL9lgogN5UijFxbTlyJ314sFzU4 XltCom/L+JvGfdiBTljJ7fW0elqUVrwlRflidU9KmI+H+zLU8WHSu+OTrIu66vIaraQ2 gTMKE+IFNDYpB9ysrQpPtrdMyZhWBl3xrpaZtLGbBLaU+TEOhqTzG3MYzkNKY+6xrG43 uEeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661179; x=1761265979; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+a1al4eiPUacV4Mot5X5KU5+gR34Rt/H3K9T45t0ZTE=; b=Xpc11/phITSeyzvLUIVVxHROEjTAniFtZffJnjMOFjFSYzDcGHUVxK35cYToZTpfNN fNO9daPsI/Z7HVid+A6G8PU1MKUyUyHaq7AvB1LBmy9Iaz/spH+e8eRHYI0tAjYgdcD0 IdaSk34VLH/rNw7t/sOYm1xCBmrfeCVaHktZOD68OJACXmxwJOY7W2tTWXwjueJSapmj uFutdpQ2YHKrwBLUQfeR/EONez2/wBH2ZYQ4z3glfm/iKTocTRgZHs0uimw1n9eQWTBx z0KktkTS5TMdSulUaVfoY1F7oajNLF7dEbTJjF4Cja/B7mGO+pLFyP4p/+kMQ3MHghmH tubQ== X-Forwarded-Encrypted: i=1; AJvYcCXy2OzEdLEA4DioWYjwYM4DIt8YWfctyIhlXG+p9knxMH17IfGkUeU0wGd/JwAXrGCP7ettfVovRgxBZp8=@vger.kernel.org X-Gm-Message-State: AOJu0YzSkYgDja1JERhWU4uZc9UKdeIbfnGLNYgvgEC2NRZ966ETUFor DlZkCv6jbpNHXunpeYuKTlou27vHlHjZrJphgMHrYnANxGjuXeh3f5MVxjY46jwhAZuEI3NrF86 C2nbGqA== X-Google-Smtp-Source: AGHT+IHzIstKbR2q/VP+R8akfQ7d0mH54WcISHRgZ7r9iWOK+ijBTCVCrvD3k8+Ee7BqK468kooIUIkBSz0= X-Received: from pjbgj21.prod.google.com ([2002:a17:90b:1095:b0:33b:b662:ae3a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1343:b0:30a:4874:5397 with SMTP id 98e67ed59e1d1-33bcf87f454mr1982502a91.9.1760661178904; Thu, 16 Oct 2025 17:32:58 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:24 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-7-seanjc@google.com> Subject: [PATCH v3 06/25] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() now that it's used only by kvm_arch_vcpu_pre_fault_memory(). No functional change intended. Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang --- arch/x86/kvm/mmu/mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3711dba92440..94d7f32a03b6 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4924,8 +4924,8 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct = kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } =20 -static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_co= de, - u8 *level) +static int kvm_tdp_page_prefault(struct kvm_vcpu *vcpu, gpa_t gpa, + u64 error_code, u8 *level) { int r; =20 @@ -5002,7 +5002,7 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *= vcpu, * Shadow paging uses GVA for kvm page fault, so restrict to * two-dimensional paging. */ - r =3D kvm_tdp_map_page(vcpu, range->gpa | direct_bits, error_code, &level= ); + r =3D kvm_tdp_page_prefault(vcpu, range->gpa | direct_bits, error_code, &= level); if (r < 0) return r; =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 557B12580F9 for ; Fri, 17 Oct 2025 00:33:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661183; cv=none; b=GMAkp9gKXJgdNOemkIJTavP7VvK07K7FEtSVpMC/tNmMJmweE1U4EJnyCZ1qcXpGjezCbGsW6wFsabfLaWQEOa7ZI+MuyCnYxJFLIoQYijk4QU2hMrlHMmap4ZA1xjzmf58jAlHl3U20yMZb08nqBSOaE6fgKFMjqR1g18RPIQM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661183; c=relaxed/simple; bh=3mm/N8f8OCP3jzKgT6vjuBOSEtjI242afa4+FgXcNGc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=OpZT7b7NPZm8bOxAjGX87Pl4w4RLce9M2eoXOlwrqyv9QNqSyntElUBSvJ0yOLZCzzwFwY3a/A0vD2S+oiFZW0xkNtV2TDW8i+jJh4ZPC+Aq13Wy8FDdZegM58GBZBrwy/iIeUphvqcCiPPZbWxy9ECEuiaQtEiTNsGcn7XMjuE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=xDp0CWJm; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="xDp0CWJm" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-7810af03a63so2817473b3a.3 for ; Thu, 16 Oct 2025 17:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661181; x=1761265981; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=cf8N8Y56G5w6bizwE7Ci0y1MCMFb0ZkjYruE6L135MI=; b=xDp0CWJmVY3sva2IJSaUSpNPTcbI20xzw2p/R+PG4HGv2b6odOz79mt0LPrLMdC9XR 0r0ys5MohshGFdWIA7bn8eOWUkRd/qiuzU4DMUFVFAHBRPT1wPBsGOASsNguc6QBMhKp c3bjBi2JRXYTfX9Za0v5TONW7if6vQUFiWFnFQPOXx5z7KN1W2QxNbKYp7Is9isBcGtD 2xeX7P139uzaFdYzz0AjDpME0ZRcMuCu/20zXcdxPTQx7PjncCLIcYbLoZbLkFJSh0o8 4D9VTmGUudMsH7C6mVdK70wF47fEOVbuvwetz7xCxYzTOEjwzQGPOdOCXJ+vYZIvOmBG ALVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661181; x=1761265981; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cf8N8Y56G5w6bizwE7Ci0y1MCMFb0ZkjYruE6L135MI=; b=J4qBLD8u/PqNAZemYsl29+RlSPFKGcdafu2nwkVilWTCpH+guQIUOaB4b7zOR1Eodf NuXN4prEY1+4n/2j5tUufcysxTb4L+mZvG4NJhsc2vZuLTTaCgIeBJot7quyYN+iIqnL T74ozfcyjUeu9rByHBnRNBSlqY0tYfUle61onrNQvGwhPTX0WFhUCBgq0ChjunFXMHIH gHEByOl59LVa/CU+92qNJ9hAghGhaxzXR0qv/LMc1kksxRxs3+sP4lsoHNtzRLClje1t LUKtZMRBK4cjKtABfA3xsdm889JXeCQc8Cd3vPB+m8R0Bknp/Ia0ztSzECG6z9uzAlzn YrSA== X-Forwarded-Encrypted: i=1; AJvYcCXRcMLak3TClNco4jY8gJ/KZip7d4Tz9VjBxP+eJvaPB4u3Td2aHC25HouAEBJbiJIYRjM8AvMJEPZiW+s=@vger.kernel.org X-Gm-Message-State: AOJu0YyQrCD2jXtE1wjZ1kx3hFBErPDVWPnO56X534se6YYIT4QUvsyf rdT3sNYbR70nHwparxazvcszUBFGm/vFQMK++fg1uThq7wQZPCimQOywxr2H2pFIk5xhBUQRQ7q B0KxONw== X-Google-Smtp-Source: AGHT+IFGX/O8rAylXetPrL4mM3U8URlaleqJu641GFX2gsiAqRo0BdBbaKA0U4x8ZGwd9In5RU4acPt36Xo= X-Received: from pjbml22.prod.google.com ([2002:a17:90b:3616:b0:339:ae3b:2bc7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6300:2408:b0:334:a23e:2dff with SMTP id adf61e73a8af0-334a858f4e4mr1825671637.25.1760661180701; Thu, 16 Oct 2025 17:33:00 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:25 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-8-seanjc@google.com> Subject: [PATCH v3 07/25] KVM: TDX: Drop superfluous page pinning in S-EPT management From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yan Zhao Don't explicitly pin pages when mapping pages into the S-EPT, guest_memfd doesn't support page migration in any capacity, i.e. there are no migrate callbacks because guest_memfd pages *can't* be migrated. See the WARN in kvm_gmem_migrate_folio(). Eliminating TDX's explicit pinning will also enable guest_memfd to support in-place conversion between shared and private memory[1][2]. Because KVM cannot distinguish between speculative/transient refcounts and the intentional refcount for TDX on private pages[3], failing to release private page refcount in TDX could cause guest_memfd to indefinitely wait on decreasing the refcount for the splitting. Under normal conditions, not holding an extra page refcount in TDX is safe because guest_memfd ensures pages are retained until its invalidation notification to KVM MMU is completed. However, if there're bugs in KVM/TDX module, not holding an extra refcount when a page is mapped in S-EPT could result in a page being released from guest_memfd while still mapped in the S-EPT. But, doing work to make a fatal error slightly less fatal is a net negative when that extra work adds complexity and confusion. Several approaches were considered to address the refcount issue, including - Attempting to modify the KVM unmap operation to return a failure, which was deemed too complex and potentially incorrect[4]. - Increasing the folio reference count only upon S-EPT zapping failure[5]. - Use page flags or page_ext to indicate a page is still used by TDX[6], which does not work for HVO (HugeTLB Vmemmap Optimization). - Setting HWPOISON bit or leveraging folio_set_hugetlb_hwpoison()[7]. Due to the complexity or inappropriateness of these approaches, and the fact that S-EPT zapping failure is currently only possible when there are bugs in the KVM or TDX module, which is very rare in a production kernel, a straightforward approach of simply not holding the page reference count in TDX was chosen[8]. When S-EPT zapping errors occur, KVM_BUG_ON() is invoked to kick off all vCPUs and mark the VM as dead. Although there is a potential window that a private page mapped in the S-EPT could be reallocated and used outside the VM, the loud warning from KVM_BUG_ON() should provide sufficient debug information. To be robust against bugs, the user can enable panic_on_warn as normal. Link: https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.c= om [1] Link: https://youtu.be/UnBKahkAon4 [2] Link: https://lore.kernel.org/all/CAGtprH_ypohFy9TOJ8Emm_roT4XbQUtLKZNFcM6F= r+fhTFkE0Q@mail.gmail.com [3] Link: https://lore.kernel.org/all/aEEEJbTzlncbRaRA@yzhao56-desk.sh.intel.co= m [4] Link: https://lore.kernel.org/all/aE%2Fq9VKkmaCcuwpU@yzhao56-desk.sh.intel.= com [5] Link: https://lore.kernel.org/all/aFkeBtuNBN1RrDAJ@yzhao56-desk.sh.intel.co= m [6] Link: https://lore.kernel.org/all/diqzy0tikran.fsf@ackerleytng-ctop.c.googl= ers.com [7] Link: https://lore.kernel.org/all/53ea5239f8ef9d8df9af593647243c10435fd219.= camel@intel.com [8] Suggested-by: Vishal Annapurve Suggested-by: Ackerley Tng Suggested-by: Rick Edgecombe Signed-off-by: Yan Zhao Reviewed-by: Ira Weiny Reviewed-by: Kai Huang [sean: extract out of hugepage series, massage changelog accordingly] Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Rick Edgecombe --- arch/x86/kvm/vmx/tdx.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 29f344af4cc2..c3bae6b96dc4 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1583,29 +1583,22 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t = root_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa); } =20 -static void tdx_unpin(struct kvm *kvm, struct page *page) -{ - put_page(page); -} - static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, - enum pg_level level, struct page *page) + enum pg_level level, kvm_pfn_t pfn) { int tdx_level =3D pg_level_to_tdx_sept_level(level); struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + struct page *page =3D pfn_to_page(pfn); gpa_t gpa =3D gfn_to_gpa(gfn); u64 entry, level_state; u64 err; =20 err =3D tdh_mem_page_aug(&kvm_tdx->td, gpa, tdx_level, page, &entry, &lev= el_state); - if (unlikely(tdx_operand_busy(err))) { - tdx_unpin(kvm, page); + if (unlikely(tdx_operand_busy(err))) return -EBUSY; - } =20 if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MEM_PAGE_AUG, err, entry, level_state); - tdx_unpin(kvm, page); return -EIO; } =20 @@ -1639,29 +1632,18 @@ static int tdx_sept_set_private_spte(struct kvm *kv= m, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - struct page *page =3D pfn_to_page(pfn); =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return -EINVAL; =20 - /* - * Because guest_memfd doesn't support page migration with - * a_ops->migrate_folio (yet), no callback is triggered for KVM on page - * migration. Until guest_memfd supports page migration, prevent page - * migration. - * TODO: Once guest_memfd introduces callback on page migration, - * implement it and remove get_page/put_page(). - */ - get_page(page); - /* * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching * barrier in tdx_td_finalize(). */ smp_rmb(); if (likely(kvm_tdx->state =3D=3D TD_STATE_RUNNABLE)) - return tdx_mem_page_aug(kvm, gfn, level, page); + return tdx_mem_page_aug(kvm, gfn, level, pfn); =20 return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); } @@ -1712,7 +1694,6 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm= , gfn_t gfn, return -EIO; } tdx_quirk_reset_page(page); - tdx_unpin(kvm, page); return 0; } =20 @@ -1792,7 +1773,6 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm,= gfn_t gfn, if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) && !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) { atomic64_dec(&kvm_tdx->nr_premapped); - tdx_unpin(kvm, page); return 0; } =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25AE9264630 for ; Fri, 17 Oct 2025 00:33:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661185; cv=none; b=VvXK2GWoH+pOR5B6aMY3c4ndCvKtRU6BFS6rr3PiFR5oAGeQ6iz6xVoWf+FeGG1s/2TmrCAesZM0H/9FVxQAaHxJJ8uKjC4LSoUdZJWX4AHFzWzZIjL8Brx0298JL6/TchVSGMyTIFNDiX9A3+kh3CdEExWmomZxq1j8qafFhRk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661185; c=relaxed/simple; bh=ZVrQAaDETHJOm3nZp8vCz09Pe/Wr4FeYALw+WA60dkU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Aug3uctVu4tXwo6355OsWdHrPZjrgOuCfpeOnWA7JTe/1KKaWJ5lxwNYoLDq/rz/G5U1+aYEmGKzyRToaXpVMkQtAKemeKXOmSfX72aPh9BSxae5SRw2V71999YCwkookGREVPCVLinzMn0MeAj/UWFhKBdnulX88JlaMfDH430= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IBJzUfWq; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IBJzUfWq" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32eb18b5500so2174269a91.2 for ; Thu, 16 Oct 2025 17:33:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661183; x=1761265983; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=EknCUbF2LHWm0fNDNjp+CKNXYZwTUTGxAjt6VludIpM=; b=IBJzUfWqKqgA2dZ0ncXhr7wne+qoBzaru6iwsgi8HGnOM5tJqrFKJaMxZVwObQEmcr TRm4paOnkNVEgamYkQkR7CWTBVlChdOCk7E+gBAAbhIc985IX3j4m/BjMGNM/qg/rj2b 30fR007r6nw3QOMC2LuPnWoun/4+1Nkr6/8zPZh8HGIt2k/VNAeHenLomLq1MiFnkcBb xPIaBXuwq4MBBUBtgIi3CThwXR4lxc2Odba1pIU+IzliYWGy+kn1FNencdLP+C1w5qSi fjzymMuRR8AYmCyXmn8WxO2J4jechOfgSp+LM3QSg8BdZYij3QvyTl+H7NXKb6sI3Z2S 0T/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661183; x=1761265983; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EknCUbF2LHWm0fNDNjp+CKNXYZwTUTGxAjt6VludIpM=; b=bbYdnlwx2K41BYuPH++6fXFr+/uc9LHKT79y/EQwT8wu8vmNCGRVsjfShzaOrTwCsS WubCYksvkQjtbtLegvwnLL0o1Rl4slR4peaKO82cIWqbIwHErBIRzFo/Pnzsmzimej8D KyCaXlUQW+DE6PtQPpERFhsP52/CxyYhyIpC8RAXQ04KXUvqcSjwyfzeE5LDZioCbP+T 6oaY2AqtsTb4DdEzaooYaxTbC5FiUxTupl5PzXCZLZzxxtBAfJAwh0GjiC0lvJK4puUy OGkz9C+t4P3gGI7/xj9JLayAOLe1qNMadSwZfG5Jxwr9Fle5+qkJTfPyv13pW2+bsrrn 9Nsg== X-Forwarded-Encrypted: i=1; AJvYcCUA6GRyvyNV3cKIZWgTkuhwdvGfq6+56132aYGz2ujR5RvM0zC4VmQVpwKFlFqh9ccTtl/88raLNtqUrJI=@vger.kernel.org X-Gm-Message-State: AOJu0YzxORCeDGWqHlXQGnjES/9QqtjmMUnFJwuJSdGf6GFTmiUl4J98 +uDWbBzuozr6iB3eIPfWfFDmbmGsf8qYMb74PohxHkxJcx6+mb+IrXNUpnUjasTHOoFF1ll1gya Ql94XAw== X-Google-Smtp-Source: AGHT+IHp0VFuMyv8iYOF0K3G5i6Db3pbnaDLGLxvBIGhE/qWOYmJanCMeHz3uAv5TWFd0pF3pAUnMoqPNCg= X-Received: from pjuy20.prod.google.com ([2002:a17:90a:d714:b0:32f:3fab:c9e7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d44c:b0:336:9dcf:ed14 with SMTP id 98e67ed59e1d1-33bcf8e3b0emr1878293a91.23.1760661182616; Thu, 16 Oct 2025 17:33:02 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:26 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-9-seanjc@google.com> Subject: [PATCH v3 08/25] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Return -EIO when a KVM_BUG_ON() is tripped, as KVM's ABI is to return -EIO when a VM has been killed due to a KVM bug, not -EINVAL. Note, many (all?) of the affected paths never propagate the error code to userspace, i.e. this is about internal consistency more than anything else. Reviewed-by: Rick Edgecombe Reviewed-by: Ira Weiny Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index c3bae6b96dc4..c242d73b6a7b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1621,7 +1621,7 @@ static int tdx_mem_page_record_premap_cnt(struct kvm = *kvm, gfn_t gfn, struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); =20 if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) - return -EINVAL; + return -EIO; =20 /* nr_premapped will be decreased when tdh_mem_page_add() is called. */ atomic64_inc(&kvm_tdx->nr_premapped); @@ -1635,7 +1635,7 @@ static int tdx_sept_set_private_spte(struct kvm *kvm,= gfn_t gfn, =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EINVAL; + return -EIO; =20 /* * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching @@ -1658,10 +1658,10 @@ static int tdx_sept_drop_private_spte(struct kvm *k= vm, gfn_t gfn, =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EINVAL; + return -EIO; =20 if (KVM_BUG_ON(!is_hkid_assigned(kvm_tdx), kvm)) - return -EINVAL; + return -EIO; =20 /* * When zapping private page, write lock is held. So no race condition @@ -1846,7 +1846,7 @@ static int tdx_sept_free_private_spt(struct kvm *kvm,= gfn_t gfn, * and slot move/deletion. */ if (KVM_BUG_ON(is_hkid_assigned(kvm_tdx), kvm)) - return -EINVAL; + return -EIO; =20 /* * The HKID assigned to this TD was already freed and cache was @@ -1867,7 +1867,7 @@ static int tdx_sept_remove_private_spte(struct kvm *k= vm, gfn_t gfn, * there can't be anything populated in the private EPT. */ if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) - return -EINVAL; + return -EIO; =20 ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); if (ret <=3D 0) --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F7D42765E3 for ; Fri, 17 Oct 2025 00:33:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661187; cv=none; b=XXi/P2F/5Ibaj4yonAv5+4HTya7e7TWmnuG3cCIrRp4B7JaeJq97P7Sf2oWcxxya7i4kqY88442D+0pcKIqaZZvXuq/uejTUeSxn3rMwa+mxMXFncQIgNdgRdkq0y9jLR4GG88El5Gb2887rnkbPPeDL8W2UcVX/vdOFHhMlgpE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661187; c=relaxed/simple; bh=QkFUtx5sPoQ6tUw0mbfzfEAj27904ErfeoisebxrDLU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QyPbNFxm/bTnZ8dwknCbGcmVVZy8j3yAUa4kUFShwCTzcT8kS2TX6yJAEyyTpJ2ZMOasKwC3pPDVUy1H/GCxE2lLHlUE6jJCHfOkArMUA+j2WnQrF02U+hoIZ6WwBRfPEyukciv6ioQ8uUgD3TzqfYSpvAiSBP0ArNP5w/N84qk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=GE9el0w/; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GE9el0w/" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32eb864fe90so2362233a91.3 for ; Thu, 16 Oct 2025 17:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661184; x=1761265984; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=RrkTXpYpJ7TF3q/IfNrztO0QiwCsWOWOHOU78mYwx64=; b=GE9el0w/M8keJ7fupiyx6IGO4pOo3MRgbl30ySwpxqi7q/OfMPPqJTOEdy4E8mzLiZ /Cx0c4lcuQwWG/1VEMC0ExMS0/FE2/4MU5bs6TV7/qh5wF80T9LKAnXJepxbYItAuTIz lBz149inpJNX+qIu7IHGn59o9xSy7KcTKyDR6CLBMtBdgedClP/oBo9XL9SadzXhfR/L q1P6/cUhxgkjMsZcPLI2xytkHLqsMPIFjor02isao8Jjyyatz/DTJImQo65vC4jRRjxF tVxCFTPXUPNEei2U2mBUl/eLtuIk7uiEjPUefZ8g1FNKQ5nvSGTxWMbo6v+Jyzc4z57t pibA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661184; x=1761265984; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RrkTXpYpJ7TF3q/IfNrztO0QiwCsWOWOHOU78mYwx64=; b=IQYcInIp5O0Vc4FF1vL+GBlkuhQwFfvWkzozgUIVDBUFsV6TfHFdW2pEJ4LPgwqjcN +BroV92LzXTvKYwdtS0JTkB3DCWmHj8vWXGbI2Cl2nbbT8VuGVwQ+uUcUmgeuLj0p6Ku 9px6DDsu8AoWcsvHAbh4eU5X/8XtTiUxUHN0UHXl9exGTaw+m9g6Shnvyshuwg7Wb9jN hYZHxIi7gWryknqDNQPr4CoubUxFAvXAGJqyxs+Q6+gUevbTfD63YSSpOCR+7lpf9J/r bJka35wuR4YhMGRoyetXIz8umVH/wU7TwNFdjwemBt9FNvjHqOHdMKafIneoAPH+LVRW ThPg== X-Forwarded-Encrypted: i=1; AJvYcCVG/xEI2QaQ29Q9m8Lzn5lPIqW3UKSCFgrE5NgegMo5etZdnhniu8CqBHa7+gt9W178UTx2bURphb1KHRk=@vger.kernel.org X-Gm-Message-State: AOJu0YwYN2oXxF8FNj5670eMCMaa3qd2FeZMZ2BXeohaJvHRxlLJPQas 1yRMgOrkPX0CXfux9ThZHelm8dsVGUBXt/76rMD9JJ8P6aWMtSEA5Y6lLXn7sKdlvGWSzvMXaM4 aDxGOBQ== X-Google-Smtp-Source: AGHT+IEszDWyEl04dE1jeZ+WdRgxVfIHlmHOlr4lEQiQaFSefc7rtggxfv5d/ZUmkL9dWZiaizwe6/ECFgw= X-Received: from pjtf14.prod.google.com ([2002:a17:90a:c28e:b0:32b:ae4c:196c]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:55cb:b0:339:9f7d:92d4 with SMTP id 98e67ed59e1d1-33bcf87b8a5mr2036638a91.9.1760661184417; Thu, 16 Oct 2025 17:33:04 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:27 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-10-seanjc@google.com> Subject: [PATCH v3 09/25] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() to avoid having to differnatiate between "zap", "drop", and "remove", and to eliminate dead code due to redundant checks, e.g. on an HKID being assigned. No functional change intended. Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang --- arch/x86/kvm/vmx/tdx.c | 90 +++++++++++++++++++----------------------- 1 file changed, 40 insertions(+), 50 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index c242d73b6a7b..abea9b3d08cf 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1648,55 +1648,6 @@ static int tdx_sept_set_private_spte(struct kvm *kvm= , gfn_t gfn, return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); } =20 -static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, struct page *page) -{ - int tdx_level =3D pg_level_to_tdx_sept_level(level); - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - gpa_t gpa =3D gfn_to_gpa(gfn); - u64 err, entry, level_state; - - /* TODO: handle large pages. */ - if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EIO; - - if (KVM_BUG_ON(!is_hkid_assigned(kvm_tdx), kvm)) - return -EIO; - - /* - * When zapping private page, write lock is held. So no race condition - * with other vcpu sept operation. - * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. - */ - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - - if (unlikely(tdx_operand_busy(err))) { - /* - * The second retry is expected to succeed after kicking off all - * other vCPUs and prevent them from invoking TDH.VP.ENTER. - */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - tdx_no_vcpus_enter_stop(kvm); - } - - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); - return -EIO; - } - - err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); - - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); - return -EIO; - } - tdx_quirk_reset_page(page); - return 0; -} - static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, enum pg_level level, void *private_spt) { @@ -1858,7 +1809,11 @@ static int tdx_sept_free_private_spt(struct kvm *kvm= , gfn_t gfn, static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { + int tdx_level =3D pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); struct page *page =3D pfn_to_page(pfn); + gpa_t gpa =3D gfn_to_gpa(gfn); + u64 err, entry, level_state; int ret; =20 /* @@ -1869,6 +1824,10 @@ static int tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) return -EIO; =20 + /* TODO: handle large pages. */ + if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) + return -EIO; + ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); if (ret <=3D 0) return ret; @@ -1879,7 +1838,38 @@ static int tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, */ tdx_track(kvm); =20 - return tdx_sept_drop_private_spte(kvm, gfn, level, page); + /* + * When zapping private page, write lock is held. So no race condition + * with other vcpu sept operation. + * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. + */ + err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, + &level_state); + + if (unlikely(tdx_operand_busy(err))) { + /* + * The second retry is expected to succeed after kicking off all + * other vCPUs and prevent them from invoking TDH.VP.ENTER. + */ + tdx_no_vcpus_enter_start(kvm); + err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, + &level_state); + tdx_no_vcpus_enter_stop(kvm); + } + + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); + return -EIO; + } + + err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + return -EIO; + } + + tdx_quirk_reset_page(page); + return 0; } =20 void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 360F821C9FD for ; Fri, 17 Oct 2025 00:33:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661189; cv=none; b=YJ59f0B7/BKoJCy4DAruJOJenhBwBA17kSCgc8IsMN2Hq+Q7wugwCDcLXaQOllqQX8julj9FrRsoe477mUh0fzPwieVx/DoFE/hkpIGhpw+0bdYtK+LjAP9kzcTY7wFPSlxiviyHG/ObnfuSTU+u+1j+Lnb+u7r2I1wjugB/3X4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661189; c=relaxed/simple; bh=k9Z9jxwZO5TaNJ3JVHODzpRZ428q5OixM7USLnsTl5s=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=KccWbWswgNrWVPudV9WiUYO6dKEnSXOnNg/b3RsjQtDJd07TdVPJw2jddq7q7P+cDNC7K57VUhdnWdPlTlIv1+fsQQi+uL39f+K/AZ7U+cmnHA2j7RPonv5Yk1stHA7ELcEweqJ17gZQmvvq0bH55R5e49WAcdB+WudMNLgAKnc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=AeJO1Vfg; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AeJO1Vfg" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-269880a7bd9so17294935ad.3 for ; Thu, 16 Oct 2025 17:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661186; x=1761265986; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=VqN6FPJY384bIppIse52Rb5Kgtgq051USSGGCGlx7fA=; b=AeJO1VfgJohwvULVXJk2NTW8Jdh/YNcAKx0bjr2W0pS6gFtanW3VoUWwICyRiTgFBC DmB7cxRVzPenINwp6L2vRYSBqvKAjIPtYZfjLywoVnkjU5SbrC0jzX3vJw75OwsYrgkK D6lyWr9icVCPOBGk+3FVogjBeK127/EOzaocvc6dWsdB5gYECn9EXW1RpxraSdKQGJHZ dz6zlXeHolZ4zNMcG9oq7hb4tgFjxSq+bwTJqFAV2kCy+NL0F8JWdKFVp2Jg1Ykuky84 rKRMx1nrmcn9WkFNHp6K9oMOLeGVsojpM0/IHebO+ovaDaEmNky//Q92VD3vn6OjP3C0 crgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661186; x=1761265986; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VqN6FPJY384bIppIse52Rb5Kgtgq051USSGGCGlx7fA=; b=pAIbTz0szFMuj95kSL22OWP1CxOorrafyFyRUSmTZfTqhxBjCIYPa1ZuKYyNiRKHDr wPdgZD/FsuZmc0u4CqMmPK/iMwsXx66YI6iVH0y2HfXdyowqTCFSeRafuG5JGKUhcQ1i 9GpU/lirNp1daJ3oCciXu8tY5XKk6rrzqFWldp6ILsJfWgW8WqLgvduOxxPVUIEI/3k1 o06wBvtS7k4YHwY2nEELqB4EXPV60HdTH9oR58VdKAehdkubKZOqPIMoyuogYSrkDNrS +FosE9FJyVc5PUG0m3Hhkub87cUggJDBKgMIcr6OfXcM94Oz8RkZxxViQi59/f4RLSBG lHAA== X-Forwarded-Encrypted: i=1; AJvYcCVHZCA0rQSzhLFOparxUkZiPyIIrqV8AgTranFbd+6Gq5OQUXHNYrEIqfREaSmMvSTZCBqlyXz/Eu6p81s=@vger.kernel.org X-Gm-Message-State: AOJu0Yxd7awo+m9SNfCTC3xSooadjxQl9yqAQKKXRq2BEzzyRQPHmI0Q 099dpUj3c4NK/xsJpswgRyQiiRJkAE/MVJJp4GEG54vphyPEM7Fh5UkZc7My3gYZBnmp87D1V9k JrQ8PmA== X-Google-Smtp-Source: AGHT+IECAB7Q2xyCVxo4b5IYRWMRhmsnO9Lb3GIfoRWMBXN6TTKJmV4zclbiOdup5YJYV9LbxCAcEbMCp+c= X-Received: from pjzg1.prod.google.com ([2002:a17:90a:e581:b0:33b:51fe:1a78]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d2d0:b0:254:70cb:5b36 with SMTP id d9443c01a7336-290c9cf37b2mr17571675ad.8.1760661186063; Thu, 16 Oct 2025 17:33:06 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:28 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-11-seanjc@google.com> Subject: [PATCH v3 10/25] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop the return code from kvm_x86_ops.remove_external_spte(), a.k.a. tdx_sept_remove_private_spte(), as KVM simply does a KVM_BUG_ON() failure, and that KVM_BUG_ON() is redundant since all error paths in TDX also do a KVM_BUG_ON(). Opportunistically pass the spte instead of the pfn, as the API is clearly about removing an spte. Suggested-by: Rick Edgecombe Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++------ arch/x86/kvm/vmx/tdx.c | 17 ++++++++--------- 3 files changed, 12 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 48598d017d6f..7e92aebd07e8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1855,8 +1855,8 @@ struct kvm_x86_ops { void *external_spt); =20 /* Update external page table from spte getting removed, and flush TLB. */ - int (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level lev= el, - kvm_pfn_t pfn_for_gfn); + void (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level le= vel, + u64 spte); =20 bool (*has_wbinvd_exit)(void); =20 diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 9b4006c2120e..c09c25f3f93b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -362,9 +362,6 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct k= vm_mmu_page *sp) static void remove_external_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte, int level) { - kvm_pfn_t old_pfn =3D spte_to_pfn(old_spte); - int ret; - /* * External (TDX) SPTEs are limited to PG_LEVEL_4K, and external * PTs are removed in a special order, involving free_external_spt(). @@ -377,9 +374,8 @@ static void remove_external_spte(struct kvm *kvm, gfn_t= gfn, u64 old_spte, =20 /* Zapping leaf spte is allowed only when write lock is held. */ lockdep_assert_held_write(&kvm->mmu_lock); - /* Because write lock is held, operation should success. */ - ret =3D kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_pfn); - KVM_BUG_ON(ret, kvm); + + kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_spte); } =20 /** diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index abea9b3d08cf..f5cbcbf4e663 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1806,12 +1806,12 @@ static int tdx_sept_free_private_spt(struct kvm *kv= m, gfn_t gfn, return tdx_reclaim_page(virt_to_page(private_spt)); } =20 -static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, kvm_pfn_t pfn) +static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level, u64 spte) { + struct page *page =3D pfn_to_page(spte_to_pfn(spte)); int tdx_level =3D pg_level_to_tdx_sept_level(level); struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - struct page *page =3D pfn_to_page(pfn); gpa_t gpa =3D gfn_to_gpa(gfn); u64 err, entry, level_state; int ret; @@ -1822,15 +1822,15 @@ static int tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, * there can't be anything populated in the private EPT. */ if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) - return -EIO; + return; =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EIO; + return; =20 ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); if (ret <=3D 0) - return ret; + return; =20 /* * TDX requires TLB tracking before dropping private page. Do @@ -1859,17 +1859,16 @@ static int tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, =20 if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); - return -EIO; + return; } =20 err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); if (KVM_BUG_ON(err, kvm)) { pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); - return -EIO; + return; } =20 tdx_quirk_reset_page(page); - return 0; } =20 void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AACC721FF28 for ; Fri, 17 Oct 2025 00:33:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661190; cv=none; b=IN/pmsRGpRxUWso3ZIE+PzHv5G2MUk9ATrUerL6/mJylDOl1GV4TkKgrd7TaHPfd/9qvO+bW8PctpQBRdSv9QJ6EW2exRGA63KcjdqOjErCW1N5XiHI4apuIQB5vEYZgxQZoGWeilHWCTihdxfR2LVcLSAtMLDPVocXu+mkypo0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661190; c=relaxed/simple; bh=wYAEXPYHa2xMErFCIfNovZ590pxCtOKCG7ZgmLDnGp4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Wb/y5LH8b3Q7MGEK2ZGI4cEM568YpReqX+74VpuLgdXksmNtk+75Oe/Z2GMgHhLWInbUCvX+M9j3UeYRM25yRCXB6ODpDmstJBHYwAQ7hcqqf9MCmWhvbPG8h3ujrteaVfyTrI3vsSrLoXXMfY5d8eCyMMJlV1r4MER2+jp1rOU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=meXA5Rbt; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="meXA5Rbt" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-334b0876195so1510816a91.1 for ; Thu, 16 Oct 2025 17:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661188; x=1761265988; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Ylyqozer80U4wIAhCrbCm8kORDI60jP8QTvjhJUGpZY=; b=meXA5Rbtcn2dDXKmEbm8JEbkRlcsdIn3bhSbm4/TT6Wk9tTRL9sZ/PreDT6aGziq4s MdjRT4PmL+YtuWnTf1Kijw2abNvPZljInMlM5nQpenjN9mlpcOV6ZXTOTaaONyTlNv3j NWi1babo8bn8p9Ym74NUNfyckVpgrSLizXex2+ZuprpHw04PIkFbJErTi26CSxpbOL+a xDhoH4UzTFTnXRbz1Buezwbm0MgYmFjPdv+Qsf0AOatgRLT1u442D6dCjlgg0K3RoaF6 AvZpcSCsI4MmO0Uq+Ey73ioAwoMRJ06uHNeEV7TJXQAmRE8cs2wnICGOBlwRohwwafB9 wSUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661188; x=1761265988; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ylyqozer80U4wIAhCrbCm8kORDI60jP8QTvjhJUGpZY=; b=odML/p7IOztU3jXn8XjkubdVxCA7izDtz/umWtkSouoF5lAW51V7yK5TsqdGaat99p wcFSQ8DlJAVtv+xFSvEj1oocLOcuJcUY5nhT4rSlR0nUptNvItfPUBob9sZT3w9nf8I+ NR/SjEfy/AqMpEWWNrdf8G/ZvpNABktBPD+l/iAeKWHFfPfhAswTcI/eHM77D7MVKdaH CdmWjJgGfnftSC0U5WZqN39EjMLVGRNG+lb/n8Q4wgUgHXDQY02mITEuh3T6RlFM09tL s87ec2bg+7G7q6Ks1kB92icP4vndsOVXQ/t9aR9/K4c0XojKTzckCULKIoGpTv24BC+i 7kyg== X-Forwarded-Encrypted: i=1; AJvYcCX/CexMfMzkxNvplisjj4KMZiDN0e8ZVQB6PP52CyiXdCZZMPPczpRyBKxNSSoeZz3K3wgwg6qY79nYTYQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yxeq/w3v0fQQXcDv8Fml9I+qFxOs6XlgmqIINLwjGrNvtrildgl j97MskUcDa59a8ooX/KUG2g2IHMTrW20nR/FN11VQBC+PJKpuNbedeI6hwhGK/HF9Cu8W4N8xbU LEgrSZQ== X-Google-Smtp-Source: AGHT+IFv5wQBYU08VlWSJ3YEyBvWx6CUl5P/6vzeOVA3OyqdlNjTdwYkeUoBdZlGjRBJoj7p6GD6I+E30lo= X-Received: from pjbnc11.prod.google.com ([2002:a17:90b:37cb:b0:33b:caf7:2442]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3f8c:b0:32b:a2b9:b200 with SMTP id 98e67ed59e1d1-33bcf87ab38mr1930441a91.13.1760661187797; Thu, 16 Oct 2025 17:33:07 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:29 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-12-seanjc@google.com> Subject: [PATCH v3 11/25] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Return -EIO immediately from tdx_sept_zap_private_spte() if the number of to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)" isn't also triggered. Isolating the check from the "is premap error" if-statement will also allow adding a lockdep assertion that premap errors are encountered if and only if slots_lock is held. Reviewed-by: Rick Edgecombe Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang --- arch/x86/kvm/vmx/tdx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index f5cbcbf4e663..220989a1e085 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1721,8 +1721,10 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm= , gfn_t gfn, err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); tdx_no_vcpus_enter_stop(kvm); } - if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) && - !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) { + if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) { + if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) + return -EIO; + atomic64_dec(&kvm_tdx->nr_premapped); return 0; } --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18BEC21ADCB for ; Fri, 17 Oct 2025 00:33:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661191; cv=none; b=iuNm1ZWkdqpNxXdtXsallI5s9V6Iqx81T3k7UAO1YZBT6e6TgxE2QTM9XnNy+rgpTPYOsQs/r7Z853jYfN21JxxJMFHaWJsIYgW5GcA0UakUFfLWEFcvc1uwMPdbQXfWj1E9+5EcNcg9fiIMW1yGeq4sVD1gkV9aI3nCnkNzI+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661191; c=relaxed/simple; bh=T+m3BYVvUt838TFNeB0JdBMWut/8n0JmQsfeAiJxDYg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=sCJ0Z/anpGta0I/DBcHHQ9xoXiAjrLw9y9jW/rxoGm4AVOixOkYgKIoIUvd2bBKIQiF/lWc+m1mKWERK/viba65O1F/RuHCMa9cIyBQa+7IogOH4InO8O0ZtoP662JHCz1u1d92fD59QM/qOsnjafgxJ3x4TlfclnitwQlVxgbU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lC5orD7w; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lC5orD7w" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-27eca7298d9so29216575ad.0 for ; Thu, 16 Oct 2025 17:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661189; x=1761265989; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=YenbGRez6bu/UEaTbB8eZbtDNA5csFA4mZUF4JnZu9w=; b=lC5orD7w5mscrpd6JATzDFcCWOtnaM6ug7gsHILNGUreXKkVd0RT3sswpdI532lmkj txsneJ655rxlruFVxZxNUFEJ2ebYZZitSlpjK1yRn6ROTXWIDJE4PvTiWImoIeCADUiW BszffPRMU88cnUNyA9Xxr+QmVF4e1IGYmmTDBLGFlVYEqitvoNLSBQjO6Gf3pxYEmOxw TYRknCt7FWurtMgnIzp65n/5N34sFkfEoNWT85Z8kyRkAv1ZEb1iacvHWzLvJlxN0+mX NbGh4tFSAq4UxHv6XULANkETVRv+28d8ENIpXo6eXkJcWbS3uc0DRHM0nTHgfCILpjth FArg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661189; x=1761265989; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YenbGRez6bu/UEaTbB8eZbtDNA5csFA4mZUF4JnZu9w=; b=Bc5oHePOyeD+eReCcBOZi4cVgpsOHf4YjuJnxCtWN+xVcY9aT9/N2MNVzuot2C28O7 E2B/gOgOUouqUEdA+s/2YemIBNKbWtLXVDOGaeYsf+yO2pl/dxNbZxtpSdMM5nM0SedO wyrmWFpVnNDF+xrF4OALoapxnwyssm448CPnznKfhE3HxMOSJlRMCMxXvY5vi/2nq2Fl aKW1eNsnu5+yMP0qzskCKn0QnepWLQXodaw9+k+hisg6AFQMzcOh+13EMuFoDKJkOasR cutiitzzAvbkLRhz8nVZf3ya+o0gBO+b9yzRYSBAWwaN0E3eFwGfOkQP8JjJG4Y8QpPe nb9Q== X-Forwarded-Encrypted: i=1; AJvYcCWWEfn4wVwKbsUJuCebod/Q/qxc47A8mfBPDIyO2Qr+7mUVVr2BqWJJyPFJRb5Yy5U149dhiAeP5gXLbjA=@vger.kernel.org X-Gm-Message-State: AOJu0YxKwMWiGfZ1pcA08O62FfbZJGBPBLgnn57vUVaCHfPkvSAhZiSB 7VykFVKBbhry/B9sCaHiVl5EYJdR/pZsHIVU3gxsGicg2NmJZgfQ2Ahu1rVBfmS35zIEmjvXbxW GS/SYhg== X-Google-Smtp-Source: AGHT+IFzwgrTUpyTLmlR4n7vLZlT3Ntvx3uZ2AlOGbeNdkR1Pz/L5L7m7JolYZt52l8hjag9eabuD/ERTOw= X-Received: from pjbpf6.prod.google.com ([2002:a17:90b:1d86:b0:33b:5907:81cb]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:19f0:b0:286:d3c5:4d15 with SMTP id d9443c01a7336-290cb947798mr22501395ad.36.1760661189400; Thu, 16 Oct 2025 17:33:09 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:30 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-13-seanjc@google.com> Subject: [PATCH v3 12/25] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use atomic64_dec_return() when decrementing the number of "pre-mapped" S-EPT pages to ensure that the count can't go negative without KVM noticing. In theory, checking for '0' and then decrementing in a separate operation could miss a 0=3D>-1 transition. In practice, such a condition is impossible because nr_premapped is protected by slots_lock, i.e. doesn't actually need to be an atomic (that wart will be addressed shortly). Don't bother trying to keep the count non-negative, as the KVM_BUG_ON() ensures the VM is dead, i.e. there's no point in trying to limp along. Reviewed-by: Rick Edgecombe Reviewed-by: Ira Weiny Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 220989a1e085..6c0adc1b3bd5 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1722,10 +1722,9 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm= , gfn_t gfn, tdx_no_vcpus_enter_stop(kvm); } if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) { - if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) + if (KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm)) return -EIO; =20 - atomic64_dec(&kvm_tdx->nr_premapped); return 0; } =20 @@ -3157,8 +3156,7 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, goto out; } =20 - if (!KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) - atomic64_dec(&kvm_tdx->nr_premapped); + KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm); =20 if (arg->flags & KVM_TDX_MEASURE_MEMORY_REGION) { for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46459285056 for ; Fri, 17 Oct 2025 00:33:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661194; cv=none; b=bFN4VgGfMVuw+qjxLpswYVpOGKVSi1Fre0uummcCTv96kMN/Kvz3hYQFoUgLYkwNmDIp0qHcfSpagwTaqGX68EYDi+m10nHSnciVuQ4hIwhOk5Bl300vld14GnsO4ak4WgfqJd+oBRQZeWRDYcIXJzBYIYdKkhsiaZVOPpUmPzw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661194; c=relaxed/simple; bh=g9XE/XmOjyDDyTFp1eZETSR7kpDfFGyQ7oUmnFxVe5U=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=i8OcC2+YvrFCuOC572sIvDy1CIW4RwpLUzqiyHfxX5hHizYdGZ9rF5mFm9Dl+R5DzEjt2fOFrZEtu/nv6yCyzdu5e2dvpKhVpUOzfaL+kjufjbTASePOx2P4ucmOBrw8ZRlJmwdeSXe2CdAQIa8Fl9rwdzr4KbStJp3TzOXI49c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tHr4DlwL; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tHr4DlwL" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32eb18b5500so2174536a91.2 for ; Thu, 16 Oct 2025 17:33:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661191; x=1761265991; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=tdDvnnMBi79Itp61RxW2nV56QIHnitcyrtTO64NTOt8=; b=tHr4DlwLH2QPutEBTZf00UhaF5tY5sOYkzcgcBKqjTfgauXffHWruC6gbzSIbuniMu vPMUZXgJXNjxjbGcr/I1cGBkADHOmK+mNrJpPh/W+DhoHrNaXHDYEhlU37dJX99uR2lT 3/hlePFECPUcEnJYHOr37i4sqxvBr0lrbQ8sVI0g72/9MfjG4ex43iWX9x4xgD2K+yFb wbbQB28BivmehBfYQ4qyGKX8HZ7r6jeFHWQ4xeZ6mjKehr7FyHDi0l7q8fX5dsbCBoy4 MbDZ0d1m7Xi+YZG8nDFjUzCc6LLAUGz0z9vd4lhFtvWU1hc7nGbK5n22I+Aieq+P5Pj9 X37w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661191; x=1761265991; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tdDvnnMBi79Itp61RxW2nV56QIHnitcyrtTO64NTOt8=; b=PMaYRASWX2c3DIvhFT8owLgPDNDM/AAkAjbhpn0zeYDQ0M9gJEP15VdOK88T0AaEtI ngXCp42apn66DEcsQC06wG2UKulPqlHxGs+IEmw4U9z1pfRHmxh57HY/jT1QBSt4am+W CVysUTlvemdUyUqk0s+rXU8cVZeiCpkYY/uV3BcTFMFvi+GbknA2OHSjZWVWyzzEjaoM O0zQIPB96bCcQLeDaClQpVegh6ZNvlDNqolzEs3WSzG0VBVJXAbVryZSy76W2oEhP/iv sNLrz2Wg4VBsxx0KOKOEvXPKsxe/f/Sdy4sroUkAmXsGzOMexeLPo4U+mwWlO62JQczt kj7A== X-Forwarded-Encrypted: i=1; AJvYcCVIRoPPfYrOIQuP3+/jHTkP5E3BqCIVDw4JZYKBD047bEfeEbfOJnPUTrhFV+CsiB/YYz8q678OMii+pjc=@vger.kernel.org X-Gm-Message-State: AOJu0Yw/G6yUp1FJZAOJvlm/r5VolmFHezIBazxd1sAz75IaW43ioDGA 0ttI4gbi7WHHsGIFIHw5JGZvik9E2ow1fWINMD6B3UG7+6c36RXJPCjlyvBQKTyPipQvImgHjYx J2hoY3A== X-Google-Smtp-Source: AGHT+IFjbvhuJBf5rM9mXEGAaRcwet/6S3ph7hsVNQrCN7ddhA3ec/NJFjzeh48C6n1kB16/mom5cp7HTgE= X-Received: from pjso17.prod.google.com ([2002:a17:90a:c091:b0:32e:3830:65f2]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1dc4:b0:330:6f13:53fc with SMTP id 98e67ed59e1d1-33bcf8f870bmr1842078a91.27.1760661191207; Thu, 16 Oct 2025 17:33:11 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:31 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-14-seanjc@google.com> Subject: [PATCH v3 13/25] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Fold tdx_mem_page_record_premap_cnt() into tdx_sept_set_private_spte() as providing a one-off helper for effectively three lines of code is at best a wash, and splitting the code makes the comment for smp_rmb() _extremely_ confusing as the comment talks about reading kvm->arch.pre_fault_allowed before kvm_tdx->state, but the immediately visible code does the exact opposite. Opportunistically rewrite the comments to more explicitly explain who is checking what, as well as _why_ the ordering matters. No functional change intended. Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang --- arch/x86/kvm/vmx/tdx.c | 49 ++++++++++++++++++------------------------ 1 file changed, 21 insertions(+), 28 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 6c0adc1b3bd5..c37591730cc5 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1605,29 +1605,6 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t g= fn, return 0; } =20 -/* - * KVM_TDX_INIT_MEM_REGION calls kvm_gmem_populate() to map guest pages; t= he - * callback tdx_gmem_post_populate() then maps pages into private memory. - * through the a seamcall TDH.MEM.PAGE.ADD(). The SEAMCALL also requires = the - * private EPT structures for the page to have been built before, which is - * done via kvm_tdp_map_page(). nr_premapped counts the number of pages th= at - * were added to the EPT structures but not added with TDH.MEM.PAGE.ADD(). - * The counter has to be zero on KVM_TDX_FINALIZE_VM, to ensure that there - * are no half-initialized shared EPT pages. - */ -static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, - enum pg_level level, kvm_pfn_t pfn) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - - if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) - return -EIO; - - /* nr_premapped will be decreased when tdh_mem_page_add() is called. */ - atomic64_inc(&kvm_tdx->nr_premapped); - return 0; -} - static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { @@ -1638,14 +1615,30 @@ static int tdx_sept_set_private_spte(struct kvm *kv= m, gfn_t gfn, return -EIO; =20 /* - * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching - * barrier in tdx_td_finalize(). + * Ensure pre_fault_allowed is read by kvm_arch_vcpu_pre_fault_memory() + * before kvm_tdx->state. Userspace must not be allowed to pre-fault + * arbitrary memory until the initial memory image is finalized. Pairs + * with the smp_wmb() in tdx_td_finalize(). */ smp_rmb(); - if (likely(kvm_tdx->state =3D=3D TD_STATE_RUNNABLE)) - return tdx_mem_page_aug(kvm, gfn, level, pfn); =20 - return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); + /* + * If the TD isn't finalized/runnable, then userspace is initializing + * the VM image via KVM_TDX_INIT_MEM_REGION. Increment the number of + * pages that need to be mapped and initialized via TDH.MEM.PAGE.ADD. + * KVM_TDX_FINALIZE_VM checks the counter to ensure all mapped pages + * have been added to the image, to prevent running the TD with a + * valid mapping in the mirror EPT, but not in the S-EPT. + */ + if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) { + if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) + return -EIO; + + atomic64_inc(&kvm_tdx->nr_premapped); + return 0; + } + + return tdx_mem_page_aug(kvm, gfn, level, pfn); } =20 static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B4B128C009 for ; Fri, 17 Oct 2025 00:33:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661195; cv=none; b=VzGNpFfigre1tELi/AY37CbcHZIvsKCD5cc7zuRJqYYMnxBDY0GS7Vy/1wiDQ2r4PNkJFStdB85/LSNPNbtRn2CXfHeyjx/PVxiR8M4W2ZDZ8ycTEfLvCK+0+NBAC60cFW8TUB9yUkwa5/kx2mF7ngRT3GDe4r7eJ8EVCo+Nl5I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661195; c=relaxed/simple; bh=3YQMVIXe/PWDNcoPwZMoZ+T0ntV4ZtND1X6btyfkyHo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ct/DtC7vKjUiD6omfAR/vS9EH2xEHTv0oDXR5XOPpZuCMJ1o1/Rs3dmS2Wg5RIBWgmK1lVCQtRQMSt8nDz5ypqVuvK3UBnxL76XvZY51JJPvcooA27zUe29tqxGSr9Elme0Pmd6hHSURyRTXM6oyLikv9GYSfmK5Vp4HDB4cQMY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Z+TNYbQG; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Z+TNYbQG" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33428befd39so3313713a91.0 for ; Thu, 16 Oct 2025 17:33:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661193; x=1761265993; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=yGJSKqxu0aNpKU4CMPi7Ju9DUaCLFSD0JC8SLGhbfdI=; b=Z+TNYbQG2H1cly4Y0BEjtql8btk8zR56Sa6JfdkTCLqFr29mP+NpN1lUv86b8pXFci gVIkvkdZ0Mu/vDz0lTv0yMq38ojLQzZzQopvIM/pxwdL3bcw3be8cqBtkim0dYllW3xI YqfNVDYW9/LfaJvjbndtWsnLe3joPJ5Ahm9mWGQOQst2rF0F55YX4k77TDL0TS3A5p3f EJUb+XNJDTZC1tauas/7VPA0DG2bGKlmLghalUKpnaiHG+xqjiJkLzM60B5mzFhgbf/q S9FU8nkjwTrTomiAaFeUhjhxdtp8RS9SPdOOaKPprqnTWSFZ6Ql2rMUHW0kKQl2M26R3 yyTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661193; x=1761265993; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yGJSKqxu0aNpKU4CMPi7Ju9DUaCLFSD0JC8SLGhbfdI=; b=wzOcpjIjWBTSnvXrUNntWpV9oakQF68YDGW1Z9AMNYi2qGzhB479MGqMfyOz/7mDi/ l8KRULfx5R2GA38ihcy3Kxc7F2FA3mbwXPiXiyHE2YljtDxCYzkyeL9hVqCdRF0FJxa/ w7zd2H59BjHalCYysyb8JYvw8E88PCmRycnNHgDxyY9Msfz9DqJw5tI9vsXtE89Ou51N 1hGqXAyBW8T6haaSfCtm+tgBV+VwNMfoNmwYXGaEkUGYVwkszohWz6GsrJyL6CG4IboS x3IOjnGG/+4R6TM1ay+1Rn1BrEWFtXgKSANMoymZxnx1r2oT61vcIRkoeEcLcM8PvFTs zQzQ== X-Forwarded-Encrypted: i=1; AJvYcCVTdlIEVkAZdDiD3YK+kb8emoCpW/sZSpkO/FMdWU3FkQfVabTmS0vmK68VhmtzC1WDcZq5Z++4tlWlSYI=@vger.kernel.org X-Gm-Message-State: AOJu0Yyq4o+jbJz4sdDTLUakk/LShPyBa4jKQq5t+rVo6MpvongcxAAu hagv7jURkY/LsHwXxu0AEr8mnydfiFT1c7fSU0YOTPjpFIq5P5Dd2xkJXbP6k9QBTvcCAWYyEQC sKQU/sw== X-Google-Smtp-Source: AGHT+IHjaZ2dKb3gnqjzbrezKxoV9F/Lq4sC9R35D07gH2vTOnVl3FqHgnrEtTPEkkZOYMNEYP5euQorQsk= X-Received: from pjtu8.prod.google.com ([2002:a17:90a:c888:b0:32b:58d1:a610]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c05:b0:339:ec9c:b275 with SMTP id 98e67ed59e1d1-33bcf84e181mr2234528a91.6.1760661192796; Thu, 16 Oct 2025 17:33:12 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:32 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-15-seanjc@google.com> Subject: [PATCH v3 14/25] KVM: TDX: Bug the VM if extended the initial measurement fails From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" WARN and terminate the VM if TDH_MR_EXTEND fails, as extending the measurement should fail if and only if there is a KVM bug, or if the S-EPT mapping is invalid, and it should be impossible for the S-EPT mappings to be removed between kvm_tdp_mmu_map_private_pfn() and tdh_mr_extend(). Holding slots_lock prevents zaps due to memslot updates, filemap_invalidate_lock() prevents zaps due to guest_memfd PUNCH_HOLE, and all usage of kvm_zap_gfn_range() is mutually exclusive with S-EPT entries that can be used for the initial image. The call from sev.c is obviously mutually exclusive, TDX disallows KVM_X86_QUIRK_IGNORE_GUEST_PAT so same goes for kvm_noncoherent_dma_assignment_start_or_stop, and while __kvm_set_or_clear_apicv_inhibit() can likely be tripped while building the image, the APIC page has its own non-guest_memfd memslot and so can't be used for the initial image, which means that too is mutually exclusive. Opportunistically switch to "goto" to jump around the measurement code, partly to make it clear that KVM needs to bail entirely if extending the measurement fails, partly in anticipation of reworking how and when TDH_MEM_PAGE_ADD is done. Fixes: d789fa6efac9 ("KVM: TDX: Handle vCPU dissociation") Signed-off-by: Yan Zhao Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index c37591730cc5..f4bab75d3ffb 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3151,14 +3151,22 @@ static int tdx_gmem_post_populate(struct kvm *kvm, = gfn_t gfn, kvm_pfn_t pfn, =20 KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm); =20 - if (arg->flags & KVM_TDX_MEASURE_MEMORY_REGION) { - for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { - err =3D tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry, - &level_state); - if (err) { - ret =3D -EIO; - break; - } + if (!(arg->flags & KVM_TDX_MEASURE_MEMORY_REGION)) + goto out; + + /* + * Note, MR.EXTEND can fail if the S-EPT mapping is somehow removed + * between mapping the pfn and now, but slots_lock prevents memslot + * updates, filemap_invalidate_lock() prevents guest_memfd updates, + * mmu_notifier events can't reach S-EPT entries, and KVM's internal + * zapping flows are mutually exclusive with S-EPT mappings. + */ + for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { + err =3D tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry, &level_state); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MR_EXTEND, err, entry, level_state); + ret =3D -EIO; + goto out; } } =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A9AF294A10 for ; Fri, 17 Oct 2025 00:33:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661198; cv=none; b=CucD65GOeISr1ZpT0S9zGyrN0nGQcIyOmsi2llKwHJlLTYugHfGMgMCpwjHX/cowh+Ab7+rjRMwv1F8aHRrA603fGLYsO5Aa34YWez7GONp0dCnxlzo+njfdsyRk0D2OPUrODjCRlrXOmiMJ36rUGTZ45SlN+toVvVYLgtbZF7o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661198; c=relaxed/simple; bh=jbwKuImiIrIw8Noi4RxeEMR04KFlxna3AZNfFvD8+0A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=bAIg54VeSC0v7of7vno7Ml4a1TJJ+W+A+9Vk/3i+83O353mYcrVQcNUKQ2ziqKIDr7u5le83ntALU1X4Dg3drZHJr8PTuBMobN69fZCTKJ9VAoWpI/LlU1reABbVeP+A5nmQFQ4BVJ8JQ9zNrAUYKB3f2rd4rK3kFhFla2xR2ms= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=CQcpuwIZ; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CQcpuwIZ" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33bcb7796d4so677134a91.0 for ; Thu, 16 Oct 2025 17:33:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661194; x=1761265994; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=24WL3/XVrWtQShgQMVnGZkUPAQhnwZjEdj9gk6jqvRM=; b=CQcpuwIZowrBlg0KVJrWZ/HUjrsj4Sahqh5r4Wv/LzUsBZhwqaRchpFaE2tyAmhVv5 PaUP1QyCyUDn9LkxvnnVr6/bQ8bOXxFuddIMFXRVy6bxUqETXW1hQeZTEIcYm0CLTJzZ 14JTgNpjDmaJePUZOFBsIcxTH/gTF9Qr72Mk1RFYM4tFJn8hxjSNQZNnfhHrAB+SVJCV o8n1r6skEQ5oyG3Vmh3ah/XcdIuQRkI9M5Cwum5TBZL6HVzKZ9XxDTgwFq6K+qoROfj0 XswUCopiaFeqlwjc7GZRaOpjjDM1dhCy5DvWDbc9Q/H4EU+Iz0YGGcgqaSLmB5TK1mUt K1GA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661194; x=1761265994; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=24WL3/XVrWtQShgQMVnGZkUPAQhnwZjEdj9gk6jqvRM=; b=VybF2L+NCVZ3zl5m0gIYw1kRC5rBRqnpgfPrHFo2IaXusv6l74cCgZl0l4U6naAnLO nRaxmfrHQybzVILrngXLyhK59GhC3ufuhAzRGs3WX+CWAP2mpIaIHRc7WAnNGjtZ0857 kB1iM+P6JuKgpyCBg/plv6+6Du9eA1ktVzcSJDSbZmHC7sDjYcugx9TfFvZYjNB1lwcu MgzjfsBMjWyUqWxvMYrwzALgBXS4Kg3wnsGKHcAkJjLhH3aU88jqQGZnKz8Z+zkSLduF PNQA5Gg81xCW8GUD4zqemusl8nmzlwCXNiuOIWVqfSNwChuf5bBbuHHT/k6LARlKgv+I 9CWQ== X-Forwarded-Encrypted: i=1; AJvYcCXQq2JqQ2zz4+/aG4h1JTeEGzmT4M9UALwHUywUf3e0tir9s7L5QURToABIotMqpznapdOuc+kSw/dL+U8=@vger.kernel.org X-Gm-Message-State: AOJu0Yz47Yn42ENOVs8WzRqTh66bVsOMCGpH3edngRq5qB9VDpTS5WfB rQTrndpMnh1q3AIScVJuUXq/nvcHkNoosHwApdnOhtJArbZlVtbM1jwQE9AwILKS+la+0UKcfy4 OE4cmGg== X-Google-Smtp-Source: AGHT+IHsf5Q4hKwAqjzfrtBJQeOEWpiuWR1U3EUS3M0ncbeH1DJJyZF5orweh3oOIyC71bDEGVac2tYwtXY= X-Received: from pjir1.prod.google.com ([2002:a17:90a:5c81:b0:31f:2a78:943]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:270a:b0:32e:8c14:5d09 with SMTP id 98e67ed59e1d1-33bcf86287fmr1924301a91.7.1760661194542; Thu, 16 Oct 2025 17:33:14 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:33 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-16-seanjc@google.com> Subject: [PATCH v3 15/25] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When populating the initial memory image for a TDX guest, ADD pages to the TD as part of establishing the mappings in the mirror EPT, as opposed to creating the mappings and then doing ADD after the fact. Doing ADD in the S-EPT callbacks eliminates the need to track "premapped" pages, as the mirror EPT (M-EPT) and S-EPT are always synchronized, e.g. if ADD fails, KVM reverts to the previous M-EPT entry (guaranteed to be !PRESENT). Eliminating the hole where the M-EPT can have a mapping that doesn't exist in the S-EPT in turn obviates the need to handle errors that are unique to encountering a missing S-EPT entry (see tdx_is_sept_zap_err_due_to_premap()= ). Keeping the M-EPT and S-EPT synchronized also eliminates the need to check for unconsumed "premap" entries during tdx_td_finalize(), as there simply can't be any such entries. Dropping that check in particular reduces the overall cognitive load, as the managemented of nr_premapped with respect to removal of S-EPT is _very_ subtle. E.g. successful removal of an S-EPT entry after it completed ADD doesn't adjust nr_premapped, but it's not clear why that's "ok" but having half-baked entries is not (it's not truly "ok" in that removing pages from the image will likely prevent the guest from booting, but from KVM's perspective it's "ok"). Doing ADD in the S-EPT path requires passing an argument via a scratch field, but the current approach of tracking the number of "premapped" pages effectively does the same. And the "premapped" counter is much more dangerous, as it doesn't have a singular lock to protect its usage, since nr_premapped can be modified as soon as mmu_lock is dropped, at least in theory. I.e. nr_premapped is guarded by slots_lock, but only for "happy" paths. Note, this approach was used/tried at various points in TDX development, but was ultimately discarded due to a desire to avoid stashing temporary state in kvm_tdx. But as above, KVM ended up with such state anyways, and fully committing to using temporary state provides better access rules (100% guarded by slots_lock), and makes several edge cases flat out impossible. Note #2, continue to extend the measurement outside of mmu_lock, as it's a slow operation (typically 16 SEAMCALLs per page whose data is included in the measurement), and doesn't *need* to be done under mmu_lock, e.g. for consistency purposes. However, MR.EXTEND isn't _that_ slow, e.g. ~1ms latency to measure a full page, so if it needs to be done under mmu_lock in the future, e.g. because KVM gains a flow that can remove S-EPT entries uring KVM_TDX_INIT_MEM_REGION, then extending the measurement can also be moved into the S-EPT mapping path (again, only if absolutely necessary). P.S. _If_ MR.EXTEND is moved into the S-EPT path, take care not to return an error up the stack if TDH_MR_EXTEND fails, as removing the M-EPT entry but not the S-EPT entry would result in inconsistent state! Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang --- arch/x86/kvm/vmx/tdx.c | 114 ++++++++++++++--------------------------- arch/x86/kvm/vmx/tdx.h | 8 ++- 2 files changed, 45 insertions(+), 77 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index f4bab75d3ffb..76030461c8f7 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1583,6 +1583,32 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t r= oot_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa); } =20 +static int tdx_mem_page_add(struct kvm *kvm, gfn_t gfn, enum pg_level leve= l, + kvm_pfn_t pfn) +{ + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + u64 err, entry, level_state; + gpa_t gpa =3D gfn_to_gpa(gfn); + + lockdep_assert_held(&kvm->slots_lock); + + if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm) || + KVM_BUG_ON(!kvm_tdx->page_add_src, kvm)) + return -EIO; + + err =3D tdh_mem_page_add(&kvm_tdx->td, gpa, pfn_to_page(pfn), + kvm_tdx->page_add_src, &entry, &level_state); + if (unlikely(tdx_operand_busy(err))) + return -EBUSY; + + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MEM_PAGE_ADD, err, entry, level_state); + return -EIO; + } + + return 0; +} + static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { @@ -1624,19 +1650,10 @@ static int tdx_sept_set_private_spte(struct kvm *kv= m, gfn_t gfn, =20 /* * If the TD isn't finalized/runnable, then userspace is initializing - * the VM image via KVM_TDX_INIT_MEM_REGION. Increment the number of - * pages that need to be mapped and initialized via TDH.MEM.PAGE.ADD. - * KVM_TDX_FINALIZE_VM checks the counter to ensure all mapped pages - * have been added to the image, to prevent running the TD with a - * valid mapping in the mirror EPT, but not in the S-EPT. + * the VM image via KVM_TDX_INIT_MEM_REGION; ADD the page to the TD. */ - if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) { - if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) - return -EIO; - - atomic64_inc(&kvm_tdx->nr_premapped); - return 0; - } + if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) + return tdx_mem_page_add(kvm, gfn, level, pfn); =20 return tdx_mem_page_aug(kvm, gfn, level, pfn); } @@ -1662,39 +1679,6 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, return 0; } =20 -/* - * Check if the error returned from a SEPT zap SEAMCALL is due to that a p= age is - * mapped by KVM_TDX_INIT_MEM_REGION without tdh_mem_page_add() being call= ed - * successfully. - * - * Since tdh_mem_sept_add() must have been invoked successfully before a - * non-leaf entry present in the mirrored page table, the SEPT ZAP related - * SEAMCALLs should not encounter err TDX_EPT_WALK_FAILED. They should ins= tead - * find TDX_EPT_ENTRY_STATE_INCORRECT due to an empty leaf entry found in = the - * SEPT. - * - * Further check if the returned entry from SEPT walking is with RWX permi= ssions - * to filter out anything unexpected. - * - * Note: @level is pg_level, not the tdx_level. The tdx_level extracted fr= om - * level_state returned from a SEAMCALL error is the same as that passed i= nto - * the SEAMCALL. - */ -static int tdx_is_sept_zap_err_due_to_premap(struct kvm_tdx *kvm_tdx, u64 = err, - u64 entry, int level) -{ - if (!err || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) - return false; - - if (err !=3D (TDX_EPT_ENTRY_STATE_INCORRECT | TDX_OPERAND_ID_RCX)) - return false; - - if ((is_last_spte(entry, level) && (entry & VMX_EPT_RWX_MASK))) - return false; - - return true; -} - static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, struct page *page) { @@ -1714,12 +1698,6 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm= , gfn_t gfn, err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); tdx_no_vcpus_enter_stop(kvm); } - if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) { - if (KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm)) - return -EIO; - - return 0; - } =20 if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); @@ -2825,12 +2803,6 @@ static int tdx_td_finalize(struct kvm *kvm, struct k= vm_tdx_cmd *cmd) =20 if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; - /* - * Pages are pending for KVM_TDX_INIT_MEM_REGION to issue - * TDH.MEM.PAGE.ADD(). - */ - if (atomic64_read(&kvm_tdx->nr_premapped)) - return -EINVAL; =20 cmd->hw_error =3D tdh_mr_finalize(&kvm_tdx->td); if (tdx_operand_busy(cmd->hw_error)) @@ -3127,6 +3099,9 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, struct page *src_page; int ret, i; =20 + if (KVM_BUG_ON(kvm_tdx->page_add_src, kvm)) + return -EIO; + /* * Get the source page if it has been faulted in. Return failure if the * source page has been swapped out or unmapped in primary memory. @@ -3137,22 +3112,14 @@ static int tdx_gmem_post_populate(struct kvm *kvm, = gfn_t gfn, kvm_pfn_t pfn, if (ret !=3D 1) return -ENOMEM; =20 + kvm_tdx->page_add_src =3D src_page; ret =3D kvm_tdp_mmu_map_private_pfn(arg->vcpu, gfn, pfn); - if (ret < 0) - goto out; + kvm_tdx->page_add_src =3D NULL; =20 - ret =3D 0; - err =3D tdh_mem_page_add(&kvm_tdx->td, gpa, pfn_to_page(pfn), - src_page, &entry, &level_state); - if (err) { - ret =3D unlikely(tdx_operand_busy(err)) ? -EBUSY : -EIO; - goto out; - } + put_page(src_page); =20 - KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm); - - if (!(arg->flags & KVM_TDX_MEASURE_MEMORY_REGION)) - goto out; + if (ret || !(arg->flags & KVM_TDX_MEASURE_MEMORY_REGION)) + return ret; =20 /* * Note, MR.EXTEND can fail if the S-EPT mapping is somehow removed @@ -3165,14 +3132,11 @@ static int tdx_gmem_post_populate(struct kvm *kvm, = gfn_t gfn, kvm_pfn_t pfn, err =3D tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry, &level_state); if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MR_EXTEND, err, entry, level_state); - ret =3D -EIO; - goto out; + return -EIO; } } =20 -out: - put_page(src_page); - return ret; + return 0; } =20 static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_= cmd *cmd) diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index ca39a9391db1..1b00adbbaf77 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -36,8 +36,12 @@ struct kvm_tdx { =20 struct tdx_td td; =20 - /* For KVM_TDX_INIT_MEM_REGION. */ - atomic64_t nr_premapped; + /* + * Scratch pointer used to pass the source page to tdx_mem_page_add. + * Protected by slots_lock, and non-NULL only when mapping a private + * pfn via tdx_gmem_post_populate(). + */ + struct page *page_add_src; =20 /* * Prevent vCPUs from TD entry to ensure SEPT zap related SEAMCALLs do --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0658529A310 for ; Fri, 17 Oct 2025 00:33:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661199; cv=none; b=JxWo4GznS3YbhJKj93GIMCTAT2whIGErCK7czS3uya3x6IBf/K6c6v46Q0Wl6TJFsYTdon6+nzK7qqH8N+WUlmw87gcJuzFbzldVl0KjPa3rQj6iPWHUqMdqw5b18vw9I/vGzTr2ttjd9eyqrllJ25HST9UYfq/CVa/om9msR5E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661199; c=relaxed/simple; bh=NWFHLZ0jbzvVfPfe8vtwcPHvovnjfevJyNbSF69oZ6Y=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=IhGkN9C9qlwh+oLA9hMm/7n76o+fIxgqNn1bx3iatoqX1tZDMp4vSePHgEtL3aFbRUh2wM+bCiF9VWAb3D4nbZL3Ki49FazMosHewv5vaZLvJC/GpDjnLZTTBZYJltC7st+2+Vw/dthI6/K3wPf/qfkQaYkv6LYU/LIcVcwdVG0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=yI3SKA7Z; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yI3SKA7Z" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-33bdd0479a9so102956a91.2 for ; Thu, 16 Oct 2025 17:33:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661196; x=1761265996; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=WBWRMN6X3w3vTrUF8QjNUM8q/tVPNzKCX5eXLQ+Uu38=; b=yI3SKA7ZvSa9mXc8NcCU/oVf8S+VApiTTX3F70Q3JM8b8whuE2v8PPl9MSjK0qvgkx vCDHCdk/ju/pC8hZCiYmTm2OLxWX+FdeL2ftm672WuZ5eyboRJdzKZkTZB+mq0WpGjVJ 1YpuYzwDXDqyohqdu/+RsIEyU9zSVRZJ/eNN5OIJ7HBbG/QrxsUZQtMfolwBlRrqewk5 +N00exqtzNY4TzgujEFcdH2JVgqalv3x4+Dt5Nzqwxhq2bVZZc4tdKttsBRPOLQ27zED SZN+EuzXA6wGyaJncK1ZdwVbiV2/hMdAkCjfxMBri6EPlzMEob5J25FafdFn7wJV5uUp xFBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661196; x=1761265996; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WBWRMN6X3w3vTrUF8QjNUM8q/tVPNzKCX5eXLQ+Uu38=; b=selqT32KgL+5lAynjuZ226ef3PADcOgHKgFMALM5qZs2uIMx3/mjPbelA0N+lhL7ti xIB7JVQmJ/bEC6Iu0n51H21ywUZn/9G0cAtMG7aAv2ZOU4d4HYi3TVhAdp/jH22LGO4t wFhfnCGsAAtWvHVPlA864PZQz5GeSXgBBNoDQFLxQyvbSQUYD/vLBd0N2WQHBcjbkDUP qalc7mjiwBt7cpLiXPJkGN90t1DZYhzfXRoxdSIGfeGdRS3ioYLQnXYUMp4NLSiVxttB tSJjaIIEQDcop+0T6WC2GsipgPLg2E0CS6Q4+dBgNPq+h+XZNfub/ggp74a80UH+R0kr aosA== X-Forwarded-Encrypted: i=1; AJvYcCVoHNN/H4MSInTvRpaaDSoOmdX0NFkkpjjwCOzDngXiTTjNnPRsJWblQm71Rg5rAhPN2IY73Sx5Cam7HWE=@vger.kernel.org X-Gm-Message-State: AOJu0YyCPcAMjYJJJIuchxVtlv8bK5noxHvYyrBsATquG0iMctoLd+Ya oxrwPqEGcnsFRAFqu/ayYvAWvRcAP5VE03ZIv17OWeNdHRU0cQLINuM81T3Ti+orO1iTGrat76R UqIyOug== X-Google-Smtp-Source: AGHT+IH07aQeXAQe/gpM4oI6RLK2u3p52mKtLGzZsiMbsodE1gcNl3/aTBVcK7Ae83foyLaxFhiKihoxiME= X-Received: from pjbpt2.prod.google.com ([2002:a17:90b:3d02:b0:329:e84e:1c50]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:c2c3:b0:32e:4924:6902 with SMTP id 98e67ed59e1d1-33bcf85a829mr1565388a91.3.1760661196491; Thu, 16 Oct 2025 17:33:16 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:34 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-17-seanjc@google.com> Subject: [PATCH v3 16/25] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Do TDH_MEM_RANGE_BLOCK directly in tdx_sept_remove_private_spte() instead of using a one-off helper now that the nr_premapped tracking is gone. Opportunistically drop the WARN on hugepages, which was dead code (see the KVM_BUG_ON() in tdx_sept_remove_private_spte()). No functional change intended. Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang --- arch/x86/kvm/vmx/tdx.c | 41 +++++++++++------------------------------ 1 file changed, 11 insertions(+), 30 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 76030461c8f7..d77b2de6db8a 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1679,33 +1679,6 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, return 0; } =20 -static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, struct page *page) -{ - int tdx_level =3D pg_level_to_tdx_sept_level(level); - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - gpa_t gpa =3D gfn_to_gpa(gfn) & KVM_HPAGE_MASK(level); - u64 err, entry, level_state; - - /* For now large page isn't supported yet. */ - WARN_ON_ONCE(level !=3D PG_LEVEL_4K); - - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_= state); - - if (unlikely(tdx_operand_busy(err))) { - /* After no vCPUs enter, the second retry is expected to succeed */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); - tdx_no_vcpus_enter_stop(kvm); - } - - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); - return -EIO; - } - return 1; -} - /* * Ensure shared and private EPTs to be flushed on all vCPUs. * tdh_mem_track() is the only caller that increases TD epoch. An increase= in @@ -1786,7 +1759,6 @@ static void tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); gpa_t gpa =3D gfn_to_gpa(gfn); u64 err, entry, level_state; - int ret; =20 /* * HKID is released after all private pages have been removed, and set @@ -1800,9 +1772,18 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return; =20 - ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); - if (ret <=3D 0) + err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_= state); + if (unlikely(tdx_operand_busy(err))) { + /* After no vCPUs enter, the second retry is expected to succeed */ + tdx_no_vcpus_enter_start(kvm); + err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); + tdx_no_vcpus_enter_stop(kvm); + } + + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); return; + } =20 /* * TDX requires TLB tracking before dropping private page. Do --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98F602264A7 for ; Fri, 17 Oct 2025 00:33:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661201; cv=none; b=octHECL3Q19v/T7iBKN27LreURTAe+ovQN1ykgEUb92/fCY9PmGDQNjKAA8u8mWgIpxgND7IYKAlodJXdwYq3aF2z8l4+yQ+/9RxRmN3E5bzDC+SRZJ1ZzpWyTrDvxK/8r3LwydvpwLNXvl3Czy8wa+5rSEnoNgdsrXQlkFEAKU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661201; c=relaxed/simple; bh=xsVeZVCpJopvRN613ExtBgtC+UVk2ov6pGM6BGmA2r8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nCNyugZzXLgv92mZCYNnAgfA4jXzpQBW7ueFpGyOCRB8R6NHfcbKEleB5ZVc1odlX1n1UhxsBxNHM7MLl4rd6D+9luj7v1z/EKw5uNY4YKJlRqwGyR/YTlri+Lm62uq/5cXeYPAmqAtZyu7h+ltF6BoOuoBQ93Yjzj4ZZ23m1PE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nwRblHiW; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nwRblHiW" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33baef12edaso1233465a91.0 for ; Thu, 16 Oct 2025 17:33:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661199; x=1761265999; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=4L4VN8U7RJUEbpzDtbDieIOtt/3ZiFqFZJAEpu6W9XM=; b=nwRblHiW9FNDsZ2TOMqCMlyKULZJFzSXeSqD/mTxuHnrey0mqWi7DB6w4PT7yVEtyv aVlaY12uifgHttYTyXQWP+yYeAvsZUE1uRa5LywDSf+6de1QxH7gT6QqwfBfm35LhxaL g19okMdFX2g5XzF7Bg6iwNl4mlJdjcSsi9b9ugqkEvOtWyWYhZIwPXgyBztG21GQ6Hki J+ltFOkVu40sBzyPM7sW9zZwPIpWyM4r62IhRYhRKLbvNkRTxD/QWeU94kfXgDxUeihE CJ3+9SoR8bluB0iYrDm5d7LgPqLquCwLMvM9BqnajaRw3DYEY81c4vwM8Zeqb5XnNcxa Tc6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661199; x=1761265999; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4L4VN8U7RJUEbpzDtbDieIOtt/3ZiFqFZJAEpu6W9XM=; b=wLnoRaFO0tFNwUjTIDh5IG/abQcQIBeuUUdM6JWN4vso5gINl5birhuyJvCpGoZg+g hB3ADxml0n7xchmvwc7rgNC8v6o0+VBQwYVt/2Y5DoTfDYPEumrLvmeeveokqG3mEkdt vpRe0feT5Xz6f2qRDIZTRADzwiBjqA7UXn2PNnGX9EKlFhbMSOyLqsiM7DKu44j9KmHy jmsEToCDc+dNBC8rPeE9lAi7UarJdhrgUBcykmTlxWdJJZ9SBCdOY+sk5LgFoCF3+Iyt 1M5M09Ot/sKuSoDJH2A5RvGOPX/B5LcbPP+7x+8YHm1LfuBqz7YWkUhQ8S40OeaCtzO9 aT6A== X-Forwarded-Encrypted: i=1; AJvYcCVBgn6Ow0p2WdX98LWqGPp/I2ipfLUOIfNVn+dv5PEdDUs4hJeB9ratdKw1O4vnXg8veulo5sjckyMbIM0=@vger.kernel.org X-Gm-Message-State: AOJu0YwLM49vsV8jY/WrxrJoKZ1sqAe8O0wrRdurqZ8mdD9+lvel2twG QVJSXtc/x4ieNLoiIEbmROMF3xE5PUd6Wj9HUZhbp8V/GAvqXk7vc8ZPiN/zbvUhz/l9tvILhlB XL+ZNAw== X-Google-Smtp-Source: AGHT+IHMdPj8KO0Ow+VsHoVT8Wut4d4xJXGIy0YSxbMRshGdS2z8MCnnKR/WSj9gCLIaYnBljUnbwyVx82E= X-Received: from pjir1.prod.google.com ([2002:a17:90a:5c81:b0:31f:2a78:943]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2883:b0:32e:3830:65d5 with SMTP id 98e67ed59e1d1-33bcf9222aemr1525254a91.36.1760661198266; Thu, 16 Oct 2025 17:33:18 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:35 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-18-seanjc@google.com> Subject: [PATCH v3 17/25] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add TDX_BUG_ON() macros (with varying numbers of arguments) to deduplicate the myriad flows that do KVM_BUG_ON()/WARN_ON_ONCE() followed by a call to pr_tdx_error(). In addition to reducing boilerplate copy+paste code, this also helps ensure that KVM provides consistent handling of SEAMCALL errors. Opportunistically convert a handful of bare WARN_ON_ONCE() paths to the equivalent of KVM_BUG_ON(), i.e. have them terminate the VM. If a SEAMCALL error is fatal enough to WARN on, it's fatal enough to terminate the TD. Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 114 +++++++++++++++++------------------------ 1 file changed, 47 insertions(+), 67 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d77b2de6db8a..2d587a38581e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -24,20 +24,32 @@ #undef pr_fmt #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 -#define pr_tdx_error(__fn, __err) \ - pr_err_ratelimited("SEAMCALL %s failed: 0x%llx\n", #__fn, __err) +#define __TDX_BUG_ON(__err, __f, __kvm, __fmt, __args...) \ +({ \ + struct kvm *_kvm =3D (__kvm); \ + bool __ret =3D !!(__err); \ + \ + if (WARN_ON_ONCE(__ret && (!_kvm || !_kvm->vm_bugged))) { \ + if (_kvm) \ + kvm_vm_bugged(_kvm); \ + pr_err_ratelimited("SEAMCALL " __f " failed: 0x%llx" __fmt "\n",\ + __err, __args); \ + } \ + unlikely(__ret); \ +}) =20 -#define __pr_tdx_error_N(__fn_str, __err, __fmt, ...) \ - pr_err_ratelimited("SEAMCALL " __fn_str " failed: 0x%llx, " __fmt, __err= , __VA_ARGS__) +#define TDX_BUG_ON(__err, __fn, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, "%s", "") =20 -#define pr_tdx_error_1(__fn, __err, __rcx) \ - __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx\n", __rcx) +#define TDX_BUG_ON_1(__err, __fn, __rcx, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx", __rcx) =20 -#define pr_tdx_error_2(__fn, __err, __rcx, __rdx) \ - __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx, rdx 0x%llx\n", __rcx, __rdx) +#define TDX_BUG_ON_2(__err, __fn, __rcx, __rdx, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx", __rcx, __rd= x) + +#define TDX_BUG_ON_3(__err, __fn, __rcx, __rdx, __r8, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx, r8 0x%llx", = __rcx, __rdx, __r8) =20 -#define pr_tdx_error_3(__fn, __err, __rcx, __rdx, __r8) \ - __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx, rdx 0x%llx, r8 0x%llx\n", __r= cx, __rdx, __r8) =20 bool enable_tdx __ro_after_init; module_param_named(tdx, enable_tdx, bool, 0444); @@ -313,10 +325,9 @@ static int __tdx_reclaim_page(struct page *page) * before the HKID is released and control pages have also been * released at this point, so there is no possibility of contention. */ - if (WARN_ON_ONCE(err)) { - pr_tdx_error_3(TDH_PHYMEM_PAGE_RECLAIM, err, rcx, rdx, r8); + if (TDX_BUG_ON_3(err, TDH_PHYMEM_PAGE_RECLAIM, rcx, rdx, r8, NULL)) return -EIO; - } + return 0; } =20 @@ -404,8 +415,8 @@ static void tdx_flush_vp_on_cpu(struct kvm_vcpu *vcpu) return; =20 smp_call_function_single(cpu, tdx_flush_vp, &arg, 1); - if (KVM_BUG_ON(arg.err, vcpu->kvm)) - pr_tdx_error(TDH_VP_FLUSH, arg.err); + + TDX_BUG_ON(arg.err, TDH_VP_FLUSH, vcpu->kvm); } =20 void tdx_disable_virtualization_cpu(void) @@ -464,8 +475,7 @@ static void smp_func_do_phymem_cache_wb(void *unused) } =20 out: - if (WARN_ON_ONCE(err)) - pr_tdx_error(TDH_PHYMEM_CACHE_WB, err); + TDX_BUG_ON(err, TDH_PHYMEM_CACHE_WB, NULL); } =20 void tdx_mmu_release_hkid(struct kvm *kvm) @@ -504,8 +514,7 @@ void tdx_mmu_release_hkid(struct kvm *kvm) err =3D tdh_mng_vpflushdone(&kvm_tdx->td); if (err =3D=3D TDX_FLUSHVP_NOT_DONE) goto out; - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_MNG_VPFLUSHDONE, err); + if (TDX_BUG_ON(err, TDH_MNG_VPFLUSHDONE, kvm)) { pr_err("tdh_mng_vpflushdone() failed. HKID %d is leaked.\n", kvm_tdx->hkid); goto out; @@ -528,8 +537,7 @@ void tdx_mmu_release_hkid(struct kvm *kvm) * tdh_mng_key_freeid() will fail. */ err =3D tdh_mng_key_freeid(&kvm_tdx->td); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_MNG_KEY_FREEID, err); + if (TDX_BUG_ON(err, TDH_MNG_KEY_FREEID, kvm)) { pr_err("tdh_mng_key_freeid() failed. HKID %d is leaked.\n", kvm_tdx->hkid); } else { @@ -580,10 +588,9 @@ static void tdx_reclaim_td_control_pages(struct kvm *k= vm) * when it is reclaiming TDCS). */ err =3D tdh_phymem_page_wbinvd_tdr(&kvm_tdx->td); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + if (TDX_BUG_ON(err, TDH_PHYMEM_PAGE_WBINVD, kvm)) return; - } + tdx_quirk_reset_page(kvm_tdx->td.tdr_page); =20 __free_page(kvm_tdx->td.tdr_page); @@ -606,11 +613,8 @@ static int tdx_do_tdh_mng_key_config(void *param) =20 /* TDX_RND_NO_ENTROPY related retries are handled by sc_retry() */ err =3D tdh_mng_key_config(&kvm_tdx->td); - - if (KVM_BUG_ON(err, &kvm_tdx->kvm)) { - pr_tdx_error(TDH_MNG_KEY_CONFIG, err); + if (TDX_BUG_ON(err, TDH_MNG_KEY_CONFIG, &kvm_tdx->kvm)) return -EIO; - } =20 return 0; } @@ -1601,10 +1605,8 @@ static int tdx_mem_page_add(struct kvm *kvm, gfn_t g= fn, enum pg_level level, if (unlikely(tdx_operand_busy(err))) return -EBUSY; =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_ADD, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_ADD, entry, level_state, kvm)) return -EIO; - } =20 return 0; } @@ -1623,10 +1625,8 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t g= fn, if (unlikely(tdx_operand_busy(err))) return -EBUSY; =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_AUG, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_AUG, entry, level_state, kvm)) return -EIO; - } =20 return 0; } @@ -1671,10 +1671,8 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, if (unlikely(tdx_operand_busy(err))) return -EBUSY; =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_SEPT_ADD, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_SEPT_ADD, entry, level_state, kvm)) return -EIO; - } =20 return 0; } @@ -1722,8 +1720,7 @@ static void tdx_track(struct kvm *kvm) tdx_no_vcpus_enter_stop(kvm); } =20 - if (KVM_BUG_ON(err, kvm)) - pr_tdx_error(TDH_MEM_TRACK, err); + TDX_BUG_ON(err, TDH_MEM_TRACK, kvm); =20 kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); } @@ -1780,10 +1777,8 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, tdx_no_vcpus_enter_stop(kvm); } =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) return; - } =20 /* * TDX requires TLB tracking before dropping private page. Do @@ -1810,16 +1805,12 @@ static void tdx_sept_remove_private_spte(struct kvm= *kvm, gfn_t gfn, tdx_no_vcpus_enter_stop(kvm); } =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_REMOVE, entry, level_state, kvm)) return; - } =20 err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + if (TDX_BUG_ON(err, TDH_PHYMEM_PAGE_WBINVD, kvm)) return; - } =20 tdx_quirk_reset_page(page); } @@ -2459,8 +2450,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, goto free_packages; } =20 - if (WARN_ON_ONCE(err)) { - pr_tdx_error(TDH_MNG_CREATE, err); + if (TDX_BUG_ON(err, TDH_MNG_CREATE, kvm)) { ret =3D -EIO; goto free_packages; } @@ -2501,8 +2491,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, ret =3D -EAGAIN; goto teardown; } - if (WARN_ON_ONCE(err)) { - pr_tdx_error(TDH_MNG_ADDCX, err); + if (TDX_BUG_ON(err, TDH_MNG_ADDCX, kvm)) { ret =3D -EIO; goto teardown; } @@ -2519,8 +2508,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, *seamcall_err =3D err; ret =3D -EINVAL; goto teardown; - } else if (WARN_ON_ONCE(err)) { - pr_tdx_error_1(TDH_MNG_INIT, err, rcx); + } else if (TDX_BUG_ON_1(err, TDH_MNG_INIT, rcx, kvm)) { ret =3D -EIO; goto teardown; } @@ -2788,10 +2776,8 @@ static int tdx_td_finalize(struct kvm *kvm, struct k= vm_tdx_cmd *cmd) cmd->hw_error =3D tdh_mr_finalize(&kvm_tdx->td); if (tdx_operand_busy(cmd->hw_error)) return -EBUSY; - if (KVM_BUG_ON(cmd->hw_error, kvm)) { - pr_tdx_error(TDH_MR_FINALIZE, cmd->hw_error); + if (TDX_BUG_ON(cmd->hw_error, TDH_MR_FINALIZE, kvm)) return -EIO; - } =20 kvm_tdx->state =3D TD_STATE_RUNNABLE; /* TD_STATE_RUNNABLE must be set before 'pre_fault_allowed' */ @@ -2878,16 +2864,14 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, = u64 vcpu_rcx) } =20 err =3D tdh_vp_create(&kvm_tdx->td, &tdx->vp); - if (KVM_BUG_ON(err, vcpu->kvm)) { + if (TDX_BUG_ON(err, TDH_VP_CREATE, vcpu->kvm)) { ret =3D -EIO; - pr_tdx_error(TDH_VP_CREATE, err); goto free_tdcx; } =20 for (i =3D 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { err =3D tdh_vp_addcx(&tdx->vp, tdx->vp.tdcx_pages[i]); - if (KVM_BUG_ON(err, vcpu->kvm)) { - pr_tdx_error(TDH_VP_ADDCX, err); + if (TDX_BUG_ON(err, TDH_VP_ADDCX, vcpu->kvm)) { /* * Pages already added are reclaimed by the vcpu_free * method, but the rest are freed here. @@ -2901,10 +2885,8 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u= 64 vcpu_rcx) } =20 err =3D tdh_vp_init(&tdx->vp, vcpu_rcx, vcpu->vcpu_id); - if (KVM_BUG_ON(err, vcpu->kvm)) { - pr_tdx_error(TDH_VP_INIT, err); + if (TDX_BUG_ON(err, TDH_VP_INIT, vcpu->kvm)) return -EIO; - } =20 vcpu->arch.mp_state =3D KVM_MP_STATE_RUNNABLE; =20 @@ -3111,10 +3093,8 @@ static int tdx_gmem_post_populate(struct kvm *kvm, g= fn_t gfn, kvm_pfn_t pfn, */ for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { err =3D tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry, &level_state); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MR_EXTEND, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MR_EXTEND, entry, level_state, kvm)) return -EIO; - } } =20 return 0; --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3091F29DB6A for ; Fri, 17 Oct 2025 00:33:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661203; cv=none; b=V06zsMaLMlBTRYcfBBUYMG8KHiHEDCV21SyBMElpunAojw45gm0RlufeNwoCrs0jG0p2xwA/Bh+2p0ofU0xmjw3cOyPh9qHnfUk2wATxoN2I/XUC30lQpDP/3hBVihoBkNeloVi6IFLxrK1mE6dPpA/w18Xn4HEdjomNNotmKhk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661203; c=relaxed/simple; bh=tg0l56HvAdoWWKikkxStnVhmDSB2cUfpRDPGIT/ddFc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Vx0PytgHahv3/dSUmxQXv1aSH3MzINEw2bgaAGqmriGZguCycg9Yt6IM/aijdW7xqLp1Pkkq9oDKJjl93LsR0jteuIStSXPaMg/b3XFle8YZ38IQ8cL9W699l3q9Bqykn4+plDANYY1/9QK4hv4J+EsQKavVLiK2W/dX7cF6Hfs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tl8JaAdf; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tl8JaAdf" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-290bcced220so9521095ad.1 for ; Thu, 16 Oct 2025 17:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661200; x=1761266000; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Et3yNXLGMeuamTbAzxTVAbjwKav6YdsiHYhgXHS+lqo=; b=tl8JaAdfHhXoNOtZDXn8LDYLS+l/s5jkVijG6SU417ryDCkaSiAEsS0sHwtCeXxCn1 YQhlXPEGfCD8pEnp9USPKYIqfNrsPbieBVp/5bDBLXLyx/IvQ9XV9gShpWx32fYrXY6J BQi4bc3GKXFVjaiI4fOzZDvy98hQhpnFtQHEPepLf08OvlConiOC0GrX/lTqmzglp101 3b6u3IPDiMR4YZr3/TaouteG6gmOgaSNWr7MGVlsEg5TI15x/OI63sf4gzkgs+VXDGep Xthn/g8sG6oThIJb7M1E0fBxU2htOBqok16jWkNvJa8n79Kw1zezjoIK3OXR4edooQAP DPiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661200; x=1761266000; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Et3yNXLGMeuamTbAzxTVAbjwKav6YdsiHYhgXHS+lqo=; b=wbauYHRr2QmqOA3xlj6iyNWjaxgJQAaOJ9uI1NLdkJJk6lfR9we/Fz+v4xmHpfFy+8 s51nekN8k9bezNRv3vWtTcqofuNi5SDoGk0Jp5YTrvcqoXCLWivEiywNUZdNwwoApa22 /aUHl7ICEMzb31RSr5LXVSXkJ0CKQDZ9+kFJLku4ZKVpS0hMV0a9gTqRGZCXCb3T1MRD DePk438oazLrdE20E0+BEGVtHvW2h8HvBeUNkuclxIP52WZOUg5AlsmpcR3jsbbJBJPk zblZCu/NVMxg7wmn22LccXsHIbT0Gcrph5hBognlP5eSkK2Y0USAJPDXIcNXzJM6O8aX 6rEQ== X-Forwarded-Encrypted: i=1; AJvYcCWjIvZDfXVmP+7LETENm1ILS07k6g7HQT+Qw44F4xH6uQpEdKZ8xDuTK18GAoNnfD6TwutSH3ICU4n9FyA=@vger.kernel.org X-Gm-Message-State: AOJu0YymhXlRaENJIjq6RYGFI/FV67tbY/JmWTf0OLKqtWfuA5ThCaxg HE+4Yu26x/cw17i4kC9sg51ZSg5gpL2/1+IiC3oAhCgBSv1E7sfU1S8mt0rqKVTl7RmJrQev02R IN5Vwaw== X-Google-Smtp-Source: AGHT+IF8DlOutKWl0i8m+B1TfQqXPthFr1585kr7hlu5o2md7/0QzZzZHhKvSNGyUqeTAbPYZaEbqdnIEp8= X-Received: from pjbmr8.prod.google.com ([2002:a17:90b:2388:b0:330:49f5:c0a7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e544:b0:26b:5346:5857 with SMTP id d9443c01a7336-290c9cd4b82mr18056465ad.24.1760661200311; Thu, 16 Oct 2025 17:33:20 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:36 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-19-seanjc@google.com> Subject: [PATCH v3 18/25] KVM: TDX: Derive error argument names from the local variable names From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When printing SEAMCALL errors, use the name of the variable holding an error parameter instead of the register from whence it came, so that flows which use descriptive variable names will similarly print descriptive error messages. Suggested-by: Rick Edgecombe Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 2d587a38581e..e517ad3d5f4f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -41,14 +41,15 @@ #define TDX_BUG_ON(__err, __fn, __kvm) \ __TDX_BUG_ON(__err, #__fn, __kvm, "%s", "") =20 -#define TDX_BUG_ON_1(__err, __fn, __rcx, __kvm) \ - __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx", __rcx) +#define TDX_BUG_ON_1(__err, __fn, a1, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", " #a1 " 0x%llx", a1) =20 -#define TDX_BUG_ON_2(__err, __fn, __rcx, __rdx, __kvm) \ - __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx", __rcx, __rd= x) +#define TDX_BUG_ON_2(__err, __fn, a1, a2, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", " #a1 " 0x%llx, " #a2 " 0x%llx", a1,= a2) =20 -#define TDX_BUG_ON_3(__err, __fn, __rcx, __rdx, __r8, __kvm) \ - __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx, r8 0x%llx", = __rcx, __rdx, __r8) +#define TDX_BUG_ON_3(__err, __fn, a1, a2, a3, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", " #a1 " 0x%llx, " #a2 ", 0x%llx, " #= a3 " 0x%llx", \ + a1, a2, a3) =20 =20 bool enable_tdx __ro_after_init; --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 755B72BD00C for ; Fri, 17 Oct 2025 00:33:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661204; cv=none; b=NhWSwVyf3qyDIUpbATvjEhpAEfOhudM0kiwugoH6DgWslAgbE3lL5mCA/F/t9PIJdySPZ1RFiMNtU6IjddJ/bgX3AkeGIXR6zhNKTLFS4FtBIHx56DTKHt4ALAGmrzLjXAydsyAmxCgr1nojs+wEEwySicU2DMIGoZ3jdl8hZ2U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661204; c=relaxed/simple; bh=yaQKyTxksrTJX2fl14PtTN0UzNHo3LiOiMMysefNVFI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=K5SYj4y4cEnV1J8NBjpVAVUQc5ZOfLfhHAbmiJCezc7Ydwy41cPcNXKqFFMZpUi93jwZg3unCedyYR0ngp8dWMdde053BR5sBYqMiIiTKVi9E6kUlKt9mfO3zwTt3gAfl5Xg6uDamo3pAQCANQmtt8asL6tmir4mYR2rPKX2n0A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=casHcnsL; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="casHcnsL" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32eddb7e714so1205900a91.1 for ; Thu, 16 Oct 2025 17:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661202; x=1761266002; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=6nFGgJBVWEpkJy+c6A8c/BAOcJuYsJfcEhqI0sTwd+k=; b=casHcnsLBKYhuW7qt5C+awNW4w83cnLajGzcMfLt+sLY6JdzuX/CN3RmjwecVxlo3r QNWrR1qTr1I7xpeyQvPZVtwgwmVR0JayxW2HcbT8YMqTcDq5HDdmAJVVgg+4ErjMK3RC xTqgR2uos9JTd1Uaeby2Y9EsdW5J/oS+ecnipElXVxWonvSYczTh8omy0iXAHsvik9Ua X/DZfE3c0eJtKIablsaVimaYC3sHE6Nja91MXN7Q+W6fr+bVbO/Bb2Mn9E7E2zdf77PO GKPEAfSXwnC7atRU1dzorvVZKc1TKaXSNb0FO3ou0ynWPIV6mylDSP1EgX4wU6Ysvii5 VRwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661202; x=1761266002; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6nFGgJBVWEpkJy+c6A8c/BAOcJuYsJfcEhqI0sTwd+k=; b=mjMRNCgLLnTAp9fUfKPITky41rZdEzuPbRyFo/1A/anHBnnVnVd50TtiR4SqyI5yIU 1NHLPQfF7PUebmeq4MxPzVrAQNJ5kzXlKxkkoLWJcP9Fywza9Tb+MCpG8yEzhnx8SRQz +KllHs1/j56NP8LFwBn5tfvn36QqckI2qnKLGcHPisaFGIRkD7Rgxx32tAvfaj1XCuq8 RbR/IZpY0IPyU+LsUZcFucBg2yJe32YVGf02V3hc9zVJH+SgE+cqX8X2BUh6ReU120yV z1JLlpn5pD3BG4C9S1XZovwMFuKaEX047/PdxBNZHL+O7+uvCSEJ7eHx9MHcvFWL/FXf w2Vw== X-Forwarded-Encrypted: i=1; AJvYcCVhW2NcbtW1KnDv/MSYl3wZZRVeP7yeoxT2DKlFJLqMuXUFyRx17BqwmxS6wX1rn/m+zmhQFvuuxzqVRfs=@vger.kernel.org X-Gm-Message-State: AOJu0YzIjzh1Xo/vc+9Kpo320qMPNZ5/QudctHUXKolPa0u/qXYLkLQ+ UIyHvOR7ewWvubzhh/LpjId+UgQUVoPG5oF0SmssdQh/htgxPz3NUFdlPm14xTIB5DJXYGo6Uf8 mk/RZPw== X-Google-Smtp-Source: AGHT+IGwUk2DKc/Clop/mYKkbeytMt0iZD/Bzj0bONucI419CZOE9/NPfh2luTSi9dr6fi1GHUAnQKjomOI= X-Received: from pjnu19.prod.google.com ([2002:a17:90a:8913:b0:329:ec3d:72ad]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4fc6:b0:330:6d5e:f17e with SMTP id 98e67ed59e1d1-33bcf8faaeamr1865014a91.24.1760661201850; Thu, 16 Oct 2025 17:33:21 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:37 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-20-seanjc@google.com> Subject: [PATCH v3 19/25] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Unconditionally assert that mmu_lock is held for write when removing S-EPT entries, not just when removing S-EPT entries triggers certain conditions, e.g. needs to do TDH_MEM_TRACK or kick vCPUs out of the guest. Conditionally asserting implies that it's safe to hold mmu_lock for read when those paths aren't hit, which is simply not true, as KVM doesn't support removing S-EPT entries under read-lock. Only two paths lead to remove_external_spte(), and both paths asserts that mmu_lock is held for write (tdp_mmu_set_spte() via lockdep, and handle_removed_pt() via KVM_BUG_ON()). Deliberately leave lockdep assertions in the "no vCPUs" helpers to document that wait_for_sept_zap is guarded by holding mmu_lock for write. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e517ad3d5f4f..f6782b0ffa98 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1711,8 +1711,6 @@ static void tdx_track(struct kvm *kvm) if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) return; =20 - lockdep_assert_held_write(&kvm->mmu_lock); - err =3D tdh_mem_track(&kvm_tdx->td); if (unlikely(tdx_operand_busy(err))) { /* After no vCPUs enter, the second retry is expected to succeed */ @@ -1758,6 +1756,8 @@ static void tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, gpa_t gpa =3D gfn_to_gpa(gfn); u64 err, entry, level_state; =20 + lockdep_assert_held_write(&kvm->mmu_lock); + /* * HKID is released after all private pages have been removed, and set * before any might be populated. Warn if zapping is attempted when --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EADC52BE057 for ; Fri, 17 Oct 2025 00:33:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661206; cv=none; b=DzT+eq8Ygvv7xqV/AJKsPHJcqemM0dkDQ7qk/KUl2iiT7vmDJUkaT3jEsy9Se8/srCQhHoS8CigfJ7D/q5v1AQyKDllsw7XVEmbcjq924dZiP5K8JKjc/5qcr2t4KcpOhLOtr8DvHN7/x6Bt/jf6b4ou1l0Qt1TIW6/M7ruxvv0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661206; c=relaxed/simple; bh=M6cFRGQS8meCCuYW0/U+4dFHJMyHGWBHr9QMgQL+DJ0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=mmqHtCdjFQtsZtkmnK/OGby8fpwAF8H8XOWAqd7vgoYONIiBcNrvPfnyuEROlFLiJ04WHP2wuZNVH4LeE2AewyP4qYAxIIyd3Iyucnmyi5CzSjKttKatXQReNcWT4m3ZtWtIow4rMcUcucge5pgYb2kCRpVaJ0bvqeGSOctaYYM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=JG8w4U4N; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="JG8w4U4N" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3304def7909so1195965a91.3 for ; Thu, 16 Oct 2025 17:33:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661203; x=1761266003; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=jejy5GHiX1n5ZzK7FS/SwKAPXTIMDzw65QOVqYwFQBI=; b=JG8w4U4Niq55dlrwX/1lGDMW9qhuGwx2MPE04ytJyxTMKWU4aqUHtxNQ4nsCFJyNaB nnLs9iS7YXdN6SneGXWiV3Xd/9zi5q0PUx3Qt+FQvzCuABxjrT5v1VIT32yCNVFCGaZy gBS0LzGGlhEY6lpXnBlfzMh/dUCNpC0mrk1b+2S1ioHkzTfkL2MpfEzO9ipUEkwTWt9O gzpvqgARSoJm5RtnX2Rrg7g6EWJE4oZNHga1deTKV1kRu09blFhE3JFnEbC4iT4bx1h9 T9zImFJj9V8gNd8S2jasDPTlFPWeS6JGAX8WM1gZtwFVDWR0QTU2j/91uNMKIy1ocUeZ i1Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661203; x=1761266003; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jejy5GHiX1n5ZzK7FS/SwKAPXTIMDzw65QOVqYwFQBI=; b=qqH2BJHorPdG8DIZoGMkUGiKqvUk21blQfEjGiHvX8tchVQKMpMk2EmV5K69JbJFMp g2GXCmmNScuJzdp3GY9GIgjgcJxhBwyIVxc2uKebJhX0RY3D4xURezYrE7DaTcKCA2pq pYzF4D6TZ8EAVvTvAOzj3pfQXsxPgTfeFyQHQDcqNgHt5nogPq+Q1A+H6zIUQoTnCIzy s5NyrH7WW7B/e1d6zzgXBcJiMBXplCXffIaQKIRlBIk/XKeQiW0acu5LoIVLS0aGGFVo msHXxY+bI+pPKqgp3InX2ZdP59sK9SNA/GsJbloJPTWDvXT8H3R3/NfkZ1AuY5wl2gEn 9lVw== X-Forwarded-Encrypted: i=1; AJvYcCUqLoQDBKnouP/ThpT4IjgBquEIrjs98zaAPiHJcgJCLcC7k45Mu9LbZ/OrIxbL7RPTQ8WjXKPA5aI0XGI=@vger.kernel.org X-Gm-Message-State: AOJu0Yy/sQVFzDFIYS/Mnjjtw1U/s8sv+RE0G99PI4AJr6hCHR9akxFp gqyss5tQBoHxuH2r3nvAC58XkH8AcBPf5cYJ7ErmWRaH5FO8IOdL4a+8U7Mv5MvuUulm3F1yCFP bmvsCqA== X-Google-Smtp-Source: AGHT+IFl8lCxI7RawbIX2CACJWbI5vFGM8HhgndqIpbMEKI2O7QrNwa82jxnXWK9S77FfTXYSbFZaN83qL4= X-Received: from pjqx20.prod.google.com ([2002:a17:90a:b014:b0:338:3e6b:b835]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3c02:b0:32c:38b0:593e with SMTP id 98e67ed59e1d1-33bcf85ffa4mr1903883a91.5.1760661203423; Thu, 16 Oct 2025 17:33:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:38 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-21-seanjc@google.com> Subject: [PATCH v3 20/25] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a macro to handle kicking vCPUs out of the guest and retrying SEAMCALLs on -EBUSY instead of providing small helpers to be used by each SEAMCALL. Wrapping the SEAMCALLs in a macro makes it a little harder to tease out which SEAMCALL is being made, but significantly reduces the amount of copy+paste code and makes it all but impossible to leave an elevated wait_for_sept_zap. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 72 ++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 49 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index f6782b0ffa98..2e2dab89c98f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -294,25 +294,24 @@ static inline void tdx_disassociate_vp(struct kvm_vcp= u *vcpu) vcpu->cpu =3D -1; } =20 -static void tdx_no_vcpus_enter_start(struct kvm *kvm) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - - lockdep_assert_held_write(&kvm->mmu_lock); - - WRITE_ONCE(kvm_tdx->wait_for_sept_zap, true); - - kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); -} - -static void tdx_no_vcpus_enter_stop(struct kvm *kvm) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - - lockdep_assert_held_write(&kvm->mmu_lock); - - WRITE_ONCE(kvm_tdx->wait_for_sept_zap, false); -} +#define tdh_do_no_vcpus(tdh_func, kvm, args...) \ +({ \ + struct kvm_tdx *__kvm_tdx =3D to_kvm_tdx(kvm); \ + u64 __err; \ + \ + lockdep_assert_held_write(&kvm->mmu_lock); \ + \ + __err =3D tdh_func(args); \ + if (unlikely(tdx_operand_busy(__err))) { \ + WRITE_ONCE(__kvm_tdx->wait_for_sept_zap, true); \ + kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); \ + \ + __err =3D tdh_func(args); \ + \ + WRITE_ONCE(__kvm_tdx->wait_for_sept_zap, false); \ + } \ + __err; \ +}) =20 /* TDH.PHYMEM.PAGE.RECLAIM is allowed only when destroying the TD. */ static int __tdx_reclaim_page(struct page *page) @@ -1711,14 +1710,7 @@ static void tdx_track(struct kvm *kvm) if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) return; =20 - err =3D tdh_mem_track(&kvm_tdx->td); - if (unlikely(tdx_operand_busy(err))) { - /* After no vCPUs enter, the second retry is expected to succeed */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_track(&kvm_tdx->td); - tdx_no_vcpus_enter_stop(kvm); - } - + err =3D tdh_do_no_vcpus(tdh_mem_track, kvm, &kvm_tdx->td); TDX_BUG_ON(err, TDH_MEM_TRACK, kvm); =20 kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); @@ -1770,14 +1762,8 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return; =20 - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_= state); - if (unlikely(tdx_operand_busy(err))) { - /* After no vCPUs enter, the second retry is expected to succeed */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); - tdx_no_vcpus_enter_stop(kvm); - } - + err =3D tdh_do_no_vcpus(tdh_mem_range_block, kvm, &kvm_tdx->td, gpa, + tdx_level, &entry, &level_state); if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) return; =20 @@ -1792,20 +1778,8 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, * with other vcpu sept operation. * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. */ - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - - if (unlikely(tdx_operand_busy(err))) { - /* - * The second retry is expected to succeed after kicking off all - * other vCPUs and prevent them from invoking TDH.VP.ENTER. - */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - tdx_no_vcpus_enter_stop(kvm); - } - + err =3D tdh_do_no_vcpus(tdh_mem_page_remove, kvm, &kvm_tdx->td, gpa, + tdx_level, &entry, &level_state); if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_REMOVE, entry, level_state, kvm)) return; =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6D722BEC3A for ; Fri, 17 Oct 2025 00:33:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661207; cv=none; b=kCMFF2OmBJdQ76Ams0gdTGkigBA0Rons6HyO9qWACB+avwA09Qfc0ZL+VWV3EdSzNxXwv/K6xJuR3imsLFGvVPXSLOLJRgpGLV54CCRqRyTXJVVd4/h0YUcpng0KU47+HwF1nNN0AJ8Eu5GUn2Qa/smmxO0q9a4uQ+6tRMU104Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661207; c=relaxed/simple; bh=7L8ZMSuIkRJTkNiFBvS5epoU4gg7pv9o+BehY7SDauc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=K3J1sYf2r6tc32nxwLSLsA6MCe3E1KpNk8yaMhXqqv+lH0SbtGMhdb3IuhV0Yhg/vcZ/4x6Jg6iOeiNRB2YSxchRxNmD+PWwJB/BsIz9pUpnjLhaeFIX5h/5Z01dQypfCeOSqdJwJE8a8E7l5RPbUd35mjbzSBzmRwD2IeWDbWs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=I2d2BaLt; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="I2d2BaLt" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32ee4998c50so1124848a91.3 for ; Thu, 16 Oct 2025 17:33:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661205; x=1761266005; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=G1uT4yQcba8Ntip8LDR4veCehne/0rON+KC3PlpYlbc=; b=I2d2BaLtXqDJIKsNYUqpCOnQeIzAGYVpirrMAlP0NvOM+mzhb0h0sgpGfn0PitSFTU +5dns31JOWekYVeclB4cNzOz/l06RBk+7mBe+cupAbWT/FfI0Jl7ZUFTYguzroQROfqw +hFhYY6VZR8XnV7MFhanRhR9G9xvLAxF4X46uQbEubCFfqLGRiEuWlYpVhCb7JpINlyM nlabs3mKEln6vSpYDZEPDRai6uaWjlsqTEFM6xQ8QBrW6ullentcn29VVQnk1T/XsCo/ 10AWXS1pVYoYiud2ksL9f6HFJzWRLNNxEEJiS3dVOdgcOVLgyGkJ6kiO5T5BMRXMjhNt 6VSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661205; x=1761266005; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=G1uT4yQcba8Ntip8LDR4veCehne/0rON+KC3PlpYlbc=; b=JpQ0CQX0mSXpiz9j+dHun1z4LuqcSYX6LvFeXO/ZeqQqDNbEeKyrIqJPZ4IedhuPEs fslAjxY0Ap3YxZzPdF3wjSSK2SoZidQpdPwJKoUQOWbNKHFJkDL8Rekicim9Vo6TfaHX LVN9KgPKpbDyBKA1Q9PqQM24Pv8lxuo/nyiVo6/Fk5AJ55vjTtl1kDNgBUbH8CIPD1TN xV4NxSYMnWOr0Qa5yjKUJYQZVSsZCeqCRIPvhMCExREOTb6dT5OFGCjscWjyjXZFPV3a pbM1n+XsWPHjLc5imkfoMWBe6h+v+88rQmerJ/4DT7iSNvMRAhh2itDlNjYKlUtj2QRN T9DA== X-Forwarded-Encrypted: i=1; AJvYcCXUnipsLHdVxapxiT2F9zSwFxku/kIAZPmrE1whTRb5kjqo5A2AoiiJzcOJj4tqdQDY2mudSL6nGRgnJEM=@vger.kernel.org X-Gm-Message-State: AOJu0YwztrGbHFOm8yL/aSzoTxyPvZlg5pi+7NxkHbgisSnrYRXUU/6T KvLPOtowozWpFe8TXigR+NNEUSz3PdAxCy9YexfSgeyriiLIEsXn/SEuedfdUU5NGFrURraNsqH EU6ByPA== X-Google-Smtp-Source: AGHT+IFb3cb9XScIkfIbQA2oQID7nhqxfExRyWF6N5UGcVhE3T71y0SJP1d51io6pNMM6/KZP3XG25y1rx0= X-Received: from pjoa3.prod.google.com ([2002:a17:90a:8c03:b0:32e:a549:83e3]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2d8f:b0:335:2eee:19dc with SMTP id 98e67ed59e1d1-33bcf8f94b6mr1826093a91.28.1760661204960; Thu, 16 Oct 2025 17:33:24 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:39 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-22-seanjc@google.com> Subject: [PATCH v3 21/25] KVM: TDX: Add tdx_get_cmd() helper to get and validate sub-ioctl command From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a helper to copy a kvm_tdx_cmd structure from userspace and verify that must-be-zero fields are indeed zero. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Rick Edgecombe --- arch/x86/kvm/vmx/tdx.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 2e2dab89c98f..d5f810435f34 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2761,20 +2761,25 @@ static int tdx_td_finalize(struct kvm *kvm, struct = kvm_tdx_cmd *cmd) return 0; } =20 +static int tdx_get_cmd(void __user *argp, struct kvm_tdx_cmd *cmd) +{ + if (copy_from_user(cmd, argp, sizeof(*cmd))) + return -EFAULT; + + if (cmd->hw_error) + return -EINVAL; + + return 0; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; int r; =20 - if (copy_from_user(&tdx_cmd, argp, sizeof(struct kvm_tdx_cmd))) - return -EFAULT; - - /* - * Userspace should never set hw_error. It is used to fill - * hardware-defined error by the kernel. - */ - if (tdx_cmd.hw_error) - return -EINVAL; + r =3D tdx_get_cmd(argp, &tdx_cmd); + if (r) + return r; =20 mutex_lock(&kvm->lock); =20 @@ -3152,11 +3157,9 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __use= r *argp) if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; =20 - if (copy_from_user(&cmd, argp, sizeof(cmd))) - return -EFAULT; - - if (cmd.hw_error) - return -EINVAL; + ret =3D tdx_get_cmd(argp, &cmd); + if (ret) + return ret; =20 switch (cmd.id) { case KVM_TDX_INIT_VCPU: --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 520DD22D4DE for ; Fri, 17 Oct 2025 00:33:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661209; cv=none; b=YJJuCVy9VfLtCIJth/BPpXNP0eniEOv5owXVBT+10UfbHZvPOCy2cCJy6pbEubV9oPeYDE3+SfksVFk4cnigrcaSg2Agz+x/wYEMeh/BoxCO4edr6Vg0NLLPR77YRn9+tPPkq9fScKQJBVdE0DO+un8TFvLAVZImFib8eH0FZ/E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661209; c=relaxed/simple; bh=5eDvri2CjtViZDPGkwvFCflVPL4adcTxs8IiZs71tVQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JvCnVzoqvB3QPIZ9SZc4fH9wB7+Rw7/2cbQnzEbyOobEk076+Snk1nJa3Jv+PO06jK+jq8qGGfZM4OwzIa1mIVPbYuCMY/Cd8NYMv+FZKbFIsmkWDaOIIiv1kgFady7EhqXdTe88sRndQ+cMi5bG6ycV8msFpsnKX0NA8CoaKB0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ki4ImjsY; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ki4ImjsY" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-33bbbb41a84so2168919a91.1 for ; Thu, 16 Oct 2025 17:33:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661207; x=1761266007; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=TrLSndNhbvAHaUMTALYaKuYkpgF66qtzwp89j/74SLs=; b=ki4ImjsYkdYufTJXZGUHl85XGJqLvRfxag8qpsZpOfS4AmlmVANF+swdjvTwSe+uQX r6o8UokCGJnEsR/8frlNAhj2k6nbXj340JbrSJsuCpEshLUOWCxrTESpqhzhivimTPJr 7OM4I0txUoni6hVO/c8BsHYaneSJUrWVQsJWY//MaUKqLYFRbez+GZfXULHGOaHe+BXr 9rkgndn5LMwewae+7coAo1WtiXt+DfrMGpfFJPUiNqvBpC6ZxlFlLe/8kFYT9Omvb1GO sO+NMUd+urQnqV33JhSVGv1RzPvMBdGJ6JdP2MIL5OjjI775cbIqYCkGNhbsG3BLWPm6 XNLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661207; x=1761266007; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TrLSndNhbvAHaUMTALYaKuYkpgF66qtzwp89j/74SLs=; b=PRI9PGMh/NytVRevA16rBp4JUU8j2OfMNWNkfUckp5ylBAtVexCv7klE16OGFVrtAk /abWRJ5hm59S6cpDr2TDVowccv+e7KpEt62a+ZKXfBeA8IE54zrhHzv3pLGp2QZNKPQv y33/hfNFVnszLFBl0Ytm7ijvhtU7pNLv9VavnqPMjzUwpjn0nNdQZ0N7C5bp7HBTA1ua EBR+jjRe6LXI3PF2UAEJ797hRgBmwhA9iQ5V/hQB6tazwc4Ek1/AEmqOdS/TDkXvFeJF ykAQts1rdWHibMhVWPL2WAfuFJDogAugKedQ8kp4ANnvNSKp4MoxwRTkl2b5AWP/Q99X l6mA== X-Forwarded-Encrypted: i=1; AJvYcCVlkx/u5NSLbN2whoHkvQuC8FgHBw6CY1tBnfpRe7yq+MelSJxIxnRgQfupOYknbJdiFw+WfG3uIqw+2W4=@vger.kernel.org X-Gm-Message-State: AOJu0YwkHJ4AroAjd/nA8SOoU0EcZBJGY8Dd5h6ZBC+rO0oqemWAZCip C2MfcqirTvtilGGMwe1ykCEd95Fq97rQNE5yzpqKeYrawDmgVFCM7xdalNCXiaOT6Et9zipXhd1 2qf+Hdw== X-Google-Smtp-Source: AGHT+IGBGWUmWwecmhGYPhLl5Rx6FLknwjbhIsKC3ApukPUlzu/2O7CwZGO+y90VWUKWa1HBq+25PhrJvNY= X-Received: from pjff15.prod.google.com ([2002:a17:90b:562f:b0:33b:51fe:1a76]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:ec8b:b0:330:6d5e:f174 with SMTP id 98e67ed59e1d1-33bcf8ec602mr1949785a91.20.1760661206685; Thu, 16 Oct 2025 17:33:26 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:40 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-23-seanjc@google.com> Subject: [PATCH v3 22/25] KVM: TDX: Convert INIT_MEM_REGION and INIT_VCPU to "unlocked" vCPU ioctl From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Handle the KVM_TDX_INIT_MEM_REGION and KVM_TDX_INIT_VCPU vCPU sub-ioctls in the unlocked variant, i.e. outside of vcpu->mutex, in anticipation of taking kvm->lock along with all other vCPU mutexes, at which point the sub-ioctls _must_ start without vcpu->mutex held. No functional change intended. Co-developed-by: Yan Zhao Signed-off-by: Yan Zhao Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 9 +++++++ arch/x86/kvm/vmx/tdx.c | 42 +++++++++++++++++++++++++----- arch/x86/kvm/vmx/x86_ops.h | 1 + arch/x86/kvm/x86.c | 7 +++++ 6 files changed, 55 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-= x86-ops.h index fdf178443f85..de709fb5bd76 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -128,6 +128,7 @@ KVM_X86_OP(enable_smi_window) KVM_X86_OP_OPTIONAL(dev_get_attr) KVM_X86_OP_OPTIONAL(mem_enc_ioctl) KVM_X86_OP_OPTIONAL(vcpu_mem_enc_ioctl) +KVM_X86_OP_OPTIONAL(vcpu_mem_enc_unlocked_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_register_region) KVM_X86_OP_OPTIONAL(mem_enc_unregister_region) KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 7e92aebd07e8..fda24da9e4e5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1914,6 +1914,7 @@ struct kvm_x86_ops { int (*dev_get_attr)(u32 group, u64 attr, u64 *val); int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp); int (*vcpu_mem_enc_ioctl)(struct kvm_vcpu *vcpu, void __user *argp); + int (*vcpu_mem_enc_unlocked_ioctl)(struct kvm_vcpu *vcpu, void __user *ar= gp); int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *ar= gp); int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *= argp); int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 0eb2773b2ae2..a46ccd670785 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -831,6 +831,14 @@ static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu= , void __user *argp) return tdx_vcpu_ioctl(vcpu, argp); } =20 +static int vt_vcpu_mem_enc_unlocked_ioctl(struct kvm_vcpu *vcpu, void __us= er *argp) +{ + if (!is_td_vcpu(vcpu)) + return -EINVAL; + + return tdx_vcpu_unlocked_ioctl(vcpu, argp); +} + static int vt_gmem_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, bool is_private) { @@ -1005,6 +1013,7 @@ struct kvm_x86_ops vt_x86_ops __initdata =3D { =20 .mem_enc_ioctl =3D vt_op_tdx_only(mem_enc_ioctl), .vcpu_mem_enc_ioctl =3D vt_op_tdx_only(vcpu_mem_enc_ioctl), + .vcpu_mem_enc_unlocked_ioctl =3D vt_op_tdx_only(vcpu_mem_enc_unlocked_ioc= tl), =20 .gmem_max_mapping_level =3D vt_op_tdx_only(gmem_max_mapping_level) }; diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d5f810435f34..1de5f17a7989 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3148,6 +3148,42 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu = *vcpu, struct kvm_tdx_cmd *c return ret; } =20 +int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp) +{ + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); + struct kvm_tdx_cmd cmd; + int r; + + r =3D tdx_get_cmd(argp, &cmd); + if (r) + return r; + + if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) + return -EINVAL; + + if (mutex_lock_killable(&vcpu->mutex)) + return -EINTR; + + vcpu_load(vcpu); + + switch (cmd.id) { + case KVM_TDX_INIT_MEM_REGION: + r =3D tdx_vcpu_init_mem_region(vcpu, &cmd); + break; + case KVM_TDX_INIT_VCPU: + r =3D tdx_vcpu_init(vcpu, &cmd); + break; + default: + r =3D -ENOIOCTLCMD; + break; + } + + vcpu_put(vcpu); + + mutex_unlock(&vcpu->mutex); + return r; +} + int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); @@ -3162,12 +3198,6 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __use= r *argp) return ret; =20 switch (cmd.id) { - case KVM_TDX_INIT_VCPU: - ret =3D tdx_vcpu_init(vcpu, &cmd); - break; - case KVM_TDX_INIT_MEM_REGION: - ret =3D tdx_vcpu_init_mem_region(vcpu, &cmd); - break; case KVM_TDX_GET_CPUID: ret =3D tdx_vcpu_get_cpuid(vcpu, &cmd); break; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 77613a44cebf..d09abeac2b56 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -148,6 +148,7 @@ int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data = *msr); int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); =20 int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); +int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp); =20 void tdx_flush_tlb_current(struct kvm_vcpu *vcpu); void tdx_flush_tlb_all(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b85cb213a336..593fccc9cf1c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7243,6 +7243,13 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, v= oid __user *argp) long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { + struct kvm_vcpu *vcpu =3D filp->private_data; + void __user *argp =3D (void __user *)arg; + + if (ioctl =3D=3D KVM_MEMORY_ENCRYPT_OP && + kvm_x86_ops.vcpu_mem_enc_unlocked_ioctl) + return kvm_x86_call(vcpu_mem_enc_unlocked_ioctl)(vcpu, argp); + return -ENOIOCTLCMD; } =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4467F2C0F67 for ; Fri, 17 Oct 2025 00:33:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661211; cv=none; b=kvkZN38Hkfprk2M7ZAMvNHuW0bueIqgMG5sgMmJP9JsbExr7Juykvpi7Bq7hjthqLsDRZdO6+zYG8zGqqsCbwc1X5cYDEETc2jUcupxTW23crO4zH9yRIGFP9wepkr4DiQPJ0EE3rqVPKyOLA5/P8hqhgnlEBoNI58TTqMibDdc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661211; c=relaxed/simple; bh=Jnfkzx2NqZmze1Ji5qS22be+VpNyN9sUekdU/ad34no=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QUcIgYjbjTOqZR9lQufKDa46DVu3Wd5m1evac/gixs6w0CJCALv7iPW5Z4ejKovfiFrQS66LWWvbc3eFxA6P8yTGuj0gT3GsMpAbbUedE/nh3ABpxkfbXqCl0FHJpCymjvJ4zGvVp/9LC4w3GCps03SiEQc0zIXLstO86c8N8/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=sngaEqzA; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="sngaEqzA" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32eb18b5659so1156410a91.2 for ; Thu, 16 Oct 2025 17:33:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661208; x=1761266008; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=lt3hJyWBrmkcLCPNC82mbSgwC8RUxvI14lDVp9rMNK8=; b=sngaEqzAHb8MQs/iSf/hJnu8gNDhAdNioYdvIFpPSYOtnOD5Ae5YimT0S3efsoguaF Z2Cr6X3SnCjAY/H+1yaZd5bCnG1eEXj1tXwopYkJmXmNHdriCjVeq/OZXsUI33Nrkq8k zoEGfbnL4NOSOO2Cp4wHVG8XMuhba8otSO8UKw/Y80bGAsPi47KrXw4fykrtqq3evDvR R4yPN+SZkzpLnzWY9pNGni6pAt6Un7lD71etCgPrw80UF7dpuEkosvOK512E+oM49H0G 8lj7SJoyQO4Ulkq4enBxNvqhamzpPCjpHjUwFNWecIMBKfy9u5SpS0sqyJYZ+5+OkRy2 ZHJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661208; x=1761266008; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lt3hJyWBrmkcLCPNC82mbSgwC8RUxvI14lDVp9rMNK8=; b=ZK6VkdB1gNxxBg6kwGHP9THk4oy2aKItoRUVRHklDi3GxX0eel0bj8Vaj4yHaK2GJR alw1ori2o9x8g3kxy6cdzpcFRFPxMKINTlCFPS61+26OscsvsRQGL4y+RShkt2ehBBe5 LW29+CGk0aTvylSTzshxSyhiwK0GrHTj/15xHS4PrwWvoi5aEAosH5uC1nJ5IhbtXV0o 3XJQlTUdvrNHY42yf0KlQQkZsyDybcWw3nOBmIO414Z9ObneVmM8PnHIELZq4tgKu0Lb LW14W2khtczRhyf9m94MGwgbrvMQW071iQXcsFXbIRvMdaSrHcWlHxPpmVyUfAeid6hB FWgQ== X-Forwarded-Encrypted: i=1; AJvYcCVjKHYr5mL6OeAWJ/aygwCxus3X2ODm+f8EfEFkcpgu4Ow5a5r6kBWnKrdGaKoA1yos0T4UdJg4L7p2PLc=@vger.kernel.org X-Gm-Message-State: AOJu0Yxolm1MXnhhv/iOMklaIvUN8tchyLNfiHXgRdtpSbcWSkAlWjV4 K+nfeuX0kM8V6nsQxQaDa324kclfx8Z+b1gFkBgu1iPv1TfDwbj66f0W/J+JUNKetCtwK3JnChc h6DDd1w== X-Google-Smtp-Source: AGHT+IHg17I+MrCPzq5TEoO4O1fEv/5Rea+bFeeHOFAKW3+uW2v4jsY6m0H6q4wWDO+YA+zfXcJHtbyRmvo= X-Received: from pjot19.prod.google.com ([2002:a17:90a:9513:b0:323:25d2:22db]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:38c7:b0:335:2747:a9b3 with SMTP id 98e67ed59e1d1-33bcf90e717mr1807890a91.32.1760661208480; Thu, 16 Oct 2025 17:33:28 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:41 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-24-seanjc@google.com> Subject: [PATCH v3 23/25] KVM: TDX: Use guard() to acquire kvm->lock in tdx_vm_ioctl() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use guard() in tdx_vm_ioctl() to tidy up the code a small amount, but more importantly to minimize the diff of a future change, which will use guard-like semantics to acquire and release multiple locks. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Rick Edgecombe --- arch/x86/kvm/vmx/tdx.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 1de5f17a7989..84b5fe654c99 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2781,7 +2781,7 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) if (r) return r; =20 - mutex_lock(&kvm->lock); + guard(mutex)(&kvm->lock); =20 switch (tdx_cmd.id) { case KVM_TDX_CAPABILITIES: @@ -2794,15 +2794,12 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) r =3D tdx_td_finalize(kvm, &tdx_cmd); break; default: - r =3D -EINVAL; - goto out; + return -EINVAL; } =20 if (copy_to_user(argp, &tdx_cmd, sizeof(struct kvm_tdx_cmd))) - r =3D -EFAULT; + return -EFAULT; =20 -out: - mutex_unlock(&kvm->lock); return r; } =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8B3B2C21D4 for ; Fri, 17 Oct 2025 00:33:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661212; cv=none; b=pSO6QHk451KvsjBLCXi7ANGMhdsn8qBVeE5QJdy6sUmmrXaDLPMSyndypZTgc0fBT6xht0CYUdR0hhONijmg6casVSSRziymd3UnknA82sB5zCMIZxn+J8ESlLFAtifHx38rvl3W6WGkbjbIZ+YAMbGmYs5kl6UWfVmVoEhGK6Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661212; c=relaxed/simple; bh=gM7rt+rTY18BYUW91TafNBRopkYIdhlbY6uHrLwvGao=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=YTeTD/D0bj06xWjZqOvC6Mh9cb6QfaixdslEDESBcJTgsl20NT5Qcxw/hkcxpGNzIIU4tiQ/aT9YNYP0jT5oG0Y5rDVrsMCAGht4b2Dx4y/JPfonsjbNDOiLNW6CPRECo61OOaAqvaInkmwGgaKqKsmo8xGS2TTS2O2KLtzeBUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=1SwgD2f7; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1SwgD2f7" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33ba9047881so1921241a91.1 for ; Thu, 16 Oct 2025 17:33:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661210; x=1761266010; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Ngwce4OaquS4InkYDa/fl+U0GZsEjplrgThj8MTuVYg=; b=1SwgD2f7YWddkLchkzKcsW4d4rZNMYcMSwC5ZRl8o5UtXqjEQ5my04rS63foOmCEy6 +1ykSfPkpe/Sw3hgnHOE9+21/HZ4zoOJEkmYeo9hrTsIPOKjhFutmkwzqJrq0c3bmCW9 PI6b35ci8d9bpgcPsnHL2KhFMjaUoXednnw/zSMOehZBrwhRrAVKupMmG4hTCkOksf0E Po/8kDprB/xsP1FqYGwtMTx9JjsUo8LOHfOkHuGPaX2i/tAQxNR2PCHZSs7DkKoOtRCX i15c9BxRVaA/BciLPX1gRZEKROa+MGRKOm3im/T/IMpZjuWFOQrAkVLh5VV3Q9msOdWN mZYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661210; x=1761266010; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ngwce4OaquS4InkYDa/fl+U0GZsEjplrgThj8MTuVYg=; b=enFaTtrJy1GTkdkUkqUpJwmEEDkiMNW4wimwF8vAvRAwKtn5uz9S5Q7q31cjONrkSk PhSxv7yHGw/RS49d6w2pKeJM5XvaNyQ5hziuuzJbmb2ymhtvrnEB6+tbWFFCfnNKFWkk nNwa5ImhJE/bzUU/1XJm3yHMn/NDOr2MWeZOzHhfCQhTWIvaXBhKvt/aZZGjOQ7TRqbB 36gC2BQpAf0MNq+dZx2xceW2Sb8IO1qKZ2QK6TVpQehi2K8x6TRKrXLC03i+wJRj+JUn /e079FtV3fSRpCKJrUJtJ9KcfTqdqO8cz5gKPYTUPndv7xGpbuxSqaQ7iFeeyWYXgckD f7Fg== X-Forwarded-Encrypted: i=1; AJvYcCWxeljZm116KXRIR5UAPR21uhn0agZjMXQ80ccRXsZHHrOQw4rUR0GO0NEwKo5UZ+4zKK1zx78SzGG35EI=@vger.kernel.org X-Gm-Message-State: AOJu0Yw3QRsAC1hihz4T2nyRMtrxTIbxFfiszRDoRFEmk0emKQS8bwcl KjFjxlWu/Jtzz6pXAejgCHEkq7D7A230gxSOt9kWKSJ2/oFbNTLL7hdfSHY8sPomkhmi3q772Zb s93uh+Q== X-Google-Smtp-Source: AGHT+IEKHDX7XekpTQV5XLScyWGD23ZPyNWCCYdFpMOGeqIT3rMk3IOTs/Q4qsdgwOMQLd3yDk0jpbRIcJA= X-Received: from pjua3.prod.google.com ([2002:a17:90a:cb83:b0:33b:be14:2b68]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1d09:b0:339:e8c7:d47d with SMTP id 98e67ed59e1d1-33bc9c11c65mr2828717a91.9.1760661210238; Thu, 16 Oct 2025 17:33:30 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:42 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-25-seanjc@google.com> Subject: [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Acquire kvm->lock, kvm->slots_lock, and all vcpu->mutex locks when servicing ioctls that (a) transition the TD to a new state, i.e. when doing INIT or FINALIZE or (b) are only valid if the TD is in a specific state, i.e. when initializing a vCPU or memory region. Acquiring "all" the locks fixes several KVM_BUG_ON() situations where a SEAMCALL can fail due to racing actions, e.g. if tdh_vp_create() contends with either tdh_mr_extend() or tdh_mr_finalize(). For all intents and purposes, the paths in question are fully serialized, i.e. there's no reason to try and allow anything remotely interesting to happen. Smack 'em with a big hammer instead of trying to be "nice". Acquire kvm->lock to prevent VM-wide things from happening, slots_lock to prevent kvm_mmu_zap_all_fast(), and _all_ vCPU mutexes to prevent vCPUs from interefering. Use the recently-renamed kvm_arch_vcpu_unlocked_ioctl() to service the vCPU-scoped ioctls to avoid a lock inversion problem, e.g. due to taking vcpu->mutex outside kvm->lock. See also commit ecf371f8b02d ("KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight"), which fixed a similar bug with SEV intra-host migration where an in-flight vCPU creation could race with a VM-wide state transition. Define a fancy new CLASS to handle the lock+check =3D> unlock logic with guard()-like syntax: CLASS(tdx_vm_state_guard, guard)(kvm); if (IS_ERR(guard)) return PTR_ERR(guard); to simplify juggling the many locks. Note! Take kvm->slots_lock *after* all vcpu->mutex locks, as per KVM's soon-to-be-documented lock ordering rules[1]. Link: https://lore.kernel.org/all/20251016235538.171962-1-seanjc@google.com= [1] Reported-by: Yan Zhao Closes: https://lore.kernel.org/all/aLFiPq1smdzN3Ary@yzhao56-desk.sh.intel.= com Signed-off-by: Sean Christopherson Suggested-by: Kai Huang --- arch/x86/kvm/vmx/tdx.c | 63 +++++++++++++++++++++++++++++++++++------- 1 file changed, 53 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 84b5fe654c99..d6541b08423f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2632,6 +2632,46 @@ static int tdx_read_cpuid(struct kvm_vcpu *vcpu, u32= leaf, u32 sub_leaf, return -EIO; } =20 +typedef void *tdx_vm_state_guard_t; + +static tdx_vm_state_guard_t tdx_acquire_vm_state_locks(struct kvm *kvm) +{ + int r; + + mutex_lock(&kvm->lock); + + if (kvm->created_vcpus !=3D atomic_read(&kvm->online_vcpus)) { + r =3D -EBUSY; + goto out_err; + } + + r =3D kvm_lock_all_vcpus(kvm); + if (r) + goto out_err; + + /* + * Note the unintuitive ordering! vcpu->mutex must be taken outside + * kvm->slots_lock! + */ + mutex_lock(&kvm->slots_lock); + return kvm; + +out_err: + mutex_unlock(&kvm->lock); + return ERR_PTR(r); +} + +static void tdx_release_vm_state_locks(struct kvm *kvm) +{ + mutex_unlock(&kvm->slots_lock); + kvm_unlock_all_vcpus(kvm); + mutex_unlock(&kvm->lock); +} + +DEFINE_CLASS(tdx_vm_state_guard, tdx_vm_state_guard_t, + if (!IS_ERR(_T)) tdx_release_vm_state_locks(_T), + tdx_acquire_vm_state_locks(kvm), struct kvm *kvm); + static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) { struct kvm_tdx_init_vm __user *user_data =3D u64_to_user_ptr(cmd->data); @@ -2644,6 +2684,10 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_t= dx_cmd *cmd) BUILD_BUG_ON(sizeof(*init_vm) !=3D 256 + sizeof_field(struct kvm_tdx_init= _vm, cpuid)); BUILD_BUG_ON(sizeof(struct td_params) !=3D 1024); =20 + CLASS(tdx_vm_state_guard, guard)(kvm); + if (IS_ERR(guard)) + return PTR_ERR(guard); + if (kvm_tdx->state !=3D TD_STATE_UNINITIALIZED) return -EINVAL; =20 @@ -2743,7 +2787,9 @@ static int tdx_td_finalize(struct kvm *kvm, struct kv= m_tdx_cmd *cmd) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); =20 - guard(mutex)(&kvm->slots_lock); + CLASS(tdx_vm_state_guard, guard)(kvm); + if (IS_ERR(guard)) + return PTR_ERR(guard); =20 if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; @@ -2781,8 +2827,6 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) if (r) return r; =20 - guard(mutex)(&kvm->lock); - switch (tdx_cmd.id) { case KVM_TDX_CAPABILITIES: r =3D tdx_get_capabilities(&tdx_cmd); @@ -3090,8 +3134,6 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *= vcpu, struct kvm_tdx_cmd *c if (tdx->state !=3D VCPU_TD_STATE_INITIALIZED) return -EINVAL; =20 - guard(mutex)(&kvm->slots_lock); - /* Once TD is finalized, the initial guest memory is fixed. */ if (kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; @@ -3147,7 +3189,8 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *= vcpu, struct kvm_tdx_cmd *c =20 int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); + struct kvm *kvm =3D vcpu->kvm; + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); struct kvm_tdx_cmd cmd; int r; =20 @@ -3155,12 +3198,13 @@ int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, = void __user *argp) if (r) return r; =20 + CLASS(tdx_vm_state_guard, guard)(kvm); + if (IS_ERR(guard)) + return PTR_ERR(guard); + if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; =20 - if (mutex_lock_killable(&vcpu->mutex)) - return -EINTR; - vcpu_load(vcpu); =20 switch (cmd.id) { @@ -3177,7 +3221,6 @@ int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, vo= id __user *argp) =20 vcpu_put(vcpu); =20 - mutex_unlock(&vcpu->mutex); return r; } =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sat Feb 7 21:47:53 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA7002DEA67 for ; Fri, 17 Oct 2025 00:33:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661215; cv=none; b=rqwYkd7oHdfGhxQ+OrEg/68Yn9KXfE5uEtYcTLteP2SlWEwmOmGUGa/pQTwwr591iS4/cFkAfyKEX4hICQdo6NV3ZDrTjeDIzohcD3XGtu+f1U8xhaL1shOknmCj9/zIKyrZ7SJEjm9VOaLNecIwesznvNJ16hiJqdkht6cRtZU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760661215; c=relaxed/simple; bh=Bv2JFpVKbSOr8w26IO6QkCIXhdunh/T2ahf6bwE6NCw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=J671Ee7uo2+OwoXpnm6G1zQ/4uGRhRNDEsaqfyFHUuaQTNgdKTanS2m3d+PDDx00qynekjmwG8mF/fJzpML2CWC7uvgetdlfW/9gbWEOegG+yIxjl1ynPRNY/PAamvNsZaEpJbPasqKXYsM3FmZ9xMfD9D1oaKfTgdZEZnT0PQs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iBGRhlRn; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iBGRhlRn" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32ee157b9c9so1211323a91.2 for ; Thu, 16 Oct 2025 17:33:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760661212; x=1761266012; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=/pSW/fUm0O3ay1Oo9bje5jBv9ucpW+KObLJe4y1pVWI=; b=iBGRhlRntFycHJp61W6uPCtjw066185c9WUSwfyFKn2wvPaHzqCEy4q/VnoYFxzYC8 Wmz89f+f8Hx4wed0e178GuuUeCzocnNDdfylbXLmyxiHCf+i9IhjJHhParlKtiTgshoS ZUMCcoSEnfPspN3yfvacgNZjNjMqPlf4sU2HQYbED/VNMbUg8CQd6f+OcpRLJgzqTxnF gBUCTlchG17NGEOCmbUzEv38ommjKE0CgN2480Y/mzOjsWqituxXbyvkEPziX0LXO8Sg l41nqw6hHff/Ot9lguc+/JfVZ5DLQ9EmFxaiWgC0joB3wGrQRJ/ScilAEl16FEGoiUBY GEbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760661212; x=1761266012; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/pSW/fUm0O3ay1Oo9bje5jBv9ucpW+KObLJe4y1pVWI=; b=SIpObXzzIZKCJagLe+qqXRjFmstiPKf+0pB4mcAJqcECHyJ8yopJhIBi/n04NUHRdC aaZXPKzWu7hz5S1TjsiBaYQs3TXsCkc/acDAKnt3NR83JwyLACcydMjko5irBIuLrQ6b sylbACMTNYPxm7j9Jju/wI0hIcHCLXObNRv+2x8jkDreqNdWZ9BsZrExDrGwS/w90e1x L0aeRRgX9+mA9vGCTQPHPEJlnjAoYzEZmfqJs1gVLzPhgfFjiB7Mtfk9ra6e9k8IQrOy zKmkU35ZNLBbTedG3UJJAvsu8fm4kshsBE6uAWKY8yGWqd7+V0HzkTQoZjr9Pzwoeae7 EhmA== X-Forwarded-Encrypted: i=1; AJvYcCUb+H9wOaZlDA/HVZpK/hT3+TamPg9iCKjbe5IzMRz7CKD6CZLFGhfyQ6TbdtVjyaF5BZgh4D1U3yTCdDA=@vger.kernel.org X-Gm-Message-State: AOJu0YwnoGIZFgzovEi2orbSqwEcpL/EJSX+bjlbXWTRACrJU8gGNXTx N7dBEB1/pGf7NlaQXY6ofkxjA6wGvubEyZ4yLz126iwbLIwTtuNgMEA0tPN6M/dBPlRmAXFQn8L 1V7NysA== X-Google-Smtp-Source: AGHT+IGUYXYNMUKz1i83Pw6n8U7DCU7Wd9XYw4AfmAkZINqRiiLRcAvioXFznTZGlb6j2wOsKcjvQ1uSQcI= X-Received: from pjsc19.prod.google.com ([2002:a17:90a:bf13:b0:33b:ab21:aff7]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4b0f:b0:32e:8c14:5cd2 with SMTP id 98e67ed59e1d1-33bcf8faac8mr1848079a91.28.1760661212029; Thu, 16 Oct 2025 17:33:32 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 17:32:43 -0700 In-Reply-To: <20251017003244.186495-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251017003244.186495-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251017003244.186495-26-seanjc@google.com> Subject: [PATCH v3 25/25] KVM: TDX: Fix list_add corruption during vcpu_load() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng , Binbin Wu Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yan Zhao During vCPU creation, a vCPU may be destroyed immediately after kvm_arch_vcpu_create() (e.g., due to vCPU id confiliction). However, the vcpu_load() inside kvm_arch_vcpu_create() may have associate the vCPU to pCPU via "list_add(&tdx->cpu_list, &per_cpu(associated_tdvcpus, cpu))" before invoking tdx_vcpu_free(). Though there's no need to invoke tdh_vp_flush() on the vCPU, failing to dissociate the vCPU from pCPU (i.e., "list_del(&to_tdx(vcpu)->cpu_list)") will cause list corruption of the per-pCPU list associated_tdvcpus. Then, a later list_add() during vcpu_load() would detect list corruption and print calltrace as shown below. Dissociate a vCPU from its associated pCPU in tdx_vcpu_free() for the vCPUs destroyed immediately after creation which must be in VCPU_TD_STATE_UNINITIALIZED state. kernel BUG at lib/list_debug.c:29! Oops: invalid opcode: 0000 [#2] SMP NOPTI RIP: 0010:__list_add_valid_or_report+0x82/0xd0 Call Trace: tdx_vcpu_load+0xa8/0x120 vt_vcpu_load+0x25/0x30 kvm_arch_vcpu_load+0x81/0x300 vcpu_load+0x55/0x90 kvm_arch_vcpu_create+0x24f/0x330 kvm_vm_ioctl_create_vcpu+0x1b1/0x53 kvm_vm_ioctl+0xc2/0xa60 __x64_sys_ioctl+0x9a/0xf0 x64_sys_call+0x10ee/0x20d0 do_syscall_64+0xc3/0x470 entry_SYSCALL_64_after_hwframe+0x77/0x7f Signed-off-by: Yan Zhao Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 43 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d6541b08423f..daec88d4b88d 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -833,19 +833,52 @@ void tdx_vcpu_put(struct kvm_vcpu *vcpu) tdx_prepare_switch_to_host(vcpu); } =20 +/* + * Life cycles for a TD and a vCPU: + * 1. KVM_CREATE_VM ioctl. + * TD state is TD_STATE_UNINITIALIZED. + * hkid is not assigned at this stage. + * 2. KVM_TDX_INIT_VM ioctl. + * TD transitions to TD_STATE_INITIALIZED. + * hkid is assigned after this stage. + * 3. KVM_CREATE_VCPU ioctl. (only when TD is TD_STATE_INITIALIZED). + * 3.1 tdx_vcpu_create() transitions vCPU state to VCPU_TD_STATE_UNINIT= IALIZED. + * 3.2 vcpu_load() and vcpu_put() in kvm_arch_vcpu_create(). + * 3.3 (conditional) if any error encountered after kvm_arch_vcpu_creat= e() + * kvm_arch_vcpu_destroy() --> tdx_vcpu_free(). + * 4. KVM_TDX_INIT_VCPU ioctl. + * tdx_vcpu_init() transitions vCPU state to VCPU_TD_STATE_INITIALIZED. + * vCPU control structures are allocated at this stage. + * 5. kvm_destroy_vm(). + * 5.1 tdx_mmu_release_hkid(): (1) tdh_vp_flush(), disassociates all vC= PUs. + * (2) puts hkid to !assigned state. + * 5.2 kvm_destroy_vcpus() --> tdx_vcpu_free(): + * transitions vCPU to VCPU_TD_STATE_UNINITIALIZED state. + * 5.3 tdx_vm_destroy() + * transitions TD to TD_STATE_UNINITIALIZED state. + * + * tdx_vcpu_free() can be invoked only at 3.3 or 5.2. + * - If at 3.3, hkid is still assigned, but the vCPU must be in + * VCPU_TD_STATE_UNINITIALIZED state. + * - if at 5.2, hkid must be !assigned and all vCPUs must be in + * VCPU_TD_STATE_INITIALIZED state and have been dissociated. + */ void tdx_vcpu_free(struct kvm_vcpu *vcpu) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); struct vcpu_tdx *tdx =3D to_tdx(vcpu); int i; =20 + if (vcpu->cpu !=3D -1) { + KVM_BUG_ON(tdx->state =3D=3D VCPU_TD_STATE_INITIALIZED, vcpu->kvm); + tdx_disassociate_vp(vcpu); + return; + } + /* * It is not possible to reclaim pages while hkid is assigned. It might - * be assigned if: - * 1. the TD VM is being destroyed but freeing hkid failed, in which - * case the pages are leaked - * 2. TD VCPU creation failed and this on the error path, in which case - * there is nothing to do anyway + * be assigned if the TD VM is being destroyed but freeing hkid failed, + * in which case the pages are leaked. */ if (is_hkid_assigned(kvm_tdx)) return; --=20 2.51.0.858.gf9c4a03a3a-goog