From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8423374AAD for ; Thu, 30 Oct 2025 20:10:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855004; cv=none; b=qs5M6kpekvzzyMao7XvzF5AMxCFzWSPBJwuLMDzkVib7ItPoytml6PypXs8Lz/D+6soJw/QaIfxI7mzxBQMIxErRcC4PSULfc2tUXJJFelpnMeoVqcjZBDgQS5s3zL3dCBPcgCGrI2XJhHpxwyfXpGTQXMHG0Snm/VUW3FhLqSA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855004; c=relaxed/simple; bh=GwQR3WqNM/N4V33I9/zWw/yrZtqHSbqQBGNXQvrrL5s=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=KSg2z6+ZD8JBZ2u637AbTJqFOehq6Pibx8qcHJQC0nt9tbSSwUBXchvikt2L1I7THsqGQ20uQksGQpi1xgXLU13muwfwHc6H1PyVszBJ17MC0E7sue/Ob6Mmh6I5lA5lc/nod/4u2j9BDqA8dPQYpxLOLT3prWZh6Di/RUVPNOU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=bYKjgnh1; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bYKjgnh1" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34077439124so1054169a91.1 for ; Thu, 30 Oct 2025 13:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855002; x=1762459802; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=vdmNuA/XJs18gaKKFpEA0E3nq9J2Wf2NR2DLPMRdiiE=; b=bYKjgnh1Vh/jEn22JEbUAmJst/MSEMzOB5OLqAK8X0lsqpyPLP2gCEYHNdf+G7Y8ge fIgUTJ4o2Z4eTJNa3bOx+CsWjqM8CrPNEYIdAuuWVF6iTkKzBpND+B9MAf4FawrRpscN X/5pMjE7fzmO0oq+z6c3nowf9qQyMr+G6INdHltqmszTWZFD0BURYaxvE1Z6dEkCGdhw NGeAMiTQS4WTRDkChFjYwITJSZUK2o0hgp6XSuB5c4rBjiFlOrU8kqqtLpmJuR2YQ7Xo Pt5Frprq+rlSFNStnCh4ESRMFN/QIFxxAsEYJh4S4Tml3blfkNP8BgO7E5AngZxZ6A3k 9RhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855002; x=1762459802; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vdmNuA/XJs18gaKKFpEA0E3nq9J2Wf2NR2DLPMRdiiE=; b=l/BpAthQy3OceJyI+Y3UI3Kz4jnvmD/h+Zz5C1ObY6lp5y+UBXoLN8N+eHW7hgSBUy HJciTPhUBMrsi68LNd/ShnaikL0+nPGDAAVbkr8BPsIIbNYrx3i98kwLz9lh9WmW6xEJ 2oc60MWRw0NKWyJBfYDiEM9UqboS47XVtYpd40sog+d7sVHlj4UCTkXfWSt1OWmsyPnr r0arcDB7x7w98wVYN0xHTuOQcJC399rpwBTd7wJ5b/OdGem0sNW1LRmQq7iW8vcwDgQw Ow4ECxImZKZpoy88VvQec3YvUVx/5MC1jbD/2LMkuuzVptaEX0Qs9Hw41Hk0WYcCnN4p oF3w== X-Forwarded-Encrypted: i=1; AJvYcCWyI1MKnAWCldbwzO6Z1CO2rjmuQ7RrakhGOKWTrtfsw0tiGbJgugnypbLvfHp1qpliHYrAfZ98vBPLH5Q=@vger.kernel.org X-Gm-Message-State: AOJu0YxuPshIs5PFID4H8PJsEA6s+bM9aTBbYRpK+A5KIVDxUil7gtPT FKvBLI5+oObvz/ftDed1tMITQWSvV7IeUWG1OcnWcQnzGKZhN+q+fsL9dK0wCoLoFDf3xvyYVkM JjIAFSg== X-Google-Smtp-Source: AGHT+IHCp67ng7EPL926/aDQ0svpDIsDoj6OeGfyMpf/ofykocac96hQIKqfmRloHmNuNLaZynEjzDSP6Us= X-Received: from pjvm17.prod.google.com ([2002:a17:90a:de11:b0:33d:79bb:9fd5]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3b92:b0:334:2a38:2a05 with SMTP id 98e67ed59e1d1-34082fc64a4mr1360357a91.8.1761855001643; Thu, 30 Oct 2025 13:10:01 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:24 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-2-seanjc@google.com> Subject: [PATCH v4 01/28] KVM: Make support for kvm_arch_vcpu_async_ioctl() mandatory From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement kvm_arch_vcpu_async_ioctl() "natively" in x86 and arm64 instead of relying on an #ifdef'd stub, and drop HAVE_KVM_VCPU_ASYNC_IOCTL in anticipation of using the API on x86. Once x86 uses the API, providing a stub for one architecture and having all other architectures opt-in requires more code than simply implementing the API in the lone holdout. Eliminating the Kconfig will also reduce churn if the API is renamed in the future (spoiler alert). No functional change intended. Acked-by: Claudio Imbrenda Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/arm64/kvm/arm.c | 6 ++++++ arch/loongarch/kvm/Kconfig | 1 - arch/mips/kvm/Kconfig | 1 - arch/powerpc/kvm/Kconfig | 1 - arch/riscv/kvm/Kconfig | 1 - arch/s390/kvm/Kconfig | 1 - arch/x86/kvm/x86.c | 6 ++++++ include/linux/kvm_host.h | 10 ---------- virt/kvm/Kconfig | 3 --- 9 files changed, 12 insertions(+), 18 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 870953b4a8a7..ef5bf57f79b7 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1835,6 +1835,12 @@ long kvm_arch_vcpu_ioctl(struct file *filp, return r; } =20 +long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + return -ENOIOCTLCMD; +} + void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *mems= lot) { =20 diff --git a/arch/loongarch/kvm/Kconfig b/arch/loongarch/kvm/Kconfig index ae64bbdf83a7..ed4f724db774 100644 --- a/arch/loongarch/kvm/Kconfig +++ b/arch/loongarch/kvm/Kconfig @@ -25,7 +25,6 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_MSI select HAVE_KVM_READONLY_MEM - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_COMMON select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_GENERIC_HARDWARE_ENABLING diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig index ab57221fa4dd..cc13cc35f208 100644 --- a/arch/mips/kvm/Kconfig +++ b/arch/mips/kvm/Kconfig @@ -22,7 +22,6 @@ config KVM select EXPORT_UASM select KVM_COMMON select KVM_GENERIC_DIRTYLOG_READ_PROTECT - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_MMIO select KVM_GENERIC_MMU_NOTIFIER select KVM_GENERIC_HARDWARE_ENABLING diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 2f2702c867f7..c9a2d50ff1b0 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -20,7 +20,6 @@ if VIRTUALIZATION config KVM bool select KVM_COMMON - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_VFIO select HAVE_KVM_IRQ_BYPASS =20 diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig index c50328212917..77379f77840a 100644 --- a/arch/riscv/kvm/Kconfig +++ b/arch/riscv/kvm/Kconfig @@ -23,7 +23,6 @@ config KVM select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQ_ROUTING select HAVE_KVM_MSI - select HAVE_KVM_VCPU_ASYNC_IOCTL select HAVE_KVM_READONLY_MEM select HAVE_KVM_DIRTY_RING_ACQ_REL select KVM_COMMON diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig index cae908d64550..96d16028e8b7 100644 --- a/arch/s390/kvm/Kconfig +++ b/arch/s390/kvm/Kconfig @@ -20,7 +20,6 @@ config KVM def_tristate y prompt "Kernel-based Virtual Machine (KVM) support" select HAVE_KVM_CPU_RELAX_INTERCEPT - select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_ASYNC_PF select KVM_ASYNC_PF_SYNC select KVM_COMMON diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b4b5d2d09634..ca5ba2caf314 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7240,6 +7240,12 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, v= oid __user *argp) return 0; } =20 +long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + return -ENOIOCTLCMD; +} + int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long= arg) { struct kvm *kvm =3D filp->private_data; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5bd76cf394fa..7186b2ae4b57 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2437,18 +2437,8 @@ static inline bool kvm_arch_no_poll(struct kvm_vcpu = *vcpu) } #endif /* CONFIG_HAVE_KVM_NO_POLL */ =20 -#ifdef CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); -#else -static inline long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, - unsigned long arg) -{ - return -ENOIOCTLCMD; -} -#endif /* CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL */ - void kvm_arch_guest_memory_reclaimed(struct kvm *kvm); =20 #ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 5f0015c5dd95..267c7369c765 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -78,9 +78,6 @@ config HAVE_KVM_IRQ_BYPASS tristate select IRQ_BYPASS_MANAGER =20 -config HAVE_KVM_VCPU_ASYNC_IOCTL - bool - config HAVE_KVM_VCPU_RUN_PID_CHANGE bool =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B758F374AC6 for ; Thu, 30 Oct 2025 20:10:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855008; cv=none; b=i4L8Zx/I7Fpgix/tR/X2w4gucxeQGrZ60fE0oxx4cBWj3dUf2hRt2LXEu77Lf1eQRBQwTaHvc95WpZRRhdkIpylccWYlrP5dfffKN0BZ/caL6tcvWaO2nxCzZ0TdrmJmZJM6EDC0BZ80QcNcV85UxC7aCbKfzH0CLDmJ/IPhBss= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855008; c=relaxed/simple; bh=JuTYFKeg2fApfbwtmQ2BcVytoG51MuKQIB+TddLOfF4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AqH3DBAfIa5cbuZzNMfEWmrdIfnrmcm+280ZNjY5ZGtMrdvgazsE7CPj7qaPoaQuYU3s9u7dlVQclCXZyol+0WHS07clect0thbUKAwDaPCt/r9+W7SLj5Mx84wqWomghbLEF4gYBKwIUXTofXExAKGegKlSrsHFpO5AS3t4ODA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=0siVUR5a; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="0siVUR5a" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-294f59039d6so12295955ad.3 for ; Thu, 30 Oct 2025 13:10:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855004; x=1762459804; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=2OA09S1/1skw+FuU1BhzWw9OqqsMWSUfhu0LCFCqROs=; b=0siVUR5aqkBOBtdfb0NNlPyiZw5VEtHXxmIBZ5O4w4uhnKvzg1fcSD/xEij6HI4qcJ hWjDvzwTEehWOCy6qlvkcuqX+iMKN08zwVX6dQkfd4+wF8tfJ1KyoPpCKL1VsR0msTk4 2dvv1U6mnvymKh+XbNuoMGCOf7F1rZtSv7pNi6HHhrtmayOP7oZ+XPyTcR01BHCylcat Am64tCg08HSLb3G98RWWxcdZzvJtl1lvpKNHX/9Qtq6pMoawSiFO0pmF2Iw68lQMU8mq SIkhgg3/dZzAjGkAphoAnxqGrFLoh/mhJoXUUXHp2OKMvdZcV3AI8S9mrW6VSoEs+2zW Ceuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855004; x=1762459804; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2OA09S1/1skw+FuU1BhzWw9OqqsMWSUfhu0LCFCqROs=; b=nq2McGCJBgkUfIrrVSz06whs26TZ2wl1t1ZLONE1kfZ/Vj/NCsf+MEYbB/b+ESFCUm ni7wkdSEifZKNdRH+n/473PEsApxVXcDKe0GqDNjhWWgXYreie6gj4dX5Y8uVkFO4cdA 2bM+eTe8n+isoZAi3d6JnjS5w+WMUmYdJOp0jQV7YMjZ8m3EquAkG+CyjEDLSTaGj2s+ msiXZ2LVVvX3fxkQu7xC+Qqglyv9qFta1N4JnN2cHu8zSr4eAlxKeopU1PNizWm04jP+ naRwFQYMeiUzCp26f5fvH43NpQW2ljAiEx7m0x6cyVVtlrx1+WMP1+LoOYugFsqDGPOj WPOQ== X-Forwarded-Encrypted: i=1; AJvYcCVTfGgbbCeVLGtRnWy6lzP/4m4Xn5z8IlrHfdawUGanzf1J8rGxugu+snT6tpkz9NRE82LqKcquNh7EYZ8=@vger.kernel.org X-Gm-Message-State: AOJu0YymDZR5JTOszD7kLSiydz1SA4mCznSKI5J7pKaya/Rm3+6QNAQI 537eMa3u8B/PKHYexz/11EjBiHd3IEPvnS3lqTOTM0SVWEkheCpVkBuJ+cBvhE5aQuEoELoQeRb S5HAKOg== X-Google-Smtp-Source: AGHT+IFyokACO79MDowezF6ant8GOHayKAjiYuTdimtA+5sS6fpyhP+P+t8J4m0XKxQ1DNOfqANuKqsRbyM= X-Received: from pljc8.prod.google.com ([2002:a17:903:3b88:b0:269:a7b6:c668]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2bce:b0:270:e595:a440 with SMTP id d9443c01a7336-2951a3b61efmr15533075ad.25.1761855004041; Thu, 30 Oct 2025 13:10:04 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:25 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-3-seanjc@google.com> Subject: [PATCH v4 02/28] KVM: Rename kvm_arch_vcpu_async_ioctl() to kvm_arch_vcpu_unlocked_ioctl() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rename the "async" ioctl API to "unlocked" so that upcoming usage in x86's TDX code doesn't result in a massive misnomer. To avoid having to retry SEAMCALLs, TDX needs to acquire kvm->lock *and* all vcpu->mutex locks, and acquiring all of those locks after/inside the current vCPU's mutex is a non-starter. However, TDX also needs to acquire the vCPU's mutex and load the vCPU, i.e. the handling is very much not async to the vCPU. No functional change intended. Acked-by: Claudio Imbrenda Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/arm64/kvm/arm.c | 4 ++-- arch/loongarch/kvm/vcpu.c | 4 ++-- arch/mips/kvm/mips.c | 4 ++-- arch/powerpc/kvm/powerpc.c | 4 ++-- arch/riscv/kvm/vcpu.c | 4 ++-- arch/s390/kvm/kvm-s390.c | 4 ++-- arch/x86/kvm/x86.c | 4 ++-- include/linux/kvm_host.h | 4 ++-- virt/kvm/kvm_main.c | 6 +++--- 9 files changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index ef5bf57f79b7..cf23f6b07ec7 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1835,8 +1835,8 @@ long kvm_arch_vcpu_ioctl(struct file *filp, return r; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, - unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { return -ENOIOCTLCMD; } diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index 30e3b089a596..9a5844e85fd3 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -1471,8 +1471,8 @@ int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, s= truct kvm_interrupt *irq) return 0; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { void __user *argp =3D (void __user *)arg; struct kvm_vcpu *vcpu =3D filp->private_data; diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index a75587018f44..b0fb92fda4d4 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -895,8 +895,8 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *v= cpu, return r; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, - unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 2ba057171ebe..9a89a6d98f97 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -2028,8 +2028,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *= vcpu, return -EINVAL; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index bccb919ca615..a4bd6077eecc 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -238,8 +238,8 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, s= truct vm_fault *vmf) return VM_FAULT_SIGBUS; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 16ba04062854..8c4caa5f2fcd 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -5730,8 +5730,8 @@ static long kvm_s390_vcpu_memsida_op(struct kvm_vcpu = *vcpu, return r; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { struct kvm_vcpu *vcpu =3D filp->private_data; void __user *argp =3D (void __user *)arg; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ca5ba2caf314..b85cb213a336 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7240,8 +7240,8 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, vo= id __user *argp) return 0; } =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, unsigned int ioctl, - unsigned long arg) +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) { return -ENOIOCTLCMD; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7186b2ae4b57..d93f75b05ae2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1557,6 +1557,8 @@ long kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); +long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, + unsigned int ioctl, unsigned long arg); vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf= ); =20 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext); @@ -2437,8 +2439,6 @@ static inline bool kvm_arch_no_poll(struct kvm_vcpu *= vcpu) } #endif /* CONFIG_HAVE_KVM_NO_POLL */ =20 -long kvm_arch_vcpu_async_ioctl(struct file *filp, - unsigned int ioctl, unsigned long arg); void kvm_arch_guest_memory_reclaimed(struct kvm *kvm); =20 #ifdef CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4845e5739436..9eca084bdcbe 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4434,10 +4434,10 @@ static long kvm_vcpu_ioctl(struct file *filp, return r; =20 /* - * Some architectures have vcpu ioctls that are asynchronous to vcpu - * execution; mutex_lock() would break them. + * Let arch code handle select vCPU ioctls without holding vcpu->mutex, + * e.g. to support ioctls that can run asynchronous to vCPU execution. */ - r =3D kvm_arch_vcpu_async_ioctl(filp, ioctl, arg); + r =3D kvm_arch_vcpu_unlocked_ioctl(filp, ioctl, arg); if (r !=3D -ENOIOCTLCMD) return r; =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D661375744 for ; Thu, 30 Oct 2025 20:10:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855009; cv=none; b=kkxHdYS5M3sA2Xkt2z76XJ6lQawHbVUtN7rMQBRfml0yot/vqVnL6kwsr4PzvJ8rN3Aj6RRHWDcUJmJWLU/QDTjLyazMTH89xqT6ubacfoXVgvCxft0v9JeK6zPD/Af0RD0lsUfXoDQnS9msYBlx7GeZZCdseDYNjMg4+zXwtbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855009; c=relaxed/simple; bh=MUDH20eIOuOTsz0KS9uuqL90AQHBBTqkmMlPBeDR0dg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=T//l0+40z6HovReWtdOH5YU2w1NekZU6V7l25VdocAAvpMMZV3S/Eb/Av8lP+ICdMHCndY8hPzuRPzHMGcq4TXC/3q4yrZ9SvzUWpI+GxYTRfc0QaAFKYHCg9LcxcIIG3vFlLBNHUAHBAX3n39zxP2bGmsXNMONPlm4mID/SL/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=T5UwO06f; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="T5UwO06f" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b63038a6350so1675643a12.0 for ; Thu, 30 Oct 2025 13:10:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855007; x=1762459807; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=TnV5Uo4WEDSyHqNRydChmPWwv04NNnacTlgMSv81cvQ=; b=T5UwO06f5YYUjTOoIBnQeZqZF3MXwA1xsiU1nObbLP4/tgmhojQYA0acKw+ROJdLk9 /nhe7veeJ9PR6/EvlxZ/mL4TOA/L5gZn2S0eczabNl9X2qMRy0wrgO17HvSvr3Cnmvoe Ebc3mpQ1/R4HhJ6Zhp2PBi/Bd3WEzUf1WjQ3zT3PF0qlQ6BlaQNN4B85rknkW0/4a65U TTQwXCjGC76CGEUdnqoTejpjoQ9+G3P1mTGkf9++df2k+gUwVdj4UiZX9JuhsIO7wCRo czlCTLXHHsmRVXv6UnoyWtnqqCcU8z0eJERJy4epx5L577EKtT42wHUbSYvVKx0YiY6Z NSpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855007; x=1762459807; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TnV5Uo4WEDSyHqNRydChmPWwv04NNnacTlgMSv81cvQ=; b=ubj0WVzUNeFvJ3353Qxjn/YoZsihylAhyPu0Y3nuHfHo3AllkOWBPWnugtvajwfnqW WdxT2AK1DlOscI1x+OC6bVzmjaNqJSRGfo3uIre8FJrJT2I3j/7pg1+MqpWo1tPK2vFi YHPBB4K9vNa2biTc6spA9oTynjdVUpQ5iTBMS/d1Hburfb2bAQnWOtSA7kUYPK90HibV hQ43kE4hfu/bef7rOl3vHAt+KWodbu2MOgg9AauIS6QlZB9XFsda6EYFQQ1EntWg8fOF 06lSni+FWBc/pfw/0MlM6AYAB+MMGeAL3B2j7MS95+mVPJvzy2EMJ0SeOLsrw9hVtuoC 3qwg== X-Forwarded-Encrypted: i=1; AJvYcCXgIA0i0H7MkibF0ox7pPuxWAUKTA4ELkhidy6oqmHQW6XORgfssBKMhDn+obY3gFGJKAR+sZb6ZpBs6bI=@vger.kernel.org X-Gm-Message-State: AOJu0YxAOg+FbCuRd5m80fORxXdgNAcMN+ksk7gBiyyq6nvBiCw3m5o1 oSofbZ5g4GDg+Kpa3EvLbHsWOvQNGXtftMj4tl1Rt0WgK1E3plfvnO3y5BRoFKFwVcqUyLOoBl8 GpD3P9w== X-Google-Smtp-Source: AGHT+IFGhHo2Di0bU/Eld4A49PARSXoqPeeyu/ySpV1D9gJIOdjNn1G+WqeVqdSQrN5R+YSZ+ehbNpIGVJQ= X-Received: from plow8.prod.google.com ([2002:a17:903:1b08:b0:267:cd3d:3446]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f54c:b0:295:134:9ae5 with SMTP id d9443c01a7336-29519b9f27cmr11657375ad.24.1761855006488; Thu, 30 Oct 2025 13:10:06 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:26 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-4-seanjc@google.com> Subject: [PATCH v4 03/28] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop TDX's sanity check that a mirror EPT mapping isn't zapped between creating said mapping and doing TDH.MEM.PAGE.ADD, as the check is simultaneously superfluous and incomplete. Per commit 2608f1057601 ("KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU"), the justification for introducing kvm_tdp_mmu_gpa_is_mapped() was to check that the target gfn was pre-populated, with a link that points to this snippet: : > One small question: : > : > What if the memory region passed to KVM_TDX_INIT_MEM_REGION hasn't bee= n pre- : > populated? If we want to make KVM_TDX_INIT_MEM_REGION work with these= regions, : > then we still need to do the real map. Or we can make KVM_TDX_INIT_ME= M_REGION : > return error when it finds the region hasn't been pre-populated? : : Return an error. I don't love the idea of bleeding so many TDX details = into : userspace, but I'm pretty sure that ship sailed a long, long time ago. But that justification makes little sense for the final code, as the check on nr_premapped after TDH.MEM.PAGE.ADD will detect and return an error if KVM attempted to zap a S-EPT entry (tdx_sept_zap_private_spte() will fail on TDH.MEM.RANGE.BLOCK due lack of a valid S-EPT entry). And as evidenced by the "is mapped?" code being guarded with CONFIG_KVM_PROVE_MMU=3Dy, KVM is NOT relying on the check for general correctness. The sanity check is also incomplete in the sense that mmu_lock is dropped between the check and TDH.MEM.PAGE.ADD, i.e. will only detect KVM bugs that zap SPTEs in a very specific window (note, this also applies to the check on nr_premapped). Removing the sanity check will allow removing kvm_tdp_mmu_gpa_is_mapped(), which has no business being exposed to vendor code, and more importantly will pave the way for eliminating the "pre-map" approach entirely in favor of doing TDH.MEM.PAGE.ADD under mmu_lock. Reviewed-by: Ira Weiny Reviewed-by: Kai Huang Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 326db9b9c567..4c3014befe9f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3181,20 +3181,6 @@ static int tdx_gmem_post_populate(struct kvm *kvm, g= fn_t gfn, kvm_pfn_t pfn, if (ret < 0) goto out; =20 - /* - * The private mem cannot be zapped after kvm_tdp_map_page() - * because all paths are covered by slots_lock and the - * filemap invalidate lock. Check that they are indeed enough. - */ - if (IS_ENABLED(CONFIG_KVM_PROVE_MMU)) { - scoped_guard(read_lock, &kvm->mmu_lock) { - if (KVM_BUG_ON(!kvm_tdp_mmu_gpa_is_mapped(vcpu, gpa), kvm)) { - ret =3D -EIO; - goto out; - } - } - } - ret =3D 0; err =3D tdh_mem_page_add(&kvm_tdx->td, gpa, pfn_to_page(pfn), src_page, &entry, &level_state); --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C72DF37DBD4 for ; Thu, 30 Oct 2025 20:10:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855012; cv=none; b=lDE+TeZLT6T33Cm5eh2B3Vtx96W3zR9Xhw+RoTTtgjYfBApV2TvSofgQpsuCpGC41K9rbdxh9KJjSf8VdmkYgYKNoOEKdKlwJ8laRIH2sTHxg2/eN1kFCI9hhojeezw1u/bTyy664cQVnk4cR4CE8c+bRJ9AYE0ILAgPcAANeDA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855012; c=relaxed/simple; bh=SegAjPYWds/TZjfbFAHGrF/2ieYqFxZV5by845zNTko=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RwJjQMXk+GlcL2jvMdLkUDiShDDAXdLcIsqBBvrZWeIcudNJy6YmaKP6yCvituWtQvZBBDrpPn3CEavEck8WRZZmY5XSgouSGYlBU2BZ5vECemoB+vtQ2ju0Lo6wxI6l/Qq6X6R2OmvQ7tC0r/GCGVZLs3XoJjD53ZbrJq4MwyQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UD9OJHc/; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UD9OJHc/" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-286a252bfbfso34206775ad.3 for ; Thu, 30 Oct 2025 13:10:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855009; x=1762459809; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=AWjqyewTSxAo5goVGUIy0m+TBQ84l/fAmH7ugD3Pbw4=; b=UD9OJHc/7JqtGgwFVcDVc4qgv8mXaW2u7Z34roFs07bt58HlbkvHDW1LFWpNKSjxFW HRUndqlqW6aieMeJlQNBF/wl5o4MbYVLWj8yi9mT3h9WgQsqepjU+IWofHAKlVrqAbak xJBtPeam5th83Uyvf9RI/g7ZYEPFk0i9ROmEg34o5FCgrl7J+7BFB82KpOR9/l1Fn1hV Gjyj6REuCKSeyAGBPAxWmRqml85aScc7sQiPwh0nUEmZ7ayf51FOpahmZw4Ptk2PTbP8 Nu86YnVVDoVMtKt/tr4HnIMaQGFrZQhQXI0CE/1gOIMbV2XEk1V9OzY0tUF3vsBVtJ5v NLbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855009; x=1762459809; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AWjqyewTSxAo5goVGUIy0m+TBQ84l/fAmH7ugD3Pbw4=; b=ToCsybuvKsxf9M1nSUVxCYY+r1odALJ8YZhnvHTpQAnkh2r73zxtf/Vkmi+cdjlKnK bzlMDsgskn7ZzPtw4XNN7ywBWx/CSbmAPPpx1ni0YW2vB5SDSwSAsT6tRDU0ZtLoyQUY LWphx5F3axmpnmEFyNlx7Bn091k2Osi2d+PB4Jy2bMhGicDfqyXK1/Bi5NYCdW/1Eq+w 7TyqpoOnZRFPfh1rlrqvvEs8lyjen6vhzkAaXEg1Puufw6Xq8xFquZFglD1/BprPJfwe E7mE6uFiw0D5w5ji2x5VEaZOBEPwq0iStYKPfe7Ms2OO296OlvNtWmOQPATT8ECjL7Mk vQlw== X-Forwarded-Encrypted: i=1; AJvYcCUCy6bA3KipH8+5Ngoe6xcbPUlLRDLvxbn0REUmnOqo039ryUnKYnlBPqMk6KsC2RQVjehQpslsvMEeCec=@vger.kernel.org X-Gm-Message-State: AOJu0YxhBgHr+hLL5IyraLApT+R6ksiEuzR6qb3uzkm68WXMcV2l0MJs gnM7hTGZFKmzbmjAErmVXqIu0Rjk9P5pyQjqW4SZ7LRS8e+WzAwPNoYfjnAO6rbGcbeAPg4pAvm A1ZQrPA== X-Google-Smtp-Source: AGHT+IEOd6dhhOrMyLm5rAPRaBNRm2puNA7otVNJLczO2HSlc5/6mK5syWdypSUA75SSISTWcIWOwk769kE= X-Received: from plkn2.prod.google.com ([2002:a17:902:6a82:b0:292:4a9c:44cf]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2452:b0:267:a55a:8684 with SMTP id d9443c01a7336-2951a36417cmr13224335ad.2.1761855009104; Thu, 30 Oct 2025 13:10:09 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:27 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-5-seanjc@google.com> Subject: [PATCH v4 04/28] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add and use a new API for mapping a private pfn from guest_memfd into the TDP MMU from TDX's post-populate hook instead of partially open-coding the functionality into the TDX code. Sharing code with the pre-fault path sounded good on paper, but it's fatally flawed as simulating a fault loses the pfn, and calling back into gmem to re-retrieve the pfn creates locking problems, e.g. kvm_gmem_populate() already holds the gmem invalidation lock. Providing a dedicated API will also removing several MMU exports that ideally would not be exposed outside of the MMU, let alone to vendor code. On that topic, opportunistically drop the kvm_mmu_load() export. Leave kvm_tdp_mmu_gpa_is_mapped() alone for now; the entire commit that added kvm_tdp_mmu_gpa_is_mapped() will be removed in the near future. Gate the API on CONFIG_KVM_GUEST_MEMFD=3Dy as private memory _must_ be back= ed by guest_memfd. Add a lockdep-only assert to that the incoming pfn is indeed backed by guest_memfd, and that the gmem instance's invalidate lock is held (which, combined with slots_lock being held, obviates the need to check for a stale "fault"). Cc: Michael Roth Cc: Yan Zhao Cc: Ira Weiny Cc: Vishal Annapurve Cc: Rick Edgecombe Reviewed-by: Rick Edgecombe Reviewed-by: Kai Huang Link: https://lore.kernel.org/all/20250709232103.zwmufocd3l7sqk7y@amd.com Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/mmu/mmu.c | 81 +++++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.c | 10 ++---- 3 files changed, 84 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index f63074048ec6..2f108e381959 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -259,6 +259,7 @@ extern bool tdp_mmu_enabled; =20 bool kvm_tdp_mmu_gpa_is_mapped(struct kvm_vcpu *vcpu, u64 gpa); int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, u8 = *level); +int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn); =20 static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 18d69d48bc55..bad0480bdb0d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5014,6 +5014,86 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu = *vcpu, return min(range->size, end - range->gpa); } =20 +#ifdef CONFIG_KVM_GUEST_MEMFD +static void kvm_assert_gmem_invalidate_lock_held(struct kvm_memory_slot *s= lot) +{ +#ifdef CONFIG_PROVE_LOCKING + if (WARN_ON_ONCE(!kvm_slot_has_gmem(slot)) || + WARN_ON_ONCE(!slot->gmem.file) || + WARN_ON_ONCE(!file_count(slot->gmem.file))) + return; + + lockdep_assert_held(&file_inode(slot->gmem.file)->i_mapping->invalidate_l= ock); +#endif +} + +int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn) +{ + struct kvm_page_fault fault =3D { + .addr =3D gfn_to_gpa(gfn), + .error_code =3D PFERR_GUEST_FINAL_MASK | PFERR_PRIVATE_ACCESS, + .prefetch =3D true, + .is_tdp =3D true, + .nx_huge_page_workaround_enabled =3D is_nx_huge_page_enabled(vcpu->kvm), + + .max_level =3D PG_LEVEL_4K, + .req_level =3D PG_LEVEL_4K, + .goal_level =3D PG_LEVEL_4K, + .is_private =3D true, + + .gfn =3D gfn, + .slot =3D kvm_vcpu_gfn_to_memslot(vcpu, gfn), + .pfn =3D pfn, + .map_writable =3D true, + }; + struct kvm *kvm =3D vcpu->kvm; + int r; + + lockdep_assert_held(&kvm->slots_lock); + + /* + * Mapping a pre-determined private pfn is intended only for use when + * populating a guest_memfd instance. Assert that the slot is backed + * by guest_memfd and that the gmem instance's invalidate_lock is held. + */ + kvm_assert_gmem_invalidate_lock_held(fault.slot); + + if (KVM_BUG_ON(!tdp_mmu_enabled, kvm)) + return -EIO; + + if (kvm_gfn_is_write_tracked(kvm, fault.slot, fault.gfn)) + return -EPERM; + + r =3D kvm_mmu_reload(vcpu); + if (r) + return r; + + r =3D mmu_topup_memory_caches(vcpu, false); + if (r) + return r; + + do { + if (signal_pending(current)) + return -EINTR; + + if (kvm_test_request(KVM_REQ_VM_DEAD, vcpu)) + return -EIO; + + cond_resched(); + + guard(read_lock)(&kvm->mmu_lock); + + r =3D kvm_tdp_mmu_map(vcpu, &fault); + } while (r =3D=3D RET_PF_RETRY); + + if (r !=3D RET_PF_FIXED) + return -EIO; + + return 0; +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_mmu_map_private_pfn); +#endif + static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault =3D nonpaging_page_fault; @@ -5997,7 +6077,6 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) out: return r; } -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_load); =20 void kvm_mmu_unload(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4c3014befe9f..29f344af4cc2 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3157,15 +3157,12 @@ struct tdx_gmem_post_populate_arg { static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pf= n, void __user *src, int order, void *_arg) { - u64 error_code =3D PFERR_GUEST_FINAL_MASK | PFERR_PRIVATE_ACCESS; - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); struct tdx_gmem_post_populate_arg *arg =3D _arg; - struct kvm_vcpu *vcpu =3D arg->vcpu; + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + u64 err, entry, level_state; gpa_t gpa =3D gfn_to_gpa(gfn); - u8 level =3D PG_LEVEL_4K; struct page *src_page; int ret, i; - u64 err, entry, level_state; =20 /* * Get the source page if it has been faulted in. Return failure if the @@ -3177,7 +3174,7 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, if (ret !=3D 1) return -ENOMEM; =20 - ret =3D kvm_tdp_map_page(vcpu, gpa, error_code, &level); + ret =3D kvm_tdp_mmu_map_private_pfn(arg->vcpu, gfn, pfn); if (ret < 0) goto out; =20 @@ -3240,7 +3237,6 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *= vcpu, struct kvm_tdx_cmd *c !vt_is_tdx_private_gpa(kvm, region.gpa + (region.nr_pages << PAGE_SHI= FT) - 1)) return -EINVAL; =20 - kvm_mmu_reload(vcpu); ret =3D 0; while (region.nr_pages) { if (signal_pending(current)) { --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5302137DBCF for ; Thu, 30 Oct 2025 20:10:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855014; cv=none; b=thf4kMjU3t759En2iUOHzhbIDQvjEBoHGBl+pB9E8V++56ypnMRDbrIzVS2F7B7XDq6/sE5X8iR8C2pfhd+E4R66/9BjxTCigIXr98wjbDJ03KA1hA7Tg2pmRLGBFeKdn5aOD4O2blyYObjZkSVwGJavCIRvmgKm0xvhNf5h7eA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855014; c=relaxed/simple; bh=Q2c6cNYSKoGYP4CYOI8nHGq0kI85fWBL8NZOIPZmdhU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DUzb8xlCEkunOIuLuWZqTcr9tjuJ8n6iDsXYkjkhlyJ0QezVu83kQWaiPzFTmUVrGQA00wq/K7pygx86sSPKIo8tjAkX0oELLI7SjRB+UPHNoBNgyuvPQAGixlzI7FToOmfCECF4AkqBcoDeKa3vMCETR314jiFtpI3pYrO9zoY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gz7K7LCI; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gz7K7LCI" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b5535902495so987236a12.0 for ; Thu, 30 Oct 2025 13:10:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855011; x=1762459811; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=pr+88J8V2FFjpoCz/RKjLuOTT6WQDejtYqs45iSU51c=; b=gz7K7LCIaED4skeuMQNZUqP4NIJ3bX8PQX7dV1566hqiO8b+/xoA1fJs2faKHA5s6K UYNXrO8Du5eb91L7nSf/+bG0YCwXt3xfDAJGZG7rIg9tfnKBGSY+3wmDO/YTfoHgGo+8 WZINsmqtOq8GGvfVvCf4pRALkAuA7NHbuyDsQReEVv0gW2aeKgp1xiULY3W/CSx3mnN5 u3CShr1dkTerZL99tpypYYiKfQa/P0Xqo88tSHukNKBoMXDk2Q+goD6i4ldmD0x0ly1r JxiSsW92L1xw+KJjE+ukd+MSLBIBfjIFTYzt2zielD2sUYUA1fLnHn+JPcrP0TSZZDJ5 kQEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855011; x=1762459811; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=pr+88J8V2FFjpoCz/RKjLuOTT6WQDejtYqs45iSU51c=; b=RxjP/Wwd+cZ/OKrmL1YAb0wIve1LTBu2ppvJQ/YWETizbs8VDg8CqdwH50MwHUA9yM wMzRJD3gFSAwjdF1bQXFtCL+HjqoOScwQuNbSxdB9WizkVfPoJ3rKeO8EJ003PaeL85L VxoTONmrHqxhhATy3Z+5vpOzKO2HuG8kiIsEUNVZAn4Or/YDCJ8QS7zKFP9xgIWzPV6s v6YmS3U6opElg/H47qIbx8S/KOnvDNjbaTMB14bFaOnmWKH1jvCO6N6gWvMMyt7OjJcp w3k5WftRfRhhUJZUGb5mR6CxEYdDMMthF5yge4xq3jO3F35IHdPCzEGwB+WhKFPPhATl MSng== X-Forwarded-Encrypted: i=1; AJvYcCWgjsVgsGB3elkp0qPUwQyxiVFGvpVfH3fyRRyfTLDM4qIkvpNcqExeRhU+BMZgbVKIISCbcSGO7a0cMQc=@vger.kernel.org X-Gm-Message-State: AOJu0YxpygTiSK/oOcStcXdCLHu6eFBfraLYcztMeQ9aQwvOEWPdU7vH DyzqjrE8V1AOjkHEy6fybVdpP7bUyCid9/YQ9cmqYu/cWIPozUHaw9qw5ygvPe6svznUQdnPHAx XosGZGg== X-Google-Smtp-Source: AGHT+IHqKKH3jALoT7jRe2OHIQ2mXXV+pELYiJKGwKEjf4ZIM+twJ+5jG3TjV8eGRlfpIUNYgujs+bg0n4E= X-Received: from pjji4.prod.google.com ([2002:a17:90a:6504:b0:33b:dccb:b328]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c8d:b0:33f:ebc2:63a with SMTP id 98e67ed59e1d1-34082fc2cb0mr1192488a91.14.1761855011296; Thu, 30 Oct 2025 13:10:11 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:28 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-6-seanjc@google.com> Subject: [PATCH v4 05/28] KVM: x86/mmu: WARN if KVM attempts to map into an invalid TDP MMU root From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When mapping into the TDP MMU, WARN (if KVM_PROVE_MMU=3Dy) if the root is invalid, e.g. if KVM is attempting to insert a mapping without checking if the information and MMU context is fresh. Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/mmu/tdp_mmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index c5734ca5c17d..440fd8f80397 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1273,6 +1273,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm= _page_fault *fault) struct kvm_mmu_page *sp; int ret =3D RET_PF_RETRY; =20 + KVM_MMU_WARN_ON(!root || root->role.invalid); + kvm_mmu_hugepage_adjust(vcpu, fault); =20 trace_kvm_mmu_spte_requested(fault); --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3832F37EE1F for ; Thu, 30 Oct 2025 20:10:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855016; cv=none; b=ZQg9tVBM+VAxQmyJZeV3083zkqwR0uEa2Ot0KpPhGb+gWCsa+TDJLai6coBjZUH0PMAyYPCO8piRA0ElD2dsOMD1bRIHbWTTft5h9dUYRw4cQRCNKKR926mi3ZFSwJPLkiv6VR8VXUBIs93lbt83AJFVI/geyAouLuY7Xl/HwFM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855016; c=relaxed/simple; bh=iL9gs/36+Vag8cN5PVoyBc4UOU0u8h8j4kVKzBjrjFc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QkGW/Eu02b18pKZvPfO5KZA7j5wPiA5zC1NbI9Laj+rcn/dpGBNQwaJW4gG606H5xoNNUbBweLY7dQhQMPAHx9f8MoNN6Kvs41w2C3tnlHt+QTnlvP3Ul346ggClVnWx+/8g0ldoq2bcRn7YvVqXrplAZ8KVQTk3jRW0QMxBVpk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=MsirBJuK; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MsirBJuK" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33d8970ae47so1738707a91.1 for ; Thu, 30 Oct 2025 13:10:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855014; x=1762459814; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=LWKZ6GH7M+zzFa/XrLh2XF2IHEzo52dgkoqprkhASvM=; b=MsirBJuKh4wk6WxKJhf9T34e25umxoo+8Xch5azN52aMkF+b3lFYen0WPVFp5lO7ga d0gFR/xWqb3Q4mDKuQUVnyhIBID1tVNhvabb+xQ5++2QGzaVvYdlaNltYpxHoeAN9zhY D9vhl/4hpzVRrVy4kZ43Ff5/BC5GJNrZLFP8NesDX8PcJH5vSvFF9Wv6DJ2taPF2zxWZ IeaoKpLk8kpAzHIH7ZGzMuEqC21uCFJMUY75GBMzKACXmLpt7I/Qo4kUcfnjx7dChdgp lSWu+GNNzhimtNMQnYmFQHDREIhRqRKuk5+HQqhhuD4+jPRNmkRDK12Qc6XZ7Llv/8Gz XzSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855014; x=1762459814; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LWKZ6GH7M+zzFa/XrLh2XF2IHEzo52dgkoqprkhASvM=; b=NE6F2qEPswhRYZlgi8QEeVdCrfTYQDgdwFMzECzZQCFpHi0Etx/ycJpfyAofrdxqAH aWQMMiF6YUYi8SP6ADWbsrAPpS6nZDhaatzSiRsqnO1s27JBNn8OwSj+cTAqSTbUpxDj 8G2toSrP7/GqB02pGkcLUAvUXfin/TlSGbAgk8Jr6V7niTUWit8d86o1ws2WXIExfTNz +yTOYpJZ3Wq+HWuy2nzOPTjgmMDDGG9Uc3bzqlss+2r+VoKsGGsU9RoVuLjsVxgmlEx3 ApzkQ/Bt3JqGdC9YKHxlViTfI65T+2hUYe4h0fwQsZPxGE2xByl7pxFeGI8vouNcD+0v 95Vw== X-Forwarded-Encrypted: i=1; AJvYcCV+4YEHlqBxIsuQQNP1GDP4DpW2aLDgVPdEYiDrZ5VsswT3aahvNDl728TedTkd3nmbYApqr/vkZKdc5og=@vger.kernel.org X-Gm-Message-State: AOJu0YxGWW7wF/MHj3lBqe1YmXAAiygdtDKSrSDOp0OrxPw3hXZA1ckp mQX9Ipqy4Tjbwvipbrrqi5PrOuc1b7GdHXXwjXFuqniqHD5fQZen5d3Dzv3/QVHrHWPzjrNhLF0 +yHOEkA== X-Google-Smtp-Source: AGHT+IG7uuixnfesLu6aOkm8lY1M0UiQHdZ1syGC1fH0ExXdaI9kZ+0pX4GRNJah3kEFWI4QtyuhY8kfLfE= X-Received: from pjrx14.prod.google.com ([2002:a17:90a:bc8e:b0:332:7fae:e138]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:530d:b0:330:797a:f4ea with SMTP id 98e67ed59e1d1-3408308b46emr1245875a91.29.1761855013586; Thu, 30 Oct 2025 13:10:13 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:29 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-7-seanjc@google.com> Subject: [PATCH v4 06/28] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the helper and exports that were added to allow TDX code to reuse kvm_tdp_map_page() for its gmem post-populate flow now that a dedicated TDP MMU API is provided to install a mapping given a gfn+pfn pair. This reverts commit 2608f105760115e94a03efd9f12f8fbfd1f9af4b. Reviewed-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/mmu.h | 2 -- arch/x86/kvm/mmu/mmu.c | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 37 +++++-------------------------------- 3 files changed, 7 insertions(+), 36 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 2f108e381959..9e5045a60d8b 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -257,8 +257,6 @@ extern bool tdp_mmu_enabled; #define tdp_mmu_enabled false #endif =20 -bool kvm_tdp_mmu_gpa_is_mapped(struct kvm_vcpu *vcpu, u64 gpa); -int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, u8 = *level); int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn); =20 static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index bad0480bdb0d..3a5104e4127a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4924,7 +4924,8 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct = kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } =20 -int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, u8 = *level) +static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_co= de, + u8 *level) { int r; =20 @@ -4966,7 +4967,6 @@ int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa= , u64 error_code, u8 *level return -EIO; } } -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_map_page); =20 long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 440fd8f80397..e735d2f8367b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1941,13 +1941,16 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, * * Must be called between kvm_tdp_mmu_walk_lockless_{begin,end}. */ -static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sp= tes, - struct kvm_mmu_page *root) +int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, + int *root_level) { + struct kvm_mmu_page *root =3D root_to_sp(vcpu->arch.mmu->root.hpa); struct tdp_iter iter; gfn_t gfn =3D addr >> PAGE_SHIFT; int leaf =3D -1; =20 + *root_level =3D vcpu->arch.mmu->root_role.level; + for_each_tdp_pte(iter, vcpu->kvm, root, gfn, gfn + 1) { leaf =3D iter.level; sptes[leaf] =3D iter.old_spte; @@ -1956,36 +1959,6 @@ static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *v= cpu, u64 addr, u64 *sptes, return leaf; } =20 -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, - int *root_level) -{ - struct kvm_mmu_page *root =3D root_to_sp(vcpu->arch.mmu->root.hpa); - *root_level =3D vcpu->arch.mmu->root_role.level; - - return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, root); -} - -bool kvm_tdp_mmu_gpa_is_mapped(struct kvm_vcpu *vcpu, u64 gpa) -{ - struct kvm *kvm =3D vcpu->kvm; - bool is_direct =3D kvm_is_addr_direct(kvm, gpa); - hpa_t root =3D is_direct ? vcpu->arch.mmu->root.hpa : - vcpu->arch.mmu->mirror_root_hpa; - u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte; - int leaf; - - lockdep_assert_held(&kvm->mmu_lock); - rcu_read_lock(); - leaf =3D __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, root_to_sp(root)); - rcu_read_unlock(); - if (leaf < 0) - return false; - - spte =3D sptes[leaf]; - return is_shadow_present_pte(spte) && is_last_spte(spte, leaf); -} -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_mmu_gpa_is_mapped); - /* * Returns the last level spte pointer of the shadow page walk for the giv= en * gpa, and sets *spte to the spte value. This spte may be non-preset. If = no --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D988137FC59 for ; Thu, 30 Oct 2025 20:10:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855018; cv=none; b=Yr2uPLqLFyErt+7EoW9sbj7J93MMIc5boNfsiXDZegGid8x7MYsTViUgGMPyjmMqXRqzdPHLvCUXi9YzN0QZ52JZmKAI7U68gtckLX10glaJeEnFGKVuZPZFa61GOPT1AZYXl+3tLxe1Se3ft1AapZmKlyb5Kej8bEPUf2jUw34= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855018; c=relaxed/simple; bh=e3/F56xOZTqFFfm6ytjXqIzmHI1q3N3bM68KHJZJdHQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=akcqLQAXmSln6lXAwtUCNjs8d1kYR7Mkfg+i7pNvA5FK06Sw57jAFHFC+WTpL1PL+3aa1TVbz4Js+K1+5ufB0UYHnWGu8//h+B0kXxj9u+QdqgTcAPWCfxJieP2ySZpQPW9jqdevpetaPZgIZzb3VJj44ikQEc7JXA9s8CryjJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ge1T9AB7; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ge1T9AB7" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-294f59039d6so12297765ad.3 for ; Thu, 30 Oct 2025 13:10:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855016; x=1762459816; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=JLmVjM6Rep3/JgP/0z+P5jqrWiynYk4juTN/f68hSuo=; b=ge1T9AB7xA3SQDdm/6wLAolvNdiTwy7ib17w/sEC2oIr3W2lS++MAAU6ZG29Or/H3j zPAuYszQcGT3tWe9fOGESCwdFAxpyGMucqgv3WM7olE9IV7NQvAUqtVElC4iAaxw31HM Mf9DpChuKpuTXOB5k2DHJRrYz8ah4wXD+cCf6lAKBs3PsLhULOmAfaZ7rgqQ30dsH0uD JPEOOhqxF+Evmd7FDbx7aOucnpi1ZobunvZv/wS7FtEaEnMAbTQ0GTVs2IFtjWpLwHA1 vasSRtA+Cszxhl356hVE88pbgqSIzEZtFoRkuYxiGuzJx0vLmjcKc3Iy9WQ+TKTk4nnz 3Mag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855016; x=1762459816; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JLmVjM6Rep3/JgP/0z+P5jqrWiynYk4juTN/f68hSuo=; b=IDGG4FvetRVZS5Pnz7fkd06WF1s5CbkzHNontFnk+MhFI96VOAWTolGDRTdDaD0SE0 HRUp+pvgPu3QQhVfyOJ+sDzwdrrJR0sQCiRXFTRQK+1MR1Pjqu94DdqaotfwkzJW0Nda zH96d44oJyWMY7tr1rfuinNFQZrMXIYn5XgKyOSIQ8hIo19royM3Xm5FataGMn0Lp+uy qSLJfkcZrzz5nzoZW+yn2w9kZgro26+Du0j5RmMqM69w2PZ0N3bVrm5/d3jvQEbeDHYE o6SnNCnAxcatiU5cazb7+n5SC4BefyUZOsH7m82ODQm6qCQZoJAL+jBFuWJReRIFC6Lh AnSQ== X-Forwarded-Encrypted: i=1; AJvYcCUpcgc0NWI0R1VWSDefQvY8SfDGQGBrX7MBD+OX+kY8efWd7qcOsWXuyzi7VQB6pt3Y5tE1bCj+hN4P+C8=@vger.kernel.org X-Gm-Message-State: AOJu0Yxq6ZIH2nEF+urxiYL993TQqrcV/3ONwomX6rlGa+C9HqCCoeTb gv4ez+dumwpoACqu740wqZEK2orLq5Jt5BKJH/oPPm4ybUgzFP83H8tqiA17Fj2JorGc5TBZMYe 1kPZc9w== X-Google-Smtp-Source: AGHT+IH1/xZVzpdLpqTCV5R9sgqM94QaK+KvWTLge6Sarj+nuvNBhb5+D5DYqWRmVcfvrQhQkF7/dm0shPg= X-Received: from plnx8.prod.google.com ([2002:a17:902:8208:b0:290:bd15:24ab]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e88e:b0:28e:7fd3:57f2 with SMTP id d9443c01a7336-2951a491f2dmr14399305ad.49.1761855016072; Thu, 30 Oct 2025 13:10:16 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:30 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-8-seanjc@google.com> Subject: [PATCH v4 07/28] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() now that it's used only by kvm_arch_vcpu_pre_fault_memory(). No functional change intended. Reviewed-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/mmu/mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3a5104e4127a..10e579f8fa8e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4924,8 +4924,8 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct = kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } =20 -static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_co= de, - u8 *level) +static int kvm_tdp_page_prefault(struct kvm_vcpu *vcpu, gpa_t gpa, + u64 error_code, u8 *level) { int r; =20 @@ -5002,7 +5002,7 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *= vcpu, * Shadow paging uses GVA for kvm page fault, so restrict to * two-dimensional paging. */ - r =3D kvm_tdp_map_page(vcpu, range->gpa | direct_bits, error_code, &level= ); + r =3D kvm_tdp_page_prefault(vcpu, range->gpa | direct_bits, error_code, &= level); if (r < 0) return r; =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A245380853 for ; Thu, 30 Oct 2025 20:10:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855022; cv=none; b=dYVap09OP7WecmcWcFs9a+d8ql65hzrnECWECqc+Xh/m0g71BoDn1rr94T24Xj8axBmWz03GFW03bESrR0oNq5ZxmnA8yN1Eug6WhiDc023aPWbRBC4BKvU4zes2vkUGmoSuNXZWBLzXP/6Q43sjZIuM9VZBwMLhuFRUvjr1nuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855022; c=relaxed/simple; bh=2Txe47VB3K7A0LqVybEnQCSyZsk6zIqB1rLpOi1mLYk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=G5EurktrKy4fJLbU1aPkvNfdFaUeRrnBn7MV7wLDBsR9i0iT/s1I/wZFNMuK+/G/A5pwvxshN28dCp9Av+lv6JUcceHwEmwWzLZleLpf7W3S17lrFlDTUONvbeeWF2yEuYEFvAyDf/B6gdUxuPjLz3DQFHX7SI+jlIs4R7TratI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=xptYbRVS; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="xptYbRVS" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-34029b3dbfeso1710590a91.3 for ; Thu, 30 Oct 2025 13:10:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855019; x=1762459819; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=sAFIK0l3RMhPR7pmj5tjYrChp81q6H152HYKl+MHkyE=; b=xptYbRVSDX/oht3xSrV564wzdmBls/0dRwhNw60hHP0PHH5Gm9u8l4BoeGKao59aAn AEQRvSFfKbV7/kGeqIwbJWo4+tBQvr0wdc5ZjiqZip2iaAAd+s/mTVto64Six4QiPYA8 2ZJHCEiKuuQIHgS15+3tPlJ8v9ltbSUtuFqCR8tW1jQyrbgxaV9oyeTefDJsvrthymyS HxoiapA5QRlQTjyG/GyScwXOgpe4FLxSsdYA2BmceGOmItdNtqNmGQVzfLvn+MH9wG3L keCrl8RGgiWzpy1hhYOAD6HHDIvBkAuGSQbi6FhWXBD3m1Y1gWClfx6MrqRzN4HLY6GU VA4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855019; x=1762459819; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sAFIK0l3RMhPR7pmj5tjYrChp81q6H152HYKl+MHkyE=; b=AllgeRbwWX1Y0HFYYHOxcmhqeuidFQVdS8og9k02DUaOzceVS8tIT4NQt3YobHLt9J 9OKoah32OoUMNJJTq90cpHdaEd4Ob11YPZo0v2a8hj7w+O/qO+vVABjJHZjIs34tDqM9 fgWiELg1bszLq5gM1/AiUBcHL3Y2GRAuBAkwqhywBf/MUqcNBVxANw6iJ1gxCPjQh7zO h7ZoDko7sD0fKp9CDNfcYGAP3HBI9yqQ0eeQMeKsYfwaUH+SoSdQDxkL/SrXHHwTYpE9 mcst4hziD9Dx+rOnyZV3NXRixN9BvO7Y+b5K3bby4YFFLgN9VhSzJriPzjxSnwcjcvWx 63ig== X-Forwarded-Encrypted: i=1; AJvYcCWdfIeQ93WmXdhdPXEUvOaZwvEOw8TPoU9vuD2wWQwB6upKEMAd5fbNY4U0IoN4zR8nrYeqLcf0jXM8760=@vger.kernel.org X-Gm-Message-State: AOJu0YxDnqWzk9QgG9KZS9NSFGNzDPNqtzv7cyNJ8vRuEx0hJ4zWn3TV UrRcNRixhkubpSyIrQab3n+WsCIGy8RHOPCtvrwFg9VDuyR+yi/v083GR/9M8nBHMWoVByPXcce Nrdtv/g== X-Google-Smtp-Source: AGHT+IFa06BfDwKwW2VMNxuQ8ibpI7ipYGuzW9P9Yv39A5TWMZsjHlQcJLcDxxGMugGfb86zaLcKGL4PNrU= X-Received: from pjty20.prod.google.com ([2002:a17:90a:ca94:b0:340:4910:738f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2250:b0:340:5c27:a096 with SMTP id 98e67ed59e1d1-34082faafd2mr1400500a91.6.1761855018548; Thu, 30 Oct 2025 13:10:18 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:31 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-9-seanjc@google.com> Subject: [PATCH v4 08/28] KVM: TDX: Drop superfluous page pinning in S-EPT management From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yan Zhao Don't explicitly pin pages when mapping pages into the S-EPT, guest_memfd doesn't support page migration in any capacity, i.e. there are no migrate callbacks because guest_memfd pages *can't* be migrated. See the WARN in kvm_gmem_migrate_folio(). Eliminating TDX's explicit pinning will also enable guest_memfd to support in-place conversion between shared and private memory[1][2]. Because KVM cannot distinguish between speculative/transient refcounts and the intentional refcount for TDX on private pages[3], failing to release private page refcount in TDX could cause guest_memfd to indefinitely wait on decreasing the refcount for the splitting. Under normal conditions, not holding an extra page refcount in TDX is safe because guest_memfd ensures pages are retained until its invalidation notification to KVM MMU is completed. However, if there're bugs in KVM/TDX module, not holding an extra refcount when a page is mapped in S-EPT could result in a page being released from guest_memfd while still mapped in the S-EPT. But, doing work to make a fatal error slightly less fatal is a net negative when that extra work adds complexity and confusion. Several approaches were considered to address the refcount issue, including - Attempting to modify the KVM unmap operation to return a failure, which was deemed too complex and potentially incorrect[4]. - Increasing the folio reference count only upon S-EPT zapping failure[5]. - Use page flags or page_ext to indicate a page is still used by TDX[6], which does not work for HVO (HugeTLB Vmemmap Optimization). - Setting HWPOISON bit or leveraging folio_set_hugetlb_hwpoison()[7]. Due to the complexity or inappropriateness of these approaches, and the fact that S-EPT zapping failure is currently only possible when there are bugs in the KVM or TDX module, which is very rare in a production kernel, a straightforward approach of simply not holding the page reference count in TDX was chosen[8]. When S-EPT zapping errors occur, KVM_BUG_ON() is invoked to kick off all vCPUs and mark the VM as dead. Although there is a potential window that a private page mapped in the S-EPT could be reallocated and used outside the VM, the loud warning from KVM_BUG_ON() should provide sufficient debug information. To be robust against bugs, the user can enable panic_on_warn as normal. Link: https://lore.kernel.org/all/cover.1747264138.git.ackerleytng@google.c= om [1] Link: https://youtu.be/UnBKahkAon4 [2] Link: https://lore.kernel.org/all/CAGtprH_ypohFy9TOJ8Emm_roT4XbQUtLKZNFcM6F= r+fhTFkE0Q@mail.gmail.com [3] Link: https://lore.kernel.org/all/aEEEJbTzlncbRaRA@yzhao56-desk.sh.intel.co= m [4] Link: https://lore.kernel.org/all/aE%2Fq9VKkmaCcuwpU@yzhao56-desk.sh.intel.= com [5] Link: https://lore.kernel.org/all/aFkeBtuNBN1RrDAJ@yzhao56-desk.sh.intel.co= m [6] Link: https://lore.kernel.org/all/diqzy0tikran.fsf@ackerleytng-ctop.c.googl= ers.com [7] Link: https://lore.kernel.org/all/53ea5239f8ef9d8df9af593647243c10435fd219.= camel@intel.com [8] Suggested-by: Vishal Annapurve Suggested-by: Ackerley Tng Suggested-by: Rick Edgecombe Signed-off-by: Yan Zhao Reviewed-by: Ira Weiny Reviewed-by: Kai Huang [sean: extract out of hugepage series, massage changelog accordingly] Reviewed-by: Binbin Wu Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 28 ++++------------------------ 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 29f344af4cc2..c3bae6b96dc4 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1583,29 +1583,22 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t = root_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa); } =20 -static void tdx_unpin(struct kvm *kvm, struct page *page) -{ - put_page(page); -} - static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, - enum pg_level level, struct page *page) + enum pg_level level, kvm_pfn_t pfn) { int tdx_level =3D pg_level_to_tdx_sept_level(level); struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + struct page *page =3D pfn_to_page(pfn); gpa_t gpa =3D gfn_to_gpa(gfn); u64 entry, level_state; u64 err; =20 err =3D tdh_mem_page_aug(&kvm_tdx->td, gpa, tdx_level, page, &entry, &lev= el_state); - if (unlikely(tdx_operand_busy(err))) { - tdx_unpin(kvm, page); + if (unlikely(tdx_operand_busy(err))) return -EBUSY; - } =20 if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MEM_PAGE_AUG, err, entry, level_state); - tdx_unpin(kvm, page); return -EIO; } =20 @@ -1639,29 +1632,18 @@ static int tdx_sept_set_private_spte(struct kvm *kv= m, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - struct page *page =3D pfn_to_page(pfn); =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return -EINVAL; =20 - /* - * Because guest_memfd doesn't support page migration with - * a_ops->migrate_folio (yet), no callback is triggered for KVM on page - * migration. Until guest_memfd supports page migration, prevent page - * migration. - * TODO: Once guest_memfd introduces callback on page migration, - * implement it and remove get_page/put_page(). - */ - get_page(page); - /* * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching * barrier in tdx_td_finalize(). */ smp_rmb(); if (likely(kvm_tdx->state =3D=3D TD_STATE_RUNNABLE)) - return tdx_mem_page_aug(kvm, gfn, level, page); + return tdx_mem_page_aug(kvm, gfn, level, pfn); =20 return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); } @@ -1712,7 +1694,6 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm= , gfn_t gfn, return -EIO; } tdx_quirk_reset_page(page); - tdx_unpin(kvm, page); return 0; } =20 @@ -1792,7 +1773,6 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm,= gfn_t gfn, if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) && !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) { atomic64_dec(&kvm_tdx->nr_premapped); - tdx_unpin(kvm, page); return 0; } =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8FC0380F38 for ; Thu, 30 Oct 2025 20:10:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855023; cv=none; b=hDFSKfB7m00VHJHOZsyfcJTwZxWtve9uS0MVljL3OwfH30h169MfpdQfOZPXRLRshyfQBUZBk42dHL/lkqtmANXScKwTUv6HwBhTvQzssTeq5udsJprWwSkdmz2vYfszo4W4tlzEMK11Hwlci2LDSHVPmQiwt43ABxTOa6QW+xI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855023; c=relaxed/simple; bh=87HTkxfpH+H+Satg8RfgeECqCHrMVmRmSND+z9d7i14=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nS3r+bANYq9QMfpRjn+1I4hFlWyNbfdfHvRTFIKiT4/ZEzkbUm+QNg9K8G2lOrsIWByaGCUtMoh8GC7oSgNvat5wxDakYsYUV/c0w9KH/ggFcfPe179fgaynLByrilpTlixzLK70Vb5O/XnqBe1NDYpS6j0QmJF+gMrAXW2sN24= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=VzoQhI0v; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VzoQhI0v" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32ee4998c50so1121542a91.3 for ; Thu, 30 Oct 2025 13:10:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855021; x=1762459821; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=l9YJv18Ob5fRSYpWoHOBPRjxzgmHk89rOMKnw0Dkb6I=; b=VzoQhI0vx99ZtNrdkiy/3vq2LgRBTnhAdGPe2gDLhp/RuFJfy5LpXOedkSawNNLIpC BG4QKCNSGzYW0JLrf5cUb8FxpPisW1VlGQDcOZYweodV9RvX25C56FzPeAYb3gsyvkhP n60LihSkOgx/Mvl4xgc4bt+SCX6n+X/3fWVC9A9KF/DixbEbEK028o7aB3EO7D6qGvCr LM5CtNRAE4DLnuKTML1zGGOqz6E2MizwCoOH1OVl46WnsKEnY3hbwls7rYxZ9Ndv7UKA xkoPOWJZ4q2+u06AVq221p6xNmatgF7W0yJA2oTH+zL61i+KcdrXDbz5S1wVOSAQD8vj kzQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855021; x=1762459821; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=l9YJv18Ob5fRSYpWoHOBPRjxzgmHk89rOMKnw0Dkb6I=; b=MKDigxLK+pvuThrZxY9MeA9LJDLgTqK5cYvNV76/bNMIq1cdfzqiWvVFi3rMFAsNl1 7GF4pscajIawAF4+YmYBIt3dW7GSpvO2sbYb1PEc7uRb7WiQT6gvDiN/rBBGntWCLVQY u4Z6hOEOMEmcuYCOIdsJiXJ+Wx7MTzwGXzRddq3yoOUX+dFTrvrPJjH7sdlG07qjlfTy 9zymbAfFZyNWNOVkCzx5myZZEqslXN29cpNLb1kXDei6ZqVVukOHAVjoF0Lmx0J7Chfi rEc77S8YhfatYDKNGsovWFnNUr95VKyPS0i3csFSo5vtokMNePSIM7/TyPftWjrHeW4h HjQA== X-Forwarded-Encrypted: i=1; AJvYcCWnPQmAZVB7m9/8I68hmZ1aZHEboyulJwdiFsWKl73wEgQ7PlbygwAdSxulqyo678AIeBXqJ0P/MoM5x2o=@vger.kernel.org X-Gm-Message-State: AOJu0YyhJ1VmwNasV11kCDpIRxrJidlRqViwqoGaUDTFaBKcQFZvHxBd EIBG0wuCDSopZYz+/eyI7YzUozJDPr/eWIyOplwfwId9Aw9RFQMz2oWqoOXqe3kJwYeVQFzlZey WxbjnQw== X-Google-Smtp-Source: AGHT+IG+gFwz9Ni5D6YjNM4YmwfrZ1OxqSQdFqNs3bWhzX+cRsNM16o/ggLOmTawQXyQElnAsV5VYAZIDy0= X-Received: from pjvp23.prod.google.com ([2002:a17:90a:df97:b0:339:e59f:e26]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4e83:b0:33b:c5de:6a4e with SMTP id 98e67ed59e1d1-34082fa592cmr1221665a91.5.1761855021014; Thu, 30 Oct 2025 13:10:21 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:32 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-10-seanjc@google.com> Subject: [PATCH v4 09/28] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Return -EIO when a KVM_BUG_ON() is tripped, as KVM's ABI is to return -EIO when a VM has been killed due to a KVM bug, not -EINVAL. Note, many (all?) of the affected paths never propagate the error code to userspace, i.e. this is about internal consistency more than anything else. Reviewed-by: Rick Edgecombe Reviewed-by: Ira Weiny Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index c3bae6b96dc4..c242d73b6a7b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1621,7 +1621,7 @@ static int tdx_mem_page_record_premap_cnt(struct kvm = *kvm, gfn_t gfn, struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); =20 if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) - return -EINVAL; + return -EIO; =20 /* nr_premapped will be decreased when tdh_mem_page_add() is called. */ atomic64_inc(&kvm_tdx->nr_premapped); @@ -1635,7 +1635,7 @@ static int tdx_sept_set_private_spte(struct kvm *kvm,= gfn_t gfn, =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EINVAL; + return -EIO; =20 /* * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching @@ -1658,10 +1658,10 @@ static int tdx_sept_drop_private_spte(struct kvm *k= vm, gfn_t gfn, =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EINVAL; + return -EIO; =20 if (KVM_BUG_ON(!is_hkid_assigned(kvm_tdx), kvm)) - return -EINVAL; + return -EIO; =20 /* * When zapping private page, write lock is held. So no race condition @@ -1846,7 +1846,7 @@ static int tdx_sept_free_private_spt(struct kvm *kvm,= gfn_t gfn, * and slot move/deletion. */ if (KVM_BUG_ON(is_hkid_assigned(kvm_tdx), kvm)) - return -EINVAL; + return -EIO; =20 /* * The HKID assigned to this TD was already freed and cache was @@ -1867,7 +1867,7 @@ static int tdx_sept_remove_private_spte(struct kvm *k= vm, gfn_t gfn, * there can't be anything populated in the private EPT. */ if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) - return -EINVAL; + return -EIO; =20 ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); if (ret <=3D 0) --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3CD81380F5D for ; Thu, 30 Oct 2025 20:10:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855026; cv=none; b=KAczzfvrjTagIpPzeT8Z5jpPJuv9GTQHbHx26+VCYdQ4GfxdUMojiAzQckzzwZDrjz3xKNGX4ed4RqvesCas+xUZJXtf8/3CUtCvxMfdCyrpPV7sWNCLCb4sIBcS9FNCzvDyX2hwl6PzjDvVEWt/V6ICTBOfEzpSSeZkEBRPzSU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855026; c=relaxed/simple; bh=VPvA8JSjcYzb7PAL2NCANYZ0tcPgx4vpcgMWgfs85d0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BMZC/X9zTlooslo83Ft4Na1x1b99rNAqrj0SmvZMe5o6p0WQrTZKg/ygsIjomRCzppZp8q2xS+ArZzZ/x/sqPYPenvANMAasEWuOFw0zjC33pbuSYPER3pM7/KyVIRJ8bWiiIRhKLQtozSth8cOvIQ4NJsXF3vBYVT1DtiziNTk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=RAsAjtKF; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="RAsAjtKF" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-33bb4d11f5eso1783633a91.3 for ; Thu, 30 Oct 2025 13:10:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855024; x=1762459824; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=IvM7TjIntZ9FlQ6yPFIX7Pdb82bcYcGNbKTNQ5X1yLU=; b=RAsAjtKFqBFhMwFq6toP5dP0KTxOgS1mv6Bb6GEoY/sJqhjeRUT+156qucBNkuQd9z vhTCqmDN8XamGFLJxNtUL5mE79I6366jAxdlaXRywFBGRyX9GlWRCpwXZj9xtFxI00kw 3uFshrpVjk9O+Q4Y9Sj0ZPfpGqyHgrGWwFz9jodRrO8LukyqdLxP/aCxBxOn6F6aW7V+ 2kKCjwJly6tNcgLL8l8PorUxJQKBexh870eo4Mfk06ZD8Q3OBtjKaBDPeHVJWRN7jXti f6r+Z2iLC32gC/7uUcA/LPFtmRdoCNh5fWtN4i2ojTyCPKmFtnaddc2NoQq75bE1ARKd Mmeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855024; x=1762459824; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=IvM7TjIntZ9FlQ6yPFIX7Pdb82bcYcGNbKTNQ5X1yLU=; b=flkkbDUwZlv0CtbNqggLCbc2AzLdDu+JlpnMGDkgwk2FmfUUUOFfM1UfWYhEj0uxNF 5xXH0iZnlt2sgtDUy6F02ReWBfABDaOkSqVU9NPI2awzuD8Oy19sRAr7MhgABNdH5aeh +jBxAo8w5qmZSeKI2wh4tQAutY+m1l2gqByZ8WzBrgRUZh3ix/bOR65dm9V1DMuitSZd jtSdkFlu6b3D3I5BiAaf6VK7MubpwAMhXK+P08HB4XcxEpF13VJbtcqmL0KMp+CN7Od+ mWbFNlNyQLEx/loQgOrEePI4HAnm4IBFKdL8WZMlEYZudPFYQOGw/1/zOZH1V5lKas+P R8eQ== X-Forwarded-Encrypted: i=1; AJvYcCUeAM5KqUGP39NBno8lbEVjmGbQyoMhNSNr25IdRHPBV9eN2E0xqQTGqVhXGFPDXv2igp/AkkuxeveJZQw=@vger.kernel.org X-Gm-Message-State: AOJu0YzWpiqHh2wDpRsDRz1K/n0bW2zGD3E81+Z8yPqDi2c23OxhlFEc QTYgLpdDoZYB1nT90h0Quws56RX/dEF/JQBVkmMUxI0QJRjx3J+pmi2+5ZhEQmNjQfFzsqro8rw XQ3IZXw== X-Google-Smtp-Source: AGHT+IGgb27AO5v2WDGNnxGkmz0o+KOcj5EzJAo9YzbL1iR1u4hmGEpmIxfQoNCUxUiI3kNavEPI4qshnw4= X-Received: from pjdr2.prod.google.com ([2002:a17:90a:2e82:b0:340:8a98:4706]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3947:b0:340:6fae:1a0c with SMTP id 98e67ed59e1d1-34083070e80mr1280206a91.21.1761855023579; Thu, 30 Oct 2025 13:10:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:33 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-11-seanjc@google.com> Subject: [PATCH v4 10/28] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() as a step towards having "remove" be the only and only function that deals with removing/zapping/dropping a SPTE, e.g. to avoid having to differentiate between "zap", "drop", and "remove". Eliminating the "drop" helper also gets rid of what is effectively dead code due to redundant checks, e.g. on an HKID being assigned. No functional change intended. Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 90 +++++++++++++++++++----------------------- 1 file changed, 40 insertions(+), 50 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index c242d73b6a7b..abea9b3d08cf 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1648,55 +1648,6 @@ static int tdx_sept_set_private_spte(struct kvm *kvm= , gfn_t gfn, return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); } =20 -static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, struct page *page) -{ - int tdx_level =3D pg_level_to_tdx_sept_level(level); - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - gpa_t gpa =3D gfn_to_gpa(gfn); - u64 err, entry, level_state; - - /* TODO: handle large pages. */ - if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EIO; - - if (KVM_BUG_ON(!is_hkid_assigned(kvm_tdx), kvm)) - return -EIO; - - /* - * When zapping private page, write lock is held. So no race condition - * with other vcpu sept operation. - * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. - */ - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - - if (unlikely(tdx_operand_busy(err))) { - /* - * The second retry is expected to succeed after kicking off all - * other vCPUs and prevent them from invoking TDH.VP.ENTER. - */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - tdx_no_vcpus_enter_stop(kvm); - } - - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); - return -EIO; - } - - err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); - - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); - return -EIO; - } - tdx_quirk_reset_page(page); - return 0; -} - static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, enum pg_level level, void *private_spt) { @@ -1858,7 +1809,11 @@ static int tdx_sept_free_private_spt(struct kvm *kvm= , gfn_t gfn, static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { + int tdx_level =3D pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); struct page *page =3D pfn_to_page(pfn); + gpa_t gpa =3D gfn_to_gpa(gfn); + u64 err, entry, level_state; int ret; =20 /* @@ -1869,6 +1824,10 @@ static int tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) return -EIO; =20 + /* TODO: handle large pages. */ + if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) + return -EIO; + ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); if (ret <=3D 0) return ret; @@ -1879,7 +1838,38 @@ static int tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, */ tdx_track(kvm); =20 - return tdx_sept_drop_private_spte(kvm, gfn, level, page); + /* + * When zapping private page, write lock is held. So no race condition + * with other vcpu sept operation. + * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. + */ + err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, + &level_state); + + if (unlikely(tdx_operand_busy(err))) { + /* + * The second retry is expected to succeed after kicking off all + * other vCPUs and prevent them from invoking TDH.VP.ENTER. + */ + tdx_no_vcpus_enter_start(kvm); + err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, + &level_state); + tdx_no_vcpus_enter_stop(kvm); + } + + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); + return -EIO; + } + + err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + return -EIO; + } + + tdx_quirk_reset_page(page); + return 0; } =20 void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0F7C380F5B for ; Thu, 30 Oct 2025 20:10:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855029; cv=none; b=Lg0WXzZd6ED/mC10V91mqkUZ3c331Jqg7572+t/qqiPWqLa2wLn3BN8yMChVYWUmPBQqvB4Kbhk4Pvnp0qvoGAyG8dnj8TZjvI44Y4ZA4zenHbC2QGqk66D4Wk7Re/qNlQBOpn4WGEg4Xm3U5kzniugePMLC2ePOPcKgPqZwLfU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855029; c=relaxed/simple; bh=iID6fKfqxMHNYPgMBI64j92pxR6SB3hzw7FaAoHtwr4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=U8/dP780dqQ1CLc1zeDLBg0KmItJVl1YiOc3TrY/5Jy3bx6WytKCwQlAe/mxk2xv5A3BKhXIdQ3qq+R74Uu9ihsDgEVsWtEcaBfOS5kMicmFyTFNE88TKc5goZ+HESSDFAiPypg55EzGzfOggyTSQC939bWAkVZe+PjJBXfHRyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=08+gjiTM; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="08+gjiTM" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3405c46a22eso1221762a91.2 for ; Thu, 30 Oct 2025 13:10:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855026; x=1762459826; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=8nIvTvBCdsBevoBG1DvySQLnzbDN6yPR9QB+VA8aXbA=; b=08+gjiTMb/dSTY4kL3LIEsTLuWlsNiO8u2YOC5w4RvPbwLMSiYok055XzOPD0i4P0y dCrLpBgsjD2C7YQF+sP4RmMZOxKmyipw/WWYd5NDlY9aPYKIIEnPvQ4WhcAq+SqyWXNz 5TSHT9pfEeG51lAzhMGcHIwkcVjKDbumQoTdHf8AWdZw0RTY6HlcQNjxnF+O9HLY570d G6xYGf6Dv3M6UHN8doAOu+N5jEXIDzQ6EIepjpqgzIaNEqnFEGEz5r/SDn1uPsfgTFsd a0U+C+7b/KKvRbqQd6+aklXjab7G3xCoKnwRh8y4yMM4BrliJ+DViCebAiScFFhLzyP4 bl3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855026; x=1762459826; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8nIvTvBCdsBevoBG1DvySQLnzbDN6yPR9QB+VA8aXbA=; b=Lz+rx6mkM9Q4VRG8K8zji5SbqO6poYZZAsK2FloWA2UfUJyC+u/WUaAWzWNKZgyxh2 57r9R5afn0LbdA1lEKl6b0vzKhDO3ePFZZGt2aSog7X1IsMbNzLmODomIpFfg2AN7odN BXrpCKVzSlKbGR2TsGj0P7F0gNkizb97smYW9mzuYLxYZI7M7rKizwMLify1M5ojJmHy D2wZ5HPp/udJa/R7Y0SU/2yrZ/NiXSM/j73nZMsbCENNeRjlQ2O1Kp6kLnxVW2OYQv8o 5boqzmLAaVn3+Ed8ZPXzO8h1SQyLXsLmCUC1tA+lgvmVdTijE+Ty9f093mfLIGEWcT/m odWw== X-Forwarded-Encrypted: i=1; AJvYcCUMVQlj3367Tk3+PxUNppfJND0kgQGBwkzBmG4eaQ9befW/k6GEF8i4P5zconjqlN255HRcp2qw2CIMKGI=@vger.kernel.org X-Gm-Message-State: AOJu0Yz60+CGaHExaFDc+LpUl1ucGKgNbTSMUFZH09JijPRRhYkPuBLT yYMp7SOR5nF5DjzSaZcZlWOjNz/u1iRN4xo9fl40f/+ytIDJ8C7sI4vMdKqv7ntRQycIO4byq+b 4h93aWA== X-Google-Smtp-Source: AGHT+IEhw8LtT9p+iuh4rLr+K1Dp7djyQDWKY2Obpi49t8AGu0qQZAphBfIqivSEmnTrnJBooynMyMjyxqM= X-Received: from pjbcf6.prod.google.com ([2002:a17:90a:ebc6:b0:33f:e888:4aad]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c8b:b0:32e:3552:8c79 with SMTP id 98e67ed59e1d1-34083088fb8mr1238454a91.29.1761855025815; Thu, 30 Oct 2025 13:10:25 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:34 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-12-seanjc@google.com> Subject: [PATCH v4 11/28] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop the return code from kvm_x86_ops.remove_external_spte(), a.k.a. tdx_sept_remove_private_spte(), as KVM simply does a KVM_BUG_ON() failure, and that KVM_BUG_ON() is redundant since all error paths in TDX also do a KVM_BUG_ON(). Opportunistically pass the spte instead of the pfn, as the API is clearly about removing an spte. Suggested-by: Rick Edgecombe Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++------ arch/x86/kvm/vmx/tdx.c | 17 ++++++++--------- 3 files changed, 12 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 48598d017d6f..b5867f8fe6ce 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1855,8 +1855,8 @@ struct kvm_x86_ops { void *external_spt); =20 /* Update external page table from spte getting removed, and flush TLB. */ - int (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level lev= el, - kvm_pfn_t pfn_for_gfn); + void (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level le= vel, + u64 mirror_spte); =20 bool (*has_wbinvd_exit)(void); =20 diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e735d2f8367b..e1a96e9ea1bb 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -362,9 +362,6 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct k= vm_mmu_page *sp) static void remove_external_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte, int level) { - kvm_pfn_t old_pfn =3D spte_to_pfn(old_spte); - int ret; - /* * External (TDX) SPTEs are limited to PG_LEVEL_4K, and external * PTs are removed in a special order, involving free_external_spt(). @@ -377,9 +374,8 @@ static void remove_external_spte(struct kvm *kvm, gfn_t= gfn, u64 old_spte, =20 /* Zapping leaf spte is allowed only when write lock is held. */ lockdep_assert_held_write(&kvm->mmu_lock); - /* Because write lock is held, operation should success. */ - ret =3D kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_pfn); - KVM_BUG_ON(ret, kvm); + + kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_spte); } =20 /** diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index abea9b3d08cf..7ab67ad83677 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1806,12 +1806,12 @@ static int tdx_sept_free_private_spt(struct kvm *kv= m, gfn_t gfn, return tdx_reclaim_page(virt_to_page(private_spt)); } =20 -static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, kvm_pfn_t pfn) +static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level, u64 mirror_spte) { + struct page *page =3D pfn_to_page(spte_to_pfn(mirror_spte)); int tdx_level =3D pg_level_to_tdx_sept_level(level); struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - struct page *page =3D pfn_to_page(pfn); gpa_t gpa =3D gfn_to_gpa(gfn); u64 err, entry, level_state; int ret; @@ -1822,15 +1822,15 @@ static int tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, * there can't be anything populated in the private EPT. */ if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) - return -EIO; + return; =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return -EIO; + return; =20 ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); if (ret <=3D 0) - return ret; + return; =20 /* * TDX requires TLB tracking before dropping private page. Do @@ -1859,17 +1859,16 @@ static int tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, =20 if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); - return -EIO; + return; } =20 err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); if (KVM_BUG_ON(err, kvm)) { pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); - return -EIO; + return; } =20 tdx_quirk_reset_page(page); - return 0; } =20 void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18A3F381E72 for ; Thu, 30 Oct 2025 20:10:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855030; cv=none; b=WX6kjKpLtafl6tvCnJLbl5ENhkL7WFCQwoV4eJGVTrFNWRd1UdJ4++rmOBZu0VrkoVvSCLNUzO61helXLdncXcC4UQdg56fy6p/LjDyRqbf7QwonOijkjo669EWc1sMsZmNvTs3jcQbuqP4F4IWQ4gjLdRwqAFgKsSn/7uLyKjM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855030; c=relaxed/simple; bh=ko0pi49hMMjXayjy5Hc2PgynQuEfPA92uCBi4fQ9ie0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=D+sEt2wIEO3UUy2NnnKFn0zAQZQNykTkGSm6eOnETgibhIT9Qp6K5P1eZDhW1iGswJJt3Hd6E155332agPHaxs2cbAt13+nbsz9NTM5wQVrCN2DyLlGpSL2JtaOC0eb9T5+YwzE8C0dWqAfNLmhMWtIJkMZAMQ6CMNO5TjnC9bU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nj+B+efx; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nj+B+efx" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-34077439124so1055055a91.1 for ; Thu, 30 Oct 2025 13:10:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855028; x=1762459828; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=KH8ZMUFkaBzXjpSahKr3XCuyUdT65ecO9/HP1PAb4RI=; b=nj+B+efxjj+BjjpLsFBs9asAhG14bnqqJoNM5z9YofCOlxRMnFSVu3n/sY1Mt5fItP CsTO3RH93jnXS0R4+5Up/MU3OJ/hE5yc16bnWjmMoyvqDtelHrG7uinV/OtYrmiPpRqW G+/oE3ScYnxzem5B51i7E1lFke8cAGzXx+nkk+JrPfrhbAprJ/bYXlXcXZipbOUJizka YmtGSBTUVHJSgDMV10WoXwjgCiEGO5D6mOvCAkaoazLbR7hL7nvLax64VctyUovZeoky +ZgWn9RP+wSxUsFxnUlO6/qj936N3zt8ywlIltb8m7DHrZ5LUALFvXureoaiY9xAkpAU CAAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855028; x=1762459828; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KH8ZMUFkaBzXjpSahKr3XCuyUdT65ecO9/HP1PAb4RI=; b=WFeGcnR2mKNP61h2nmR7PUq1tsO9dpGh3VUJCG+GoNZsN1I7mhteeQVrF67sCp0QYm KxgLmrLoLMiNWRGcQhopOUjdDAT2tbchUWeav0N8vfWrx6p5hkS9afZ2qlE0HiLdrPOS 3mgN1fWPXkP+BSmTvRuTuxFmrcveGF10Ki54nwrArf/fqZXb0LLmqwpy70MATmPV6K7y f/nEKI+tE6oisC0t9QcUeHufqcl+teBt124RfE6/m3U9sn1iL0BoCZkcxL7yLBiwAxjc z5murBQ8YaoiNPJws2TNiK0aJCKJ5RXWDDaTucUC6SdEbWPU2GOE9zyl9aNHcOP069x1 9I8Q== X-Forwarded-Encrypted: i=1; AJvYcCVyAByL8BY85nmZrpGAn0bABjLW+2t6VzL2zTeLXTi+NAc5W1sckGIc5vlTs7lP9mqhUBji3R69mhSR/Ww=@vger.kernel.org X-Gm-Message-State: AOJu0YzP5fQVLY1J2zGVtwM5o0+URUX9Xl61CPm+AFQHU/FjgRT4ghXL Bna59f/xhl86+ufFXvZeujVJeX2Xm8E8D5sFSvEDWzu4soUUfvBKZHKYytpv381//JrWNdGqK3y 19yAROA== X-Google-Smtp-Source: AGHT+IGrJrKalEV3VAWt2uGzDiDeYhZVGRjeI1LK3deyE8hSeRDQF6nrnXg4JLxn0CpF+AukaegC2zqX/UE= X-Received: from pjbgj23.prod.google.com ([2002:a17:90b:1097:b0:33b:cfaf:ce3e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1801:b0:330:6d5e:f17b with SMTP id 98e67ed59e1d1-34083074dddmr1219299a91.21.1761855028166; Thu, 30 Oct 2025 13:10:28 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:35 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-13-seanjc@google.com> Subject: [PATCH v4 12/28] KVM: TDX: WARN if mirror SPTE doesn't have full RWX when creating S-EPT mapping From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pass in the mirror_spte to kvm_x86_ops.set_external_spte() to provide symmetry with .remove_external_spte(), and assert in TDX that the mirror SPTE is shadow-present with full RWX permissions (the TDX-Module doesn't allow the hypervisor to control protections). Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 3 +-- arch/x86/kvm/vmx/tdx.c | 6 +++++- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index b5867f8fe6ce..87a5f5100b1d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1848,7 +1848,7 @@ struct kvm_x86_ops { void *external_spt); /* Update the external page table from spte getting set. */ int (*set_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, - kvm_pfn_t pfn_for_gfn); + u64 mirror_spte); =20 /* Update external page tables for page table about to be freed. */ int (*free_external_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e1a96e9ea1bb..9c26038f6b77 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -515,7 +515,6 @@ static int __must_check set_external_spte_present(struc= t kvm *kvm, tdp_ptep_t sp bool was_present =3D is_shadow_present_pte(old_spte); bool is_present =3D is_shadow_present_pte(new_spte); bool is_leaf =3D is_present && is_last_spte(new_spte, level); - kvm_pfn_t new_pfn =3D spte_to_pfn(new_spte); int ret =3D 0; =20 KVM_BUG_ON(was_present, kvm); @@ -534,7 +533,7 @@ static int __must_check set_external_spte_present(struc= t kvm *kvm, tdp_ptep_t sp * external page table, or leaf. */ if (is_leaf) { - ret =3D kvm_x86_call(set_external_spte)(kvm, gfn, level, new_pfn); + ret =3D kvm_x86_call(set_external_spte)(kvm, gfn, level, new_spte); } else { void *external_spt =3D get_external_spt(gfn, new_spte, level); =20 diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 7ab67ad83677..658e0407eb21 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1629,14 +1629,18 @@ static int tdx_mem_page_record_premap_cnt(struct kv= m *kvm, gfn_t gfn, } =20 static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, kvm_pfn_t pfn) + enum pg_level level, u64 mirror_spte) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + kvm_pfn_t pfn =3D spte_to_pfn(mirror_spte); =20 /* TODO: handle large pages. */ if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return -EIO; =20 + WARN_ON_ONCE(!is_shadow_present_pte(mirror_spte) || + (mirror_spte & VMX_EPT_RWX_MASK) !=3D VMX_EPT_RWX_MASK); + /* * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching * barrier in tdx_td_finalize(). --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B37B383365 for ; Thu, 30 Oct 2025 20:10:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855033; cv=none; b=L+DiruyOeAcysoOQaZK3FUvppT58yTjueogppDHRfvtypYcQ8f7QTQqKbESHJZzqFZN4XGo+Lplr6gWWOi50kovYDUyLHJsfFW55l96X9biTqQtiLrdmXZGp+PYg38KdkRXupdmAfwUQZblC4faF1TYYZfQG1FXvzZjWPzy/sbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855033; c=relaxed/simple; bh=XFfT8gaTPEkUKYTdX1NH6w/tGb+C9eg2daMfkI8gFLA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GVEpEqlo/mayG7qtgbhSeUsiiQq88l3eXAGzFZVRySNEHv9X4cR3WINfU0jZIgEToBUMCGM+5/lQ2O0sehQx1n8efG6BASoLpSEmfQ8P1kUFIAf57SRYoYbgZjtzhxO67kyBv+kEEWaJWjlb2BOPWEZdatMEDUOcQ8cMeQPtur0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=MD6FtfjT; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MD6FtfjT" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33bcb7796d4so1407632a91.0 for ; Thu, 30 Oct 2025 13:10:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855031; x=1762459831; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=meKFn4T9FN95UHKOhoydSdUPWQIDJ02hjx7Eu38ytY8=; b=MD6FtfjTp9+yNMcS9CBvurnWUdRAFUDcW0CRCwNqk+MSPAXRR8H8yZYvNz/pLRMAfB DaI8SltrdDLg0FlhaN+//+KEPwvMDog6e2sjyPRt5GiAjGg6pKkjfaJEdEJJ5ajWJ9Ku yaDtrNkOncp1LDSkD6nOLyTBovbgX/SIqAEkdFQEwhSy8m//7E7QHUovbBfUhTbN6x41 OJCkHi75yzC5NGofyi2WLqWSMawQ2YQ4594ucoMnkQPZBjcBLVHwY+AiEsg5RYrk36jE SGwPAIYPVN/+RxG5Jxjm0dyQ3GFQTEnWNr7Cx0jKI5K2DBhJ5kz0U3aaNqDGG0QePtCr Hdvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855031; x=1762459831; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=meKFn4T9FN95UHKOhoydSdUPWQIDJ02hjx7Eu38ytY8=; b=moG6rT8UJwm/lgr6CtMOZ5wLloIkjmr2fOotzwxMbLl+lGOzPjsufyLde3kPRCTyfl X4ONVr1rJRsLkNaJmhgLVKNmbW4mlEjM9XUD+vcHfXY/+xxP+mfMPZGsnlF0S5z9jcya b6egRomVIXmiM2ktfYKWZ9TtA2f7XdBwD9LnArmRvcPOLfNNjG+CHwPU8SEv+nYFg2BU iZhK6U4RCe009AXs2A7L59Aghz7rgUMe+5R5aCwmvEN9TKJIGlrWHCyciuOy5d5dhzMS W0jcGL0bUiBmabgmbZZUyKQ0i/NAlOStJESU9ZcOdD77QD3HYLhqDOSyLUhFBmA9SxRY am9w== X-Forwarded-Encrypted: i=1; AJvYcCXzwHyc60cEwrGUpsVZngSn+zPJNyvba6B5O4OTY2dhxr27MlHJitwk2yXDWHbXgvtYcL/uB4OpNGTbFVs=@vger.kernel.org X-Gm-Message-State: AOJu0YzzscQ2L9B9cG9hDPNB/AuXIbPxb2f8+6JyUMHD8VfpyJSW9UIz fyx2i7Ybb93Yp5OylouCDHOJxMH5Kx9uHMxDyo52ia47hr/MHkDU24ZDPCqknUZTE3T+17VY0jv 4trQLbA== X-Google-Smtp-Source: AGHT+IFHot6p8Ozw6sPTCEifBtpuZu56mTW8C+szZfak/MuPqOINRZqkxHegGZo0IYVURE2JpMgMqwrtO2U= X-Received: from pjvl22.prod.google.com ([2002:a17:90a:dd96:b0:340:6aca:917f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2d92:b0:340:5c38:3a56 with SMTP id 98e67ed59e1d1-340830a3a7cmr1392650a91.37.1761855030672; Thu, 30 Oct 2025 13:10:30 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:36 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-14-seanjc@google.com> Subject: [PATCH v4 13/28] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Return -EIO immediately from tdx_sept_zap_private_spte() if the number of to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)" isn't also triggered. Isolating the check from the "is premap error" if-statement will also allow adding a lockdep assertion that premap errors are encountered if and only if slots_lock is held. Reviewed-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 658e0407eb21..db64a9e8c6a5 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1725,8 +1725,10 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm= , gfn_t gfn, err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); tdx_no_vcpus_enter_stop(kvm); } - if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) && - !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) { + if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) { + if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) + return -EIO; + atomic64_dec(&kvm_tdx->nr_premapped); return 0; } --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EBEE384B93 for ; Thu, 30 Oct 2025 20:10:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855037; cv=none; b=s3YKsxYNEnuGDnCTPtqVEZei8xIdqIFkFL/9EeVK8mUZAwqQa3meEjA7p8v4ydY0Ng09/wncqQKCk7qaAhohXQ2T5swTOs72azo60UWzuDezNJzQqDyPCfw7hOhldOJrHwHZZaXuG/XywW2SN2HgZQmm4OR8JCXiGdOTn0azLWo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855037; c=relaxed/simple; bh=hZ6oGgJsAxLyizwY2nBkXOfX86m8d4LnroP9LBDe0vw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=c3LURGjGa30J9iTx2FAO6jIHACbuAoOByxlDaEl7lTZf4cTsB0fNcz8jZt2+FwTGb6V1rf26PnGxNzDIDaaecUbntIcKwLipls65tG9j5O+hv48b6uxfhttLwxaxZBQ00sraUNgvBgXOR9oLzdEX2hT5ZZ5s7/CQNCZ9UL8YiN0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3A5p7Oup; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3A5p7Oup" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3403e994649so3115347a91.3 for ; Thu, 30 Oct 2025 13:10:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855035; x=1762459835; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=2Qp65v2XVpj2mbgYoCIcn74lwhNS3G3VX8VrR6k2zs8=; b=3A5p7OupYLEJL57URKNOGGxMM98HcF2+WLquTZNZSBnthoI4ayXJ79BYxCdhf7/4pF swxficeXIf1dR4LyNeI44RUUdALRJwXVxauonNboTc+f3cHbKPfZXrwmBQHViRCWUqxt 9Hc+LNbVZ7Z0dTGdT9OpFid8zN1bZbkcjOI5T6cSDNiJdpOIaD1mhWvNIHnNVw12eTvi CVehsXqeBU9U9Y37MlV51XRHI1vTzsPtxPcgLSLlZPZFy/YIVW1wXsRVbzk3UTcrf+Av qlulcr9TfTcOA5W1cMLWMo76X5sTSFD1l7OLbJaPu6OLhYweZU/lnb+iT9cy7KHKjzy5 39xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855035; x=1762459835; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2Qp65v2XVpj2mbgYoCIcn74lwhNS3G3VX8VrR6k2zs8=; b=flZIG0ZrisO43Cu9NIvcAqaXDmmDsBEb+BRJjBFZcBlnKQs8hIeXau991btvaNpk13 oMAjz5FdOOxaDcME+UoG+Te5pQMkTKE5IXahEdj2pwC+tNiiAc52L9hkC6c9fGddo3a5 3unO/eOz1fNkv4iAjWtB3lARyd+o6MCln5DaJ2jm5CatDMGzpcih5+iAn/CI/MHbzaVX b4gKXOF9QRV/f/7AR2b2RcvPS+NCVhBIqnfVeHpptcS7mj005+u11tstDbBfl7BeE5Ck muGU9jXJikFcyg30M4urKbZxPFZEWdJcWSEyL3vjKUbtY7Roc61JZ5YLZFXe2CBmO292 4TCQ== X-Forwarded-Encrypted: i=1; AJvYcCXjybGj4hhyQXc3bvXKgLk5YK8Cv1mRNzk9NfIY0chA0i8p5UNr1z9/FwDabD/GKiZ0PB7pIrno2tnAWps=@vger.kernel.org X-Gm-Message-State: AOJu0Yy5XylYT2XTbYE++f7DNrrlXcmAQrLKYjDvRnF6JY89Gz686Frz 5+Q2kyq/Y3oPK87zVOazAwZUSunVGMYm/+vuDjbQZY53DkdEmlsRbRRr1cm9wQDqMUxIxbzzyJM 4OdguAQ== X-Google-Smtp-Source: AGHT+IHtEBN2FzRvx/PBMIz+g6mN5JL6fejVbo3/Jj6z8BZ8mwuwi1S8UZTxug0dqnAA2vI1WFCMGP+gtTc= X-Received: from pjph23.prod.google.com ([2002:a17:90a:9c17:b0:340:2aef:8b01]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3842:b0:335:2eef:4ca8 with SMTP id 98e67ed59e1d1-34083088cfamr1279023a91.33.1761855034597; Thu, 30 Oct 2025 13:10:34 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:37 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-15-seanjc@google.com> Subject: [PATCH v4 14/28] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use atomic64_dec_return() when decrementing the number of "pre-mapped" S-EPT pages to ensure that the count can't go negative without KVM noticing. In theory, checking for '0' and then decrementing in a separate operation could miss a 0=3D>-1 transition. In practice, such a condition is impossible because nr_premapped is protected by slots_lock, i.e. doesn't actually need to be an atomic (that wart will be addressed shortly). Don't bother trying to keep the count non-negative, as the KVM_BUG_ON() ensures the VM is dead, i.e. there's no point in trying to limp along. Reviewed-by: Rick Edgecombe Reviewed-by: Ira Weiny Reviewed-by: Binbin Wu Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index db64a9e8c6a5..99db19e02cf1 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1726,10 +1726,9 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm= , gfn_t gfn, tdx_no_vcpus_enter_stop(kvm); } if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) { - if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) + if (KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm)) return -EIO; =20 - atomic64_dec(&kvm_tdx->nr_premapped); return 0; } =20 @@ -3161,8 +3160,7 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, goto out; } =20 - if (!KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) - atomic64_dec(&kvm_tdx->nr_premapped); + KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm); =20 if (arg->flags & KVM_TDX_MEASURE_MEMORY_REGION) { for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ECBDB384BB0 for ; Thu, 30 Oct 2025 20:10:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855040; cv=none; b=T6Znsg8LcLsCaZfYsZ8wDPA36Td4xm0+O3n8MgDRV8R+iDY+0hDaBlW00LfN/ZBzuoJAaj4bB85x2vTVkpXC5trzDoeS8WuFvDm373/CrojNk3dFPTtooyfR+2IcL4h49Onn5WiBUVkAczPkOjgUs4fftY9WUvaJsoJt2nYtGNw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855040; c=relaxed/simple; bh=/ZmwuSI1/TsBi/hdaMNzxUBuHFwJZlKVkIuQ67ClOls=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cC1Y/Sx0LbnYqmKyZGZWL5y57XJzkxUEadifzvm83kRk51n5y1dfDsVJm/paazzUZAeLhi1JUnInpyXfaE44vjwB4soLQwV1LjRnUX/svSF9TDowbzVwOfou31I+dtvwLhggOssIPZlLtTMo+LtMfMpkKy2nIad/vZlJ7VjedLM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=blexKLPm; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="blexKLPm" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32edda89a37so1493112a91.1 for ; Thu, 30 Oct 2025 13:10:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855037; x=1762459837; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=/Z7Z4VCKxvbeclMBJpiAZWPr8rGnQCyDletBZbiKOLg=; b=blexKLPm2bf60LniZ067LYpTNSo7rlBBn1oOwAYgAK32/wdj4v74T5YulDjPtWEhJ/ +O/ZN2g3FyH3KaMx4flpHfSotLdGUDPVk4U1bkxus/eLHu0kur+yUy0j97iZCqAW780r bceumGgreTngf4JQ9uFm2DfNFdWvxy1c6bXP6IPk1DIJ/sqEsNwzyiWj0t8OfkN+bAo+ 30Oe0NQWgNprvHFFXEg2mOrRQboYRkjh2csQNn7hrjAg680jsYDOVbHBkoV2m8OJ5TiM qNNTP9kkkhHXOFGSHuW5mBAdsHmOGGM8NtMQbQ6RemJHy+1O9FQuag1ws/6LQCsF4hA+ eMCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855037; x=1762459837; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/Z7Z4VCKxvbeclMBJpiAZWPr8rGnQCyDletBZbiKOLg=; b=FbXPe6uYBag6Qdi7p1/Y9BH6ywz81EABpCVHO3Gmy6LIx4R5Rj4OWUDVHeqjM4E//R vedtgFoUGMpYwW2eq7D6E+QV74mTLXQrdL3E+06zhEDPo99kdHkk7C4+Em0J+xTxyITX DmtI642mNpT/C+QpxvZ3t0D/pqnX9UpEpgSeG3VIrB4wghCTtoO5DqzNj95+zVE8Fxnm 7J5jZFBir4Epy9BU86z8/eH+9cZiTJ68WScjXmk7ly4RVovv3/tFjrbqGjijcfr9I73r ZLC79YWFLIeifa2sv5aI7AJdXUaQF75zrVx4TgLmNeHHHsyWrt5usuhhKy+KiFqY48p7 lFhQ== X-Forwarded-Encrypted: i=1; AJvYcCUc1gzJ+MXZNMoqOy1lY6eIeg6i9eyfdKejVmeH7ovSPYoG88ux+UjLVo0+0su0jiHO8fPwDEcBAvb2VhM=@vger.kernel.org X-Gm-Message-State: AOJu0Ywi7NHhOFrLxuY8aDMCj+Uhk+5jIVMKiqEx7mGeoMBAoMYL1PxV 5u+U/cFjK0RMTQDXqf3meU6p+CkPZckRVPMQ8OzcNkwtjNlzu4/Am6eM8uIw0x7tjsCUrhgZ+sT H6vXi3w== X-Google-Smtp-Source: AGHT+IGuQ9tseR+UxrPGQmatIE4dzuj8mufA6J/LivL787A5dSoSjpBpOEZB47eEhvzGXFTpSKaWylv+I0w= X-Received: from pjqx5.prod.google.com ([2002:a17:90a:b005:b0:32d:dbd4:5cf3]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3141:b0:339:a323:30fe with SMTP id 98e67ed59e1d1-34082fdc1a5mr1313871a91.14.1761855037257; Thu, 30 Oct 2025 13:10:37 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:38 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-16-seanjc@google.com> Subject: [PATCH v4 15/28] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Fold tdx_mem_page_record_premap_cnt() into tdx_sept_set_private_spte() as providing a one-off helper for effectively three lines of code is at best a wash, and splitting the code makes the comment for smp_rmb() _extremely_ confusing as the comment talks about reading kvm->arch.pre_fault_allowed before kvm_tdx->state, but the immediately visible code does the exact opposite. Opportunistically rewrite the comments to more explicitly explain who is checking what, as well as _why_ the ordering matters. No functional change intended. Reviewed-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 49 ++++++++++++++++++------------------------ 1 file changed, 21 insertions(+), 28 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 99db19e02cf1..a30471c972ba 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1605,29 +1605,6 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t g= fn, return 0; } =20 -/* - * KVM_TDX_INIT_MEM_REGION calls kvm_gmem_populate() to map guest pages; t= he - * callback tdx_gmem_post_populate() then maps pages into private memory. - * through the a seamcall TDH.MEM.PAGE.ADD(). The SEAMCALL also requires = the - * private EPT structures for the page to have been built before, which is - * done via kvm_tdp_map_page(). nr_premapped counts the number of pages th= at - * were added to the EPT structures but not added with TDH.MEM.PAGE.ADD(). - * The counter has to be zero on KVM_TDX_FINALIZE_VM, to ensure that there - * are no half-initialized shared EPT pages. - */ -static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn, - enum pg_level level, kvm_pfn_t pfn) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - - if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) - return -EIO; - - /* nr_premapped will be decreased when tdh_mem_page_add() is called. */ - atomic64_inc(&kvm_tdx->nr_premapped); - return 0; -} - static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, u64 mirror_spte) { @@ -1642,14 +1619,30 @@ static int tdx_sept_set_private_spte(struct kvm *kv= m, gfn_t gfn, (mirror_spte & VMX_EPT_RWX_MASK) !=3D VMX_EPT_RWX_MASK); =20 /* - * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching - * barrier in tdx_td_finalize(). + * Ensure pre_fault_allowed is read by kvm_arch_vcpu_pre_fault_memory() + * before kvm_tdx->state. Userspace must not be allowed to pre-fault + * arbitrary memory until the initial memory image is finalized. Pairs + * with the smp_wmb() in tdx_td_finalize(). */ smp_rmb(); - if (likely(kvm_tdx->state =3D=3D TD_STATE_RUNNABLE)) - return tdx_mem_page_aug(kvm, gfn, level, pfn); =20 - return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn); + /* + * If the TD isn't finalized/runnable, then userspace is initializing + * the VM image via KVM_TDX_INIT_MEM_REGION. Increment the number of + * pages that need to be mapped and initialized via TDH.MEM.PAGE.ADD. + * KVM_TDX_FINALIZE_VM checks the counter to ensure all pre-mapped + * pages have been added to the image, to prevent running the TD with a + * valid mapping in the mirror EPT, but not in the S-EPT. + */ + if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) { + if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) + return -EIO; + + atomic64_inc(&kvm_tdx->nr_premapped); + return 0; + } + + return tdx_mem_page_aug(kvm, gfn, level, pfn); } =20 static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3D01384BB9 for ; Thu, 30 Oct 2025 20:10:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855043; cv=none; b=CaxgJoMkO9GF9IIeZ37W9xKbXfF6wkfdcTdJ877NbaTF2b2FoXHIPo68FzDj2tCNPrt/WS4YoGUaRhgtEK8MfHMbpzpr+CL0hm7MTjEB4Yo3/9CX8QAHDehgdixqi2ZXo21EEOgue8ynbKMdT3JV/VYjwO9RCdrJj16+btWZit8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855043; c=relaxed/simple; bh=+XhvZIclD177VOUPtwPNVFoxhAZfUY1LBy39FhUfjiQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FESIapf0EtFnwm6nJO7D8g12/XRCzYtjxKi4sJSoXrsR9GIDfO4ff0I2D7ZiL1gJOCHnEe/lsQexs1HcjR0MKRp+XQwt3Tj9+s8pwL50TutRQE/K5g2nf1gu5rhg3Q9/lcW9aBX69uMTlV3aQAhbDS0qAN0bjgarYB3OGtmhqig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=essL5lre; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="essL5lre" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-290da17197eso24885215ad.2 for ; Thu, 30 Oct 2025 13:10:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855041; x=1762459841; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=hYFILgw2CVVObdwN7Nbx6skzXKtVXg762axcq16ft10=; b=essL5lreFVSM6gum8iNQwnds5Wgyq1DJDTP4WzzoZ3Ew7czCACL0Z8NxfEC5ppnJmc ENyXPOv0eHYoChyQ9v6e4hSVA/C9o0GCfvKbLz/O5la0LO7DJTI+PD6H21wwk0JFWNZC WpVa/l+FoWt0fUJK9jyBAR4DR6jOfb2Jzeh9MPI5qoHxk45f0/XD/gv8g/pRSMBuCmSP oe4d1qS0IpXlpsspTnXgSt8u7lzaAMAzUoVwK/iaSLMgRDeaskl/JnJN7C0im2sxM2k3 P4PTcIyMrxTTGv/udDxAkAkPzRD+CNGtRotvpNMwDKTqQKgCzb2Fh5P2/GunCfeSA4/t sWJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855041; x=1762459841; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hYFILgw2CVVObdwN7Nbx6skzXKtVXg762axcq16ft10=; b=Do55Z21c+1a00nbtyq0REkQKa/Ju1DRAR1CeUQFsw3e+hEcQNlAtTFkORssCtwaJxA 77tQzzoR4nvtCMK5hnRa775cISGzssV21efWR2qIK3pCIY8jvK89pDoxwP4MwNv4KShy DUyRGsNRSADp1J6x+Nakc8PLbM8eaF1m4oegXCkuZsNNwGtnoIM46YUBPrkjcbOcpppc 4dyx2I5DhUHVH0rpGtgiUlYVo+fgLYBDLMG9M+g6F45RxHOfJ4RVz4R50hZw+m05NZDe y76oswkj1XTw9mBv/AH4jj8W+z4/QWO1YVYS/oQ6RU2B9BCOCc6jsfiCrEzBw/vekKr+ MiPw== X-Forwarded-Encrypted: i=1; AJvYcCU6qi7fyLDQBJzcSW+cIfTJkKIzKbbwfzCnXnhWF+Ogs/d2T7KHZAqzaxHqmKQeqKpEi7PL3NzNGLpajUA=@vger.kernel.org X-Gm-Message-State: AOJu0Yyj5mEiD8eQVmI6NQV1e9hzXKRIRKbl0X2/+jJJxlw2c3oM1oEe PYWHCaXs04dUJ587RpWBbgFOnCLDvvql9yT2feCdLAQyIBLja2AIbbobDY50tymXTy2cCWvOXQl Z91Fh9A== X-Google-Smtp-Source: AGHT+IFdPBSucKFDy/U8TuJaqtYrp9bGJQyMq5Xu07f53oKQ6VYIKj4rbtZlkuu2LLkkKnC3ZIl4E+FjLT8= X-Received: from plsh10.prod.google.com ([2002:a17:902:b94a:b0:28e:7ea0:ac4a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f682:b0:272:a900:c42e with SMTP id d9443c01a7336-2951a4dfd2emr11837385ad.35.1761855040990; Thu, 30 Oct 2025 13:10:40 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:39 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-17-seanjc@google.com> Subject: [PATCH v4 16/28] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When populating the initial memory image for a TDX guest, ADD pages to the TD as part of establishing the mappings in the mirror EPT, as opposed to creating the mappings and then doing ADD after the fact. Doing ADD in the S-EPT callbacks eliminates the need to track "premapped" pages, as the mirror EPT (M-EPT) and S-EPT are always synchronized, e.g. if ADD fails, KVM reverts to the previous M-EPT entry (guaranteed to be !PRESENT). Eliminating the hole where the M-EPT can have a mapping that doesn't exist in the S-EPT in turn obviates the need to handle errors that are unique to encountering a missing S-EPT entry (see tdx_is_sept_zap_err_due_to_premap()= ). Keeping the M-EPT and S-EPT synchronized also eliminates the need to check for unconsumed "premap" entries during tdx_td_finalize(), as there simply can't be any such entries. Dropping that check in particular reduces the overall cognitive load, as the management of nr_premapped with respect to removal of S-EPT is _very_ subtle. E.g. successful removal of an S-EPT entry after it completed ADD doesn't adjust nr_premapped, but it's not clear why that's "ok" but having half-baked entries is not (it's not truly "ok" in that removing pages from the image will likely prevent the guest from booting, but from KVM's perspective it's "ok"). Doing ADD in the S-EPT path requires passing an argument via a scratch field, but the current approach of tracking the number of "premapped" pages effectively does the same. And the "premapped" counter is much more dangerous, as it doesn't have a singular lock to protect its usage, since nr_premapped can be modified as soon as mmu_lock is dropped, at least in theory. I.e. nr_premapped is guarded by slots_lock, but only for "happy" paths. Note, this approach was used/tried at various points in TDX development, but was ultimately discarded due to a desire to avoid stashing temporary state in kvm_tdx. But as above, KVM ended up with such state anyways, and fully committing to using temporary state provides better access rules (100% guarded by slots_lock), and makes several edge cases flat out impossible. Note #2, continue to extend the measurement outside of mmu_lock, as it's a slow operation (typically 16 SEAMCALLs per page whose data is included in the measurement), and doesn't *need* to be done under mmu_lock, e.g. for consistency purposes. However, MR.EXTEND isn't _that_ slow, e.g. ~1ms latency to measure a full page, so if it needs to be done under mmu_lock in the future, e.g. because KVM gains a flow that can remove S-EPT entries during KVM_TDX_INIT_MEM_REGION, then extending the measurement can also be moved into the S-EPT mapping path (again, only if absolutely necessary). P.S. _If_ MR.EXTEND is moved into the S-EPT path, take care not to return an error up the stack if TDH_MR_EXTEND fails, as removing the M-EPT entry but not the S-EPT entry would result in inconsistent state! Reviewed-by: Rick Edgecombe Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 106 ++++++++++++++--------------------------- arch/x86/kvm/vmx/tdx.h | 8 +++- 2 files changed, 43 insertions(+), 71 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index a30471c972ba..cfdf8d262756 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1583,6 +1583,32 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t r= oot_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa); } =20 +static int tdx_mem_page_add(struct kvm *kvm, gfn_t gfn, enum pg_level leve= l, + kvm_pfn_t pfn) +{ + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + u64 err, entry, level_state; + gpa_t gpa =3D gfn_to_gpa(gfn); + + lockdep_assert_held(&kvm->slots_lock); + + if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm) || + KVM_BUG_ON(!kvm_tdx->page_add_src, kvm)) + return -EIO; + + err =3D tdh_mem_page_add(&kvm_tdx->td, gpa, pfn_to_page(pfn), + kvm_tdx->page_add_src, &entry, &level_state); + if (unlikely(tdx_operand_busy(err))) + return -EBUSY; + + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MEM_PAGE_ADD, err, entry, level_state); + return -EIO; + } + + return 0; +} + static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { @@ -1628,19 +1654,10 @@ static int tdx_sept_set_private_spte(struct kvm *kv= m, gfn_t gfn, =20 /* * If the TD isn't finalized/runnable, then userspace is initializing - * the VM image via KVM_TDX_INIT_MEM_REGION. Increment the number of - * pages that need to be mapped and initialized via TDH.MEM.PAGE.ADD. - * KVM_TDX_FINALIZE_VM checks the counter to ensure all pre-mapped - * pages have been added to the image, to prevent running the TD with a - * valid mapping in the mirror EPT, but not in the S-EPT. + * the VM image via KVM_TDX_INIT_MEM_REGION; ADD the page to the TD. */ - if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) { - if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm)) - return -EIO; - - atomic64_inc(&kvm_tdx->nr_premapped); - return 0; - } + if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) + return tdx_mem_page_add(kvm, gfn, level, pfn); =20 return tdx_mem_page_aug(kvm, gfn, level, pfn); } @@ -1666,39 +1683,6 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, return 0; } =20 -/* - * Check if the error returned from a SEPT zap SEAMCALL is due to that a p= age is - * mapped by KVM_TDX_INIT_MEM_REGION without tdh_mem_page_add() being call= ed - * successfully. - * - * Since tdh_mem_sept_add() must have been invoked successfully before a - * non-leaf entry present in the mirrored page table, the SEPT ZAP related - * SEAMCALLs should not encounter err TDX_EPT_WALK_FAILED. They should ins= tead - * find TDX_EPT_ENTRY_STATE_INCORRECT due to an empty leaf entry found in = the - * SEPT. - * - * Further check if the returned entry from SEPT walking is with RWX permi= ssions - * to filter out anything unexpected. - * - * Note: @level is pg_level, not the tdx_level. The tdx_level extracted fr= om - * level_state returned from a SEAMCALL error is the same as that passed i= nto - * the SEAMCALL. - */ -static int tdx_is_sept_zap_err_due_to_premap(struct kvm_tdx *kvm_tdx, u64 = err, - u64 entry, int level) -{ - if (!err || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) - return false; - - if (err !=3D (TDX_EPT_ENTRY_STATE_INCORRECT | TDX_OPERAND_ID_RCX)) - return false; - - if ((is_last_spte(entry, level) && (entry & VMX_EPT_RWX_MASK))) - return false; - - return true; -} - static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, struct page *page) { @@ -1718,12 +1702,6 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm= , gfn_t gfn, err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); tdx_no_vcpus_enter_stop(kvm); } - if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) { - if (KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm)) - return -EIO; - - return 0; - } =20 if (KVM_BUG_ON(err, kvm)) { pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); @@ -2829,12 +2807,6 @@ static int tdx_td_finalize(struct kvm *kvm, struct k= vm_tdx_cmd *cmd) =20 if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; - /* - * Pages are pending for KVM_TDX_INIT_MEM_REGION to issue - * TDH.MEM.PAGE.ADD(). - */ - if (atomic64_read(&kvm_tdx->nr_premapped)) - return -EINVAL; =20 cmd->hw_error =3D tdh_mr_finalize(&kvm_tdx->td); if (tdx_operand_busy(cmd->hw_error)) @@ -3131,6 +3103,9 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, struct page *src_page; int ret, i; =20 + if (KVM_BUG_ON(kvm_tdx->page_add_src, kvm)) + return -EIO; + /* * Get the source page if it has been faulted in. Return failure if the * source page has been swapped out or unmapped in primary memory. @@ -3141,19 +3116,14 @@ static int tdx_gmem_post_populate(struct kvm *kvm, = gfn_t gfn, kvm_pfn_t pfn, if (ret !=3D 1) return -ENOMEM; =20 + kvm_tdx->page_add_src =3D src_page; ret =3D kvm_tdp_mmu_map_private_pfn(arg->vcpu, gfn, pfn); - if (ret < 0) - goto out; + kvm_tdx->page_add_src =3D NULL; =20 - ret =3D 0; - err =3D tdh_mem_page_add(&kvm_tdx->td, gpa, pfn_to_page(pfn), - src_page, &entry, &level_state); - if (err) { - ret =3D unlikely(tdx_operand_busy(err)) ? -EBUSY : -EIO; - goto out; - } + put_page(src_page); =20 - KVM_BUG_ON(atomic64_dec_return(&kvm_tdx->nr_premapped) < 0, kvm); + if (ret) + return ret; =20 if (arg->flags & KVM_TDX_MEASURE_MEMORY_REGION) { for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { @@ -3166,8 +3136,6 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gf= n_t gfn, kvm_pfn_t pfn, } } =20 -out: - put_page(src_page); return ret; } =20 diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index ca39a9391db1..1b00adbbaf77 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -36,8 +36,12 @@ struct kvm_tdx { =20 struct tdx_td td; =20 - /* For KVM_TDX_INIT_MEM_REGION. */ - atomic64_t nr_premapped; + /* + * Scratch pointer used to pass the source page to tdx_mem_page_add. + * Protected by slots_lock, and non-NULL only when mapping a private + * pfn via tdx_gmem_post_populate(). + */ + struct page *page_add_src; =20 /* * Prevent vCPUs from TD entry to ensure SEPT zap related SEAMCALLs do --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05D3C3855A8 for ; Thu, 30 Oct 2025 20:10:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855045; cv=none; b=JDgpUGx5ZWSsei25Ulh+2EktOHvDvPgc0KVaeLtoJ8zebNtdcTrQdTNpibDiO6YDYbKPxI1MF/zfHGXfyy0MaanXtf1QEkTVc1rzl5TpPvySZ7cMRU+b9KDjnI0M6lxtshwuhblgkT2eO8inInS1zwTl6aTdJs38i16BEDyx8cw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855045; c=relaxed/simple; bh=chQz73w4sDaVm3u9kl7kGYcKIA2cugXbhH5x61ROvUY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BA8bmB637VhVIAG98Q/ZkiTQy03qbrDi0Y6uTtbk2eMLD3A7hdk4hcU6euQEMyfR5YOEn7n8GSzCThTEhMrTFrQQgDthVRXRSSbseE8yHIstpUYT5Rq/U9bb6GUQNngPonuX8+YMc6Y/cxLpeL/vZbOf8T9/rmRlPnCKEkFhD2M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=rcQq5mPi; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rcQq5mPi" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3408d2c733cso57748a91.2 for ; Thu, 30 Oct 2025 13:10:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855043; x=1762459843; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=NXaPi3MoPrN4063cnb5uQMVDinJuo4gq6tPJoc7Otx0=; b=rcQq5mPiZuYrNEL1J06D+UkT/6HNu/oL/xIr99L4t7Z0WktHRMxh6SJZmCpYW/Gage WsxEka/8w6+WhKHUsv+ViDYtnLl+S94QHPlm+nBpatt12spLdgMB4pPgfCfWQPaBEXnh Gp3M4VyzbqUGTPu8zpdG+ZeeynBU9WrGAmz44PIvCPwzRdShJWbvWwTZdSOdXuBPKwc7 T1bCDoR6BQp946WTzo8slPrb+FTUm+Kclf/NWgpoAyXE1NGpqhn22QZKk9seUTp4GZDh fDMm1avFqkXZsBna+AeW04CoyGgWlYSwYTyUt2xCOiaQYUT6yCmQw2McLohW3hEHqGkX ZqaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855043; x=1762459843; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NXaPi3MoPrN4063cnb5uQMVDinJuo4gq6tPJoc7Otx0=; b=HKEGh9q1d9p82UpTjcG01TE69vKB5zeaSHWB8MFtFygOBrDj+C7BHDcW14/Ha735S/ KCDBpQbdyON+o/DXOZJ6uBTKLTpcUYpQOg6NHsil+CCTb9peiwORC+b6CBbq+mxEOKUF fXM1uHQFfh0oWTvwawNGFKgGgAJMD1wk+ZqvIqIhtrCIbU1fXWSiH2YgN6m8cy5MO3c+ fkEh86Zpoqn8qwoSiNUKNOQ0EXrlFxa3k4R2yp7SlJ6yY1DxfbO0EeSLnVto6VuKoORI qiZJgrnhZMhwqetqGdQ2AYH4aM2Dj/I0BwBQGZU1ZfPo6Gu2Imx43LC3g3H0FtfKznsj nbEg== X-Forwarded-Encrypted: i=1; AJvYcCU99QBufLNNs/GwhdLQ3tNislr28+jUy1vbvoPZYDPuyXPd+gam875rnobiQjRD1kcG3a9oiO4GfO3tMYA=@vger.kernel.org X-Gm-Message-State: AOJu0YwHkXo22A/v+Uj9xcW6OBnXPsalkqMepOsdl1bIiTbCVXOhoCCP N22TV342CV8x2bSza+WzymLl5ALJ4FDskXSSyGnn9SSMCzGjw6d11fBqKOJCWausASOCMT8BkRG wKY7G2w== X-Google-Smtp-Source: AGHT+IEb/219Cf+wyRyoeh6pXELuzrJXroky0oVjoU6kkNSExiZe/MRoYQlDKXB9iGMeGWIkzSQ16EMgROI= X-Received: from plrs24.prod.google.com ([2002:a17:902:b198:b0:269:b6ad:8899]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e5cf:b0:293:b97:c30b with SMTP id d9443c01a7336-2951a3a6519mr12248475ad.9.1761855043356; Thu, 30 Oct 2025 13:10:43 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:40 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-18-seanjc@google.com> Subject: [PATCH v4 17/28] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Do TDH_MEM_RANGE_BLOCK directly in tdx_sept_remove_private_spte() instead of using a one-off helper now that the nr_premapped tracking is gone. Opportunistically drop the WARN on hugepages, which was dead code (see the KVM_BUG_ON() in tdx_sept_remove_private_spte()). No functional change intended. Reviewed-by: Rick Edgecombe Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 41 +++++++++++------------------------------ 1 file changed, 11 insertions(+), 30 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index cfdf8d262756..260b569309cf 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1683,33 +1683,6 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, return 0; } =20 -static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, struct page *page) -{ - int tdx_level =3D pg_level_to_tdx_sept_level(level); - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - gpa_t gpa =3D gfn_to_gpa(gfn) & KVM_HPAGE_MASK(level); - u64 err, entry, level_state; - - /* For now large page isn't supported yet. */ - WARN_ON_ONCE(level !=3D PG_LEVEL_4K); - - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_= state); - - if (unlikely(tdx_operand_busy(err))) { - /* After no vCPUs enter, the second retry is expected to succeed */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); - tdx_no_vcpus_enter_stop(kvm); - } - - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); - return -EIO; - } - return 1; -} - /* * Ensure shared and private EPTs to be flushed on all vCPUs. * tdh_mem_track() is the only caller that increases TD epoch. An increase= in @@ -1790,7 +1763,6 @@ static void tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); gpa_t gpa =3D gfn_to_gpa(gfn); u64 err, entry, level_state; - int ret; =20 /* * HKID is released after all private pages have been removed, and set @@ -1804,9 +1776,18 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return; =20 - ret =3D tdx_sept_zap_private_spte(kvm, gfn, level, page); - if (ret <=3D 0) + err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_= state); + if (unlikely(tdx_operand_busy(err))) { + /* After no vCPUs enter, the second retry is expected to succeed */ + tdx_no_vcpus_enter_start(kvm); + err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); + tdx_no_vcpus_enter_stop(kvm); + } + + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); return; + } =20 /* * TDX requires TLB tracking before dropping private page. Do --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56DDF393DCF for ; Thu, 30 Oct 2025 20:10:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855048; cv=none; b=Wdzu8M00d0vxDs1EW68cb9uqxKX9kqub8DX/yEjUIqh6kNuOdhaE8cpGWwDqKHxO7b1CHLbR2tjFxkw4dDo2CF3rJdC/Lni41bCXjMMvnHfm6mhanqOHE8uIe2qKw5uDFOEqofmkaD7UsYk38e2KTKRnSHVMCexIlk2qmXlrGGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855048; c=relaxed/simple; bh=YzrAcOy4kMGvZnv8LvvV5Xf4Ed5dS9yiBLT7BTZP694=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WMDXuP3ThHKDQZ7kRdDXBRYqScpFl00E1ySpTAs2ksSmqzAkT/B0wW6dJvFzfCBDZi7iOIuOl6NCGW22r7wexVHuMwoZXO2AHovaKtEDTF2StCjTXMeQEpYr70NUtqjsXe2Eo84BvIyb/NEilrYtXxlIYPBa2fPStyHuWdethg0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=C+gCfhRh; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="C+gCfhRh" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-290bd7c835dso15688975ad.3 for ; Thu, 30 Oct 2025 13:10:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855046; x=1762459846; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=HvkYP5K0H7bzidqYh2GUowDdUiZwtqWzbaGK7ATegJ0=; b=C+gCfhRhfMaUbe/qu8MiC2e8Nf40nf/ouA7X9hpRiI+86we8eLyErfsa7/lc0f6MC6 zkOIK8RXVvKngO0fZJNc4dmIUNwhv1IFc7tOiIn752k5JX+gggJEnOgH37epTbZGYPfo sEGZFdn4V091DVO5CsDyV3a7kGBfBe5nKnHxatAytIXt6lZnYVAhR8tD8yCl8sIib4i1 mgOiFDyEvFdpeBg5PXqPLlDu+mkKyaA1mLMcd8P4fYabEfv+8jkoeEF1IRiEmeJs8od6 oebl02lCTWYFkCjIesXI65RfbieFgAjkRAT9CzChJM5vp5zNxYIL7dlkXkYmLT5DTib2 s/nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855046; x=1762459846; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HvkYP5K0H7bzidqYh2GUowDdUiZwtqWzbaGK7ATegJ0=; b=VZHUkkkblhUdSP/pUr+me5j/M3cJ8xUv53iVCPaWmTIP+LMtfMfRJ/GcwOJImhprGA RtvdvCMduu5xp/bRef5M5ie1khjByH9Oz4zEzLUtU850R34P90UWsR+rHbKwbDSQc2PE paOPuAzGsTr9htokk87TmJBTVTFq7QqVyo2NTvI9gcGBS83IZUhVzr6ejU9qdVfshkTn sI/cFsyZAFMiCGCOMU5PGBb1npKsGuE+w9uHX3WnpVsyvjx/D4FQm0u7gqlfyfRfYJ/i Q6DMlyS+BkwiCKvVYXiuztTEwKYBCci1dwnMV436l8t5mtjiqSjdSeVlgyRBdo6NXkE1 +B3w== X-Forwarded-Encrypted: i=1; AJvYcCUmJypb0Yuw6C7XsjfeLtVSjNsvasskIfEWu94XiXAZVqJspKf2F9893hLJkxQhCv2yrkfaW6IPQrf8lAk=@vger.kernel.org X-Gm-Message-State: AOJu0Yy+cd7d20cDrqphbwibwNVHaLp9nHKbq1LspSNEmf4c2YhOBqsY jqiFyiLRQ0kEWHdmfNaR4fu/OKuNZTjgX/1j2ukGp+GsFymcY4Zzqq+UHJevh4AijpExxe6beW+ 8LIJ16Q== X-Google-Smtp-Source: AGHT+IHb++2Z/KelxH0n+uChReZvIH66SFbrsagpgu5Hbz52plhdUUxmf4i41V+1yu1qPk273gNABB2ooWc= X-Received: from plmm17.prod.google.com ([2002:a17:902:c451:b0:294:fdb9:5c0a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:fc85:b0:24c:ed95:2725 with SMTP id d9443c01a7336-2951a3712f9mr11351465ad.4.1761855045661; Thu, 30 Oct 2025 13:10:45 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:41 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-19-seanjc@google.com> Subject: [PATCH v4 18/28] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add TDX_BUG_ON() macros (with varying numbers of arguments) to deduplicate the myriad flows that do KVM_BUG_ON()/WARN_ON_ONCE() followed by a call to pr_tdx_error(). In addition to reducing boilerplate copy+paste code, this also helps ensure that KVM provides consistent handling of SEAMCALL errors. Opportunistically convert a handful of bare WARN_ON_ONCE() paths to the equivalent of KVM_BUG_ON(), i.e. have them terminate the VM. If a SEAMCALL error is fatal enough to WARN on, it's fatal enough to terminate the TD. Reviewed-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 110 +++++++++++++++++------------------------ 1 file changed, 46 insertions(+), 64 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 260b569309cf..5e6f2d8b6014 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -24,20 +24,32 @@ #undef pr_fmt #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 -#define pr_tdx_error(__fn, __err) \ - pr_err_ratelimited("SEAMCALL %s failed: 0x%llx\n", #__fn, __err) +#define __TDX_BUG_ON(__err, __f, __kvm, __fmt, __args...) \ +({ \ + struct kvm *_kvm =3D (__kvm); \ + bool __ret =3D !!(__err); \ + \ + if (WARN_ON_ONCE(__ret && (!_kvm || !_kvm->vm_bugged))) { \ + if (_kvm) \ + kvm_vm_bugged(_kvm); \ + pr_err_ratelimited("SEAMCALL " __f " failed: 0x%llx" __fmt "\n",\ + __err, __args); \ + } \ + unlikely(__ret); \ +}) =20 -#define __pr_tdx_error_N(__fn_str, __err, __fmt, ...) \ - pr_err_ratelimited("SEAMCALL " __fn_str " failed: 0x%llx, " __fmt, __err= , __VA_ARGS__) +#define TDX_BUG_ON(__err, __fn, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, "%s", "") =20 -#define pr_tdx_error_1(__fn, __err, __rcx) \ - __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx\n", __rcx) +#define TDX_BUG_ON_1(__err, __fn, __rcx, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx", __rcx) =20 -#define pr_tdx_error_2(__fn, __err, __rcx, __rdx) \ - __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx, rdx 0x%llx\n", __rcx, __rdx) +#define TDX_BUG_ON_2(__err, __fn, __rcx, __rdx, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx", __rcx, __rd= x) + +#define TDX_BUG_ON_3(__err, __fn, __rcx, __rdx, __r8, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx, r8 0x%llx", = __rcx, __rdx, __r8) =20 -#define pr_tdx_error_3(__fn, __err, __rcx, __rdx, __r8) \ - __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx, rdx 0x%llx, r8 0x%llx\n", __r= cx, __rdx, __r8) =20 bool enable_tdx __ro_after_init; module_param_named(tdx, enable_tdx, bool, 0444); @@ -313,10 +325,9 @@ static int __tdx_reclaim_page(struct page *page) * before the HKID is released and control pages have also been * released at this point, so there is no possibility of contention. */ - if (WARN_ON_ONCE(err)) { - pr_tdx_error_3(TDH_PHYMEM_PAGE_RECLAIM, err, rcx, rdx, r8); + if (TDX_BUG_ON_3(err, TDH_PHYMEM_PAGE_RECLAIM, rcx, rdx, r8, NULL)) return -EIO; - } + return 0; } =20 @@ -404,8 +415,8 @@ static void tdx_flush_vp_on_cpu(struct kvm_vcpu *vcpu) return; =20 smp_call_function_single(cpu, tdx_flush_vp, &arg, 1); - if (KVM_BUG_ON(arg.err, vcpu->kvm)) - pr_tdx_error(TDH_VP_FLUSH, arg.err); + + TDX_BUG_ON(arg.err, TDH_VP_FLUSH, vcpu->kvm); } =20 void tdx_disable_virtualization_cpu(void) @@ -464,8 +475,7 @@ static void smp_func_do_phymem_cache_wb(void *unused) } =20 out: - if (WARN_ON_ONCE(err)) - pr_tdx_error(TDH_PHYMEM_CACHE_WB, err); + TDX_BUG_ON(err, TDH_PHYMEM_CACHE_WB, NULL); } =20 void tdx_mmu_release_hkid(struct kvm *kvm) @@ -504,8 +514,7 @@ void tdx_mmu_release_hkid(struct kvm *kvm) err =3D tdh_mng_vpflushdone(&kvm_tdx->td); if (err =3D=3D TDX_FLUSHVP_NOT_DONE) goto out; - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_MNG_VPFLUSHDONE, err); + if (TDX_BUG_ON(err, TDH_MNG_VPFLUSHDONE, kvm)) { pr_err("tdh_mng_vpflushdone() failed. HKID %d is leaked.\n", kvm_tdx->hkid); goto out; @@ -528,8 +537,7 @@ void tdx_mmu_release_hkid(struct kvm *kvm) * tdh_mng_key_freeid() will fail. */ err =3D tdh_mng_key_freeid(&kvm_tdx->td); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_MNG_KEY_FREEID, err); + if (TDX_BUG_ON(err, TDH_MNG_KEY_FREEID, kvm)) { pr_err("tdh_mng_key_freeid() failed. HKID %d is leaked.\n", kvm_tdx->hkid); } else { @@ -580,10 +588,9 @@ static void tdx_reclaim_td_control_pages(struct kvm *k= vm) * when it is reclaiming TDCS). */ err =3D tdh_phymem_page_wbinvd_tdr(&kvm_tdx->td); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + if (TDX_BUG_ON(err, TDH_PHYMEM_PAGE_WBINVD, kvm)) return; - } + tdx_quirk_reset_page(kvm_tdx->td.tdr_page); =20 __free_page(kvm_tdx->td.tdr_page); @@ -606,11 +613,8 @@ static int tdx_do_tdh_mng_key_config(void *param) =20 /* TDX_RND_NO_ENTROPY related retries are handled by sc_retry() */ err =3D tdh_mng_key_config(&kvm_tdx->td); - - if (KVM_BUG_ON(err, &kvm_tdx->kvm)) { - pr_tdx_error(TDH_MNG_KEY_CONFIG, err); + if (TDX_BUG_ON(err, TDH_MNG_KEY_CONFIG, &kvm_tdx->kvm)) return -EIO; - } =20 return 0; } @@ -1601,10 +1605,8 @@ static int tdx_mem_page_add(struct kvm *kvm, gfn_t g= fn, enum pg_level level, if (unlikely(tdx_operand_busy(err))) return -EBUSY; =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_ADD, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_ADD, entry, level_state, kvm)) return -EIO; - } =20 return 0; } @@ -1623,10 +1625,8 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t g= fn, if (unlikely(tdx_operand_busy(err))) return -EBUSY; =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_AUG, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_AUG, entry, level_state, kvm)) return -EIO; - } =20 return 0; } @@ -1675,10 +1675,8 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, if (unlikely(tdx_operand_busy(err))) return -EBUSY; =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_SEPT_ADD, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_SEPT_ADD, entry, level_state, kvm)) return -EIO; - } =20 return 0; } @@ -1726,8 +1724,7 @@ static void tdx_track(struct kvm *kvm) tdx_no_vcpus_enter_stop(kvm); } =20 - if (KVM_BUG_ON(err, kvm)) - pr_tdx_error(TDH_MEM_TRACK, err); + TDX_BUG_ON(err, TDH_MEM_TRACK, kvm); =20 kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); } @@ -1784,10 +1781,8 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, tdx_no_vcpus_enter_stop(kvm); } =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) return; - } =20 /* * TDX requires TLB tracking before dropping private page. Do @@ -1814,16 +1809,12 @@ static void tdx_sept_remove_private_spte(struct kvm= *kvm, gfn_t gfn, tdx_no_vcpus_enter_stop(kvm); } =20 - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error_2(TDH_MEM_PAGE_REMOVE, err, entry, level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_REMOVE, entry, level_state, kvm)) return; - } =20 err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, page); - if (KVM_BUG_ON(err, kvm)) { - pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + if (TDX_BUG_ON(err, TDH_PHYMEM_PAGE_WBINVD, kvm)) return; - } =20 tdx_quirk_reset_page(page); } @@ -2463,8 +2454,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, goto free_packages; } =20 - if (WARN_ON_ONCE(err)) { - pr_tdx_error(TDH_MNG_CREATE, err); + if (TDX_BUG_ON(err, TDH_MNG_CREATE, kvm)) { ret =3D -EIO; goto free_packages; } @@ -2505,8 +2495,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, ret =3D -EAGAIN; goto teardown; } - if (WARN_ON_ONCE(err)) { - pr_tdx_error(TDH_MNG_ADDCX, err); + if (TDX_BUG_ON(err, TDH_MNG_ADDCX, kvm)) { ret =3D -EIO; goto teardown; } @@ -2523,8 +2512,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, *seamcall_err =3D err; ret =3D -EINVAL; goto teardown; - } else if (WARN_ON_ONCE(err)) { - pr_tdx_error_1(TDH_MNG_INIT, err, rcx); + } else if (TDX_BUG_ON_1(err, TDH_MNG_INIT, rcx, kvm)) { ret =3D -EIO; goto teardown; } @@ -2792,10 +2780,8 @@ static int tdx_td_finalize(struct kvm *kvm, struct k= vm_tdx_cmd *cmd) cmd->hw_error =3D tdh_mr_finalize(&kvm_tdx->td); if (tdx_operand_busy(cmd->hw_error)) return -EBUSY; - if (KVM_BUG_ON(cmd->hw_error, kvm)) { - pr_tdx_error(TDH_MR_FINALIZE, cmd->hw_error); + if (TDX_BUG_ON(cmd->hw_error, TDH_MR_FINALIZE, kvm)) return -EIO; - } =20 kvm_tdx->state =3D TD_STATE_RUNNABLE; /* TD_STATE_RUNNABLE must be set before 'pre_fault_allowed' */ @@ -2882,16 +2868,14 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, = u64 vcpu_rcx) } =20 err =3D tdh_vp_create(&kvm_tdx->td, &tdx->vp); - if (KVM_BUG_ON(err, vcpu->kvm)) { + if (TDX_BUG_ON(err, TDH_VP_CREATE, vcpu->kvm)) { ret =3D -EIO; - pr_tdx_error(TDH_VP_CREATE, err); goto free_tdcx; } =20 for (i =3D 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { err =3D tdh_vp_addcx(&tdx->vp, tdx->vp.tdcx_pages[i]); - if (KVM_BUG_ON(err, vcpu->kvm)) { - pr_tdx_error(TDH_VP_ADDCX, err); + if (TDX_BUG_ON(err, TDH_VP_ADDCX, vcpu->kvm)) { /* * Pages already added are reclaimed by the vcpu_free * method, but the rest are freed here. @@ -2905,10 +2889,8 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u= 64 vcpu_rcx) } =20 err =3D tdh_vp_init(&tdx->vp, vcpu_rcx, vcpu->vcpu_id); - if (KVM_BUG_ON(err, vcpu->kvm)) { - pr_tdx_error(TDH_VP_INIT, err); + if (TDX_BUG_ON(err, TDH_VP_INIT, vcpu->kvm)) return -EIO; - } =20 vcpu->arch.mp_state =3D KVM_MP_STATE_RUNNABLE; =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B3DD393DE7 for ; Thu, 30 Oct 2025 20:10:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855050; cv=none; b=KJmzxiPNwBwX/AI/NGO8BDCM64+voMe3QMAfue9zXGe4SvuZaIjBB7dcDgeaB+74jDb0/bKxJEdjDMEV42w2jkJIUAAdizpyENKRRpIofLl3zYkkxkFB8TxcIbFZ0HPMNhfP30ZgntwFv344KvzRtcdU1FF+Huj8cdkIks9iGSM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855050; c=relaxed/simple; bh=4WxHGoE93VOocFQ5wN0AUJ/80sFNRiS2hdmgOFi4y/A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ftg/wYif5ZkjWZ9vAQuHjy/4rNV4681LeCXef3eZJgn9efDiZ9i9FYR5H/0r5Um5s9cNc5xiVdAXNCkt9IZtOhuRfkE1I8rwq2rGJ59sMRvWmChH/o0ig275sx6PNfPafleiT07bAGKZgVj0/AsWWfaN65/xleiRzqh55RqBl9M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=VmRawTbN; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VmRawTbN" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b55283ff3fcso965942a12.3 for ; Thu, 30 Oct 2025 13:10:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855048; x=1762459848; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=q3vUMZT0yYe1EXEiRgC8c4U03xvX09k6brWY3B4yx5I=; b=VmRawTbNAViiJiQpj/T71gWATAeSlhmr7dzX5KPAUeTJpZ1tsTsuufgapB047yQAmP J8yRP2yThpvFvFlSDMqdfD0g4SKcP7WpHKZQciWcQn/SkCPXVumuPE9P0dPM00sqE7xM fke0EbpU7W5Ri39dFAj1nL7srVlmwNY7oj4Go39PIUqs54a1+J9hEZ6bi/09u/jIrDLa VSl44p1XarJ51KQK7Zuv8dJPYoCDJince6wGoV7Sz06tB0rHw2ntJh0kgCXkfCfmJUHH Fq/p1+EMBMiFqSWiV4HuOol/v6nxdgVqVKn5uPzIBkGwdWKCanB+pgkYeZ7Al8XEN9mA AK4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855048; x=1762459848; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=q3vUMZT0yYe1EXEiRgC8c4U03xvX09k6brWY3B4yx5I=; b=B3eSzw2ZP07vDJOEp5Zix37dhNT7WpyxrP0XLNISq1v2RN9uES/+eG5fLAs6vT6jwz 3FlLpbWFey1lAdFY8PUghYERa77S9JHR3U6OnSTKvDihq3eBLGNK4CfsH/XNMRrYW/JK IZbDYNzUZjECepMgx8URz6CObLsGGxyOsuB8sVhNf7nARif8i7ho3k+KZHsv5O5MoCaY MOprb1sHdYWZ9j1TVUTpIiTxUfQvo1UMUxpt4JhT0rnIGChT/U7J+jpImkzgkkvXOyE0 IVrN/a3ONE5l3DpiyoABzrW3k+uuV7OkI6v9rOBMxIIcSc73JAo8ftVrlcHs41CSYU8l mZPw== X-Forwarded-Encrypted: i=1; AJvYcCUogw96ZguVAUUayR5VzbhbkAsqQVEoq1GCEESUk2XPd2AkRPoIp5wAyR739TSze6jyT2JVGEkgnJ+mpDw=@vger.kernel.org X-Gm-Message-State: AOJu0YzDHb6G2RiT9MmMF0PSiaMROs4577mb7mn5OXNXq87oz1kHnSRW J111cJT81fZCAwLYqv96yTIktq7/CeQ1jDqrVOnqe5yw6T/mVhzVSe8g4P7fTgJzz8LrZM0KXad SFPpffg== X-Google-Smtp-Source: AGHT+IGMIUiQfnFXRwKgU4xakuBoSsZwl0KV10BIrj34ojpFyNKXcMvJLG+bxMBFJqWMl0yCEf0bmf3P3I0= X-Received: from pltj3.prod.google.com ([2002:a17:902:76c3:b0:294:8e58:7348]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2348:b0:295:1a63:57b0 with SMTP id d9443c01a7336-2951a635c07mr12598385ad.23.1761855047670; Thu, 30 Oct 2025 13:10:47 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:42 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-20-seanjc@google.com> Subject: [PATCH v4 19/28] KVM: TDX: Derive error argument names from the local variable names From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When printing SEAMCALL errors, use the name of the variable holding an error parameter instead of the register from whence it came, so that flows which use descriptive variable names will similarly print descriptive error messages. Suggested-by: Rick Edgecombe Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 5e6f2d8b6014..63d4609cc3bc 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -41,14 +41,15 @@ #define TDX_BUG_ON(__err, __fn, __kvm) \ __TDX_BUG_ON(__err, #__fn, __kvm, "%s", "") =20 -#define TDX_BUG_ON_1(__err, __fn, __rcx, __kvm) \ - __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx", __rcx) +#define TDX_BUG_ON_1(__err, __fn, a1, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", " #a1 " 0x%llx", a1) =20 -#define TDX_BUG_ON_2(__err, __fn, __rcx, __rdx, __kvm) \ - __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx", __rcx, __rd= x) +#define TDX_BUG_ON_2(__err, __fn, a1, a2, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", " #a1 " 0x%llx, " #a2 " 0x%llx", a1,= a2) =20 -#define TDX_BUG_ON_3(__err, __fn, __rcx, __rdx, __r8, __kvm) \ - __TDX_BUG_ON(__err, #__fn, __kvm, ", rcx 0x%llx, rdx 0x%llx, r8 0x%llx", = __rcx, __rdx, __r8) +#define TDX_BUG_ON_3(__err, __fn, a1, a2, a3, __kvm) \ + __TDX_BUG_ON(__err, #__fn, __kvm, ", " #a1 " 0x%llx, " #a2 ", 0x%llx, " #= a3 " 0x%llx", \ + a1, a2, a3) =20 =20 bool enable_tdx __ro_after_init; --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96975329E4D for ; Thu, 30 Oct 2025 20:10:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855052; cv=none; b=kPVSBrPNr6UAuRRDSisCuDnHxpgxK4oNAU7tLg7klHuDM03arFwrVgVOLEZw1ZcRc1hVjoGbKWX+A5Qyhc8R4EAnRAqSNhlpNN2UjVTTdgAnklDhrJMr9OlegJgQHt2tKBeqrOOWjmvs1dMWcNkW7U0gThUzAgjeuucF1OmhDHs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855052; c=relaxed/simple; bh=gvzcFRbzaqGBcuZ+f2VoTMDHPgCQZ6omIrzB4Q9yffU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rgKF5ucuNgal39BbVND8j95oRvUrkE6l1VteuEhj1wQLIYlxAj+GY5j9GmXApGqjO4YoHnShMwT9BUL57Xu9XsoqrbHi3glSpH0kPXf/3PY1zkEdmGhtxLJ3VhCFlaEdc6NNAjNVaSmcnbOfkCq6/kaBYlPgR2z8eFDU3sRjMUw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=u2FBLlUZ; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="u2FBLlUZ" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-28bd8b3fa67so12820275ad.2 for ; Thu, 30 Oct 2025 13:10:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855050; x=1762459850; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=JvdyzKJX9NuDfCF42GsCtwlMI06iCRR0AAdHR5FNyUo=; b=u2FBLlUZ+lwWe5HPcLxb3qjhYcUbALd6FK9vUvldg72EsVY7TAA5jr3WbKkhEsA9V3 2FrshuFmwwFimtFhlJ/aY3a45IXz+WEZUA0pr9zk2Opy0OlGfZDY/DcGuvAHWZChuoQ5 ztQGdkcxJZdoUNcisE2kM8ppJflGHPHZjKxuTmDF9KZRXpVwh/u4Mxq9xQj0blmEF7Xv rHYXyaX/+dx+EEMPOaa+6kJsNwwLpzgGDNb2C8b4YtIj1hNHC0NiSHrdDFSsbT2dMnrP vQTwlcjmEhn2U95+0w5Au4crPHiPzoov9EukihpvbVzEd/AWpYkyiiNd98p7unvXKeLy 01yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855050; x=1762459850; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JvdyzKJX9NuDfCF42GsCtwlMI06iCRR0AAdHR5FNyUo=; b=sWuRctUBk4HCLYgxazQOCLzfsXRxEjHin4AKaEy/1t3zvk5TtAQART1kxJT9rqD3iQ tWrkZaSurgpmSfzBtDJx1BI3qJVQDRl6mrcD+p+htWO1FDeToQOVId6e04C374iJNQa8 51U8G8F//oYGVCF2IyKIQ63QVmlpd+Zzk4hYgYscbiq56u1H+aavmQsoUE9lEVqY98dz 7udvvDLnAh0vJrcHosHDF/iKDbcg6fpkDrF/Q+ahHqAZYGqGAIgzSntgf9FHStE5jpQp Jpbe9/KroZT1C5Yx4oVqjkArfb28txDMWBVAWyiWg7GzSaEG6I0Gg1dv6u6D0Q0HZ/Sh aEmA== X-Forwarded-Encrypted: i=1; AJvYcCUjr1eB5De1WW8UeENUg9xvCRZEOGD7bo1IPlK5cHttwyacjnQOIdLzaQB/6MaN2tIoQE6nYc+cUa+aGG0=@vger.kernel.org X-Gm-Message-State: AOJu0YyazkC2T65NxXnTKjfjjxrlvlXjrd7VK873hMqtzOLux/zm6gbo hBfT+IZVZjRqE35R7TUCq/8+q3/nRcuB+xyFy1fzexb/Z32JRXjPrMpgQ4F9YKRdtnLsx18hdJW WVgBomQ== X-Google-Smtp-Source: AGHT+IHEVTdCFV7pO4/z6pVQDy8N1YwVBneIfEZO+ruu2Aqfbau+04/y/mG+hlEqHWjdepdrmBSAOEyDdTs= X-Received: from plge18.prod.google.com ([2002:a17:902:cf52:b0:290:d7ff:80f8]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:38c8:b0:294:fda6:4723 with SMTP id d9443c01a7336-2951a6007f3mr12024995ad.60.1761855049987; Thu, 30 Oct 2025 13:10:49 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:43 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-21-seanjc@google.com> Subject: [PATCH v4 20/28] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Unconditionally assert that mmu_lock is held for write when removing S-EPT entries, not just when removing S-EPT entries triggers certain conditions, e.g. needs to do TDH_MEM_TRACK or kick vCPUs out of the guest. Conditionally asserting implies that it's safe to hold mmu_lock for read when those paths aren't hit, which is simply not true, as KVM doesn't support removing S-EPT entries under read-lock. Only two paths lead to remove_external_spte(), and both paths asserts that mmu_lock is held for write (tdp_mmu_set_spte() via lockdep, and handle_removed_pt() via KVM_BUG_ON()). Deliberately leave lockdep assertions in the "no vCPUs" helpers to document that wait_for_sept_zap is guarded by holding mmu_lock for write, and keep the conditional assert in tdx_track() as well, but with a comment to help explain why holding mmu_lock for write matters (above and beyond why tdx_sept_remove_private_spte()'s requirements). Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 63d4609cc3bc..999b519494e9 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1715,6 +1715,11 @@ static void tdx_track(struct kvm *kvm) if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) return; =20 + /* + * The full sequence of TDH.MEM.TRACK and forcing vCPUs out of guest + * mode must be serialized, as TDH.MEM.TRACK will fail if the previous + * tracking epoch hasn't completed. + */ lockdep_assert_held_write(&kvm->mmu_lock); =20 err =3D tdh_mem_track(&kvm_tdx->td); @@ -1762,6 +1767,8 @@ static void tdx_sept_remove_private_spte(struct kvm *= kvm, gfn_t gfn, gpa_t gpa =3D gfn_to_gpa(gfn); u64 err, entry, level_state; =20 + lockdep_assert_held_write(&kvm->mmu_lock); + /* * HKID is released after all private pages have been removed, and set * before any might be populated. Warn if zapping is attempted when --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0ED0329E7C for ; Thu, 30 Oct 2025 20:10:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855055; cv=none; b=d2kbdmqW4RECYtw6kAG7K9QG1a001HKFwE10osGMHv1yBfo+1qXO6+7F4Vnmao3LDNUoLtgRn6fqT6Hcvb8gUmRoLae+AspfPoGal7rpTpcfOweiFQiR0Jurz1rkEKoODce9u6FtpkO8TAkLBD6P3PmCg39kDiAI4DYSuY9CUSA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855055; c=relaxed/simple; bh=1TeuJjhP9mqGMQJvgPXRft34enFs+P5PX2+/Rfvv+AU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WuSyX3uYzUA4kJLi95XUclf1crH+rcs44qFdjmduzDctCktszMZ2aN1UjKDLPvnC8Ae0HdCwBAVm1RzxxF93vgpX/wjLZtmcaJqknau5UKakUj2iqt/T9fLMv9zNxZ5BxSfrfuodBYoB2JTVmKEnY97Rk3VDbxTEQJtUyE/D1Q4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=QxCyHq46; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="QxCyHq46" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-33da1f30fdfso3305165a91.3 for ; Thu, 30 Oct 2025 13:10:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855052; x=1762459852; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=xpa5Fver89jqQUHMNWla6MmRTI/zYJOh/l7CEBc/74M=; b=QxCyHq46FNsACnY9xxLczfXbrqZhxzlC0LfjtaADyUhKxvNIlI5b5r92E/67DGYJg9 byHiP3t+5OzvGz09ixvUMjXTjF6qukfjE8qOHvmn0URLh3hp9nZeixEId99jAYbcUzju aMN0LNO+/VGXzQ9+ydK/teQZKOT8lKXe+5CTJo12XIk9l5GMkKX3kr7zId5bplUk0u3R UJO8lLsYQSNM/+3/GPz4n+EW3J1S9cnoYYUcs8YbJs7N+8OimXtWTctvKY6Pr5cujfux nu7X9grZppPGc2mrmkpsfekrZe6AyXssdCMG4bKMvJN8MJsZNZ/TOUie80GcPAbm9sXa ffHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855052; x=1762459852; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xpa5Fver89jqQUHMNWla6MmRTI/zYJOh/l7CEBc/74M=; b=dsqCFA1e0sPISorKvJ5xRBM2wswU6iWlX3Rz0g858Q3DI5zl/7R06HpK4j6OCCGOeC DNyVrRoboumukacTrSwWYoknrQ4BSnQop5hymBZwcxt0U7ZknyBQ4e1hCwEu9fXSDqOk fB5aZL03e6F8T2VvZ85XZo9c1E5fBVjF4O5/ExMYxrNgLm8mq1OB/JhMGPDCk6I16ViH 1RMrVf7o/C3DOWg4LDQXTDFm78ofAjVVx9tTAFRfTca+P7b3ORU+8UaN0KehN4KBw40p aCqCreLMh4tmQdlSrjqDK11fxHgzDVFFHYMI3YDSPWm3pw1T50EVBIdQQ2SFX+Ye1fR2 fa4A== X-Forwarded-Encrypted: i=1; AJvYcCVs8N+7+BV4WTYN1Sgw4WRZDcATd1sU4X3/r9C1V5uln9R1yzEZBbFJYuDU/g/s0/oECpdhchlQIicJKEs=@vger.kernel.org X-Gm-Message-State: AOJu0YyQ5xjUJw2TOr2F45xvExDaOkDf17zjPibyWu9WfGNh1bUt9Loa 7vjLb7JSlTQEZQAXynlZ5YQpY4WuaavNyyAltsCIIluJN3jE5aXrG8zojf6Kz3Aa/RSjR39UrZY yuEJN1A== X-Google-Smtp-Source: AGHT+IGdH9ACR3MS+ju/IeUipjD96kAUXZo3BjKEOn1spE0qh/NlJMuCLno+Rlt479ILTGbPCBlzBUGVIBM= X-Received: from pjl4.prod.google.com ([2002:a17:90b:2f84:b0:339:dc19:ae60]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3c52:b0:338:3789:2e7b with SMTP id 98e67ed59e1d1-34082fd9099mr1370626a91.13.1761855052135; Thu, 30 Oct 2025 13:10:52 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:44 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-22-seanjc@google.com> Subject: [PATCH v4 21/28] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a macro to handle kicking vCPUs out of the guest and retrying SEAMCALLs on -EBUSY instead of providing small helpers to be used by each SEAMCALL. Wrapping the SEAMCALLs in a macro makes it a little harder to tease out which SEAMCALL is being made, but significantly reduces the amount of copy+paste code and makes it all but impossible to leave an elevated wait_for_sept_zap. Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 82 +++++++++++++++++------------------------- 1 file changed, 33 insertions(+), 49 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 999b519494e9..97632fc6b520 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -294,25 +294,34 @@ static inline void tdx_disassociate_vp(struct kvm_vcp= u *vcpu) vcpu->cpu =3D -1; } =20 -static void tdx_no_vcpus_enter_start(struct kvm *kvm) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - - lockdep_assert_held_write(&kvm->mmu_lock); - - WRITE_ONCE(kvm_tdx->wait_for_sept_zap, true); - - kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); -} - -static void tdx_no_vcpus_enter_stop(struct kvm *kvm) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - - lockdep_assert_held_write(&kvm->mmu_lock); - - WRITE_ONCE(kvm_tdx->wait_for_sept_zap, false); -} +/* + * Execute a SEAMCALL related to removing/blocking S-EPT entries, with a s= ingle + * retry (if necessary) after forcing vCPUs to exit and wait for the opera= tion + * to complete. All flows that remove/block S-EPT entries run with mmu_lo= ck + * held for write, i.e. are mutually exclusive with each other, but they a= ren't + * mutually exclusive with running vCPUs, and so can fail with "operand bu= sy" + * if a vCPU acquires a relevant lock in the TDX-Module, e.g. when doing T= DCALL. + * + * Note, the retry is guaranteed to succeed, absent KVM and/or TDX-Module = bugs. + */ +#define tdh_do_no_vcpus(tdh_func, kvm, args...) \ +({ \ + struct kvm_tdx *__kvm_tdx =3D to_kvm_tdx(kvm); \ + u64 __err; \ + \ + lockdep_assert_held_write(&kvm->mmu_lock); \ + \ + __err =3D tdh_func(args); \ + if (unlikely(tdx_operand_busy(__err))) { \ + WRITE_ONCE(__kvm_tdx->wait_for_sept_zap, true); \ + kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); \ + \ + __err =3D tdh_func(args); \ + \ + WRITE_ONCE(__kvm_tdx->wait_for_sept_zap, false); \ + } \ + __err; \ +}) =20 /* TDH.PHYMEM.PAGE.RECLAIM is allowed only when destroying the TD. */ static int __tdx_reclaim_page(struct page *page) @@ -1722,14 +1731,7 @@ static void tdx_track(struct kvm *kvm) */ lockdep_assert_held_write(&kvm->mmu_lock); =20 - err =3D tdh_mem_track(&kvm_tdx->td); - if (unlikely(tdx_operand_busy(err))) { - /* After no vCPUs enter, the second retry is expected to succeed */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_track(&kvm_tdx->td); - tdx_no_vcpus_enter_stop(kvm); - } - + err =3D tdh_do_no_vcpus(tdh_mem_track, kvm, &kvm_tdx->td); TDX_BUG_ON(err, TDH_MEM_TRACK, kvm); =20 kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); @@ -1781,14 +1783,8 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) return; =20 - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_= state); - if (unlikely(tdx_operand_busy(err))) { - /* After no vCPUs enter, the second retry is expected to succeed */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level= _state); - tdx_no_vcpus_enter_stop(kvm); - } - + err =3D tdh_do_no_vcpus(tdh_mem_range_block, kvm, &kvm_tdx->td, gpa, + tdx_level, &entry, &level_state); if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) return; =20 @@ -1803,20 +1799,8 @@ static void tdx_sept_remove_private_spte(struct kvm = *kvm, gfn_t gfn, * with other vcpu sept operation. * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. */ - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - - if (unlikely(tdx_operand_busy(err))) { - /* - * The second retry is expected to succeed after kicking off all - * other vCPUs and prevent them from invoking TDH.VP.ENTER. - */ - tdx_no_vcpus_enter_start(kvm); - err =3D tdh_mem_page_remove(&kvm_tdx->td, gpa, tdx_level, &entry, - &level_state); - tdx_no_vcpus_enter_stop(kvm); - } - + err =3D tdh_do_no_vcpus(tdh_mem_page_remove, kvm, &kvm_tdx->td, gpa, + tdx_level, &entry, &level_state); if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_REMOVE, entry, level_state, kvm)) return; =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EAA133555B for ; Thu, 30 Oct 2025 20:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855057; cv=none; b=rPBcWcDNAU57Z3M746iTMaVgo2ma5ebZnAzC/o3cnc+OLuiGiFw9F22JPBXlZ8QW9ZrOXgYzrLXJENPE/Lz+r/6YPpZi7u0TLYZgw4nR7DKWFmIqsS2H/DuwPnPp3sINYrsBD+OPCC9TFpwQEEu81K6/V3El2n3WqQKFksLDXTI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855057; c=relaxed/simple; bh=ijmVL6tuMasvfCu8NJ2oY2dmKaMYxvGjS6YgS4whU3k=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=SlA2ctPoeVxzdR6aFtSy5eM6GRTtqUO2uMftTralrCTp0GNzYy42pa6snNMbKNCuL+kIA0ZeCaYDyIkqPCjFZhhRHa0xEEA5Ec0hDwwkWVDKe714z0hFn+ZGdkwfDUZ2chNSOy9g0iq2N9Q8UwEIJzsR29mcnhLygfzYHrZyRho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NS337iLP; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NS337iLP" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3403e994649so3115954a91.3 for ; Thu, 30 Oct 2025 13:10:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855055; x=1762459855; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=HfwppzbkJFTqc1oDUhl9YAWd2H/Ydo3ViOcPjYsyqPw=; b=NS337iLP3IfeG0E03oUgJhNVR1nwPGhjNqGmulSiXufZUa+ypiuqoTv5igcWybbu/w CxjwBZhVof1+fsrJB2PuYKiFELEpHmURNM6SOm8Zktl96J/yOrLkTu0ARmVq5hjgqRUH tQZtAKQ0ev2IvGqOHbPN3gQ9zzfrigqztT6Z50flOV3VR4AMOtbsBPBKNjpq9tCBMFNX 9sLI7L0XH0fgKBXUvZ82IIQeRvwYsZGhAdc/CSfwKFfDkrVl3oGtsJBfjREGWnR/TQU2 wntY4+yJBp9ER+bs2xOxNeWkLcPZV7Q+On/NjebKAbs1YbqOVSqJML3PmP8lu6E7ysuJ nVgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855055; x=1762459855; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HfwppzbkJFTqc1oDUhl9YAWd2H/Ydo3ViOcPjYsyqPw=; b=LHdKRXY3uA2tTlxLFvs+Cf+yEzgrBIttOgG3uGYtq49+g/SkddNb58GUj7yuK8uqql XADsYFxtlyizJQytlV8kKSo/iT3cVHtDfC58473oWwcksEzCIh+CT3BaHFio4zJgeAQ1 rhfTDON0307ZRl++oMzPG/T0M1hxzLUBKk83EsvFNjmAz/vJzu8jZST3wZx0AmsjN+oe TsOeoCJQ20BY70/nvqinDPGgFsfSk0ggb9L5Z4VyeyCRKvIeTp0DnShxHn3qLqeqXqRr OrMxmwgXQdVfsxFQ43cA2PFGBucIQcXxXvVns0xnLLPyxRWcfbh+F9277kRnl2/XqEyW d69g== X-Forwarded-Encrypted: i=1; AJvYcCXd5zldxG8PKRZM9Syo3Wn+suKUrlvgog1FzO5vjw8hqkALkweX4eim68dIGk+acBrq4KLqk9EIOHokZI8=@vger.kernel.org X-Gm-Message-State: AOJu0YyYrWaYAO4R93dYcRKB99j4oHjMVoFKn9f04L6Xc+jSVefmi+ps NWG/1FQzXz5dYalJbY6xTQ14Y3WIfCvzQ97QzKaQyyJHl8uuuiuDo+LElirraYY0avfo/f23WyB 8b9/SPw== X-Google-Smtp-Source: AGHT+IFgv0+NgL48Cma0Y79QUzd7lRP+/FCxQ0tZkUpinDRTeLAklkWHALMXu+tHUB2W7Y9sxKQZz0IMk4g= X-Received: from pjbbt9.prod.google.com ([2002:a17:90a:f009:b0:340:7770:2e30]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1f82:b0:32e:a5ae:d00 with SMTP id 98e67ed59e1d1-34082fc7b43mr1201499a91.13.1761855054620; Thu, 30 Oct 2025 13:10:54 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:45 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-23-seanjc@google.com> Subject: [PATCH v4 22/28] KVM: TDX: Add tdx_get_cmd() helper to get and validate sub-ioctl command From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a helper to copy a kvm_tdx_cmd structure from userspace and verify that must-be-zero fields are indeed zero. No functional change intended. Reviewed-by: Rick Edgecombe Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 97632fc6b520..390c934562c1 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2782,20 +2782,29 @@ static int tdx_td_finalize(struct kvm *kvm, struct = kvm_tdx_cmd *cmd) return 0; } =20 +static int tdx_get_cmd(void __user *argp, struct kvm_tdx_cmd *cmd) +{ + if (copy_from_user(cmd, argp, sizeof(*cmd))) + return -EFAULT; + + /* + * Userspace should never set hw_error. KVM writes hw_error to report + * hardware-defined error back to userspace. + */ + if (cmd->hw_error) + return -EINVAL; + + return 0; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; int r; =20 - if (copy_from_user(&tdx_cmd, argp, sizeof(struct kvm_tdx_cmd))) - return -EFAULT; - - /* - * Userspace should never set hw_error. It is used to fill - * hardware-defined error by the kernel. - */ - if (tdx_cmd.hw_error) - return -EINVAL; + r =3D tdx_get_cmd(argp, &tdx_cmd); + if (r) + return r; =20 mutex_lock(&kvm->lock); =20 @@ -3171,11 +3180,9 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __use= r *argp) if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; =20 - if (copy_from_user(&cmd, argp, sizeof(cmd))) - return -EFAULT; - - if (cmd.hw_error) - return -EINVAL; + ret =3D tdx_get_cmd(argp, &cmd); + if (ret) + return ret; =20 switch (cmd.id) { case KVM_TDX_INIT_VCPU: --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89D4D375758 for ; Thu, 30 Oct 2025 20:10:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855059; cv=none; b=hsHdssH4nbqC4EokkYy+btlANTTWy7B+bnfHCQmN/AQSuuIbZimtEJx9zD0ea/wX9WFAWWejYQjXzrnQzeFearaG76QPXRadYWz8sTKp8lTrVlsJ9c9qcTMvamLGCqmzfb/yEzrFW6iXw9x5u882ND9yleSPzSQnNH2WUf8i90g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855059; c=relaxed/simple; bh=kotZ+hJUlfpAI0fz7usDYLNjU5vqssCEbgz3qGC9nmI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FbM/7006v4CHGaO+SV667PnbGrAa8zm7qSDDwHISF5KKGH34GRqWCWnbIbtBOa92b5QrDPiuXizstbDki7AdHOmATygXmT/BmMqwNup+02vzas2c89azpSQ3JR+GhTXf1Wo6DAF1FBl16ngolkr52X9dGsgjAw2+HB55PyYLT1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4D8w0xNw; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4D8w0xNw" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-34029b3dbfeso1711092a91.3 for ; Thu, 30 Oct 2025 13:10:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855057; x=1762459857; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=8YASmmQaw2bp6ARedML0HqNnGQNivMGsg1DfU0cvX6Q=; b=4D8w0xNwqW0F6qU1u//NgQMhdmpW6OjNk1T8HYoimdx5sjIK0MJ/ks5deLG2Z1gXCx vZPpp5PdXxhc/WDN/nyNnyZ9DOKF/VW9Wlo24rxy+AmgEKWhQM5UIbuPsdxhLn/EfA9k Q91DwMiE0Pc5Q/G38qGvtY9Fpih0JO+2/M0uTI4Uo0wa6M6trdVjVPZPBdutsslUeUUS MXjv4xIBDMARt8KHCO9IWY74cN4wg2FXFvJGEOaqOCosBJIVVdcJfuwtNvRfrG2g3SXD qH392dTlM5Hll5IZfkzeDBBAqgHQrbJOPlFy8swovlo6pAOpjz5sa8kPUUKjuIhDiauk +Npg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855057; x=1762459857; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8YASmmQaw2bp6ARedML0HqNnGQNivMGsg1DfU0cvX6Q=; b=QIrc2GE6L6eHYBNypbfG5YfUAB3U3Jj2XWYmYUF73jOKHKd1yxl4xwXZK8jDiW6Stb kEg8TW8HeOGRdLOI5r7XL9DrU6xh0azM51Lh2XnOJKVfzdHfUTnODn6sgi7gnIOjoRCn 3MbNAQlwhN4NEJrfA9tw6Vk+F7vHHFZA1vKaRGtL+cLRxEYDskLDvrk5qVFEVaBXAI6C E1ce1n2MJODTNpy0QaBo9Pgvdef+8oSwLXMtmy2X9jQAH916A6SXT4FT+ADANEBucIlQ vmKcMisrMWidTq2xhUMj6jvegztNkivoUwLPc4SOZkF4cUS4TpY/vPvmY8EUGOX8Vp6i f75w== X-Forwarded-Encrypted: i=1; AJvYcCXoBbfEEmOn/3Ajr7Ccl2iKJItDn1vnLixpHhI9+kPvv5RrdHDn0k9SPbeF+y9MMwpR3K0SADHCB1F7LPw=@vger.kernel.org X-Gm-Message-State: AOJu0YzFDVzwydJLD8jW0ggdVl1GUwTdQ8ukq7sGJ372JbOwPjJSbzY4 pmteMCU1OpQnqvf3d6nMkP13cF1DrUa/Y9BIGqCkpeOAI4IF9K6azpbqG0Fe/jD18tie6yi9o4u NbUD4+A== X-Google-Smtp-Source: AGHT+IEJ7RyLZ+kuNy3gvYryYSQgpwzE5exie7vcyQeCPqNyLi6NBzIMgPjXgsonnZ1/wfIIShZ4miOVTHA= X-Received: from pjsf3.prod.google.com ([2002:a17:90a:6543:b0:33b:51fe:1a97]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:120c:b0:294:fc1d:9e0 with SMTP id d9443c01a7336-2951a54d0afmr12621765ad.54.1761855056839; Thu, 30 Oct 2025 13:10:56 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:46 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-24-seanjc@google.com> Subject: [PATCH v4 23/28] KVM: TDX: Convert INIT_MEM_REGION and INIT_VCPU to "unlocked" vCPU ioctl From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Handle the KVM_TDX_INIT_MEM_REGION and KVM_TDX_INIT_VCPU vCPU sub-ioctls in the unlocked variant, i.e. outside of vcpu->mutex, in anticipation of taking kvm->lock along with all other vCPU mutexes, at which point the sub-ioctls _must_ start without vcpu->mutex held. No functional change intended. Reviewed-by: Kai Huang Co-developed-by: Yan Zhao Signed-off-by: Yan Zhao Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 9 +++++++ arch/x86/kvm/vmx/tdx.c | 42 +++++++++++++++++++++++++----- arch/x86/kvm/vmx/x86_ops.h | 1 + arch/x86/kvm/x86.c | 7 +++++ 6 files changed, 55 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-= x86-ops.h index fdf178443f85..de709fb5bd76 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -128,6 +128,7 @@ KVM_X86_OP(enable_smi_window) KVM_X86_OP_OPTIONAL(dev_get_attr) KVM_X86_OP_OPTIONAL(mem_enc_ioctl) KVM_X86_OP_OPTIONAL(vcpu_mem_enc_ioctl) +KVM_X86_OP_OPTIONAL(vcpu_mem_enc_unlocked_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_register_region) KVM_X86_OP_OPTIONAL(mem_enc_unregister_region) KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 87a5f5100b1d..2bfae1cfa514 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1914,6 +1914,7 @@ struct kvm_x86_ops { int (*dev_get_attr)(u32 group, u64 attr, u64 *val); int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp); int (*vcpu_mem_enc_ioctl)(struct kvm_vcpu *vcpu, void __user *argp); + int (*vcpu_mem_enc_unlocked_ioctl)(struct kvm_vcpu *vcpu, void __user *ar= gp); int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *ar= gp); int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *= argp); int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 0eb2773b2ae2..a46ccd670785 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -831,6 +831,14 @@ static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu= , void __user *argp) return tdx_vcpu_ioctl(vcpu, argp); } =20 +static int vt_vcpu_mem_enc_unlocked_ioctl(struct kvm_vcpu *vcpu, void __us= er *argp) +{ + if (!is_td_vcpu(vcpu)) + return -EINVAL; + + return tdx_vcpu_unlocked_ioctl(vcpu, argp); +} + static int vt_gmem_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, bool is_private) { @@ -1005,6 +1013,7 @@ struct kvm_x86_ops vt_x86_ops __initdata =3D { =20 .mem_enc_ioctl =3D vt_op_tdx_only(mem_enc_ioctl), .vcpu_mem_enc_ioctl =3D vt_op_tdx_only(vcpu_mem_enc_ioctl), + .vcpu_mem_enc_unlocked_ioctl =3D vt_op_tdx_only(vcpu_mem_enc_unlocked_ioc= tl), =20 .gmem_max_mapping_level =3D vt_op_tdx_only(gmem_max_mapping_level) }; diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 390c934562c1..d6f40a481487 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3171,6 +3171,42 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu = *vcpu, struct kvm_tdx_cmd *c return ret; } =20 +int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp) +{ + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); + struct kvm_tdx_cmd cmd; + int r; + + r =3D tdx_get_cmd(argp, &cmd); + if (r) + return r; + + if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) + return -EINVAL; + + if (mutex_lock_killable(&vcpu->mutex)) + return -EINTR; + + vcpu_load(vcpu); + + switch (cmd.id) { + case KVM_TDX_INIT_MEM_REGION: + r =3D tdx_vcpu_init_mem_region(vcpu, &cmd); + break; + case KVM_TDX_INIT_VCPU: + r =3D tdx_vcpu_init(vcpu, &cmd); + break; + default: + r =3D -ENOIOCTLCMD; + break; + } + + vcpu_put(vcpu); + + mutex_unlock(&vcpu->mutex); + return r; +} + int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); @@ -3185,12 +3221,6 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __use= r *argp) return ret; =20 switch (cmd.id) { - case KVM_TDX_INIT_VCPU: - ret =3D tdx_vcpu_init(vcpu, &cmd); - break; - case KVM_TDX_INIT_MEM_REGION: - ret =3D tdx_vcpu_init_mem_region(vcpu, &cmd); - break; case KVM_TDX_GET_CPUID: ret =3D tdx_vcpu_get_cpuid(vcpu, &cmd); break; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 77613a44cebf..d09abeac2b56 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -148,6 +148,7 @@ int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data = *msr); int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); =20 int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); +int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp); =20 void tdx_flush_tlb_current(struct kvm_vcpu *vcpu); void tdx_flush_tlb_all(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b85cb213a336..593fccc9cf1c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7243,6 +7243,13 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, v= oid __user *argp) long kvm_arch_vcpu_unlocked_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { + struct kvm_vcpu *vcpu =3D filp->private_data; + void __user *argp =3D (void __user *)arg; + + if (ioctl =3D=3D KVM_MEMORY_ENCRYPT_OP && + kvm_x86_ops.vcpu_mem_enc_unlocked_ioctl) + return kvm_x86_call(vcpu_mem_enc_unlocked_ioctl)(vcpu, argp); + return -ENOIOCTLCMD; } =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1718933BBDA for ; Thu, 30 Oct 2025 20:11:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855062; cv=none; b=pXzEM2nPfTMdrWRO4ydr24UK1ZTbANU+RiDHSM+/jHuGgK4l8FKYXUV4FLhA+H8QuvWIhB8IhQ/S76QLzQLdEc9mxcnZp7NcYa9CPXARLtfQfZPZE6Go5AKI+VbcufKTV3rGAfvHbmje/GzyPXdDG8uZW8yjGarwzmaNGZSCtVc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855062; c=relaxed/simple; bh=P1goZ+98qKVEWbuRxPkvfQzAqWl2KpW9bCv/jdGFRzQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XIjlNUn6gEYAY/ZHeeWK/3CvMIv6a74WNKOr/FmuI/nqD0/o3UPYjQjscWwcETWjCFT0toRyZZV/HQrTKciFS+n5H0NxKjNbYHQE0pRrPalVc4g2kqHraIrTR/ziWhPO9ZF1VzgLQD4L+GCpYwyEzu9Sge68pCl04r1MeWq3CZI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=G7K9ZvO9; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="G7K9ZvO9" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-7a675fbd71eso2043876b3a.2 for ; Thu, 30 Oct 2025 13:11:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855059; x=1762459859; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ZPfEDjublRysEg+wBq1Bud0evQ6Qw3SUr7bZgrR40cM=; b=G7K9ZvO9zSv+TXHHSNOZRrOzTdgbjhJ6ptgUUZ+lPfcKTfp+V4UKer0oJgPQLYLG1T LBIBF5GUOgZvmM76IIiabt4TFdYxtfOaZVhl4UWkkRY57MB79GvcAo51ivcuzfMUmh1S NlVgsOJuhKUlF20SmsqKEwyfxIeJH6GA+hAvheredf75iowyteros74dUQS9VyFpKWPa PGN2YrZmirqhXlaYy0/iKDkVD6TmWeXoG8gXIiS6HeB6yU+aCcFpj7mdNmTWCoA6TxzQ KqlxjNapCrJGghp5EZgb7tQvcdE638b6p0ayOQyUiCYChed4nvss7EjFVQ+qpQIm+LIe y+sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855059; x=1762459859; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZPfEDjublRysEg+wBq1Bud0evQ6Qw3SUr7bZgrR40cM=; b=pGHJn+naDuDkZ9QMOxuSxNHaTruFb+S/OLO3DhJ3DYiRsCpR8bxc7LMhd0yxcUltUQ e6OXjUEgvms4aUfPIZQNSyFet3Jdc6jkf2/vwriakHFJVGVDupWp9xdEQfqTdid18Go0 7ftpsjIagHR/jZ/JmqLkydl0kmTSU+z13wN2dC9Q0rEYswCca4hlTCZdxXTGawtW2cEr agNfpHShtvWW49x+CA/+50E/wdQizh0Wx9kdo6V9Tw/hXTNmmaeKoXkH9I5AaaXglUAM QFH02jL2hFLvi6n/Xvg42c/KhNhcP0JwrjQ1FlvVVgC4Ko484wNjtlP2uozkjh7FG4tD r2yQ== X-Forwarded-Encrypted: i=1; AJvYcCWhPttV6zuRwS60VWiHxm2DT8Ay7RTTYb36Js4H8l2hHwE98SzWjePK2lvkvJRv1r5LXXLZOnXnECgyAjA=@vger.kernel.org X-Gm-Message-State: AOJu0Ywl6GluF1/apQnftt+m4+6oUIjGoCMeyaz36qUdmYpdUW1GpKTW h9t6b8dEvRt5+vhk8X7tQLcyuNk23F6ZPpqw0BE4Fn+D87n6qLA94GUPeicm/iAJKGZ9PSGnC3Z K5kmsdg== X-Google-Smtp-Source: AGHT+IE4oqFNhFQQZr1IyqSPUxhZWizKY8+4h3k61YypjXkX/4HyRwAM2GyKD7DH49xWshMyjNqw52ssRLY= X-Received: from pfbgi4.prod.google.com ([2002:a05:6a00:63c4:b0:793:b157:af42]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2283:b0:7a2:721b:adb6 with SMTP id d2e1a72fcca58-7a77a9cb11amr747302b3a.28.1761855059275; Thu, 30 Oct 2025 13:10:59 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:47 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-25-seanjc@google.com> Subject: [PATCH v4 24/28] KVM: TDX: Use guard() to acquire kvm->lock in tdx_vm_ioctl() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use guard() in tdx_vm_ioctl() to tidy up the code a small amount, but more importantly to minimize the diff of a future change, which will use guard-like semantics to acquire and release multiple locks. No functional change intended. Reviewed-by: Rick Edgecombe Reviewed-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d6f40a481487..037429964fd7 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2806,7 +2806,7 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) if (r) return r; =20 - mutex_lock(&kvm->lock); + guard(mutex)(&kvm->lock); =20 switch (tdx_cmd.id) { case KVM_TDX_CAPABILITIES: @@ -2819,15 +2819,12 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) r =3D tdx_td_finalize(kvm, &tdx_cmd); break; default: - r =3D -EINVAL; - goto out; + return -EINVAL; } =20 if (copy_to_user(argp, &tdx_cmd, sizeof(struct kvm_tdx_cmd))) - r =3D -EFAULT; + return -EFAULT; =20 -out: - mutex_unlock(&kvm->lock); return r; } =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 275AF33E34D for ; Thu, 30 Oct 2025 20:11:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855063; cv=none; b=HC8Hb6e2UMUumNbvoAXjMbasYWrS7yXS7L3d7Ms01/DC8hKYRdIRRcrfIN9gxWKLiGz7KnYA9e0YUSIWvYP0kUp6jWQsKR8TXoGiiIXwI1rmqBUA28dVLyWaIQqb+KjnRzxqOHh/OYXywODHVVTohJWczoTtwR92HoBTzPf/Tk0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855063; c=relaxed/simple; bh=dmFGy64Opu9Mkjf3vfnnmQEC763oF8Lj2ACAU4D9x+8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Jog9SXyokw6tJj4yIwsTQRX+oKAxK3UyceGFF/XRQQGhDxwNcorz/aUUIT9mKikMaOKsYkkcvVSOiYrullan45DDI0xfYb/Al7hxSitJJ9NqyfvnCawDL95SUvIPkWwcwbdpddXq+rU5oQy8FTA+WqGCDJrUFy/gwS0tqQ+myUM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ioCOFbnP; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ioCOFbnP" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-294fca58e78so8886085ad.1 for ; Thu, 30 Oct 2025 13:11:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855061; x=1762459861; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=pcGBUMDBt4QrK3gZ63mWxwFVf8xhoPzw9YjzqkGrLic=; b=ioCOFbnPjht5hMODgh5BDdCHoAWbcB6ldCqCu3+zwSLxeKmbSX5AWce7jNEaxbj3Wh OKxg2920qDhgK42yD4Lb7se8WyCxKEBOWId/u+jZTBSLpXvEsgPD+0AjNgu/0YZ8LBPK k8RTpkis2SAM6ILviLwkYaE9058lhf+wT+2DiAdL6rulK1LpGCC99INKDC+2G+NjBl63 9n5hzdIBZCiBWY07L3E6UgBjlTmv+lF1/G/pXcRQ/4jK1FnBZaP+Ano6nJEx6h6z83OD umdPHHMhY5uuDBQzgjJCQjFUb0AuFlrDAjkJH5lRoLpaDyrc+iSswE6QIrlchk2pclxt GJ4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855061; x=1762459861; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=pcGBUMDBt4QrK3gZ63mWxwFVf8xhoPzw9YjzqkGrLic=; b=Vago6bsSKD15NsQH8QyWIDmFwSnfUJ1nfNgy89hEy9PV9yy6PAOSv7x2IzTEZnn5bi LpkFM1bf5VKRQvCewoyebm7BiYGQxhtaeU3KcLXmrtby6/KB/ZkMaRmlxM/EUDdvY4Od m8B0wRh2fEkZo4TxjfTBUCiPcfPSI6xX8gY703ScvN+L7TdWOG7G4GQOwca9303qfRJR SnI6cpRpZi6PPkxHub7jUaYTuPrUm861meRMaNp2WCk4FR3IYdgKVA24JJqhHwqmZwgT Yj4TDYkNd2/qQN2rMmN5zenX4PBEtztZj9qkFYIiRb5q80K3LG6QxRSNCLYLhw5rY4Zj YDow== X-Forwarded-Encrypted: i=1; AJvYcCVdjHo+Vmti8R1Yrhu8aQ/ahriQwZwZ+VtCYfjAihyhmihhBq9VP6TmjMtPLr2aZcUmqMAILqS/A4T6SXk=@vger.kernel.org X-Gm-Message-State: AOJu0YxG1yOSR6KWAkb99y728SK12Z0kctKY4LeB2F+JestgeD8c2I8y 6ULhtH2Za8jfzA2uaChuzU+StBeTS5Hfd59FOqhbiBNbmJjD/eolqmyBkxu+nvF+lKw7B2n8kMH RTH6Zzw== X-Google-Smtp-Source: AGHT+IE/OfQCQRYqXElzN+OeKvwnTKdqn8Aa5qkU15JwMNXReTeV5Yh/81ZcfUEIC9p/s1OTRodzv1oDE/0= X-Received: from pjbms9.prod.google.com ([2002:a17:90b:2349:b0:33d:69cf:1f82]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:c401:b0:295:21ac:3540 with SMTP id d9443c01a7336-29521ac3a29mr3509295ad.33.1761855061339; Thu, 30 Oct 2025 13:11:01 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:48 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-26-seanjc@google.com> Subject: [PATCH v4 25/28] KVM: TDX: Don't copy "cmd" back to userspace for KVM_TDX_CAPABILITIES From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Don't copy the kvm_tdx_cmd structure back to userspace when handling KVM_TDX_CAPABILITIES, as tdx_get_capabilities() doesn't modify hw_error or any other fields. Opportunistically hoist the call to tdx_get_capabilities() outside of the kvm->lock critical section, as getting the capabilities doesn't touch the VM in any way, e.g. doesn't even take @kvm. Suggested-by: Kai Huang Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 037429964fd7..57dfddd2a6cf 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2806,12 +2806,12 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) if (r) return r; =20 + if (tdx_cmd.id =3D=3D KVM_TDX_CAPABILITIES) + return tdx_get_capabilities(&tdx_cmd); + guard(mutex)(&kvm->lock); =20 switch (tdx_cmd.id) { - case KVM_TDX_CAPABILITIES: - r =3D tdx_get_capabilities(&tdx_cmd); - break; case KVM_TDX_INIT_VM: r =3D tdx_td_init(kvm, &tdx_cmd); break; --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 775F733555B for ; Thu, 30 Oct 2025 20:11:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855066; cv=none; b=gBd0rqwxFGnFcLmsmWzVjd7Nw9Ip3YGnTb0Gd3v9mTbKqQz7tV5wUZ9ZFRMrcXlrI2+cfe0rOrNjY8M5YH+Io7BmzBa55HYYYdiDOmWc6hj2ySjDi1uZHUTkBEBN7cEykAfLXqbG5wRAKawZJiw80c25HyxzdxYKRk2z0XTmFsI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855066; c=relaxed/simple; bh=3OodWTNKfo3eGtZ7Fc7X1fUwmvd4oILOljrjn90Df3M=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GhLffiMi1Qwup6Tmfdm/4DDa/FR9ssvGcWRiVV5m90mHb4xxw3tFZH5M1vET9fSim9PsBrSEEpJ8UIIOKoVHRrcFhM59kUGVIEWq/Rr04Rl44C7Y7JBjMVwXAD5/HJaC0s+QmKG8f0sBFOHFaXYpNj2/eIlU9AwxL4LsCW3zUdA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=A0JJnbXO; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="A0JJnbXO" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32edda89a37so1493458a91.1 for ; Thu, 30 Oct 2025 13:11:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855064; x=1762459864; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ixX9GQx535IvklNL5bVhTvm5cxC1bAxRsqCoHbh8pSg=; b=A0JJnbXODuaAU/SObnIaNr6/AheaUQVuUObJlIpNGGnWyaVQjn8CJW8qtkXp6ZCliz /oBsZT4tyrLN7p314vgo+znP9/gCwO/B+e1Q0e3ont+p9mZOhtEg4+3dbDKA1CdSLtHs /fmKD/BBBYKruu48sb84Ero8oRE9FMw7GmzcWqJTZycYc/rn1HUkiPDMYKHb/9PSAG9m pgPphpjOqTVU+J3Al2YA1SJzS/jdppp42eqrpJUR+K6InWM/0ankTUk4UZFftsPOBsuI yhUyvbxSFewjrm6HdvMlzHWOaPpG7Qv73/Lc+9IzbJLrgP+stzdG12oplyKgP1Mh/SbQ Q+yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855064; x=1762459864; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ixX9GQx535IvklNL5bVhTvm5cxC1bAxRsqCoHbh8pSg=; b=YEdIgQpafSgoIZeZ8LTdaceWAkGS1bRMCuUkxnDzT65NQde+Fy67P9RdxEGZvYsLZ9 AKW252VVNqFrJqWwEYdDoFfVAYWjjNW/IDAeSSn4REyFfrYIC+dujHajtZB3r8eLQtvI N4/vaBzbN+5TMpZ/SHLbeITo2ns3F0gT4r8DI+UWZj9RMGdc0FJRlKqLmMN2+seJS3az 7QzMuAsfRsiwgfYbQTb1+T+kx9TLj8g0ccfmli9AU+SvcPEbsWXg6TzlpFGj0nbgfd3e LTTmqR9GKAX1oErGecHJPJjyyi4o1CBQyQT1MIMEkaWumsn721/lotK08U3TO0tD83Gz FvTQ== X-Forwarded-Encrypted: i=1; AJvYcCU2v+IjsJT3wvMbMBKBnDoudDSSuPPNUjJoOYOhgsKjgQclmjJDlCPs8M5pwHFTE3XZ2qncgct5bnAy02s=@vger.kernel.org X-Gm-Message-State: AOJu0YxN3xo6FettgufCFS2oobao00YuxH7KgxRAUiHOMtxooywMzQCq 9dwilcHU/JnghLG/VdQ0b+GW17RZsy3k7F2dpcmVMcffZErDViHKt3Yxbo3JAhFOgnW6yJo9F3H ua15n5Q== X-Google-Smtp-Source: AGHT+IFyAaLKG8h2LJ84hJgHQ75wx3G0tk8Vj+W9kFkkyUvmGPkijC+6POH5qOYkAqbj7Ex/aKtYOd6js0k= X-Received: from pjbcq19.prod.google.com ([2002:a17:90a:f993:b0:339:ee5f:ec32]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4b10:b0:340:2a16:94be with SMTP id 98e67ed59e1d1-34082fab1e6mr1328002a91.4.1761855063621; Thu, 30 Oct 2025 13:11:03 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:49 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-27-seanjc@google.com> Subject: [PATCH v4 26/28] KVM: TDX: Guard VM state transitions with "all" the locks From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Acquire kvm->lock, kvm->slots_lock, and all vcpu->mutex locks when servicing ioctls that (a) transition the TD to a new state, i.e. when doing INIT or FINALIZE or (b) are only valid if the TD is in a specific state, i.e. when initializing a vCPU or memory region. Acquiring "all" the locks fixes several KVM_BUG_ON() situations where a SEAMCALL can fail due to racing actions, e.g. if tdh_vp_create() contends with either tdh_mr_extend() or tdh_mr_finalize(). For all intents and purposes, the paths in question are fully serialized, i.e. there's no reason to try and allow anything remotely interesting to happen. Smack 'em with a big hammer instead of trying to be "nice". Acquire kvm->lock to prevent VM-wide things from happening, slots_lock to prevent kvm_mmu_zap_all_fast(), and _all_ vCPU mutexes to prevent vCPUs from interefering. Use the recently-renamed kvm_arch_vcpu_unlocked_ioctl() to service the vCPU-scoped ioctls to avoid a lock inversion problem, e.g. due to taking vcpu->mutex outside kvm->lock. See also commit ecf371f8b02d ("KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight"), which fixed a similar bug with SEV intra-host migration where an in-flight vCPU creation could race with a VM-wide state transition. Define a fancy new CLASS to handle the lock+check =3D> unlock logic with guard()-like syntax: CLASS(tdx_vm_state_guard, guard)(kvm); if (IS_ERR(guard)) return PTR_ERR(guard); to simplify juggling the many locks. Note! Take kvm->slots_lock *after* all vcpu->mutex locks, as per KVM's soon-to-be-documented lock ordering rules[1]. Link: https://lore.kernel.org/all/20251016235538.171962-1-seanjc@google.com= [1] Reported-by: Yan Zhao Closes: https://lore.kernel.org/all/aLFiPq1smdzN3Ary@yzhao56-desk.sh.intel.= com Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 59 +++++++++++++++++++++++++++++++++++------- 1 file changed, 49 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 57dfddd2a6cf..8bcdec049ac6 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2653,6 +2653,46 @@ static int tdx_read_cpuid(struct kvm_vcpu *vcpu, u32= leaf, u32 sub_leaf, return -EIO; } =20 +typedef void *tdx_vm_state_guard_t; + +static tdx_vm_state_guard_t tdx_acquire_vm_state_locks(struct kvm *kvm) +{ + int r; + + mutex_lock(&kvm->lock); + + if (kvm->created_vcpus !=3D atomic_read(&kvm->online_vcpus)) { + r =3D -EBUSY; + goto out_err; + } + + r =3D kvm_lock_all_vcpus(kvm); + if (r) + goto out_err; + + /* + * Note the unintuitive ordering! vcpu->mutex must be taken outside + * kvm->slots_lock! + */ + mutex_lock(&kvm->slots_lock); + return kvm; + +out_err: + mutex_unlock(&kvm->lock); + return ERR_PTR(r); +} + +static void tdx_release_vm_state_locks(struct kvm *kvm) +{ + mutex_unlock(&kvm->slots_lock); + kvm_unlock_all_vcpus(kvm); + mutex_unlock(&kvm->lock); +} + +DEFINE_CLASS(tdx_vm_state_guard, tdx_vm_state_guard_t, + if (!IS_ERR(_T)) tdx_release_vm_state_locks(_T), + tdx_acquire_vm_state_locks(kvm), struct kvm *kvm); + static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) { struct kvm_tdx_init_vm __user *user_data =3D u64_to_user_ptr(cmd->data); @@ -2764,8 +2804,6 @@ static int tdx_td_finalize(struct kvm *kvm, struct kv= m_tdx_cmd *cmd) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); =20 - guard(mutex)(&kvm->slots_lock); - if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; =20 @@ -2809,7 +2847,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) if (tdx_cmd.id =3D=3D KVM_TDX_CAPABILITIES) return tdx_get_capabilities(&tdx_cmd); =20 - guard(mutex)(&kvm->lock); + CLASS(tdx_vm_state_guard, guard)(kvm); + if (IS_ERR(guard)) + return PTR_ERR(guard); =20 switch (tdx_cmd.id) { case KVM_TDX_INIT_VM: @@ -3113,8 +3153,6 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *= vcpu, struct kvm_tdx_cmd *c if (tdx->state !=3D VCPU_TD_STATE_INITIALIZED) return -EINVAL; =20 - guard(mutex)(&kvm->slots_lock); - /* Once TD is finalized, the initial guest memory is fixed. */ if (kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; @@ -3170,7 +3208,8 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *= vcpu, struct kvm_tdx_cmd *c =20 int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); + struct kvm *kvm =3D vcpu->kvm; + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); struct kvm_tdx_cmd cmd; int r; =20 @@ -3178,12 +3217,13 @@ int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, = void __user *argp) if (r) return r; =20 + CLASS(tdx_vm_state_guard, guard)(kvm); + if (IS_ERR(guard)) + return PTR_ERR(guard); + if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state =3D=3D TD_STATE_RUNNABLE) return -EINVAL; =20 - if (mutex_lock_killable(&vcpu->mutex)) - return -EINTR; - vcpu_load(vcpu); =20 switch (cmd.id) { @@ -3200,7 +3240,6 @@ int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, vo= id __user *argp) =20 vcpu_put(vcpu); =20 - mutex_unlock(&vcpu->mutex); return r; } =20 --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2AC8346FC7 for ; Thu, 30 Oct 2025 20:11:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855068; cv=none; b=Ba13dP6Yz+6+6qh1Q5M3WQ0veaU5iyJJc9Fq+L+AWOlvE8n/ZzFrv+Qvqo9Rd/HPwq/Z5h06GIVSgh3tOeNdgytn/lVJOFtXLlmSBiaWZl4q6KSWtnVmIwp2fV+gHlIRvaBL+E31WX1jlMSq+gazDSFJ1A5GePcOoeS9lpZCxFA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855068; c=relaxed/simple; bh=BNSOEKa28oWXqKHW7RuSRcfta2Y9XobgHdriosWTdaA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RmLmG1ZT5jliKMMz5I1/Y5hkOLGViE/np4GZipEGsPfCk7Ieq5gTU3JKnogokUMTaF5LhAYNgv36Dxq7SOPF5/pd2+MW1WIQIz2uEgCgHOINMIrYmj6YKutzWn1SBu62vO4JKU4zJqSaGyYjDriAG9pETov5RyqXf6KW+DY79mY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ZTi4/1Mu; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZTi4/1Mu" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32ee4998c50so1121914a91.3 for ; Thu, 30 Oct 2025 13:11:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855065; x=1762459865; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=gJtCncfCK9X54oXx991KGpYhmmpTyjjcLT5mmtOhZeY=; b=ZTi4/1Muxx6qDWUNt3cEFU7xcCSnSIePEwQ12PKIX2DNkRxj11o/CA9NZGc+1A9oFc fbBdQ9CCAJPNFfeiL5zGcJJWuH4W6RV47jTfs2v5Jr9vqUON+AABZGbBBSB7p9BQJNML HCPvgc3wP+fbm6Jcx6u6gFSY5y9ii/+6J+mTXq1m2yLt6Zuoi0cDJjxXWUXTzhsiPZ5U 4wLaLYO3gTtqdK0JmhgbF6PvqLC0BKLqvvFjfE1aqG0cj/Sm1fCeNXkCAnh3eHCT9HnM rpclILEZG+aQyQcBwfW41LWYyz8ZSt2AhOvSe2Q1b/UVoX0kF73HE7bwtZclofIi6lW0 SZSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855065; x=1762459865; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gJtCncfCK9X54oXx991KGpYhmmpTyjjcLT5mmtOhZeY=; b=PVo3xpuawOl8RRBcNpmz68WsdA/ob8hkgL5CkILHVriW0S1q18oI+fMXbXmkwDW7cH OmArAqptmEorIFYqHRUnTb6jwcjrgaZNj+Z1D9J12tV90SGJe4vGC4MB4Sk3Mmqb6SRn Y3RLrzqOtdidevet500p1yl811SDaWr5mIXfIldAguTH59KTvAUIswmVPTRLyFtTT3ba +HNpibsIjg2CW99GI6keauLnPOW1b95KOlBh/y/5H0WHSRCTkCQWd0kEGcU1BpuC3HVn jX0zvl7V89dURPaRkndN22Vh0Cb94+tFG+0RFG3aZ3rwQWqV0FRKnoeqr/hztV+2XZ7A v3Og== X-Forwarded-Encrypted: i=1; AJvYcCVmII1xNI2BrOTbfncqfbUugt2PeZ0fHlA/BPOmtpgU2lVfJRCOCbGm72B+9Nio6dKGm4q2RNsu52ljv+w=@vger.kernel.org X-Gm-Message-State: AOJu0Yy6TC8h3W8II00LqoqtGE8GumuRcQ54G5Kwiw93ojD3rtV7APak xWtfdljgRZ3wrpweQlSCQQ9OJBHp/xnvhtCEpoKZ6XYlzu3Am6a7TpUV8aEuQBjnVD0gUcSwGwP /hgRteQ== X-Google-Smtp-Source: AGHT+IGHOQLpfyFznvDWNUH7H5RRwumwXl+ad4M1fU9LKzyWdjNf4lBkOJxqkb3vWAdXWhUIZ0D4LKjuEDo= X-Received: from pjd14.prod.google.com ([2002:a17:90b:54ce:b0:339:ee20:f620]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3e4d:b0:32e:528c:60ee with SMTP id 98e67ed59e1d1-34083055b02mr1449393a91.24.1761855065570; Thu, 30 Oct 2025 13:11:05 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:50 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-28-seanjc@google.com> Subject: [PATCH v4 27/28] KVM: TDX: Bug the VM if extending the initial measurement fails From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" WARN and terminate the VM if TDH_MR_EXTEND fails, as extending the measurement should fail if and only if there is a KVM bug, or if the S-EPT mapping is invalid. Now that KVM makes all state transitions mutually exclusive via tdx_vm_state_guard, it should be impossible for S-EPT mappings to be removed between kvm_tdp_mmu_map_private_pfn() and tdh_mr_extend(). Holding slots_lock prevents zaps due to memslot updates, filemap_invalidate_lock() prevents zaps due to guest_memfd PUNCH_HOLE, vcpu->mutex locks prevents updates from other vCPUs, kvm->lock prevents VM-scoped ioctls from creating havoc (e.g. by creating new vCPUs), and all usage of kvm_zap_gfn_range() is mutually exclusive with S-EPT entries that can be used for the initial image. For kvm_zap_gfn_range(), the call from sev.c is obviously mutually exclusive, TDX disallows KVM_X86_QUIRK_IGNORE_GUEST_PAT so the same goes for kvm_noncoherent_dma_assignment_start_or_stop(), and __kvm_set_or_clear_apicv_inhibit() is blocked by virtue of holding all VM and vCPU mutexes (and the APIC page has its own non-guest_memfd memslot and so can't be used for the initial image, which means that too is mutually exclusive irrespective of locking). Opportunistically return early if the region doesn't need to be measured in order to reduce line lengths and avoid wraps. Similarly, immediately and explicitly return if TDH_MR_EXTEND fails to make it clear that KVM needs to bail entirely if extending the measurement fails. Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 8bcdec049ac6..762f2896547f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3123,21 +3123,23 @@ static int tdx_gmem_post_populate(struct kvm *kvm, = gfn_t gfn, kvm_pfn_t pfn, =20 put_page(src_page); =20 - if (ret) + if (ret || !(arg->flags & KVM_TDX_MEASURE_MEMORY_REGION)) return ret; =20 - if (arg->flags & KVM_TDX_MEASURE_MEMORY_REGION) { - for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { - err =3D tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry, - &level_state); - if (err) { - ret =3D -EIO; - break; - } - } + /* + * Note, MR.EXTEND can fail if the S-EPT mapping is somehow removed + * between mapping the pfn and now, but slots_lock prevents memslot + * updates, filemap_invalidate_lock() prevents guest_memfd updates, + * mmu_notifier events can't reach S-EPT entries, and KVM's internal + * zapping flows are mutually exclusive with S-EPT mappings. + */ + for (i =3D 0; i < PAGE_SIZE; i +=3D TDX_EXTENDMR_CHUNKSIZE) { + err =3D tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry, &level_state); + if (TDX_BUG_ON_2(err, TDH_MR_EXTEND, entry, level_state, kvm)) + return -EIO; } =20 - return ret; + return 0; } =20 static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_= cmd *cmd) --=20 2.51.1.930.gacf6e81ea2-goog From nobody Tue Dec 16 07:27:35 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3FBF233E37B for ; Thu, 30 Oct 2025 20:11:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855070; cv=none; b=qJaPb+2UTgH/K1ZpC2pwWY3fj9NfFk/xdcfgT/IqJLoMwTiHf5r313v12ex8N+9UBM+3i9D0JyGaXbKjVwUpCOB7S7Lo3KIxaMGEN2mGBJnjc2eEaXiSIGMz5D3CHqo8/6+15vJGEiEMTyFOjMJu3zSKa5nP9Dbe3qn0hKz520A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761855070; c=relaxed/simple; bh=+qLC3Iee5+EjjvPxLbec42/6nK7vtOEI69ofQbkEfuY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gJmihN1xcU9QlZDItqt1S34wuDAtyaECfHXx+mwn+I9rQ+dTvj1GdkjAW75r/hA5UljLe+1zme93kdCiH9kCgzX2scal/N5i3B9UUR0Qh2XxvZ2TZCKdZUTJu05yWvTU2ULHyU8VsC9ANgYnL3Ym8hsw9Fo97mymziH8poAIFuU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zdKbOyxC; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zdKbOyxC" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33bb4d11f5eso1784206a91.3 for ; Thu, 30 Oct 2025 13:11:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761855068; x=1762459868; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=g0DZj43Z67Ng4V8dccak1Ym5loj+qmLakJ+LCALP9kg=; b=zdKbOyxCPxnbqzFm+845zbU+aWPDJZEuP8PA7LjPrzoMfw2GDcy8ytSyiBMV1WcFTp Uzmb/b/ftTflEgVbH/80n4hByIx+2eAsnItF9Bc26lDyqhVySSIdS+LaUle7+tJyugcd P98UsGZyl2XMwDnbnh3e7a8iI8jbgp8pGfcqR4xnIwojJhwrNV6l+Bg0OYVHIhm02Lh4 Gy/fmwE5T4WuHhjg1vFtnfulQWVk0BBX24ThxFsBJhgy8xhBCysO2Ps3lLBpBvLZjX+e cSgIa/R1aYcT8IDCiErLnd9Sqe0JziGB2FSudAP3YuXjfqoZsonj2oLYd+6NpMet2xuu U5KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761855068; x=1762459868; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=g0DZj43Z67Ng4V8dccak1Ym5loj+qmLakJ+LCALP9kg=; b=IQowlEzUDJ07NgUV5C9ruAafZn7UpxgcZDBHku88WGqkAEaBN07dH3YUb/Kyuc057r uGG4iOOz1ligpcJOylGtZIlPMDDbe2WCVJw+ijPzpG7npfPthA+t/8rzLuCOxSJrCWBO gjmcnaVC71USRUvkBuvHnOxzDTVcfBURTzEHX5zLJgLynx0lWDnjHUrUUVrryX3/8Mjb xVzg7ikFRSUi1h68a0CZERWFUummVSP5nnlRqL6H7SZxmO9MsIsd5/Cu9GS7UAxfuP7F O8X8E6SijHk6BPf4LvJASIs81ZRcswUG+ZfgI/UKS0+l2tzMv/1zt5hiqkCSpxEkqw7M aY2Q== X-Forwarded-Encrypted: i=1; AJvYcCXkU5MPj/SsqkRmvJ2JzFWOxg0mmFEcZln/Ig7iLdruVN9Ibr/l+p2TUIOBDXjQRx1SgtoMrUIgskYxiTU=@vger.kernel.org X-Gm-Message-State: AOJu0YwUyGW1DK7mB6lBMy3FBCXifThfFkOKibdttmWb8o2431/P+dyx eUuWLjRbYJ6sRs5RGxTAm0xNrbDUhff4pDmO7G+6IVlkxbjIwQG2L5PO43DiGKzxNQ+mc81GQA0 xw9l3rA== X-Google-Smtp-Source: AGHT+IHFPEZaejYJvfIle9tc1ZXyDJfCQsJh2BXbTmXv+Mn/KvdDlX4yHPVD4nKwk0+goI6XnN1tnx5wN9U= X-Received: from pjbgp21.prod.google.com ([2002:a17:90a:df15:b0:33b:51fe:1a7b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c12:b0:32e:7340:a7f7 with SMTP id 98e67ed59e1d1-34082fc8525mr1267842a91.2.1761855067459; Thu, 30 Oct 2025 13:11:07 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 13:09:51 -0700 In-Reply-To: <20251030200951.3402865-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251030200951.3402865-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251030200951.3402865-29-seanjc@google.com> Subject: [PATCH v4 28/28] KVM: TDX: Fix list_add corruption during vcpu_load() From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Madhavan Srinivasan , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson , Paolo Bonzini , "Kirill A. Shutemov" Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Ira Weiny , Kai Huang , Binbin Wu , Michael Roth , Yan Zhao , Vishal Annapurve , Rick Edgecombe , Ackerley Tng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yan Zhao During vCPU creation, a vCPU may be destroyed immediately after kvm_arch_vcpu_create() (e.g., due to vCPU id confiliction). However, the vcpu_load() inside kvm_arch_vcpu_create() may have associate the vCPU to pCPU via "list_add(&tdx->cpu_list, &per_cpu(associated_tdvcpus, cpu))" before invoking tdx_vcpu_free(). Though there's no need to invoke tdh_vp_flush() on the vCPU, failing to dissociate the vCPU from pCPU (i.e., "list_del(&to_tdx(vcpu)->cpu_list)") will cause list corruption of the per-pCPU list associated_tdvcpus. Then, a later list_add() during vcpu_load() would detect list corruption and print calltrace as shown below. Dissociate a vCPU from its associated pCPU in tdx_vcpu_free() for the vCPUs destroyed immediately after creation which must be in VCPU_TD_STATE_UNINITIALIZED state. kernel BUG at lib/list_debug.c:29! Oops: invalid opcode: 0000 [#2] SMP NOPTI RIP: 0010:__list_add_valid_or_report+0x82/0xd0 Call Trace: tdx_vcpu_load+0xa8/0x120 vt_vcpu_load+0x25/0x30 kvm_arch_vcpu_load+0x81/0x300 vcpu_load+0x55/0x90 kvm_arch_vcpu_create+0x24f/0x330 kvm_vm_ioctl_create_vcpu+0x1b1/0x53 kvm_vm_ioctl+0xc2/0xa60 __x64_sys_ioctl+0x9a/0xf0 x64_sys_call+0x10ee/0x20d0 do_syscall_64+0xc3/0x470 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: d789fa6efac9 ("KVM: TDX: Handle vCPU dissociation") Signed-off-by: Yan Zhao Signed-off-by: Sean Christopherson Reviewed-by: Kai Huang Reviewed-by: Yan Zhao Tested-by: Kai Huang Tested-by: Yan Zhao --- arch/x86/kvm/vmx/tdx.c | 43 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 762f2896547f..db7ac7955ca1 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -843,19 +843,52 @@ void tdx_vcpu_put(struct kvm_vcpu *vcpu) tdx_prepare_switch_to_host(vcpu); } =20 +/* + * Life cycles for a TD and a vCPU: + * 1. KVM_CREATE_VM ioctl. + * TD state is TD_STATE_UNINITIALIZED. + * hkid is not assigned at this stage. + * 2. KVM_TDX_INIT_VM ioctl. + * TD transitions to TD_STATE_INITIALIZED. + * hkid is assigned after this stage. + * 3. KVM_CREATE_VCPU ioctl. (only when TD is TD_STATE_INITIALIZED). + * 3.1 tdx_vcpu_create() transitions vCPU state to VCPU_TD_STATE_UNINIT= IALIZED. + * 3.2 vcpu_load() and vcpu_put() in kvm_arch_vcpu_create(). + * 3.3 (conditional) if any error encountered after kvm_arch_vcpu_creat= e() + * kvm_arch_vcpu_destroy() --> tdx_vcpu_free(). + * 4. KVM_TDX_INIT_VCPU ioctl. + * tdx_vcpu_init() transitions vCPU state to VCPU_TD_STATE_INITIALIZED. + * vCPU control structures are allocated at this stage. + * 5. kvm_destroy_vm(). + * 5.1 tdx_mmu_release_hkid(): (1) tdh_vp_flush(), disassociates all vC= PUs. + * (2) puts hkid to !assigned state. + * 5.2 kvm_destroy_vcpus() --> tdx_vcpu_free(): + * transitions vCPU to VCPU_TD_STATE_UNINITIALIZED state. + * 5.3 tdx_vm_destroy() + * transitions TD to TD_STATE_UNINITIALIZED state. + * + * tdx_vcpu_free() can be invoked only at 3.3 or 5.2. + * - If at 3.3, hkid is still assigned, but the vCPU must be in + * VCPU_TD_STATE_UNINITIALIZED state. + * - if at 5.2, hkid must be !assigned and all vCPUs must be in + * VCPU_TD_STATE_INITIALIZED state and have been dissociated. + */ void tdx_vcpu_free(struct kvm_vcpu *vcpu) { struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(vcpu->kvm); struct vcpu_tdx *tdx =3D to_tdx(vcpu); int i; =20 + if (vcpu->cpu !=3D -1) { + KVM_BUG_ON(tdx->state =3D=3D VCPU_TD_STATE_INITIALIZED, vcpu->kvm); + tdx_flush_vp_on_cpu(vcpu); + return; + } + /* * It is not possible to reclaim pages while hkid is assigned. It might - * be assigned if: - * 1. the TD VM is being destroyed but freeing hkid failed, in which - * case the pages are leaked - * 2. TD VCPU creation failed and this on the error path, in which case - * there is nothing to do anyway + * be assigned if the TD VM is being destroyed but freeing hkid failed, + * in which case the pages are leaked. */ if (is_hkid_assigned(kvm_tdx)) return; --=20 2.51.1.930.gacf6e81ea2-goog