From nobody Wed Feb 11 14:04:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4AC2C77B6C for ; Fri, 7 Apr 2023 20:20:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230336AbjDGUUA (ORCPT ); Fri, 7 Apr 2023 16:20:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230234AbjDGUT4 (ORCPT ); Fri, 7 Apr 2023 16:19:56 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB5A3C65D for ; Fri, 7 Apr 2023 13:19:55 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id c8-20020a170902d48800b001a1e0fd4085so25154255plg.20 for ; Fri, 07 Apr 2023 13:19:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680898795; x=1683490795; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8b0y9dzGATppcj8q9R3EYlNwH8XwDGz6PntC1PTa0lM=; b=kVkihDPr/2wcKyOVlHQk1B6rs/tWhOM8mww1inoytYh7iKm5x67D2NyRve2+XK9jnl y50/l1iKtKC5LzduH0aqkWUFALmZcItOqyfohGKWVmSf9aMu6zllI5ks8kPNysQtvZHS wJ1FosoGrMpU/qJwrubMmQRiXytK2DXFJhIQtsc39RDr1ACjSnKLorktmolfRB+5lluB LVCdw30oylwxXqslJbJDvRraCncjkV3Fsst1lK+eX2DoL1MnMtmILAZ5KLba1pQ499Sw sOBOa6f9Jeg5PHSFk5QBFh0xhUP4qdblmkBOiJFfqYNqeiJtU6Xopnp4rxVwAg21QdRI ebkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680898795; x=1683490795; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8b0y9dzGATppcj8q9R3EYlNwH8XwDGz6PntC1PTa0lM=; b=IEm+Dm0eQeZEpmdUgqP78jtLB8wh5ynUYcKVdyPC3rHE7ClqfqDCFVeoYCfXVgGpkD I5JUSuOor1f3HclAR7VU+KtapqwlUO18waC7Yc8/rhZrM8KPIfJyYQQDaeoamJ4LJUVO rcvbVEQNEB2743CFdH6QCEdDUKUgfH2Rq4D6Mw1WLVdXutzKvZDAqYJRA4fkey6QBYWp CQOVK+fb2x8QwKoO753vUUiY6zpvjTUa+tl/QtpYlQgSmR7yM43ufACMnEH5GihUuYzp YFDKLztGuE7thzXSsu4AjjGQROAATWdpKlO3MDMyQxW592f8mUwiY3F1/epLvGjx+Gkl enDw== X-Gm-Message-State: AAQBX9fbqLhXsyGLwdDr7I6jBf6SW3QXwCBj9mK12x9BWKlFvyJ9EORo Io5UODkE0AL2Wakq/6KnDblaBbfbeg== X-Google-Smtp-Source: AKy350Yu52fhDPmtG31yJRW3kZkCKp3NE4S4e24lBBsLq7eYvF3p6JmBw3w8hLndanVHtKew+WdWfgWzyg== X-Received: from sagi.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:241b]) (user=sagis job=sendgmr) by 2002:a63:24c3:0:b0:513:2523:1b5f with SMTP id k186-20020a6324c3000000b0051325231b5fmr725494pgk.3.1680898795205; Fri, 07 Apr 2023 13:19:55 -0700 (PDT) Date: Fri, 7 Apr 2023 20:19:17 +0000 In-Reply-To: <20230407201921.2703758-1-sagis@google.com> Mime-Version: 1.0 References: <20230407201921.2703758-1-sagis@google.com> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog Message-ID: <20230407201921.2703758-2-sagis@google.com> Subject: [RFC PATCH 1/5] KVM: Split tdp_mmu_pages to private and shared lists From: Sagi Shahar To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Sean Christopherson , Paolo Bonzini , Isaku Yamahata , Erdem Aktas , David Matlack , Kai Huang , Zhi Wang , Chao Peng , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Sagi Shahar Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" tdp_mmu_pages holds all the active pages used by the mmu. When we transfer the state during intra-host migration we need to transfer the private pages but not the shared ones. Keeping them in separate counters makes this transfer more efficient. Signed-off-by: Sagi Shahar --- arch/x86/include/asm/kvm_host.h | 5 ++++- arch/x86/kvm/mmu/tdp_mmu.c | 11 +++++++++-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index ae377eec81987..5ed70cd9d74bf 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1426,9 +1426,12 @@ struct kvm_arch { struct task_struct *nx_huge_page_recovery_thread; =20 #ifdef CONFIG_X86_64 - /* The number of TDP MMU pages across all roots. */ + /* The number of non-private TDP MMU pages across all roots. */ atomic64_t tdp_mmu_pages; =20 + /* Same as tdp_mmu_pages but only for private pages. */ + atomic64_t tdp_private_mmu_pages; + /* * List of struct kvm_mmu_pages being used as roots. * All struct kvm_mmu_pages in the list should have diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 58a236a69ec72..327dee4f6170e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -44,6 +44,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) destroy_workqueue(kvm->arch.tdp_mmu_zap_wq); =20 WARN_ON(atomic64_read(&kvm->arch.tdp_mmu_pages)); + WARN_ON(atomic64_read(&kvm->arch.tdp_private_mmu_pages)); WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots)); =20 /* @@ -373,13 +374,19 @@ static void handle_changed_spte_dirty_log(struct kvm = *kvm, int as_id, gfn_t gfn, static void tdp_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) { kvm_account_pgtable_pages((void *)sp->spt, +1); - atomic64_inc(&kvm->arch.tdp_mmu_pages); + if (is_private_sp(sp)) + atomic64_inc(&kvm->arch.tdp_private_mmu_pages); + else + atomic64_inc(&kvm->arch.tdp_mmu_pages); } =20 static void tdp_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *s= p) { kvm_account_pgtable_pages((void *)sp->spt, -1); - atomic64_dec(&kvm->arch.tdp_mmu_pages); + if (is_private_sp(sp)) + atomic64_dec(&kvm->arch.tdp_private_mmu_pages); + else + atomic64_dec(&kvm->arch.tdp_mmu_pages); } =20 /** --=20 2.40.0.348.gf938b09366-goog From nobody Wed Feb 11 14:04:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2498DC77B6C for ; Fri, 7 Apr 2023 20:20:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230467AbjDGUUK (ORCPT ); Fri, 7 Apr 2023 16:20:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230394AbjDGUUE (ORCPT ); Fri, 7 Apr 2023 16:20:04 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5F28C65C for ; Fri, 7 Apr 2023 13:20:01 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-54196bfcd5fso422923687b3.4 for ; Fri, 07 Apr 2023 13:20:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680898801; x=1683490801; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PEBmMXjBIn1CI5gPicrS6RIKVR0bvHgFijUJvWGbelY=; b=UNL2PK7t+b3VGpTAoySNDTA+mhQvxrB+W6h/mWFGL0YyDEz7kTPWaIOvNmS+WVvWyS ivIZ9nIH8jHYb+kHJlxEkyXUjU1+enRo+mdeLY912hcyg00OuMBTaCSGzuzjoX5NssFe 79+sx6iktGmnKyPBoLLzL1kNQRZZljEl+8AoTkeMnvVtmXwTK7UJYy3kLEXtqTPbRJ5r eQqQbPAfzfNU/u6TfzAOYjyWVAewSxAy6MF20Isaa/0ATivpmpkVdCOHsJEPNt4Y/jRJ TtVcpT89P+DpOMKYz47RYrAaCTXlT347XRNyZPzRpHF1cy/TvBAT/UT1pBLIFEMshICg X3og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680898801; x=1683490801; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PEBmMXjBIn1CI5gPicrS6RIKVR0bvHgFijUJvWGbelY=; b=ZoEH06Je+ya4q0NtuQF9QuC8i2V1VpC7COMZRDrpmNSni3i4lR9fgRfWrkCzmHh6P1 DaHaKz44/sY6UhwRI546CFoOBnQl+94Whg4PEv+FtOH+8zBmq20W13HApaj3gVd1jRKi +4/x136ei8s0qAHM1xMnvfOFnNutlj9NFiy00KReK0QYRRvfiGaQQlWw+nRuPzem+5GG f0WYl9n+3FGc9ucc15SnMsUFuP0jIFODalFYwF/rghPXiPzhV8/y1KiiMKEyJ8DIjSoj df3f4a+IA5BBpk9lNxagVQhlVOzS7FYMBmIauIyE2NhFXDwh5biBN32277ql7lSYXP7h pA6w== X-Gm-Message-State: AAQBX9dyJwvvTx2c7/+mk8oqi6uZh2YfcyW6PfBNITIXCNCs8P2GsRwB L8qN6Ow5tp7rQaGdSLPGQEN6jJFdeg== X-Google-Smtp-Source: AKy350YHaPy5hMQ+7GbUpVR5qSB/2m0QcKG0N7nYosQ5zmWYwu6Dtks9XgsYgnAc7sdBi0uCpcTNNHNuyw== X-Received: from sagi.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:241b]) (user=sagis job=sendgmr) by 2002:a25:d6d6:0:b0:ad2:3839:f49 with SMTP id n205-20020a25d6d6000000b00ad238390f49mr1823667ybg.5.1680898800862; Fri, 07 Apr 2023 13:20:00 -0700 (PDT) Date: Fri, 7 Apr 2023 20:19:18 +0000 In-Reply-To: <20230407201921.2703758-1-sagis@google.com> Mime-Version: 1.0 References: <20230407201921.2703758-1-sagis@google.com> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog Message-ID: <20230407201921.2703758-3-sagis@google.com> Subject: [RFC PATCH 2/5] KVM: SEV: Refactor common code out of sev_vm_move_enc_context_from From: Sagi Shahar To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Sean Christopherson , Paolo Bonzini , Isaku Yamahata , Erdem Aktas , David Matlack , Kai Huang , Zhi Wang , Chao Peng , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Sagi Shahar Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Both SEV and TDX are going to use similar flows for intra-host migration. This change moves some of the code which will be used by both architecture into shared code in x86.h Signed-off-by: Sagi Shahar --- arch/x86/kvm/svm/sev.c | 175 +++++------------------------------------ arch/x86/kvm/x86.c | 166 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.h | 16 ++++ 3 files changed, 201 insertions(+), 156 deletions(-) diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index c25aeb550cd97..18831a0b7734e 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1553,116 +1553,6 @@ static bool is_cmd_allowed_from_mirror(u32 cmd_id) return false; } =20 -static int sev_lock_two_vms(struct kvm *dst_kvm, struct kvm *src_kvm) -{ - struct kvm_sev_info *dst_sev =3D &to_kvm_svm(dst_kvm)->sev_info; - struct kvm_sev_info *src_sev =3D &to_kvm_svm(src_kvm)->sev_info; - int r =3D -EBUSY; - - if (dst_kvm =3D=3D src_kvm) - return -EINVAL; - - /* - * Bail if these VMs are already involved in a migration to avoid - * deadlock between two VMs trying to migrate to/from each other. - */ - if (atomic_cmpxchg_acquire(&dst_sev->migration_in_progress, 0, 1)) - return -EBUSY; - - if (atomic_cmpxchg_acquire(&src_sev->migration_in_progress, 0, 1)) - goto release_dst; - - r =3D -EINTR; - if (mutex_lock_killable(&dst_kvm->lock)) - goto release_src; - if (mutex_lock_killable_nested(&src_kvm->lock, SINGLE_DEPTH_NESTING)) - goto unlock_dst; - return 0; - -unlock_dst: - mutex_unlock(&dst_kvm->lock); -release_src: - atomic_set_release(&src_sev->migration_in_progress, 0); -release_dst: - atomic_set_release(&dst_sev->migration_in_progress, 0); - return r; -} - -static void sev_unlock_two_vms(struct kvm *dst_kvm, struct kvm *src_kvm) -{ - struct kvm_sev_info *dst_sev =3D &to_kvm_svm(dst_kvm)->sev_info; - struct kvm_sev_info *src_sev =3D &to_kvm_svm(src_kvm)->sev_info; - - mutex_unlock(&dst_kvm->lock); - mutex_unlock(&src_kvm->lock); - atomic_set_release(&dst_sev->migration_in_progress, 0); - atomic_set_release(&src_sev->migration_in_progress, 0); -} - -/* vCPU mutex subclasses. */ -enum sev_migration_role { - SEV_MIGRATION_SOURCE =3D 0, - SEV_MIGRATION_TARGET, - SEV_NR_MIGRATION_ROLES, -}; - -static int sev_lock_vcpus_for_migration(struct kvm *kvm, - enum sev_migration_role role) -{ - struct kvm_vcpu *vcpu; - unsigned long i, j; - - kvm_for_each_vcpu(i, vcpu, kvm) { - if (mutex_lock_killable_nested(&vcpu->mutex, role)) - goto out_unlock; - -#ifdef CONFIG_PROVE_LOCKING - if (!i) - /* - * Reset the role to one that avoids colliding with - * the role used for the first vcpu mutex. - */ - role =3D SEV_NR_MIGRATION_ROLES; - else - mutex_release(&vcpu->mutex.dep_map, _THIS_IP_); -#endif - } - - return 0; - -out_unlock: - - kvm_for_each_vcpu(j, vcpu, kvm) { - if (i =3D=3D j) - break; - -#ifdef CONFIG_PROVE_LOCKING - if (j) - mutex_acquire(&vcpu->mutex.dep_map, role, 0, _THIS_IP_); -#endif - - mutex_unlock(&vcpu->mutex); - } - return -EINTR; -} - -static void sev_unlock_vcpus_for_migration(struct kvm *kvm) -{ - struct kvm_vcpu *vcpu; - unsigned long i; - bool first =3D true; - - kvm_for_each_vcpu(i, vcpu, kvm) { - if (first) - first =3D false; - else - mutex_acquire(&vcpu->mutex.dep_map, - SEV_NR_MIGRATION_ROLES, 0, _THIS_IP_); - - mutex_unlock(&vcpu->mutex); - } -} - static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm) { struct kvm_sev_info *dst =3D &to_kvm_svm(dst_kvm)->sev_info; @@ -1744,25 +1634,6 @@ static void sev_migrate_from(struct kvm *dst_kvm, st= ruct kvm *src_kvm) } } =20 -static int sev_check_source_vcpus(struct kvm *dst, struct kvm *src) -{ - struct kvm_vcpu *src_vcpu; - unsigned long i; - - if (!sev_es_guest(src)) - return 0; - - if (atomic_read(&src->online_vcpus) !=3D atomic_read(&dst->online_vcpus)) - return -EINVAL; - - kvm_for_each_vcpu(i, src_vcpu, src) { - if (!src_vcpu->arch.guest_state_protected) - return -EINVAL; - } - - return 0; -} - int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd) { struct kvm_sev_info *dst_sev =3D &to_kvm_svm(kvm)->sev_info; @@ -1777,19 +1648,20 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, u= nsigned int source_fd) ret =3D -EBADF; goto out_fput; } - source_kvm =3D source_kvm_file->private_data; - ret =3D sev_lock_two_vms(kvm, source_kvm); + src_sev =3D &to_kvm_svm(source_kvm)->sev_info; + + ret =3D pre_move_enc_context_from(kvm, source_kvm, + &dst_sev->migration_in_progress, + &src_sev->migration_in_progress); if (ret) goto out_fput; =20 - if (sev_guest(kvm) || !sev_guest(source_kvm)) { + if (sev_guest(kvm) || !sev_es_guest(source_kvm)) { ret =3D -EINVAL; - goto out_unlock; + goto out_post; } =20 - src_sev =3D &to_kvm_svm(source_kvm)->sev_info; - dst_sev->misc_cg =3D get_current_misc_cg(); cg_cleanup_sev =3D dst_sev; if (dst_sev->misc_cg !=3D src_sev->misc_cg) { @@ -1799,34 +1671,21 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, u= nsigned int source_fd) charged =3D true; } =20 - ret =3D sev_lock_vcpus_for_migration(kvm, SEV_MIGRATION_SOURCE); - if (ret) - goto out_dst_cgroup; - ret =3D sev_lock_vcpus_for_migration(source_kvm, SEV_MIGRATION_TARGET); - if (ret) - goto out_dst_vcpu; - - ret =3D sev_check_source_vcpus(kvm, source_kvm); - if (ret) - goto out_source_vcpu; - sev_migrate_from(kvm, source_kvm); kvm_vm_dead(source_kvm); cg_cleanup_sev =3D src_sev; ret =3D 0; =20 -out_source_vcpu: - sev_unlock_vcpus_for_migration(source_kvm); -out_dst_vcpu: - sev_unlock_vcpus_for_migration(kvm); out_dst_cgroup: /* Operates on the source on success, on the destination on failure. */ if (charged) sev_misc_cg_uncharge(cg_cleanup_sev); put_misc_cg(cg_cleanup_sev->misc_cg); cg_cleanup_sev->misc_cg =3D NULL; -out_unlock: - sev_unlock_two_vms(kvm, source_kvm); +out_post: + post_move_enc_context_from(kvm, source_kvm, + &dst_sev->migration_in_progress, + &src_sev->migration_in_progress); out_fput: if (source_kvm_file) fput(source_kvm_file); @@ -2058,7 +1917,11 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, un= signed int source_fd) } =20 source_kvm =3D source_kvm_file->private_data; - ret =3D sev_lock_two_vms(kvm, source_kvm); + source_sev =3D &to_kvm_svm(source_kvm)->sev_info; + mirror_sev =3D &to_kvm_svm(kvm)->sev_info; + ret =3D lock_two_vms_for_migration(kvm, source_kvm, + &mirror_sev->migration_in_progress, + &source_sev->migration_in_progress); if (ret) goto e_source_fput; =20 @@ -2078,9 +1941,7 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, uns= igned int source_fd) * The mirror kvm holds an enc_context_owner ref so its asid can't * disappear until we're done with it */ - source_sev =3D &to_kvm_svm(source_kvm)->sev_info; kvm_get_kvm(source_kvm); - mirror_sev =3D &to_kvm_svm(kvm)->sev_info; list_add_tail(&mirror_sev->mirror_entry, &source_sev->mirror_vms); =20 /* Set enc_context_owner and copy its encryption context over */ @@ -2101,7 +1962,9 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, uns= igned int source_fd) */ =20 e_unlock: - sev_unlock_two_vms(kvm, source_kvm); + unlock_two_vms_for_migration(kvm, source_kvm, + &mirror_sev->migration_in_progress, + &source_sev->migration_in_progress); e_source_fput: if (source_kvm_file) fput(source_kvm_file); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 870041887ed91..865c434a94899 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13596,6 +13596,172 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, u= nsigned int size, } EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); =20 +/* vCPU mutex subclasses. */ +enum migration_role { + MIGRATION_SOURCE =3D 0, + MIGRATION_TARGET, + NR_MIGRATION_ROLES, +}; + +static int lock_vcpus_for_migration(struct kvm *kvm, enum migration_role r= ole) +{ + struct kvm_vcpu *vcpu; + unsigned long i, j; + + kvm_for_each_vcpu(i, vcpu, kvm) { + if (mutex_lock_killable_nested(&vcpu->mutex, role)) + goto out_unlock; + +#ifdef CONFIG_PROVE_LOCKING + if (!i) + /* + * Reset the role to one that avoids colliding with + * the role used for the first vcpu mutex. + */ + role =3D NR_MIGRATION_ROLES; + else + mutex_release(&vcpu->mutex.dep_map, _THIS_IP_); +#endif + } + + return 0; + +out_unlock: + + kvm_for_each_vcpu(j, vcpu, kvm) { + if (i =3D=3D j) + break; + +#ifdef CONFIG_PROVE_LOCKING + if (j) + mutex_acquire(&vcpu->mutex.dep_map, role, 0, _THIS_IP_); +#endif + + mutex_unlock(&vcpu->mutex); + } + return -EINTR; +} + +static void unlock_vcpus_for_migration(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu; + unsigned long i; + bool first =3D true; + + kvm_for_each_vcpu(i, vcpu, kvm) { + if (first) + first =3D false; + else + mutex_acquire(&vcpu->mutex.dep_map, NR_MIGRATION_ROLES, + 0, _THIS_IP_); + + mutex_unlock(&vcpu->mutex); + } +} + +int lock_two_vms_for_migration(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress) +{ + int r =3D -EBUSY; + + if (dst_kvm =3D=3D src_kvm) + return -EINVAL; + + /* + * Bail if these VMs are already involved in a migration to avoid + * deadlock between two VMs trying to migrate to/from each other. + */ + if (atomic_cmpxchg_acquire(dst_migration_in_progress, 0, 1)) + return -EBUSY; + + if (atomic_cmpxchg_acquire(src_migration_in_progress, 0, 1)) + goto release_dst; + + r =3D -EINTR; + if (mutex_lock_killable(&dst_kvm->lock)) + goto release_src; + if (mutex_lock_killable_nested(&src_kvm->lock, SINGLE_DEPTH_NESTING)) + goto unlock_dst; + return 0; + +unlock_dst: + mutex_unlock(&dst_kvm->lock); +release_src: + atomic_set_release(src_migration_in_progress, 0); +release_dst: + atomic_set_release(dst_migration_in_progress, 0); + return r; +} +EXPORT_SYMBOL_GPL(lock_two_vms_for_migration); + +void unlock_two_vms_for_migration(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress) +{ + mutex_unlock(&dst_kvm->lock); + mutex_unlock(&src_kvm->lock); + atomic_set_release(dst_migration_in_progress, 0); + atomic_set_release(src_migration_in_progress, 0); +} +EXPORT_SYMBOL_GPL(unlock_two_vms_for_migration); + +int pre_move_enc_context_from(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress) +{ + struct kvm_vcpu *src_vcpu; + unsigned long i; + int ret =3D -EINVAL; + + ret =3D lock_two_vms_for_migration(dst_kvm, src_kvm, + dst_migration_in_progress, + src_migration_in_progress); + if (ret) + return ret; + + ret =3D lock_vcpus_for_migration(dst_kvm, MIGRATION_TARGET); + if (ret) + goto unlock_vms; + + ret =3D lock_vcpus_for_migration(src_kvm, MIGRATION_SOURCE); + if (ret) + goto unlock_dst_vcpu; + + if (atomic_read(&dst_kvm->online_vcpus) !=3D + atomic_read(&src_kvm->online_vcpus)) + goto unlock_dst_vcpu; + + kvm_for_each_vcpu(i, src_vcpu, src_kvm) { + if (!src_vcpu->arch.guest_state_protected) + goto unlock_dst_vcpu; + } + + return 0; + +unlock_dst_vcpu: + unlock_vcpus_for_migration(dst_kvm); +unlock_vms: + unlock_two_vms_for_migration(dst_kvm, src_kvm, + dst_migration_in_progress, + src_migration_in_progress); + + return ret; +} +EXPORT_SYMBOL_GPL(pre_move_enc_context_from); + +void post_move_enc_context_from(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress) +{ + unlock_vcpus_for_migration(src_kvm); + unlock_vcpus_for_migration(dst_kvm); + unlock_two_vms_for_migration(dst_kvm, src_kvm, + dst_migration_in_progress, + src_migration_in_progress); +} +EXPORT_SYMBOL_GPL(post_move_enc_context_from); + bool kvm_arch_dirty_log_supported(struct kvm *kvm) { return kvm->arch.vm_type !=3D KVM_X86_PROTECTED_VM; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 33a1a5341e788..554c797184994 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -502,4 +502,20 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsign= ed int size, unsigned int port, void *data, unsigned int count, int in); =20 +int lock_two_vms_for_migration(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress); + +void unlock_two_vms_for_migration(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress); + +int pre_move_enc_context_from(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress); + +void post_move_enc_context_from(struct kvm *dst_kvm, struct kvm *src_kvm, + atomic_t *dst_migration_in_progress, + atomic_t *src_migration_in_progress); + #endif --=20 2.40.0.348.gf938b09366-goog From nobody Wed Feb 11 14:04:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E174C6FD1D for ; Fri, 7 Apr 2023 20:20:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231187AbjDGUUP (ORCPT ); Fri, 7 Apr 2023 16:20:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230446AbjDGUUK (ORCPT ); Fri, 7 Apr 2023 16:20:10 -0400 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A190FC66B for ; Fri, 7 Apr 2023 13:20:04 -0700 (PDT) Received: by mail-pf1-x44a.google.com with SMTP id d2e1a72fcca58-632389f445cso1050b3a.0 for ; Fri, 07 Apr 2023 13:20:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680898804; x=1683490804; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GgRFXdpbsewXFN8nfe13N4oAoHTkIOTXjaNF6DLenHQ=; b=bUPKu4nhvU6AIpMyXIiHt43tbAVl8CQ5zUEZSFuYtVCkd015VQ7DKPwcFXmkmH17As lVLq79uj/NQdWNkY26hN2EggIFXZpzKO7MNJr65yXQcHcpHpQeTfnrxw50qg9C4gx8RL iy6eCReaQSIdPkYWlJzdeM1yXiqFLrTJzBWQ+KyLOH2z0fuAXahHqDM3ShgTW6UQ8eiS 71szTjXJlV/MBpQ6pN2g2pLgUqRonwbez8iewZcmNKFaJDdvBEolIbXNt2Ioa/Qw1pE1 lRrvA7nz+YB0FqnzFEDqCWuFkMZ4mLINvwECc8AohsUCtkcFdtPYhDRkIHP8oomf5Drr OrxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680898804; x=1683490804; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GgRFXdpbsewXFN8nfe13N4oAoHTkIOTXjaNF6DLenHQ=; b=ueetEMczomD0UuiVggyZQPxVo68R856gQTmBsBtQMaWpyKGumQKiXT2yJVYKeRTUAj W/CQKFg9UoEzeBvovagQrE5yKKkOr2puXs6PxB8HBJqzzNgtbKdpSAbRRP7CBBr2BoMW xtHayZUdTx8goELwswX9of0D4wLQN9yFzbnKDYC4YKhAc2npLL9qHPyElC1zWJ/3iTfI 6YlJhg2x6O+ZfyxQF8eJE3CWLfon0muvO9g9o95ONU06rmRjlaAOF+x47iURMupl2XvP 28r+S3C0q9ozl4i8f95aLDWLCerCWUI8zUiTNHIwyKQmuYJjUCXXj3J8R4sV/JVLRMGY xwag== X-Gm-Message-State: AAQBX9cl0rlbJXVfIxStr//f3QoBqz3J31SGRsLEzHCYOIPFKnM5xfCu KPjmlAhFfzbw8LzOoTCf7wAWOKaBQw== X-Google-Smtp-Source: AKy350YGEe2C15q4jyE6p1UWvUniafEfLkgm3j9s60Y8u1yQ+nOtXj2Cpevyz/K7PO+ujyOfnnjwJVut0g== X-Received: from sagi.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:241b]) (user=sagis job=sendgmr) by 2002:a05:6a00:2d20:b0:627:d4fa:6a9c with SMTP id fa32-20020a056a002d2000b00627d4fa6a9cmr1663271pfb.6.1680898803826; Fri, 07 Apr 2023 13:20:03 -0700 (PDT) Date: Fri, 7 Apr 2023 20:19:19 +0000 In-Reply-To: <20230407201921.2703758-1-sagis@google.com> Mime-Version: 1.0 References: <20230407201921.2703758-1-sagis@google.com> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog Message-ID: <20230407201921.2703758-4-sagis@google.com> Subject: [RFC PATCH 3/5] KVM: TDX: Add base implementation for tdx_vm_move_enc_context_from From: Sagi Shahar To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Sean Christopherson , Paolo Bonzini , Isaku Yamahata , Erdem Aktas , David Matlack , Kai Huang , Zhi Wang , Chao Peng , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Sagi Shahar Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This should mostly match the logic in sev_vm_move_enc_context_from. Signed-off-by: Sagi Shahar --- arch/x86/kvm/vmx/main.c | 10 +++++++ arch/x86/kvm/vmx/tdx.c | 56 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 2 ++ arch/x86/kvm/vmx/x86_ops.h | 5 ++++ 4 files changed, 73 insertions(+) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 5b64fe5404958..9d5d0ac465bf6 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -979,6 +979,14 @@ static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu= , void __user *argp) return tdx_vcpu_ioctl(vcpu, argp); } =20 +static int vt_move_enc_context_from(struct kvm *kvm, unsigned int source_f= d) +{ + if (!is_td(kvm)) + return -ENOTTY; + + return tdx_vm_move_enc_context_from(kvm, source_fd); +} + #define VMX_REQUIRED_APICV_INHIBITS \ ( \ BIT(APICV_INHIBIT_REASON_DISABLE)| \ @@ -1141,6 +1149,8 @@ struct kvm_x86_ops vt_x86_ops __initdata =3D { .dev_mem_enc_ioctl =3D tdx_dev_ioctl, .mem_enc_ioctl =3D vt_mem_enc_ioctl, .vcpu_mem_enc_ioctl =3D vt_vcpu_mem_enc_ioctl, + + .vm_move_enc_context_from =3D vt_move_enc_context_from, }; =20 struct kvm_x86_init_ops vt_init_ops __initdata =3D { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 8af7e4e81c860..0999a6d827c99 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2826,3 +2826,59 @@ int __init tdx_init(void) INIT_LIST_HEAD(&per_cpu(associated_tdvcpus, cpu)); return 0; } + +static __always_inline bool tdx_guest(struct kvm *kvm) +{ + struct kvm_tdx *tdx_kvm =3D to_kvm_tdx(kvm); + + return tdx_kvm->finalized; +} + +static int tdx_migrate_from(struct kvm *dst, struct kvm *src) +{ + return -EINVAL; +} + +int tdx_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd) +{ + struct kvm_tdx *dst_tdx =3D to_kvm_tdx(kvm); + struct file *src_kvm_file; + struct kvm_tdx *src_tdx; + struct kvm *src_kvm; + int ret; + + src_kvm_file =3D fget(source_fd); + if (!file_is_kvm(src_kvm_file)) { + ret =3D -EBADF; + goto out_fput; + } + src_kvm =3D src_kvm_file->private_data; + src_tdx =3D to_kvm_tdx(src_kvm); + + ret =3D pre_move_enc_context_from(kvm, src_kvm, + &dst_tdx->migration_in_progress, + &src_tdx->migration_in_progress); + if (ret) + goto out_fput; + + if (tdx_guest(kvm) || !tdx_guest(src_kvm)) { + ret =3D -EINVAL; + goto out_post; + } + + ret =3D tdx_migrate_from(kvm, src_kvm); + if (ret) + goto out_post; + + kvm_vm_dead(src_kvm); + ret =3D 0; + +out_post: + post_move_enc_context_from(kvm, src_kvm, + &dst_tdx->migration_in_progress, + &src_tdx->migration_in_progress); +out_fput: + if (src_kvm_file) + fput(src_kvm_file); + return ret; +} diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 71818c5001862..21b7e710be1fd 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -24,6 +24,8 @@ struct kvm_tdx { atomic_t tdh_mem_track; =20 u64 tsc_offset; + + atomic_t migration_in_progress; }; =20 union tdx_exit_reason { diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index d049e0c72ed0c..275f5d75e9bf1 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -187,6 +187,8 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *= argp); void tdx_flush_tlb(struct kvm_vcpu *vcpu); int tdx_sept_tlb_remote_flush(struct kvm *kvm); void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_leve= l); + +int tdx_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd); #else static inline int tdx_init(void) { return 0; }; static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return= -ENOSYS; } @@ -241,6 +243,9 @@ static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu,= void __user *argp) { ret static inline void tdx_flush_tlb(struct kvm_vcpu *vcpu) {} static inline int tdx_sept_tlb_remote_flush(struct kvm *kvm) { return 0; } static inline void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,= int root_level) {} + +static inline int tdx_vm_move_enc_context_from(struct kvm *kvm, u + nsigned int source_fd) { return -EOPNOTSUPP; } #endif =20 #if defined(CONFIG_INTEL_TDX_HOST) && defined(CONFIG_KVM_SMM) --=20 2.40.0.348.gf938b09366-goog From nobody Wed Feb 11 14:04:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 937BAC76196 for ; Fri, 7 Apr 2023 20:20:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231183AbjDGUUa (ORCPT ); Fri, 7 Apr 2023 16:20:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbjDGUUS (ORCPT ); Fri, 7 Apr 2023 16:20:18 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D93FBC65F for ; Fri, 7 Apr 2023 13:20:07 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-54bfc4e0330so79098737b3.3 for ; Fri, 07 Apr 2023 13:20:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680898807; x=1683490807; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gDOIz19mFLXVjh2KEN4BbSPwap6txMB8kY9CE7752rg=; b=reXZKVgJAcUhKBFqAhAfXZIMQul2LNsezKlEEoHtiLFTJvynT88LRPnZVM8H43my2O 0AdWQfw3AEOAoOIF7B7AWJxVFSFD/w3Q2pEfhk9TLSFsHjsMCNwDjt1nToC3Erq9lXvB g1LbxQEVzYJTdlQ5MYz5hDc9P8gRoRV3iaOFlg0OdPEkY2HKDJ8CNZ5ywg2KOO/iF5t6 lVplsBeH6uvx9ev0WzAiyQG3vF+n90o7T5CR+YbfiXxyMGTfkTZG4CFGpeaqYlO+IJif dO3yYbv3bKl2jMrlV2SrXMEB8JHo/vfYywtDCcW7wAKIKF7yN3CbU+qRkgehxk1sT0AX eQgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680898807; x=1683490807; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gDOIz19mFLXVjh2KEN4BbSPwap6txMB8kY9CE7752rg=; b=7JnEeDrobtxT7XUksu4yoXsn8s+3qK9fTEQymgEIjbh3vbTWZvWX2Q9uihCi6oCBB8 oj5m7e9rU+aWgvxf2ThTUWE9Vwlda1XfO98XPUiBz5X489kG7lcsw7KYepgbWm1lOxW1 8AxMNnrxWf4Fdt9ljYWkvE2F143n8xHw/MnuSJRVEjK4aFhDIrhJfQozgFxt8gpr9zAP S7qa5KRkLAlDB+nGhIJLgKsfJ44E+zivoG6/zRVImGI9Nl6hiyADXqxox2xf0B9eGwqs DgnFmZ2wkHTRRzrdsyrkh7sNRLzYm592GlYoSZtPmBAbotBK789krz9u/kMGTBtMcBNS amEQ== X-Gm-Message-State: AAQBX9eaNlf3CYb4IAKUd3DnLvKeJdWBF8btMQxmnBD61pydccvdmeVf VoAxVi5czxKowqb+Lx6I1/OFjEGMkA== X-Google-Smtp-Source: AKy350bhhE80rWXZROApfO8xLPaZj9ts5o8abPpaSzK+xmA5azXLcrO5jVnu2rk1aXyBf4QNsfLxthLxYg== X-Received: from sagi.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:241b]) (user=sagis job=sendgmr) by 2002:a05:6902:909:b0:a27:3ecc:ffe7 with SMTP id bu9-20020a056902090900b00a273eccffe7mr5158014ybb.3.1680898807168; Fri, 07 Apr 2023 13:20:07 -0700 (PDT) Date: Fri, 7 Apr 2023 20:19:20 +0000 In-Reply-To: <20230407201921.2703758-1-sagis@google.com> Mime-Version: 1.0 References: <20230407201921.2703758-1-sagis@google.com> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog Message-ID: <20230407201921.2703758-5-sagis@google.com> Subject: [RFC PATCH 4/5] KVM: TDX: Implement moving private pages between 2 TDs From: Sagi Shahar To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Sean Christopherson , Paolo Bonzini , Isaku Yamahata , Erdem Aktas , David Matlack , Kai Huang , Zhi Wang , Chao Peng , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Sagi Shahar Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Added functionality for moving the private EPT table from one TD to a new one. This function moves the root of the private EPT table and overwrite the root of the destination. Signed-off-by: Sagi Shahar --- arch/x86/kvm/mmu.h | 2 + arch/x86/kvm/mmu/mmu.c | 60 +++++++++++++++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.c | 77 +++++++++++++++++++++++++++++++++++--- arch/x86/kvm/mmu/tdp_mmu.h | 3 ++ 4 files changed, 137 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index d10b08eeaefee..09bae7fe18a12 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -120,6 +120,8 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu); void kvm_mmu_free_obsolete_roots(struct kvm_vcpu *vcpu); void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu); void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu); +int kvm_mmu_move_private_pages_from(struct kvm_vcpu *vcpu, + struct kvm_vcpu *src_vcpu); =20 static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a35f2e7f9bc70..1acc9338323da 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3789,6 +3789,66 @@ static int mmu_first_shadow_root_alloc(struct kvm *k= vm) return r; } =20 +int kvm_mmu_move_private_pages_from(struct kvm_vcpu *vcpu, + struct kvm_vcpu *src_vcpu) +{ + struct kvm_mmu *mmu =3D vcpu->arch.mmu; + struct kvm_mmu *src_mmu =3D src_vcpu->arch.mmu; + gfn_t gfn_shared =3D kvm_gfn_shared_mask(vcpu->kvm); + hpa_t private_root_hpa, shared_root_hpa; + int r =3D -EINVAL; + + // Hold locks for both src and dst. Always take the src lock first. + write_lock(&src_vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); + + if (!gfn_shared) + goto out_unlock; + + WARN_ON_ONCE(!is_tdp_mmu_active(vcpu)); + WARN_ON_ONCE(!is_tdp_mmu_active(src_vcpu)); + + r =3D mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->root_role.direct); + if (r) + goto out_unlock; + + /* + * The private root is moved from the src to the dst and is marked as + * invalid in the src. + */ + private_root_hpa =3D kvm_tdp_mmu_move_private_pages_from(vcpu, src_vcpu); + if (private_root_hpa =3D=3D INVALID_PAGE) { + /* + * This likely means that the private root was already moved by + * another vCPU. + */ + private_root_hpa =3D kvm_tdp_mmu_get_vcpu_root_hpa_no_alloc(vcpu, true); + if (private_root_hpa =3D=3D INVALID_PAGE) { + r =3D -EINVAL; + goto out_unlock; + } + } + + mmu->private_root_hpa =3D private_root_hpa; + src_mmu->private_root_hpa =3D INVALID_PAGE; + + /* + * The shared root is allocated normally and is not moved from the src. + */ + shared_root_hpa =3D kvm_tdp_mmu_get_vcpu_root_hpa(vcpu, false); + mmu->root.hpa =3D shared_root_hpa; + + kvm_mmu_load_pgd(vcpu); + static_call(kvm_x86_flush_tlb_current)(vcpu); + +out_unlock: + write_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&src_vcpu->kvm->mmu_lock); + + return r; +} +EXPORT_SYMBOL(kvm_mmu_move_private_pages_from); + static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu =3D vcpu->arch.mmu; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 327dee4f6170e..685528fdc0ad6 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -296,6 +296,23 @@ static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, t= dp_ptep_t sptep, trace_kvm_mmu_get_page(sp, true); } =20 +static struct kvm_mmu_page * +kvm_tdp_mmu_get_vcpu_root_no_alloc(struct kvm_vcpu *vcpu, union kvm_mmu_pa= ge_role role) +{ + struct kvm *kvm =3D vcpu->kvm; + struct kvm_mmu_page *root; + + lockdep_assert_held_read(&kvm->mmu_lock); + + for_each_tdp_mmu_root(kvm, root, kvm_mmu_role_as_id(role)) { + if (root->role.word =3D=3D role.word && + kvm_tdp_mmu_get_root(root)) + return root; + } + + return NULL; +} + static struct kvm_mmu_page *kvm_tdp_mmu_get_vcpu_root(struct kvm_vcpu *vcp= u, bool private) { @@ -311,11 +328,9 @@ static struct kvm_mmu_page *kvm_tdp_mmu_get_vcpu_root(= struct kvm_vcpu *vcpu, */ if (private) kvm_mmu_page_role_set_private(&role); - for_each_tdp_mmu_root(kvm, root, kvm_mmu_role_as_id(role)) { - if (root->role.word =3D=3D role.word && - kvm_tdp_mmu_get_root(root)) - goto out; - } + root =3D kvm_tdp_mmu_get_vcpu_root_no_alloc(vcpu, role); + if (!!root) + goto out; =20 root =3D tdp_mmu_alloc_sp(vcpu, role); tdp_mmu_init_sp(root, NULL, 0); @@ -330,6 +345,58 @@ static struct kvm_mmu_page *kvm_tdp_mmu_get_vcpu_root(= struct kvm_vcpu *vcpu, return root; } =20 +hpa_t kvm_tdp_mmu_move_private_pages_from(struct kvm_vcpu *vcpu, + struct kvm_vcpu *src_vcpu) +{ + union kvm_mmu_page_role role =3D vcpu->arch.mmu->root_role; + struct kvm *kvm =3D vcpu->kvm; + struct kvm *src_kvm =3D src_vcpu->kvm; + struct kvm_mmu_page *private_root =3D NULL; + struct kvm_mmu_page *root; + s64 num_private_pages, old; + + lockdep_assert_held_write(&vcpu->kvm->mmu_lock); + lockdep_assert_held_write(&src_vcpu->kvm->mmu_lock); + + /* Find the private root of the source. */ + kvm_mmu_page_role_set_private(&role); + for_each_tdp_mmu_root(src_kvm, root, kvm_mmu_role_as_id(role)) { + if (root->role.word =3D=3D role.word) { + private_root =3D root; + break; + } + } + if (!private_root) + return INVALID_PAGE; + + /* Remove the private root from the src kvm and add it to dst kvm. */ + list_del_rcu(&private_root->link); + list_add_rcu(&private_root->link, &kvm->arch.tdp_mmu_roots); + + num_private_pages =3D atomic64_read(&src_kvm->arch.tdp_private_mmu_pages); + old =3D atomic64_cmpxchg(&kvm->arch.tdp_private_mmu_pages, 0, + num_private_pages); + /* The destination VM should have no private pages at this point. */ + WARN_ON(old); + atomic64_set(&src_kvm->arch.tdp_private_mmu_pages, 0); + + return __pa(private_root->spt); +} + +hpa_t kvm_tdp_mmu_get_vcpu_root_hpa_no_alloc(struct kvm_vcpu *vcpu, bool p= rivate) +{ + struct kvm_mmu_page *root; + union kvm_mmu_page_role role =3D vcpu->arch.mmu->root_role; + + if (private) + kvm_mmu_page_role_set_private(&role); + root =3D kvm_tdp_mmu_get_vcpu_root_no_alloc(vcpu, role); + if (!root) + return INVALID_PAGE; + + return __pa(root->spt); +} + hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu, bool private) { return __pa(kvm_tdp_mmu_get_vcpu_root(vcpu, private)->spt); diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 3ae3c3b8642ac..0e9d38432673d 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -11,6 +11,9 @@ int kvm_mmu_init_tdp_mmu(struct kvm *kvm); void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm); =20 hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu, bool private); +hpa_t kvm_tdp_mmu_get_vcpu_root_hpa_no_alloc(struct kvm_vcpu *vcpu, bool p= rivate); +hpa_t kvm_tdp_mmu_move_private_pages_from(struct kvm_vcpu *vcpu, + struct kvm_vcpu *src_vcpu); =20 __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *= root) { --=20 2.40.0.348.gf938b09366-goog From nobody Wed Feb 11 14:04:38 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56DFAC6FD1D for ; Fri, 7 Apr 2023 20:20:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231289AbjDGUUo (ORCPT ); Fri, 7 Apr 2023 16:20:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231283AbjDGUU3 (ORCPT ); Fri, 7 Apr 2023 16:20:29 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF449CA1E for ; Fri, 7 Apr 2023 13:20:11 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 204-20020a250fd5000000b00b6d6655dc35so42735805ybp.6 for ; Fri, 07 Apr 2023 13:20:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680898810; x=1683490810; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=88YN7x3qppPMVMra5pRzHHPVPBEL4gMuQXheETIYTNU=; b=OMW9kzjNG7JkuumKUv1gxM5ClRnzMIgI7kcKO2DxeN5LnLC/y/rEnRoSpywnKQs6hD xlQD/OsuGnJ5npkky5Zeo0idUTx/ZeUjWrN65bVSD+9YE94mwbexeVuduMYRcJXkdWR9 c5e4/E0ZMYZkZuNx3ir9D1lXfiqPJjj8W9x34KqqTmzX6auXI6JlOD5uUwQyobkTZHkX GCYUe1C5vg1gq2woLG6c4aqn7ii1vyB75r4qVRPb3gbnB8AF+iJW/hONV03AuDt//Ykl +JKBwB+Km4olaMCfJvrbGcPiR60/pmR36bZ7vdgt09knNdseMPjEeCZ5XJB2aV2wWMRD M2Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680898810; x=1683490810; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=88YN7x3qppPMVMra5pRzHHPVPBEL4gMuQXheETIYTNU=; b=PqwT6TPWyaZCnjM5nJua/CMiBucQExiiWd3RbQaE60QBXvgbUz+HHZVcxzT0aSyZb0 1B+bMUfI4STuCQJqm8cGTdsxKGFuqx+l8AR0Y+8Gym3ZRM49C3Ejb/N2u0mf6oUurhyE 6YtOH5QSwF56wZM5SMBrD7IEh2h3UOTGTT2m2aCGPretjuWMEcLmFrlsqLOtXICfRbOn /8zhPMw0rMmG+Ao6VngTJRIQjAryn7zf5YAB/VHvwydKxMCiNIJ4qOh6O4w2b7JtNPr/ rzxWGF6O28N8rQaY8l68WK0qevvI0hBcmtlrusQIIStQpyuZ9H+SwsLEnXfMM3cjgzCp rVuw== X-Gm-Message-State: AAQBX9dzWiyNfaYe9/0FyfYAq+re0xnkNxgHb1C0HJbcZEpY4xmdFpgN LiirI6vGAnBUFN5hq/t7l4WeQYu/5A== X-Google-Smtp-Source: AKy350Z518uI8iDZqsGbGxAPlrTG82u1+tzRRv9ENlxzq3lw1oRIWVZvb/x2syQmHk/sxAmaqvastUXjlQ== X-Received: from sagi.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:241b]) (user=sagis job=sendgmr) by 2002:a25:d958:0:b0:b75:3fd4:1b31 with SMTP id q85-20020a25d958000000b00b753fd41b31mr2479838ybg.1.1680898810376; Fri, 07 Apr 2023 13:20:10 -0700 (PDT) Date: Fri, 7 Apr 2023 20:19:21 +0000 In-Reply-To: <20230407201921.2703758-1-sagis@google.com> Mime-Version: 1.0 References: <20230407201921.2703758-1-sagis@google.com> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog Message-ID: <20230407201921.2703758-6-sagis@google.com> Subject: [RFC PATCH 5/5] KVM: TDX: Add core logic for TDX intra-host migration From: Sagi Shahar To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Sean Christopherson , Paolo Bonzini , Isaku Yamahata , Erdem Aktas , David Matlack , Kai Huang , Zhi Wang , Chao Peng , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Sagi Shahar Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Adds the core logic for transferring state between source and destination TDs during intra-host migration. Signed-off-by: Sagi Shahar --- arch/x86/kvm/vmx/tdx.c | 191 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 190 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 0999a6d827c99..05b164a91437b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2834,9 +2834,198 @@ static __always_inline bool tdx_guest(struct kvm *k= vm) return tdx_kvm->finalized; } =20 +#define for_each_memslot_pair(memslots_1, memslots_2, memslot_iter_1, \ + memslot_iter_2) \ + for (memslot_iter_1 =3D rb_first(&memslots_1->gfn_tree), \ + memslot_iter_2 =3D rb_first(&memslots_2->gfn_tree); \ + memslot_iter_1 && memslot_iter_2; \ + memslot_iter_1 =3D rb_next(memslot_iter_1), \ + memslot_iter_2 =3D rb_next(memslot_iter_2)) + static int tdx_migrate_from(struct kvm *dst, struct kvm *src) { - return -EINVAL; + struct rb_node *src_memslot_iter, *dst_memslot_iter; + struct vcpu_tdx *dst_tdx_vcpu, *src_tdx_vcpu; + struct kvm_memslots *src_slots, *dst_slots; + struct kvm_vcpu *dst_vcpu, *src_vcpu; + struct kvm_tdx *src_tdx, *dst_tdx; + unsigned long i, j; + int ret; + + src_tdx =3D to_kvm_tdx(src); + dst_tdx =3D to_kvm_tdx(dst); + + src_slots =3D __kvm_memslots(src, 0); + dst_slots =3D __kvm_memslots(dst, 0); + + ret =3D -EINVAL; + + if (!src_tdx->finalized) { + pr_warn("Cannot migrate from a non finalized VM\n"); + goto abort; + } + + // Traverse both memslots in gfn order and compare them + for_each_memslot_pair(src_slots, dst_slots, src_memslot_iter, dst_memslot= _iter) { + struct kvm_memory_slot *src_slot, *dst_slot; + + src_slot =3D + container_of(src_memslot_iter, struct kvm_memory_slot, + gfn_node[src_slots->node_idx]); + dst_slot =3D + container_of(src_memslot_iter, struct kvm_memory_slot, + gfn_node[dst_slots->node_idx]); + + if (src_slot->base_gfn !=3D dst_slot->base_gfn || + src_slot->npages !=3D dst_slot->npages) { + pr_warn("Cannot migrate between VMs with different memory slots configu= rations\n"); + goto abort; + } + + if (src_slot->flags !=3D dst_slot->flags) { + pr_warn("Cannot migrate between VMs with different memory slots configu= rations\n"); + goto abort; + } + + if (src_slot->flags & KVM_MEM_PRIVATE) { + if (src_slot->restrictedmem.file->f_inode->i_ino !=3D + dst_slot->restrictedmem.file->f_inode->i_ino) { + pr_warn("Private memslots points to different restricted files\n"); + goto abort; + } + + if (src_slot->restrictedmem.index !=3D dst_slot->restrictedmem.index) { + pr_warn("Private memslots points to the restricted file at different o= ffsets\n"); + goto abort; + } + } + } + + if (src_memslot_iter || dst_memslot_iter) { + pr_warn("Cannot migrate between VMs with different memory slots configur= ations\n"); + goto abort; + } + + dst_tdx->hkid =3D src_tdx->hkid; + dst_tdx->tdr_pa =3D src_tdx->tdr_pa; + + dst_tdx->tdcs_pa =3D kcalloc(tdx_info.nr_tdcs_pages, sizeof(*dst_tdx->tdc= s_pa), + GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (!dst_tdx->tdcs_pa) { + ret =3D -ENOMEM; + goto late_abort; + } + memcpy(dst_tdx->tdcs_pa, src_tdx->tdcs_pa, + tdx_info.nr_tdcs_pages * sizeof(*dst_tdx->tdcs_pa)); + + dst_tdx->tsc_offset =3D src_tdx->tsc_offset; + dst_tdx->attributes =3D src_tdx->attributes; + dst_tdx->xfam =3D src_tdx->xfam; + dst_tdx->kvm.arch.gfn_shared_mask =3D src_tdx->kvm.arch.gfn_shared_mask; + + kvm_for_each_vcpu(i, src_vcpu, src) + tdx_flush_vp_on_cpu(src_vcpu); + + /* Copy per-vCPU state */ + kvm_for_each_vcpu(i, src_vcpu, src) { + src_tdx_vcpu =3D to_tdx(src_vcpu); + dst_vcpu =3D kvm_get_vcpu(dst, i); + dst_tdx_vcpu =3D to_tdx(dst_vcpu); + + vcpu_load(dst_vcpu); + + memcpy(dst_vcpu->arch.regs, src_vcpu->arch.regs, + NR_VCPU_REGS * sizeof(src_vcpu->arch.regs[0])); + dst_vcpu->arch.regs_avail =3D src_vcpu->arch.regs_avail; + dst_vcpu->arch.regs_dirty =3D src_vcpu->arch.regs_dirty; + + dst_vcpu->arch.tsc_offset =3D dst_tdx->tsc_offset; + + dst_tdx_vcpu->interrupt_disabled_hlt =3D src_tdx_vcpu->interrupt_disable= d_hlt; + dst_tdx_vcpu->buggy_hlt_workaround =3D src_tdx_vcpu->buggy_hlt_workaroun= d; + + dst_tdx_vcpu->tdvpr_pa =3D src_tdx_vcpu->tdvpr_pa; + dst_tdx_vcpu->tdvpx_pa =3D kcalloc(tdx_info.nr_tdvpx_pages, + sizeof(*dst_tdx_vcpu->tdvpx_pa), + GFP_KERNEL_ACCOUNT); + if (!dst_tdx_vcpu->tdvpx_pa) { + ret =3D -ENOMEM; + vcpu_put(dst_vcpu); + goto late_abort; + } + memcpy(dst_tdx_vcpu->tdvpx_pa, src_tdx_vcpu->tdvpx_pa, + tdx_info.nr_tdvpx_pages * sizeof(*dst_tdx_vcpu->tdvpx_pa)); + + td_vmcs_write64(dst_tdx_vcpu, POSTED_INTR_DESC_ADDR, __pa(&dst_tdx_vcpu-= >pi_desc)); + + /* Copy private EPT tables */ + if (kvm_mmu_move_private_pages_from(dst_vcpu, src_vcpu)) { + ret =3D -EINVAL; + vcpu_put(dst_vcpu); + goto late_abort; + } + + for (j =3D 0; j < tdx_info.nr_tdvpx_pages; j++) + src_tdx_vcpu->tdvpx_pa[j] =3D 0; + + src_tdx_vcpu->tdvpr_pa =3D 0; + + vcpu_put(dst_vcpu); + } + + for_each_memslot_pair(src_slots, dst_slots, src_memslot_iter, + dst_memslot_iter) { + struct kvm_memory_slot *src_slot, *dst_slot; + + src_slot =3D container_of(src_memslot_iter, + struct kvm_memory_slot, + gfn_node[src_slots->node_idx]); + dst_slot =3D container_of(src_memslot_iter, + struct kvm_memory_slot, + gfn_node[dst_slots->node_idx]); + + for (i =3D 1; i < KVM_NR_PAGE_SIZES; ++i) { + unsigned long ugfn; + int level =3D i + 1; + + /* + * If the gfn and userspace address are not aligned wrt each other, then + * large page support should already be disabled at this level. + */ + ugfn =3D dst_slot->userspace_addr >> PAGE_SHIFT; + if ((dst_slot->base_gfn ^ ugfn) & (KVM_PAGES_PER_HPAGE(level) - 1)) + continue; + + dst_slot->arch.lpage_info[i - 1] =3D + src_slot->arch.lpage_info[i - 1]; + src_slot->arch.lpage_info[i - 1] =3D NULL; + } + } + + dst->mem_attr_array.xa_head =3D src->mem_attr_array.xa_head; + src->mem_attr_array.xa_head =3D NULL; + + dst_tdx->finalized =3D true; + + /* Clear source VM to avoid freeing the hkid and pages on VM put */ + src_tdx->hkid =3D -1; + src_tdx->tdr_pa =3D 0; + for (i =3D 0; i < tdx_info.nr_tdcs_pages; i++) + src_tdx->tdcs_pa[i] =3D 0; + + return 0; + +late_abort: + /* If we aborted after the state transfer already started, the src VM + * is no longer valid. + */ + kvm_vm_dead(src); + +abort: + dst_tdx->hkid =3D -1; + dst_tdx->tdr_pa =3D 0; + + return ret; } =20 int tdx_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd) --=20 2.40.0.348.gf938b09366-goog