From nobody Wed Dec 17 07:59:11 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA7AAE732DB for ; Thu, 28 Sep 2023 15:05:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231262AbjI1PFh (ORCPT ); Thu, 28 Sep 2023 11:05:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231445AbjI1PFe (ORCPT ); Thu, 28 Sep 2023 11:05:34 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50D801AC for ; Thu, 28 Sep 2023 08:04:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695913482; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E+PIeCSrfaW18zOkmenj4XNWXbjicG5Z3Ntw9Ymuv4Q=; b=LglO15rb2dXS/5/YirtmG2G+eEnZUXFfMfm02jlvOyzdBnd22l3FRzgNWg7N9CALHKylF9 gbbhWmi5gJLQfxhaKI7cRr5OtOfzH0TB+velCCUf9mn475efQdtyjOSwgJWhZvr7lVypKf QMF70tuMCHXaHqQdcV+qLowZrXSIO14= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-448-cZx1yAeFPPeI-OiQmsVxAA-1; Thu, 28 Sep 2023 11:04:37 -0400 X-MC-Unique: cZx1yAeFPPeI-OiQmsVxAA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8B782858F19; Thu, 28 Sep 2023 15:04:36 +0000 (UTC) Received: from localhost.localdomain (unknown [10.45.226.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F9CF40C6E76; Thu, 28 Sep 2023 15:04:32 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Will Deacon , Borislav Petkov , Dave Hansen , Suravee Suthikulpanit , Thomas Gleixner , Paolo Bonzini , x86@kernel.org, Robin Murphy , iommu@lists.linux.dev, Ingo Molnar , Joerg Roedel , Sean Christopherson , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH 1/5] x86: KVM: SVM: fix for x2avic CVE-2023-5090 Date: Thu, 28 Sep 2023 18:04:24 +0300 Message-Id: <20230928150428.199929-2-mlevitsk@redhat.com> In-Reply-To: <20230928150428.199929-1-mlevitsk@redhat.com> References: <20230928150428.199929-1-mlevitsk@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The following problem exists since the x2avic was enabled in the KVM: svm_set_x2apic_msr_interception is called to enable the interception of the x2apic msrs. In particular it is called at the moment the guest resets its apic. Assuming that the guest's apic was in x2apic mode, the reset will bring it back to the xapic mode. The svm_set_x2apic_msr_interception however has an erroneous check for '!apic_x2apic_mode()' which prevents it from doing anything in this case. As a result of this, all x2apic msrs are left unintercepted, and that exposes the bare metal x2apic (if enabled) to the guest. Oops. Remove the erroneous '!apic_x2apic_mode()' check to fix that. Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9507df93f410a63..acdd0b89e4715a3 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -913,8 +913,7 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *s= vm, bool intercept) if (intercept =3D=3D svm->x2avic_msrs_intercepted) return; =20 - if (!x2avic_enabled || - !apic_x2apic_mode(svm->vcpu.arch.apic)) + if (!x2avic_enabled) return; =20 for (i =3D 0; i < MAX_DIRECT_ACCESS_MSRS; i++) { --=20 2.26.3 From nobody Wed Dec 17 07:59:11 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11B04E732E4 for ; Thu, 28 Sep 2023 15:05:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231592AbjI1PFn (ORCPT ); Thu, 28 Sep 2023 11:05:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231464AbjI1PFi (ORCPT ); Thu, 28 Sep 2023 11:05:38 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5173F1A3 for ; Thu, 28 Sep 2023 08:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695913489; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/zSW4L47PrK+7DJbUcadmA8oL8chw/9RJFYn/fvmJlQ=; b=Y35kKURBpMmvCUw+8sti1NrL94VIZ79XRgwzwhNLXrc9pwinAILrgzldGX51A20nqS6CwU iC3EdPVBWxfYvQSGQfkZ7iEiTCqIug57OjcPDJXByZkZ9VWc7nGkurV9LU0idNcnXIdCjw is9Rpvm+Dup8nme8Q4l93bT4yQv3V8s= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-627-UUxYy2WWNlqjcdm6QvOhzQ-1; Thu, 28 Sep 2023 11:04:42 -0400 X-MC-Unique: UUxYy2WWNlqjcdm6QvOhzQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 187D438145AB; Thu, 28 Sep 2023 15:04:41 +0000 (UTC) Received: from localhost.localdomain (unknown [10.45.226.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id E1A0540C6E76; Thu, 28 Sep 2023 15:04:36 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Will Deacon , Borislav Petkov , Dave Hansen , Suravee Suthikulpanit , Thomas Gleixner , Paolo Bonzini , x86@kernel.org, Robin Murphy , iommu@lists.linux.dev, Ingo Molnar , Joerg Roedel , Sean Christopherson , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH 2/5] x86: KVM: SVM: add support for Invalid IPI Vector interception Date: Thu, 28 Sep 2023 18:04:25 +0300 Message-Id: <20230928150428.199929-3-mlevitsk@redhat.com> In-Reply-To: <20230928150428.199929-1-mlevitsk@redhat.com> References: <20230928150428.199929-1-mlevitsk@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In later revisions of AMD's APM, there is a new 'incomplete IPI' exit code: "Invalid IPI Vector - The vector for the specified IPI was set to an illegal value (VEC < 16)" Note that tests on Zen2 machine show that this VM exit doesn't happen and instead AVIC just does nothing. Add support for this exit code by doing nothing, instead of filling the kernel log with errors. Also replace an unthrottled 'pr_err()' if another unknown incomplete IPI exit happens with WARN_ON_ONCE() (e.g in case AMD adds yet another 'Invalid IPI' exit reason) Cc: Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/svm.h | 1 + arch/x86/kvm/svm/avic.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index 19bf955b67e0da0..3ac0ffc4f3e202b 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -268,6 +268,7 @@ enum avic_ipi_failure_cause { AVIC_IPI_FAILURE_TARGET_NOT_RUNNING, AVIC_IPI_FAILURE_INVALID_TARGET, AVIC_IPI_FAILURE_INVALID_BACKING_PAGE, + AVIC_IPI_FAILURE_INVALID_IPI_VECTOR, }; =20 #define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(8, 0) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 2092db892d7d052..c44b65af494e3ff 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -529,8 +529,11 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *= vcpu) case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE: WARN_ONCE(1, "Invalid backing page\n"); break; + case AVIC_IPI_FAILURE_INVALID_IPI_VECTOR: + /* Invalid IPI with vector < 16 */ + break; default: - pr_err("Unknown IPI interception\n"); + WARN_ONCE(1, "Unknown avic incomplete IPI interception\n"); } =20 return 1; --=20 2.26.3 From nobody Wed Dec 17 07:59:11 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 091E3E732DB for ; Thu, 28 Sep 2023 15:05:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231607AbjI1PFq (ORCPT ); Thu, 28 Sep 2023 11:05:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231556AbjI1PFj (ORCPT ); Thu, 28 Sep 2023 11:05:39 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C32D8195 for ; Thu, 28 Sep 2023 08:04:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695913489; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+K85/eS/kylNQpLV/CXk6IqQwJPNNCcpLU6igIUXkXY=; b=YXGNyiGkb0F/lL+tGSYsZHXrvXxE56muEtGNcWIv5r7pCi2qkZj4gHB/b+QSAhpMdzYDBS NuOke5n6Oz1r/qHQ38wFfuOIMAyU+ucG0de/B+qvA5C1yunqE65jEHfTKXBvpZHGT2kCi1 Ay4R9aW8Xe4GDHrctugCq2oTbQuEh9M= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-461-DDx6aR3LPmeCIbUCa9SEdA-1; Thu, 28 Sep 2023 11:04:45 -0400 X-MC-Unique: DDx6aR3LPmeCIbUCa9SEdA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E8DBD3C1ACEE; Thu, 28 Sep 2023 15:04:44 +0000 (UTC) Received: from localhost.localdomain (unknown [10.45.226.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6E86A40C6E76; Thu, 28 Sep 2023 15:04:41 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Will Deacon , Borislav Petkov , Dave Hansen , Suravee Suthikulpanit , Thomas Gleixner , Paolo Bonzini , x86@kernel.org, Robin Murphy , iommu@lists.linux.dev, Ingo Molnar , Joerg Roedel , Sean Christopherson , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH 3/5] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested() Date: Thu, 28 Sep 2023 18:04:26 +0300 Message-Id: <20230928150428.199929-4-mlevitsk@redhat.com> In-Reply-To: <20230928150428.199929-1-mlevitsk@redhat.com> References: <20230928150428.199929-1-mlevitsk@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" svm_leave_nested() similar to a nested VM exit, get the vCPU out of nested mode and thus should end the local inhibition of AVIC on this vCPU. Failure to do so, can lead to hangs on guest reboot. Raise the KVM_REQ_APICV_UPDATE request to refresh the AVIC state of the current vCPU in this case. Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index dd496c9e5f91f28..3fea8c47679e689 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1253,6 +1253,9 @@ void svm_leave_nested(struct kvm_vcpu *vcpu) =20 nested_svm_uninit_mmu_context(vcpu); vmcb_mark_all_dirty(svm->vmcb); + + if (kvm_apicv_activated(vcpu->kvm)) + kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu); } =20 kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); --=20 2.26.3 From nobody Wed Dec 17 07:59:11 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91049E732E3 for ; Thu, 28 Sep 2023 15:05:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231652AbjI1PFt (ORCPT ); Thu, 28 Sep 2023 11:05:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231537AbjI1PFl (ORCPT ); Thu, 28 Sep 2023 11:05:41 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6B23F9 for ; Thu, 28 Sep 2023 08:04:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695913494; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yyT2AmMvLLxljrFvJMSDRCezFgTSL0+kRV0V1GII1YQ=; b=ZGjROXmvUa7ySlZMcZuEgJw7jwAVZA8duIl4TkUrIfVa5rUxaFpzO5k06p2mWP0H3ue6Ot ZOo/e/2Zr3nB5/D+6vqDyzfciN9rPKlgqJOdD0ZBm6IkAs9OcjDuvuNTF6QaqqwPUkHiRF fCfGDMMVBZOh/ZoZLITUNfcEQ2d+bp4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-195-0K3VyQymPgyYVl7SZgYxGA-1; Thu, 28 Sep 2023 11:04:50 -0400 X-MC-Unique: 0K3VyQymPgyYVl7SZgYxGA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 95190101A53B; Thu, 28 Sep 2023 15:04:48 +0000 (UTC) Received: from localhost.localdomain (unknown [10.45.226.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A8B540C6E76; Thu, 28 Sep 2023 15:04:45 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Will Deacon , Borislav Petkov , Dave Hansen , Suravee Suthikulpanit , Thomas Gleixner , Paolo Bonzini , x86@kernel.org, Robin Murphy , iommu@lists.linux.dev, Ingo Molnar , Joerg Roedel , Sean Christopherson , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Maxim Levitsky Subject: [PATCH 4/5] iommu/amd: skip updating the IRTE entry when is_run is already false Date: Thu, 28 Sep 2023 18:04:27 +0300 Message-Id: <20230928150428.199929-5-mlevitsk@redhat.com> In-Reply-To: <20230928150428.199929-1-mlevitsk@redhat.com> References: <20230928150428.199929-1-mlevitsk@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When vCPU affinity of an IRTE which already has is_run =3D=3D false, is updated and the update also sets is_run to false, there is nothing to do. The goal of this patch is to make a call to 'amd_iommu_update_ga()' to be relatively cheap if there is nothing to do. Signed-off-by: Maxim Levitsky Reviewed-by: Joao Martins --- drivers/iommu/amd/iommu.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 95bd7c25ba6f366..10bcd436e984672 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3774,6 +3774,15 @@ int amd_iommu_update_ga(int cpu, bool is_run, void *= data) entry->hi.fields.destination =3D APICID_TO_IRTE_DEST_HI(cpu); } + + if (!is_run && !entry->lo.fields_vapic.is_run) { + /* + * No need to notify the IOMMU about an entry which + * already has is_run =3D=3D False + */ + return 0; + } + entry->lo.fields_vapic.is_run =3D is_run; =20 return modify_irte_ga(ir_data->iommu, ir_data->irq_2_irte.devid, --=20 2.26.3 From nobody Wed Dec 17 07:59:11 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9472BE732E3 for ; Thu, 28 Sep 2023 15:06:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231537AbjI1PGP (ORCPT ); Thu, 28 Sep 2023 11:06:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231653AbjI1PGK (ORCPT ); Thu, 28 Sep 2023 11:06:10 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDD161B0 for ; Thu, 28 Sep 2023 08:05:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695913520; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BNW1QOOJZvi4+G5BXUnncgpsE3RlMF02WmB0kRHObXg=; b=gy1IeaU6C5+swcBiYoJklGwsalsl8KYtL94826stvQp5t65A9mcEXu2QL3jzEY0T3n6Ew8 N5IvEKiHWvWPBYRXiWrL9P/R/JGHBic9GDSvpK8JUcsK3b636J3fz6ZJgA3uEBYd2DThAB YgxzVIx6pwpSBHbOtPK8UkSP9Vc7Ypc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-475-HDbD2tdKM2KB0zSaX0me8w-1; Thu, 28 Sep 2023 11:04:53 -0400 X-MC-Unique: HDbD2tdKM2KB0zSaX0me8w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3B7C2101A550; Thu, 28 Sep 2023 15:04:52 +0000 (UTC) Received: from localhost.localdomain (unknown [10.45.226.141]) by smtp.corp.redhat.com (Postfix) with ESMTP id E7DEB40C6E76; Thu, 28 Sep 2023 15:04:48 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Will Deacon , Borislav Petkov , Dave Hansen , Suravee Suthikulpanit , Thomas Gleixner , Paolo Bonzini , x86@kernel.org, Robin Murphy , iommu@lists.linux.dev, Ingo Molnar , Joerg Roedel , Sean Christopherson , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Maxim Levitsky Subject: [PATCH 5/5] x86: KVM: SVM: workaround for AVIC's errata #1235 Date: Thu, 28 Sep 2023 18:04:28 +0300 Message-Id: <20230928150428.199929-6-mlevitsk@redhat.com> In-Reply-To: <20230928150428.199929-1-mlevitsk@redhat.com> References: <20230928150428.199929-1-mlevitsk@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" On Zen2 (and likely on Zen1 as well), AVIC doesn't reliably detect a change in the 'is_running' bit during ICR write emulation and might skip a VM exit, if that bit was recently cleared. The absence of the VM exit, leads to the KVM not waking up / triggering nested vm exit on the target(s) of the IPI which can, in some cases, lead to an unbounded delays in the guest execution. As I recently discovered, a reasonable workaround exists: make the KVM never set the is_running bit. This workaround ensures that (*) all ICR writes always cause a VM exit and therefore correctly emulated, in expense of never enjoying VM exit-less ICR emulation. This workaround does carry a performance penalty but according to my benchmarks is still much better than not using AVIC at all, because AVIC is still used for the receiving end of the IPIs, and for the posted interrupts. If the user is aware of the errata and it doesn't affect his workload, the user can disable the workaround with 'avic_zen2_errata_workaround=3D0' (*) More correctly all ICR writes except when 'Self' shorthand is used: In this case AVIC skips reading physid table and just sets bits in IRR of local APIC. Thankfully in this case, the errata is not possible, therefore an extra workaround for this case is not needed. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 50 +++++++++++++++++++++++++++++------------ 1 file changed, 36 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index c44b65af494e3ff..df9efa428f86aa9 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -62,6 +62,9 @@ static_assert(__AVIC_GATAG(AVIC_VM_ID_MASK, AVIC_VCPU_ID_= MASK) =3D=3D -1u); static bool force_avic; module_param_unsafe(force_avic, bool, 0444); =20 +static int avic_zen2_errata_workaround =3D -1; +module_param(avic_zen2_errata_workaround, int, 0444); + /* Note: * This hash table is used to map VM_ID to a struct kvm_svm, * when handling AMD IOMMU GALOG notification to schedule in @@ -1027,7 +1030,6 @@ avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu= , int cpu, bool r) =20 void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { - u64 entry; int h_physical_id =3D kvm_cpu_get_apicid(cpu); struct vcpu_svm *svm =3D to_svm(vcpu); unsigned long flags; @@ -1056,14 +1058,18 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu) */ spin_lock_irqsave(&svm->ir_list_lock, flags); =20 - entry =3D READ_ONCE(*(svm->avic_physical_id_cache)); - WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK); + if (!avic_zen2_errata_workaround) { + u64 entry =3D READ_ONCE(*(svm->avic_physical_id_cache)); =20 - entry &=3D ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK; - entry |=3D (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK); - entry |=3D AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK; + WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK); + + entry &=3D ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK; + entry |=3D (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK= ); + entry |=3D AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK; + + WRITE_ONCE(*(svm->avic_physical_id_cache), entry); + } =20 - WRITE_ONCE(*(svm->avic_physical_id_cache), entry); avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, true); =20 spin_unlock_irqrestore(&svm->ir_list_lock, flags); @@ -1071,7 +1077,7 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu) =20 void avic_vcpu_put(struct kvm_vcpu *vcpu) { - u64 entry; + u64 entry =3D 0; struct vcpu_svm *svm =3D to_svm(vcpu); unsigned long flags; =20 @@ -1084,11 +1090,13 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu) * can't be scheduled out and thus avic_vcpu_{put,load}() can't run * recursively. */ - entry =3D READ_ONCE(*(svm->avic_physical_id_cache)); =20 - /* Nothing to do if IsRunning =3D=3D '0' due to vCPU blocking. */ - if (!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)) - return; + if (!avic_zen2_errata_workaround) { + /* Nothing to do if IsRunning =3D=3D '0' due to vCPU blocking. */ + entry =3D READ_ONCE(*(svm->avic_physical_id_cache)); + if (!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)) + return; + } =20 /* * Take and hold the per-vCPU interrupt remapping lock while updating @@ -1102,8 +1110,10 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu) =20 avic_update_iommu_vcpu_affinity(vcpu, -1, 0); =20 - entry &=3D ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK; - WRITE_ONCE(*(svm->avic_physical_id_cache), entry); + if (!avic_zen2_errata_workaround) { + entry &=3D ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK; + WRITE_ONCE(*(svm->avic_physical_id_cache), entry); + } =20 spin_unlock_irqrestore(&svm->ir_list_lock, flags); =20 @@ -1217,5 +1227,17 @@ bool avic_hardware_setup(void) =20 amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier); =20 + if (avic_zen2_errata_workaround =3D=3D -1) { + + /* Assume that Zen1 and Zen2 have errata #1235 */ + if (boot_cpu_data.x86 =3D=3D 0x17) + avic_zen2_errata_workaround =3D 1; + else + avic_zen2_errata_workaround =3D 0; + } + + if (avic_zen2_errata_workaround) + pr_info("Workaround for AVIC errata #1235 is enabled\n"); + return true; } --=20 2.26.3