From nobody Thu May 2 16:11:31 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=citrix.com ARC-Seal: i=1; a=rsa-sha256; t=1603268473; cv=none; d=zohomail.com; s=zohoarc; b=K4z0Wy5g03QSbSq47+qyLHB9xt66kXXcsgOHjOxyhED/6JYd5aKwlPCtvxtKj2cX7M1DQtHkhkVTZeQuq3aVGebb2+BOgFcU3zBqudUO5dBv6dcGB9dA8pl3gRqsPb9mO7IZl9bEEDYjXPj/ydiP2499bC3/O5Cbd1WdHG2+q98= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1603268473; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=yPT7chdBxXN5oRHnggTYZe4wtL5iXNM2j9Ura6HWbks=; b=BTYp/OgztVW1OYUHthrLAzZiXoFzJ3NMBaycMR2vkr1fq9XHSsoglXnj5TtAlfiN4kVmtm3l7J1cswRGbCFs8hwU3k66YvtAnwef0vuGC/PKwGt1TCnUJ0Noi0TqeXsSjS73xvCVcS3ig9W7qkgnGeSbHwnXLglytBgksKEs53Q= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) header.from= Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1603268473610659.818153987406; Wed, 21 Oct 2020 01:21:13 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.9892.26124 (Exim 4.92) (envelope-from ) id 1kV9MG-0001Xz-RS; Wed, 21 Oct 2020 08:20:32 +0000 Received: by outflank-mailman (output) from mailman id 9892.26124; Wed, 21 Oct 2020 08:20:32 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kV9MG-0001Xs-Ny; Wed, 21 Oct 2020 08:20:32 +0000 Received: by outflank-mailman (input) for mailman id 9892; Wed, 21 Oct 2020 08:20:31 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kV9MF-0001Xn-DN for xen-devel@lists.xenproject.org; Wed, 21 Oct 2020 08:20:31 +0000 Received: from esa4.hc3370-68.iphmx.com (unknown [216.71.155.144]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 53246437-3c70-48f1-91e9-8d68ad3dfce4; Wed, 21 Oct 2020 08:20:29 +0000 (UTC) Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kV9MF-0001Xn-DN for xen-devel@lists.xenproject.org; Wed, 21 Oct 2020 08:20:31 +0000 Received: from esa4.hc3370-68.iphmx.com (unknown [216.71.155.144]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 53246437-3c70-48f1-91e9-8d68ad3dfce4; Wed, 21 Oct 2020 08:20:29 +0000 (UTC) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 53246437-3c70-48f1-91e9-8d68ad3dfce4 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1603268429; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=Y1DsProsd+NskXZ7WmI5xqqNUBFyInGW0t+Wre805Zk=; b=DUFW7vml8Hcz5+B42oaS/7HrXvFxhrS3H/sRMBqAZT4mBgS5frzwhZau KfgCsnGvSPu0Jk7KXeB2ZJxrNLwTdB5yJgKAY45Z8z4ApMiYja1rV8bIU 6Z/TublxmrFiHrQPf6l1rdUJsU4xR/ICyUaFSWufDl56ojm76bnPR86ad 8=; Authentication-Results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none IronPort-SDR: 5C733hWXykDkPs3r9anZHWXJn489rFYLVVlMW3rYOIs2sYhLzzkEalWK+2QkxYlnDGUGGP1yKw JS9016T5ZQcD/Y6d/ikMtintAYeGauk5cMhIMlmMsHUK6lutG9Bd3vBECyYzQFWjy6TtiXm73n Kri8kUEoJFPYwGMDVlLbQuKX/MVORFdxOaROemC1SaryRRpyqOUHUBfYkAeEQvQH8oCTljfLjI 47ZSZZpMfgqJwY2ttMyP5UKpc+yycR/BFA6ceKAXjbq2jgqOIQ0uxSgKn6k7kzqAFFuryUMFR/ TEw= X-SBRS: 2.5 X-MesageID: 30513882 X-Ironport-Server: esa4.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.77,400,1596513600"; d="scan'208";a="30513882" From: Roger Pau Monne To: CC: Roger Pau Monne , Jan Beulich , Paul Durrant Subject: [PATCH] pci: cleanup MSI interrupts before removing device from IOMMU Date: Wed, 21 Oct 2020 10:19:45 +0200 Message-ID: <20201021081945.28425-1-roger.pau@citrix.com> X-Mailer: git-send-email 2.29.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @citrix.com) Doing the MSI cleanup after removing the device from the IOMMU leads to the following panic on AMD hardware: Assertion 'table.ptr && (index < intremap_table_entries(table.ptr, iommu))'= failed at iommu_intr.c:172 Reviewed-by: Jan Beulich ----[ Xen-4.13.1-10.0.3-d x86_64 debug=3Dy Not tainted ]---- CPU: 3 RIP: e008:[] drivers/passthrough/amd/iommu_intr.c#get_= intremap_entry+0x52/0x7b [...] Xen call trace: [] R drivers/passthrough/amd/iommu_intr.c#get_intremap= _entry+0x52/0x7b [] F drivers/passthrough/amd/iommu_intr.c#update_intre= map_entry_from_msi_msg+0xc0/0x342 [] F amd_iommu_msi_msg_update_ire+0x98/0x129 [] F iommu_update_ire_from_msi+0x1e/0x21 [] F msi_free_irq+0x55/0x1a0 [] F pci_cleanup_msi+0x8c/0xb0 [] F pci_remove_device+0x1af/0x2da [] F do_physdev_op+0xd18/0x1187 [] F pv_hypercall+0x1f5/0x567 [] F lstar_enter+0x112/0x120 That's because the call to iommu_remove_device on AMD hardware will remove the per-device interrupt remapping table, and hence the call to pci_cleanup_msi done afterwards will find a null intremap table and crash. Reorder the calls so that MSI interrupts are torn down before removing the device from the IOMMU. Signed-off-by: Roger Pau Monn=C3=A9 --- I've discussed the issue with Andrew and maybe we should try to avoid removing the interrupt remapping table on device removal, but then the tables would have to be sized to support the maximum amount of interrupts instead of the maximum supported by the device currently plugged in. I think the currently proposed fix can be easily backported. We can see about improving the resilience going ahead. --- xen/drivers/passthrough/pci.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index b035067975..64b8a77ce0 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -834,10 +834,15 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn) list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list ) if ( pdev->bus =3D=3D bus && pdev->devfn =3D=3D devfn ) { + /* + * Cleanup MSI interrupts before removing the device from the + * IOMMU, or else the internal IOMMU data used to track the de= vice + * interrupts might be already gone. + */ + pci_cleanup_msi(pdev); ret =3D iommu_remove_device(pdev); if ( pdev->domain ) list_del(&pdev->domain_list); - pci_cleanup_msi(pdev); printk(XENLOG_DEBUG "PCI remove device %pp\n", &pdev->sbdf); free_pdev(pseg, pdev); break; --=20 2.29.0