From nobody Sun Dec 28 21:15:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B016C4167B for ; Tue, 5 Dec 2023 01:26:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343954AbjLEB0n (ORCPT ); Mon, 4 Dec 2023 20:26:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230128AbjLEB0i (ORCPT ); Mon, 4 Dec 2023 20:26:38 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AB83A4 for ; Mon, 4 Dec 2023 17:26:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701739604; x=1733275604; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8FCirIQaoMt87kwAb+y8rC0FuOBeH82W4TnVNvwESEg=; b=MNAO7IeAyN2YgsnfUYtT/kUm8q00bFnUQPavtk4+WVAk6OAa6xCZF44q DeQhnKJtcN5rVgquxYYx68MOj2W9ySghJqz/hegkScM6pQTXYTwwK2ZWI vINxYTkWMVn2PS5I7W/S5DSBpSsu5ttWsKZEp8ihuEhHZjGSWVCKkfgLg oxrKdjf2Rt0OnTW8n5q5OaXx1ObmrZwp0iqoiPY6bp9jgrfsNcNcgCPeA HTfXKL8tD9vjMrMjakdtqbi8R2UaVX8Yr5ETsLtNretLydmiwyAeSOQlh 5nx17XwPK9dyVGE/h2krFjE0GIemsiCc+qaIBw2bpl9A2+DEi47PEeOy3 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="460313330" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="460313330" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 17:26:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1102276189" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="1102276189" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga005.fm.intel.com with ESMTP; 04 Dec 2023 17:26:41 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 1/6] iommu/vt-d: Setup scalable mode context entry in probe path Date: Tue, 5 Dec 2023 09:21:58 +0800 Message-Id: <20231205012203.244584-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205012203.244584-1-baolu.lu@linux.intel.com> References: <20231205012203.244584-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In contrast to legacy mode, the DMA translation table is configured in the PASID table entry instead of the context entry for scalable mode. For this reason, it is more appropriate to set up the scalable mode context entry in the device_probe callback and direct it to the appropriate PASID table. The iommu domain attach/detach operations only affect the PASID table entry. Therefore, there is no need to modify the context entry when configuring the translation type and page table. The only exception is the kdump case, where context entry setup is postponed until the device driver invokes the first DMA interface. Signed-off-by: Lu Baolu --- drivers/iommu/intel/pasid.h | 1 + drivers/iommu/intel/iommu.c | 17 +++- drivers/iommu/intel/pasid.c | 180 ++++++++++++++++++++++++++++++++++++ 3 files changed, 195 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 8d40d4c66e31..58d7049081b9 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -319,4 +319,5 @@ void intel_pasid_tear_down_entry(struct intel_iommu *io= mmu, bool fault_ignore); void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu, struct device *dev, u32 pasid); +int intel_pasid_setup_sm_context(struct device *dev, bool deferred); #endif /* __INTEL_PASID_H */ diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 84b78e42a470..9dc005031dd2 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4270,15 +4270,26 @@ static struct iommu_device *intel_iommu_probe_devic= e(struct device *dev) ret =3D intel_pasid_alloc_table(dev); if (ret) { dev_err(dev, "PASID table allocation failed\n"); - dev_iommu_priv_set(dev, NULL); - kfree(info); - return ERR_PTR(ret); + goto err_clear_priv; + } + + ret =3D intel_pasid_setup_sm_context(dev, false); + if (ret) { + dev_err(dev, "Scalable context entry setup failed\n"); + goto err_free_table; } } =20 intel_iommu_debugfs_create_dev(info); =20 return &iommu->iommu; +err_free_table: + intel_pasid_free_table(dev); +err_clear_priv: + dev_iommu_priv_set(dev, NULL); + kfree(info); + + return ERR_PTR(ret); } =20 static void intel_iommu_release_device(struct device *dev) diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 3239cefa4c33..9e505060617a 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -304,6 +304,11 @@ int intel_pasid_setup_first_level(struct intel_iommu *= iommu, return -EINVAL; } =20 + if (intel_pasid_setup_sm_context(dev, true)) { + dev_err(dev, "Context entry is not configured\n"); + return -ENODEV; + } + spin_lock(&iommu->lock); pte =3D intel_pasid_get_entry(dev, pasid); if (!pte) { @@ -384,6 +389,11 @@ int intel_pasid_setup_second_level(struct intel_iommu = *iommu, return -EINVAL; } =20 + if (intel_pasid_setup_sm_context(dev, true)) { + dev_err(dev, "Context entry is not configured\n"); + return -ENODEV; + } + pgd =3D domain->pgd; agaw =3D iommu_skip_agaw(domain, iommu, &pgd); if (agaw < 0) { @@ -505,6 +515,11 @@ int intel_pasid_setup_pass_through(struct intel_iommu = *iommu, u16 did =3D FLPT_DEFAULT_DID; struct pasid_entry *pte; =20 + if (intel_pasid_setup_sm_context(dev, true)) { + dev_err(dev, "Context entry is not configured\n"); + return -ENODEV; + } + spin_lock(&iommu->lock); pte =3D intel_pasid_get_entry(dev, pasid); if (!pte) { @@ -623,6 +638,11 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu= , struct device *dev, return -EINVAL; } =20 + if (intel_pasid_setup_sm_context(dev, true)) { + dev_err_ratelimited(dev, "Context entry is not configured\n"); + return -ENODEV; + } + spin_lock(&iommu->lock); pte =3D intel_pasid_get_entry(dev, pasid); if (!pte) { @@ -666,3 +686,163 @@ int intel_pasid_setup_nested(struct intel_iommu *iomm= u, struct device *dev, =20 return 0; } + +/* + * Interface to set a pasid table to the scalable mode context table entry: + */ + +/* + * Get the PASID directory size for scalable mode context entry. + * Value of X in the PDTS field of a scalable mode context entry + * indicates PASID directory with 2^(X + 7) entries. + */ +static unsigned long context_get_sm_pds(struct pasid_table *table) +{ + unsigned long pds, max_pde; + + max_pde =3D table->max_pasid >> PASID_PDE_SHIFT; + pds =3D find_first_bit(&max_pde, MAX_NR_PASID_BITS); + if (pds < 7) + return 0; + + return pds - 7; +} + +static int context_entry_set_pasid_table(struct context_entry *context, + struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct pasid_table *table =3D info->pasid_table; + struct intel_iommu *iommu =3D info->iommu; + unsigned long pds; + + context_clear_entry(context); + + pds =3D context_get_sm_pds(table); + context->lo =3D (u64)virt_to_phys(table->table) | context_pdts(pds); + context_set_sm_rid2pasid(context, IOMMU_NO_PASID); + + if (info->ats_supported) + context_set_sm_dte(context); + if (info->pri_supported) + context_set_sm_pre(context); + if (info->pasid_supported) + context_set_pasid(context); + + context_set_fault_enable(context); + context_set_present(context); + if (!ecap_coherent(iommu->ecap)) + clflush_cache_range(context, sizeof(*context)); + + return 0; +} + +static int device_pasid_table_setup(struct device *dev, u8 bus, u8 devfn) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + struct context_entry *context; + int ret =3D 0; + + spin_lock(&iommu->lock); + context =3D iommu_context_addr(iommu, bus, devfn, true); + if (!context) { + ret =3D -ENOMEM; + goto out_unlock; + } + + if (context_present(context) && !context_copied(iommu, bus, devfn)) + goto out_unlock; + + /* + * Cache invalidation for changes to a scalable-mode context table + * entry. + * + * Section 6.5.3.3 of the VT-d spec: + * - Device-selective context-cache invalidation; + * - Domain-selective PASID-cache invalidation to affected domains + * (can be skipped if all PASID entries were not-present); + * - Domain-selective IOTLB invalidation to affected domains; + * - Global Device-TLB invalidation to affected functions. + * + * For kdump cases, old valid entries may be cached due to the + * in-flight DMA and copied pgtable, but there is no unmapping + * behaviour for them, thus we need explicit cache flushes for all + * affected domain IDs and PASIDs used in the copied PASID table. + * Given that we have no idea about which domain IDs and PASIDs were + * used in the copied tables, upgrade them to global PASID and IOTLB + * cache invalidation. + * + * For kdump case, at this point, the device is supposed to finish + * reset at its driver probe stage, so no in-flight DMA will exist, + * and we don't need to worry anymore hereafter. + */ + if (context_copied(iommu, bus, devfn)) { + context_clear_entry(context); + clear_context_copied(iommu, bus, devfn); + iommu->flush.flush_context(iommu, 0, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + qi_flush_pasid_cache(iommu, 0, QI_PC_GLOBAL, 0); + iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH); + devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID); + } + + context_entry_set_pasid_table(context, dev); + + /* + * It's a non-present to present mapping. If hardware doesn't cache + * non-present entry we only need to flush the write-buffer. If the + * _does_ cache non-present entries, then it does so in the special + * domain ID #0, which we have to flush: + */ + if (cap_caching_mode(iommu->cap)) { + iommu->flush.flush_context(iommu, 0, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH); + } else { + iommu_flush_write_buffer(iommu); + } + +out_unlock: + spin_unlock(&iommu->lock); + return ret; +} + +static int pci_pasid_table_setup(struct pci_dev *pdev, u16 alias, void *da= ta) +{ + struct device *dev =3D data; + + if (dev !=3D &pdev->dev) + return 0; + + return device_pasid_table_setup(dev, PCI_BUS_NUM(alias), alias & 0xff); +} + +/* + * Set the device's PASID table to its context table entry. + * + * The PASID table is set to the context entries of both device itself + * and its alias requester ID for DMA. If it is called in domain attach + * paths, set @deferred to true, false in other cases. + */ +int intel_pasid_setup_sm_context(struct device *dev, bool deferred) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + + /* + * Skip pasid table setting up if context entry is copied and + * function is not called in deferred attachment context. + */ + if (deferred ^ context_copied(iommu, info->bus, info->devfn)) + return 0; + + if (!dev_is_pci(dev)) + return device_pasid_table_setup(dev, info->bus, info->devfn); + + return pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_setup, dev= ); +} --=20 2.34.1 From nobody Sun Dec 28 21:15:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2109C4167B for ; Tue, 5 Dec 2023 01:26:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343896AbjLEB0p (ORCPT ); Mon, 4 Dec 2023 20:26:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231593AbjLEB0k (ORCPT ); Mon, 4 Dec 2023 20:26:40 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D756CCD for ; Mon, 4 Dec 2023 17:26:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701739606; x=1733275606; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nBO6r/Mialf5vBFGAoopzywwKjVALm8cMNFioZBcghU=; b=Ju2X6AqR8Zi+GFlMSvFwKe7OIMmVrecAd89TaM2lWf+ZqlYS8KpxoVeD 0lGfjVxs/LuTy6mPCit3gi+DLSi9SJOBg7C5x4fkpqGCfQg2BqVe6ZpWi r9ICNy4QmbynhROdxAUXWRf+lcS44Wb+yPaSuDMZ7X1epYlOMz9L+zB6A FAe1VRxdXC6ajgU4nBX9cuLINIyVEv058/FpIhlkFj2XW80a2IPFOv1Ae KYDx0Lql5ZXslcHtlglSvwzeQOUGmHzm/u6I7nW8aSvU7I+rE7KfOForg 9sq9hJgvg14RyQt9/1oum7qeL4JGycwyx7U+ea3aTerHUlxmp/I3nWBGE w==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="460313342" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="460313342" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 17:26:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1102276212" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="1102276212" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga005.fm.intel.com with ESMTP; 04 Dec 2023 17:26:44 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 2/6] iommu/vt-d: Remove scalable mode context entry setup from attach_dev Date: Tue, 5 Dec 2023 09:21:59 +0800 Message-Id: <20231205012203.244584-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205012203.244584-1-baolu.lu@linux.intel.com> References: <20231205012203.244584-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The scalable mode context entry is now setup in the probe_device path, eliminating the need to configure it in the attach_dev path. Removes the redundant code from the attach_dev path to avoid dead code. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 156 ++++++++++-------------------------- 1 file changed, 44 insertions(+), 112 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 9dc005031dd2..a324b3a3a005 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1727,34 +1727,17 @@ static void domain_exit(struct dmar_domain *domain) kfree(domain); } =20 -/* - * Get the PASID directory size for scalable mode context entry. - * Value of X in the PDTS field of a scalable mode context entry - * indicates PASID directory with 2^(X + 7) entries. - */ -static unsigned long context_get_sm_pds(struct pasid_table *table) -{ - unsigned long pds, max_pde; - - max_pde =3D table->max_pasid >> PASID_PDE_SHIFT; - pds =3D find_first_bit(&max_pde, MAX_NR_PASID_BITS); - if (pds < 7) - return 0; - - return pds - 7; -} - static int domain_context_mapping_one(struct dmar_domain *domain, struct intel_iommu *iommu, - struct pasid_table *table, u8 bus, u8 devfn) { struct device_domain_info *info =3D domain_lookup_dev_info(domain, iommu, bus, devfn); u16 did =3D domain_id_iommu(domain, iommu); int translation =3D CONTEXT_TT_MULTI_LEVEL; + struct dma_pte *pgd =3D domain->pgd; struct context_entry *context; - int ret; + int agaw, ret; =20 if (hw_pass_through && domain_type_is_si(domain)) translation =3D CONTEXT_TT_PASS_THROUGH; @@ -1797,65 +1780,37 @@ static int domain_context_mapping_one(struct dmar_d= omain *domain, } =20 context_clear_entry(context); + context_set_domain_id(context, did); =20 - if (sm_supported(iommu)) { - unsigned long pds; - - /* Setup the PASID DIR pointer: */ - pds =3D context_get_sm_pds(table); - context->lo =3D (u64)virt_to_phys(table->table) | - context_pdts(pds); - - /* Setup the RID_PASID field: */ - context_set_sm_rid2pasid(context, IOMMU_NO_PASID); - + if (translation !=3D CONTEXT_TT_PASS_THROUGH) { /* - * Setup the Device-TLB enable bit and Page request - * Enable bit: + * Skip top levels of page tables for iommu which has + * less agaw than default. Unnecessary for PT mode. */ + for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { + ret =3D -ENOMEM; + pgd =3D phys_to_virt(dma_pte_addr(pgd)); + if (!dma_pte_present(pgd)) + goto out_unlock; + } + if (info && info->ats_supported) - context_set_sm_dte(context); - if (info && info->pri_supported) - context_set_sm_pre(context); - if (info && info->pasid_supported) - context_set_pasid(context); + translation =3D CONTEXT_TT_DEV_IOTLB; + else + translation =3D CONTEXT_TT_MULTI_LEVEL; + + context_set_address_root(context, virt_to_phys(pgd)); + context_set_address_width(context, agaw); } else { - struct dma_pte *pgd =3D domain->pgd; - int agaw; - - context_set_domain_id(context, did); - - if (translation !=3D CONTEXT_TT_PASS_THROUGH) { - /* - * Skip top levels of page tables for iommu which has - * less agaw than default. Unnecessary for PT mode. - */ - for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { - ret =3D -ENOMEM; - pgd =3D phys_to_virt(dma_pte_addr(pgd)); - if (!dma_pte_present(pgd)) - goto out_unlock; - } - - if (info && info->ats_supported) - translation =3D CONTEXT_TT_DEV_IOTLB; - else - translation =3D CONTEXT_TT_MULTI_LEVEL; - - context_set_address_root(context, virt_to_phys(pgd)); - context_set_address_width(context, agaw); - } else { - /* - * In pass through mode, AW must be programmed to - * indicate the largest AGAW value supported by - * hardware. And ASR is ignored by hardware. - */ - context_set_address_width(context, iommu->msagaw); - } - - context_set_translation_type(context, translation); + /* + * In pass through mode, AW must be programmed to + * indicate the largest AGAW value supported by + * hardware. And ASR is ignored by hardware. + */ + context_set_address_width(context, iommu->msagaw); } =20 + context_set_translation_type(context, translation); context_set_fault_enable(context); context_set_present(context); if (!ecap_coherent(iommu->ecap)) @@ -1885,43 +1840,29 @@ static int domain_context_mapping_one(struct dmar_d= omain *domain, return ret; } =20 -struct domain_context_mapping_data { - struct dmar_domain *domain; - struct intel_iommu *iommu; - struct pasid_table *table; -}; - static int domain_context_mapping_cb(struct pci_dev *pdev, u16 alias, void *opaque) { - struct domain_context_mapping_data *data =3D opaque; + struct device_domain_info *info =3D dev_iommu_priv_get(&pdev->dev); + struct intel_iommu *iommu =3D info->iommu; + struct dmar_domain *domain =3D opaque; =20 - return domain_context_mapping_one(data->domain, data->iommu, - data->table, PCI_BUS_NUM(alias), - alias & 0xff); + return domain_context_mapping_one(domain, iommu, + PCI_BUS_NUM(alias), alias & 0xff); } =20 static int domain_context_mapping(struct dmar_domain *domain, struct device *dev) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); - struct domain_context_mapping_data data; struct intel_iommu *iommu =3D info->iommu; u8 bus =3D info->bus, devfn =3D info->devfn; - struct pasid_table *table; - - table =3D intel_pasid_get_table(dev); =20 if (!dev_is_pci(dev)) - return domain_context_mapping_one(domain, iommu, table, - bus, devfn); - - data.domain =3D domain; - data.iommu =3D iommu; - data.table =3D table; + return domain_context_mapping_one(domain, iommu, bus, devfn); =20 return pci_for_each_dma_alias(to_pci_dev(dev), - &domain_context_mapping_cb, &data); + domain_context_mapping_cb, domain); } =20 /* Returns a number of VTD pages, but aligned to MM page size */ @@ -2278,28 +2219,19 @@ static int dmar_domain_attach_device(struct dmar_do= main *domain, list_add(&info->link, &domain->devices); spin_unlock_irqrestore(&domain->lock, flags); =20 - /* PASID table is mandatory for a PCI device in scalable mode. */ - if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) { - /* Setup the PASID entry for requests without PASID: */ - if (hw_pass_through && domain_type_is_si(domain)) - ret =3D intel_pasid_setup_pass_through(iommu, - dev, IOMMU_NO_PASID); - else if (domain->use_first_level) - ret =3D domain_setup_first_level(iommu, domain, dev, - IOMMU_NO_PASID); - else - ret =3D intel_pasid_setup_second_level(iommu, domain, - dev, IOMMU_NO_PASID); - if (ret) { - dev_err(dev, "Setup RID2PASID failed\n"); - device_block_translation(dev); - return ret; - } - } + if (dev_is_real_dma_subdevice(dev)) + return 0; + + if (!sm_supported(iommu)) + ret =3D domain_context_mapping(domain, dev); + else if (hw_pass_through && domain_type_is_si(domain)) + ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); + else if (domain->use_first_level) + ret =3D domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID); + else + ret =3D intel_pasid_setup_second_level(iommu, domain, dev, IOMMU_NO_PASI= D); =20 - ret =3D domain_context_mapping(domain, dev); if (ret) { - dev_err(dev, "Domain context map failed\n"); device_block_translation(dev); return ret; } --=20 2.34.1 From nobody Sun Dec 28 21:15:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CA2EC4167B for ; Tue, 5 Dec 2023 01:26:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234859AbjLEB0t (ORCPT ); Mon, 4 Dec 2023 20:26:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343956AbjLEB0n (ORCPT ); Mon, 4 Dec 2023 20:26:43 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F339135 for ; Mon, 4 Dec 2023 17:26:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701739609; x=1733275609; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hu+gkVWZE83d4KZHeWPMt+YqjiEu98XZ0PIi+Ei4+Wc=; b=kGbXTOfw7rmc2yHYFRp2bYzEnow3iLpmg9BXkjkBADc6nVk/PPgsdyaz zfHOaFcYUGu830o36kiLSUmDgcu1coITXIPmXjAMJsy3oTON3aLFtDA2N TnOLai1IXQNaxpkNSdUeRK/BccY4ircQ4zxiYE1mmhmMCBKYkIw+0jfju dfbskFqk2tqcL0wMJ9AvVFkeECBNW4t4LKRK9OoUAz6Sn6JHZvvSG9GjS /zhnWP5meGjgv9YOTCRS6EaZNaRYtXagNKt/KwcOk1e5sLGxBvkEanu1t 0ppzIiPhC0nsU+68avY0BloCf9qqvqXiDAC8NsL2lcM14uXV1iQCmtK7n Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="460313349" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="460313349" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 17:26:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1102276239" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="1102276239" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga005.fm.intel.com with ESMTP; 04 Dec 2023 17:26:46 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 3/6] iommu/vt-d: Refactor domain_context_mapping_one() to be reusable Date: Tue, 5 Dec 2023 09:22:00 +0800 Message-Id: <20231205012203.244584-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205012203.244584-1-baolu.lu@linux.intel.com> References: <20231205012203.244584-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Extract common code from domain_context_mapping_one() into new functions, making it reusable by other functions such as the upcoming identity domain implementation. No intentional functional changes. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 99 ++++++++++++++++++++++--------------- 1 file changed, 58 insertions(+), 41 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index a324b3a3a005..605cd1c52e95 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1727,6 +1727,61 @@ static void domain_exit(struct dmar_domain *domain) kfree(domain); } =20 +/* + * For kdump cases, old valid entries may be cached due to the + * in-flight DMA and copied pgtable, but there is no unmapping + * behaviour for them, thus we need an explicit cache flush for + * the newly-mapped device. For kdump, at this point, the device + * is supposed to finish reset at its driver probe stage, so no + * in-flight DMA will exist, and we don't need to worry anymore + * hereafter. + */ +static void copied_context_tear_down(struct intel_iommu *iommu, + struct context_entry *context, + u8 bus, u8 devfn) +{ + u16 did_old; + + if (!context_copied(iommu, bus, devfn)) + return; + + assert_spin_locked(&iommu->lock); + + did_old =3D context_domain_id(context); + context_clear_entry(context); + + if (did_old < cap_ndoms(iommu->cap)) { + iommu->flush.flush_context(iommu, did_old, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, did_old, 0, 0, + DMA_TLB_DSI_FLUSH); + } + + clear_context_copied(iommu, bus, devfn); +} + +/* + * It's a non-present to present mapping. If hardware doesn't cache + * non-present entry we only need to flush the write-buffer. If the + * _does_ cache non-present entries, then it does so in the special + * domain #0, which we have to flush: + */ +static void context_present_cache_flush(struct intel_iommu *iommu, u16 did, + u8 bus, u8 devfn) +{ + if (cap_caching_mode(iommu->cap)) { + iommu->flush.flush_context(iommu, 0, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); + } else { + iommu_flush_write_buffer(iommu); + } +} + static int domain_context_mapping_one(struct dmar_domain *domain, struct intel_iommu *iommu, u8 bus, u8 devfn) @@ -1755,31 +1810,9 @@ static int domain_context_mapping_one(struct dmar_do= main *domain, if (context_present(context) && !context_copied(iommu, bus, devfn)) goto out_unlock; =20 - /* - * For kdump cases, old valid entries may be cached due to the - * in-flight DMA and copied pgtable, but there is no unmapping - * behaviour for them, thus we need an explicit cache flush for - * the newly-mapped device. For kdump, at this point, the device - * is supposed to finish reset at its driver probe stage, so no - * in-flight DMA will exist, and we don't need to worry anymore - * hereafter. - */ - if (context_copied(iommu, bus, devfn)) { - u16 did_old =3D context_domain_id(context); - - if (did_old < cap_ndoms(iommu->cap)) { - iommu->flush.flush_context(iommu, did_old, - (((u16)bus) << 8) | devfn, - DMA_CCMD_MASK_NOBIT, - DMA_CCMD_DEVICE_INVL); - iommu->flush.flush_iotlb(iommu, did_old, 0, 0, - DMA_TLB_DSI_FLUSH); - } - - clear_context_copied(iommu, bus, devfn); - } - + copied_context_tear_down(iommu, context, bus, devfn); context_clear_entry(context); + context_set_domain_id(context, did); =20 if (translation !=3D CONTEXT_TT_PASS_THROUGH) { @@ -1815,23 +1848,7 @@ static int domain_context_mapping_one(struct dmar_do= main *domain, context_set_present(context); if (!ecap_coherent(iommu->ecap)) clflush_cache_range(context, sizeof(*context)); - - /* - * It's a non-present to present mapping. If hardware doesn't cache - * non-present entry we only need to flush the write-buffer. If the - * _does_ cache non-present entries, then it does so in the special - * domain #0, which we have to flush: - */ - if (cap_caching_mode(iommu->cap)) { - iommu->flush.flush_context(iommu, 0, - (((u16)bus) << 8) | devfn, - DMA_CCMD_MASK_NOBIT, - DMA_CCMD_DEVICE_INVL); - iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); - } else { - iommu_flush_write_buffer(iommu); - } - + context_present_cache_flush(iommu, did, bus, devfn); ret =3D 0; =20 out_unlock: --=20 2.34.1 From nobody Sun Dec 28 21:15:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5225FC4167B for ; Tue, 5 Dec 2023 01:27:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346247AbjLEB0w (ORCPT ); Mon, 4 Dec 2023 20:26:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343885AbjLEB0p (ORCPT ); Mon, 4 Dec 2023 20:26:45 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90B5010F for ; Mon, 4 Dec 2023 17:26:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701739611; x=1733275611; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UMkeLFUpYAwPaBYnzxx/I6VFYpGlFn70X93po/+cGE0=; b=lqYLp0A4kNT760HrbdSo1KwKyMJ3G9KFKqYsAxfkMH5JUZ9XZ1slPe6w JTeAgHGnIGEyo0YZl1aaCB8gWxKiYyQFJ+La1l8rF3oy98kQChkehvaC5 8j6ByK7c+2r7Ls8ysl+BzFohww1iZnHM5nNh0T+XqQ2iL4gZExvRIU1V+ 2Ho/49FnS8X21wrzKZ963UDxSPGUwZrPjO4q/wZINQ9gVgNJd+9GZBi9q AHzOzsfAq8jg/9RCvl8wO3mWDTlP8SOlpNG2z3Bnq3uxypVzLE0gPteRr q7AHOfBRoSI/1X4BcDN35fJlgzMHgAMRDnujxfaJ77U4yIpXv+FX7806l Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="460313355" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="460313355" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 17:26:51 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1102276279" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="1102276279" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga005.fm.intel.com with ESMTP; 04 Dec 2023 17:26:49 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 4/6] iommu/vt-d: Remove 1:1 mappings from identity domain Date: Tue, 5 Dec 2023 09:22:01 +0800 Message-Id: <20231205012203.244584-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205012203.244584-1-baolu.lu@linux.intel.com> References: <20231205012203.244584-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Older VT-d hardware implementations did not support pass-through translation mode. The iommu driver relied on a DMA domain with all physical memory addresses identically mapped to the same IOVA to simulate pass-through translation. This workaround is no longer necessary due to the evolution of iommu core. The core has introduced def_domain_type op, allowing the iommu driver to specify its capabilities. Additionally, the identity domain has become a static system domain with "never fail" attach semantics. Eliminate support for the 1:1 mapping domain on older hardware and removes the unused code that created the 1:1 page table. This paves a way for the implementation of a global static identity domain. Signed-off-by: Lu Baolu Suggested-by: Kevin Tian --- drivers/iommu/intel/iommu.c | 118 +++--------------------------------- 1 file changed, 10 insertions(+), 108 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 605cd1c52e95..7022cc183120 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2146,29 +2146,10 @@ static bool dev_is_real_dma_subdevice(struct device= *dev) pci_real_dma_dev(to_pci_dev(dev)) !=3D to_pci_dev(dev); } =20 -static int iommu_domain_identity_map(struct dmar_domain *domain, - unsigned long first_vpfn, - unsigned long last_vpfn) -{ - /* - * RMRR range might have overlap with physical memory range, - * clear it first - */ - dma_pte_clear_range(domain, first_vpfn, last_vpfn); - - return __domain_mapping(domain, first_vpfn, - first_vpfn, last_vpfn - first_vpfn + 1, - DMA_PTE_READ|DMA_PTE_WRITE, GFP_KERNEL); -} - static int md_domain_init(struct dmar_domain *domain, int guest_width); =20 static int __init si_domain_init(int hw) { - struct dmar_rmrr_unit *rmrr; - struct device *dev; - int i, nid, ret; - si_domain =3D alloc_domain(IOMMU_DOMAIN_IDENTITY); if (!si_domain) return -EFAULT; @@ -2179,44 +2160,6 @@ static int __init si_domain_init(int hw) return -EFAULT; } =20 - if (hw) - return 0; - - for_each_online_node(nid) { - unsigned long start_pfn, end_pfn; - int i; - - for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) { - ret =3D iommu_domain_identity_map(si_domain, - mm_to_dma_pfn_start(start_pfn), - mm_to_dma_pfn_end(end_pfn)); - if (ret) - return ret; - } - } - - /* - * Identity map the RMRRs so that devices with RMRRs could also use - * the si_domain. - */ - for_each_rmrr_units(rmrr) { - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt, - i, dev) { - unsigned long long start =3D rmrr->base_address; - unsigned long long end =3D rmrr->end_address; - - if (WARN_ON(end < start || - end >> agaw_to_width(si_domain->agaw))) - continue; - - ret =3D iommu_domain_identity_map(si_domain, - mm_to_dma_pfn_start(start >> PAGE_SHIFT), - mm_to_dma_pfn_end(end >> PAGE_SHIFT)); - if (ret) - return ret; - } - } - return 0; } =20 @@ -2301,6 +2244,9 @@ static bool device_rmrr_is_relaxable(struct device *d= ev) */ static int device_def_domain_type(struct device *dev) { + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + if (dev_is_pci(dev)) { struct pci_dev *pdev =3D to_pci_dev(dev); =20 @@ -2311,6 +2257,13 @@ static int device_def_domain_type(struct device *dev) return IOMMU_DOMAIN_IDENTITY; } =20 + /* + * Hardware does not support the passthrough translation mode. + * Always use a dynamaic mapping domain. + */ + if (!ecap_pass_through(iommu->ecap)) + return IOMMU_DOMAIN_DMA; + return 0; } =20 @@ -3301,52 +3254,6 @@ int dmar_iommu_notify_scope_dev(struct dmar_pci_noti= fy_info *info) return 0; } =20 -static int intel_iommu_memory_notifier(struct notifier_block *nb, - unsigned long val, void *v) -{ - struct memory_notify *mhp =3D v; - unsigned long start_vpfn =3D mm_to_dma_pfn_start(mhp->start_pfn); - unsigned long last_vpfn =3D mm_to_dma_pfn_end(mhp->start_pfn + - mhp->nr_pages - 1); - - switch (val) { - case MEM_GOING_ONLINE: - if (iommu_domain_identity_map(si_domain, - start_vpfn, last_vpfn)) { - pr_warn("Failed to build identity map for [%lx-%lx]\n", - start_vpfn, last_vpfn); - return NOTIFY_BAD; - } - break; - - case MEM_OFFLINE: - case MEM_CANCEL_ONLINE: - { - struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; - LIST_HEAD(freelist); - - domain_unmap(si_domain, start_vpfn, last_vpfn, &freelist); - - rcu_read_lock(); - for_each_active_iommu(iommu, drhd) - iommu_flush_iotlb_psi(iommu, si_domain, - start_vpfn, mhp->nr_pages, - list_empty(&freelist), 0); - rcu_read_unlock(); - put_pages_list(&freelist); - } - break; - } - - return NOTIFY_OK; -} - -static struct notifier_block intel_iommu_memory_nb =3D { - .notifier_call =3D intel_iommu_memory_notifier, - .priority =3D 0 -}; - static void intel_disable_iommus(void) { struct intel_iommu *iommu =3D NULL; @@ -3643,12 +3550,7 @@ int __init intel_iommu_init(void) =20 iommu_pmu_register(iommu); } - up_read(&dmar_global_lock); =20 - if (si_domain && !hw_pass_through) - register_memory_notifier(&intel_iommu_memory_nb); - - down_read(&dmar_global_lock); if (probe_acpi_namespace_devices()) pr_warn("ACPI name space devices didn't probe correctly\n"); =20 --=20 2.34.1 From nobody Sun Dec 28 21:15:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35E82C4167B for ; Tue, 5 Dec 2023 01:27:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346313AbjLEB04 (ORCPT ); Mon, 4 Dec 2023 20:26:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234805AbjLEB0t (ORCPT ); Mon, 4 Dec 2023 20:26:49 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D546127 for ; Mon, 4 Dec 2023 17:26:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701739614; x=1733275614; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GFKeHdIylAj+4b4uIVEJnbm29OkKM9wsrhT6h9/WUk4=; b=FiXM6H/6pC823aIk9+mjocYa0KQX9YJezgwZgUFoxPg6ArtAgNTX9t82 9flPusC2nA2/SdwtPaNKhnMMvORAvY4l40eoHUuIVwvu11O+m4KbGmA1x yYa8NUOkIGkjaiCPO0uA5Ysy+CVuslkOhzRkurDrBCJ5t4VEbYg+E7L9f uj1o+W2xmx9oNINmOIVRcmX1I2UHBK2aUMFH3ihfzd27CxICxKtEVe/zz /kfGJSbt1GuCaYV/dtLJeeRm4dyFQyRtJ3Ov4DqNCd5b3dAbxwDAMqqyE PN/HUwtUFpV1un5+uhZLKbFEvheR986Db92x3uu+yKu1YqrYb01KulkiU g==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="460313362" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="460313362" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 17:26:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1102276309" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="1102276309" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga005.fm.intel.com with ESMTP; 04 Dec 2023 17:26:51 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 5/6] iommu/vt-d: Add support for static identity domain Date: Tue, 5 Dec 2023 09:22:02 +0800 Message-Id: <20231205012203.244584-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205012203.244584-1-baolu.lu@linux.intel.com> References: <20231205012203.244584-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add a global static identity domain with guaranteed attach semantics. Software determines VT-d hardware support for pass-through translation by inspecting the capability register. If pass-through translation is not supported, the device is instructed to use DMA domain for its default domain. While most recent VT-d hardware implementations support pass- through translation, this capability is only lacking in some early VT-d implementations. With the static identity domain in place, the domain field of per-device iommu driver data can be either a pointer to a DMA translation domain, or NULL, indicating that the static identity domain is attached. Refine some code to accommodate this change. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 122 +++++++++++++++++++++++++++++++++--- drivers/iommu/intel/svm.c | 2 +- 2 files changed, 116 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 7022cc183120..3c747d19495e 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1282,7 +1282,8 @@ static void iommu_enable_pci_caps(struct device_domai= n_info *info) if (info->ats_supported && pci_ats_page_aligned(pdev) && !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) { info->ats_enabled =3D 1; - domain_update_iotlb(info->domain); + if (info->domain) + domain_update_iotlb(info->domain); } } =20 @@ -1298,7 +1299,8 @@ static void iommu_disable_pci_caps(struct device_doma= in_info *info) if (info->ats_enabled) { pci_disable_ats(pdev); info->ats_enabled =3D 0; - domain_update_iotlb(info->domain); + if (info->domain) + domain_update_iotlb(info->domain); } =20 if (info->pasid_enabled) { @@ -1549,8 +1551,7 @@ static int iommu_init_domains(struct intel_iommu *iom= mu) * second-level or nested translation. We reserve a domain id for * this purpose. */ - if (sm_supported(iommu)) - set_bit(FLPT_DEFAULT_DID, iommu->domain_ids); + set_bit(FLPT_DEFAULT_DID, iommu->domain_ids); =20 return 0; } @@ -3614,6 +3615,9 @@ static void dmar_remove_one_dev_info(struct device *d= ev) domain_context_clear(info); } =20 + if (!domain) + return; + spin_lock_irqsave(&domain->lock, flags); list_del(&info->link); spin_unlock_irqrestore(&domain->lock, flags); @@ -3822,11 +3826,9 @@ int prepare_domain_attach_device(struct iommu_domain= *domain, static int intel_iommu_attach_device(struct iommu_domain *domain, struct device *dev) { - struct device_domain_info *info =3D dev_iommu_priv_get(dev); int ret; =20 - if (info->domain) - device_block_translation(dev); + device_block_translation(dev); =20 ret =3D prepare_domain_attach_device(domain, dev); if (ret) @@ -4431,6 +4433,9 @@ static void intel_iommu_remove_dev_pasid(struct devic= e *dev, ioasid_t pasid) goto out_tear_down; } =20 + if (domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) + goto out_tear_down; + dmar_domain =3D to_dmar_domain(domain); spin_lock_irqsave(&dmar_domain->lock, flags); list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) { @@ -4605,8 +4610,111 @@ static const struct iommu_dirty_ops intel_dirty_ops= =3D { .read_and_clear_dirty =3D intel_iommu_read_and_clear_dirty, }; =20 +static int context_setup_pass_through(struct device *dev, u8 bus, u8 devfn) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + struct context_entry *context; + + spin_lock(&iommu->lock); + context =3D iommu_context_addr(iommu, bus, devfn, 1); + if (!context) { + spin_unlock(&iommu->lock); + return -ENOMEM; + } + + if (context_present(context) && !context_copied(iommu, bus, devfn)) { + spin_unlock(&iommu->lock); + return 0; + } + + copied_context_tear_down(iommu, context, bus, devfn); + context_clear_entry(context); + context_set_domain_id(context, FLPT_DEFAULT_DID); + + /* + * In pass through mode, AW must be programmed to indicate the largest + * AGAW value supported by hardware. And ASR is ignored by hardware. + */ + context_set_address_width(context, iommu->msagaw); + context_set_translation_type(context, CONTEXT_TT_PASS_THROUGH); + context_set_fault_enable(context); + context_set_present(context); + if (!ecap_coherent(iommu->ecap)) + clflush_cache_range(context, sizeof(*context)); + context_present_cache_flush(iommu, FLPT_DEFAULT_DID, bus, devfn); + spin_unlock(&iommu->lock); + + return 0; +} + +static int context_setup_pass_through_cb(struct pci_dev *pdev, u16 alias, = void *data) +{ + struct device *dev =3D data; + + if (dev !=3D &pdev->dev) + return 0; + + return context_setup_pass_through(dev, PCI_BUS_NUM(alias), alias & 0xff); +} + +static int device_setup_pass_through(struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + + if (!dev_is_pci(dev)) + return context_setup_pass_through(dev, info->bus, info->devfn); + + return pci_for_each_dma_alias(to_pci_dev(dev), + context_setup_pass_through_cb, dev); +} + +static int identity_domain_attach_dev(struct iommu_domain *domain, + struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + int ret; + + device_block_translation(dev); + + if (dev_is_real_dma_subdevice(dev)) + return 0; + + if (sm_supported(iommu)) { + ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); + if (!ret) + iommu_enable_pci_caps(info); + } else { + ret =3D device_setup_pass_through(dev); + } + + return ret; +} + +static int identity_domain_set_dev_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + + if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev)) + return -EOPNOTSUPP; + + return intel_pasid_setup_pass_through(iommu, dev, pasid); +} + +static struct iommu_domain identity_domain =3D { + .type =3D IOMMU_DOMAIN_IDENTITY, + .ops =3D &(const struct iommu_domain_ops) { + .attach_dev =3D identity_domain_attach_dev, + .set_dev_pasid =3D identity_domain_set_dev_pasid, + }, +}; + const struct iommu_ops intel_iommu_ops =3D { .blocked_domain =3D &blocking_domain, + .identity_domain =3D &identity_domain, .capable =3D intel_iommu_capable, .hw_info =3D intel_iommu_hw_info, .domain_alloc =3D intel_iommu_domain_alloc, diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 442ff9905ca9..d78beb132f5b 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -493,7 +493,7 @@ void intel_drain_pasid_prq(struct device *dev, u32 pasi= d) domain =3D info->domain; pdev =3D to_pci_dev(dev); sid =3D PCI_DEVID(info->bus, info->devfn); - did =3D domain_id_iommu(domain, iommu); + did =3D domain ? domain_id_iommu(domain, iommu) : FLPT_DEFAULT_DID; qdep =3D pci_ats_queue_depth(pdev); =20 /* --=20 2.34.1 From nobody Sun Dec 28 21:15:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72912C4167B for ; Tue, 5 Dec 2023 01:27:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346344AbjLEB1H (ORCPT ); Mon, 4 Dec 2023 20:27:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346337AbjLEB05 (ORCPT ); Mon, 4 Dec 2023 20:26:57 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54F1418A for ; Mon, 4 Dec 2023 17:26:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701739617; x=1733275617; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u1gb+xXuSLrs67n4B6W8q40HRhcbFjefDWfMBwrFYbE=; b=WwosC1tF7difF/42duQS89tBujS6p6/yB5253cwVNu2HyBSOTBFuR7k7 JYij1OibK//jlsxs3NkXGi43TGMr94PjhjJLE2vpg4Pcprh7zqBoXvWIY u3rHMGye3cNkQWj/q6yRxpe2F/b75p5XziV4E0ZpOcqxbxMqXnwGF967r VAyRGm68sE/CVoysVWSZM0P7Hwz3cboURO/Evdw4AxNiww8TLplZ3P1Y+ KwHH69epjJuA5FSmBIh/SipqQNpY6mBoTMCZ5MGavpRQrZRrnLdalkc8Z w4F8bb+Z+y1c4mXwgSgMNlbfqibPcBjAVEIJNniJgeEFtvmJqTOugnVpX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="460313368" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="460313368" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 17:26:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="1102276337" X-IronPort-AV: E=Sophos;i="6.04,251,1695711600"; d="scan'208";a="1102276337" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga005.fm.intel.com with ESMTP; 04 Dec 2023 17:26:53 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 6/6] iommu/vt-d: Cleanup si_domain Date: Tue, 5 Dec 2023 09:22:03 +0800 Message-Id: <20231205012203.244584-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231205012203.244584-1-baolu.lu@linux.intel.com> References: <20231205012203.244584-1-baolu.lu@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The static identity domain has been introduced, rendering the si_domain obsolete. Remove si_domain and cleanup the code accordingly. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 118 +++++++----------------------------- 1 file changed, 23 insertions(+), 95 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 3c747d19495e..91443b34111b 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -97,15 +97,6 @@ static phys_addr_t root_entry_uctp(struct root_entry *re) return re->hi & VTD_PAGE_MASK; } =20 -/* - * This domain is a statically identity mapping domain. - * 1. This domain creats a static 1:1 mapping to all usable memory. - * 2. It maps to each iommu if successful. - * 3. Each iommu mapps to this domain if successful. - */ -static struct dmar_domain *si_domain; -static int hw_pass_through =3D 1; - struct dmar_rmrr_unit { struct list_head list; /* list of rmrr units */ struct acpi_dmar_header *hdr; /* ACPI header */ @@ -240,11 +231,6 @@ void free_pgtable_page(void *vaddr) free_page((unsigned long)vaddr); } =20 -static int domain_type_is_si(struct dmar_domain *domain) -{ - return domain->domain.type =3D=3D IOMMU_DOMAIN_IDENTITY; -} - static int domain_pfn_supported(struct dmar_domain *domain, unsigned long = pfn) { int addr_width =3D agaw_to_width(domain->agaw) - VTD_PAGE_SHIFT; @@ -1795,9 +1781,6 @@ static int domain_context_mapping_one(struct dmar_dom= ain *domain, struct context_entry *context; int agaw, ret; =20 - if (hw_pass_through && domain_type_is_si(domain)) - translation =3D CONTEXT_TT_PASS_THROUGH; - pr_debug("Set context mapping for %02x:%02x.%d\n", bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); =20 @@ -1816,34 +1799,24 @@ static int domain_context_mapping_one(struct dmar_d= omain *domain, =20 context_set_domain_id(context, did); =20 - if (translation !=3D CONTEXT_TT_PASS_THROUGH) { - /* - * Skip top levels of page tables for iommu which has - * less agaw than default. Unnecessary for PT mode. - */ - for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { - ret =3D -ENOMEM; - pgd =3D phys_to_virt(dma_pte_addr(pgd)); - if (!dma_pte_present(pgd)) - goto out_unlock; - } - - if (info && info->ats_supported) - translation =3D CONTEXT_TT_DEV_IOTLB; - else - translation =3D CONTEXT_TT_MULTI_LEVEL; - - context_set_address_root(context, virt_to_phys(pgd)); - context_set_address_width(context, agaw); - } else { - /* - * In pass through mode, AW must be programmed to - * indicate the largest AGAW value supported by - * hardware. And ASR is ignored by hardware. - */ - context_set_address_width(context, iommu->msagaw); + /* + * Skip top levels of page tables for iommu which has + * less agaw than default. Unnecessary for PT mode. + */ + for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { + ret =3D -ENOMEM; + pgd =3D phys_to_virt(dma_pte_addr(pgd)); + if (!dma_pte_present(pgd)) + goto out_unlock; } =20 + if (info && info->ats_supported) + translation =3D CONTEXT_TT_DEV_IOTLB; + else + translation =3D CONTEXT_TT_MULTI_LEVEL; + + context_set_address_root(context, virt_to_phys(pgd)); + context_set_address_width(context, agaw); context_set_translation_type(context, translation); context_set_fault_enable(context); context_set_present(context); @@ -2077,14 +2050,10 @@ static void domain_context_clear_one(struct device_= domain_info *info, u8 bus, u8 return; } =20 - if (sm_supported(iommu)) { - if (hw_pass_through && domain_type_is_si(info->domain)) - did_old =3D FLPT_DEFAULT_DID; - else - did_old =3D domain_id_iommu(info->domain, iommu); - } else { - did_old =3D context_domain_id(context); - } + if (info->domain) + did_old =3D domain_id_iommu(info->domain, iommu); + else + did_old =3D FLPT_DEFAULT_DID; =20 context_clear_entry(context); __iommu_flush_cache(iommu, context, sizeof(*context)); @@ -2147,23 +2116,6 @@ static bool dev_is_real_dma_subdevice(struct device = *dev) pci_real_dma_dev(to_pci_dev(dev)) !=3D to_pci_dev(dev); } =20 -static int md_domain_init(struct dmar_domain *domain, int guest_width); - -static int __init si_domain_init(int hw) -{ - si_domain =3D alloc_domain(IOMMU_DOMAIN_IDENTITY); - if (!si_domain) - return -EFAULT; - - if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { - domain_exit(si_domain); - si_domain =3D NULL; - return -EFAULT; - } - - return 0; -} - static int dmar_domain_attach_device(struct dmar_domain *domain, struct device *dev) { @@ -2185,8 +2137,6 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, =20 if (!sm_supported(iommu)) ret =3D domain_context_mapping(domain, dev); - else if (hw_pass_through && domain_type_is_si(domain)) - ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); else if (domain->use_first_level) ret =3D domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID); else @@ -2197,8 +2147,7 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, return ret; } =20 - if (sm_supported(info->iommu) || !domain_type_is_si(info->domain)) - iommu_enable_pci_caps(info); + iommu_enable_pci_caps(info); =20 return 0; } @@ -2548,8 +2497,6 @@ static int __init init_dmars(void) } } =20 - if (!ecap_pass_through(iommu->ecap)) - hw_pass_through =3D 0; intel_svm_check(iommu); } =20 @@ -2572,10 +2519,6 @@ static int __init init_dmars(void) =20 check_tylersburg_isoch(); =20 - ret =3D si_domain_init(hw_pass_through); - if (ret) - goto free_iommu; - /* * for each drhd * enable fault log @@ -2621,10 +2564,6 @@ static int __init init_dmars(void) disable_dmar_iommu(iommu); free_dmar_iommu(iommu); } - if (si_domain) { - domain_exit(si_domain); - si_domain =3D NULL; - } =20 return ret; } @@ -2999,12 +2938,6 @@ static int intel_iommu_add(struct dmar_drhd_unit *dm= aru) if (ret) goto out; =20 - if (hw_pass_through && !ecap_pass_through(iommu->ecap)) { - pr_warn("%s: Doesn't support hardware pass through.\n", - iommu->name); - return -ENXIO; - } - sp =3D domain_update_iommu_superpage(NULL, iommu) - 1; if (sp >=3D 0 && !(cap_super_page_val(iommu->cap) & (1 << sp))) { pr_warn("%s: Doesn't support large page.\n", @@ -3718,8 +3651,6 @@ static struct iommu_domain *intel_iommu_domain_alloc(= unsigned type) domain->geometry.force_aperture =3D true; =20 return domain; - case IOMMU_DOMAIN_IDENTITY: - return &si_domain->domain; case IOMMU_DOMAIN_SVA: return intel_svm_domain_alloc(); default: @@ -3779,8 +3710,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32= flags, =20 static void intel_iommu_domain_free(struct iommu_domain *domain) { - if (domain !=3D &si_domain->domain) - domain_exit(to_dmar_domain(domain)); + domain_exit(to_dmar_domain(domain)); } =20 int prepare_domain_attach_device(struct iommu_domain *domain, @@ -4487,9 +4417,7 @@ static int intel_iommu_set_dev_pasid(struct iommu_dom= ain *domain, if (ret) goto out_free; =20 - if (domain_type_is_si(dmar_domain)) - ret =3D intel_pasid_setup_pass_through(iommu, dev, pasid); - else if (dmar_domain->use_first_level) + if (dmar_domain->use_first_level) ret =3D domain_setup_first_level(iommu, dmar_domain, dev, pasid); else --=20 2.34.1