From nobody Sat Feb 7 21:30:32 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C9C78F6C for ; Wed, 10 Apr 2024 02:10:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712715009; cv=none; b=UfXJKuR73448cR7pNK3RnZnwFXXbKwJ0cQhJp422DwskPsAyXVq8PRd0FCQjvIopkCm18bYPTlvdVgHriVn+A8hcw1fnSe9WZUFt9a6vuhyYZaL2OaGk2OwtYiTZUAVJFi8D2iE5pJrjTdd7hdlU8zCxiWuu0oR09kt7uilTwtQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712715009; c=relaxed/simple; bh=tgJam1AA8RhybLELFvuMJfu2ggMrfQ1btbDCF0y2Kdk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gbfRvfNpUJGbF8rDl4gkT9w5skOAUhKcJ3Eqd4os5hDDbmgb5gjE2Z6RaexYrR1Bg+Roo1Qlof0R7C7p7c0HQMfTFB2aJb4vY9URmF1N/AbhnD9MGzqmK6+qeWruLVzo5NnDqDftrBInOYH/us4ZhfTK4Y6Mb4FUpPEIVHlZtc4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Y8HsRsfq; arc=none smtp.client-ip=198.175.65.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Y8HsRsfq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712715008; x=1744251008; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tgJam1AA8RhybLELFvuMJfu2ggMrfQ1btbDCF0y2Kdk=; b=Y8HsRsfqp2ztga+oPv4NU1YPzS95IruZC5Hccz9Bl8iusQ3YnkxhnVa4 5MQ7Y52DsAmg8YPDFuXu1G+WcPcRO663LSo5ZoQziEukk4/V7rkqF1Hm9 tgT71j3J5uU/JmRppMwtKSXEgqfL2YmwE1RWMJoMSgy7kK7czzAIJCDRw FV1yjxd66W4Op+sjERIyZpBETz7UWoX4rwpfHQpJpthlyDYi+ZXmbj7Gq eNd05kGkKPEj8nUxCIxeJ3PB1Aq3jaRBq/44oB9F3Fp0UugA2LWckW46Q JfV3BNnwW2jXTr6KxImZ4rKbUso6KkQgT/U6zg5VM8ax3fyPWH83eJll1 g==; X-CSE-ConnectionGUID: xmIhYvh3SKSgNdk5+Iif1Q== X-CSE-MsgGUID: PmE94UdCSOSwo+4BCZ+BVw== X-IronPort-AV: E=McAfee;i="6600,9927,11039"; a="7918539" X-IronPort-AV: E=Sophos;i="6.07,190,1708416000"; d="scan'208";a="7918539" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Apr 2024 19:10:07 -0700 X-CSE-ConnectionGUID: wVV8b4STSGO20CXD1PexxA== X-CSE-MsgGUID: MYLMYFFTScqj16e31lr/2g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,190,1708416000"; d="scan'208";a="20478888" Received: from unknown (HELO allen-box.sh.intel.com) ([10.239.159.127]) by fmviesa007.fm.intel.com with ESMTP; 09 Apr 2024 19:10:04 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Kevin Tian , Jason Gunthorpe Cc: Tina Zhang , Yi Liu , iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v2 02/12] iommu/vt-d: Add cache tag invalidation helpers Date: Wed, 10 Apr 2024 10:08:34 +0800 Message-Id: <20240410020844.253535-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240410020844.253535-1-baolu.lu@linux.intel.com> References: <20240410020844.253535-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add several helpers to invalidate the caches after mappings in the affected domain are changed. - cache_tag_flush_range() invalidates a range of caches after mappings within this range are changed. It uses the page-selective cache invalidation methods. - cache_tag_flush_all() invalidates all caches tagged by a domain ID. It uses the domain-selective cache invalidation methods. - cache_tag_flush_range_np() invalidates a range of caches when new mappings are created in the domain and the corresponding page table entries change from non-present to present. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 14 +++ drivers/iommu/intel/cache.c | 195 ++++++++++++++++++++++++++++++++++++ drivers/iommu/intel/iommu.c | 12 --- 3 files changed, 209 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 6f6bffc60852..700574421b51 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -35,6 +35,8 @@ #define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT) #define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK) =20 +#define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT) + #define VTD_STRIDE_SHIFT (9) #define VTD_STRIDE_MASK (((u64)-1) << VTD_STRIDE_SHIFT) =20 @@ -1041,6 +1043,13 @@ static inline void context_set_sm_pre(struct context= _entry *context) context->lo |=3D BIT_ULL(4); } =20 +/* Returns a number of VTD pages, but aligned to MM page size */ +static inline unsigned long aligned_nrpages(unsigned long host_addr, size_= t size) +{ + host_addr &=3D ~PAGE_MASK; + return PAGE_ALIGN(host_addr + size) >> VTD_PAGE_SHIFT; +} + /* Convert value to context PASID directory size field coding. */ #define context_pdts(pds) (((pds) & 0x7) << 9) =20 @@ -1116,6 +1125,11 @@ int cache_tag_assign_domain(struct dmar_domain *doma= in, u16 did, struct device *dev, ioasid_t pasid); void cache_tag_unassign_domain(struct dmar_domain *domain, u16 did, struct device *dev, ioasid_t pasid); +void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start, + unsigned long end, int ih); +void cache_tag_flush_all(struct dmar_domain *domain); +void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long st= art, + unsigned long end); =20 #ifdef CONFIG_INTEL_IOMMU_SVM void intel_svm_check(struct intel_iommu *iommu); diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index debbdaeff1c4..b2270dc8a765 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -12,6 +12,7 @@ #include #include #include +#include #include =20 #include "iommu.h" @@ -194,3 +195,197 @@ void cache_tag_unassign_domain(struct dmar_domain *do= main, u16 did, if (domain->domain.type =3D=3D IOMMU_DOMAIN_NESTED) __cache_tag_unassign_parent_domain(domain->s2_domain, did, dev, pasid); } + +static unsigned long calculate_psi_aligned_address(unsigned long start, + unsigned long end, + unsigned long *_pages, + unsigned long *_mask) +{ + unsigned long pages =3D aligned_nrpages(start, end - start + 1); + unsigned long aligned_pages =3D __roundup_pow_of_two(pages); + unsigned long bitmask =3D aligned_pages - 1; + unsigned long mask =3D ilog2(aligned_pages); + unsigned long pfn =3D IOVA_PFN(start); + + /* + * PSI masks the low order bits of the base address. If the + * address isn't aligned to the mask, then compute a mask value + * needed to ensure the target range is flushed. + */ + if (unlikely(bitmask & pfn)) { + unsigned long end_pfn =3D pfn + pages - 1, shared_bits; + + /* + * Since end_pfn <=3D pfn + bitmask, the only way bits + * higher than bitmask can differ in pfn and end_pfn is + * by carrying. This means after masking out bitmask, + * high bits starting with the first set bit in + * shared_bits are all equal in both pfn and end_pfn. + */ + shared_bits =3D ~(pfn ^ end_pfn) & ~bitmask; + mask =3D shared_bits ? __ffs(shared_bits) : BITS_PER_LONG; + } + + *_pages =3D aligned_pages; + *_mask =3D mask; + + return ALIGN_DOWN(start, VTD_PAGE_SIZE); +} + +/* + * Invalidates a range of IOVA from @start (inclusive) to @end (inclusive) + * when the memory mappings in the target domain have been modified. + */ +void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start, + unsigned long end, int ih) +{ + unsigned long pages, mask, addr; + struct cache_tag *tag; + unsigned long flags; + + addr =3D calculate_psi_aligned_address(start, end, &pages, &mask); + + spin_lock_irqsave(&domain->cache_lock, flags); + list_for_each_entry(tag, &domain->cache_tags, node) { + struct intel_iommu *iommu =3D tag->iommu; + struct device_domain_info *info; + u16 sid; + + switch (tag->type) { + case CACHE_TAG_IOTLB: + case CACHE_TAG_PARENT_IOTLB: + if (domain->use_first_level) { + qi_flush_piotlb(iommu, tag->domain_id, + tag->pasid, addr, pages, ih); + } else { + /* + * Fallback to domain selective flush if no + * PSI support or the size is too big. + */ + if (!cap_pgsel_inv(iommu->cap) || + mask > cap_max_amask_val(iommu->cap)) + iommu->flush.flush_iotlb(iommu, tag->domain_id, + 0, 0, DMA_TLB_DSI_FLUSH); + else + iommu->flush.flush_iotlb(iommu, tag->domain_id, + addr | ih, mask, + DMA_TLB_PSI_FLUSH); + } + break; + case CACHE_TAG_PARENT_DEVTLB: + /* + * Address translation cache in device side caches the + * result of nested translation. There is no easy way + * to identify the exact set of nested translations + * affected by a change in S2. So just flush the entire + * device cache. + */ + addr =3D 0; + mask =3D MAX_AGAW_PFN_WIDTH; + fallthrough; + case CACHE_TAG_DEVTLB: + info =3D dev_iommu_priv_get(tag->dev); + sid =3D PCI_DEVID(info->bus, info->devfn); + + if (tag->pasid =3D=3D IOMMU_NO_PASID) + qi_flush_dev_iotlb(iommu, sid, info->pfsid, + info->ats_qdep, addr, mask); + else + qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, + tag->pasid, info->ats_qdep, + addr, mask); + + quirk_extra_dev_tlb_flush(info, addr, mask, tag->pasid, info->ats_qdep); + break; + } + } + spin_unlock_irqrestore(&domain->cache_lock, flags); +} + +/* + * Invalidates all ranges of IOVA when the memory mappings in the target + * domain have been modified. + */ +void cache_tag_flush_all(struct dmar_domain *domain) +{ + struct cache_tag *tag; + unsigned long flags; + + spin_lock_irqsave(&domain->cache_lock, flags); + list_for_each_entry(tag, &domain->cache_tags, node) { + struct intel_iommu *iommu =3D tag->iommu; + struct device_domain_info *info; + u16 sid; + + switch (tag->type) { + case CACHE_TAG_IOTLB: + case CACHE_TAG_PARENT_IOTLB: + if (domain->use_first_level) + qi_flush_piotlb(iommu, tag->domain_id, + tag->pasid, 0, -1, 0); + else + iommu->flush.flush_iotlb(iommu, tag->domain_id, + 0, 0, DMA_TLB_DSI_FLUSH); + break; + case CACHE_TAG_DEVTLB: + case CACHE_TAG_PARENT_DEVTLB: + info =3D dev_iommu_priv_get(tag->dev); + sid =3D PCI_DEVID(info->bus, info->devfn); + + qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, + 0, MAX_AGAW_PFN_WIDTH); + quirk_extra_dev_tlb_flush(info, 0, MAX_AGAW_PFN_WIDTH, + IOMMU_NO_PASID, info->ats_qdep); + break; + } + } + spin_unlock_irqrestore(&domain->cache_lock, flags); +} + +/* + * Invalidate a range of IOVA when new mappings are created in the target + * domain. + * + * - VT-d spec, Section 6.1 Caching Mode: When the CM field is reported as + * Set, any software updates to remapping structures other than first- + * stage mapping requires explicit invalidation of the caches. + * - VT-d spec, Section 6.8 Write Buffer Flushing: For hardware that requi= res + * write buffer flushing, software must explicitly perform write-buffer + * flushing, if cache invalidation is not required. + */ +void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long st= art, + unsigned long end) +{ + unsigned long pages, mask, addr; + struct cache_tag *tag; + unsigned long flags; + + addr =3D calculate_psi_aligned_address(start, end, &pages, &mask); + + spin_lock_irqsave(&domain->cache_lock, flags); + list_for_each_entry(tag, &domain->cache_tags, node) { + struct intel_iommu *iommu =3D tag->iommu; + + if (!cap_caching_mode(iommu->cap) || domain->use_first_level) { + iommu_flush_write_buffer(iommu); + continue; + } + + if (tag->type =3D=3D CACHE_TAG_IOTLB || + tag->type =3D=3D CACHE_TAG_PARENT_IOTLB) { + /* + * Fallback to domain selective flush if no + * PSI support or the size is too big. + */ + if (!cap_pgsel_inv(iommu->cap) || + mask > cap_max_amask_val(iommu->cap)) + iommu->flush.flush_iotlb(iommu, tag->domain_id, + 0, 0, DMA_TLB_DSI_FLUSH); + else + iommu->flush.flush_iotlb(iommu, tag->domain_id, + addr, mask, + DMA_TLB_PSI_FLUSH); + } + } + spin_unlock_irqrestore(&domain->cache_lock, flags); +} diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 17b8ce4b9154..4572624a275e 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -54,11 +54,6 @@ __DOMAIN_MAX_PFN(gaw), (unsigned long)-1)) #define DOMAIN_MAX_ADDR(gaw) (((uint64_t)__DOMAIN_MAX_PFN(gaw)) << VTD_PAG= E_SHIFT) =20 -/* IO virtual address start page frame number */ -#define IOVA_START_PFN (1) - -#define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT) - static void __init check_tylersburg_isoch(void); static int rwbf_quirk; =20 @@ -1985,13 +1980,6 @@ domain_context_mapping(struct dmar_domain *domain, s= truct device *dev) domain_context_mapping_cb, domain); } =20 -/* Returns a number of VTD pages, but aligned to MM page size */ -static unsigned long aligned_nrpages(unsigned long host_addr, size_t size) -{ - host_addr &=3D ~PAGE_MASK; - return PAGE_ALIGN(host_addr + size) >> VTD_PAGE_SHIFT; -} - /* Return largest possible superpage level for a given mapping */ static int hardware_largepage_caps(struct dmar_domain *domain, unsigned lo= ng iov_pfn, unsigned long phy_pfn, unsigned long pages) --=20 2.34.1