From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 818041BF37 for ; Mon, 2 Sep 2024 02:31:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244283; cv=none; b=N/cU62J9GRDhPwZY/T2RqTeVMD9s+Me+iT42cqcytHO6cIoPurxkACAgPf3TnftZl3sfC2qy8KqmKUvYfZdVXgWUE4hRocb0ieoDO0ik7mZe1resmXNXvdbxlJVAD6WoELIibWBsJk+Z2MZyn1R4KLWAInyo9xBZgAj84LPxfV8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244283; c=relaxed/simple; bh=ykFlqs93OMeUXErJ6XEMUhvHlg6YaH6RSHAwXe9P3rk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k9SjlzBq49oPzaz7osFCINlYRN9ue55qNqjzbTrOPdKtwoIdK+1PVB6pc9apnKxij8DP6aWg/n1s11WmSvFhT71e0PWNFO0YHTkYICCHE4wwiNaH9TIxDwsTkn6Le1LJB6AjBOvUqsLJb2YdBLa2n4SVQwLjoWEJb/pJzjgd6Rk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dgaycs/N; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dgaycs/N" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244281; x=1756780281; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ykFlqs93OMeUXErJ6XEMUhvHlg6YaH6RSHAwXe9P3rk=; b=dgaycs/N760yWkcyWaepZ2+RJj515aczIGuDeeIEucIY9osS3EbE4ARB mhH0I9IrtaJ3fQvx4g+S35mWFiIocLDZVW/nE8BJTqQ/26ji30QK9hQIX 2zNi5W+xC/mVgkrTOFVDlKspTYMMjVzPiCwa+V1mXeolDTK+PXGo4vPhw GpQ5eyPNlBkVz45U4bje+8PZnGWyl9y1EfnHyBFHU0eh6OWYlPfkYnoXh 2C9PqlUALgIfKteQs17DLU6j0FpRaRZJg3Ct/fgv1QDGUwWOxJloKj46s nVZ0MLfcTHVpqW+aXG+SKnZYpv9jb1/AWxAyQt3Jrb/OviGvDwFxwEGVw w==; X-CSE-ConnectionGUID: hZH9RUV2S8GVMErbbwvpyg== X-CSE-MsgGUID: VuLoEovsQYWLs94xeeXW2g== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994269" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994269" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:21 -0700 X-CSE-ConnectionGUID: GHeDYdzBS7OvEimuGsBzgg== X-CSE-MsgGUID: s3hzxfAARnOzKCylySHNAw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359266" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:20 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 01/14] iommu/vt-d: Require DMA domain if hardware not support passthrough Date: Mon, 2 Sep 2024 10:27:11 +0800 Message-Id: <20240902022724.67059-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The iommu core defines the def_domain_type callback to query the iommu driver about hardware capability and quirks. The iommu driver should declare IOMMU_DOMAIN_DMA requirement for hardware lacking pass-through capability. Earlier VT-d hardware implementations did not support pass-through translation mode. The iommu driver relied on a paging domain with all physical system memory addresses identically mapped to the same IOVA to simulate pass-through translation before the def_domain_type was introduced and it has been kept until now. It's time to adjust it now to make the Intel iommu driver follow the def_domain_type semantics. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Jerry Snitselaar Link: https://lore.kernel.org/r/20240809055431.36513-2-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 4aa070cf56e7..5193986e420b 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2151,6 +2151,16 @@ static bool device_rmrr_is_relaxable(struct device *= dev) =20 static int device_def_domain_type(struct device *dev) { + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + + /* + * Hardware does not support the passthrough translation mode. + * Always use a dynamaic mapping domain. + */ + if (!ecap_pass_through(iommu->ecap)) + return IOMMU_DOMAIN_DMA; + if (dev_is_pci(dev)) { struct pci_dev *pdev =3D to_pci_dev(dev); =20 --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2566437708 for ; Mon, 2 Sep 2024 02:31:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244284; cv=none; b=gRsch8DeB3KoVsPLHrGV/iJRE4mpd4yuh6ecpQddUh6FLRKU2nJz95+39ZLDvCSau0qNDsbN/U3vsiyPLFH3TtPlJyiu3oPR5sU3lAi8TcQre3gA1bJdUkwArvvD0mPVBpbEO4KoZJgnHFCcevezwdtdnZR2Oq/aQydHvqJ1m9Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244284; c=relaxed/simple; bh=obfscsfgYiGQYs6L1isOiSoQt2RpBmLfO8bu79MSrgs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n48eel8GIiBFddB+sJrrId0G17y5ZcMvxWO4ANWJIGT95pfr1A6UrIRpubDyRcy8PgVJaJsqCRB5mWlhfdII/ygZLMEtB80oD0cqSakBhgeK+Nnf8qK3+NQS3DrTj8fddjTKh+s8dSH9ydnv+Tc9ReoN2pCYJ9181DEmassVYtg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Rm2cz1kT; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Rm2cz1kT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244283; x=1756780283; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=obfscsfgYiGQYs6L1isOiSoQt2RpBmLfO8bu79MSrgs=; b=Rm2cz1kTydDHpI/VAwS8MWDOUiQc+AlujoD6b408jvlr8/s6BWIS1ZBl h9IK6ZLhR5Mz2CS0TUWZrXOOEJbt2aTpLBuPenxqcPsP6hJLNul886n9X 0X928Modo8FJ4EALbXRkx1SG1vKjYNazAjNUQLwA1T2xPQitWtOVAyauU AZPla81bBmFviyKS5eHAr99hEEXPqk4r6icBh2qUIf8Grsz5fnaKTQRLQ qU/lVt8Pnx7T0qzAypY9+sAYAL633uCghDcUW1+HsN+u8q51hS3EhZ2jc H4yom+vZjnzkQIIkdAqMJtaoWf1ByQhm/ukk9U939m4N0c7Z5Pm/gRbPI Q==; X-CSE-ConnectionGUID: sNU8xGCjQcioLvAkrCprmA== X-CSE-MsgGUID: pEvFzhKKQM2Ct4lkKQk27w== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994276" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994276" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:23 -0700 X-CSE-ConnectionGUID: BmGE2hYqT/WKCuEfCXwSDQ== X-CSE-MsgGUID: Tu2Yp65oTdupXB9gW7A/jQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359277" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:22 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 02/14] iommu/vt-d: Remove identity mappings from si_domain Date: Mon, 2 Sep 2024 10:27:12 +0800 Message-Id: <20240902022724.67059-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As the driver has enforced DMA domains for devices managed by an IOMMU hardware that doesn't support passthrough translation mode, there is no need for static identity mappings in the si_domain. Remove the identity mapping code to avoid dead code. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240809055431.36513-3-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.c | 122 ++---------------------------------- 1 file changed, 4 insertions(+), 118 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 5193986e420b..a92da6375efe 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -167,14 +167,7 @@ static void device_rbtree_remove(struct device_domain_= info *info) spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags); } =20 -/* - * This domain is a statically identity mapping domain. - * 1. This domain creats a static 1:1 mapping to all usable memory. - * 2. It maps to each iommu if successful. - * 3. Each iommu mapps to this domain if successful. - */ static struct dmar_domain *si_domain; -static int hw_pass_through =3D 1; =20 struct dmar_rmrr_unit { struct list_head list; /* list of rmrr units */ @@ -1647,7 +1640,7 @@ static int domain_context_mapping_one(struct dmar_dom= ain *domain, struct context_entry *context; int agaw, ret; =20 - if (hw_pass_through && domain_type_is_si(domain)) + if (domain_type_is_si(domain)) translation =3D CONTEXT_TT_PASS_THROUGH; =20 pr_debug("Set context mapping for %02x:%02x.%d\n", @@ -2000,29 +1993,10 @@ static bool dev_is_real_dma_subdevice(struct device= *dev) pci_real_dma_dev(to_pci_dev(dev)) !=3D to_pci_dev(dev); } =20 -static int iommu_domain_identity_map(struct dmar_domain *domain, - unsigned long first_vpfn, - unsigned long last_vpfn) -{ - /* - * RMRR range might have overlap with physical memory range, - * clear it first - */ - dma_pte_clear_range(domain, first_vpfn, last_vpfn); - - return __domain_mapping(domain, first_vpfn, - first_vpfn, last_vpfn - first_vpfn + 1, - DMA_PTE_READ|DMA_PTE_WRITE, GFP_KERNEL); -} - static int md_domain_init(struct dmar_domain *domain, int guest_width); =20 -static int __init si_domain_init(int hw) +static int __init si_domain_init(void) { - struct dmar_rmrr_unit *rmrr; - struct device *dev; - int i, nid, ret; - si_domain =3D alloc_domain(IOMMU_DOMAIN_IDENTITY); if (!si_domain) return -EFAULT; @@ -2033,44 +2007,6 @@ static int __init si_domain_init(int hw) return -EFAULT; } =20 - if (hw) - return 0; - - for_each_online_node(nid) { - unsigned long start_pfn, end_pfn; - int i; - - for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) { - ret =3D iommu_domain_identity_map(si_domain, - mm_to_dma_pfn_start(start_pfn), - mm_to_dma_pfn_end(end_pfn-1)); - if (ret) - return ret; - } - } - - /* - * Identity map the RMRRs so that devices with RMRRs could also use - * the si_domain. - */ - for_each_rmrr_units(rmrr) { - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt, - i, dev) { - unsigned long long start =3D rmrr->base_address; - unsigned long long end =3D rmrr->end_address; - - if (WARN_ON(end < start || - end >> agaw_to_width(si_domain->agaw))) - continue; - - ret =3D iommu_domain_identity_map(si_domain, - mm_to_dma_pfn_start(start >> PAGE_SHIFT), - mm_to_dma_pfn_end(end >> PAGE_SHIFT)); - if (ret) - return ret; - } - } - return 0; } =20 @@ -2096,7 +2032,7 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, =20 if (!sm_supported(iommu)) ret =3D domain_context_mapping(domain, dev); - else if (hw_pass_through && domain_type_is_si(domain)) + else if (domain_type_is_si(domain)) ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); else if (domain->use_first_level) ret =3D domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID); @@ -2451,8 +2387,6 @@ static int __init init_dmars(void) } } =20 - if (!ecap_pass_through(iommu->ecap)) - hw_pass_through =3D 0; intel_svm_check(iommu); } =20 @@ -2468,7 +2402,7 @@ static int __init init_dmars(void) =20 check_tylersburg_isoch(); =20 - ret =3D si_domain_init(hw_pass_through); + ret =3D si_domain_init(); if (ret) goto free_iommu; =20 @@ -2895,12 +2829,6 @@ static int intel_iommu_add(struct dmar_drhd_unit *dm= aru) if (ret) goto out; =20 - if (hw_pass_through && !ecap_pass_through(iommu->ecap)) { - pr_warn("%s: Doesn't support hardware pass through.\n", - iommu->name); - return -ENXIO; - } - sp =3D domain_update_iommu_superpage(NULL, iommu) - 1; if (sp >=3D 0 && !(cap_super_page_val(iommu->cap) & (1 << sp))) { pr_warn("%s: Doesn't support large page.\n", @@ -3151,43 +3079,6 @@ int dmar_iommu_notify_scope_dev(struct dmar_pci_noti= fy_info *info) return 0; } =20 -static int intel_iommu_memory_notifier(struct notifier_block *nb, - unsigned long val, void *v) -{ - struct memory_notify *mhp =3D v; - unsigned long start_vpfn =3D mm_to_dma_pfn_start(mhp->start_pfn); - unsigned long last_vpfn =3D mm_to_dma_pfn_end(mhp->start_pfn + - mhp->nr_pages - 1); - - switch (val) { - case MEM_GOING_ONLINE: - if (iommu_domain_identity_map(si_domain, - start_vpfn, last_vpfn)) { - pr_warn("Failed to build identity map for [%lx-%lx]\n", - start_vpfn, last_vpfn); - return NOTIFY_BAD; - } - break; - - case MEM_OFFLINE: - case MEM_CANCEL_ONLINE: - { - LIST_HEAD(freelist); - - domain_unmap(si_domain, start_vpfn, last_vpfn, &freelist); - iommu_put_pages_list(&freelist); - } - break; - } - - return NOTIFY_OK; -} - -static struct notifier_block intel_iommu_memory_nb =3D { - .notifier_call =3D intel_iommu_memory_notifier, - .priority =3D 0 -}; - static void intel_disable_iommus(void) { struct intel_iommu *iommu =3D NULL; @@ -3484,12 +3375,7 @@ int __init intel_iommu_init(void) =20 iommu_pmu_register(iommu); } - up_read(&dmar_global_lock); =20 - if (si_domain && !hw_pass_through) - register_memory_notifier(&intel_iommu_memory_nb); - - down_read(&dmar_global_lock); if (probe_acpi_namespace_devices()) pr_warn("ACPI name space devices didn't probe correctly\n"); =20 --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9FD93BB50 for ; Mon, 2 Sep 2024 02:31:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244286; cv=none; b=fwVrwJk4ckMI/eLevYxD9sQdPzdpvHyzMi+Ox7cLfcYEcpSia5H2gwl5sZcOv1OUyVNpIe+l/NgE74GPuEOwtmI43EfHG267UleXToOJN3pqLOBKhgyQkf4dEoK/BZeqi4FEQcel7hyfbjOhNM3G84d4rMAVzMA5E3roiJCdr0c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244286; c=relaxed/simple; bh=0PZrnab4FudM+5CSpi+Okv7WquImKuGdpCZBN2D7HPU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XnMkXj5mCGO6thVs58MbrwfT8QjDj0UsxgyDJkKJXYfnPnRBf82/pZVCczKAT9Vc5pELanwa9/5BSWuObvWmre0xJbP5LbxeT8zSFUe7mOAGWeDBXukB5qcz3rVQEQ2tHzUhCAuY9YXlNnHuvf5MbYnwda4MvWkjDlRCL+5QmFM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lxhrV+bF; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lxhrV+bF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244285; x=1756780285; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0PZrnab4FudM+5CSpi+Okv7WquImKuGdpCZBN2D7HPU=; b=lxhrV+bFbX4pzHH3YoCSidEmVb2vVBpLVUvFUwr7z2oc97lBazCDrI6b jGhbXLiBl0a+Y1sKLDLzu9ztSxESdN34aBDbZrZonQhid9mjaKqWSV7Xh JxuRwVadxepKd9mjkRuHgKS796BjTOR7+533Cw0NkJTA60n0nk5spGJdN LQsZUbCgXeuCkwgPRw+fVZ6ktqd6Qz3RQ2KfXbCCcX2h02RffxGmLBjEZ f/vTez4rWAydIjmpAwKKeVlm9QxmK3UnW/5Br4wB4ZA5sCc4qM3KbMupI dC5XObg67HepxdLVmM+38skBu5IFt21eTYhhrMgU3986IsLZ3Bu02KQBk g==; X-CSE-ConnectionGUID: OMTx5WfmTrWpVtmtOiTlSQ== X-CSE-MsgGUID: RzcVzVr/SiGZN9Z4BG8xlA== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994280" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994280" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:25 -0700 X-CSE-ConnectionGUID: 1P/ZVzKXRUqkO+nQ/xRdpA== X-CSE-MsgGUID: CKAJlbw+ReCXxPK1Ac2JDQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359283" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:24 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 03/14] iommu/vt-d: Always reserve a domain ID for identity setup Date: Mon, 2 Sep 2024 10:27:13 +0800 Message-Id: <20240902022724.67059-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We will use a global static identity domain. Reserve a static domain ID for it. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Jerry Snitselaar Link: https://lore.kernel.org/r/20240809055431.36513-4-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index a92da6375efe..baf79c6a5a84 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1440,10 +1440,10 @@ static int iommu_init_domains(struct intel_iommu *i= ommu) * entry for first-level or pass-through translation modes should * be programmed with a domain id different from those used for * second-level or nested translation. We reserve a domain id for - * this purpose. + * this purpose. This domain id is also used for identity domain + * in legacy mode. */ - if (sm_supported(iommu)) - set_bit(FLPT_DEFAULT_DID, iommu->domain_ids); + set_bit(FLPT_DEFAULT_DID, iommu->domain_ids); =20 return 0; } --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5478342AB4 for ; Mon, 2 Sep 2024 02:31:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244288; cv=none; b=tvkV4CCDBLVmjmD8dMasJScpdmpoWIYTCDnMBfUQ4TLnARD+4K4EK7q2xTmB1u9MNC8RaH6y+HrNFnpGuqF0JNe0m1IE+/lWfbsZuTT2onlgZJzyRBZ00ATVHphjURJkNo4m62eZPqI36GVzF29Zko1bBpt7FH8u9ocPxKZutlE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244288; c=relaxed/simple; bh=s0E6ICdgpmiD25EF4GAVXtGxvzQCqhxtQQwT91zUvSA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=me0FKL7BnZXm6w+nOOM4jMIYZBYzACeNTKQVAop25QfqK6gCllINcSMg1Mh63EpnZVCU1rL5Lt8k9JZOijJL8A5+n24PraqebUp7E81AR5EhfAHGrwwZKzF79bPhNeOmh+J9pfq/WJXQiIRTM+2SoIdN3lHDKdxA8ml9NZPaqdY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HFHRasqq; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HFHRasqq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244287; x=1756780287; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s0E6ICdgpmiD25EF4GAVXtGxvzQCqhxtQQwT91zUvSA=; b=HFHRasqqnP1Dm91gfXrYQM6XCnjvdrtdXOay8C55jrirH0NROrAg4omn irBAtKHyHVowBxnwtb7nmAkn+S2Zk5KTWSzu4CLhxXxIbngkOZWjLM5Sy BkVlEy+w6y0f7HEdFC1ZY5DlFFBpWI7vCJzJBIrbNRIV4i0N8RhNsynKv KgpEsAekGAcUxQoRbJELJjE6Bag+QWrQMyJ0eKVbx/vbJrGchLtYmUIe2 NDG82fZdwiyLY9SWRvPDIUvi0hyYuO4NNvohcUDG9fmRAShaR36JkmW75 3COA17Q8zd6u8gnSHrgvfyRoi14PR4ISeeC8nKNJzAxYlJQdzgPMMnreB g==; X-CSE-ConnectionGUID: X7MtvuBBQJu4jSWDB1vMEQ== X-CSE-MsgGUID: xrxLJ/1fQ4K0OaMsvRW+FQ== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994286" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994286" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:27 -0700 X-CSE-ConnectionGUID: YIDfa1x5TEuVlJrU8gvHNQ== X-CSE-MsgGUID: H45BLma4QHe0ZRF5+wsVxw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359287" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:25 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 04/14] iommu/vt-d: Remove has_iotlb_device flag Date: Mon, 2 Sep 2024 10:27:14 +0800 Message-Id: <20240902022724.67059-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The has_iotlb_device flag was used to indicate if a domain had attached devices with ATS enabled. Domains without this flag didn't require device TLB invalidation during unmap operations, optimizing performance by avoiding unnecessary device iteration. With the introduction of cache tags, this flag is no longer needed. The code to iterate over attached devices was removed by commit 06792d067989 ("iommu/vt-d: Cleanup use of iommu_flush_iotlb_psi()"). Remove has_iotlb_device to avoid unnecessary code. Suggested-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240809055431.36513-5-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.h | 2 -- drivers/iommu/intel/iommu.c | 34 +--------------------------------- drivers/iommu/intel/nested.c | 2 -- 3 files changed, 1 insertion(+), 37 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index a969be2258b1..b3b295e60626 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -588,7 +588,6 @@ struct dmar_domain { int nid; /* node id */ struct xarray iommu_array; /* Attached IOMMU array */ =20 - u8 has_iotlb_device: 1; u8 iommu_coherency: 1; /* indicate coherency of iommu access */ u8 force_snooping : 1; /* Create IOPTEs with snoop control */ u8 set_pte_snp:1; @@ -1104,7 +1103,6 @@ int qi_submit_sync(struct intel_iommu *iommu, struct = qi_desc *desc, */ #define QI_OPT_WAIT_DRAIN BIT(0) =20 -void domain_update_iotlb(struct dmar_domain *domain); int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *io= mmu); void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *i= ommu); void device_block_translation(struct device *dev); diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index baf79c6a5a84..b58137ecaf87 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -485,7 +485,6 @@ void domain_update_iommu_cap(struct dmar_domain *domain) domain->domain.geometry.aperture_end =3D __DOMAIN_MAX_ADDR(domain->gaw); =20 domain->domain.pgsize_bitmap |=3D domain_super_pgsize_bitmap(domain); - domain_update_iotlb(domain); } =20 struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus, @@ -1263,32 +1262,6 @@ domain_lookup_dev_info(struct dmar_domain *domain, return NULL; } =20 -void domain_update_iotlb(struct dmar_domain *domain) -{ - struct dev_pasid_info *dev_pasid; - struct device_domain_info *info; - bool has_iotlb_device =3D false; - unsigned long flags; - - spin_lock_irqsave(&domain->lock, flags); - list_for_each_entry(info, &domain->devices, link) { - if (info->ats_enabled) { - has_iotlb_device =3D true; - break; - } - } - - list_for_each_entry(dev_pasid, &domain->dev_pasids, link_domain) { - info =3D dev_iommu_priv_get(dev_pasid->dev); - if (info->ats_enabled) { - has_iotlb_device =3D true; - break; - } - } - domain->has_iotlb_device =3D has_iotlb_device; - spin_unlock_irqrestore(&domain->lock, flags); -} - /* * The extra devTLB flush quirk impacts those QAT devices with PCI device * IDs ranging from 0x4940 to 0x4943. It is exempted from risky_device() @@ -1325,10 +1298,8 @@ static void iommu_enable_pci_caps(struct device_doma= in_info *info) info->pasid_enabled =3D 1; =20 if (info->ats_supported && pci_ats_page_aligned(pdev) && - !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) { + !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) info->ats_enabled =3D 1; - domain_update_iotlb(info->domain); - } } =20 static void iommu_disable_pci_caps(struct device_domain_info *info) @@ -1343,7 +1314,6 @@ static void iommu_disable_pci_caps(struct device_doma= in_info *info) if (info->ats_enabled) { pci_disable_ats(pdev); info->ats_enabled =3D 0; - domain_update_iotlb(info->domain); } =20 if (info->pasid_enabled) { @@ -1517,7 +1487,6 @@ static struct dmar_domain *alloc_domain(unsigned int = type) domain->nid =3D NUMA_NO_NODE; if (first_level_by_default(type)) domain->use_first_level =3D true; - domain->has_iotlb_device =3D false; INIT_LIST_HEAD(&domain->devices); INIT_LIST_HEAD(&domain->dev_pasids); INIT_LIST_HEAD(&domain->cache_tags); @@ -3520,7 +3489,6 @@ static struct dmar_domain *paging_domain_alloc(struct= device *dev, bool first_st xa_init(&domain->iommu_array); =20 domain->nid =3D dev_to_node(dev); - domain->has_iotlb_device =3D info->ats_enabled; domain->use_first_level =3D first_stage; =20 /* calculate the address width */ diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c index 16a2bcf5cfeb..36a91b1b52be 100644 --- a/drivers/iommu/intel/nested.c +++ b/drivers/iommu/intel/nested.c @@ -66,8 +66,6 @@ static int intel_nested_attach_dev(struct iommu_domain *d= omain, list_add(&info->link, &dmar_domain->devices); spin_unlock_irqrestore(&dmar_domain->lock, flags); =20 - domain_update_iotlb(dmar_domain); - return 0; unassign_tag: cache_tag_unassign_domain(dmar_domain, dev, IOMMU_NO_PASID); --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B86342ABE for ; Mon, 2 Sep 2024 02:31:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244290; cv=none; b=RjFGNWTc0YMLFpHaAya0hL8o18jPrrJuhXfGUsL6hq6C9eiz432Modj/rZfwI/VqW+5dh6irEFFbYwVv8gM+PhNZ7TfDZ4DFxOTh/N4m+t1bm7i7TXilAA2YBkXrY/ZPY4zi3usz3fiHOSGSSzyFNTFHNDxZ+R8dxgTTmxVtE8Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244290; c=relaxed/simple; bh=3HzOKvzO9pY4piCmz7QwkmYFBvAbH4HKR8epidlTfh4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TdUs8zyikuecc4Ne7SZIvp38+CjmZCBsVjwPxTQ2twO3h+UUtwJAGYM/GUN10ty9XicIWFQ+QvC81rAnTFwF7cgn9zEubZfJ1vWQ+8NUK2RydwwT+gc4+AxI/ZWj9NJArMgOBP7yliLAsUq8kWSzcwb80lu6D14r+Jid/shvhSo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bUzQLRdb; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bUzQLRdb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244289; x=1756780289; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3HzOKvzO9pY4piCmz7QwkmYFBvAbH4HKR8epidlTfh4=; b=bUzQLRdbjTHRCsuPJmUbrQBiwmoGBvsNjQFGK1IyI63m3qol4gegXqXH O0aIezCaNRQ4UP0rDDAVrzXSWYNChtwPnncWnv9ldGZ2JIct1wwUMiiyU N+86pyTnnZCqNR1oyu9FGFYK6y/5gHRJZxOTcan2Rbq3aTUqNbrX5CZZe DfETNMK5T7hnXC9nJsM4Nb/SZv/fJhICNJjTJpaFuyMjDhze4Rya+ezlA +BevX9ni//TH7UP/uVmajT/1KUCqw/Q7I2Bby6tq2UjK7ppvcA9tfJ0Mq ZDCafTXCNROWv4SxQ7A8nDD3BbIZ/YDJ9x0haa2FVULjaqn0Oc9BKy5zc g==; X-CSE-ConnectionGUID: XdOr5oXJScWBTHzlqKDTAw== X-CSE-MsgGUID: g3j9X/i4TGqLnHQZEJxlBQ== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994294" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994294" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:29 -0700 X-CSE-ConnectionGUID: ShBLQgMMTjuv53qus+FYLA== X-CSE-MsgGUID: hh7nf8p0TgSZ2uDpnpzAww== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359299" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:27 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 05/14] iommu/vt-d: Factor out helpers from domain_context_mapping_one() Date: Mon, 2 Sep 2024 10:27:15 +0800 Message-Id: <20240902022724.67059-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extract common code from domain_context_mapping_one() into new helpers, making it reusable by other functions such as the upcoming identity domain implementation. No intentional functional changes. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Jerry Snitselaar Link: https://lore.kernel.org/r/20240809055431.36513-6-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.c | 99 ++++++++++++++++++++++--------------- 1 file changed, 58 insertions(+), 41 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b58137ecaf87..41c47410736b 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1597,6 +1597,61 @@ static void domain_exit(struct dmar_domain *domain) kfree(domain); } =20 +/* + * For kdump cases, old valid entries may be cached due to the + * in-flight DMA and copied pgtable, but there is no unmapping + * behaviour for them, thus we need an explicit cache flush for + * the newly-mapped device. For kdump, at this point, the device + * is supposed to finish reset at its driver probe stage, so no + * in-flight DMA will exist, and we don't need to worry anymore + * hereafter. + */ +static void copied_context_tear_down(struct intel_iommu *iommu, + struct context_entry *context, + u8 bus, u8 devfn) +{ + u16 did_old; + + if (!context_copied(iommu, bus, devfn)) + return; + + assert_spin_locked(&iommu->lock); + + did_old =3D context_domain_id(context); + context_clear_entry(context); + + if (did_old < cap_ndoms(iommu->cap)) { + iommu->flush.flush_context(iommu, did_old, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, did_old, 0, 0, + DMA_TLB_DSI_FLUSH); + } + + clear_context_copied(iommu, bus, devfn); +} + +/* + * It's a non-present to present mapping. If hardware doesn't cache + * non-present entry we only need to flush the write-buffer. If the + * _does_ cache non-present entries, then it does so in the special + * domain #0, which we have to flush: + */ +static void context_present_cache_flush(struct intel_iommu *iommu, u16 did, + u8 bus, u8 devfn) +{ + if (cap_caching_mode(iommu->cap)) { + iommu->flush.flush_context(iommu, 0, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); + } else { + iommu_flush_write_buffer(iommu); + } +} + static int domain_context_mapping_one(struct dmar_domain *domain, struct intel_iommu *iommu, u8 bus, u8 devfn) @@ -1625,31 +1680,9 @@ static int domain_context_mapping_one(struct dmar_do= main *domain, if (context_present(context) && !context_copied(iommu, bus, devfn)) goto out_unlock; =20 - /* - * For kdump cases, old valid entries may be cached due to the - * in-flight DMA and copied pgtable, but there is no unmapping - * behaviour for them, thus we need an explicit cache flush for - * the newly-mapped device. For kdump, at this point, the device - * is supposed to finish reset at its driver probe stage, so no - * in-flight DMA will exist, and we don't need to worry anymore - * hereafter. - */ - if (context_copied(iommu, bus, devfn)) { - u16 did_old =3D context_domain_id(context); - - if (did_old < cap_ndoms(iommu->cap)) { - iommu->flush.flush_context(iommu, did_old, - (((u16)bus) << 8) | devfn, - DMA_CCMD_MASK_NOBIT, - DMA_CCMD_DEVICE_INVL); - iommu->flush.flush_iotlb(iommu, did_old, 0, 0, - DMA_TLB_DSI_FLUSH); - } - - clear_context_copied(iommu, bus, devfn); - } - + copied_context_tear_down(iommu, context, bus, devfn); context_clear_entry(context); + context_set_domain_id(context, did); =20 if (translation !=3D CONTEXT_TT_PASS_THROUGH) { @@ -1685,23 +1718,7 @@ static int domain_context_mapping_one(struct dmar_do= main *domain, context_set_present(context); if (!ecap_coherent(iommu->ecap)) clflush_cache_range(context, sizeof(*context)); - - /* - * It's a non-present to present mapping. If hardware doesn't cache - * non-present entry we only need to flush the write-buffer. If the - * _does_ cache non-present entries, then it does so in the special - * domain #0, which we have to flush: - */ - if (cap_caching_mode(iommu->cap)) { - iommu->flush.flush_context(iommu, 0, - (((u16)bus) << 8) | devfn, - DMA_CCMD_MASK_NOBIT, - DMA_CCMD_DEVICE_INVL); - iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); - } else { - iommu_flush_write_buffer(iommu); - } - + context_present_cache_flush(iommu, did, bus, devfn); ret =3D 0; =20 out_unlock: --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56D814AEE0 for ; Mon, 2 Sep 2024 02:31:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244292; cv=none; b=n+r1yNy9Fg9u7Eeib7UraZ9EE2OL9698hNtkIdcR9PTl45aJlsNPNAS31i5K2Lk5sBvpMbhPE/Z1kD6wbmBG/OlhsgxGwJsUrjBU8NYGuNIFzGAsLbqzB8BhjDLhU+f+HsSS22pCITexKbncLvESQiYcck+EVZ2oUTZCvsDgXNg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244292; c=relaxed/simple; bh=deFbhjCg3D4MlJfauctTLDOm141eNNZjML2Qbu39cek=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=N/0SG3VK6PhYW8hD5tJ5Zbs7KlTLigzDVh1w3fNW8vI1A81HRP3se1uiGnhZN43LuMFHZyqbkmMdrzpoF7g7BrwU8DiccemCHfAS/AiEOH2aZ4GM+cORBBsUJ9cInhqgrru4/3nSVYTd7HoZuiWGv+7HdqnmiBOh4+ONyGuMY/I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BCV/iwsl; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BCV/iwsl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244291; x=1756780291; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=deFbhjCg3D4MlJfauctTLDOm141eNNZjML2Qbu39cek=; b=BCV/iwsl1qQxTG4lxkQq19QG1uCPAAujhNM+Z4mfqjJ8+2cYCj0GkqI3 xXvIZ+O/LQNVomWnT5XhLKXjOwRt/8XqEGuGQUiaHmY+meDfwv1xMftw5 DQ35CQWuR/DqFjvUh3FM9F08DWidxY9gcz854I5GF5XO+a+oWMukzI1Pr iFPvpMVPTJWWRNUrm8BD2daTpqwl9sMt0gCmyVQ9cju24JkvHASYGGlRa FT/rB0xf40CAbQmcim6xHy3oL5N/9I7tTHwx0O+sg6YBYjHSqL60kytKa gtbPbNy2PAiyBXiKDTevAn+vcUqX5+x+q5a0uQiT6tys1qHGHjpyu5kna A==; X-CSE-ConnectionGUID: 95fQleHrQg+AgHle0B/tCg== X-CSE-MsgGUID: Upcr96HPTN2XVbfmDZBuIA== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994303" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994303" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:31 -0700 X-CSE-ConnectionGUID: qSsEiDwkSCuscu0MPJVf8A== X-CSE-MsgGUID: 8Ba3g9LcT2Kmr7UEHlgLjg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359316" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:29 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 06/14] iommu/vt-d: Add support for static identity domain Date: Mon, 2 Sep 2024 10:27:16 +0800 Message-Id: <20240902022724.67059-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Software determines VT-d hardware support for passthrough translation by inspecting the capability register. If passthrough translation is not supported, the device is instructed to use DMA domain for its default domain. Add a global static identity domain with guaranteed attach semantics for IOMMUs that support passthrough translation mode. The global static identity domain is a dummy domain without corresponding dmar_domain structure. Consequently, the device's info->domain will be NULL with the identity domain is attached. Refactor the code accordingly. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240809055431.36513-7-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.c | 114 ++++++++++++++++++++++++++++++++++-- drivers/iommu/intel/svm.c | 2 +- 2 files changed, 111 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 41c47410736b..df34416ba7bf 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3693,11 +3693,9 @@ int prepare_domain_attach_device(struct iommu_domain= *domain, static int intel_iommu_attach_device(struct iommu_domain *domain, struct device *dev) { - struct device_domain_info *info =3D dev_iommu_priv_get(dev); int ret; =20 - if (info->domain) - device_block_translation(dev); + device_block_translation(dev); =20 ret =3D prepare_domain_attach_device(domain, dev); if (ret) @@ -4305,11 +4303,17 @@ static void intel_iommu_remove_dev_pasid(struct dev= ice *dev, ioasid_t pasid, struct iommu_domain *domain) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); - struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); struct dev_pasid_info *curr, *dev_pasid =3D NULL; struct intel_iommu *iommu =3D info->iommu; + struct dmar_domain *dmar_domain; unsigned long flags; =20 + if (domain->type =3D=3D IOMMU_DOMAIN_IDENTITY) { + intel_pasid_tear_down_entry(iommu, dev, pasid, false); + return; + } + + dmar_domain =3D to_dmar_domain(domain); spin_lock_irqsave(&dmar_domain->lock, flags); list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) { if (curr->dev =3D=3D dev && curr->pasid =3D=3D pasid) { @@ -4536,9 +4540,111 @@ static const struct iommu_dirty_ops intel_dirty_ops= =3D { .read_and_clear_dirty =3D intel_iommu_read_and_clear_dirty, }; =20 +static int context_setup_pass_through(struct device *dev, u8 bus, u8 devfn) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + struct context_entry *context; + + spin_lock(&iommu->lock); + context =3D iommu_context_addr(iommu, bus, devfn, 1); + if (!context) { + spin_unlock(&iommu->lock); + return -ENOMEM; + } + + if (context_present(context) && !context_copied(iommu, bus, devfn)) { + spin_unlock(&iommu->lock); + return 0; + } + + copied_context_tear_down(iommu, context, bus, devfn); + context_clear_entry(context); + context_set_domain_id(context, FLPT_DEFAULT_DID); + + /* + * In pass through mode, AW must be programmed to indicate the largest + * AGAW value supported by hardware. And ASR is ignored by hardware. + */ + context_set_address_width(context, iommu->msagaw); + context_set_translation_type(context, CONTEXT_TT_PASS_THROUGH); + context_set_fault_enable(context); + context_set_present(context); + if (!ecap_coherent(iommu->ecap)) + clflush_cache_range(context, sizeof(*context)); + context_present_cache_flush(iommu, FLPT_DEFAULT_DID, bus, devfn); + spin_unlock(&iommu->lock); + + return 0; +} + +static int context_setup_pass_through_cb(struct pci_dev *pdev, u16 alias, = void *data) +{ + struct device *dev =3D data; + + if (dev !=3D &pdev->dev) + return 0; + + return context_setup_pass_through(dev, PCI_BUS_NUM(alias), alias & 0xff); +} + +static int device_setup_pass_through(struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + + if (!dev_is_pci(dev)) + return context_setup_pass_through(dev, info->bus, info->devfn); + + return pci_for_each_dma_alias(to_pci_dev(dev), + context_setup_pass_through_cb, dev); +} + +static int identity_domain_attach_dev(struct iommu_domain *domain, struct = device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + int ret; + + device_block_translation(dev); + + if (dev_is_real_dma_subdevice(dev)) + return 0; + + if (sm_supported(iommu)) { + ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); + if (!ret) + iommu_enable_pci_caps(info); + } else { + ret =3D device_setup_pass_through(dev); + } + + return ret; +} + +static int identity_domain_set_dev_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + + if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev)) + return -EOPNOTSUPP; + + return intel_pasid_setup_pass_through(iommu, dev, pasid); +} + +static struct iommu_domain identity_domain =3D { + .type =3D IOMMU_DOMAIN_IDENTITY, + .ops =3D &(const struct iommu_domain_ops) { + .attach_dev =3D identity_domain_attach_dev, + .set_dev_pasid =3D identity_domain_set_dev_pasid, + }, +}; + const struct iommu_ops intel_iommu_ops =3D { .blocked_domain =3D &blocking_domain, .release_domain =3D &blocking_domain, + .identity_domain =3D &identity_domain, .capable =3D intel_iommu_capable, .hw_info =3D intel_iommu_hw_info, .domain_alloc =3D intel_iommu_domain_alloc, diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 0e3a9b38bef2..ef12e95e400a 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -311,7 +311,7 @@ void intel_drain_pasid_prq(struct device *dev, u32 pasi= d) domain =3D info->domain; pdev =3D to_pci_dev(dev); sid =3D PCI_DEVID(info->bus, info->devfn); - did =3D domain_id_iommu(domain, iommu); + did =3D domain ? domain_id_iommu(domain, iommu) : FLPT_DEFAULT_DID; qdep =3D pci_ats_queue_depth(pdev); =20 /* --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09BB75A7AA for ; Mon, 2 Sep 2024 02:31:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244294; cv=none; b=D1wjkFODvfCkxjYW5tYNqKn3ikG1vn+TXOevddowGkPgjtq666Duc5v4zyAL6fKk1anDYvvabiKTFYayg7o03yTu0dM9e8m7+sMGOfe5Ou1Z8n/rW+qiSztw7ZDZF2xf0VPeBhZ4al0//EtSONJl/xTmM0DDR/apoZ1MoD3ZVFk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244294; c=relaxed/simple; bh=1tvzBKq+Ft34pip0KC4KcaOEa/uEFjJ1tKEKuGuNIgc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cU9xpQ9BdzSoi2THubh/UdfFCStOKp0by7yY9dIIyrS6kqQow35GuZJOnN77dPtzxVHHw8fKjw+Q4TrpF5E2HdSZp0tKHT6ADDSw2cYVmUl0kik50f0lWLMOTls4vUZWF+JAHRHRur+aVpgl43W5ZcCDdxqtreIswLG33lE3E5E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=noo6Tr4z; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="noo6Tr4z" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244293; x=1756780293; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1tvzBKq+Ft34pip0KC4KcaOEa/uEFjJ1tKEKuGuNIgc=; b=noo6Tr4zmCMcNr3MMi4cLlsfbPkQNrWRMUGX9mJ1RDSN6ipq1MSTAanl oVJM6K19ISHy7a0k1HweRIG9o4TYzAKFg+ASNsn69djDAWsMnhfzE1ie0 nlCiaI0TXHJDgZDlQImDhNy1Nxu1VLGr0Tiypht4nTKA5WciVU45QReJp ZepECX0Bjw7dz2qa+wnM0VM7rUAUxxcyINpIzrtxxLcdwVl/bUJTerYy3 7GsRzFMPNrlQEqvK+ZovSjrBenx4/NOMW1JQDnKy52FLsi5X/4bmBFlHt OVz2k3NMYuVhPPKWskhmCjSyVENjyTab83ecIRSp/MEZJ2xssRSTC1ahR Q==; X-CSE-ConnectionGUID: UTXhUiUnRy6+Joz8wtKgAQ== X-CSE-MsgGUID: aswagyE9T0a7z+H/gO6dlQ== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994310" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994310" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:33 -0700 X-CSE-ConnectionGUID: 8wSd623JQ666cfEYeBp8RQ== X-CSE-MsgGUID: TfRPOLNpQ/WD9k/NsHM8ag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359325" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:31 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 07/14] iommu/vt-d: Cleanup si_domain Date: Mon, 2 Sep 2024 10:27:17 +0800 Message-Id: <20240902022724.67059-8-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The static identity domain has been introduced, rendering the si_domain obsolete. Remove si_domain and cleanup the code accordingly. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240809055431.36513-8-baolu.lu@linux.intel= .com --- drivers/iommu/intel/iommu.c | 91 ++++++++----------------------------- 1 file changed, 19 insertions(+), 72 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index df34416ba7bf..34006b6e89eb 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -167,8 +167,6 @@ static void device_rbtree_remove(struct device_domain_i= nfo *info) spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags); } =20 -static struct dmar_domain *si_domain; - struct dmar_rmrr_unit { struct list_head list; /* list of rmrr units */ struct acpi_dmar_header *hdr; /* ACPI header */ @@ -286,11 +284,6 @@ static int __init intel_iommu_setup(char *str) } __setup("intel_iommu=3D", intel_iommu_setup); =20 -static int domain_type_is_si(struct dmar_domain *domain) -{ - return domain->domain.type =3D=3D IOMMU_DOMAIN_IDENTITY; -} - static int domain_pfn_supported(struct dmar_domain *domain, unsigned long = pfn) { int addr_width =3D agaw_to_width(domain->agaw) - VTD_PAGE_SHIFT; @@ -1664,9 +1657,6 @@ static int domain_context_mapping_one(struct dmar_dom= ain *domain, struct context_entry *context; int agaw, ret; =20 - if (domain_type_is_si(domain)) - translation =3D CONTEXT_TT_PASS_THROUGH; - pr_debug("Set context mapping for %02x:%02x.%d\n", bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); =20 @@ -1685,34 +1675,24 @@ static int domain_context_mapping_one(struct dmar_d= omain *domain, =20 context_set_domain_id(context, did); =20 - if (translation !=3D CONTEXT_TT_PASS_THROUGH) { - /* - * Skip top levels of page tables for iommu which has - * less agaw than default. Unnecessary for PT mode. - */ - for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { - ret =3D -ENOMEM; - pgd =3D phys_to_virt(dma_pte_addr(pgd)); - if (!dma_pte_present(pgd)) - goto out_unlock; - } - - if (info && info->ats_supported) - translation =3D CONTEXT_TT_DEV_IOTLB; - else - translation =3D CONTEXT_TT_MULTI_LEVEL; - - context_set_address_root(context, virt_to_phys(pgd)); - context_set_address_width(context, agaw); - } else { - /* - * In pass through mode, AW must be programmed to - * indicate the largest AGAW value supported by - * hardware. And ASR is ignored by hardware. - */ - context_set_address_width(context, iommu->msagaw); + /* + * Skip top levels of page tables for iommu which has + * less agaw than default. Unnecessary for PT mode. + */ + for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { + ret =3D -ENOMEM; + pgd =3D phys_to_virt(dma_pte_addr(pgd)); + if (!dma_pte_present(pgd)) + goto out_unlock; } =20 + if (info && info->ats_supported) + translation =3D CONTEXT_TT_DEV_IOTLB; + else + translation =3D CONTEXT_TT_MULTI_LEVEL; + + context_set_address_root(context, virt_to_phys(pgd)); + context_set_address_width(context, agaw); context_set_translation_type(context, translation); context_set_fault_enable(context); context_set_present(context); @@ -1979,23 +1959,6 @@ static bool dev_is_real_dma_subdevice(struct device = *dev) pci_real_dma_dev(to_pci_dev(dev)) !=3D to_pci_dev(dev); } =20 -static int md_domain_init(struct dmar_domain *domain, int guest_width); - -static int __init si_domain_init(void) -{ - si_domain =3D alloc_domain(IOMMU_DOMAIN_IDENTITY); - if (!si_domain) - return -EFAULT; - - if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) { - domain_exit(si_domain); - si_domain =3D NULL; - return -EFAULT; - } - - return 0; -} - static int dmar_domain_attach_device(struct dmar_domain *domain, struct device *dev) { @@ -2018,8 +1981,6 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, =20 if (!sm_supported(iommu)) ret =3D domain_context_mapping(domain, dev); - else if (domain_type_is_si(domain)) - ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); else if (domain->use_first_level) ret =3D domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID); else @@ -2028,8 +1989,7 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, if (ret) goto out_block_translation; =20 - if (sm_supported(info->iommu) || !domain_type_is_si(info->domain)) - iommu_enable_pci_caps(info); + iommu_enable_pci_caps(info); =20 ret =3D cache_tag_assign_domain(domain, dev, IOMMU_NO_PASID); if (ret) @@ -2388,10 +2348,6 @@ static int __init init_dmars(void) =20 check_tylersburg_isoch(); =20 - ret =3D si_domain_init(); - if (ret) - goto free_iommu; - /* * for each drhd * enable fault log @@ -2437,10 +2393,6 @@ static int __init init_dmars(void) disable_dmar_iommu(iommu); free_dmar_iommu(iommu); } - if (si_domain) { - domain_exit(si_domain); - si_domain =3D NULL; - } =20 return ret; } @@ -3574,8 +3526,6 @@ static struct iommu_domain *intel_iommu_domain_alloc(= unsigned type) domain->geometry.force_aperture =3D true; =20 return domain; - case IOMMU_DOMAIN_IDENTITY: - return &si_domain->domain; default: return NULL; } @@ -3642,8 +3592,7 @@ static void intel_iommu_domain_free(struct iommu_doma= in *domain) =20 WARN_ON(dmar_domain->nested_parent && !list_empty(&dmar_domain->s1_domains)); - if (domain !=3D &si_domain->domain) - domain_exit(dmar_domain); + domain_exit(dmar_domain); } =20 int prepare_domain_attach_device(struct iommu_domain *domain, @@ -4368,9 +4317,7 @@ static int intel_iommu_set_dev_pasid(struct iommu_dom= ain *domain, if (ret) goto out_detach_iommu; =20 - if (domain_type_is_si(dmar_domain)) - ret =3D intel_pasid_setup_pass_through(iommu, dev, pasid); - else if (dmar_domain->use_first_level) + if (dmar_domain->use_first_level) ret =3D domain_setup_first_level(iommu, dmar_domain, dev, pasid); else --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEFC6768E1 for ; Mon, 2 Sep 2024 02:31:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244297; cv=none; b=nZFovHQhfxNy0lfEExdPY1ygTGHb5kYpFreGD/lYqpZ32Ft6nQAEC6XTq3yB1nAtuLbOdqDtnsAObVrWl6iJQWK+UMoeaVwcMNVrc8DKz9Yi4pFoZstsNN91mkWk83TEg0dGlT11dDvRrnfmEoICyEgVKon/z3GjSsipJON42q4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244297; c=relaxed/simple; bh=JAHOTQH4PiZazv2tg8j+ODrfqTu/cLl55/IpDQrDQD8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=d+CggYsnf4+n585DjeP82jhv2CF+lrUHMYZiBIAi1wjnGqERVwip2gaHt/Jd+pZLwNDuFMxIpMJZj2G5D7HwRMHCzC+biSOzpuNBHkZjvu0qEwjbTv39z6+C1uPafFde1fCLLYyeunEeodBVDS2pDXK2QXM/jTPRYCNfO3BDN2M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IIFi8Zac; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IIFi8Zac" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244296; x=1756780296; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JAHOTQH4PiZazv2tg8j+ODrfqTu/cLl55/IpDQrDQD8=; b=IIFi8ZacrwohrhgGRVRpPHXyENZ8rjbJU7MPwGnAYNO57viqtkb08ndE lO9qUPQzBznOU17FGXJ/NVOb4VOBKISYBITER+Ynft9pFW8wN4J+7neoW a3kC949m5ynkATG+h8LsldIlbquefd1qwwJ1qWUGFa4sFOykEoaTt1r2u BZoahXmz7OkTb+D7e0ujiQbxE/kZcZfPynPjq0wghsoGXDZYpKDlNY59J mTCj/bPFt44gC1ezVVZQ6YocY80XICeH+0wzMprN4VeE4Yd1+i2XCV+de sQQ5izRyVmOFEqTxeEuyFS+rvXq9cqHoBxrMJKSF5VbZ0pF8F5pRwgh2I Q==; X-CSE-ConnectionGUID: Le81Iso0REOCCf8SP2HW1g== X-CSE-MsgGUID: sPguVFELS5+lOEqsLUCxfg== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994319" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994319" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:35 -0700 X-CSE-ConnectionGUID: LIYITmiyQ4Krj157cBc4kA== X-CSE-MsgGUID: CNIssNR0TlqE6y8Dt7EolA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359334" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:33 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 08/14] iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count Date: Mon, 2 Sep 2024 10:27:18 +0800 Message-Id: <20240902022724.67059-9-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sanjay K Kumar If qi_submit_sync() is invoked with 0 invalidation descriptors (for instance, for DMA draining purposes), we can run into a bug where a submitting thread fails to detect the completion of invalidation_wait. Subsequently, this led to a soft lockup. Currently, there is no impact by this bug on the existing users because no callers are submitting invalidations with 0 descriptors. This fix will enable future users (such as DMA drain) calling qi_submit_sync() with 0 count. Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both threads then enter a while loop, waiting for their respective descriptors to complete. T1 detects its completion (i.e., T1's invalidation_wait status changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to reclaim all descriptors, potentially including adjacent ones of other threads that are also marked as QI_DONE. During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU hardware may complete the invalidation for T2, setting its status to QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's invalidation_wait descriptor and changes its status to QI_FREE, T2 will not observe the QI_DONE status for its invalidation_wait and will indefinitely remain stuck. This soft lockup does not occur when only non-zero descriptors are submitted.In such cases, invalidation descriptors are interspersed among wait descriptors with the status QI_IN_USE, acting as barriers. These barriers prevent the reclaim code from mistakenly freeing descriptors belonging to other submitters. Considered the following example timeline: T1 T2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ID1 WD1 while(WD1!=3DQI_DONE) unlock lock WD1=3DQI_DONE* WD2 while(WD2!=3DQI_DONE) unlock lock WD1=3D=3DQI_DONE? ID1=3DQI_DONE WD2=3DDONE* reclaim() ID1=3DFREE WD1=3DFREE WD2=3DFREE unlock soft lockup! T2 never sees QI_DONE in WD2 Where: ID =3D invalidation descriptor WD =3D wait descriptor * Written by hardware The root of the problem is that the descriptor status QI_DONE flag is used for two conflicting purposes: 1. signal a descriptor is ready for reclaim (to be freed) 2. signal by the hardware that a wait descriptor is complete The solution (in this patch) is state separation by using QI_FREE flag for #1. Once a thread's invalidation descriptors are complete, their status would be set to QI_FREE. The reclaim_free_desc() function would then only free descriptors marked as QI_FREE instead of those marked as QI_DONE. This change ensures that T2 (from the previous example) will correctly observe the completion of its invalidation_wait (marked as QI_DONE). Signed-off-by: Sanjay K Kumar Signed-off-by: Jacob Pan Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240728210059.1964602-1-jacob.jun.pan@linu= x.intel.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/dmar.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 1c8d3141cb55..01e157d89a16 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1204,9 +1204,7 @@ static void free_iommu(struct intel_iommu *iommu) */ static inline void reclaim_free_desc(struct q_inval *qi) { - while (qi->desc_status[qi->free_tail] =3D=3D QI_DONE || - qi->desc_status[qi->free_tail] =3D=3D QI_ABORT) { - qi->desc_status[qi->free_tail] =3D QI_FREE; + while (qi->desc_status[qi->free_tail] =3D=3D QI_FREE && qi->free_tail != =3D qi->free_head) { qi->free_tail =3D (qi->free_tail + 1) % QI_LENGTH; qi->free_cnt++; } @@ -1463,8 +1461,16 @@ int qi_submit_sync(struct intel_iommu *iommu, struct= qi_desc *desc, raw_spin_lock(&qi->q_lock); } =20 - for (i =3D 0; i < count; i++) - qi->desc_status[(index + i) % QI_LENGTH] =3D QI_DONE; + /* + * The reclaim code can free descriptors from multiple submissions + * starting from the tail of the queue. When count =3D=3D 0, the + * status of the standalone wait descriptor at the tail of the queue + * must be set to QI_FREE to allow the reclaim code to proceed. + * It is also possible that descriptors from one of the previous + * submissions has to be reclaimed by a subsequent submission. + */ + for (i =3D 0; i <=3D count; i++) + qi->desc_status[(index + i) % QI_LENGTH] =3D QI_FREE; =20 reclaim_free_desc(qi); raw_spin_unlock_irqrestore(&qi->q_lock, flags); --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04E694AEE0 for ; Mon, 2 Sep 2024 02:31:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244298; cv=none; b=MhwQdbkM4/ov6rJrk6PlIXt02frrHzNk4wRpRUYUm0nbQCRj1d5K7DP16NKpOxsT7IDlCt+qGev36XIZmDytgBestYEKfy1kTTqsEInDLKqpB7MOXShoK9cWCj5qte2vxZGe9C4GN4H/XgxAitoSjdagqfZXggcBkxssFFHsxOo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244298; c=relaxed/simple; bh=k1FJp+pHiGoDqzd1Mjnm4T+Pc+0t6PtwzNNj/ffy4JE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YJJO09DnUD5gfLEaQFF/SQ/Ky9B2H8qTZh+r7V1v7RXthX2bsKDQV0LlhjFIOU3kcEVipJqs4J+VBIDSDMZG0OQs1LNcK1g2wLYUVsG+ASb6D+w6rl6mSNYqZ7lyd4eYTSN5HPD34J4wt8khV423juR5fmrLIegAk83QUMhjlxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ss2Gc8i+; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ss2Gc8i+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244297; x=1756780297; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=k1FJp+pHiGoDqzd1Mjnm4T+Pc+0t6PtwzNNj/ffy4JE=; b=Ss2Gc8i+R8oo/w8gNChGG/6sB/BXHbqDHw1s7bDYJaImgroh/1n4ALfr SKvb0aMctXYv2iPUPcuD9XfVdTwQFfB1SBQgXwMA6qdEObJGeqNzuC3Ht su6VWbqDgbQa/wqTkROvh4t+wAUiZaz9FTxEexAf+FlLkEP3iJgADL2ZX TIz4cZpJjTGUmF6Fb3Ct/WN8Xwp/xdRdV5+cb3WOz4nqe8n0ZB6PBkCHA pgM7gvvRLmLk0+Nri9yZ6x2WYLi8iTL635AETepzYXPiydVoDTqzvem0A bsIep1A9RbvQsuraB/hVwyWuL90RswkvD5haaUAapbckTrYYpuBo/975P g==; X-CSE-ConnectionGUID: tiw5IcPST0+ZMW79JUl25A== X-CSE-MsgGUID: rIc7bsdQTVWNcYezgXNglQ== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994324" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994324" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:37 -0700 X-CSE-ConnectionGUID: ZJnEKqF6TSOxMvwDZxB+vQ== X-CSE-MsgGUID: yqQTWLeAQb2t91J821Eutg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359338" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:35 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 09/14] iommu/vt-d: Move PCI PASID enablement to probe path Date: Mon, 2 Sep 2024 10:27:19 +0800 Message-Id: <20240902022724.67059-10-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, PCI PASID is enabled alongside PCI ATS when an iommu domain is attached to the device and disabled when the device transitions to block translation mode. This approach is inappropriate as PCI PASID is a device feature independent of the type of the attached domain. Enable PCI PASID during the IOMMU device probe and disables it during the release path. Suggested-by: Yi Liu Signed-off-by: Lu Baolu Reviewed-by: Yi Liu Tested-by: Yi Liu Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20240819051805.116936-1-baolu.lu@linux.inte= l.com --- drivers/iommu/intel/iommu.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 34006b6e89eb..eab87a649804 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1281,15 +1281,6 @@ static void iommu_enable_pci_caps(struct device_doma= in_info *info) return; =20 pdev =3D to_pci_dev(info->dev); - - /* The PCIe spec, in its wisdom, declares that the behaviour of - the device if you enable PASID support after ATS support is - undefined. So always enable PASID support on devices which - have it, even if we can't yet know if we're ever going to - use it. */ - if (info->pasid_supported && !pci_enable_pasid(pdev, info->pasid_supporte= d & ~1)) - info->pasid_enabled =3D 1; - if (info->ats_supported && pci_ats_page_aligned(pdev) && !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) info->ats_enabled =3D 1; @@ -1308,11 +1299,6 @@ static void iommu_disable_pci_caps(struct device_dom= ain_info *info) pci_disable_ats(pdev); info->ats_enabled =3D 0; } - - if (info->pasid_enabled) { - pci_disable_pasid(pdev); - info->pasid_enabled =3D 0; - } } =20 static void intel_flush_iotlb_all(struct iommu_domain *domain) @@ -3942,6 +3928,16 @@ static struct iommu_device *intel_iommu_probe_device= (struct device *dev) =20 intel_iommu_debugfs_create_dev(info); =20 + /* + * The PCIe spec, in its wisdom, declares that the behaviour of the + * device is undefined if you enable PASID support after ATS support. + * So always enable PASID support on devices which have it, even if + * we can't yet know if we're ever going to use it. + */ + if (info->pasid_supported && + !pci_enable_pasid(pdev, info->pasid_supported & ~1)) + info->pasid_enabled =3D 1; + return &iommu->iommu; free_table: intel_pasid_free_table(dev); @@ -3958,6 +3954,11 @@ static void intel_iommu_release_device(struct device= *dev) struct device_domain_info *info =3D dev_iommu_priv_get(dev); struct intel_iommu *iommu =3D info->iommu; =20 + if (info->pasid_enabled) { + pci_disable_pasid(to_pci_dev(dev)); + info->pasid_enabled =3D 0; + } + mutex_lock(&iommu->iopf_lock); if (dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev))) device_rbtree_remove(info); --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E0417E583 for ; Mon, 2 Sep 2024 02:31:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244300; cv=none; b=O3O/+/CtLWujFSsipyF6d+RUw4Rqs2UzC2cUMFraLbwRhWPLzc5jnLaQQP8t0P320DU1Jvwav6SnFStMjHHpmAmrABrZjoU2mO4X4MixqNrcM+PkzhL+acJwJFjeqlGNqYK0TPjLOHWDU1GaoQ8YKfHELGoDa7QAi+XBJRLss44= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244300; c=relaxed/simple; bh=GtutiaUzGKFbzKvh3xEoisjtlmPo8NKbdxyJU2Gn1lA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ws8GAnwCmUM9IT8bG9VK2u0/H6J92IsdJ7J8ocaZqb1rNHXXWOy+EwUmOWawr49Th4ujJ94VqdT8zqEpcsyLzMVb+nMUdGCp8ZvSdOcTqA6NTKRZyMhe1/+CXGUoXMNtEZK32xaagLFZiNepeXcoqamodTDaLho0iplN7uzNSvk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=aNMeWGzz; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="aNMeWGzz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244299; x=1756780299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GtutiaUzGKFbzKvh3xEoisjtlmPo8NKbdxyJU2Gn1lA=; b=aNMeWGzzKuRylr2yKf5XOgdwCpYkR23EVQ9IeRrlltH65XrnxieBchPd Wt2WYMqGGWJl0xRhZow1e3N41iod6X8UuSsbjPp3pwoBf5umuaErzCwTn Q9NjnyRea1PPflq1J1wRUXijP3q2+0m2Hdjl1CFIf4XMVZ6lLd/cQ8VYx ldYyDL6aCHlm9XHC1N7JPgAuzAEtPbJdFW/RfdmZM0QMOftm7AGAhxMQg 0ALubS+pAxHBg+tHGpxzDtMC6WjbC1QSs0kw7jfGSMK2KxLQlDtZ+42NS lW2Zq/fuC5cYn0yqBrzxpWVmuQQOE7mkjOUOYm6LJaFX7llBTn1USUveC w==; X-CSE-ConnectionGUID: x7+L/8GZS2m2ppxNKhRHmA== X-CSE-MsgGUID: IaLhWRPqSIWb2pd1IZkGYg== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994333" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994333" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:38 -0700 X-CSE-ConnectionGUID: DN3V7W+7Sx+80mb+F32J2w== X-CSE-MsgGUID: L/mCona4QAuoIG3+JnSisQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359345" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:37 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 10/14] iommu/vt-d: Unconditionally flush device TLB for pasid table updates Date: Mon, 2 Sep 2024 10:27:20 +0800 Message-Id: <20240902022724.67059-11-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The caching mode of an IOMMU is irrelevant to the behavior of the device TLB. Previously, commit <304b3bde24b5> ("iommu/vt-d: Remove caching mode check before device TLB flush") removed this redundant check in the domain unmap path. Checking the caching mode before flushing the device TLB after a pasid table entry is updated is unnecessary and can lead to inconsistent behavior. Extends this consistency by removing the caching mode check in the pasid table update path. Suggested-by: Yi Liu Signed-off-by: Lu Baolu Link: https://lore.kernel.org/r/20240820030208.20020-1-baolu.lu@linux.intel= .com --- drivers/iommu/intel/pasid.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index b51fc268dc84..2e5fa0a23299 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -264,9 +264,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu *io= mmu, struct device *dev, else iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); =20 - /* Device IOTLB doesn't need to be flushed in caching mode. */ - if (!cap_caching_mode(iommu->cap)) - devtlb_invalidation_with_pasid(iommu, dev, pasid); + devtlb_invalidation_with_pasid(iommu, dev, pasid); } =20 /* @@ -493,9 +491,7 @@ int intel_pasid_setup_dirty_tracking(struct intel_iommu= *iommu, =20 iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); =20 - /* Device IOTLB doesn't need to be flushed in caching mode. */ - if (!cap_caching_mode(iommu->cap)) - devtlb_invalidation_with_pasid(iommu, dev, pasid); + devtlb_invalidation_with_pasid(iommu, dev, pasid); =20 return 0; } @@ -572,9 +568,7 @@ void intel_pasid_setup_page_snoop_control(struct intel_= iommu *iommu, pasid_cache_invalidation_with_pasid(iommu, did, pasid); qi_flush_piotlb(iommu, did, pasid, 0, -1, 0); =20 - /* Device IOTLB doesn't need to be flushed in caching mode. */ - if (!cap_caching_mode(iommu->cap)) - devtlb_invalidation_with_pasid(iommu, dev, pasid); + devtlb_invalidation_with_pasid(iommu, dev, pasid); } =20 /** --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 376C713AD05 for ; Mon, 2 Sep 2024 02:31:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244302; cv=none; b=B04NePba4hnj2u2drWFVqQqXzNiKhhNgzjoG7cH5i/c3AIYgY6YSbY4NqLLqw9Ripe7Npnnje6tBBeKA1YpT1/wDER+Q4eRAqeKZf1P3KW5UjZ0Qa+CHHq8IIYkxsooG/qFDaMcFe/9hYCs3SKITkyJRU6/nZSNfY18MLUkTbEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244302; c=relaxed/simple; bh=riBmFPcSBAvJB1OHstBgDS/COM/8vG7xHt/eORYl5DI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kq2lpyMIa//oGnsnHCuhHECBJqKg0LZnuz09lfWTQLWYLGrpl0s+fK5sVPJef9VHvRW6HkCpF9Lfvdev45ReKg50WPrdKxIjaLyFKjsN9ZlPJffpGadTIfPJLhAM+RuVrZJZYC4UvHA9d5ewf5tTRQpVF6qBUA1xC64t4f3sdrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BxO9fAiP; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BxO9fAiP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244301; x=1756780301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=riBmFPcSBAvJB1OHstBgDS/COM/8vG7xHt/eORYl5DI=; b=BxO9fAiPwu/Lh6jzl5MQBRu7gIZnIcB9RFR8u0wLe4S3hIH+yp5Gqah+ fqnFA29LfOFli3Wy/VUqaGXSJ6SPEHuZ+GoB9l+RzUWEeQBIMCAatcZvI f6Fx0A6jgUqKwK5PvMO1keE/ryi1uOETtDf4E1PhOl/oF0QN6Sfxl0fgP 1HB0UPtEpu+FbwLCcx2XVJRYBW5rNicJM3vyfl/EUP6uz+XvSiPcuw9ux 71/HDap7cWgP+WiagDeSfE9KbJAj5DXEV+2iv44rHBYjMELKmsVQjen8s uo8SvEDhDNkUqChkFMzpUgjevp/wM5DfJxM0GSd2vuUFKvSzKJGiVehTx A==; X-CSE-ConnectionGUID: qKZ1HgtkQKSKaJ2PectnIQ== X-CSE-MsgGUID: IXM2SfuvSSWnIsSg0jJjtQ== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994338" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994338" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:40 -0700 X-CSE-ConnectionGUID: 5cVuQdV5SB+sboLY+Ig78Q== X-CSE-MsgGUID: xcO0tSt0S5yqWn80oEy7ug== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359348" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:39 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 11/14] iommu/vt-d: Factor out invalidation descriptor composition Date: Mon, 2 Sep 2024 10:27:21 +0800 Message-Id: <20240902022724.67059-12-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Tina Zhang Separate the logic for constructing IOTLB and device TLB invalidation descriptors from the qi_flush interfaces. New helpers, qi_desc(), are introduced to encapsulate this common functionality. Moving descriptor composition code to new helpers enables its reuse in the upcoming qi_batch interfaces. No functional changes are intended. Signed-off-by: Tina Zhang Link: https://lore.kernel.org/r/20240815065221.50328-2-tina.zhang@intel.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 109 ++++++++++++++++++++++++++++++++++++ drivers/iommu/intel/dmar.c | 93 ++---------------------------- 2 files changed, 115 insertions(+), 87 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index b3b295e60626..8884aae56ca7 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1066,6 +1066,115 @@ static inline unsigned long nrpages_to_size(unsigne= d long npages) return npages << VTD_PAGE_SHIFT; } =20 +static inline void qi_desc_iotlb(struct intel_iommu *iommu, u16 did, u64 a= ddr, + unsigned int size_order, u64 type, + struct qi_desc *desc) +{ + u8 dw =3D 0, dr =3D 0; + int ih =3D 0; + + if (cap_write_drain(iommu->cap)) + dw =3D 1; + + if (cap_read_drain(iommu->cap)) + dr =3D 1; + + desc->qw0 =3D QI_IOTLB_DID(did) | QI_IOTLB_DR(dr) | QI_IOTLB_DW(dw) + | QI_IOTLB_GRAN(type) | QI_IOTLB_TYPE; + desc->qw1 =3D QI_IOTLB_ADDR(addr) | QI_IOTLB_IH(ih) + | QI_IOTLB_AM(size_order); + desc->qw2 =3D 0; + desc->qw3 =3D 0; +} + +static inline void qi_desc_dev_iotlb(u16 sid, u16 pfsid, u16 qdep, u64 add= r, + unsigned int mask, struct qi_desc *desc) +{ + if (mask) { + addr |=3D (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1; + desc->qw1 =3D QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE; + } else { + desc->qw1 =3D QI_DEV_IOTLB_ADDR(addr); + } + + if (qdep >=3D QI_DEV_IOTLB_MAX_INVS) + qdep =3D 0; + + desc->qw0 =3D QI_DEV_IOTLB_SID(sid) | QI_DEV_IOTLB_QDEP(qdep) | + QI_DIOTLB_TYPE | QI_DEV_IOTLB_PFSID(pfsid); + desc->qw2 =3D 0; + desc->qw3 =3D 0; +} + +static inline void qi_desc_piotlb(u16 did, u32 pasid, u64 addr, + unsigned long npages, bool ih, + struct qi_desc *desc) +{ + if (npages =3D=3D -1) { + desc->qw0 =3D QI_EIOTLB_PASID(pasid) | + QI_EIOTLB_DID(did) | + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | + QI_EIOTLB_TYPE; + desc->qw1 =3D 0; + } else { + int mask =3D ilog2(__roundup_pow_of_two(npages)); + unsigned long align =3D (1ULL << (VTD_PAGE_SHIFT + mask)); + + if (WARN_ON_ONCE(!IS_ALIGNED(addr, align))) + addr =3D ALIGN_DOWN(addr, align); + + desc->qw0 =3D QI_EIOTLB_PASID(pasid) | + QI_EIOTLB_DID(did) | + QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) | + QI_EIOTLB_TYPE; + desc->qw1 =3D QI_EIOTLB_ADDR(addr) | + QI_EIOTLB_IH(ih) | + QI_EIOTLB_AM(mask); + } +} + +static inline void qi_desc_dev_iotlb_pasid(u16 sid, u16 pfsid, u32 pasid, + u16 qdep, u64 addr, + unsigned int size_order, + struct qi_desc *desc) +{ + unsigned long mask =3D 1UL << (VTD_PAGE_SHIFT + size_order - 1); + + desc->qw0 =3D QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) | + QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE | + QI_DEV_IOTLB_PFSID(pfsid); + + /* + * If S bit is 0, we only flush a single page. If S bit is set, + * The least significant zero bit indicates the invalidation address + * range. VT-d spec 6.5.2.6. + * e.g. address bit 12[0] indicates 8KB, 13[0] indicates 16KB. + * size order =3D 0 is PAGE_SIZE 4KB + * Max Invs Pending (MIP) is set to 0 for now until we have DIT in + * ECAP. + */ + if (!IS_ALIGNED(addr, VTD_PAGE_SIZE << size_order)) + pr_warn_ratelimited("Invalidate non-aligned address %llx, order %d\n", + addr, size_order); + + /* Take page address */ + desc->qw1 =3D QI_DEV_EIOTLB_ADDR(addr); + + if (size_order) { + /* + * Existing 0s in address below size_order may be the least + * significant bit, we must set them to 1s to avoid having + * smaller size than desired. + */ + desc->qw1 |=3D GENMASK_ULL(size_order + VTD_PAGE_SHIFT - 1, + VTD_PAGE_SHIFT); + /* Clear size_order bit to indicate size */ + desc->qw1 &=3D ~mask; + /* Set the S bit to indicate flushing more than 1 page */ + desc->qw1 |=3D QI_DEV_EIOTLB_SIZE; + } +} + /* Convert value to context PASID directory size field coding. */ #define context_pdts(pds) (((pds) & 0x7) << 9) =20 diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 01e157d89a16..eaf862e8dea1 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1526,24 +1526,9 @@ void qi_flush_context(struct intel_iommu *iommu, u16= did, u16 sid, u8 fm, void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr, unsigned int size_order, u64 type) { - u8 dw =3D 0, dr =3D 0; - struct qi_desc desc; - int ih =3D 0; - - if (cap_write_drain(iommu->cap)) - dw =3D 1; - - if (cap_read_drain(iommu->cap)) - dr =3D 1; - - desc.qw0 =3D QI_IOTLB_DID(did) | QI_IOTLB_DR(dr) | QI_IOTLB_DW(dw) - | QI_IOTLB_GRAN(type) | QI_IOTLB_TYPE; - desc.qw1 =3D QI_IOTLB_ADDR(addr) | QI_IOTLB_IH(ih) - | QI_IOTLB_AM(size_order); - desc.qw2 =3D 0; - desc.qw3 =3D 0; =20 + qi_desc_iotlb(iommu, did, addr, size_order, type, &desc); qi_submit_sync(iommu, &desc, 1, 0); } =20 @@ -1561,20 +1546,7 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u= 16 sid, u16 pfsid, if (!(iommu->gcmd & DMA_GCMD_TE)) return; =20 - if (mask) { - addr |=3D (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1; - desc.qw1 =3D QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE; - } else - desc.qw1 =3D QI_DEV_IOTLB_ADDR(addr); - - if (qdep >=3D QI_DEV_IOTLB_MAX_INVS) - qdep =3D 0; - - desc.qw0 =3D QI_DEV_IOTLB_SID(sid) | QI_DEV_IOTLB_QDEP(qdep) | - QI_DIOTLB_TYPE | QI_DEV_IOTLB_PFSID(pfsid); - desc.qw2 =3D 0; - desc.qw3 =3D 0; - + qi_desc_dev_iotlb(sid, pfsid, qdep, addr, mask, &desc); qi_submit_sync(iommu, &desc, 1, 0); } =20 @@ -1594,28 +1566,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 = did, u32 pasid, u64 addr, return; } =20 - if (npages =3D=3D -1) { - desc.qw0 =3D QI_EIOTLB_PASID(pasid) | - QI_EIOTLB_DID(did) | - QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | - QI_EIOTLB_TYPE; - desc.qw1 =3D 0; - } else { - int mask =3D ilog2(__roundup_pow_of_two(npages)); - unsigned long align =3D (1ULL << (VTD_PAGE_SHIFT + mask)); - - if (WARN_ON_ONCE(!IS_ALIGNED(addr, align))) - addr =3D ALIGN_DOWN(addr, align); - - desc.qw0 =3D QI_EIOTLB_PASID(pasid) | - QI_EIOTLB_DID(did) | - QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) | - QI_EIOTLB_TYPE; - desc.qw1 =3D QI_EIOTLB_ADDR(addr) | - QI_EIOTLB_IH(ih) | - QI_EIOTLB_AM(mask); - } - + qi_desc_piotlb(did, pasid, addr, npages, ih, &desc); qi_submit_sync(iommu, &desc, 1, 0); } =20 @@ -1623,7 +1574,6 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 d= id, u32 pasid, u64 addr, void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsi= d, u32 pasid, u16 qdep, u64 addr, unsigned int size_order) { - unsigned long mask =3D 1UL << (VTD_PAGE_SHIFT + size_order - 1); struct qi_desc desc =3D {.qw1 =3D 0, .qw2 =3D 0, .qw3 =3D 0}; =20 /* @@ -1635,40 +1585,9 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *io= mmu, u16 sid, u16 pfsid, if (!(iommu->gcmd & DMA_GCMD_TE)) return; =20 - desc.qw0 =3D QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) | - QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE | - QI_DEV_IOTLB_PFSID(pfsid); - - /* - * If S bit is 0, we only flush a single page. If S bit is set, - * The least significant zero bit indicates the invalidation address - * range. VT-d spec 6.5.2.6. - * e.g. address bit 12[0] indicates 8KB, 13[0] indicates 16KB. - * size order =3D 0 is PAGE_SIZE 4KB - * Max Invs Pending (MIP) is set to 0 for now until we have DIT in - * ECAP. - */ - if (!IS_ALIGNED(addr, VTD_PAGE_SIZE << size_order)) - pr_warn_ratelimited("Invalidate non-aligned address %llx, order %d\n", - addr, size_order); - - /* Take page address */ - desc.qw1 =3D QI_DEV_EIOTLB_ADDR(addr); - - if (size_order) { - /* - * Existing 0s in address below size_order may be the least - * significant bit, we must set them to 1s to avoid having - * smaller size than desired. - */ - desc.qw1 |=3D GENMASK_ULL(size_order + VTD_PAGE_SHIFT - 1, - VTD_PAGE_SHIFT); - /* Clear size_order bit to indicate size */ - desc.qw1 &=3D ~mask; - /* Set the S bit to indicate flushing more than 1 page */ - desc.qw1 |=3D QI_DEV_EIOTLB_SIZE; - } - + qi_desc_dev_iotlb_pasid(sid, pfsid, pasid, + qdep, addr, size_order, + &desc); qi_submit_sync(iommu, &desc, 1, 0); } =20 --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C0AE13AD27 for ; Mon, 2 Sep 2024 02:31:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244304; cv=none; b=GqJg35p/DF12km4t+L17CjyXwngS72iifmqG6yGlRLE4TnKwAs8+vI48znBjVAgpyOZboOXhr425aiCzXPm/PUjtmZ8REl1SLAL0ke9y5zfjmXZGlTm2zCbkCDsxdSaRWThmse9ju53SmI/xUx0J3+H2nO5ZtHenZu5PLRR5sb0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244304; c=relaxed/simple; bh=4bOekfGoUt0BjvOBLIs3P9Ihuchvum6NOSE4iIcjOUA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kzbdk49hNTHkcz++wMXgMWTZK0Ouz8zZ3/goL1GUgiA1OhY9elEl+Z+xq8sW9I4OGMgIje21jDW0P0yQrVVI5/dDJSf8ClhjAok1yLL3Bi93R3liSP3B5GO/6I8xgXvjVED4yWf2Qb9r4mw9DmpPkd/bSaq6tUfiREMMzRt+spo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZoUA5xbI; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZoUA5xbI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244303; x=1756780303; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4bOekfGoUt0BjvOBLIs3P9Ihuchvum6NOSE4iIcjOUA=; b=ZoUA5xbI7K9+7xNbKDkaIcN3KqP+ESlY348U5mSDZ4pA2xugmB4o+G3H uvYesl2Pv+NyBrs4b+ftfLtBFK2PBW6v725WQSxFF1lJQ/Bnpi/0h1gA7 gx9zPPXNHiLbhJeBVenl7TJVDVfgO2/7FON9LpJbAlmTc4LlAfrnVKpxP mKj62+pwpnNwnET9P/rlayCAZrb1xpHO+ojHDlL6Ffh/gMN1CpggWYgt+ /YyzmcTMverMkzgwEEfFLEDAhAplp5b43uwqI2DIOPe4HyOCF8nybTvNN t6d4+moYIAe6BHCSyIyZeFcbPeAZAkYluGIVEHSaS8isoRpv+rqqFQF1i A==; X-CSE-ConnectionGUID: uw6/7X7lTsSmz+O8/BCv4g== X-CSE-MsgGUID: B99kizMjQImd+nlqc7iFuw== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994343" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994343" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:42 -0700 X-CSE-ConnectionGUID: KQjXHFpIRM+j72ssQHoQog== X-CSE-MsgGUID: TYMCf4PiSAmUyMZjhAyAPQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359353" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:41 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 12/14] iommu/vt-d: Refactor IOTLB and Dev-IOTLB flush for batching Date: Mon, 2 Sep 2024 10:27:22 +0800 Message-Id: <20240902022724.67059-13-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Tina Zhang Extracts IOTLB and Dev-IOTLB invalidation logic from cache tag flush interfaces into dedicated helper functions. It prepares the codebase for upcoming changes to support batched cache invalidations. To enable direct use of qi_flush helpers in the new functions, iommu->flush.flush_iotlb and quirk_extra_dev_tlb_flush() are opened up. No functional changes are intended. Co-developed-by: Lu Baolu Signed-off-by: Lu Baolu Signed-off-by: Tina Zhang Link: https://lore.kernel.org/r/20240815065221.50328-3-tina.zhang@intel.com --- drivers/iommu/intel/iommu.h | 3 + drivers/iommu/intel/cache.c | 142 ++++++++++++++++++++---------------- drivers/iommu/intel/iommu.c | 5 +- 3 files changed, 83 insertions(+), 67 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 8884aae56ca7..ba40db4f8399 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1206,6 +1206,9 @@ void qi_flush_pasid_cache(struct intel_iommu *iommu, = u16 did, u64 granu, =20 int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, unsigned int count, unsigned long options); + +void __iommu_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr, + unsigned int size_order, u64 type); /* * Options used in qi_submit_sync: * QI_OPT_WAIT_DRAIN - Wait for PRQ drain completion, spec 6.5.2.8. diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index 44e92638c0cd..08f7ce2c16c3 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -255,6 +255,78 @@ static unsigned long calculate_psi_aligned_address(uns= igned long start, return ALIGN_DOWN(start, VTD_PAGE_SIZE << mask); } =20 +static void cache_tag_flush_iotlb(struct dmar_domain *domain, struct cache= _tag *tag, + unsigned long addr, unsigned long pages, + unsigned long mask, int ih) +{ + struct intel_iommu *iommu =3D tag->iommu; + u64 type =3D DMA_TLB_PSI_FLUSH; + + if (domain->use_first_level) { + qi_flush_piotlb(iommu, tag->domain_id, tag->pasid, addr, pages, ih); + return; + } + + /* + * Fallback to domain selective flush if no PSI support or the size + * is too big. + */ + if (!cap_pgsel_inv(iommu->cap) || + mask > cap_max_amask_val(iommu->cap) || pages =3D=3D -1) { + addr =3D 0; + mask =3D 0; + ih =3D 0; + type =3D DMA_TLB_DSI_FLUSH; + } + + if (ecap_qis(iommu->ecap)) + qi_flush_iotlb(iommu, tag->domain_id, addr | ih, mask, type); + else + __iommu_flush_iotlb(iommu, tag->domain_id, addr | ih, mask, type); +} + +static void cache_tag_flush_devtlb_psi(struct dmar_domain *domain, struct = cache_tag *tag, + unsigned long addr, unsigned long mask) +{ + struct intel_iommu *iommu =3D tag->iommu; + struct device_domain_info *info; + u16 sid; + + info =3D dev_iommu_priv_get(tag->dev); + sid =3D PCI_DEVID(info->bus, info->devfn); + + if (tag->pasid =3D=3D IOMMU_NO_PASID) { + qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, + addr, mask); + if (info->dtlb_extra_inval) + qi_flush_dev_iotlb(iommu, sid, info->pfsid, + info->ats_qdep, addr, mask); + return; + } + + qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, tag->pasid, + info->ats_qdep, addr, mask); + if (info->dtlb_extra_inval) + qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, tag->pasid, + info->ats_qdep, addr, mask); +} + +static void cache_tag_flush_devtlb_all(struct dmar_domain *domain, struct = cache_tag *tag) +{ + struct intel_iommu *iommu =3D tag->iommu; + struct device_domain_info *info; + u16 sid; + + info =3D dev_iommu_priv_get(tag->dev); + sid =3D PCI_DEVID(info->bus, info->devfn); + + qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, + MAX_AGAW_PFN_WIDTH); + if (info->dtlb_extra_inval) + qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, + MAX_AGAW_PFN_WIDTH); +} + /* * Invalidates a range of IOVA from @start (inclusive) to @end (inclusive) * when the memory mappings in the target domain have been modified. @@ -270,30 +342,10 @@ void cache_tag_flush_range(struct dmar_domain *domain= , unsigned long start, =20 spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { - struct intel_iommu *iommu =3D tag->iommu; - struct device_domain_info *info; - u16 sid; - switch (tag->type) { case CACHE_TAG_IOTLB: case CACHE_TAG_NESTING_IOTLB: - if (domain->use_first_level) { - qi_flush_piotlb(iommu, tag->domain_id, - tag->pasid, addr, pages, ih); - } else { - /* - * Fallback to domain selective flush if no - * PSI support or the size is too big. - */ - if (!cap_pgsel_inv(iommu->cap) || - mask > cap_max_amask_val(iommu->cap)) - iommu->flush.flush_iotlb(iommu, tag->domain_id, - 0, 0, DMA_TLB_DSI_FLUSH); - else - iommu->flush.flush_iotlb(iommu, tag->domain_id, - addr | ih, mask, - DMA_TLB_PSI_FLUSH); - } + cache_tag_flush_iotlb(domain, tag, addr, pages, mask, ih); break; case CACHE_TAG_NESTING_DEVTLB: /* @@ -307,18 +359,7 @@ void cache_tag_flush_range(struct dmar_domain *domain,= unsigned long start, mask =3D MAX_AGAW_PFN_WIDTH; fallthrough; case CACHE_TAG_DEVTLB: - info =3D dev_iommu_priv_get(tag->dev); - sid =3D PCI_DEVID(info->bus, info->devfn); - - if (tag->pasid =3D=3D IOMMU_NO_PASID) - qi_flush_dev_iotlb(iommu, sid, info->pfsid, - info->ats_qdep, addr, mask); - else - qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, - tag->pasid, info->ats_qdep, - addr, mask); - - quirk_extra_dev_tlb_flush(info, addr, mask, tag->pasid, info->ats_qdep); + cache_tag_flush_devtlb_psi(domain, tag, addr, mask); break; } =20 @@ -338,29 +379,14 @@ void cache_tag_flush_all(struct dmar_domain *domain) =20 spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { - struct intel_iommu *iommu =3D tag->iommu; - struct device_domain_info *info; - u16 sid; - switch (tag->type) { case CACHE_TAG_IOTLB: case CACHE_TAG_NESTING_IOTLB: - if (domain->use_first_level) - qi_flush_piotlb(iommu, tag->domain_id, - tag->pasid, 0, -1, 0); - else - iommu->flush.flush_iotlb(iommu, tag->domain_id, - 0, 0, DMA_TLB_DSI_FLUSH); + cache_tag_flush_iotlb(domain, tag, 0, -1, 0, 0); break; case CACHE_TAG_DEVTLB: case CACHE_TAG_NESTING_DEVTLB: - info =3D dev_iommu_priv_get(tag->dev); - sid =3D PCI_DEVID(info->bus, info->devfn); - - qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, - 0, MAX_AGAW_PFN_WIDTH); - quirk_extra_dev_tlb_flush(info, 0, MAX_AGAW_PFN_WIDTH, - IOMMU_NO_PASID, info->ats_qdep); + cache_tag_flush_devtlb_all(domain, tag); break; } =20 @@ -399,20 +425,8 @@ void cache_tag_flush_range_np(struct dmar_domain *doma= in, unsigned long start, } =20 if (tag->type =3D=3D CACHE_TAG_IOTLB || - tag->type =3D=3D CACHE_TAG_NESTING_IOTLB) { - /* - * Fallback to domain selective flush if no - * PSI support or the size is too big. - */ - if (!cap_pgsel_inv(iommu->cap) || - mask > cap_max_amask_val(iommu->cap)) - iommu->flush.flush_iotlb(iommu, tag->domain_id, - 0, 0, DMA_TLB_DSI_FLUSH); - else - iommu->flush.flush_iotlb(iommu, tag->domain_id, - addr, mask, - DMA_TLB_PSI_FLUSH); - } + tag->type =3D=3D CACHE_TAG_NESTING_IOTLB) + cache_tag_flush_iotlb(domain, tag, addr, pages, mask, 0); =20 trace_cache_tag_flush_range_np(tag, start, end, addr, pages, mask); } diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index eab87a649804..779e258188cd 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1184,9 +1184,8 @@ static void __iommu_flush_context(struct intel_iommu = *iommu, raw_spin_unlock_irqrestore(&iommu->register_lock, flag); } =20 -/* return value determine if we need a write buffer flush */ -static void __iommu_flush_iotlb(struct intel_iommu *iommu, u16 did, - u64 addr, unsigned int size_order, u64 type) +void __iommu_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr, + unsigned int size_order, u64 type) { int tlb_offset =3D ecap_iotlb_offset(iommu->ecap); u64 val =3D 0, val_iva =3D 0; --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 79CDC144D18 for ; Mon, 2 Sep 2024 02:31:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244305; cv=none; b=qtKO3qISe4RDPHIGOJoOohIk8TZoHrWwbmOf3T48GAunGmgqpctr7DUF15CbhdNvkUPRKRe7WXG9KpQX11p4MN/X6pPG4U31x4NouSgOvmiPTe872D6hnsPBCGDS1G9t3prQA+a7mHLgIm3pNL6aLRnr7GcGFlgd0etXeSXoKiA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244305; c=relaxed/simple; bh=xGjTtLzwtzCEOrsJubGNRKuJETMqym6KddPhb61StuM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RJtFSQDSCSFfCBeyKYvfsXG5ZBM4kJEtOyXK9YIYxwLZwO3VPWAbaJUQV4qdmd7EookzO3DDSn7MUoHxarUjmg1D01uU/56/A2IYu0g5TshXxLb9r8NaVKIX3tuSZNuK2EtQ98MhNRqAyszPRC0EiqTFpV4iI6RzlLrDvKY5VNg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mqm4B+1O; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mqm4B+1O" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244304; x=1756780304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xGjTtLzwtzCEOrsJubGNRKuJETMqym6KddPhb61StuM=; b=mqm4B+1ObpeXI7kHS3tX6k1hrqmX9kSxTNUE7rnXYNSjXfz36Sq2pRGA wBSZMuGywZ+/iIWMaBU+WMPcPGzuPhaUenpxFpWzlgLjq06Q6EMk5n+cJ UB7T3nCpuEBBw1AUzAehBNFrbppgaZ4Ckgp2M1+d9EzYsfzVjJJIN+K1d TY9Y0rHPzeZOKBNnoQtq4MNQXaLzGJxBYEPf0nebFL5JHUyQaW6b/692B jc6VTPtzXmwXwechdeVGTdhN78Axkn5H7tujCY6IPHoAI9erbQf0xt2dc Zxf6BmisHe/tyRap6fz8kOmqFJCbPHTzVB91n8gzwnvi4HLBIXdpZEW9U g==; X-CSE-ConnectionGUID: Njx92mJXQkuWgaEe59g6Nw== X-CSE-MsgGUID: b3BgaT7XRUOm9+du7fUFVg== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994348" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994348" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:44 -0700 X-CSE-ConnectionGUID: SiikL3VkQBSnEg1JNZ3G4Q== X-CSE-MsgGUID: ZJPvh4o2Riq1RFQ7G3lY+A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359361" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:43 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 13/14] iommu/vt-d: Add qi_batch for dmar_domain Date: Mon, 2 Sep 2024 10:27:23 +0800 Message-Id: <20240902022724.67059-14-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduces a qi_batch structure to hold batched cache invalidation descriptors on a per-dmar_domain basis. A fixed-size descriptor array is used for simplicity. The qi_batch is allocated when the first cache tag is added to the domain and freed during iommu_free_domain(). Signed-off-by: Lu Baolu Signed-off-by: Tina Zhang Link: https://lore.kernel.org/r/20240815065221.50328-4-tina.zhang@intel.com --- drivers/iommu/intel/iommu.h | 14 ++++++++++++++ drivers/iommu/intel/cache.c | 7 +++++++ drivers/iommu/intel/iommu.c | 1 + drivers/iommu/intel/nested.c | 1 + drivers/iommu/intel/svm.c | 5 ++++- 5 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index ba40db4f8399..428d253f1348 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -584,6 +584,19 @@ struct iommu_domain_info { * to VT-d spec, section 9.3 */ }; =20 +/* + * We start simply by using a fixed size for the batched descriptors. This + * size is currently sufficient for our needs. Future improvements could + * involve dynamically allocating the batch buffer based on actual demand, + * allowing us to adjust the batch size for optimal performance in differe= nt + * scenarios. + */ +#define QI_MAX_BATCHED_DESC_COUNT 16 +struct qi_batch { + struct qi_desc descs[QI_MAX_BATCHED_DESC_COUNT]; + unsigned int index; +}; + struct dmar_domain { int nid; /* node id */ struct xarray iommu_array; /* Attached IOMMU array */ @@ -608,6 +621,7 @@ struct dmar_domain { =20 spinlock_t cache_lock; /* Protect the cache tag list */ struct list_head cache_tags; /* Cache tag list */ + struct qi_batch *qi_batch; /* Batched QI descriptors */ =20 int iommu_superpage;/* Level of superpages supported: 0 =3D=3D 4KiB (no superpages), 1 =3D=3D 2MiB, diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index 08f7ce2c16c3..2e997d782beb 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -190,6 +190,13 @@ int cache_tag_assign_domain(struct dmar_domain *domain, u16 did =3D domain_get_id_for_dev(domain, dev); int ret; =20 + /* domain->qi_bach will be freed in iommu_free_domain() path. */ + if (!domain->qi_batch) { + domain->qi_batch =3D kzalloc(sizeof(*domain->qi_batch), GFP_KERNEL); + if (!domain->qi_batch) + return -ENOMEM; + } + ret =3D __cache_tag_assign_domain(domain, did, dev, pasid); if (ret || domain->domain.type !=3D IOMMU_DOMAIN_NESTED) return ret; diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 779e258188cd..0500022e789e 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1572,6 +1572,7 @@ static void domain_exit(struct dmar_domain *domain) if (WARN_ON(!list_empty(&domain->devices))) return; =20 + kfree(domain->qi_batch); kfree(domain); } =20 diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c index 36a91b1b52be..433c58944401 100644 --- a/drivers/iommu/intel/nested.c +++ b/drivers/iommu/intel/nested.c @@ -83,6 +83,7 @@ static void intel_nested_domain_free(struct iommu_domain = *domain) spin_lock(&s2_domain->s1_lock); list_del(&dmar_domain->s2_link); spin_unlock(&s2_domain->s1_lock); + kfree(dmar_domain->qi_batch); kfree(dmar_domain); } =20 diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index ef12e95e400a..078d1e32a24e 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -184,7 +184,10 @@ static void intel_mm_release(struct mmu_notifier *mn, = struct mm_struct *mm) =20 static void intel_mm_free_notifier(struct mmu_notifier *mn) { - kfree(container_of(mn, struct dmar_domain, notifier)); + struct dmar_domain *domain =3D container_of(mn, struct dmar_domain, notif= ier); + + kfree(domain->qi_batch); + kfree(domain); } =20 static const struct mmu_notifier_ops intel_mmuops =3D { --=20 2.34.1 From nobody Fri Dec 19 02:49:51 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD4D014AD20 for ; Mon, 2 Sep 2024 02:31:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244308; cv=none; b=ru9v1tBiFew2th2tjJGXmtoQ6A0uZ3BJI72dWZaT35gQ880xFmo7L2bfHDcgs0SyOTJb2uqCWHBxL8r0U5GzEjQjbeyX67NCUJ6POztpj0IisURr7rqJug8bhiya5Jj7tmVxcHVI5V1KU9ibInBBwdvIVVFcVZwUVaVPJ188egU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725244308; c=relaxed/simple; bh=SuXh1x2aBDH4m7kXr22UkyynOu2sb7NDnWcLtZYzL9g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cn0DMIZQ9hNl53A7KXyi4AGyLTQwNMhLjKQXyYTaeiDACV6DjTjsp9zZ9MTaCth3xbdM+7cjTv4A4L24EqOITJztPJrabjGllQ4jViDKMaBbnw6dj96DsaiX1RZZX/eOB9bhR5SAkFqeBrs0cN/gIJ3DTjLF3Qx9su7e/wItde0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KEjavwIk; arc=none smtp.client-ip=192.198.163.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KEjavwIk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1725244307; x=1756780307; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SuXh1x2aBDH4m7kXr22UkyynOu2sb7NDnWcLtZYzL9g=; b=KEjavwIk7w/kSO4pBDh0ead4ClgJv0b9eJuUxgXzP3y2p7ZDZ0pHHLCC NP1ltyO6jZ/81vmEIIQFkVggJJwGkC6ssVp8mz3N2gKvgmTLj46gOF9cB ggGVFiHEmmyDSy11yWW5jQeu4T9ogbglFIg0eFS9TnYYagU6cZu4CvHLk aPpqCGM3MDufZmgER2UKf2ixyjG9rbOFl3f3jZo6XS9KJcCvL6zVOFUrP rDaaBHwk5wqRiYme0mkWBwogr2YTeSTHStFZxZ0mnhXFu658/OhafLt0Z oNsoE8Zd49Bn2X6jCcOU+KGn0+ingfqY3HlZHWUg6GTWovATWx4YjZ22R g==; X-CSE-ConnectionGUID: m5OpY+mlRsSmuzIEF0XCag== X-CSE-MsgGUID: FaCwwrpORZuz0I2ThqVimQ== X-IronPort-AV: E=McAfee;i="6700,10204,11182"; a="23994353" X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="23994353" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2024 19:31:46 -0700 X-CSE-ConnectionGUID: QCManA9OTjmiSvACN9UoVQ== X-CSE-MsgGUID: TR13/FedQtSK0xkPIqMYMw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,195,1719903600"; d="scan'208";a="69359372" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa004.jf.intel.com with ESMTP; 01 Sep 2024 19:31:45 -0700 From: Lu Baolu To: Joerg Roedel , Will Deacon Cc: Tina Zhang , Sanjay K Kumar , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 14/14] iommu/vt-d: Introduce batched cache invalidation Date: Mon, 2 Sep 2024 10:27:24 +0800 Message-Id: <20240902022724.67059-15-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240902022724.67059-1-baolu.lu@linux.intel.com> References: <20240902022724.67059-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Tina Zhang Converts IOTLB and Dev-IOTLB invalidation to a batched model. Cache tag invalidation requests for a domain are now accumulated in a qi_batch structure before being flushed in bulk. It replaces the previous per- request qi_flush approach with a more efficient batching mechanism. Co-developed-by: Lu Baolu Signed-off-by: Lu Baolu Signed-off-by: Tina Zhang Link: https://lore.kernel.org/r/20240815065221.50328-5-tina.zhang@intel.com --- drivers/iommu/intel/cache.c | 122 +++++++++++++++++++++++++++++++----- 1 file changed, 107 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index 2e997d782beb..e5b89f728ad3 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -262,6 +262,79 @@ static unsigned long calculate_psi_aligned_address(uns= igned long start, return ALIGN_DOWN(start, VTD_PAGE_SIZE << mask); } =20 +static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batc= h *batch) +{ + if (!iommu || !batch->index) + return; + + qi_submit_sync(iommu, batch->descs, batch->index, 0); + + /* Reset the index value and clean the whole batch buffer. */ + memset(batch, 0, sizeof(*batch)); +} + +static void qi_batch_increment_index(struct intel_iommu *iommu, struct qi_= batch *batch) +{ + if (++batch->index =3D=3D QI_MAX_BATCHED_DESC_COUNT) + qi_batch_flush_descs(iommu, batch); +} + +static void qi_batch_add_iotlb(struct intel_iommu *iommu, u16 did, u64 add= r, + unsigned int size_order, u64 type, + struct qi_batch *batch) +{ + qi_desc_iotlb(iommu, did, addr, size_order, type, &batch->descs[batch->in= dex]); + qi_batch_increment_index(iommu, batch); +} + +static void qi_batch_add_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16= pfsid, + u16 qdep, u64 addr, unsigned int mask, + struct qi_batch *batch) +{ + /* + * According to VT-d spec, software is recommended to not submit any Devi= ce-TLB + * invalidation requests while address remapping hardware is disabled. + */ + if (!(iommu->gcmd & DMA_GCMD_TE)) + return; + + qi_desc_dev_iotlb(sid, pfsid, qdep, addr, mask, &batch->descs[batch->inde= x]); + qi_batch_increment_index(iommu, batch); +} + +static void qi_batch_add_piotlb(struct intel_iommu *iommu, u16 did, u32 pa= sid, + u64 addr, unsigned long npages, bool ih, + struct qi_batch *batch) +{ + /* + * npages =3D=3D -1 means a PASID-selective invalidation, otherwise, + * a positive value for Page-selective-within-PASID invalidation. + * 0 is not a valid input. + */ + if (!npages) + return; + + qi_desc_piotlb(did, pasid, addr, npages, ih, &batch->descs[batch->index]); + qi_batch_increment_index(iommu, batch); +} + +static void qi_batch_add_pasid_dev_iotlb(struct intel_iommu *iommu, u16 si= d, u16 pfsid, + u32 pasid, u16 qdep, u64 addr, + unsigned int size_order, struct qi_batch *batch) +{ + /* + * According to VT-d spec, software is recommended to not submit any + * Device-TLB invalidation requests while address remapping hardware + * is disabled. + */ + if (!(iommu->gcmd & DMA_GCMD_TE)) + return; + + qi_desc_dev_iotlb_pasid(sid, pfsid, pasid, qdep, addr, size_order, + &batch->descs[batch->index]); + qi_batch_increment_index(iommu, batch); +} + static void cache_tag_flush_iotlb(struct dmar_domain *domain, struct cache= _tag *tag, unsigned long addr, unsigned long pages, unsigned long mask, int ih) @@ -270,7 +343,8 @@ static void cache_tag_flush_iotlb(struct dmar_domain *d= omain, struct cache_tag * u64 type =3D DMA_TLB_PSI_FLUSH; =20 if (domain->use_first_level) { - qi_flush_piotlb(iommu, tag->domain_id, tag->pasid, addr, pages, ih); + qi_batch_add_piotlb(iommu, tag->domain_id, tag->pasid, addr, + pages, ih, domain->qi_batch); return; } =20 @@ -287,7 +361,8 @@ static void cache_tag_flush_iotlb(struct dmar_domain *d= omain, struct cache_tag * } =20 if (ecap_qis(iommu->ecap)) - qi_flush_iotlb(iommu, tag->domain_id, addr | ih, mask, type); + qi_batch_add_iotlb(iommu, tag->domain_id, addr | ih, mask, type, + domain->qi_batch); else __iommu_flush_iotlb(iommu, tag->domain_id, addr | ih, mask, type); } @@ -303,19 +378,20 @@ static void cache_tag_flush_devtlb_psi(struct dmar_do= main *domain, struct cache_ sid =3D PCI_DEVID(info->bus, info->devfn); =20 if (tag->pasid =3D=3D IOMMU_NO_PASID) { - qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, - addr, mask); + qi_batch_add_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, + addr, mask, domain->qi_batch); if (info->dtlb_extra_inval) - qi_flush_dev_iotlb(iommu, sid, info->pfsid, - info->ats_qdep, addr, mask); + qi_batch_add_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, + addr, mask, domain->qi_batch); return; } =20 - qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, tag->pasid, - info->ats_qdep, addr, mask); + qi_batch_add_pasid_dev_iotlb(iommu, sid, info->pfsid, tag->pasid, + info->ats_qdep, addr, mask, domain->qi_batch); if (info->dtlb_extra_inval) - qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, tag->pasid, - info->ats_qdep, addr, mask); + qi_batch_add_pasid_dev_iotlb(iommu, sid, info->pfsid, tag->pasid, + info->ats_qdep, addr, mask, + domain->qi_batch); } =20 static void cache_tag_flush_devtlb_all(struct dmar_domain *domain, struct = cache_tag *tag) @@ -327,11 +403,11 @@ static void cache_tag_flush_devtlb_all(struct dmar_do= main *domain, struct cache_ info =3D dev_iommu_priv_get(tag->dev); sid =3D PCI_DEVID(info->bus, info->devfn); =20 - qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, - MAX_AGAW_PFN_WIDTH); + qi_batch_add_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, + MAX_AGAW_PFN_WIDTH, domain->qi_batch); if (info->dtlb_extra_inval) - qi_flush_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, - MAX_AGAW_PFN_WIDTH); + qi_batch_add_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, + MAX_AGAW_PFN_WIDTH, domain->qi_batch); } =20 /* @@ -341,6 +417,7 @@ static void cache_tag_flush_devtlb_all(struct dmar_doma= in *domain, struct cache_ void cache_tag_flush_range(struct dmar_domain *domain, unsigned long start, unsigned long end, int ih) { + struct intel_iommu *iommu =3D NULL; unsigned long pages, mask, addr; struct cache_tag *tag; unsigned long flags; @@ -349,6 +426,10 @@ void cache_tag_flush_range(struct dmar_domain *domain,= unsigned long start, =20 spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { + if (iommu && iommu !=3D tag->iommu) + qi_batch_flush_descs(iommu, domain->qi_batch); + iommu =3D tag->iommu; + switch (tag->type) { case CACHE_TAG_IOTLB: case CACHE_TAG_NESTING_IOTLB: @@ -372,6 +453,7 @@ void cache_tag_flush_range(struct dmar_domain *domain, = unsigned long start, =20 trace_cache_tag_flush_range(tag, start, end, addr, pages, mask); } + qi_batch_flush_descs(iommu, domain->qi_batch); spin_unlock_irqrestore(&domain->cache_lock, flags); } =20 @@ -381,11 +463,16 @@ void cache_tag_flush_range(struct dmar_domain *domain= , unsigned long start, */ void cache_tag_flush_all(struct dmar_domain *domain) { + struct intel_iommu *iommu =3D NULL; struct cache_tag *tag; unsigned long flags; =20 spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { + if (iommu && iommu !=3D tag->iommu) + qi_batch_flush_descs(iommu, domain->qi_batch); + iommu =3D tag->iommu; + switch (tag->type) { case CACHE_TAG_IOTLB: case CACHE_TAG_NESTING_IOTLB: @@ -399,6 +486,7 @@ void cache_tag_flush_all(struct dmar_domain *domain) =20 trace_cache_tag_flush_all(tag); } + qi_batch_flush_descs(iommu, domain->qi_batch); spin_unlock_irqrestore(&domain->cache_lock, flags); } =20 @@ -416,6 +504,7 @@ void cache_tag_flush_all(struct dmar_domain *domain) void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long st= art, unsigned long end) { + struct intel_iommu *iommu =3D NULL; unsigned long pages, mask, addr; struct cache_tag *tag; unsigned long flags; @@ -424,7 +513,9 @@ void cache_tag_flush_range_np(struct dmar_domain *domai= n, unsigned long start, =20 spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { - struct intel_iommu *iommu =3D tag->iommu; + if (iommu && iommu !=3D tag->iommu) + qi_batch_flush_descs(iommu, domain->qi_batch); + iommu =3D tag->iommu; =20 if (!cap_caching_mode(iommu->cap) || domain->use_first_level) { iommu_flush_write_buffer(iommu); @@ -437,5 +528,6 @@ void cache_tag_flush_range_np(struct dmar_domain *domai= n, unsigned long start, =20 trace_cache_tag_flush_range_np(tag, start, end, addr, pages, mask); } + qi_batch_flush_descs(iommu, domain->qi_batch); spin_unlock_irqrestore(&domain->cache_lock, flags); } --=20 2.34.1