From nobody Sun Feb 8 05:40:24 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99203DDCF for ; Thu, 15 Feb 2024 07:28:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707982133; cv=none; b=JO3tHxZawsyAbGjp67uuHVCHCB32YvSoHNbYM2ZuAirWxcyuSJnWLuAogeMLv6Rf5rGcwUCxuMHrpbano/GfuuvTcpEtto03L9isT9Y0Ou96+pE3JUratmGVv2SbsIIA7KNrrWSqQDfTmjyqGugHz9jeOPg5EVJHnlCZQu+OYhk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707982133; c=relaxed/simple; bh=TMjeJSIkcMJLmOlEVy9pSoQQPSex5IwLFXnsMQdCurc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=aR/XjWQu1TEKyRByzp7lHGEKVQZcoY1sJOWuuFNko5wzK7mJw1xBJF69BgeFNwaBIDJXfXgP33ak7gEi+B+Pn1GFVjYPMt1sIRnZ+Ewo8ygVXN0tTZLyTxgQl3X0GolKeEiwjVvuYaAgQEKUuYpCtGp1TxytVbobNBk05qVaqME= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=DI97xG42; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="DI97xG42" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1707982132; x=1739518132; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TMjeJSIkcMJLmOlEVy9pSoQQPSex5IwLFXnsMQdCurc=; b=DI97xG42qnrIccXxgVqKj78Iyo08Q+aGvZPQeabwDi9W/Dy062Gi6g89 JNUL53ER+q7kKsHtpkA25HZe8vGyqWo97e3C/HELTthN0wAKGc111V8tI iJltre19qa2oCyk6vmPN+/XulN6Szoy8iHeZcBO3X+YYMRUD0SXhknffh sAT3irQucjlg+Z5WbyJVy1pdGYvQAPzGu/cjZikZrC9GY6IzXU90FQY0l gbM1rXjjwks/L13f0LioFNV6NL+F9sYLyYvdh8O8wiHmJjU7yOPdYjFDp pOEydGAD/Ho3lQcGfwNdIUQvrGBu0d+5QXVHevPZ/AAxvoiK+M6kjteQ6 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10984"; a="2182713" X-IronPort-AV: E=Sophos;i="6.06,161,1705392000"; d="scan'208";a="2182713" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Feb 2024 23:28:51 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10984"; a="912113139" X-IronPort-AV: E=Sophos;i="6.06,161,1705392000"; d="scan'208";a="912113139" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga002.fm.intel.com with ESMTP; 14 Feb 2024 23:28:48 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: Huang Jiaqing , Ethan Zhao , iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH 1/2] iommu/vt-d: Use rbtree to track iommu probed devices Date: Thu, 15 Feb 2024 15:22:48 +0800 Message-Id: <20240215072249.4465-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215072249.4465-1-baolu.lu@linux.intel.com> References: <20240215072249.4465-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use a red-black tree(rbtree) to track devices probed by the driver's probe_device callback. These devices need to be looked up quickly by a source ID when the hardware reports a fault, either recoverable or unrecoverable. Fault reporting paths are critical. Searching a list in this scenario is inefficient, with an algorithm complexity of O(n). An rbtree is a self-balancing binary search tree, offering an average search time complexity of O(log(n)). This significant performance improvement makes rbtrees a better choice. Furthermore, rbtrees are implemented on a per-iommu basis, eliminating the need for global searches and further enhancing efficiency in critical fault paths. The rbtree is protected by a spin lock with interrupts disabled to ensure thread-safe access even within interrupt contexts. Co-developed-by: Huang Jiaqing Signed-off-by: Huang Jiaqing Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe --- drivers/iommu/intel/iommu.h | 7 +++++ drivers/iommu/intel/dmar.c | 3 +- drivers/iommu/intel/iommu.c | 62 +++++++++++++++++++++++++++++++++++-- 3 files changed, 69 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index cf9a28c7fab8..54eeaa8e35a9 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -716,6 +716,11 @@ struct intel_iommu { struct q_inval *qi; /* Queued invalidation info */ u32 iommu_state[MAX_SR_DMAR_REGS]; /* Store iommu states between suspend = and resume.*/ =20 + /* rb tree for all probed devices */ + struct rb_root device_rbtree; + /* protect the device_rbtree */ + spinlock_t device_rbtree_lock; + #ifdef CONFIG_IRQ_REMAP struct ir_table *ir_table; /* Interrupt remapping info */ struct irq_domain *ir_domain; @@ -749,6 +754,8 @@ struct device_domain_info { struct intel_iommu *iommu; /* IOMMU used by this device */ struct dmar_domain *domain; /* pointer to domain */ struct pasid_table *pasid_table; /* pasid table */ + /* device tracking node(lookup by PCI RID) */ + struct rb_node node; #ifdef CONFIG_INTEL_IOMMU_DEBUGFS struct dentry *debugfs_dentry; /* pointer to device directory dentry */ #endif diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 23cb80d62a9a..f9b63c2875f7 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1095,7 +1095,8 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd) iommu->agaw =3D agaw; iommu->msagaw =3D msagaw; iommu->segment =3D drhd->segment; - + iommu->device_rbtree =3D RB_ROOT; + spin_lock_init(&iommu->device_rbtree_lock); iommu->node =3D NUMA_NO_NODE; =20 ver =3D readl(iommu->reg + DMAR_VER_REG); diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index a81a2be9b870..09009d96e553 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -96,6 +96,55 @@ static phys_addr_t root_entry_uctp(struct root_entry *re) return re->hi & VTD_PAGE_MASK; } =20 +static int device_rid_cmp_key(const void *key, const struct rb_node *node) +{ + struct device_domain_info *info =3D + rb_entry(node, struct device_domain_info, node); + const u16 *rid_lhs =3D key; + + if (*rid_lhs < PCI_DEVID(info->bus, info->devfn)) + return -1; + + if (*rid_lhs > PCI_DEVID(info->bus, info->devfn)) + return 1; + + return 0; +} + +static int device_rid_cmp(struct rb_node *lhs, const struct rb_node *rhs) +{ + struct device_domain_info *info =3D + rb_entry(lhs, struct device_domain_info, node); + u16 key =3D PCI_DEVID(info->bus, info->devfn); + + return device_rid_cmp_key(&key, rhs); +} + +static int device_rbtree_insert(struct intel_iommu *iommu, + struct device_domain_info *info) +{ + struct rb_node *curr; + unsigned long flags; + + spin_lock_irqsave(&iommu->device_rbtree_lock, flags); + curr =3D rb_find_add(&info->node, &iommu->device_rbtree, device_rid_cmp); + spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags); + if (curr) + dev_warn(info->dev, "device already in rbtree\n"); + + return curr ? -EEXIST : 0; +} + +static void device_rbtree_remove(struct device_domain_info *info) +{ + struct intel_iommu *iommu =3D info->iommu; + unsigned long flags; + + spin_lock_irqsave(&iommu->device_rbtree_lock, flags); + rb_erase(&info->node, &iommu->device_rbtree); + spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags); +} + /* * This domain is a statically identity mapping domain. * 1. This domain creats a static 1:1 mapping to all usable memory. @@ -4264,25 +4313,34 @@ static struct iommu_device *intel_iommu_probe_devic= e(struct device *dev) } =20 dev_iommu_priv_set(dev, info); + ret =3D device_rbtree_insert(iommu, info); + if (ret) + goto free; =20 if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) { ret =3D intel_pasid_alloc_table(dev); if (ret) { dev_err(dev, "PASID table allocation failed\n"); - kfree(info); - return ERR_PTR(ret); + goto clear_rbtree; } } =20 intel_iommu_debugfs_create_dev(info); =20 return &iommu->iommu; +clear_rbtree: + device_rbtree_remove(info); +free: + kfree(info); + + return ERR_PTR(ret); } =20 static void intel_iommu_release_device(struct device *dev) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); =20 + device_rbtree_remove(info); dmar_remove_one_dev_info(dev); intel_pasid_free_table(dev); intel_iommu_debugfs_remove_dev(info); --=20 2.34.1 From nobody Sun Feb 8 05:40:24 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0627A11702 for ; Thu, 15 Feb 2024 07:28:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707982135; cv=none; b=cB70b07KousotFR0k9lA1Vrnh4rxLE0G2R1crs4/pWl+vNd5kT7oCtDw/uXcj07KNOJQOVlxJSgrQuIQ0fUF6Fywmsf5vd+vBssFyNC2faZUa/iRA+bUqQguE7LVk6Ilf57MADKLxy2it8G2Hbg2cJVkud54PyCS3v+/VpQJR7M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707982135; c=relaxed/simple; bh=/UfRxLziZyXq/0+b3VTkvvpW66vL2yoqTnghp3Iq0KI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=f8ILt0VGc5rW3paw7RIiScNBg1aZVn23PvVwGNhWOq/zHT13Oo6R277/DSAO/AkoLv9tXT6WQPvbh2c3c51AtG+g1ljrDUk7tNgzFYuhv55kWxU7tu9uqY7aCBWxosA1yVKMXPR7aEirDCsaTWVLMbZOUV8rWTQ7s5nXjuJeLQk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=RDhpXhxb; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="RDhpXhxb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1707982134; x=1739518134; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/UfRxLziZyXq/0+b3VTkvvpW66vL2yoqTnghp3Iq0KI=; b=RDhpXhxbLh+e6dGyVhDMDuEW7+jh03xBp9TEbCK7hNlq0wWn/Sh3eYSW l2h13emvpi62P78KjLJKv7MLMgevsz2/uFhEc2yMJT+oQDTPgZY51+FTO GRoKlfgQ1GD3dCgIGY/d5cTFO5fuaKveKSC4VKiK3dKJgiykg/qPOX4lr rKxQ5U/clE6HOj4yI7oGC8ZSJeMTER3uVBeP2wSUYgkDX96P5IQScO7mv G2NDa9A2eRpeCqbbK/dEjSXDo5ZAtxLRmnDmxRNpcVAPD5uMv3jqxTHch 3xocJNyLc0QhQY9f5qPMbg+rUCfsGcOzdP4b60MSBRQjTLKOR7wpKwggr A==; X-IronPort-AV: E=McAfee;i="6600,9927,10984"; a="2182719" X-IronPort-AV: E=Sophos;i="6.06,161,1705392000"; d="scan'208";a="2182719" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Feb 2024 23:28:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10984"; a="912113145" X-IronPort-AV: E=Sophos;i="6.06,161,1705392000"; d="scan'208";a="912113145" Received: from allen-box.sh.intel.com ([10.239.159.127]) by fmsmga002.fm.intel.com with ESMTP; 14 Feb 2024 23:28:51 -0800 From: Lu Baolu To: Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe , Kevin Tian Cc: Huang Jiaqing , Ethan Zhao , iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH 2/2] iommu/vt-d: Use device rbtree in iopf reporting path Date: Thu, 15 Feb 2024 15:22:49 +0800 Message-Id: <20240215072249.4465-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215072249.4465-1-baolu.lu@linux.intel.com> References: <20240215072249.4465-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The existing IO page fault handler currently locates the PCI device by calling pci_get_domain_bus_and_slot(). This function searches the list of all PCI devices until the desired device is found. To improve lookup efficiency, a helper function named device_rbtree_find() is introduced to search for the device within the rbtree. Replace pci_get_domain_bus_and_slot() in the IO page fault handling path. Co-developed-by: Huang Jiaqing Signed-off-by: Huang Jiaqing Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 1 + drivers/iommu/intel/iommu.c | 29 +++++++++++++++++++++++++++++ drivers/iommu/intel/svm.c | 14 ++++++-------- 3 files changed, 36 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 54eeaa8e35a9..f13c228924f8 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1081,6 +1081,7 @@ void free_pgtable_page(void *vaddr); void iommu_flush_write_buffer(struct intel_iommu *iommu); struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent, const struct iommu_user_data *user_data); +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid); =20 #ifdef CONFIG_INTEL_IOMMU_SVM void intel_svm_check(struct intel_iommu *iommu); diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 09009d96e553..d92c680bcc96 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -120,6 +120,35 @@ static int device_rid_cmp(struct rb_node *lhs, const s= truct rb_node *rhs) return device_rid_cmp_key(&key, rhs); } =20 +/* + * Looks up an IOMMU-probed device using its source ID. + * + * If the device is found: + * - Increments its reference count. + * - Returns a pointer to the device. + * - The caller must call put_device() after using the pointer. + * + * If the device is not found, returns NULL. + */ +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid) +{ + struct device_domain_info *info; + struct device *dev =3D NULL; + struct rb_node *node; + unsigned long flags; + + spin_lock_irqsave(&iommu->device_rbtree_lock, flags); + node =3D rb_find(&rid, &iommu->device_rbtree, device_rid_cmp_key); + if (node) { + info =3D rb_entry(node, struct device_domain_info, node); + dev =3D info->dev; + get_device(dev); + } + spin_unlock_irqrestore(&iommu->device_rbtree_lock, flags); + + return dev; +} + static int device_rbtree_insert(struct intel_iommu *iommu, struct device_domain_info *info) { diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index b644d57da841..717b7041973c 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -645,7 +645,7 @@ static irqreturn_t prq_event_thread(int irq, void *d) struct intel_iommu *iommu =3D d; struct page_req_dsc *req; int head, tail, handled; - struct pci_dev *pdev; + struct device *dev; u64 address; =20 /* @@ -691,21 +691,19 @@ static irqreturn_t prq_event_thread(int irq, void *d) if (unlikely(req->lpig && !req->rd_req && !req->wr_req)) goto prq_advance; =20 - pdev =3D pci_get_domain_bus_and_slot(iommu->segment, - PCI_BUS_NUM(req->rid), - req->rid & 0xff); /* * If prq is to be handled outside iommu driver via receiver of * the fault notifiers, we skip the page response here. */ - if (!pdev) + dev =3D device_rbtree_find(iommu, req->rid); + if (!dev) goto bad_req; =20 - intel_svm_prq_report(iommu, &pdev->dev, req); - trace_prq_report(iommu, &pdev->dev, req->qw_0, req->qw_1, + intel_svm_prq_report(iommu, dev, req); + trace_prq_report(iommu, dev, req->qw_0, req->qw_1, req->priv_data[0], req->priv_data[1], iommu->prq_seq_number++); - pci_dev_put(pdev); + put_device(dev); prq_advance: head =3D (head + sizeof(*req)) & PRQ_RING_MASK; } --=20 2.34.1