[PATCH 3/7] iommu/vt-d: Use device_block_translation() in dev_attach error path

Lu Baolu posted 7 patches 3 years, 1 month ago
There is a newer version of this series
[PATCH 3/7] iommu/vt-d: Use device_block_translation() in dev_attach error path
Posted by Lu Baolu 3 years, 1 month ago
If domain attaching to device fails, the IOMMU driver should bring the
device to blocking DMA state. The upper layer is expected to recover it
by attaching a new domain. Use device_block_translation() in the error
path of dev_attach to make the behavior specific.

The difference between device_block_translation() and the previous
dmar_remove_one_dev_info() is that the latter disables PCIe ATS and the
related PCIe features. This is unnecessary as these features are not per
domain capabilities, disabling them during domain switching is
unnecessary.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7374a03cbe27..b956c411f2bb 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -277,8 +277,8 @@ static LIST_HEAD(dmar_satc_units);
 #define for_each_rmrr_units(rmrr) \
 	list_for_each_entry(rmrr, &dmar_rmrr_units, list)
 
-static void dmar_remove_one_dev_info(struct device *dev);
 static void intel_iommu_domain_free(struct iommu_domain *domain);
+static void device_block_translation(struct device *dev);
 
 int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON);
 int intel_iommu_sm = IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON);
@@ -2490,7 +2490,7 @@ static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev)
 					dev, PASID_RID2PASID);
 		if (ret) {
 			dev_err(dev, "Setup RID2PASID failed\n");
-			dmar_remove_one_dev_info(dev);
+			device_block_translation(dev);
 			return ret;
 		}
 	}
@@ -2498,7 +2498,7 @@ static int domain_add_dev_info(struct dmar_domain *domain, struct device *dev)
 	ret = domain_context_mapping(domain, dev);
 	if (ret) {
 		dev_err(dev, "Domain context map failed\n");
-		dmar_remove_one_dev_info(dev);
+		device_block_translation(dev);
 		return ret;
 	}
 
@@ -4283,7 +4283,7 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 		struct device_domain_info *info = dev_iommu_priv_get(dev);
 
 		if (info->domain)
-			dmar_remove_one_dev_info(dev);
+			device_block_translation(dev);
 	}
 
 	ret = prepare_domain_attach_device(domain, dev);
-- 
2.34.1
RE: [PATCH 3/7] iommu/vt-d: Use device_block_translation() in dev_attach error path
Posted by Tian, Kevin 3 years, 1 month ago
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Thursday, November 3, 2022 1:53 PM
> 
> If domain attaching to device fails, the IOMMU driver should bring the
> device to blocking DMA state. The upper layer is expected to recover it
> by attaching a new domain. Use device_block_translation() in the error
> path of dev_attach to make the behavior specific.
> 
> The difference between device_block_translation() and the previous
> dmar_remove_one_dev_info() is that the latter disables PCIe ATS and the
> related PCIe features. This is unnecessary as these features are not per
> domain capabilities, disabling them during domain switching is
> unnecessary.

well, the opposite argument is that when the DMA is blocked what is
the point of enabling ATS/PRI on the device.

btw this change is partial. @attach_dev still calls iommu_enable_pci_caps()
which always tries to enable PCI capabilities w/o checking whether they
have been enabled or not. Then user will hit -EBUSY when related PCI
helpers are called.

another difference worthy of pointing out is that in scalable mode it is
the RID2PASID entry instead of context entry being cleared.
Re: [PATCH 3/7] iommu/vt-d: Use device_block_translation() in dev_attach error path
Posted by Baolu Lu 3 years, 1 month ago
On 2022/11/4 10:18, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Thursday, November 3, 2022 1:53 PM
>>
>> If domain attaching to device fails, the IOMMU driver should bring the
>> device to blocking DMA state. The upper layer is expected to recover it
>> by attaching a new domain. Use device_block_translation() in the error
>> path of dev_attach to make the behavior specific.
>>
>> The difference between device_block_translation() and the previous
>> dmar_remove_one_dev_info() is that the latter disables PCIe ATS and the
>> related PCIe features. This is unnecessary as these features are not per
>> domain capabilities, disabling them during domain switching is
>> unnecessary.
> 
> well, the opposite argument is that when the DMA is blocked what is
> the point of enabling ATS/PRI on the device.

It's not "DMA is blocked", but "DMA without PASID is blocked". :-)

As above term implies, it's conceptually incorrect to disable ATS/PRI
only because DMA without PASID is about to blocked.

> 
> btw this change is partial. @attach_dev still calls iommu_enable_pci_caps()
> which always tries to enable PCI capabilities w/o checking whether they
> have been enabled or not. Then user will hit -EBUSY when related PCI
> helpers are called.

Good catch!

How about moving iommu_enable/disable_pci_caps() into
iommu_probe/release_device() path? I may look into details if there's no
significant arch gaps.

> 
> another difference worthy of pointing out is that in scalable mode it is
> the RID2PASID entry instead of context entry being cleared.

Yes. Will update the commit message.

Best regards,
baolu
Re: [PATCH 3/7] iommu/vt-d: Use device_block_translation() in dev_attach error path
Posted by Baolu Lu 3 years, 1 month ago
On 2022/11/5 10:09, Baolu Lu wrote:
>>
>> btw this change is partial. @attach_dev still calls 
>> iommu_enable_pci_caps()
>> which always tries to enable PCI capabilities w/o checking whether they
>> have been enabled or not. Then user will hit -EBUSY when related PCI
>> helpers are called.
> 
> Good catch!
> 
> How about moving iommu_enable/disable_pci_caps() into
> iommu_probe/release_device() path? I may look into details if there's no
> significant arch gaps.

After reconsideration, it seems that this is not a feasible solution.
This changes the order in which PCI devices enable DMA.In addition, for
the kdump kernel, this is not feasible because there may be on-going DMA
on the device.

Perhaps I can make the iommu_enable_pci_caps() reentrant.

Best regards,
baolu