[v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

[Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Posted by Ashish Mhetre 4 years ago

Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
entries to not be invalidated correctly. The problem is that the walk
cache index generated for IOVA is not same across translation and
invalidation requests. This is leading to page faults when PMD entry is
released during unmap and populated with new PTE table during subsequent
map request. Disabling large page mappings avoids the release of PMD
entry and avoid translations seeing stale PMD entry in walk cache.
Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
Tegra234 devices. This is recommended fix from Tegra hardware design
team.

Co-developed-by: Pritesh Raithatha <praithatha@nvidia.com>
Signed-off-by: Pritesh Raithatha <praithatha@nvidia.com>
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
---
Changes in v2:
- Using init_context() to override pgsize_bitmap instead of new function

 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 30 ++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
index 01e9b50b10a1..87bf522b9d2e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
@@ -258,6 +258,34 @@ static void nvidia_smmu_probe_finalize(struct arm_smmu_device *smmu, struct devi
 			dev_name(dev), err);
 }
 
+static int nvidia_smmu_init_context(struct arm_smmu_domain *smmu_domain,
+				    struct io_pgtable_cfg *pgtbl_cfg,
+				    struct device *dev)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	const struct device_node *np = smmu->dev->of_node;
+
+	/*
+	 * Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
+	 * entries to not be invalidated correctly. The problem is that the walk
+	 * cache index generated for IOVA is not same across translation and
+	 * invalidation requests. This is leading to page faults when PMD entry
+	 * is released during unmap and populated with new PTE table during
+	 * subsequent map request. Disabling large page mappings avoids the
+	 * release of PMD entry and avoid translations seeing stale PMD entry in
+	 * walk cache.
+	 * Fix this by limiting the page mappings to PAGE_SIZE on Tegra194 and
+	 * Tegra234.
+	 */
+	if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
+	    of_device_is_compatible(np, "nvidia,tegra194-smmu")) {
+		smmu->pgsize_bitmap = PAGE_SIZE;
+		pgtbl_cfg->pgsize_bitmap = smmu->pgsize_bitmap;
+	}
+
+	return 0;
+}
+
 static const struct arm_smmu_impl nvidia_smmu_impl = {
 	.read_reg = nvidia_smmu_read_reg,
 	.write_reg = nvidia_smmu_write_reg,
@@ -268,10 +296,12 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
 	.global_fault = nvidia_smmu_global_fault,
 	.context_fault = nvidia_smmu_context_fault,
 	.probe_finalize = nvidia_smmu_probe_finalize,
+	.init_context = nvidia_smmu_init_context,
 };
 
 static const struct arm_smmu_impl nvidia_smmu_single_impl = {
 	.probe_finalize = nvidia_smmu_probe_finalize,
+	.init_context = nvidia_smmu_init_context,
 };
 
 struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
-- 
2.17.1

Re: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Posted by Will Deacon 4 years ago

On Thu, 21 Apr 2022 13:45:04 +0530, Ashish Mhetre wrote:
> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
> entries to not be invalidated correctly. The problem is that the walk
> cache index generated for IOVA is not same across translation and
> invalidation requests. This is leading to page faults when PMD entry is
> released during unmap and populated with new PTE table during subsequent
> map request. Disabling large page mappings avoids the release of PMD
> entry and avoid translations seeing stale PMD entry in walk cache.
> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
> Tegra234 devices. This is recommended fix from Tegra hardware design
> team.
> 
> [...]

Applied to will (for-joerg/arm-smmu/fixes), thanks!

[1/1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu
      https://git.kernel.org/will/c/4a25f2ea0e03

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

Re: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Posted by Jon Hunter 4 years ago

Hi Will,

On 22/04/2022 11:55, Will Deacon wrote:
> On Thu, 21 Apr 2022 13:45:04 +0530, Ashish Mhetre wrote:
>> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
>> entries to not be invalidated correctly. The problem is that the walk
>> cache index generated for IOVA is not same across translation and
>> invalidation requests. This is leading to page faults when PMD entry is
>> released during unmap and populated with new PTE table during subsequent
>> map request. Disabling large page mappings avoids the release of PMD
>> entry and avoid translations seeing stale PMD entry in walk cache.
>> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
>> Tegra234 devices. This is recommended fix from Tegra hardware design
>> team.
>>
>> [...]
> 
> Applied to will (for-joerg/arm-smmu/fixes), thanks!
> 
> [1/1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu
>        https://git.kernel.org/will/c/4a25f2ea0e03
> 

Thanks for applying. Sorry to be late to the party, but feel free
to add my ...

Reviewed-by: Jon Hunter <jonathanh@nvidia.com>

Also any chance we could tag for stable? Probably the most
appropriate fixes-tag would be ...

Fixes: aab5a1c88276 ("iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage")

Thanks!
Jon

-- 
nvpublic

RE: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Posted by Krishna Reddy 4 years ago

> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache entries to
> not be invalidated correctly. The problem is that the walk cache index generated
> for IOVA is not same across translation and invalidation requests. This is leading
> to page faults when PMD entry is released during unmap and populated with
> new PTE table during subsequent map request. Disabling large page mappings
> avoids the release of PMD entry and avoid translations seeing stale PMD entry in
> walk cache.
> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
> Tegra234 devices. This is recommended fix from Tegra hardware design team.
> 
> Co-developed-by: Pritesh Raithatha <praithatha@nvidia.com>
> Signed-off-by: Pritesh Raithatha <praithatha@nvidia.com>
> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
> ---
> Changes in v2:
> - Using init_context() to override pgsize_bitmap instead of new function
> 
>  drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 30
> ++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> index 01e9b50b10a1..87bf522b9d2e 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> @@ -258,6 +258,34 @@ static void nvidia_smmu_probe_finalize(struct
> arm_smmu_device *smmu, struct devi
>  			dev_name(dev), err);
>  }
> 
> +static int nvidia_smmu_init_context(struct arm_smmu_domain
> *smmu_domain,
> +				    struct io_pgtable_cfg *pgtbl_cfg,
> +				    struct device *dev)
> +{
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	const struct device_node *np = smmu->dev->of_node;
> +
> +	/*
> +	 * Tegra194 and Tegra234 SoCs have the erratum that causes walk
> cache
> +	 * entries to not be invalidated correctly. The problem is that the walk
> +	 * cache index generated for IOVA is not same across translation and
> +	 * invalidation requests. This is leading to page faults when PMD entry
> +	 * is released during unmap and populated with new PTE table during
> +	 * subsequent map request. Disabling large page mappings avoids the
> +	 * release of PMD entry and avoid translations seeing stale PMD entry in
> +	 * walk cache.
> +	 * Fix this by limiting the page mappings to PAGE_SIZE on Tegra194 and
> +	 * Tegra234.
> +	 */
> +	if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
> +	    of_device_is_compatible(np, "nvidia,tegra194-smmu")) {
> +		smmu->pgsize_bitmap = PAGE_SIZE;
> +		pgtbl_cfg->pgsize_bitmap = smmu->pgsize_bitmap;
> +	}
> +
> +	return 0;
> +}
> +
>  static const struct arm_smmu_impl nvidia_smmu_impl = {
>  	.read_reg = nvidia_smmu_read_reg,
>  	.write_reg = nvidia_smmu_write_reg,
> @@ -268,10 +296,12 @@ static const struct arm_smmu_impl
> nvidia_smmu_impl = {
>  	.global_fault = nvidia_smmu_global_fault,
>  	.context_fault = nvidia_smmu_context_fault,
>  	.probe_finalize = nvidia_smmu_probe_finalize,
> +	.init_context = nvidia_smmu_init_context,
>  };
> 
>  static const struct arm_smmu_impl nvidia_smmu_single_impl = {
>  	.probe_finalize = nvidia_smmu_probe_finalize,
> +	.init_context = nvidia_smmu_init_context,
>  };
> 

Reviewed-by: Krishna Reddy <vdumpa@nvidia.com>

-KR

Re: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Posted by Robin Murphy 4 years ago

On 2022-04-21 09:15, Ashish Mhetre wrote:
> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
> entries to not be invalidated correctly. The problem is that the walk
> cache index generated for IOVA is not same across translation and
> invalidation requests. This is leading to page faults when PMD entry is
> released during unmap and populated with new PTE table during subsequent
> map request. Disabling large page mappings avoids the release of PMD
> entry and avoid translations seeing stale PMD entry in walk cache.
> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
> Tegra234 devices. This is recommended fix from Tegra hardware design
> team.

Acked-by: Robin Murphy <robin.murphy@arm.com>

> Co-developed-by: Pritesh Raithatha <praithatha@nvidia.com>
> Signed-off-by: Pritesh Raithatha <praithatha@nvidia.com>
> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
> ---
> Changes in v2:
> - Using init_context() to override pgsize_bitmap instead of new function
> 
>   drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 30 ++++++++++++++++++++
>   1 file changed, 30 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> index 01e9b50b10a1..87bf522b9d2e 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> @@ -258,6 +258,34 @@ static void nvidia_smmu_probe_finalize(struct arm_smmu_device *smmu, struct devi
>   			dev_name(dev), err);
>   }
>   
> +static int nvidia_smmu_init_context(struct arm_smmu_domain *smmu_domain,
> +				    struct io_pgtable_cfg *pgtbl_cfg,
> +				    struct device *dev)
> +{
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	const struct device_node *np = smmu->dev->of_node;
> +
> +	/*
> +	 * Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
> +	 * entries to not be invalidated correctly. The problem is that the walk
> +	 * cache index generated for IOVA is not same across translation and
> +	 * invalidation requests. This is leading to page faults when PMD entry
> +	 * is released during unmap and populated with new PTE table during
> +	 * subsequent map request. Disabling large page mappings avoids the
> +	 * release of PMD entry and avoid translations seeing stale PMD entry in
> +	 * walk cache.
> +	 * Fix this by limiting the page mappings to PAGE_SIZE on Tegra194 and
> +	 * Tegra234.
> +	 */
> +	if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
> +	    of_device_is_compatible(np, "nvidia,tegra194-smmu")) {
> +		smmu->pgsize_bitmap = PAGE_SIZE;
> +		pgtbl_cfg->pgsize_bitmap = smmu->pgsize_bitmap;
> +	}
> +
> +	return 0;
> +}
> +
>   static const struct arm_smmu_impl nvidia_smmu_impl = {
>   	.read_reg = nvidia_smmu_read_reg,
>   	.write_reg = nvidia_smmu_write_reg,
> @@ -268,10 +296,12 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
>   	.global_fault = nvidia_smmu_global_fault,
>   	.context_fault = nvidia_smmu_context_fault,
>   	.probe_finalize = nvidia_smmu_probe_finalize,
> +	.init_context = nvidia_smmu_init_context,
>   };
>   
>   static const struct arm_smmu_impl nvidia_smmu_single_impl = {
>   	.probe_finalize = nvidia_smmu_probe_finalize,
> +	.init_context = nvidia_smmu_init_context,
>   };
>   
>   struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)