[PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround

Ashish Mhetre posted 2 patches 1 week, 4 days ago
There is a newer version of this series
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 70 +++++++++++++++++++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  8 +++
2 files changed, 72 insertions(+), 6 deletions(-)
[PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround
Posted by Ashish Mhetre 1 week, 4 days ago
Nvidia Tegra264 SMMUs are affected by an erratum where a TLB entry can
survive an invalidation that races with concurrent traffic targeting
the same entry. The hardware-recommended software workaround is to
issue every CFGI/TLBI command (each followed by CMD_SYNC) twice. The
second issue must execute only after the first issue's CMD_SYNC has
completed, giving the sequence:

    TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC

This series implements the workaround by hooking the duplication into
the single chokepoint that every synchronous submission flows through
arm_smmu_cmdq_issue_cmdlist().

Patch 1 detects affected instances using the existing
"nvidia,tegra264-smmu" compatible string and exposes the condition
via a new ARM_SMMU_OPT_TLBI_TWICE option bit.

Patch 2 wires the option into the CMDQ submission path which is used to
re-issue the cmdlist when @sync is true and the first command is a
CFGI/TLBI.

Ashish Mhetre (2):
  iommu/arm-smmu-v3: Detect Tegra264 erratum
  iommu/arm-smmu-v3: Issue CFGI/TLBI twice on Tegra264

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 70 +++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  8 +++
 2 files changed, 72 insertions(+), 6 deletions(-)


base-commit: f86b1ac9a67321419fec095ecb27584b2f77e339
-- 
2.50.1
Re: [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround
Posted by Nicolin Chen 1 week, 3 days ago
On Thu, May 28, 2026 at 10:16:15AM +0000, Ashish Mhetre wrote:
> Nvidia Tegra264 SMMUs are affected by an erratum where a TLB entry can
> survive an invalidation that races with concurrent traffic targeting
> the same entry. The hardware-recommended software workaround is to
> issue every CFGI/TLBI command (each followed by CMD_SYNC) twice. The
> second issue must execute only after the first issue's CMD_SYNC has
> completed, giving the sequence:
> 
>     TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC
> 
> This series implements the workaround by hooking the duplication into
> the single chokepoint that every synchronous submission flows through
> arm_smmu_cmdq_issue_cmdlist().
> 
> Patch 1 detects affected instances using the existing
> "nvidia,tegra264-smmu" compatible string and exposes the condition
> via a new ARM_SMMU_OPT_TLBI_TWICE option bit.
> 
> Patch 2 wires the option into the CMDQ submission path which is used to
> re-issue the cmdlist when @sync is true and the first command is a
> CFGI/TLBI.

What base-commit do you format the patches from?

Sashiko failed to apply for running a review:
https://sashiko.dev/#/patchset/20260528101617.4068249-1-amhetre%40nvidia.com

Nicolin
Re: [PATCH 0/2] iommu/arm-smmu-v3: Tegra264 invalidation workaround
Posted by Ashish Mhetre 1 week, 3 days ago

On 5/29/2026 12:11 AM, Nicolin Chen wrote:
> On Thu, May 28, 2026 at 10:16:15AM +0000, Ashish Mhetre wrote:
>> Nvidia Tegra264 SMMUs are affected by an erratum where a TLB entry can
>> survive an invalidation that races with concurrent traffic targeting
>> the same entry. The hardware-recommended software workaround is to
>> issue every CFGI/TLBI command (each followed by CMD_SYNC) twice. The
>> second issue must execute only after the first issue's CMD_SYNC has
>> completed, giving the sequence:
>>
>>      TLBI/CFGI ... CMD_SYNC TLBI/CFGI ... CMD_SYNC
>>
>> This series implements the workaround by hooking the duplication into
>> the single chokepoint that every synchronous submission flows through
>> arm_smmu_cmdq_issue_cmdlist().
>>
>> Patch 1 detects affected instances using the existing
>> "nvidia,tegra264-smmu" compatible string and exposes the condition
>> via a new ARM_SMMU_OPT_TLBI_TWICE option bit.
>>
>> Patch 2 wires the option into the CMDQ submission path which is used to
>> re-issue the cmdlist when @sync is true and the first command is a
>> CFGI/TLBI.
> What base-commit do you format the patches from?
>
> Sashiko failed to apply for running a review:
> https://sashiko.dev/#/patchset/20260528101617.4068249-1-amhetre%40nvidia.com
>
> Nicolin

The series is on top of Jason Gunthorpe's "Remove SMMUv3 struct
arm_smmu_cmdq_ent" series [1], which is in iommu/next. I applied
those 9 patches from the lore mbox onto linux-next-20260527
locally, so the base-commit hash recorded in the cover letter is
a local SHA, sorry for the confusion. I'll repoint base-commit
at the iommu/next tip in v2.

For convenience, the same series is also available in
jgg/iommu_pt_arm64 on github. The tip of the 9-patch series
there is currently 13428b0bf794 ("iommu/arm-smmu-v3: Directly
encode TLBI commands"), and applying these two patches on top of
that SHA reproduces my tree exactly.

[1] 
https://lore.kernel.org/all/0-v2-47b2bf710ad5+716ac-smmu_no_cmdq_ent_jgg@nvidia.com/

Thanks,
Ashish Mhetre