drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 260 +++++++++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 33 +++ 2 files changed, 285 insertions(+), 8 deletions(-)
From: Zhen Lei <thunder.leizhen@huawei.com>
v1 --> v2:
1. Drop patch "iommu/arm-smmu-v3: Add arm_smmu_ecmdq_issue_cmdlist() for non-shared ECMDQ" in v1
2. Drop patch "iommu/arm-smmu-v3: Add support for less than one ECMDQ per core" in v1
3. Replace rwlock with IPI to support lockless protection against the write operation to bit
'ERRACK' during error handling and the read operation to bit 'ERRACK' during command insertion.
4. Standardize variable names.
- struct arm_smmu_ecmdq *__percpu *ecmdq;
+ struct arm_smmu_ecmdq *__percpu *ecmdqs;
5. Add member 'iobase' to struct arm_smmu_device to record the start physical
address of the SMMU, to replace translation operation (vmalloc_to_pfn(smmu->base) << PAGE_SHIFT)
+ phys_addr_t iobase;
- smmu_dma_base = (vmalloc_to_pfn(smmu->base) << PAGE_SHIFT);
6. Cancel below union. Whether ECMDQ is enabled is determined only based on 'ecmdq_enabled'.
- union {
- u32 nr_ecmdq;
- u32 ecmdq_enabled;
- };
+ u32 nr_ecmdq;
+ bool ecmdq_enabled;
7. Eliminate some sparse check warnings. For example.
- struct arm_smmu_ecmdq *ecmdq;
+ struct arm_smmu_ecmdq __percpu *ecmdq;
Zhen Lei (2):
iommu/arm-smmu-v3: Add support for ECMDQ register mode
iommu/arm-smmu-v3: Ensure that a set of associated commands are
inserted in the same ECMDQ
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 260 +++++++++++++++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 33 +++
2 files changed, 285 insertions(+), 8 deletions(-)
--
2.34.1
On Wed, Aug 09, 2023 at 09:13:01PM +0800, thunder.leizhen@huaweicloud.com wrote: > From: Zhen Lei <thunder.leizhen@huawei.com> > > v1 --> v2: Jason previously asked about performance numbers for ECMDQ: https://lore.kernel.org/r/ZL6n3f01yV7tc4yH@ziepe.ca Do you have any? Will
On 2023/8/9 21:56, Will Deacon wrote: > On Wed, Aug 09, 2023 at 09:13:01PM +0800, thunder.leizhen@huaweicloud.com wrote: >> From: Zhen Lei <thunder.leizhen@huawei.com> >> >> v1 --> v2: > > Jason previously asked about performance numbers for ECMDQ: > > https://lore.kernel.org/r/ZL6n3f01yV7tc4yH@ziepe.ca > > Do you have any? I asked my colleagues in the chip department, and they said that the chip was not commercially available and the specific data could not be disclosed. However, to be sure, the performance has improved, but not by much, the public benchmark is only about 5%. Your optimization patch was so perfect that it ruined our jobs. However, since Marvell also implements ECMDQ, there are at least two users. Do we think about making it available first? > > Will > . > -- Regards, Zhen Lei
On Wed, Aug 09, 2023 at 07:18:36PM -0700, Leizhen (ThunderTown) wrote: > On 2023/8/9 21:56, Will Deacon wrote: > > On Wed, Aug 09, 2023 at 09:13:01PM +0800, thunder.leizhen@huaweicloud.com wrote: > >> From: Zhen Lei <thunder.leizhen@huawei.com> > >> > >> v1 --> v2: > > > > Jason previously asked about performance numbers for ECMDQ: > > > > https://lore.kernel.org/r/ZL6n3f01yV7tc4yH@ziepe.ca > > > > Do you have any? > > I asked my colleagues in the chip department, and they said that the chip > was not commercially available and the specific data could not be disclosed. > However, to be sure, the performance has improved, but not by much, the > public benchmark is only about 5%. Your optimization patch was so perfect > that it ruined our jobs. > > However, since Marvell also implements ECMDQ, there are at least two users. > Do we think about making it available first? I have seen something similar (~5%) with VCMDQ on NVIDIA Grace, when running, in host OS, TLB flush benchmark tests concurrently on different CPUs. Although VCMDQ could be slightly different from ECMDQ, both have a multi-queue feature. And the amount of improvement in my case came from a reduction of congestion at issueing commands to the multi queues vs. a single queue. And I guess ECMDQ might benefit its 5% from that too. If we decide to move ECMDQ forward, perhaps we can converge some of the functions to support both :) Thanks Nicolin
© 2016 - 2026 Red Hat, Inc.