[v5] RE: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

RE: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

Posted by Shameerali Kolothum Thodi 2 weeks ago


> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Monday, April 15, 2024 6:14 PM
> To: Nicolin Chen <nicolinc@nvidia.com>
> Cc: will@kernel.org; robin.murphy@arm.com; joro@8bytes.org;
> thierry.reding@gmail.com; vdumpa@nvidia.com; jonathanh@nvidia.com;
> linux-kernel@vger.kernel.org; iommu@lists.linux.dev; linux-arm-
> kernel@lists.infradead.org; linux-tegra@vger.kernel.org; Jerry Snitselaar
> <jsnitsel@redhat.com>
> Subject: Re: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)
> 
> On Fri, Apr 12, 2024 at 08:43:48PM -0700, Nicolin Chen wrote:
> 
> > The user-space support is to provide uAPIs (via IOMMUFD) for hypervisors
> > in user space to passthrough VCMDQs to VMs, allowing these VMs to
> access
> > the VCMDQs directly without trappings, i.e. no VM Exits. This gives huge
> > performance improvements: 70% to 90% reductions of TLB invalidation
> time
> > were measured by various DMA unmap tests running in a guest OS,
> compared
> > to a nested SMMU CMDQ (with trappings).
> 
> So everyone is on the same page, this is the primary point of this
> series. The huge speed up of in-VM performance is necessary for the
> workloads this chip is expected to be running. This series is unique
> from all the rest because it runs inside a VM, often in the from of a
> distro release.
> 
> It doesn't need the other series or it's own part 2 as it entirely
> stands alone on bare metal hardware or on top of commercial VM cloud
> instances runing who-knows-what in their hypervisors.
> 
> The other parts are substantially about enabling qemu and the open
> ecosystem to have fully functional vSMMU3 virtualization.

Hi,

We do have plans to revive the SMMUv3 ECMDQ series posted a while back[0]
and looking at this series, I am just wondering whether it makes sense to have
a similar one with ECMDQ as well?  I see that the NVIDIA VCMDQ has a special bit 
to restrict the commands that can be issued from user space. If we end up assigning
a ECMDQ to user space, is there any potential risk in doing so? 

SMMUV3 spec does say,
"Arm expects that the Non-secure Stream table, Command queue, Event queue and
PRI queue are controlled by the most privileged Non-secure system software. "

Not clear to me what are the major concerns here and maybe we can come up with 
something to address that in kernel.

Please let me know if you have any thoughts on this.

Thanks,
Shameer
[0] https://lore.kernel.org/lkml/20230809131303.1355-1-thunder.leizhen@huaweicloud.com/

RE: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

Posted by Shameerali Kolothum Thodi 2 weeks ago


> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, April 17, 2024 9:01 AM
> To: 'Jason Gunthorpe' <jgg@nvidia.com>; Nicolin Chen <nicolinc@nvidia.com>
> Cc: will@kernel.org; robin.murphy@arm.com; joro@8bytes.org;
> thierry.reding@gmail.com; vdumpa@nvidia.com; jonathanh@nvidia.com; linux-
> kernel@vger.kernel.org; iommu@lists.linux.dev; linux-arm-
> kernel@lists.infradead.org; linux-tegra@vger.kernel.org; Jerry Snitselaar
> <jsnitsel@redhat.com>
> Subject: RE: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)
> 
> 
> 
> > -----Original Message-----
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Monday, April 15, 2024 6:14 PM
> > To: Nicolin Chen <nicolinc@nvidia.com>
> > Cc: will@kernel.org; robin.murphy@arm.com; joro@8bytes.org;
> > thierry.reding@gmail.com; vdumpa@nvidia.com; jonathanh@nvidia.com;
> > linux-kernel@vger.kernel.org; iommu@lists.linux.dev; linux-arm-
> > kernel@lists.infradead.org; linux-tegra@vger.kernel.org; Jerry Snitselaar
> > <jsnitsel@redhat.com>
> > Subject: Re: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)
> >
> > On Fri, Apr 12, 2024 at 08:43:48PM -0700, Nicolin Chen wrote:
> >
> > > The user-space support is to provide uAPIs (via IOMMUFD) for hypervisors
> > > in user space to passthrough VCMDQs to VMs, allowing these VMs to
> > access
> > > the VCMDQs directly without trappings, i.e. no VM Exits. This gives huge
> > > performance improvements: 70% to 90% reductions of TLB invalidation
> > time
> > > were measured by various DMA unmap tests running in a guest OS,
> > compared
> > > to a nested SMMU CMDQ (with trappings).
> >
> > So everyone is on the same page, this is the primary point of this
> > series. The huge speed up of in-VM performance is necessary for the
> > workloads this chip is expected to be running. This series is unique
> > from all the rest because it runs inside a VM, often in the from of a
> > distro release.
> >
> > It doesn't need the other series or it's own part 2 as it entirely
> > stands alone on bare metal hardware or on top of commercial VM cloud
> > instances runing who-knows-what in their hypervisors.
> >
> > The other parts are substantially about enabling qemu and the open
> > ecosystem to have fully functional vSMMU3 virtualization.
> 
> Hi,
> 
> We do have plans to revive the SMMUv3 ECMDQ series posted a while back[0]
> and looking at this series, I am just wondering whether it makes sense to have
> a similar one with ECMDQ as well?  I see that the NVIDIA VCMDQ has a special
> bit
> to restrict the commands that can be issued from user space. If we end up
> assigning
> a ECMDQ to user space, is there any potential risk in doing so?
> 
> SMMUV3 spec does say,
> "Arm expects that the Non-secure Stream table, Command queue, Event queue
> and
> PRI queue are controlled by the most privileged Non-secure system software. "
> 
> Not clear to me what are the major concerns here and maybe we can come up
> with
> something to address that in kernel.

Just to add to that. One idea could be like to have a case where when ECMDQs are 
detected, use that for issuing limited set of cmds(like stage 1 TLBIs) and use the
normal cmdq for rest. Since we use stage 1 for both host and for Guest nested cases
and TLBIs are the bottlenecks in most cases I think this should give performance
benefits.

Thanks,
Shameer

Re: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

Posted by Jason Gunthorpe 2 weeks ago

On Wed, Apr 17, 2024 at 09:45:34AM +0000, Shameerali Kolothum Thodi wrote:

> Just to add to that. One idea could be like to have a case where when ECMDQs are 
> detected, use that for issuing limited set of cmds(like stage 1 TLBIs) and use the
> normal cmdq for rest. Since we use stage 1 for both host and for Guest nested cases
> and TLBIs are the bottlenecks in most cases I think this should give performance
> benefits.

There is definately options to look at to improve the performance
here.

IMHO the design of the ECMDQ largely seems to expect 1 queue per-cpu
and then we move to a lock-less design where each CPU uses it's own
private per-cpu queue. In this case a VMM calling the kernel to do
invalidation would often naturally use a thread originating on a pCPU
bound to a vCPU which is substantially exclusive to the VM.

Jason

Re: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

Posted by Jason Gunthorpe 2 weeks ago

On Wed, Apr 17, 2024 at 08:01:10AM +0000, Shameerali Kolothum Thodi wrote:
> We do have plans to revive the SMMUv3 ECMDQ series posted a while back[0]
> and looking at this series, I am just wondering whether it makes sense to have
> a similar one with ECMDQ as well?  I see that the NVIDIA VCMDQ has a special bit 
> to restrict the commands that can be issued from user space. If we end up assigning
> a ECMDQ to user space, is there any potential risk in doing so?

I think there is some risk/trouble, ECMDQ needs some enhancement
before it can be really safe to use from less privileged software, and
it wasn't designed to have an isolated doorbell page either.

> Not clear to me what are the major concerns here and maybe we can come up with 
> something to address that in kernel.

I haven't looked deeply but my impression has been the ECMDQ is not
workable to support virtualization. At a minimum it has no way to
constrain the command flow to a VMID and to do VSID -> PSID
translation.

I suggest you talk directly to ARM on this if you are interested in
this.

Jason

RE: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

Posted by Shameerali Kolothum Thodi 2 weeks ago


> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, April 17, 2024 1:25 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Nicolin Chen <nicolinc@nvidia.com>; will@kernel.org;
> robin.murphy@arm.com; joro@8bytes.org; thierry.reding@gmail.com;
> vdumpa@nvidia.com; jonathanh@nvidia.com; linux-kernel@vger.kernel.org;
> iommu@lists.linux.dev; linux-arm-kernel@lists.infradead.org; linux-
> tegra@vger.kernel.org; Jerry Snitselaar <jsnitsel@redhat.com>
> Subject: Re: [PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)
> 
> On Wed, Apr 17, 2024 at 08:01:10AM +0000, Shameerali Kolothum Thodi wrote:
> > We do have plans to revive the SMMUv3 ECMDQ series posted a while back[0]
> > and looking at this series, I am just wondering whether it makes sense to have
> > a similar one with ECMDQ as well?  I see that the NVIDIA VCMDQ has a special
> bit
> > to restrict the commands that can be issued from user space. If we end up
> assigning
> > a ECMDQ to user space, is there any potential risk in doing so?
> 
> I think there is some risk/trouble, ECMDQ needs some enhancement
> before it can be really safe to use from less privileged software, and
> it wasn't designed to have an isolated doorbell page either.
> 
> > Not clear to me what are the major concerns here and maybe we can come up
> with
> > something to address that in kernel.
> 
> I haven't looked deeply but my impression has been the ECMDQ is not
> workable to support virtualization. At a minimum it has no way to
> constrain the command flow to a VMID and to do VSID -> PSID
> translation.

Ok. That makes sense.

> 
> I suggest you talk directly to ARM on this if you are interested in
> this.
> 

Sure. Will check.

Thanks,
Shameer