RE: [PATCH 0/7] iommu cleanup and refactoring

Tian, Kevin posted 7 patches 4 years, 5 months ago
Only 0 patches received!
RE: [PATCH 0/7] iommu cleanup and refactoring
Posted by Tian, Kevin 4 years, 5 months ago
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Monday, January 24, 2022 3:11 PM
> 
> Hi,
> 
> The guest pasid and aux-domain related code are dead code in current
> iommu subtree. As we have reached a consensus that all these features
> should be based on the new iommufd framework (which is under active
> development), the first part of this series removes and cleanups all
> the dead code.
> 
> The second part of this series refactors the iommu_domain by moving all
> domain-specific ops from iommu_ops to a new domain_ops. This makes an
> iommu_domain self-contained and represent the abstraction of an I/O
> translation table in the IOMMU subsystem. With different type of
> iommu_domain providing different set of ops, it's easier to support more
> types of I/O translation tables.

You may want to give more background on this end goal. In general there
are four IOPT types in iommufd discussions:

1) The one currently tracked by iommu_domain, with a map/unmap semantics
2) The one managed by mm and shared to iommu via sva_bind/unbind ops
3) The one managed by userspace and bound to iommu via iommufd (require nesting)
4) The one managed by KVM (e.g. EPT) and shared to iommu via a TBD interface

Currently only 1) and 2) are supported. 2) is supported via sva specific 
structures, tracked outside of iommu_domain.

Following current direction 3) and 4) will end up to have separate tracking
structures too (some existing bits for 3 are removed by this series).

And this series is preparatory for extending the domain concept to be a 
general object for all IOPT types. And it is one important design direction
to be aligned in the iommu community.

Thanks
Kevin
Re: [PATCH 0/7] iommu cleanup and refactoring
Posted by Jason Gunthorpe 4 years, 5 months ago
On Mon, Jan 24, 2022 at 09:46:26AM +0000, Tian, Kevin wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > Sent: Monday, January 24, 2022 3:11 PM
> > 
> > Hi,
> > 
> > The guest pasid and aux-domain related code are dead code in current
> > iommu subtree. As we have reached a consensus that all these features
> > should be based on the new iommufd framework (which is under active
> > development), the first part of this series removes and cleanups all
> > the dead code.
> > 
> > The second part of this series refactors the iommu_domain by moving all
> > domain-specific ops from iommu_ops to a new domain_ops. This makes an
> > iommu_domain self-contained and represent the abstraction of an I/O
> > translation table in the IOMMU subsystem. With different type of
> > iommu_domain providing different set of ops, it's easier to support more
> > types of I/O translation tables.
> 
> You may want to give more background on this end goal. In general there
> are four IOPT types in iommufd discussions:
> 
> 1) The one currently tracked by iommu_domain, with a map/unmap semantics
> 2) The one managed by mm and shared to iommu via sva_bind/unbind ops
> 3) The one managed by userspace and bound to iommu via iommufd (require nesting)
> 4) The one managed by KVM (e.g. EPT) and shared to iommu via a TBD interface

Yes, at least from an iommufd perspective I'd like to see one struct
for all of these types, mainly so we can have a uniform alloc, attach
and detatch flow for all io page table types. 

If we want to use the iommu_domain, or make iommu_domain a sub-class
of a new struct, can be determined as we go along.

Regardless, I think this cleanup stands on its own. Moving the ops and
purging the dead code is clearly the right thing to do.

Thanks,
Jason
RE: [PATCH 0/7] iommu cleanup and refactoring
Posted by Tian, Kevin 4 years, 5 months ago
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, January 25, 2022 1:44 AM
> 
> On Mon, Jan 24, 2022 at 09:46:26AM +0000, Tian, Kevin wrote:
> > > From: Lu Baolu <baolu.lu@linux.intel.com>
> > > Sent: Monday, January 24, 2022 3:11 PM
> > >
> > > Hi,
> > >
> > > The guest pasid and aux-domain related code are dead code in current
> > > iommu subtree. As we have reached a consensus that all these features
> > > should be based on the new iommufd framework (which is under active
> > > development), the first part of this series removes and cleanups all
> > > the dead code.
> > >
> > > The second part of this series refactors the iommu_domain by moving all
> > > domain-specific ops from iommu_ops to a new domain_ops. This makes
> an
> > > iommu_domain self-contained and represent the abstraction of an I/O
> > > translation table in the IOMMU subsystem. With different type of
> > > iommu_domain providing different set of ops, it's easier to support more
> > > types of I/O translation tables.
> >
> > You may want to give more background on this end goal. In general there
> > are four IOPT types in iommufd discussions:
> >
> > 1) The one currently tracked by iommu_domain, with a map/unmap
> semantics
> > 2) The one managed by mm and shared to iommu via sva_bind/unbind ops
> > 3) The one managed by userspace and bound to iommu via iommufd
> (require nesting)
> > 4) The one managed by KVM (e.g. EPT) and shared to iommu via a TBD
> interface
> 
> Yes, at least from an iommufd perspective I'd like to see one struct
> for all of these types, mainly so we can have a uniform alloc, attach
> and detatch flow for all io page table types.
> 
> If we want to use the iommu_domain, or make iommu_domain a sub-class
> of a new struct, can be determined as we go along.
> 
> Regardless, I think this cleanup stands on its own. Moving the ops and
> purging the dead code is clearly the right thing to do.
> 

Indeed!
Re: [PATCH 0/7] iommu cleanup and refactoring
Posted by Robin Murphy 4 years, 5 months ago
On 2022-01-24 17:44, Jason Gunthorpe wrote:
> On Mon, Jan 24, 2022 at 09:46:26AM +0000, Tian, Kevin wrote:
>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>> Sent: Monday, January 24, 2022 3:11 PM
>>>
>>> Hi,
>>>
>>> The guest pasid and aux-domain related code are dead code in current
>>> iommu subtree. As we have reached a consensus that all these features
>>> should be based on the new iommufd framework (which is under active
>>> development), the first part of this series removes and cleanups all
>>> the dead code.
>>>
>>> The second part of this series refactors the iommu_domain by moving all
>>> domain-specific ops from iommu_ops to a new domain_ops. This makes an
>>> iommu_domain self-contained and represent the abstraction of an I/O
>>> translation table in the IOMMU subsystem. With different type of
>>> iommu_domain providing different set of ops, it's easier to support more
>>> types of I/O translation tables.
>>
>> You may want to give more background on this end goal. In general there
>> are four IOPT types in iommufd discussions:
>>
>> 1) The one currently tracked by iommu_domain, with a map/unmap semantics
>> 2) The one managed by mm and shared to iommu via sva_bind/unbind ops
>> 3) The one managed by userspace and bound to iommu via iommufd (require nesting)
>> 4) The one managed by KVM (e.g. EPT) and shared to iommu via a TBD interface
> 
> Yes, at least from an iommufd perspective I'd like to see one struct
> for all of these types, mainly so we can have a uniform alloc, attach
> and detatch flow for all io page table types.

Agreed, certainly an IOMMU_DOMAIN_SVA type that can both encapsulate the 
mm and effectively replace iommu_sva seems like a logical and fairly 
small next step. We already have the paradigm of different domain types 
supporting different ops, so initially an SVA domain would simply allow 
bind/unbind rather than attach/detach/map/unmap.

It might then further be possible to hide SVA bind/unbind behind the 
attach/detach interface, but AFAICS we'd still need distinct flows for 
attaching/binding the whole device vs. attaching/binding to a PASID, 
since they are fundamentally different things in their own right, and 
the ideal API should give us the orthogonality to also bind a device to 
an SVA domain without PASID (e.g. for KVM stage 2, or userspace 
assignment of simpler fault/stall-tolerant devices), or attach PASIDs to 
regular iommu_domains.

That distinction could of course be consolidated by flipping to the 
other approach of explicitly allocating the PASID first, then wrapping 
it in a struct device that could then be passed through the same 
attach/detach interfaces and distinguished internally, but although I 
still have a fondness for that approach I know I'm about the only one :)

Cheers,
Robin.

> If we want to use the iommu_domain, or make iommu_domain a sub-class
> of a new struct, can be determined as we go along.
> 
> Regardless, I think this cleanup stands on its own. Moving the ops and
> purging the dead code is clearly the right thing to do.
> 
> Thanks,
> Jason
Re: [PATCH 0/7] iommu cleanup and refactoring
Posted by Jason Gunthorpe 4 years, 5 months ago
On Tue, Jan 25, 2022 at 02:48:02PM +0000, Robin Murphy wrote:
 
> Agreed, certainly an IOMMU_DOMAIN_SVA type that can both encapsulate the mm
> and effectively replace iommu_sva seems like a logical and fairly small next
> step. We already have the paradigm of different domain types supporting
> different ops, so initially an SVA domain would simply allow bind/unbind
> rather than attach/detach/map/unmap.

I hope we can quickly get to a PASID enabled generic attach/detach
scheme - we really need this to do the uAPI part of this interface.

> they are fundamentally different things in their own right, and the ideal
> API should give us the orthogonality to also bind a device to an SVA domain
> without PASID (e.g. for KVM stage 2, or userspace assignment of simpler
> fault/stall-tolerant devices), or attach PASIDs to regular iommu_domains.

Yes, these are orthogonal things. A iommu driver that supports PASID
ideally should support PASID enabled attach/detatch for every
iommu_domain type it supports.

SVA should not be entangled with PASID beyond that SVA is often used
with PASID - a SVA iommu_domain should be fully usable with a RID too.

I'm hoping to see the core iommu code provide some simplified "SVA"
API that under the covers creates a SVA domain and then does a normal
PASID attach using the global PASID in the mm_struct - the
driver should not care what, or even if, PASID is used for a SVA
domain.

Jason
Re: [PATCH 0/7] iommu cleanup and refactoring
Posted by Lu Baolu 4 years, 5 months ago
On 1/25/22 11:16 PM, Jason Gunthorpe wrote:
> On Tue, Jan 25, 2022 at 02:48:02PM +0000, Robin Murphy wrote:
>   
>> Agreed, certainly an IOMMU_DOMAIN_SVA type that can both encapsulate the mm
>> and effectively replace iommu_sva seems like a logical and fairly small next
>> step. We already have the paradigm of different domain types supporting
>> different ops, so initially an SVA domain would simply allow bind/unbind
>> rather than attach/detach/map/unmap.
> 
> I hope we can quickly get to a PASID enabled generic attach/detach
> scheme - we really need this to do the uAPI part of this interface.

Agreed. Jacob is working on kernel DMA with PASID. He needs such
interfaces as well. I have worked out an implementation for vt-d driver.
It could be post for review inside Jacob's series for kernel DMA with
PASID.

> 
>> they are fundamentally different things in their own right, and the ideal
>> API should give us the orthogonality to also bind a device to an SVA domain
>> without PASID (e.g. for KVM stage 2, or userspace assignment of simpler
>> fault/stall-tolerant devices), or attach PASIDs to regular iommu_domains.
> 
> Yes, these are orthogonal things. A iommu driver that supports PASID
> ideally should support PASID enabled attach/detatch for every
> iommu_domain type it supports.
> 
> SVA should not be entangled with PASID beyond that SVA is often used
> with PASID - a SVA iommu_domain should be fully usable with a RID too.

The prototype of PASID enabled attach/detach ops could look like:

        int (*attach_dev_pasid)(struct iommu_domain *domain,
                                struct device *dev, ioasid_t id);
        void (*detach_dev_pasid)(struct iommu_domain *domain,
                                 struct device *dev, ioasid_t id);

But the iommu driver should implement different callbacks for

1) attaching an IOMMU DMA domain to a PASID on device;
    - kernel DMA with PASID
    - mdev-like device passthrough
    - etc.
2) attaching a CPU-shared domain to a PASID on device;
    - SVA
    - guest PASID
    - etc.

> 
> I'm hoping to see the core iommu code provide some simplified "SVA"
> API that under the covers creates a SVA domain and then does a normal
> PASID attach using the global PASID in the mm_struct - the
> driver should not care what, or even if, PASID is used for a SVA
> domain.
> 
> Jason
> 

Best regards,
baolu
Re: [PATCH 0/7] iommu cleanup and refactoring
Posted by Jason Gunthorpe 4 years, 5 months ago
On Wed, Jan 26, 2022 at 09:51:36AM +0800, Lu Baolu wrote:

> > > they are fundamentally different things in their own right, and the ideal
> > > API should give us the orthogonality to also bind a device to an SVA domain
> > > without PASID (e.g. for KVM stage 2, or userspace assignment of simpler
> > > fault/stall-tolerant devices), or attach PASIDs to regular iommu_domains.
> > 
> > Yes, these are orthogonal things. A iommu driver that supports PASID
> > ideally should support PASID enabled attach/detatch for every
> > iommu_domain type it supports.
> > 
> > SVA should not be entangled with PASID beyond that SVA is often used
> > with PASID - a SVA iommu_domain should be fully usable with a RID too.
> 
> The prototype of PASID enabled attach/detach ops could look like:
> 
>        int (*attach_dev_pasid)(struct iommu_domain *domain,
>                                struct device *dev, ioasid_t id);
>        void (*detach_dev_pasid)(struct iommu_domain *domain,
>                                 struct device *dev, ioasid_t id);

It seems reasonable and straightforward to me..

These would be domain ops?
 
> But the iommu driver should implement different callbacks for
> 
> 1) attaching an IOMMU DMA domain to a PASID on device;
>    - kernel DMA with PASID
>    - mdev-like device passthrough
>    - etc.
> 2) attaching a CPU-shared domain to a PASID on device;
>    - SVA
>    - guest PASID
>    - etc.

But this you mean domain->ops would be different? Seems fine, up to
the driver.

I'd hope to see some flow like:

 domain = device->bus->iommu_ops->alloc_sva_domain(dev)
 domain->ops->attach_dev_pasid(domain, dev, current->pasid)

To duplicate the current SVA APIs

Jason
Re: [PATCH 0/7] iommu cleanup and refactoring
Posted by Robin Murphy 4 years, 5 months ago
On 2022-01-26 13:27, Jason Gunthorpe via iommu wrote:
> On Wed, Jan 26, 2022 at 09:51:36AM +0800, Lu Baolu wrote:
> 
>>>> they are fundamentally different things in their own right, and the ideal
>>>> API should give us the orthogonality to also bind a device to an SVA domain
>>>> without PASID (e.g. for KVM stage 2, or userspace assignment of simpler
>>>> fault/stall-tolerant devices), or attach PASIDs to regular iommu_domains.
>>>
>>> Yes, these are orthogonal things. A iommu driver that supports PASID
>>> ideally should support PASID enabled attach/detatch for every
>>> iommu_domain type it supports.
>>>
>>> SVA should not be entangled with PASID beyond that SVA is often used
>>> with PASID - a SVA iommu_domain should be fully usable with a RID too.
>>
>> The prototype of PASID enabled attach/detach ops could look like:
>>
>>         int (*attach_dev_pasid)(struct iommu_domain *domain,
>>                                 struct device *dev, ioasid_t id);
>>         void (*detach_dev_pasid)(struct iommu_domain *domain,
>>                                  struct device *dev, ioasid_t id);
> 
> It seems reasonable and straightforward to me..
> 
> These would be domain ops?
>   
>> But the iommu driver should implement different callbacks for
>>
>> 1) attaching an IOMMU DMA domain to a PASID on device;
>>     - kernel DMA with PASID
>>     - mdev-like device passthrough
>>     - etc.
>> 2) attaching a CPU-shared domain to a PASID on device;
>>     - SVA
>>     - guest PASID
>>     - etc.
> 
> But this you mean domain->ops would be different? Seems fine, up to
> the driver.

Indeed, it makes little practical difference whether the driver provides 
distinct sets of ops or just common callbacks which switch on the domain 
type internally. I expect that's largely going to come down to personal 
preference and how much internal driver code is common between the paths.

> I'd hope to see some flow like:
> 
>   domain = device->bus->iommu_ops->alloc_sva_domain(dev)
>   domain->ops->attach_dev_pasid(domain, dev, current->pasid)
> 
> To duplicate the current SVA APIs

+1 - I'd concur that attach/detach belong as domain ops in general. 
There's quite a nice logical split forming here where domain ops are the 
ones that make sense for external subsystems and endpoint drivers to use 
too, while device ops, with the sole exception of domain_alloc, are 
IOMMU API internals. By that reasoning, attach/bind/etc. are firmly in 
the former category.

Thanks,
Robin.