[PATCH 5/6] iommu: document missing def_domain_type return

Matthew Rosato posted 6 patches 2 weeks, 3 days ago
There is a newer version of this series
[PATCH 5/6] iommu: document missing def_domain_type return
Posted by Matthew Rosato 2 weeks, 3 days ago
In addition to IOMMU_DOMAIN_DMA, def_domain_type can also return
IOMMU_DOMAIN_DMA_FQ when applicable, else flush queues will never be
used.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 05279109c732..d0da1918d2de 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -585,6 +585,7 @@ iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size,
  * @def_domain_type: device default domain type, return value:
  *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
  *		- IOMMU_DOMAIN_DMA: must use a dma domain
+ *              - IOMMU_DOMAIN_DMA_FQ: dma domain with batch invalidation
  *		- 0: use the default setting
  * @default_domain_ops: the default ops for domains
  * @remove_dev_pasid: Remove any translation configurations of a specific
-- 
2.47.0
Re: [PATCH 5/6] iommu: document missing def_domain_type return
Posted by Baolu Lu 2 weeks, 3 days ago
On 12/10/24 03:24, Matthew Rosato wrote:
> In addition to IOMMU_DOMAIN_DMA, def_domain_type can also return
> IOMMU_DOMAIN_DMA_FQ when applicable, else flush queues will never be
> used.
> 
> Signed-off-by: Matthew Rosato<mjrosato@linux.ibm.com>
> ---
>   include/linux/iommu.h | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 05279109c732..d0da1918d2de 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -585,6 +585,7 @@ iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size,
>    * @def_domain_type: device default domain type, return value:
>    *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
>    *		- IOMMU_DOMAIN_DMA: must use a dma domain
> + *              - IOMMU_DOMAIN_DMA_FQ: dma domain with batch invalidation

In which case must an iommu driver return IOMMU_DOMAIN_DMA_FQ?

The flush queue is a policy of "when and how to synchronize the IOTLB"
in dma-iommu.c. The iommu driver actually has no need to understand such
policy.

By the way, "dma domain" in this comment is a bit confusing, it reads
better if we make it like:

- IOMMU_DOMAIN_IDENTITY: must use an identity domain
- IOMMU_DOMAIN_DMA: must use a paging domain

Thanks,
baolu
Re: [PATCH 5/6] iommu: document missing def_domain_type return
Posted by Matthew Rosato 2 weeks, 2 days ago
On 12/9/24 9:57 PM, Baolu Lu wrote:
> On 12/10/24 03:24, Matthew Rosato wrote:
>> In addition to IOMMU_DOMAIN_DMA, def_domain_type can also return
>> IOMMU_DOMAIN_DMA_FQ when applicable, else flush queues will never be
>> used.
>>
>> Signed-off-by: Matthew Rosato<mjrosato@linux.ibm.com>
>> ---
>>   include/linux/iommu.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 05279109c732..d0da1918d2de 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -585,6 +585,7 @@ iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size,
>>    * @def_domain_type: device default domain type, return value:
>>    *        - IOMMU_DOMAIN_IDENTITY: must use an identity domain
>>    *        - IOMMU_DOMAIN_DMA: must use a dma domain
>> + *              - IOMMU_DOMAIN_DMA_FQ: dma domain with batch invalidation
> 
> In which case must an iommu driver return IOMMU_DOMAIN_DMA_FQ?
> 
> The flush queue is a policy of "when and how to synchronize the IOTLB"
> in dma-iommu.c. The iommu driver actually has no need to understand such
> policy.

If you look ahead to the next patch where I implement def_domain_type for s390, I found that if I only ever return IOMMU_DOMAIN_DMA from ops->def_domain_type then when go through iommu_dma_init_domain() we will never call iommu_dma_init_fq() regardless of IOMMU_CAP_DEFERRED_FLUSH because of the if (domain->type == IOMMU_DOMAIN_DMA_FQ) check.  So something isn't right here.

It looks to me like the following is happening:

We first have the iommu_def_domain_type set in iommu_subsys_init or via one of the set_default routines, e.g.:
	if (!iommu_default_passthrough() && !iommu_dma_strict)
		iommu_def_domain_type = IOMMU_DOMAIN_DMA_FQ;

But when we arrive at iommu_group_alloc_default_domain()...

if we have no ops->def_domain_type() defined we will call __iommu_group_alloc_default_domain using what is in iommu_def_domain_type, which could be IOMMU_DOMAIN_DMA, IOMMU_DOMAIN_DMA_FQ or IOMMU_DOMAIN_IDENTITY based on strict/passthrough settings.  Testing an s390 scenario today without this series applied, we will call __iommu_group_alloc_default_domain with IOMMU_DOMAIN_DMA_FQ, as long as iommu.strict/passthrough is not specified, so then later in dma-iommu:iommu_dma_init_domain() we can use FQ based on IOMMU_CAP_DEFERRED_FLUSH.

but once we add ops->def_domain_type() then we end up calling iommu_group_alloc_default_domain() with a req_type == the return value from ops->def_domain_type(), which by the current definition can only be IOMMU_DOMAIN_DMA or IOMMU_DOMAIN_IDENTITY.  We will then call __iommu_group_alloc_default_domain with that req_type; so without this patch + the DMA_FQ path in patch 6 we would always end up allocating IOMMU_DOMAIN_DMA instead of IOMMU_DOMAIN_DMA_FQ by default, so when we arrive at dma:iommu_dma_init_domain() we won't check for IOMMU_CAP_DEFERRED_FLUSH because of the type.

So unless I'm missing something I think either we have to
1) be more flexible in what ops->default_domain_type() is allowed to return as this patch does 
or 
2) iommu core needs to look at the return from ops->default_domain_type() and decide whether it's OK to convert IOMMU_DOMAIN_DMA->IOMMU_DOMAIN_DMA_FQ based on strict setting.  This removes the decision from the individual drivers and dma-iommu can later decide whether or not to use it or not based on IOMMU_CAP_DEFERRED_FLUSH?  But would also affect other users of def_domain_type() today that perhaps did not want DMA_FQ?  Unsure.  What I mean is something like (untested):

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6bdede4177ff..275daa7f819d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1744,9 +1744,11 @@ static int iommu_get_def_domain_type(struct iommu_group *group,
                 */
                type = ops->default_domain->type;
        } else {
-               if (ops->def_domain_type)
+               if (ops->def_domain_type) {
                        type = ops->def_domain_type(dev);
-               else
+                       if (type == IOMMU_DOMAIN_DMA && !iommu_dma_strict)
+                               type = IOMMU_DOMAIN_DMA_FQ;
+               } else
                        return cur_type;
        }
        if (!type || cur_type == type)


Re: [PATCH 5/6] iommu: document missing def_domain_type return
Posted by Robin Murphy 2 weeks, 2 days ago
On 10/12/2024 4:26 pm, Matthew Rosato wrote:
> On 12/9/24 9:57 PM, Baolu Lu wrote:
>> On 12/10/24 03:24, Matthew Rosato wrote:
>>> In addition to IOMMU_DOMAIN_DMA, def_domain_type can also return
>>> IOMMU_DOMAIN_DMA_FQ when applicable, else flush queues will never be
>>> used.
>>>
>>> Signed-off-by: Matthew Rosato<mjrosato@linux.ibm.com>
>>> ---
>>>    include/linux/iommu.h | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 05279109c732..d0da1918d2de 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -585,6 +585,7 @@ iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size,
>>>     * @def_domain_type: device default domain type, return value:
>>>     *        - IOMMU_DOMAIN_IDENTITY: must use an identity domain
>>>     *        - IOMMU_DOMAIN_DMA: must use a dma domain
>>> + *              - IOMMU_DOMAIN_DMA_FQ: dma domain with batch invalidation
>>
>> In which case must an iommu driver return IOMMU_DOMAIN_DMA_FQ?
>>
>> The flush queue is a policy of "when and how to synchronize the IOTLB"
>> in dma-iommu.c. The iommu driver actually has no need to understand such
>> policy.
> 
> If you look ahead to the next patch where I implement def_domain_type for s390, I found that if I only ever return IOMMU_DOMAIN_DMA from ops->def_domain_type then when go through iommu_dma_init_domain() we will never call iommu_dma_init_fq() regardless of IOMMU_CAP_DEFERRED_FLUSH because of the if (domain->type == IOMMU_DOMAIN_DMA_FQ) check.  So something isn't right here.

Conceptually I don't think it ever makes sense for a driver to *require* 
a device to use deferred invalidation. Furthermore it's been the whole 
design for a while now that drivers should never see nor have to 
acknowledge IOMMU_DOMAIN_DMA_FQ, it's now just an internal type which 
exists largely for the sake of making the sysfs interface work really 
neatly. Also beware that a major reason for overriding 
iommu_def_domain_type with a paging domain is for untrusted devices, so 
massaging the result based on iommu_dma_strict is still not necessarily 
appropriate anyway.

It appears the real underlying issue is that, like everyone else in the 
same situation, you're doing def_domain_type wrong. If and when you 
can't support IOMMU_DOMAIN_IDENTITY, the expectation is that you make 
__iommu_alloc_identity_domain() fail, such that if iommu_def_domain_type 
is then ever set to passthrough, iommu_group_alloc_default_domain() 
falls back to IOMMU_DOMAIN_DMA by itself, and the user gets told they 
did a silly thing.

What you see apple-dart doing is a hack around the old bus-based 
iommu_domain_alloc() API where there wasn't enough information at the 
right point to necessarily do the right thing.

Thanks,
Robin.

> It looks to me like the following is happening:
> 
> We first have the iommu_def_domain_type set in iommu_subsys_init or via one of the set_default routines, e.g.:
> 	if (!iommu_default_passthrough() && !iommu_dma_strict)
> 		iommu_def_domain_type = IOMMU_DOMAIN_DMA_FQ;
> 
> But when we arrive at iommu_group_alloc_default_domain()...
> 
> if we have no ops->def_domain_type() defined we will call __iommu_group_alloc_default_domain using what is in iommu_def_domain_type, which could be IOMMU_DOMAIN_DMA, IOMMU_DOMAIN_DMA_FQ or IOMMU_DOMAIN_IDENTITY based on strict/passthrough settings.  Testing an s390 scenario today without this series applied, we will call __iommu_group_alloc_default_domain with IOMMU_DOMAIN_DMA_FQ, as long as iommu.strict/passthrough is not specified, so then later in dma-iommu:iommu_dma_init_domain() we can use FQ based on IOMMU_CAP_DEFERRED_FLUSH.
> 
> but once we add ops->def_domain_type() then we end up calling iommu_group_alloc_default_domain() with a req_type == the return value from ops->def_domain_type(), which by the current definition can only be IOMMU_DOMAIN_DMA or IOMMU_DOMAIN_IDENTITY.  We will then call __iommu_group_alloc_default_domain with that req_type; so without this patch + the DMA_FQ path in patch 6 we would always end up allocating IOMMU_DOMAIN_DMA instead of IOMMU_DOMAIN_DMA_FQ by default, so when we arrive at dma:iommu_dma_init_domain() we won't check for IOMMU_CAP_DEFERRED_FLUSH because of the type.
> 
> So unless I'm missing something I think either we have to
> 1) be more flexible in what ops->default_domain_type() is allowed to return as this patch does
> or
> 2) iommu core needs to look at the return from ops->default_domain_type() and decide whether it's OK to convert IOMMU_DOMAIN_DMA->IOMMU_DOMAIN_DMA_FQ based on strict setting.  This removes the decision from the individual drivers and dma-iommu can later decide whether or not to use it or not based on IOMMU_CAP_DEFERRED_FLUSH?  But would also affect other users of def_domain_type() today that perhaps did not want DMA_FQ?  Unsure.  What I mean is something like (untested):
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 6bdede4177ff..275daa7f819d 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1744,9 +1744,11 @@ static int iommu_get_def_domain_type(struct iommu_group *group,
>                   */
>                  type = ops->default_domain->type;
>          } else {
> -               if (ops->def_domain_type)
> +               if (ops->def_domain_type) {
>                          type = ops->def_domain_type(dev);
> -               else
> +                       if (type == IOMMU_DOMAIN_DMA && !iommu_dma_strict)
> +                               type = IOMMU_DOMAIN_DMA_FQ;
> +               } else
>                          return cur_type;
>          }
>          if (!type || cur_type == type)
> 
> 
Re: [PATCH 5/6] iommu: document missing def_domain_type return
Posted by Matthew Rosato 2 weeks, 2 days ago
On 12/10/24 1:42 PM, Robin Murphy wrote:
> On 10/12/2024 4:26 pm, Matthew Rosato wrote:
>> On 12/9/24 9:57 PM, Baolu Lu wrote:
>>> On 12/10/24 03:24, Matthew Rosato wrote:
>>>> In addition to IOMMU_DOMAIN_DMA, def_domain_type can also return
>>>> IOMMU_DOMAIN_DMA_FQ when applicable, else flush queues will never be
>>>> used.
>>>>
>>>> Signed-off-by: Matthew Rosato<mjrosato@linux.ibm.com>
>>>> ---
>>>>    include/linux/iommu.h | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index 05279109c732..d0da1918d2de 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -585,6 +585,7 @@ iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size,
>>>>     * @def_domain_type: device default domain type, return value:
>>>>     *        - IOMMU_DOMAIN_IDENTITY: must use an identity domain
>>>>     *        - IOMMU_DOMAIN_DMA: must use a dma domain
>>>> + *              - IOMMU_DOMAIN_DMA_FQ: dma domain with batch invalidation
>>>
>>> In which case must an iommu driver return IOMMU_DOMAIN_DMA_FQ?
>>>
>>> The flush queue is a policy of "when and how to synchronize the IOTLB"
>>> in dma-iommu.c. The iommu driver actually has no need to understand such
>>> policy.
>>
>> If you look ahead to the next patch where I implement def_domain_type for s390, I found that if I only ever return IOMMU_DOMAIN_DMA from ops->def_domain_type then when go through iommu_dma_init_domain() we will never call iommu_dma_init_fq() regardless of IOMMU_CAP_DEFERRED_FLUSH because of the if (domain->type == IOMMU_DOMAIN_DMA_FQ) check.  So something isn't right here.
> 
> Conceptually I don't think it ever makes sense for a driver to *require* a device to use deferred invalidation. Furthermore it's been the whole design for a while now that drivers should never see nor have to acknowledge IOMMU_DOMAIN_DMA_FQ, it's now just an internal type which exists largely for the sake of making the sysfs interface work really neatly. Also beware that a major reason for overriding iommu_def_domain_type with a paging domain is for untrusted devices, so massaging the result based on iommu_dma_strict is still not necessarily appropriate anyway.
> 
> It appears the real underlying issue is that, like everyone else in the same situation, you're doing def_domain_type wrong. If and when you can't support IOMMU_DOMAIN_IDENTITY, the expectation is that you make __iommu_alloc_identity_domain() fail, such that if iommu_def_domain_type is then ever set to passthrough, iommu_group_alloc_default_domain() falls back to IOMMU_DOMAIN_DMA by itself, and the user gets told they did a silly thing.

OK, I almost see where this all fits to throw out def_domain_type for this series...  but looking at __iommu_alloc_identity_domain, the preferred approach is using a static identity domain which turns __iommu_alloc_identity_domain into a nofail case once you define the identity_domain:
 
if (ops->identity_domain)
	return ops->identity_domain;

So it seems to me to be an all-or-nothing thing, whereas what I'm trying to achieve is a device-based decision on whether the group is allowed to use that identity domain.  Which reminds me that this is ultimately why I ended up looking into def_domain_type in the first place.

If I need __iommu_alloc_identity_domain to fail, I guess what I'm looking to do boils down to something like...

if (ops->identity_domain) { 
	if (!ops->allow_identity || ops->allow_identity(dev))
		return ops->identity_domain;
	else
		return ERR_PTR(-EOPNOTSUPP);
}






Re: [PATCH 5/6] iommu: document missing def_domain_type return
Posted by Robin Murphy 2 weeks, 1 day ago
On 10/12/2024 10:06 pm, Matthew Rosato wrote:
> On 12/10/24 1:42 PM, Robin Murphy wrote:
>> On 10/12/2024 4:26 pm, Matthew Rosato wrote:
>>> On 12/9/24 9:57 PM, Baolu Lu wrote:
>>>> On 12/10/24 03:24, Matthew Rosato wrote:
>>>>> In addition to IOMMU_DOMAIN_DMA, def_domain_type can also return
>>>>> IOMMU_DOMAIN_DMA_FQ when applicable, else flush queues will never be
>>>>> used.
>>>>>
>>>>> Signed-off-by: Matthew Rosato<mjrosato@linux.ibm.com>
>>>>> ---
>>>>>     include/linux/iommu.h | 1 +
>>>>>     1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index 05279109c732..d0da1918d2de 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -585,6 +585,7 @@ iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size,
>>>>>      * @def_domain_type: device default domain type, return value:
>>>>>      *        - IOMMU_DOMAIN_IDENTITY: must use an identity domain
>>>>>      *        - IOMMU_DOMAIN_DMA: must use a dma domain
>>>>> + *              - IOMMU_DOMAIN_DMA_FQ: dma domain with batch invalidation
>>>>
>>>> In which case must an iommu driver return IOMMU_DOMAIN_DMA_FQ?
>>>>
>>>> The flush queue is a policy of "when and how to synchronize the IOTLB"
>>>> in dma-iommu.c. The iommu driver actually has no need to understand such
>>>> policy.
>>>
>>> If you look ahead to the next patch where I implement def_domain_type for s390, I found that if I only ever return IOMMU_DOMAIN_DMA from ops->def_domain_type then when go through iommu_dma_init_domain() we will never call iommu_dma_init_fq() regardless of IOMMU_CAP_DEFERRED_FLUSH because of the if (domain->type == IOMMU_DOMAIN_DMA_FQ) check.  So something isn't right here.
>>
>> Conceptually I don't think it ever makes sense for a driver to *require* a device to use deferred invalidation. Furthermore it's been the whole design for a while now that drivers should never see nor have to acknowledge IOMMU_DOMAIN_DMA_FQ, it's now just an internal type which exists largely for the sake of making the sysfs interface work really neatly. Also beware that a major reason for overriding iommu_def_domain_type with a paging domain is for untrusted devices, so massaging the result based on iommu_dma_strict is still not necessarily appropriate anyway.
>>
>> It appears the real underlying issue is that, like everyone else in the same situation, you're doing def_domain_type wrong. If and when you can't support IOMMU_DOMAIN_IDENTITY, the expectation is that you make __iommu_alloc_identity_domain() fail, such that if iommu_def_domain_type is then ever set to passthrough, iommu_group_alloc_default_domain() falls back to IOMMU_DOMAIN_DMA by itself, and the user gets told they did a silly thing.
> 
> OK, I almost see where this all fits to throw out def_domain_type for this series...  but looking at __iommu_alloc_identity_domain, the preferred approach is using a static identity domain which turns __iommu_alloc_identity_domain into a nofail case once you define the identity_domain:
>   
> if (ops->identity_domain)
> 	return ops->identity_domain;
> 
> So it seems to me to be an all-or-nothing thing, whereas what I'm trying to achieve is a device-based decision on whether the group is allowed to use that identity domain.  Which reminds me that this is ultimately why I ended up looking into def_domain_type in the first place.

Well, yeah, the static domain is very much designed for the typical case 
where it *is* unconditionally available. Simple options right now would 
be to have two sets of ops and select them per-instance, or if it's a 
system-wide property perhaps have non-const ops and populate/clear 
identity_domain as appropriate, or possibly even just stick with the 
ops->domain_alloc(IOMMU_DOMAIN_IDENTITY) path (perhaps eventually 
evolving into a specialised domain_alloc_identity op to also become 
per-device?). What we should not have is mechanisms like the 
def_domain_type hack where something claims to be available, but then 
some other thing tries to say "oh but you mustn't touch that".

Thanks,
Robin.

> If I need __iommu_alloc_identity_domain to fail, I guess what I'm looking to do boils down to something like...
> 
> if (ops->identity_domain) {
> 	if (!ops->allow_identity || ops->allow_identity(dev))
> 		return ops->identity_domain;
> 	else
> 		return ERR_PTR(-EOPNOTSUPP);
> }
> 
> 
> 
> 
> 
>