[PATCH v5 10/21] intel_iommu: Introduce two helpers vtd_as_from/to_iommu_pasid_locked

Zhenzhong Duan posted 21 patches 2 months, 3 weeks ago
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Yi Liu <yi.l.liu@intel.com>, "Clément Mathieu--Drif" <clement.mathieu--drif@eviden.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Eduardo Habkost <eduardo@habkost.net>, Alex Williamson <alex.williamson@redhat.com>, "Cédric Le Goater" <clg@redhat.com>, Eric Auger <eric.auger@redhat.com>, Zhenzhong Duan <zhenzhong.duan@intel.com>
[PATCH v5 10/21] intel_iommu: Introduce two helpers vtd_as_from/to_iommu_pasid_locked
Posted by Zhenzhong Duan 2 months, 3 weeks ago
PCI device supports two request types, Requests-without-PASID and
Requests-with-PASID. Requests-without-PASID doesn't include a PASID TLP
prefix, IOMMU fetches rid_pasid from context entry and use it as IOMMU's
pasid to index pasid table.

So we need to translate between PCI's pasid and IOMMU's pasid specially
for Requests-without-PASID, e.g., PCI_NO_PASID(-1) <-> rid_pasid.
For Requests-with-PASID, PCI's pasid and IOMMU's pasid are same value.

vtd_as_from_iommu_pasid_locked() translates from BDF+iommu_pasid to vtd_as
which contains PCI's pasid vtd_as->pasid.

vtd_as_to_iommu_pasid_locked() translates from BDF+vtd_as->pasid to iommu_pasid.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 hw/i386/intel_iommu.c | 58 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 6edd91d94e..1801f1cdf6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1602,6 +1602,64 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
     return 0;
 }
 
+static int vtd_as_to_iommu_pasid_locked(VTDAddressSpace *vtd_as,
+                                        uint32_t *pasid)
+{
+    VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;
+    IntelIOMMUState *s = vtd_as->iommu_state;
+    uint8_t bus_num = pci_bus_num(vtd_as->bus);
+    uint8_t devfn = vtd_as->devfn;
+    VTDContextEntry ce;
+    int ret;
+
+    /* For Requests-with-PASID, its pasid value is used by vIOMMU directly */
+    if (vtd_as->pasid != PCI_NO_PASID) {
+        *pasid = vtd_as->pasid;
+        return 0;
+    }
+
+    if (cc_entry->context_cache_gen == s->context_cache_gen) {
+        ce = cc_entry->context_entry;
+    } else {
+        ret = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
+        if (ret) {
+            return ret;
+        }
+    }
+
+    *pasid = VTD_CE_GET_RID2PASID(&ce);
+    return 0;
+}
+
+static gboolean vtd_find_as_by_sid_and_iommu_pasid(gpointer key, gpointer value,
+                                                   gpointer user_data)
+{
+    VTDAddressSpace *vtd_as = (VTDAddressSpace *)value;
+    struct vtd_as_raw_key *target = (struct vtd_as_raw_key *)user_data;
+    uint16_t sid = PCI_BUILD_BDF(pci_bus_num(vtd_as->bus), vtd_as->devfn);
+    uint32_t pasid;
+
+    if (vtd_as_to_iommu_pasid_locked(vtd_as, &pasid)) {
+        return false;
+    }
+
+    return (pasid == target->pasid) && (sid == target->sid);
+}
+
+/* Translate iommu pasid to vtd_as */
+static inline
+VTDAddressSpace *vtd_as_from_iommu_pasid_locked(IntelIOMMUState *s,
+                                                uint16_t sid, uint32_t pasid)
+{
+    struct vtd_as_raw_key key = {
+        .sid = sid,
+        .pasid = pasid
+    };
+
+    return g_hash_table_find(s->vtd_address_spaces,
+                             vtd_find_as_by_sid_and_iommu_pasid, &key);
+}
+
 static int vtd_sync_shadow_page_hook(const IOMMUTLBEvent *event,
                                      void *private)
 {
-- 
2.47.1
Re: [PATCH v5 10/21] intel_iommu: Introduce two helpers vtd_as_from/to_iommu_pasid_locked
Posted by Yi Liu 2 months, 2 weeks ago
On 2025/8/22 14:40, Zhenzhong Duan wrote:
> PCI device supports two request types, Requests-without-PASID and
> Requests-with-PASID. Requests-without-PASID doesn't include a PASID TLP
> prefix, IOMMU fetches rid_pasid from context entry and use it as IOMMU's
> pasid to index pasid table.
> 
> So we need to translate between PCI's pasid and IOMMU's pasid specially
> for Requests-without-PASID, e.g., PCI_NO_PASID(-1) <-> rid_pasid.
> For Requests-with-PASID, PCI's pasid and IOMMU's pasid are same value.
> 
> vtd_as_from_iommu_pasid_locked() translates from BDF+iommu_pasid to vtd_as
> which contains PCI's pasid vtd_as->pasid.
> 
> vtd_as_to_iommu_pasid_locked() translates from BDF+vtd_as->pasid to iommu_pasid.

translate is somehow strange. convert or get might be better? Same to
the translate terms in the patch.

> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> ---
>   hw/i386/intel_iommu.c | 58 +++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 58 insertions(+)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 6edd91d94e..1801f1cdf6 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -1602,6 +1602,64 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
>       return 0;
>   }
>   
> +static int vtd_as_to_iommu_pasid_locked(VTDAddressSpace *vtd_as,
> +                                        uint32_t *pasid)
> +{
> +    VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;
> +    IntelIOMMUState *s = vtd_as->iommu_state;
> +    uint8_t bus_num = pci_bus_num(vtd_as->bus);
> +    uint8_t devfn = vtd_as->devfn;
> +    VTDContextEntry ce;
> +    int ret;
> +
> +    /* For Requests-with-PASID, its pasid value is used by vIOMMU directly */
> +    if (vtd_as->pasid != PCI_NO_PASID) {
> +        *pasid = vtd_as->pasid;
> +        return 0;
> +    }
> +
> +    if (cc_entry->context_cache_gen == s->context_cache_gen) {
> +        ce = cc_entry->context_entry;
> +    } else {
> +        ret = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
> +        if (ret) {
> +            return ret;
> +        }
> +    }
> +    *pasid = VTD_CE_GET_RID2PASID(&ce);

looks like we have quite a few code get rid_pasid from the context
entry. I think we may simplify it by using PASID #0 since vIOMMU does
not report ECAP.RPS bit at all. It could be done as a separate cleanup.

> +    return 0;
> +}
> +
> +static gboolean vtd_find_as_by_sid_and_iommu_pasid(gpointer key, gpointer value,
> +                                                   gpointer user_data)
> +{
> +    VTDAddressSpace *vtd_as = (VTDAddressSpace *)value;
> +    struct vtd_as_raw_key *target = (struct vtd_as_raw_key *)user_data;
> +    uint16_t sid = PCI_BUILD_BDF(pci_bus_num(vtd_as->bus), vtd_as->devfn);
> +    uint32_t pasid;
> +
> +    if (vtd_as_to_iommu_pasid_locked(vtd_as, &pasid)) {
> +        return false;
> +    }
> +
> +    return (pasid == target->pasid) && (sid == target->sid);
> +}
> +
> +/* Translate iommu pasid to vtd_as */
> +static inline
> +VTDAddressSpace *vtd_as_from_iommu_pasid_locked(IntelIOMMUState *s,
> +                                                uint16_t sid, uint32_t pasid)
> +{
> +    struct vtd_as_raw_key key = {
> +        .sid = sid,
> +        .pasid = pasid
> +    };
> +
> +    return g_hash_table_find(s->vtd_address_spaces,
> +                             vtd_find_as_by_sid_and_iommu_pasid, &key);
> +}
> +
>   static int vtd_sync_shadow_page_hook(const IOMMUTLBEvent *event,
>                                        void *private)
>   {

the code looks good to me.

Reviewed-by: Yi Liu <yi.l.liu@intel.com>
RE: [PATCH v5 10/21] intel_iommu: Introduce two helpers vtd_as_from/to_iommu_pasid_locked
Posted by Duan, Zhenzhong 2 months, 2 weeks ago

>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v5 10/21] intel_iommu: Introduce two helpers
>vtd_as_from/to_iommu_pasid_locked
>
>On 2025/8/22 14:40, Zhenzhong Duan wrote:
>> PCI device supports two request types, Requests-without-PASID and
>> Requests-with-PASID. Requests-without-PASID doesn't include a PASID TLP
>> prefix, IOMMU fetches rid_pasid from context entry and use it as IOMMU's
>> pasid to index pasid table.
>>
>> So we need to translate between PCI's pasid and IOMMU's pasid specially
>> for Requests-without-PASID, e.g., PCI_NO_PASID(-1) <-> rid_pasid.
>> For Requests-with-PASID, PCI's pasid and IOMMU's pasid are same value.
>>
>> vtd_as_from_iommu_pasid_locked() translates from BDF+iommu_pasid to
>vtd_as
>> which contains PCI's pasid vtd_as->pasid.
>>
>> vtd_as_to_iommu_pasid_locked() translates from BDF+vtd_as->pasid to
>iommu_pasid.
>
>translate is somehow strange. convert or get might be better? Same to
>the translate terms in the patch.

OK, will use 'convert' terminology.

>
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>   hw/i386/intel_iommu.c | 58
>+++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 58 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 6edd91d94e..1801f1cdf6 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -1602,6 +1602,64 @@ static int
>vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
>>       return 0;
>>   }
>>
>> +static int vtd_as_to_iommu_pasid_locked(VTDAddressSpace *vtd_as,
>> +                                        uint32_t *pasid)
>> +{
>> +    VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;
>> +    IntelIOMMUState *s = vtd_as->iommu_state;
>> +    uint8_t bus_num = pci_bus_num(vtd_as->bus);
>> +    uint8_t devfn = vtd_as->devfn;
>> +    VTDContextEntry ce;
>> +    int ret;
>> +
>> +    /* For Requests-with-PASID, its pasid value is used by vIOMMU
>directly */
>> +    if (vtd_as->pasid != PCI_NO_PASID) {
>> +        *pasid = vtd_as->pasid;
>> +        return 0;
>> +    }
>> +
>> +    if (cc_entry->context_cache_gen == s->context_cache_gen) {
>> +        ce = cc_entry->context_entry;
>> +    } else {
>> +        ret = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
>> +        if (ret) {
>> +            return ret;
>> +        }
>> +    }
>> +    *pasid = VTD_CE_GET_RID2PASID(&ce);
>
>looks like we have quite a few code get rid_pasid from the context
>entry. I think we may simplify it by using PASID #0 since vIOMMU does
>not report ECAP.RPS bit at all. It could be done as a separate cleanup.

Yes, but we already have all code supporting RPS capability though RPS
isn't enabled in CAP register. In theory we can enable RPS easily by setting
the bit in CAP register. So I would like to be consistent with this instead of
dropping all the existing code about RPS cap.

Thanks
Zhenzhong

>
>> +    return 0;
>> +}
>> +
>> +static gboolean vtd_find_as_by_sid_and_iommu_pasid(gpointer key,
>gpointer value,
>> +                                                   gpointer
>user_data)
>> +{
>> +    VTDAddressSpace *vtd_as = (VTDAddressSpace *)value;
>> +    struct vtd_as_raw_key *target = (struct vtd_as_raw_key *)user_data;
>> +    uint16_t sid = PCI_BUILD_BDF(pci_bus_num(vtd_as->bus),
>vtd_as->devfn);
>> +    uint32_t pasid;
>> +
>> +    if (vtd_as_to_iommu_pasid_locked(vtd_as, &pasid)) {
>> +        return false;
>> +    }
>> +
>> +    return (pasid == target->pasid) && (sid == target->sid);
>> +}
>> +
>> +/* Translate iommu pasid to vtd_as */
>> +static inline
>> +VTDAddressSpace *vtd_as_from_iommu_pasid_locked(IntelIOMMUState
>*s,
>> +                                                uint16_t sid,
>uint32_t pasid)
>> +{
>> +    struct vtd_as_raw_key key = {
>> +        .sid = sid,
>> +        .pasid = pasid
>> +    };
>> +
>> +    return g_hash_table_find(s->vtd_address_spaces,
>> +                             vtd_find_as_by_sid_and_iommu_pasid,
>&key);
>> +}
>> +
>>   static int vtd_sync_shadow_page_hook(const IOMMUTLBEvent *event,
>>                                        void *private)
>>   {
>
>the code looks good to me.
>
>Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Re: [PATCH v5 10/21] intel_iommu: Introduce two helpers vtd_as_from/to_iommu_pasid_locked
Posted by Yi Liu 2 months, 1 week ago
On 2025/9/1 13:33, Duan, Zhenzhong wrote:

>>> +static int vtd_as_to_iommu_pasid_locked(VTDAddressSpace *vtd_as,
>>> +                                        uint32_t *pasid)
>>> +{
>>> +    VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;
>>> +    IntelIOMMUState *s = vtd_as->iommu_state;
>>> +    uint8_t bus_num = pci_bus_num(vtd_as->bus);
>>> +    uint8_t devfn = vtd_as->devfn;
>>> +    VTDContextEntry ce;
>>> +    int ret;
>>> +
>>> +    /* For Requests-with-PASID, its pasid value is used by vIOMMU
>> directly */
>>> +    if (vtd_as->pasid != PCI_NO_PASID) {
>>> +        *pasid = vtd_as->pasid;
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (cc_entry->context_cache_gen == s->context_cache_gen) {
>>> +        ce = cc_entry->context_entry;

just realized, if you don't record the context_entry in the below
branch, then this flow will always go with the below branch for
passthrough device. is it?

>>> +    } else {
>>> +        ret = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
>>> +        if (ret) {
>>> +            return ret;
>>> +        }
>>> +    }
>>> +    *pasid = VTD_CE_GET_RID2PASID(&ce);
>>
>> looks like we have quite a few code get rid_pasid from the context
>> entry. I think we may simplify it by using PASID #0 since vIOMMU does
>> not report ECAP.RPS bit at all. It could be done as a separate cleanup.
> 
> Yes, but we already have all code supporting RPS capability though RPS
> isn't enabled in CAP register. In theory we can enable RPS easily by setting
> the bit in CAP register. So I would like to be consistent with this instead of
> dropping all the existing code about RPS cap.

right. The code is almost there. But I haven't seen the possibility to
report RPS==1 to guest. It's somehow aligned that pasid#0 would be used
as rid_pasid. You may have noticed Linux even does not check RPS bit. So
such a guest will ignore RPS. This means this reading rid_pasid from ce
entry is not necessary. This is not urgent task anyhow.


Regards,
Yi Liu
RE: [PATCH v5 10/21] intel_iommu: Introduce two helpers vtd_as_from/to_iommu_pasid_locked
Posted by Duan, Zhenzhong 2 months, 1 week ago

>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v5 10/21] intel_iommu: Introduce two helpers
>vtd_as_from/to_iommu_pasid_locked
>
>On 2025/9/1 13:33, Duan, Zhenzhong wrote:
>
>>>> +static int vtd_as_to_iommu_pasid_locked(VTDAddressSpace *vtd_as,
>>>> +                                        uint32_t *pasid)
>>>> +{
>>>> +    VTDContextCacheEntry *cc_entry =
>&vtd_as->context_cache_entry;
>>>> +    IntelIOMMUState *s = vtd_as->iommu_state;
>>>> +    uint8_t bus_num = pci_bus_num(vtd_as->bus);
>>>> +    uint8_t devfn = vtd_as->devfn;
>>>> +    VTDContextEntry ce;
>>>> +    int ret;
>>>> +
>>>> +    /* For Requests-with-PASID, its pasid value is used by vIOMMU
>>> directly */
>>>> +    if (vtd_as->pasid != PCI_NO_PASID) {
>>>> +        *pasid = vtd_as->pasid;
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    if (cc_entry->context_cache_gen == s->context_cache_gen) {
>>>> +        ce = cc_entry->context_entry;
>
>just realized, if you don't record the context_entry in the below
>branch, then this flow will always go with the below branch for
>passthrough device. is it?

Yes.

>
>>>> +    } else {
>>>> +        ret = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
>>>> +        if (ret) {
>>>> +            return ret;
>>>> +        }
>>>> +    }
>>>> +    *pasid = VTD_CE_GET_RID2PASID(&ce);
>>>
>>> looks like we have quite a few code get rid_pasid from the context
>>> entry. I think we may simplify it by using PASID #0 since vIOMMU does
>>> not report ECAP.RPS bit at all. It could be done as a separate cleanup.
>>
>> Yes, but we already have all code supporting RPS capability though RPS
>> isn't enabled in CAP register. In theory we can enable RPS easily by setting
>> the bit in CAP register. So I would like to be consistent with this instead of
>> dropping all the existing code about RPS cap.
>
>right. The code is almost there. But I haven't seen the possibility to
>report RPS==1 to guest. It's somehow aligned that pasid#0 would be used
>as rid_pasid. You may have noticed Linux even does not check RPS bit. So
>such a guest will ignore RPS. This means this reading rid_pasid from ce
>entry is not necessary. This is not urgent task anyhow.

OK, if we have no plan to support RPS, a lot of code will be simplified. I'll do it in this series if no objection.

Thanks
Zhenzhong