[PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS

CLEMENT MATHIEU--DRIF posted 22 patches 2 months, 2 weeks ago
[PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by CLEMENT MATHIEU--DRIF 2 months, 2 weeks ago
From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>

Devices implementing ATS can send translation requests using
pci_ats_request_translation_pasid.

The invalidation events are sent back to the device using the iommu
notifier managed with pci_register_iommu_tlb_event_notifier and
pci_unregister_iommu_tlb_event_notifier

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
 hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
 include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 7a483dd05d..93b816aff2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
     }
 }
 
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+                                          bool priv_req, bool exec_req,
+                                          hwaddr addr, size_t length,
+                                          bool no_write, IOMMUTLBEntry *result,
+                                          size_t result_length,
+                                          uint32_t *err_count)
+{
+    assert(result_length);
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr || !pcie_ats_enabled(dev)) {
+        return -EPERM;
+    }
+    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
+                                                       exec_req, addr, length,
+                                                       no_write, result,
+                                                       result_length,
+                                                       err_count);
+}
+
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                          IOMMUNotifier *n)
+{
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr) {
+        return -EPERM;
+    }
+    return memory_region_register_iommu_notifier(MEMORY_REGION(iommu_mr), n,
+                                                 &error_fatal);
+}
+
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                             IOMMUNotifier *n)
+{
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr) {
+        return -EPERM;
+    }
+    memory_region_unregister_iommu_notifier(MEMORY_REGION(iommu_mr), n);
+    return 0;
+}
+
 void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
 {
     /*
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b2a9ed7782..d656f2656a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -473,6 +473,58 @@ bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
                                    IOMMUNotifier *n, IOMMUNotify fn,
                                    void *opaque);
 
+/**
+ * pci_ats_request_translation_pasid: perform an ATS request
+ *
+ * Return the number of translations stored in @result in case of success,
+ * a negative error code otherwise.
+ * -ENOMEM is returned when the result buffer is not large enough to store
+ * all the translations
+ *
+ * @dev: the ATS-capable PCI device
+ * @pasid: the pasid of the address space in which the translation will be made
+ * @priv_req: privileged mode bit (PASID TLP)
+ * @exec_req: execute request bit (PASID TLP)
+ * @addr: start address of the memory range to be translated
+ * @length: length of the memory range in bytes
+ * @no_write: request a read-only access translation (if supported by the IOMMU)
+ * @result: buffer in which the TLB entries will be stored
+ * @result_length: result buffer length
+ * @err_count: number of untranslated subregions
+ */
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+                                          bool priv_req, bool exec_req,
+                                          hwaddr addr, size_t length,
+                                          bool no_write, IOMMUTLBEntry *result,
+                                          size_t result_length,
+                                          uint32_t *err_count);
+
+/**
+ * pci_register_iommu_tlb_event_notifier: register a notifier for changes to
+ * IOMMU translation entries in a specific address space.
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to get notified
+ * @pasid: the pasid of the address space to track
+ * @n: the notifier to register
+ */
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                          IOMMUNotifier *n);
+
+/**
+ * pci_unregister_iommu_tlb_event_notifier: unregister a notifier that has been
+ * registerd with pci_register_iommu_tlb_event_notifier
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to unsubscribe
+ * @pasid: the pasid of the address space to be untracked
+ * @n: the notifier to unregister
+ */
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+                                            IOMMUNotifier *n);
+
 /**
  * pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
  *
-- 
2.45.2
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by Minwoo Im 2 months, 1 week ago
On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> 
> Devices implementing ATS can send translation requests using
> pci_ats_request_translation_pasid.
> 
> The invalidation events are sent back to the device using the iommu
> notifier managed with pci_register_iommu_tlb_event_notifier and
> pci_unregister_iommu_tlb_event_notifier
> 
> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> ---
>  hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
>  include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 96 insertions(+)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 7a483dd05d..93b816aff2 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
>      }
>  }
>  
> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> +                                          bool priv_req, bool exec_req,
> +                                          hwaddr addr, size_t length,
> +                                          bool no_write, IOMMUTLBEntry *result,
> +                                          size_t result_length,
> +                                          uint32_t *err_count)
> +{
> +    assert(result_length);
> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
> +                                                                        pasid);
> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
> +        return -EPERM;
> +    }
> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
> +                                                       exec_req, addr, length,
> +                                                       no_write, result,
> +                                                       result_length,
> +                                                       err_count);
> +}

Can we use this function not from the endpoint PCI device, but inside of the pci
subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
side.
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by CLEMENT MATHIEU--DRIF 2 months, 1 week ago

On 09/07/2024 12:15, Minwoo Im wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>
>> Devices implementing ATS can send translation requests using
>> pci_ats_request_translation_pasid.
>>
>> The invalidation events are sent back to the device using the iommu
>> notifier managed with pci_register_iommu_tlb_event_notifier and
>> pci_unregister_iommu_tlb_event_notifier
>>
>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>> ---
>>   hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
>>   include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 96 insertions(+)
>>
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index 7a483dd05d..93b816aff2 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
>>       }
>>   }
>>
>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
>> +                                          bool priv_req, bool exec_req,
>> +                                          hwaddr addr, size_t length,
>> +                                          bool no_write, IOMMUTLBEntry *result,
>> +                                          size_t result_length,
>> +                                          uint32_t *err_count)
>> +{
>> +    assert(result_length);
>> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
>> +                                                                        pasid);
>> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
>> +        return -EPERM;
>> +    }
>> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
>> +                                                       exec_req, addr, length,
>> +                                                       no_write, result,
>> +                                                       result_length,
>> +                                                       err_count);
>> +}
> Can we use this function not from the endpoint PCI device, but inside of the pci
> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
> PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
> side.
Hi,

This series aims to bring support for SVM (we are trying to integrate 
the patches bit by bit).
 From a spec point of view, I don't know if it would make sense to 
implement the SVM logic at the PCI level
as it's supposed to be implemented by endpoint devices.
However, we could consider providing a reference/reusable/encapsulated 
implementation of SVM with a simplified API
that would call the pci_* functions under the hood.

Do you have a specific use case in mind?

 >cmd

>
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by Minwoo Im 2 months, 1 week ago
On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote:
> 
> 
> On 09/07/2024 12:15, Minwoo Im wrote:
> > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >
> >
> > On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
> >> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> >>
> >> Devices implementing ATS can send translation requests using
> >> pci_ats_request_translation_pasid.
> >>
> >> The invalidation events are sent back to the device using the iommu
> >> notifier managed with pci_register_iommu_tlb_event_notifier and
> >> pci_unregister_iommu_tlb_event_notifier
> >>
> >> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> >> ---
> >>   hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
> >>   include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
> >>   2 files changed, 96 insertions(+)
> >>
> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >> index 7a483dd05d..93b816aff2 100644
> >> --- a/hw/pci/pci.c
> >> +++ b/hw/pci/pci.c
> >> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
> >>       }
> >>   }
> >>
> >> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> >> +                                          bool priv_req, bool exec_req,
> >> +                                          hwaddr addr, size_t length,
> >> +                                          bool no_write, IOMMUTLBEntry *result,
> >> +                                          size_t result_length,
> >> +                                          uint32_t *err_count)
> >> +{
> >> +    assert(result_length);
> >> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
> >> +                                                                        pasid);
> >> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
> >> +        return -EPERM;
> >> +    }
> >> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
> >> +                                                       exec_req, addr, length,
> >> +                                                       no_write, result,
> >> +                                                       result_length,
> >> +                                                       err_count);
> >> +}
> > Can we use this function not from the endpoint PCI device, but inside of the pci
> > subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
> > PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
> > issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
> > side.
> Hi,
> 
> This series aims to bring support for SVM (we are trying to integrate 
> the patches bit by bit).
>  From a spec point of view, I don't know if it would make sense to 
> implement the SVM logic at the PCI level
> as it's supposed to be implemented by endpoint devices.

Understood that this series is targeting the SVM usage.  But ATS feature is
something general to PCI devices, not only just for SVM, so I guess it would be
better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem
like pci_dma_rw() to avoid duplicated implementation in the future for the
other PCI enpoint devices.

> However, we could consider providing a reference/reusable/encapsulated 
> implementation of SVM with a simplified API
> that would call the pci_* functions under the hood.

I would prefer that PCI devices which want to request ATS translation has no
additional implementation for ATS, but only pcie_ats_init().

> 
> Do you have a specific use case in mind?

ATS/PRI is the actual use case, and it's not that different what you are
targeting for :)

> 
>  >cmd
> 
> >
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by CLEMENT MATHIEU--DRIF 2 months, 1 week ago

On 09/07/2024 23:17, Minwoo Im wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote:
>>
>> On 09/07/2024 12:15, Minwoo Im wrote:
>>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>>>
>>>
>>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
>>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>>>
>>>> Devices implementing ATS can send translation requests using
>>>> pci_ats_request_translation_pasid.
>>>>
>>>> The invalidation events are sent back to the device using the iommu
>>>> notifier managed with pci_register_iommu_tlb_event_notifier and
>>>> pci_unregister_iommu_tlb_event_notifier
>>>>
>>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>>> ---
>>>>    hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
>>>>    include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
>>>>    2 files changed, 96 insertions(+)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 7a483dd05d..93b816aff2 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
>>>>        }
>>>>    }
>>>>
>>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
>>>> +                                          bool priv_req, bool exec_req,
>>>> +                                          hwaddr addr, size_t length,
>>>> +                                          bool no_write, IOMMUTLBEntry *result,
>>>> +                                          size_t result_length,
>>>> +                                          uint32_t *err_count)
>>>> +{
>>>> +    assert(result_length);
>>>> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
>>>> +                                                                        pasid);
>>>> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
>>>> +        return -EPERM;
>>>> +    }
>>>> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
>>>> +                                                       exec_req, addr, length,
>>>> +                                                       no_write, result,
>>>> +                                                       result_length,
>>>> +                                                       err_count);
>>>> +}
>>> Can we use this function not from the endpoint PCI device, but inside of the pci
>>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
>>> PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
>>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
>>> side.
>> Hi,
>>
>> This series aims to bring support for SVM (we are trying to integrate
>> the patches bit by bit).
>>   From a spec point of view, I don't know if it would make sense to
>> implement the SVM logic at the PCI level
>> as it's supposed to be implemented by endpoint devices.
> Understood that this series is targeting the SVM usage.  But ATS feature is
> something general to PCI devices, not only just for SVM, so I guess it would be
> better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem
> like pci_dma_rw() to avoid duplicated implementation in the future for the
> other PCI enpoint devices.

Would we store the ATC directly in the PCI subsytem?
>
>> However, we could consider providing a reference/reusable/encapsulated
>> implementation of SVM with a simplified API
>> that would call the pci_* functions under the hood.
> I would prefer that PCI devices which want to request ATS translation has no
> additional implementation for ATS, but only pcie_ats_init().
Hi,

I think both strategies can coexist.
Keeping control can be interesting for people who use Qemu for hardware 
prototyping and who generally want to experiment.
We can keep the current PCI-level API for devices that want to 
reimplement the logic themselves
and add a kind of "DMA module"/"ATS+PRI module" that works out of the box.
That module could be called "struct PciDmaModule" and expose a simple 
set of functions like pci_dma_module_init, pci_dma_module_read, 
pci_dma_module_write.
I think it's important to keep existing DMA API as is to allow devices 
to do both "with ATS" and "without ATS" operations.

Do you agree with that?
>
>> Do you have a specific use case in mind?
> ATS/PRI is the actual use case, and it's not that different what you are
> targeting for :)
>
>>   >cmd
>>
>>>
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by Minwoo Im 2 months, 1 week ago
On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote:
> 
> 
> On 09/07/2024 23:17, Minwoo Im wrote:
> > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >
> >
> > On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote:
> >>
> >> On 09/07/2024 12:15, Minwoo Im wrote:
> >>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >>>
> >>>
> >>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
> >>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> >>>>
> >>>> Devices implementing ATS can send translation requests using
> >>>> pci_ats_request_translation_pasid.
> >>>>
> >>>> The invalidation events are sent back to the device using the iommu
> >>>> notifier managed with pci_register_iommu_tlb_event_notifier and
> >>>> pci_unregister_iommu_tlb_event_notifier
> >>>>
> >>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> >>>> ---
> >>>>    hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
> >>>>    include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
> >>>>    2 files changed, 96 insertions(+)
> >>>>
> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>> index 7a483dd05d..93b816aff2 100644
> >>>> --- a/hw/pci/pci.c
> >>>> +++ b/hw/pci/pci.c
> >>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
> >>>>        }
> >>>>    }
> >>>>
> >>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> >>>> +                                          bool priv_req, bool exec_req,
> >>>> +                                          hwaddr addr, size_t length,
> >>>> +                                          bool no_write, IOMMUTLBEntry *result,
> >>>> +                                          size_t result_length,
> >>>> +                                          uint32_t *err_count)
> >>>> +{
> >>>> +    assert(result_length);
> >>>> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
> >>>> +                                                                        pasid);
> >>>> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
> >>>> +        return -EPERM;
> >>>> +    }
> >>>> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
> >>>> +                                                       exec_req, addr, length,
> >>>> +                                                       no_write, result,
> >>>> +                                                       result_length,
> >>>> +                                                       err_count);
> >>>> +}
> >>> Can we use this function not from the endpoint PCI device, but inside of the pci
> >>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
> >>> PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
> >>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
> >>> side.
> >> Hi,
> >>
> >> This series aims to bring support for SVM (we are trying to integrate
> >> the patches bit by bit).
> >>   From a spec point of view, I don't know if it would make sense to
> >> implement the SVM logic at the PCI level
> >> as it's supposed to be implemented by endpoint devices.
> > Understood that this series is targeting the SVM usage.  But ATS feature is
> > something general to PCI devices, not only just for SVM, so I guess it would be
> > better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem
> > like pci_dma_rw() to avoid duplicated implementation in the future for the
> > other PCI enpoint devices.
> 
> Would we store the ATC directly in the PCI subsytem?

Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem
with `PCIDevice *pdev instance` which represents the endpoint device itself.
By the instance, we can look up the IOTLB entry from the ATC in the PCI
subsystem, not the current caller side.

> >
> >> However, we could consider providing a reference/reusable/encapsulated
> >> implementation of SVM with a simplified API
> >> that would call the pci_* functions under the hood.
> > I would prefer that PCI devices which want to request ATS translation has no
> > additional implementation for ATS, but only pcie_ats_init().
> Hi,
> 
> I think both strategies can coexist.
> Keeping control can be interesting for people who use Qemu for hardware 
> prototyping and who generally want to experiment.
> We can keep the current PCI-level API for devices that want to 
> reimplement the logic themselves
> and add a kind of "DMA module"/"ATS+PRI module" that works out of the box.

I think we should proivde hybrid mode on this.  One for a `generic` cache
policy mode for every PCI endpoint devices which can be controlled in the PCI
subsystem for ATC, the other one is that device-specific cache policy mode
which let each device implement their own ATC lookup behaviors to optimize
their own caching impact.

> That module could be called "struct PciDmaModule" and expose a simple 
> set of functions like pci_dma_module_init, pci_dma_module_read, 
> pci_dma_module_write.
> I think it's important to keep existing DMA API as is to allow devices 
> to do both "with ATS" and "without ATS" operations.
> 
> Do you agree with that?

Indeed.  Keeping the existing APIs is a good choice, but I would like to have
endpoint devices code much more simpler for the generic usages :)

> >
> >> Do you have a specific use case in mind?
> > ATS/PRI is the actual use case, and it's not that different what you are
> > targeting for :)
> >
> >>   >cmd
> >>
> >>>
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by CLEMENT MATHIEU--DRIF 2 months ago

On 11/07/2024 10:04, Minwoo Im wrote:
> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>
>
> On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote:
>>
>> On 09/07/2024 23:17, Minwoo Im wrote:
>>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>>>
>>>
>>> On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote:
>>>> On 09/07/2024 12:15, Minwoo Im wrote:
>>>>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
>>>>>
>>>>>
>>>>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
>>>>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>>>>>
>>>>>> Devices implementing ATS can send translation requests using
>>>>>> pci_ats_request_translation_pasid.
>>>>>>
>>>>>> The invalidation events are sent back to the device using the iommu
>>>>>> notifier managed with pci_register_iommu_tlb_event_notifier and
>>>>>> pci_unregister_iommu_tlb_event_notifier
>>>>>>
>>>>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
>>>>>> ---
>>>>>>     hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
>>>>>>     include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
>>>>>>     2 files changed, 96 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>> index 7a483dd05d..93b816aff2 100644
>>>>>> --- a/hw/pci/pci.c
>>>>>> +++ b/hw/pci/pci.c
>>>>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
>>>>>> +                                          bool priv_req, bool exec_req,
>>>>>> +                                          hwaddr addr, size_t length,
>>>>>> +                                          bool no_write, IOMMUTLBEntry *result,
>>>>>> +                                          size_t result_length,
>>>>>> +                                          uint32_t *err_count)
>>>>>> +{
>>>>>> +    assert(result_length);
>>>>>> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
>>>>>> +                                                                        pasid);
>>>>>> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
>>>>>> +        return -EPERM;
>>>>>> +    }
>>>>>> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
>>>>>> +                                                       exec_req, addr, length,
>>>>>> +                                                       no_write, result,
>>>>>> +                                                       result_length,
>>>>>> +                                                       err_count);
>>>>>> +}
>>>>> Can we use this function not from the endpoint PCI device, but inside of the pci
>>>>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
>>>>> PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
>>>>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
>>>>> side.
>>>> Hi,
>>>>
>>>> This series aims to bring support for SVM (we are trying to integrate
>>>> the patches bit by bit).
>>>>    From a spec point of view, I don't know if it would make sense to
>>>> implement the SVM logic at the PCI level
>>>> as it's supposed to be implemented by endpoint devices.
>>> Understood that this series is targeting the SVM usage.  But ATS feature is
>>> something general to PCI devices, not only just for SVM, so I guess it would be
>>> better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem
>>> like pci_dma_rw() to avoid duplicated implementation in the future for the
>>> other PCI enpoint devices.
>> Would we store the ATC directly in the PCI subsytem?
> Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem
> with `PCIDevice *pdev instance` which represents the endpoint device itself.
> By the instance, we can look up the IOTLB entry from the ATC in the PCI
> subsystem, not the current caller side.
>
>>>> However, we could consider providing a reference/reusable/encapsulated
>>>> implementation of SVM with a simplified API
>>>> that would call the pci_* functions under the hood.
>>> I would prefer that PCI devices which want to request ATS translation has no
>>> additional implementation for ATS, but only pcie_ats_init().
>> Hi,
>>
>> I think both strategies can coexist.
>> Keeping control can be interesting for people who use Qemu for hardware
>> prototyping and who generally want to experiment.
>> We can keep the current PCI-level API for devices that want to
>> reimplement the logic themselves
>> and add a kind of "DMA module"/"ATS+PRI module" that works out of the box.
> I think we should proivde hybrid mode on this.  One for a `generic` cache
> policy mode for every PCI endpoint devices which can be controlled in the PCI
> subsystem for ATC, the other one is that device-specific cache policy mode
> which let each device implement their own ATC lookup behaviors to optimize
> their own caching impact.
>
>> That module could be called "struct PciDmaModule" and expose a simple
>> set of functions like pci_dma_module_init, pci_dma_module_read,
>> pci_dma_module_write.
>> I think it's important to keep existing DMA API as is to allow devices
>> to do both "with ATS" and "without ATS" operations.
>>
>> Do you agree with that?
> Indeed.  Keeping the existing APIs is a good choice, but I would like to have
> endpoint devices code much more simpler for the generic usages :)
That's a good point, we will se what we can do once the current work is 
integrated.
Thanks for your comment :)
>
>>>> Do you have a specific use case in mind?
>>> ATS/PRI is the actual use case, and it's not that different what you are
>>> targeting for :)
>>>
>>>>    >cmd
>>>>
>>>>>
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by Minwoo Im 2 months ago
On 24-07-11 19:00:58, CLEMENT MATHIEU--DRIF wrote:
> 
> 
> On 11/07/2024 10:04, Minwoo Im wrote:
> > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >
> >
> > On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote:
> >>
> >> On 09/07/2024 23:17, Minwoo Im wrote:
> >>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >>>
> >>>
> >>> On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote:
> >>>> On 09/07/2024 12:15, Minwoo Im wrote:
> >>>>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.
> >>>>>
> >>>>>
> >>>>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:
> >>>>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> >>>>>>
> >>>>>> Devices implementing ATS can send translation requests using
> >>>>>> pci_ats_request_translation_pasid.
> >>>>>>
> >>>>>> The invalidation events are sent back to the device using the iommu
> >>>>>> notifier managed with pci_register_iommu_tlb_event_notifier and
> >>>>>> pci_unregister_iommu_tlb_event_notifier
> >>>>>>
> >>>>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
> >>>>>> ---
> >>>>>>     hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
> >>>>>>     include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>     2 files changed, 96 insertions(+)
> >>>>>>
> >>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>>>> index 7a483dd05d..93b816aff2 100644
> >>>>>> --- a/hw/pci/pci.c
> >>>>>> +++ b/hw/pci/pci.c
> >>>>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
> >>>>>>         }
> >>>>>>     }
> >>>>>>
> >>>>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
> >>>>>> +                                          bool priv_req, bool exec_req,
> >>>>>> +                                          hwaddr addr, size_t length,
> >>>>>> +                                          bool no_write, IOMMUTLBEntry *result,
> >>>>>> +                                          size_t result_length,
> >>>>>> +                                          uint32_t *err_count)
> >>>>>> +{
> >>>>>> +    assert(result_length);
> >>>>>> +    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
> >>>>>> +                                                                        pasid);
> >>>>>> +    if (!iommu_mr || !pcie_ats_enabled(dev)) {
> >>>>>> +        return -EPERM;
> >>>>>> +    }
> >>>>>> +    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
> >>>>>> +                                                       exec_req, addr, length,
> >>>>>> +                                                       no_write, result,
> >>>>>> +                                                       result_length,
> >>>>>> +                                                       err_count);
> >>>>>> +}
> >>>>> Can we use this function not from the endpoint PCI device, but inside of the pci
> >>>>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
> >>>>> PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
> >>>>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
> >>>>> side.
> >>>> Hi,
> >>>>
> >>>> This series aims to bring support for SVM (we are trying to integrate
> >>>> the patches bit by bit).
> >>>>    From a spec point of view, I don't know if it would make sense to
> >>>> implement the SVM logic at the PCI level
> >>>> as it's supposed to be implemented by endpoint devices.
> >>> Understood that this series is targeting the SVM usage.  But ATS feature is
> >>> something general to PCI devices, not only just for SVM, so I guess it would be
> >>> better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem
> >>> like pci_dma_rw() to avoid duplicated implementation in the future for the
> >>> other PCI enpoint devices.
> >> Would we store the ATC directly in the PCI subsytem?
> > Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem
> > with `PCIDevice *pdev instance` which represents the endpoint device itself.
> > By the instance, we can look up the IOTLB entry from the ATC in the PCI
> > subsystem, not the current caller side.
> >
> >>>> However, we could consider providing a reference/reusable/encapsulated
> >>>> implementation of SVM with a simplified API
> >>>> that would call the pci_* functions under the hood.
> >>> I would prefer that PCI devices which want to request ATS translation has no
> >>> additional implementation for ATS, but only pcie_ats_init().
> >> Hi,
> >>
> >> I think both strategies can coexist.
> >> Keeping control can be interesting for people who use Qemu for hardware
> >> prototyping and who generally want to experiment.
> >> We can keep the current PCI-level API for devices that want to
> >> reimplement the logic themselves
> >> and add a kind of "DMA module"/"ATS+PRI module" that works out of the box.
> > I think we should proivde hybrid mode on this.  One for a `generic` cache
> > policy mode for every PCI endpoint devices which can be controlled in the PCI
> > subsystem for ATC, the other one is that device-specific cache policy mode
> > which let each device implement their own ATC lookup behaviors to optimize
> > their own caching impact.
> >
> >> That module could be called "struct PciDmaModule" and expose a simple
> >> set of functions like pci_dma_module_init, pci_dma_module_read,
> >> pci_dma_module_write.
> >> I think it's important to keep existing DMA API as is to allow devices
> >> to do both "with ATS" and "without ATS" operations.
> >>
> >> Do you agree with that?
> > Indeed.  Keeping the existing APIs is a good choice, but I would like to have
> > endpoint devices code much more simpler for the generic usages :)
> That's a good point, we will se what we can do once the current work is 
> integrated.
> Thanks for your comment :)

Do you have a plan to repost this series soon? I would like to apply your
patches and review/test this series.  It looks like you've been through some of
fix patches for VT-d, but I'm curious your further plan for the actual SVM
feature :)

> >
> >>>> Do you have a specific use case in mind?
> >>> ATS/PRI is the actual use case, and it's not that different what you are
> >>> targeting for :)
> >>>
> >>>>    >cmd
> >>>>
> >>>>>
Re: [PATCH ats_vtd v5 20/22] pci: add a pci-level API for ATS
Posted by CLEMENT MATHIEU--DRIF 2 months ago

On 18/07/2024 01:44, Minwoo Im wrote:

Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.


On 24-07-11 19:00:58, CLEMENT MATHIEU--DRIF wrote:




On 11/07/2024 10:04, Minwoo Im wrote:


Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.


On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote:



On 09/07/2024 23:17, Minwoo Im wrote:


Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.


On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote:


On 09/07/2024 12:15, Minwoo Im wrote:


Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe.


On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote:


From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com><mailto:clement.mathieu--drif@eviden.com>

Devices implementing ATS can send translation requests using
pci_ats_request_translation_pasid.

The invalidation events are sent back to the device using the iommu
notifier managed with pci_register_iommu_tlb_event_notifier and
pci_unregister_iommu_tlb_event_notifier

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com><mailto:clement.mathieu--drif@eviden.com>
---
    hw/pci/pci.c         | 44 +++++++++++++++++++++++++++++++++++++
    include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
    2 files changed, 96 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 7a483dd05d..93b816aff2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
        }
    }

+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+                                          bool priv_req, bool exec_req,
+                                          hwaddr addr, size_t length,
+                                          bool no_write, IOMMUTLBEntry *result,
+                                          size_t result_length,
+                                          uint32_t *err_count)
+{
+    assert(result_length);
+    IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+                                                                        pasid);
+    if (!iommu_mr || !pcie_ats_enabled(dev)) {
+        return -EPERM;
+    }
+    return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
+                                                       exec_req, addr, length,
+                                                       no_write, result,
+                                                       result_length,
+                                                       err_count);
+}


Can we use this function not from the endpoint PCI device, but inside of the pci
subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from
PCI endpoint device POV?  I guess it would be better to have PCI subsystem to
issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint
side.


Hi,

This series aims to bring support for SVM (we are trying to integrate
the patches bit by bit).
   From a spec point of view, I don't know if it would make sense to
implement the SVM logic at the PCI level
as it's supposed to be implemented by endpoint devices.


Understood that this series is targeting the SVM usage.  But ATS feature is
something general to PCI devices, not only just for SVM, so I guess it would be
better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem
like pci_dma_rw() to avoid duplicated implementation in the future for the
other PCI enpoint devices.


Would we store the ATC directly in the PCI subsytem?


Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem
with `PCIDevice *pdev instance` which represents the endpoint device itself.
By the instance, we can look up the IOTLB entry from the ATC in the PCI
subsystem, not the current caller side.



However, we could consider providing a reference/reusable/encapsulated
implementation of SVM with a simplified API
that would call the pci_* functions under the hood.


I would prefer that PCI devices which want to request ATS translation has no
additional implementation for ATS, but only pcie_ats_init().


Hi,

I think both strategies can coexist.
Keeping control can be interesting for people who use Qemu for hardware
prototyping and who generally want to experiment.
We can keep the current PCI-level API for devices that want to
reimplement the logic themselves
and add a kind of "DMA module"/"ATS+PRI module" that works out of the box.


I think we should proivde hybrid mode on this.  One for a `generic` cache
policy mode for every PCI endpoint devices which can be controlled in the PCI
subsystem for ATC, the other one is that device-specific cache policy mode
which let each device implement their own ATC lookup behaviors to optimize
their own caching impact.



That module could be called "struct PciDmaModule" and expose a simple
set of functions like pci_dma_module_init, pci_dma_module_read,
pci_dma_module_write.
I think it's important to keep existing DMA API as is to allow devices
to do both "with ATS" and "without ATS" operations.

Do you agree with that?


Indeed.  Keeping the existing APIs is a good choice, but I would like to have
endpoint devices code much more simpler for the generic usages :)


That's a good point, we will se what we can do once the current work is
integrated.
Thanks for your comment :)



Do you have a plan to repost this series soon? I would like to apply your
patches and review/test this series.  It looks like you've been through some of
fix patches for VT-d, but I'm curious your further plan for the actual SVM
feature :)

Hi,

As you may have noticed, my SVM implementation is based on Zhenzhong Duan's FLTS series, which has not yet been integrated.
The VT-d quick fixes were the only fixes without dependencies.
I'm now waiting for FLTS support to be integrated upstream before sending my series again (I'll add you to CC).

If you want to test the features, you can clone the following repos:
- https://github.com/BullSequana/Qemu-in-guest-SVM-demo
- https://github.com/BullSequana/qemu (submodule of the repo above)

>cmd









Do you have a specific use case in mind?


ATS/PRI is the actual use case, and it's not that different what you are
targeting for :)



   >cmd