[PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases

Klaus Jensen posted 6 patches 1 year, 3 months ago
drivers/iommu/Kconfig                |   2 +-
drivers/iommu/intel/Kconfig          |   1 -
drivers/iommu/intel/Makefile         |   2 +-
drivers/iommu/intel/iommu.c          |  29 ++--
drivers/iommu/intel/iommu.h          |  40 ++++-
drivers/iommu/intel/prq.c            | 284 ++++++++++++++++++++++++++++++++
drivers/iommu/intel/svm.c            | 308 -----------------------------------
drivers/iommu/iommu-priv.h           |   3 +
drivers/iommu/iommu.c                |  31 ++++
drivers/iommu/iommufd/fault.c        |  22 +++
drivers/iommu/iommufd/hw_pagetable.c |   3 +-
11 files changed, 389 insertions(+), 336 deletions(-)
[PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Klaus Jensen 1 year, 3 months ago
This is a Request for Comment series that will hopefully generate
initial feedback on the use of the iommufd_hwpt_replace_device to
execute non-nested and non-svm user space IOPFs. Our main motivation is
to enable user-space driver driven device verification with default
pasid and without nesting nor SVM.

What?
  * Enable IO page fault handling in user space in a non-nested, non-svm
    and non-virtualised use case.
  * Removing the relation between IOPF and INTEL_IOMMU_SVM by allowing
    the user to (de)select the IOPF code through Kconfig.
  * Create a new file under iommu/intel (prq.c) that contains all the
    page request queue related logic and is not under intel/svm.c.
  * Add the IOMMU_HWPT_FAULT_ID_VALID to the valid flags used to create
    IOMMU_HWPT_ALLOC allocations.
  * Create a default (zero) pasid handle and insert it to the pasid
    array within the dev->iommu_group when replacing the old HWPT with
    an iopf enabled HWPT.

Why?
  The PCI ATS Extended Capability allows peripheral devices to
  participate in the caching of translations when operating under an
  IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
  devices to handle missing mappings. Currently, PRI is mainly used in
  the context of Shared Virtual Addressing, requiring support for the
  Process Address Space Identifier (PASID) capability, but other use
  cases such as enabling user-space driver driven device verification
  and reducing memory pinning exists. This patchest sets out to enable
  these use cases.

Testing?
  The non-nested/non-svm IOPF interface is exercised by first
  initializing an iopf enabled ioas and then reading the fault file
  descriptor. Pseudocode on the iopf initializing and handling is in [3]
  and [4] (using libvfn).

  Supplementary repositories supporting this patchset:
    1. A user space library libvfn [1] which is used for testing and
       verification (see examples/iopf.c), and
    2. Basic emulation of PCIe ATS/PRI and Intel VT-d PRQ in QEMU [2].

Notes
  Patches 5/6 are added by Klaus for testing against the QEMU test
  device (which does not support PASID). They are very much RFC.

Comments and feedback are greatly appreciated

Best

Joel

PS: I'm on PTO, so my answers might be delayed (back September 2nd). But
    I'll give priority to answer any questions or feedback when I see
    it.

[1] https://github.com/SamsungDS/libvfn/tree/iommufd-fault-queue
[2] https://gitlab.com/birkelund/qemu/-/tree/pcie-ats-pri

[3] Initializing
```
  int iopf_init(struct iommu_ioas *ioas, const char *bdf)
  {
      // open vfio device from bdf
      int devfd = open('/dev/vfio/devices/VFIO_DEV', O_RDWR);

      struct vfio_device_bind_iommufd bind = {
          .argsz = sizeof(bind),
          .flags = 0,
          .iommufd = __iommufd,
      };
      ioctl(devfd, VFIO_DEVICE_BIND_IOMMUFD, &bind);

      struct iommu_ioas *ioas = ioas;
      struct vfio_device_attach_iommufd_pt attach_data = {
          .argsz = sizeof(attach_data),
          .flags = 0,
          .pt_id = ioas->id,
      };
      ioctl(devfd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);

      struct iommu_fault_alloc fault = {
          .size = sizeof(fault),
          .flags = 0,
      };
      ioctl(__iommufd, IOMMU_FAULT_QUEUE_ALLOC, &fault);

      struct iommu_hwpt_alloc fault_cmd = {
          .size = sizeof(fault_cmd),
          .flags = IOMMU_HWPT_FAULT_ID_VALID,
          .dev_id = bind.out_devid,
          .pt_id = ioas->id,
          .data_len = 0,
          .data_uptr = (uint64_t)NULL,
          .fault_id = fault.out_fault_id,
          .__reserved = 0,
          };
      ioctl(__iommufd, IOMMU_HWPT_ALLOC, &fault_cmd);

      // This is a re-attach
      struct vfio_device_attach_iommufd_pt attach = {
          .argsz = sizeof(attach),
          .flags = 0,
          .pt_id = fault_cmd.out_hwpt_id
      };
      ioctl(dev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach);
  }
```

[4] Handling
```
  int handle_iopf(void *vaddr, int len, uint64_t iova) {
    exec_command(CMD)

    int iopf_fd = fault_cmd.fault_id;

    struct iommu_hwpt_pgfault pgfault = {0};
    if(read(iopf_fd, &pgfault, sizeof(pgfault)) == 0);
      return; // no page fault

    ret = iommu_map_vaddr(__iommmufd, vaddr, len, &iova)
    struct iommu_hwpt_page_response pgfault_response = {
      .cookie = pgfault.cookie,
      .code = ret ? IOMMUFD_PAGE_RESP_SUCCESS : IOMMUFD_PAGE_RESP_INVALID,
    };

    write(iopf_fd, &pgfault_response, sizeof(pgfault_response));

    return;
  }
```

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
---
Joel Granados (4):
      iommu/vt-d: Separate page request queue from SVM
      iommu: Make IOMMU_IOPF selectable in Kconfig
      iommufd: Enable PRI when doing the iommufd_hwpt_alloc
      iommu: init pasid array while doing domain_replace and iopf is active

Klaus Jensen (2):
      iommu/vt-d: drop pasid requirement for prq initialization
      iommu/vt-d: do not require a PASID in page requests

 drivers/iommu/Kconfig                |   2 +-
 drivers/iommu/intel/Kconfig          |   1 -
 drivers/iommu/intel/Makefile         |   2 +-
 drivers/iommu/intel/iommu.c          |  29 ++--
 drivers/iommu/intel/iommu.h          |  40 ++++-
 drivers/iommu/intel/prq.c            | 284 ++++++++++++++++++++++++++++++++
 drivers/iommu/intel/svm.c            | 308 -----------------------------------
 drivers/iommu/iommu-priv.h           |   3 +
 drivers/iommu/iommu.c                |  31 ++++
 drivers/iommu/iommufd/fault.c        |  22 +++
 drivers/iommu/iommufd/hw_pagetable.c |   3 +-
 11 files changed, 389 insertions(+), 336 deletions(-)
---
base-commit: 3d5f968a177d468cd13568ef901c5be84d83d32b
change-id: 20240823-iopf-for-all-3b19075efc32

Best regards,
-- 
Klaus Jensen <k.jensen@samsung.com>
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Jason Gunthorpe 1 year, 3 months ago
On Mon, Aug 26, 2024 at 01:40:26PM +0200, Klaus Jensen wrote:
> This is a Request for Comment series that will hopefully generate
> initial feedback on the use of the iommufd_hwpt_replace_device to
> execute non-nested and non-svm user space IOPFs. Our main motivation is
> to enable user-space driver driven device verification with default
> pasid and without nesting nor SVM.
> 
> What?
>   * Enable IO page fault handling in user space in a non-nested, non-svm
>     and non-virtualised use case.
>   * Removing the relation between IOPF and INTEL_IOMMU_SVM by allowing
>     the user to (de)select the IOPF code through Kconfig.
>   * Create a new file under iommu/intel (prq.c) that contains all the
>     page request queue related logic and is not under intel/svm.c.
>   * Add the IOMMU_HWPT_FAULT_ID_VALID to the valid flags used to create
>     IOMMU_HWPT_ALLOC allocations.
>   * Create a default (zero) pasid handle and insert it to the pasid
>     array within the dev->iommu_group when replacing the old HWPT with
>     an iopf enabled HWPT.
> 
> Why?
>   The PCI ATS Extended Capability allows peripheral devices to
>   participate in the caching of translations when operating under an
>   IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
>   devices to handle missing mappings. Currently, PRI is mainly used in
>   the context of Shared Virtual Addressing, requiring support for the
>   Process Address Space Identifier (PASID) capability, but other use
>   cases such as enabling user-space driver driven device verification
>   and reducing memory pinning exists. This patchest sets out to enable
>   these use cases.

I definitely expect PRI to work outside PASID and SVA cases, so this
is going in a good direction

>   Supplementary repositories supporting this patchset:
>     1. A user space library libvfn [1] which is used for testing and
>        verification (see examples/iopf.c), and

That's pretty neat, I've been wanting to see some kind of IOMMU test
suite based around a capable widely available device. This is the
closest I've seen..

Jason
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Joel Granados 1 year, 3 months ago
On Mon, Aug 26, 2024 at 10:59:55AM -0300, Jason Gunthorpe wrote:
> On Mon, Aug 26, 2024 at 01:40:26PM +0200, Klaus Jensen wrote:
> > This is a Request for Comment series that will hopefully generate
> > initial feedback on the use of the iommufd_hwpt_replace_device to
> > execute non-nested and non-svm user space IOPFs. Our main motivation is
> > to enable user-space driver driven device verification with default
> > pasid and without nesting nor SVM.
> > 
> > What?
> >   * Enable IO page fault handling in user space in a non-nested, non-svm
> >     and non-virtualised use case.
> >   * Removing the relation between IOPF and INTEL_IOMMU_SVM by allowing
> >     the user to (de)select the IOPF code through Kconfig.
> >   * Create a new file under iommu/intel (prq.c) that contains all the
> >     page request queue related logic and is not under intel/svm.c.
> >   * Add the IOMMU_HWPT_FAULT_ID_VALID to the valid flags used to create
> >     IOMMU_HWPT_ALLOC allocations.
> >   * Create a default (zero) pasid handle and insert it to the pasid
> >     array within the dev->iommu_group when replacing the old HWPT with
> >     an iopf enabled HWPT.
> > 
> > Why?
> >   The PCI ATS Extended Capability allows peripheral devices to
> >   participate in the caching of translations when operating under an
> >   IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
> >   devices to handle missing mappings. Currently, PRI is mainly used in
> >   the context of Shared Virtual Addressing, requiring support for the
> >   Process Address Space Identifier (PASID) capability, but other use
> >   cases such as enabling user-space driver driven device verification
> >   and reducing memory pinning exists. This patchest sets out to enable
> >   these use cases.
> 
Sorry for the late reply, Slowly getting through my backlog after PTO

> I definitely expect PRI to work outside PASID and SVA cases, so this
> is going in a good direction
This touches on a detail (at least in Intel's vtd-io spec) that is not
100% clear to me. Second paragraph of section "3.4.3 Scalable Mode
Address Translation" reads:
"
  ... Scalable-mode context-entries support both requests-without-PASID
  and requests-with-PASID. However unlike legacy mode, in scalable-mode,
  requests-without-PASID obtain a PASID value from the RID_PASID field of
  the scalable-mode context- entry and are processed similarly to
  requests-with-PASID.Implementations not supporting RID_PASID capability
  (ECAP_REG.RPS is 0b), use a PASID value of 0 to perform address
  translation for requests without PASID.
"
This basically means that a default PASID is used even though the
request is without PASID. Right? Therefore "outside PASID" means with
the default PASID (at least in Intels case). Right?

> 
> >   Supplementary repositories supporting this patchset:
> >     1. A user space library libvfn [1] which is used for testing and
> >        verification (see examples/iopf.c), and
> 
> That's pretty neat, I've been wanting to see some kind of IOMMU test
> suite based around a capable widely available device. This is the
> closest I've seen..
Yes! This is an obvious application of libvfn. Do you see it as a
something that can be included in tools/selftests/iommu?

Best
-- 

Joel Granados
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Jason Gunthorpe 1 year, 3 months ago
On Mon, Sep 02, 2024 at 12:48:19PM +0200, Joel Granados wrote:
> > >   Supplementary repositories supporting this patchset:
> > >     1. A user space library libvfn [1] which is used for testing and
> > >        verification (see examples/iopf.c), and
> > 
> > That's pretty neat, I've been wanting to see some kind of IOMMU test
> > suite based around a capable widely available device. This is the
> > closest I've seen..
>
> Yes! This is an obvious application of libvfn. Do you see it as a
> something that can be included in tools/selftests/iommu?

Maybe? What would it look like in-kernel?

I've been thinking the same thing with mlx5

Maybe some kind of test runner with a plugin driver that has some kind
of 'do dma', 'generate interrupt', etc sort of operations, IDK.

Jason
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Joel Granados 1 year, 3 months ago
On Wed, Sep 04, 2024 at 01:13:50PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 02, 2024 at 12:48:19PM +0200, Joel Granados wrote:
> > > >   Supplementary repositories supporting this patchset:
> > > >     1. A user space library libvfn [1] which is used for testing and
> > > >        verification (see examples/iopf.c), and
> > > 
> > > That's pretty neat, I've been wanting to see some kind of IOMMU test
> > > suite based around a capable widely available device. This is the
> > > closest I've seen..
> >
> > Yes! This is an obvious application of libvfn. Do you see it as a
> > something that can be included in tools/selftests/iommu?
> 
> Maybe? What would it look like in-kernel?
Having it in-kernel with libvfn might be a bit too much because we would
have to bring libvfn into the kernel sources. But we can have some sort
of DMA test suit that runs as CI. Similar to what fstests or blktest do
(dmatests?). And we can automate it all with kdevops.

Here is a very rough run down of the idea.
1. We create (or use if there is one already) a DMA test suit. That has
can run on its own
2. Use libvfn to create commands that poke at the iommu{,fd}, intel)
amd, arm drivers.
3. Use qemu iommu implementation as well as pci dma enabled devices as
the test devices.
4. Can use hardware if it is detected.
5. Can use kdevops to bring up the different environments (e.g. kernel and
qemu branch combinations) needed for the test.
6. And finally put the kdevops test targets into some existing kernel CI
like linux-next or 0day (or whatever makes sense).

> 
> I've been thinking the same thing with mlx5
Not too familiar with this driver, but if it makes sense to reuse it (or
part of it) to make the test happen, I'm all for that.

> 
> Maybe some kind of test runner with a plugin driver that has some kind
> of 'do dma', 'generate interrupt', etc sort of operations, IDK.
Yes. If you are up for it, we can maybe discuss it a bit in LPC and
flesh it out a bit more?

Best

-- 

Joel Granados
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Baolu Lu 1 year, 3 months ago
On 2024/9/2 18:48, Joel Granados wrote:
>> I definitely expect PRI to work outside PASID and SVA cases, so this
>> is going in a good direction
> This touches on a detail (at least in Intel's vtd-io spec) that is not
> 100% clear to me. Second paragraph of section "3.4.3 Scalable Mode
> Address Translation" reads:
> "
>    ... Scalable-mode context-entries support both requests-without-PASID
>    and requests-with-PASID. However unlike legacy mode, in scalable-mode,
>    requests-without-PASID obtain a PASID value from the RID_PASID field of
>    the scalable-mode context- entry and are processed similarly to
>    requests-with-PASID.Implementations not supporting RID_PASID capability
>    (ECAP_REG.RPS is 0b), use a PASID value of 0 to perform address
>    translation for requests without PASID.
> "
> This basically means that a default PASID is used even though the
> request is without PASID. Right? Therefore "outside PASID" means with
> the default PASID (at least in Intels case). Right?

Kind of yes.

The PCI specification defines the concept of PASID and its role in
transaction routing. We refer to PCI transactions with a PASID prefix as
"request-with-PASID" and those without a PASID prefix as "request-
without-PASID." Consequently, I understand 'outside PASID' to mean
transactions that do not have a PASID prefix.

The VT-d specification describes how the IOMMU hardware handles request-
without-PASID. It uses a reserved PASID for its internal routing and
handling purposes. If RID_PASID is supported (ECAP_REG.RPS=1), software
can select its own reserved PASID. Otherwise, the IOMMU hardware will
use a default value of 0.

Thanks,
baolu
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Joel Granados 1 year, 3 months ago
On Mon, Sep 02, 2024 at 08:47:21PM +0800, Baolu Lu wrote:
> On 2024/9/2 18:48, Joel Granados wrote:
> >> I definitely expect PRI to work outside PASID and SVA cases, so this
> >> is going in a good direction
> > This touches on a detail (at least in Intel's vtd-io spec) that is not
> > 100% clear to me. Second paragraph of section "3.4.3 Scalable Mode
> > Address Translation" reads:
> > "
> >    ... Scalable-mode context-entries support both requests-without-PASID
> >    and requests-with-PASID. However unlike legacy mode, in scalable-mode,
> >    requests-without-PASID obtain a PASID value from the RID_PASID field of
> >    the scalable-mode context- entry and are processed similarly to
> >    requests-with-PASID.Implementations not supporting RID_PASID capability
> >    (ECAP_REG.RPS is 0b), use a PASID value of 0 to perform address
> >    translation for requests without PASID.
> > "
> > This basically means that a default PASID is used even though the
> > request is without PASID. Right? Therefore "outside PASID" means with
> > the default PASID (at least in Intels case). Right?
> 
> Kind of yes.
> 
> The PCI specification defines the concept of PASID and its role in
> transaction routing. We refer to PCI transactions with a PASID prefix as
> "request-with-PASID" and those without a PASID prefix as "request-
> without-PASID." Consequently, I understand 'outside PASID' to mean
> transactions that do not have a PASID prefix.
> 
> The VT-d specification describes how the IOMMU hardware handles request-
> without-PASID. It uses a reserved PASID for its internal routing and
> handling purposes. If RID_PASID is supported (ECAP_REG.RPS=1), software
> can select its own reserved PASID. Otherwise, the IOMMU hardware will
> use a default value of 0.
> 
Thx for getting back to me. This generates another doubt in my head
regarding the published capabilities from the intel IOMMU Hardware:

So ecap_pasid [1] does not have to be set in scalable-mode. Right? This
allows hardware supporting scalable-mode to reject transactions with
PASID whenever ecap_pasid is *NOT* set; even though internally things
are handled with a PASID. This question is directly related to the two
last patches in the set.5/6 and 6/6.

Best

[1] I use ecap_pasid like in the kernel `ecap_pasid(e)  (e>>40) & 0x1`

-- 

Joel Granados
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Baolu Lu 1 year, 3 months ago
On 9/3/24 9:20 PM, Joel Granados wrote:
> On Mon, Sep 02, 2024 at 08:47:21PM +0800, Baolu Lu wrote:
>> On 2024/9/2 18:48, Joel Granados wrote:
>>>> I definitely expect PRI to work outside PASID and SVA cases, so this
>>>> is going in a good direction
>>> This touches on a detail (at least in Intel's vtd-io spec) that is not
>>> 100% clear to me. Second paragraph of section "3.4.3 Scalable Mode
>>> Address Translation" reads:
>>> "
>>>     ... Scalable-mode context-entries support both requests-without-PASID
>>>     and requests-with-PASID. However unlike legacy mode, in scalable-mode,
>>>     requests-without-PASID obtain a PASID value from the RID_PASID field of
>>>     the scalable-mode context- entry and are processed similarly to
>>>     requests-with-PASID.Implementations not supporting RID_PASID capability
>>>     (ECAP_REG.RPS is 0b), use a PASID value of 0 to perform address
>>>     translation for requests without PASID.
>>> "
>>> This basically means that a default PASID is used even though the
>>> request is without PASID. Right? Therefore "outside PASID" means with
>>> the default PASID (at least in Intels case). Right?
>> Kind of yes.
>>
>> The PCI specification defines the concept of PASID and its role in
>> transaction routing. We refer to PCI transactions with a PASID prefix as
>> "request-with-PASID" and those without a PASID prefix as "request-
>> without-PASID." Consequently, I understand 'outside PASID' to mean
>> transactions that do not have a PASID prefix.
>>
>> The VT-d specification describes how the IOMMU hardware handles request-
>> without-PASID. It uses a reserved PASID for its internal routing and
>> handling purposes. If RID_PASID is supported (ECAP_REG.RPS=1), software
>> can select its own reserved PASID. Otherwise, the IOMMU hardware will
>> use a default value of 0.
>>
> Thx for getting back to me. This generates another doubt in my head
> regarding the published capabilities from the intel IOMMU Hardware:
> 
> So ecap_pasid [1] does not have to be set in scalable-mode. Right? This
> allows hardware supporting scalable-mode to reject transactions with
> PASID whenever ecap_pasid is*NOT*  set; even though internally things
> are handled with a PASID. This question is directly related to the two
> last patches in the set.5/6 and 6/6.

Yes. And 5/6, 6/6 make sense to me. We should remove the PASID
restriction from the code once PRI is split from SVA.

Thanks,
baolu
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Joel Granados 1 year, 3 months ago
On Wed, Sep 04, 2024 at 09:37:35AM +0800, Baolu Lu wrote:
> On 9/3/24 9:20 PM, Joel Granados wrote:
> > On Mon, Sep 02, 2024 at 08:47:21PM +0800, Baolu Lu wrote:
> >> On 2024/9/2 18:48, Joel Granados wrote:
> >>>> I definitely expect PRI to work outside PASID and SVA cases, so this
> >>>> is going in a good direction
> >>> This touches on a detail (at least in Intel's vtd-io spec) that is not
> >>> 100% clear to me. Second paragraph of section "3.4.3 Scalable Mode
> >>> Address Translation" reads:
> >>> "
> >>>     ... Scalable-mode context-entries support both requests-without-PASID
> >>>     and requests-with-PASID. However unlike legacy mode, in scalable-mode,
> >>>     requests-without-PASID obtain a PASID value from the RID_PASID field of
> >>>     the scalable-mode context- entry and are processed similarly to
> >>>     requests-with-PASID.Implementations not supporting RID_PASID capability
> >>>     (ECAP_REG.RPS is 0b), use a PASID value of 0 to perform address
> >>>     translation for requests without PASID.
> >>> "
> >>> This basically means that a default PASID is used even though the
> >>> request is without PASID. Right? Therefore "outside PASID" means with
> >>> the default PASID (at least in Intels case). Right?
> >> Kind of yes.
> >>
> >> The PCI specification defines the concept of PASID and its role in
> >> transaction routing. We refer to PCI transactions with a PASID prefix as
> >> "request-with-PASID" and those without a PASID prefix as "request-
> >> without-PASID." Consequently, I understand 'outside PASID' to mean
> >> transactions that do not have a PASID prefix.
> >>
> >> The VT-d specification describes how the IOMMU hardware handles request-
> >> without-PASID. It uses a reserved PASID for its internal routing and
> >> handling purposes. If RID_PASID is supported (ECAP_REG.RPS=1), software
> >> can select its own reserved PASID. Otherwise, the IOMMU hardware will
> >> use a default value of 0.
> >>
> > Thx for getting back to me. This generates another doubt in my head
> > regarding the published capabilities from the intel IOMMU Hardware:
> > 
> > So ecap_pasid [1] does not have to be set in scalable-mode. Right? This
> > allows hardware supporting scalable-mode to reject transactions with
> > PASID whenever ecap_pasid is*NOT*  set; even though internally things
> > are handled with a PASID. This question is directly related to the two
> > last patches in the set.5/6 and 6/6.
> 
> Yes. And 5/6, 6/6 make sense to me. We should remove the PASID
> restriction from the code once PRI is split from SVA.

Thx for the clarification. I'll make sure to include them in my V1

Best

-- 

Joel Granados
Re: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in non-nested and non-svm cases
Posted by Joel Granados 1 year, 3 months ago
On Mon, Sep 02, 2024 at 12:48:19PM +0200, Joel Granados wrote:
> On Mon, Aug 26, 2024 at 10:59:55AM -0300, Jason Gunthorpe wrote:
> > On Mon, Aug 26, 2024 at 01:40:26PM +0200, Klaus Jensen wrote:
> > > This is a Request for Comment series that will hopefully generate
> > > initial feedback on the use of the iommufd_hwpt_replace_device to
> > > execute non-nested and non-svm user space IOPFs. Our main motivation is
> > > to enable user-space driver driven device verification with default
> > > pasid and without nesting nor SVM.
> > > 
> > > What?
> > >   * Enable IO page fault handling in user space in a non-nested, non-svm
> > >     and non-virtualised use case.
> > >   * Removing the relation between IOPF and INTEL_IOMMU_SVM by allowing
> > >     the user to (de)select the IOPF code through Kconfig.
> > >   * Create a new file under iommu/intel (prq.c) that contains all the
> > >     page request queue related logic and is not under intel/svm.c.
> > >   * Add the IOMMU_HWPT_FAULT_ID_VALID to the valid flags used to create
> > >     IOMMU_HWPT_ALLOC allocations.
> > >   * Create a default (zero) pasid handle and insert it to the pasid
> > >     array within the dev->iommu_group when replacing the old HWPT with
> > >     an iopf enabled HWPT.
> > > 
> > > Why?
> > >   The PCI ATS Extended Capability allows peripheral devices to
> > >   participate in the caching of translations when operating under an
> > >   IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
> > >   devices to handle missing mappings. Currently, PRI is mainly used in
> > >   the context of Shared Virtual Addressing, requiring support for the
> > >   Process Address Space Identifier (PASID) capability, but other use
> > >   cases such as enabling user-space driver driven device verification
> > >   and reducing memory pinning exists. This patchest sets out to enable
> > >   these use cases.
> > 
> Sorry for the late reply, Slowly getting through my backlog after PTO
> 
> > I definitely expect PRI to work outside PASID and SVA cases, so this
> > is going in a good direction
> This touches on a detail (at least in Intel's vtd-io spec) that is not
> 100% clear to me. Second paragraph of section "3.4.3 Scalable Mode
> Address Translation" reads:
> "
>   ... Scalable-mode context-entries support both requests-without-PASID
>   and requests-with-PASID. However unlike legacy mode, in scalable-mode,
>   requests-without-PASID obtain a PASID value from the RID_PASID field of
>   the scalable-mode context- entry and are processed similarly to
>   requests-with-PASID.Implementations not supporting RID_PASID capability
>   (ECAP_REG.RPS is 0b), use a PASID value of 0 to perform address
>   translation for requests without PASID.
> "
> This basically means that a default PASID is used even though the
> request is without PASID. Right? Therefore "outside PASID" means with
> the default PASID (at least in Intels case). Right?
This is something that is related to patches 5/6 and 6/6 of this set.
And maybe is more a question for Lu Baolu.

Best

-- 

Joel Granados