[v6] RE: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable accelerated SMMUv3

RE: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable accelerated SMMUv3

Posted by Duan, Zhenzhong 2 months ago


>-----Original Message-----
>From: Shameer Kolothum <skolothumtho@nvidia.com>
>Subject: RE: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable
>accelerated SMMUv3
>
>
>
>> -----Original Message-----
>> From: Duan, Zhenzhong <zhenzhong.duan@intel.com>
>> Sent: 08 December 2025 10:08
>> To: Shameer Kolothum <skolothumtho@nvidia.com>; qemu-
>> arm@nongnu.org; qemu-devel@nongnu.org
>> Cc: eric.auger@redhat.com; peter.maydell@linaro.org; Jason Gunthorpe
>> <jgg@nvidia.com>; Nicolin Chen <nicolinc@nvidia.com>;
>ddutile@redhat.com;
>> berrange@redhat.com; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
>> <mochs@nvidia.com>; smostafa@google.com; wangzhou1@hisilicon.com;
>> jiangkunkun@huawei.com; jonathan.cameron@huawei.com;
>> zhangfei.gao@linaro.org; Liu, Yi L <yi.l.liu@intel.com>; Krishnakant Jaju
>> <kjaju@nvidia.com>
>> Subject: RE: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable
>> accelerated SMMUv3
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Shameer,
>>
>> >-----Original Message-----
>> >From: Shameer Kolothum <skolothumtho@nvidia.com>
>> >Subject: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable
>> >accelerated SMMUv3
>> >
>> >Hi,
>> >
>> >Changes from v5:
>> >
>> >https://lore.kernel.org/qemu-devel/20251031105005.24618-1-
>> skolothumtho
>> >@nvidia.com/
>> >
>> > - Addressed feedback from v5 and picked up R-by tags. Thanks to all!
>> > - The previously split out _DSM fix mini-series is now accepted [0].
>> > - Improved documentation about the rationale behind the design choice
>of
>> >   returning an address space aliased to the system address space for
>> >   vfio-pci endpoint devices (patch #10).
>> > - Added error propagation support for smmuv3_cmdq_consume() (patch
>> >#13).
>> > - Updated vSTE based HWPT installation to check the SMMU enabled case
>> >   (patch #14).
>> > - Introduced an optional callback to PCIIOMMUOps to retrieve the MSI
>> >   doorbell GPA directly, allowing us to avoid unsafe page table walks for
>> >   MSI translation in accelerated SMMUv3 cases (patch #16).
>> > - GBPA-based vSTE update depends on Nicolin's kernel patch [1].
>> > - VFIO/IOMMUFD has dependency on Zhenzhong's patches: 4/5/8 from
>the
>> >   pass-through support series [2].
>> >
>> >PATCH organization:
>> > 1–26: Enables accelerated SMMUv3 with features based on default QEMU
>> >SMMUv3,
>> >       including IORT RMR based MSI support.
>> > 27–29: Adds options for specifying RIL, ATS, and OAS features.
>> > 30–33: Adds PASID support, including VFIO changes.
>> >
>> >Tests:
>> >Performed basic sanity tests on an NVIDIA GRACE platform with GPU
>> >device assignments. A CUDA test application was used to verify the SVA
>use
>> case.
>> >Further tests are always welcome.
>>
>> I see PASID capability is exposed to guest but no pasid attachment in this
>> series.
>> Was the nested hwpt attached to SID instead of pasid?
>
>In ARM world there is no specific PASID attachment. ARM uses a Context
>Descriptor (CD) table indexed by PASID(substream in ARM) which is owned by
>Guest. Hence, no specific PASID attach handling is required in QEMU.

I just realized a nested hwpt in ARM is “stage2 hwpt + guest CD table” rather than
“stage2 hwpt + a guest s1 hwpt”. When creating nested hwpt, guest S1ContextPtr
is passed to host rather than a stage1 TTB. Do I understand right?

>
>How was page fault
>> handled in stage1?
>
>If you meant PRI CAP that is not supported yet.
>
>However, Zhangfei is maintaining a branch to add support for devices that
>supports
>SMMUv3 STALL feature and handles S1 page fault.
>https://github.com/Linaro/qemu/commits/master-smmuv3-accel-v6/

Thanks for sharing.

BRs,
Zhenzhong

Re: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable accelerated SMMUv3

Posted by Yi Liu 2 months ago

On 2025/12/9 10:30, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Shameer Kolothum <skolothumtho@nvidia.com>
>> Subject: RE: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable
>> accelerated SMMUv3
>>
>>
>>
>>> -----Original Message-----
>>> From: Duan, Zhenzhong <zhenzhong.duan@intel.com>
>>> Sent: 08 December 2025 10:08
>>> To: Shameer Kolothum <skolothumtho@nvidia.com>; qemu-
>>> arm@nongnu.org; qemu-devel@nongnu.org
>>> Cc: eric.auger@redhat.com; peter.maydell@linaro.org; Jason Gunthorpe
>>> <jgg@nvidia.com>; Nicolin Chen <nicolinc@nvidia.com>;
>> ddutile@redhat.com;
>>> berrange@redhat.com; Nathan Chen <nathanc@nvidia.com>; Matt Ochs
>>> <mochs@nvidia.com>; smostafa@google.com; wangzhou1@hisilicon.com;
>>> jiangkunkun@huawei.com; jonathan.cameron@huawei.com;
>>> zhangfei.gao@linaro.org; Liu, Yi L <yi.l.liu@intel.com>; Krishnakant Jaju
>>> <kjaju@nvidia.com>
>>> Subject: RE: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable
>>> accelerated SMMUv3
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Hi Shameer,
>>>
>>>> -----Original Message-----
>>>> From: Shameer Kolothum <skolothumtho@nvidia.com>
>>>> Subject: [PATCH v6 00/33] hw/arm/virt: Add support for user-creatable
>>>> accelerated SMMUv3
>>>>
>>>> Hi,
>>>>
>>>> Changes from v5:
>>>>
>>>> https://lore.kernel.org/qemu-devel/20251031105005.24618-1-
>>> skolothumtho
>>>> @nvidia.com/
>>>>
>>>> - Addressed feedback from v5 and picked up R-by tags. Thanks to all!
>>>> - The previously split out _DSM fix mini-series is now accepted [0].
>>>> - Improved documentation about the rationale behind the design choice
>> of
>>>>    returning an address space aliased to the system address space for
>>>>    vfio-pci endpoint devices (patch #10).
>>>> - Added error propagation support for smmuv3_cmdq_consume() (patch
>>>> #13).
>>>> - Updated vSTE based HWPT installation to check the SMMU enabled case
>>>>    (patch #14).
>>>> - Introduced an optional callback to PCIIOMMUOps to retrieve the MSI
>>>>    doorbell GPA directly, allowing us to avoid unsafe page table walks for
>>>>    MSI translation in accelerated SMMUv3 cases (patch #16).
>>>> - GBPA-based vSTE update depends on Nicolin's kernel patch [1].
>>>> - VFIO/IOMMUFD has dependency on Zhenzhong's patches: 4/5/8 from
>> the
>>>>    pass-through support series [2].
>>>>
>>>> PATCH organization:
>>>> 1–26: Enables accelerated SMMUv3 with features based on default QEMU
>>>> SMMUv3,
>>>>        including IORT RMR based MSI support.
>>>> 27–29: Adds options for specifying RIL, ATS, and OAS features.
>>>> 30–33: Adds PASID support, including VFIO changes.
>>>>
>>>> Tests:
>>>> Performed basic sanity tests on an NVIDIA GRACE platform with GPU
>>>> device assignments. A CUDA test application was used to verify the SVA
>> use
>>> case.
>>>> Further tests are always welcome.
>>>
>>> I see PASID capability is exposed to guest but no pasid attachment in this
>>> series.
>>> Was the nested hwpt attached to SID instead of pasid?
>>
>> In ARM world there is no specific PASID attachment. ARM uses a Context
>> Descriptor (CD) table indexed by PASID(substream in ARM) which is owned by
>> Guest. Hence, no specific PASID attach handling is required in QEMU.
> 
> I just realized a nested hwpt in ARM is “stage2 hwpt + guest CD table” rather than
> “stage2 hwpt + a guest s1 hwpt”. When creating nested hwpt, guest S1ContextPtr
> is passed to host rather than a stage1 TTB. Do I understand right?

yeah. ARM SMMUv3 and AMD iommu does not attach pasid to host explicitly.
This should be done in the time of enabling nesting.

Regards,
Yi Liu