[RFC PATCH 00/11] iommu/riscv: Add hardware dirty tracking for second-stage domains

fangyu.yu@linux.alibaba.com posted 11 patches 1 month, 2 weeks ago
There is a newer version of this series
arch/riscv/kvm/Kconfig               |   2 +
drivers/iommu/generic_pt/fmt/riscv.h | 120 ++++++++++++-
drivers/iommu/riscv/iommu-bits.h     |   7 +
drivers/iommu/riscv/iommu.c          | 247 +++++++++++++++++++++++----
include/linux/generic_pt/common.h    |  13 ++
include/linux/generic_pt/iommu.h     |  17 +-
include/uapi/linux/iommufd.h         |  18 ++
7 files changed, 383 insertions(+), 41 deletions(-)
[RFC PATCH 00/11] iommu/riscv: Add hardware dirty tracking for second-stage domains
Posted by fangyu.yu@linux.alibaba.com 1 month, 2 weeks ago
From: Fangyu Yu <fangyu.yu@linux.alibaba.com>

The RISC-V IOMMU architecture defines an AMO_HWAD capability (Hardware
Access/Dirty update) that allows the IOMMU to atomically set the A/D bits
in second-stage PTEs on DMA access.  When DC.tc.GADE is asserted, the IOMMU
autonomously sets D on the first write to a page mapped by an iohgatp
domain.  This series wires that capability up to the iommufd dirty-tracking
interface (IOMMU_HWPT_SET_DIRTY_TRACKING / IOMMU_HWPT_GET_DIRTY_BITMAP) and
reports IOMMU_CAP_DIRTY_TRACKING.

Design notes
------------

* The feature is scoped to second-stage (iohgatp) domains only; these are
  the domains created for KVM / VFIO device pass-through when userspace
  allocates an HWPT with IOMMU_HWPT_ALLOC_NEST_PARENT or
  IOMMU_HWPT_ALLOC_DIRTY_TRACKING.  First-stage (iosatp) domains are not
  touched by this series.

* The page-table side plugs into the existing generic_pt dirty hook
  framework (amdv1 / vtdss style).  RISC-V adds the three required PTE
  ops – is_write_dirty / make_write_clean / make_write_dirty.

Testing
-------

* Test on QEMU RISC-V, a virtio-net and an e1000e device was passed through
  to an L2 guest via vfio-pci + iommufd.

* generic_pt KUnit: the existing test_dirty case now runs and passes for
  the RISC-V 64-bit format.

Follow-up work
--------------
* Build a dedicated end-to-end test case that drives the full flow
  (HWPT_ALLOC with DIRTY_TRACKING -> attach -> IOAS_MAP -> generate real
  DMA -> SET_DIRTY_TRACKING -> GET_DIRTY_BITMAP -> verify bitmap against
  expected IOVA footprint) so that the behaviour can be regression-tested
  beyond the KUnit PTE-level coverage.

* If possible, rebase and retest on top of the updated "iommu irqbypass"
  patchset.


Fangyu Yu (6):
  iommupt: Add RISC-V Second-stage (iohgatp) page table support
  iommu/riscv: Add domain_alloc_paging_flags for second-stage domain
  iommupt: Don't preset D when RISC-V IOMMU dirty tracking on
  iommu/riscv: Add dirty tracking support for second-stage domains
  iommu/riscv: Add IOTINVAL.GVMA after updating DDT/PDT entries
  iommupt: Add RISC-V dirty tracking PTE ops

Tomasz Jeznach (2):
  iommu/riscv: report iommu capabilities
  RISC-V: KVM: Enable KVM_VFIO interfaces on RISC-V arch

Zong Li (3):
  iommu/riscv: use data structure instead of individual values
  iommu/riscv: support GSCID and GVMA invalidation command
  iommu/riscv: support nested iommu for getting iommu hardware
    information

 arch/riscv/kvm/Kconfig               |   2 +
 drivers/iommu/generic_pt/fmt/riscv.h | 120 ++++++++++++-
 drivers/iommu/riscv/iommu-bits.h     |   7 +
 drivers/iommu/riscv/iommu.c          | 247 +++++++++++++++++++++++----
 include/linux/generic_pt/common.h    |  13 ++
 include/linux/generic_pt/iommu.h     |  17 +-
 include/uapi/linux/iommufd.h         |  18 ++
 7 files changed, 383 insertions(+), 41 deletions(-)

-- 
2.50.1

Re: [RFC PATCH 00/11] iommu/riscv: Add hardware dirty tracking for second-stage domains
Posted by Andrew Jones 1 month, 1 week ago
On Tue, Apr 28, 2026 at 09:13:48PM +0800, fangyu.yu@linux.alibaba.com wrote:
> From: Fangyu Yu <fangyu.yu@linux.alibaba.com>
> 
> The RISC-V IOMMU architecture defines an AMO_HWAD capability (Hardware
> Access/Dirty update) that allows the IOMMU to atomically set the A/D bits
> in second-stage PTEs on DMA access.  When DC.tc.GADE is asserted, the IOMMU
> autonomously sets D on the first write to a page mapped by an iohgatp
> domain.  This series wires that capability up to the iommufd dirty-tracking
> interface (IOMMU_HWPT_SET_DIRTY_TRACKING / IOMMU_HWPT_GET_DIRTY_BITMAP) and
> reports IOMMU_CAP_DIRTY_TRACKING.
> 
> Design notes
> ------------
> 
> * The feature is scoped to second-stage (iohgatp) domains only; these are
>   the domains created for KVM / VFIO device pass-through when userspace
>   allocates an HWPT with IOMMU_HWPT_ALLOC_NEST_PARENT or
>   IOMMU_HWPT_ALLOC_DIRTY_TRACKING.  First-stage (iosatp) domains are not
>   touched by this series.
> 
> * The page-table side plugs into the existing generic_pt dirty hook
>   framework (amdv1 / vtdss style).  RISC-V adds the three required PTE
>   ops – is_write_dirty / make_write_clean / make_write_dirty.
> 
> Testing
> -------
> 
> * Test on QEMU RISC-V, a virtio-net and an e1000e device was passed through
>   to an L2 guest via vfio-pci + iommufd.
> 
> * generic_pt KUnit: the existing test_dirty case now runs and passes for
>   the RISC-V 64-bit format.
> 
> Follow-up work
> --------------
> * Build a dedicated end-to-end test case that drives the full flow
>   (HWPT_ALLOC with DIRTY_TRACKING -> attach -> IOAS_MAP -> generate real
>   DMA -> SET_DIRTY_TRACKING -> GET_DIRTY_BITMAP -> verify bitmap against
>   expected IOVA footprint) so that the behaviour can be regression-tested
>   beyond the KUnit PTE-level coverage.
> 
> * If possible, rebase and retest on top of the updated "iommu irqbypass"
>   patchset.

Thanks for this series! I was starting to go down a similar road myself
in order to limit irqbypass to IOMMU_HWPT_ALLOC_NEST_PARENT domains since
I wasn't happy with other approaches, e.g. continuing to use s-stage, but
activating g-stage too with identity mappings since the MSI table can't be
activated otherwise. Or, simply using g-stage instead of s-stage in order
to get the MSI table enabled. In the end, I think the best is to require
nested for irqbypass and this series will provide a good base for that.

I'll rebase irqbypass on this series and test it out.

Thanks,
drew
Re: Re: [RFC PATCH 00/11] iommu/riscv: Add hardware dirty tracking for second-stage domains
Posted by fangyu.yu@linux.alibaba.com 1 month, 1 week ago
>> From: Fangyu Yu <fangyu.yu@linux.alibaba.com>
>> 
>> The RISC-V IOMMU architecture defines an AMO_HWAD capability (Hardware
>> Access/Dirty update) that allows the IOMMU to atomically set the A/D bits
>> in second-stage PTEs on DMA access.  When DC.tc.GADE is asserted, the IOMMU
>> autonomously sets D on the first write to a page mapped by an iohgatp
>> domain.  This series wires that capability up to the iommufd dirty-tracking
>> interface (IOMMU_HWPT_SET_DIRTY_TRACKING / IOMMU_HWPT_GET_DIRTY_BITMAP) and
>> reports IOMMU_CAP_DIRTY_TRACKING.
>> 
>> Design notes
>> ------------
>> 
>> * The feature is scoped to second-stage (iohgatp) domains only; these are
>>   the domains created for KVM / VFIO device pass-through when userspace
>>   allocates an HWPT with IOMMU_HWPT_ALLOC_NEST_PARENT or
>>   IOMMU_HWPT_ALLOC_DIRTY_TRACKING.  First-stage (iosatp) domains are not
>>   touched by this series.
>> 
>> * The page-table side plugs into the existing generic_pt dirty hook
>>   framework (amdv1 / vtdss style).  RISC-V adds the three required PTE
>>   ops – is_write_dirty / make_write_clean / make_write_dirty.
>> 
>> Testing
>> -------
>> 
>> * Test on QEMU RISC-V, a virtio-net and an e1000e device was passed through
>>   to an L2 guest via vfio-pci + iommufd.
>> 
>> * generic_pt KUnit: the existing test_dirty case now runs and passes for
>>   the RISC-V 64-bit format.
>> 
>> Follow-up work
>> --------------
>> * Build a dedicated end-to-end test case that drives the full flow
>>   (HWPT_ALLOC with DIRTY_TRACKING -> attach -> IOAS_MAP -> generate real
>>   DMA -> SET_DIRTY_TRACKING -> GET_DIRTY_BITMAP -> verify bitmap against
>>   expected IOVA footprint) so that the behaviour can be regression-tested
>>   beyond the KUnit PTE-level coverage.
>> 
>> * If possible, rebase and retest on top of the updated "iommu irqbypass"
>>   patchset.
>
>Thanks for this series! I was starting to go down a similar road myself
>in order to limit irqbypass to IOMMU_HWPT_ALLOC_NEST_PARENT domains since
>I wasn't happy with other approaches, e.g. continuing to use s-stage, but
>activating g-stage too with identity mappings since the MSI table can't be
>activated otherwise. Or, simply using g-stage instead of s-stage in order
>to get the MSI table enabled. In the end, I think the best is to require
>nested for irqbypass and this series will provide a good base for that.
>
>I'll rebase irqbypass on this series and test it out.
>

Thanks for the feedback. Jason has provided some helpful suggestions on this
series, and I am in the process of updating it. I expect to send out a new
version in the coming days.

Fangyu

>Thanks,
>drew