[RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework

tangtao1634 posted 2 patches 1 month, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250930165340.42788-1-tangtao1634@phytium.com.cn
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Fabiano Rosas <farosas@suse.de>, Laurent Vivier <lvivier@redhat.com>
docs/specs/index.rst             |   1 +
docs/specs/smmu-testdev.rst      |  45 ++
hw/misc/Kconfig                  |   5 +
hw/misc/meson.build              |   1 +
hw/misc/smmu-testdev.c           | 943 +++++++++++++++++++++++++++++++
include/hw/misc/smmu-testdev.h   | 402 +++++++++++++
tests/qtest/meson.build          |   1 +
tests/qtest/smmu-testdev-qtest.c | 238 ++++++++
8 files changed, 1636 insertions(+)
create mode 100644 docs/specs/smmu-testdev.rst
create mode 100644 hw/misc/smmu-testdev.c
create mode 100644 include/hw/misc/smmu-testdev.h
create mode 100644 tests/qtest/smmu-testdev-qtest.c
[RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by tangtao1634 1 month, 2 weeks ago
From: Tao Tang <tangtao1634@phytium.com.cn>

This patch series (V2) introduces several cleanups and improvements to the smmu-testdev device. The main goals are to refactor shared code, enhance robustness, and significantly clarify the complex page table construction used for testing.

Motivation
----------

Currently, thoroughly testing the SMMUv3 emulation requires a significant
software stack. We need to boot a full guest operating system (like Linux)
with the appropriate drivers (e.g., IOMMUFD) and rely on firmware (e.g.,
ACPI with IORT tables or Hafnium) to correctly configure the SMMU and
orchestrate DMA from a peripheral device.

This dependency on a complex software stack presents several challenges:

* High Barrier to Entry: Writing targeted tests for specific SMMU
    features (like fault handling, specific translation regimes, etc.)
    becomes cumbersome.

* Difficult to Debug: It's hard to distinguish whether a bug originates
    from the SMMU emulation itself, the guest driver, the firmware
    tables, or the guest kernel's configuration.

* Slow Iteration: The need to boot a full guest OS slows down the
    development and testing cycle.

The primary goal of this work is to create a lightweight, self-contained
testing environment that allows us to exercise the SMMU's core logic
directly at the qtest level, removing the need for any guest-side software.

Our Approach: A Dedicated Test Device
-------------------------------------

To achieve this, we introduce two main components:

* A new, minimal hardware device: smmu-testdev.
* A corresponding qtest that drives this device to generate SMMU-bound
    traffic.

A key question is, "Why introduce a new smmu-testdev instead of using an
existing PCIe or platform device?"

The answer lies in our goal to minimize complexity. Standard devices,
whether PCIe or platform, come with their own intricate initialization
protocols and often require a complex driver state machine to function.
Using them would re-introduce the very driver-level complexity we aim to
avoid.

The smmu-testdev is intentionally not a conformant, general-purpose PCIe
or platform device. It is a purpose-built, highly simplified "DMA engine."
I've designed it to be analogous to a minimal PCIe Root Complex that
bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
to provide a direct, programmable path for a DMA request to reach the SMMU.
Its sole purpose is to trigger a DMA transaction when its registers are
written to, making it perfectly suited for direct control from a test
environment like qtest.

The Qtest Framework
-------------------

The new qtest (smmu-testdev-qtest.c) serves as the "bare-metal driver"
for both the SMMU and the smmu-testdev. It manually performs all the
setup that would typically be handled by the guest kernel and firmware,
but in a completely controlled and predictable manner:

1.  SMMU Configuration: It directly initializes the SMMU's registers to a
    known state.

2.  Translation Structure Setup: It manually constructs the necessary
    translation structures in memory, including Stream Table Entries
    (STEs), Context Descriptors (CDs), and Page Tables (PTEs).

3.  DMA Trigger: It programs the smmu-testdev to initiate a DMA operation
    targeting a specific IOVA.

4.  Verification: It waits for the transaction to complete and verifies
    that the memory was accessed correctly after address translation by
    the SMMU.

This framework provides a solid and extensible foundation for validating
the SMMU's core translation paths. The initial test included in this
series covers a basic DMA completion path in the Non-Secure bank,
serving as a smoke test and a proof of concept.

It is worth noting that this series currently only includes tests for the
Non-Secure SMMU. I am aware of the ongoing discussions and RFC patches
for Secure SMMU support. To avoid a dependency on unmerged work, this
submission does not include tests for the Secure world. However, I have
already implemented these tests locally, and I am prepared to submit
them for review as soon as the core Secure SMMU support is merged
upstream.


Changes from v1 RFC:
- Clarify Page Table Construction:
Detailed comments have been added to the page table construction logic. This is a key improvement, as the test setup extensively re-uses the same set of page tables for multiple translation stages and purposes (e.g., nested S1/S2 walks, CD fetch). The new comments explain this sharing mechanism, which can otherwise be confusing to follow.

- Refactor Shared Helpers:
The helper functions std_space_offset and std_space_to_str are now moved to a common header file. This allows them to be used by both the main device implementation (hw/misc/smmu-testdev.c) and its qtest (tests/qtest/smmu-testdev-qtest.c), improving code re-use and maintainability.

- Enhance Robustness:
Assertions have been added to ensure the device operates only in the expected Non-secure context. Additional conditional checks are also included to prevent potential runtime errors and make the test device more stable.

- Code Simplification and Cleanup:
Several functions that were redundant with existing macros for constructing Context Descriptors (CD) and Stream Table Entries (STE) have been removed. This simplifies the test data setup and reduces code duplication.

Other unused code fragments have also been removed to improve overall code clarity and hygiene.

Tao Tang (2):
  hw/misc/smmu-testdev: introduce minimal SMMUv3 test device
  tests/qtest: add SMMUv3 smoke test using smmu-testdev DMA source

 docs/specs/index.rst             |   1 +
 docs/specs/smmu-testdev.rst      |  45 ++
 hw/misc/Kconfig                  |   5 +
 hw/misc/meson.build              |   1 +
 hw/misc/smmu-testdev.c           | 943 +++++++++++++++++++++++++++++++
 include/hw/misc/smmu-testdev.h   | 402 +++++++++++++
 tests/qtest/meson.build          |   1 +
 tests/qtest/smmu-testdev-qtest.c | 238 ++++++++
 8 files changed, 1636 insertions(+)
 create mode 100644 docs/specs/smmu-testdev.rst
 create mode 100644 hw/misc/smmu-testdev.c
 create mode 100644 include/hw/misc/smmu-testdev.h
 create mode 100644 tests/qtest/smmu-testdev-qtest.c

-- 
2.49.0
Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Peter Maydell 2 weeks, 4 days ago
On Tue, 30 Sept 2025 at 17:53, tangtao1634 <tangtao1634@phytium.com.cn> wrote:
> The smmu-testdev is intentionally not a conformant, general-purpose PCIe
> or platform device. It is a purpose-built, highly simplified "DMA engine."
> I've designed it to be analogous to a minimal PCIe Root Complex that
> bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
> to provide a direct, programmable path for a DMA request to reach the SMMU.
> Its sole purpose is to trigger a DMA transaction when its registers are
> written to, making it perfectly suited for direct control from a test
> environment like qtest.

This makes sense to me. But looking at the code it looks like the
device itself has a lot of code for setting up IOMMU page tables in
guest memory when the test code writes to its registers. That
surprised me, as I was expecting the test device to essentially
be "do DMA on command". Is there a reason why we can't have the
test code do the setting up of the IOMMU page tables itself
using the qtest functions for writing guest memory? (Obviously
you'd abstract this out into functions for the purpose in
libqos/ somewhere.)

If we did it that way, we could use the same test device as
part of non-SMMUv3 iommu emulation tests too -- the qtest
test case code would just set up the different IOMMU in
the way that IOMMU expects before triggering DMA.

thanks
-- PMM
Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Tao Tang 2 weeks, 3 days ago
Hi Alex, Peter, Pierrick:


Thank you all again for the outstanding feedback.


As Peter said in mail 
[1](https://lore.kernel.org/qemu-devel/CAFEAcA92dTDn+Zf-GZVv9zQ3_mwJHZY5hrkdgrRyE7XUio4Sjw@mail.gmail.com/): 


On 2025/10/27 21:58, Peter Maydell wrote:
> On Tue, 30 Sept 2025 at 17:53, tangtao1634 <tangtao1634@phytium.com.cn> wrote:
>> The smmu-testdev is intentionally not a conformant, general-purpose PCIe
>> or platform device. It is a purpose-built, highly simplified "DMA engine."
>> I've designed it to be analogous to a minimal PCIe Root Complex that
>> bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
>> to provide a direct, programmable path for a DMA request to reach the SMMU.
>> Its sole purpose is to trigger a DMA transaction when its registers are
>> written to, making it perfectly suited for direct control from a test
>> environment like qtest.
> This makes sense to me. But looking at the code it looks like the
> device itself has a lot of code for setting up IOMMU page tables in
> guest memory when the test code writes to its registers. That
> surprised me, as I was expecting the test device to essentially
> be "do DMA on command". Is there a reason why we can't have the
> test code do the setting up of the IOMMU page tables itself
> using the qtest functions for writing guest memory? (Obviously
> you'd abstract this out into functions for the purpose in
> libqos/ somewhere.)
>
> If we did it that way, we could use the same test device as
> part of non-SMMUv3 iommu emulation tests too -- the qtest
> test case code would just set up the different IOMMU in
> the way that IOMMU expects before triggering DMA.
>
> thanks
> -- PMM


And Alex's guidance in another 
mail [2](https://lore.kernel.org/qemu-devel/87jz0gxw01.fsf@draig.linaro.org/):

> Yes - generally I think having a single test device that can be used to
> test multiple models will be useful. I guess each qtest will be very
> tied to the SMMU it is modelling as it needs to program both sides but
> if we take care to encapsulate the programming of the test device and
> verification of the results we should be able to ensure good code
> re-use.
>
>> I'll admit this is an area I haven't looked into. I'm very
>> open to ideas—do you or others have suggestions on how this
>> test-device pattern could be generalized or what would be needed to
>> make it useful across different architectures?
> My only initial thought is the device might be better called
> iommu-testdev (as in a device to test IOMMUs, of which the SMMU is one).
>
You nailed the core problem. I hadn't properly thought through the 
separation of concerns, leading to a device that was doing work it 
shouldn't. As Alex pointed out in [2], this architectural refactoring 
allows us to build a generic iommu-testdev to ensure good code re-use. 
This elevates the work from a single-purpose tool into a framework that 
can benefit all IOMMU implementations, which I understand is a far more 
valuable contribution.


Furthermore,  Pierrick's comment in mail 
[3](https://lore.kernel.org/qemu-devel/792a06cd-302c-46a5-997c-026cb67f8f2e@linaro.org/):

> We have to start somewhere, so something simple and not trying to 
> solve all use cases is the right approach. It can even just be 
> read/write config/registers before trying to add any DMA scenario. 

Also as Alex said in mail 
[4](https://lore.kernel.org/qemu-devel/87ecqoxohg.fsf@draig.linaro.org/):

> We should be thinking of targeted unit tests. The difference between
> this and a full OS is we don't need to manage multiple shifting memory
> maps over time. Setup a page (or two) with the permissions you expect
> and check that works.

The goal is to avoid "accidentally rewriting a driver." Instead, we 
should start simple and provide "targeted unit tests." The idea of 
setting up a simple, static state ("a page or two") to verify atomic 
features and edge cases that are hard to trigger in a dynamic OS is 
exactly the right philosophy for this framework.


Based on this, my plan for V3 maybe now much clearer:

- Refactor the device: It will be renamed iommu-testdev and become a 
"dumb," generic DMA engine. All architecture-specific logic, including 
the construction of page tables and other structures, will be moved into 
the qtest.

- Abstract for reuse: Following Peter's and Alex's advice, the 
table-building logic will be abstracted into reusable helper functions 
within the libqos/ library.

- Limit the initial scope: As you all suggested, the first set of tests 
will be simple unit tests, focusing on the core paths like different 
security states and translation stages


One final question to manage the scope of this large refactoring: my 
plan is to implement the generic iommu-testdev framework in V3, but 
provide only the SMMUv3-specific qtest helpers and tests for now. We can 
leave the implementation for other architectures (like VT-d) to future 
work. Does this seem like a reasonable approach?

Thanks again for helping to shape this work.

Best regards,

Tao


[1] 
(https://lore.kernel.org/qemu-devel/CAFEAcA92dTDn+Zf-GZVv9zQ3_mwJHZY5hrkdgrRyE7XUio4Sjw@mail.gmail.com/)
[2] (https://lore.kernel.org/qemu-devel/87jz0gxw01.fsf@draig.linaro.org/)
[3] 
(https://lore.kernel.org/qemu-devel/792a06cd-302c-46a5-997c-026cb67f8f2e@linaro.org/)
[4] (https://lore.kernel.org/qemu-devel/87ecqoxohg.fsf@draig.linaro.org/)


Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Alex Bennée 3 weeks, 1 day ago
tangtao1634 <tangtao1634@phytium.com.cn> writes:

> From: Tao Tang <tangtao1634@phytium.com.cn>
>
> This patch series (V2) introduces several cleanups and improvements to the smmu-testdev device. The main goals are to refactor shared code, enhance robustness, and significantly clarify the complex page table construction used for testing.
>
> Motivation
> ----------
>
> Currently, thoroughly testing the SMMUv3 emulation requires a significant
> software stack. We need to boot a full guest operating system (like Linux)
> with the appropriate drivers (e.g., IOMMUFD) and rely on firmware (e.g.,
> ACPI with IORT tables or Hafnium) to correctly configure the SMMU and
> orchestrate DMA from a peripheral device.
>
> This dependency on a complex software stack presents several challenges:
>
> * High Barrier to Entry: Writing targeted tests for specific SMMU
>     features (like fault handling, specific translation regimes, etc.)
>     becomes cumbersome.
>
> * Difficult to Debug: It's hard to distinguish whether a bug originates
>     from the SMMU emulation itself, the guest driver, the firmware
>     tables, or the guest kernel's configuration.
>
> * Slow Iteration: The need to boot a full guest OS slows down the
>     development and testing cycle.
>
> The primary goal of this work is to create a lightweight, self-contained
> testing environment that allows us to exercise the SMMU's core logic
> directly at the qtest level, removing the need for any guest-side
> software.

I agree, an excellent motivation.

>
> Our Approach: A Dedicated Test Device
> -------------------------------------
>
> To achieve this, we introduce two main components:
>
> * A new, minimal hardware device: smmu-testdev.
> * A corresponding qtest that drives this device to generate SMMU-bound
>     traffic.
>
> A key question is, "Why introduce a new smmu-testdev instead of using an
> existing PCIe or platform device?"

I curious what the split between PCIe and platform devices that need an
SMMU are. I suspect there is a strong split between the virtualisation
case and the emulation case.

> The answer lies in our goal to minimize complexity. Standard devices,
> whether PCIe or platform, come with their own intricate initialization
> protocols and often require a complex driver state machine to function.
> Using them would re-introduce the very driver-level complexity we aim to
> avoid.
>
> The smmu-testdev is intentionally not a conformant, general-purpose PCIe
> or platform device. It is a purpose-built, highly simplified "DMA engine."
> I've designed it to be analogous to a minimal PCIe Root Complex that
> bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
> to provide a direct, programmable path for a DMA request to reach the SMMU.
> Its sole purpose is to trigger a DMA transaction when its registers are
> written to, making it perfectly suited for direct control from a test
> environment like qtest.
>
> The Qtest Framework
> -------------------
>
> The new qtest (smmu-testdev-qtest.c) serves as the "bare-metal driver"
> for both the SMMU and the smmu-testdev. It manually performs all the
> setup that would typically be handled by the guest kernel and firmware,
> but in a completely controlled and predictable manner:
>
> 1.  SMMU Configuration: It directly initializes the SMMU's registers to a
>     known state.
>
> 2.  Translation Structure Setup: It manually constructs the necessary
>     translation structures in memory, including Stream Table Entries
>     (STEs), Context Descriptors (CDs), and Page Tables (PTEs).
>
> 3.  DMA Trigger: It programs the smmu-testdev to initiate a DMA operation
>     targeting a specific IOVA.
>
> 4.  Verification: It waits for the transaction to complete and verifies
>     that the memory was accessed correctly after address translation by
>     the SMMU.
>
> This framework provides a solid and extensible foundation for validating
> the SMMU's core translation paths. The initial test included in this
> series covers a basic DMA completion path in the Non-Secure bank,
> serving as a smoke test and a proof of concept.
>
> It is worth noting that this series currently only includes tests for the
> Non-Secure SMMU. I am aware of the ongoing discussions and RFC patches
> for Secure SMMU support. To avoid a dependency on unmerged work, this
> submission does not include tests for the Secure world. However, I have
> already implemented these tests locally, and I am prepared to submit
> them for review as soon as the core Secure SMMU support is merged
> upstream.

What about other IOMMU's? Are there any other bus mediating devices
modelled in QEMU that could also benefit from the ability to trigger DMA
transactions?

>
>
> Changes from v1 RFC:
> - Clarify Page Table Construction:
> Detailed comments have been added to the page table construction logic. This is a key improvement, as the test setup extensively re-uses the same set of page tables for multiple translation stages and purposes (e.g., nested S1/S2 walks, CD fetch). The new comments explain this sharing mechanism, which can otherwise be confusing to follow.
>
> - Refactor Shared Helpers:
> The helper functions std_space_offset and std_space_to_str are now moved to a common header file. This allows them to be used by both the main device implementation (hw/misc/smmu-testdev.c) and its qtest (tests/qtest/smmu-testdev-qtest.c), improving code re-use and maintainability.
>
> - Enhance Robustness:
> Assertions have been added to ensure the device operates only in the expected Non-secure context. Additional conditional checks are also included to prevent potential runtime errors and make the test device more stable.
>
> - Code Simplification and Cleanup:
> Several functions that were redundant with existing macros for constructing Context Descriptors (CD) and Stream Table Entries (STE) have been removed. This simplifies the test data setup and reduces code duplication.
>
> Other unused code fragments have also been removed to improve overall code clarity and hygiene.
>
> Tao Tang (2):
>   hw/misc/smmu-testdev: introduce minimal SMMUv3 test device
>   tests/qtest: add SMMUv3 smoke test using smmu-testdev DMA source
>
>  docs/specs/index.rst             |   1 +
>  docs/specs/smmu-testdev.rst      |  45 ++
>  hw/misc/Kconfig                  |   5 +
>  hw/misc/meson.build              |   1 +
>  hw/misc/smmu-testdev.c           | 943 +++++++++++++++++++++++++++++++
>  include/hw/misc/smmu-testdev.h   | 402 +++++++++++++
>  tests/qtest/meson.build          |   1 +
>  tests/qtest/smmu-testdev-qtest.c | 238 ++++++++
>  8 files changed, 1636 insertions(+)
>  create mode 100644 docs/specs/smmu-testdev.rst
>  create mode 100644 hw/misc/smmu-testdev.c
>  create mode 100644 include/hw/misc/smmu-testdev.h
>  create mode 100644 tests/qtest/smmu-testdev-qtest.c

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Tao Tang 2 weeks, 4 days ago
Hi Alex,

On 2025/10/23 18:06, Alex Bennée wrote:
> tangtao1634 <tangtao1634@phytium.com.cn> writes:
>
>> From: Tao Tang <tangtao1634@phytium.com.cn>
>>
>> This patch series (V2) introduces several cleanups and improvements to the smmu-testdev device. The main goals are to refactor shared code, enhance robustness, and significantly clarify the complex page table construction used for testing.
>>
>> Motivation
>> ----------
>>
>> Currently, thoroughly testing the SMMUv3 emulation requires a significant
>> software stack. We need to boot a full guest operating system (like Linux)
>> with the appropriate drivers (e.g., IOMMUFD) and rely on firmware (e.g.,
>> ACPI with IORT tables or Hafnium) to correctly configure the SMMU and
>> orchestrate DMA from a peripheral device.
>>
>> This dependency on a complex software stack presents several challenges:
>>
>> * High Barrier to Entry: Writing targeted tests for specific SMMU
>>      features (like fault handling, specific translation regimes, etc.)
>>      becomes cumbersome.
>>
>> * Difficult to Debug: It's hard to distinguish whether a bug originates
>>      from the SMMU emulation itself, the guest driver, the firmware
>>      tables, or the guest kernel's configuration.
>>
>> * Slow Iteration: The need to boot a full guest OS slows down the
>>      development and testing cycle.
>>
>> The primary goal of this work is to create a lightweight, self-contained
>> testing environment that allows us to exercise the SMMU's core logic
>> directly at the qtest level, removing the need for any guest-side
>> software.
> I agree, an excellent motivation.
>
>> Our Approach: A Dedicated Test Device
>> -------------------------------------
>>
>> To achieve this, we introduce two main components:
>>
>> * A new, minimal hardware device: smmu-testdev.
>> * A corresponding qtest that drives this device to generate SMMU-bound
>>      traffic.
>>
>> A key question is, "Why introduce a new smmu-testdev instead of using an
>> existing PCIe or platform device?"
> I curious what the split between PCIe and platform devices that need an
> SMMU are. I suspect there is a strong split between the virtualisation
> case and the emulation case.


Thanks again for the insightful questions and for sparking this valuable 
discussion.


 From my observation of real-world, commercially available SoCs, the 
SMMU is almost exclusively designed for and used with PCIe. Of course, 
you're right that architecturally, the SMMU specification certainly 
allows for non-PCIe clients. Peter's point about using the SMMU for the 
GIC ITS in a Realm context is an excellent example. I've also personally 
seen similar setups in the TF-A-Tests+FVP software stack, where platform 
device, named SMMUv3TestEngine, was used to test the SMMU.

However, from the perspective of QEMU's current implementation, there 
are some significant limitations that guided the design of smmu-testdev. 
At the moment, the SMMU model does not really support non-PCIe devices. 
Two key issues are:

- The IOMMU MemoryRegion, used with PCIe device, cannot be used with a 
platform device, which is the primary mechanism for routing DMA traffic 
through the IOMMU.

- Internally, the SMMU code makes assumptions about its clients. For 
instance, the smmu_get_sid() function explicitly expects a PCIe device 
and has no path to acquire a StreamID for a platform device.

Given this, the decision to model smmu-testdev as a minimal, PCI-like 
device is a pragmatic one. It aligns with the most common real-world use 
case while also working within the constraints of QEMU's current SMMU 
implementation.

>> The answer lies in our goal to minimize complexity. Standard devices,
>> whether PCIe or platform, come with their own intricate initialization
>> protocols and often require a complex driver state machine to function.
>> Using them would re-introduce the very driver-level complexity we aim to
>> avoid.
>>
>> The smmu-testdev is intentionally not a conformant, general-purpose PCIe
>> or platform device. It is a purpose-built, highly simplified "DMA engine."
>> I've designed it to be analogous to a minimal PCIe Root Complex that
>> bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
>> to provide a direct, programmable path for a DMA request to reach the SMMU.
>> Its sole purpose is to trigger a DMA transaction when its registers are
>> written to, making it perfectly suited for direct control from a test
>> environment like qtest.
>>
>> The Qtest Framework
>> -------------------
>>
>> The new qtest (smmu-testdev-qtest.c) serves as the "bare-metal driver"
>> for both the SMMU and the smmu-testdev. It manually performs all the
>> setup that would typically be handled by the guest kernel and firmware,
>> but in a completely controlled and predictable manner:
>>
>> 1.  SMMU Configuration: It directly initializes the SMMU's registers to a
>>      known state.
>>
>> 2.  Translation Structure Setup: It manually constructs the necessary
>>      translation structures in memory, including Stream Table Entries
>>      (STEs), Context Descriptors (CDs), and Page Tables (PTEs).
>>
>> 3.  DMA Trigger: It programs the smmu-testdev to initiate a DMA operation
>>      targeting a specific IOVA.
>>
>> 4.  Verification: It waits for the transaction to complete and verifies
>>      that the memory was accessed correctly after address translation by
>>      the SMMU.
>>
>> This framework provides a solid and extensible foundation for validating
>> the SMMU's core translation paths. The initial test included in this
>> series covers a basic DMA completion path in the Non-Secure bank,
>> serving as a smoke test and a proof of concept.
>>
>> It is worth noting that this series currently only includes tests for the
>> Non-Secure SMMU. I am aware of the ongoing discussions and RFC patches
>> for Secure SMMU support. To avoid a dependency on unmerged work, this
>> submission does not include tests for the Secure world. However, I have
>> already implemented these tests locally, and I am prepared to submit
>> them for review as soon as the core Secure SMMU support is merged
>> upstream.
> What about other IOMMU's? Are there any other bus mediating devices
> modelled in QEMU that could also benefit from the ability to trigger DMA
> transactions?


This is a great point that I haven't fully considered. To make sure I 
understand correctly, are you referring to IOMMU implementations for 
other architectures, such as VT-d on x86 or the ongoing IOMMU work for 
RISC-V? I'll admit this is an area I haven't looked into. I'm very open 
to ideas—do you or others have suggestions on how this test-device 
pattern could be generalized or what would be needed to make it useful 
across different architectures?

Thanks again for the great feedback.

Best regards,

Tao


Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Alex Bennée 2 weeks, 4 days ago
Tao Tang <tangtao1634@phytium.com.cn> writes:

> Hi Alex,
>
> On 2025/10/23 18:06, Alex Bennée wrote:
>> tangtao1634 <tangtao1634@phytium.com.cn> writes:
>>
>>> From: Tao Tang <tangtao1634@phytium.com.cn>
>>>
>>> This patch series (V2) introduces several cleanups and improvements
>>> to the smmu-testdev device. The main goals are to refactor shared
>>> code, enhance robustness, and significantly clarify the complex
>>> page table construction used for testing.
>>>
>>> Motivation
>>> ----------
>>>
>>> Currently, thoroughly testing the SMMUv3 emulation requires a significant
>>> software stack. We need to boot a full guest operating system (like Linux)
>>> with the appropriate drivers (e.g., IOMMUFD) and rely on firmware (e.g.,
>>> ACPI with IORT tables or Hafnium) to correctly configure the SMMU and
>>> orchestrate DMA from a peripheral device.
>>>
>>> This dependency on a complex software stack presents several challenges:
>>>
>>> * High Barrier to Entry: Writing targeted tests for specific SMMU
>>>      features (like fault handling, specific translation regimes, etc.)
>>>      becomes cumbersome.
>>>
>>> * Difficult to Debug: It's hard to distinguish whether a bug originates
>>>      from the SMMU emulation itself, the guest driver, the firmware
>>>      tables, or the guest kernel's configuration.
>>>
>>> * Slow Iteration: The need to boot a full guest OS slows down the
>>>      development and testing cycle.
>>>
>>> The primary goal of this work is to create a lightweight, self-contained
>>> testing environment that allows us to exercise the SMMU's core logic
>>> directly at the qtest level, removing the need for any guest-side
>>> software.
>> I agree, an excellent motivation.
>>
>>> Our Approach: A Dedicated Test Device
>>> -------------------------------------
>>>
>>> To achieve this, we introduce two main components:
>>>
>>> * A new, minimal hardware device: smmu-testdev.
>>> * A corresponding qtest that drives this device to generate SMMU-bound
>>>      traffic.
>>>
>>> A key question is, "Why introduce a new smmu-testdev instead of using an
>>> existing PCIe or platform device?"
>> I curious what the split between PCIe and platform devices that need an
>> SMMU are. I suspect there is a strong split between the virtualisation
>> case and the emulation case.
>
>
> Thanks again for the insightful questions and for sparking this
> valuable discussion.
>
>
> From my observation of real-world, commercially available SoCs, the
> SMMU is almost exclusively designed for and used with PCIe. Of course,
> you're right that architecturally, the SMMU specification certainly
> allows for non-PCIe clients. Peter's point about using the SMMU for
> the GIC ITS in a Realm context is an excellent example. I've also
> personally seen similar setups in the TF-A-Tests+FVP software stack,
> where platform device, named SMMUv3TestEngine, was used to test the
> SMMU.
>
> However, from the perspective of QEMU's current implementation, there
> are some significant limitations that guided the design of
> smmu-testdev. At the moment, the SMMU model does not really support
> non-PCIe devices. Two key issues are:
>
> - The IOMMU MemoryRegion, used with PCIe device, cannot be used with a
>   platform device, which is the primary mechanism for routing DMA
>   traffic through the IOMMU.
>
> - Internally, the SMMU code makes assumptions about its clients. For
>   instance, the smmu_get_sid() function explicitly expects a PCIe
>   device and has no path to acquire a StreamID for a platform device.
>
> Given this, the decision to model smmu-testdev as a minimal, PCI-like
> device is a pragmatic one. It aligns with the most common real-world
> use case while also working within the constraints of QEMU's current
> SMMU implementation.

OK that makes sense. One of the main use cases for a modelled SMMU in
QEMU is for developing and testing FEAT_RME (Arm's Confidential
Computing Realm implementation). In that case everything I've seen so
far expects PCI. I guess we can put off any generalisation until we
actually have some use cases that might need it.

>>> The answer lies in our goal to minimize complexity. Standard devices,
>>> whether PCIe or platform, come with their own intricate initialization
>>> protocols and often require a complex driver state machine to function.
>>> Using them would re-introduce the very driver-level complexity we aim to
>>> avoid.
>>>
>>> The smmu-testdev is intentionally not a conformant, general-purpose PCIe
>>> or platform device. It is a purpose-built, highly simplified "DMA engine."
>>> I've designed it to be analogous to a minimal PCIe Root Complex that
>>> bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
>>> to provide a direct, programmable path for a DMA request to reach the SMMU.
>>> Its sole purpose is to trigger a DMA transaction when its registers are
>>> written to, making it perfectly suited for direct control from a test
>>> environment like qtest.
>>>
>>> The Qtest Framework
>>> -------------------
>>>
>>> The new qtest (smmu-testdev-qtest.c) serves as the "bare-metal driver"
>>> for both the SMMU and the smmu-testdev. It manually performs all the
>>> setup that would typically be handled by the guest kernel and firmware,
>>> but in a completely controlled and predictable manner:
>>>
>>> 1.  SMMU Configuration: It directly initializes the SMMU's registers to a
>>>      known state.
>>>
>>> 2.  Translation Structure Setup: It manually constructs the necessary
>>>      translation structures in memory, including Stream Table Entries
>>>      (STEs), Context Descriptors (CDs), and Page Tables (PTEs).
>>>
>>> 3.  DMA Trigger: It programs the smmu-testdev to initiate a DMA operation
>>>      targeting a specific IOVA.
>>>
>>> 4.  Verification: It waits for the transaction to complete and verifies
>>>      that the memory was accessed correctly after address translation by
>>>      the SMMU.
>>>
>>> This framework provides a solid and extensible foundation for validating
>>> the SMMU's core translation paths. The initial test included in this
>>> series covers a basic DMA completion path in the Non-Secure bank,
>>> serving as a smoke test and a proof of concept.
>>>
>>> It is worth noting that this series currently only includes tests for the
>>> Non-Secure SMMU. I am aware of the ongoing discussions and RFC patches
>>> for Secure SMMU support. To avoid a dependency on unmerged work, this
>>> submission does not include tests for the Secure world. However, I have
>>> already implemented these tests locally, and I am prepared to submit
>>> them for review as soon as the core Secure SMMU support is merged
>>> upstream.
>> What about other IOMMU's? Are there any other bus mediating devices
>> modelled in QEMU that could also benefit from the ability to trigger DMA
>> transactions?
>
>
> This is a great point that I haven't fully considered. To make sure I
> understand correctly, are you referring to IOMMU implementations for
> other architectures, such as VT-d on x86 or the ongoing IOMMU work for
> RISC-V?

Yes - generally I think having a single test device that can be used to
test multiple models will be useful. I guess each qtest will be very
tied to the SMMU it is modelling as it needs to program both sides but
if we take care to encapsulate the programming of the test device and
verification of the results we should be able to ensure good code
re-use.

> I'll admit this is an area I haven't looked into. I'm very
> open to ideas—do you or others have suggestions on how this
> test-device pattern could be generalized or what would be needed to
> make it useful across different architectures?

My only initial thought is the device might be better called
iommu-testdev (as in a device to test IOMMUs, of which the SMMU is one).

>
> Thanks again for the great feedback.
>
> Best regards,
>
> Tao

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Peter Maydell 3 weeks, 1 day ago
On Thu, 23 Oct 2025 at 11:06, Alex Bennée <alex.bennee@linaro.org> wrote:
> tangtao1634 <tangtao1634@phytium.com.cn> writes:
> > From: Tao Tang <tangtao1634@phytium.com.cn>
> > A key question is, "Why introduce a new smmu-testdev instead of using an
> > existing PCIe or platform device?"
>
> I curious what the split between PCIe and platform devices that need an
> SMMU are. I suspect there is a strong split between the virtualisation
> case and the emulation case.

I don't think emulation vs virtualization matters much here.
My impression is that SMMU is almost entirely for PCI/PCIe;
the exception is that for Realm emulation we need to do
granule checks for non-PCI devices like the GIC ITS, and the
easy way to do that is to have the ITS accesses go through
the SMMU as a "NoStreamID" client.

(Linaro has a JIRA card for the Realm/GIC interaction:
 https://linaro.atlassian.net/browse/QEMU-606 )

thanks
-- PMM
Re: [RFC v2 0/2] hw/misc: Introduce a new SMMUv3 test framework
Posted by Tao Tang 3 weeks, 1 day ago
Gentle ping.

Any feedback on this patch series would be appreciated.


V2: 
  https://lore.kernel.org/qemu-devel/20250930165340.42788-1-tangtao1634@phytium.com.cn/

V1: 
  https://lore.kernel.org/qemu-devel/20250925153550.105915-1-tangtao1634@phytium.com.cn/


On 2025/10/1 00:53, tangtao1634 wrote:
> From: Tao Tang <tangtao1634@phytium.com.cn>
>
> This patch series (V2) introduces several cleanups and improvements to the smmu-testdev device. The main goals are to refactor shared code, enhance robustness, and significantly clarify the complex page table construction used for testing.
>
> Motivation
> ----------
>
> Currently, thoroughly testing the SMMUv3 emulation requires a significant
> software stack. We need to boot a full guest operating system (like Linux)
> with the appropriate drivers (e.g., IOMMUFD) and rely on firmware (e.g.,
> ACPI with IORT tables or Hafnium) to correctly configure the SMMU and
> orchestrate DMA from a peripheral device.
>
> This dependency on a complex software stack presents several challenges:
>
> * High Barrier to Entry: Writing targeted tests for specific SMMU
>      features (like fault handling, specific translation regimes, etc.)
>      becomes cumbersome.
>
> * Difficult to Debug: It's hard to distinguish whether a bug originates
>      from the SMMU emulation itself, the guest driver, the firmware
>      tables, or the guest kernel's configuration.
>
> * Slow Iteration: The need to boot a full guest OS slows down the
>      development and testing cycle.
>
> The primary goal of this work is to create a lightweight, self-contained
> testing environment that allows us to exercise the SMMU's core logic
> directly at the qtest level, removing the need for any guest-side software.
>
> Our Approach: A Dedicated Test Device
> -------------------------------------
>
> To achieve this, we introduce two main components:
>
> * A new, minimal hardware device: smmu-testdev.
> * A corresponding qtest that drives this device to generate SMMU-bound
>      traffic.
>
> A key question is, "Why introduce a new smmu-testdev instead of using an
> existing PCIe or platform device?"
>
> The answer lies in our goal to minimize complexity. Standard devices,
> whether PCIe or platform, come with their own intricate initialization
> protocols and often require a complex driver state machine to function.
> Using them would re-introduce the very driver-level complexity we aim to
> avoid.
>
> The smmu-testdev is intentionally not a conformant, general-purpose PCIe
> or platform device. It is a purpose-built, highly simplified "DMA engine."
> I've designed it to be analogous to a minimal PCIe Root Complex that
> bypasses the full, realistic topology (Host Bridges, Switches, Endpoints)
> to provide a direct, programmable path for a DMA request to reach the SMMU.
> Its sole purpose is to trigger a DMA transaction when its registers are
> written to, making it perfectly suited for direct control from a test
> environment like qtest.
>
> The Qtest Framework
> -------------------
>
> The new qtest (smmu-testdev-qtest.c) serves as the "bare-metal driver"
> for both the SMMU and the smmu-testdev. It manually performs all the
> setup that would typically be handled by the guest kernel and firmware,
> but in a completely controlled and predictable manner:
>
> 1.  SMMU Configuration: It directly initializes the SMMU's registers to a
>      known state.
>
> 2.  Translation Structure Setup: It manually constructs the necessary
>      translation structures in memory, including Stream Table Entries
>      (STEs), Context Descriptors (CDs), and Page Tables (PTEs).
>
> 3.  DMA Trigger: It programs the smmu-testdev to initiate a DMA operation
>      targeting a specific IOVA.
>
> 4.  Verification: It waits for the transaction to complete and verifies
>      that the memory was accessed correctly after address translation by
>      the SMMU.
>
> This framework provides a solid and extensible foundation for validating
> the SMMU's core translation paths. The initial test included in this
> series covers a basic DMA completion path in the Non-Secure bank,
> serving as a smoke test and a proof of concept.
>
> It is worth noting that this series currently only includes tests for the
> Non-Secure SMMU. I am aware of the ongoing discussions and RFC patches
> for Secure SMMU support. To avoid a dependency on unmerged work, this
> submission does not include tests for the Secure world. However, I have
> already implemented these tests locally, and I am prepared to submit
> them for review as soon as the core Secure SMMU support is merged
> upstream.
>
>
> Changes from v1 RFC:
> - Clarify Page Table Construction:
> Detailed comments have been added to the page table construction logic. This is a key improvement, as the test setup extensively re-uses the same set of page tables for multiple translation stages and purposes (e.g., nested S1/S2 walks, CD fetch). The new comments explain this sharing mechanism, which can otherwise be confusing to follow.
>
> - Refactor Shared Helpers:
> The helper functions std_space_offset and std_space_to_str are now moved to a common header file. This allows them to be used by both the main device implementation (hw/misc/smmu-testdev.c) and its qtest (tests/qtest/smmu-testdev-qtest.c), improving code re-use and maintainability.
>
> - Enhance Robustness:
> Assertions have been added to ensure the device operates only in the expected Non-secure context. Additional conditional checks are also included to prevent potential runtime errors and make the test device more stable.
>
> - Code Simplification and Cleanup:
> Several functions that were redundant with existing macros for constructing Context Descriptors (CD) and Stream Table Entries (STE) have been removed. This simplifies the test data setup and reduces code duplication.
>
> Other unused code fragments have also been removed to improve overall code clarity and hygiene.
>
> Tao Tang (2):
>    hw/misc/smmu-testdev: introduce minimal SMMUv3 test device
>    tests/qtest: add SMMUv3 smoke test using smmu-testdev DMA source
>
>   docs/specs/index.rst             |   1 +
>   docs/specs/smmu-testdev.rst      |  45 ++
>   hw/misc/Kconfig                  |   5 +
>   hw/misc/meson.build              |   1 +
>   hw/misc/smmu-testdev.c           | 943 +++++++++++++++++++++++++++++++
>   include/hw/misc/smmu-testdev.h   | 402 +++++++++++++
>   tests/qtest/meson.build          |   1 +
>   tests/qtest/smmu-testdev-qtest.c | 238 ++++++++
>   8 files changed, 1636 insertions(+)
>   create mode 100644 docs/specs/smmu-testdev.rst
>   create mode 100644 hw/misc/smmu-testdev.c
>   create mode 100644 include/hw/misc/smmu-testdev.h
>   create mode 100644 tests/qtest/smmu-testdev-qtest.c
>