[PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages

Andrew Jones posted 1 patch 2 weeks, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260508205129.377032-1-andrew.jones@oss.qualcomm.com
Maintainers: Palmer Dabbelt <palmer@dabbelt.com>, Alistair Francis <alistair.francis@wdc.com>, Weiwei Li <liwei1518@gmail.com>, Daniel Henrique Barboza <daniel.barboza@oss.qualcomm.com>, Liu Zhiwei <zhiwei_liu@linux.alibaba.com>, Chao Liu <chao.liu.zevorn@gmail.com>
hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 40 insertions(+), 4 deletions(-)
[PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Andrew Jones 2 weeks, 2 days ago
The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
model wasn't checking PTE_N and therefore was using the raw (NAPOT-
encoded) PPN directly in the physical address, yielding an address
32 KB above the correct base.

Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
the Svnapot handling already present in target/riscv/cpu_helper.c:

  napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
  napot_mask  = (1 << napot_bits) - 1   /* 0xF */
  phys_base   = PPN_PHYS(ppn & ~napot_mask)
  page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))

The spec only defines napot_bits == 4 (64KB); any other value is
treated as a reserved encoding.

This is a fix, rather than new feature support, because the spec
says "IOMMU implementations must support the Svnapot standard
extension for NAPOT Translation Contiguity."

Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
---
 hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
index 25a356d1d366..eb09cc974812 100644
--- a/hw/riscv/riscv-iommu.c
+++ b/hw/riscv/riscv-iommu.c
@@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
     return true;
 }
 
+/* Returns the NAPOT page mask, or 0 for reserved encodings. */
+static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
+{
+    int napot_bits = ctz64(ppn) + 1;
+    hwaddr napot_mask, page_mask;
+
+    /* The spec only defines 64KB (napot_bits == 4) */
+    if (napot_bits != 4) {
+        return 0;
+    }
+
+    napot_mask = (1ULL << napot_bits) - 1;
+    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
+
+    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
+
+    return page_mask;
+}
+
 /*
  * RISCV IOMMU Address Translation Lookup - Page Table Walk
  *
@@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
         } else {
             /* Leaf PTE, translation completed. */
             sc[pass].step = sc[pass].levels;
-            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
-            /* Update address mask based on smallest translation granularity */
-            iotlb->addr_mask &= (1ULL << va_skip) - 1;
+
+            if (pte & PTE_N) {
+                hwaddr mask = riscv_iommu_napot_page_mask(ppn, addr, &base);
+
+                if (!mask) {
+                    break;
+                }
+                iotlb->addr_mask &= mask;
+            } else {
+                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
+                /* Update address mask based on smallest translation granularity */
+                iotlb->addr_mask &= (1ULL << va_skip) - 1;
+            }
+
             /* Continue with S-Stage translation? */
             if (pass && sc[0].step != sc[0].levels) {
                 pass = S_STAGE;
@@ -997,7 +1027,13 @@ static MemTxResult pdt_memory_read(RISCVIOMMUState *s,
             return MEMTX_ACCESS_ERROR; /* Misaligned PPN */
         } else {
             /* Leaf PTE, translation completed. */
-            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
+            if (pte & PTE_N) {
+                if (!riscv_iommu_napot_page_mask(ppn, addr, &base)) {
+                    return MEMTX_ACCESS_ERROR;
+                }
+            } else {
+                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
+            }
             break;
         }
 
-- 
2.43.0
Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Alistair Francis 1 week, 5 days ago
On Sat, May 9, 2026 at 6:52 AM Andrew Jones
<andrew.jones@oss.qualcomm.com> wrote:
>
> The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> encoded) PPN directly in the physical address, yielding an address
> 32 KB above the correct base.
>
> Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> the Svnapot handling already present in target/riscv/cpu_helper.c:
>
>   napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
>   napot_mask  = (1 << napot_bits) - 1   /* 0xF */
>   phys_base   = PPN_PHYS(ppn & ~napot_mask)
>   page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
>
> The spec only defines napot_bits == 4 (64KB); any other value is
> treated as a reserved encoding.
>
> This is a fix, rather than new feature support, because the spec
> says "IOMMU implementations must support the Svnapot standard
> extension for NAPOT Translation Contiguity."
>
> Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
>  hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 40 insertions(+), 4 deletions(-)
>
> diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> index 25a356d1d366..eb09cc974812 100644
> --- a/hw/riscv/riscv-iommu.c
> +++ b/hw/riscv/riscv-iommu.c
> @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>      return true;
>  }
>
> +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> +{
> +    int napot_bits = ctz64(ppn) + 1;
> +    hwaddr napot_mask, page_mask;
> +
> +    /* The spec only defines 64KB (napot_bits == 4) */
> +    if (napot_bits != 4) {
> +        return 0;
> +    }
> +
> +    napot_mask = (1ULL << napot_bits) - 1;
> +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> +
> +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> +
> +    return page_mask;
> +}
> +
>  /*
>   * RISCV IOMMU Address Translation Lookup - Page Table Walk
>   *
> @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>          } else {
>              /* Leaf PTE, translation completed. */
>              sc[pass].step = sc[pass].levels;
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> -            /* Update address mask based on smallest translation granularity */
> -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +
> +            if (pte & PTE_N) {
> +                hwaddr mask = riscv_iommu_napot_page_mask(ppn, addr, &base);
> +
> +                if (!mask) {
> +                    break;
> +                }
> +                iotlb->addr_mask &= mask;
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +                /* Update address mask based on smallest translation granularity */
> +                iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +            }
> +
>              /* Continue with S-Stage translation? */
>              if (pass && sc[0].step != sc[0].levels) {
>                  pass = S_STAGE;
> @@ -997,7 +1027,13 @@ static MemTxResult pdt_memory_read(RISCVIOMMUState *s,
>              return MEMTX_ACCESS_ERROR; /* Misaligned PPN */
>          } else {
>              /* Leaf PTE, translation completed. */
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            if (pte & PTE_N) {
> +                if (!riscv_iommu_napot_page_mask(ppn, addr, &base)) {
> +                    return MEMTX_ACCESS_ERROR;
> +                }
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            }
>              break;
>          }
>
> --
> 2.43.0
>
>
Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Nutty.Liu 1 week, 6 days ago
On 5/9/2026 4:51 AM, Andrew Jones wrote:
> The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> encoded) PPN directly in the physical address, yielding an address
> 32 KB above the correct base.
>
> Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> the Svnapot handling already present in target/riscv/cpu_helper.c:
>
>    napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
>    napot_mask  = (1 << napot_bits) - 1   /* 0xF */
>    phys_base   = PPN_PHYS(ppn & ~napot_mask)
>    page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
>
> The spec only defines napot_bits == 4 (64KB); any other value is
> treated as a reserved encoding.
>
> This is a fix, rather than new feature support, because the spec
> says "IOMMU implementations must support the Svnapot standard
> extension for NAPOT Translation Contiguity."
>
> Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>

Thanks,
Nutty
> ---
>   hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 40 insertions(+), 4 deletions(-)
>
> diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> index 25a356d1d366..eb09cc974812 100644
> --- a/hw/riscv/riscv-iommu.c
> +++ b/hw/riscv/riscv-iommu.c
> @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>       return true;
>   }
>   
> +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> +{
> +    int napot_bits = ctz64(ppn) + 1;
> +    hwaddr napot_mask, page_mask;
> +
> +    /* The spec only defines 64KB (napot_bits == 4) */
> +    if (napot_bits != 4) {
> +        return 0;
> +    }
> +
> +    napot_mask = (1ULL << napot_bits) - 1;
> +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> +
> +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> +
> +    return page_mask;
> +}
> +
>   /*
>    * RISCV IOMMU Address Translation Lookup - Page Table Walk
>    *
> @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>           } else {
>               /* Leaf PTE, translation completed. */
>               sc[pass].step = sc[pass].levels;
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> -            /* Update address mask based on smallest translation granularity */
> -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +
> +            if (pte & PTE_N) {
> +                hwaddr mask = riscv_iommu_napot_page_mask(ppn, addr, &base);
> +
> +                if (!mask) {
> +                    break;
> +                }
> +                iotlb->addr_mask &= mask;
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +                /* Update address mask based on smallest translation granularity */
> +                iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +            }
> +
>               /* Continue with S-Stage translation? */
>               if (pass && sc[0].step != sc[0].levels) {
>                   pass = S_STAGE;
> @@ -997,7 +1027,13 @@ static MemTxResult pdt_memory_read(RISCVIOMMUState *s,
>               return MEMTX_ACCESS_ERROR; /* Misaligned PPN */
>           } else {
>               /* Leaf PTE, translation completed. */
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            if (pte & PTE_N) {
> +                if (!riscv_iommu_napot_page_mask(ppn, addr, &base)) {
> +                    return MEMTX_ACCESS_ERROR;
> +                }
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            }
>               break;
>           }
>
Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Daniel Henrique Barboza 2 weeks ago

On 5/8/2026 5:51 PM, Andrew Jones wrote:
> The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> encoded) PPN directly in the physical address, yielding an address
> 32 KB above the correct base.
> 
> Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> the Svnapot handling already present in target/riscv/cpu_helper.c:
> 
>    napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
>    napot_mask  = (1 << napot_bits) - 1   /* 0xF */
>    phys_base   = PPN_PHYS(ppn & ~napot_mask)
>    page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
> 
> The spec only defines napot_bits == 4 (64KB); any other value is
> treated as a reserved encoding.
> 
> This is a fix, rather than new feature support, because the spec
> says "IOMMU implementations must support the Svnapot standard
> extension for NAPOT Translation Contiguity."
> 
> Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
> ---

Reviewed-by: Daniel Henrique Barboza <daniel.barboza@oss.qualcomm.com>

>   hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> index 25a356d1d366..eb09cc974812 100644
> --- a/hw/riscv/riscv-iommu.c
> +++ b/hw/riscv/riscv-iommu.c
> @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>       return true;
>   }
>   
> +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> +{
> +    int napot_bits = ctz64(ppn) + 1;
> +    hwaddr napot_mask, page_mask;
> +
> +    /* The spec only defines 64KB (napot_bits == 4) */
> +    if (napot_bits != 4) {
> +        return 0;
> +    }
> +
> +    napot_mask = (1ULL << napot_bits) - 1;
> +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> +
> +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> +
> +    return page_mask;
> +}
> +
>   /*
>    * RISCV IOMMU Address Translation Lookup - Page Table Walk
>    *
> @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>           } else {
>               /* Leaf PTE, translation completed. */
>               sc[pass].step = sc[pass].levels;
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> -            /* Update address mask based on smallest translation granularity */
> -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +
> +            if (pte & PTE_N) {
> +                hwaddr mask = riscv_iommu_napot_page_mask(ppn, addr, &base);
> +
> +                if (!mask) {
> +                    break;
> +                }
> +                iotlb->addr_mask &= mask;
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +                /* Update address mask based on smallest translation granularity */
> +                iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +            }
> +
>               /* Continue with S-Stage translation? */
>               if (pass && sc[0].step != sc[0].levels) {
>                   pass = S_STAGE;
> @@ -997,7 +1027,13 @@ static MemTxResult pdt_memory_read(RISCVIOMMUState *s,
>               return MEMTX_ACCESS_ERROR; /* Misaligned PPN */
>           } else {
>               /* Leaf PTE, translation completed. */
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            if (pte & PTE_N) {
> +                if (!riscv_iommu_napot_page_mask(ppn, addr, &base)) {
> +                    return MEMTX_ACCESS_ERROR;
> +                }
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            }
>               break;
>           }
>
Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Andrew Jones 2 weeks, 2 days ago
On Fri, May 08, 2026 at 03:51:29PM -0500, Andrew Jones wrote:
> The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> encoded) PPN directly in the physical address, yielding an address
> 32 KB above the correct base.
> 
> Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> the Svnapot handling already present in target/riscv/cpu_helper.c:
> 
>   napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
>   napot_mask  = (1 << napot_bits) - 1   /* 0xF */
>   phys_base   = PPN_PHYS(ppn & ~napot_mask)
>   page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
> 
> The spec only defines napot_bits == 4 (64KB); any other value is
> treated as a reserved encoding.
> 
> This is a fix, rather than new feature support, because the spec
> says "IOMMU implementations must support the Svnapot standard
> extension for NAPOT Translation Contiguity."
> 
> Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>

I should have added

Cc: qemu-stable@nongnu.org

because as soon as [1] is merged into Linux Linux will no longer boot
on any QEMU build without this patch.

[1] https://lore.kernel.org/all/20260508212339.381933-1-andrew.jones@oss.qualcomm.com/

Thanks,
drew

> ---
>  hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> index 25a356d1d366..eb09cc974812 100644
> --- a/hw/riscv/riscv-iommu.c
> +++ b/hw/riscv/riscv-iommu.c
> @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>      return true;
>  }
>  
> +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> +{
> +    int napot_bits = ctz64(ppn) + 1;
> +    hwaddr napot_mask, page_mask;
> +
> +    /* The spec only defines 64KB (napot_bits == 4) */
> +    if (napot_bits != 4) {
> +        return 0;
> +    }
> +
> +    napot_mask = (1ULL << napot_bits) - 1;
> +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> +
> +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> +
> +    return page_mask;
> +}
> +
>  /*
>   * RISCV IOMMU Address Translation Lookup - Page Table Walk
>   *
> @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
>          } else {
>              /* Leaf PTE, translation completed. */
>              sc[pass].step = sc[pass].levels;
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> -            /* Update address mask based on smallest translation granularity */
> -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +
> +            if (pte & PTE_N) {
> +                hwaddr mask = riscv_iommu_napot_page_mask(ppn, addr, &base);
> +
> +                if (!mask) {
> +                    break;
> +                }
> +                iotlb->addr_mask &= mask;
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +                /* Update address mask based on smallest translation granularity */
> +                iotlb->addr_mask &= (1ULL << va_skip) - 1;
> +            }
> +
>              /* Continue with S-Stage translation? */
>              if (pass && sc[0].step != sc[0].levels) {
>                  pass = S_STAGE;
> @@ -997,7 +1027,13 @@ static MemTxResult pdt_memory_read(RISCVIOMMUState *s,
>              return MEMTX_ACCESS_ERROR; /* Misaligned PPN */
>          } else {
>              /* Leaf PTE, translation completed. */
> -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            if (pte & PTE_N) {
> +                if (!riscv_iommu_napot_page_mask(ppn, addr, &base)) {
> +                    return MEMTX_ACCESS_ERROR;
> +                }
> +            } else {
> +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> +            }
>              break;
>          }
>  
> -- 
> 2.43.0
>
Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Tomasz Jeznach 2 weeks ago
Hi Drew,

On Fri, May 8, 2026 at 2:29 PM Andrew Jones
<andrew.jones@oss.qualcomm.com> wrote:
>
> On Fri, May 08, 2026 at 03:51:29PM -0500, Andrew Jones wrote:
> > The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> > storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> > model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> > encoded) PPN directly in the physical address, yielding an address
> > 32 KB above the correct base.
> >
> > Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> > the Svnapot handling already present in target/riscv/cpu_helper.c:
> >
> >   napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
> >   napot_mask  = (1 << napot_bits) - 1   /* 0xF */
> >   phys_base   = PPN_PHYS(ppn & ~napot_mask)
> >   page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
> >
> > The spec only defines napot_bits == 4 (64KB); any other value is
> > treated as a reserved encoding.
> >
> > This is a fix, rather than new feature support, because the spec
> > says "IOMMU implementations must support the Svnapot standard
> > extension for NAPOT Translation Contiguity."
> >
> > Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> > Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
>
> I should have added
>
> Cc: qemu-stable@nongnu.org
>
> because as soon as [1] is merged into Linux Linux will no longer boot
> on any QEMU build without this patch.
>
> [1] https://lore.kernel.org/all/20260508212339.381933-1-andrew.jones@oss.qualcomm.com/
>
> Thanks,
> drew
>
> > ---
> >  hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 40 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> > index 25a356d1d366..eb09cc974812 100644
> > --- a/hw/riscv/riscv-iommu.c
> > +++ b/hw/riscv/riscv-iommu.c
> > @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> >      return true;
> >  }
> >
> > +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> > +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> > +{
> > +    int napot_bits = ctz64(ppn) + 1;
> > +    hwaddr napot_mask, page_mask;
> > +
> > +    /* The spec only defines 64KB (napot_bits == 4) */
> > +    if (napot_bits != 4) {
> > +        return 0;
> > +    }
> > +
> > +    napot_mask = (1ULL << napot_bits) - 1;
> > +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> > +
> > +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> > +
> > +    return page_mask;
> > +}
> > +
> >  /*
> >   * RISCV IOMMU Address Translation Lookup - Page Table Walk
> >   *
> > @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> >          } else {
> >              /* Leaf PTE, translation completed. */
> >              sc[pass].step = sc[pass].levels;
> > -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> > -            /* Update address mask based on smallest translation granularity */
> > -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> > +
> > +            if (pte & PTE_N) {

Would it be worth checking PTE_N only on the last level page table
entry (sc[pass].step == sc[pass].levels at the start of the block)?

> > +                hwaddr mask = riscv_iommu_napot_page_mask(ppn, addr, &base);
> > +
> > +                if (!mask) {
> > +                    break;
> > +                }
> > +                iotlb->addr_mask &= mask;
> > +            } else {
> > +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> > +                /* Update address mask based on smallest translation granularity */
> > +                iotlb->addr_mask &= (1ULL << va_skip) - 1;
> > +            }
> > +
> >              /* Continue with S-Stage translation? */
> >              if (pass && sc[0].step != sc[0].levels) {
> >                  pass = S_STAGE;
> > @@ -997,7 +1027,13 @@ static MemTxResult pdt_memory_read(RISCVIOMMUState *s,
> >              return MEMTX_ACCESS_ERROR; /* Misaligned PPN */
> >          } else {
> >              /* Leaf PTE, translation completed. */
> > -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> > +            if (pte & PTE_N) {
> > +                if (!riscv_iommu_napot_page_mask(ppn, addr, &base)) {
> > +                    return MEMTX_ACCESS_ERROR;
> > +                }
> > +            } else {
> > +                base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> > +            }
> >              break;
> >          }
> >
> > --
> > 2.43.0
> >


Thanks for the fix otherwise.
- Tomasz
Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Andrew Jones 1 week, 6 days ago
On Mon, May 11, 2026 at 10:20:32AM -0700, Tomasz Jeznach wrote:
> Hi Drew,
> 
> On Fri, May 8, 2026 at 2:29 PM Andrew Jones
> <andrew.jones@oss.qualcomm.com> wrote:
> >
> > On Fri, May 08, 2026 at 03:51:29PM -0500, Andrew Jones wrote:
> > > The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> > > storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> > > model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> > > encoded) PPN directly in the physical address, yielding an address
> > > 32 KB above the correct base.
> > >
> > > Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> > > the Svnapot handling already present in target/riscv/cpu_helper.c:
> > >
> > >   napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
> > >   napot_mask  = (1 << napot_bits) - 1   /* 0xF */
> > >   phys_base   = PPN_PHYS(ppn & ~napot_mask)
> > >   page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
> > >
> > > The spec only defines napot_bits == 4 (64KB); any other value is
> > > treated as a reserved encoding.
> > >
> > > This is a fix, rather than new feature support, because the spec
> > > says "IOMMU implementations must support the Svnapot standard
> > > extension for NAPOT Translation Contiguity."
> > >
> > > Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> > > Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
> >
> > I should have added
> >
> > Cc: qemu-stable@nongnu.org
> >
> > because as soon as [1] is merged into Linux Linux will no longer boot
> > on any QEMU build without this patch.
> >
> > [1] https://lore.kernel.org/all/20260508212339.381933-1-andrew.jones@oss.qualcomm.com/
> >
> > Thanks,
> > drew
> >
> > > ---
> > >  hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
> > >  1 file changed, 40 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> > > index 25a356d1d366..eb09cc974812 100644
> > > --- a/hw/riscv/riscv-iommu.c
> > > +++ b/hw/riscv/riscv-iommu.c
> > > @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> > >      return true;
> > >  }
> > >
> > > +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> > > +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> > > +{
> > > +    int napot_bits = ctz64(ppn) + 1;
> > > +    hwaddr napot_mask, page_mask;
> > > +
> > > +    /* The spec only defines 64KB (napot_bits == 4) */
> > > +    if (napot_bits != 4) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    napot_mask = (1ULL << napot_bits) - 1;
> > > +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> > > +
> > > +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> > > +
> > > +    return page_mask;
> > > +}
> > > +
> > >  /*
> > >   * RISCV IOMMU Address Translation Lookup - Page Table Walk
> > >   *
> > > @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> > >          } else {
> > >              /* Leaf PTE, translation completed. */
> > >              sc[pass].step = sc[pass].levels;
> > > -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> > > -            /* Update address mask based on smallest translation granularity */
> > > -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> > > +
> > > +            if (pte & PTE_N) {
> 
> Would it be worth checking PTE_N only on the last level page table
> entry (sc[pass].step == sc[pass].levels at the start of the block)?
>

Hi Tomasz,

I don't think we need it since we can't get here with non-zero napot bits
on a superpage due to the "Misaligned PPN" check above. And, if we get
past that and PTE_N is erroneously set, then we'll break from the loop
when riscv_iommu_napot_page_mask() returns zero, allowing us to fault on
the bad napot entry.

Thanks,
drew

Re: [PATCH] hw/riscv/riscv-iommu: Fix Svnapot 64KB pages
Posted by Tomasz Jeznach 1 week, 6 days ago
Hi Drew,

On Mon, May 11, 2026 at 2:07 PM Andrew Jones
<andrew.jones@oss.qualcomm.com> wrote:
>
> On Mon, May 11, 2026 at 10:20:32AM -0700, Tomasz Jeznach wrote:
> > Hi Drew,
> >
> > On Fri, May 8, 2026 at 2:29 PM Andrew Jones
> > <andrew.jones@oss.qualcomm.com> wrote:
> > >
> > > On Fri, May 08, 2026 at 03:51:29PM -0500, Andrew Jones wrote:
> > > > The Svnapot extension encodes a 64KB leaf PTE by setting PTE_N and
> > > > storing bits [3:0] of the PPN as a NAPOT size indicator. The IOMMU
> > > > model wasn't checking PTE_N and therefore was using the raw (NAPOT-
> > > > encoded) PPN directly in the physical address, yielding an address
> > > > 32 KB above the correct base.
> > > >
> > > > Fix both riscv_iommu_spa_fetch() and pdt_memory_read() by mirroring
> > > > the Svnapot handling already present in target/riscv/cpu_helper.c:
> > > >
> > > >   napot_bits  = ctz64(ppn) + 1          /* 4 for 64KB */
> > > >   napot_mask  = (1 << napot_bits) - 1   /* 0xF */
> > > >   phys_base   = PPN_PHYS(ppn & ~napot_mask)
> > > >   page_offset = addr & (PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1))
> > > >
> > > > The spec only defines napot_bits == 4 (64KB); any other value is
> > > > treated as a reserved encoding.
> > > >
> > > > This is a fix, rather than new feature support, because the spec
> > > > says "IOMMU implementations must support the Svnapot standard
> > > > extension for NAPOT Translation Contiguity."
> > > >
> > > > Fixes: 0c54acb8243d ("hw/riscv: add RISC-V IOMMU base emulation")
> > > > Signed-off-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
> > >
> > > I should have added
> > >
> > > Cc: qemu-stable@nongnu.org
> > >
> > > because as soon as [1] is merged into Linux Linux will no longer boot
> > > on any QEMU build without this patch.
> > >
> > > [1] https://lore.kernel.org/all/20260508212339.381933-1-andrew.jones@oss.qualcomm.com/
> > >
> > > Thanks,
> > > drew
> > >
> > > > ---
> > > >  hw/riscv/riscv-iommu.c | 44 ++++++++++++++++++++++++++++++++++++++----
> > > >  1 file changed, 40 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
> > > > index 25a356d1d366..eb09cc974812 100644
> > > > --- a/hw/riscv/riscv-iommu.c
> > > > +++ b/hw/riscv/riscv-iommu.c
> > > > @@ -237,6 +237,25 @@ static bool riscv_iommu_msi_check(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> > > >      return true;
> > > >  }
> > > >
> > > > +/* Returns the NAPOT page mask, or 0 for reserved encodings. */
> > > > +static hwaddr riscv_iommu_napot_page_mask(hwaddr ppn, hwaddr addr, hwaddr *out)
> > > > +{
> > > > +    int napot_bits = ctz64(ppn) + 1;
> > > > +    hwaddr napot_mask, page_mask;
> > > > +
> > > > +    /* The spec only defines 64KB (napot_bits == 4) */
> > > > +    if (napot_bits != 4) {
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    napot_mask = (1ULL << napot_bits) - 1;
> > > > +    page_mask = PPN_PHYS(napot_mask) | (TARGET_PAGE_SIZE - 1);
> > > > +
> > > > +    *out = PPN_PHYS(ppn & ~napot_mask) | (addr & page_mask);
> > > > +
> > > > +    return page_mask;
> > > > +}
> > > > +
> > > >  /*
> > > >   * RISCV IOMMU Address Translation Lookup - Page Table Walk
> > > >   *
> > > > @@ -458,9 +477,20 @@ static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> > > >          } else {
> > > >              /* Leaf PTE, translation completed. */
> > > >              sc[pass].step = sc[pass].levels;
> > > > -            base = PPN_PHYS(ppn) | (addr & ((1ULL << va_skip) - 1));
> > > > -            /* Update address mask based on smallest translation granularity */
> > > > -            iotlb->addr_mask &= (1ULL << va_skip) - 1;
> > > > +
> > > > +            if (pte & PTE_N) {
> >
> > Would it be worth checking PTE_N only on the last level page table
> > entry (sc[pass].step == sc[pass].levels at the start of the block)?
> >
>
> Hi Tomasz,
>
> I don't think we need it since we can't get here with non-zero napot bits
> on a superpage due to the "Misaligned PPN" check above. And, if we get
> past that and PTE_N is erroneously set, then we'll break from the loop
> when riscv_iommu_napot_page_mask() returns zero, allowing us to fault on
> the bad napot entry.
>
> Thanks,
> drew

Good point.

Reviewed-by: Tomasz Jeznach <tjeznach@rivosinc.com>

Thanks,
- Tomasz