[PATCH v1 03/14] xen/riscv: introduce ioremap()

Oleksii Kurochko posted 14 patches 8 months, 1 week ago
There is a newer version of this series
[PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Oleksii Kurochko 8 months, 1 week ago
Based on RISC-V unpriviliged spec ( Version 20240411 ):
```
For implementations that conform to the RISC-V Unix Platform Specification,
I/O devices and DMA operations are required to access memory coherently and
via strongly ordered I/O channels. Therefore, accesses to regular main memory
regions that are concurrently accessed by external devices can also use the
standard synchronization mechanisms. Implementations that do not conform
to the Unix Platform Specification and/or in which devices do not access
memory coherently will need to use mechanisms
(which are currently platform-specific or device-specific) to enforce
coherency.

I/O regions in the address space should be considered non-cacheable
regions in the PMAs for those regions. Such regions can be considered coherent
by the PMA if they are not cached by any agent.
```
and [1]:
```
The current riscv linux implementation requires SOC system to support
memory coherence between all I/O devices and CPUs. But some SOC systems
cannot maintain the coherence and they need support cache clean/invalid
operations to synchronize data.

Current implementation is no problem with SiFive FU540, because FU540
keeps all IO devices and DMA master devices coherence with CPU. But to a
traditional SOC vendor, it may already have a stable non-coherency SOC
system, the need is simply to replace the CPU with RV CPU and rebuild
the whole system with IO-coherency is very expensive.
```

and the fact that all known ( to me ) CPUs that support the H-extension
and that ones is going to be supported by Xen have memory coherency
between all I/O devices and CPUs, so it is currently safe to use the
PAGE_HYPERVISOR attribute.
However, in cases where a platform does not support memory coherency, it
should support CMO extensions and Svpbmt. In this scenario, updates to
ioremap will be necessary.
For now, a compilation error will be generated to ensure that the need to
update ioremap() is not overlooked.

[1] https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/arch/riscv/Kconfig | 12 ++++++++++++
 xen/arch/riscv/pt.c    | 19 +++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/xen/arch/riscv/Kconfig b/xen/arch/riscv/Kconfig
index d882e0a059..27086cca9c 100644
--- a/xen/arch/riscv/Kconfig
+++ b/xen/arch/riscv/Kconfig
@@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
 	string
 	default "arch/riscv/configs/tiny64_defconfig"
 
+config HAS_SVPBMT
+	bool
+	help
+	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
+	  page-based memory types).
+
+	  The memory type for a page contains a combination of attributes
+	  that indicate the cacheability, idempotency, and ordering
+	  properties for access to that page.
+
+	  The Svpbmt extension is only available on 64-bit cpus.
+
 menu "Architecture Features"
 
 source "arch/Kconfig"
diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c
index 857619d48d..e2f49e2f97 100644
--- a/xen/arch/riscv/pt.c
+++ b/xen/arch/riscv/pt.c
@@ -7,6 +7,7 @@
 #include <xen/pfn.h>
 #include <xen/pmap.h>
 #include <xen/spinlock.h>
+#include <xen/vmap.h>
 
 #include <asm/fixmap.h>
 #include <asm/flushtlb.h>
@@ -548,3 +549,21 @@ void clear_fixmap(unsigned int map)
                               FIXMAP_ADDR(map) + PAGE_SIZE) != 0 )
         BUG();
 }
+
+void *ioremap(paddr_t pa, size_t len)
+{
+    mfn_t mfn = _mfn(PFN_DOWN(pa));
+    unsigned int offs = pa & (PAGE_SIZE - 1);
+    unsigned int nr = PFN_UP(offs + len);
+
+#ifdef CONFIG_HAS_SVPBMT
+    #error "an introduction of PAGE_HYPERVISOR_IOREMAP is needed for __vmap()"
+#endif
+
+    void *ptr = __vmap(&mfn, nr, 1, 1, PAGE_HYPERVISOR, VMAP_DEFAULT);
+
+    if ( !ptr )
+        return NULL;
+
+    return ptr + offs;
+}
-- 
2.49.0
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Jan Beulich 8 months, 1 week ago
On 08.04.2025 17:57, Oleksii Kurochko wrote:
> Based on RISC-V unpriviliged spec ( Version 20240411 ):
> ```
> For implementations that conform to the RISC-V Unix Platform Specification,
> I/O devices and DMA operations are required to access memory coherently and
> via strongly ordered I/O channels. Therefore, accesses to regular main memory
> regions that are concurrently accessed by external devices can also use the
> standard synchronization mechanisms. Implementations that do not conform
> to the Unix Platform Specification and/or in which devices do not access
> memory coherently will need to use mechanisms
> (which are currently platform-specific or device-specific) to enforce
> coherency.
> 
> I/O regions in the address space should be considered non-cacheable
> regions in the PMAs for those regions. Such regions can be considered coherent
> by the PMA if they are not cached by any agent.
> ```
> and [1]:
> ```
> The current riscv linux implementation requires SOC system to support
> memory coherence between all I/O devices and CPUs. But some SOC systems
> cannot maintain the coherence and they need support cache clean/invalid
> operations to synchronize data.
> 
> Current implementation is no problem with SiFive FU540, because FU540
> keeps all IO devices and DMA master devices coherence with CPU. But to a
> traditional SOC vendor, it may already have a stable non-coherency SOC
> system, the need is simply to replace the CPU with RV CPU and rebuild
> the whole system with IO-coherency is very expensive.
> ```
> 
> and the fact that all known ( to me ) CPUs that support the H-extension
> and that ones is going to be supported by Xen have memory coherency
> between all I/O devices and CPUs, so it is currently safe to use the
> PAGE_HYPERVISOR attribute.
> However, in cases where a platform does not support memory coherency, it
> should support CMO extensions and Svpbmt. In this scenario, updates to
> ioremap will be necessary.
> For now, a compilation error will be generated to ensure that the need to
> update ioremap() is not overlooked.
> 
> [1] https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/

But MMIO access correctness isn't just a matter of coherency. There may not
be any caching involved in most cases, or else you may observe significantly
delayed or even dropped (folded with later ones) writes, and reads may be
serviced from the cache instead of going to actual MMIO. Therefore ...

> --- a/xen/arch/riscv/Kconfig
> +++ b/xen/arch/riscv/Kconfig
> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>  	string
>  	default "arch/riscv/configs/tiny64_defconfig"
>  
> +config HAS_SVPBMT
> +	bool
> +	help
> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
> +	  page-based memory types).
> +
> +	  The memory type for a page contains a combination of attributes
> +	  that indicate the cacheability, idempotency, and ordering
> +	  properties for access to that page.
> +
> +	  The Svpbmt extension is only available on 64-bit cpus.

... I kind of expect this extension (or anything else that there might be) will need
making use of.

> @@ -548,3 +549,21 @@ void clear_fixmap(unsigned int map)
>                                FIXMAP_ADDR(map) + PAGE_SIZE) != 0 )
>          BUG();
>  }
> +
> +void *ioremap(paddr_t pa, size_t len)
> +{
> +    mfn_t mfn = _mfn(PFN_DOWN(pa));
> +    unsigned int offs = pa & (PAGE_SIZE - 1);
> +    unsigned int nr = PFN_UP(offs + len);
> +
> +#ifdef CONFIG_HAS_SVPBMT
> +    #error "an introduction of PAGE_HYPERVISOR_IOREMAP is needed for __vmap()"
> +#endif

While, as per above, I don't think this can stay, just in case: As indicated
earlier, pre-processor directives want to have the # in the first column.

Jan
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Oleksii Kurochko 8 months ago
On 4/10/25 5:13 PM, Jan Beulich wrote:
> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>> ```
>> For implementations that conform to the RISC-V Unix Platform Specification,
>> I/O devices and DMA operations are required to access memory coherently and
>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>> regions that are concurrently accessed by external devices can also use the
>> standard synchronization mechanisms. Implementations that do not conform
>> to the Unix Platform Specification and/or in which devices do not access
>> memory coherently will need to use mechanisms
>> (which are currently platform-specific or device-specific) to enforce
>> coherency.
>>
>> I/O regions in the address space should be considered non-cacheable
>> regions in the PMAs for those regions. Such regions can be considered coherent
>> by the PMA if they are not cached by any agent.
>> ```
>> and [1]:
>> ```
>> The current riscv linux implementation requires SOC system to support
>> memory coherence between all I/O devices and CPUs. But some SOC systems
>> cannot maintain the coherence and they need support cache clean/invalid
>> operations to synchronize data.
>>
>> Current implementation is no problem with SiFive FU540, because FU540
>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>> traditional SOC vendor, it may already have a stable non-coherency SOC
>> system, the need is simply to replace the CPU with RV CPU and rebuild
>> the whole system with IO-coherency is very expensive.
>> ```
>>
>> and the fact that all known ( to me ) CPUs that support the H-extension
>> and that ones is going to be supported by Xen have memory coherency
>> between all I/O devices and CPUs, so it is currently safe to use the
>> PAGE_HYPERVISOR attribute.
>> However, in cases where a platform does not support memory coherency, it
>> should support CMO extensions and Svpbmt. In this scenario, updates to
>> ioremap will be necessary.
>> For now, a compilation error will be generated to ensure that the need to
>> update ioremap() is not overlooked.
>>
>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
> But MMIO access correctness isn't just a matter of coherency. There may not
> be any caching involved in most cases, or else you may observe significantly
> delayed or even dropped (folded with later ones) writes, and reads may be
> serviced from the cache instead of going to actual MMIO. Therefore ...
>
>> --- a/xen/arch/riscv/Kconfig
>> +++ b/xen/arch/riscv/Kconfig
>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>   	string
>>   	default "arch/riscv/configs/tiny64_defconfig"
>>   
>> +config HAS_SVPBMT
>> +	bool
>> +	help
>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>> +	  page-based memory types).
>> +
>> +	  The memory type for a page contains a combination of attributes
>> +	  that indicate the cacheability, idempotency, and ordering
>> +	  properties for access to that page.
>> +
>> +	  The Svpbmt extension is only available on 64-bit cpus.
> ... I kind of expect this extension (or anything else that there might be) will need
> making use of.

In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
is used to control which memory regions are cacheable, non-cacheable, readable, writable,
etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
cores, or it can be fixed at design time, as in SiFive cores.

In the case of QEMU, I assume it is QEMU's responsibility to properly emulate accesses
to device memory regions. Since QEMU does not appear to provide registers for configuring
PMA, it seems that PMA is not emulated. Additionally, QEMU does not emulate caches.

Based on that, I expect that it is the responsibility of the firmware or the hardware
itself to provide the correct PMA configuration.

I want to note that even Svpbmt is available PMA settings could be used or be ovewritten
by Svpbmt's attribute value.

I will update the commit message  for more clearness.

>
>> @@ -548,3 +549,21 @@ void clear_fixmap(unsigned int map)
>>                                 FIXMAP_ADDR(map) + PAGE_SIZE) != 0 )
>>           BUG();
>>   }
>> +
>> +void *ioremap(paddr_t pa, size_t len)
>> +{
>> +    mfn_t mfn = _mfn(PFN_DOWN(pa));
>> +    unsigned int offs = pa & (PAGE_SIZE - 1);
>> +    unsigned int nr = PFN_UP(offs + len);
>> +
>> +#ifdef CONFIG_HAS_SVPBMT
>> +    #error "an introduction of PAGE_HYPERVISOR_IOREMAP is needed for __vmap()"
>> +#endif
> While, as per above, I don't think this can stay, just in case: As indicated
> earlier, pre-processor directives want to have the # in the first column.

I think it can be safely dropped now, and|PAGE_HYPERVISOR_IOREMAP| could be introduced
and simply PTE's PMBT bits can be ignored in|pt_update_entry()| if Svpbmt isn't implemented.

~ Oleksii
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Jan Beulich 8 months ago
On 15.04.2025 12:29, Oleksii Kurochko wrote:
> 
> On 4/10/25 5:13 PM, Jan Beulich wrote:
>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>> ```
>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>> I/O devices and DMA operations are required to access memory coherently and
>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>> regions that are concurrently accessed by external devices can also use the
>>> standard synchronization mechanisms. Implementations that do not conform
>>> to the Unix Platform Specification and/or in which devices do not access
>>> memory coherently will need to use mechanisms
>>> (which are currently platform-specific or device-specific) to enforce
>>> coherency.
>>>
>>> I/O regions in the address space should be considered non-cacheable
>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>> by the PMA if they are not cached by any agent.
>>> ```
>>> and [1]:
>>> ```
>>> The current riscv linux implementation requires SOC system to support
>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>> cannot maintain the coherence and they need support cache clean/invalid
>>> operations to synchronize data.
>>>
>>> Current implementation is no problem with SiFive FU540, because FU540
>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>> the whole system with IO-coherency is very expensive.
>>> ```
>>>
>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>> and that ones is going to be supported by Xen have memory coherency
>>> between all I/O devices and CPUs, so it is currently safe to use the
>>> PAGE_HYPERVISOR attribute.
>>> However, in cases where a platform does not support memory coherency, it
>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>> ioremap will be necessary.
>>> For now, a compilation error will be generated to ensure that the need to
>>> update ioremap() is not overlooked.
>>>
>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>> But MMIO access correctness isn't just a matter of coherency. There may not
>> be any caching involved in most cases, or else you may observe significantly
>> delayed or even dropped (folded with later ones) writes, and reads may be
>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>
>>> --- a/xen/arch/riscv/Kconfig
>>> +++ b/xen/arch/riscv/Kconfig
>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>   	string
>>>   	default "arch/riscv/configs/tiny64_defconfig"
>>>   
>>> +config HAS_SVPBMT
>>> +	bool
>>> +	help
>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>> +	  page-based memory types).
>>> +
>>> +	  The memory type for a page contains a combination of attributes
>>> +	  that indicate the cacheability, idempotency, and ordering
>>> +	  properties for access to that page.
>>> +
>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>> ... I kind of expect this extension (or anything else that there might be) will need
>> making use of.
> 
> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
> cores, or it can be fixed at design time, as in SiFive cores.

How would things work if there was a need to map a RAM page uncacheable (via
ioremap() or otherwise)?

Jan
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Oleksii Kurochko 8 months ago
On 4/15/25 1:02 PM, Jan Beulich wrote:
> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>> ```
>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>> I/O devices and DMA operations are required to access memory coherently and
>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>> regions that are concurrently accessed by external devices can also use the
>>>> standard synchronization mechanisms. Implementations that do not conform
>>>> to the Unix Platform Specification and/or in which devices do not access
>>>> memory coherently will need to use mechanisms
>>>> (which are currently platform-specific or device-specific) to enforce
>>>> coherency.
>>>>
>>>> I/O regions in the address space should be considered non-cacheable
>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>> by the PMA if they are not cached by any agent.
>>>> ```
>>>> and [1]:
>>>> ```
>>>> The current riscv linux implementation requires SOC system to support
>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>> operations to synchronize data.
>>>>
>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>> the whole system with IO-coherency is very expensive.
>>>> ```
>>>>
>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>> and that ones is going to be supported by Xen have memory coherency
>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>> PAGE_HYPERVISOR attribute.
>>>> However, in cases where a platform does not support memory coherency, it
>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>> ioremap will be necessary.
>>>> For now, a compilation error will be generated to ensure that the need to
>>>> update ioremap() is not overlooked.
>>>>
>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>> be any caching involved in most cases, or else you may observe significantly
>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>
>>>> --- a/xen/arch/riscv/Kconfig
>>>> +++ b/xen/arch/riscv/Kconfig
>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>    	string
>>>>    	default "arch/riscv/configs/tiny64_defconfig"
>>>>    
>>>> +config HAS_SVPBMT
>>>> +	bool
>>>> +	help
>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>> +	  page-based memory types).
>>>> +
>>>> +	  The memory type for a page contains a combination of attributes
>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>> +	  properties for access to that page.
>>>> +
>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>> ... I kind of expect this extension (or anything else that there might be) will need
>>> making use of.
>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>> cores, or it can be fixed at design time, as in SiFive cores.
> How would things work if there was a need to map a RAM page uncacheable (via
> ioremap() or otherwise)?

My understanding is that Svpbmt is only needed when someone wants to change the memory
attribute of a page set by PMA.

The question is if non-cacheable RAM page is really needed if we have a coherency?

~ Oleksii
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Jan Beulich 8 months ago
On 17.04.2025 16:20, Oleksii Kurochko wrote:
> On 4/15/25 1:02 PM, Jan Beulich wrote:
>> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>>> ```
>>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>>> I/O devices and DMA operations are required to access memory coherently and
>>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>>> regions that are concurrently accessed by external devices can also use the
>>>>> standard synchronization mechanisms. Implementations that do not conform
>>>>> to the Unix Platform Specification and/or in which devices do not access
>>>>> memory coherently will need to use mechanisms
>>>>> (which are currently platform-specific or device-specific) to enforce
>>>>> coherency.
>>>>>
>>>>> I/O regions in the address space should be considered non-cacheable
>>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>>> by the PMA if they are not cached by any agent.
>>>>> ```
>>>>> and [1]:
>>>>> ```
>>>>> The current riscv linux implementation requires SOC system to support
>>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>>> operations to synchronize data.
>>>>>
>>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>>> the whole system with IO-coherency is very expensive.
>>>>> ```
>>>>>
>>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>>> and that ones is going to be supported by Xen have memory coherency
>>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>>> PAGE_HYPERVISOR attribute.
>>>>> However, in cases where a platform does not support memory coherency, it
>>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>>> ioremap will be necessary.
>>>>> For now, a compilation error will be generated to ensure that the need to
>>>>> update ioremap() is not overlooked.
>>>>>
>>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>>> be any caching involved in most cases, or else you may observe significantly
>>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>>
>>>>> --- a/xen/arch/riscv/Kconfig
>>>>> +++ b/xen/arch/riscv/Kconfig
>>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>>    	string
>>>>>    	default "arch/riscv/configs/tiny64_defconfig"
>>>>>    
>>>>> +config HAS_SVPBMT
>>>>> +	bool
>>>>> +	help
>>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>>> +	  page-based memory types).
>>>>> +
>>>>> +	  The memory type for a page contains a combination of attributes
>>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>>> +	  properties for access to that page.
>>>>> +
>>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>>> ... I kind of expect this extension (or anything else that there might be) will need
>>>> making use of.
>>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>>> cores, or it can be fixed at design time, as in SiFive cores.
>> How would things work if there was a need to map a RAM page uncacheable (via
>> ioremap() or otherwise)?
> 
> My understanding is that Svpbmt is only needed when someone wants to change the memory
> attribute of a page set by PMA.
> 
> The question is if non-cacheable RAM page is really needed if we have a coherency?

Aiui coherency here is among CPUs. Properties of devices in the system are
largely unknown? (Beyond this there may also be special situations in which
one really cares about data going directly to RAM.)

Jan
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Oleksii Kurochko 8 months ago
On 4/17/25 4:24 PM, Jan Beulich wrote:
> On 17.04.2025 16:20, Oleksii Kurochko wrote:
>> On 4/15/25 1:02 PM, Jan Beulich wrote:
>>> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>>>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>>>> ```
>>>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>>>> I/O devices and DMA operations are required to access memory coherently and
>>>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>>>> regions that are concurrently accessed by external devices can also use the
>>>>>> standard synchronization mechanisms. Implementations that do not conform
>>>>>> to the Unix Platform Specification and/or in which devices do not access
>>>>>> memory coherently will need to use mechanisms
>>>>>> (which are currently platform-specific or device-specific) to enforce
>>>>>> coherency.
>>>>>>
>>>>>> I/O regions in the address space should be considered non-cacheable
>>>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>>>> by the PMA if they are not cached by any agent.
>>>>>> ```
>>>>>> and [1]:
>>>>>> ```
>>>>>> The current riscv linux implementation requires SOC system to support
>>>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>>>> operations to synchronize data.
>>>>>>
>>>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>>>> the whole system with IO-coherency is very expensive.
>>>>>> ```
>>>>>>
>>>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>>>> and that ones is going to be supported by Xen have memory coherency
>>>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>>>> PAGE_HYPERVISOR attribute.
>>>>>> However, in cases where a platform does not support memory coherency, it
>>>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>>>> ioremap will be necessary.
>>>>>> For now, a compilation error will be generated to ensure that the need to
>>>>>> update ioremap() is not overlooked.
>>>>>>
>>>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>>>> be any caching involved in most cases, or else you may observe significantly
>>>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>>>
>>>>>> --- a/xen/arch/riscv/Kconfig
>>>>>> +++ b/xen/arch/riscv/Kconfig
>>>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>>>     	string
>>>>>>     	default "arch/riscv/configs/tiny64_defconfig"
>>>>>>     
>>>>>> +config HAS_SVPBMT
>>>>>> +	bool
>>>>>> +	help
>>>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>>>> +	  page-based memory types).
>>>>>> +
>>>>>> +	  The memory type for a page contains a combination of attributes
>>>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>>>> +	  properties for access to that page.
>>>>>> +
>>>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>>>> ... I kind of expect this extension (or anything else that there might be) will need
>>>>> making use of.
>>>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>>>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>>>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>>>> cores, or it can be fixed at design time, as in SiFive cores.
>>> How would things work if there was a need to map a RAM page uncacheable (via
>>> ioremap() or otherwise)?
>> My understanding is that Svpbmt is only needed when someone wants to change the memory
>> attribute of a page set by PMA.
>>
>> The question is if non-cacheable RAM page is really needed if we have a coherency?
> Aiui coherency here is among CPUs.

```
For implementations that conform to the RISC-V Unix Platform Specification,
I/O devices and DMA operations are required to access memory coherently and
via strongly ordered I/O channels. Therefore, accesses to regular main memory
regions that are concurrently accessed by external devices can also use the
standard synchronization mechanisms. Implementations that do not conform
to the Unix Platform Specification and/or in which devices do not access
memory coherently will need to use mechanisms
(which are currently platform-specific or device-specific) to enforce
coherency.
```
Based on this from the spec, coherency here is not only among CPUs.


> Properties of devices in the system are
> largely unknown?

Yes, but still not sure what kind of property requires ioremap() which won't work
without Svpmbt. Could you please tell me an example?

> (Beyond this there may also be special situations in which
> one really cares about data going directly to RAM.)

If there are such special cases, I assume that the firmware or hardware (in the case
of fixed PMA) will provide a non-cacheable region. In that case, the user should be
aware of this region and use it for those specific scenarios.

~ Oleksii
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Jan Beulich 8 months ago
On 17.04.2025 16:37, Oleksii Kurochko wrote:
> 
> On 4/17/25 4:24 PM, Jan Beulich wrote:
>> On 17.04.2025 16:20, Oleksii Kurochko wrote:
>>> On 4/15/25 1:02 PM, Jan Beulich wrote:
>>>> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>>>>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>>>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>>>>> ```
>>>>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>>>>> I/O devices and DMA operations are required to access memory coherently and
>>>>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>>>>> regions that are concurrently accessed by external devices can also use the
>>>>>>> standard synchronization mechanisms. Implementations that do not conform
>>>>>>> to the Unix Platform Specification and/or in which devices do not access
>>>>>>> memory coherently will need to use mechanisms
>>>>>>> (which are currently platform-specific or device-specific) to enforce
>>>>>>> coherency.
>>>>>>>
>>>>>>> I/O regions in the address space should be considered non-cacheable
>>>>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>>>>> by the PMA if they are not cached by any agent.
>>>>>>> ```
>>>>>>> and [1]:
>>>>>>> ```
>>>>>>> The current riscv linux implementation requires SOC system to support
>>>>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>>>>> operations to synchronize data.
>>>>>>>
>>>>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>>>>> the whole system with IO-coherency is very expensive.
>>>>>>> ```
>>>>>>>
>>>>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>>>>> and that ones is going to be supported by Xen have memory coherency
>>>>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>>>>> PAGE_HYPERVISOR attribute.
>>>>>>> However, in cases where a platform does not support memory coherency, it
>>>>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>>>>> ioremap will be necessary.
>>>>>>> For now, a compilation error will be generated to ensure that the need to
>>>>>>> update ioremap() is not overlooked.
>>>>>>>
>>>>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>>>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>>>>> be any caching involved in most cases, or else you may observe significantly
>>>>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>>>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>>>>
>>>>>>> --- a/xen/arch/riscv/Kconfig
>>>>>>> +++ b/xen/arch/riscv/Kconfig
>>>>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>>>>     	string
>>>>>>>     	default "arch/riscv/configs/tiny64_defconfig"
>>>>>>>     
>>>>>>> +config HAS_SVPBMT
>>>>>>> +	bool
>>>>>>> +	help
>>>>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>>>>> +	  page-based memory types).
>>>>>>> +
>>>>>>> +	  The memory type for a page contains a combination of attributes
>>>>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>>>>> +	  properties for access to that page.
>>>>>>> +
>>>>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>>>>> ... I kind of expect this extension (or anything else that there might be) will need
>>>>>> making use of.
>>>>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>>>>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>>>>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>>>>> cores, or it can be fixed at design time, as in SiFive cores.
>>>> How would things work if there was a need to map a RAM page uncacheable (via
>>>> ioremap() or otherwise)?
>>> My understanding is that Svpbmt is only needed when someone wants to change the memory
>>> attribute of a page set by PMA.
>>>
>>> The question is if non-cacheable RAM page is really needed if we have a coherency?
>> Aiui coherency here is among CPUs.
> 
> ```
> For implementations that conform to the RISC-V Unix Platform Specification,
> I/O devices and DMA operations are required to access memory coherently and
> via strongly ordered I/O channels. Therefore, accesses to regular main memory
> regions that are concurrently accessed by external devices can also use the
> standard synchronization mechanisms. Implementations that do not conform
> to the Unix Platform Specification and/or in which devices do not access
> memory coherently will need to use mechanisms
> (which are currently platform-specific or device-specific) to enforce
> coherency.
> ```
> Based on this from the spec, coherency here is not only among CPUs.
> 
> 
>> Properties of devices in the system are
>> largely unknown?
> 
> Yes, but still not sure what kind of property requires ioremap() which won't work
> without Svpmbt. Could you please tell me an example?

Well, above you said they all need to access memory coherently. That's the
"property" I was referring to.

>> (Beyond this there may also be special situations in which
>> one really cares about data going directly to RAM.)
> 
> If there are such special cases, I assume that the firmware or hardware (in the case
> of fixed PMA) will provide a non-cacheable region.

How could they? Firmware may be unaware of specific properties of specific
devices a user adds to a system.

Jan

> In that case, the user should be
> aware of this region and use it for those specific scenarios.
> 
> ~ Oleksii
>
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Oleksii Kurochko 7 months, 3 weeks ago
On 4/17/25 4:49 PM, Jan Beulich wrote:
> On 17.04.2025 16:37, Oleksii Kurochko wrote:
>> On 4/17/25 4:24 PM, Jan Beulich wrote:
>>> On 17.04.2025 16:20, Oleksii Kurochko wrote:
>>>> On 4/15/25 1:02 PM, Jan Beulich wrote:
>>>>> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>>>>>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>>>>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>>>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>>>>>> ```
>>>>>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>>>>>> I/O devices and DMA operations are required to access memory coherently and
>>>>>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>>>>>> regions that are concurrently accessed by external devices can also use the
>>>>>>>> standard synchronization mechanisms. Implementations that do not conform
>>>>>>>> to the Unix Platform Specification and/or in which devices do not access
>>>>>>>> memory coherently will need to use mechanisms
>>>>>>>> (which are currently platform-specific or device-specific) to enforce
>>>>>>>> coherency.
>>>>>>>>
>>>>>>>> I/O regions in the address space should be considered non-cacheable
>>>>>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>>>>>> by the PMA if they are not cached by any agent.
>>>>>>>> ```
>>>>>>>> and [1]:
>>>>>>>> ```
>>>>>>>> The current riscv linux implementation requires SOC system to support
>>>>>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>>>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>>>>>> operations to synchronize data.
>>>>>>>>
>>>>>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>>>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>>>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>>>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>>>>>> the whole system with IO-coherency is very expensive.
>>>>>>>> ```
>>>>>>>>
>>>>>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>>>>>> and that ones is going to be supported by Xen have memory coherency
>>>>>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>>>>>> PAGE_HYPERVISOR attribute.
>>>>>>>> However, in cases where a platform does not support memory coherency, it
>>>>>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>>>>>> ioremap will be necessary.
>>>>>>>> For now, a compilation error will be generated to ensure that the need to
>>>>>>>> update ioremap() is not overlooked.
>>>>>>>>
>>>>>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>>>>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>>>>>> be any caching involved in most cases, or else you may observe significantly
>>>>>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>>>>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>>>>>
>>>>>>>> --- a/xen/arch/riscv/Kconfig
>>>>>>>> +++ b/xen/arch/riscv/Kconfig
>>>>>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>>>>>      	string
>>>>>>>>      	default "arch/riscv/configs/tiny64_defconfig"
>>>>>>>>      
>>>>>>>> +config HAS_SVPBMT
>>>>>>>> +	bool
>>>>>>>> +	help
>>>>>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>>>>>> +	  page-based memory types).
>>>>>>>> +
>>>>>>>> +	  The memory type for a page contains a combination of attributes
>>>>>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>>>>>> +	  properties for access to that page.
>>>>>>>> +
>>>>>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>>>>>> ... I kind of expect this extension (or anything else that there might be) will need
>>>>>>> making use of.
>>>>>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>>>>>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>>>>>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>>>>>> cores, or it can be fixed at design time, as in SiFive cores.
>>>>> How would things work if there was a need to map a RAM page uncacheable (via
>>>>> ioremap() or otherwise)?
>>>> My understanding is that Svpbmt is only needed when someone wants to change the memory
>>>> attribute of a page set by PMA.
>>>>
>>>> The question is if non-cacheable RAM page is really needed if we have a coherency?
>>> Aiui coherency here is among CPUs.
>> ```
>> For implementations that conform to the RISC-V Unix Platform Specification,
>> I/O devices and DMA operations are required to access memory coherently and
>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>> regions that are concurrently accessed by external devices can also use the
>> standard synchronization mechanisms. Implementations that do not conform
>> to the Unix Platform Specification and/or in which devices do not access
>> memory coherently will need to use mechanisms
>> (which are currently platform-specific or device-specific) to enforce
>> coherency.
>> ```
>> Based on this from the spec, coherency here is not only among CPUs.
>>
>>
>>> Properties of devices in the system are
>>> largely unknown?
>> Yes, but still not sure what kind of property requires ioremap() which won't work
>> without Svpmbt. Could you please tell me an example?
> Well, above you said they all need to access memory coherently. That's the
> "property" I was referring to.

Do you mean that device could have a property which tell that it would like to have non-cachable
region used for that? I haven't seen such property in device tree files.

Do we have in Xen cases when Xen wants to have map part of RAM as non-cachebale and it is only the
one option?

I am also thinking why it can't be used cachable region + barrier (if we don't have memory coherency
for everything).

Anyway, if it isn't an option to have mapped cacheble region + barrier then there is no any choice
and the support of Svpmbt is required.

>
>>> (Beyond this there may also be special situations in which
>>> one really cares about data going directly to RAM.)
>> If there are such special cases, I assume that the firmware or hardware (in the case
>> of fixed PMA) will provide a non-cacheable region.
> How could they? Firmware may be unaware of specific properties of specific
> devices a user adds to a system.

(this is not real case, just thoughts) Firmware could by default provide part of RAM as
non-cacheable region and then hypervisor/kernel use this region for allocation. But I agree
that it isn't the best thing to manage that.


~ Oleksii

>> In that case, the user should be
>> aware of this region and use it for those specific scenarios.
>>
>> ~ Oleksii
>>
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Jan Beulich 7 months, 3 weeks ago
On 22.04.2025 10:40, Oleksii Kurochko wrote:
> 
> On 4/17/25 4:49 PM, Jan Beulich wrote:
>> On 17.04.2025 16:37, Oleksii Kurochko wrote:
>>> On 4/17/25 4:24 PM, Jan Beulich wrote:
>>>> On 17.04.2025 16:20, Oleksii Kurochko wrote:
>>>>> On 4/15/25 1:02 PM, Jan Beulich wrote:
>>>>>> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>>>>>>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>>>>>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>>>>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>>>>>>> ```
>>>>>>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>>>>>>> I/O devices and DMA operations are required to access memory coherently and
>>>>>>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>>>>>>> regions that are concurrently accessed by external devices can also use the
>>>>>>>>> standard synchronization mechanisms. Implementations that do not conform
>>>>>>>>> to the Unix Platform Specification and/or in which devices do not access
>>>>>>>>> memory coherently will need to use mechanisms
>>>>>>>>> (which are currently platform-specific or device-specific) to enforce
>>>>>>>>> coherency.
>>>>>>>>>
>>>>>>>>> I/O regions in the address space should be considered non-cacheable
>>>>>>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>>>>>>> by the PMA if they are not cached by any agent.
>>>>>>>>> ```
>>>>>>>>> and [1]:
>>>>>>>>> ```
>>>>>>>>> The current riscv linux implementation requires SOC system to support
>>>>>>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>>>>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>>>>>>> operations to synchronize data.
>>>>>>>>>
>>>>>>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>>>>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>>>>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>>>>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>>>>>>> the whole system with IO-coherency is very expensive.
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>>>>>>> and that ones is going to be supported by Xen have memory coherency
>>>>>>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>>>>>>> PAGE_HYPERVISOR attribute.
>>>>>>>>> However, in cases where a platform does not support memory coherency, it
>>>>>>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>>>>>>> ioremap will be necessary.
>>>>>>>>> For now, a compilation error will be generated to ensure that the need to
>>>>>>>>> update ioremap() is not overlooked.
>>>>>>>>>
>>>>>>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>>>>>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>>>>>>> be any caching involved in most cases, or else you may observe significantly
>>>>>>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>>>>>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>>>>>>
>>>>>>>>> --- a/xen/arch/riscv/Kconfig
>>>>>>>>> +++ b/xen/arch/riscv/Kconfig
>>>>>>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>>>>>>      	string
>>>>>>>>>      	default "arch/riscv/configs/tiny64_defconfig"
>>>>>>>>>      
>>>>>>>>> +config HAS_SVPBMT
>>>>>>>>> +	bool
>>>>>>>>> +	help
>>>>>>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>>>>>>> +	  page-based memory types).
>>>>>>>>> +
>>>>>>>>> +	  The memory type for a page contains a combination of attributes
>>>>>>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>>>>>>> +	  properties for access to that page.
>>>>>>>>> +
>>>>>>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>>>>>>> ... I kind of expect this extension (or anything else that there might be) will need
>>>>>>>> making use of.
>>>>>>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>>>>>>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>>>>>>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>>>>>>> cores, or it can be fixed at design time, as in SiFive cores.
>>>>>> How would things work if there was a need to map a RAM page uncacheable (via
>>>>>> ioremap() or otherwise)?
>>>>> My understanding is that Svpbmt is only needed when someone wants to change the memory
>>>>> attribute of a page set by PMA.
>>>>>
>>>>> The question is if non-cacheable RAM page is really needed if we have a coherency?
>>>> Aiui coherency here is among CPUs.
>>> ```
>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>> I/O devices and DMA operations are required to access memory coherently and
>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>> regions that are concurrently accessed by external devices can also use the
>>> standard synchronization mechanisms. Implementations that do not conform
>>> to the Unix Platform Specification and/or in which devices do not access
>>> memory coherently will need to use mechanisms
>>> (which are currently platform-specific or device-specific) to enforce
>>> coherency.
>>> ```
>>> Based on this from the spec, coherency here is not only among CPUs.
>>>
>>>
>>>> Properties of devices in the system are
>>>> largely unknown?
>>> Yes, but still not sure what kind of property requires ioremap() which won't work
>>> without Svpmbt. Could you please tell me an example?
>> Well, above you said they all need to access memory coherently. That's the
>> "property" I was referring to.
> 
> Do you mean that device could have a property which tell that it would like to have non-cachable
> region used for that? I haven't seen such property in device tree files.
> 
> Do we have in Xen cases when Xen wants to have map part of RAM as non-cachebale and it is only the
> one option?

On x86 we have the case that IOMMUs may access memory non-coherently. This is
particular means that IOMMU page table updates (which necessarily live in RAM)
need to be done quite carefully. As it's all our code, we deal with the
situation by issuing cache flushes, avoiding the need for UC mappings.

Graphics engines may have similar constraints, aiui. With the driver code not
being part of Xen, we wouldn't be able to use a similar "simplification" there.
UC mappings would be pretty much unavoidable.

> I am also thinking why it can't be used cachable region + barrier (if we don't have memory coherency
> for everything).

Not sure what exactly you're asking here (if anything). An answer would very
likely depend on the specific kind of barrier you're thinking about. The
question would be what, if any, effect a barrier would have on the cache(s).

> Anyway, if it isn't an option to have mapped cacheble region + barrier then there is no any choice
> and the support of Svpmbt is required.

Quite possible.

Jan
Re: [PATCH v1 03/14] xen/riscv: introduce ioremap()
Posted by Oleksii Kurochko 7 months, 3 weeks ago
On 4/22/25 11:14 AM, Jan Beulich wrote:
> On 22.04.2025 10:40, Oleksii Kurochko wrote:
>> On 4/17/25 4:49 PM, Jan Beulich wrote:
>>> On 17.04.2025 16:37, Oleksii Kurochko wrote:
>>>> On 4/17/25 4:24 PM, Jan Beulich wrote:
>>>>> On 17.04.2025 16:20, Oleksii Kurochko wrote:
>>>>>> On 4/15/25 1:02 PM, Jan Beulich wrote:
>>>>>>> On 15.04.2025 12:29, Oleksii Kurochko wrote:
>>>>>>>> On 4/10/25 5:13 PM, Jan Beulich wrote:
>>>>>>>>> On 08.04.2025 17:57, Oleksii Kurochko wrote:
>>>>>>>>>> Based on RISC-V unpriviliged spec ( Version 20240411 ):
>>>>>>>>>> ```
>>>>>>>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>>>>>>>> I/O devices and DMA operations are required to access memory coherently and
>>>>>>>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>>>>>>>> regions that are concurrently accessed by external devices can also use the
>>>>>>>>>> standard synchronization mechanisms. Implementations that do not conform
>>>>>>>>>> to the Unix Platform Specification and/or in which devices do not access
>>>>>>>>>> memory coherently will need to use mechanisms
>>>>>>>>>> (which are currently platform-specific or device-specific) to enforce
>>>>>>>>>> coherency.
>>>>>>>>>>
>>>>>>>>>> I/O regions in the address space should be considered non-cacheable
>>>>>>>>>> regions in the PMAs for those regions. Such regions can be considered coherent
>>>>>>>>>> by the PMA if they are not cached by any agent.
>>>>>>>>>> ```
>>>>>>>>>> and [1]:
>>>>>>>>>> ```
>>>>>>>>>> The current riscv linux implementation requires SOC system to support
>>>>>>>>>> memory coherence between all I/O devices and CPUs. But some SOC systems
>>>>>>>>>> cannot maintain the coherence and they need support cache clean/invalid
>>>>>>>>>> operations to synchronize data.
>>>>>>>>>>
>>>>>>>>>> Current implementation is no problem with SiFive FU540, because FU540
>>>>>>>>>> keeps all IO devices and DMA master devices coherence with CPU. But to a
>>>>>>>>>> traditional SOC vendor, it may already have a stable non-coherency SOC
>>>>>>>>>> system, the need is simply to replace the CPU with RV CPU and rebuild
>>>>>>>>>> the whole system with IO-coherency is very expensive.
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> and the fact that all known ( to me ) CPUs that support the H-extension
>>>>>>>>>> and that ones is going to be supported by Xen have memory coherency
>>>>>>>>>> between all I/O devices and CPUs, so it is currently safe to use the
>>>>>>>>>> PAGE_HYPERVISOR attribute.
>>>>>>>>>> However, in cases where a platform does not support memory coherency, it
>>>>>>>>>> should support CMO extensions and Svpbmt. In this scenario, updates to
>>>>>>>>>> ioremap will be necessary.
>>>>>>>>>> For now, a compilation error will be generated to ensure that the need to
>>>>>>>>>> update ioremap() is not overlooked.
>>>>>>>>>>
>>>>>>>>>> [1]https://patchwork.kernel.org/project/linux-riscv/patch/1555947870-23014-1-git-send-email-guoren@kernel.org/
>>>>>>>>> But MMIO access correctness isn't just a matter of coherency. There may not
>>>>>>>>> be any caching involved in most cases, or else you may observe significantly
>>>>>>>>> delayed or even dropped (folded with later ones) writes, and reads may be
>>>>>>>>> serviced from the cache instead of going to actual MMIO. Therefore ...
>>>>>>>>>
>>>>>>>>>> --- a/xen/arch/riscv/Kconfig
>>>>>>>>>> +++ b/xen/arch/riscv/Kconfig
>>>>>>>>>> @@ -15,6 +15,18 @@ config ARCH_DEFCONFIG
>>>>>>>>>>       	string
>>>>>>>>>>       	default "arch/riscv/configs/tiny64_defconfig"
>>>>>>>>>>       
>>>>>>>>>> +config HAS_SVPBMT
>>>>>>>>>> +	bool
>>>>>>>>>> +	help
>>>>>>>>>> +	  This config enables usage of Svpbmt ISA-extension ( Supervisor-mode:
>>>>>>>>>> +	  page-based memory types).
>>>>>>>>>> +
>>>>>>>>>> +	  The memory type for a page contains a combination of attributes
>>>>>>>>>> +	  that indicate the cacheability, idempotency, and ordering
>>>>>>>>>> +	  properties for access to that page.
>>>>>>>>>> +
>>>>>>>>>> +	  The Svpbmt extension is only available on 64-bit cpus.
>>>>>>>>> ... I kind of expect this extension (or anything else that there might be) will need
>>>>>>>>> making use of.
>>>>>>>> In cases where the Svpbmt extension isn't available, PMA (Physical Memory Attributes)
>>>>>>>> is used to control which memory regions are cacheable, non-cacheable, readable, writable,
>>>>>>>> etc. PMA is configured in M-mode by the firmware (e.g., OpenSBI), as is done in Andes
>>>>>>>> cores, or it can be fixed at design time, as in SiFive cores.
>>>>>>> How would things work if there was a need to map a RAM page uncacheable (via
>>>>>>> ioremap() or otherwise)?
>>>>>> My understanding is that Svpbmt is only needed when someone wants to change the memory
>>>>>> attribute of a page set by PMA.
>>>>>>
>>>>>> The question is if non-cacheable RAM page is really needed if we have a coherency?
>>>>> Aiui coherency here is among CPUs.
>>>> ```
>>>> For implementations that conform to the RISC-V Unix Platform Specification,
>>>> I/O devices and DMA operations are required to access memory coherently and
>>>> via strongly ordered I/O channels. Therefore, accesses to regular main memory
>>>> regions that are concurrently accessed by external devices can also use the
>>>> standard synchronization mechanisms. Implementations that do not conform
>>>> to the Unix Platform Specification and/or in which devices do not access
>>>> memory coherently will need to use mechanisms
>>>> (which are currently platform-specific or device-specific) to enforce
>>>> coherency.
>>>> ```
>>>> Based on this from the spec, coherency here is not only among CPUs.
>>>>
>>>>
>>>>> Properties of devices in the system are
>>>>> largely unknown?
>>>> Yes, but still not sure what kind of property requires ioremap() which won't work
>>>> without Svpmbt. Could you please tell me an example?
>>> Well, above you said they all need to access memory coherently. That's the
>>> "property" I was referring to.
>> Do you mean that device could have a property which tell that it would like to have non-cachable
>> region used for that? I haven't seen such property in device tree files.
>>
>> Do we have in Xen cases when Xen wants to have map part of RAM as non-cachebale and it is only the
>> one option?
> On x86 we have the case that IOMMUs may access memory non-coherently. This is
> particular means that IOMMU page table updates (which necessarily live in RAM)
> need to be done quite carefully. As it's all our code, we deal with the
> situation by issuing cache flushes, avoiding the need for UC mappings.
>
> Graphics engines may have similar constraints, aiui. With the driver code not
> being part of Xen, we wouldn't be able to use a similar "simplification" there.
> UC mappings would be pretty much unavoidable.

For this case, it would be better to have Svpmbt.

I would like to noted that Svpmbt isn't supported by RV32 architectures. For such cases, it will be still
needed to play with PMA.
I found today a patch in Linux kernel which does something similar to what I wrote in one of my previous
replies:
   [0]https://lore.kernel.org/all/20241102000843.1301099-1-samuel.holland@sifive.com/
In the cover letter [0] it is mentioned the following:
   On some RISC-V platforms, including StarFive JH7100 and ESWIN EIC7700,
   RAM is mapped to multiple physical address ranges, with each alias
   having a different set of statically-determined Physical Memory
   Attributes (PMAs). Software selects the PMAs for a page by choosing a
   PFN from the corresponding physical address range. On these platforms,
   this is the only way to allocate noncached memory for use with
   noncoherent DMA.

So firmware should configure PMAs so some part of RAM is noncached and then kernel get this info based
on the binding:
   https://patchew.org/linux/20241102000843.1301099-1-samuel.holland@sifive.com/20241102000843.1301099-2-samuel.holland@sifive.com/

Considering that this feature isn't available even in Linux kernel, we can start with assumption that all
our SoCs will support Svpmbt.

We don't really care about StarFive JH7100 as it doesn't support H extension, but we potentially should care
about ESWIN EIC7700, which support H extension and doesn't support Svpmbt extension according to a datasheet
publicly available:
   Each EIC7700X core is configured to support the RV64I base ISA, as well as the Multiply (M), Atomic(A),
   Single-Precision Floating Point (F), Double-Precision Floating Point (D), Compressed (C), CSR
   Instructions (Zicsr), Instruction-Fetch Fence (Zifencei), Address Calculation (Zba), Basic Bit
   Manipulation (Zbb), and Count Overflow and Mode-Based Filtering (Sscofpmf) RISC‑V extensions. This
   is captured by the RISC‑V extension string: RV64GC_Zba_Zbb_Sscofpmf.

>
>> I am also thinking why it can't be used cachable region + barrier (if we don't have memory coherency
>> for everything).
> Not sure what exactly you're asking here (if anything). An answer would very
> likely depend on the specific kind of barrier you're thinking about. The
> question would be what, if any, effect a barrier would have on the cache(s).

I confused barrier with cache flushes (when I wrote that I thought about the case of DMA that we don't really
should have requirement of non-cachable memory for DMA as it is enough to have memory fence between use of DMA
memory and MMIO that triggers the dma), basically I meant what you wrote above about x86's IOMMUs.

~ Oleksii

>
>> Anyway, if it isn't an option to have mapped cacheble region + barrier then there is no any choice
>> and the support of Svpmbt is required.
> Quite possible.
>
> Jan