[PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers

Oleksii Kurochko posted 17 patches 4 months, 3 weeks ago
There is a newer version of this series
[PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Oleksii Kurochko 4 months, 3 weeks ago
Implement the mfn_valid() macro to verify whether a given MFN is valid by
checking that it falls within the range [start_page, max_page).
These bounds are initialized based on the start and end addresses of RAM.

As part of this patch, start_page is introduced and initialized with the
PFN of the first RAM page.

Also, after providing a non-stub implementation of the mfn_valid() macro,
the following compilation errors started to occur:
  riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
  /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
  riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
  /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
  riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
  /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
  riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
  riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
  /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
  riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
  riscv64-linux-gnu-ld: final link failed: bad value
  make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
To resolve these errors, the following functions have also been introduced,
based on their Arm counterparts:
- page_get_owner_and_reference() and its variant to safely acquire a
  reference to a page and retrieve its owner.
- put_page() and put_page_nr() to release page references and free the page
  when the count drops to zero.
  For put_page_nr(), code related to static memory configuration is wrapped
  with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
  common code. Therefore, PGC_static and free_domstatic_page() are not
  introduced for RISC-V. However, since this configuration could be useful
  in the future, the relevant code is retained and conditionally compiled.
- A stub for page_is_ram_type() that currently always returns 0 and asserts
  unreachable, as RAM type checking is not yet implemented.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V2:
 - New patch.
---
 xen/arch/riscv/include/asm/mm.h |  9 ++-
 xen/arch/riscv/mm.c             | 97 +++++++++++++++++++++++++++++++--
 2 files changed, 99 insertions(+), 7 deletions(-)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 41bf9002d7..bd8511e5f9 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -5,6 +5,7 @@
 
 #include <public/xen.h>
 #include <xen/bug.h>
+#include <xen/compiler.h>
 #include <xen/const.h>
 #include <xen/mm-frame.h>
 #include <xen/pdx.h>
@@ -288,8 +289,12 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
 #define page_get_owner(p)    (p)->v.inuse.domain
 #define page_set_owner(p, d) ((p)->v.inuse.domain = (d))
 
-/* TODO: implement */
-#define mfn_valid(mfn) ({ (void)(mfn); 0; })
+extern unsigned long start_page;
+
+#define mfn_valid(mfn) ({                                   \
+    unsigned long mfn__ = mfn_x(mfn);                       \
+    likely((mfn__ >= start_page) && (mfn__ < max_page));    \
+})
 
 #define domain_set_alloc_bitsize(d) ((void)(d))
 #define domain_clamp_alloc_bitsize(d, b) ((void)(d), (b))
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index 4047d67c0e..c88908d4f0 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -361,11 +361,6 @@ unsigned long __init calc_phys_offset(void)
     return phys_offset;
 }
 
-void put_page(struct page_info *page)
-{
-    BUG_ON("unimplemented");
-}
-
 void arch_dump_shared_mem_info(void)
 {
     BUG_ON("unimplemented");
@@ -525,6 +520,8 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
 #error setup_{directmap,frametable}_mapping() should be implemented for RV_32
 #endif
 
+unsigned long __read_mostly start_page;
+
 /*
  * Setup memory management
  *
@@ -577,6 +574,8 @@ void __init setup_mm(void)
     }
 
     setup_frametable_mappings(ram_start, ram_end);
+
+    start_page = PFN_DOWN(ram_start);
     max_page = PFN_DOWN(ram_end);
 }
 
@@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
 {
     return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
 }
+
+int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
+{
+    ASSERT_UNREACHABLE();
+
+    return 0;
+}
+
+static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
+                                                      unsigned long nr)
+{
+    unsigned long x, y = page->count_info;
+    struct domain *owner;
+
+    /* Restrict nr to avoid "double" overflow */
+    if ( nr >= PGC_count_mask )
+    {
+        ASSERT_UNREACHABLE();
+        return NULL;
+    }
+
+    do {
+        x = y;
+        /*
+         * Count ==  0: Page is not allocated, so we cannot take a reference.
+         * Count == -1: Reference count would wrap, which is invalid.
+         */
+        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
+            return NULL;
+    }
+    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
+
+    owner = page_get_owner(page);
+    ASSERT(owner);
+
+    return owner;
+}
+
+struct domain *page_get_owner_and_reference(struct page_info *page)
+{
+    return page_get_owner_and_nr_reference(page, 1);
+}
+
+void put_page_nr(struct page_info *page, unsigned long nr)
+{
+    unsigned long nx, x, y = page->count_info;
+
+    do {
+        ASSERT((y & PGC_count_mask) >= nr);
+        x  = y;
+        nx = x - nr;
+    }
+    while ( unlikely((y = cmpxchg(&page->count_info, x, nx)) != x) );
+
+    if ( unlikely((nx & PGC_count_mask) == 0) )
+    {
+#ifdef CONFIG_STATIC_MEMORY
+        if ( unlikely(nx & PGC_static) )
+            free_domstatic_page(page);
+        else
+#endif
+            free_domheap_page(page);
+    }
+}
+
+void put_page(struct page_info *page)
+{
+    put_page_nr(page, 1);
+}
+
+bool get_page_nr(struct page_info *page, const struct domain *domain,
+                 unsigned long nr)
+{
+    const struct domain *owner = page_get_owner_and_nr_reference(page, nr);
+
+    if ( likely(owner == domain) )
+        return true;
+
+    if ( owner != NULL )
+        put_page_nr(page, nr);
+
+    return false;
+}
+
+bool get_page(struct page_info *page, const struct domain *domain)
+{
+    return get_page_nr(page, domain, 1);
+}
-- 
2.49.0
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 4 months ago
On 10.06.2025 15:05, Oleksii Kurochko wrote:
> Implement the mfn_valid() macro to verify whether a given MFN is valid by
> checking that it falls within the range [start_page, max_page).
> These bounds are initialized based on the start and end addresses of RAM.
> 
> As part of this patch, start_page is introduced and initialized with the
> PFN of the first RAM page.
> 
> Also, after providing a non-stub implementation of the mfn_valid() macro,
> the following compilation errors started to occur:
>   riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
>   /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
>   riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
>   /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
>   riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
>   /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
>   riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
>   riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
>   /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
>   riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
>   riscv64-linux-gnu-ld: final link failed: bad value
>   make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
> To resolve these errors, the following functions have also been introduced,
> based on their Arm counterparts:
> - page_get_owner_and_reference() and its variant to safely acquire a
>   reference to a page and retrieve its owner.
> - put_page() and put_page_nr() to release page references and free the page
>   when the count drops to zero.
>   For put_page_nr(), code related to static memory configuration is wrapped
>   with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
>   common code. Therefore, PGC_static and free_domstatic_page() are not
>   introduced for RISC-V. However, since this configuration could be useful
>   in the future, the relevant code is retained and conditionally compiled.
> - A stub for page_is_ram_type() that currently always returns 0 and asserts
>   unreachable, as RAM type checking is not yet implemented.

How does this end up working when common code references the function?

> @@ -288,8 +289,12 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
>  #define page_get_owner(p)    (p)->v.inuse.domain
>  #define page_set_owner(p, d) ((p)->v.inuse.domain = (d))
>  
> -/* TODO: implement */
> -#define mfn_valid(mfn) ({ (void)(mfn); 0; })
> +extern unsigned long start_page;
> +
> +#define mfn_valid(mfn) ({                                   \
> +    unsigned long mfn__ = mfn_x(mfn);                       \
> +    likely((mfn__ >= start_page) && (mfn__ < max_page));    \
> +})

I don't think you should try to be clever and avoid using __mfn_valid() here,
at least not without an easily identifiable TODO. Surely you've seen that both
Arm and x86 use it.

Also, according to all I know, likely() doesn't work very well when used like
this, except for architectures supporting conditionally executed insns (like
Arm32 or IA-64, i.e. beyond conditional branches). I.e. if you want to use
likely() here, I think you need two of them.

> @@ -525,6 +520,8 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
>  #error setup_{directmap,frametable}_mapping() should be implemented for RV_32
>  #endif
>  
> +unsigned long __read_mostly start_page;

Memory hotplug question again: __read_mostly or __ro_after_init?

> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>  {
>      return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>  }
> +
> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
> +{
> +    ASSERT_UNREACHABLE();
> +
> +    return 0;
> +}
> +
> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
> +                                                      unsigned long nr)
> +{
> +    unsigned long x, y = page->count_info;
> +    struct domain *owner;
> +
> +    /* Restrict nr to avoid "double" overflow */
> +    if ( nr >= PGC_count_mask )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return NULL;
> +    }

I question the validity of this, already in the Arm original: I can't spot
how the caller guarantees to stay below that limit. Without such an
(attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
any limit check.

> +    do {
> +        x = y;
> +        /*
> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
> +         * Count == -1: Reference count would wrap, which is invalid.
> +         */

May I once again ask that you look carefully at comments (as much as at code)
you copy. Clearly this comment wasn't properly updated when the bumping by 1
was changed to bumping by nr.

> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
> +            return NULL;
> +    }
> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
> +
> +    owner = page_get_owner(page);
> +    ASSERT(owner);
> +
> +    return owner;
> +}
> +
> +struct domain *page_get_owner_and_reference(struct page_info *page)
> +{
> +    return page_get_owner_and_nr_reference(page, 1);
> +}
> +
> +void put_page_nr(struct page_info *page, unsigned long nr)
> +{
> +    unsigned long nx, x, y = page->count_info;
> +
> +    do {
> +        ASSERT((y & PGC_count_mask) >= nr);
> +        x  = y;
> +        nx = x - nr;
> +    }
> +    while ( unlikely((y = cmpxchg(&page->count_info, x, nx)) != x) );
> +
> +    if ( unlikely((nx & PGC_count_mask) == 0) )
> +    {
> +#ifdef CONFIG_STATIC_MEMORY
> +        if ( unlikely(nx & PGC_static) )
> +            free_domstatic_page(page);
> +        else
> +#endif

Such #ifdef-ed-out code is liable to go stale. Minimally use IS_ENABLED().
Even better would imo be if you introduced a "stub" PGC_static, resolving
to 0 (i.e. for now unconditionally).

Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Oleksii Kurochko 3 months, 2 weeks ago
On 7/2/25 12:09 PM, Jan Beulich wrote:
> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>> Implement the mfn_valid() macro to verify whether a given MFN is valid by
>> checking that it falls within the range [start_page, max_page).
>> These bounds are initialized based on the start and end addresses of RAM.
>>
>> As part of this patch, start_page is introduced and initialized with the
>> PFN of the first RAM page.
>>
>> Also, after providing a non-stub implementation of the mfn_valid() macro,
>> the following compilation errors started to occur:
>>    riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
>>    /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
>>    riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
>>    /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
>>    riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
>>    /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
>>    riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
>>    riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
>>    /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
>>    riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
>>    riscv64-linux-gnu-ld: final link failed: bad value
>>    make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
>> To resolve these errors, the following functions have also been introduced,
>> based on their Arm counterparts:
>> - page_get_owner_and_reference() and its variant to safely acquire a
>>    reference to a page and retrieve its owner.
>> - put_page() and put_page_nr() to release page references and free the page
>>    when the count drops to zero.
>>    For put_page_nr(), code related to static memory configuration is wrapped
>>    with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
>>    common code. Therefore, PGC_static and free_domstatic_page() are not
>>    introduced for RISC-V. However, since this configuration could be useful
>>    in the future, the relevant code is retained and conditionally compiled.
>> - A stub for page_is_ram_type() that currently always returns 0 and asserts
>>    unreachable, as RAM type checking is not yet implemented.
> How does this end up working when common code references the function?

Based on the following commit message:
     Callers are VT-d (so x86 specific) and various bits of page offlining
     support, which although it looks generic (and is in xen/common) does
     things like diving into page_info->count_info which is not generic.
     
     In any case on this is only reachable via XEN_SYSCTL_page_offline_op,
     which clearly shouldn't be called on ARM just yet.

There is no callers for page_is_ram_type() for Arm now, and I expect something similar
for RISC-V. As we don't even introduce hypercalls for RISC-V, we can just live
without it.

>
>> @@ -288,8 +289,12 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
>>   #define page_get_owner(p)    (p)->v.inuse.domain
>>   #define page_set_owner(p, d) ((p)->v.inuse.domain = (d))
>>   
>> -/* TODO: implement */
>> -#define mfn_valid(mfn) ({ (void)(mfn); 0; })
>> +extern unsigned long start_page;
>> +
>> +#define mfn_valid(mfn) ({                                   \
>> +    unsigned long mfn__ = mfn_x(mfn);                       \
>> +    likely((mfn__ >= start_page) && (mfn__ < max_page));    \
>> +})
> I don't think you should try to be clever and avoid using __mfn_valid() here,
> at least not without an easily identifiable TODO. Surely you've seen that both
> Arm and x86 use it.

Overlooked that pdx.c is compiled unconditionally, so I thought that __mfn_valid() common
implementation isn't avaiable as, at the moment, RISC-V doesn't support PDX_COMPRESSION...

> Also, according to all I know, likely() doesn't work very well when used like
> this, except for architectures supporting conditionally executed insns (like
> Arm32 or IA-64, i.e. beyond conditional branches). I.e. if you want to use
> likely() here, I think you need two of them.

... I will update mfn_valid() definition according to your recommendations.

>
>> @@ -525,6 +520,8 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
>>   #error setup_{directmap,frametable}_mapping() should be implemented for RV_32
>>   #endif
>>   
>> +unsigned long __read_mostly start_page;
> Memory hotplug question again: __read_mostly or __ro_after_init?
>
>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>   {
>>       return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>   }
>> +
>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>> +{
>> +    ASSERT_UNREACHABLE();
>> +
>> +    return 0;
>> +}
>> +
>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>> +                                                      unsigned long nr)
>> +{
>> +    unsigned long x, y = page->count_info;
>> +    struct domain *owner;
>> +
>> +    /* Restrict nr to avoid "double" overflow */
>> +    if ( nr >= PGC_count_mask )
>> +    {
>> +        ASSERT_UNREACHABLE();
>> +        return NULL;
>> +    }
> I question the validity of this, already in the Arm original: I can't spot
> how the caller guarantees to stay below that limit. Without such an
> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
> any limit check.

Agree, it could be really dropped.

>
>> +    do {
>> +        x = y;
>> +        /*
>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>> +         * Count == -1: Reference count would wrap, which is invalid.
>> +         */
> May I once again ask that you look carefully at comments (as much as at code)
> you copy. Clearly this comment wasn't properly updated when the bumping by 1
> was changed to bumping by nr.
>
>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>> +            return NULL;
>> +    }
>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>> +
>> +    owner = page_get_owner(page);
>> +    ASSERT(owner);
>> +
>> +    return owner;
>> +}
>> +
>> +struct domain *page_get_owner_and_reference(struct page_info *page)
>> +{
>> +    return page_get_owner_and_nr_reference(page, 1);
>> +}
>> +
>> +void put_page_nr(struct page_info *page, unsigned long nr)
>> +{
>> +    unsigned long nx, x, y = page->count_info;
>> +
>> +    do {
>> +        ASSERT((y & PGC_count_mask) >= nr);
>> +        x  = y;
>> +        nx = x - nr;
>> +    }
>> +    while ( unlikely((y = cmpxchg(&page->count_info, x, nx)) != x) );
>> +
>> +    if ( unlikely((nx & PGC_count_mask) == 0) )
>> +    {
>> +#ifdef CONFIG_STATIC_MEMORY
>> +        if ( unlikely(nx & PGC_static) )
>> +            free_domstatic_page(page);
>> +        else
>> +#endif
> Such #ifdef-ed-out code is liable to go stale. Minimally use IS_ENABLED().
> Even better would imo be if you introduced a "stub" PGC_static, resolving
> to 0 (i.e. for now unconditionally).

An introduction of PGC_static would be better.

Thanks.

~ Oleksii
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 3 months, 1 week ago
On 18.07.2025 16:49, Oleksii Kurochko wrote:
> On 7/2/25 12:09 PM, Jan Beulich wrote:
>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>> Implement the mfn_valid() macro to verify whether a given MFN is valid by
>>> checking that it falls within the range [start_page, max_page).
>>> These bounds are initialized based on the start and end addresses of RAM.
>>>
>>> As part of this patch, start_page is introduced and initialized with the
>>> PFN of the first RAM page.
>>>
>>> Also, after providing a non-stub implementation of the mfn_valid() macro,
>>> the following compilation errors started to occur:
>>>    riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
>>>    /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
>>>    riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
>>>    /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
>>>    riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
>>>    /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
>>>    riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
>>>    riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
>>>    /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
>>>    riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
>>>    riscv64-linux-gnu-ld: final link failed: bad value
>>>    make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
>>> To resolve these errors, the following functions have also been introduced,
>>> based on their Arm counterparts:
>>> - page_get_owner_and_reference() and its variant to safely acquire a
>>>    reference to a page and retrieve its owner.
>>> - put_page() and put_page_nr() to release page references and free the page
>>>    when the count drops to zero.
>>>    For put_page_nr(), code related to static memory configuration is wrapped
>>>    with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
>>>    common code. Therefore, PGC_static and free_domstatic_page() are not
>>>    introduced for RISC-V. However, since this configuration could be useful
>>>    in the future, the relevant code is retained and conditionally compiled.
>>> - A stub for page_is_ram_type() that currently always returns 0 and asserts
>>>    unreachable, as RAM type checking is not yet implemented.
>> How does this end up working when common code references the function?
> 
> Based on the following commit message:
>      Callers are VT-d (so x86 specific) and various bits of page offlining
>      support, which although it looks generic (and is in xen/common) does
>      things like diving into page_info->count_info which is not generic.
>      
>      In any case on this is only reachable via XEN_SYSCTL_page_offline_op,
>      which clearly shouldn't be called on ARM just yet.

Assuming this is from an old commit, then I have to question this justification.
I see nothing preventing XEN_SYSCTL_page_offline_op to be invoked on an Arm
system. Hence (unless I'm overlooking somthing) ASSERT_UNREACHABLE() is simply
inappropriate (and wants fixing). Luckily it being sysctl-s only, there's no
need for an XSA. In no case should known flawed code be copied into another
port.

Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 3 months, 1 week ago
On 18.07.2025 16:49, Oleksii Kurochko wrote:
> On 7/2/25 12:09 PM, Jan Beulich wrote:
>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>> Implement the mfn_valid() macro to verify whether a given MFN is valid by
>>> checking that it falls within the range [start_page, max_page).
>>> These bounds are initialized based on the start and end addresses of RAM.
>>>
>>> As part of this patch, start_page is introduced and initialized with the
>>> PFN of the first RAM page.
>>>
>>> Also, after providing a non-stub implementation of the mfn_valid() macro,
>>> the following compilation errors started to occur:
>>>    riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
>>>    /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
>>>    riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
>>>    /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
>>>    riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
>>>    /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
>>>    riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
>>>    riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
>>>    /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
>>>    riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
>>>    riscv64-linux-gnu-ld: final link failed: bad value
>>>    make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
>>> To resolve these errors, the following functions have also been introduced,
>>> based on their Arm counterparts:
>>> - page_get_owner_and_reference() and its variant to safely acquire a
>>>    reference to a page and retrieve its owner.
>>> - put_page() and put_page_nr() to release page references and free the page
>>>    when the count drops to zero.
>>>    For put_page_nr(), code related to static memory configuration is wrapped
>>>    with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
>>>    common code. Therefore, PGC_static and free_domstatic_page() are not
>>>    introduced for RISC-V. However, since this configuration could be useful
>>>    in the future, the relevant code is retained and conditionally compiled.
>>> - A stub for page_is_ram_type() that currently always returns 0 and asserts
>>>    unreachable, as RAM type checking is not yet implemented.
>> How does this end up working when common code references the function?
> 
> Based on the following commit message:
>      Callers are VT-d (so x86 specific) and various bits of page offlining
>      support, which although it looks generic (and is in xen/common) does
>      things like diving into page_info->count_info which is not generic.
>      
>      In any case on this is only reachable via XEN_SYSCTL_page_offline_op,
>      which clearly shouldn't be called on ARM just yet.

What commit message are you talking about? Nothing like the above is anywhere
in this patch.

> There is no callers for page_is_ram_type() for Arm now, and I expect something similar
> for RISC-V. As we don't even introduce hypercalls for RISC-V, we can just live
> without it.

If there's no caller, why the stub?

Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Oleksii Kurochko 3 months, 1 week ago
On 7/21/25 3:42 PM, Jan Beulich wrote:
> On 18.07.2025 16:49, Oleksii Kurochko wrote:
>> On 7/2/25 12:09 PM, Jan Beulich wrote:
>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>> Implement the mfn_valid() macro to verify whether a given MFN is valid by
>>>> checking that it falls within the range [start_page, max_page).
>>>> These bounds are initialized based on the start and end addresses of RAM.
>>>>
>>>> As part of this patch, start_page is introduced and initialized with the
>>>> PFN of the first RAM page.
>>>>
>>>> Also, after providing a non-stub implementation of the mfn_valid() macro,
>>>> the following compilation errors started to occur:
>>>>     riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
>>>>     /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
>>>>     riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
>>>>     /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
>>>>     riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
>>>>     /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
>>>>     riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
>>>>     riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
>>>>     /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
>>>>     riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
>>>>     riscv64-linux-gnu-ld: final link failed: bad value
>>>>     make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
>>>> To resolve these errors, the following functions have also been introduced,
>>>> based on their Arm counterparts:
>>>> - page_get_owner_and_reference() and its variant to safely acquire a
>>>>     reference to a page and retrieve its owner.
>>>> - put_page() and put_page_nr() to release page references and free the page
>>>>     when the count drops to zero.
>>>>     For put_page_nr(), code related to static memory configuration is wrapped
>>>>     with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
>>>>     common code. Therefore, PGC_static and free_domstatic_page() are not
>>>>     introduced for RISC-V. However, since this configuration could be useful
>>>>     in the future, the relevant code is retained and conditionally compiled.
>>>> - A stub for page_is_ram_type() that currently always returns 0 and asserts
>>>>     unreachable, as RAM type checking is not yet implemented.
>>> How does this end up working when common code references the function?
>> Based on the following commit message:
>>       Callers are VT-d (so x86 specific) and various bits of page offlining
>>       support, which although it looks generic (and is in xen/common) does
>>       things like diving into page_info->count_info which is not generic.
>>       
>>       In any case on this is only reachable via XEN_SYSCTL_page_offline_op,
>>       which clearly shouldn't be called on ARM just yet.
> What commit message are you talking about? Nothing like the above is anywhere
> in this patch.

It's pretty old commit:
   commit 214c4cd94a80bcaf042f25158eaa7d0e5bbc3b5b
   Author: Ian Campbell<ian.campbell@citrix.com>
   Date:   Wed Dec 19 14:16:23 2012 +0000

But since that time page_is_ram_type() hasn't been changed for Arm.

>
>> There is no callers for page_is_ram_type() for Arm now, and I expect something similar
>> for RISC-V. As we don't even introduce hypercalls for RISC-V, we can just live
>> without it.
> If there's no caller, why the stub?

Because that parts of common code which are using it aren't under ifdef, for example, this
one function:
   int offline_page(mfn_t mfn, int broken, uint32_t *status)

And specifically this function is called when XEN_SYSCTL_page_offline_op is handled.

But I agree that it seems like nothing prevents to call XEN_SYSCTL_page_offline_op.

~ Oleksii
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Orzel, Michal 4 months ago

On 02/07/2025 12:09, Jan Beulich wrote:
> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>> Implement the mfn_valid() macro to verify whether a given MFN is valid by
>> checking that it falls within the range [start_page, max_page).
>> These bounds are initialized based on the start and end addresses of RAM.
>>
>> As part of this patch, start_page is introduced and initialized with the
>> PFN of the first RAM page.
>>
>> Also, after providing a non-stub implementation of the mfn_valid() macro,
>> the following compilation errors started to occur:
>>   riscv64-linux-gnu-ld: prelink.o: in function `__next_node':
>>   /build/xen/./include/xen/nodemask.h:202: undefined reference to `page_is_ram_type'
>>   riscv64-linux-gnu-ld: prelink.o: in function `get_free_buddy':
>>   /build/xen/common/page_alloc.c:881: undefined reference to `page_is_ram_type'
>>   riscv64-linux-gnu-ld: prelink.o: in function `alloc_heap_pages':
>>   /build/xen/common/page_alloc.c:1043: undefined reference to `page_get_owner_and_reference'
>>   riscv64-linux-gnu-ld: /build/xen/common/page_alloc.c:1098: undefined reference to `page_is_ram_type'
>>   riscv64-linux-gnu-ld: prelink.o: in function `ns16550_interrupt':
>>   /build/xen/drivers/char/ns16550.c:205: undefined reference to `get_page'
>>   riscv64-linux-gnu-ld: ./.xen-syms.0: hidden symbol `page_get_owner_and_reference' isn't defined
>>   riscv64-linux-gnu-ld: final link failed: bad value
>>   make[2]: *** [arch/riscv/Makefile:35: xen-syms] Error 1
>> To resolve these errors, the following functions have also been introduced,
>> based on their Arm counterparts:
>> - page_get_owner_and_reference() and its variant to safely acquire a
>>   reference to a page and retrieve its owner.
>> - put_page() and put_page_nr() to release page references and free the page
>>   when the count drops to zero.
>>   For put_page_nr(), code related to static memory configuration is wrapped
>>   with CONFIG_STATIC_MEMORY, as this configuration has not yet been moved to
>>   common code. Therefore, PGC_static and free_domstatic_page() are not
>>   introduced for RISC-V. However, since this configuration could be useful
>>   in the future, the relevant code is retained and conditionally compiled.
>> - A stub for page_is_ram_type() that currently always returns 0 and asserts
>>   unreachable, as RAM type checking is not yet implemented.
> 
> How does this end up working when common code references the function?
> 
>> @@ -288,8 +289,12 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
>>  #define page_get_owner(p)    (p)->v.inuse.domain
>>  #define page_set_owner(p, d) ((p)->v.inuse.domain = (d))
>>  
>> -/* TODO: implement */
>> -#define mfn_valid(mfn) ({ (void)(mfn); 0; })
>> +extern unsigned long start_page;
>> +
>> +#define mfn_valid(mfn) ({                                   \
>> +    unsigned long mfn__ = mfn_x(mfn);                       \
>> +    likely((mfn__ >= start_page) && (mfn__ < max_page));    \
>> +})
> 
> I don't think you should try to be clever and avoid using __mfn_valid() here,
> at least not without an easily identifiable TODO. Surely you've seen that both
> Arm and x86 use it.
> 
> Also, according to all I know, likely() doesn't work very well when used like
> this, except for architectures supporting conditionally executed insns (like
> Arm32 or IA-64, i.e. beyond conditional branches). I.e. if you want to use
> likely() here, I think you need two of them.
> 
>> @@ -525,6 +520,8 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
>>  #error setup_{directmap,frametable}_mapping() should be implemented for RV_32
>>  #endif
>>  
>> +unsigned long __read_mostly start_page;
> 
> Memory hotplug question again: __read_mostly or __ro_after_init?
> 
>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>  {
>>      return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>  }
>> +
>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>> +{
>> +    ASSERT_UNREACHABLE();
>> +
>> +    return 0;
>> +}
>> +
>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>> +                                                      unsigned long nr)
>> +{
>> +    unsigned long x, y = page->count_info;
>> +    struct domain *owner;
>> +
>> +    /* Restrict nr to avoid "double" overflow */
>> +    if ( nr >= PGC_count_mask )
>> +    {
>> +        ASSERT_UNREACHABLE();
>> +        return NULL;
>> +    }
> 
> I question the validity of this, already in the Arm original: I can't spot
> how the caller guarantees to stay below that limit. Without such an
> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
> any limit check.
Honestly I don't know why this assert was placed here. I checked the code and we
don't limit nr_shm_borrowers in any place, so in theory it's possible to end up
here.

~Michal

> 
>> +    do {
>> +        x = y;
>> +        /*
>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>> +         * Count == -1: Reference count would wrap, which is invalid.
>> +         */
> 
> May I once again ask that you look carefully at comments (as much as at code)
> you copy. Clearly this comment wasn't properly updated when the bumping by 1
> was changed to bumping by nr.
> 
>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>> +            return NULL;
>> +    }
>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>> +
>> +    owner = page_get_owner(page);
>> +    ASSERT(owner);
>> +
>> +    return owner;
>> +}
>> +
>> +struct domain *page_get_owner_and_reference(struct page_info *page)
>> +{
>> +    return page_get_owner_and_nr_reference(page, 1);
>> +}
>> +
>> +void put_page_nr(struct page_info *page, unsigned long nr)
>> +{
>> +    unsigned long nx, x, y = page->count_info;
>> +
>> +    do {
>> +        ASSERT((y & PGC_count_mask) >= nr);
>> +        x  = y;
>> +        nx = x - nr;
>> +    }
>> +    while ( unlikely((y = cmpxchg(&page->count_info, x, nx)) != x) );
>> +
>> +    if ( unlikely((nx & PGC_count_mask) == 0) )
>> +    {
>> +#ifdef CONFIG_STATIC_MEMORY
>> +        if ( unlikely(nx & PGC_static) )
>> +            free_domstatic_page(page);
>> +        else
>> +#endif
> 
> Such #ifdef-ed-out code is liable to go stale. Minimally use IS_ENABLED().
> Even better would imo be if you introduced a "stub" PGC_static, resolving
> to 0 (i.e. for now unconditionally).
> 
> Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 4 months ago
On 02.07.2025 12:09, Jan Beulich wrote:
> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>  {
>>      return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>  }
>> +
>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>> +{
>> +    ASSERT_UNREACHABLE();
>> +
>> +    return 0;
>> +}
>> +
>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>> +                                                      unsigned long nr)
>> +{
>> +    unsigned long x, y = page->count_info;
>> +    struct domain *owner;
>> +
>> +    /* Restrict nr to avoid "double" overflow */
>> +    if ( nr >= PGC_count_mask )
>> +    {
>> +        ASSERT_UNREACHABLE();
>> +        return NULL;
>> +    }
> 
> I question the validity of this, already in the Arm original: I can't spot
> how the caller guarantees to stay below that limit. Without such an
> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
> any limit check.
> 
>> +    do {
>> +        x = y;
>> +        /*
>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>> +         * Count == -1: Reference count would wrap, which is invalid.
>> +         */
> 
> May I once again ask that you look carefully at comments (as much as at code)
> you copy. Clearly this comment wasn't properly updated when the bumping by 1
> was changed to bumping by nr.
> 
>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>> +            return NULL;
>> +    }
>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>> +
>> +    owner = page_get_owner(page);
>> +    ASSERT(owner);
>> +
>> +    return owner;
>> +}

There also looks to be a dead code concern here (towards the "nr" parameters
here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
leave out Misra rule 2.2 entirely.

Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Oleksii Kurochko 3 months, 2 weeks ago
On 7/2/25 12:28 PM, Jan Beulich wrote:
> On 02.07.2025 12:09, Jan Beulich wrote:
>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>>   {
>>>       return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>>   }
>>> +
>>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>>> +{
>>> +    ASSERT_UNREACHABLE();
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>>> +                                                      unsigned long nr)
>>> +{
>>> +    unsigned long x, y = page->count_info;
>>> +    struct domain *owner;
>>> +
>>> +    /* Restrict nr to avoid "double" overflow */
>>> +    if ( nr >= PGC_count_mask )
>>> +    {
>>> +        ASSERT_UNREACHABLE();
>>> +        return NULL;
>>> +    }
>> I question the validity of this, already in the Arm original: I can't spot
>> how the caller guarantees to stay below that limit. Without such an
>> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
>> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
>> any limit check.
>>
>>> +    do {
>>> +        x = y;
>>> +        /*
>>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>>> +         * Count == -1: Reference count would wrap, which is invalid.
>>> +         */
>> May I once again ask that you look carefully at comments (as much as at code)
>> you copy. Clearly this comment wasn't properly updated when the bumping by 1
>> was changed to bumping by nr.
>>
>>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>>> +            return NULL;
>>> +    }
>>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>>> +
>>> +    owner = page_get_owner(page);
>>> +    ASSERT(owner);
>>> +
>>> +    return owner;
>>> +}
> There also looks to be a dead code concern here (towards the "nr" parameters
> here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
> leave out Misra rule 2.2 entirely.

I think that I didn't get what is an issue when STATIC_SHM=n, functions is still
going to be called through page_get_owner_and_reference(), at least, in page_alloc.c .

~ Oleksii
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 3 months, 1 week ago
On 18.07.2025 16:37, Oleksii Kurochko wrote:
> 
> On 7/2/25 12:28 PM, Jan Beulich wrote:
>> On 02.07.2025 12:09, Jan Beulich wrote:
>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>>>   {
>>>>       return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>>>   }
>>>> +
>>>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>>>> +{
>>>> +    ASSERT_UNREACHABLE();
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>>>> +                                                      unsigned long nr)
>>>> +{
>>>> +    unsigned long x, y = page->count_info;
>>>> +    struct domain *owner;
>>>> +
>>>> +    /* Restrict nr to avoid "double" overflow */
>>>> +    if ( nr >= PGC_count_mask )
>>>> +    {
>>>> +        ASSERT_UNREACHABLE();
>>>> +        return NULL;
>>>> +    }
>>> I question the validity of this, already in the Arm original: I can't spot
>>> how the caller guarantees to stay below that limit. Without such an
>>> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
>>> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
>>> any limit check.
>>>
>>>> +    do {
>>>> +        x = y;
>>>> +        /*
>>>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>>>> +         * Count == -1: Reference count would wrap, which is invalid.
>>>> +         */
>>> May I once again ask that you look carefully at comments (as much as at code)
>>> you copy. Clearly this comment wasn't properly updated when the bumping by 1
>>> was changed to bumping by nr.
>>>
>>>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>>>> +            return NULL;
>>>> +    }
>>>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>>>> +
>>>> +    owner = page_get_owner(page);
>>>> +    ASSERT(owner);
>>>> +
>>>> +    return owner;
>>>> +}
>> There also looks to be a dead code concern here (towards the "nr" parameters
>> here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
>> leave out Misra rule 2.2 entirely.
> 
> I think that I didn't get what is an issue when STATIC_SHM=n, functions is still
> going to be called through page_get_owner_and_reference(), at least, in page_alloc.c .

Yes, but will "nr" ever be anything other than 1 then? IOW omitting the parameter
would be fine. And that's what "dead code" is about.

Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Oleksii Kurochko 3 months, 1 week ago
On 7/21/25 3:39 PM, Jan Beulich wrote:
> On 18.07.2025 16:37, Oleksii Kurochko wrote:
>> On 7/2/25 12:28 PM, Jan Beulich wrote:
>>> On 02.07.2025 12:09, Jan Beulich wrote:
>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>>>>    {
>>>>>        return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>>>>    }
>>>>> +
>>>>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>>>>> +{
>>>>> +    ASSERT_UNREACHABLE();
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>>>>> +                                                      unsigned long nr)
>>>>> +{
>>>>> +    unsigned long x, y = page->count_info;
>>>>> +    struct domain *owner;
>>>>> +
>>>>> +    /* Restrict nr to avoid "double" overflow */
>>>>> +    if ( nr >= PGC_count_mask )
>>>>> +    {
>>>>> +        ASSERT_UNREACHABLE();
>>>>> +        return NULL;
>>>>> +    }
>>>> I question the validity of this, already in the Arm original: I can't spot
>>>> how the caller guarantees to stay below that limit. Without such an
>>>> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
>>>> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
>>>> any limit check.
>>>>
>>>>> +    do {
>>>>> +        x = y;
>>>>> +        /*
>>>>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>>>>> +         * Count == -1: Reference count would wrap, which is invalid.
>>>>> +         */
>>>> May I once again ask that you look carefully at comments (as much as at code)
>>>> you copy. Clearly this comment wasn't properly updated when the bumping by 1
>>>> was changed to bumping by nr.
>>>>
>>>>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>>>>> +            return NULL;
>>>>> +    }
>>>>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>>>>> +
>>>>> +    owner = page_get_owner(page);
>>>>> +    ASSERT(owner);
>>>>> +
>>>>> +    return owner;
>>>>> +}
>>> There also looks to be a dead code concern here (towards the "nr" parameters
>>> here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
>>> leave out Misra rule 2.2 entirely.
>> I think that I didn't get what is an issue when STATIC_SHM=n, functions is still
>> going to be called through page_get_owner_and_reference(), at least, in page_alloc.c .
> Yes, but will "nr" ever be anything other than 1 then? IOW omitting the parameter
> would be fine. And that's what "dead code" is about.

Got it.

So we don't have any SAF-x tag to mark this function as safe. What is the best one
solution for now if nr argument will be needed in the future for STATIC_SHM=y?

~ Oleksii
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 3 months, 1 week ago
On 22.07.2025 14:03, Oleksii Kurochko wrote:
> On 7/21/25 3:39 PM, Jan Beulich wrote:
>> On 18.07.2025 16:37, Oleksii Kurochko wrote:
>>> On 7/2/25 12:28 PM, Jan Beulich wrote:
>>>> On 02.07.2025 12:09, Jan Beulich wrote:
>>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>>>>>    {
>>>>>>        return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>>>>>    }
>>>>>> +
>>>>>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>>>>>> +{
>>>>>> +    ASSERT_UNREACHABLE();
>>>>>> +
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>>>>>> +                                                      unsigned long nr)
>>>>>> +{
>>>>>> +    unsigned long x, y = page->count_info;
>>>>>> +    struct domain *owner;
>>>>>> +
>>>>>> +    /* Restrict nr to avoid "double" overflow */
>>>>>> +    if ( nr >= PGC_count_mask )
>>>>>> +    {
>>>>>> +        ASSERT_UNREACHABLE();
>>>>>> +        return NULL;
>>>>>> +    }
>>>>> I question the validity of this, already in the Arm original: I can't spot
>>>>> how the caller guarantees to stay below that limit. Without such an
>>>>> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
>>>>> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
>>>>> any limit check.
>>>>>
>>>>>> +    do {
>>>>>> +        x = y;
>>>>>> +        /*
>>>>>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>>>>>> +         * Count == -1: Reference count would wrap, which is invalid.
>>>>>> +         */
>>>>> May I once again ask that you look carefully at comments (as much as at code)
>>>>> you copy. Clearly this comment wasn't properly updated when the bumping by 1
>>>>> was changed to bumping by nr.
>>>>>
>>>>>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>>>>>> +            return NULL;
>>>>>> +    }
>>>>>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>>>>>> +
>>>>>> +    owner = page_get_owner(page);
>>>>>> +    ASSERT(owner);
>>>>>> +
>>>>>> +    return owner;
>>>>>> +}
>>>> There also looks to be a dead code concern here (towards the "nr" parameters
>>>> here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
>>>> leave out Misra rule 2.2 entirely.
>>> I think that I didn't get what is an issue when STATIC_SHM=n, functions is still
>>> going to be called through page_get_owner_and_reference(), at least, in page_alloc.c .
>> Yes, but will "nr" ever be anything other than 1 then? IOW omitting the parameter
>> would be fine. And that's what "dead code" is about.
> 
> Got it.
> 
> So we don't have any SAF-x tag to mark this function as safe. What is the best one
> solution for now if nr argument will be needed in the future for STATIC_SHM=y?

Add the parameter at that point. Just like was done for Arm.

Jan
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Oleksii Kurochko 3 months ago
On 7/22/25 2:05 PM, Jan Beulich wrote:
> On 22.07.2025 14:03, Oleksii Kurochko wrote:
>> On 7/21/25 3:39 PM, Jan Beulich wrote:
>>> On 18.07.2025 16:37, Oleksii Kurochko wrote:
>>>> On 7/2/25 12:28 PM, Jan Beulich wrote:
>>>>> On 02.07.2025 12:09, Jan Beulich wrote:
>>>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>>>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>>>>>>     {
>>>>>>>         return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>>>>>>     }
>>>>>>> +
>>>>>>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>>>>>>> +{
>>>>>>> +    ASSERT_UNREACHABLE();
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>>>>>>> +                                                      unsigned long nr)
>>>>>>> +{
>>>>>>> +    unsigned long x, y = page->count_info;
>>>>>>> +    struct domain *owner;
>>>>>>> +
>>>>>>> +    /* Restrict nr to avoid "double" overflow */
>>>>>>> +    if ( nr >= PGC_count_mask )
>>>>>>> +    {
>>>>>>> +        ASSERT_UNREACHABLE();
>>>>>>> +        return NULL;
>>>>>>> +    }
>>>>>> I question the validity of this, already in the Arm original: I can't spot
>>>>>> how the caller guarantees to stay below that limit. Without such an
>>>>>> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
>>>>>> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
>>>>>> any limit check.
>>>>>>
>>>>>>> +    do {
>>>>>>> +        x = y;
>>>>>>> +        /*
>>>>>>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>>>>>>> +         * Count == -1: Reference count would wrap, which is invalid.
>>>>>>> +         */
>>>>>> May I once again ask that you look carefully at comments (as much as at code)
>>>>>> you copy. Clearly this comment wasn't properly updated when the bumping by 1
>>>>>> was changed to bumping by nr.
>>>>>>
>>>>>>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>>>>>>> +            return NULL;
>>>>>>> +    }
>>>>>>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>>>>>>> +
>>>>>>> +    owner = page_get_owner(page);
>>>>>>> +    ASSERT(owner);
>>>>>>> +
>>>>>>> +    return owner;
>>>>>>> +}
>>>>> There also looks to be a dead code concern here (towards the "nr" parameters
>>>>> here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
>>>>> leave out Misra rule 2.2 entirely.
>>>> I think that I didn't get what is an issue when STATIC_SHM=n, functions is still
>>>> going to be called through page_get_owner_and_reference(), at least, in page_alloc.c .
>>> Yes, but will "nr" ever be anything other than 1 then? IOW omitting the parameter
>>> would be fine. And that's what "dead code" is about.
>> Got it.
>>
>> So we don't have any SAF-x tag to mark this function as safe. What is the best one
>> solution for now if nr argument will be needed in the future for STATIC_SHM=y?
> Add the parameter at that point. Just like was done for Arm.

Hmm, it seems like I am confusing something... Arm has the same defintion and declaration
of page_get_owner_and_nr_reference().

~ Oleksii
Re: [PATCH v2 16/17] xen/riscv: implement mfn_valid() and page reference, ownership handling helpers
Posted by Jan Beulich 3 months ago
On 29.07.2025 15:47, Oleksii Kurochko wrote:
> 
> On 7/22/25 2:05 PM, Jan Beulich wrote:
>> On 22.07.2025 14:03, Oleksii Kurochko wrote:
>>> On 7/21/25 3:39 PM, Jan Beulich wrote:
>>>> On 18.07.2025 16:37, Oleksii Kurochko wrote:
>>>>> On 7/2/25 12:28 PM, Jan Beulich wrote:
>>>>>> On 02.07.2025 12:09, Jan Beulich wrote:
>>>>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>>>>> @@ -613,3 +612,91 @@ void __iomem *ioremap(paddr_t pa, size_t len)
>>>>>>>>     {
>>>>>>>>         return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>>>>>>>>     }
>>>>>>>> +
>>>>>>>> +int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
>>>>>>>> +{
>>>>>>>> +    ASSERT_UNREACHABLE();
>>>>>>>> +
>>>>>>>> +    return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static struct domain *page_get_owner_and_nr_reference(struct page_info *page,
>>>>>>>> +                                                      unsigned long nr)
>>>>>>>> +{
>>>>>>>> +    unsigned long x, y = page->count_info;
>>>>>>>> +    struct domain *owner;
>>>>>>>> +
>>>>>>>> +    /* Restrict nr to avoid "double" overflow */
>>>>>>>> +    if ( nr >= PGC_count_mask )
>>>>>>>> +    {
>>>>>>>> +        ASSERT_UNREACHABLE();
>>>>>>>> +        return NULL;
>>>>>>>> +    }
>>>>>>> I question the validity of this, already in the Arm original: I can't spot
>>>>>>> how the caller guarantees to stay below that limit. Without such an
>>>>>>> (attempted) guarantee, ASSERT_UNREACHABLE() is wrong to use. All I can see
>>>>>>> is process_shm_node() incrementing shmem_extra[].nr_shm_borrowers, without
>>>>>>> any limit check.
>>>>>>>
>>>>>>>> +    do {
>>>>>>>> +        x = y;
>>>>>>>> +        /*
>>>>>>>> +         * Count ==  0: Page is not allocated, so we cannot take a reference.
>>>>>>>> +         * Count == -1: Reference count would wrap, which is invalid.
>>>>>>>> +         */
>>>>>>> May I once again ask that you look carefully at comments (as much as at code)
>>>>>>> you copy. Clearly this comment wasn't properly updated when the bumping by 1
>>>>>>> was changed to bumping by nr.
>>>>>>>
>>>>>>>> +        if ( unlikely(((x + nr) & PGC_count_mask) <= nr) )
>>>>>>>> +            return NULL;
>>>>>>>> +    }
>>>>>>>> +    while ( (y = cmpxchg(&page->count_info, x, x + nr)) != x );
>>>>>>>> +
>>>>>>>> +    owner = page_get_owner(page);
>>>>>>>> +    ASSERT(owner);
>>>>>>>> +
>>>>>>>> +    return owner;
>>>>>>>> +}
>>>>>> There also looks to be a dead code concern here (towards the "nr" parameters
>>>>>> here and elsewhere, when STATIC_SHM=n). Just that apparently we decided to
>>>>>> leave out Misra rule 2.2 entirely.
>>>>> I think that I didn't get what is an issue when STATIC_SHM=n, functions is still
>>>>> going to be called through page_get_owner_and_reference(), at least, in page_alloc.c .
>>>> Yes, but will "nr" ever be anything other than 1 then? IOW omitting the parameter
>>>> would be fine. And that's what "dead code" is about.
>>> Got it.
>>>
>>> So we don't have any SAF-x tag to mark this function as safe. What is the best one
>>> solution for now if nr argument will be needed in the future for STATIC_SHM=y?
>> Add the parameter at that point. Just like was done for Arm.
> 
> Hmm, it seems like I am confusing something... Arm has the same defintion and declaration
> of page_get_owner_and_nr_reference().

But it didn't always have it. And there is at least one pending issue there.
Hence my request to use the simpler variant until someone actually makes
STATIC_SHM work on RISC-V. And hopefully by then the issue in Arm code is
sorted, and you can clone the code without raising questions.

Jan