[PATCH v3 08/14] xen/riscv: imsic_init() implementation

Oleksii Kurochko posted 14 patches 6 months, 4 weeks ago
There is a newer version of this series
[PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 4 weeks ago
imsic_init() is introduced to parse device tree node, which has the following
bindings [2], and based on the parsed information update IMSIC configuration
which is stored in imsic_cfg.

The following helpers are introduces for imsic_init() usage:
  - imsic_parse_node() parses IMSIC node from DTS
  - imsic_get_parent_hartid() returns the hart ( CPU ) ID of the given device
    tree node.

This patch is based on the code from [1].

Since Microchip originally developed imsic.{c,h}, an internal discussion with
them led to the decision to use the MIT license.

[1] https://gitlab.com/xen-project/people/olkur/xen/-/commit/0b1a94f2bc3bb1a81cd26bb75f0bf578f84cb4d4
[2] https://elixir.bootlin.com/linux/v6.12/source/Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

Co-developed-by: Romain Caritey <Romain.Caritey@microchip.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V3:
 - Drop year in imsic.h in copyrights.
 - Correct identation in imsic_parse_node() and imsic_init()
   where for imsic_cfg.base_addr a mask is applied.
 - Use unsigned int istead of uint32_t for local variable nr_parent_irqs,
   index, nr_handlers in imsic_init().
 - Fix a leakage of ealiers successfull allocations in case if imsic_init()
   returns with an error.
 - Excess blank in printk() message: "%s: unable to parse MMIO regset %d\n".
 - Introduce hartid_to_cpuid() and use it in the check:
     if ( hardid_to_cpuid(cpuid) >= num_possible_cpus() )
   in imsic_init().
 - Use "%u" for unsigned int in printk(...).
 - Fix for loop condition: nr_mmios -> "j < nr_mmios".
 - [imsic_init()] Drop usage of j in nested loop. It is enough to have only
   index.
 - Change 0x%lx to %#lx for formatting of an address in printk().
 - [imsic_init()] Rename local variable cpuid to hartid.
 - s/imsic_get_parent_cpuid/imsic_get_parent_hartid, s/cpuid/hartid for an
   imsic_get_parent_hartid()'s argument.
 - Declare cpus member of struct imsic_mmios as cpumask_t.
 - [imsic_init()] Allocate imsic_mmios.cpus by using of alloc_cpumask_var().
 - [imsic_init()] Use cpumask_set_cpu() instead of bitmap_set().
 - [imsic_parse_node()] add check for correctness of "interrupt-extended" property.
 - [imsic_parse_node()] Use dt_node_name() or dt_full_node_name() instead of
   derefence of struct dt_node.
 - [imsic_parse_node()] Add cleanup label and update 'rc = XXX; goto cleanup'
   instead of 'return rc' as now we have to cleanup dynamically allocated irq_range
   array.
 - Add comments above imsic_init() and imsic_parse_node().
 - s/xen/arch/riscv/imsic.h/xen/arch/riscv/include/asm/imsic.h in the comment of
   imsic.h.
---
Changes in V2:
 - Drop years in copyrights.
 - s/riscv_of_processor_hartid/dt_processor_cpuid.
 - s/imsic_get_parent_hartid/imsic_get_parent_hartid.
   Rename argument hartid to cpuid.
   Make node argument const.
   Return res instead of -EINVAL for the failure case of dt_processor_cpuid().
   Drop local variable hart and use cpuid argument instead.
   Drop useless return res;
 - imsic_parse_node() changes:
   - Make node argument const.
   - Check the return value of dt_property_read_u32() directly instead of
     saving it to rc variable.
   - Update tmp usage, use short form "-=".
   - Update a check (imsic_cfg.nr_ids >= IMSIC_MAX_ID) to (imsic_cfg.nr_ids > IMSIC_MAX_ID)
     as IMSIC_MAX_ID is changed to maximum valid value, not just the firsr out-of-range.
   - Use `rc` to return value instead of explicitly use -EINVAL.
   - Use do {} while() to find number of MMIO register sets.
 - Set IMSIC_MAX_ID to 2047 (maximum possible IRQ number).
 - imsic_init() changes:
   - Use unsigned int in for's expression1.
   - s/xfree/XFEE.
   - Allocate msi and cpus array dynamically.
 - Drop forward declaration before declaration of imsic_get_config() in asm/imsic.h
   as it is not used as parameter type.
 - Align declaration of imisic_init with defintion.
 - s/harts/cpus in imisic_mmios.
   Also, change type from bool harts[NR_CPUS] to unsigned long *cpus.
 - Allocate msi member of imsic_config dynamically to save some memory.
 - Code style fixes.
 - Update the commit message.
---
 xen/arch/riscv/Makefile            |   1 +
 xen/arch/riscv/imsic.c             | 354 +++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/imsic.h |  65 ++++++
 xen/arch/riscv/include/asm/smp.h   |  13 ++
 4 files changed, 433 insertions(+)
 create mode 100644 xen/arch/riscv/imsic.c
 create mode 100644 xen/arch/riscv/include/asm/imsic.h

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index a1c145c506..e2b8aa42c8 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -2,6 +2,7 @@ obj-y += aplic.o
 obj-y += cpufeature.o
 obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
 obj-y += entry.o
+obj-y += imsic.o
 obj-y += intc.o
 obj-y += irq.o
 obj-y += mm.o
diff --git a/xen/arch/riscv/imsic.c b/xen/arch/riscv/imsic.c
new file mode 100644
index 0000000000..9f8b492e97
--- /dev/null
+++ b/xen/arch/riscv/imsic.c
@@ -0,0 +1,354 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * xen/arch/riscv/imsic.c
+ *
+ * RISC-V Incoming MSI Controller support
+ *
+ * (c) Microchip Technology Inc.
+ * (c) Vates
+ */
+
+#include <xen/bitops.h>
+#include <xen/const.h>
+#include <xen/cpumask.h>
+#include <xen/device_tree.h>
+#include <xen/errno.h>
+#include <xen/init.h>
+#include <xen/macros.h>
+#include <xen/smp.h>
+#include <xen/spinlock.h>
+#include <xen/xmalloc.h>
+
+#include <asm/imsic.h>
+
+static struct imsic_config imsic_cfg;
+
+/* Callers aren't expected to changed imsic_cfg so return const. */
+const struct imsic_config *imsic_get_config(void)
+{
+    return &imsic_cfg;
+}
+
+static int __init imsic_get_parent_hartid(const struct dt_device_node *node,
+                                          unsigned int index,
+                                          unsigned long *hartid)
+{
+    int res;
+    struct dt_phandle_args args;
+
+    res = dt_parse_phandle_with_args(node, "interrupts-extended",
+                                     "#interrupt-cells", index, &args);
+    if ( !res )
+        res = dt_processor_hartid(args.np->parent, hartid);
+
+    return res;
+}
+
+/*
+ * Parses IMSIC DT node.
+ *
+ * Returns 0 if initialization is successful, a negative value on failure,
+ * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
+ * which should be ignored by the hypervisor.
+ */
+static int imsic_parse_node(const struct dt_device_node *node,
+                            unsigned int *nr_parent_irqs)
+{
+    int rc;
+    unsigned int tmp;
+    paddr_t base_addr;
+    uint32_t *irq_range;
+
+    *nr_parent_irqs = dt_number_of_irq(node);
+    if ( !*nr_parent_irqs )
+        panic("%s: irq_num can be 0. Check %s node\n", __func__,
+              dt_node_full_name(node));
+
+    irq_range = xzalloc_array(uint32_t, *nr_parent_irqs * 2);
+    if ( !irq_range )
+        panic("%s: irq_range[] allocation failed\n", __func__);
+
+    if ( (rc = dt_property_read_u32_array(node, "interrupts-extended",
+                                    irq_range, *nr_parent_irqs * 2)) )
+        panic("%s: unable to find interrupt-extended in %s node: %d\n",
+              __func__, dt_node_full_name(node), rc);
+
+    if ( irq_range[1] == IRQ_M_EXT )
+    {
+        /* Machine mode imsic node, ignore it. */
+        rc = IRQ_M_EXT;
+        goto cleanup;
+    }
+
+    /* Check that interrupts-extended property is well-formed. */
+    for ( unsigned int i = 2; i < (*nr_parent_irqs * 2); i += 2 )
+    {
+        if ( irq_range[i + 1] != irq_range[1] )
+            panic("%s: mode[%d] != %d\n", __func__, i + 1, irq_range[1]);
+    }
+
+    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
+                               &imsic_cfg.guest_index_bits) )
+        imsic_cfg.guest_index_bits = 0;
+    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
+    if ( tmp < imsic_cfg.guest_index_bits )
+    {
+        printk(XENLOG_ERR "%s: guest index bits too big\n",
+               dt_node_name(node));
+        rc = -ENOENT;
+        goto cleanup;
+    }
+
+    /* Find number of HART index bits */
+    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
+                               &imsic_cfg.hart_index_bits) )
+    {
+        /* Assume default value */
+        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
+        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
+            imsic_cfg.hart_index_bits++;
+    }
+    tmp -= imsic_cfg.guest_index_bits;
+    if ( tmp < imsic_cfg.hart_index_bits )
+    {
+        printk(XENLOG_ERR "%s: HART index bits too big\n",
+               dt_node_name(node));
+        rc = -ENOENT;
+        goto cleanup;
+    }
+
+    /* Find number of group index bits */
+    if ( !dt_property_read_u32(node, "riscv,group-index-bits",
+                               &imsic_cfg.group_index_bits) )
+        imsic_cfg.group_index_bits = 0;
+    tmp -= imsic_cfg.hart_index_bits;
+    if ( tmp < imsic_cfg.group_index_bits )
+    {
+        printk(XENLOG_ERR "%s: group index bits too big\n",
+               dt_node_name(node));
+        rc = -ENOENT;
+        goto cleanup;
+    }
+
+    /* Find first bit position of group index */
+    tmp = IMSIC_MMIO_PAGE_SHIFT * 2;
+    if ( !dt_property_read_u32(node, "riscv,group-index-shift",
+                               &imsic_cfg.group_index_shift) )
+        imsic_cfg.group_index_shift = tmp;
+    if ( imsic_cfg.group_index_shift < tmp )
+    {
+        printk(XENLOG_ERR "%s: group index shift too small\n",
+               dt_node_name(node));
+        rc = -ENOENT;
+        goto cleanup;
+    }
+    tmp = imsic_cfg.group_index_bits + imsic_cfg.group_index_shift - 1;
+    if ( tmp >= BITS_PER_LONG )
+    {
+        printk(XENLOG_ERR "%s: group index shift too big\n",
+               dt_node_name(node));
+        rc = -EINVAL;
+        goto cleanup;
+    }
+
+    /* Find number of interrupt identities */
+    if ( !dt_property_read_u32(node, "riscv,num-ids", &imsic_cfg.nr_ids) )
+    {
+        printk(XENLOG_ERR "%s: number of interrupt identities not found\n",
+               node->name);
+        rc = -ENOENT;
+        goto cleanup;
+    }
+
+    if ( (imsic_cfg.nr_ids < IMSIC_MIN_ID) ||
+         (imsic_cfg.nr_ids > IMSIC_MAX_ID) ||
+         ((imsic_cfg.nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID) )
+    {
+        printk(XENLOG_ERR "%s: invalid number of interrupt identities\n",
+               node->name);
+        rc = -EINVAL;
+        goto cleanup;
+    }
+
+    /* Compute base address */
+    imsic_cfg.nr_mmios = 0;
+    rc = dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL);
+    if ( rc )
+    {
+        printk(XENLOG_ERR "%s: first MMIO resource not found: %d\n",
+               dt_node_name(node), rc);
+        goto cleanup;
+    }
+
+    imsic_cfg.base_addr = base_addr;
+    imsic_cfg.base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+                                 imsic_cfg.hart_index_bits +
+                                 IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
+    imsic_cfg.base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
+                             imsic_cfg.group_index_shift);
+
+    /* Find number of MMIO register sets */
+    do {
+        imsic_cfg.nr_mmios++;
+    } while ( !dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL) );
+
+ cleanup:
+    xfree(irq_range);
+
+    return rc;
+}
+
+/*
+ * Initialize the imsic_cfg structure based on the IMSIC DT node.
+ *
+ * Returns 0 if initialization is successful, a negative value on failure,
+ * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
+ * which should be ignored by the hypervisor.
+ */
+int __init imsic_init(const struct dt_device_node *node)
+{
+    int rc;
+    unsigned long reloff, hartid;
+    unsigned int nr_parent_irqs, index, nr_handlers = 0;
+    paddr_t base_addr;
+    unsigned int nr_mmios;
+
+    /* Parse IMSIC node */
+    rc = imsic_parse_node(node, &nr_parent_irqs);
+    /*
+     * If machine mode imsic node => ignore it.
+     * If rc < 0 => parsing of IMSIC DT node failed.
+     */
+    if ( (rc == IRQ_M_EXT) || rc )
+        return rc;
+
+    nr_mmios = imsic_cfg.nr_mmios;
+
+    /* Allocate MMIO resource array */
+    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
+    if ( !imsic_cfg.mmios )
+    {
+        rc = -ENOMEM;
+        goto imsic_init_err;
+    }
+
+    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
+    if ( !imsic_cfg.msi )
+    {
+        rc = -ENOMEM;
+        goto imsic_init_err;
+    }
+
+    /* Check MMIO register sets */
+    for ( unsigned int i = 0; i < nr_mmios; i++ )
+    {
+        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
+        {
+            rc = -ENOMEM;
+            goto imsic_init_err;
+        }
+
+        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
+                                   &imsic_cfg.mmios[i].size);
+        if ( rc )
+        {
+            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
+                   node->name, i);
+            goto imsic_init_err;
+        }
+
+        base_addr = imsic_cfg.mmios[i].base_addr;
+        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+                           imsic_cfg.hart_index_bits +
+                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
+        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
+                       imsic_cfg.group_index_shift);
+        if ( base_addr != imsic_cfg.base_addr )
+        {
+            rc = -EINVAL;
+            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
+                   node->name, i);
+            goto imsic_init_err;
+        }
+    }
+
+    /* Configure handlers for target CPUs */
+    for ( unsigned int i = 0; i < nr_parent_irqs; i++ )
+    {
+        unsigned long xen_cpuid;
+
+        rc = imsic_get_parent_hartid(node, i, &hartid);
+        if ( rc )
+        {
+            printk(XENLOG_WARNING "%s: cpu ID for parent irq%u not found\n",
+                   node->name, i);
+            continue;
+        }
+
+        xen_cpuid = hartid_to_cpuid(hartid);
+
+        if ( xen_cpuid >= num_possible_cpus() )
+        {
+            printk(XENLOG_WARNING "%s: unsupported cpu ID=%lu for parent irq%u\n",
+                   node->name, hartid, i);
+            continue;
+        }
+
+        /* Find MMIO location of MSI page */
+        reloff = i * BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ;
+        for ( index = 0; index < nr_mmios; index++ )
+        {
+            if ( reloff < imsic_cfg.mmios[index].size )
+                break;
+
+            /*
+             * MMIO region size may not be aligned to
+             * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
+             * if holes are present.
+             */
+            reloff -= ROUNDUP(imsic_cfg.mmios[index].size,
+                BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ);
+        }
+
+        if ( index == nr_mmios )
+        {
+            printk(XENLOG_WARNING "%s: MMIO not found for parent irq%u\n",
+                   node->name, i);
+            continue;
+        }
+
+        if ( !IS_ALIGNED(imsic_cfg.msi[xen_cpuid].base_addr + reloff, PAGE_SIZE) )
+        {
+            printk(XENLOG_WARNING "%s: MMIO address %#lx is not aligned on a page\n",
+                   node->name, imsic_cfg.msi[xen_cpuid].base_addr + reloff);
+            imsic_cfg.msi[xen_cpuid].offset = 0;
+            imsic_cfg.msi[xen_cpuid].base_addr = 0;
+            continue;
+        }
+
+        cpumask_set_cpu(xen_cpuid, imsic_cfg.mmios[index].cpus);
+
+        imsic_cfg.msi[xen_cpuid].base_addr = imsic_cfg.mmios[index].base_addr;
+        imsic_cfg.msi[xen_cpuid].offset = reloff;
+
+        nr_handlers++;
+    }
+
+    if ( !nr_handlers )
+    {
+        printk(XENLOG_ERR "%s: No CPU handlers found\n", node->name);
+        rc = -ENODEV;
+        goto imsic_init_err;
+    }
+
+    return 0;
+
+ imsic_init_err:
+    for ( unsigned int i = 0; i < nr_mmios; i++ )
+        free_cpumask_var(imsic_cfg.mmios[i].cpus);
+    XFREE(imsic_cfg.mmios);
+    XFREE(imsic_cfg.msi);
+
+    return rc;
+}
diff --git a/xen/arch/riscv/include/asm/imsic.h b/xen/arch/riscv/include/asm/imsic.h
new file mode 100644
index 0000000000..0d17881884
--- /dev/null
+++ b/xen/arch/riscv/include/asm/imsic.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * xen/arch/riscv/include/asm/imsic.h
+ *
+ * RISC-V Incoming MSI Controller support
+ *
+ * (c) Microchip Technology Inc.
+ */
+
+#ifndef ASM__RISCV__IMSIC_H
+#define ASM__RISCV__IMSIC_H
+
+#include <xen/types.h>
+
+#define IMSIC_MMIO_PAGE_SHIFT   12
+#define IMSIC_MMIO_PAGE_SZ      (1UL << IMSIC_MMIO_PAGE_SHIFT)
+
+#define IMSIC_MIN_ID            63
+#define IMSIC_MAX_ID            2047
+
+struct imsic_msi {
+    paddr_t base_addr;
+    unsigned long offset;
+};
+
+struct imsic_mmios {
+    paddr_t base_addr;
+    unsigned long size;
+    cpumask_var_t cpus;
+};
+
+struct imsic_config {
+    /* Base address */
+    paddr_t base_addr;
+
+    /* Bits representing Guest index, HART index, and Group index */
+    unsigned int guest_index_bits;
+    unsigned int hart_index_bits;
+    unsigned int group_index_bits;
+    unsigned int group_index_shift;
+
+    /* IMSIC phandle */
+    unsigned int phandle;
+
+    /* Number of parent irq */
+    unsigned int nr_parent_irqs;
+
+    /* Number off interrupt identities */
+    unsigned int nr_ids;
+
+    /* MMIOs */
+    unsigned int nr_mmios;
+    struct imsic_mmios *mmios;
+
+    /* MSI */
+    struct imsic_msi *msi;
+};
+
+struct dt_device_node;
+int imsic_init(const struct dt_device_node *node);
+
+const struct imsic_config *imsic_get_config(void);
+
+#endif /* ASM__RISCV__IMSIC_H */
diff --git a/xen/arch/riscv/include/asm/smp.h b/xen/arch/riscv/include/asm/smp.h
index eb58b6576b..33ee5ec06b 100644
--- a/xen/arch/riscv/include/asm/smp.h
+++ b/xen/arch/riscv/include/asm/smp.h
@@ -3,6 +3,7 @@
 #define ASM__RISCV__SMP_H
 
 #include <xen/cpumask.h>
+#include <xen/macros.h>
 #include <xen/percpu.h>
 
 #include <asm/current.h>
@@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
     return pcpu_info[cpuid].hart_id;
 }
 
+static inline unsigned long hartid_to_cpuid(unsigned long hartid)
+{
+    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
+    {
+        if ( hartid == cpuid_to_hartid(cpuid) )
+            return cpuid;
+    }
+
+    /* hartid isn't valid for some reason */
+    return NR_CPUS;
+}
+
 static inline void set_cpuid_to_hartid(unsigned long cpuid,
                                        unsigned long hartid)
 {
-- 
2.49.0
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 3 weeks ago
On 21.05.2025 18:03, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/imsic.c
> @@ -0,0 +1,354 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * xen/arch/riscv/imsic.c
> + *
> + * RISC-V Incoming MSI Controller support
> + *
> + * (c) Microchip Technology Inc.
> + * (c) Vates
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/const.h>
> +#include <xen/cpumask.h>
> +#include <xen/device_tree.h>
> +#include <xen/errno.h>
> +#include <xen/init.h>
> +#include <xen/macros.h>
> +#include <xen/smp.h>
> +#include <xen/spinlock.h>
> +#include <xen/xmalloc.h>
> +
> +#include <asm/imsic.h>
> +
> +static struct imsic_config imsic_cfg;
> +
> +/* Callers aren't expected to changed imsic_cfg so return const. */
> +const struct imsic_config *imsic_get_config(void)
> +{
> +    return &imsic_cfg;
> +}

Minor remark regarding the comment: Consider replacing "expected" by "supposed"
or "intended"?

> +/*
> + * Parses IMSIC DT node.
> + *
> + * Returns 0 if initialization is successful, a negative value on failure,
> + * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
> + * which should be ignored by the hypervisor.
> + */
> +static int imsic_parse_node(const struct dt_device_node *node,
> +                            unsigned int *nr_parent_irqs)
> +{
> +    int rc;
> +    unsigned int tmp;
> +    paddr_t base_addr;
> +    uint32_t *irq_range;
> +
> +    *nr_parent_irqs = dt_number_of_irq(node);
> +    if ( !*nr_parent_irqs )
> +        panic("%s: irq_num can be 0. Check %s node\n", __func__,
> +              dt_node_full_name(node));

DYM "can't be"?

> +    irq_range = xzalloc_array(uint32_t, *nr_parent_irqs * 2);
> +    if ( !irq_range )
> +        panic("%s: irq_range[] allocation failed\n", __func__);
> +
> +    if ( (rc = dt_property_read_u32_array(node, "interrupts-extended",
> +                                    irq_range, *nr_parent_irqs * 2)) )
> +        panic("%s: unable to find interrupt-extended in %s node: %d\n",
> +              __func__, dt_node_full_name(node), rc);
> +
> +    if ( irq_range[1] == IRQ_M_EXT )
> +    {
> +        /* Machine mode imsic node, ignore it. */
> +        rc = IRQ_M_EXT;
> +        goto cleanup;
> +    }

Wouldn't this better be done ...

> +    /* Check that interrupts-extended property is well-formed. */
> +    for ( unsigned int i = 2; i < (*nr_parent_irqs * 2); i += 2 )
> +    {
> +        if ( irq_range[i + 1] != irq_range[1] )
> +            panic("%s: mode[%d] != %d\n", __func__, i + 1, irq_range[1]);
> +    }

... after this consistency check?

Also %u please when you log unsigned values.

> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
> +                               &imsic_cfg.guest_index_bits) )
> +        imsic_cfg.guest_index_bits = 0;
> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
> +    if ( tmp < imsic_cfg.guest_index_bits )
> +    {
> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
> +               dt_node_name(node));
> +        rc = -ENOENT;
> +        goto cleanup;
> +    }
> +
> +    /* Find number of HART index bits */
> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
> +                               &imsic_cfg.hart_index_bits) )
> +    {
> +        /* Assume default value */
> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
> +            imsic_cfg.hart_index_bits++;

Since fls() returns a 1-based bit number, isn't it rather that in the
exact-power-of-2 case you'd need to subtract 1?

> +    }
> +    tmp -= imsic_cfg.guest_index_bits;
> +    if ( tmp < imsic_cfg.hart_index_bits )
> +    {
> +        printk(XENLOG_ERR "%s: HART index bits too big\n",
> +               dt_node_name(node));
> +        rc = -ENOENT;
> +        goto cleanup;
> +    }
> +
> +    /* Find number of group index bits */
> +    if ( !dt_property_read_u32(node, "riscv,group-index-bits",
> +                               &imsic_cfg.group_index_bits) )
> +        imsic_cfg.group_index_bits = 0;
> +    tmp -= imsic_cfg.hart_index_bits;
> +    if ( tmp < imsic_cfg.group_index_bits )
> +    {
> +        printk(XENLOG_ERR "%s: group index bits too big\n",
> +               dt_node_name(node));
> +        rc = -ENOENT;
> +        goto cleanup;
> +    }
> +
> +    /* Find first bit position of group index */
> +    tmp = IMSIC_MMIO_PAGE_SHIFT * 2;
> +    if ( !dt_property_read_u32(node, "riscv,group-index-shift",
> +                               &imsic_cfg.group_index_shift) )
> +        imsic_cfg.group_index_shift = tmp;
> +    if ( imsic_cfg.group_index_shift < tmp )
> +    {
> +        printk(XENLOG_ERR "%s: group index shift too small\n",
> +               dt_node_name(node));
> +        rc = -ENOENT;
> +        goto cleanup;
> +    }
> +    tmp = imsic_cfg.group_index_bits + imsic_cfg.group_index_shift - 1;
> +    if ( tmp >= BITS_PER_LONG )
> +    {
> +        printk(XENLOG_ERR "%s: group index shift too big\n",
> +               dt_node_name(node));
> +        rc = -EINVAL;
> +        goto cleanup;
> +    }
> +
> +    /* Find number of interrupt identities */
> +    if ( !dt_property_read_u32(node, "riscv,num-ids", &imsic_cfg.nr_ids) )
> +    {
> +        printk(XENLOG_ERR "%s: number of interrupt identities not found\n",
> +               node->name);
> +        rc = -ENOENT;
> +        goto cleanup;
> +    }
> +
> +    if ( (imsic_cfg.nr_ids < IMSIC_MIN_ID) ||
> +         (imsic_cfg.nr_ids > IMSIC_MAX_ID) ||
> +         ((imsic_cfg.nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID) )

Now that you've explained to me what the deal is with these constants: Isn't
the 1st of the three checks redundant with the last one?

> +    {
> +        printk(XENLOG_ERR "%s: invalid number of interrupt identities\n",
> +               node->name);
> +        rc = -EINVAL;
> +        goto cleanup;
> +    }
> +
> +    /* Compute base address */
> +    imsic_cfg.nr_mmios = 0;
> +    rc = dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL);
> +    if ( rc )
> +    {
> +        printk(XENLOG_ERR "%s: first MMIO resource not found: %d\n",
> +               dt_node_name(node), rc);
> +        goto cleanup;
> +    }
> +
> +    imsic_cfg.base_addr = base_addr;
> +    imsic_cfg.base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
> +                                 imsic_cfg.hart_index_bits +
> +                                 IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
> +    imsic_cfg.base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
> +                             imsic_cfg.group_index_shift);
> +
> +    /* Find number of MMIO register sets */
> +    do {
> +        imsic_cfg.nr_mmios++;
> +    } while ( !dt_device_get_address(node, imsic_cfg.nr_mmios, &base_addr, NULL) );
> +
> + cleanup:
> +    xfree(irq_range);

Afacit you could free this array way earlier. That would then simplify quite
a few of the error paths, I think.

> +/*
> + * Initialize the imsic_cfg structure based on the IMSIC DT node.
> + *
> + * Returns 0 if initialization is successful, a negative value on failure,
> + * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
> + * which should be ignored by the hypervisor.
> + */
> +int __init imsic_init(const struct dt_device_node *node)
> +{
> +    int rc;
> +    unsigned long reloff, hartid;
> +    unsigned int nr_parent_irqs, index, nr_handlers = 0;
> +    paddr_t base_addr;
> +    unsigned int nr_mmios;
> +
> +    /* Parse IMSIC node */
> +    rc = imsic_parse_node(node, &nr_parent_irqs);
> +    /*
> +     * If machine mode imsic node => ignore it.
> +     * If rc < 0 => parsing of IMSIC DT node failed.
> +     */
> +    if ( (rc == IRQ_M_EXT) || rc )
> +        return rc;

The former of the checks is redundant with the latter. Did you perhaps mean
"rc < 0" for that one?

> +    nr_mmios = imsic_cfg.nr_mmios;
> +
> +    /* Allocate MMIO resource array */
> +    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);

How large can this and ...

> +    if ( !imsic_cfg.mmios )
> +    {
> +        rc = -ENOMEM;
> +        goto imsic_init_err;
> +    }
> +
> +    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);

... this array grow (in principle)? I think you're aware that in principle
new code is expected to use xvmalloc() and friends unless there are specific
reasons speaking against that.

> +    if ( !imsic_cfg.msi )
> +    {
> +        rc = -ENOMEM;
> +        goto imsic_init_err;
> +    }
> +
> +    /* Check MMIO register sets */
> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
> +    {
> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
> +        {
> +            rc = -ENOMEM;
> +            goto imsic_init_err;
> +        }
> +
> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
> +                                   &imsic_cfg.mmios[i].size);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
> +                   node->name, i);
> +            goto imsic_init_err;
> +        }
> +
> +        base_addr = imsic_cfg.mmios[i].base_addr;
> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
> +                           imsic_cfg.hart_index_bits +
> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
> +                       imsic_cfg.group_index_shift);
> +        if ( base_addr != imsic_cfg.base_addr )
> +        {
> +            rc = -EINVAL;
> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
> +                   node->name, i);
> +            goto imsic_init_err;
> +        }

Maybe just for my own understanding: There's no similar check for the
sizes to match / be consistent wanted / needed?

> +    }
> +
> +    /* Configure handlers for target CPUs */
> +    for ( unsigned int i = 0; i < nr_parent_irqs; i++ )
> +    {
> +        unsigned long xen_cpuid;
> +
> +        rc = imsic_get_parent_hartid(node, i, &hartid);
> +        if ( rc )
> +        {
> +            printk(XENLOG_WARNING "%s: cpu ID for parent irq%u not found\n",
> +                   node->name, i);
> +            continue;
> +        }
> +
> +        xen_cpuid = hartid_to_cpuid(hartid);

I'm probably biased by "cpuid" having different meaning on x86, but: To
be consistent with variable names elsewhere, couldn't this variable simply
be named "cpu"? With the other item named "hartid" there's no ambiguity
there anymore.

> +        if ( xen_cpuid >= num_possible_cpus() )
> +        {
> +            printk(XENLOG_WARNING "%s: unsupported cpu ID=%lu for parent irq%u\n",
> +                   node->name, hartid, i);

The message continues to be ambiguous (to me as a non-RISC-V person at
least): You log a hart ID, while you say "cpu ID". Also, as I think I
said elsewhere already, the hart ID may better be logged using %#lx.

> +            continue;
> +        }
> +
> +        /* Find MMIO location of MSI page */
> +        reloff = i * BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ;
> +        for ( index = 0; index < nr_mmios; index++ )
> +        {
> +            if ( reloff < imsic_cfg.mmios[index].size )
> +                break;
> +
> +            /*
> +             * MMIO region size may not be aligned to
> +             * BIT(global->guest_index_bits) * IMSIC_MMIO_PAGE_SZ
> +             * if holes are present.
> +             */
> +            reloff -= ROUNDUP(imsic_cfg.mmios[index].size,
> +                BIT(imsic_cfg.guest_index_bits, UL) * IMSIC_MMIO_PAGE_SZ);

Nit: Indentation once again.

> +        }
> +
> +        if ( index == nr_mmios )
> +        {
> +            printk(XENLOG_WARNING "%s: MMIO not found for parent irq%u\n",
> +                   node->name, i);
> +            continue;
> +        }
> +
> +        if ( !IS_ALIGNED(imsic_cfg.msi[xen_cpuid].base_addr + reloff, PAGE_SIZE) )

If this is the crucial thing to check, ...

> +        {
> +            printk(XENLOG_WARNING "%s: MMIO address %#lx is not aligned on a page\n",
> +                   node->name, imsic_cfg.msi[xen_cpuid].base_addr + reloff);
> +            imsic_cfg.msi[xen_cpuid].offset = 0;
> +            imsic_cfg.msi[xen_cpuid].base_addr = 0;
> +            continue;
> +        }
> +
> +        cpumask_set_cpu(xen_cpuid, imsic_cfg.mmios[index].cpus);
> +
> +        imsic_cfg.msi[xen_cpuid].base_addr = imsic_cfg.mmios[index].base_addr;
> +        imsic_cfg.msi[xen_cpuid].offset = reloff;

... why is it that the two parts are stored separately? If their sum needs to
be page-aligned, I'd kind of expect it's only ever the sum which is used?

Also is it really PAGE_SHIFT or rather IMSIC_MMIO_PAGE_SHIFT that needs
chacking against?

Further please pay attention to line length limits - there are at least two
violations around my earlier comment here.

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/imsic.h
> @@ -0,0 +1,65 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * xen/arch/riscv/include/asm/imsic.h
> + *
> + * RISC-V Incoming MSI Controller support
> + *
> + * (c) Microchip Technology Inc.
> + */
> +
> +#ifndef ASM__RISCV__IMSIC_H
> +#define ASM__RISCV__IMSIC_H

Please update according to the most recent naming rules change (all it takes
may be to shrink the double underscores).

> +#include <xen/types.h>
> +
> +#define IMSIC_MMIO_PAGE_SHIFT   12
> +#define IMSIC_MMIO_PAGE_SZ      (1UL << IMSIC_MMIO_PAGE_SHIFT)
> +
> +#define IMSIC_MIN_ID            63
> +#define IMSIC_MAX_ID            2047
> +
> +struct imsic_msi {
> +    paddr_t base_addr;
> +    unsigned long offset;
> +};
> +
> +struct imsic_mmios {
> +    paddr_t base_addr;
> +    unsigned long size;
> +    cpumask_var_t cpus;
> +};
> +
> +struct imsic_config {
> +    /* Base address */
> +    paddr_t base_addr;
> +
> +    /* Bits representing Guest index, HART index, and Group index */
> +    unsigned int guest_index_bits;
> +    unsigned int hart_index_bits;
> +    unsigned int group_index_bits;
> +    unsigned int group_index_shift;
> +
> +    /* IMSIC phandle */
> +    unsigned int phandle;
> +
> +    /* Number of parent irq */
> +    unsigned int nr_parent_irqs;
> +
> +    /* Number off interrupt identities */
> +    unsigned int nr_ids;
> +
> +    /* MMIOs */
> +    unsigned int nr_mmios;
> +    struct imsic_mmios *mmios;

Are the contents of this and ...

> +    /* MSI */
> +    struct imsic_msi *msi;

... this array ever changing post-init? If not, the pointers here may want
to be pointer-to-const (requiring local variables in the function populating
the field).

> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>      return pcpu_info[cpuid].hart_id;
>  }
>  
> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
> +{
> +    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
> +    {
> +        if ( hartid == cpuid_to_hartid(cpuid) )
> +            return cpuid;
> +    }
> +
> +    /* hartid isn't valid for some reason */
> +    return NR_CPUS;
> +}

Considering the values being returned, why's the function's return type
"unsigned long"?

Why the use of ARRAY_SIZE() in the loop header? You don't use pcpu_info[]
in the loop body.

Finally - on big systems this is going to be pretty inefficient a lookup.
This may want to at least have a TODO comment attached.

Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 3 weeks ago
On 5/22/25 4:46 PM, Jan Beulich wrote:
> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>> --- /dev/null
>> +++ b/xen/arch/riscv/imsic.c
>> @@ -0,0 +1,354 @@
>> +/* SPDX-License-Identifier: MIT */
>> +
>> +/*
>> + * xen/arch/riscv/imsic.c
>> + *
>> + * RISC-V Incoming MSI Controller support
>> + *
>> + * (c) Microchip Technology Inc.
>> + * (c) Vates
>> + */
>> +
>> +#include <xen/bitops.h>
>> +#include <xen/const.h>
>> +#include <xen/cpumask.h>
>> +#include <xen/device_tree.h>
>> +#include <xen/errno.h>
>> +#include <xen/init.h>
>> +#include <xen/macros.h>
>> +#include <xen/smp.h>
>> +#include <xen/spinlock.h>
>> +#include <xen/xmalloc.h>
>> +
>> +#include <asm/imsic.h>
>> +
>> +static struct imsic_config imsic_cfg;
>> +
>> +    irq_range = xzalloc_array(uint32_t, *nr_parent_irqs * 2);
>> +    if ( !irq_range )
>> +        panic("%s: irq_range[] allocation failed\n", __func__);
>> +
>> +    if ( (rc = dt_property_read_u32_array(node, "interrupts-extended",
>> +                                    irq_range, *nr_parent_irqs * 2)) )
>> +        panic("%s: unable to find interrupt-extended in %s node: %d\n",
>> +              __func__, dt_node_full_name(node), rc);
>> +
>> +    if ( irq_range[1] == IRQ_M_EXT )
>> +    {
>> +        /* Machine mode imsic node, ignore it. */
>> +        rc = IRQ_M_EXT;
>> +        goto cleanup;
>> +    }
> Wouldn't this better be done ...
>
>> +    /* Check that interrupts-extended property is well-formed. */
>> +    for ( unsigned int i = 2; i < (*nr_parent_irqs * 2); i += 2 )
>> +    {
>> +        if ( irq_range[i + 1] != irq_range[1] )
>> +            panic("%s: mode[%d] != %d\n", __func__, i + 1, irq_range[1]);
>> +    }
> ... after this consistency check?

My intention was: if the first|irq_range| isn't|IRQ_M_EXT|, then there's no
point in parsing the full|interrupts-extended| property.

However, on the other hand, it might be important to ensure that the
|interrupts-extended| property is properly formed.

So yes, it makes sense to move the check above, before the|for| loop.

> Also %u please when you log unsigned values.
>
>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>> +                               &imsic_cfg.guest_index_bits) )
>> +        imsic_cfg.guest_index_bits = 0;
>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>> +    if ( tmp < imsic_cfg.guest_index_bits )
>> +    {
>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>> +               dt_node_name(node));
>> +        rc = -ENOENT;
>> +        goto cleanup;
>> +    }
>> +
>> +    /* Find number of HART index bits */
>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>> +                               &imsic_cfg.hart_index_bits) )
>> +    {
>> +        /* Assume default value */
>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>> +            imsic_cfg.hart_index_bits++;
> Since fls() returns a 1-based bit number, isn't it rather that in the
> exact-power-of-2 case you'd need to subtract 1?

Agree, in this case, -1 should be taken into account.


>> +    }
>> +    tmp -= imsic_cfg.guest_index_bits;
>> +    if ( tmp < imsic_cfg.hart_index_bits )
>> +    {
>> +        printk(XENLOG_ERR "%s: HART index bits too big\n",
>> +               dt_node_name(node));
>> +        rc = -ENOENT;
>> +        goto cleanup;
>> +    }
>> +
>> +    /* Find number of group index bits */
>> +    if ( !dt_property_read_u32(node, "riscv,group-index-bits",
>> +                               &imsic_cfg.group_index_bits) )
>> +        imsic_cfg.group_index_bits = 0;
>> +    tmp -= imsic_cfg.hart_index_bits;
>> +    if ( tmp < imsic_cfg.group_index_bits )
>> +    {
>> +        printk(XENLOG_ERR "%s: group index bits too big\n",
>> +               dt_node_name(node));
>> +        rc = -ENOENT;
>> +        goto cleanup;
>> +    }
>> +
>> +    /* Find first bit position of group index */
>> +    tmp = IMSIC_MMIO_PAGE_SHIFT * 2;
>> +    if ( !dt_property_read_u32(node, "riscv,group-index-shift",
>> +                               &imsic_cfg.group_index_shift) )
>> +        imsic_cfg.group_index_shift = tmp;
>> +    if ( imsic_cfg.group_index_shift < tmp )
>> +    {
>> +        printk(XENLOG_ERR "%s: group index shift too small\n",
>> +               dt_node_name(node));
>> +        rc = -ENOENT;
>> +        goto cleanup;
>> +    }
>> +    tmp = imsic_cfg.group_index_bits + imsic_cfg.group_index_shift - 1;
>> +    if ( tmp >= BITS_PER_LONG )
>> +    {
>> +        printk(XENLOG_ERR "%s: group index shift too big\n",
>> +               dt_node_name(node));
>> +        rc = -EINVAL;
>> +        goto cleanup;
>> +    }
>> +
>> +    /* Find number of interrupt identities */
>> +    if ( !dt_property_read_u32(node, "riscv,num-ids", &imsic_cfg.nr_ids) )
>> +    {
>> +        printk(XENLOG_ERR "%s: number of interrupt identities not found\n",
>> +               node->name);
>> +        rc = -ENOENT;
>> +        goto cleanup;
>> +    }
>> +
>> +    if ( (imsic_cfg.nr_ids < IMSIC_MIN_ID) ||
>> +         (imsic_cfg.nr_ids > IMSIC_MAX_ID) ||
>> +         ((imsic_cfg.nr_ids & IMSIC_MIN_ID) != IMSIC_MIN_ID) )
> Now that you've explained to me what the deal is with these constants: Isn't
> the 1st of the three checks redundant with the last one?

Agree, one of them could be dropped.

>> +/*
>> + * Initialize the imsic_cfg structure based on the IMSIC DT node.
>> + *
>> + * Returns 0 if initialization is successful, a negative value on failure,
>> + * or IRQ_M_EXT if the IMSIC node corresponds to a machine-mode IMSIC,
>> + * which should be ignored by the hypervisor.
>> + */
>> +int __init imsic_init(const struct dt_device_node *node)
>> +{
>> +    int rc;
>> +    unsigned long reloff, hartid;
>> +    unsigned int nr_parent_irqs, index, nr_handlers = 0;
>> +    paddr_t base_addr;
>> +    unsigned int nr_mmios;
>> +
>> +    /* Parse IMSIC node */
>> +    rc = imsic_parse_node(node, &nr_parent_irqs);
>> +    /*
>> +     * If machine mode imsic node => ignore it.
>> +     * If rc < 0 => parsing of IMSIC DT node failed.
>> +     */
>> +    if ( (rc == IRQ_M_EXT) || rc )
>> +        return rc;
> The former of the checks is redundant with the latter. Did you perhaps mean
> "rc < 0" for that one?

Yes, like the comment is correct but in code I did a mistake.

>
>> +    nr_mmios = imsic_cfg.nr_mmios;
>> +
>> +    /* Allocate MMIO resource array */
>> +    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
> How large can this and ...
>
>> +    if ( !imsic_cfg.mmios )
>> +    {
>> +        rc = -ENOMEM;
>> +        goto imsic_init_err;
>> +    }
>> +
>> +    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
> ... this array grow (in principle)?

Roughly speaking, this is the number of processors. The highests amount of processors
on the market I saw it was 32. But it was over a year ago when I last checked this.

>   I think you're aware that in principle
> new code is expected to use xvmalloc() and friends unless there are specific
> reasons speaking against that.

Oh, missed 'v'...

>
>> +    if ( !imsic_cfg.msi )
>> +    {
>> +        rc = -ENOMEM;
>> +        goto imsic_init_err;
>> +    }
>> +
>> +    /* Check MMIO register sets */
>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>> +    {
>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>> +        {
>> +            rc = -ENOMEM;
>> +            goto imsic_init_err;
>> +        }
>> +
>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>> +                                   &imsic_cfg.mmios[i].size);
>> +        if ( rc )
>> +        {
>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>> +                   node->name, i);
>> +            goto imsic_init_err;
>> +        }
>> +
>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>> +                           imsic_cfg.hart_index_bits +
>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>> +                       imsic_cfg.group_index_shift);
>> +        if ( base_addr != imsic_cfg.base_addr )
>> +        {
>> +            rc = -EINVAL;
>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>> +                   node->name, i);
>> +            goto imsic_init_err;
>> +        }
> Maybe just for my own understanding: There's no similar check for the
> sizes to match / be consistent wanted / needed?

If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
provide, IMO.
So I don't what is possible range for imsic_cfg.mmios[i].size.

>> +    }
>> +
>> +    /* Configure handlers for target CPUs */
>> +    for ( unsigned int i = 0; i < nr_parent_irqs; i++ )
>> +    {
>> +        unsigned long xen_cpuid;
>> +
>> +        rc = imsic_get_parent_hartid(node, i, &hartid);
>> +        if ( rc )
>> +        {
>> +            printk(XENLOG_WARNING "%s: cpu ID for parent irq%u not found\n",
>> +                   node->name, i);
>> +            continue;
>> +        }
>> +
>> +        xen_cpuid = hartid_to_cpuid(hartid);
> I'm probably biased by "cpuid" having different meaning on x86, but: To
> be consistent with variable names elsewhere, couldn't this variable simply
> be named "cpu"? With the other item named "hartid" there's no ambiguity
> there anymore.

Sure, I will use "cpu"/"xen_cpu" for instead of xen_cpuid.

>
>> +        if ( xen_cpuid >= num_possible_cpus() )
>> +        {
>> +            printk(XENLOG_WARNING "%s: unsupported cpu ID=%lu for parent irq%u\n",
>> +                   node->name, hartid, i);
> The message continues to be ambiguous (to me as a non-RISC-V person at
> least): You log a hart ID, while you say "cpu ID". Also, as I think I
> said elsewhere already, the hart ID may better be logged using %#lx.

I will correct the message.

>> +        }
>> +
>> +        if ( index == nr_mmios )
>> +        {
>> +            printk(XENLOG_WARNING "%s: MMIO not found for parent irq%u\n",
>> +                   node->name, i);
>> +            continue;
>> +        }
>> +
>> +        if ( !IS_ALIGNED(imsic_cfg.msi[xen_cpuid].base_addr + reloff, PAGE_SIZE) )
> If this is the crucial thing to check, ...
>
>> +        {
>> +            printk(XENLOG_WARNING "%s: MMIO address %#lx is not aligned on a page\n",
>> +                   node->name, imsic_cfg.msi[xen_cpuid].base_addr + reloff);
>> +            imsic_cfg.msi[xen_cpuid].offset = 0;
>> +            imsic_cfg.msi[xen_cpuid].base_addr = 0;
>> +            continue;
>> +        }
>> +
>> +        cpumask_set_cpu(xen_cpuid, imsic_cfg.mmios[index].cpus);
>> +
>> +        imsic_cfg.msi[xen_cpuid].base_addr = imsic_cfg.mmios[index].base_addr;
>> +        imsic_cfg.msi[xen_cpuid].offset = reloff;
> ... why is it that the two parts are stored separately? If their sum needs to
> be page-aligned, I'd kind of expect it's only ever the sum which is used?

Because in APLIC's target register it is used only base_addr and which one interrupt
file to use is chosen by hart/guest index:
   static void vaplic_update_target(const struct imsic_config *imsic,
                                    int guest_id, unsigned long hart_id,
                                    uint32_t *value)
   {
       unsigned long group_index;
       uint32_t hhxw = imsic->group_index_bits;
       uint32_t lhxw = imsic->hart_index_bits;
       uint32_t hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT * 2;
       unsigned long base_ppn = imsic->msi[hart_id].base_addr >> IMSIC_MMIO_PAGE_SHIFT;

       group_index = (base_ppn >> (hhxs + 12)) & (BIT(hhxw, UL) - 1);

       *value &= APLIC_TARGET_EIID_MASK;
       *value |= guest_id << APLIC_TARGET_GUEST_IDX_SHIFT;
       *value |= hart_id << APLIC_TARGET_HART_IDX_SHIFT;
       *value |= group_index << (lhxw + APLIC_TARGET_HART_IDX_SHIFT) ;
   }


> Also is it really PAGE_SHIFT or rather IMSIC_MMIO_PAGE_SHIFT that needs
> chacking against?

I think more correct will be IMSIC_MMIO_PAGE_SHIFT.

>
> Further please pay attention to line length limits - there are at least two
> violations around my earlier comment here.
>
>> --- /dev/null
>> +++ b/xen/arch/riscv/include/asm/imsic.h
>> @@ -0,0 +1,65 @@
>> +/* SPDX-License-Identifier: MIT */
>> +
>> +/*
>> + * xen/arch/riscv/include/asm/imsic.h
>> + *
>> + * RISC-V Incoming MSI Controller support
>> + *
>> + * (c) Microchip Technology Inc.
>> + */
>> +
>> +#ifndef ASM__RISCV__IMSIC_H
>> +#define ASM__RISCV__IMSIC_H
> Please update according to the most recent naming rules change (all it takes
> may be to shrink the double underscores).
>
>> +#include <xen/types.h>
>> +
>> +#define IMSIC_MMIO_PAGE_SHIFT   12
>> +#define IMSIC_MMIO_PAGE_SZ      (1UL << IMSIC_MMIO_PAGE_SHIFT)
>> +
>> +#define IMSIC_MIN_ID            63
>> +#define IMSIC_MAX_ID            2047
>> +
>> +struct imsic_msi {
>> +    paddr_t base_addr;
>> +    unsigned long offset;
>> +};
>> +
>> +struct imsic_mmios {
>> +    paddr_t base_addr;
>> +    unsigned long size;
>> +    cpumask_var_t cpus;
>> +};
>> +
>> +struct imsic_config {
>> +    /* Base address */
>> +    paddr_t base_addr;
>> +
>> +    /* Bits representing Guest index, HART index, and Group index */
>> +    unsigned int guest_index_bits;
>> +    unsigned int hart_index_bits;
>> +    unsigned int group_index_bits;
>> +    unsigned int group_index_shift;
>> +
>> +    /* IMSIC phandle */
>> +    unsigned int phandle;
>> +
>> +    /* Number of parent irq */
>> +    unsigned int nr_parent_irqs;
>> +
>> +    /* Number off interrupt identities */
>> +    unsigned int nr_ids;
>> +
>> +    /* MMIOs */
>> +    unsigned int nr_mmios;
>> +    struct imsic_mmios *mmios;
> Are the contents of this and ...
>
>> +    /* MSI */
>> +    struct imsic_msi *msi;
> ... this array ever changing post-init? If not, the pointers here may want
> to be pointer-to-const (requiring local variables in the function populating
> the field).

No, they will be initialized once.

Even more I think we can drop  *mmios and nr_mmios from this struct as it is used only
in imsic_init(), so could be allocated and freed there.

Only *msi is used in the function (vaplic_update_target) mentioned above.

>
>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>       return pcpu_info[cpuid].hart_id;
>>   }
>>   
>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>> +{
>> +    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>> +    {
>> +        if ( hartid == cpuid_to_hartid(cpuid) )
>> +            return cpuid;
>> +    }
>> +
>> +    /* hartid isn't valid for some reason */
>> +    return NR_CPUS;
>> +}
> Considering the values being returned, why's the function's return type
> "unsigned long"?

mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
and MXLEN is 64 for RV64.

>
> Why the use of ARRAY_SIZE() in the loop header? You don't use pcpu_info[]
> in the loop body.

I will chang to NR_CPUs. I thought that it would be more generic if pcpu_info will
be initialized with something else, not NR_CPUs, but I don't rembember why I think
it would be better.

>
> Finally - on big systems this is going to be pretty inefficient a lookup.
> This may want to at least have a TODO comment attached.

Sure, I will add.

Thanks.

~ Oleksii
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 2 weeks ago
On 26.05.2025 20:44, Oleksii Kurochko wrote:
> On 5/22/25 4:46 PM, Jan Beulich wrote:
>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>> +    /* Allocate MMIO resource array */
>>> +    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
>> How large can this and ...
>>
>>> +    if ( !imsic_cfg.mmios )
>>> +    {
>>> +        rc = -ENOMEM;
>>> +        goto imsic_init_err;
>>> +    }
>>> +
>>> +    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
>> ... this array grow (in principle)?
> 
> Roughly speaking, this is the number of processors. The highests amount of processors
> on the market I saw it was 32. But it was over a year ago when I last checked this.

Unless there's an architectural limit, I don't think it's a good idea to
take as reference what's available at present. But yes, ...

>>   I think you're aware that in principle
>> new code is expected to use xvmalloc() and friends unless there are specific
>> reasons speaking against that.
> 
> Oh, missed 'v'...

... adding the missing 'v' will take care of my concern. Provided of
course this isn't running to early for vmalloc() to be usable just yet.

>>> +    if ( !imsic_cfg.msi )
>>> +    {
>>> +        rc = -ENOMEM;
>>> +        goto imsic_init_err;
>>> +    }
>>> +
>>> +    /* Check MMIO register sets */
>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>> +    {
>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>> +        {
>>> +            rc = -ENOMEM;
>>> +            goto imsic_init_err;
>>> +        }
>>> +
>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>> +                                   &imsic_cfg.mmios[i].size);
>>> +        if ( rc )
>>> +        {
>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>> +                   node->name, i);
>>> +            goto imsic_init_err;
>>> +        }
>>> +
>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>> +                           imsic_cfg.hart_index_bits +
>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>> +                       imsic_cfg.group_index_shift);
>>> +        if ( base_addr != imsic_cfg.base_addr )
>>> +        {
>>> +            rc = -EINVAL;
>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>> +                   node->name, i);
>>> +            goto imsic_init_err;
>>> +        }
>> Maybe just for my own understanding: There's no similar check for the
>> sizes to match / be consistent wanted / needed?
> 
> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
> provide, IMO.
> So I don't what is possible range for imsic_cfg.mmios[i].size.

Well, all I can say is that's it feels odd that you sanity check base_addr
but permit effectively any size.

>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>       return pcpu_info[cpuid].hart_id;
>>>   }
>>>   
>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>> +{
>>> +    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>> +    {
>>> +        if ( hartid == cpuid_to_hartid(cpuid) )
>>> +            return cpuid;
>>> +    }
>>> +
>>> +    /* hartid isn't valid for some reason */
>>> +    return NR_CPUS;
>>> +}
>> Considering the values being returned, why's the function's return type
>> "unsigned long"?
> 
> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
> and MXLEN is 64 for RV64.

Yet the return value here is always constrained by NR_CPUS, isn't it?

Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 2 weeks ago
On 6/2/25 12:21 PM, Jan Beulich wrote:
> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>> +    /* Allocate MMIO resource array */
>>>> +    imsic_cfg.mmios = xzalloc_array(struct imsic_mmios, nr_mmios);
>>> How large can this and ...
>>>
>>>> +    if ( !imsic_cfg.mmios )
>>>> +    {
>>>> +        rc = -ENOMEM;
>>>> +        goto imsic_init_err;
>>>> +    }
>>>> +
>>>> +    imsic_cfg.msi = xzalloc_array(struct imsic_msi, nr_parent_irqs);
>>> ... this array grow (in principle)?
>> Roughly speaking, this is the number of processors. The highests amount of processors
>> on the market I saw it was 32. But it was over a year ago when I last checked this.
> Unless there's an architectural limit, I don't think it's a good idea to
> take as reference what's available at present. But yes, ...

This (32) is not an architectural limit.
I assume that if mhartd id accepts a range from 0 to  2^64-1 for RV64 then I assume
that the *theoretical* limit for amount of cpus is  2^64-1. And in RISC-V spec. I can't
find if it is theoretical limit or not.
But if look into AIA (interrupt controller) specification then it tells explicitly that limit
is 16,384:
   1.2 Limits
   In its current version, the RISC-V Advanced Interrupt Architecture can support RISC-V symmet-ric
   multiprocessing (SMP) systems with up to 16,384 harts. If the harts are 64-bit (RV64) and implement
   the hypervisor extension, and if all features of the Advanced Interrupt Architecture are fully
   implemented as well, then for each physical hart there may be up to 63 active virtual harts and
   potentially thousands of additional idle (swapped-out) virtual harts, where each virtual hart has
   direct control of one or more physical devices.
Also 16,384 is used as a maximum for nr_parent_irqs from DTS point of view.

>
>>>    I think you're aware that in principle
>>> new code is expected to use xvmalloc() and friends unless there are specific
>>> reasons speaking against that.
>> Oh, missed 'v'...
> ... adding the missing 'v' will take care of my concern. Provided of
> course this isn't running to early for vmalloc() to be usable just yet.
>
>>>> +    if ( !imsic_cfg.msi )
>>>> +    {
>>>> +        rc = -ENOMEM;
>>>> +        goto imsic_init_err;
>>>> +    }
>>>> +
>>>> +    /* Check MMIO register sets */
>>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>> +    {
>>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>> +        {
>>>> +            rc = -ENOMEM;
>>>> +            goto imsic_init_err;
>>>> +        }
>>>> +
>>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>> +                                   &imsic_cfg.mmios[i].size);
>>>> +        if ( rc )
>>>> +        {
>>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>> +                   node->name, i);
>>>> +            goto imsic_init_err;
>>>> +        }
>>>> +
>>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>> +                           imsic_cfg.hart_index_bits +
>>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>> +                       imsic_cfg.group_index_shift);
>>>> +        if ( base_addr != imsic_cfg.base_addr )
>>>> +        {
>>>> +            rc = -EINVAL;
>>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>> +                   node->name, i);
>>>> +            goto imsic_init_err;
>>>> +        }
>>> Maybe just for my own understanding: There's no similar check for the
>>> sizes to match / be consistent wanted / needed?
>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>> provide, IMO.
>> So I don't what is possible range for imsic_cfg.mmios[i].size.
> Well, all I can say is that's it feels odd that you sanity check base_addr
> but permit effectively any size.

Okay, I think I have two ideas how to check the size:
1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
     for (socket = 0; socket < socket_count; socket++) {
         imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
         imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
                      s->soc[socket].num_harts;
     ...
    where:
      #define IMSIC_MMIO_PAGE_SHIFT          12
      #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
      
      #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
              (1U << (__guest_bits))
      #define IMSIC_HART_SIZE(__guest_bits)                 \
              (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)

2. Just take a theoretical maximum for S-mode IMSIC's node:
     16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
    Where,
      16,384 - maximum possible amount of harts according to AIA spec
      64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
           1 - S interupt file + 63 guest interrupt files.
      4 Kib - a maximum size of one interrupt file.

Which option is preferred?

The specification doesn’t seem to mention (or I couldn’t find) that all platforms
must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
better to use the approach described in option 2.

On the other hand, I don't think a platform should be considered correct if it
provides slightly more than needed but still less than the theoretical maximum.

>
>>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>>        return pcpu_info[cpuid].hart_id;
>>>>    }
>>>>    
>>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>>> +{
>>>> +    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>>> +    {
>>>> +        if ( hartid == cpuid_to_hartid(cpuid) )
>>>> +            return cpuid;
>>>> +    }
>>>> +
>>>> +    /* hartid isn't valid for some reason */
>>>> +    return NR_CPUS;
>>>> +}
>>> Considering the values being returned, why's the function's return type
>>> "unsigned long"?
>> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
>> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
>> and MXLEN is 64 for RV64.
> Yet the return value here is always constrained by NR_CPUS, isn't it?

I am not 100% sure that I get your point, but I put NR_CPUS here because of:
   /*
    * tp points to one of these per cpu.
    *
    * hart_id would be valid (no matter which value) if its
    * processor_id field is valid (less than NR_CPUS).
    */
   struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
       .processor_id = NR_CPUS,
   }};

As an option we could use ULONG_MAX. Would it be better?

~ Oleksii
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 1 week ago
On 04.06.2025 17:36, Oleksii Kurochko wrote:
> On 6/2/25 12:21 PM, Jan Beulich wrote:
>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>> +    /* Check MMIO register sets */
>>>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>> +    {
>>>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>> +        {
>>>>> +            rc = -ENOMEM;
>>>>> +            goto imsic_init_err;
>>>>> +        }
>>>>> +
>>>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>> +                                   &imsic_cfg.mmios[i].size);
>>>>> +        if ( rc )
>>>>> +        {
>>>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>> +                   node->name, i);
>>>>> +            goto imsic_init_err;
>>>>> +        }
>>>>> +
>>>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>> +                           imsic_cfg.hart_index_bits +
>>>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>> +                       imsic_cfg.group_index_shift);
>>>>> +        if ( base_addr != imsic_cfg.base_addr )
>>>>> +        {
>>>>> +            rc = -EINVAL;
>>>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>> +                   node->name, i);
>>>>> +            goto imsic_init_err;
>>>>> +        }
>>>> Maybe just for my own understanding: There's no similar check for the
>>>> sizes to match / be consistent wanted / needed?
>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>> provide, IMO.
>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>> Well, all I can say is that's it feels odd that you sanity check base_addr
>> but permit effectively any size.
> 
> Okay, I think I have two ideas how to check the size:
> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>      for (socket = 0; socket < socket_count; socket++) {
>          imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>          imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>                       s->soc[socket].num_harts;
>      ...
>     where:
>       #define IMSIC_MMIO_PAGE_SHIFT          12
>       #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
>       
>       #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
>               (1U << (__guest_bits))
>       #define IMSIC_HART_SIZE(__guest_bits)                 \
>               (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
> 
> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>      16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>     Where,
>       16,384 - maximum possible amount of harts according to AIA spec
>       64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>            1 - S interupt file + 63 guest interrupt files.
>       4 Kib - a maximum size of one interrupt file.
> 
> Which option is preferred?

I would have said 2, if your outline used "actual" rather than "maximum" values.

> The specification doesn’t seem to mention (or I couldn’t find) that all platforms
> must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
> better to use the approach described in option 2.
> 
> On the other hand, I don't think a platform should be considered correct if it
> provides slightly more than needed but still less than the theoretical maximum.
> 
>>
>>>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>>>        return pcpu_info[cpuid].hart_id;
>>>>>    }
>>>>>    
>>>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>>>> +{
>>>>> +    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>>>> +    {
>>>>> +        if ( hartid == cpuid_to_hartid(cpuid) )
>>>>> +            return cpuid;
>>>>> +    }
>>>>> +
>>>>> +    /* hartid isn't valid for some reason */
>>>>> +    return NR_CPUS;
>>>>> +}
>>>> Considering the values being returned, why's the function's return type
>>>> "unsigned long"?
>>> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
>>> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
>>> and MXLEN is 64 for RV64.
>> Yet the return value here is always constrained by NR_CPUS, isn't it?
> 
> I am not 100% sure that I get your point,

NR_CPUS is guaranteed to fit in a unsigned int. Furthermore variables / parameters
holding Xen-internal CPU numbers also are unsigned int everywhere else.

> but I put NR_CPUS here because of:
>    /*
>     * tp points to one of these per cpu.
>     *
>     * hart_id would be valid (no matter which value) if its
>     * processor_id field is valid (less than NR_CPUS).
>     */
>    struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
>        .processor_id = NR_CPUS,
>    }};
> 
> As an option we could use ULONG_MAX. Would it be better?

No. NR_CPUS is the appropriate value to use here, imo.

Jan

Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 1 week ago
On 6/5/25 8:50 AM, Jan Beulich wrote:
> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>> +    /* Check MMIO register sets */
>>>>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>> +    {
>>>>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>> +        {
>>>>>> +            rc = -ENOMEM;
>>>>>> +            goto imsic_init_err;
>>>>>> +        }
>>>>>> +
>>>>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>> +                                   &imsic_cfg.mmios[i].size);
>>>>>> +        if ( rc )
>>>>>> +        {
>>>>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>> +                   node->name, i);
>>>>>> +            goto imsic_init_err;
>>>>>> +        }
>>>>>> +
>>>>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>> +                           imsic_cfg.hart_index_bits +
>>>>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>> +                       imsic_cfg.group_index_shift);
>>>>>> +        if ( base_addr != imsic_cfg.base_addr )
>>>>>> +        {
>>>>>> +            rc = -EINVAL;
>>>>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>> +                   node->name, i);
>>>>>> +            goto imsic_init_err;
>>>>>> +        }
>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>> sizes to match / be consistent wanted / needed?
>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>> provide, IMO.
>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>> but permit effectively any size.
>> Okay, I think I have two ideas how to check the size:
>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>       for (socket = 0; socket < socket_count; socket++) {
>>           imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>           imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>                        s->soc[socket].num_harts;
>>       ...
>>      where:
>>        #define IMSIC_MMIO_PAGE_SHIFT          12
>>        #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>        
>>        #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
>>                (1U << (__guest_bits))
>>        #define IMSIC_HART_SIZE(__guest_bits)                 \
>>                (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>
>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>       16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>      Where,
>>        16,384 - maximum possible amount of harts according to AIA spec
>>        64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>             1 - S interupt file + 63 guest interrupt files.
>>        4 Kib - a maximum size of one interrupt file.
>>
>> Which option is preferred?
> I would have said 2, if your outline used "actual" rather than "maximum" values.

In option 2 maximum possible values are used. If we want to use "actual" values then
the option 1 should be used. At the moment, I did in the following way that S-mode
IMSIC node, at least, should contain 1 S-mode interrupt file + an amount of guest
interrupts files (based on imsic_cfg.guest_index_bits):

+#define IMSIC_HART_SIZE(guest_bits_) (BIT(guest_bits_, U) * IMSIC_MMIO_PAGE_SZ)
+
  #define IMSIC_DISABLE_EIDELIVERY    0
  #define IMSIC_ENABLE_EIDELIVERY     1
  #define IMSIC_DISABLE_EITHRESHOLD   1
@@ -359,6 +356,10 @@ int __init imsic_init(const struct dt_device_node *node)
      /* Check MMIO register sets */
      for ( unsigned int i = 0; i < nr_mmios; i++ )
      {
+        unsigned int guest_bits = imsic_cfg.guest_index_bits;
+        unsigned long expected_size =
+            IMSIC_HART_SIZE(guest_bits) * nr_parent_irqs;
+
          rc = dt_device_get_address(node, i, &mmios[i].base_addr,
                                     &mmios[i].size);
          if ( rc )
@@ -369,7 +370,7 @@ int __init imsic_init(const struct dt_device_node *node)
          }
  
          base_addr = mmios[i].base_addr;
-        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
+        base_addr &= ~(BIT(guest_bits +
                             imsic_cfg.hart_index_bits +
                             IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
          base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
@@ -381,6 +382,14 @@ int __init imsic_init(const struct dt_device_node *node)
                     node->name, i);
              goto imsic_init_err;
          }
+
+        if ( mmios[i].size != expected_size )
+        {
+            rc = -EINVAL;
+            printk(XENLOG_ERR "%s: IMSIC MMIO size is incorrect %ld, "
+                   "max:%ld\n", node->name, mmios[i].size, max_size);
+            goto imsic_init_err;
+        }
      }

>
>> The specification doesn’t seem to mention (or I couldn’t find) that all platforms
>> must calculate the MMIO size in the same way QEMU does. Therefore, it’s probably
>> better to use the approach described in option 2.
>>
>> On the other hand, I don't think a platform should be considered correct if it
>> provides slightly more than needed but still less than the theoretical maximum.
>>
>>>>>> @@ -18,6 +19,18 @@ static inline unsigned long cpuid_to_hartid(unsigned long cpuid)
>>>>>>         return pcpu_info[cpuid].hart_id;
>>>>>>     }
>>>>>>     
>>>>>> +static inline unsigned long hartid_to_cpuid(unsigned long hartid)
>>>>>> +{
>>>>>> +    for ( unsigned int cpuid = 0; cpuid < ARRAY_SIZE(pcpu_info); cpuid++ )
>>>>>> +    {
>>>>>> +        if ( hartid == cpuid_to_hartid(cpuid) )
>>>>>> +            return cpuid;
>>>>>> +    }
>>>>>> +
>>>>>> +    /* hartid isn't valid for some reason */
>>>>>> +    return NR_CPUS;
>>>>>> +}
>>>>> Considering the values being returned, why's the function's return type
>>>>> "unsigned long"?
>>>> mhartid register has MXLEN length, so theoretically we could have from 0 to MXLEN-1
>>>> Harts and so we could have the same amount of Xen's CPUIDs. and MXLEN is 32 for RV32
>>>> and MXLEN is 64 for RV64.
>>> Yet the return value here is always constrained by NR_CPUS, isn't it?
>> I am not 100% sure that I get your point,
> NR_CPUS is guaranteed to fit in a unsigned int. Furthermore variables / parameters
> holding Xen-internal CPU numbers also are unsigned int everywhere else.

Okay, then agree, hartid_to_cpuid() should return unsigned int. I'll update correspondingly.

Thanks.

~ Oleksii

>
>> but I put NR_CPUS here because of:
>>     /*
>>      * tp points to one of these per cpu.
>>      *
>>      * hart_id would be valid (no matter which value) if its
>>      * processor_id field is valid (less than NR_CPUS).
>>      */
>>     struct pcpu_info pcpu_info[NR_CPUS] = { [0 ... NR_CPUS - 1] = {
>>         .processor_id = NR_CPUS,
>>     }};
>>
>> As an option we could use ULONG_MAX. Would it be better?
> No. NR_CPUS is the appropriate value to use here, imo.
>
> Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 1 week ago
On 05.06.2025 11:13, Oleksii Kurochko wrote:
> 
> On 6/5/25 8:50 AM, Jan Beulich wrote:
>> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>>> +    /* Check MMIO register sets */
>>>>>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>>> +    {
>>>>>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>>> +        {
>>>>>>> +            rc = -ENOMEM;
>>>>>>> +            goto imsic_init_err;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>>> +                                   &imsic_cfg.mmios[i].size);
>>>>>>> +        if ( rc )
>>>>>>> +        {
>>>>>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>>> +                   node->name, i);
>>>>>>> +            goto imsic_init_err;
>>>>>>> +        }
>>>>>>> +
>>>>>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>>> +                           imsic_cfg.hart_index_bits +
>>>>>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>>> +                       imsic_cfg.group_index_shift);
>>>>>>> +        if ( base_addr != imsic_cfg.base_addr )
>>>>>>> +        {
>>>>>>> +            rc = -EINVAL;
>>>>>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>>> +                   node->name, i);
>>>>>>> +            goto imsic_init_err;
>>>>>>> +        }
>>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>>> sizes to match / be consistent wanted / needed?
>>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>>> provide, IMO.
>>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>>> but permit effectively any size.
>>> Okay, I think I have two ideas how to check the size:
>>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>>       for (socket = 0; socket < socket_count; socket++) {
>>>           imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>>           imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>>                        s->soc[socket].num_harts;
>>>       ...
>>>      where:
>>>        #define IMSIC_MMIO_PAGE_SHIFT          12
>>>        #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>>        
>>>        #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
>>>                (1U << (__guest_bits))
>>>        #define IMSIC_HART_SIZE(__guest_bits)                 \
>>>                (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>>
>>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>>       16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>>      Where,
>>>        16,384 - maximum possible amount of harts according to AIA spec
>>>        64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>>             1 - S interupt file + 63 guest interrupt files.
>>>        4 Kib - a maximum size of one interrupt file.
>>>
>>> Which option is preferred?
>> I would have said 2, if your outline used "actual" rather than "maximum" values.
> 
> In option 2 maximum possible values are used. If we want to use "actual" values then
> the option 1 should be used.

Actually I was wrong with request "actual" uniformly. It's only the hart count where
I meant to ask for that. For interrupts we should allow the maximum possible unless
we already know their count.

Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 1 week ago
On 6/5/25 11:42 AM, Jan Beulich wrote:
> On 05.06.2025 11:13, Oleksii Kurochko wrote:
>> On 6/5/25 8:50 AM, Jan Beulich wrote:
>>> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>>>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>>>> +    /* Check MMIO register sets */
>>>>>>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>>>> +    {
>>>>>>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>>>> +        {
>>>>>>>> +            rc = -ENOMEM;
>>>>>>>> +            goto imsic_init_err;
>>>>>>>> +        }
>>>>>>>> +
>>>>>>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>>>> +                                   &imsic_cfg.mmios[i].size);
>>>>>>>> +        if ( rc )
>>>>>>>> +        {
>>>>>>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>>>> +                   node->name, i);
>>>>>>>> +            goto imsic_init_err;
>>>>>>>> +        }
>>>>>>>> +
>>>>>>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>>>> +                           imsic_cfg.hart_index_bits +
>>>>>>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>>>> +                       imsic_cfg.group_index_shift);
>>>>>>>> +        if ( base_addr != imsic_cfg.base_addr )
>>>>>>>> +        {
>>>>>>>> +            rc = -EINVAL;
>>>>>>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>>>> +                   node->name, i);
>>>>>>>> +            goto imsic_init_err;
>>>>>>>> +        }
>>>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>>>> sizes to match / be consistent wanted / needed?
>>>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>>>> provide, IMO.
>>>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>>>> but permit effectively any size.
>>>> Okay, I think I have two ideas how to check the size:
>>>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>>>        for (socket = 0; socket < socket_count; socket++) {
>>>>            imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>>>            imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>>>                         s->soc[socket].num_harts;
>>>>        ...
>>>>       where:
>>>>         #define IMSIC_MMIO_PAGE_SHIFT          12
>>>>         #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>>>         
>>>>         #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
>>>>                 (1U << (__guest_bits))
>>>>         #define IMSIC_HART_SIZE(__guest_bits)                 \
>>>>                 (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>>>
>>>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>>>        16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>>>       Where,
>>>>         16,384 - maximum possible amount of harts according to AIA spec
>>>>         64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>>>              1 - S interupt file + 63 guest interrupt files.
>>>>         4 Kib - a maximum size of one interrupt file.
>>>>
>>>> Which option is preferred?
>>> I would have said 2, if your outline used "actual" rather than "maximum" values.
>> In option 2 maximum possible values are used. If we want to use "actual" values then
>> the option 1 should be used.
> Actually I was wrong with request "actual" uniformly. It's only the hart count where
> I meant to ask for that. For interrupts we should allow the maximum possible unless
> we already know their count.

Do you mean 'interrupt file' here? If yes, then an amount of them shouldn't be bigger
then 1 + BIT(guest_bits).

~ Oleksii
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 1 week ago
On 05.06.2025 12:15, Oleksii Kurochko wrote:
> 
> On 6/5/25 11:42 AM, Jan Beulich wrote:
>> On 05.06.2025 11:13, Oleksii Kurochko wrote:
>>> On 6/5/25 8:50 AM, Jan Beulich wrote:
>>>> On 04.06.2025 17:36, Oleksii Kurochko wrote:
>>>>> On 6/2/25 12:21 PM, Jan Beulich wrote:
>>>>>> On 26.05.2025 20:44, Oleksii Kurochko wrote:
>>>>>>> On 5/22/25 4:46 PM, Jan Beulich wrote:
>>>>>>>> On 21.05.2025 18:03, Oleksii Kurochko wrote:
>>>>>>>>> +    /* Check MMIO register sets */
>>>>>>>>> +    for ( unsigned int i = 0; i < nr_mmios; i++ )
>>>>>>>>> +    {
>>>>>>>>> +        if ( !alloc_cpumask_var(&imsic_cfg.mmios[i].cpus) )
>>>>>>>>> +        {
>>>>>>>>> +            rc = -ENOMEM;
>>>>>>>>> +            goto imsic_init_err;
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>> +        rc = dt_device_get_address(node, i, &imsic_cfg.mmios[i].base_addr,
>>>>>>>>> +                                   &imsic_cfg.mmios[i].size);
>>>>>>>>> +        if ( rc )
>>>>>>>>> +        {
>>>>>>>>> +            printk(XENLOG_ERR "%s: unable to parse MMIO regset %u\n",
>>>>>>>>> +                   node->name, i);
>>>>>>>>> +            goto imsic_init_err;
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>> +        base_addr = imsic_cfg.mmios[i].base_addr;
>>>>>>>>> +        base_addr &= ~(BIT(imsic_cfg.guest_index_bits +
>>>>>>>>> +                           imsic_cfg.hart_index_bits +
>>>>>>>>> +                           IMSIC_MMIO_PAGE_SHIFT, UL) - 1);
>>>>>>>>> +        base_addr &= ~((BIT(imsic_cfg.group_index_bits, UL) - 1) <<
>>>>>>>>> +                       imsic_cfg.group_index_shift);
>>>>>>>>> +        if ( base_addr != imsic_cfg.base_addr )
>>>>>>>>> +        {
>>>>>>>>> +            rc = -EINVAL;
>>>>>>>>> +            printk(XENLOG_ERR "%s: address mismatch for regset %u\n",
>>>>>>>>> +                   node->name, i);
>>>>>>>>> +            goto imsic_init_err;
>>>>>>>>> +        }
>>>>>>>> Maybe just for my own understanding: There's no similar check for the
>>>>>>>> sizes to match / be consistent wanted / needed?
>>>>>>> If you are speaking about imsic_cfg.mmios[i].size then it depends fully on h/w will
>>>>>>> provide, IMO.
>>>>>>> So I don't what is possible range for imsic_cfg.mmios[i].size.
>>>>>> Well, all I can say is that's it feels odd that you sanity check base_addr
>>>>>> but permit effectively any size.
>>>>> Okay, I think I have two ideas how to check the size:
>>>>> 1. Based on guest bits from IMSIC's DT node. QEMU calculates a size as:
>>>>>        for (socket = 0; socket < socket_count; socket++) {
>>>>>            imsic_addr = base_addr + socket * VIRT_IMSIC_GROUP_MAX_SIZE;
>>>>>            imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
>>>>>                         s->soc[socket].num_harts;
>>>>>        ...
>>>>>       where:
>>>>>         #define IMSIC_MMIO_PAGE_SHIFT          12
>>>>>         #define IMSIC_MMIO_PAGE_SZ             (1UL << IMSIC_MMIO_PAGE_SHIFT)
>>>>>         
>>>>>         #define IMSIC_HART_NUM_GUESTS(__guest_bits)           \
>>>>>                 (1U << (__guest_bits))
>>>>>         #define IMSIC_HART_SIZE(__guest_bits)                 \
>>>>>                 (IMSIC_HART_NUM_GUESTS(__guest_bits) * IMSIC_MMIO_PAGE_SZ)
>>>>>
>>>>> 2. Just take a theoretical maximum for S-mode IMSIC's node:
>>>>>        16,384 * 64 1(S-mode interrupt file) + 63(max guest interrupt files)) * 4 KiB
>>>>>       Where,
>>>>>         16,384 - maximum possible amount of harts according to AIA spec
>>>>>         64 - a maximum amount of possible interrupt file for S-mode IMSIC node:
>>>>>              1 - S interupt file + 63 guest interrupt files.
>>>>>         4 Kib - a maximum size of one interrupt file.
>>>>>
>>>>> Which option is preferred?
>>>> I would have said 2, if your outline used "actual" rather than "maximum" values.
>>> In option 2 maximum possible values are used. If we want to use "actual" values then
>>> the option 1 should be used.
>> Actually I was wrong with request "actual" uniformly. It's only the hart count where
>> I meant to ask for that. For interrupts we should allow the maximum possible unless
>> we already know their count.
> 
> Do you mean 'interrupt file' here?

Yes, I do. Sorry for getting the terminology wrong.

Jan

> If yes, then an amount of them shouldn't be bigger
> then 1 + BIT(guest_bits).
> 
> ~ Oleksii
>
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 3 weeks ago
On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>> +                               &imsic_cfg.guest_index_bits) )
>>> +        imsic_cfg.guest_index_bits = 0;
>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>> +    {
>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>> +               dt_node_name(node));
>>> +        rc = -ENOENT;
>>> +        goto cleanup;
>>> +    }
>>> +
>>> +    /* Find number of HART index bits */
>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>> +                               &imsic_cfg.hart_index_bits) )
>>> +    {
>>> +        /* Assume default value */
>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>> +            imsic_cfg.hart_index_bits++;
>> Since fls() returns a 1-based bit number, isn't it rather that in the
>> exact-power-of-2 case you'd need to subtract 1?
> Agree, in this case, -1 should be taken into account.

Hmm, it seems like in case of fls() returns a 1-based bit number there
is not need for the check:
  (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )

We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
checking *nr_parent_irqs is power-of-two or not, and then just leave the
check (2).
And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
I amn't mistaken something. And if I'm not mistaken, then probably it make
sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
case is so special.

Does it make sense?

~ Oleksii
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 2 weeks ago
On 27.05.2025 13:30, Oleksii Kurochko wrote:
> 
> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>> +                               &imsic_cfg.guest_index_bits) )
>>>> +        imsic_cfg.guest_index_bits = 0;
>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>> +    {
>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>> +               dt_node_name(node));
>>>> +        rc = -ENOENT;
>>>> +        goto cleanup;
>>>> +    }
>>>> +
>>>> +    /* Find number of HART index bits */
>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>> +                               &imsic_cfg.hart_index_bits) )
>>>> +    {
>>>> +        /* Assume default value */
>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>> +            imsic_cfg.hart_index_bits++;
>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>> exact-power-of-2 case you'd need to subtract 1?
>> Agree, in this case, -1 should be taken into account.
> 
> Hmm, it seems like in case of fls() returns a 1-based bit number there
> is not need for the check:
>   (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
> 
> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
> checking *nr_parent_irqs is power-of-two or not, and then just leave the
> check (2).
> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
> I amn't mistaken something. And if I'm not mistaken, then probably it make
> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
> case is so special.
> 
> Does it make sense?

Can't easily tell; I'd like to see the resulting code instead of the textual
description.

Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 2 weeks ago
On 6/2/25 12:22 PM, Jan Beulich wrote:
> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>> +                               &imsic_cfg.guest_index_bits) )
>>>>> +        imsic_cfg.guest_index_bits = 0;
>>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>>> +    {
>>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>> +               dt_node_name(node));
>>>>> +        rc = -ENOENT;
>>>>> +        goto cleanup;
>>>>> +    }
>>>>> +
>>>>> +    /* Find number of HART index bits */
>>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>> +                               &imsic_cfg.hart_index_bits) )
>>>>> +    {
>>>>> +        /* Assume default value */
>>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>> +            imsic_cfg.hart_index_bits++;
>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>> exact-power-of-2 case you'd need to subtract 1?
>>> Agree, in this case, -1 should be taken into account.
>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>> is not need for the check:
>>    (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>
>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>> check (2).
>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>> case is so special.
>>
>> Does it make sense?
> Can't easily tell; I'd like to see the resulting code instead of the textual
> description.

Here is the code:
     /* Find number of HART index bits */
     if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
                                &imsic_cfg.hart_index_bits) )
     {
         /* Assume default value */
         imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
                                     (*nr_parent_irqs == 1);
     }

It seems like it covers all the cases.

~ Oleksii
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 2 weeks ago
On 04.06.2025 15:42, Oleksii Kurochko wrote:
> 
> On 6/2/25 12:22 PM, Jan Beulich wrote:
>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>> +                               &imsic_cfg.guest_index_bits) )
>>>>>> +        imsic_cfg.guest_index_bits = 0;
>>>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>>>> +    {
>>>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>> +               dt_node_name(node));
>>>>>> +        rc = -ENOENT;
>>>>>> +        goto cleanup;
>>>>>> +    }
>>>>>> +
>>>>>> +    /* Find number of HART index bits */
>>>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>> +                               &imsic_cfg.hart_index_bits) )
>>>>>> +    {
>>>>>> +        /* Assume default value */
>>>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>> +            imsic_cfg.hart_index_bits++;
>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>> Agree, in this case, -1 should be taken into account.
>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>> is not need for the check:
>>>    (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>
>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>> check (2).
>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>> case is so special.
>>>
>>> Does it make sense?
>> Can't easily tell; I'd like to see the resulting code instead of the textual
>> description.
> 
> Here is the code:
>      /* Find number of HART index bits */
>      if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>                                 &imsic_cfg.hart_index_bits) )
>      {
>          /* Assume default value */
>          imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>                                      (*nr_parent_irqs == 1);
>      }
> 
> It seems like it covers all the cases.

*nr_parent_irqs		imsic_cfg.hart_index_bits
	 1			1 (0 + 1)
	 2			1
	 3			2
	 4			2
	 5			3
	 6			3

IOW why the special casing of *nr_parent_irqs == 1?

Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 2 weeks ago
On 6/4/25 5:05 PM, Jan Beulich wrote:
> On 04.06.2025 15:42, Oleksii Kurochko wrote:
>> On 6/2/25 12:22 PM, Jan Beulich wrote:
>>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>>> +                               &imsic_cfg.guest_index_bits) )
>>>>>>> +        imsic_cfg.guest_index_bits = 0;
>>>>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>>>>> +    {
>>>>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>>> +               dt_node_name(node));
>>>>>>> +        rc = -ENOENT;
>>>>>>> +        goto cleanup;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    /* Find number of HART index bits */
>>>>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>>> +                               &imsic_cfg.hart_index_bits) )
>>>>>>> +    {
>>>>>>> +        /* Assume default value */
>>>>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>> +            imsic_cfg.hart_index_bits++;
>>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>>> Agree, in this case, -1 should be taken into account.
>>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>>> is not need for the check:
>>>>     (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>
>>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>>> check (2).
>>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>>> case is so special.
>>>>
>>>> Does it make sense?
>>> Can't easily tell; I'd like to see the resulting code instead of the textual
>>> description.
>> Here is the code:
>>       /* Find number of HART index bits */
>>       if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>                                  &imsic_cfg.hart_index_bits) )
>>       {
>>           /* Assume default value */
>>           imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>>                                       (*nr_parent_irqs == 1);
>>       }
>>
>> It seems like it covers all the cases.
> *nr_parent_irqs		imsic_cfg.hart_index_bits
> 	 1			1 (0 + 1)
> 	 2			1
> 	 3			2
> 	 4			2
> 	 5			3
> 	 6			3
>
> IOW why the special casing of *nr_parent_irqs == 1?

If we don't have "... + (*nr_parent_irqs == 1)" then for the case when *nr_parent_irqs == 1,
we will have imsic_cfg.hart_index_bits = fls(1-1) = fls(0) = 0 because:
   #define arch_fls(x)     ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
and imsic_cfg.hart_index_bits = 0 doesn't seem correct because I expect that if I have only
1 hart then at least 1 bit will be needed to point to it.

~ Oleksii



Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 1 week ago
On 04.06.2025 17:41, Oleksii Kurochko wrote:
> 
> On 6/4/25 5:05 PM, Jan Beulich wrote:
>> On 04.06.2025 15:42, Oleksii Kurochko wrote:
>>> On 6/2/25 12:22 PM, Jan Beulich wrote:
>>>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>>>> +                               &imsic_cfg.guest_index_bits) )
>>>>>>>> +        imsic_cfg.guest_index_bits = 0;
>>>>>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>>>>>> +    {
>>>>>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>>>> +               dt_node_name(node));
>>>>>>>> +        rc = -ENOENT;
>>>>>>>> +        goto cleanup;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    /* Find number of HART index bits */
>>>>>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>>>> +                               &imsic_cfg.hart_index_bits) )
>>>>>>>> +    {
>>>>>>>> +        /* Assume default value */
>>>>>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>>> +            imsic_cfg.hart_index_bits++;
>>>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>>>> Agree, in this case, -1 should be taken into account.
>>>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>>>> is not need for the check:
>>>>>     (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>
>>>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>>>> check (2).
>>>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>>>> case is so special.
>>>>>
>>>>> Does it make sense?
>>>> Can't easily tell; I'd like to see the resulting code instead of the textual
>>>> description.
>>> Here is the code:
>>>       /* Find number of HART index bits */
>>>       if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>                                  &imsic_cfg.hart_index_bits) )
>>>       {
>>>           /* Assume default value */
>>>           imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>>>                                       (*nr_parent_irqs == 1);
>>>       }
>>>
>>> It seems like it covers all the cases.
>> *nr_parent_irqs		imsic_cfg.hart_index_bits
>> 	 1			1 (0 + 1)
>> 	 2			1
>> 	 3			2
>> 	 4			2
>> 	 5			3
>> 	 6			3
>>
>> IOW why the special casing of *nr_parent_irqs == 1?
> 
> If we don't have "... + (*nr_parent_irqs == 1)" then for the case when *nr_parent_irqs == 1,
> we will have imsic_cfg.hart_index_bits = fls(1-1) = fls(0) = 0 because:
>    #define arch_fls(x)     ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
> and imsic_cfg.hart_index_bits = 0 doesn't seem correct because I expect that if I have only
> 1 hart then at least 1 bit will be needed to point to it.

No, why? To pick 1 out of 1 you need no bits at all to represent.

Jan
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Oleksii Kurochko 6 months, 1 week ago
On 6/5/25 8:52 AM, Jan Beulich wrote:
> On 04.06.2025 17:41, Oleksii Kurochko wrote:
>> On 6/4/25 5:05 PM, Jan Beulich wrote:
>>> On 04.06.2025 15:42, Oleksii Kurochko wrote:
>>>> On 6/2/25 12:22 PM, Jan Beulich wrote:
>>>>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>>>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>>>>> +                               &imsic_cfg.guest_index_bits) )
>>>>>>>>> +        imsic_cfg.guest_index_bits = 0;
>>>>>>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>>>>>>> +    {
>>>>>>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>>>>> +               dt_node_name(node));
>>>>>>>>> +        rc = -ENOENT;
>>>>>>>>> +        goto cleanup;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    /* Find number of HART index bits */
>>>>>>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>>>>> +                               &imsic_cfg.hart_index_bits) )
>>>>>>>>> +    {
>>>>>>>>> +        /* Assume default value */
>>>>>>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>>>> +            imsic_cfg.hart_index_bits++;
>>>>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>>>>> Agree, in this case, -1 should be taken into account.
>>>>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>>>>> is not need for the check:
>>>>>>      (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>>
>>>>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>>>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>>>>> check (2).
>>>>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>>>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>>>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>>>>> case is so special.
>>>>>>
>>>>>> Does it make sense?
>>>>> Can't easily tell; I'd like to see the resulting code instead of the textual
>>>>> description.
>>>> Here is the code:
>>>>        /* Find number of HART index bits */
>>>>        if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>                                   &imsic_cfg.hart_index_bits) )
>>>>        {
>>>>            /* Assume default value */
>>>>            imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>>>>                                        (*nr_parent_irqs == 1);
>>>>        }
>>>>
>>>> It seems like it covers all the cases.
>>> *nr_parent_irqs		imsic_cfg.hart_index_bits
>>> 	 1			1 (0 + 1)
>>> 	 2			1
>>> 	 3			2
>>> 	 4			2
>>> 	 5			3
>>> 	 6			3
>>>
>>> IOW why the special casing of *nr_parent_irqs == 1?
>> If we don't have "... + (*nr_parent_irqs == 1)" then for the case when *nr_parent_irqs == 1,
>> we will have imsic_cfg.hart_index_bits = fls(1-1) = fls(0) = 0 because:
>>     #define arch_fls(x)     ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>> and imsic_cfg.hart_index_bits = 0 doesn't seem correct because I expect that if I have only
>> 1 hart then at least 1 bit will be needed to point to it.
> No, why? To pick 1 out of 1 you need no bits at all to represent.

It seems you are right, I thought that DT's binding requires it to be minimum 1, but it could
be zero.
Then it is okay just to initialize hart_index_bits without extra " + (*nr_parent_irqs == 1)":
   imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1);

~ Oleksii
Re: [PATCH v3 08/14] xen/riscv: imsic_init() implementation
Posted by Jan Beulich 6 months, 2 weeks ago
On 04.06.2025 15:42, Oleksii Kurochko wrote:
> 
> On 6/2/25 12:22 PM, Jan Beulich wrote:
>> On 27.05.2025 13:30, Oleksii Kurochko wrote:
>>> On 5/26/25 8:44 PM, Oleksii Kurochko wrote:
>>>>>> +    if ( !dt_property_read_u32(node, "riscv,guest-index-bits",
>>>>>> +                               &imsic_cfg.guest_index_bits) )
>>>>>> +        imsic_cfg.guest_index_bits = 0;
>>>>>> +    tmp = BITS_PER_LONG - IMSIC_MMIO_PAGE_SHIFT;
>>>>>> +    if ( tmp < imsic_cfg.guest_index_bits )
>>>>>> +    {
>>>>>> +        printk(XENLOG_ERR "%s: guest index bits too big\n",
>>>>>> +               dt_node_name(node));
>>>>>> +        rc = -ENOENT;
>>>>>> +        goto cleanup;
>>>>>> +    }
>>>>>> +
>>>>>> +    /* Find number of HART index bits */
>>>>>> +    if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>>>>>> +                               &imsic_cfg.hart_index_bits) )
>>>>>> +    {
>>>>>> +        /* Assume default value */
>>>>>> +        imsic_cfg.hart_index_bits = fls(*nr_parent_irqs);
>>>>>> +        if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>>>> +            imsic_cfg.hart_index_bits++;
>>>>> Since fls() returns a 1-based bit number, isn't it rather that in the
>>>>> exact-power-of-2 case you'd need to subtract 1?
>>>> Agree, in this case, -1 should be taken into account.
>>> Hmm, it seems like in case of fls() returns a 1-based bit number there
>>> is not need for the check:
>>>    (2) if ( BIT(imsic_cfg.hart_index_bits, UL) < *nr_parent_irqs )
>>>
>>> We could do imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) (1) without
>>> checking *nr_parent_irqs is power-of-two or not, and then just leave the
>>> check (2).
>>> And with (1), the check (2) is only needed for the case *nr_parent_irqs=1, if
>>> I amn't mistaken something. And if I'm not mistaken, then probably it make
>>> sense to change (2) to if ( *nr_parent_irqs == 1 ) + some comment why this
>>> case is so special.
>>>
>>> Does it make sense?
>> Can't easily tell; I'd like to see the resulting code instead of the textual
>> description.
> 
> Here is the code:
>      /* Find number of HART index bits */
>      if ( !dt_property_read_u32(node, "riscv,hart-index-bits",
>                                 &imsic_cfg.hart_index_bits) )
>      {
>          /* Assume default value */
>          imsic_cfg.hart_index_bits = fls(*nr_parent_irqs - 1) +
>                                      (*nr_parent_irqs == 1);
>      }
> 
> It seems like it covers all the cases.

*nr_parent_irqs		imsic_cfg.hart_index_bits
	 1			0
	 2			2 (1 + 1)
	 3			2
	 4			2
	 5			3
	 6			3

IOW why the special casing of *nr_parent_irqs == 1?

Jan