:p
atchew
Login
The patch series is based on top of 'RISCV basic exception handling implementation' patch series. [1] The patch series introduces the following things: 1. Functionality to build the page tables for Xen that map link-time to physical-time location. 2. Check that Xen is less then page size. 3. Check that load addresses don't overlap with linker addresses. 4. Prepare things for proper switch to virtual memory world. 5. Load the built page table into the SATP 6. Enable MMU. [1] https://lore.kernel.org/xen-devel/cover.1678976127.git.oleksii.kurochko@gmail.com/T/#t --- Changes in V2: * Remove {ZEROETH,FIRST,...}_{SHIFT,MASK, SIZE,...} and introduce instead of them XEN_PT_LEVEL_*() and LEVEL_* * Rework pt_linear_offset() and pt_index based on XEN_PT_LEVEL_*() * Remove clear_pagetables() functions as pagetables were zeroed during .bss initialization * Rename _setup_initial_pagetables() to setup_initial_mapping() * Make PTE_DEFAULT equal to RX. * Update prototype of setup_initial_mapping(..., bool writable) -> setup_initial_mapping(..., UL flags) * Update calls of setup_initial_mapping according to new prototype * Remove unnecessary call of: _setup_initial_pagetables(..., load_addr_start, load_addr_end, load_addr_start, ...) * Define index* in the loop of setup_initial_mapping * Remove attribute "__attribute__((section(".entry")))" for setup_initial_pagetables() as we don't have such section * make arguments of paddr_to_pte() and pte_is_valid() as const. * use <xen/kernel.h> instead of declaring extern unsigned long _stext, 0etext, _srodata, _erodata * update 'extern unsigned long __init_begin' to 'extern unsigned long __init_begin[]' * use aligned() instead of "__attribute__((__aligned__(PAGE_SIZE)))" * set __section(".bss.page_aligned") for page tables arrays * fix identatations * Change '__attribute__((section(".entry")))' to '__init' * Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in setup_inital_mapping() as they should be already aligned. * Remove clear_pagetables() as initial pagetables will be zeroed during bss initialization * Remove __attribute__((section(".entry")) for setup_initial_pagetables() as there is no such section in xen.lds.S * Update the argument of pte_is_valid() to "const pte_t *p" * Remove patch "[PATCH v1 3/3] automation: update RISC-V smoke test" from the patch series as it was introduced simplified approach for RISC-V smoke test by Andrew Cooper * Add patch [[xen/riscv: remove dummy_bss variable] as there is no any sense in dummy_bss variable after introduction of inittial page tables. --- Oleksii Kurochko (3): xen/riscv: introduce setup_initial_pages xen/riscv: setup initial pagetables xen/riscv: remove dummy_bss variable xen/arch/riscv/Makefile | 1 + xen/arch/riscv/include/asm/mm.h | 8 ++ xen/arch/riscv/include/asm/page.h | 67 +++++++++++++++++ xen/arch/riscv/mm.c | 121 ++++++++++++++++++++++++++++++ xen/arch/riscv/riscv64/head.S | 65 ++++++++++++++++ xen/arch/riscv/setup.c | 13 ++-- xen/arch/riscv/xen.lds.S | 2 + 7 files changed, 269 insertions(+), 8 deletions(-) create mode 100644 xen/arch/riscv/include/asm/mm.h create mode 100644 xen/arch/riscv/include/asm/page.h create mode 100644 xen/arch/riscv/mm.c -- 2.39.2
Mostly the code for setup_initial_pages was taken from Bobby's repo except for the following changes: * Use only a minimal part of the code enough to enable MMU * rename {_}setup_initial_pagetables functions * add an argument for setup_initial_mapping to have an opportunity to make set PTE flags. * update setup_initial_pagetables function to map sections with correct PTE flags. * introduce separate enable_mmu() to be able for proper handling the case when load start address isn't equal to linker start address. * map linker addresses range to load addresses range without 1:1 mapping. * add safety checks such as: * Xen size is less than page size * linker addresses range doesn't overlap load addresses range * Rework macros {THIRD,SECOND,FIRST,ZEROETH}_{SHIFT,MASK} * change PTE_LEAF_DEFAULT to RX instead of RWX. * Remove phys_offset as it isn't used now. * Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in setup_inital_mapping() as they should be already aligned. * Remove clear_pagetables() as initial pagetables will be zeroed during bss initialization * Remove __attribute__((section(".entry")) for setup_initial_pagetables() as there is no such section in xen.lds.S * Update the argument of pte_is_valid() to "const pte_t *p" Origin: https://gitlab.com/xen-on-risc-v/xen/-/tree/riscv-rebase 4af165b468af Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V2: * update the commit message: * Remove {ZEROETH,FIRST,...}_{SHIFT,MASK, SIZE,...} and introduce instead of them XEN_PT_LEVEL_*() and LEVEL_* * Rework pt_linear_offset() and pt_index based on XEN_PT_LEVEL_*() * Remove clear_pagetables() functions as pagetables were zeroed during .bss initialization * Rename _setup_initial_pagetables() to setup_initial_mapping() * Make PTE_DEFAULT equal to RX. * Update prototype of setup_initial_mapping(..., bool writable) -> setup_initial_mapping(..., UL flags) * Update calls of setup_initial_mapping according to new prototype * Remove unnecessary call of: _setup_initial_pagetables(..., load_addr_start, load_addr_end, load_addr_start, ...) * Define index* in the loop of setup_initial_mapping * Remove attribute "__attribute__((section(".entry")))" for setup_initial_pagetables() as we don't have such section * make arguments of paddr_to_pte() and pte_is_valid() as const. * make xen_second_pagetable static. * use <xen/kernel.h> instead of declaring extern unsigned long _stext, 0etext, _srodata, _erodata * update 'extern unsigned long __init_begin' to 'extern unsigned long __init_begin[]' * use aligned() instead of "__attribute__((__aligned__(PAGE_SIZE)))" * set __section(".bss.page_aligned") for page tables arrays * fix identatations * Change '__attribute__((section(".entry")))' to '__init' * Remove phys_offset as it isn't used now. * Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in setup_inital_mapping() as they should be already aligned. * Remove clear_pagetables() as initial pagetables will be zeroed during bss initialization * Remove __attribute__((section(".entry")) for setup_initial_pagetables() as there is no such section in xen.lds.S * Update the argument of pte_is_valid() to "const pte_t *p" --- xen/arch/riscv/Makefile | 1 + xen/arch/riscv/include/asm/mm.h | 8 ++ xen/arch/riscv/include/asm/page.h | 67 +++++++++++++++++ xen/arch/riscv/mm.c | 121 ++++++++++++++++++++++++++++++ xen/arch/riscv/riscv64/head.S | 65 ++++++++++++++++ xen/arch/riscv/xen.lds.S | 2 + 6 files changed, 264 insertions(+) create mode 100644 xen/arch/riscv/include/asm/mm.h create mode 100644 xen/arch/riscv/include/asm/page.h create mode 100644 xen/arch/riscv/mm.c diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/Makefile +++ b/xen/arch/riscv/Makefile @@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o obj-y += entry.o +obj-y += mm.o obj-$(CONFIG_RISCV_64) += riscv64/ obj-y += sbi.o obj-y += setup.o diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/xen/arch/riscv/include/asm/mm.h @@ -XXX,XX +XXX,XX @@ +#ifndef _ASM_RISCV_MM_H +#define _ASM_RISCV_MM_H + +void setup_initial_pagetables(void); + +extern void enable_mmu(void); + +#endif /* _ASM_RISCV_MM_H */ diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/xen/arch/riscv/include/asm/page.h @@ -XXX,XX +XXX,XX @@ +#ifndef _ASM_RISCV_PAGE_H +#define _ASM_RISCV_PAGE_H + +#include <xen/const.h> +#include <xen/types.h> + +#define PAGE_ENTRIES (1 << PAGETABLE_ORDER) +#define VPN_MASK ((unsigned long)(PAGE_ENTRIES - 1)) + +#define PAGE_ORDER (12) + +#ifdef CONFIG_RISCV_64 +#define PAGETABLE_ORDER (9) +#else /* CONFIG_RISCV_32 */ +#define PAGETABLE_ORDER (10) +#endif + +#define LEVEL_ORDER(lvl) (lvl * PAGETABLE_ORDER) +#define LEVEL_SHIFT(lvl) (LEVEL_ORDER(lvl) + PAGE_ORDER) +#define LEVEL_SIZE(lvl) (_AT(paddr_t, 1) << LEVEL_SHIFT(lvl)) + +#define XEN_PT_LEVEL_SHIFT(lvl) LEVEL_SHIFT(lvl) +#define XEN_PT_LEVEL_ORDER(lvl) LEVEL_ORDER(lvl) +#define XEN_PT_LEVEL_SIZE(lvl) LEVEL_SIZE(lvl) +#define XEN_PT_LEVEL_MAP_MASK(lvl) (~(XEN_PT_LEVEL_SIZE(lvl) - 1)) +#define XEN_PT_LEVEL_MASK(lvl) (VPN_MASK << XEN_PT_LEVEL_SHIFT(lvl)) + +#define PTE_SHIFT 10 + +#define PTE_VALID BIT(0, UL) +#define PTE_READABLE BIT(1, UL) +#define PTE_WRITABLE BIT(2, UL) +#define PTE_EXECUTABLE BIT(3, UL) +#define PTE_USER BIT(4, UL) +#define PTE_GLOBAL BIT(5, UL) +#define PTE_ACCESSED BIT(6, UL) +#define PTE_DIRTY BIT(7, UL) +#define PTE_RSW (BIT(8, UL) | BIT(9, UL)) + +#define PTE_LEAF_DEFAULT (PTE_VALID | PTE_READABLE | PTE_EXECUTABLE) +#define PTE_TABLE (PTE_VALID) + +/* Calculate the offsets into the pagetables for a given VA */ +#define pt_linear_offset(lvl, va) ((va) >> XEN_PT_LEVEL_SHIFT(lvl)) + +#define pt_index(lvl, va) pt_linear_offset(lvl, (va) & XEN_PT_LEVEL_MASK(lvl)) + +/* Page Table entry */ +typedef struct { + uint64_t pte; +} pte_t; + +/* Shift the VPN[x] or PPN[x] fields of a virtual or physical address + * to become the shifted PPN[x] fields of a page table entry */ +#define addr_to_ppn(x) (((x) >> PAGE_SHIFT) << PTE_SHIFT) + +static inline pte_t paddr_to_pte(const unsigned long paddr) +{ + return (pte_t) { .pte = addr_to_ppn(paddr) }; +} + +static inline bool pte_is_valid(const pte_t *p) +{ + return p->pte & PTE_VALID; +} + +#endif /* _ASM_RISCV_PAGE_H */ diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/xen/arch/riscv/mm.c @@ -XXX,XX +XXX,XX @@ +#include <xen/compiler.h> +#include <xen/init.h> +#include <xen/kernel.h> +#include <xen/lib.h> +#include <xen/page-size.h> + +#include <asm/boot-info.h> +#include <asm/config.h> +#include <asm/csr.h> +#include <asm/mm.h> +#include <asm/page.h> +#include <asm/traps.h> + +/* + * xen_second_pagetable is indexed with the VPN[2] page table entry field + * xen_first_pagetable is accessed from the VPN[1] page table entry field + * xen_zeroeth_pagetable is accessed from the VPN[0] page table entry field + */ +pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) + xen_second_pagetable[PAGE_ENTRIES]; +static pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) + xen_first_pagetable[PAGE_ENTRIES]; +static pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) + xen_zeroeth_pagetable[PAGE_ENTRIES]; + +extern unsigned long __init_begin[]; +extern unsigned long __init_end[]; +extern unsigned char cpu0_boot_stack[STACK_SIZE]; + +static void __init +setup_initial_mapping(pte_t *second, pte_t *first, pte_t *zeroeth, + unsigned long map_start, + unsigned long map_end, + unsigned long pa_start, + unsigned long flags) +{ + unsigned long page_addr; + + // /* align start addresses */ + // map_start &= XEN_PT_LEVEL_MAP_MASK(0); + // pa_start &= XEN_PT_LEVEL_MAP_MASK(0); + + page_addr = map_start; + while ( page_addr < map_end ) + { + unsigned long index2 = pt_index(2, page_addr); + unsigned long index1 = pt_index(1, page_addr); + unsigned long index0 = pt_index(0, page_addr); + + /* Setup level2 table */ + second[index2] = paddr_to_pte((unsigned long)first); + second[index2].pte |= PTE_TABLE; + + /* Setup level1 table */ + first[index1] = paddr_to_pte((unsigned long)zeroeth); + first[index1].pte |= PTE_TABLE; + + + /* Setup level0 table */ + if ( !pte_is_valid(&zeroeth[index0]) ) + { + /* Update level0 table */ + zeroeth[index0] = paddr_to_pte((page_addr - map_start) + pa_start); + zeroeth[index0].pte |= flags; + } + + /* Point to next page */ + page_addr += XEN_PT_LEVEL_SIZE(0); + } +} + +/* + * setup_initial_pagetables: + * + * Build the page tables for Xen that map the following: + * load addresses to linker addresses + */ +void __init setup_initial_pagetables(void) +{ + pte_t *second; + pte_t *first; + pte_t *zeroeth; + + unsigned long load_addr_start = boot_info.load_start; + unsigned long load_addr_end = boot_info.load_end; + unsigned long linker_addr_start = boot_info.linker_start; + unsigned long linker_addr_end = boot_info.linker_end; + + BUG_ON(load_addr_start % (PAGE_ENTRIES * PAGE_SIZE) != 0); + if (load_addr_start != linker_addr_start) + BUG_ON((linker_addr_end > load_addr_start && load_addr_end > linker_addr_start)); + + /* Get the addresses where the page tables were loaded */ + second = (pte_t *)(&xen_second_pagetable); + first = (pte_t *)(&xen_first_pagetable); + zeroeth = (pte_t *)(&xen_zeroeth_pagetable); + + setup_initial_mapping(second, first, zeroeth, + LOAD_TO_LINK((unsigned long)&_stext), + LOAD_TO_LINK((unsigned long)&_etext), + (unsigned long)&_stext, + PTE_LEAF_DEFAULT); + + setup_initial_mapping(second, first, zeroeth, + LOAD_TO_LINK((unsigned long)&__init_begin), + LOAD_TO_LINK((unsigned long)&__init_end), + (unsigned long)&__init_begin, + PTE_LEAF_DEFAULT | PTE_WRITABLE); + + setup_initial_mapping(second, first, zeroeth, + LOAD_TO_LINK((unsigned long)&_srodata), + LOAD_TO_LINK((unsigned long)&_erodata), + (unsigned long)(&_srodata), + PTE_VALID | PTE_READABLE); + + setup_initial_mapping(second, first, zeroeth, + linker_addr_start, + linker_addr_end, + load_addr_start, + PTE_LEAF_DEFAULT | PTE_READABLE); +} diff --git a/xen/arch/riscv/riscv64/head.S b/xen/arch/riscv/riscv64/head.S index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/riscv64/head.S +++ b/xen/arch/riscv/riscv64/head.S @@ -XXX,XX +XXX,XX @@ #include <asm/asm.h> +#include <asm/asm-offsets.h> #include <asm/riscv_encoding.h> .section .text.header, "ax", %progbits @@ -XXX,XX +XXX,XX @@ ENTRY(start) add sp, sp, t0 tail start_xen + +ENTRY(enable_mmu) + /* Calculate physical offset between linker and load addresses */ + la t0, boot_info + REG_L t1, BI_LINKER_START(t0) + REG_L t2, BI_LOAD_START(t0) + sub t1, t1, t2 + + /* + * Calculate and update a linker time address of the .L_mmu_is_enabled + * label and update CSR_STVEC with it. + * MMU is configured in a way where linker addresses are mapped + * on load addresses so case when linker addresses are not equal to + * load addresses, and thereby, after MMU is enabled, it will cause + * an exception and jump to linker time addresses + */ + la t3, .L_mmu_is_enabled + add t3, t3, t1 + csrw CSR_STVEC, t3 + + /* Calculate a value for SATP register */ + li t5, SATP_MODE_SV39 + li t6, SATP_MODE_SHIFT + sll t5, t5, t6 + + la t4, xen_second_pagetable + srl t4, t4, PAGE_SHIFT + or t4, t4, t5 + sfence.vma + csrw CSR_SATP, t4 + + .align 2 +.L_mmu_is_enabled: + /* + * Stack should be re-inited as: + * 1. Right now an address of the stack is relative to load time + * addresses what will cause an issue in case of load start address + * isn't equal to linker start address. + * 2. Addresses in stack are all load time relative which can be an + * issue in case when load start address isn't equal to linker + * start address. + */ + la sp, cpu0_boot_stack + li t0, STACK_SIZE + add sp, sp, t0 + + /* + * Re-init an address of exception handler as it was overwritten with + * the address of the .L_mmu_is_enabled label. + * Before jump to trap_init save return address of enable_mmu() to + * know where we should back after enable_mmu() will be finished. + */ + mv s0, ra + lla t0, trap_init + jalr ra, t0 + + /* + * Re-calculate the return address of enable_mmu() function for case + * when linker start address isn't equal to load start address + */ + add s0, s0, t1 + mv ra, s0 + + ret diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/xen.lds.S +++ b/xen/arch/riscv/xen.lds.S @@ -XXX,XX +XXX,XX @@ SECTIONS ASSERT(!SIZEOF(.got), ".got non-empty") ASSERT(!SIZEOF(.got.plt), ".got.plt non-empty") + +ASSERT(_end - _start <= MB(2), "Xen too large for early-boot assumptions") -- 2.39.2
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V2: * Update the commit message --- xen/arch/riscv/setup.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/setup.c +++ b/xen/arch/riscv/setup.c @@ -XXX,XX +XXX,XX @@ #include <asm/boot-info.h> #include <asm/csr.h> #include <asm/early_printk.h> +#include <asm/mm.h> #include <asm/traps.h> /* Xen stack for bringing up the first CPU. */ @@ -XXX,XX +XXX,XX @@ void __init noreturn start_xen(unsigned long bootcpu_id, test_macros_from_bug_h(); + setup_initial_pagetables(); + + enable_mmu(); + early_printk("All set up\n"); for ( ;; ) -- 2.39.2
After introduction of initial pagetables there is no any sense in dummy_bss variable as bss section will not be empty anymore. Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V2: * patch was introduced in the current one patch series (v2). --- xen/arch/riscv/setup.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/setup.c +++ b/xen/arch/riscv/setup.c @@ -XXX,XX +XXX,XX @@ unsigned char __initdata cpu0_boot_stack[STACK_SIZE] struct boot_info boot_info = { (unsigned long)&_start, (unsigned long)&_end, 0UL, 0UL }; -/* - * To be sure that .bss isn't zero. It will simplify code of - * .bss initialization. - * TODO: - * To be deleted when the first real .bss user appears - */ -int dummy_bss __attribute__((unused)); - static void fill_boot_info(void) { boot_info.load_start = (unsigned long)_start; -- 2.39.2
The patch series introduces the following things: 1. Functionality to build the page tables for Xen that map link-time to physical-time location. 2. Check that Xen is less then page size. 3. Check that load addresses don't overlap with linker addresses. 4. Prepare things for proper switch to virtual memory world. 5. Load the built page table into the SATP 6. Enable MMU. --- Changes in V5: * rebase the patch series on top of current staging * Update cover letter: it was removed the info about the patches on which MMU patch series is based as they were merged to staging * add new patch with description of VM layout for RISC-V2 * Indent fields of pte_t struct * Rename addr_to_pte() and ppn_to_paddr() to match their content --- Changes in V4: * use GB() macros instead of defining SZ_1G * hardcode XEN_VIRT_START and add comment (ADDRESS_SPACE_END + 1 - GB(1)) * remove unnecessary 'asm' word at the end of #error * encapsulate pte_t definition in a struct * rename addr_to_ppn() to ppn_to_paddr(). * change type of paddr argument from const unsigned long to paddr_t * pte_to_paddr() update prototype. * calculate size of Xen binary based on an amount of page tables * use unsgined int instead of 'uint32_t' instead of uint32_t as their use isn't warranted. * remove extern of bss_{start,end} as they aren't used in mm.c anymore * fix code style * add argument for HANDLE_PGTBL macros instead of curr_lvl_num variable * make enable_mmu() as noinline to prevent under link-time optimization because of the nature of enable_mmu() * add function to check that SATP_MODE is supported. * update the commit message * update setup_initial_pagetables to set correct PTE flags in one pass instead of calling setup_pte_permissions after setup_initial_pagetables() as setup_initial_pagetables() isn't used to change permission flags. --- Changes in V3: * Update the cover letter message: the patch series isn't depend on [ RISC-V basic exception handling implementation ] as it was decied to enable MMU before implementation of exception handling. Also MMU patch series is based on two other patches which weren't merged [1] and [2] - Update the commit message for [ [PATCH v3 1/3] xen/riscv: introduce setup_initial_pages ]. - update definition of pte_t structure to have a proper size of pte_t in case of RV32. - update asm/mm.h with new functions and remove unnecessary 'extern'. - remove LEVEL_* macros as only XEN_PT_LEVEL_* are enough. - update paddr_to_pte() to receive permissions as an argument. - add check that map_start & pa_start is properly aligned. - move defines PAGETABLE_ORDER, PAGETABLE_ENTRIES, PTE_PPN_SHIFT to <asm/page-bits.h> - Rename PTE_SHIFT to PTE_PPN_SHIFT - refactor setup_initial_pagetables: map all LINK addresses to LOAD addresses and after setup PTEs permission for sections; update check that linker and load addresses don't overlap. - refactor setup_initial_mapping: allocate pagetable 'dynamically' if it is necessary. - rewrite enable_mmu in C; add the check that map_start and pa_start are aligned on 4k boundary. - update the comment for setup_initial_pagetable funcion - Add RV_STAGE1_MODE to support different MMU modes. - update the commit message that MMU is also enabled here - set XEN_VIRT_START very high to not overlap with load address range - align bss section --- Changes in V2: * Remove {ZEROETH,FIRST,...}_{SHIFT,MASK, SIZE,...} and introduce instead of them XEN_PT_LEVEL_*() and LEVEL_* * Rework pt_linear_offset() and pt_index based on XEN_PT_LEVEL_*() * Remove clear_pagetables() functions as pagetables were zeroed during .bss initialization * Rename _setup_initial_pagetables() to setup_initial_mapping() * Make PTE_DEFAULT equal to RX. * Update prototype of setup_initial_mapping(..., bool writable) -> setup_initial_mapping(..., UL flags) * Update calls of setup_initial_mapping according to new prototype * Remove unnecessary call of: _setup_initial_pagetables(..., load_addr_start, load_addr_end, load_addr_start, ...) * Define index* in the loop of setup_initial_mapping * Remove attribute "__attribute__((section(".entry")))" for setup_initial_pagetables() as we don't have such section * make arguments of paddr_to_pte() and pte_is_valid() as const. * use <xen/kernel.h> instead of declaring extern unsigned long _stext, 0etext, _srodata, _erodata * update 'extern unsigned long __init_begin' to 'extern unsigned long __init_begin[]' * use aligned() instead of "__attribute__((__aligned__(PAGE_SIZE)))" * set __section(".bss.page_aligned") for page tables arrays * fix identatations * Change '__attribute__((section(".entry")))' to '__init' * Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in setup_inital_mapping() as they should be already aligned. * Remove clear_pagetables() as initial pagetables will be zeroed during bss initialization * Remove __attribute__((section(".entry")) for setup_initial_pagetables() as there is no such section in xen.lds.S * Update the argument of pte_is_valid() to "const pte_t *p" * Remove patch "[PATCH v1 3/3] automation: update RISC-V smoke test" from the patch series as it was introduced simplified approach for RISC-V smoke test by Andrew Cooper * Add patch [[xen/riscv: remove dummy_bss variable] as there is no any sense in dummy_bss variable after introduction of inittial page tables. --- Oleksii Kurochko (4): xen/riscv: add VM space layout xen/riscv: introduce setup_initial_pages xen/riscv: setup initial pagetables xen/riscv: remove dummy_bss variable xen/arch/riscv/Makefile | 1 + xen/arch/riscv/include/asm/config.h | 43 +++- xen/arch/riscv/include/asm/mm.h | 9 + xen/arch/riscv/include/asm/page-bits.h | 10 + xen/arch/riscv/include/asm/page.h | 63 +++++ xen/arch/riscv/mm.c | 319 +++++++++++++++++++++++++ xen/arch/riscv/riscv64/head.S | 2 + xen/arch/riscv/setup.c | 22 +- xen/arch/riscv/xen.lds.S | 4 + 9 files changed, 464 insertions(+), 9 deletions(-) create mode 100644 xen/arch/riscv/include/asm/mm.h create mode 100644 xen/arch/riscv/include/asm/page.h create mode 100644 xen/arch/riscv/mm.c -- 2.39.2
Also it was added explanation about ignoring of top VA bits Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V5: * the patch was introduced in the current patch series. --- xen/arch/riscv/include/asm/config.h | 31 +++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/include/asm/config.h +++ b/xen/arch/riscv/include/asm/config.h @@ -XXX,XX +XXX,XX @@ #include <xen/const.h> #include <xen/page-size.h> +/* + * RISC-V64 Layout: + * + * From the riscv-privileged doc: + * When mapping between narrower and wider addresses, + * RISC-V zero-extends a narrower physical address to a wider size. + * The mapping between 64-bit virtual addresses and the 39-bit usable + * address space of Sv39 is not based on zero-extension but instead + * follows an entrenched convention that allows an OS to use one or + * a few of the most-significant bits of a full-size (64-bit) virtual + * address to quickly distinguish user and supervisor address regions. + * + * It means that: + * top VA bits are simply ignored for the purpose of translating to PA. + * + * The similar is true for other Sv{32, 39, 48, 57}. + * + * ============================================================================ + * Start addr | End addr | Size | VM area description + * ============================================================================ + * FFFFFFFFC0000000 | FFFFFFFFC0200000 | 2 MB | Xen + * FFFFFFFFC0200000 | FFFFFFFFC0600000 | 4 MB | FDT + * FFFFFFFFC0600000 | FFFFFFFFC0800000 | 2 MB | Fixmap + * .................. unused .................. + * 0000003200000000 | 0000007f40000000 | 331 GB | Direct map(L2 slot: 200-509) + * 0000003100000000 | 0000003140000000 | 1 GB | Frametable(L2 slot: 196-197) + * 0000003080000000 | 00000030c0000000 | 1 GB | VMAP (L2 slot: 194-195) + * .................. unused .................. + * ============================================================================ + */ + #if defined(CONFIG_RISCV_64) # define LONG_BYTEORDER 3 # define ELFSIZE 64 -- 2.39.2
The idea was taken from xvisor but the following changes were done: * Use only a minimal part of the code enough to enable MMU * rename {_}setup_initial_pagetables functions * add an argument for setup_initial_mapping to have an opportunity to make set PTE flags. * update setup_initial_pagetables function to map sections with correct PTE flags. * Rewrite enable_mmu() to C. * map linker addresses range to load addresses range without 1:1 mapping. It will be 1:1 only in case when load_start_addr is equal to linker_start_addr. * add safety checks such as: * Xen size is less than page size * linker addresses range doesn't overlap load addresses range * Rework macros {THIRD,SECOND,FIRST,ZEROETH}_{SHIFT,MASK} * change PTE_LEAF_DEFAULT to RW instead of RWX. * Remove phys_offset as it is not used now * Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in setup_inital_mapping() as they should be already aligned. Make a check that {map_pa}_start are aligned. * Remove clear_pagetables() as initial pagetables will be zeroed during bss initialization * Remove __attribute__((section(".entry")) for setup_initial_pagetables() as there is no such section in xen.lds.S * Update the argument of pte_is_valid() to "const pte_t *p" * Add check that Xen's load address is aligned at 4k boundary * Refactor setup_initial_pagetables() so it is mapping linker address range to load address range. After setup needed permissions for specific section ( such as .text, .rodata, etc ) otherwise RW permission will be set by default. * Add function to check that requested SATP_MODE is supported Origin: git@github.com:xvisor/xvisor.git 9be2fdd7 Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V5: * Indent fields of pte_t struct * Rename addr_to_pte() and ppn_to_paddr() to match their content --- Changes in V4: * use GB() macros instead of defining SZ_1G * hardcode XEN_VIRT_START and add comment (ADDRESS_SPACE_END + 1 - GB(1)) * remove unnecessary 'asm' word at the end of #error * encapsulate pte_t definition in a struct * rename addr_to_ppn() to ppn_to_paddr(). * change type of paddr argument from const unsigned long to paddr_t * pte_to_paddr() update prototype. * calculate size of Xen binary based on an amount of page tables * use unsgined int instead of 'uint32_t' instead of uint32_t as their use isn't warranted. * remove extern of bss_{start,end} as they aren't used in mm.c anymore * fix code style * add argument for HANDLE_PGTBL macros instead of curr_lvl_num variable * make enable_mmu() as noinline to prevent under link-time optimization because of the nature of enable_mmu() * add function to check that SATP_MODE is supported. * update the commit message * update setup_initial_pagetables to set correct PTE flags in one pass instead of calling setup_pte_permissions after setup_initial_pagetables() as setup_initial_pagetables() isn't used to change permission flags. --- Changes in V3: - update definition of pte_t structure to have a proper size of pte_t in case of RV32. - update asm/mm.h with new functions and remove unnecessary 'extern'. - remove LEVEL_* macros as only XEN_PT_LEVEL_* are enough. - update paddr_to_pte() to receive permissions as an argument. - add check that map_start & pa_start is properly aligned. - move defines PAGETABLE_ORDER, PAGETABLE_ENTRIES, PTE_PPN_SHIFT to <asm/page-bits.h> - Rename PTE_SHIFT to PTE_PPN_SHIFT - refactor setup_initial_pagetables: map all LINK addresses to LOAD addresses and after setup PTEs permission for sections; update check that linker and load addresses don't overlap. - refactor setup_initial_mapping: allocate pagetable 'dynamically' if it is necessary. - rewrite enable_mmu in C; add the check that map_start and pa_start are aligned on 4k boundary. - update the comment for setup_initial_pagetable funcion - Add RV_STAGE1_MODE to support different MMU modes - set XEN_VIRT_START very high to not overlap with load address range - align bss section --- Changes in V2: * update the commit message: * Remove {ZEROETH,FIRST,...}_{SHIFT,MASK, SIZE,...} and introduce instead of them XEN_PT_LEVEL_*() and LEVEL_* * Rework pt_linear_offset() and pt_index based on XEN_PT_LEVEL_*() * Remove clear_pagetables() functions as pagetables were zeroed during .bss initialization * Rename _setup_initial_pagetables() to setup_initial_mapping() * Make PTE_DEFAULT equal to RX. * Update prototype of setup_initial_mapping(..., bool writable) -> setup_initial_mapping(..., UL flags) * Update calls of setup_initial_mapping according to new prototype * Remove unnecessary call of: _setup_initial_pagetables(..., load_addr_start, load_addr_end, load_addr_start, ...) * Define index* in the loop of setup_initial_mapping * Remove attribute "__attribute__((section(".entry")))" for setup_initial_pagetables() as we don't have such section * make arguments of paddr_to_pte() and pte_is_valid() as const. * make xen_second_pagetable static. * use <xen/kernel.h> instead of declaring extern unsigned long _stext, 0etext, _srodata, _erodata * update 'extern unsigned long __init_begin' to 'extern unsigned long __init_begin[]' * use aligned() instead of "__attribute__((__aligned__(PAGE_SIZE)))" * set __section(".bss.page_aligned") for page tables arrays * fix identatations * Change '__attribute__((section(".entry")))' to '__init' * Remove phys_offset as it isn't used now. * Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in setup_inital_mapping() as they should be already aligned. * Remove clear_pagetables() as initial pagetables will be zeroed during bss initialization * Remove __attribute__((section(".entry")) for setup_initial_pagetables() as there is no such section in xen.lds.S * Update the argument of pte_is_valid() to "const pte_t *p" --- xen/arch/riscv/Makefile | 1 + xen/arch/riscv/include/asm/config.h | 12 +- xen/arch/riscv/include/asm/mm.h | 9 + xen/arch/riscv/include/asm/page-bits.h | 10 + xen/arch/riscv/include/asm/page.h | 63 +++++ xen/arch/riscv/mm.c | 319 +++++++++++++++++++++++++ xen/arch/riscv/riscv64/head.S | 2 + xen/arch/riscv/setup.c | 11 + xen/arch/riscv/xen.lds.S | 4 + 9 files changed, 430 insertions(+), 1 deletion(-) create mode 100644 xen/arch/riscv/include/asm/mm.h create mode 100644 xen/arch/riscv/include/asm/page.h create mode 100644 xen/arch/riscv/mm.c diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/Makefile +++ b/xen/arch/riscv/Makefile @@ -XXX,XX +XXX,XX @@ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o obj-y += entry.o +obj-y += mm.o obj-$(CONFIG_RISCV_64) += riscv64/ obj-y += sbi.o obj-y += setup.o diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/include/asm/config.h +++ b/xen/arch/riscv/include/asm/config.h @@ -XXX,XX +XXX,XX @@ name: #endif -#define XEN_VIRT_START _AT(UL, 0x80200000) +#ifdef CONFIG_RISCV_64 +#define XEN_VIRT_START 0xFFFFFFFFC0000000 /* (_AC(-1, UL) + 1 - GB(1)) */ +#else +#error "RV32 isn't supported" +#endif #define SMP_CACHE_BYTES (1 << 6) #define STACK_SIZE PAGE_SIZE +#ifdef CONFIG_RISCV_64 +#define RV_STAGE1_MODE SATP_MODE_SV39 +#else +#define RV_STAGE1_MODE SATP_MODE_SV32 +#endif + #endif /* __RISCV_CONFIG_H__ */ /* * Local variables: diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/xen/arch/riscv/include/asm/mm.h @@ -XXX,XX +XXX,XX @@ +#ifndef _ASM_RISCV_MM_H +#define _ASM_RISCV_MM_H + +void setup_initial_pagetables(void); + +void enable_mmu(void); +void cont_after_mmu_is_enabled(void); + +#endif /* _ASM_RISCV_MM_H */ diff --git a/xen/arch/riscv/include/asm/page-bits.h b/xen/arch/riscv/include/asm/page-bits.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/include/asm/page-bits.h +++ b/xen/arch/riscv/include/asm/page-bits.h @@ -XXX,XX +XXX,XX @@ #ifndef __RISCV_PAGE_BITS_H__ #define __RISCV_PAGE_BITS_H__ +#ifdef CONFIG_RISCV_64 +#define PAGETABLE_ORDER (9) +#else /* CONFIG_RISCV_32 */ +#define PAGETABLE_ORDER (10) +#endif + +#define PAGETABLE_ENTRIES (1 << PAGETABLE_ORDER) + +#define PTE_PPN_SHIFT 10 + #define PAGE_SHIFT 12 /* 4 KiB Pages */ #define PADDR_BITS 56 /* 44-bit PPN */ diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/xen/arch/riscv/include/asm/page.h @@ -XXX,XX +XXX,XX @@ +#ifndef _ASM_RISCV_PAGE_H +#define _ASM_RISCV_PAGE_H + +#include <xen/const.h> +#include <xen/types.h> + +#define VPN_MASK ((unsigned long)(PAGETABLE_ENTRIES - 1)) + +#define XEN_PT_LEVEL_ORDER(lvl) ((lvl) * PAGETABLE_ORDER) +#define XEN_PT_LEVEL_SHIFT(lvl) (XEN_PT_LEVEL_ORDER(lvl) + PAGE_SHIFT) +#define XEN_PT_LEVEL_SIZE(lvl) (_AT(paddr_t, 1) << XEN_PT_LEVEL_SHIFT(lvl)) +#define XEN_PT_LEVEL_MAP_MASK(lvl) (~(XEN_PT_LEVEL_SIZE(lvl) - 1)) +#define XEN_PT_LEVEL_MASK(lvl) (VPN_MASK << XEN_PT_LEVEL_SHIFT(lvl)) + +#define PTE_VALID BIT(0, UL) +#define PTE_READABLE BIT(1, UL) +#define PTE_WRITABLE BIT(2, UL) +#define PTE_EXECUTABLE BIT(3, UL) +#define PTE_USER BIT(4, UL) +#define PTE_GLOBAL BIT(5, UL) +#define PTE_ACCESSED BIT(6, UL) +#define PTE_DIRTY BIT(7, UL) +#define PTE_RSW (BIT(8, UL) | BIT(9, UL)) + +#define PTE_LEAF_DEFAULT (PTE_VALID | PTE_READABLE | PTE_WRITABLE) +#define PTE_TABLE (PTE_VALID) + +/* Calculate the offsets into the pagetables for a given VA */ +#define pt_linear_offset(lvl, va) ((va) >> XEN_PT_LEVEL_SHIFT(lvl)) + +#define pt_index(lvl, va) pt_linear_offset(lvl, (va) & XEN_PT_LEVEL_MASK(lvl)) + +/* Page Table entry */ +typedef struct { +#ifdef CONFIG_RISCV_64 + uint64_t pte; +#else + uint32_t pte; +#endif +} pte_t; + +#define pte_to_addr(x) (((x) >> PTE_PPN_SHIFT) << PAGE_SHIFT) + +#define addr_to_pte(x) (((x) >> PAGE_SHIFT) << PTE_PPN_SHIFT) + +static inline pte_t paddr_to_pte(const paddr_t paddr, + const unsigned long permissions) +{ + unsigned long tmp = addr_to_pte(paddr); + return (pte_t) { .pte = tmp | permissions }; +} + +static inline paddr_t pte_to_paddr(const pte_t pte) +{ + return pte_to_addr(pte.pte); +} + +static inline bool pte_is_valid(const pte_t p) +{ + return p.pte & PTE_VALID; +} + +#endif /* _ASM_RISCV_PAGE_H */ diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/xen/arch/riscv/mm.c @@ -XXX,XX +XXX,XX @@ +#include <xen/compiler.h> +#include <xen/init.h> +#include <xen/kernel.h> + +#include <asm/early_printk.h> +#include <asm/config.h> +#include <asm/csr.h> +#include <asm/mm.h> +#include <asm/page.h> +#include <asm/processor.h> + +struct mmu_desc { + unsigned long num_levels; + unsigned int pgtbl_count; + pte_t *next_pgtbl; + pte_t *pgtbl_base; +}; + +extern unsigned char cpu0_boot_stack[STACK_SIZE]; + +#define PHYS_OFFSET ((unsigned long)_start - XEN_VIRT_START) +#define LOAD_TO_LINK(addr) ((addr) - PHYS_OFFSET) +#define LINK_TO_LOAD(addr) ((addr) + PHYS_OFFSET) + + +/* + * It is expected that Xen won't be more then 2 MB. + * The check in xen.lds.S guarantees that. + * At least 4 page (in case when Sv48 or Sv57 are used ) + * tables are needed to cover 2 MB. One for each page level + * table with PAGE_SIZE = 4 Kb + * + * One L0 page table can cover 2 MB + * (512 entries of one page table * PAGE_SIZE). + * + * It might be needed one more page table in case when + * Xen load address isn't 2 MB aligned. + * + */ +#define PGTBL_INITIAL_COUNT (5) + +#define PGTBL_ENTRY_AMOUNT (PAGE_SIZE / sizeof(pte_t)) + +pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) +stage1_pgtbl_root[PGTBL_ENTRY_AMOUNT]; + +pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) +stage1_pgtbl_nonroot[PGTBL_INITIAL_COUNT * PGTBL_ENTRY_AMOUNT]; + +#define HANDLE_PGTBL(curr_lvl_num) \ + index = pt_index(curr_lvl_num, page_addr); \ + if ( pte_is_valid(pgtbl[index]) ) \ + { \ + /* Find L{ 0-3 } table */ \ + pgtbl = (pte_t *)pte_to_paddr(pgtbl[index]); \ + } \ + else \ + { \ + /* Allocate new L{0-3} page table */ \ + if ( mmu_desc->pgtbl_count == PGTBL_INITIAL_COUNT ) \ + { \ + early_printk("(XEN) No initial table available\n"); \ + /* panic(), BUG() or ASSERT() aren't ready now. */ \ + die(); \ + } \ + mmu_desc->pgtbl_count++; \ + pgtbl[index] = paddr_to_pte((unsigned long)mmu_desc->next_pgtbl, \ + PTE_VALID); \ + pgtbl = mmu_desc->next_pgtbl; \ + mmu_desc->next_pgtbl += PGTBL_ENTRY_AMOUNT; \ + } + +static void __init setup_initial_mapping(struct mmu_desc *mmu_desc, + unsigned long map_start, + unsigned long map_end, + unsigned long pa_start, + unsigned long permissions) +{ + unsigned int index; + pte_t *pgtbl; + unsigned long page_addr; + pte_t pte_to_be_written; + unsigned long paddr; + unsigned long tmp_permissions; + + if ( (unsigned long)_start % XEN_PT_LEVEL_SIZE(0) ) + { + early_printk("(XEN) Xen should be loaded at 4k boundary\n"); + die(); + } + + if ( map_start & ~XEN_PT_LEVEL_MAP_MASK(0) || + pa_start & ~XEN_PT_LEVEL_MAP_MASK(0) ) + { + early_printk("(XEN) map and pa start addresses should be aligned\n"); + /* panic(), BUG() or ASSERT() aren't ready now. */ + die(); + } + + page_addr = map_start; + while ( page_addr < map_end ) + { + pgtbl = mmu_desc->pgtbl_base; + + switch (mmu_desc->num_levels) + { + case 4: /* Level 3 */ + HANDLE_PGTBL(3); + case 3: /* Level 2 */ + HANDLE_PGTBL(2); + case 2: /* Level 1 */ + HANDLE_PGTBL(1); + case 1: /* Level 0 */ + index = pt_index(0, page_addr); + paddr = (page_addr - map_start) + pa_start; + + tmp_permissions = permissions; + + if ( is_kernel_text(LINK_TO_LOAD(page_addr)) || + is_kernel_inittext(LINK_TO_LOAD(page_addr)) ) + tmp_permissions = + PTE_EXECUTABLE | PTE_READABLE | PTE_VALID; + + if ( is_kernel_rodata(LINK_TO_LOAD(page_addr)) ) + tmp_permissions = PTE_READABLE | PTE_VALID; + + pte_to_be_written = paddr_to_pte(paddr, tmp_permissions); + + if ( !pte_is_valid(pgtbl[index]) ) + pgtbl[index] = pte_to_be_written; + else + { + /* + * get an adresses of current pte and that one to + * be written without permission flags + */ + unsigned long curr_pte = + pgtbl[index].pte & ~((1 << PTE_PPN_SHIFT) - 1); + + pte_to_be_written.pte &= ~((1 << PTE_PPN_SHIFT) - 1); + + if ( curr_pte != pte_to_be_written.pte ) + { + early_printk("PTE that is intended to write isn't the" + "same that the once are overwriting\n"); + /* panic(), <asm/bug.h> aren't ready now. */ + die(); + } + } + } + + /* Point to next page */ + page_addr += XEN_PT_LEVEL_SIZE(0); + } +} + +static void __init calc_pgtbl_lvls_num(struct mmu_desc *mmu_desc) +{ + unsigned long satp_mode = RV_STAGE1_MODE; + + /* Number of page table levels */ + switch (satp_mode) + { + case SATP_MODE_SV32: + mmu_desc->num_levels = 2; + break; + case SATP_MODE_SV39: + mmu_desc->num_levels = 3; + break; + case SATP_MODE_SV48: + mmu_desc->num_levels = 4; + break; + default: + early_printk("(XEN) Unsupported SATP_MODE\n"); + die(); + } +} + +static bool __init check_pgtbl_mode_support(struct mmu_desc *mmu_desc, + unsigned long load_start, + unsigned long satp_mode) +{ + bool is_mode_supported = false; + unsigned int index; + unsigned int page_table_level = (mmu_desc->num_levels - 1); + unsigned level_map_mask = XEN_PT_LEVEL_MAP_MASK(page_table_level); + + unsigned long aligned_load_start = load_start & level_map_mask; + unsigned long aligned_page_size = XEN_PT_LEVEL_SIZE(page_table_level); + unsigned long xen_size = (unsigned long)(_end - start); + + if ( (load_start + xen_size) > (aligned_load_start + aligned_page_size) ) + { + early_printk("please place Xen to be in range of PAGE_SIZE " + "where PAGE_SIZE is XEN_PT_LEVEL_SIZE( {L3 | L2 | L1} ) " + "depending on expected SATP_MODE \n" + "XEN_PT_LEVEL_SIZE is defined in <asm/page.h>\n"); + die(); + } + + index = pt_index(page_table_level, aligned_load_start); + stage1_pgtbl_root[index] = paddr_to_pte(aligned_load_start, + PTE_LEAF_DEFAULT | PTE_EXECUTABLE); + + asm volatile("sfence.vma"); + csr_write(CSR_SATP, + ((unsigned long)stage1_pgtbl_root >> PAGE_SHIFT) | + satp_mode << SATP_MODE_SHIFT); + + if ( (csr_read(CSR_SATP) >> SATP_MODE_SHIFT) == satp_mode ) + is_mode_supported = true; + + /* Clean MMU root page table and disable MMU */ + stage1_pgtbl_root[index] = paddr_to_pte(0x0, 0x0); + + csr_write(CSR_SATP, 0); + asm volatile("sfence.vma"); + + return is_mode_supported; +} + +/* + * setup_initial_pagetables: + * + * Build the page tables for Xen that map the following: + * 1. Calculate page table's level numbers. + * 2. Init mmu description structure. + * 3. Check that linker addresses range doesn't overlap + * with load addresses range + * 4. Map all linker addresses and load addresses ( it shouldn't + * be 1:1 mapped and will be 1:1 mapped only in case if + * linker address is equal to load address ) with + * RW permissions by default. + * 5. Setup proper PTE permissions for each section. + */ +void __init setup_initial_pagetables(void) +{ + struct mmu_desc mmu_desc = { 0, 0, NULL, 0 }; + + /* + * Access to _{stard, end } is always PC-relative + * thereby when access them we will get load adresses + * of start and end of Xen + * To get linker addresses LOAD_TO_LINK() is required + * to use + */ + unsigned long load_start = (unsigned long)_start; + unsigned long load_end = (unsigned long)_end; + unsigned long linker_start = LOAD_TO_LINK(load_start); + unsigned long linker_end = LOAD_TO_LINK(load_end); + + if ( (linker_start != load_start) && + (linker_start <= load_end) && (load_start <= linker_end) ) { + early_printk("(XEN) linker and load address ranges overlap\n"); + die(); + } + + calc_pgtbl_lvls_num(&mmu_desc); + + if ( !check_pgtbl_mode_support(&mmu_desc, load_start, RV_STAGE1_MODE) ) + { + early_printk("requested MMU mode isn't supported by CPU\n" + "Please choose different in <asm/config.h>\n"); + die(); + } + + mmu_desc.pgtbl_base = stage1_pgtbl_root; + mmu_desc.next_pgtbl = stage1_pgtbl_nonroot; + + setup_initial_mapping(&mmu_desc, + linker_start, + linker_end, + load_start, + PTE_LEAF_DEFAULT); +} + +void __init noinline enable_mmu() +{ + /* + * Calculate a linker time address of the mmu_is_enabled + * label and update CSR_STVEC with it. + * MMU is configured in a way where linker addresses are mapped + * on load addresses so in a case when linker addresses are not equal + * to load addresses, after MMU is enabled, it will cause + * an exception and jump to linker time addresses. + * Otherwise if load addresses are equal to linker addresses the code + * after mmu_is_enabled label will be executed without exception. + */ + csr_write(CSR_STVEC, LOAD_TO_LINK((unsigned long)&&mmu_is_enabled)); + + /* Ensure page table writes precede loading the SATP */ + asm volatile("sfence.vma"); + + /* Enable the MMU and load the new pagetable for Xen */ + csr_write(CSR_SATP, + ((unsigned long)stage1_pgtbl_root >> PAGE_SHIFT) | + RV_STAGE1_MODE << SATP_MODE_SHIFT); + + asm volatile(".align 2"); + mmu_is_enabled: + /* + * Stack should be re-inited as: + * 1. Right now an address of the stack is relative to load time + * addresses what will cause an issue in case of load start address + * isn't equal to linker start address. + * 2. Addresses in stack are all load time relative which can be an + * issue in case when load start address isn't equal to linker + * start address. + */ + asm volatile ("mv sp, %0" : : "r"((unsigned long)cpu0_boot_stack + STACK_SIZE)); + + /* + * We can't return to the caller because the stack was reseted + * and it may have stash some variable on the stack. + * Jump to a brand new function as the stack was reseted + */ + cont_after_mmu_is_enabled(); +} + diff --git a/xen/arch/riscv/riscv64/head.S b/xen/arch/riscv/riscv64/head.S index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/riscv64/head.S +++ b/xen/arch/riscv/riscv64/head.S @@ -XXX,XX +XXX,XX @@ #include <asm/asm.h> +#include <asm/asm-offsets.h> #include <asm/riscv_encoding.h> .section .text.header, "ax", %progbits @@ -XXX,XX +XXX,XX @@ ENTRY(start) add sp, sp, t0 tail start_xen + diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/setup.c +++ b/xen/arch/riscv/setup.c @@ -XXX,XX +XXX,XX @@ #include <xen/init.h> #include <asm/early_printk.h> +#include <asm/mm.h> /* Xen stack for bringing up the first CPU. */ unsigned char __initdata cpu0_boot_stack[STACK_SIZE] @@ -XXX,XX +XXX,XX @@ void __init noreturn start_xen(unsigned long bootcpu_id, unreachable(); } + +void __init noreturn cont_after_mmu_is_enabled(void) +{ + early_printk("All set up\n"); + + for ( ;; ) + asm volatile ("wfi"); + + unreachable(); +} diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/xen.lds.S +++ b/xen/arch/riscv/xen.lds.S @@ -XXX,XX +XXX,XX @@ SECTIONS . = ALIGN(POINTER_ALIGN); __init_end = .; + . = ALIGN(PAGE_SIZE); .bss : { /* BSS */ __bss_start = .; *(.bss.stack_aligned) @@ -XXX,XX +XXX,XX @@ ASSERT(IS_ALIGNED(__bss_end, POINTER_ALIGN), "__bss_end is misaligned") ASSERT(!SIZEOF(.got), ".got non-empty") ASSERT(!SIZEOF(.got.plt), ".got.plt non-empty") + +ASSERT(_end - _start <= MB(2), "Xen too large for early-boot assumptions") + -- 2.39.2
The patch does two thing: 1. Setup initial pagetables. 2. Enable MMU which end up with code in cont_after_mmu_is_enabled() Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V5: - Nothing changed. Only rebase --- Changes in V4: - Nothing changed. Only rebase --- Changes in V3: - update the commit message that MMU is also enabled here - remove early_printk("All set up\n") as it was moved to cont_after_mmu_is_enabled() function after MMU is enabled. --- Changes in V2: * Update the commit message --- xen/arch/riscv/setup.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/setup.c +++ b/xen/arch/riscv/setup.c @@ -XXX,XX +XXX,XX @@ void __init noreturn start_xen(unsigned long bootcpu_id, { early_printk("Hello from C env\n"); - early_printk("All set up\n"); + setup_initial_pagetables(); + + enable_mmu(); + for ( ;; ) asm volatile ("wfi"); -- 2.39.2
After introduction of initial pagetables there is no any sense in dummy_bss variable as bss section will not be empty anymore. Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in V5: - Nothing changed. Only rebase --- Changes in V4: - Nothing changed. Only rebase --- Changes in V3: * patch was introduced in the current one patch series (v3). --- Changes in V2: * patch was introduced in the current one patch series (v2). --- xen/arch/riscv/setup.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/setup.c +++ b/xen/arch/riscv/setup.c @@ -XXX,XX +XXX,XX @@ unsigned char __initdata cpu0_boot_stack[STACK_SIZE] __aligned(STACK_SIZE); -/* - * To be sure that .bss isn't zero. It will simplify code of - * .bss initialization. - * TODO: - * To be deleted when the first real .bss user appears - */ -int dummy_bss __attribute__((unused)); - void __init noreturn start_xen(unsigned long bootcpu_id, paddr_t dtb_addr) { -- 2.39.2