arch/riscv/kernel/head.S | 1 + 1 file changed, 1 insertion(+)
From: Chen Pei <cp0613@linux.alibaba.com>
During SMP boot, the secondary_start_sbi address is passed to the
slave core via sbi_hsm_hart_start. In OpenSBI, this address is
written to STVEC in sbi_hart_switch_mode.
According to the privileged specification, the BASE field of STVEC
must always be aligned on a 4-byte boundary. However, System.map
reveals that secondary_start_sbi is not a 4-byte aligned address.
For example, the address of secondary_start_sbi is
0xffffffff80001066, and the disassembly is as follows:
Dump of assembler code from 0xffffffff80001052 to 0xffffffff8000107a:
0xffffffff80001052 <_start+4178>: c.nop -11
0xffffffff80001054 <_start+4180>: auipc gp,0x1a1f
0xffffffff80001058 <_start+4184>: addi gp,gp,84
0xffffffff8000105c <_start+4188>: csrw satp,a2
0xffffffff80001060 <_start+4192>: sfence.vma
0xffffffff80001064 <_start+4196>: ret
0xffffffff80001066 <_start+4198>: csrw sie,zero
0xffffffff8000106a <_start+4202>: csrw sip,zero
0xffffffff8000106e <_start+4206>: li t0,2
0xffffffff80001070 <_start+4208>: csrw scounteren,t0
0xffffffff80001074 <_start+4212>: auipc gp,0x1a1f
0xffffffff80001078 <_start+4216>: addi gp,gp,52
When writing to STVEC at address 0xffffffff80001066, the actual
write address is 0xffffffff80001064, corresponding to the address
of the previous ret instruction. This is unexpected, and if an
interrupt occurs at this point, it will cause unpredictable results.
However, secondary_start_sbi immediately masks all interrupts and
updates STVEC, so no problems occurred.
In summary, it is more reasonable to make secondary_start_sbi
satisfy 4-byte alignment.
Changes in v2:
- Place `.align 2` inside `#ifdef CONFIG_SMP`, above the tag.
- Add two Reviewed-by tags.
- Based on Linux 7.0.
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
---
arch/riscv/kernel/head.S | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 9c99c5ad6fe8..9f33be6260e1 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -127,6 +127,7 @@ relocate_enable_mmu:
#endif /* CONFIG_MMU */
#ifdef CONFIG_SMP
.global secondary_start_sbi
+ .align 2
secondary_start_sbi:
/* Mask all interrupts */
csrw CSR_IE, zero
--
2.50.1
[ +cc opensbi, Anup: What was the deal with setting stvec in sbi_hart_switch_mode? ] On 4/13/26 21:20, cp0613@linux.alibaba.com wrote: > From: Chen Pei <cp0613@linux.alibaba.com> > > During SMP boot, the secondary_start_sbi address is passed to the > slave core via sbi_hsm_hart_start. In OpenSBI, this address is > written to STVEC in sbi_hart_switch_mode. This is certainly odd (OpenSBI git blames to initial git commit. No idea why this was needed.) but technically permitted by the SBI HSM spec (see below). * Anup and other opensbi folks: Do you know/remember why it was there? It's not really a bug, but some historical context would be appreciated. > According to the privileged specification, the BASE field of STVEC > must always be aligned on a 4-byte boundary. However, System.map > reveals that secondary_start_sbi is not a 4-byte aligned address. > > For example, the address of secondary_start_sbi is > 0xffffffff80001066, and the disassembly is as follows: > > Dump of assembler code from 0xffffffff80001052 to 0xffffffff8000107a: > 0xffffffff80001052 <_start+4178>: c.nop -11 > 0xffffffff80001054 <_start+4180>: auipc gp,0x1a1f > 0xffffffff80001058 <_start+4184>: addi gp,gp,84 > 0xffffffff8000105c <_start+4188>: csrw satp,a2 > 0xffffffff80001060 <_start+4192>: sfence.vma > 0xffffffff80001064 <_start+4196>: ret > 0xffffffff80001066 <_start+4198>: csrw sie,zero > 0xffffffff8000106a <_start+4202>: csrw sip,zero > 0xffffffff8000106e <_start+4206>: li t0,2 > 0xffffffff80001070 <_start+4208>: csrw scounteren,t0 > 0xffffffff80001074 <_start+4212>: auipc gp,0x1a1f > 0xffffffff80001078 <_start+4216>: addi gp,gp,52 > > When writing to STVEC at address 0xffffffff80001066, the actual > write address is 0xffffffff80001064, corresponding to the address > of the previous ret instruction. Also technically permitted. > This is unexpected, and if an > interrupt occurs at this point, it will cause unpredictable results. > > However, secondary_start_sbi immediately masks all interrupts and > updates STVEC, so no problems occurred. An interrupt cannot trap in between secondary_start_sbi starting and the interrupts getting masked individually. The only arch state guaranteed by HSM is [1]: |Register Name | Register Value |satp | 0 |sstatus.SIE | 0 |a0 | hartid |a1 | `opaque` parameter |All other registers remain in an undefined state. Of which sstatus.SIE = 0 means that interrupts are masked in Supervisor mode. > In summary, it is more reasonable to make secondary_start_sbi > satisfy 4-byte alignment. Thus, it is unnecessary. If this fixes a bug where an interrupt can trap into the wrong stvec for you, then your firmware is broken. Michael's suggestion about SYM_CODE_START still stands, but if you do that, that's just a really minor refactoring, and in any case the current commit message cannot stand. Vivian "dramforever" Wang [1]: https://github.com/riscv-non-isa/riscv-sbi-doc/blob/v3.0/src/ext-hsm.adoc#function-hart-start-fid-0
On Wed, 15 Apr 2026 11:39:22 +0800, wangruikang@iscas.ac.cn wrote: > > During SMP boot, the secondary_start_sbi address is passed to the > > slave core via sbi_hsm_hart_start. In OpenSBI, this address is > > written to STVEC in sbi_hart_switch_mode. > > This is certainly odd (OpenSBI git blames to initial git commit. No idea > why this was needed.) but technically permitted by the SBI HSM spec (see > below). > > * Anup and other opensbi folks: Do you know/remember why it was there? > It's not really a bug, but some historical context would be appreciated. Yes, looking forward to responses from Anup and other opensbi folks. > > According to the privileged specification, the BASE field of STVEC > > must always be aligned on a 4-byte boundary. However, System.map > > reveals that secondary_start_sbi is not a 4-byte aligned address. > > > > For example, the address of secondary_start_sbi is > > 0xffffffff80001066, and the disassembly is as follows: > > > > Dump of assembler code from 0xffffffff80001052 to 0xffffffff8000107a: > > 0xffffffff80001052 <_start+4178>: c.nop -11 > > 0xffffffff80001054 <_start+4180>: auipc gp,0x1a1f > > 0xffffffff80001058 <_start+4184>: addi gp,gp,84 > > 0xffffffff8000105c <_start+4188>: csrw satp,a2 > > 0xffffffff80001060 <_start+4192>: sfence.vma > > 0xffffffff80001064 <_start+4196>: ret > > 0xffffffff80001066 <_start+4198>: csrw sie,zero > > 0xffffffff8000106a <_start+4202>: csrw sip,zero > > 0xffffffff8000106e <_start+4206>: li t0,2 > > 0xffffffff80001070 <_start+4208>: csrw scounteren,t0 > > 0xffffffff80001074 <_start+4212>: auipc gp,0x1a1f > > 0xffffffff80001078 <_start+4216>: addi gp,gp,52 > > > > When writing to STVEC at address 0xffffffff80001066, the actual > > write address is 0xffffffff80001064, corresponding to the address > > of the previous ret instruction. > Also technically permitted. Yes, but this is unintended behavior, and we should avoid it. > > This is unexpected, and if an > > interrupt occurs at this point, it will cause unpredictable results. > > > > However, secondary_start_sbi immediately masks all interrupts and > > updates STVEC, so no problems occurred. > > An interrupt cannot trap in between secondary_start_sbi starting and the > interrupts getting masked individually. The only arch state guaranteed > by HSM is [1]: > > |Register Name | Register Value > |satp | 0 > |sstatus.SIE | 0 > |a0 | hartid > |a1 | `opaque` parameter > |All other registers remain in an undefined state. > > Of which sstatus.SIE = 0 means that interrupts are masked in Supervisor > mode. > > In summary, it is more reasonable to make secondary_start_sbi > > satisfy 4-byte alignment. > > Thus, it is unnecessary. > > If this fixes a bug where an interrupt can trap into the wrong stvec for > you, then your firmware is broken. This isn't about fixing a bug (as mentioned earlier, it's highly unlikely to happen), but rather the STVEC check mechanism added to QEMU discovered this issue. Regarding the problem itself, I would appreciate any clarification or modifications from OpenSBI, but currently this modification is the simplest and most feasible. > Michael's suggestion about SYM_CODE_START still stands, but if you do > that, that's just a really minor refactoring, and in any case the > current commit message cannot stand. > > Vivian "dramforever" Wang > > [1]: > https://github.com/riscv-non-isa/riscv-sbi-doc/blob/v3.0/src/ext-hsm.adoc#function-hart-start-fid-0 Thanks, Pei
On 4/15/26 20:37, Chen Pei wrote: [...] >>> In summary, it is more reasonable to make secondary_start_sbi >>> satisfy 4-byte alignment. >> Thus, it is unnecessary. >> >> If this fixes a bug where an interrupt can trap into the wrong stvec for >> you, then your firmware is broken. > This isn't about fixing a bug (as mentioned earlier, it's highly unlikely > to happen), It is *impossible* given correct HW and FW, but okay... > but rather the STVEC check mechanism added to QEMU discovered > this issue. Regarding the problem itself, I would appreciate any clarification > or modifications from OpenSBI, but currently this modification is the simplest > and most feasible. The stvec CSR is WARL. If the "STVEC check mechanism" is complaining about usage of CSRs as specified, then the check is a false positive and is not something fixable. Vivian "dramforever" Wang
On 13/4/2026 23:20, cp0613@linux.alibaba.com wrote: > From: Chen Pei <cp0613@linux.alibaba.com> > > During SMP boot, the secondary_start_sbi address is passed to the > slave core via sbi_hsm_hart_start. In OpenSBI, this address is > written to STVEC in sbi_hart_switch_mode. ... > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S > index 9c99c5ad6fe8..9f33be6260e1 100644 > --- a/arch/riscv/kernel/head.S > +++ b/arch/riscv/kernel/head.S > @@ -127,6 +127,7 @@ relocate_enable_mmu: > #endif /* CONFIG_MMU */ > #ifdef CONFIG_SMP > .global secondary_start_sbi > + .align 2 > secondary_start_sbi: > /* Mask all interrupts */ > csrw CSR_IE, zero Minor nit, but IMHO .balign is preferable in new code. It always byte aligns, whereas .align has different meanings across architectures, and on some arches (including riscv) requires the reader to do the math to convert to a byte alignment. These days there's also the SYM_CODE_START etc. macros defined via linkage.h, which use __ALIGN from arch/riscv/include/asm/linkage.h, which is already .balign 4. cheers
On Wed, 15 Apr 2026 12:06:28 +1000, mpe@kernel.org wrote: > > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S > > index 9c99c5ad6fe8..9f33be6260e1 100644 > > --- a/arch/riscv/kernel/head.S > > +++ b/arch/riscv/kernel/head.S > > @@ -127,6 +127,7 @@ relocate_enable_mmu: > > #endif /* CONFIG_MMU */ > > #ifdef CONFIG_SMP > > .global secondary_start_sbi > > + .align 2 > > secondary_start_sbi: > > /* Mask all interrupts */ > > csrw CSR_IE, zero > > > Minor nit, but IMHO .balign is preferable in new code. It always byte > aligns, whereas .align has different meanings across architectures, and > on some arches (including riscv) requires the reader to do the math to > convert to a byte alignment. > > These days there's also the SYM_CODE_START etc. macros defined via > linkage.h, which use __ALIGN from arch/riscv/include/asm/linkage.h, > which is already .balign 4. Hi Michael, Thank you for your review and reminder. If this patch is ultimately approved (because the behavior of OpenSBI still needs clarification based on the new reviewer's comments), I will update it using .balign. Thanks, Pei
© 2016 - 2026 Red Hat, Inc.