arch/loongarch/kernel/kprobes.c | 1 + 1 file changed, 1 insertion(+)
Currently, interrupts need to be disabled before single-step mode is set,
it requires that the CSR_PRMD_PIE must be cleared in save_local_irqflag()
which is called by setup_singlestep(), this is reasonable.
But in the first kprobe breakpoint exception, if the irq is enabled at the
beginning of do_bp(), it will not be disabled at the end of do_bp() due to
the CSR_PRMD_PIE has been cleared in save_local_irqflag(). For this case,
it may corrupt exception context when restoring exception after do_bp() in
handle_bp(), this is not reasonable.
Based on the above analysis, in order to make sure the irq is disabled at
the end of do_bp() for the first kprobe breakpoint exception, it is proper
to disable irq first before clearing CSR_PRMD_PIE in save_local_irqflag().
Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support")
Co-developed-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
arch/loongarch/kernel/kprobes.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/kernel/kprobes.c b/arch/loongarch/kernel/kprobes.c
index 8ba391cfabb0..6eab97636e6b 100644
--- a/arch/loongarch/kernel/kprobes.c
+++ b/arch/loongarch/kernel/kprobes.c
@@ -113,6 +113,7 @@ NOKPROBE_SYMBOL(set_current_kprobe);
static void save_local_irqflag(struct kprobe_ctlblk *kcb,
struct pt_regs *regs)
{
+ local_irq_disable();
kcb->saved_status = regs->csr_prmd;
regs->csr_prmd &= ~CSR_PRMD_PIE;
}
--
2.42.0
On 2025-04-08 17:27, Tiezhu Yang wrote: > Currently, interrupts need to be disabled before single-step mode is set, > it requires that the CSR_PRMD_PIE must be cleared in save_local_irqflag() > which is called by setup_singlestep(), this is reasonable. > > But in the first kprobe breakpoint exception, if the irq is enabled at the > beginning of do_bp(), it will not be disabled at the end of do_bp() due to > the CSR_PRMD_PIE has been cleared in save_local_irqflag(). For this case, > it may corrupt exception context when restoring exception after do_bp() in > handle_bp(), this is not reasonable. > > Based on the above analysis, in order to make sure the irq is disabled at > the end of do_bp() for the first kprobe breakpoint exception, it is proper > to disable irq first before clearing CSR_PRMD_PIE in save_local_irqflag(). > > Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support") > Co-developed-by: Tianyang Zhang <zhangtianyang@loongson.cn> > Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> > --- > arch/loongarch/kernel/kprobes.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/loongarch/kernel/kprobes.c b/arch/loongarch/kernel/kprobes.c > index 8ba391cfabb0..6eab97636e6b 100644 > --- a/arch/loongarch/kernel/kprobes.c > +++ b/arch/loongarch/kernel/kprobes.c > @@ -113,6 +113,7 @@ NOKPROBE_SYMBOL(set_current_kprobe); > static void save_local_irqflag(struct kprobe_ctlblk *kcb, > struct pt_regs *regs) > { > + local_irq_disable(); > kcb->saved_status = regs->csr_prmd; > regs->csr_prmd &= ~CSR_PRMD_PIE; > } Hi, Tiezhu, I think the carsh is caused by "irq-triggered re-re-enter" clear the previous_kprobe status. An example things like, ... static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb, int reenter) { union loongarch_instruction insn; if (reenter) { save_previous_kprobe(kcb); =================== <- irq and trigger re-re-enter in its handler set_current_kprobe(p); kcb->kprobe_status = KPROBE_REENTER; } else { kcb->kprobe_status = KPROBE_HIT_SS; } ... We should assure the previous_kprobe status not be changed after re-enter. So this `local_irq_disable` should be set in reenter block begin. And for !reenter block, `local_irq_disable` may be not needed. Jinyang
On 04/09/2025 10:17 AM, Jinyang He wrote: > On 2025-04-08 17:27, Tiezhu Yang wrote: > >> Currently, interrupts need to be disabled before single-step mode is set, >> it requires that the CSR_PRMD_PIE must be cleared in save_local_irqflag() >> which is called by setup_singlestep(), this is reasonable. >> >> But in the first kprobe breakpoint exception, if the irq is enabled at >> the >> beginning of do_bp(), it will not be disabled at the end of do_bp() >> due to >> the CSR_PRMD_PIE has been cleared in save_local_irqflag(). For this case, >> it may corrupt exception context when restoring exception after >> do_bp() in >> handle_bp(), this is not reasonable. >> >> Based on the above analysis, in order to make sure the irq is disabled at >> the end of do_bp() for the first kprobe breakpoint exception, it is >> proper >> to disable irq first before clearing CSR_PRMD_PIE in >> save_local_irqflag(). >> >> Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support") >> Co-developed-by: Tianyang Zhang <zhangtianyang@loongson.cn> >> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> >> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> >> --- >> arch/loongarch/kernel/kprobes.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/arch/loongarch/kernel/kprobes.c >> b/arch/loongarch/kernel/kprobes.c >> index 8ba391cfabb0..6eab97636e6b 100644 >> --- a/arch/loongarch/kernel/kprobes.c >> +++ b/arch/loongarch/kernel/kprobes.c >> @@ -113,6 +113,7 @@ NOKPROBE_SYMBOL(set_current_kprobe); >> static void save_local_irqflag(struct kprobe_ctlblk *kcb, >> struct pt_regs *regs) >> { >> + local_irq_disable(); >> kcb->saved_status = regs->csr_prmd; >> regs->csr_prmd &= ~CSR_PRMD_PIE; >> } > > Hi, Tiezhu, > > I think the carsh is caused by "irq-triggered re-re-enter" clear > the previous_kprobe status. An example things like, > > ... > static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, > struct kprobe_ctlblk *kcb, int reenter) > { > union loongarch_instruction insn; > > if (reenter) { > save_previous_kprobe(kcb); > =================== <- irq and trigger re-re-enter in its handler > set_current_kprobe(p); > kcb->kprobe_status = KPROBE_REENTER; > } else { > kcb->kprobe_status = KPROBE_HIT_SS; > } > ... > > We should assure the previous_kprobe status not be changed after re-enter. > So this `local_irq_disable` should be set in reenter block begin. > And for !reenter block, `local_irq_disable` may be not needed. If you mean to do the following change: diff --git a/arch/loongarch/kernel/kprobes.c b/arch/loongarch/kernel/kprobes.c index 8ba391cfabb0..1ad67a3c7b70 100644 --- a/arch/loongarch/kernel/kprobes.c +++ b/arch/loongarch/kernel/kprobes.c @@ -158,6 +158,7 @@ static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, union loongarch_instruction insn; if (reenter) { + local_irq_disable(); save_previous_kprobe(kcb); set_current_kprobe(p); kcb->kprobe_status = KPROBE_REENTER; then it can not fix the case Tianyang and I have met. With the above change, the issue of kernel hang can still be reproduced after a few hours. With the original change of this patch, the issue of kernel hang can not be reproduced for several days, it works well. Maybe what you said is another case, but anyway, the original change of this patch is safe for any case because its coverage is more extensive, so just keep it as is. Thanks, Tiezhu
On 2025-04-11 10:48, Tiezhu Yang wrote: > On 04/09/2025 10:17 AM, Jinyang He wrote: >> On 2025-04-08 17:27, Tiezhu Yang wrote: >> >>> Currently, interrupts need to be disabled before single-step mode is >>> set, >>> it requires that the CSR_PRMD_PIE must be cleared in >>> save_local_irqflag() >>> which is called by setup_singlestep(), this is reasonable. >>> >>> But in the first kprobe breakpoint exception, if the irq is enabled at >>> the >>> beginning of do_bp(), it will not be disabled at the end of do_bp() >>> due to >>> the CSR_PRMD_PIE has been cleared in save_local_irqflag(). For this >>> case, >>> it may corrupt exception context when restoring exception after >>> do_bp() in >>> handle_bp(), this is not reasonable. >>> >>> Based on the above analysis, in order to make sure the irq is >>> disabled at >>> the end of do_bp() for the first kprobe breakpoint exception, it is >>> proper >>> to disable irq first before clearing CSR_PRMD_PIE in >>> save_local_irqflag(). >>> >>> Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support") >>> Co-developed-by: Tianyang Zhang <zhangtianyang@loongson.cn> >>> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> >>> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> >>> --- >>> arch/loongarch/kernel/kprobes.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/arch/loongarch/kernel/kprobes.c >>> b/arch/loongarch/kernel/kprobes.c >>> index 8ba391cfabb0..6eab97636e6b 100644 >>> --- a/arch/loongarch/kernel/kprobes.c >>> +++ b/arch/loongarch/kernel/kprobes.c >>> @@ -113,6 +113,7 @@ NOKPROBE_SYMBOL(set_current_kprobe); >>> static void save_local_irqflag(struct kprobe_ctlblk *kcb, >>> struct pt_regs *regs) >>> { >>> + local_irq_disable(); >>> kcb->saved_status = regs->csr_prmd; >>> regs->csr_prmd &= ~CSR_PRMD_PIE; >>> } >> >> Hi, Tiezhu, >> >> I think the carsh is caused by "irq-triggered re-re-enter" clear >> the previous_kprobe status. An example things like, >> >> ... >> static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, >> struct kprobe_ctlblk *kcb, int reenter) >> { >> union loongarch_instruction insn; >> >> if (reenter) { >> save_previous_kprobe(kcb); >> =================== <- irq and trigger re-re-enter in its handler >> set_current_kprobe(p); >> kcb->kprobe_status = KPROBE_REENTER; >> } else { >> kcb->kprobe_status = KPROBE_HIT_SS; >> } >> ... >> >> We should assure the previous_kprobe status not be changed after >> re-enter. >> So this `local_irq_disable` should be set in reenter block begin. >> And for !reenter block, `local_irq_disable` may be not needed. > > If you mean to do the following change: > > diff --git a/arch/loongarch/kernel/kprobes.c > b/arch/loongarch/kernel/kprobes.c > index 8ba391cfabb0..1ad67a3c7b70 100644 > --- a/arch/loongarch/kernel/kprobes.c > +++ b/arch/loongarch/kernel/kprobes.c > @@ -158,6 +158,7 @@ static void setup_singlestep(struct kprobe *p, > struct pt_regs *regs, > union loongarch_instruction insn; > > if (reenter) { > + local_irq_disable(); > save_previous_kprobe(kcb); > set_current_kprobe(p); > kcb->kprobe_status = KPROBE_REENTER; > > then it can not fix the case Tianyang and I have met. > > With the above change, the issue of kernel hang can > still be reproduced after a few hours. > > With the original change of this patch, the issue of > kernel hang can not be reproduced for several days, > it works well. > > Maybe what you said is another case, but anyway, the > original change of this patch is safe for any case > because its coverage is more extensive, so just keep > it as is. > static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb, int reenter) { union loongarch_instruction insn; if (reenter) { -> My meaning, local_irq_disable(); LabelA: save_previous_kprobe(kcb); set_current_kprobe(p); kcb->kprobe_status = KPROBE_REENTER; LabelC: } else { LabelD: kcb->kprobe_status = KPROBE_HIT_SS; } if (p->ainsn.insn) { -> Yours, local_irq_disable(); LabelB: /* IRQs and single stepping do not mix well */ save_local_irqflag(kcb, regs); /* set ip register to prepare for single stepping */ regs->csr_era = (unsigned long)p->ainsn.insn; } else { /* simulate single steping */ insn.word = p->opcode; arch_simulate_insn(insn, regs); /* now go for post processing */ post_kprobe_handler(p, kcb, regs); } } I have just explained the previous status may be broken by IRQ. That's why we should protected LabelA->LabelC. For the case `p->ainsn.insn`, my meaning is just not protecte LabelD. If that is wrong, the change should be changed to static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb, int reenter) { union loongarch_instruction insn; + if (reenter || p->ainsn.insn) local_irq_disable(); ... other codes. } On the other hand, have you tried only fix do_bp weather cause hang? bool this_bp_ie = regs->csr_prmd & CSR_PRMD_PIE; if (this_bp_ie) local_irq_enable(); ... if (this_bp_ie) local_irq_disable();
On 04/11/2025 12:46 PM, Jinyang He wrote: > > On 2025-04-11 10:48, Tiezhu Yang wrote: >> On 04/09/2025 10:17 AM, Jinyang He wrote: >>> On 2025-04-08 17:27, Tiezhu Yang wrote: ... > I have just explained the previous status may be broken by IRQ. The initial aim is to make sure the irq is disabled at the end of do_bp(), so let us narrow down the scope. > On the other hand, have you tried only fix do_bp weather cause hang? > > bool this_bp_ie = regs->csr_prmd & CSR_PRMD_PIE; > if (this_bp_ie) > local_irq_enable(); > ... > if (this_bp_ie) > local_irq_disable(); This is a good idea, thank you. I will test the following change, if it works well and no more comments, I will send v2 in the next week. ---8<--- diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c index 2ec3106c0da3..68cc4165578a 100644 --- a/arch/loongarch/kernel/traps.c +++ b/arch/loongarch/kernel/traps.c @@ -710,11 +710,12 @@ asmlinkage void noinstr do_bce(struct pt_regs *regs) asmlinkage void noinstr do_bp(struct pt_regs *regs) { bool user = user_mode(regs); + bool pie = regs->csr_prmd & CSR_PRMD_PIE; unsigned int opcode, bcode; unsigned long era = exception_era(regs); irqentry_state_t state = irqentry_enter(regs); - if (regs->csr_prmd & CSR_PRMD_PIE) + if (pie) local_irq_enable(); if (__get_inst(&opcode, (u32 *)era, user)) @@ -780,7 +781,7 @@ asmlinkage void noinstr do_bp(struct pt_regs *regs) } out: - if (regs->csr_prmd & CSR_PRMD_PIE) + if (pie) local_irq_disable(); irqentry_exit(regs, state); Thanks, Tiezhu
Hi, Tiezhu, On Fri, Apr 11, 2025 at 7:37 PM Tiezhu Yang <yangtiezhu@loongson.cn> wrote: > > On 04/11/2025 12:46 PM, Jinyang He wrote: > > > > On 2025-04-11 10:48, Tiezhu Yang wrote: > >> On 04/09/2025 10:17 AM, Jinyang He wrote: > >>> On 2025-04-08 17:27, Tiezhu Yang wrote: > > ... > > > I have just explained the previous status may be broken by IRQ. > > The initial aim is to make sure the irq is disabled at the end of > do_bp(), so let us narrow down the scope. > > > On the other hand, have you tried only fix do_bp weather cause hang? > > > > bool this_bp_ie = regs->csr_prmd & CSR_PRMD_PIE; > > if (this_bp_ie) > > local_irq_enable(); > > ... > > if (this_bp_ie) > > local_irq_disable(); > > This is a good idea, thank you. > > I will test the following change, if it works well and no more comments, > I will send v2 in the next week. If it works, I think there is a more fundamental problem, all do_xyz() should be fixed because PRMD may change during handling. And you can use bool pie = regs_irqs_disabled(regs) instead of bool pie = regs->csr_prmd & CSR_PRMD_PIE Huacai > > ---8<--- > diff --git a/arch/loongarch/kernel/traps.c b/arch/loongarch/kernel/traps.c > index 2ec3106c0da3..68cc4165578a 100644 > --- a/arch/loongarch/kernel/traps.c > +++ b/arch/loongarch/kernel/traps.c > @@ -710,11 +710,12 @@ asmlinkage void noinstr do_bce(struct pt_regs *regs) > asmlinkage void noinstr do_bp(struct pt_regs *regs) > { > bool user = user_mode(regs); > + bool pie = regs->csr_prmd & CSR_PRMD_PIE; > unsigned int opcode, bcode; > unsigned long era = exception_era(regs); > irqentry_state_t state = irqentry_enter(regs); > > - if (regs->csr_prmd & CSR_PRMD_PIE) > + if (pie) > local_irq_enable(); > > if (__get_inst(&opcode, (u32 *)era, user)) > @@ -780,7 +781,7 @@ asmlinkage void noinstr do_bp(struct pt_regs *regs) > } > > out: > - if (regs->csr_prmd & CSR_PRMD_PIE) > + if (pie) > local_irq_disable(); > > irqentry_exit(regs, state); > > Thanks, > Tiezhu > >
© 2016 - 2025 Red Hat, Inc.