arch/x86/xen/irq.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
Disabling preemption in xen_irq_enable() is not needed. There is no
risk of missing events due to preemption, as preemption can happen
only in case an event is being received, which is just the opposite
of missing an event.
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/xen/irq.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index dfa091d79c2e..ba9b14a97109 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
{
struct vcpu_info *vcpu;
- /*
- * We may be preempted as soon as vcpu->evtchn_upcall_mask is
- * cleared, so disable preemption to ensure we check for
- * events on the VCPU we are still running on.
- */
- preempt_disable();
-
vcpu = this_cpu_read(xen_vcpu);
vcpu->evtchn_upcall_mask = 0;
- /* Doesn't matter if we get preempted here, because any
- pending event will get dealt with anyway. */
+ /*
+ * Now preemption could happen, but this is only possible if an event
+ * was handled, so missing an event due to preemption is not
+ * possible at all.
+ * The worst possible case is to be preempted and then check events
+ * pending on the old vcpu, but this is not problematic.
+ */
barrier(); /* unmask then check (avoid races) */
if (unlikely(vcpu->evtchn_upcall_pending))
xen_force_evtchn_callback();
-
- preempt_enable();
}
PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
--
2.26.2
On 21.09.2021 09:02, Juergen Gross wrote: > --- a/arch/x86/xen/irq.c > +++ b/arch/x86/xen/irq.c > @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) > { > struct vcpu_info *vcpu; > > - /* > - * We may be preempted as soon as vcpu->evtchn_upcall_mask is > - * cleared, so disable preemption to ensure we check for > - * events on the VCPU we are still running on. > - */ > - preempt_disable(); > - > vcpu = this_cpu_read(xen_vcpu); > vcpu->evtchn_upcall_mask = 0; > > - /* Doesn't matter if we get preempted here, because any > - pending event will get dealt with anyway. */ > + /* > + * Now preemption could happen, but this is only possible if an event > + * was handled, so missing an event due to preemption is not > + * possible at all. > + * The worst possible case is to be preempted and then check events > + * pending on the old vcpu, but this is not problematic. > + */ I agree this isn't problematic from a functional perspective, but ... > barrier(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); ... is a stray call here cheaper than ... > - > - preempt_enable(); ... the preempt_{dis,en}able() pair? Jan
On 21.09.21 09:53, Jan Beulich wrote: > On 21.09.2021 09:02, Juergen Gross wrote: >> --- a/arch/x86/xen/irq.c >> +++ b/arch/x86/xen/irq.c >> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) >> { >> struct vcpu_info *vcpu; >> >> - /* >> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is >> - * cleared, so disable preemption to ensure we check for >> - * events on the VCPU we are still running on. >> - */ >> - preempt_disable(); >> - >> vcpu = this_cpu_read(xen_vcpu); >> vcpu->evtchn_upcall_mask = 0; >> >> - /* Doesn't matter if we get preempted here, because any >> - pending event will get dealt with anyway. */ >> + /* >> + * Now preemption could happen, but this is only possible if an event >> + * was handled, so missing an event due to preemption is not >> + * possible at all. >> + * The worst possible case is to be preempted and then check events >> + * pending on the old vcpu, but this is not problematic. >> + */ > > I agree this isn't problematic from a functional perspective, but ... > >> barrier(); /* unmask then check (avoid races) */ >> if (unlikely(vcpu->evtchn_upcall_pending)) >> xen_force_evtchn_callback(); > > ... is a stray call here cheaper than ... > >> - >> - preempt_enable(); > > ... the preempt_{dis,en}able() pair? The question is if a stray call in case of preemption (very unlikely) is cheaper than the preempt_{dis|en}able() pair on each IRQ enabling. I'm quite sure removing the preempt_*() calls will be a net benefit. Juergen
On 21.09.2021 09:58, Juergen Gross wrote: > On 21.09.21 09:53, Jan Beulich wrote: >> On 21.09.2021 09:02, Juergen Gross wrote: >>> --- a/arch/x86/xen/irq.c >>> +++ b/arch/x86/xen/irq.c >>> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) >>> { >>> struct vcpu_info *vcpu; >>> >>> - /* >>> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is >>> - * cleared, so disable preemption to ensure we check for >>> - * events on the VCPU we are still running on. >>> - */ >>> - preempt_disable(); >>> - >>> vcpu = this_cpu_read(xen_vcpu); >>> vcpu->evtchn_upcall_mask = 0; >>> >>> - /* Doesn't matter if we get preempted here, because any >>> - pending event will get dealt with anyway. */ >>> + /* >>> + * Now preemption could happen, but this is only possible if an event >>> + * was handled, so missing an event due to preemption is not >>> + * possible at all. >>> + * The worst possible case is to be preempted and then check events >>> + * pending on the old vcpu, but this is not problematic. >>> + */ >> >> I agree this isn't problematic from a functional perspective, but ... >> >>> barrier(); /* unmask then check (avoid races) */ >>> if (unlikely(vcpu->evtchn_upcall_pending)) >>> xen_force_evtchn_callback(); >> >> ... is a stray call here cheaper than ... >> >>> - >>> - preempt_enable(); >> >> ... the preempt_{dis,en}able() pair? > > The question is if a stray call in case of preemption (very unlikely) > is cheaper than the preempt_{dis|en}able() pair on each IRQ enabling. > > I'm quite sure removing the preempt_*() calls will be a net benefit. Well, yes, I agree. It would have been nice if the description pointed out the fact that preemption kicking in precisely here is very unlikely. But perhaps that's considered rather obvious ... The issue I'm having is with the prior comments: They indicated that preemption happening before the "pending" check would be okay, _despite_ the preempt_{dis,en}able() pair. One could view this as an indication that this pair was put there for another reason (e.g. to avoid the stray calls). But it may of course also be that the comment simply was stale. Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan
On 21.09.21 10:11, Jan Beulich wrote: > On 21.09.2021 09:58, Juergen Gross wrote: >> On 21.09.21 09:53, Jan Beulich wrote: >>> On 21.09.2021 09:02, Juergen Gross wrote: >>>> --- a/arch/x86/xen/irq.c >>>> +++ b/arch/x86/xen/irq.c >>>> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) >>>> { >>>> struct vcpu_info *vcpu; >>>> >>>> - /* >>>> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is >>>> - * cleared, so disable preemption to ensure we check for >>>> - * events on the VCPU we are still running on. >>>> - */ >>>> - preempt_disable(); >>>> - >>>> vcpu = this_cpu_read(xen_vcpu); >>>> vcpu->evtchn_upcall_mask = 0; >>>> >>>> - /* Doesn't matter if we get preempted here, because any >>>> - pending event will get dealt with anyway. */ >>>> + /* >>>> + * Now preemption could happen, but this is only possible if an event >>>> + * was handled, so missing an event due to preemption is not >>>> + * possible at all. >>>> + * The worst possible case is to be preempted and then check events >>>> + * pending on the old vcpu, but this is not problematic. >>>> + */ >>> >>> I agree this isn't problematic from a functional perspective, but ... >>> >>>> barrier(); /* unmask then check (avoid races) */ >>>> if (unlikely(vcpu->evtchn_upcall_pending)) >>>> xen_force_evtchn_callback(); >>> >>> ... is a stray call here cheaper than ... >>> >>>> - >>>> - preempt_enable(); >>> >>> ... the preempt_{dis,en}able() pair? >> >> The question is if a stray call in case of preemption (very unlikely) >> is cheaper than the preempt_{dis|en}able() pair on each IRQ enabling. >> >> I'm quite sure removing the preempt_*() calls will be a net benefit. > > Well, yes, I agree. It would have been nice if the description pointed > out the fact that preemption kicking in precisely here is very unlikely. > But perhaps that's considered rather obvious ... The issue I'm having > is with the prior comments: They indicated that preemption happening > before the "pending" check would be okay, _despite_ the > preempt_{dis,en}able() pair. One could view this as an indication that > this pair was put there for another reason (e.g. to avoid the stray > calls). But it may of course also be that the comment simply was stale. The comment is older than the preempt_*() calls. Those were added 8 years ago claiming they'd prevent lost events, but at the same time at lease one other patch was added which really prevented lost events, so adding the preempt_*() calls might just have been a guess at that time. > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks, Juergen
On 21.09.21 09:02, Juergen Gross wrote: > Disabling preemption in xen_irq_enable() is not needed. There is no > risk of missing events due to preemption, as preemption can happen > only in case an event is being received, which is just the opposite > of missing an event. > > Signed-off-by: Juergen Gross <jgross@suse.com> Please ignore this patch, it is superseded now by "[PATCH v2 0/2] x86/xen: simplify irq pvops" Juergen > --- > arch/x86/xen/irq.c | 18 +++++++----------- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c > index dfa091d79c2e..ba9b14a97109 100644 > --- a/arch/x86/xen/irq.c > +++ b/arch/x86/xen/irq.c > @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) > { > struct vcpu_info *vcpu; > > - /* > - * We may be preempted as soon as vcpu->evtchn_upcall_mask is > - * cleared, so disable preemption to ensure we check for > - * events on the VCPU we are still running on. > - */ > - preempt_disable(); > - > vcpu = this_cpu_read(xen_vcpu); > vcpu->evtchn_upcall_mask = 0; > > - /* Doesn't matter if we get preempted here, because any > - pending event will get dealt with anyway. */ > + /* > + * Now preemption could happen, but this is only possible if an event > + * was handled, so missing an event due to preemption is not > + * possible at all. > + * The worst possible case is to be preempted and then check events > + * pending on the old vcpu, but this is not problematic. > + */ > > barrier(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); > - > - preempt_enable(); > } > PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable); > >
On Tue, Sep 21, 2021 at 09:02:26AM +0200, Juergen Gross wrote: > Disabling preemption in xen_irq_enable() is not needed. There is no > risk of missing events due to preemption, as preemption can happen > only in case an event is being received, which is just the opposite > of missing an event. > > Signed-off-by: Juergen Gross <jgross@suse.com> > --- > arch/x86/xen/irq.c | 18 +++++++----------- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c > index dfa091d79c2e..ba9b14a97109 100644 > --- a/arch/x86/xen/irq.c > +++ b/arch/x86/xen/irq.c > @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) > { > struct vcpu_info *vcpu; > > - /* > - * We may be preempted as soon as vcpu->evtchn_upcall_mask is > - * cleared, so disable preemption to ensure we check for > - * events on the VCPU we are still running on. > - */ > - preempt_disable(); > - > vcpu = this_cpu_read(xen_vcpu); > vcpu->evtchn_upcall_mask = 0; > > - /* Doesn't matter if we get preempted here, because any > - pending event will get dealt with anyway. */ > + /* > + * Now preemption could happen, but this is only possible if an event > + * was handled, so missing an event due to preemption is not > + * possible at all. > + * The worst possible case is to be preempted and then check events > + * pending on the old vcpu, but this is not problematic. > + */ > > barrier(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); > - > - preempt_enable(); > } > PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable); > > -- > 2.26.2 > So the reason I asked about this is: vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to preempt_count_add() leaves .noinstr.text section vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to preempt_count_add() leaves .noinstr.text section as reported by sfr here: https://lkml.kernel.org/r/20210920113809.18b9b70c@canb.auug.org.au (I'm still not entirely sure why I didn't see them in my build, or why 0day didn't either) Anyway, I can 'fix' xen_irq_disable(), see below, but I'm worried about that still having a hole vs the preempt model. Consider: xen_irq_disable() preempt_disable(); <IRQ> set_tif_need_resched() </IRQ no preemption because preempt_count!=0> this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1; // IRQs are actually disabled preempt_enable_no_resched(); // can't resched because IRQs are disabled ... xen_irq_enable() preempt_disable(); vcpu->evtch_upcall_mask = 0; // IRQs are on preempt_enable() // catches the resched from above Now your patch removes that preempt_enable() and we'll have a missing preemption. Trouble is, because this is noinstr, we can't do schedule().. catch-22 --- Subject: x86/xen: Fixup noinstr in xen_irq_{en,dis}able() From: Peter Zijlstra <peterz@infradead.org> Date: Mon Sep 20 13:46:19 CEST 2021 vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to preempt_count_add() leaves .noinstr.text section vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to preempt_count_add() leaves .noinstr.text section XXX, trades it for: vmlinux.o: warning: objtool: xen_irq_enable()+0x5c: call to __SCT__preempt_schedule_notrace() leaves .noinstr.text section Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- arch/x86/xen/irq.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -44,12 +44,18 @@ __PV_CALLEE_SAVE_REGS_THUNK(xen_save_fl, asmlinkage __visible noinstr void xen_irq_disable(void) { - /* There's a one instruction preempt window here. We need to - make sure we're don't switch CPUs between getting the vcpu - pointer and updating the mask. */ - preempt_disable(); + /* + * There's a one instruction preempt window here. We need to + * make sure we're don't switch CPUs between getting the vcpu + * pointer and updating the mask. + */ + preempt_disable_notrace(); this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1; - preempt_enable_no_resched(); + /* + * We have IRQs disabled at this point, rescheduling isn't going to + * happen, so no point calling into the scheduler for it. + */ + preempt_enable_no_resched_notrace(); } __PV_CALLEE_SAVE_REGS_THUNK(xen_irq_disable, ".noinstr.text"); @@ -62,7 +68,7 @@ asmlinkage __visible noinstr void xen_ir * cleared, so disable preemption to ensure we check for * events on the VCPU we are still running on. */ - preempt_disable(); + preempt_disable_notrace(); vcpu = this_cpu_read(xen_vcpu); vcpu->evtchn_upcall_mask = 0; @@ -74,7 +80,11 @@ asmlinkage __visible noinstr void xen_ir if (unlikely(vcpu->evtchn_upcall_pending)) xen_force_evtchn_callback(); - preempt_enable(); + /* + * XXX if we noinstr we shouldn't be calling schedule(), OTOH we also + * cannot not schedule() as that would violate PREEMPT. + */ + preempt_enable_notrace(); } __PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable, ".noinstr.text");
On 21.09.21 10:27, Peter Zijlstra wrote: > On Tue, Sep 21, 2021 at 09:02:26AM +0200, Juergen Gross wrote: >> Disabling preemption in xen_irq_enable() is not needed. There is no >> risk of missing events due to preemption, as preemption can happen >> only in case an event is being received, which is just the opposite >> of missing an event. >> >> Signed-off-by: Juergen Gross <jgross@suse.com> >> --- >> arch/x86/xen/irq.c | 18 +++++++----------- >> 1 file changed, 7 insertions(+), 11 deletions(-) >> >> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c >> index dfa091d79c2e..ba9b14a97109 100644 >> --- a/arch/x86/xen/irq.c >> +++ b/arch/x86/xen/irq.c >> @@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void) >> { >> struct vcpu_info *vcpu; >> >> - /* >> - * We may be preempted as soon as vcpu->evtchn_upcall_mask is >> - * cleared, so disable preemption to ensure we check for >> - * events on the VCPU we are still running on. >> - */ >> - preempt_disable(); >> - >> vcpu = this_cpu_read(xen_vcpu); >> vcpu->evtchn_upcall_mask = 0; >> >> - /* Doesn't matter if we get preempted here, because any >> - pending event will get dealt with anyway. */ >> + /* >> + * Now preemption could happen, but this is only possible if an event >> + * was handled, so missing an event due to preemption is not >> + * possible at all. >> + * The worst possible case is to be preempted and then check events >> + * pending on the old vcpu, but this is not problematic. >> + */ >> >> barrier(); /* unmask then check (avoid races) */ >> if (unlikely(vcpu->evtchn_upcall_pending)) >> xen_force_evtchn_callback(); >> - >> - preempt_enable(); >> } >> PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable); >> >> -- >> 2.26.2 >> > > So the reason I asked about this is: > > vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to preempt_count_add() leaves .noinstr.text section > vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to preempt_count_add() leaves .noinstr.text section > > as reported by sfr here: > > https://lkml.kernel.org/r/20210920113809.18b9b70c@canb.auug.org.au > > (I'm still not entirely sure why I didn't see them in my build, or why > 0day didn't either) > > Anyway, I can 'fix' xen_irq_disable(), see below, but I'm worried about > that still having a hole vs the preempt model. Consider: > > xen_irq_disable() > preempt_disable(); > <IRQ> > set_tif_need_resched() > </IRQ no preemption because preempt_count!=0> > this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1; // IRQs are actually disabled > preempt_enable_no_resched(); // can't resched because IRQs are disabled > > ... > > xen_irq_enable() > preempt_disable(); > vcpu->evtch_upcall_mask = 0; // IRQs are on > preempt_enable() // catches the resched from above > > > Now your patch removes that preempt_enable() and we'll have a missing > preemption. > > Trouble is, because this is noinstr, we can't do schedule().. catch-22 I think it is even worse. Looking at xen_save_fl() there is clearly a missing preempt_disable(). But I think this all can be resolved by avoiding the need of disabling preemption in those calls (xen_save_fl(), xen_irq_disable() and xen_irq_enable()). Right now disabling preemption is needed, because the flag to be tested or modified is reached via a pointer (xen_vcpu) stored in the percpu area. Looking where it might point to reveals the target address is either an array indexed by smp_processor_id() or a percpu variable of the local cpu (xen_vcpu_info). Nowadays (since Xen 3.4, which is older than our minimal supported Xen version) the array indexed by smp_processor_id() is used only during early boot (interrupts are always off, only boot cpu is running) and just after coming back from suspending the system (e.g. when being live migrated). Early boot should be no problem, and the suspend case isn't either, as that is happening under control of stop_machine() (interrupts off on all cpus). So I think I can switch the whole mess to only need to work on the local percpu xen_vcpu_info instance, which will access always the "correct" area via %gs. Let me have a try ... Juergen
© 2016 - 2024 Red Hat, Inc.