Moves sti directly after the cr2 read and immediately after the #PF
handler.
While in the area, remove redundant q suffix to a movq in entry.S
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
I don't think this is a bug as much as an accident about to happen. Even if
there's no cases at the moment in which the IRQ handler may page fault, that
might change in the future.
Note: I haven't tested it extensively beyond running it on GitLab.
pipeline:
https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1449182525
---
xen/arch/x86/traps.c | 2 ++
xen/arch/x86/x86_64/entry.S | 11 +++++++++--
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 708136f625..1c04c03d9f 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1600,6 +1600,8 @@ void asmlinkage do_page_fault(struct cpu_user_regs *regs)
addr = read_cr2();
+ local_irq_enable();
+
/* fixup_page_fault() might change regs->error_code, so cache it here. */
error_code = regs->error_code;
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index b8482de8ee..ef803f6288 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -844,8 +844,7 @@ handle_exception_saved:
#elif !defined(CONFIG_PV)
ASSERT_CONTEXT_IS_XEN
#endif /* CONFIG_PV */
- sti
-1: movq %rsp,%rdi
+1: mov %rsp,%rdi
movzbl UREGS_entry_vector(%rsp),%eax
#ifdef CONFIG_PERF_COUNTERS
lea per_cpu__perfcounters(%rip), %rcx
@@ -866,7 +865,15 @@ handle_exception_saved:
jmp .L_exn_dispatch_done; \
.L_ ## vec ## _done:
+ /*
+ * IRQs kept off to derisk being hit by a nested interrupt before
+ * reading %cr2. Otherwise a page fault in the nested interrupt hadnler
+ * would corrupt %cr2.
+ */
DISPATCH(X86_EXC_PF, do_page_fault)
+
+ sti
+
DISPATCH(X86_EXC_GP, do_general_protection)
DISPATCH(X86_EXC_UD, do_invalid_op)
DISPATCH(X86_EXC_NM, do_device_not_available)
--
2.46.0
On 11/09/2024 3:58 pm, Alejandro Vallejo wrote: > diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S > index b8482de8ee..ef803f6288 100644 > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -844,8 +844,7 @@ handle_exception_saved: > #elif !defined(CONFIG_PV) > ASSERT_CONTEXT_IS_XEN > #endif /* CONFIG_PV */ > - sti > -1: movq %rsp,%rdi > +1: mov %rsp,%rdi > movzbl UREGS_entry_vector(%rsp),%eax > #ifdef CONFIG_PERF_COUNTERS > lea per_cpu__perfcounters(%rip), %rcx I'm afraid this isn't correctly. The STI is only on one of two paths to the dispatch logic. Right now, you're re-enabling interrupts even if #PF hits an irqs-off region in Xen. You must not enabled IRQs if going via the exception_with_ints_disabled path, which is the user of that 1: label immediately after STI. ~Andrew
On Thu Sep 12, 2024 at 10:49 AM BST, Andrew Cooper wrote: > On 11/09/2024 3:58 pm, Alejandro Vallejo wrote: > > diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S > > index b8482de8ee..ef803f6288 100644 > > --- a/xen/arch/x86/x86_64/entry.S > > +++ b/xen/arch/x86/x86_64/entry.S > > @@ -844,8 +844,7 @@ handle_exception_saved: > > #elif !defined(CONFIG_PV) > > ASSERT_CONTEXT_IS_XEN > > #endif /* CONFIG_PV */ > > - sti > > -1: movq %rsp,%rdi > > +1: mov %rsp,%rdi > > movzbl UREGS_entry_vector(%rsp),%eax > > #ifdef CONFIG_PERF_COUNTERS > > lea per_cpu__perfcounters(%rip), %rcx > > I'm afraid this isn't correctly. The STI is only on one of two paths to > the dispatch logic. > > Right now, you're re-enabling interrupts even if #PF hits an irqs-off > region in Xen. > > You must not enabled IRQs if going via the exception_with_ints_disabled > path, which is the user of that 1: label immediately after STI. > > ~Andrew Well, darn. That's a well-hidden Waldo. I'll send a v2 with conditional enables on C and assembly, and a change of that label from "1" to ".Lfoo" to clearly imply the control flow might take a backflip from several miles down the file. Cheers, Alejandro
On Wed, Sep 11, 2024 at 03:58:23PM +0100, Alejandro Vallejo wrote: > Moves sti directly after the cr2 read and immediately after the #PF > handler. > > While in the area, remove redundant q suffix to a movq in entry.S > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > --- > I don't think this is a bug as much as an accident about to happen. Even if > there's no cases at the moment in which the IRQ handler may page fault, that > might change in the future. > > Note: I haven't tested it extensively beyond running it on GitLab. > > pipeline: > https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1449182525 > > --- > xen/arch/x86/traps.c | 2 ++ > xen/arch/x86/x86_64/entry.S | 11 +++++++++-- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index 708136f625..1c04c03d9f 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -1600,6 +1600,8 @@ void asmlinkage do_page_fault(struct cpu_user_regs *regs) > > addr = read_cr2(); > > + local_irq_enable(); I would maybe add an ASSERT(!local_irq_is_enabled()); at the top of the function, just to make sure the context is as expected. > + > /* fixup_page_fault() might change regs->error_code, so cache it here. */ > error_code = regs->error_code; > > diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S > index b8482de8ee..ef803f6288 100644 > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -844,8 +844,7 @@ handle_exception_saved: > #elif !defined(CONFIG_PV) > ASSERT_CONTEXT_IS_XEN > #endif /* CONFIG_PV */ > - sti > -1: movq %rsp,%rdi > +1: mov %rsp,%rdi Since you are modifying this already - we usually add a space between the comma and the next operand. Thanks, Roger.
On Wed, Sep 11, 2024 at 3:58 PM Alejandro Vallejo <alejandro.vallejo@cloud.com> wrote: > > Moves sti directly after the cr2 read and immediately after the #PF > handler. > > While in the area, remove redundant q suffix to a movq in entry.S > > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> > --- > I don't think this is a bug as much as an accident about to happen. Even if > there's no cases at the moment in which the IRQ handler may page fault, that > might change in the future. > > Note: I haven't tested it extensively beyond running it on GitLab. > > pipeline: > https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1449182525 > > --- > xen/arch/x86/traps.c | 2 ++ > xen/arch/x86/x86_64/entry.S | 11 +++++++++-- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index 708136f625..1c04c03d9f 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -1600,6 +1600,8 @@ void asmlinkage do_page_fault(struct cpu_user_regs *regs) > > addr = read_cr2(); > > + local_irq_enable(); > + > /* fixup_page_fault() might change regs->error_code, so cache it here. */ > error_code = regs->error_code; > > diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S > index b8482de8ee..ef803f6288 100644 > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -844,8 +844,7 @@ handle_exception_saved: > #elif !defined(CONFIG_PV) > ASSERT_CONTEXT_IS_XEN > #endif /* CONFIG_PV */ > - sti > -1: movq %rsp,%rdi > +1: mov %rsp,%rdi > movzbl UREGS_entry_vector(%rsp),%eax > #ifdef CONFIG_PERF_COUNTERS > lea per_cpu__perfcounters(%rip), %rcx > @@ -866,7 +865,15 @@ handle_exception_saved: > jmp .L_exn_dispatch_done; \ > .L_ ## vec ## _done: > > + /* > + * IRQs kept off to derisk being hit by a nested interrupt before > + * reading %cr2. Otherwise a page fault in the nested interrupt hadnler Minor, typo: hadnler -> handler > > + * would corrupt %cr2. > + */ > DISPATCH(X86_EXC_PF, do_page_fault) > + > + sti > + > DISPATCH(X86_EXC_GP, do_general_protection) > DISPATCH(X86_EXC_UD, do_invalid_op) > DISPATCH(X86_EXC_NM, do_device_not_available) > Frediano
© 2016 - 2025 Red Hat, Inc.