xen/arch/x86/crash.c | 10 ---------- xen/arch/x86/machine_kexec.c | 10 ++++++++++ 2 files changed, 10 insertions(+), 10 deletions(-)
The kexec_reloc() asm has an indirect jump to relocate onto the identity
trampoline. While we clear CET in machine_crash_shutdown(), we fail to clear
CET for the non-crash path. This in turn highlights that the same is true of
resetting the CPUID masking/faulting.
Move both pieces of logic from machine_crash_shutdown() to machine_kexec(),
the latter being common for all kexec transitions. Adjust the condition for
CET being considered active to check in CR4, which is simpler and more robust.
Fixes: 311434bfc9d1 ("x86/setup: Rework MSR_S_CET handling for CET-IBT")
Fixes: b60ab42db2f0 ("x86/shstk: Activate Supervisor Shadow Stacks")
Fixes: 5ab9564c6fa1 ("x86/cpu: Context switch cpuid masks and faulting state in context_switch()")
Reported-by: David Vrabel (XXX which alias to use?)
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: David Vrabel <dvrabel@cantab.net>
---
xen/arch/x86/crash.c | 10 ----------
xen/arch/x86/machine_kexec.c | 10 ++++++++++
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index 003222c0f1ac..99089f77a7eb 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -187,16 +187,6 @@ void machine_crash_shutdown(void)
nmi_shootdown_cpus();
- /* Reset CPUID masking and faulting to the host's default. */
- ctxt_switch_levelling(NULL);
-
- /* Disable CET. */
- if ( cpu_has_xen_shstk || cpu_has_xen_ibt )
- {
- wrmsrl(MSR_S_CET, 0);
- write_cr4(read_cr4() & ~X86_CR4_CET);
- }
-
info = kexec_crash_save_info();
info->xen_phys_start = xen_phys_start;
info->dom0_pfn_to_mfn_frame_list_list =
diff --git a/xen/arch/x86/machine_kexec.c b/xen/arch/x86/machine_kexec.c
index 751a9efcaf6a..d83aa4e7e93b 100644
--- a/xen/arch/x86/machine_kexec.c
+++ b/xen/arch/x86/machine_kexec.c
@@ -156,6 +156,16 @@ void machine_kexec(struct kexec_image *image)
*/
local_irq_disable();
+ /* Reset CPUID masking and faulting to the host's default. */
+ ctxt_switch_levelling(NULL);
+
+ /* Disable CET. */
+ if ( read_cr4() & X86_CR4_CET )
+ {
+ wrmsrl(MSR_S_CET, 0);
+ write_cr4(read_cr4() & ~X86_CR4_CET);
+ }
+
/* Now regular interrupts are disabled, we need to reduce the impact
* of interrupts not disabled by 'cli'.
*
--
2.11.0
On 07.03.2022 21:53, Andrew Cooper wrote: > --- a/xen/arch/x86/machine_kexec.c > +++ b/xen/arch/x86/machine_kexec.c > @@ -156,6 +156,16 @@ void machine_kexec(struct kexec_image *image) > */ > local_irq_disable(); > > + /* Reset CPUID masking and faulting to the host's default. */ > + ctxt_switch_levelling(NULL); > + > + /* Disable CET. */ > + if ( read_cr4() & X86_CR4_CET ) > + { > + wrmsrl(MSR_S_CET, 0); > + write_cr4(read_cr4() & ~X86_CR4_CET); > + } It just occurred to me: Isn't using read_cr4() here somewhat risky? If we crashed on the CR4 write in write_cr4(), we'd try to load CR4 here again with a value which may cause another fault. Jan
On 07.03.2022 21:53, Andrew Cooper wrote: > --- a/xen/arch/x86/machine_kexec.c > +++ b/xen/arch/x86/machine_kexec.c > @@ -156,6 +156,16 @@ void machine_kexec(struct kexec_image *image) > */ > local_irq_disable(); > > + /* Reset CPUID masking and faulting to the host's default. */ > + ctxt_switch_levelling(NULL); > + > + /* Disable CET. */ > + if ( read_cr4() & X86_CR4_CET ) > + { > + wrmsrl(MSR_S_CET, 0); > + write_cr4(read_cr4() & ~X86_CR4_CET); > + } > + > /* Now regular interrupts are disabled, we need to reduce the impact > * of interrupts not disabled by 'cli'. > * Besides introducing somewhat of a disconnect between the comment in context here and the earlier local_irq_disable(), is it really necessary to do both actions with IRQs off? Jan
On 08/03/2022 08:15, Jan Beulich wrote: > On 07.03.2022 21:53, Andrew Cooper wrote: >> --- a/xen/arch/x86/machine_kexec.c >> +++ b/xen/arch/x86/machine_kexec.c >> @@ -156,6 +156,16 @@ void machine_kexec(struct kexec_image *image) >> */ >> local_irq_disable(); >> >> + /* Reset CPUID masking and faulting to the host's default. */ >> + ctxt_switch_levelling(NULL); >> + >> + /* Disable CET. */ >> + if ( read_cr4() & X86_CR4_CET ) >> + { >> + wrmsrl(MSR_S_CET, 0); >> + write_cr4(read_cr4() & ~X86_CR4_CET); >> + } >> + >> /* Now regular interrupts are disabled, we need to reduce the impact >> * of interrupts not disabled by 'cli'. >> * > Besides introducing somewhat of a disconnect between the comment in > context here and the earlier local_irq_disable(), is it really > necessary to do both actions with IRQs off? We are a handful of instructions away from discarding Xen's context entirely. IRQs are not a relevant concern. If we're nitpicking, irqs want to be off before kexecing gets set, because absolutely nothing good can come of handling interrupts later than that point. ~Andrew
On 08.03.2022 17:22, Andrew Cooper wrote: > On 08/03/2022 08:15, Jan Beulich wrote: >> On 07.03.2022 21:53, Andrew Cooper wrote: >>> --- a/xen/arch/x86/machine_kexec.c >>> +++ b/xen/arch/x86/machine_kexec.c >>> @@ -156,6 +156,16 @@ void machine_kexec(struct kexec_image *image) >>> */ >>> local_irq_disable(); >>> >>> + /* Reset CPUID masking and faulting to the host's default. */ >>> + ctxt_switch_levelling(NULL); >>> + >>> + /* Disable CET. */ >>> + if ( read_cr4() & X86_CR4_CET ) >>> + { >>> + wrmsrl(MSR_S_CET, 0); >>> + write_cr4(read_cr4() & ~X86_CR4_CET); >>> + } >>> + >>> /* Now regular interrupts are disabled, we need to reduce the impact >>> * of interrupts not disabled by 'cli'. >>> * >> Besides introducing somewhat of a disconnect between the comment in >> context here and the earlier local_irq_disable(), is it really >> necessary to do both actions with IRQs off? > > We are a handful of instructions away from discarding Xen's context > entirely. IRQs are not a relevant concern. Well, as said - the comment was what caught my eye. But as you appear to think that slight disconnect is not an issue: I don't mean my remark to be an objection. Feel free to commit with David's R-b. Jan
On 07/03/2022 20:53, Andrew Cooper wrote: > The kexec_reloc() asm has an indirect jump to relocate onto the identity > trampoline. While we clear CET in machine_crash_shutdown(), we fail to clear > CET for the non-crash path. This in turn highlights that the same is true of > resetting the CPUID masking/faulting. > > Move both pieces of logic from machine_crash_shutdown() to machine_kexec(), > the latter being common for all kexec transitions. Adjust the condition for > CET being considered active to check in CR4, which is simpler and more robust. Reviewed-by: David Vrabel <dvrabel@amazon.co.uk> > Fixes: 311434bfc9d1 ("x86/setup: Rework MSR_S_CET handling for CET-IBT") > Fixes: b60ab42db2f0 ("x86/shstk: Activate Supervisor Shadow Stacks") > Fixes: 5ab9564c6fa1 ("x86/cpu: Context switch cpuid masks and faulting state in context_switch()") > Reported-by: David Vrabel (XXX which alias to use?) Amazon, please. David
© 2016 - 2024 Red Hat, Inc.