[PATCH] x86: Fix check_ist_exit() assertions

Andrew Cooper posted 1 patch 7 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20230919103514.1076888-1-andrew.cooper3@citrix.com
xen/arch/x86/include/asm/current.h | 1 +
1 file changed, 1 insertion(+)
[PATCH] x86: Fix check_ist_exit() assertions
Posted by Andrew Cooper 7 months, 1 week ago
The patch adding check_ist_exit() neglected to consider reset_stack_and_jump()
leaving C and entering one of the Xen exit paths.  The value in %r12 is stale,
and depending on compiler decisions may not be 0.

This shows up in Gitlab CI for the Clang build:

  https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/5112783827

and in OSSTest for GCC 8:

  http://logs.test-lab.xenproject.org/osstest/logs/183045/test-amd64-amd64-xl-qemuu-debianhvm-amd64/serial-pinot0.log

The justification for ensuring ist_exit is accurate in the exit paths still
stands, so zero %r12 in reset_stack_and_jump() to indicate a non-IST exit.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/include/asm/current.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h
index da5e152a10cc..2ce43e275784 100644
--- a/xen/arch/x86/include/asm/current.h
+++ b/xen/arch/x86/include/asm/current.h
@@ -178,6 +178,7 @@ unsigned long get_stack_dump_bottom (unsigned long sp);
             SHADOW_STACK_WORK                                           \
             "mov %[stk], %%rsp;"                                        \
             CHECK_FOR_LIVEPATCH_WORK                                    \
+            "xor %%r12d, %%r12d;" /* non-IST exit */                    \
             instr "[fun]"                                               \
             : [val] "=&r" (tmp),                                        \
               [ssp] "=&r" (tmp)                                         \

base-commit: ea36ac0de27c2a7c847a2a52c3e0f97a45864d81
-- 
2.30.2


Re: [PATCH] x86: Fix check_ist_exit() assertions
Posted by Jan Beulich 7 months, 1 week ago
On 19.09.2023 12:35, Andrew Cooper wrote:
> The patch adding check_ist_exit() neglected to consider reset_stack_and_jump()
> leaving C and entering one of the Xen exit paths.  The value in %r12 is stale,
> and depending on compiler decisions may not be 0.
> 
> This shows up in Gitlab CI for the Clang build:
> 
>   https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/5112783827
> 
> and in OSSTest for GCC 8:
> 
>   http://logs.test-lab.xenproject.org/osstest/logs/183045/test-amd64-amd64-xl-qemuu-debianhvm-amd64/serial-pinot0.log
> 
> The justification for ensuring ist_exit is accurate in the exit paths still
> stands, so zero %r12 in reset_stack_and_jump() to indicate a non-IST exit.

I did think of this as an option, but I don't think this covers all cases.
If we take #DB while in a PV guest, that'll be an IST entry. Assume further
that we re-schedule before re-entering the guest. Upon the vCPU being
scheduled back in we'll have %r12 clear with an on-stack indication of
having taken an IST guest exit.

Jan
Re: [PATCH] x86: Fix check_ist_exit() assertions
Posted by Andrew Cooper 7 months, 1 week ago
On 19/09/2023 12:46 pm, Jan Beulich wrote:
> On 19.09.2023 12:35, Andrew Cooper wrote:
>> The patch adding check_ist_exit() neglected to consider reset_stack_and_jump()
>> leaving C and entering one of the Xen exit paths.  The value in %r12 is stale,
>> and depending on compiler decisions may not be 0.
>>
>> This shows up in Gitlab CI for the Clang build:
>>
>>   https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/5112783827
>>
>> and in OSSTest for GCC 8:
>>
>>   http://logs.test-lab.xenproject.org/osstest/logs/183045/test-amd64-amd64-xl-qemuu-debianhvm-amd64/serial-pinot0.log
>>
>> The justification for ensuring ist_exit is accurate in the exit paths still
>> stands, so zero %r12 in reset_stack_and_jump() to indicate a non-IST exit.
> I did think of this as an option, but I don't think this covers all cases.
> If we take #DB while in a PV guest, that'll be an IST entry. Assume further
> that we re-schedule before re-entering the guest. Upon the vCPU being
> scheduled back in we'll have %r12 clear with an on-stack indication of
> having taken an IST guest exit.

This is nasty, and we would easily have that behaviour if e.g. GDBSX was
attached to the vCPU in question.

In that case, technically it's the other scheduled vCPU which is
undergoing the ist_exit, but regs->entry_vector will be different and
invalidate the check.

I guess for now I'll have to relent on checking on the exit-to-guest
paths.  I don't see any other feasible option.

~Andrew