From nobody Tue Mar 3 05:11:48 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=citrix.com ARC-Seal: i=1; a=rsa-sha256; t=1772234242; cv=none; d=zohomail.com; s=zohoarc; b=Ttc6GxGgPmLsY92aenc+13iHt0UX88MNxmmWHGRhCcVVxdPUSpDjpNO2vv2nFpsOY6xD/SmH1F7ejhjqOa5YU51ZLKEmnTH+BWCR4yTbzjFsyYW72rwRXyquBXQC6YXubHerSfX/5r2qGp38K1nNk+znOaGJ0sWlM6fylZneKKg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1772234242; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=XGrbegNxX52Qprjc9LHURGzyjTJEjBVjNc2DvWM4rPE=; b=izZOA6SfwY9XsrLLkWnxtSdpXYm+4D+lfEmLy0GeLSFOPirE3a34cNnZFntaIQUimkXx1gwy+cR3cDkbXHHjspS5MU6B9ur58sUmzeNfduOti6hSm0z3rgLvLnwRC35b8VM1azzuIORjX0553Jf91q/xDi4kkNWLwtJo2Ufj0v8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1772234242030614.817881338424; Fri, 27 Feb 2026 15:17:22 -0800 (PST) Received: from list by lists.xenproject.org with outflank-mailman.1243153.1543235 (Exim 4.92) (envelope-from ) id 1vw74t-0003EC-UW; Fri, 27 Feb 2026 23:16:59 +0000 Received: by outflank-mailman (output) from mailman id 1243153.1543235; Fri, 27 Feb 2026 23:16:59 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vw74t-0003AJ-1i; Fri, 27 Feb 2026 23:16:59 +0000 Received: by outflank-mailman (input) for mailman id 1243153; Fri, 27 Feb 2026 23:16:57 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1vw74r-0001Do-1S for xen-devel@lists.xenproject.org; Fri, 27 Feb 2026 23:16:57 +0000 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [2a00:1450:4864:20::42e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 666569d4-1432-11f1-9ccf-f158ae23cfc8; Sat, 28 Feb 2026 00:16:52 +0100 (CET) Received: by mail-wr1-x42e.google.com with SMTP id ffacd0b85a97d-43991cc3155so2439113f8f.0 for ; Fri, 27 Feb 2026 15:16:52 -0800 (PST) Received: from localhost.localdomain (host-92-22-18-152.as13285.net. [92.22.18.152]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4399c70e8e8sm9680306f8f.10.2026.02.27.15.16.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 15:16:51 -0800 (PST) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 666569d4-1432-11f1-9ccf-f158ae23cfc8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1772234212; x=1772839012; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XGrbegNxX52Qprjc9LHURGzyjTJEjBVjNc2DvWM4rPE=; b=cfR24MA2ZraMNZhxpjRXI0EIbZ8NDwIT124BCWEKOOxIpVODXNVfm7sqWxnY215+SB c9eTPTVUImIkeuhweszpCcqS3M9Xw1mC7mzexbtADsS31Fjb+cXZJ4OeEG7rmNpT3SUX uBRZsySD3IPaexCOHGFdiX0OrEEdQEFmYh/ds= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772234212; x=1772839012; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=XGrbegNxX52Qprjc9LHURGzyjTJEjBVjNc2DvWM4rPE=; b=TyGAYADbd++/NH1gXaEmYlw7vAUL2z/WxPH/tNVxnPjnFnL9qyrBCe7ejbjbKOIUk2 ekpCy8pakF0Xoy3YSIOHfgSOekoDlNVzXyycbtDkf2q4ldzhXHiH+obqzbQQipZud4Kf ssOeqe4zxo7Pexosf0poovHUv4hQdkQa0LQ4YmXqr6qcw6sz6P4+X5XXrdkjhiYRY7J/ qumSZkgJSO68ru0/jayaoNisvk6PJYV7DR8fFQNpXhG9P5tfo1shKl5KpqSylr/zgDgx /sMtyJ98qe+MISByUcOAPRBKGIiONIeSh0CHA977jI0Z+5zRxbvRIpDV9hr75zzDxmgu yl0g== X-Gm-Message-State: AOJu0YxIIe/ytVrI9jSYpvy/iDPrQ6J/E+bFIgxu2Jdrvaxmv2ZPb61Y jmTp7v+dDPXpbhYYuZVOkXyKIde0HR3mFFdEEqy56lnjSc0CW5tz6DfzFmorq2LX6gBFDy8gxli Pe/azSH3Y0w== X-Gm-Gg: ATEYQzyPiaMOoCLw84T75GywpbfjHVYkFwjH6Irtf4gIH3NOzv27anX6qR5zPi62HbC HTdYDnRAIfDLZMGRTu3GgSeKLa+H46OLCAZsA1hHSNP1n0QfVo0h7BZdkVs6EpS7CkrYK2HIi+y 9jQSlKNpoyvUeQR84YEoiNxdu0AesfS0F1PI+/AzRofCkvw7alhi+puYwKRsvlToBO7BB/nK6Fi QcyGNSBrMLuP9+OnRAafsQdJFOZV8WLaVstiNHpQ8vSCmFnWjvQMBR732P/MOfDUEntimTUSl5N UBmzh88iINf2bJXOwpj8kakfar2j41Di5ti6Sok4XoMjsR+Nmnl2VPQrh5v570j4KWkvlXRj0ME awTQaJ1C944J8pDPoeGc6ehBhIukGsH5SSQ8gopQIuYzy2sNTnQAsUOCNo007nq0w7TqaKsAxhX EkyzI8rXMtRXFoxL4LM/ZOOmyB75rW6JGrzq9PpbOwsY4Ig9MfnJLxps3cx1qxRtDKRNV6XOM= X-Received: by 2002:a05:6000:2482:b0:439:852f:c9e0 with SMTP id ffacd0b85a97d-4399de21f73mr7666076f8f.47.1772234211529; Fri, 27 Feb 2026 15:16:51 -0800 (PST) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , Jan Beulich , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Subject: [PATCH v4 12/14] x86/pv: System call handling in FRED mode Date: Fri, 27 Feb 2026 23:16:34 +0000 Message-Id: <20260227231636.3955109-13-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20260227231636.3955109-1-andrew.cooper3@citrix.com> References: <20260227231636.3955109-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @citrix.com) X-ZM-MESSAGEID: 1772234242412158500 Under FRED, entry_from_pv() handles everything, even system call instructio= ns. This means more of our logic is written in C now, rather than assembly. In order to facilitate this, introduce pv_inject_callback(), which reuses struct trap_bounce infrastructure to inject the syscall/sysenter callbacks. This in turns requires some !PV compatibility for pv_inject_callback() and pv_hypercall() which can both be ASSERT_UNREACHABLE(). For each of INT $N, SYSCALL and SYSENTER, FRED gives us interrupted context which was previously lost. As the guest can't see FRED, Xen has to lose st= ate in the same way to maintain the prior behaviour. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- CC: Jan Beulich CC: Roger Pau Monn=C3=A9 v3: * Simplify DCE handling. * Add ASSERT_UNREACHABLE() to pv_inject_callback(). * Adjust comment for X86_ET_SW_INT v2: * New --- xen/arch/x86/include/asm/domain.h | 2 + xen/arch/x86/include/asm/hypercall.h | 2 - xen/arch/x86/pv/traps.c | 39 ++++++++ xen/arch/x86/traps.c | 131 +++++++++++++++++++++++++++ 4 files changed, 172 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/d= omain.h index 94b0cf7f1d95..ad7f6adb2cb9 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -725,6 +725,8 @@ void arch_vcpu_regs_init(struct vcpu *v); struct vcpu_hvm_context; int arch_set_info_hvm_guest(struct vcpu *v, const struct vcpu_hvm_context = *ctx); =20 +void pv_inject_callback(unsigned int type); + #ifdef CONFIG_PV void pv_inject_event(const struct x86_event *event); #else diff --git a/xen/arch/x86/include/asm/hypercall.h b/xen/arch/x86/include/as= m/hypercall.h index bf2f0e169aef..d042a61d1702 100644 --- a/xen/arch/x86/include/asm/hypercall.h +++ b/xen/arch/x86/include/asm/hypercall.h @@ -18,9 +18,7 @@ =20 #define __HYPERVISOR_paging_domctl_cont __HYPERVISOR_arch_1 =20 -#ifdef CONFIG_PV void pv_hypercall(struct cpu_user_regs *regs); -#endif =20 void pv_ring1_init_hypercall_page(void *p); void pv_ring3_init_hypercall_page(void *p); diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c index b0395b99145a..c863ab9d372a 100644 --- a/xen/arch/x86/pv/traps.c +++ b/xen/arch/x86/pv/traps.c @@ -20,6 +20,8 @@ #include #include =20 +#include + void pv_inject_event(const struct x86_event *event) { struct vcpu *curr =3D current; @@ -96,6 +98,43 @@ void pv_inject_event(const struct x86_event *event) } } =20 +void pv_inject_callback(unsigned int type) +{ + struct vcpu *curr =3D current; + struct trap_bounce *tb =3D &curr->arch.pv.trap_bounce; + unsigned long rip; + bool irq; + + ASSERT(is_pv_64bit_vcpu(curr)); + + switch ( type ) + { + case CALLBACKTYPE_syscall: + rip =3D curr->arch.pv.syscall_callback_eip; + irq =3D curr->arch.pv.vgc_flags & VGCF_syscall_disables_events; + break; + + case CALLBACKTYPE_syscall32: + rip =3D curr->arch.pv.syscall32_callback_eip; + irq =3D curr->arch.pv.syscall32_disables_events; + break; + + case CALLBACKTYPE_sysenter: + rip =3D curr->arch.pv.sysenter_callback_eip; + irq =3D curr->arch.pv.sysenter_disables_events; + break; + + default: + ASSERT_UNREACHABLE(); + rip =3D 0; + irq =3D false; + break; + } + + tb->flags =3D TBF_EXCEPTION | (irq ? TBF_INTERRUPT : 0); + tb->eip =3D rip; +} + /* * Called from asm to set up the MCE trapbounce info. * Returns false no callback is set up, else true. diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 2f40f628cbff..e2c35a046e6b 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -51,6 +52,8 @@ #include #include =20 +#include + /* * opt_nmi: one of 'ignore', 'dom0', or 'fatal'. * fatal: Xen prints diagnostic message and then hangs. @@ -2267,6 +2270,7 @@ void asmlinkage check_ist_exit(const struct cpu_user_= regs *regs, bool ist_exit) void asmlinkage entry_from_pv(struct cpu_user_regs *regs) { struct fred_info *fi =3D cpu_regs_fred_info(regs); + struct vcpu *curr =3D current; uint8_t type =3D regs->fred_ss.type; uint8_t vec =3D regs->fred_ss.vector; =20 @@ -2309,6 +2313,38 @@ void asmlinkage entry_from_pv(struct cpu_user_regs *= regs) =20 switch ( type ) { + case X86_ET_SW_INT: + /* + * For better or worse, Xen writes IDT vectors 3 and 4 with DPL3 (= so + * INT3/INTO work), making INT $3/4 indistinguishable, and the gue= st + * choice of DPL for these vectors is ignored. + * + * Have them fall through into X86_ET_HW_EXC, as #BP in particular + * needs handling by do_int3() in case an external debugger is + * attached. + * + * As the event type is provided, INT $N instructions don't need #= GP + * tricks to spot, and INT $0x80 doesn't need a fastpath. As the + * guest is necessary PV64, INT $0x82 has no special meaning eithe= r. + * + * When converting to a fault, hardware finally gives us enough + * information to account for prefixes, so provide the more correct + * behaviour rather than assuming the instruction was two bytes lo= ng. + */ + if ( vec !=3D X86_EXC_BP && vec !=3D X86_EXC_OF ) + { + const struct trap_info *ti =3D &curr->arch.pv.trap_ctxt[vec]; + + if ( permit_softint(TI_GET_DPL(ti), curr, regs) ) + pv_inject_sw_interrupt(vec); + else + { + regs->rip -=3D regs->fred_ss.insnlen; + pv_inject_hw_exception(X86_EXC_GP, (vec << 3) | X86_XEC_ID= T); + } + break; + } + fallthrough; case X86_ET_HW_EXC: case X86_ET_PRIV_SW_EXC: case X86_ET_SW_EXC: @@ -2338,6 +2374,101 @@ void asmlinkage entry_from_pv(struct cpu_user_regs = *regs) } break; =20 + case X86_ET_OTHER: + switch ( regs->fred_ss.vector ) + { + case 1: /* SYSCALL */ + { + /* + * FRED delivery preserves the interrupted %cs/%ss, but previo= usly + * SYSCALL lost the interrupted selectors, and SYSRET forced t= he + * use of the ones in MSR_STAR. + * + * The guest isn't aware of FRED, so recreate the legacy + * behaviour. + * + * The non-FRED SYSCALL path sets TRAP_syscall in entry_vector= to + * signal that SYSRET can be used, but this isn't relevant in = FRED + * mode. + * + * When setting the selectors, clear all upper metadata again = for + * backwards compatibility. In particular fred_ss.swint becom= es + * pend_DB on ERETx, and nothing else in the pv_hypercall() wo= uld + * clean up. + * + * When converting to a fault, hardware finally gives us enough + * information to account for prefixes, so provide the more + * correct behaviour rather than assuming the instruction was = two + * bytes long. + */ + bool l =3D regs->fred_ss.l; + unsigned int len =3D regs->fred_ss.insnlen; + + regs->ssx =3D l ? FLAT_KERNEL_SS : FLAT_USER_SS32; + regs->csx =3D l ? FLAT_KERNEL_CS64 : FLAT_USER_CS32; + + if ( guest_kernel_mode(curr, regs) ) + pv_hypercall(regs); + else if ( (l ? curr->arch.pv.syscall_callback_eip + : curr->arch.pv.syscall32_callback_eip) =3D=3D 0 ) + { + regs->rip -=3D len; + pv_inject_hw_exception(X86_EXC_UD, X86_EVENT_NO_EC); + } + else + { + /* + * The PV ABI, given no virtual SYSCALL_MASK, hardcodes th= at + * DF is cleared. Other flags are handled in the same way= as + * interrupts and exceptions in create_bounce_frame(). + */ + regs->eflags &=3D ~X86_EFLAGS_DF; + pv_inject_callback(l ? CALLBACKTYPE_syscall + : CALLBACKTYPE_syscall32); + } + break; + } + + case 2: /* SYSENTER */ + { + /* + * FRED delivery preserves the interrupted state, but previous= ly + * SYSENTER discarded almost everything. + * + * The guest isn't aware of FRED, so recreate the legacy + * behaviour. + * + * When setting the selectors, clear all upper metadata. In + * particular fred_ss.swint becomes pend_DB on ERETx. + * + * When converting to a fault, hardware finally gives us enough + * information to account for prefixes, so provide the more + * correct behaviour rather than assuming the instruction was = two + * bytes long. + */ + unsigned int len =3D regs->fred_ss.insnlen; + + regs->ssx =3D FLAT_USER_SS; + regs->rsp =3D 0; + regs->eflags &=3D ~(X86_EFLAGS_VM | X86_EFLAGS_IF); + regs->csx =3D 3; + regs->rip =3D 0; + + if ( !curr->arch.pv.sysenter_callback_eip ) + { + regs->rip -=3D len; + pv_inject_hw_exception(X86_EXC_GP, 0); + } + else + pv_inject_callback(CALLBACKTYPE_sysenter); + break; + } + + default: + goto fatal; + } + break; + default: goto fatal; } --=20 2.39.5