From nobody Tue Oct 7 01:55:53 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCF3F22127C for ; Tue, 15 Jul 2025 19:06:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752606404; cv=none; b=RRezv34EH/4YNGvQp6hKvd6InRFbrFRE21Nx+7fuIGdLKEJViCzGTISXR6vBKNXhipS68pADKg6XYvkjZVhEXVKLvCb22NsuUkqDDpzkgM8QSc3iHOyymXBtMP9nuR9f8HbeZQts2LOeIaRKxCBcq4C/JDOAjFZZ29ofycGvUbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752606404; c=relaxed/simple; bh=GTxT5dw0PVj8wMRhtSSRP76xX5fd2Cpp5r+28okB0A4=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=TaeiRAvX83LMwah4UD+gEA0HlnwQvB326MoAm7sEBXgqBgEpLm/zYJq2h4FVBkEK9cVtvxTkugvqkcLFQ8q3IiiZ/FFBCXKmgAp5PRjShB+JqeQLxE6IwH/IbWAAczAyn6yjH+kClgk8AsnhPdD97T4aYVe4ELoSPOw4SZOIKms= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TwVZ9USH; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TwVZ9USH" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2349fe994a9so51500185ad.1 for ; Tue, 15 Jul 2025 12:06:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752606401; x=1753211201; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:reply-to:from:to:cc :subject:date:message-id:reply-to; bh=j9kM4hEAGClBPbwwJM/o6yebUkYu5yQUZuXR772mje4=; b=TwVZ9USH407tctEVAuc5brOp8akW+dyi/NaU6eVkg9cP/6AX89XqUd8nfq6kVWvWOZ aerIMVeQs+mcwXwzm2l96JhaSRkDMXqrH5Hjf8BFBbsDZMagyECurpPF/EMFoEgFZVBQ PDzc2ZSSvH519BLXvzlXBYQVDwi04LBThe9cYZbcZHA98yCW/YP+TDvTUkEEou0zf1eO Wys5VIYLjj4Cboy9mY8prDZXdw5uLE9LVGjk1T3IVAEbMXyOoZO8vdoBOfJXDUg4pLUg Xpra7HCMxHTUHyR4s6J62kk/Wxk50ApaPjAl11yfPWxr8c1XJ0t4P3E1SHcQFHaQaBPJ RIVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752606401; x=1753211201; h=cc:to:from:subject:message-id:mime-version:date:reply-to :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=j9kM4hEAGClBPbwwJM/o6yebUkYu5yQUZuXR772mje4=; b=bt/H412xMjoLR8xAdMrQXQf8MuHN2rfje3mgK95Upzks6HrvmgxQ5LQ+baF4YY6lg4 8ZS4yVbuqDAfPy5Gt8GPvuwwbqahYd2BjWhZTHk8GGfeCfd/MX6vZpLDUeXNN251KCCc AemFhWtA5IPE/qXRQV4NX7YilxXb/HtA38Dn3zxnaTcy9RqUbUsIoK18sNXqRYFj2Wlb xrzg7NMhkoraFJ2dz33Ic5vQ8OAS2260wHWkm8aI+fwwqUt3wZNKId8Qr4mye0sodj5k +ZdpSpHoaNV3+nJbbTTpuayzQaQOgBQrJ1qaSLgzVH/RcM2sqcZhDpnPQmf67hxoOOge bE3Q== X-Forwarded-Encrypted: i=1; AJvYcCXSexWnHO1ICmm6Ur1OsIQWu1QtZZAYMCqpzy08xk6qiafIV81pI0IbHFv/ulXQyJwLhJP1y7dGlkay+us=@vger.kernel.org X-Gm-Message-State: AOJu0YxqZhyt3D2Nyi1MU4P/MarOcEjQJW8yxZeE28qtb3m0eGzMKkpx P+DLzZxt35un8Vv5B+jemXYWWFwO/X4Y4+XMRXM1I96JNo6+SlhJ8W7YwmLTXM6TkE6uigB+KsL F1asB3g== X-Google-Smtp-Source: AGHT+IEc17k2iTydYcoJ2I0zd3TiRSQ3d/97yuv9VPah2k9SyTYKAouwm4LkyHCYGU53am0RO8rvezKZvso= X-Received: from pgbct3.prod.google.com ([2002:a05:6a02:2103:b0:b2e:c392:14f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2381:b0:234:d10d:9f9f with SMTP id d9443c01a7336-23e24fe30bfmr1577835ad.40.1752606401066; Tue, 15 Jul 2025 12:06:41 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 15 Jul 2025 12:06:38 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250715190638.1899116-1-seanjc@google.com> Subject: [PATCH] KVM: x86: Don't (re)check L1 intercepts when completing userspace I/O From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When completing emulation of instruction that generated a userspace exit for I/O, don't recheck L1 intercepts as KVM has already finished that phase of instruction execution, i.e. has already committed to allowing L2 to perform I/O. If L1 (or host userspace) modifies the I/O permission bitmaps during the exit to userspace, KVM will treat the access as being intercepted despite already having emulated the I/O access. Pivot on EMULTYPE_NO_DECODE to detect that KVM is completing emulation. Of the three users of EMULTYPE_NO_DECODE, only complete_emulated_io() (the intended "recipient") can reach the code in question. gp_interception()'s use is mutually exclusive with is_guest_mode(), and complete_emulated_insn_gp() unconditionally pairs EMULTYPE_NO_DECODE with EMULTYPE_SKIP. The bad behavior was detected by a syzkaller program that toggles port I/O interception during the userspace I/O exit, ultimately resulting in a WARN on vcpu->arch.pio.count being non-zero due to KVM no completing emulation of the I/O instruction. WARNING: CPU: 23 PID: 1083 at arch/x86/kvm/x86.c:8039 emulator_pio_in_out= +0x154/0x170 [kvm] Modules linked in: kvm_intel kvm irqbypass CPU: 23 UID: 1000 PID: 1083 Comm: repro Not tainted 6.16.0-rc5-c1610d2d66= b1-next-vm #74 NONE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:emulator_pio_in_out+0x154/0x170 [kvm] PKRU: 55555554 Call Trace: kvm_fast_pio+0xd6/0x1d0 [kvm] vmx_handle_exit+0x149/0x610 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xda8/0x1ac0 [kvm] kvm_vcpu_ioctl+0x244/0x8c0 [kvm] __x64_sys_ioctl+0x8a/0xd0 do_syscall_64+0x5d/0xc60 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: 8a76d7f25f8f ("KVM: x86: Add x86 callback for intercept check") Cc: stable@vger.kernel.org Cc: Jim Mattson Signed-off-by: Sean Christopherson --- arch/x86/kvm/emulate.c | 9 ++++----- arch/x86/kvm/kvm_emulate.h | 3 +-- arch/x86/kvm/x86.c | 15 ++++++++------- 3 files changed, 13 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 1349e278cd2a..542d3664afa3 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -5107,12 +5107,11 @@ void init_decode_cache(struct x86_emulate_ctxt *ctx= t) ctxt->mem_read.end =3D 0; } =20 -int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) +int x86_emulate_insn(struct x86_emulate_ctxt *ctxt, bool check_intercepts) { const struct x86_emulate_ops *ops =3D ctxt->ops; int rc =3D X86EMUL_CONTINUE; int saved_dst_type =3D ctxt->dst.type; - bool is_guest_mode =3D ctxt->ops->is_guest_mode(ctxt); =20 ctxt->mem_read.pos =3D 0; =20 @@ -5160,7 +5159,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) fetch_possible_mmx_operand(&ctxt->dst); } =20 - if (unlikely(is_guest_mode) && ctxt->intercept) { + if (unlikely(check_intercepts) && ctxt->intercept) { rc =3D emulator_check_intercept(ctxt, ctxt->intercept, X86_ICPT_PRE_EXCEPT); if (rc !=3D X86EMUL_CONTINUE) @@ -5189,7 +5188,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) goto done; } =20 - if (unlikely(is_guest_mode) && (ctxt->d & Intercept)) { + if (unlikely(check_intercepts) && (ctxt->d & Intercept)) { rc =3D emulator_check_intercept(ctxt, ctxt->intercept, X86_ICPT_POST_EXCEPT); if (rc !=3D X86EMUL_CONTINUE) @@ -5243,7 +5242,7 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt) =20 special_insn: =20 - if (unlikely(is_guest_mode) && (ctxt->d & Intercept)) { + if (unlikely(check_intercepts) && (ctxt->d & Intercept)) { rc =3D emulator_check_intercept(ctxt, ctxt->intercept, X86_ICPT_POST_MEMACCESS); if (rc !=3D X86EMUL_CONTINUE) diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h index c1df5acfacaf..7b5ddb787a25 100644 --- a/arch/x86/kvm/kvm_emulate.h +++ b/arch/x86/kvm/kvm_emulate.h @@ -235,7 +235,6 @@ struct x86_emulate_ops { void (*set_nmi_mask)(struct x86_emulate_ctxt *ctxt, bool masked); =20 bool (*is_smm)(struct x86_emulate_ctxt *ctxt); - bool (*is_guest_mode)(struct x86_emulate_ctxt *ctxt); int (*leave_smm)(struct x86_emulate_ctxt *ctxt); void (*triple_fault)(struct x86_emulate_ctxt *ctxt); int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr); @@ -521,7 +520,7 @@ bool x86_page_table_writing_insn(struct x86_emulate_ctx= t *ctxt); #define EMULATION_RESTART 1 #define EMULATION_INTERCEPTED 2 void init_decode_cache(struct x86_emulate_ctxt *ctxt); -int x86_emulate_insn(struct x86_emulate_ctxt *ctxt); +int x86_emulate_insn(struct x86_emulate_ctxt *ctxt, bool check_intercepts); int emulator_task_switch(struct x86_emulate_ctxt *ctxt, u16 tss_selector, int idt_index, int reason, bool has_error_code, u32 error_code); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index de51dbd85a58..44ef3492bfd2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8609,11 +8609,6 @@ static bool emulator_is_smm(struct x86_emulate_ctxt = *ctxt) return is_smm(emul_to_vcpu(ctxt)); } =20 -static bool emulator_is_guest_mode(struct x86_emulate_ctxt *ctxt) -{ - return is_guest_mode(emul_to_vcpu(ctxt)); -} - #ifndef CONFIG_KVM_SMM static int emulator_leave_smm(struct x86_emulate_ctxt *ctxt) { @@ -8697,7 +8692,6 @@ static const struct x86_emulate_ops emulate_ops =3D { .guest_cpuid_is_intel_compatible =3D emulator_guest_cpuid_is_intel_compat= ible, .set_nmi_mask =3D emulator_set_nmi_mask, .is_smm =3D emulator_is_smm, - .is_guest_mode =3D emulator_is_guest_mode, .leave_smm =3D emulator_leave_smm, .triple_fault =3D emulator_triple_fault, .set_xcr =3D emulator_set_xcr, @@ -9282,7 +9276,14 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, g= pa_t cr2_or_gpa, ctxt->exception.address =3D 0; } =20 - r =3D x86_emulate_insn(ctxt); + /* + * Check L1's instruction intercepts when emulating instructions for + * L2, unless KVM is re-emulating a previously decoded instruction, + * e.g. to complete userspace I/O, in which case KVM has already + * checked the intercepts. + */ + r =3D x86_emulate_insn(ctxt, is_guest_mode(vcpu) && + !(emulation_type & EMULTYPE_NO_DECODE)); =20 if (r =3D=3D EMULATION_INTERCEPTED) return 1; base-commit: 4578a747f3c7950be3feb93c2db32eb597a3e55b --=20 2.50.0.727.gbf7dc18ff4-goog