From nobody Sat Oct 4 16:14:05 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E240BA3F for ; Fri, 15 Aug 2025 00:12:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755216736; cv=none; b=mLioWx8zEmjcpbKgN0mnO2fUu3uHVkZGp9jNel/xZCIPczVKfdiW9SSiDhiQJPnYd6k25P6RiWg1HKiMUHKEBoSwQICX2VrScSQJto3FO9SpPDz333NF0rGaGzQI2IkdQgud3W5PwEq/UXJeriHKUp77oiebeF7gtSPsSLXndhQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755216736; c=relaxed/simple; bh=QZ7hiEeEodzcf9N5xa+v0EKi7D4cdSZUxTI4oTTqBUM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DV07upgSTyr0LAKLAE8+D6ALKklCZqe7BSkEzGH147aW6jSjxIzeV3asSJUn0/RseyhhmUZHSSGVweV2OCnz/Wg0ugmDaQnZcSTnuUsrTqfqbEWvPvUX0IBFx6jrB6j3d+rmEOavEM/tiP80EoEyGDXT01WwG8G+QMfjIfduoY8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=QnJSMhGf; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="QnJSMhGf" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-24458194d82so15458795ad.2 for ; Thu, 14 Aug 2025 17:12:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1755216734; x=1755821534; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=+3Z7z+GhkYgH9ZaG5Bt9ilPatbNNXsIC5wDZoaGAfW0=; b=QnJSMhGf1xkhdlJm+hIeKc4J5S8Y/wvko75pb+7GHk9MjGCUkqE8y35/FGQTison19 BPSZ/vk+VMfwpX+TjRTm896g4xV9WJlR/Dwe+c/+F3LvN0IsKcAui23Bk3Sf1lHXDZhM V6LKbRSgs/8k+vXFDVnv7oZgLeIybS28wo+bEVDf8BlEq037qwmZkaVNampp76RD688a DaNlCMUrgjRXIlFrl1I3LB3/+PRyFEKVySy0nDQEeKsYDS62u5G2NAjBhBolOy0Qn7Ny b65u+8arDYyhaSRiRP/mxzTiM12WazTJiqujIe8kj394s3gKC+QvCLZQKAydQC8SXnGp AaBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755216734; x=1755821534; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+3Z7z+GhkYgH9ZaG5Bt9ilPatbNNXsIC5wDZoaGAfW0=; b=OIb3j3ZLM7glIyX0eprzZI7R6hZxbbdpWRoumIrSgW8pXA3ncUjoUsAn1huYjzIobo jpYShEA2xapGSdbXDPwsOx2KtwHg4bFr6/zDXMnVEynaXkAGoCE38paK1lElXGujWlDf dc59YCzssQMU0FdLHoGBD4i7u7/LZLuC5sSA8qKaeT52z+eqPcb/aL4Dr8mO3jQlOWo6 PY4zOFTwVKsDuSczsizrFrrRjqXWXOT9JSFbpzxYMug9QeULpehueq8DztmHjqzeIyjK crYf/xvSTqocBhrUifp7d5cuWkwnTA+gom2UVQengojrWLq9mI29/6SWjo3jpzRgjPV5 xWzQ== X-Forwarded-Encrypted: i=1; AJvYcCXyO2uQKEnUFaKhIcxKjnBvGq2X45XIVoo+2RGmbrNspP1HzVUncFhgnKKpqLUH+gkbPR7u9vS6XZALv+M=@vger.kernel.org X-Gm-Message-State: AOJu0YyQo2W7QPp38uy1kaf7BqGI66y2fe5ekmc84dnU/NVUgJrLaYpP CdCAXiFfc/Qy9yoMCm+Gt+F+JNr181lcFopl9i1NXjNx6Mbn4e3iGEndc4EEwQcgWKla7hLpHeJ 9Lwo9Mw== X-Google-Smtp-Source: AGHT+IFeyVWtbJnA+0jK98/KDDcJOwZAstyIgyOg41fad5x07dGh22iBOzCqZIcKbU86Qr8Tb8SFx3dcllQ= X-Received: from pjbnd12.prod.google.com ([2002:a17:90b:4ccc:b0:31f:36c3:b18a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:22c7:b0:242:8a45:a95e with SMTP id d9443c01a7336-2446d6e3df4mr1726155ad.15.1755216734230; Thu, 14 Aug 2025 17:12:14 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 14 Aug 2025 17:11:45 -0700 In-Reply-To: <20250815001205.2370711-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250815001205.2370711-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.rc1.163.g2494970778-goog Message-ID: <20250815001205.2370711-2-seanjc@google.com> Subject: [PATCH 6.1.y 01/21] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow From: Sean Christopherson To: stable@vger.kernel.org, Greg Kroah-Hartman , Sasha Levin Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" [ Upstream commit be45bc4eff33d9a7dae84a2150f242a91a617402 ] Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk (some would say "bug"), where the STI shadow bleeds into the guest's intr_state field if a #VMEXIT occurs during injection of an event, i.e. if the VMRUN doesn't complete before the subsequent #VMEXIT. The spurious "interrupts masked" state is relatively benign, as it only occurs during event injection and is transient. Because KVM is already injecting an event, the guest can't be in HLT, and if KVM is querying IRQ blocking for injection, then KVM would need to force an immediate exit anyways since injecting multiple events is impossible. However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the spurious STI shadow is visible to L1 when running a nested VM, which can trip sanity checks, e.g. in VMware's VMM. Hoist the STI+CLI all the way to C code, as the aforementioned calls to guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=3D1 is already possible. I.e. if there's kernel code that is confused by running with RFLAGS.IF=3D1, then it's already a problem. In practice, since GIF=3D0 also blocks NMIs, the only change in exposure to non-KVM code (relative to surrounding VMRUN with STI+CLI) is exception handling code, and except for the kvm_rebooting=3D1 case, all exception in the core VM-Enter/VM-Exit path are fatal. Use the "raw" variants to enable/disable IRQs to avoid tracing in the "no instrumentation" code; the guest state helpers also take care of tracing IRQ state. Oppurtunstically document why KVM needs to do STI in the first place. Reported-by: Doug Covelli Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO754= 2DjMZgs4uzJQ@mail.gmail.com Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly") Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Link: https://lore.kernel.org/r/20250224165442.2338294-2-seanjc@google.com Signed-off-by: Sean Christopherson [sean: resolve minor syntatic conflict in __svm_sev_es_vcpu_run()] Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/svm.c | 14 ++++++++++++++ arch/x86/kvm/svm/vmenter.S | 9 +-------- 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b6bbd0dc4e65..c95a84afc35f 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3982,6 +3982,18 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_v= cpu *vcpu, bool spec_ctrl_in =20 guest_state_enter_irqoff(); =20 + /* + * Set RFLAGS.IF prior to VMRUN, as the host's RFLAGS.IF at the time of + * VMRUN controls whether or not physical IRQs are masked (KVM always + * runs with V_INTR_MASKING_MASK). Toggle RFLAGS.IF here to avoid the + * temptation to do STI+VMRUN+CLI, as AMD CPUs bleed the STI shadow + * into guest state if delivery of an event during VMRUN triggers a + * #VMEXIT, and the guest_state transitions already tell lockdep that + * IRQs are being enabled/disabled. Note! GIF=3D0 for the entirety of + * this path, so IRQs aren't actually unmasked while running host code. + */ + raw_local_irq_enable(); + amd_clear_divider(); =20 if (sev_es_guest(vcpu->kvm)) @@ -3989,6 +4001,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vc= pu *vcpu, bool spec_ctrl_in else __svm_vcpu_run(svm, spec_ctrl_intercepted); =20 + raw_local_irq_disable(); + guest_state_exit_irqoff(); } =20 diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S index 42824f9b06a2..48b72625cc45 100644 --- a/arch/x86/kvm/svm/vmenter.S +++ b/arch/x86/kvm/svm/vmenter.S @@ -170,12 +170,8 @@ SYM_FUNC_START(__svm_vcpu_run) VM_CLEAR_CPU_BUFFERS =20 /* Enter guest mode */ - sti - 3: vmrun %_ASM_AX 4: - cli - /* Pop @svm to RAX while it's the only available register. */ pop %_ASM_AX =20 @@ -343,11 +339,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) VM_CLEAR_CPU_BUFFERS =20 /* Enter guest mode */ - sti - 1: vmrun %_ASM_AX - -2: cli +2: =20 /* Pop @svm to RDI, guest registers have been saved already. */ pop %_ASM_DI --=20 2.51.0.rc1.163.g2494970778-goog