From nobody Sat Feb 7 16:00:30 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E015726738B for ; Fri, 23 Jan 2026 22:45:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769208322; cv=none; b=X8dnhNZnjnZRcSfykXZgWT1e+fRjyT0gFjKuFiMR9B3NztclU10esh2/mcEQRXFeJvA4YAwzjuFjEvd/35Ma03mQAHESDNm2H9OGsvQbfjR0SoBUaRFR8cucg9hR8Xw7chqJt0TsIyYjKip8crdHCkexZW3yfJwX81MluRwW32U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769208322; c=relaxed/simple; bh=AiqWO8SpVo62e8X1WEFA3hq64LuUmaVr6HZPgUvbHnw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=iIRFlqbiRmxvzG6O8+hDhr6HNws910chTHcTgRxV24iAc2NHbwnr1x0XEocE6axobyBrO1MTOuT6zAk/eLhGDFv/qubQT0Lp74TO0ser2Ls6qSqTivsERCqmyBNx6es2lHWaR+qSjK3yQpWJ8oDLeT8tUcu33gK/P4sfPVniWEg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=pLbnHm7w; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pLbnHm7w" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-352e1a8603bso2611068a91.1 for ; Fri, 23 Jan 2026 14:45:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1769208320; x=1769813120; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=DgWvnMczPCtvjTL6g2NKbG8isQMgrKZUrFDimvKoHZM=; b=pLbnHm7w7faLELo+w27aFoHI3BlzJE07On7PeezbqLkXUW6c7vsGnnTlQYuFLxATtd HGL6F3dTtFqSr9EbeOGa6gxIaBIt17YXjawf1LY0G8bAG3YDZLOxECOPcb5d4hnOueZJ jr33yfQBVIjmCEHytJcHUh4wwgmrkSwBDboZ+yJN8zRVXdJLUFb4G9cr7hW4eYz6nVES efsQEQrpxte3Fl7/FZgd4F3+Vlc3rXa+xqOcxk/h2GQtsObCHEvBCx6PEUIQqS/V8L+u JSLqUX50Pv5ecx6ffiJ5ULkKBKr1jxbiu5yuqdpLucOaGilrH6oZoUkc73NWBZU+vUTi ELWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769208320; x=1769813120; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DgWvnMczPCtvjTL6g2NKbG8isQMgrKZUrFDimvKoHZM=; b=Mg5NaqxnqxugvEUF6+0MdWVgczh1IyBdFczb82/EVKhnVI9XWn2/RZ4piVgrG6gP3w UEnlg2A80R6MuVFStz8tvu0fj54PgBUTHNFtm0PKR7P0IOJuy9mPRbEjw3OC/WZ+Ay9u Vt4Gbg35XKtsFYvUtMgfI5BnxMsHqvo7NgwVeo/ABciHCcTF5Zo4w8iJdASfg46oDNm+ 4/PFXdrzRGccdVM5pUdY3vn8TXNOfAfMMpoy8dIDpOg14ESrGIjVCDzevxBAbNNxID+o +K0G10xqGENxNzJ7jGPSHHNX/O5X7K4NbrWH+axAfmCRxq7VRQgqNYFRL1IbqolvqA1G g/Jg== X-Forwarded-Encrypted: i=1; AJvYcCUeeyIhHV32xXA6aBo5FX2WWzznj8rq2FsRqfjBQMYCcNKRu39m5ZWY4j5sFrpCe1F+rXhEsdsFVW6EOyM=@vger.kernel.org X-Gm-Message-State: AOJu0Yz+qf00o2hRiYswjLgqO230O7YILZ7bPKzncnn7MoMr64BjnxuZ OFfBX0cw/lWYpvZA3oBd1/0WxsqQQDKXB+xTUSaH1X1FjbaDSNVWGGzOlC/9+0t3HfSzo4DFauT Y+MVkpA== X-Received: from pjbmi1.prod.google.com ([2002:a17:90b:4b41:b0:34c:2f02:7f5d]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3b4a:b0:33f:eca0:47c6 with SMTP id 98e67ed59e1d1-3536911dbefmr3269951a91.30.1769208320265; Fri, 23 Jan 2026 14:45:20 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 23 Jan 2026 14:45:11 -0800 In-Reply-To: <20260123224514.2509129-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260123224514.2509129-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.457.g6b5491de43-goog Message-ID: <20260123224514.2509129-2-seanjc@google.com> Subject: [PATCH v2 1/4] KVM: SVM: Fix clearing IRQ window inhibit with nested guests From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Naveen N Rao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Clearing IRQ window inhibit today relies on interrupt window interception, but that is not always reachable when nested guests are involved. If L1 is intercepting IRQs, then interrupt_window_interception() will never be reached while L2 is active, because the only reason KVM would set the V_IRQ intercept in vmcb02 would be on behalf of L1, i.e. because of vmcb12. svm_clear_vintr() always operates on (at least) vmcb01, and VMRUN unconditionally sets GIF=3D1, which means that enter_svm_guest_mode() will always do svm_clear_vintr() via svm_set_gif(svm, true). I.e. KVM will keep the VM-wide inhibit set until control transfers back to L1 *and* an interrupt window is triggered. If L1 is not intercepting IRQs, KVM may immediately inject L1's ExtINT into L2 if IRQs are enabled in L2 without taking an interrupt window interception. Address this by clearing the IRQ window inhibit when KVM actually injects an interrupt and there are no further injectable interrupts. That way, if L1 isn't intercepting IRQs, KVM will drop the inhibit as soon as an interrupt is injected into L2. And if L1 is intercepting IRQs, KVM will keep the inhibit until the IRQ is injected into L2. So, AVIC won't be left inhibited. Note, somewhat blindly invoking kvm_clear_apicv_inhibit() is both wrong and suboptimal. If the IRQWIN inhibit isn't set, then the vCPU will unnecessarily take apicv_update_lock for write. And if a _different_ vCPU has an injectable IRQ, clearing IRQWIN may block that vCPU's ability to inject its IRQ. Defer fixing both issues to a future commit, as fixing one problem without also fixing the other would also leave KVM in a temporarily bad state, as would fixing both issues without fixing _this_ bug. I.e. it's not feasible to fix each bug independently without there being some remaining flaw in KVM. Co-developed-by: Naveen N Rao (AMD) Signed-off-by: Naveen N Rao (AMD) Tested-by: Naveen N Rao (AMD) Signed-off-by: Sean Christopherson --- arch/x86/kvm/svm/svm.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7803d2781144..24b9c2275821 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3130,20 +3130,6 @@ static int interrupt_window_interception(struct kvm_= vcpu *vcpu) kvm_make_request(KVM_REQ_EVENT, vcpu); svm_clear_vintr(to_svm(vcpu)); =20 - /* - * If not running nested, for AVIC, the only reason to end up here is Ext= INTs. - * In this case AVIC was temporarily disabled for - * requesting the IRQ window and we have to re-enable it. - * - * If running nested, still remove the VM wide AVIC inhibit to - * support case in which the interrupt window was requested when the - * vCPU was not running nested. - - * All vCPUs which run still run nested, will remain to have their - * AVIC still inhibited due to per-cpu AVIC inhibition. - */ - kvm_clear_apicv_inhibit(vcpu->kvm, APICV_INHIBIT_REASON_IRQWIN); - ++vcpu->stat.irq_window_exits; return 1; } @@ -3732,6 +3718,20 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu, bo= ol reinjected) type =3D SVM_EVTINJ_TYPE_INTR; } =20 + /* + * If AVIC was inhibited in order to detect an IRQ window, and there's + * no other injectable interrupts pending or L2 is active (see below), + * then drop the inhibit as the window has served its purpose. + * + * If L2 is active, this path is reachable if L1 is not intercepting + * IRQs, i.e. if KVM is injecting L1 IRQs into L2. AVIC is locally + * inhibited while L2 is active; drop the VM-wide inhibit to optimize + * the case in which the interrupt window was requested while L1 was + * active (the vCPU was not running nested). + */ + if (!kvm_cpu_has_injectable_intr(vcpu) || is_guest_mode(vcpu)) + kvm_clear_apicv_inhibit(vcpu->kvm, APICV_INHIBIT_REASON_IRQWIN); + trace_kvm_inj_virq(intr->nr, intr->soft, reinjected); ++vcpu->stat.irq_injections; =20 --=20 2.52.0.457.g6b5491de43-goog