From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BAB2205AD7 for ; Tue, 1 Apr 2025 20:46:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540411; cv=none; b=ulgBSmt5HAjk5J2ErN8NwzykvNXsyTw06cZ3rHPEZpSK91BQP8cUaiwc8miusHPLpbhuFdkwHsz2VqPnJRlW6CHASkbZQSTaDI4K1ljIsR5yqA0LZUWsqSG/v5vWtQfLECQD/GLeO12C3r9LHc2ujGhgneY6JVI0dq3Gr3kb6PU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540411; c=relaxed/simple; bh=4JC9vDgpuY3Y2FxwMbdW8RzZ4m05ZUKUwd4bktvUCtE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ny2w2dES6uYNDbgAmGONb+x6Fspww0guivdBbYQ7CL+E+oTKbxgYMIyVaYRYivWl3GSXeeK0dkOOtEJ5WW7Jfn/i25DIVyzoTvrdqiIVL+V8jC3WD8gBYbZgHeQrULaScMPxZeumZzZ1+/dgnkTtNnLcF0vsgpu5DUwlh9o7JTk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3dX3bz5V; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3dX3bz5V" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-22403329f9eso106195035ad.3 for ; Tue, 01 Apr 2025 13:46:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540409; x=1744145209; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=amBezYkVNcZuyH4EZC510WvPXbGr3Dq+FrBRP4RH3+k=; b=3dX3bz5VEA/pJrMPZVhhcQfCsIDcsN+HdIJ8UgiZXJJL4RfH/6YH8hTV5scUkDsJpW im5O3ChpXTa4T92pbre/rl8eq65TfEcUzXAxKlKYTqfazEOskKdKvAAX4uqQM9iS58y/ kYk2T9wz8nlexg1pwvufUDlNzQIumw0PMQqQxb6qcS+tHYns5a5PbXpAd94dQM/aNKQ6 vo89loV5K2XX5LfnK98nY4czttRQx0VhFa/cHZdQntwi6prRSxweKz23QHa5mwN+yPTF 2bklvKsk+Ik387qktbBUeivNycw7Nl/TuTQGm8z7UBqGxEy+6e+OcbKFYDUX8q0R4x/y mYBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540409; x=1744145209; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=amBezYkVNcZuyH4EZC510WvPXbGr3Dq+FrBRP4RH3+k=; b=MfBVIs26a5TbtNvrRmoU92MAEioZEYUfo2UPcFQlP/PSGw2EZ9KedYe4m2e94hwxbl sTfcB6qwO3RfluUppJysBZYFHg3nGR4ACF3aMyikBwpgPTvDfak3/qRfIUZ5MbSdLu5r 9VJq4L+y6GEQttxDO2/u5Clj5L+IrP0mqdCsIkUDx1pwl5nlzy1O2n4s0g7zt2YTx+vu zGsc9kncgTIyr61FWxeZzwv02YekYE5oiNc5hHfx2mcdoziU6iGbMnW9admc9B463AdC p1ZB2Cv9DVJxfc4u+QKzPCq/fWHi62jGz9/rgilUeN0RqRxTvDQgELBqMrAkWjKhiJ4V Gw2g== X-Forwarded-Encrypted: i=1; AJvYcCWqbvg5YtfMe4zFM2T3J3s3NjmjbJi0eeoM4EQR4LC+ghJ7BwuG+F5TlXeatE3kXzBT6lqyAOXCAAU5sYI=@vger.kernel.org X-Gm-Message-State: AOJu0YxXc72BcCIiWQ6DU/UvXPdpN507jn1QnA3R5EoeK6yXOEwMchVc ZalnvseROvdKOZCy1tkCLkLfjVjt5DSL7EK1VE2LxJNeD/U8SCgC5Zf5GConlfQL4NXdG4KUH87 wSg== X-Google-Smtp-Source: AGHT+IEa5pU5FztleJgZ0CTC7h18PjwGLO8G4GBjaSUUoGrBKG+0/CQ77Eky5VEgolWpGWv0RRLt0hcHxXQ= X-Received: from pfblb18.prod.google.com ([2002:a05:6a00:4f12:b0:736:b315:f15e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:174b:b0:736:4d44:8b77 with SMTP id d2e1a72fcca58-7398035e5a0mr24221681b3a.8.1743540409565; Tue, 01 Apr 2025 13:46:49 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:13 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-2-seanjc@google.com> Subject: [PATCH 01/12] KVM: Use a local struct to do the initial vfs_poll() on an irqfd From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use a function-local struct for the poll_table passted to vfs_poll(), as nothing in the vfs_poll() callchain grabs a long-term reference to the structure, i.e. its lifetime doesn't need to be tied to the irqfd. Using a local structure will also allow propagating failures out of the polling callback without further polluting kvm_kernel_irqfd. Opportunstically rename irqfd_ptable_queue_proc() to kvm_irqfd_register() to capture what it actually does. Signed-off-by: Sean Christopherson --- include/linux/kvm_irqfd.h | 1 - virt/kvm/eventfd.c | 26 +++++++++++++++++--------- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/include/linux/kvm_irqfd.h b/include/linux/kvm_irqfd.h index 8ad43692e3bb..44fd2a20b09e 100644 --- a/include/linux/kvm_irqfd.h +++ b/include/linux/kvm_irqfd.h @@ -55,7 +55,6 @@ struct kvm_kernel_irqfd { /* Used for setup/shutdown */ struct eventfd_ctx *eventfd; struct list_head list; - poll_table pt; struct work_struct shutdown; struct irq_bypass_consumer consumer; struct irq_bypass_producer *producer; diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 249ba5b72e9b..01c6eb4dceb8 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -245,12 +245,17 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode,= int sync, void *key) return ret; } =20 -static void -irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, - poll_table *pt) +struct kvm_irqfd_pt { + struct kvm_kernel_irqfd *irqfd; + poll_table pt; +}; + +static void kvm_irqfd_register(struct file *file, wait_queue_head_t *wqh, + poll_table *pt) { - struct kvm_kernel_irqfd *irqfd =3D - container_of(pt, struct kvm_kernel_irqfd, pt); + struct kvm_irqfd_pt *p =3D container_of(pt, struct kvm_irqfd_pt, pt); + struct kvm_kernel_irqfd *irqfd =3D p->irqfd; + add_wait_queue_priority(wqh, &irqfd->wait); } =20 @@ -305,6 +310,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *arg= s) { struct kvm_kernel_irqfd *irqfd, *tmp; struct eventfd_ctx *eventfd =3D NULL, *resamplefd =3D NULL; + struct kvm_irqfd_pt irqfd_pt; int ret; __poll_t events; int idx; @@ -394,7 +400,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *arg= s) * a callback whenever someone signals the underlying eventfd */ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); - init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc); =20 spin_lock_irq(&kvm->irqfds.lock); =20 @@ -416,11 +421,14 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *a= rgs) spin_unlock_irq(&kvm->irqfds.lock); =20 /* - * Check if there was an event already pending on the eventfd - * before we registered, and trigger it as if we didn't miss it. + * Register the irqfd with the eventfd by polling on the eventfd. If + * there was en event pending on the eventfd prior to registering, + * manually trigger IRQ injection. */ - events =3D vfs_poll(fd_file(f), &irqfd->pt); + irqfd_pt.irqfd =3D irqfd; + init_poll_funcptr(&irqfd_pt.pt, kvm_irqfd_register); =20 + events =3D vfs_poll(fd_file(f), &irqfd_pt.pt); if (events & EPOLLIN) schedule_work(&irqfd->inject); =20 --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8C4321420C for ; Tue, 1 Apr 2025 20:46:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540413; cv=none; b=nDgVImP9UG9hfM9UJ/65UqjqWh9xJ1Y7sqdgrlhJ0NAFVz2/KXc642CuHkxWY4xuH3RY0akckXLWrTP0leYZmeAJ07Cu2hEjZ1fG1dVqeFwraWeF+tU8oV6iveV/QqHvgncIywEmfn/g+HntVbA27ZbfXqB00YwD6GylWFFfavo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540413; c=relaxed/simple; bh=NEcVsWkrtlEatK+Y5SErb33iB70xl7AybVaIH/Yq9FU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Mq4UuTKbcHv2/LOtlTpJEyBZ6/Qnz+9IHL8ITas8d7AQFxxDD3V9jOPUonnEKzvobdsY/nVeBbc2EyMWe5FE81nWZVxwnEmyCa8VROJuVLN8I/h2sqKe1k6bVL+AiQDiSL2KNPx2qJbqP24MmAUWTnhdpti9Mcyk1BUab8NEXwI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=QJRmlX0m; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="QJRmlX0m" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-227ed471999so95586995ad.3 for ; Tue, 01 Apr 2025 13:46:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540411; x=1744145211; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=j8aM8OQNKmiMYC4fpWY1Tcz4TDYVMXdrg7MtQvLe+/k=; b=QJRmlX0mTrHvzBAE9nceVI/BKw2gf4+m9Ju2Ej37vUwGOO3w0G7DT9WiDAoNQM/ZYU PanPZy7adQotXtjpoPJhz6s57JNS7Lar+KWzOxlgIwGyFmQHfeXglTOnDZlyjZf7k+vC Wd5jJbMniavzjs8OiUFteiqP6r8HTKPjDnnyf0t1ZfGyz6ylGBYbq5hHn10lK18OCl42 xBYOsZYpDicGhS4Cw1KGPZ5qEKoYf2yVuqG9pW3htw8beAHvkKWKuOpAiD/Ym9n6TWwa Tmbi7bGfN9mNsOvgh8U4kPFFD2uc7hDYywN58D8QgzV4C0cBinQT91jabx7BLjYzfRoj zjfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540411; x=1744145211; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=j8aM8OQNKmiMYC4fpWY1Tcz4TDYVMXdrg7MtQvLe+/k=; b=A6mqBRDB9RzDvtVyawHq7ijG15XXRTki907ugPDRrwDJI9//n257lOUOY3k+8+rBTR L9MLiMSE0htJQGAsNxE/U8w1rGJKGIpIk/wdslRwOU0Tw6N4x90HrWMBr3Cqe6uC92b5 tMaQQd8f+ev1yfQZe8Y5aSYjilTXZlEbEkVop7otzkB04zDKjPrhUXHCHEa/GD+/vXCf jhg6khEWNJtdpudl4HXgi64/l5Xi/FINGJjh+G+eORV+MfporXaS0QNPY+iFM8t6igdW 3uEfs9bJbUIG147CdsyrRkJdFIOTWUH9bdFGAaZkfRhJQd4IS6DMZdPhwzFwnohjZsGJ wg6w== X-Forwarded-Encrypted: i=1; AJvYcCUTJSTS2uxTAIkFtp3iDv3Alw5JxURN4oygmmPc9A2foNBBxAVcMUuUogP9E4+8BaG9nqjRUmnkX1/IkmY=@vger.kernel.org X-Gm-Message-State: AOJu0Yw216CqlmUu/FjJz6yN9oWmnPOrdRbVe6YtYRbw7aIDdLrDz69/ 0gajeBcaWiroGIEjaaToSm3ZJsNz0BiDlgybF+WeZN9qAE6sD0prTF5Xd6AxD/FUW9uqiYJoPHc wVA== X-Google-Smtp-Source: AGHT+IHk4c/1IjO1MwiDzaYInvb39GIpeBXBReykLE1mG0z4QyNEUAZvNPkX1TcPikaWCjInx7JyJFujPu0= X-Received: from pfbcw10.prod.google.com ([2002:a05:6a00:450a:b0:736:3a40:5df5]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:c411:b0:220:f7bb:842 with SMTP id d9443c01a7336-2295c0ed243mr67188835ad.42.1743540411361; Tue, 01 Apr 2025 13:46:51 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:14 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-3-seanjc@google.com> Subject: [PATCH 02/12] KVM: Acquire SCRU lock outside of irqfds.lock during assignment From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Acquire SRCU outside of irqfds.lock so that the locking is symmetrical, and add a comment explaining why on earth KVM holds SRCU for so long. Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 01c6eb4dceb8..e47b7b6df94f 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -401,6 +401,18 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *ar= gs) */ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); =20 + /* + * Set the irqfd routing and add it to KVM's list before registering + * the irqfd with the eventfd, so that the routing information is valid + * and stays valid, e.g. if there are GSI routing changes, prior to + * making the irqfd visible, i.e. before it might be signaled. + * + * Note, holding SRCU ensures a stable read of routing information, and + * also prevents irqfd_shutdown() from freeing the irqfd before it's + * fully initialized. + */ + idx =3D srcu_read_lock(&kvm->irq_srcu); + spin_lock_irq(&kvm->irqfds.lock); =20 ret =3D 0; @@ -409,11 +421,9 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *ar= gs) continue; /* This fd is used for another irq already. */ ret =3D -EBUSY; - spin_unlock_irq(&kvm->irqfds.lock); - goto fail; + goto fail_duplicate; } =20 - idx =3D srcu_read_lock(&kvm->irq_srcu); irqfd_update(kvm, irqfd); =20 list_add_tail(&irqfd->list, &kvm->irqfds.items); @@ -449,6 +459,9 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *arg= s) srcu_read_unlock(&kvm->irq_srcu, idx); return 0; =20 +fail_duplicate: + spin_unlock_irq(&kvm->irqfds.lock); + srcu_read_unlock(&kvm->irq_srcu, idx); fail: if (irqfd->resampler) irqfd_resampler_shutdown(irqfd); --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0AF414AD20 for ; Tue, 1 Apr 2025 20:46:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540415; cv=none; b=jsgv1YZMyd92plQZSwcw5e48M2kybLUuzi2g8yI/L0SMDUftJE0F0CbPO+MT9/TihZmj/chpPA1O7KmgSOfITz47x17722rlI/9JxsLH0+vaWh/bxiMBSWobP+qKmqfGHuZCQ5sNjP7CDNEp8a/DiiCXDXaVfckfj+PWLDtSOeE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540415; c=relaxed/simple; bh=n5kNCwE5Iky7W7cF5KbIIZcTecdXBfhV99PPLi3pmUQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pkSnVGKskZhJijsJnV5bXOynDc6hj/osJyA94oRZjYt4sF9FI3KSvJ5phSFwJ1EHrXg6nLuapf+7oGL1PtZKLks3NLL/fhMT9S0gvoQHftRlIt+icoRFPI2njGSk2B2U2yRNk0/8qacwmIdyeiEY9WTl5BfXOKusr6MwbB25WL4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=az1JtS7x; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="az1JtS7x" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-30566e34290so1442953a91.3 for ; Tue, 01 Apr 2025 13:46:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540413; x=1744145213; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=HjN0v0W9m0IvqgYola2hOx5XLKml5O0udJOiHGE/eik=; b=az1JtS7xhDSs7uBWcG0wzhOyjV5Kt49ItVvyz9xEA1/S+qnR2c2GiK/D97nYUGCFwC qAO84UnBptl5VHR2QfzxZMzhBwLDwe5VPhL+iNv4061C48MeFJivaNwyTZnUfyNSL5m2 axeR+JAxMGZHF6an5cIwctkV4AEfZKO36ad2KVLxJblI/oJLEy2gfaTEfJCZUNqa/+Av Mcy8i+HPusLFxS1PaQqhwXLVvpqj/i+HdbamO7/j8nW8BUH0VaOoGbV2Tpb0y5+jStDx Bw8fb50tLIKwEaO2t7SrvVE//+yPb/eNTj5E87vmAqSdDV1UfrcD3VdtJgz6TnoInGyN fDjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540413; x=1744145213; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HjN0v0W9m0IvqgYola2hOx5XLKml5O0udJOiHGE/eik=; b=S0G3AXglAx/bEKV3wlbYI5VRRD/sUyjmx3Phjwak7KZqt0oVE0P+bBPufZ4L/2KVd3 A1WKfFfkyoxUEeZXx8ZSwHi16LLodzZgfDjh4O4LEEVtgMf1MnzQTViUq6bjbVNhj5OF HZ8pLFUyHQBLUU3xZOlq1ujw2p0NgaTZXi8cGoqDYzTTeFziri/kWC+ePMjm5Tz1iySS NbaFoVje83gWjxDHLvHnL4irt6p1NkUrHEdkHrT+Ks7Tj9pi47KK6gj/gXMAoN8T3lah IseN/N4hH7kchB3wRKMm7CyacRmMPhjKI5TqeFfp090r8QlP4MJab/ZLgkzxir5r1geE caTw== X-Forwarded-Encrypted: i=1; AJvYcCXNmMa4WmKyqgYbnWdHzDNc3G66GD3Esw8v5QrJGgO7ykklmGMF9KH6yIz22JaqS2zFabT/USVpsHWC/mE=@vger.kernel.org X-Gm-Message-State: AOJu0Ywo9PHpiWP8Q8XTv3xrcVdlavGjLNZIuEvU1Pov+BFb32GQX5GS rJpKIrwGVEuwayehVbcZayS3bI9QoUi3XyulV7CuW218MSsE0SGEcjsQIkOWYfDvTkKE2590Ki8 1yg== X-Google-Smtp-Source: AGHT+IHcYPdS+ntkX0QFkx6mSIlTj1AO51n1TbjklGmZnx1XnyN+enbKGYXER1TOaq4aVQk11+wCSyWBHf8= X-Received: from pjbsw3.prod.google.com ([2002:a17:90b:2c83:b0:2f9:c349:2f84]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2dcc:b0:2fa:157e:c790 with SMTP id 98e67ed59e1d1-3056ee181a6mr177513a91.5.1743540413076; Tue, 01 Apr 2025 13:46:53 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:15 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-4-seanjc@google.com> Subject: [PATCH 03/12] KVM: Initialize irqfd waitqueue callback when adding to the queue From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Initialize the irqfd waitqueue callback immediately prior to inserting the irqfd into the eventfd's waitqueue. Pre-initializing the state in a completely different context is all kinds of confusing, and incorrectly suggests that the waitqueue function needs to be initialize prior to vfs_poll(). Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index e47b7b6df94f..69bf2881635e 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -256,6 +256,13 @@ static void kvm_irqfd_register(struct file *file, wait= _queue_head_t *wqh, struct kvm_irqfd_pt *p =3D container_of(pt, struct kvm_irqfd_pt, pt); struct kvm_kernel_irqfd *irqfd =3D p->irqfd; =20 + /* + * Add the irqfd as a priority waiter on the eventfd, with a custom + * wake-up handler, so that KVM *and only KVM* is notified whenever the + * underlying eventfd is signaled. + */ + init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); + add_wait_queue_priority(wqh, &irqfd->wait); } =20 @@ -395,12 +402,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *ar= gs) mutex_unlock(&kvm->irqfds.resampler_lock); } =20 - /* - * Install our own custom wake-up handling so we are notified via - * a callback whenever someone signals the underlying eventfd - */ - init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); - /* * Set the irqfd routing and add it to KVM's list before registering * the irqfd with the eventfd, so that the routing information is valid --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E89C21A438 for ; Tue, 1 Apr 2025 20:46:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540417; cv=none; b=C/tqc0Tsvi/9n8Oe2Z5R9i+EskQS+e2NHgcVPrVeP9ibneEvv39ZeNNwZu6CQ+xqYYC89kFf9HRGXJgtQrgPzqqgGZ5cOp2eQj7dcpIJpHITmhKQGLC53QoRxTwELRwUH1c/ikV//R3RWpYG3eQpt+yVQekNkfd3JzW3BQJoJFU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540417; c=relaxed/simple; bh=nil5itzKYVmVDoYv7gEuftrnT2Lqo+z+CJed2C4O7A4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=omsFfmC5aedynh7wIA84OjH1TdJZCDFmR5hpB4It/njA1Tux46kaCW8RfBtu0DF90g/fnExqrCrGBgmLiiUd/1iB4fK6gZmX/aEamfriSg66qQQQc/8nAN6cvdLnJK8fSducFwy1M879r0EIpSSv1mSfy4ku6t0wf3aA+wIyP+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kKvTTwLS; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kKvTTwLS" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-227a8cdd272so99650325ad.2 for ; Tue, 01 Apr 2025 13:46:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540415; x=1744145215; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Lx4LCcdlopyvqbPN/IhBMgG93mXqJFxiV3H6RMakbuM=; b=kKvTTwLSi/UMGKuRPpo3BHTT1rLPJNOskAIYbfPiENj/bc3RmVyYhBpGhQMajZbdlL zNUECKlXlZJbwaKK/xLNs2Mr/9qta5UllrlSKC1e9+APZ3FZ0O/IVsMq2dzdOM3+XC+G bNFdX/JYL5yzN+gV9Vw0sCbiHk+FonYj7+a664nEbqiAOUifkw6HZyGY7W20bcUqiMXK CpzgtoPHZYDx3l8+yVMcUvT2YFLr2ncwlcs6JJKHU9Eiyi9ytGdxS8CrCbaJjdSsMHTf dvuCLRWE/5yom4sibM+vOGNbzcZx+WTZ1iJCDML4k6ta+j3b3fTW43bfG+O84i4rgEcr mayA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540415; x=1744145215; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Lx4LCcdlopyvqbPN/IhBMgG93mXqJFxiV3H6RMakbuM=; b=npPKzW7SWOsTurIvmOR48sZzAk9Lq2iOl2VGR3sHUp0mbeWO+x28rZRitIJSZElLVx j/Pk8W+hwDihVrEh5YLihp5WDpiekNwKBWToW16GG5Ppa05WnEvyEMU4lfyWVO36WfFj uME9IODuuYeWs/jqxkYnVViX1bzdpX8sLJVBzdk1e2mTER1qfmQ35uMPL7gss4xzh/tf 1q9aNNTOt2Iq9oUi00oZou8aMUxuXJ6LERQykd9uQDaRAQu6ckYhmQkH+a6chBiMl+rK 90nZ+VeGLWhOOznXzMDW8hEES02ry3sN8F2WGed+yte7kntBvBE4y2Xfq5MxASUVbOvs 37JA== X-Forwarded-Encrypted: i=1; AJvYcCUzowfTUEmVypwuz5ULNQa4MgIOP+aKVtkLG3yOJ5nt2cTz60/yyJV6fJhH1FUxhqkQqFEYfYWjPAM8QGM=@vger.kernel.org X-Gm-Message-State: AOJu0YyEqbmyDc4qq08dCshjgoj6vOBbFINKDKPYBNjsYijn79+Ea1xO PM+XC6pNZgNQYJIgoqzv4mvL7t8uKXjO+Y04jRmWQ3Y24cyBLH559YJ5YAa15gcJoHae+Xf+Tvr veg== X-Google-Smtp-Source: AGHT+IHW0sXHX4wMGJ15kvjoFqaPE9awBB7bIk66pU6GUBUvNZBfoLxZ3yipmWjxsb6XuZDLmoik4bMgDVc= X-Received: from pjbsi11.prod.google.com ([2002:a17:90b:528b:b0:301:1bf5:2efc]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2bcc:b0:224:912:153 with SMTP id d9443c01a7336-2292f942acbmr235734395ad.5.1743540414753; Tue, 01 Apr 2025 13:46:54 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:16 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-5-seanjc@google.com> Subject: [PATCH 04/12] KVM: Add irqfd to KVM's list via the vfs_poll() callback From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the irqfd structure to KVM's list of irqfds in kvm_irqfd_register(), i.e. via the vfs_poll() callback. This will allow taking irqfds.lock across the entire registration sequence (add to waitqueue, add to list), and more importantly will allow inserting into KVM's list if and only if adding to the waitqueue succeeds (spoiler alert), without needing to juggle return codes in weird ways. Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 102 +++++++++++++++++++++++++-------------------- 1 file changed, 57 insertions(+), 45 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 69bf2881635e..01ae5835c8ba 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -245,34 +245,14 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode,= int sync, void *key) return ret; } =20 -struct kvm_irqfd_pt { - struct kvm_kernel_irqfd *irqfd; - poll_table pt; -}; - -static void kvm_irqfd_register(struct file *file, wait_queue_head_t *wqh, - poll_table *pt) -{ - struct kvm_irqfd_pt *p =3D container_of(pt, struct kvm_irqfd_pt, pt); - struct kvm_kernel_irqfd *irqfd =3D p->irqfd; - - /* - * Add the irqfd as a priority waiter on the eventfd, with a custom - * wake-up handler, so that KVM *and only KVM* is notified whenever the - * underlying eventfd is signaled. - */ - init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); - - add_wait_queue_priority(wqh, &irqfd->wait); -} - -/* Must be called under irqfds.lock */ static void irqfd_update(struct kvm *kvm, struct kvm_kernel_irqfd *irqfd) { struct kvm_kernel_irq_routing_entry *e; struct kvm_kernel_irq_routing_entry entries[KVM_NR_IRQCHIPS]; int n_entries; =20 + lockdep_assert_held(&kvm->irqfds.lock); + n_entries =3D kvm_irq_map_gsi(kvm, entries, irqfd->gsi); =20 write_seqcount_begin(&irqfd->irq_entry_sc); @@ -286,6 +266,49 @@ static void irqfd_update(struct kvm *kvm, struct kvm_k= ernel_irqfd *irqfd) write_seqcount_end(&irqfd->irq_entry_sc); } =20 +struct kvm_irqfd_pt { + struct kvm_kernel_irqfd *irqfd; + struct kvm *kvm; + poll_table pt; + int ret; +}; + +static void kvm_irqfd_register(struct file *file, wait_queue_head_t *wqh, + poll_table *pt) +{ + struct kvm_irqfd_pt *p =3D container_of(pt, struct kvm_irqfd_pt, pt); + struct kvm_kernel_irqfd *irqfd =3D p->irqfd; + struct kvm_kernel_irqfd *tmp; + struct kvm *kvm =3D p->kvm; + + spin_lock_irq(&kvm->irqfds.lock); + + list_for_each_entry(tmp, &kvm->irqfds.items, list) { + if (irqfd->eventfd !=3D tmp->eventfd) + continue; + /* This fd is used for another irq already. */ + p->ret =3D -EBUSY; + spin_unlock_irq(&kvm->irqfds.lock); + return; + } + + irqfd_update(kvm, irqfd); + + list_add_tail(&irqfd->list, &kvm->irqfds.items); + + spin_unlock_irq(&kvm->irqfds.lock); + + /* + * Add the irqfd as a priority waiter on the eventfd, with a custom + * wake-up handler, so that KVM *and only KVM* is notified whenever the + * underlying eventfd is signaled. + */ + init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); + + add_wait_queue_priority(wqh, &irqfd->wait); + p->ret =3D 0; +} + #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS void __attribute__((weak)) kvm_arch_irq_bypass_stop( struct irq_bypass_consumer *cons) @@ -315,7 +338,7 @@ bool __attribute__((weak)) kvm_arch_irqfd_route_changed( static int kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) { - struct kvm_kernel_irqfd *irqfd, *tmp; + struct kvm_kernel_irqfd *irqfd; struct eventfd_ctx *eventfd =3D NULL, *resamplefd =3D NULL; struct kvm_irqfd_pt irqfd_pt; int ret; @@ -414,32 +437,22 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *a= rgs) */ idx =3D srcu_read_lock(&kvm->irq_srcu); =20 - spin_lock_irq(&kvm->irqfds.lock); - - ret =3D 0; - list_for_each_entry(tmp, &kvm->irqfds.items, list) { - if (irqfd->eventfd !=3D tmp->eventfd) - continue; - /* This fd is used for another irq already. */ - ret =3D -EBUSY; - goto fail_duplicate; - } - - irqfd_update(kvm, irqfd); - - list_add_tail(&irqfd->list, &kvm->irqfds.items); - - spin_unlock_irq(&kvm->irqfds.lock); - /* - * Register the irqfd with the eventfd by polling on the eventfd. If - * there was en event pending on the eventfd prior to registering, - * manually trigger IRQ injection. + * Register the irqfd with the eventfd by polling on the eventfd, and + * simultaneously and the irqfd to KVM's list. If there was en event + * pending on the eventfd prior to registering, manually trigger IRQ + * injection. */ irqfd_pt.irqfd =3D irqfd; + irqfd_pt.kvm =3D kvm; init_poll_funcptr(&irqfd_pt.pt, kvm_irqfd_register); =20 events =3D vfs_poll(fd_file(f), &irqfd_pt.pt); + + ret =3D irqfd_pt.ret; + if (ret) + goto fail_poll; + if (events & EPOLLIN) schedule_work(&irqfd->inject); =20 @@ -460,8 +473,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *arg= s) srcu_read_unlock(&kvm->irq_srcu, idx); return 0; =20 -fail_duplicate: - spin_unlock_irq(&kvm->irqfds.lock); +fail_poll: srcu_read_unlock(&kvm->irq_srcu, idx); fail: if (irqfd->resampler) --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B8D221C167 for ; Tue, 1 Apr 2025 20:46:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540418; cv=none; b=L9OQZ30MF3SDjJICHCYOq0bJwzecxtJR4aUi4lIypOsfLMg859pe9n9MLaJPKmMya+/5dZhWLzAhqRl0XIqC8L0ZHHN9ZLm5745AfKONjOHqWkjgRxu4r0F3xa8CqmynQPhMK8egSrjmgK4SPLrLoV205U+L4xHrha4qWA1sskk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540418; c=relaxed/simple; bh=SQCk7dl/an2O85TPN3SHobLIgSZpdPBWw0r/t+ZuWrA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=aGxfOuXeyjzoWUcKzG/y2uNzZKJIiC0gT5n0CJM7BhSiaduwpkuzev9oAFdR/EWx7+NXkvYuUi8JG+v+y0tO6kFH9C0MXm9ejIdwdID2u1nQsDq2j7mkambC9vwB4Dtek1TBhfTNvoQt1lJrPug9/uG+7SxtdSmsqzB4NFlhnCo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4np9ppca; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4np9ppca" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3032f4eca83so10215664a91.3 for ; Tue, 01 Apr 2025 13:46:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540416; x=1744145216; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=y1emrfR/t8xEof5u39JxRR5LEtGKo3o+GNsYerS4PAw=; b=4np9ppcaVZEu1w4o0C4C19sw5pZryiZMApBAyI/XqXcmtmxNNsAOrNZXJXJz1y1WgO Q91kiydg+GeWmIDE1Wr49nKb8GhUjiTtYcw97rcYtQ5T96xmQoSBtqeRXkO8A1SRN4po SOHEx70SJpfFPHRePmRB975JExMnf8XMq5MDZ0r7YbEDHVW7DGrbFlEnhwE9TVb3FhqK lGBbYAszhJWYuTqf5X7Gr3hZYsRJbKrJ9nIqR7jTk+utmWi9645OWgfF+mKKyBBVd0GR 5eBNqstbgwU5mQQFm/k6ntuVlP6Q3EKaUlZWQ/7z2//9PjKly2SjtsB75Z6kijZSdFMs pUpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540416; x=1744145216; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=y1emrfR/t8xEof5u39JxRR5LEtGKo3o+GNsYerS4PAw=; b=Y21dlWZ/ifd3QOdKp8SE0sov1ZzowrTGmqpAsOfe94MxlpWLbDavplL1bFpaiag3HY z6CXF7JYZ7rSx8CzjLTivH1MCOLhOOjOaTZ3IvfHg2OQdLxdrvUXfeJ+Z0Uf/6tBJRNz qvQhFyxycI033CLlHsrZ0cLxXaZp8nOUSnGXrTwEWkz0nq2/ewqPActKudpFKJgdpYOQ DbFYL98HhfWi/YFvy2xrnSl8JZCoXmjiZDrgQ61JF/M765v5BImGxG7oJ3npmQ1Eqgsu KeENjoTNcqD3QwgxJADl9WrF16IOf8/BSuSeVkoc7/50bthhQ0KDYfcBI6RNxju5dRef w3Yw== X-Forwarded-Encrypted: i=1; AJvYcCU+lsEdntgq+pvi4RzBiKrVnyI2zh/XGkhpRoTsFSlk2A/7M5sKbtr7AQWwt7R2kKSSEXVDguMozKKUKGU=@vger.kernel.org X-Gm-Message-State: AOJu0YyFwQJ4SLhWjKJMk+8qwwwbVY/HNHmSexhlgi8AcaXF0C758T14 Qlx+9RNiEK3m979ZWEct3zU+CkQ3ANMbM6awG91iKBb9g70FZCza9jhfp7ZVNOHPTWkuw0F3BNM 7UQ== X-Google-Smtp-Source: AGHT+IGO2fBqWvuDL6Vwrbo2kzqtang4rRNKtiORi8KQTXOklAN9xTuLpqS5Vs1VfeAZVw4/UWdyo4ySkHQ= X-Received: from pjbsw7.prod.google.com ([2002:a17:90b:2c87:b0:2fe:800f:23a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:38ce:b0:2fe:ba7f:8032 with SMTP id 98e67ed59e1d1-30531f948c8mr21050786a91.9.1743540416529; Tue, 01 Apr 2025 13:46:56 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:17 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-6-seanjc@google.com> Subject: [PATCH 05/12] KVM: Add irqfd to eventfd's waitqueue while holding irqfds.lock From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add an irqfd to its target eventfd's waitqueue while holding irqfds.lock, which is mildly terrifying but functionally safe. irqfds.lock is taken inside the waitqueue's lock, but if and only if the eventfd is being released, i.e. that path is mutually exclusive with registration as KVM holds a reference to the eventfd (and obviously must do so to avoid UAF). This will allow using the eventfd's waitqueue to enforce KVM's requirement that eventfd is assigned to at most one irqfd, without introducing races. Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 01ae5835c8ba..a33c10bd042a 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -204,6 +204,11 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, = int sync, void *key) int ret =3D 0; =20 if (flags & EPOLLIN) { + /* + * WARNING: Do NOT take irqfds.lock in any path except EPOLLHUP, + * as KVM holds irqfds.lock when registering the irqfd with the + * eventfd. + */ u64 cnt; eventfd_ctx_do_read(irqfd->eventfd, &cnt); =20 @@ -225,6 +230,11 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, = int sync, void *key) /* The eventfd is closing, detach from KVM */ unsigned long iflags; =20 + /* + * Taking irqfds.lock is safe here, as KVM holds a reference to + * the eventfd when registering the irqfd, i.e. this path can't + * be reached while kvm_irqfd_add() is running. + */ spin_lock_irqsave(&kvm->irqfds.lock, iflags); =20 /* @@ -296,16 +306,21 @@ static void kvm_irqfd_register(struct file *file, wai= t_queue_head_t *wqh, =20 list_add_tail(&irqfd->list, &kvm->irqfds.items); =20 - spin_unlock_irq(&kvm->irqfds.lock); - /* * Add the irqfd as a priority waiter on the eventfd, with a custom * wake-up handler, so that KVM *and only KVM* is notified whenever the - * underlying eventfd is signaled. + * underlying eventfd is signaled. Temporarily lie to lockdep about + * holding irqfds.lock to avoid a false positive regarding potential + * deadlock with irqfd_wakeup() (see irqfd_wakeup() for details). */ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); =20 + spin_release(&kvm->irqfds.lock.dep_map, _RET_IP_); add_wait_queue_priority(wqh, &irqfd->wait); + spin_acquire(&kvm->irqfds.lock.dep_map, 0, 0, _RET_IP_); + + spin_unlock_irq(&kvm->irqfds.lock); + p->ret =3D 0; } =20 --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E863D21CC79 for ; Tue, 1 Apr 2025 20:46:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540420; cv=none; b=TnukqSovK2kIcfkgzgnUZegndKS25wc86OtY17lHnab4JJ+sCcW70t8qtrSIctjUQY0EpHQpZ+Ci9KUhBgGJofwFafPTl3hQjskyXWmtUpWotuJ7bSeIIKU9aV675b/av7DYQMPUNx9CLwhR3kpn5remvH6cKGXfVD7v0FpqZrE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540420; c=relaxed/simple; bh=9Y4jQuNr81blf0kL+x4SZ577lsTpKSVZZgUjgP8+H94=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=E1IILmiQxPhyWep/icPUkMch8kESeMW0CG9YBRybMkpdpxW8RY93RsfNadJxN0okyFIt6uH/weXord9wK6sswTwlottt0DEeztvPAemFWoCkwKHdhAdUVSbS3Ypi391FOWPE7zY/haRqwLfwjp9zSGX6p4HBaBFY2udtUvqQp1c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=DyouEwtd; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="DyouEwtd" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3032ea03448so11185575a91.2 for ; Tue, 01 Apr 2025 13:46:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540418; x=1744145218; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=XAZ6HcZyPVHZNs/NiQeU2YPtsx7wIdUpIqdNXxdK3gA=; b=DyouEwtdq9yHvK0FwVUiAsw9Dozv6SbcJPwqoqEuYa1kyrPC5COKuksUahOSqDZM7N wm2c6flS7j8aczykkShGZSGuR/Bea3iEqkd54eQnJQQP0uQ1tXuRZGViFsrrUJ1pdQI8 d4rYi4e5YbiRBLEdlL4nZ0hiGLfKwQ9tZqVleXm66N7rLnSMXFDhJ75vRHIp4SNSFx02 Xgk3GwQ6VmF/YsBJ7vf1zG7prF1MT7wweHeg5ic9+bJvUjgE3lbD+6ME+bEohWKHufKe ennvaLaPtuSgfuM785vTewTt9qwneaKZwIbvo6djhNCWYVNIHHsx2HTvFun0IHfPwTcY 5iCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540418; x=1744145218; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XAZ6HcZyPVHZNs/NiQeU2YPtsx7wIdUpIqdNXxdK3gA=; b=JKL6PDN7oXDLCmTbrmvhGuVkpqBJzd6bDnD3KXthvf1NgoLnhL+hES4gYoBybpV0g/ 5HPLMldGM9BJ9KYCZc4+24e3VjH/spCfI9X+AKqeDxG+ZTLOXzmhvonLA5IuNp9ZX2MV SNJmC6G3SbTVsi6kH4mi9hEK4AqXVXPSmDCrawj3/5Bk8ChXvxIjMv0jMK4+FHaO7PYC MjWnwdHB9w9oGDLfenwlK+BswO4XwlHeE+sYJtcD6S8/2DluqYSRn7sn5KvRBN3qq5Jn rTGdpPxMBgwt4+uVahLu1wrIF9Uf8aHTpiSynnbWtDnoCP+aRp7ErbN95Mrzd/B9btiT JpNQ== X-Forwarded-Encrypted: i=1; AJvYcCWqp4FQYbYR8G/HctAb4fVB1qyOfpA9RyPopl9CHOaiejd8JHLfEQOdYcEpwr8JPSmy70juy6RlZhNu+8g=@vger.kernel.org X-Gm-Message-State: AOJu0Ywjn7yiMoV8iAzk2jhW3mn3vjwelvI8vPlhahP55L6qoWKENVbz XxuxflQH7x1CCiWAEFYu2rl/LthZwPRqm11gBWQZSE+2lCPYdwG/mE/eRqIrkJKfFYYTnv8zVt2 kvg== X-Google-Smtp-Source: AGHT+IFdQSCduwNTHLV/nec9/i/BvW+thtXD97niBCIJ/k+4mEqw+xFFVLw5YdqLnoDpAq+BRV3lhigqiIM= X-Received: from pjk16.prod.google.com ([2002:a17:90b:5590:b0:2ea:5084:5297]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:574b:b0:2ee:9b09:7d3d with SMTP id 98e67ed59e1d1-305320af3e0mr18907665a91.19.1743540418245; Tue, 01 Apr 2025 13:46:58 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:18 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-7-seanjc@google.com> Subject: [PATCH 06/12] sched/wait: Add a waitqueue helper for fully exclusive priority waiters From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a waitqueue helper to add a priority waiter that requires exclusive wakeups, i.e. that requires that it be the _only_ priority waiter. The API will be used by KVM to ensure that at most one of KVM's irqfds is bound to a single eventfd (across the entire kernel). Open code the helper instead of using __add_wait_queue() so that the common path doesn't need to "handle" impossible failures. Note, the priority_exclusive() name is obviously confusing as the plain priority() API also sets WQ_FLAG_EXCLUSIVE. This will be remedied once KVM switches to add_wait_queue_priority_exclusive(), as the only other user of add_wait_queue_priority(), Xen's privcmd, doesn't actually operate in exclusive mode (more than likely, the detail was overlooked when privcmd copy-pasted (sorry, "was inspired by") KVM's implementation). Signed-off-by: Sean Christopherson --- include/linux/wait.h | 2 ++ kernel/sched/wait.c | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/include/linux/wait.h b/include/linux/wait.h index 6d90ad974408..5fe082c9e52b 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -164,6 +164,8 @@ static inline bool wq_has_sleeper(struct wait_queue_hea= d *wq_head) extern void add_wait_queue(struct wait_queue_head *wq_head, struct wait_qu= eue_entry *wq_entry); extern void add_wait_queue_exclusive(struct wait_queue_head *wq_head, stru= ct wait_queue_entry *wq_entry); extern void add_wait_queue_priority(struct wait_queue_head *wq_head, struc= t wait_queue_entry *wq_entry); +extern int add_wait_queue_priority_exclusive(struct wait_queue_head *wq_he= ad, + struct wait_queue_entry *wq_entry); extern void remove_wait_queue(struct wait_queue_head *wq_head, struct wait= _queue_entry *wq_entry); =20 static inline void __add_wait_queue(struct wait_queue_head *wq_head, struc= t wait_queue_entry *wq_entry) diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 51e38f5f4701..80d90d1dc24d 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -47,6 +47,26 @@ void add_wait_queue_priority(struct wait_queue_head *wq_= head, struct wait_queue_ } EXPORT_SYMBOL_GPL(add_wait_queue_priority); =20 +int add_wait_queue_priority_exclusive(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) +{ + struct list_head *head =3D &wq_head->head; + unsigned long flags; + int r =3D 0; + + wq_entry->flags |=3D WQ_FLAG_EXCLUSIVE | WQ_FLAG_PRIORITY; + spin_lock_irqsave(&wq_head->lock, flags); + if (!list_empty(head) && + (list_first_entry(head, typeof(*wq_entry), entry)->flags & WQ_FLAG_PR= IORITY)) + r =3D -EBUSY; + else + list_add(&wq_entry->entry, head); + spin_unlock_irqrestore(&wq_head->lock, flags); + + return r; +} +EXPORT_SYMBOL(add_wait_queue_priority_exclusive); + void remove_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_= entry *wq_entry) { unsigned long flags; --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5164321D5A7 for ; Tue, 1 Apr 2025 20:47:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540421; cv=none; b=IxNbHWl6pVa4EprGTq3daE06FrooFL9hNPHxPhMo8C6tbAM9qIRRxnQ54K7GcNgkvIrtZ1BMGqEdoY5Z5imqRebC9FDA1R/8I1SJ1h1qLD628Qbvk8VIFeMIyOIGcZeWgYMAT5GrJj5VVEW6JXMGxDVHwXVSnXEHZ0PrSTmptdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540421; c=relaxed/simple; bh=ADxRVRx091FTCCf4bvhC0j46BfdY/59y0ru1KU1eFuo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ASsG7QZfIxw+e8vcuLNsB85KrYzPdcDyjbzNMVSHx2C+4LkHWqOCvJ04gGPLqUP0Bpl11Rvw/boXknVGeGO3LEUWrTa4W1pBTvwJeBKnAJS205oP7AywTNyDdggyaAnSFNksWCJe2gpxmEi7nFhC+JjZK+yXjYkw/4DvOgDY4GY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=c63/vB/l; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="c63/vB/l" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ff55176edcso11243420a91.1 for ; Tue, 01 Apr 2025 13:47:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540419; x=1744145219; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=aA8/rbjSPW3y49ht4xFwJaGU+DA9JsbqwVRp3IHMGug=; b=c63/vB/lO+N+ME4x7fVEM29Xt89P2/DBo/6Ggd1ctj2mIoLXBT5bDobtoJ7Dp40Lh1 DPuWbS3CbCdJRmc4m8zxm6EVOvD28z0ITmZcEAso+PQClG1oXrCmRkiekiqKZ6OLMAlr 53FoMhXLD8Vz5jnptYnuKjELys+URLlkfPBy7Z9OPGDo/JgSuMlLXxKXCA74xaasJ8E6 kgbGDhTkGHdz017/pQS+lU3V5m2CrbRI1kqNVcgVU4xNaXA3Uoeiia0nMJSW7gOJ62Qp F1MRlrDZkn81PimnWWODRYtQ5U3kyXcbqyQM//AXei0de2xlqQnW3GcEWtmfPCAM1wXZ D5aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540419; x=1744145219; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=aA8/rbjSPW3y49ht4xFwJaGU+DA9JsbqwVRp3IHMGug=; b=OgiHiEdHK7d0EAlZrRVaoy2l52Gv3kI6hXh85MFtryG362jBwUQcKKSwPmMUzQScyY 2Qq16ZV3gS9gDwhx0UBUst38Xx0djpizcIXDxJG2qSu/f9wBBAFXGui1+ozOqFWpZIq7 ttY+BZJugVPVFq5ffrCTMEbTkVNCGe+CB88494u6MTK/+sSSCHrXl8HJbzHq8fa2QmWZ TbhXdheMXxh3THO4K9WhhaKsWSpLYX8mbQfv8ljLzojlVAEHw2NF6iLRAjaT3MHQW2K/ qlZnIOeMHjLPFpSrGbSd5npV2lgCBTb+HCDNo1/A9ZhbCYGtFOTWVMQgdDqxvrB405P+ uROA== X-Forwarded-Encrypted: i=1; AJvYcCXpmFkfywK6jKNM4PTvcre/AChsFijqeAHPrRb+jYd3VLGX5cjiDa63Y3Z2W2WjBE6OPmDnfIieUXQhY0A=@vger.kernel.org X-Gm-Message-State: AOJu0Yz6X0apOh80TXXs/iHQJ7LGjKOaxazFsXAJsw11BZn93Fo0/RVF eV5xZhJ1D//BasvQs32UNrHzxMchaEoxTZbY9SBcuRTiM5azhkmkrfutMyD+mtL/e3dOWR+aC4B KMA== X-Google-Smtp-Source: AGHT+IGGwFEdDMuZ0xntNe6yk2g8zeyEv/XFu/sxHNOeaskOniT4X4IpuqAn0QXf3B4CSI533+/1l7AS/qA= X-Received: from pfbby7.prod.google.com ([2002:a05:6a00:4007:b0:737:6066:fee8]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:9f07:b0:1f5:709d:e0b7 with SMTP id adf61e73a8af0-2009f5b5002mr26259502637.6.1743540419700; Tue, 01 Apr 2025 13:46:59 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:19 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-8-seanjc@google.com> Subject: [PATCH 07/12] KVM: Disallow binding multiple irqfds to an eventfd with a priority waiter From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Disallow binding an irqfd to an eventfd that already has a priority waiter, i.e. to an eventfd that already has an attached irqfd. KVM always operates in exclusive mode for EPOLL_IN (unconditionally returns '1'), i.e. only the first waiter will be notified. KVM already disallows binding multiple irqfds to an eventfd in a single VM, but doesn't guard against multiple VMs binding to an eventfd. Adding the extra protection reduces the pain of a userspace VMM bug, e.g. if userspace fails to de-assign before re-assigning when transferring state for intra-host migration, then the migration will explicitly fail as opposed to dropping IRQs on the destination VM. Temporarily keep KVM's manual check on irqfds.items, but add a WARN, e.g. to allow sanity checking the waitqueue enforcement. Cc: Oliver Upton Cc: David Matlack Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 54 +++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 17 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index a33c10bd042a..25c360ed2e1e 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -291,37 +291,57 @@ static void kvm_irqfd_register(struct file *file, wai= t_queue_head_t *wqh, struct kvm_kernel_irqfd *tmp; struct kvm *kvm =3D p->kvm; =20 + /* + * Note, irqfds.lock protects the irqfd's irq_entry, i.e. its routing, + * and irqfds.items. It does NOT protect registering with the eventfd. + */ spin_lock_irq(&kvm->irqfds.lock); =20 - list_for_each_entry(tmp, &kvm->irqfds.items, list) { - if (irqfd->eventfd !=3D tmp->eventfd) - continue; - /* This fd is used for another irq already. */ - p->ret =3D -EBUSY; - spin_unlock_irq(&kvm->irqfds.lock); - return; - } - + /* + * Initialize the routing information prior to adding the irqfd to the + * eventfd's waitqueue, as irqfd_wakeup() can be invoked as soon as the + * irqfd is registered. + */ irqfd_update(kvm, irqfd); =20 - list_add_tail(&irqfd->list, &kvm->irqfds.items); - /* * Add the irqfd as a priority waiter on the eventfd, with a custom * wake-up handler, so that KVM *and only KVM* is notified whenever the - * underlying eventfd is signaled. Temporarily lie to lockdep about - * holding irqfds.lock to avoid a false positive regarding potential - * deadlock with irqfd_wakeup() (see irqfd_wakeup() for details). + * underlying eventfd is signaled. */ init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); =20 + /* + * Temporarily lie to lockdep about holding irqfds.lock to avoid a + * false positive regarding potential deadlock with irqfd_wakeup() + * (see irqfd_wakeup() for details). + * + * Adding to the wait queue will fail if there is already a priority + * waiter, i.e. if the eventfd is associated with another irqfd (in any + * VM). Note, kvm_irqfd_deassign() waits for all in-flight shutdown + * jobs to complete, i.e. ensures the irqfd has been removed from the + * eventfd's waitqueue before returning to userspace. + */ spin_release(&kvm->irqfds.lock.dep_map, _RET_IP_); - add_wait_queue_priority(wqh, &irqfd->wait); + p->ret =3D add_wait_queue_priority_exclusive(wqh, &irqfd->wait); spin_acquire(&kvm->irqfds.lock.dep_map, 0, 0, _RET_IP_); + if (p->ret) + goto out; =20 + list_for_each_entry(tmp, &kvm->irqfds.items, list) { + if (irqfd->eventfd !=3D tmp->eventfd) + continue; + + WARN_ON_ONCE(1); + /* This fd is used for another irq already. */ + p->ret =3D -EBUSY; + goto out; + } + + list_add_tail(&irqfd->list, &kvm->irqfds.items); + +out: spin_unlock_irq(&kvm->irqfds.lock); - - p->ret =3D 0; } =20 #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9560D21E0BB for ; Tue, 1 Apr 2025 20:47:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540423; cv=none; b=JHqN9jpFK4hpsNZEu7MHbTPkmv2p9gY1YsFxiTfMe7Zo1JtgCqF2KQBLsYPiIkTaY0H7cH+sfq3quK+3dt0efmesTKsfg/FA5N/Z1Ho2p3qVC65lsVjRgPejjgCjY5Fw9qSKc4Q/9AtpOWcmSKoEyns0bLGAIZrlOpnUEK5gSSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540423; c=relaxed/simple; bh=V7OL3Q/3J2xWKwnmIlSC3LoKokZ3iR0JrDzpSaKTdhE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DSMWFzpGEMIoIurUgdQoG85KnqSt1HVYxID2LfoiVCZSJyTIzvw4/a2PKAnwYSzBG4/zo/PeuYisoqAYVOp9LLqKXn7k+Zi/K8iTMehkbWzo6iaaPKVX3u0hET1cCxY9Q3kF65fqdfJNtE4xqoinsnwUd7ljc+dyU7XAVZ8SRm8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qGfT9SO9; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qGfT9SO9" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-30364fc706fso10340067a91.3 for ; Tue, 01 Apr 2025 13:47:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540421; x=1744145221; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=hAOHDZe9hVA1LcrHBl8ILklpk5SqTyz1Lg9oNtA6EC8=; b=qGfT9SO9RBnPGlw4gIUZCxTZpC07aSiJWDJT4WjMDM64dl+zIYdeV56USeA7rEij6v pmvTivzaioWglX5Vs3GwKiUlfDH1IXnBkXC7vnHnneaNalAvoQmLUbIp4csryWG5+tjZ UnnzWVhOhc7M2qdv085XnlWkZx7MLmjBFcysL4azU42Z0yG+v2TLZk5SlEGN2YYDTlPC VXDsSxQcaybJyld7Uf+9I2BShpImty6NaOEK3NUmZUn46zt945XknHRLxA1IcnXvVjKZ Ub5Q5a8oGGe3+nEJlEJ2GQl9AB2mGMxwUNJBsDKv1wH0aOFzMgQte1afncVlcelu8hpd 8gvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540421; x=1744145221; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hAOHDZe9hVA1LcrHBl8ILklpk5SqTyz1Lg9oNtA6EC8=; b=k0aCTw5ecV1VPLH0/k1K5psWCCneZ7DL6JgFCqvzbeTvPjRVBudwikJ0WUpRnfVf6C QKheK/z1pBZv9Bwx5Px3mWOffWRWOv98GntgMWQyWk7Oh0+nGmWn+QsIhiZ7kVaQXpUK nAXKMqE/4E46sJBFirqzIXvR5QFZ3hWOMCa4viWALriVH3Cw2k6AcwX1r3l/KaHOKM8V +jPj9O4Ibm6r9/az5WIQNvFRsNrAgxOv4xDgxNqAps0EJKTUCZsxPL2OGnR8YehiTDyJ xqq/UeNR9hFMBA26hWPVHo0uNFV8k9VcdORL5MJxoVNuo8bowS25vzlzc9tKIGp5Xm7d 8fHw== X-Forwarded-Encrypted: i=1; AJvYcCW8pZJnzAwlI1/Q3PMhYlL2IxxWWr1oIkcYsVOVJN9m+7xaWMZbgSdK273EWq1+C2soKJHWGb8p4bRJUu0=@vger.kernel.org X-Gm-Message-State: AOJu0YyYXmnsgyGsQc2Fn9yIuhJg6u8LTApin2hWLuoc4V7uCDhUCtov 5fucQHWYlV3NLlhnuMmsykX+oFEcrFxSWI+G/xyL2EAuPTVwnmJekRhTyzhJbxNNdmoLVDnltmX /Yw== X-Google-Smtp-Source: AGHT+IHU4ej+9xtm5sDkWeeRPmRGXEXSjICfBn2mdGglMLGC+glAIOtURcbT/pNs0Pm5Q7kEH8DQAnKQF70= X-Received: from pjbqb11.prod.google.com ([2002:a17:90b:280b:b0:2fe:d556:ec6e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1344:b0:2ff:6788:cc67 with SMTP id 98e67ed59e1d1-3056096950cmr5544504a91.34.1743540421162; Tue, 01 Apr 2025 13:47:01 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:20 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-9-seanjc@google.com> Subject: [PATCH 08/12] sched/wait: Drop WQ_FLAG_EXCLUSIVE from add_wait_queue_priority() From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drop the setting of WQ_FLAG_EXCLUSIVE from add_wait_queue_priority() to differentiate it from add_wait_queue_priority_exclusive(). The one and only user add_wait_queue_priority(), Xen privcmd's irqfd_wakeup(), unconditionally returns '0', i.e. doesn't actually operate in exclusive mode. Cc: Juergen Gross Cc: Stefano Stabellini Cc: Oleksandr Tyshchenko Signed-off-by: Sean Christopherson --- kernel/sched/wait.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 80d90d1dc24d..2af0fc92e5d3 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -40,7 +40,7 @@ void add_wait_queue_priority(struct wait_queue_head *wq_h= ead, struct wait_queue_ { unsigned long flags; =20 - wq_entry->flags |=3D WQ_FLAG_EXCLUSIVE | WQ_FLAG_PRIORITY; + wq_entry->flags |=3D WQ_FLAG_PRIORITY; spin_lock_irqsave(&wq_head->lock, flags); __add_wait_queue(wq_head, wq_entry); spin_unlock_irqrestore(&wq_head->lock, flags); @@ -84,7 +84,7 @@ EXPORT_SYMBOL(remove_wait_queue); * the non-exclusive tasks. Normally, exclusive tasks will be at the end of * the list and any non-exclusive tasks will be woken first. A priority ta= sk * may be at the head of the list, and can consume the event without any o= ther - * tasks being woken. + * tasks being woken if it's also an exclusive task. * * There are circumstances in which we can try to wake a task which has al= ready * started to run but is not in state TASK_RUNNING. try_to_wake_up() retur= ns --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AFE0822155C for ; Tue, 1 Apr 2025 20:47:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540425; cv=none; b=OGMIf0VXozVM8+uzgTh9qH/JiX1qJDdUiAvy5qv7rrjxMBs5Aa+ljtUI0KrjTV0KYll0qTgY/Txg9H6ts+0x8N2FIhtMGoAUV0OIuIbVWwT+mrP63YMnevNHPmer7VwoqxAIZxQd24Jk/6ktKGqpSa/FmAstVSRoNMMUxeugn34= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540425; c=relaxed/simple; bh=p6FosNapp0YaaT9fRT5YA+80ZKTNpXl7IVghKZMWIQo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QBooQS/ZbxLDTQMRHf1wyer7Ieuw8iQ8ApYxqQSxi1Zuc/DSZh0hwi3nUB+TEoSfPhkGjpOUVzwwAIsSHyKDG/vEyoU6Rjw6XmWPjKPOUKHBXxQaqvnHSmomPg1PyQS62cRX5n6JoxC4wJvoGgUp5sl8PEL06LIVTdBLf0kg7Bk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qbTy8TQN; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qbTy8TQN" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3032f4eca83so10215773a91.3 for ; Tue, 01 Apr 2025 13:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540423; x=1744145223; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=A5N7R+jpPDsp4box9tDn3nxVfOu1tizhXrLdonpwfcQ=; b=qbTy8TQNcNIdxPD/fWon4IOfpmx+bRQolthuDiNSfu1NJmbv/xD2wc1yVCw0pgtOSN eUZMaqqpz2mpb7lxslZG9UzQlW5dLd9sKMF3a13lkbAcZK4vmJvVooHGn3oLVQsGEEo0 UYfc6NFClkbPgU23g6hSf3jYndBYk1n1s1JWQPmK7V5/VadBVJvPZBYgZDvqeVPZSt43 iWbXmgpdq4aih8PTrYPIYs0wQznSSv+AUU6/c0fpX7dGUTIGMzUF3+lyQUy5+9bzhGVY feBNvM0Opu4PT+B95t/I8ruLpMsn6ezisZlUX20Iy5hKFYWFPRNrPQZR2Ukvy9cZEvk7 du4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540423; x=1744145223; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=A5N7R+jpPDsp4box9tDn3nxVfOu1tizhXrLdonpwfcQ=; b=gaFhYg90sHp0mar6VphBqv/XLISXymVr6q+osg33DI9+JfbIaB2BkkzP0lyRFuSrMs k9gzIgILtWl4Q/1LEl+bQ6IWLtUjYuznDwruZ2rZzhdRUodHBbs1tCZsWu+/agif/2Yk tuzpLYgYGVQoRVWKD09YLCp0UOsBffZCVpmo5cg8ZjuEptytbTbGpvML+egZlgXYgE0w VGBBoRgb+h0YK5aBuLzF7qSGFom/QdKfUh9Qc/u8OOm+jbBD3DbVbxmsnGybzfDy64Dl MnMumJIhDrwoR/fQOkS2jtW5eD9n6W7OKGRLqVOnZXu8Qi7EJqGvC48i/3e3nXMEP/h7 JUcg== X-Forwarded-Encrypted: i=1; AJvYcCXXVygJ5exIVLciBHS+MvfzCkJOXdgYPiNd7m0j9pQvyqJKbJ6HvpCR+hyUluMqM2kngkmjCkUOCvg42es=@vger.kernel.org X-Gm-Message-State: AOJu0YwAkk3rVhOvfDVU2p+RzN2LkVGOYSp82vdVddPBR4BaOASSN0bT SJ/EWECYr3PuM2JQFzC/mCJzuKViMxbnxQnSMJQ7kUxPSq/Krx+zDI81p/ZCdfPwZUuwKqlijUk PWw== X-Google-Smtp-Source: AGHT+IFKHrxOrTZORfx3j1Hq3t+F6SZMnVAetLumeoCsiYBPEvhoueJg73+pHzxlFtD9cACElqMRzKov8iQ= X-Received: from pjbsg16.prod.google.com ([2002:a17:90b:5210:b0:2ef:9b30:69d3]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3a4b:b0:2f5:747:cbd with SMTP id 98e67ed59e1d1-305320af2bfmr23155973a91.18.1743540423264; Tue, 01 Apr 2025 13:47:03 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:21 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-10-seanjc@google.com> Subject: [PATCH 09/12] KVM: Drop sanity check that per-VM list of irqfds is unique From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that the eventfd's waitqueue ensures it has at most one priority waiter, i.e. prevents KVM from binding multiple irqfds to one eventfd, drop KVM's sanity check that eventfds are unique for a single VM. Signed-off-by: Sean Christopherson --- virt/kvm/eventfd.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 25c360ed2e1e..d21b956e7daa 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -288,7 +288,6 @@ static void kvm_irqfd_register(struct file *file, wait_= queue_head_t *wqh, { struct kvm_irqfd_pt *p =3D container_of(pt, struct kvm_irqfd_pt, pt); struct kvm_kernel_irqfd *irqfd =3D p->irqfd; - struct kvm_kernel_irqfd *tmp; struct kvm *kvm =3D p->kvm; =20 /* @@ -328,16 +327,6 @@ static void kvm_irqfd_register(struct file *file, wait= _queue_head_t *wqh, if (p->ret) goto out; =20 - list_for_each_entry(tmp, &kvm->irqfds.items, list) { - if (irqfd->eventfd !=3D tmp->eventfd) - continue; - - WARN_ON_ONCE(1); - /* This fd is used for another irq already. */ - p->ret =3D -EBUSY; - goto out; - } - list_add_tail(&irqfd->list, &kvm->irqfds.items); =20 out: --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 86759221723 for ; Tue, 1 Apr 2025 20:47:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540426; cv=none; b=Ly07ZtGKhPg1XFWXIKdHFJorStasMtxIGMtRKtYizXGU6680nMvCz763W+uGA4c3KxMFTw5xZ6ghlv9eWMbEkrMh83u7bWWEOUStxFbGvyRtyzScJGKmX7swe5K4lCVbGJqzZditWJ1Du2xEvHS93Jeb6YlZ182JPMvRiQJwjVs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540426; c=relaxed/simple; bh=vPE6Zg+Is0XKaMNp6fIjGpoFx3l5te14lZBJu/kZl5I=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rq7yCbN0YL++aTvMIJsGCPeCkr+wjeVLCCeIEK50wNg7Qa/t46He5iVyTExMC0jgTCDZTbKh8Gc5DeC/CTniJPLl4rj2I37kaaF0qSth3Sm+Tre6KLEBTOB0fsHMXyFGeTnS1tra+Jz4AIfOvBGfzmG6M4aXFoPqHdtcX+gXJBU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=paOOIBBK; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="paOOIBBK" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ff7aecba07so10183178a91.2 for ; Tue, 01 Apr 2025 13:47:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540425; x=1744145225; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=5qKsANY9N47hjadaMZNM56aBAj0f8nczwxZTz9+A9lg=; b=paOOIBBK+RA1uOJo8IFcU8v+ZUWemxv+67ogV8RX7modDidTLSKMZbEpw2AHrY1DtQ SqZafaoxumHrkF+UGKTGzaHqN81q/qomuQ9XOqEKF61xmoSGW4hz7+bS8R6tqBHDOPdi d3ODiiVP3Oiki94gdmvNbaij1fhyon6HSusTY+ei/lvwNwFHTCWCkIIko/Puyihq/yfD ZRPlf8jZPK8Slnp4Ei6SB7RXTwu1BzAcDt1R3G/1hzvH1kt9YDxDb2PBGgipG+OPPYia Co0kQLSKZEphhtGEADvIglHCU3ZYpVI843naz+Rg284vqyQ3sav9gxcHj3j1b+YSSEpb WLog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540425; x=1744145225; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5qKsANY9N47hjadaMZNM56aBAj0f8nczwxZTz9+A9lg=; b=Nbqn08baEfL33C4qtc8+8LSPtaP1PmxKBAHTOSsK+STh54VwZb1kJNuyX/NeU7n4DG 0MMmdbWIWiTIbr3PNtc3hvQWkyLNGlGhcvQO1IE93m8P/+hCl6Ux8X2ibSv0PJlHYLPN q+/48UftZ7CCb7kcakVcIV0SeJS3/wF9BWM96CfFEndyLQtcZhLERF3mRourQyPh2pvt WDkzAt3hiPA1jfyaCvaAA7Jz5HZY/9o23vkfITBlqIzl43lS2d1fYY7wGEe9fkjgwYx4 eI3ohtcqEOkO7YSfU0Adu72SRfZNBhKTV8dH6RybVH5eJBkoSpyAKsgT1YLtLSodoCt0 KrhQ== X-Forwarded-Encrypted: i=1; AJvYcCXI1Wo+taRDFwNvDZZ1aFyTrRmMhqeXpurkM0lnEl/1jvVgF2UyQu1BK5DHHvGyG647xsMa7XaD3Fh4OiE=@vger.kernel.org X-Gm-Message-State: AOJu0Yw9AIid9VzLhhpUbuShbyFKw7xCqbcLQfc4HCM7CLeZk/AgmEBb UNZK/WHRNfMQVhjc7JSYRXfH+ncJzmwYtht6FdIycIobrzWeaNPJZ+ljXb+Im/KOScjNgsZdvsW WuQ== X-Google-Smtp-Source: AGHT+IFGVv11dt7rUK+obI9Bx4C62kJ/MvcG+5bk3TDAhhA7vlmUxcL80d04RZFegq2L9eKJhwQmELWuf+4= X-Received: from pfnz20.prod.google.com ([2002:aa7:85d4:0:b0:730:9951:c9ea]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:84b:b0:736:6279:ca25 with SMTP id d2e1a72fcca58-73980463170mr23448772b3a.24.1743540425095; Tue, 01 Apr 2025 13:47:05 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:22 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-11-seanjc@google.com> Subject: [PATCH 10/12] KVM: selftests: Assert that eventfd() succeeds in Xen shinfo test From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Assert that eventfd() succeeds in the Xen shinfo test instead of skipping the associated testcase. While eventfd() is outside the scope of KVM, KVM unconditionally selects EVENTFD, i.e. the syscall should always succeed. Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/x86/xen_shinfo_test.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/kvm/x86/xen_shinfo_test.c b/tools/test= ing/selftests/kvm/x86/xen_shinfo_test.c index 287829f850f7..34d180cf4eed 100644 --- a/tools/testing/selftests/kvm/x86/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86/xen_shinfo_test.c @@ -548,14 +548,11 @@ int main(int argc, char *argv[]) =20 if (do_eventfd_tests) { irq_fd[0] =3D eventfd(0, 0); + TEST_ASSERT(irq_fd[0] >=3D 0, __KVM_SYSCALL_ERROR("eventfd()", irq_fd[0]= )); + irq_fd[1] =3D eventfd(0, 0); + TEST_ASSERT(irq_fd[1] >=3D 0, __KVM_SYSCALL_ERROR("eventfd()", irq_fd[1]= )); =20 - /* Unexpected, but not a KVM failure */ - if (irq_fd[0] =3D=3D -1 || irq_fd[1] =3D=3D -1) - do_evtchn_tests =3D do_eventfd_tests =3D false; - } - - if (do_eventfd_tests) { irq_routes.info.nr =3D 2; =20 irq_routes.entries[0].gsi =3D 32; --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87D66221D94 for ; Tue, 1 Apr 2025 20:47:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540430; cv=none; b=RmTJBxhNIPaWSZvaXxbALtGR+bmINPTxwlEVS3wZJfPwhX5IhHcQD0Harwo/VQED36UNs13LqZoDzTpz055Ow4wwl3Hkup1vo/CVKD3+BdT45/XaQHsWC9OOFS81VcFelOWlgFKtAtLcvBqCOMh9qUtMGoZemnuzzUUa4uLD9Y0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540430; c=relaxed/simple; bh=86HpLcvhrOdiDEIe6VssKHPWpXBqZvLXcZIB239MpOM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QpC8mdk8EAdsitzT8YLzWQV665vfz6NdG0ciDEalwAS7/Yy5uRNVxiNqDp18FICeWn5g0OH1CNW4CQgLZEVsylZWnY4+mSsd+OQ7Vg2DcTJz1ippK76eNLkLpofOsMOqfORIebH9lj4n+6RNO85bKnyIFuJNHbgH91UGT5oE95s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=icZax2fe; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="icZax2fe" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-22410b910b0so86953225ad.2 for ; Tue, 01 Apr 2025 13:47:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540427; x=1744145227; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=7WhTcpnDfS966b8qcjFRKYTbiz5sb4vgFd+KNycoqKw=; b=icZax2feZyy+mVbCa81lDja026HHawMGANa367/cgoZFVHJT0qUl8eQuPfUfvUmdHQ eS3iqiFhg0O7XXTp2X4RRGX2ZPCz37YsSA8Z0E1jX6boTrZV95yiCGVimCXSs/a4Z5vy MdrSB9S9yZqjBCdLkLN1STkDOc+6A6Ikc7s2Jiy5vzllk8rVanK3M7cDaUFgDKmR1cbO FoWGKvsBM7NVq17U8daxKJI5nJkEScWy6g/CRuO5QYrMWlqkizviS8SJVPT/2WdIhQgM dgt5oYjIoB6/k9zwiusnW0L/fvlAAVDpOweuVh/n0OCbnRxaVDg2s7bn8mjdh1fMRa+2 4f3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540427; x=1744145227; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7WhTcpnDfS966b8qcjFRKYTbiz5sb4vgFd+KNycoqKw=; b=iUC2MFg1TnGVFKV14ObvcyMnSMHmfHfi+otKNaSMThe9OaxL50EJtud63YX1TiGCvD NuFVejOrFN6Dm5zWCSZseBpqO7JTC7Q8Pf17zHXGQacdXALareCXVqI8sOdmYe8yjrrV zG4O8r/48o6gLpZ6/2sSfewLDSwkl/WYbP5bW7xc3kOwJJt1wBjX81vQHUThRxjtsHMy E5scC95LCRGR804Oa0ZcfUBr/46h7sW4Mu3HjMxq1fY6cC7d8qyRmkqMD6rhDw3gY/Wf diZ436cxwZtbPUmWXWujSBtrIW7kuAZwg1VTfN2c67LUa36567d72CT0S+eJHV/VxtTK 7kkA== X-Forwarded-Encrypted: i=1; AJvYcCUZLVlnG9c6XYOeS9sqOrXCOivZ6BRn3fjbUgrTfCf0nxIvyYEtJNe09tdrodEPOFq64OTk5jGUFfpC0yI=@vger.kernel.org X-Gm-Message-State: AOJu0Yzr32QMXJfW0ombPSqjQLD2N1qL7zSUb178i2mq9CxovdQyVC3H nztSH09rNQoqgiupYebIJaJ9noFXMm7lboDAk2gjvXqRtYjLu57/FB1k/wjxTPj4p3eETV+Rn1E v/Q== X-Google-Smtp-Source: AGHT+IE69F+P2diWss4q1R2RCvqjJn9bih0SHo0zrhhUPWnYhjvt54Ev63/iq2FBMJOXU3F4HgV2IVY99W4= X-Received: from pfef19.prod.google.com ([2002:a05:6a00:2293:b0:736:5b36:db8f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:853:b0:736:9fa2:bcbb with SMTP id d2e1a72fcca58-73980436cbamr22550678b3a.24.1743540426869; Tue, 01 Apr 2025 13:47:06 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:23 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-12-seanjc@google.com> Subject: [PATCH 11/12] KVM: selftests: Add utilities to create eventfds and do KVM_IRQFD From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add helpers to create eventfds and to (de)assign eventfds via KVM_IRQFD. Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/arm64/vgic_irq.c | 12 ++---- .../testing/selftests/kvm/include/kvm_util.h | 40 +++++++++++++++++++ .../selftests/kvm/x86/xen_shinfo_test.c | 18 ++------- 3 files changed, 47 insertions(+), 23 deletions(-) diff --git a/tools/testing/selftests/kvm/arm64/vgic_irq.c b/tools/testing/s= elftests/kvm/arm64/vgic_irq.c index f4ac28d53747..a09dd423c2d7 100644 --- a/tools/testing/selftests/kvm/arm64/vgic_irq.c +++ b/tools/testing/selftests/kvm/arm64/vgic_irq.c @@ -620,18 +620,12 @@ static void kvm_routing_and_irqfd_check(struct kvm_vm= *vm, * that no actual interrupt was injected for those cases. */ =20 - for (f =3D 0, i =3D intid; i < (uint64_t)intid + num; i++, f++) { - fd[f] =3D eventfd(0, 0); - TEST_ASSERT(fd[f] !=3D -1, __KVM_SYSCALL_ERROR("eventfd()", fd[f])); - } + for (f =3D 0, i =3D intid; i < (uint64_t)intid + num; i++, f++) + fd[f] =3D kvm_new_eventfd(); =20 for (f =3D 0, i =3D intid; i < (uint64_t)intid + num; i++, f++) { - struct kvm_irqfd irqfd =3D { - .fd =3D fd[f], - .gsi =3D i - MIN_SPI, - }; assert(i <=3D (uint64_t)UINT_MAX); - vm_ioctl(vm, KVM_IRQFD, &irqfd); + kvm_assign_irqfd(vm, i - MIN_SPI, fd[f]); } =20 for (f =3D 0, i =3D intid; i < (uint64_t)intid + num; i++, f++) { diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing= /selftests/kvm/include/kvm_util.h index 373912464fb4..4f7bf8f000bb 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -18,6 +18,7 @@ #include #include =20 +#include #include =20 #include "kvm_util_arch.h" @@ -496,6 +497,45 @@ static inline int vm_get_stats_fd(struct kvm_vm *vm) return fd; } =20 +static inline int __kvm_irqfd(struct kvm_vm *vm, uint32_t gsi, int eventfd, + uint32_t flags) +{ + struct kvm_irqfd irqfd =3D { + .fd =3D eventfd, + .gsi =3D gsi, + .flags =3D flags, + .resamplefd =3D -1, + }; + + return __vm_ioctl(vm, KVM_IRQFD, &irqfd); +} + +static inline void kvm_irqfd(struct kvm_vm *vm, uint32_t gsi, int eventfd, + uint32_t flags) +{ + int ret =3D __kvm_irqfd(vm, gsi, eventfd, flags); + + TEST_ASSERT_VM_VCPU_IOCTL(!ret, KVM_IRQFD, ret, vm); +} + +static inline void kvm_assign_irqfd(struct kvm_vm *vm, uint32_t gsi, int e= ventfd) +{ + kvm_irqfd(vm, gsi, eventfd, 0); +} + +static inline void kvm_deassign_irqfd(struct kvm_vm *vm, uint32_t gsi, int= eventfd) +{ + kvm_irqfd(vm, gsi, eventfd, KVM_IRQFD_FLAG_DEASSIGN); +} + +static inline int kvm_new_eventfd(void) +{ + int fd =3D eventfd(0, 0); + + TEST_ASSERT(fd >=3D 0, __KVM_SYSCALL_ERROR("eventfd()", fd)); + return fd; +} + static inline void read_stats_header(int stats_fd, struct kvm_stats_header= *header) { ssize_t ret; diff --git a/tools/testing/selftests/kvm/x86/xen_shinfo_test.c b/tools/test= ing/selftests/kvm/x86/xen_shinfo_test.c index 34d180cf4eed..23909b501ac2 100644 --- a/tools/testing/selftests/kvm/x86/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86/xen_shinfo_test.c @@ -547,11 +547,8 @@ int main(int argc, char *argv[]) int irq_fd[2] =3D { -1, -1 }; =20 if (do_eventfd_tests) { - irq_fd[0] =3D eventfd(0, 0); - TEST_ASSERT(irq_fd[0] >=3D 0, __KVM_SYSCALL_ERROR("eventfd()", irq_fd[0]= )); - - irq_fd[1] =3D eventfd(0, 0); - TEST_ASSERT(irq_fd[1] >=3D 0, __KVM_SYSCALL_ERROR("eventfd()", irq_fd[1]= )); + irq_fd[0] =3D kvm_new_eventfd(); + irq_fd[1] =3D kvm_new_eventfd(); =20 irq_routes.info.nr =3D 2; =20 @@ -569,15 +566,8 @@ int main(int argc, char *argv[]) =20 vm_ioctl(vm, KVM_SET_GSI_ROUTING, &irq_routes.info); =20 - struct kvm_irqfd ifd =3D { }; - - ifd.fd =3D irq_fd[0]; - ifd.gsi =3D 32; - vm_ioctl(vm, KVM_IRQFD, &ifd); - - ifd.fd =3D irq_fd[1]; - ifd.gsi =3D 33; - vm_ioctl(vm, KVM_IRQFD, &ifd); + kvm_assign_irqfd(vm, 32, irq_fd[0]); + kvm_assign_irqfd(vm, 33, irq_fd[1]); =20 struct sigaction sa =3D { }; sa.sa_handler =3D handle_alrm; --=20 2.49.0.504.g3bcea36a83-goog From nobody Tue Apr 15 19:54:28 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C104221DBC for ; Tue, 1 Apr 2025 20:47:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540430; cv=none; b=FG+DmG6SeQSrmt5+sa1NTeDKZ8wXXBX3ZOD0bQVxKxBZQykAOEkR8j3WXoJ2dfkJTCoWbUopntF0wTMGI+yK6EhaBjpCUG5p+oCIv7MyJdBmU9p4BBy+L6MlkoR6oUYLpY2VBrqhBC8qKOwPwczyykBQ5Z3YoBNcjHPTgLG9laM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743540430; c=relaxed/simple; bh=fUs07VOggshpPVb5iKIZY5UPq/g/qmxhlixbhOP2Vg0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=G41/rc+EdTUk8npKFN3Jx2EUG9KxttysCUm3OuIqu7o+0V1tYIRzlP3K6tcDGFr3abXYgebbXrsfYN4QRxsEElMg1psH6Ky2dqCykU2tuqXKi36fu1InAs+7IGYtCo+FImYtnpaq+WhJgKUFkWbr6dEhOOvkYI+6omSYgCoeL0c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zxwvX8bz; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zxwvX8bz" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ff854a2541so9883786a91.0 for ; Tue, 01 Apr 2025 13:47:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743540428; x=1744145228; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=m2Nlc70kYoEVZE2tPVykbAae+uvb1zAI7sPIwdhW1oA=; b=zxwvX8bztPmDuoJqe4yW5FSqwAy9wE9NIkV0IBRBTYRj3mXSPpN72m7XO+rZjRM0Ip VgDy2sU1rY2dQhjr8U4ceO8oNxI2Zrd5bBOV5X9WTLZdt+mDiTeG/Mom2LvB2MWFBsa3 RYyK883pPqu8JnYKz5FsZ5XdhRlK1rkZVYNI2gxfyZfvZscnEYmNK9lGoqiuMzI3qZxs xWHY/jnAhcdOffGVkaQ4LGaBMtccO3tl9USqg/Mof83IXsSJsf/4oO5mDSQmJDLODlp/ UnYQbX87FI5+PSthUuYtipU8KDYm3rwJiDqr2HsoAqzqnJkvQKe+Psp0lhbryfR4t6rQ CICw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743540428; x=1744145228; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=m2Nlc70kYoEVZE2tPVykbAae+uvb1zAI7sPIwdhW1oA=; b=As1Oe8Xo9jYbhyB6Vn6c6t7RoKwIojyVd0/KWcw+UEgMMt4MBU76/0WTOCoKHrUEtv TEZqyOsjzWAoXwfn9P+KHFsJjPGfzoO88al3DHL1nPZc5JF8gPbNVHLVVrMTb0PAUmsF S1jIqzPZ4ZeoCHmH3oMH6ghEe/WYIfMpsLqTHCtYOmTUOHxrci0Rk/TXGZm34QpKcqZd bJU9pPG4Okmx4qx2B/IdvNxJfXt4CwXfMXYbDyGzEIIOlNu3nUC/AxShPDt8fktxQulH 7ewdacWMuD17W6kqgHbecVIxAS7kwViN0PPFvNexc8s+yh8+JAiuz1+Eq4vwuVfyxujy hiiQ== X-Forwarded-Encrypted: i=1; AJvYcCXI14gZwpTcjSYeQvugjtK2m3Sm5ZadXGjVYEpO/TxFDwdRZrPQUXBnwSvATWEYW5kYVhJDd4HWAt276vQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwYhjNUGZ2VrnBMWa3oy2tkQ33DuYlsR7e4xXGlEzu2wAk7uTx5 7YmP8RpEVCnM6ssvly8S3Fh7i+mA0BuoPEaKm78qt0pWcxYyGQ9JJElk7yhk31wn5cJjyaglSAW vrA== X-Google-Smtp-Source: AGHT+IGsvgxu0cKjG5RuKP235RGuqLz8GTYqfRRM08+3tGkeZ2vVqfJM8i9w8mVMsttD6aUmHocD7/45nSY= X-Received: from pfbdh1.prod.google.com ([2002:a05:6a00:4781:b0:730:8e17:ed13]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:1410:b0:725:96f2:9e63 with SMTP id d2e1a72fcca58-73980477740mr24146728b3a.24.1743540428601; Tue, 01 Apr 2025 13:47:08 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 1 Apr 2025 13:44:24 -0700 In-Reply-To: <20250401204425.904001-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250401204425.904001-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250401204425.904001-13-seanjc@google.com> Subject: [PATCH 12/12] KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements From: Sean Christopherson To: Paolo Bonzini , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Marc Zyngier , Oliver Upton , Sean Christopherson , Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-riscv@lists.infradead.org, David Matlack , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a selftest to verify that eventfd+irqfd bindings are globally unique, i.e. that KVM doesn't allow multiple irqfds to bind to a single eventfd, even across VMs. Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/Makefile.kvm | 4 + tools/testing/selftests/kvm/irqfd_test.c | 130 +++++++++++++++++++++++ 2 files changed, 134 insertions(+) create mode 100644 tools/testing/selftests/kvm/irqfd_test.c diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selft= ests/kvm/Makefile.kvm index f773f8f99249..9e5128d9f22c 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -125,6 +125,7 @@ TEST_GEN_PROGS_x86 +=3D dirty_log_perf_test TEST_GEN_PROGS_x86 +=3D guest_memfd_test TEST_GEN_PROGS_x86 +=3D guest_print_test TEST_GEN_PROGS_x86 +=3D hardware_disable_test +TEST_GEN_PROGS_x86 +=3D irqfd_test TEST_GEN_PROGS_x86 +=3D kvm_create_max_vcpus TEST_GEN_PROGS_x86 +=3D kvm_page_table_test TEST_GEN_PROGS_x86 +=3D memslot_modification_stress_test @@ -163,6 +164,7 @@ TEST_GEN_PROGS_arm64 +=3D dirty_log_test TEST_GEN_PROGS_arm64 +=3D dirty_log_perf_test TEST_GEN_PROGS_arm64 +=3D guest_print_test TEST_GEN_PROGS_arm64 +=3D get-reg-list +TEST_GEN_PROGS_arm64 +=3D irqfd_test TEST_GEN_PROGS_arm64 +=3D kvm_create_max_vcpus TEST_GEN_PROGS_arm64 +=3D kvm_page_table_test TEST_GEN_PROGS_arm64 +=3D memslot_modification_stress_test @@ -185,6 +187,7 @@ TEST_GEN_PROGS_s390 +=3D s390/ucontrol_test TEST_GEN_PROGS_s390 +=3D demand_paging_test TEST_GEN_PROGS_s390 +=3D dirty_log_test TEST_GEN_PROGS_s390 +=3D guest_print_test +TEST_GEN_PROGS_s390 +=3D irqfd_test TEST_GEN_PROGS_s390 +=3D kvm_create_max_vcpus TEST_GEN_PROGS_s390 +=3D kvm_page_table_test TEST_GEN_PROGS_s390 +=3D rseq_test @@ -199,6 +202,7 @@ TEST_GEN_PROGS_riscv +=3D demand_paging_test TEST_GEN_PROGS_riscv +=3D dirty_log_test TEST_GEN_PROGS_riscv +=3D get-reg-list TEST_GEN_PROGS_riscv +=3D guest_print_test +TEST_GEN_PROGS_riscv +=3D irqfd_test TEST_GEN_PROGS_riscv +=3D kvm_binary_stats_test TEST_GEN_PROGS_riscv +=3D kvm_create_max_vcpus TEST_GEN_PROGS_riscv +=3D kvm_page_table_test diff --git a/tools/testing/selftests/kvm/irqfd_test.c b/tools/testing/selft= ests/kvm/irqfd_test.c new file mode 100644 index 000000000000..286f2b15fde6 --- /dev/null +++ b/tools/testing/selftests/kvm/irqfd_test.c @@ -0,0 +1,130 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include +#include +#include + +#include "kvm_util.h" + +static struct kvm_vm *vm1; +static struct kvm_vm *vm2; +static int __eventfd; +static bool done; + +/* + * KVM de-assigns based on eventfd *and* GSI, but requires unique eventfds= when + * assigning (the API isn't symmetrical). Abuse the oddity and use a per-= task + * GSI base to avoid false failures due to cross-task de-assign, i.e. so t= hat + * the secondary doesn't de-assign the primary's eventfd and cause assign = to + * unexpectedly succeed on the primary. + */ +#define GSI_BASE_PRIMARY 0x20 +#define GSI_BASE_SECONDARY 0x30 + +static void juggle_eventfd_secondary(struct kvm_vm *vm, int eventfd) +{ + int r, i; + + /* + * The secondary task can encounter EBADF since the primary can close + * the eventfd at any time. And because the primary can recreate the + * eventfd, at the safe fd in the file table, the secondary can also + * encounter "unexpected" success, e.g. if the close+recreate happens + * between the first and second assignments. The secondary's role is + * mostly to antagonize KVM, not to detect bugs. + */ + for (i =3D 0; i < 2; i++) { + r =3D __kvm_irqfd(vm, GSI_BASE_SECONDARY, eventfd, 0); + TEST_ASSERT(!r || errno =3D=3D EBUSY || errno =3D=3D EBADF, + "Wanted success, EBUSY, or EBADF, r =3D %d, errno =3D %d", + r, errno); + + /* De-assign should succeed unless the eventfd was closed. */ + r =3D __kvm_irqfd(vm, GSI_BASE_SECONDARY + i, eventfd, KVM_IRQFD_FLAG_DE= ASSIGN); + TEST_ASSERT(!r || errno =3D=3D EBADF, + "De-assign should succeed unless the fd was closed"); + } +} + +static void *secondary_irqfd_juggler(void *ign) +{ + while (!READ_ONCE(done)) { + juggle_eventfd_secondary(vm1, READ_ONCE(__eventfd)); + juggle_eventfd_secondary(vm2, READ_ONCE(__eventfd)); + } + + return NULL; +} + +static void juggle_eventfd_primary(struct kvm_vm *vm, int eventfd) +{ + int r1, r2; + + /* + * At least one of the assigns should fail. KVM disallows assigning a + * single eventfd to multiple GSIs (or VMs), so it's possible that both + * assignments can fail, too. + */ + r1 =3D __kvm_irqfd(vm, GSI_BASE_PRIMARY, eventfd, 0); + TEST_ASSERT(!r1 || errno =3D=3D EBUSY, + "Wanted success or EBUSY, r =3D %d, errno =3D %d", r1, errno); + + r2 =3D __kvm_irqfd(vm, GSI_BASE_PRIMARY + 1, eventfd, 0); + TEST_ASSERT(r1 || (r2 && errno =3D=3D EBUSY), + "Wanted failure (EBUSY), r1 =3D %d, r2 =3D %d, errno =3D %d", + r1, r2, errno); + + /* + * De-assign should always succeed, even if the corresponding assign + * failed. + */ + kvm_irqfd(vm, GSI_BASE_PRIMARY, eventfd, KVM_IRQFD_FLAG_DEASSIGN); + kvm_irqfd(vm, GSI_BASE_PRIMARY + 1, eventfd, KVM_IRQFD_FLAG_DEASSIGN); +} + +int main(int argc, char *argv[]) +{ + pthread_t racing_thread; + int r, i; + + /* Create "full" VMs, as KVM_IRQFD requires an in-kernel IRQ chip. */ + vm1 =3D vm_create(1); + vm2 =3D vm_create(1); + + WRITE_ONCE(__eventfd, kvm_new_eventfd()); + + kvm_irqfd(vm1, 10, __eventfd, 0); + + r =3D __kvm_irqfd(vm1, 11, __eventfd, 0); + TEST_ASSERT(r && errno =3D=3D EBUSY, + "Wanted EBUSY, r =3D %d, errno =3D %d", r, errno); + + r =3D __kvm_irqfd(vm2, 12, __eventfd, 0); + TEST_ASSERT(r && errno =3D=3D EBUSY, + "Wanted EBUSY, r =3D %d, errno =3D %d", r, errno); + + kvm_irqfd(vm1, 11, READ_ONCE(__eventfd), KVM_IRQFD_FLAG_DEASSIGN); + kvm_irqfd(vm1, 12, READ_ONCE(__eventfd), KVM_IRQFD_FLAG_DEASSIGN); + kvm_irqfd(vm1, 13, READ_ONCE(__eventfd), KVM_IRQFD_FLAG_DEASSIGN); + kvm_irqfd(vm1, 14, READ_ONCE(__eventfd), KVM_IRQFD_FLAG_DEASSIGN); + kvm_irqfd(vm1, 10, READ_ONCE(__eventfd), KVM_IRQFD_FLAG_DEASSIGN); + + close(__eventfd); + + pthread_create(&racing_thread, NULL, secondary_irqfd_juggler, vm2); + + for (i =3D 0; i < 10000; i++) { + WRITE_ONCE(__eventfd, kvm_new_eventfd()); + + juggle_eventfd_primary(vm1, __eventfd); + juggle_eventfd_primary(vm2, __eventfd); + close(__eventfd); + } + + WRITE_ONCE(done, true); + pthread_join(racing_thread, NULL); +} --=20 2.49.0.504.g3bcea36a83-goog