From nobody Sun May 19 01:26:42 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3BCC9156983; Wed, 27 Mar 2024 12:16:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711541778; cv=none; b=mdgDlMuzcS9mSDwDiUCD77phvfoar0tP9QTSwNCg0rDJfyud4nQQUXLYjWRHpG1jNy4rAEwuGwsxMtPS/ZyIguQeqTPadafHmaHUqalLhVkNYJ2JFN3Et+2mESjBBV3u1gH/DQWxF5gSdjeJtRisYIi5+RsFYfeTg0RcNIh7wr4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711541778; c=relaxed/simple; bh=OJjaGxWgsTN2/YoXy26ieVqg7aWOy2agEtwCnHDFf8s=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=gYO/afzTegzoYVwcWO3Qqf6KrtMo9UQC10gL1QIHbW9xSzYP2vZ7d+ZZji5ALbdT0ZVVOqsaOwwEQ6pLlzvmMl29GWjhrZPV+v2FxsBkvqXUWK5HX5KyE2LUuK0SUfv0mm9NFy7WX26r66bTfDoJVr1VrwyU2rpT2fLQT3GHCgs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JA2ma3Hk; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JA2ma3Hk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFD87C43394; Wed, 27 Mar 2024 12:16:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711541777; bh=OJjaGxWgsTN2/YoXy26ieVqg7aWOy2agEtwCnHDFf8s=; h=From:To:Cc:Subject:Date:From; b=JA2ma3HkkCxjSqLjfbaP4Y+yLWewHhASUONVhk7hnX+Lc2Zl/hidqJ0IwlpeVC0/O xQ5u4lSgqD5Ylw4Iw8b0k4GqIQPS0DXvxxMwhy2hY/F0OVkpiRbmo5ZGMqlIYSBx0A 8NszS822iSZF+oKaOcLW6u0NPSZ/PSgdgmr5z3JrFDi3NlCvZp3KIlWAkLsLMdGgaZ beDEnh3gdLftX6TMj/MvZBCuJFsTx9hWGUcHy99NruLVQfvMAjokcC1jOxVUDUk8Hm MuPzHQQIOvDwdYJX+704Nzz6G86JGduMqe1vcaDyImEpv8vFnqw81HHhsXRZDRBeir KLKZ9tuT1adnA== From: Sasha Levin To: stable@vger.kernel.org, alex.williamson@redhat.com Cc: Eric Auger , Kevin Tian , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: FAILED: Patch "vfio/platform: Create persistent IRQ handlers" failed to apply to 5.15-stable tree Date: Wed, 27 Mar 2024 08:16:15 -0400 Message-ID: <20240327121616.2832578-1-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Hint: ignore X-stable: review Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . Thanks, Sasha Reviewed-by: Eric Auger Reviewed-by: Kevin Tian ------------------ original commit in Linus's tree ------------------ From 675daf435e9f8e5a5eab140a9864dfad6668b375 Mon Sep 17 00:00:00 2001 From: Alex Williamson Date: Fri, 8 Mar 2024 16:05:27 -0700 Subject: [PATCH] vfio/platform: Create persistent IRQ handlers The vfio-platform SET_IRQS ioctl currently allows loopback triggering of an interrupt before a signaling eventfd has been configured by the user, which thereby allows a NULL pointer dereference. Rather than register the IRQ relative to a valid trigger, register all IRQs in a disabled state in the device open path. This allows mask operations on the IRQ to nest within the overall enable state governed by a valid eventfd signal. This decouples @masked, protected by the @locked spinlock from @trigger, protected via the @igate mutex. In doing so, it's guaranteed that changes to @trigger cannot race the IRQ handlers because the IRQ handler is synchronously disabled before modifying the trigger, and loopback triggering of the IRQ via ioctl is safe due to serialization with trigger changes via igate. For compatibility, request_irq() failures are maintained to be local to the SET_IRQS ioctl rather than a fatal error in the open device path. This allows, for example, a userspace driver with polling mode support to continue to work regardless of moving the request_irq() call site. This necessarily blocks all SET_IRQS access to the failed index. Cc: Eric Auger Cc: Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd") Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Link: https://lore.kernel.org/r/20240308230557.805580-7-alex.williamson@red= hat.com Signed-off-by: Alex Williamson --- drivers/vfio/platform/vfio_platform_irq.c | 100 +++++++++++++++------- 1 file changed, 68 insertions(+), 32 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platf= orm/vfio_platform_irq.c index e5dcada9e86c4..ef41ecef83af1 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -136,6 +136,16 @@ static int vfio_platform_set_irq_unmask(struct vfio_pl= atform_device *vdev, return 0; } =20 +/* + * The trigger eventfd is guaranteed valid in the interrupt path + * and protected by the igate mutex when triggered via ioctl. + */ +static void vfio_send_eventfd(struct vfio_platform_irq *irq_ctx) +{ + if (likely(irq_ctx->trigger)) + eventfd_signal(irq_ctx->trigger); +} + static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx =3D dev_id; @@ -155,7 +165,7 @@ static irqreturn_t vfio_automasked_irq_handler(int irq,= void *dev_id) spin_unlock_irqrestore(&irq_ctx->lock, flags); =20 if (ret =3D=3D IRQ_HANDLED) - eventfd_signal(irq_ctx->trigger); + vfio_send_eventfd(irq_ctx); =20 return ret; } @@ -164,52 +174,40 @@ static irqreturn_t vfio_irq_handler(int irq, void *de= v_id) { struct vfio_platform_irq *irq_ctx =3D dev_id; =20 - eventfd_signal(irq_ctx->trigger); + vfio_send_eventfd(irq_ctx); =20 return IRQ_HANDLED; } =20 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, - int fd, irq_handler_t handler) + int fd) { struct vfio_platform_irq *irq =3D &vdev->irqs[index]; struct eventfd_ctx *trigger; - int ret; =20 if (irq->trigger) { - irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN); - free_irq(irq->hwirq, irq); - kfree(irq->name); + disable_irq(irq->hwirq); eventfd_ctx_put(irq->trigger); irq->trigger =3D NULL; } =20 if (fd < 0) /* Disable only */ return 0; - irq->name =3D kasprintf(GFP_KERNEL_ACCOUNT, "vfio-irq[%d](%s)", - irq->hwirq, vdev->name); - if (!irq->name) - return -ENOMEM; =20 trigger =3D eventfd_ctx_fdget(fd); - if (IS_ERR(trigger)) { - kfree(irq->name); + if (IS_ERR(trigger)) return PTR_ERR(trigger); - } =20 irq->trigger =3D trigger; =20 - irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN); - ret =3D request_irq(irq->hwirq, handler, 0, irq->name, irq); - if (ret) { - kfree(irq->name); - eventfd_ctx_put(trigger); - irq->trigger =3D NULL; - return ret; - } - - if (!irq->masked) - enable_irq(irq->hwirq); + /* + * irq->masked effectively provides nested disables within the overall + * enable relative to trigger. Specifically request_irq() is called + * with NO_AUTOEN, therefore the IRQ is initially disabled. The user + * may only further disable the IRQ with a MASK operations because + * irq->masked is initially false. + */ + enable_irq(irq->hwirq); =20 return 0; } @@ -228,7 +226,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_pl= atform_device *vdev, handler =3D vfio_irq_handler; =20 if (!count && (flags & VFIO_IRQ_SET_DATA_NONE)) - return vfio_set_trigger(vdev, index, -1, handler); + return vfio_set_trigger(vdev, index, -1); =20 if (start !=3D 0 || count !=3D 1) return -EINVAL; @@ -236,7 +234,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_pl= atform_device *vdev, if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { int32_t fd =3D *(int32_t *)data; =20 - return vfio_set_trigger(vdev, index, fd, handler); + return vfio_set_trigger(vdev, index, fd); } =20 if (flags & VFIO_IRQ_SET_DATA_NONE) { @@ -260,6 +258,14 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_= device *vdev, unsigned start, unsigned count, uint32_t flags, void *data) =3D NULL; =20 + /* + * For compatibility, errors from request_irq() are local to the + * SET_IRQS path and reflected in the name pointer. This allows, + * for example, polling mode fallback for an exclusive IRQ failure. + */ + if (IS_ERR(vdev->irqs[index].name)) + return PTR_ERR(vdev->irqs[index].name); + switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) { case VFIO_IRQ_SET_ACTION_MASK: func =3D vfio_platform_set_irq_mask; @@ -280,7 +286,7 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_d= evice *vdev, =20 int vfio_platform_irq_init(struct vfio_platform_device *vdev) { - int cnt =3D 0, i; + int cnt =3D 0, i, ret =3D 0; =20 while (vdev->get_irq(vdev, cnt) >=3D 0) cnt++; @@ -292,29 +298,54 @@ int vfio_platform_irq_init(struct vfio_platform_devic= e *vdev) =20 for (i =3D 0; i < cnt; i++) { int hwirq =3D vdev->get_irq(vdev, i); + irq_handler_t handler =3D vfio_irq_handler; =20 - if (hwirq < 0) + if (hwirq < 0) { + ret =3D -EINVAL; goto err; + } =20 spin_lock_init(&vdev->irqs[i].lock); =20 vdev->irqs[i].flags =3D VFIO_IRQ_INFO_EVENTFD; =20 - if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) + if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) { vdev->irqs[i].flags |=3D VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED; + handler =3D vfio_automasked_irq_handler; + } =20 vdev->irqs[i].count =3D 1; vdev->irqs[i].hwirq =3D hwirq; vdev->irqs[i].masked =3D false; + vdev->irqs[i].name =3D kasprintf(GFP_KERNEL_ACCOUNT, + "vfio-irq[%d](%s)", hwirq, + vdev->name); + if (!vdev->irqs[i].name) { + ret =3D -ENOMEM; + goto err; + } + + ret =3D request_irq(hwirq, handler, IRQF_NO_AUTOEN, + vdev->irqs[i].name, &vdev->irqs[i]); + if (ret) { + kfree(vdev->irqs[i].name); + vdev->irqs[i].name =3D ERR_PTR(ret); + } } =20 vdev->num_irqs =3D cnt; =20 return 0; err: + for (--i; i >=3D 0; i--) { + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + kfree(vdev->irqs[i].name); + } + } kfree(vdev->irqs); - return -EINVAL; + return ret; } =20 void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) @@ -324,7 +355,12 @@ void vfio_platform_irq_cleanup(struct vfio_platform_de= vice *vdev) for (i =3D 0; i < vdev->num_irqs; i++) { vfio_virqfd_disable(&vdev->irqs[i].mask); vfio_virqfd_disable(&vdev->irqs[i].unmask); - vfio_set_trigger(vdev, i, -1, NULL); + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + if (vdev->irqs[i].trigger) + eventfd_ctx_put(vdev->irqs[i].trigger); + kfree(vdev->irqs[i].name); + } } =20 vdev->num_irqs =3D 0; --=20 2.43.0