drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_gem.c | 17 +++++++++++++++++ drivers/gpu/drm/virtio/virtgpu_plane.c | 10 ++++++++-- 3 files changed, 26 insertions(+), 2 deletions(-)
virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock
the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and
ignore its return value. The function can fail with -EINTR from
dma_resv_lock_interruptible() (signal during lock wait) or with
-ENOMEM from dma_resv_reserve_fences() (fence slot allocation),
leaving the resv lock not held. The queue path then walks the object
array and calls dma_resv_add_fence(), which requires the lock held;
with lockdep enabled this trips dma_resv_assert_held():
WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840
Call Trace:
virtio_gpu_array_add_fence
virtio_gpu_queue_ctrl_sgs
virtio_gpu_queue_fenced_ctrl_buffer
virtio_gpu_cursor_plane_update
drm_atomic_helper_commit_planes
drm_atomic_helper_commit_tail
commit_tail
drm_atomic_helper_commit
drm_atomic_commit
drm_atomic_helper_update_plane
__setplane_atomic
drm_mode_cursor_universal
drm_mode_cursor_common
drm_mode_cursor_ioctl
drm_ioctl
__x64_sys_ioctl
Beyond the WARN, mutating the dma_resv fence list without the lock
races with concurrent readers/writers and can corrupt the list.
Both call sites run inside the .atomic_update plane callback, which
DRM atomic helpers do not allow to fail (by the time it runs, the
commit has been signed off to userspace and there is no clean
rollback path). Moving the lock acquisition to .prepare_fb was
rejected because the broader lock scope deadlocks against other BO
locking paths in the same atomic commit.
Introduce virtio_gpu_lock_one_resv_uninterruptible() that uses
dma_resv_lock() instead of dma_resv_lock_interruptible(). This
eliminates the -EINTR failure mode -- the realistic syzbot trigger
-- without extending the lock hold across the commit. The helper
locks a single BO and rejects nents > 1 with -EINVAL; both fix
sites lock exactly one BO.
Use it from virtio_gpu_cursor_plane_update() and
virtio_gpu_resource_flush(); check the return value to handle the
remaining -ENOMEM case from dma_resv_reserve_fences() by freeing
the objs and skipping the plane update for that frame. The
framebuffer BOs touched here are not shared with other contexts
and lock contention is expected to be brief, so the loss of
signal-interruptibility is acceptable.
Other callers of virtio_gpu_array_lock_resv() (the ioctl paths)
continue to use the interruptible variant.
The bug was reported by syzbot, triggered via fault injection
(fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the
-ENOMEM branch in dma_resv_reserve_fences().
Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271
Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().")
Cc: stable@vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
v4: Rename the helper to virtio_gpu_lock_one_resv_uninterruptible()
and reject objs->nents > 1 with -EINVAL. The v3 helper's
multi-object branch used drm_gem_lock_reservations(), which is
interruptible, contradicting the "uninterruptible" name; both
fix sites lock a single BO so the multi-object path is dropped.
(Dmitry Osipenko)
v3: Drop the prepare_fb/cleanup_fb approach from v2 (it deadlocked
against virtio_gpu_resource_flush(), which also locks the BO in
the same atomic commit). Instead add an uninterruptible variant
of the resv lock helper and use it in both
virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush().
(Dmitry Osipenko)
v2: Move resv lock acquisition from .atomic_update (which must not
fail) to .prepare_fb (which may), per maintainer review of v1.
The v1 approach of silently skipping the cursor update on lock
failure violated the atomic-commit contract with userspace.
---
drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
drivers/gpu/drm/virtio/virtgpu_gem.c | 17 +++++++++++++++++
drivers/gpu/drm/virtio/virtgpu_plane.c | 10 ++++++++--
3 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index f17660a71a3e..2f3531950aa4 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -317,6 +317,7 @@ virtio_gpu_array_from_handles(struct drm_file *drm_file, u32 *handles, u32 nents
void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
struct drm_gem_object *obj);
int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs);
+int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs);
void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs);
void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
struct dma_fence *fence);
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index f22dc5c21cd4..435d37d36034 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -238,6 +238,23 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
return ret;
}
+int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs)
+{
+ int ret;
+
+ if (objs->nents != 1)
+ return -EINVAL;
+
+ dma_resv_lock(objs->objs[0]->resv, NULL);
+
+ ret = dma_resv_reserve_fences(objs->objs[0]->resv, 1);
+ if (ret) {
+ virtio_gpu_array_unlock_resv(objs);
+ return ret;
+ }
+ return 0;
+}
+
void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs)
{
if (objs->nents == 1) {
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index a126d1b25f46..652352424744 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -215,7 +215,10 @@ static void virtio_gpu_resource_flush(struct drm_plane *plane,
if (!objs)
return;
virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
- virtio_gpu_array_lock_resv(objs);
+ if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
+ virtio_gpu_array_put_free(objs);
+ return;
+ }
virtio_gpu_cmd_resource_flush(vgdev, bo->hw_res_handle, x, y,
width, height, objs,
vgplane_st->fence);
@@ -459,7 +462,10 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
if (!objs)
return;
virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
- virtio_gpu_array_lock_resv(objs);
+ if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
+ virtio_gpu_array_put_free(objs);
+ return;
+ }
virtio_gpu_cmd_transfer_to_host_2d
(vgdev, 0,
plane->state->crtc_w,
--
2.43.0
On 5/19/26 11:22, Deepanshu Kartikey wrote:
> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock
> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and
> ignore its return value. The function can fail with -EINTR from
> dma_resv_lock_interruptible() (signal during lock wait) or with
> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation),
> leaving the resv lock not held. The queue path then walks the object
> array and calls dma_resv_add_fence(), which requires the lock held;
> with lockdep enabled this trips dma_resv_assert_held():
>
> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840
> Call Trace:
> virtio_gpu_array_add_fence
> virtio_gpu_queue_ctrl_sgs
> virtio_gpu_queue_fenced_ctrl_buffer
> virtio_gpu_cursor_plane_update
> drm_atomic_helper_commit_planes
> drm_atomic_helper_commit_tail
> commit_tail
> drm_atomic_helper_commit
> drm_atomic_commit
> drm_atomic_helper_update_plane
> __setplane_atomic
> drm_mode_cursor_universal
> drm_mode_cursor_common
> drm_mode_cursor_ioctl
> drm_ioctl
> __x64_sys_ioctl
>
> Beyond the WARN, mutating the dma_resv fence list without the lock
> races with concurrent readers/writers and can corrupt the list.
>
> Both call sites run inside the .atomic_update plane callback, which
> DRM atomic helpers do not allow to fail (by the time it runs, the
> commit has been signed off to userspace and there is no clean
> rollback path). Moving the lock acquisition to .prepare_fb was
> rejected because the broader lock scope deadlocks against other BO
> locking paths in the same atomic commit.
>
> Introduce virtio_gpu_lock_one_resv_uninterruptible() that uses
> dma_resv_lock() instead of dma_resv_lock_interruptible(). This
> eliminates the -EINTR failure mode -- the realistic syzbot trigger
> -- without extending the lock hold across the commit. The helper
> locks a single BO and rejects nents > 1 with -EINVAL; both fix
> sites lock exactly one BO.
>
> Use it from virtio_gpu_cursor_plane_update() and
> virtio_gpu_resource_flush(); check the return value to handle the
> remaining -ENOMEM case from dma_resv_reserve_fences() by freeing
> the objs and skipping the plane update for that frame. The
> framebuffer BOs touched here are not shared with other contexts
> and lock contention is expected to be brief, so the loss of
> signal-interruptibility is acceptable.
>
> Other callers of virtio_gpu_array_lock_resv() (the ioctl paths)
> continue to use the interruptible variant.
>
> The bug was reported by syzbot, triggered via fault injection
> (fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the
> -ENOMEM branch in dma_resv_reserve_fences().
>
> Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271
> Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().")
> Cc: stable@vger.kernel.org
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
> v4: Rename the helper to virtio_gpu_lock_one_resv_uninterruptible()
> and reject objs->nents > 1 with -EINVAL. The v3 helper's
> multi-object branch used drm_gem_lock_reservations(), which is
> interruptible, contradicting the "uninterruptible" name; both
> fix sites lock a single BO so the multi-object path is dropped.
> (Dmitry Osipenko)
> v3: Drop the prepare_fb/cleanup_fb approach from v2 (it deadlocked
> against virtio_gpu_resource_flush(), which also locks the BO in
> the same atomic commit). Instead add an uninterruptible variant
> of the resv lock helper and use it in both
> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush().
> (Dmitry Osipenko)
> v2: Move resv lock acquisition from .atomic_update (which must not
> fail) to .prepare_fb (which may), per maintainer review of v1.
> The v1 approach of silently skipping the cursor update on lock
> failure violated the atomic-commit contract with userspace.
> ---
> drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
> drivers/gpu/drm/virtio/virtgpu_gem.c | 17 +++++++++++++++++
> drivers/gpu/drm/virtio/virtgpu_plane.c | 10 ++++++++--
> 3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
> index f17660a71a3e..2f3531950aa4 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
> @@ -317,6 +317,7 @@ virtio_gpu_array_from_handles(struct drm_file *drm_file, u32 *handles, u32 nents
> void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
> struct drm_gem_object *obj);
> int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs);
> +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs);
> void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs);
> void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
> struct dma_fence *fence);
> diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
> index f22dc5c21cd4..435d37d36034 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
> @@ -238,6 +238,23 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
> return ret;
> }
>
> +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs)
> +{
> + int ret;
> +
> + if (objs->nents != 1)
> + return -EINVAL;
> +
> + dma_resv_lock(objs->objs[0]->resv, NULL);
> +
> + ret = dma_resv_reserve_fences(objs->objs[0]->resv, 1);
> + if (ret) {
> + virtio_gpu_array_unlock_resv(objs);
> + return ret;
> + }
> + return 0;
> +}
> +
> void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs)
> {
> if (objs->nents == 1) {
> diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
> index a126d1b25f46..652352424744 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_plane.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
> @@ -215,7 +215,10 @@ static void virtio_gpu_resource_flush(struct drm_plane *plane,
> if (!objs)
> return;
> virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
> - virtio_gpu_array_lock_resv(objs);
> + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
> + virtio_gpu_array_put_free(objs);
> + return;
> + }
> virtio_gpu_cmd_resource_flush(vgdev, bo->hw_res_handle, x, y,
> width, height, objs,
> vgplane_st->fence);
> @@ -459,7 +462,10 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
> if (!objs)
> return;
> virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
> - virtio_gpu_array_lock_resv(objs);
> + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
> + virtio_gpu_array_put_free(objs);
> + return;
> + }
> virtio_gpu_cmd_transfer_to_host_2d
> (vgdev, 0,
> plane->state->crtc_w,
Applied to misc-next, thanks
--
Best regards,
Dmitry
On 5/20/26 18:04, Dmitry Osipenko wrote:
> On 5/19/26 11:22, Deepanshu Kartikey wrote:
>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock
>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and
>> ignore its return value. The function can fail with -EINTR from
>> dma_resv_lock_interruptible() (signal during lock wait) or with
>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation),
>> leaving the resv lock not held. The queue path then walks the object
>> array and calls dma_resv_add_fence(), which requires the lock held;
>> with lockdep enabled this trips dma_resv_assert_held():
>>
>> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840
>> Call Trace:
>> virtio_gpu_array_add_fence
>> virtio_gpu_queue_ctrl_sgs
>> virtio_gpu_queue_fenced_ctrl_buffer
>> virtio_gpu_cursor_plane_update
>> drm_atomic_helper_commit_planes
>> drm_atomic_helper_commit_tail
>> commit_tail
>> drm_atomic_helper_commit
>> drm_atomic_commit
>> drm_atomic_helper_update_plane
>> __setplane_atomic
>> drm_mode_cursor_universal
>> drm_mode_cursor_common
>> drm_mode_cursor_ioctl
>> drm_ioctl
>> __x64_sys_ioctl
>>
>> Beyond the WARN, mutating the dma_resv fence list without the lock
>> races with concurrent readers/writers and can corrupt the list.
>>
>> Both call sites run inside the .atomic_update plane callback, which
>> DRM atomic helpers do not allow to fail (by the time it runs, the
>> commit has been signed off to userspace and there is no clean
>> rollback path). Moving the lock acquisition to .prepare_fb was
>> rejected because the broader lock scope deadlocks against other BO
>> locking paths in the same atomic commit.
>>
>> Introduce virtio_gpu_lock_one_resv_uninterruptible() that uses
>> dma_resv_lock() instead of dma_resv_lock_interruptible(). This
>> eliminates the -EINTR failure mode -- the realistic syzbot trigger
>> -- without extending the lock hold across the commit. The helper
>> locks a single BO and rejects nents > 1 with -EINVAL; both fix
>> sites lock exactly one BO.
>>
>> Use it from virtio_gpu_cursor_plane_update() and
>> virtio_gpu_resource_flush(); check the return value to handle the
>> remaining -ENOMEM case from dma_resv_reserve_fences() by freeing
>> the objs and skipping the plane update for that frame. The
>> framebuffer BOs touched here are not shared with other contexts
>> and lock contention is expected to be brief, so the loss of
>> signal-interruptibility is acceptable.
>>
>> Other callers of virtio_gpu_array_lock_resv() (the ioctl paths)
>> continue to use the interruptible variant.
>>
>> The bug was reported by syzbot, triggered via fault injection
>> (fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the
>> -ENOMEM branch in dma_resv_reserve_fences().
>>
>> Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271
>> Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
>> ---
>> v4: Rename the helper to virtio_gpu_lock_one_resv_uninterruptible()
>> and reject objs->nents > 1 with -EINVAL. The v3 helper's
>> multi-object branch used drm_gem_lock_reservations(), which is
>> interruptible, contradicting the "uninterruptible" name; both
>> fix sites lock a single BO so the multi-object path is dropped.
>> (Dmitry Osipenko)
>> v3: Drop the prepare_fb/cleanup_fb approach from v2 (it deadlocked
>> against virtio_gpu_resource_flush(), which also locks the BO in
>> the same atomic commit). Instead add an uninterruptible variant
>> of the resv lock helper and use it in both
>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush().
>> (Dmitry Osipenko)
>> v2: Move resv lock acquisition from .atomic_update (which must not
>> fail) to .prepare_fb (which may), per maintainer review of v1.
>> The v1 approach of silently skipping the cursor update on lock
>> failure violated the atomic-commit contract with userspace.
>> ---
>> drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
>> drivers/gpu/drm/virtio/virtgpu_gem.c | 17 +++++++++++++++++
>> drivers/gpu/drm/virtio/virtgpu_plane.c | 10 ++++++++--
>> 3 files changed, 26 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
>> index f17660a71a3e..2f3531950aa4 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
>> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
>> @@ -317,6 +317,7 @@ virtio_gpu_array_from_handles(struct drm_file *drm_file, u32 *handles, u32 nents
>> void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
>> struct drm_gem_object *obj);
>> int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs);
>> +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs);
>> void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs);
>> void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
>> struct dma_fence *fence);
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
>> index f22dc5c21cd4..435d37d36034 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
>> +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
>> @@ -238,6 +238,23 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
>> return ret;
>> }
>>
>> +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs)
>> +{
>> + int ret;
>> +
>> + if (objs->nents != 1)
>> + return -EINVAL;
>> +
>> + dma_resv_lock(objs->objs[0]->resv, NULL);
>> +
>> + ret = dma_resv_reserve_fences(objs->objs[0]->resv, 1);
>> + if (ret) {
>> + virtio_gpu_array_unlock_resv(objs);
>> + return ret;
>> + }
>> + return 0;
>> +}
>> +
>> void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs)
>> {
>> if (objs->nents == 1) {
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
>> index a126d1b25f46..652352424744 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_plane.c
>> +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
>> @@ -215,7 +215,10 @@ static void virtio_gpu_resource_flush(struct drm_plane *plane,
>> if (!objs)
>> return;
>> virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
>> - virtio_gpu_array_lock_resv(objs);
>> + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
>> + virtio_gpu_array_put_free(objs);
>> + return;
>> + }
>> virtio_gpu_cmd_resource_flush(vgdev, bo->hw_res_handle, x, y,
>> width, height, objs,
>> vgplane_st->fence);
>> @@ -459,7 +462,10 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
>> if (!objs)
>> return;
>> virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
>> - virtio_gpu_array_lock_resv(objs);
>> + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
>> + virtio_gpu_array_put_free(objs);
>> + return;
>> + }
>> virtio_gpu_cmd_transfer_to_host_2d
>> (vgdev, 0,
>> plane->state->crtc_w,
>
> Applied to misc-next, thanks
Realized patche should go to -fixes, applied to misc-fixes too
--
Best regards,
Dmitry
On 5/19/26 10:22, Deepanshu Kartikey wrote:
> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock
> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and
> ignore its return value. The function can fail with -EINTR from
> dma_resv_lock_interruptible() (signal during lock wait) or with
> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation),
> leaving the resv lock not held. The queue path then walks the object
> array and calls dma_resv_add_fence(), which requires the lock held;
> with lockdep enabled this trips dma_resv_assert_held():
>
> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840
> Call Trace:
> virtio_gpu_array_add_fence
> virtio_gpu_queue_ctrl_sgs
> virtio_gpu_queue_fenced_ctrl_buffer
> virtio_gpu_cursor_plane_update
> drm_atomic_helper_commit_planes
> drm_atomic_helper_commit_tail
> commit_tail
> drm_atomic_helper_commit
> drm_atomic_commit
> drm_atomic_helper_update_plane
> __setplane_atomic
> drm_mode_cursor_universal
> drm_mode_cursor_common
> drm_mode_cursor_ioctl
> drm_ioctl
> __x64_sys_ioctl
>
> Beyond the WARN, mutating the dma_resv fence list without the lock
> races with concurrent readers/writers and can corrupt the list.
Well why are you trying to add a fence on an atomic mode set in the first place?
That is usually an illegal operation here.
Regards,
Christian.
>
> Both call sites run inside the .atomic_update plane callback, which
> DRM atomic helpers do not allow to fail (by the time it runs, the
> commit has been signed off to userspace and there is no clean
> rollback path). Moving the lock acquisition to .prepare_fb was
> rejected because the broader lock scope deadlocks against other BO
> locking paths in the same atomic commit.
>
> Introduce virtio_gpu_lock_one_resv_uninterruptible() that uses
> dma_resv_lock() instead of dma_resv_lock_interruptible(). This
> eliminates the -EINTR failure mode -- the realistic syzbot trigger
> -- without extending the lock hold across the commit. The helper
> locks a single BO and rejects nents > 1 with -EINVAL; both fix
> sites lock exactly one BO.
>
> Use it from virtio_gpu_cursor_plane_update() and
> virtio_gpu_resource_flush(); check the return value to handle the
> remaining -ENOMEM case from dma_resv_reserve_fences() by freeing
> the objs and skipping the plane update for that frame. The
> framebuffer BOs touched here are not shared with other contexts
> and lock contention is expected to be brief, so the loss of
> signal-interruptibility is acceptable.
>
> Other callers of virtio_gpu_array_lock_resv() (the ioctl paths)
> continue to use the interruptible variant.
>
> The bug was reported by syzbot, triggered via fault injection
> (fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the
> -ENOMEM branch in dma_resv_reserve_fences().
>
> Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271
> Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().")
> Cc: stable@vger.kernel.org
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
> v4: Rename the helper to virtio_gpu_lock_one_resv_uninterruptible()
> and reject objs->nents > 1 with -EINVAL. The v3 helper's
> multi-object branch used drm_gem_lock_reservations(), which is
> interruptible, contradicting the "uninterruptible" name; both
> fix sites lock a single BO so the multi-object path is dropped.
> (Dmitry Osipenko)
> v3: Drop the prepare_fb/cleanup_fb approach from v2 (it deadlocked
> against virtio_gpu_resource_flush(), which also locks the BO in
> the same atomic commit). Instead add an uninterruptible variant
> of the resv lock helper and use it in both
> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush().
> (Dmitry Osipenko)
> v2: Move resv lock acquisition from .atomic_update (which must not
> fail) to .prepare_fb (which may), per maintainer review of v1.
> The v1 approach of silently skipping the cursor update on lock
> failure violated the atomic-commit contract with userspace.
> ---
> drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
> drivers/gpu/drm/virtio/virtgpu_gem.c | 17 +++++++++++++++++
> drivers/gpu/drm/virtio/virtgpu_plane.c | 10 ++++++++--
> 3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
> index f17660a71a3e..2f3531950aa4 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
> @@ -317,6 +317,7 @@ virtio_gpu_array_from_handles(struct drm_file *drm_file, u32 *handles, u32 nents
> void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
> struct drm_gem_object *obj);
> int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs);
> +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs);
> void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs);
> void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
> struct dma_fence *fence);
> diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
> index f22dc5c21cd4..435d37d36034 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
> @@ -238,6 +238,23 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
> return ret;
> }
>
> +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs)
> +{
> + int ret;
> +
> + if (objs->nents != 1)
> + return -EINVAL;
> +
> + dma_resv_lock(objs->objs[0]->resv, NULL);
> +
> + ret = dma_resv_reserve_fences(objs->objs[0]->resv, 1);
> + if (ret) {
> + virtio_gpu_array_unlock_resv(objs);
> + return ret;
> + }
> + return 0;
> +}
> +
> void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs)
> {
> if (objs->nents == 1) {
> diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
> index a126d1b25f46..652352424744 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_plane.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
> @@ -215,7 +215,10 @@ static void virtio_gpu_resource_flush(struct drm_plane *plane,
> if (!objs)
> return;
> virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
> - virtio_gpu_array_lock_resv(objs);
> + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
> + virtio_gpu_array_put_free(objs);
> + return;
> + }
> virtio_gpu_cmd_resource_flush(vgdev, bo->hw_res_handle, x, y,
> width, height, objs,
> vgplane_st->fence);
> @@ -459,7 +462,10 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
> if (!objs)
> return;
> virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);
> - virtio_gpu_array_lock_resv(objs);
> + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) {
> + virtio_gpu_array_put_free(objs);
> + return;
> + }
> virtio_gpu_cmd_transfer_to_host_2d
> (vgdev, 0,
> plane->state->crtc_w,
On 5/19/26 11:27, Christian König wrote: > On 5/19/26 10:22, Deepanshu Kartikey wrote: >> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock >> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and >> ignore its return value. The function can fail with -EINTR from >> dma_resv_lock_interruptible() (signal during lock wait) or with >> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), >> leaving the resv lock not held. The queue path then walks the object >> array and calls dma_resv_add_fence(), which requires the lock held; >> with lockdep enabled this trips dma_resv_assert_held(): >> >> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 >> Call Trace: >> virtio_gpu_array_add_fence >> virtio_gpu_queue_ctrl_sgs >> virtio_gpu_queue_fenced_ctrl_buffer >> virtio_gpu_cursor_plane_update >> drm_atomic_helper_commit_planes >> drm_atomic_helper_commit_tail >> commit_tail >> drm_atomic_helper_commit >> drm_atomic_commit >> drm_atomic_helper_update_plane >> __setplane_atomic >> drm_mode_cursor_universal >> drm_mode_cursor_common >> drm_mode_cursor_ioctl >> drm_ioctl >> __x64_sys_ioctl >> >> Beyond the WARN, mutating the dma_resv fence list without the lock >> races with concurrent readers/writers and can corrupt the list. > > Well why are you trying to add a fence on an atomic mode set in the first place? > > That is usually an illegal operation here. That is pre-existing in the driver. It performs draw operation and in some cases waits for the completion during atomic. Whether all that syncing is correct is hard to say immediately as some of it may be historical edge cases. -- Best regards, Dmitry
On 5/20/26 08:50, Dmitry Osipenko wrote: > On 5/19/26 11:27, Christian König wrote: >> On 5/19/26 10:22, Deepanshu Kartikey wrote: >>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock >>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and >>> ignore its return value. The function can fail with -EINTR from >>> dma_resv_lock_interruptible() (signal during lock wait) or with >>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), >>> leaving the resv lock not held. The queue path then walks the object >>> array and calls dma_resv_add_fence(), which requires the lock held; >>> with lockdep enabled this trips dma_resv_assert_held(): >>> >>> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 >>> Call Trace: >>> virtio_gpu_array_add_fence >>> virtio_gpu_queue_ctrl_sgs >>> virtio_gpu_queue_fenced_ctrl_buffer >>> virtio_gpu_cursor_plane_update >>> drm_atomic_helper_commit_planes >>> drm_atomic_helper_commit_tail >>> commit_tail >>> drm_atomic_helper_commit >>> drm_atomic_commit >>> drm_atomic_helper_update_plane >>> __setplane_atomic >>> drm_mode_cursor_universal >>> drm_mode_cursor_common >>> drm_mode_cursor_ioctl >>> drm_ioctl >>> __x64_sys_ioctl >>> >>> Beyond the WARN, mutating the dma_resv fence list without the lock >>> races with concurrent readers/writers and can corrupt the list. >> >> Well why are you trying to add a fence on an atomic mode set in the first place? >> >> That is usually an illegal operation here. > That is pre-existing in the driver. It performs draw operation and in > some cases waits for the completion during atomic. Whether all that > syncing is correct is hard to say immediately as some of it may be > historical edge cases. I'm not not so deeply in the atomic mode setting stuff but it strongly sounds like that this is seriously broken. The background is that the atomic mode set framework allows an output dma_fence which is signaled when the commit is finished. So when you allocate a fence slot and add a new fence to finish the atomic commit it is trivially possible that this cycles back and waits for the atomic commit to finish. In other words you have a deadlock. You probably need specially crafted userspace with the right timing to trigger that, but such issues are usually a rather big no-no and need to be fixed in the long term. Try to add dma_fence_begin_signaling() and dma_fence_end_signaling() annotation and enable lockdep, the tool should be able to point out if and what exactly goes wrong. The usual fix is to prepare everything before commit_tail is called (alloc memory, create, reserve slot, add dma_fence etc....) and then just send out the prepared commands later on. Regards, Christian.
On 5/20/26 10:05, Christian König wrote: > On 5/20/26 08:50, Dmitry Osipenko wrote: >> On 5/19/26 11:27, Christian König wrote: >>> On 5/19/26 10:22, Deepanshu Kartikey wrote: >>>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock >>>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and >>>> ignore its return value. The function can fail with -EINTR from >>>> dma_resv_lock_interruptible() (signal during lock wait) or with >>>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), >>>> leaving the resv lock not held. The queue path then walks the object >>>> array and calls dma_resv_add_fence(), which requires the lock held; >>>> with lockdep enabled this trips dma_resv_assert_held(): >>>> >>>> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 >>>> Call Trace: >>>> virtio_gpu_array_add_fence >>>> virtio_gpu_queue_ctrl_sgs >>>> virtio_gpu_queue_fenced_ctrl_buffer >>>> virtio_gpu_cursor_plane_update >>>> drm_atomic_helper_commit_planes >>>> drm_atomic_helper_commit_tail >>>> commit_tail >>>> drm_atomic_helper_commit >>>> drm_atomic_commit >>>> drm_atomic_helper_update_plane >>>> __setplane_atomic >>>> drm_mode_cursor_universal >>>> drm_mode_cursor_common >>>> drm_mode_cursor_ioctl >>>> drm_ioctl >>>> __x64_sys_ioctl >>>> >>>> Beyond the WARN, mutating the dma_resv fence list without the lock >>>> races with concurrent readers/writers and can corrupt the list. >>> >>> Well why are you trying to add a fence on an atomic mode set in the first place? >>> >>> That is usually an illegal operation here. >> That is pre-existing in the driver. It performs draw operation and in >> some cases waits for the completion during atomic. Whether all that >> syncing is correct is hard to say immediately as some of it may be >> historical edge cases. > > I'm not not so deeply in the atomic mode setting stuff but it strongly sounds like that this is seriously broken. > > The background is that the atomic mode set framework allows an output dma_fence which is signaled when the commit is finished. > > So when you allocate a fence slot and add a new fence to finish the atomic commit it is trivially possible that this cycles back and waits for the atomic commit to finish. In other words you have a deadlock. > > You probably need specially crafted userspace with the right timing to trigger that, but such issues are usually a rather big no-no and need to be fixed in the long term. > > Try to add dma_fence_begin_signaling() and dma_fence_end_signaling() annotation and enable lockdep, the tool should be able to point out if and what exactly goes wrong. > > The usual fix is to prepare everything before commit_tail is called (alloc memory, create, reserve slot, add dma_fence etc....) and then just send out the prepared commands later on. We tried with moving resv alloc to prepare_fb() in a previous patch version, it resulted in a non-trivial deadlocks. The goal of this patch is to fix immediate problem with a minimal code change. What you're saying is correct, but it may require a rather big refactoring of the code. In general, everything works okay today, so not really an urgent problem. -- Best regards, Dmitry
On 5/20/26 10:12, Dmitry Osipenko wrote: > On 5/20/26 10:05, Christian König wrote: >> On 5/20/26 08:50, Dmitry Osipenko wrote: >>> On 5/19/26 11:27, Christian König wrote: >>>> On 5/19/26 10:22, Deepanshu Kartikey wrote: >>>>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock >>>>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and >>>>> ignore its return value. The function can fail with -EINTR from >>>>> dma_resv_lock_interruptible() (signal during lock wait) or with >>>>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), >>>>> leaving the resv lock not held. The queue path then walks the object >>>>> array and calls dma_resv_add_fence(), which requires the lock held; >>>>> with lockdep enabled this trips dma_resv_assert_held(): >>>>> >>>>> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 >>>>> Call Trace: >>>>> virtio_gpu_array_add_fence >>>>> virtio_gpu_queue_ctrl_sgs >>>>> virtio_gpu_queue_fenced_ctrl_buffer >>>>> virtio_gpu_cursor_plane_update >>>>> drm_atomic_helper_commit_planes >>>>> drm_atomic_helper_commit_tail >>>>> commit_tail >>>>> drm_atomic_helper_commit >>>>> drm_atomic_commit >>>>> drm_atomic_helper_update_plane >>>>> __setplane_atomic >>>>> drm_mode_cursor_universal >>>>> drm_mode_cursor_common >>>>> drm_mode_cursor_ioctl >>>>> drm_ioctl >>>>> __x64_sys_ioctl >>>>> >>>>> Beyond the WARN, mutating the dma_resv fence list without the lock >>>>> races with concurrent readers/writers and can corrupt the list. >>>> >>>> Well why are you trying to add a fence on an atomic mode set in the first place? >>>> >>>> That is usually an illegal operation here. >>> That is pre-existing in the driver. It performs draw operation and in >>> some cases waits for the completion during atomic. Whether all that >>> syncing is correct is hard to say immediately as some of it may be >>> historical edge cases. >> >> I'm not not so deeply in the atomic mode setting stuff but it strongly sounds like that this is seriously broken. >> >> The background is that the atomic mode set framework allows an output dma_fence which is signaled when the commit is finished. >> >> So when you allocate a fence slot and add a new fence to finish the atomic commit it is trivially possible that this cycles back and waits for the atomic commit to finish. In other words you have a deadlock. >> >> You probably need specially crafted userspace with the right timing to trigger that, but such issues are usually a rather big no-no and need to be fixed in the long term. >> >> Try to add dma_fence_begin_signaling() and dma_fence_end_signaling() annotation and enable lockdep, the tool should be able to point out if and what exactly goes wrong. >> >> The usual fix is to prepare everything before commit_tail is called (alloc memory, create, reserve slot, add dma_fence etc....) and then just send out the prepared commands later on. > > We tried with moving resv alloc to prepare_fb() in a previous patch > version, it resulted in a non-trivial deadlocks. The goal of this patch > is to fix immediate problem with a minimal code change. Yeah, totally fine with me to get that fixed first. > What you're saying is correct, but it may require a rather big > refactoring of the code. In general, everything works okay today, so not > really an urgent problem. It's just a potential issue and when the AI bots keep evolving like they already do they will sooner or later start to point that out as well. Regards, Christian.
20.05.2026 14:00, Christian König пишет: > On 5/20/26 10:12, Dmitry Osipenko wrote: >> On 5/20/26 10:05, Christian König wrote: >>> On 5/20/26 08:50, Dmitry Osipenko wrote: >>>> On 5/19/26 11:27, Christian König wrote: >>>>> On 5/19/26 10:22, Deepanshu Kartikey wrote: >>>>>> virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock >>>>>> the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and >>>>>> ignore its return value. The function can fail with -EINTR from >>>>>> dma_resv_lock_interruptible() (signal during lock wait) or with >>>>>> -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), >>>>>> leaving the resv lock not held. The queue path then walks the object >>>>>> array and calls dma_resv_add_fence(), which requires the lock held; >>>>>> with lockdep enabled this trips dma_resv_assert_held(): >>>>>> >>>>>> WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 >>>>>> Call Trace: >>>>>> virtio_gpu_array_add_fence >>>>>> virtio_gpu_queue_ctrl_sgs >>>>>> virtio_gpu_queue_fenced_ctrl_buffer >>>>>> virtio_gpu_cursor_plane_update >>>>>> drm_atomic_helper_commit_planes >>>>>> drm_atomic_helper_commit_tail >>>>>> commit_tail >>>>>> drm_atomic_helper_commit >>>>>> drm_atomic_commit >>>>>> drm_atomic_helper_update_plane >>>>>> __setplane_atomic >>>>>> drm_mode_cursor_universal >>>>>> drm_mode_cursor_common >>>>>> drm_mode_cursor_ioctl >>>>>> drm_ioctl >>>>>> __x64_sys_ioctl >>>>>> >>>>>> Beyond the WARN, mutating the dma_resv fence list without the lock >>>>>> races with concurrent readers/writers and can corrupt the list. >>>>> >>>>> Well why are you trying to add a fence on an atomic mode set in the first place? >>>>> >>>>> That is usually an illegal operation here. >>>> That is pre-existing in the driver. It performs draw operation and in >>>> some cases waits for the completion during atomic. Whether all that >>>> syncing is correct is hard to say immediately as some of it may be >>>> historical edge cases. >>> >>> I'm not not so deeply in the atomic mode setting stuff but it strongly sounds like that this is seriously broken. >>> >>> The background is that the atomic mode set framework allows an output dma_fence which is signaled when the commit is finished. >>> >>> So when you allocate a fence slot and add a new fence to finish the atomic commit it is trivially possible that this cycles back and waits for the atomic commit to finish. In other words you have a deadlock. >>> >>> You probably need specially crafted userspace with the right timing to trigger that, but such issues are usually a rather big no-no and need to be fixed in the long term. >>> >>> Try to add dma_fence_begin_signaling() and dma_fence_end_signaling() annotation and enable lockdep, the tool should be able to point out if and what exactly goes wrong. >>> >>> The usual fix is to prepare everything before commit_tail is called (alloc memory, create, reserve slot, add dma_fence etc....) and then just send out the prepared commands later on. >> >> We tried with moving resv alloc to prepare_fb() in a previous patch >> version, it resulted in a non-trivial deadlocks. The goal of this patch >> is to fix immediate problem with a minimal code change. > > Yeah, totally fine with me to get that fixed first. > >> What you're saying is correct, but it may require a rather big >> refactoring of the code. In general, everything works okay today, so not >> really an urgent problem. > > It's just a potential issue and when the AI bots keep evolving like they already do they will sooner or later start to point that out as well. Don't mind if bots will produce something useful -- Best regards, Dmitry
© 2016 - 2026 Red Hat, Inc.