[PATCH v2] drm/prime: fix dangling dmabuf entries after handle release

w15303746062@163.com posted 1 patch 1 week, 4 days ago
drivers/gpu/drm/drm_prime.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
[PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
Posted by w15303746062@163.com 1 week, 4 days ago
From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

When a GEM handle already exists in the drm_prime_file_private, repeated
calls to DRM_IOCTL_PRIME_HANDLE_TO_FD can cause drm_prime_add_buf_handle()
to insert multiple entries with the same handle into the handles rb_tree.
Because the insertion walk moves left on equality, duplicate keys are
structurally accepted by the tree.

Later, when the handle is released via drm_gem_release() ->
drm_gem_object_release_handle() -> drm_prime_remove_buf_handle(), the
latter iterates the handles tree, removes the first matching node, and
breaks out of the loop. Any remaining duplicate nodes that share the
same handle are left orphaned in the dmabufs tree - they are no longer
reachable through the handles tree and are never freed.

When the drm file is finally closed, drm_prime_destroy_file_private()
triggers:

	WARN_ON(!RB_EMPTY_ROOT(&prime_fpriv->dmabufs));

because the dmabufs tree is still non-empty. With CONFIG_PANIC_ON_WARN
this becomes a kernel panic:

	------------[ cut here ]------------
	WARNING: CPU: 0 PID: 19739 at drivers/gpu/drm/drm_prime.c:223 drm_prime_destroy_file_private+0x43/0x60
	...
	Kernel panic - not syncing: kernel: panic_on_warn set ...

Fix this by restarting the lookup from the root of the handles tree
after each successful removal, so that all duplicate nodes for the given
handle are erased. The caller (drm_gem_object_release_handle) already
holds prime_fpriv->lock, so this does not change the locking strategy.

Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
Changes in v2:
 - Drop the unnecessary mutex_lock addition, as the caller (drm_gem_object_release_handle) already holds the lock.
 - Rewrite the commit message to accurately reflect the root cause (duplicate handle insertions) rather than an assumed lack of synchronization.
 - Restart the rb_tree lookup from the root instead of breaking the loop to ensure all orphaned duplicate nodes are thoroughly removed.

 drivers/gpu/drm/drm_prime.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 9b44c78cd77f..dc28df1c6698 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -202,7 +202,10 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
 
 			dma_buf_put(member->dma_buf);
 			kfree(member);
-			break;
+			/* Duplicate handles may exist; restart search from root
+			 * to guarantee removal of all matching entries.
+			 */
+			rb = prime_fpriv->handles.rb_node;
 		} else if (member->handle < handle) {
 			rb = rb->rb_right;
 		} else {
-- 
2.34.1
Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
Posted by Christian König 1 week, 4 days ago
On 5/28/26 15:29, w15303746062@163.com wrote:
> From: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> 
> When a GEM handle already exists in the drm_prime_file_private, repeated
> calls to DRM_IOCTL_PRIME_HANDLE_TO_FD can cause drm_prime_add_buf_handle()
> to insert multiple entries with the same handle into the handles rb_tree.
> Because the insertion walk moves left on equality, duplicate keys are
> structurally accepted by the tree.

That should never happen and would be a major bug.

All callers should check if a handler exists before calling drm_prime_add_buf_handle().

How do you see that a handle is added twice?

Regards,
Christian.

> 
> Later, when the handle is released via drm_gem_release() ->
> drm_gem_object_release_handle() -> drm_prime_remove_buf_handle(), the
> latter iterates the handles tree, removes the first matching node, and
> breaks out of the loop. Any remaining duplicate nodes that share the
> same handle are left orphaned in the dmabufs tree - they are no longer
> reachable through the handles tree and are never freed.
> 
> When the drm file is finally closed, drm_prime_destroy_file_private()
> triggers:
> 
>         WARN_ON(!RB_EMPTY_ROOT(&prime_fpriv->dmabufs));
> 
> because the dmabufs tree is still non-empty. With CONFIG_PANIC_ON_WARN
> this becomes a kernel panic:
> 
>         ------------[ cut here ]------------
>         WARNING: CPU: 0 PID: 19739 at drivers/gpu/drm/drm_prime.c:223 drm_prime_destroy_file_private+0x43/0x60
>         ...
>         Kernel panic - not syncing: kernel: panic_on_warn set ...
> 
> Fix this by restarting the lookup from the root of the handles tree
> after each successful removal, so that all duplicate nodes for the given
> handle are erased. The caller (drm_gem_object_release_handle) already
> holds prime_fpriv->lock, so this does not change the locking strategy.
> 
> Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> ---
> Changes in v2:
>  - Drop the unnecessary mutex_lock addition, as the caller (drm_gem_object_release_handle) already holds the lock.
>  - Rewrite the commit message to accurately reflect the root cause (duplicate handle insertions) rather than an assumed lack of synchronization.
>  - Restart the rb_tree lookup from the root instead of breaking the loop to ensure all orphaned duplicate nodes are thoroughly removed.
> 
>  drivers/gpu/drm/drm_prime.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 9b44c78cd77f..dc28df1c6698 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -202,7 +202,10 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
> 
>                         dma_buf_put(member->dma_buf);
>                         kfree(member);
> -                       break;
> +                       /* Duplicate handles may exist; restart search from root
> +                        * to guarantee removal of all matching entries.
> +                        */
> +                       rb = prime_fpriv->handles.rb_node;
>                 } else if (member->handle < handle) {
>                         rb = rb->rb_right;
>                 } else {
> --
> 2.34.1
>
Re:Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
Posted by w15303746062 1 week, 4 days ago
Hi Christian,

Thank you for insisting on this. I've now gone through all callers
of drm_prime_add_buf_handle() in drm_prime.c.

You are absolutely right: both drm_gem_prime_fd_to_handle() and
drm_gem_prime_handle_to_dmabuf() perform the lookup under
prime_fpriv->lock before adding, so a duplicate handle should indeed
never be inserted through those paths.

That said, the syzkaller report clearly shows that the dmabufs tree
is not empty when drm_prime_destroy_file_private() runs, which means
some entry wasn't removed. Given that the normal add/remove paths
appear correct, the trigger might be something more subtle — perhaps
a driver-specific callback that bypasses the generic helpers, or an
error path that leaves an orphan in the dmabufs tree. I haven't been
able to identify the exact trigger yet.

The proposed change to drm_prime_remove_buf_handle() (restart search
instead of break) is intended as a small robustness improvement, not
a fix for a confirmed race. In the normal case it will still execute
only once, but if the trees ever become inconsistent for any reason,
it will clean up all entries for the given handle and prevent the
WARNING.

Would you be okay with such a defensive approach, or would you prefer
that we first track down the precise trigger (e.g. with additional
WARNs or tracing)?

Thanks,
Mingyu
Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
Posted by Christian König 1 week, 3 days ago
Hi Mingyu,

On 5/28/26 15:49, w15303746062 wrote:
> Hi Christian,
> 
> Thank you for insisting on this. I've now gone through all callers
> of drm_prime_add_buf_handle() in drm_prime.c.
> 
> You are absolutely right: both drm_gem_prime_fd_to_handle() and
> drm_gem_prime_handle_to_dmabuf() perform the lookup under
> prime_fpriv->lock before adding, so a duplicate handle should indeed
> never be inserted through those paths.
> 
> That said, the syzkaller report clearly shows that the dmabufs tree
> is not empty when drm_prime_destroy_file_private() runs, which means
> some entry wasn't removed. Given that the normal add/remove paths
> appear correct, the trigger might be something more subtle — perhaps
> a driver-specific callback that bypasses the generic helpers, or an
> error path that leaves an orphan in the dmabufs tree. I haven't been
> able to identify the exact trigger yet.
> 
> The proposed change to drm_prime_remove_buf_handle() (restart search
> instead of break) is intended as a small robustness improvement, not
> a fix for a confirmed race. In the normal case it will still execute
> only once, but if the trees ever become inconsistent for any reason,
> it will clean up all entries for the given handle and prevent the
> WARNING.
> 
> Would you be okay with such a defensive approach, or would you prefer
> that we first track down the precise trigger (e.g. with additional
> WARNs or tracing)?

I don't think so. As far as I can see this is not a robustness improvement but just papering over an issue.

Leaking memory is usually only a very minor problem, things like use after free or random memory corruption is much more worse.

And such things is exactly what starts to happens when you start papering over issues.

So I would say find the root cause of what is going on here, you have certainly stumbled over something, and then we can look into how to fix that.

But just sending out random patches where a bit of simple code reading can prove them incorrect is not really helpful.

Regards,
Christian.

> 
> Thanks,
> Mingyu

Re:Re: [PATCH v2] drm/prime: fix dangling dmabuf entries after handle release
Posted by w15303746062 1 week, 3 days ago
Hi Christian,

Thank you for your guidance and patience throughout this discussion.

After further investigation, I realize that identifying the precise
root cause requires a deeper understanding of the DRM subsystem and
access to the specific syzkaller reproducer, which I currently lack.

To avoid wasting your time with incomplete patches, I'll step back
from this issue for now and continue learning the codebase. If I
manage to reproduce the problem locally or find more concrete
evidence, I'll follow up with a proper analysis.

Thank you again for the review and the valuable lessons.

Regards,
Mingyu