[PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock

w15303746062@163.com posted 5 patches 2 weeks ago
There is a newer version of this series
Documentation/gpu/drm-kms-helpers.rst    |  12 ++
drivers/gpu/drm/Makefile                 |   3 +-
drivers/gpu/drm/drm_atomic_helper.c      |   2 +-
drivers/gpu/drm/drm_vblank.c             | 172 +++++++++++++++++++++-
drivers/gpu/drm/drm_vblank_helper.c      | 176 +++++++++++++++++++++++
drivers/gpu/drm/vkms/vkms_crtc.c         |  83 +----------
drivers/gpu/drm/vkms/vkms_drv.h          |   2 -
include/drm/drm_modeset_helper_vtables.h |  12 ++
include/drm/drm_vblank.h                 |  32 +++++
include/drm/drm_vblank_helper.h          |  56 ++++++++
10 files changed, 468 insertions(+), 82 deletions(-)
create mode 100644 drivers/gpu/drm/drm_vblank_helper.c
create mode 100644 include/drm/drm_vblank_helper.h
[PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
Posted by w15303746062@163.com 2 weeks ago
From: Mingyu Wang <25181214217@stu.xidian.edu.cn>

Hi Greg and all,

This patch series backports the generic DRM vblank timer infrastructure
and converts the vkms driver to use it, specifically targeting the 
6.18.y stable branch.

During local fuzzing with Syzkaller, an RCU preempt stall (ABBA deadlock)
was consistently observed in the 6.18.y vkms driver. This deadlock occurs
between the legacy drm_vblank_disable_and_save() function and the
vkms_vblank_simulate() hrtimer callback. 

A previous localized patch was submitted to address this in 6.18.y using
hrtimer_try_to_cancel. However, as discussed with Greg KH and Maarten
Lankhorst on the mailing list, the correct and most maintainable approach
is to backport the mainline commits that inherently resolve this by
removing the custom vkms hrtimer entirely.

Following Maarten's roadmap, this series cherry-picks the exact
dependency chain from mainline to introduce the drm_vblank_helper
infrastructure and migrate vkms to it. 

The series applies smoothly to 6.18.y and completely resolves the soft
lockup in the fuzzing environment.

Thanks,
Mingyu Wang

Thomas Zimmermann (5):
  drm/vblank: Add vblank timer
  drm/vblank: Add CRTC helpers for simple use cases
  drm/vkms: Convert to DRM's vblank timer
  drm/atomic: Increase timeout in drm_atomic_helper_wait_for_vblanks()
  drm/vblank: Fix kernel docs for vblank timer

 Documentation/gpu/drm-kms-helpers.rst    |  12 ++
 drivers/gpu/drm/Makefile                 |   3 +-
 drivers/gpu/drm/drm_atomic_helper.c      |   2 +-
 drivers/gpu/drm/drm_vblank.c             | 172 +++++++++++++++++++++-
 drivers/gpu/drm/drm_vblank_helper.c      | 176 +++++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_crtc.c         |  83 +----------
 drivers/gpu/drm/vkms/vkms_drv.h          |   2 -
 include/drm/drm_modeset_helper_vtables.h |  12 ++
 include/drm/drm_vblank.h                 |  32 +++++
 include/drm/drm_vblank_helper.h          |  56 ++++++++
 10 files changed, 468 insertions(+), 82 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_vblank_helper.c
 create mode 100644 include/drm/drm_vblank_helper.h

-- 
2.34.1
Re: [PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
Posted by Sasha Levin 1 week, 6 days ago
> [PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
>
> 1/5 drm/vblank: Add vblank timer (74afeb812850)
> 2/5 drm/vblank: Add CRTC helpers for simple use cases (d54dbb5963bd)
> 3/5 drm/vkms: Convert to DRM's vblank timer (02e2681ffe1a)
> 4/5 drm/atomic: Increase timeout in drm_atomic_helper_wait_for_vblanks() (79ae8510b5b8)
> 5/5 drm/vblank: Fix kernel docs for vblank timer (3946d3ba9934)

Thanks for putting this together.

Looking at the five commits:

  - 1/5 (74afeb812850) is the one that actually fixes the ABBA
    deadlock you observed under Syzkaller; it adds the generic vblank
    timer that replaces the open-coded vkms hrtimer path.

  - 2/5 (d54dbb5963bd) adds new CRTC helpers for "simple use cases".
    No Fixes:/Cc:stable, no described bug.

  - 3/5 (02e2681ffe1a) is a refactor that converts vkms to the new
    helpers. No Fixes:/Cc:stable, no described bug.

  - 4/5 (79ae8510b5b8) is a v7.1-rc1 timeout bump that depends on 1/5.
    It is not yet in any released stable, so applying it to 6.18.y
    would put it on an LTS before any LTS contains it.

  - 5/5 (3946d3ba9934) is a doc fix for 1/5.

Per stable-kernel-rules, what I need to queue is the minimum set that
fixes the bug. Could you explain, per patch, why 2/5..5/5 are required
to make 1/5 work / are required to actually fix the deadlock? If only
1/5 is needed, please resend just that one with your Signed-off-by
added (the carried patches today only have Thomas's S-o-b, which
breaks the chain of custody on a stable submission).

--
Thanks,
Sasha
Re:Re: [PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
Posted by w15303746062 1 week, 6 days ago
Hi Sasha,

>Looking at the five commits:
>
>  - 1/5 (74afeb812850) is the one that actually fixes the ABBA
>    deadlock you observed under Syzkaller; it adds the generic vblank
>    timer that replaces the open-coded vkms hrtimer path.
>
>  - 2/5 (d54dbb5963bd) adds new CRTC helpers for "simple use cases".
>    No Fixes:/Cc:stable, no described bug.
>
>  - 3/5 (02e2681ffe1a) is a refactor that converts vkms to the new
>    helpers. No Fixes:/Cc:stable, no described bug.
>
>  - 4/5 (79ae8510b5b8) is a v7.1-rc1 timeout bump that depends on 1/5.
>    It is not yet in any released stable, so applying it to 6.18.y
>    would put it on an LTS before any LTS contains it.
>
>  - 5/5 (3946d3ba9934) is a doc fix for 1/5.
>
>Per stable-kernel-rules, what I need to queue is the minimum set that
>fixes the bug. Could you explain, per patch, why 2/5..5/5 are required
>to make 1/5 work / are required to actually fix the deadlock? If only
>1/5 is needed, please resend just that one with your Signed-off-by
>added (the carried patches today only have Thomas's S-o-b, which
>breaks the chain of custody on a stable submission).

Thanks for the quick review and for pointing out the missing Signed-off-by. I apologize for that omission; it was my mistake during the cherry-pick process.

Regarding the dependency chain, I would like to clarify why commit 1/5 alone cannot fix the issue:

Commits 1/5 and 2/5 introduce the new generic vblank timer infrastructure to the DRM core but do *not* touch the vkms driver at all. 
Commit 3/5 (02e2681ffe1a) is the actual fix that modifies `drivers/gpu/drm/vkms/vkms_crtc.c`. It removes the buggy open-coded hrtimer that causes the ABBA deadlock and switches vkms to use the new infrastructure introduced in 1/5 and 2/5. 

Therefore, 1/5, 2/5, and 3/5 form an indivisible set. Applying only 1/5 would leave the deadlock in vkms completely unpatched.

As for 4/5 and 5/5 (the timeout bump and doc fix), Maarten Lankhorst (DRM maintainer) explicitly recommended pulling in this exact 5-commit list as the proper upstream fix for this specific vkms issue (see the mailing list link in this thread). 

However, if you feel 4/5 and 5/5 introduce unnecessary risk for the 6.18.y stable tree, I can absolutely drop them and only submit 1/5, 2/5, and 3/5. 

I am preparing a v2 patch series now with my Signed-off-by added to the chain of custody. Could you let me know if you prefer the full 5-patch series as recommended by DRM maintainers, or just the minimal 3-patch series?

Best regards,
Mingyu
Re: [PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
Posted by Maarten Lankhorst 1 week, 6 days ago
Hello,

Den 2026-05-26 kl. 14:06, skrev w15303746062:
> 
> Hi Sasha,
> 
>> Looking at the five commits:
>>
>>  - 1/5 (74afeb812850) is the one that actually fixes the ABBA
>>    deadlock you observed under Syzkaller; it adds the generic vblank
>>    timer that replaces the open-coded vkms hrtimer path.
>>
>>  - 2/5 (d54dbb5963bd) adds new CRTC helpers for "simple use cases".
>>    No Fixes:/Cc:stable, no described bug.
>>
>>  - 3/5 (02e2681ffe1a) is a refactor that converts vkms to the new
>>    helpers. No Fixes:/Cc:stable, no described bug.
>>
>>  - 4/5 (79ae8510b5b8) is a v7.1-rc1 timeout bump that depends on 1/5.
>>    It is not yet in any released stable, so applying it to 6.18.y
>>    would put it on an LTS before any LTS contains it.
>>
>>  - 5/5 (3946d3ba9934) is a doc fix for 1/5.
>>
>> Per stable-kernel-rules, what I need to queue is the minimum set that
>> fixes the bug. Could you explain, per patch, why 2/5..5/5 are required
>> to make 1/5 work / are required to actually fix the deadlock? If only
>> 1/5 is needed, please resend just that one with your Signed-off-by
>> added (the carried patches today only have Thomas's S-o-b, which
>> breaks the chain of custody on a stable submission).
> 
> Thanks for the quick review and for pointing out the missing Signed-off-by. I apologize for that omission; it was my mistake during the cherry-pick process.
> 
> Regarding the dependency chain, I would like to clarify why commit 1/5 alone cannot fix the issue:
> 
> Commits 1/5 and 2/5 introduce the new generic vblank timer infrastructure to the DRM core but do *not* touch the vkms driver at all. 
> Commit 3/5 (02e2681ffe1a) is the actual fix that modifies `drivers/gpu/drm/vkms/vkms_crtc.c`. It removes the buggy open-coded hrtimer that causes the ABBA deadlock and switches vkms to use the new infrastructure introduced in 1/5 and 2/5. 
> 
> Therefore, 1/5, 2/5, and 3/5 form an indivisible set. Applying only 1/5 would leave the deadlock in vkms completely unpatched.
> 
> As for 4/5 and 5/5 (the timeout bump and doc fix), Maarten Lankhorst (DRM maintainer) explicitly recommended pulling in this exact 5-commit list as the proper upstream fix for this specific vkms issue (see the mailing list link in this thread). 
> 
> However, if you feel 4/5 and 5/5 introduce unnecessary risk for the 6.18.y stable tree, I can absolutely drop them and only submit 1/5, 2/5, and 3/5. 
> 
> I am preparing a v2 patch series now with my Signed-off-by added to the chain of custody. Could you let me know if you prefer the full 5-patch series as recommended by DRM maintainers, or just the minimal 3-patch series?
> 
> Best regards,
> Mingyu

5/5 might strictly speaking not be needed as it's a documentation fix and I have no idea of the policy about those.

The reporter made a bug report of an ABBA deadlock that was fixed in upstream by the first 4 patches, perhaps it's good to those attach here to this discussion.

Kind regards,
~Maarten Lankhorst
Re: [PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
Posted by Sasha Levin 1 week, 6 days ago
On Tue, May 26, 2026 at 02:48:55PM +0200, Maarten Lankhorst wrote:
>Hello,
>
>Den 2026-05-26 kl. 14:06, skrev w15303746062:
>>
>> Hi Sasha,
>>
>>> Looking at the five commits:
>>>
>>>  - 1/5 (74afeb812850) is the one that actually fixes the ABBA
>>>    deadlock you observed under Syzkaller; it adds the generic vblank
>>>    timer that replaces the open-coded vkms hrtimer path.
>>>
>>>  - 2/5 (d54dbb5963bd) adds new CRTC helpers for "simple use cases".
>>>    No Fixes:/Cc:stable, no described bug.
>>>
>>>  - 3/5 (02e2681ffe1a) is a refactor that converts vkms to the new
>>>    helpers. No Fixes:/Cc:stable, no described bug.
>>>
>>>  - 4/5 (79ae8510b5b8) is a v7.1-rc1 timeout bump that depends on 1/5.
>>>    It is not yet in any released stable, so applying it to 6.18.y
>>>    would put it on an LTS before any LTS contains it.
>>>
>>>  - 5/5 (3946d3ba9934) is a doc fix for 1/5.
>>>
>>> Per stable-kernel-rules, what I need to queue is the minimum set that
>>> fixes the bug. Could you explain, per patch, why 2/5..5/5 are required
>>> to make 1/5 work / are required to actually fix the deadlock? If only
>>> 1/5 is needed, please resend just that one with your Signed-off-by
>>> added (the carried patches today only have Thomas's S-o-b, which
>>> breaks the chain of custody on a stable submission).
>>
>> Thanks for the quick review and for pointing out the missing Signed-off-by. I apologize for that omission; it was my mistake during the cherry-pick process.
>>
>> Regarding the dependency chain, I would like to clarify why commit 1/5 alone cannot fix the issue:
>>
>> Commits 1/5 and 2/5 introduce the new generic vblank timer infrastructure to the DRM core but do *not* touch the vkms driver at all.
>> Commit 3/5 (02e2681ffe1a) is the actual fix that modifies `drivers/gpu/drm/vkms/vkms_crtc.c`. It removes the buggy open-coded hrtimer that causes the ABBA deadlock and switches vkms to use the new infrastructure introduced in 1/5 and 2/5.
>>
>> Therefore, 1/5, 2/5, and 3/5 form an indivisible set. Applying only 1/5 would leave the deadlock in vkms completely unpatched.
>>
>> As for 4/5 and 5/5 (the timeout bump and doc fix), Maarten Lankhorst (DRM maintainer) explicitly recommended pulling in this exact 5-commit list as the proper upstream fix for this specific vkms issue (see the mailing list link in this thread).
>>
>> However, if you feel 4/5 and 5/5 introduce unnecessary risk for the 6.18.y stable tree, I can absolutely drop them and only submit 1/5, 2/5, and 3/5.
>>
>> I am preparing a v2 patch series now with my Signed-off-by added to the chain of custody. Could you let me know if you prefer the full 5-patch series as recommended by DRM maintainers, or just the minimal 3-patch series?
>>
>> Best regards,
>> Mingyu
>
>5/5 might strictly speaking not be needed as it's a documentation fix and I have no idea of the policy about those.
>
>The reporter made a bug report of an ABBA deadlock that was fixed in upstream by the first 4 patches, perhaps it's good to those attach here to this discussion.

I have no objection to taking all 5 if you're okay with it.

-- 
Thanks,
Sasha
Re: [PATCH 6.18.y 0/5] drm/vkms: Backport generic vblank timer to fix ABBA deadlock
Posted by Maarten Lankhorst 1 week, 6 days ago

Den 2026-05-26 kl. 14:50, skrev Sasha Levin:
> On Tue, May 26, 2026 at 02:48:55PM +0200, Maarten Lankhorst wrote:
>> Hello,
>>
>> Den 2026-05-26 kl. 14:06, skrev w15303746062:
>>>
>>> Hi Sasha,
>>>
>>>> Looking at the five commits:
>>>>
>>>>  - 1/5 (74afeb812850) is the one that actually fixes the ABBA
>>>>    deadlock you observed under Syzkaller; it adds the generic vblank
>>>>    timer that replaces the open-coded vkms hrtimer path.
>>>>
>>>>  - 2/5 (d54dbb5963bd) adds new CRTC helpers for "simple use cases".
>>>>    No Fixes:/Cc:stable, no described bug.
>>>>
>>>>  - 3/5 (02e2681ffe1a) is a refactor that converts vkms to the new
>>>>    helpers. No Fixes:/Cc:stable, no described bug.
>>>>
>>>>  - 4/5 (79ae8510b5b8) is a v7.1-rc1 timeout bump that depends on 1/5.
>>>>    It is not yet in any released stable, so applying it to 6.18.y
>>>>    would put it on an LTS before any LTS contains it.
>>>>
>>>>  - 5/5 (3946d3ba9934) is a doc fix for 1/5.
>>>>
>>>> Per stable-kernel-rules, what I need to queue is the minimum set that
>>>> fixes the bug. Could you explain, per patch, why 2/5..5/5 are required
>>>> to make 1/5 work / are required to actually fix the deadlock? If only
>>>> 1/5 is needed, please resend just that one with your Signed-off-by
>>>> added (the carried patches today only have Thomas's S-o-b, which
>>>> breaks the chain of custody on a stable submission).
>>>
>>> Thanks for the quick review and for pointing out the missing Signed-off-by. I apologize for that omission; it was my mistake during the cherry-pick process.
>>>
>>> Regarding the dependency chain, I would like to clarify why commit 1/5 alone cannot fix the issue:
>>>
>>> Commits 1/5 and 2/5 introduce the new generic vblank timer infrastructure to the DRM core but do *not* touch the vkms driver at all.
>>> Commit 3/5 (02e2681ffe1a) is the actual fix that modifies `drivers/gpu/drm/vkms/vkms_crtc.c`. It removes the buggy open-coded hrtimer that causes the ABBA deadlock and switches vkms to use the new infrastructure introduced in 1/5 and 2/5.
>>>
>>> Therefore, 1/5, 2/5, and 3/5 form an indivisible set. Applying only 1/5 would leave the deadlock in vkms completely unpatched.
>>>
>>> As for 4/5 and 5/5 (the timeout bump and doc fix), Maarten Lankhorst (DRM maintainer) explicitly recommended pulling in this exact 5-commit list as the proper upstream fix for this specific vkms issue (see the mailing list link in this thread).
>>>
>>> However, if you feel 4/5 and 5/5 introduce unnecessary risk for the 6.18.y stable tree, I can absolutely drop them and only submit 1/5, 2/5, and 3/5.
>>>
>>> I am preparing a v2 patch series now with my Signed-off-by added to the chain of custody. Could you let me know if you prefer the full 5-patch series as recommended by DRM maintainers, or just the minimal 3-patch series?
>>>
>>> Best regards,
>>> Mingyu
>>
>> 5/5 might strictly speaking not be needed as it's a documentation fix and I have no idea of the policy about those.
>>
>> The reporter made a bug report of an ABBA deadlock that was fixed in upstream by the first 4 patches, perhaps it's good to those attach here to this discussion.
> 
> I have no objection to taking all 5 if you're okay with it.
> 

Mingyu made an effort to reproduce, check if commits fixed it and then checked if other
drivers needed the same bug fixed. I'm ok with the series.

Kind regards,
~Maarten Lankhorst