[PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug

Chintan Patel posted 1 patch 2 months, 2 weeks ago
drivers/gpu/drm/drm_vblank.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
[PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug
Posted by Chintan Patel 2 months, 2 weeks ago
When wait_event_timeout() in drm_wait_one_vblank() times out, the
current WARN can cause unnecessary kernel panics in environments
with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
under scheduler pressure or from invalid userspace calls, so they are
not always a kernel bug.

Replace the WARN with drm_dbg_kms() messages that provide useful
context (last and current vblank counters) without crashing the
system. Developers can still enable drm.debug to diagnose genuine
problems.

Reported-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
Tested-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com

Signed-off-by: Chintan Patel <chintanlike@gmail.com>

v2:
 - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
 - Remove else branch, only log timeout case
---
 drivers/gpu/drm/drm_vblank.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 46f59883183d..a94570668cba 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
 	int ret;
-	u64 last;
+	u64 last, curr;
 
 	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
 		return;
@@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
 				 last != drm_vblank_count(dev, pipe),
 				 msecs_to_jiffies(100));
 
-	drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
+	curr = drm_vblank_count(dev, pipe);
+
+	if (ret == 0) {
+		drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, curr=%llu\n",
+			pipe, last, curr);
+	}
 
 	drm_vblank_put(dev, pipe);
 }
-- 
2.43.0
Re: [PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug
Posted by Ville Syrjälä 2 months, 2 weeks ago
On Wed, Oct 01, 2025 at 07:57:23PM -0700, Chintan Patel wrote:
> When wait_event_timeout() in drm_wait_one_vblank() times out, the
> current WARN can cause unnecessary kernel panics in environments
> with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
> under scheduler pressure or from invalid userspace calls, so they are
> not always a kernel bug.

"invalid userspace calls" should never reach this far.
That would be a kernel bug.

> 
> Replace the WARN with drm_dbg_kms() messages that provide useful
> context (last and current vblank counters) without crashing the
> system. Developers can still enable drm.debug to diagnose genuine
> problems.
> 
> Reported-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
> Tested-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
> 
> Signed-off-by: Chintan Patel <chintanlike@gmail.com>
> 
> v2:
>  - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
>  - Remove else branch, only log timeout case
> ---
>  drivers/gpu/drm/drm_vblank.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index 46f59883183d..a94570668cba 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
>  {
>  	struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
>  	int ret;
> -	u64 last;
> +	u64 last, curr;
>  
>  	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
>  		return;
> @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
>  				 last != drm_vblank_count(dev, pipe),
>  				 msecs_to_jiffies(100));
>  
> -	drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
> +	curr = drm_vblank_count(dev, pipe);
> +
> +	if (ret == 0) {
> +		drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, curr=%llu\n",
> +			pipe, last, curr);

It should at the very least be a drm_err(). Though the backtrace can
be useful in figuring out where the problem is coming from, so not
too happy about this change.

> +	}
>  
>  	drm_vblank_put(dev, pipe);
>  }
> -- 
> 2.43.0

-- 
Ville Syrjälä
Intel
Re: [PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug
Posted by Chintan Patel 2 months, 2 weeks ago

On 10/2/25 04:40, Ville Syrjälä wrote:
> On Wed, Oct 01, 2025 at 07:57:23PM -0700, Chintan Patel wrote:
>> When wait_event_timeout() in drm_wait_one_vblank() times out, the
>> current WARN can cause unnecessary kernel panics in environments
>> with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
>> under scheduler pressure or from invalid userspace calls, so they are
>> not always a kernel bug.
> 
> "invalid userspace calls" should never reach this far.
> That would be a kernel bug.
> 
>>
>> Replace the WARN with drm_dbg_kms() messages that provide useful
>> context (last and current vblank counters) without crashing the
>> system. Developers can still enable drm.debug to diagnose genuine
>> problems.
>>
>> Reported-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
>> Tested-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
>>
>> Signed-off-by: Chintan Patel <chintanlike@gmail.com>
>>
>> v2:
>>   - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
>>   - Remove else branch, only log timeout case
>> ---
>>   drivers/gpu/drm/drm_vblank.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
>> index 46f59883183d..a94570668cba 100644
>> --- a/drivers/gpu/drm/drm_vblank.c
>> +++ b/drivers/gpu/drm/drm_vblank.c
>> @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
>>   {
>>   	struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
>>   	int ret;
>> -	u64 last;
>> +	u64 last, curr;
>>   
>>   	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
>>   		return;
>> @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
>>   				 last != drm_vblank_count(dev, pipe),
>>   				 msecs_to_jiffies(100));
>>   
>> -	drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
>> +	curr = drm_vblank_count(dev, pipe);
>> +
>> +	if (ret == 0) {
>> +		drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, curr=%llu\n",
>> +			pipe, last, curr);
> 
> It should at the very least be a drm_err(). Though the backtrace can
> be useful in figuring out where the problem is coming from, so not
> too happy about this change.


Thanks Ville for the feedback.I am still learning as I am new here!

You’re right, “invalid userspace calls” was a poor choice of wording —
I’ll drop that from the commit message. The main goal is to avoid
unnecessary panics in fuzzing/CI with panic_on_warn, while still
reporting the error clearly.

I’ll update the patch to use drm_err() instead of drm_dbg_kms(), and
drop the extra drm_vblank_count() call per Thomas’ earlier comment.

Best regards,
Chintan
Re: [PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug
Posted by Ville Syrjälä 2 months, 2 weeks ago
On Thu, Oct 02, 2025 at 02:40:05PM +0300, Ville Syrjälä wrote:
> On Wed, Oct 01, 2025 at 07:57:23PM -0700, Chintan Patel wrote:
> > When wait_event_timeout() in drm_wait_one_vblank() times out, the
> > current WARN can cause unnecessary kernel panics in environments
> > with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
> > under scheduler pressure or from invalid userspace calls, so they are
> > not always a kernel bug.
> 
> "invalid userspace calls" should never reach this far.
> That would be a kernel bug.

I was also wondering how you could get this due to some scheduler
screwup, but I suppose that could theoretically happen with threaded 
irqs, or whatever work/etc is used to update the vblank count on
drivers that don't have hardware interrupts for it. 100+ msec
hw interrupt latency sounds excessive to me though.

But since you reference some syzbot reports below, are you
actually trying to hide real kernel bugs that syzbot found?

> 
> > 
> > Replace the WARN with drm_dbg_kms() messages that provide useful
> > context (last and current vblank counters) without crashing the
> > system. Developers can still enable drm.debug to diagnose genuine
> > problems.
> > 
> > Reported-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
> > Tested-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
> > 
> > Signed-off-by: Chintan Patel <chintanlike@gmail.com>
> > 
> > v2:
> >  - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
> >  - Remove else branch, only log timeout case
> > ---
> >  drivers/gpu/drm/drm_vblank.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > index 46f59883183d..a94570668cba 100644
> > --- a/drivers/gpu/drm/drm_vblank.c
> > +++ b/drivers/gpu/drm/drm_vblank.c
> > @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
> >  {
> >  	struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
> >  	int ret;
> > -	u64 last;
> > +	u64 last, curr;
> >  
> >  	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
> >  		return;
> > @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
> >  				 last != drm_vblank_count(dev, pipe),
> >  				 msecs_to_jiffies(100));
> >  
> > -	drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
> > +	curr = drm_vblank_count(dev, pipe);
> > +
> > +	if (ret == 0) {
> > +		drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, curr=%llu\n",
> > +			pipe, last, curr);
> 
> It should at the very least be a drm_err(). Though the backtrace can
> be useful in figuring out where the problem is coming from, so not
> too happy about this change.
> 
> > +	}
> >  
> >  	drm_vblank_put(dev, pipe);
> >  }
> > -- 
> > 2.43.0
> 
> -- 
> Ville Syrjälä
> Intel

-- 
Ville Syrjälä
Intel
Re: [PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug
Posted by Thomas Zimmermann 2 months, 2 weeks ago
Hi

Am 02.10.25 um 04:57 schrieb Chintan Patel:
> When wait_event_timeout() in drm_wait_one_vblank() times out, the
> current WARN can cause unnecessary kernel panics in environments
> with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
> under scheduler pressure or from invalid userspace calls, so they are
> not always a kernel bug.
>
> Replace the WARN with drm_dbg_kms() messages that provide useful
> context (last and current vblank counters) without crashing the
> system. Developers can still enable drm.debug to diagnose genuine
> problems.
>
> Reported-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
> Tested-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
>
> Signed-off-by: Chintan Patel <chintanlike@gmail.com>

There should be no empty lines among those tags

>
> v2:
>   - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
>   - Remove else branch, only log timeout case
> ---
>   drivers/gpu/drm/drm_vblank.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index 46f59883183d..a94570668cba 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
>   {
>   	struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
>   	int ret;
> -	u64 last;
> +	u64 last, curr;
>   
>   	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
>   		return;
> @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
>   				 last != drm_vblank_count(dev, pipe),
>   				 msecs_to_jiffies(100));
>   
> -	drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
> +	curr = drm_vblank_count(dev, pipe);

Please don't call drm_vblank_count() here. It's not necessary for 
regular operation. Simply keep the debug message as-is.

> +
> +	if (ret == 0) {

"if (!ret)" is the preferred style.

> +		drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, curr=%llu\n",
> +			pipe, last, curr);

Aligning the pipe argument with dev from the previous line is the 
preferred style.

Best regards
Thomas

> +	}
>   
>   	drm_vblank_put(dev, pipe);
>   }

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
Re: [PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN to debug
Posted by Chintan Patel 2 months, 2 weeks ago

On 10/1/25 23:34, Thomas Zimmermann wrote:
> Hi
> 
> Am 02.10.25 um 04:57 schrieb Chintan Patel:
>> When wait_event_timeout() in drm_wait_one_vblank() times out, the
>> current WARN can cause unnecessary kernel panics in environments
>> with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
>> under scheduler pressure or from invalid userspace calls, so they are
>> not always a kernel bug.
>>
>> Replace the WARN with drm_dbg_kms() messages that provide useful
>> context (last and current vblank counters) without crashing the
>> system. Developers can still enable drm.debug to diagnose genuine
>> problems.
>>
>> Reported-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
>> Tested-by: syzbot+147ba789658184f0ce04@syzkaller.appspotmail.com
>>
>> Signed-off-by: Chintan Patel <chintanlike@gmail.com>
> 
> There should be no empty lines among those tags
> 
>>
>> v2:
>>   - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
>>   - Remove else branch, only log timeout case
>> ---
>>   drivers/gpu/drm/drm_vblank.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
>> index 46f59883183d..a94570668cba 100644
>> --- a/drivers/gpu/drm/drm_vblank.c
>> +++ b/drivers/gpu/drm/drm_vblank.c
>> @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, 
>> unsigned int pipe)
>>   {
>>       struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
>>       int ret;
>> -    u64 last;
>> +    u64 last, curr;
>>       if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
>>           return;
>> @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device 
>> *dev, unsigned int pipe)
>>                    last != drm_vblank_count(dev, pipe),
>>                    msecs_to_jiffies(100));
>> -    drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
>> +    curr = drm_vblank_count(dev, pipe);
> 
> Please don't call drm_vblank_count() here. It's not necessary for 
> regular operation. Simply keep the debug message as-is.
> 
>> +
>> +    if (ret == 0) {
> 
> "if (!ret)" is the preferred style.
> 
>> +        drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, 
>> curr=%llu\n",
>> +            pipe, last, curr);
> 
> Aligning the pipe argument with dev from the previous line is the 
> preferred style.
> 

Hi Thomas,

Thank you for your review and helpful suggestions.
I’ll drop the unnecessary comment and remove the else branch as you 
recommended.

I’ll send a v3 with these changes.

Best regards,
Chintan