[PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s

Robert Garcia posted 1 patch 1 month, 4 weeks ago
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 --
1 file changed, 2 deletions(-)
[PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Robert Garcia 1 month, 4 weeks ago
From: Christian König <christian.koenig@amd.com>

[ Upstream commit 5d55ed19d4190d2c210ac05ac7a53f800a8c6fe5 ]

Those can be triggered trivially by userspace.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[ Modified to gfx_v11_0.c only. ]
Signed-off-by: Robert Garcia <rob_garcia@163.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 37f793f7d4d2..6e3a32779168 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -5380,8 +5380,6 @@ static void gfx_v11_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 header, control = 0;
 
-	BUG_ON(ib->flags & AMDGPU_IB_FLAG_CE);
-
 	header = PACKET3(PACKET3_INDIRECT_BUFFER, 2);
 
 	control |= ib->length_dw | (vmid << 24);
-- 
2.34.1

Re: [PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Timur Kristóf 1 month, 3 weeks ago
Hi,

In my opinion, this BUG_ON should NOT be removed.

Using the CE was never well-supported by amdgpu and can lead to serious 
issues, so we are planning to remove it entirely. Userspace isn't using it, so 
there is no loss of functionality here.

Mesa (the official userspace drivers) have never used CE and never will.

Best regards,
Timur

On Friday, April 17, 2026 9:40:10 AM Central European Summer Time Robert 
Garcia wrote:
> From: Christian König <christian.koenig@amd.com>
> 
> [ Upstream commit 5d55ed19d4190d2c210ac05ac7a53f800a8c6fe5 ]
> 
> Those can be triggered trivially by userspace.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> Acked-by: Timur Kristóf <timur.kristof@gmail.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> [ Modified to gfx_v11_0.c only. ]
> Signed-off-by: Robert Garcia <rob_garcia@163.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 37f793f7d4d2..6e3a32779168
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
> @@ -5380,8 +5380,6 @@ static void gfx_v11_0_ring_emit_ib_gfx(struct
> amdgpu_ring *ring, unsigned vmid = AMDGPU_JOB_GET_VMID(job);
>  	u32 header, control = 0;
> 
> -	BUG_ON(ib->flags & AMDGPU_IB_FLAG_CE);
> -
>  	header = PACKET3(PACKET3_INDIRECT_BUFFER, 2);
> 
>  	control |= ib->length_dw | (vmid << 24);
Re: [PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Christian König 1 month, 3 weeks ago
Those points are certainly valid.

I've also up-streamed a patch which completely rejects userspace submissions who try to use the CE.

The problem is that those BUG_ON() can lead to a deny of service because they crash the whole kernel.

A BUG_ON() is only justified if it prevents even worse things to happen, e.g. data corruption or it would crash later on anyway just not so obvious on what is wrong.

Otherwise we should use WARN_ON().

Regards,
Christian.

On 4/22/26 16:03, Timur Kristóf wrote:
> Hi,
> 
> In my opinion, this BUG_ON should NOT be removed.
> 
> Using the CE was never well-supported by amdgpu and can lead to serious 
> issues, so we are planning to remove it entirely. Userspace isn't using it, so 
> there is no loss of functionality here.
> 
> Mesa (the official userspace drivers) have never used CE and never will.
> 
> Best regards,
> Timur
> 
> On Friday, April 17, 2026 9:40:10 AM Central European Summer Time Robert 
> Garcia wrote:
>> From: Christian König <christian.koenig@amd.com>
>>
>> [ Upstream commit 5d55ed19d4190d2c210ac05ac7a53f800a8c6fe5 ]
>>
>> Those can be triggered trivially by userspace.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>> Acked-by: Timur Kristóf <timur.kristof@gmail.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> [ Modified to gfx_v11_0.c only. ]
>> Signed-off-by: Robert Garcia <rob_garcia@163.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 37f793f7d4d2..6e3a32779168
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> @@ -5380,8 +5380,6 @@ static void gfx_v11_0_ring_emit_ib_gfx(struct
>> amdgpu_ring *ring, unsigned vmid = AMDGPU_JOB_GET_VMID(job);
>>  	u32 header, control = 0;
>>
>> -	BUG_ON(ib->flags & AMDGPU_IB_FLAG_CE);
>> -
>>  	header = PACKET3(PACKET3_INDIRECT_BUFFER, 2);
>>
>>  	control |= ib->length_dw | (vmid << 24);
> 
> 
> 
> 

Re: [PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Greg Kroah-Hartman 1 month, 3 weeks ago
On Wed, Apr 22, 2026 at 04:11:15PM +0200, Christian König wrote:
> Those points are certainly valid.
> 
> I've also up-streamed a patch which completely rejects userspace submissions who try to use the CE.
> 
> The problem is that those BUG_ON() can lead to a deny of service because they crash the whole kernel.
> 
> A BUG_ON() is only justified if it prevents even worse things to happen, e.g. data corruption or it would crash later on anyway just not so obvious on what is wrong.
> 
> Otherwise we should use WARN_ON().

WARN_ON() crashes the kernel as well when panic-on-warn is enabled, as
it is in a few billion Linux systems :(

As this commit is upstream, and in other stable trees, I'll apply this
as it's not nice to have a simple way for userspace to crash the system.

thanks,

greg k-h
Re: [PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Timur Kristóf 1 month, 3 weeks ago
On Thursday, April 23, 2026 1:22:22 PM Central European Summer Time Greg 
Kroah-Hartman wrote:
> On Wed, Apr 22, 2026 at 04:11:15PM +0200, Christian König wrote:
> > Those points are certainly valid.
> > 
> > I've also up-streamed a patch which completely rejects userspace
> > submissions who try to use the CE.
> > 
> > The problem is that those BUG_ON() can lead to a deny of service because
> > they crash the whole kernel.
> > 
> > A BUG_ON() is only justified if it prevents even worse things to happen,
> > e.g. data corruption or it would crash later on anyway just not so
> > obvious on what is wrong.
> > 
> > Otherwise we should use WARN_ON().
> 
> WARN_ON() crashes the kernel as well when panic-on-warn is enabled, as
> it is in a few billion Linux systems :(
> 
> As this commit is upstream, and in other stable trees, I'll apply this
> as it's not nice to have a simple way for userspace to crash the system.
> 
> thanks,
> 
> greg k-h

Sounds reasonable, if you feel this improves stability.

That being said, there are many other ways besides this one for userspace to 
crash the system equally easily.

Timur
Re: [PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Greg Kroah-Hartman 1 month, 3 weeks ago
On Thu, Apr 23, 2026 at 01:34:42PM +0200, Timur Kristóf wrote:
> On Thursday, April 23, 2026 1:22:22 PM Central European Summer Time Greg 
> Kroah-Hartman wrote:
> > On Wed, Apr 22, 2026 at 04:11:15PM +0200, Christian König wrote:
> > > Those points are certainly valid.
> > > 
> > > I've also up-streamed a patch which completely rejects userspace
> > > submissions who try to use the CE.
> > > 
> > > The problem is that those BUG_ON() can lead to a deny of service because
> > > they crash the whole kernel.
> > > 
> > > A BUG_ON() is only justified if it prevents even worse things to happen,
> > > e.g. data corruption or it would crash later on anyway just not so
> > > obvious on what is wrong.
> > > 
> > > Otherwise we should use WARN_ON().
> > 
> > WARN_ON() crashes the kernel as well when panic-on-warn is enabled, as
> > it is in a few billion Linux systems :(
> > 
> > As this commit is upstream, and in other stable trees, I'll apply this
> > as it's not nice to have a simple way for userspace to crash the system.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Sounds reasonable, if you feel this improves stability.
> 
> That being said, there are many other ways besides this one for userspace to 
> crash the system equally easily.

Great, please fix up those as well :)
Re: [PATCH 6.1.y] drm/amdgpu: remove two invalid BUG_ON()s
Posted by Christian König 1 month, 3 weeks ago
On 4/23/26 13:40, Greg Kroah-Hartman wrote:
> On Thu, Apr 23, 2026 at 01:34:42PM +0200, Timur Kristóf wrote:
>> On Thursday, April 23, 2026 1:22:22 PM Central European Summer Time Greg 
>> Kroah-Hartman wrote:
>>> On Wed, Apr 22, 2026 at 04:11:15PM +0200, Christian König wrote:
>>>> Those points are certainly valid.
>>>>
>>>> I've also up-streamed a patch which completely rejects userspace
>>>> submissions who try to use the CE.
>>>>
>>>> The problem is that those BUG_ON() can lead to a deny of service because
>>>> they crash the whole kernel.
>>>>
>>>> A BUG_ON() is only justified if it prevents even worse things to happen,
>>>> e.g. data corruption or it would crash later on anyway just not so
>>>> obvious on what is wrong.
>>>>
>>>> Otherwise we should use WARN_ON().
>>>
>>> WARN_ON() crashes the kernel as well when panic-on-warn is enabled, as
>>> it is in a few billion Linux systems :(
>>>
>>> As this commit is upstream, and in other stable trees, I'll apply this
>>> as it's not nice to have a simple way for userspace to crash the system.
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> Sounds reasonable, if you feel this improves stability.
>>
>> That being said, there are many other ways besides this one for userspace to 
>> crash the system equally easily.
> 
> Great, please fix up those as well :)

Yeah, trying to do so for the last 30years or so but it's like fighting windmills.

But how goes the saying? Security is not a state but a process.

Cheers,
Christian.