.../drm/amd/amdkfd/kfd_device_queue_manager.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd()
to release the memory allocated by allocate_hiq_sdma_mqd().
Move deallocate_hiq_sdma_mqd() up to ensure proper function
visibility at the point of use.
Fixes: 11614c36bc8f ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
---
Changes in v2:
- Move deallocate_hiq_sdma_mqd() up. Thanks, Felix!
- Add a Fixes tag.
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index d7a2e7178ea9..8af0929ca40a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -2919,6 +2919,14 @@ static int allocate_hiq_sdma_mqd(struct device_queue_manager *dqm)
return retval;
}
+static void deallocate_hiq_sdma_mqd(struct kfd_node *dev,
+ struct kfd_mem_obj *mqd)
+{
+ WARN(!mqd, "No hiq sdma mqd trunk to free");
+
+ amdgpu_amdkfd_free_gtt_mem(dev->adev, &mqd->gtt_mem);
+}
+
struct device_queue_manager *device_queue_manager_init(struct kfd_node *dev)
{
struct device_queue_manager *dqm;
@@ -3042,19 +3050,14 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_node *dev)
return dqm;
}
+ if (!dev->kfd->shared_resources.enable_mes)
+ deallocate_hiq_sdma_mqd(dev, &dqm->hiq_sdma_mqd);
+
out_free:
kfree(dqm);
return NULL;
}
-static void deallocate_hiq_sdma_mqd(struct kfd_node *dev,
- struct kfd_mem_obj *mqd)
-{
- WARN(!mqd, "No hiq sdma mqd trunk to free");
-
- amdgpu_amdkfd_free_gtt_mem(dev->adev, &mqd->gtt_mem);
-}
-
void device_queue_manager_uninit(struct device_queue_manager *dqm)
{
dqm->ops.stop(dqm);
--
2.25.1
…
> Move deallocate_hiq_sdma_mqd() up to ensure proper function
> visibility at the point of use.
…
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -2919,6 +2919,14 @@ static int allocate_hiq_sdma_mqd(struct device_queue_manager *dqm)
> return retval;
> }
>
> +static void deallocate_hiq_sdma_mqd(struct kfd_node *dev,
> + struct kfd_mem_obj *mqd)
> +{
> + WARN(!mqd, "No hiq sdma mqd trunk to free");
> +
> + amdgpu_amdkfd_free_gtt_mem(dev->adev, &mqd->gtt_mem);
> +}
Is there also a need to reconsider the implementation of the applied
null pointer check here?
Regards,
Markus
On 2026-01-06 04:30, Markus Elfring wrote:
> …
>> Move deallocate_hiq_sdma_mqd() up to ensure proper function
>> visibility at the point of use.
> …
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -2919,6 +2919,14 @@ static int allocate_hiq_sdma_mqd(struct device_queue_manager *dqm)
>> return retval;
>> }
>>
>> +static void deallocate_hiq_sdma_mqd(struct kfd_node *dev,
>> + struct kfd_mem_obj *mqd)
>> +{
>> + WARN(!mqd, "No hiq sdma mqd trunk to free");
>> +
>> + amdgpu_amdkfd_free_gtt_mem(dev->adev, &mqd->gtt_mem);
>> +}
> Is there also a need to reconsider the implementation of the applied
> null pointer check here?
Yeah, we have a WARN if mqd is NULL but then we still call
amdgpu_amdkfd_free_gtt_mem. There is a NULL pointer check in
amdgpu_amdkfd_free_gtt_mem, but &mqd->gtt_mem won't be NULL because it's
not at the start of struct kfd_mem_obj. So you'd get a kernel oops if
mqd is ever NULL here.
That said, I've never seen anyone complain about this WARN and a
subsequent kernel oops.
The only other place that calls deallocate_hiq_sdma_mqd is
device_queue_manager_uninit (only if not using MES). It can only be
called with a valid dqm if device_queue_manager_init succeeded, which
always ends with a valid dqm->hiq_sdma_mqd if not using MES.
My conclusion is that this WARN is just unnecessary. But it's also harmless.
Regards,
Felix
>
> Regards,
> Markus
> My conclusion is that this WARN is just unnecessary. Would you like to omit such a questionable macro call then? > But it's also harmless. How do you think about to avoid special development concerns here? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.19-rc4#n1242 Regards, Markus
On 2026-01-09 03:37, Markus Elfring wrote: >> My conclusion is that this WARN is just unnecessary. > Would you like to omit such a questionable macro call then? I don't feel strongly about it. I already submitted Haoxiang's patch. > > >> But it's also harmless. > How do you think about to avoid special development concerns here? > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.19-rc4#n1242 No. I think the WARN is used exactly as it was meant to be here: to check for something that should never happen. Regards, Felix > > Regards, > Markus
>>> But it's also harmless. >> How do you think about to avoid special development concerns here? >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v6.19-rc4#n1242 > > No. I think the WARN is used exactly as it was meant to be here: to check for something that should never happen. Do we stumble on another target conflict at such a source code place? Would you like to avoid undefined behaviour here? https://wiki.sei.cmu.edu/confluence/spaces/c/pages/87152449/EXP34-C.+Do+not+dereference+null+pointers Regards, Markus
© 2016 - 2026 Red Hat, Inc.