From nobody Thu Apr 9 10:36:22 2026 Received: from azure-sdnproxy.icoremail.net (azure-sdnproxy.icoremail.net [207.46.229.174]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 070B63BED27 for ; Mon, 9 Mar 2026 16:05:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.46.229.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773072309; cv=none; b=T1pjeLyAXvIyum92IWkTsfdPNp0lTYzYLcYvojJROZGv03fm27zp9jwpxFz5bxO5n8gtDBJp6/jST6WJeUw2oi+697BgjX7CYRRiprhLcd9L56lJQPhXeX3OP5wyPUDkWaAK3m1tMmdHoJ0IWBIDCJtt4Tt0vqRUkZyLqHv0Bl8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773072309; c=relaxed/simple; bh=D321MC6otVNHO45uNeiJ3UO4519On10wgF8dIE8fl8k=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=lJdPK7jjevPP+JICWUBqRrfMVs7n1wKwaT9g/fidMLEk//xYmNl3NjPpv8D9ah0F8mcPnh/gvRxs8Y+ct4F2m7Lz2QQup9y+HhtzUg0slvp2cLujw4d/FwL8dbQk5Ec6OXhYuY6ScmLQCVGeE1gChnIKd+D4AebnncSbzwm+JDA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=zju.edu.cn; spf=pass smtp.mailfrom=zju.edu.cn; arc=none smtp.client-ip=207.46.229.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=zju.edu.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zju.edu.cn Received: from zju.edu.cn (unknown [10.98.66.117]) by mtasvr (Coremail) with SMTP id _____wAHDlKt765p_HMjAQ--.4617S3; Tue, 10 Mar 2026 00:05:02 +0800 (CST) Received: from localhost.localdomain (unknown [10.98.66.117]) by mail-app4 (Coremail) with SMTP id zi_KCgBXEIet765pZrqnBQ--.7150S2; Tue, 10 Mar 2026 00:05:01 +0800 (CST) From: Fan Wu To: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Fan Wu Subject: [PATCH] drm/amdgpu: fix PASID task_info lookup race Date: Mon, 9 Mar 2026 16:04:03 +0000 Message-Id: <20260309160403.599472-1-fanwu01@zju.edu.cn> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: zi_KCgBXEIet765pZrqnBQ--.7150S2 X-CM-SenderInfo: qrstjiaswqq6lmxovvfxof0/ X-CM-DELIVERINFO: =?B?iEz+eQXKKxbFmtjJiESix3B1w3vZ3A9ovKVTomAyoQazvoRs/NHSP8GI2EvgeEEW7R sfnVF/U4hEbTTFSw5kpxcvjRaRawYtOEiFwid9ffR9DppjgXh/pXvNGotpvj8grR8S8ci6 rIIm4Omhn8GfQfextwY= X-Coremail-Antispam: 1Uk129KBj93XoWxAF43ZFWxWFWxtrW8XFyruFX_yoWrGw15pF WfJr1ayF4kur4jqr18J34kZ3sIyw4kZ3WUCrWFk34F9as8ZFn5Xr1kCrWrZr9xCFWkCFW2 qrWUJ3yUW3ZF9FcCm3ZEXasCq-sJn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUU9vb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVWUJVW8JwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_ Jr0_Gr1lnxkEFVAIw20F6cxK64vIFxWle2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2 xF0cIa020Ex4CE44I27wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_ Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x 0EwIxGrwACjcxG0xvY0x0EwIxGrVCF72vEw4AK0wCF04k20xvY0x0EwIxGrwCFx2IqxVCF s4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r 1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWU JVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r 1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUv cSsGvfC2KfnxnUUI43ZEXa7IU8Tv3UUUUUU== Content-Type: text/plain; charset="utf-8" The amdgpu_vm_get_task_info_pasid() function previously called amdgpu_vm_get_vm_from_pasid() which returns a raw VM pointer after releasing the pasids xarray lock. The caller then dereferences vm->task_info without any lifetime protection. Race condition: CPU 0 (lookup) CPU 1 (release) ------------------ ------------------ amdgpu_vm_get_task_info_pasid() xa_lock() vm =3D xa_load(pasids) xa_unlock() amdgpu_vm_fini() xa_erase_irq(pasids) // teardown continues kfree(fpriv) // VM freed (embedded in fpriv) vm->task_info // potential UAF This can leave the VM pointer dangling because struct amdgpu_vm is embedded in struct amdgpu_fpriv which is freed via kfree(fpriv) in amdgpu_file_release_kms() after amdgpu_vm_fini() returns. Fix this by acquiring the task_info reference while holding the xarray lock. This avoids the window where the VM could be freed between the lookup and the dereference. Cache vm->task_info in a local variable before attempting to take a reference, which keeps the lookup straightforward inside the locked section. Use kref_get_unless_zero() to safely handle the case where task_info's refcount is already being decremented to zero by another thread in the teardown path. Note: An RCU-based approach was considered but is not currently feasible because: (1) the pasids xarray is initialized without XA_FLAGS_RCU, and (2) struct amdgpu_fpriv is freed with kfree() rather than kfree_rcu(). A future refactoring could enable RCU if needed for performance. Also remove the unsafe helper function amdgpu_vm_get_vm_from_pasid() to prevent future misuse. Fixes: b8f67b9ddf4f ("drm/amdgpu: change vm->task_info handling") Signed-off-by: Fan Wu --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 40 ++++++++++++++++---------- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/a= mdgpu/amdgpu_vm.c index f2beb980e3c3..7e8621c9b661 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2468,19 +2468,6 @@ static void amdgpu_vm_destroy_task_info(struct kref = *kref) kfree(ti); } =20 -static inline struct amdgpu_vm * -amdgpu_vm_get_vm_from_pasid(struct amdgpu_device *adev, u32 pasid) -{ - struct amdgpu_vm *vm; - unsigned long flags; - - xa_lock_irqsave(&adev->vm_manager.pasids, flags); - vm =3D xa_load(&adev->vm_manager.pasids, pasid); - xa_unlock_irqrestore(&adev->vm_manager.pasids, flags); - - return vm; -} - /** * amdgpu_vm_put_task_info - reference down the vm task_info ptr * @@ -2527,8 +2514,31 @@ amdgpu_vm_get_task_info_vm(struct amdgpu_vm *vm) struct amdgpu_task_info * amdgpu_vm_get_task_info_pasid(struct amdgpu_device *adev, u32 pasid) { - return amdgpu_vm_get_task_info_vm( - amdgpu_vm_get_vm_from_pasid(adev, pasid)); + struct amdgpu_vm *vm; + unsigned long flags; + struct amdgpu_task_info *ti =3D NULL; + + /* + * Acquire the task_info reference while holding the pasids xarray + * lock to prevent a race with amdgpu_vm_fini() which removes the + * PASID mapping before freeing the VM (embedded in struct amdgpu_fpriv). + * Without this, the VM could be freed between xa_load() return and + * the task_info dereference. + */ + xa_lock_irqsave(&adev->vm_manager.pasids, flags); + vm =3D xa_load(&adev->vm_manager.pasids, pasid); + if (vm) { + /* + * Cache vm->task_info in a local variable before + * attempting to take a reference. + */ + ti =3D vm->task_info; + if (ti && !kref_get_unless_zero(&ti->refcount)) + ti =3D NULL; + } + xa_unlock_irqrestore(&adev->vm_manager.pasids, flags); + + return ti; } =20 static int amdgpu_vm_create_task_info(struct amdgpu_vm *vm) --=20 2.34.1