From nobody Fri Oct 10 19:51:00 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDEDF1AAA2F; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; cv=none; b=ajv1G85E6iuY/UlLBkwEu7/XKulEfxCzw76pnV3cHMG7aRT4QzDTsSWRY+I97CLCV243vmgUeGwqK5XsQOyUrFkGCn/eKhtUh8lSiK3OEMurfmsuA8Z1Ody8hOAjGuI9URAOipuX4E8jxY5pinFpikHQNeZD+I7K/1nUcC1gCAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; c=relaxed/simple; bh=woWKYitU+ZTuRE8At30QAD+trikncUB1NYbv7oKpxSE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=YfOr70i+t1gl0/R9SRx428ofRCbq/y6jVCUuHfjbx/rK2DCV+ljjS0aaEnVXTqJoxfBuCRzi1vhbdZQwO5AQopdSnXCZjR0k4DKs8nUKnrvHJb437+udg99FYGY3Am+a77TQzRPc1ZvLKduokAHWMAfyHzxnf0Hurum0qplcXjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kYAKXL2x; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kYAKXL2x" Received: by smtp.kernel.org (Postfix) with ESMTPS id 91554C4CEF1; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749777091; bh=woWKYitU+ZTuRE8At30QAD+trikncUB1NYbv7oKpxSE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=kYAKXL2xGUAA2muouJo4wdPci1/dXCq3Lws0XGaIYFBFtUT1GzC+hjKFxoj+oVs3B YdyS5hwbfdqsqSUOPaAHz0nka8/33YGFA1S5fX2YvSRwlwPWFy4NYV/848n9mjN1xC 3byBlwJj0MHepj/cBkAFomkL8FeKti5BRYq3Q3cJS/svFVC6nNL8vfdJuoejbTnO6q nRiYFEddd5N0IORwsk4O3KVP+SD693zPVXJLRldyvvhrMYqU7U09NrO/5b/HXt1HLA 5X1HjlFEoflP/8gwn2Sfs7AT3r5MW5zGskEyCav8Q3zSRQHLLM3KicivCm7tPp9iF3 b6zBuk3pBxWgg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88D45C71148; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) From: Mingcong Bai via B4 Relay Date: Fri, 13 Jun 2025 09:11:29 +0800 Subject: [PATCH v2 1/5] drm/xe/bo: fix alignment with non-4KiB kernel page sizes Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io> References: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> In-Reply-To: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> To: Lucas De Marchi , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Rodrigo Vivi , David Airlie , Simona Vetter , Francois Dugast , =?utf-8?q?Zbigniew_Kempczy=C5=84ski?= , =?utf-8?q?Jos=C3=A9_Roberto_de_Souza?= , Mauro Carvalho Chehab , Matthew Brost , Zhanjun Dong , Matt Roper , Alan Previn , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mateusz Naklicki Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Kexy Biscuit , Shang Yatsen <429839446@qq.com>, Mingcong Bai , Wenbin Fang , Haien Liang <27873200@qq.com>, Jianfeng Liu , Shirong Liu , Haofeng Wu X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1749777090; l=7779; i=jeffbai@aosc.io; s=20250604; h=from:subject:message-id; bh=xSGwCQGvwAiX+QXGpf7BRIVGfqz2xRhzgtsh+/oMkpM=; b=ndJA9s+Q0CEASwC5edH5pMZL19PX8z0RS8kJtbZB1FLGIobXAQISANdb2DrbHhbWpNTalI3vZ Xscw2wmWnb5CmBeSlc+d0ZrDuzPmFy9bVu8V43q0T9kkWxs80yzNS1F X-Developer-Key: i=jeffbai@aosc.io; a=ed25519; pk=MJdgklflDF+Xz9x2Lp+ogEnEyk8HRosMGiqLgWbFctY= X-Endpoint-Received: by B4 Relay for jeffbai@aosc.io/20250604 with auth_id=422 X-Original-From: Mingcong Bai Reply-To: jeffbai@aosc.io From: Mingcong Bai The bo/ttm interfaces with kernel memory mapping from dedicated GPU memory. It is not correct to assume that SZ_4K would suffice for page alignment as there are a few hardware platforms that commonly uses non- 4KiB pages - for instance, 16KiB is the most commonly used kernel page size used on Loongson devices (of the LoongArch architecture). Per our testing, Intel Xe/Alchemist/Battlemage families of GPUs works on Loongson platforms so long as "Above 4G Decoding" was enabled and "Resizable BAR" was set to auto in the UEFI firmware settings. Without this fix, the kernel will hang at a kernel BUG(): [ 7.425445] ------------[ cut here ]------------ [ 7.430032] kernel BUG at drivers/gpu/drm/drm_gem.c:181! [ 7.435330] Oops - BUG[#1]: [ 7.438099] CPU: 0 UID: 0 PID: 102 Comm: kworker/0:4 Tainted: G = E 6.13.3-aosc-main-00336-g60829239b300-dirty #3 [ 7.449511] Tainted: [E]=3DUNSIGNED_MODULE [ 7.453402] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EV= B/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-= prestab [ 7.467144] Workqueue: events work_for_cpu_fn [ 7.471472] pc 9000000001045fa4 ra ffff8000025331dc tp 90000001010c8000 = sp 90000001010cb960 [ 7.479770] a0 900000012a3e8000 a1 900000010028c000 a2 000000000005d000 = a3 0000000000000000 [ 7.488069] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 = a7 0000000000000001 [ 7.496367] t0 0000000000001000 t1 9000000001045000 t2 0000000000000000 = t3 0000000000000000 [ 7.504665] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 = t7 0000000000000000 [ 7.504667] t8 0000000000000000 u0 90000000029ea7d8 s9 900000012a3e9360 = s0 900000010028c000 [ 7.504668] s1 ffff800002744000 s2 0000000000000000 s3 0000000000000000 = s4 0000000000000001 [ 7.504669] s5 900000012a3e8000 s6 0000000000000001 s7 0000000000022022 = s8 0000000000000000 [ 7.537855] ra: ffff8000025331dc ___xe_bo_create_locked+0x158/0x3b0 [= xe] [ 7.544893] ERA: 9000000001045fa4 drm_gem_private_object_init+0xcc/0xd0 [ 7.551639] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=3DCC DACM=3DCC -WE) [ 7.557785] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 7.562111] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 7.566870] ECFG: 00071c1d (LIE=3D0,2-4,10-12 VS=3D7) [ 7.571628] ESTAT: 000c0000 [BRK] (IS=3D ECode=3D12 EsubCode=3D0) [ 7.577163] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 7.583128] Modules linked in: xe(E+) drm_gpuvm(E) drm_exec(E) drm_buddy= (E) gpu_sched(E) drm_suballoc_helper(E) drm_display_helper(E) loongson(E) r= 8169(E) cec(E) rc_core(E) realtek(E) i2c_algo_bit(E) tpm_tis_spi(E) led_cla= ss(E) hid_generic(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_hel= per(E) sunrpc(E) la_ow_syscall(E) i2c_dev(E) [ 7.613049] Process kworker/0:4 (pid: 102, threadinfo=3D00000000bc26ebd1= , task=3D0000000055480707) [ 7.621606] Stack : 0000000000000000 3030303a6963702b 000000000005d000 0= 000000000000000 [ 7.629563] 0000000000000001 0000000000000000 0000000000000000 8= e1bfae42b2f7877 [ 7.637519] 000000000005d000 900000012a3e8000 900000012a3e9360 0= 000000000000000 [ 7.645475] ffffffffffffffff 0000000000000000 0000000000022022 0= 000000000000000 [ 7.653431] 0000000000000001 ffff800002533660 0000000000022022 9= 000000000234470 [ 7.661386] 90000001010cba28 0000000000001000 0000000000000000 0= 00000000005c300 [ 7.669342] 900000012a3e8000 0000000000000000 0000000000000001 9= 00000012a3e8000 [ 7.677298] ffffffffffffffff 0000000000022022 900000012a3e9498 f= fff800002533a14 [ 7.685254] 0000000000022022 0000000000000000 900000000209c000 9= 0000000010589e0 [ 7.693209] 90000001010cbab8 ffff8000027c78c0 fffffffffffff000 9= 00000012a3e8000 [ 7.701165] ... [ 7.703588] Call Trace: [ 7.703590] [<9000000001045fa4>] drm_gem_private_object_init+0xcc/0xd0 [ 7.712496] [] ___xe_bo_create_locked+0x154/0x3b0 [xe] [ 7.719268] [] __xe_bo_create_locked+0x228/0x304 [xe] [ 7.725951] [] xe_bo_create_pin_map_at_aligned+0x70/0x= 1b0 [xe] [ 7.733410] [] xe_managed_bo_create_pin_map+0x34/0xcc = [xe] [ 7.740522] [] xe_managed_bo_create_from_data+0x44/0xb= 0 [xe] [ 7.747807] [] xe_uc_fw_init+0x3ec/0x904 [xe] [ 7.753814] [] xe_guc_init+0x30/0x3dc [xe] [ 7.759553] [] xe_uc_init+0x20/0xf0 [xe] [ 7.765121] [] xe_gt_init_hwconfig+0x5c/0xd0 [xe] [ 7.771461] [] xe_device_probe+0x240/0x588 [xe] [ 7.777627] [] xe_pci_probe+0x6c0/0xa6c [xe] [ 7.783540] [<9000000000e9828c>] local_pci_probe+0x4c/0xb4 [ 7.788989] [<90000000002aa578>] work_for_cpu_fn+0x20/0x40 [ 7.794436] [<90000000002aeb50>] process_one_work+0x1a4/0x458 [ 7.800143] [<90000000002af5a0>] worker_thread+0x304/0x3fc [ 7.805591] [<90000000002bacac>] kthread+0x114/0x138 [ 7.810520] [<9000000000241f64>] ret_from_kernel_thread+0x8/0xa4 [ 7.816489] [ 7.817961] Code: 4c000020 29c3e2f9 53ff93ff <002a0001> 0015002c 0340= 0000 02ff8063 29c04077 001500f7 [ 7.827651] [ 7.829140] ---[ end trace 0000000000000000 ]--- Revise all instances of `SZ_4K' with `PAGE_SIZE' and revise the call to `drm_gem_private_object_init()' in `*___xe_bo_create_locked()' (last call before BUG()) to use `size_t aligned_size' calculated from `PAGE_SIZE' to fix the above error. Cc: Fixes: 4e03b584143e ("drm/xe/uapi: Reject bo creation of unaligned size") Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Tested-by: Mingcong Bai Tested-by: Wenbin Fang Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu Tested-by: Shirong Liu Tested-by: Haofeng Wu Link: https://github.com/FanFansfan/loongson-linux/commit/22c55ab3931c32410= a077b3ddb6dca3f28223360 Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai --- drivers/gpu/drm/xe/xe_bo.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index d99d91fe8aa98a2bfc901a998c9fc78fcb146e15..0767df4aebbab18283ba74deb5e= 984d5d847812c 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -1837,9 +1837,9 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device= *xe, struct xe_bo *bo, flags |=3D XE_BO_FLAG_INTERNAL_64K; alignment =3D align >> PAGE_SHIFT; } else { - aligned_size =3D ALIGN(size, SZ_4K); + aligned_size =3D ALIGN(size, PAGE_SIZE); flags &=3D ~XE_BO_FLAG_INTERNAL_64K; - alignment =3D SZ_4K >> PAGE_SHIFT; + alignment =3D PAGE_SIZE >> PAGE_SHIFT; } =20 if (type =3D=3D ttm_bo_type_device && aligned_size !=3D size) @@ -1853,7 +1853,7 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device= *xe, struct xe_bo *bo, =20 bo->ccs_cleared =3D false; bo->tile =3D tile; - bo->size =3D size; + bo->size =3D aligned_size; bo->flags =3D flags; bo->cpu_caching =3D cpu_caching; bo->ttm.base.funcs =3D &xe_gem_object_funcs; @@ -1864,7 +1864,7 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device= *xe, struct xe_bo *bo, #endif INIT_LIST_HEAD(&bo->vram_userfault_link); =20 - drm_gem_private_object_init(&xe->drm, &bo->ttm.base, size); + drm_gem_private_object_init(&xe->drm, &bo->ttm.base, aligned_size); =20 if (resv) { ctx.allow_res_evict =3D !(flags & XE_BO_FLAG_NO_RESV_EVICT); --=20 2.49.0