From nobody Fri Oct 10 17:32:57 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDEDF1AAA2F; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; cv=none; b=ajv1G85E6iuY/UlLBkwEu7/XKulEfxCzw76pnV3cHMG7aRT4QzDTsSWRY+I97CLCV243vmgUeGwqK5XsQOyUrFkGCn/eKhtUh8lSiK3OEMurfmsuA8Z1Ody8hOAjGuI9URAOipuX4E8jxY5pinFpikHQNeZD+I7K/1nUcC1gCAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; c=relaxed/simple; bh=woWKYitU+ZTuRE8At30QAD+trikncUB1NYbv7oKpxSE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=YfOr70i+t1gl0/R9SRx428ofRCbq/y6jVCUuHfjbx/rK2DCV+ljjS0aaEnVXTqJoxfBuCRzi1vhbdZQwO5AQopdSnXCZjR0k4DKs8nUKnrvHJb437+udg99FYGY3Am+a77TQzRPc1ZvLKduokAHWMAfyHzxnf0Hurum0qplcXjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kYAKXL2x; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kYAKXL2x" Received: by smtp.kernel.org (Postfix) with ESMTPS id 91554C4CEF1; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749777091; bh=woWKYitU+ZTuRE8At30QAD+trikncUB1NYbv7oKpxSE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=kYAKXL2xGUAA2muouJo4wdPci1/dXCq3Lws0XGaIYFBFtUT1GzC+hjKFxoj+oVs3B YdyS5hwbfdqsqSUOPaAHz0nka8/33YGFA1S5fX2YvSRwlwPWFy4NYV/848n9mjN1xC 3byBlwJj0MHepj/cBkAFomkL8FeKti5BRYq3Q3cJS/svFVC6nNL8vfdJuoejbTnO6q nRiYFEddd5N0IORwsk4O3KVP+SD693zPVXJLRldyvvhrMYqU7U09NrO/5b/HXt1HLA 5X1HjlFEoflP/8gwn2Sfs7AT3r5MW5zGskEyCav8Q3zSRQHLLM3KicivCm7tPp9iF3 b6zBuk3pBxWgg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88D45C71148; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) From: Mingcong Bai via B4 Relay Date: Fri, 13 Jun 2025 09:11:29 +0800 Subject: [PATCH v2 1/5] drm/xe/bo: fix alignment with non-4KiB kernel page sizes Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250613-upstream-xe-non-4k-v2-v2-1-934f82249f8a@aosc.io> References: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> In-Reply-To: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> To: Lucas De Marchi , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Rodrigo Vivi , David Airlie , Simona Vetter , Francois Dugast , =?utf-8?q?Zbigniew_Kempczy=C5=84ski?= , =?utf-8?q?Jos=C3=A9_Roberto_de_Souza?= , Mauro Carvalho Chehab , Matthew Brost , Zhanjun Dong , Matt Roper , Alan Previn , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mateusz Naklicki Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Kexy Biscuit , Shang Yatsen <429839446@qq.com>, Mingcong Bai , Wenbin Fang , Haien Liang <27873200@qq.com>, Jianfeng Liu , Shirong Liu , Haofeng Wu X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1749777090; l=7779; i=jeffbai@aosc.io; s=20250604; h=from:subject:message-id; bh=xSGwCQGvwAiX+QXGpf7BRIVGfqz2xRhzgtsh+/oMkpM=; b=ndJA9s+Q0CEASwC5edH5pMZL19PX8z0RS8kJtbZB1FLGIobXAQISANdb2DrbHhbWpNTalI3vZ Xscw2wmWnb5CmBeSlc+d0ZrDuzPmFy9bVu8V43q0T9kkWxs80yzNS1F X-Developer-Key: i=jeffbai@aosc.io; a=ed25519; pk=MJdgklflDF+Xz9x2Lp+ogEnEyk8HRosMGiqLgWbFctY= X-Endpoint-Received: by B4 Relay for jeffbai@aosc.io/20250604 with auth_id=422 X-Original-From: Mingcong Bai Reply-To: jeffbai@aosc.io From: Mingcong Bai The bo/ttm interfaces with kernel memory mapping from dedicated GPU memory. It is not correct to assume that SZ_4K would suffice for page alignment as there are a few hardware platforms that commonly uses non- 4KiB pages - for instance, 16KiB is the most commonly used kernel page size used on Loongson devices (of the LoongArch architecture). Per our testing, Intel Xe/Alchemist/Battlemage families of GPUs works on Loongson platforms so long as "Above 4G Decoding" was enabled and "Resizable BAR" was set to auto in the UEFI firmware settings. Without this fix, the kernel will hang at a kernel BUG(): [ 7.425445] ------------[ cut here ]------------ [ 7.430032] kernel BUG at drivers/gpu/drm/drm_gem.c:181! [ 7.435330] Oops - BUG[#1]: [ 7.438099] CPU: 0 UID: 0 PID: 102 Comm: kworker/0:4 Tainted: G = E 6.13.3-aosc-main-00336-g60829239b300-dirty #3 [ 7.449511] Tainted: [E]=3DUNSIGNED_MODULE [ 7.453402] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EV= B/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-= prestab [ 7.467144] Workqueue: events work_for_cpu_fn [ 7.471472] pc 9000000001045fa4 ra ffff8000025331dc tp 90000001010c8000 = sp 90000001010cb960 [ 7.479770] a0 900000012a3e8000 a1 900000010028c000 a2 000000000005d000 = a3 0000000000000000 [ 7.488069] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 = a7 0000000000000001 [ 7.496367] t0 0000000000001000 t1 9000000001045000 t2 0000000000000000 = t3 0000000000000000 [ 7.504665] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 = t7 0000000000000000 [ 7.504667] t8 0000000000000000 u0 90000000029ea7d8 s9 900000012a3e9360 = s0 900000010028c000 [ 7.504668] s1 ffff800002744000 s2 0000000000000000 s3 0000000000000000 = s4 0000000000000001 [ 7.504669] s5 900000012a3e8000 s6 0000000000000001 s7 0000000000022022 = s8 0000000000000000 [ 7.537855] ra: ffff8000025331dc ___xe_bo_create_locked+0x158/0x3b0 [= xe] [ 7.544893] ERA: 9000000001045fa4 drm_gem_private_object_init+0xcc/0xd0 [ 7.551639] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=3DCC DACM=3DCC -WE) [ 7.557785] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 7.562111] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 7.566870] ECFG: 00071c1d (LIE=3D0,2-4,10-12 VS=3D7) [ 7.571628] ESTAT: 000c0000 [BRK] (IS=3D ECode=3D12 EsubCode=3D0) [ 7.577163] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 7.583128] Modules linked in: xe(E+) drm_gpuvm(E) drm_exec(E) drm_buddy= (E) gpu_sched(E) drm_suballoc_helper(E) drm_display_helper(E) loongson(E) r= 8169(E) cec(E) rc_core(E) realtek(E) i2c_algo_bit(E) tpm_tis_spi(E) led_cla= ss(E) hid_generic(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_hel= per(E) sunrpc(E) la_ow_syscall(E) i2c_dev(E) [ 7.613049] Process kworker/0:4 (pid: 102, threadinfo=3D00000000bc26ebd1= , task=3D0000000055480707) [ 7.621606] Stack : 0000000000000000 3030303a6963702b 000000000005d000 0= 000000000000000 [ 7.629563] 0000000000000001 0000000000000000 0000000000000000 8= e1bfae42b2f7877 [ 7.637519] 000000000005d000 900000012a3e8000 900000012a3e9360 0= 000000000000000 [ 7.645475] ffffffffffffffff 0000000000000000 0000000000022022 0= 000000000000000 [ 7.653431] 0000000000000001 ffff800002533660 0000000000022022 9= 000000000234470 [ 7.661386] 90000001010cba28 0000000000001000 0000000000000000 0= 00000000005c300 [ 7.669342] 900000012a3e8000 0000000000000000 0000000000000001 9= 00000012a3e8000 [ 7.677298] ffffffffffffffff 0000000000022022 900000012a3e9498 f= fff800002533a14 [ 7.685254] 0000000000022022 0000000000000000 900000000209c000 9= 0000000010589e0 [ 7.693209] 90000001010cbab8 ffff8000027c78c0 fffffffffffff000 9= 00000012a3e8000 [ 7.701165] ... [ 7.703588] Call Trace: [ 7.703590] [<9000000001045fa4>] drm_gem_private_object_init+0xcc/0xd0 [ 7.712496] [] ___xe_bo_create_locked+0x154/0x3b0 [xe] [ 7.719268] [] __xe_bo_create_locked+0x228/0x304 [xe] [ 7.725951] [] xe_bo_create_pin_map_at_aligned+0x70/0x= 1b0 [xe] [ 7.733410] [] xe_managed_bo_create_pin_map+0x34/0xcc = [xe] [ 7.740522] [] xe_managed_bo_create_from_data+0x44/0xb= 0 [xe] [ 7.747807] [] xe_uc_fw_init+0x3ec/0x904 [xe] [ 7.753814] [] xe_guc_init+0x30/0x3dc [xe] [ 7.759553] [] xe_uc_init+0x20/0xf0 [xe] [ 7.765121] [] xe_gt_init_hwconfig+0x5c/0xd0 [xe] [ 7.771461] [] xe_device_probe+0x240/0x588 [xe] [ 7.777627] [] xe_pci_probe+0x6c0/0xa6c [xe] [ 7.783540] [<9000000000e9828c>] local_pci_probe+0x4c/0xb4 [ 7.788989] [<90000000002aa578>] work_for_cpu_fn+0x20/0x40 [ 7.794436] [<90000000002aeb50>] process_one_work+0x1a4/0x458 [ 7.800143] [<90000000002af5a0>] worker_thread+0x304/0x3fc [ 7.805591] [<90000000002bacac>] kthread+0x114/0x138 [ 7.810520] [<9000000000241f64>] ret_from_kernel_thread+0x8/0xa4 [ 7.816489] [ 7.817961] Code: 4c000020 29c3e2f9 53ff93ff <002a0001> 0015002c 0340= 0000 02ff8063 29c04077 001500f7 [ 7.827651] [ 7.829140] ---[ end trace 0000000000000000 ]--- Revise all instances of `SZ_4K' with `PAGE_SIZE' and revise the call to `drm_gem_private_object_init()' in `*___xe_bo_create_locked()' (last call before BUG()) to use `size_t aligned_size' calculated from `PAGE_SIZE' to fix the above error. Cc: Fixes: 4e03b584143e ("drm/xe/uapi: Reject bo creation of unaligned size") Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Tested-by: Mingcong Bai Tested-by: Wenbin Fang Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu Tested-by: Shirong Liu Tested-by: Haofeng Wu Link: https://github.com/FanFansfan/loongson-linux/commit/22c55ab3931c32410= a077b3ddb6dca3f28223360 Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai Suggested-by: Kexy Biscuit --- drivers/gpu/drm/xe/xe_bo.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index d99d91fe8aa98a2bfc901a998c9fc78fcb146e15..0767df4aebbab18283ba74deb5e= 984d5d847812c 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -1837,9 +1837,9 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device= *xe, struct xe_bo *bo, flags |=3D XE_BO_FLAG_INTERNAL_64K; alignment =3D align >> PAGE_SHIFT; } else { - aligned_size =3D ALIGN(size, SZ_4K); + aligned_size =3D ALIGN(size, PAGE_SIZE); flags &=3D ~XE_BO_FLAG_INTERNAL_64K; - alignment =3D SZ_4K >> PAGE_SHIFT; + alignment =3D PAGE_SIZE >> PAGE_SHIFT; } =20 if (type =3D=3D ttm_bo_type_device && aligned_size !=3D size) @@ -1853,7 +1853,7 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device= *xe, struct xe_bo *bo, =20 bo->ccs_cleared =3D false; bo->tile =3D tile; - bo->size =3D size; + bo->size =3D aligned_size; bo->flags =3D flags; bo->cpu_caching =3D cpu_caching; bo->ttm.base.funcs =3D &xe_gem_object_funcs; @@ -1864,7 +1864,7 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device= *xe, struct xe_bo *bo, #endif INIT_LIST_HEAD(&bo->vram_userfault_link); =20 - drm_gem_private_object_init(&xe->drm, &bo->ttm.base, size); + drm_gem_private_object_init(&xe->drm, &bo->ttm.base, aligned_size); =20 if (resv) { ctx.allow_res_evict =3D !(flags & XE_BO_FLAG_NO_RESV_EVICT); --=20 2.49.0 From nobody Fri Oct 10 17:32:57 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D1BD1D516A; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; cv=none; b=ar1mmOJHIK7Cfg6ioJL+OpJPf8CUix6LUhYuqX2zngcpfRVzl0Ef8zqqIFL7Ol8p7r2+psAFSPn0wT5r7INOS/t0KjBHAjLTLRqUIkDc94PGMvf5sovBN+Iw/nQZiATgxNCGBWmBV6ArODr/q9xH4DU2swV8dmsiOca79xPq9As= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; c=relaxed/simple; bh=MpayDCKyTR3MKb8+B3TDNmywonEooa+Q+4sUF0/pSGw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=V/CmDoVwb6K9p0uIe+K1yzuOV834Bb+dEQ0wGyThqOCF9l9fomB856XeysjTEEIZ3ic1sDWHo8qFg6fwtwqbpcfU5jRNRs8uo10j14eBlyzfaNA3Ozi2xZdZ2yhfR1fOK1C3mERYcd3jSbv5+XTsWUD0Pbr2fJge9OPzYfZAfx8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MRr00Wrm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MRr00Wrm" Received: by smtp.kernel.org (Postfix) with ESMTPS id A7234C4CEF2; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749777091; bh=MpayDCKyTR3MKb8+B3TDNmywonEooa+Q+4sUF0/pSGw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=MRr00WrmnndL6uqvORqPcT13pNVe3pMpjIIZxPUkZzlu5+cDi+tLWacZ1xYtSLh1h 1CU0mvKFMkvKeHHUylFE51+2Q51fNkKy5RgN3B3j0uH72KqwVVhX6wb3ChbB87LrFY ftEKXNUmkCQVQcEOYupiFJvMKmj+NLJoMKIbTwG2zhnOLMpaQG1LJd5I8vR1wOAPOO yn2dS9zMyTsmdpdUZuaWl3loP3j3QvU4fIuS1EQz83kDXqZN/3KuugpDFhLMXUZ8Ae 8SN4UoogYs78O3DaZRI5dDuaMqAIJR13/fr+0D+X0vN9J5CVVdM5RWy5VUoT5PssZK ChY8esV+brrIQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E58AC61CE8; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) From: Mingcong Bai via B4 Relay Date: Fri, 13 Jun 2025 09:11:30 +0800 Subject: [PATCH v2 2/5] drm/xe/guc: use GUC_SIZE (SZ_4K) for alignment Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250613-upstream-xe-non-4k-v2-v2-2-934f82249f8a@aosc.io> References: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> In-Reply-To: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> To: Lucas De Marchi , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Rodrigo Vivi , David Airlie , Simona Vetter , Francois Dugast , =?utf-8?q?Zbigniew_Kempczy=C5=84ski?= , =?utf-8?q?Jos=C3=A9_Roberto_de_Souza?= , Mauro Carvalho Chehab , Matthew Brost , Zhanjun Dong , Matt Roper , Alan Previn , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mateusz Naklicki Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Kexy Biscuit , Shang Yatsen <429839446@qq.com>, Mingcong Bai , Wenbin Fang , Haien Liang <27873200@qq.com>, Jianfeng Liu , Shirong Liu , Haofeng Wu X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1749777090; l=13299; i=jeffbai@aosc.io; s=20250604; h=from:subject:message-id; bh=sRr3dSRId7ulE9WtvTdpX9Xbe8lkMHy7jLflEBNaSgY=; b=oee/63Vf95TzWEwaRMKXEYtAcSWFUxVqk/DtYvjp1yo0iDfpBFI2VvNVX5z805WdgnhywFFLc toTCpdltZhrD1SJ+XxT5Af2Px2ix3F/dzjkhLlu4RBkn1/27KvWHWaN X-Developer-Key: i=jeffbai@aosc.io; a=ed25519; pk=MJdgklflDF+Xz9x2Lp+ogEnEyk8HRosMGiqLgWbFctY= X-Endpoint-Received: by B4 Relay for jeffbai@aosc.io/20250604 with auth_id=422 X-Original-From: Mingcong Bai Reply-To: jeffbai@aosc.io From: Mingcong Bai Per the "Firmware" chapter in "drm/xe Intel GFX Driver", as well as "Volume 8: Command Stream Programming" in "Intel=C2=AE Arc=E2=84=A2 A-Serie= s Graphics and Intel Data Center GPU Flex Series Open-Source Programmer's Reference Manual For the discrete GPUs code named "Alchemist" and "Arctic Sound-M"" and "Intel=C2=AE Iris=C2=AE Xe MAX Graphics Open Source Programmer's Refere= nce Manual For the 2020 Discrete GPU formerly named "DG1"": "The RINGBUF register sets (defined in Memory Interface Registers) are used to specify the ring buffer memory areas. The ring buffer must start on a 4KB boundary and be allocated in linear memory. The length of any one ring buffer is limited to 2MB." The Graphics micro (=CE=BC) Controller (GuC) really expects command buffers aligned to 4KiB boundaries. Current implementation uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures after loading the GuC firmware: [ 7.398317] xe 0000:09:00.0: [drm] Found dg2/g10 (device ID 56a1) displa= y version 13.00 stepping C0 [ 7.410429] xe 0000:09:00.0: [drm] Using GuC firmware from i915/dg2_guc_= 70.bin version 70.36.0 [ 10.719989] xe 0000:09:00.0: [drm] *ERROR* GT0: load failed: status =3D = 0x800001EC, time =3D 3297ms, freq =3D 2400MHz (req 2400MHz), done =3D 0 [ 10.732106] xe 0000:09:00.0: [drm] *ERROR* GT0: load failed: status: Res= et =3D 0, BootROM =3D 0x76, UKernel =3D 0x01, MIA =3D 0x00, Auth =3D 0x02 [ 10.744214] xe 0000:09:00.0: [drm] *ERROR* CRITICAL: Xe has declared dev= ice 0000:09:00.0 as wedged. Please file a _new_ bug report at https://gitlab.freedesktop= .org/drm/xe/kernel/issues/new [ 10.828908] xe 0000:09:00.0: [drm] *ERROR* GT0: GuC mmio request 0x4100:= no reply 0x4100 Correct this by defining `GUC_ALIGN' as `SZ_4K' in accordance with the references above, and revising all instances of `PAGE_SIZE' as `GUC_ALIGN'. Then, revise `PAGE_ALIGN()' calls as `ALIGN()' with `GUC_ALIGN' as their second argument (overriding `PAGE_SIZE'). Cc: stable@vger.kernel.org Fixes: 84d15f426110 ("drm/xe/guc: Add capture size check in GuC log buffer") Fixes: 9c8c7a7e6f1f ("drm/xe/guc: Prepare GuC register list and update ADS = size for error capture") Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Tested-by: Mingcong Bai Tested-by: Wenbin Fang Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu Tested-by: Shirong Liu Tested-by: Haofeng Wu Link: https://github.com/FanFansfan/loongson-linux/commit/22c55ab3931c32410= a077b3ddb6dca3f28223360 Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai Suggested-by: Kexy Biscuit --- drivers/gpu/drm/xe/xe_guc.c | 4 ++-- drivers/gpu/drm/xe/xe_guc.h | 3 +++ drivers/gpu/drm/xe/xe_guc_ads.c | 32 ++++++++++++++++---------------- drivers/gpu/drm/xe/xe_guc_capture.c | 8 ++++---- drivers/gpu/drm/xe/xe_guc_ct.c | 2 +- drivers/gpu/drm/xe/xe_guc_log.c | 5 +++-- drivers/gpu/drm/xe/xe_guc_pc.c | 4 ++-- 7 files changed, 31 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c index bac5471a1a7806ed7e41a241145666834a5e0eb8..95aedf9449c8c36435f963206db= df3c86a839338 100644 --- a/drivers/gpu/drm/xe/xe_guc.c +++ b/drivers/gpu/drm/xe/xe_guc.c @@ -90,7 +90,7 @@ static u32 guc_ctl_feature_flags(struct xe_guc *guc) =20 static u32 guc_ctl_log_params_flags(struct xe_guc *guc) { - u32 offset =3D guc_bo_ggtt_addr(guc, guc->log.bo) >> PAGE_SHIFT; + u32 offset =3D guc_bo_ggtt_addr(guc, guc->log.bo) >> XE_PTE_SHIFT; u32 flags; =20 #if (((CRASH_BUFFER_SIZE) % SZ_1M) =3D=3D 0) @@ -143,7 +143,7 @@ static u32 guc_ctl_log_params_flags(struct xe_guc *guc) =20 static u32 guc_ctl_ads_flags(struct xe_guc *guc) { - u32 ads =3D guc_bo_ggtt_addr(guc, guc->ads.bo) >> PAGE_SHIFT; + u32 ads =3D guc_bo_ggtt_addr(guc, guc->ads.bo) >> XE_PTE_SHIFT; u32 flags =3D ads << GUC_ADS_ADDR_SHIFT; =20 return flags; diff --git a/drivers/gpu/drm/xe/xe_guc.h b/drivers/gpu/drm/xe/xe_guc.h index 58338be4455856994df1d7e026b3f0fa7cc03fe9..5b30215ac5616728351d77dd028= ed9f3b495cfd8 100644 --- a/drivers/gpu/drm/xe/xe_guc.h +++ b/drivers/gpu/drm/xe/xe_guc.h @@ -23,6 +23,9 @@ #define GUC_FIRMWARE_VER(guc) \ MAKE_GUC_VER_STRUCT((guc)->fw.versions.found[XE_UC_FW_VER_RELEASE]) =20 +/* GuC really expects command buffers aligned to 4K boundaries. */ +#define GUC_ALIGN SZ_4K + struct drm_printer; =20 void xe_guc_comm_init_early(struct xe_guc *guc); diff --git a/drivers/gpu/drm/xe/xe_guc_ads.c b/drivers/gpu/drm/xe/xe_guc_ad= s.c index 44c1fa2fe7c857556708290a25ea1bdfcf674449..4f47809aa794843128221c5d265= 3b6f61dab202b 100644 --- a/drivers/gpu/drm/xe/xe_guc_ads.c +++ b/drivers/gpu/drm/xe/xe_guc_ads.c @@ -143,17 +143,17 @@ static size_t guc_ads_regset_size(struct xe_guc_ads *= ads) =20 static size_t guc_ads_golden_lrc_size(struct xe_guc_ads *ads) { - return PAGE_ALIGN(ads->golden_lrc_size); + return ALIGN(ads->golden_lrc_size, GUC_ALIGN); } =20 static u32 guc_ads_waklv_size(struct xe_guc_ads *ads) { - return PAGE_ALIGN(ads->ads_waklv_size); + return ALIGN(ads->ads_waklv_size, GUC_ALIGN); } =20 static size_t guc_ads_capture_size(struct xe_guc_ads *ads) { - return PAGE_ALIGN(ads->capture_size); + return ALIGN(ads->capture_size, GUC_ALIGN); } =20 static size_t guc_ads_um_queues_size(struct xe_guc_ads *ads) @@ -168,7 +168,7 @@ static size_t guc_ads_um_queues_size(struct xe_guc_ads = *ads) =20 static size_t guc_ads_private_data_size(struct xe_guc_ads *ads) { - return PAGE_ALIGN(ads_to_guc(ads)->fw.private_data_size); + return ALIGN(ads_to_guc(ads)->fw.private_data_size, GUC_ALIGN); } =20 static size_t guc_ads_regset_offset(struct xe_guc_ads *ads) @@ -183,7 +183,7 @@ static size_t guc_ads_golden_lrc_offset(struct xe_guc_a= ds *ads) offset =3D guc_ads_regset_offset(ads) + guc_ads_regset_size(ads); =20 - return PAGE_ALIGN(offset); + return ALIGN(offset, GUC_ALIGN); } =20 static size_t guc_ads_waklv_offset(struct xe_guc_ads *ads) @@ -193,7 +193,7 @@ static size_t guc_ads_waklv_offset(struct xe_guc_ads *a= ds) offset =3D guc_ads_golden_lrc_offset(ads) + guc_ads_golden_lrc_size(ads); =20 - return PAGE_ALIGN(offset); + return ALIGN(offset, GUC_ALIGN); } =20 static size_t guc_ads_capture_offset(struct xe_guc_ads *ads) @@ -203,7 +203,7 @@ static size_t guc_ads_capture_offset(struct xe_guc_ads = *ads) offset =3D guc_ads_waklv_offset(ads) + guc_ads_waklv_size(ads); =20 - return PAGE_ALIGN(offset); + return ALIGN(offset, GUC_ALIGN); } =20 static size_t guc_ads_um_queues_offset(struct xe_guc_ads *ads) @@ -213,7 +213,7 @@ static size_t guc_ads_um_queues_offset(struct xe_guc_ad= s *ads) offset =3D guc_ads_capture_offset(ads) + guc_ads_capture_size(ads); =20 - return PAGE_ALIGN(offset); + return ALIGN(offset, GUC_ALIGN); } =20 static size_t guc_ads_private_data_offset(struct xe_guc_ads *ads) @@ -223,7 +223,7 @@ static size_t guc_ads_private_data_offset(struct xe_guc= _ads *ads) offset =3D guc_ads_um_queues_offset(ads) + guc_ads_um_queues_size(ads); =20 - return PAGE_ALIGN(offset); + return ALIGN(offset, GUC_ALIGN); } =20 static size_t guc_ads_size(struct xe_guc_ads *ads) @@ -276,7 +276,7 @@ static size_t calculate_golden_lrc_size(struct xe_guc_a= ds *ads) continue; =20 real_size =3D xe_gt_lrc_size(gt, class); - alloc_size =3D PAGE_ALIGN(real_size); + alloc_size =3D ALIGN(real_size, GUC_ALIGN); total_size +=3D alloc_size; } =20 @@ -646,12 +646,12 @@ static int guc_capture_prep_lists(struct xe_guc_ads *= ads) offsetof(struct __guc_ads_blob, system_info)); =20 /* first, set aside the first page for a capture_list with zero descripto= rs */ - total_size =3D PAGE_SIZE; + total_size =3D GUC_ALIGN; if (!xe_guc_capture_getnullheader(guc, &ptr, &size)) xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), capture_offset, ptr, s= ize); =20 null_ggtt =3D ads_ggtt + capture_offset; - capture_offset +=3D PAGE_SIZE; + capture_offset +=3D GUC_ALIGN; =20 /* * Populate capture list : at this point adps is already allocated and @@ -715,10 +715,10 @@ static int guc_capture_prep_lists(struct xe_guc_ads *= ads) } } =20 - if (ads->capture_size !=3D PAGE_ALIGN(total_size)) + if (ads->capture_size !=3D ALIGN(total_size, GUC_ALIGN)) xe_gt_dbg(gt, "Updated ADS capture size %d (was %d)\n", - PAGE_ALIGN(total_size), ads->capture_size); - return PAGE_ALIGN(total_size); + ALIGN(total_size, GUC_ALIGN), ads->capture_size); + return ALIGN(total_size, GUC_ALIGN); } =20 static void guc_mmio_regset_write_one(struct xe_guc_ads *ads, @@ -966,7 +966,7 @@ static void guc_golden_lrc_populate(struct xe_guc_ads *= ads) xe_gt_assert(gt, gt->default_lrc[class]); =20 real_size =3D xe_gt_lrc_size(gt, class); - alloc_size =3D PAGE_ALIGN(real_size); + alloc_size =3D ALIGN(real_size, GUC_ALIGN); total_size +=3D alloc_size; =20 xe_map_memcpy_to(xe, ads_to_map(ads), offset, diff --git a/drivers/gpu/drm/xe/xe_guc_capture.c b/drivers/gpu/drm/xe/xe_gu= c_capture.c index 859a3ba91be54f562ea835e949f1d141ed89d486..34e9ea9b2935136fa46fbb6aac7= 944eb844b7fae 100644 --- a/drivers/gpu/drm/xe/xe_guc_capture.c +++ b/drivers/gpu/drm/xe/xe_guc_capture.c @@ -591,8 +591,8 @@ guc_capture_getlistsize(struct xe_guc *guc, u32 owner, = u32 type, return -ENODATA; =20 if (size) - *size =3D PAGE_ALIGN((sizeof(struct guc_debug_capture_list)) + - (num_regs * sizeof(struct guc_mmio_reg))); + *size =3D ALIGN((sizeof(struct guc_debug_capture_list)) + + (num_regs * sizeof(struct guc_mmio_reg)), GUC_ALIGN); =20 return 0; } @@ -739,7 +739,7 @@ size_t xe_guc_capture_ads_input_worst_size(struct xe_gu= c *guc) * sequence, that is, during the pre-hwconfig phase before we have * the exact engine fusing info. */ - total_size =3D PAGE_SIZE; /* Pad a page in front for empty lists */ + total_size =3D GUC_ALIGN; /* Pad a page in front for empty lists */ for (i =3D 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) { for (j =3D 0; j < GUC_CAPTURE_LIST_CLASS_MAX; j++) { if (xe_guc_capture_getlistsize(guc, i, @@ -759,7 +759,7 @@ size_t xe_guc_capture_ads_input_worst_size(struct xe_gu= c *guc) total_size +=3D global_size; } =20 - return PAGE_ALIGN(total_size); + return ALIGN(total_size, GUC_ALIGN); } =20 static int guc_capture_output_size_est(struct xe_guc *guc) diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c index 2447de0ebedf45759351fd6ce03a363a9459fe1a..6bd624d071e721638aa29b57dc3= 0733089ce7a9a 100644 --- a/drivers/gpu/drm/xe/xe_guc_ct.c +++ b/drivers/gpu/drm/xe/xe_guc_ct.c @@ -212,7 +212,7 @@ int xe_guc_ct_init(struct xe_guc_ct *ct) struct xe_bo *bo; int err; =20 - xe_gt_assert(gt, !(guc_ct_size() % PAGE_SIZE)); + xe_gt_assert(gt, !(guc_ct_size() % GUC_ALIGN)); =20 ct->g2h_wq =3D alloc_ordered_workqueue("xe-g2h-wq", WQ_MEM_RECLAIM); if (!ct->g2h_wq) diff --git a/drivers/gpu/drm/xe/xe_guc_log.c b/drivers/gpu/drm/xe/xe_guc_lo= g.c index 38039c4113878007a4278d9581155158f20812ae..cd01d1033e8eefab3f49c179d18= 65c23771cdec1 100644 --- a/drivers/gpu/drm/xe/xe_guc_log.c +++ b/drivers/gpu/drm/xe/xe_guc_log.c @@ -15,6 +15,7 @@ #include "xe_force_wake.h" #include "xe_gt.h" #include "xe_gt_printk.h" +#include "xe_guc.h" #include "xe_map.h" #include "xe_mmio.h" #include "xe_module.h" @@ -58,7 +59,7 @@ static size_t guc_log_size(void) * | Capture logs | * +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D+ + CAPTURE_SIZE */ - return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + + return GUC_ALIGN + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE; } =20 @@ -328,7 +329,7 @@ u32 xe_guc_get_log_buffer_size(struct xe_guc_log *log, = enum guc_log_buffer_type u32 xe_guc_get_log_buffer_offset(struct xe_guc_log *log, enum guc_log_buff= er_type type) { enum guc_log_buffer_type i; - u32 offset =3D PAGE_SIZE;/* for the log_buffer_states */ + u32 offset =3D GUC_ALIGN; /* for the log_buffer_states */ =20 for (i =3D GUC_LOG_BUFFER_CRASH_DUMP; i < GUC_LOG_BUFFER_TYPE_MAX; ++i) { if (i =3D=3D type) diff --git a/drivers/gpu/drm/xe/xe_guc_pc.c b/drivers/gpu/drm/xe/xe_guc_pc.c index 18c623992035520ec78646240512220abee07935..eae0fccf2a76a19e03b596ea6f8= 2aa415b07df43 100644 --- a/drivers/gpu/drm/xe/xe_guc_pc.c +++ b/drivers/gpu/drm/xe/xe_guc_pc.c @@ -1044,7 +1044,7 @@ int xe_guc_pc_start(struct xe_guc_pc *pc) { struct xe_device *xe =3D pc_to_xe(pc); struct xe_gt *gt =3D pc_to_gt(pc); - u32 size =3D PAGE_ALIGN(sizeof(struct slpc_shared_data)); + u32 size =3D ALIGN(sizeof(struct slpc_shared_data), GUC_ALIGN); unsigned int fw_ref; ktime_t earlier; int ret; @@ -1172,7 +1172,7 @@ int xe_guc_pc_init(struct xe_guc_pc *pc) struct xe_tile *tile =3D gt_to_tile(gt); struct xe_device *xe =3D gt_to_xe(gt); struct xe_bo *bo; - u32 size =3D PAGE_ALIGN(sizeof(struct slpc_shared_data)); + u32 size =3D ALIGN(sizeof(struct slpc_shared_data), GUC_ALIGN); int err; =20 if (xe->info.skip_guc_pc) --=20 2.49.0 From nobody Fri Oct 10 17:32:57 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3CBCE1D5178; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; cv=none; b=O9uUY7DyJmQGKMibDcW4jCXKNuW7Ktozp6RyfYBGR4khE/xx0LrEg4kxoD4lflPFbO35tXLGDC6ezavsoqDeZ0NxCLli4BRd7kB72Cb0rjBm1DUZ9nLQHJPF0fj+KItnPML2WTghgeH6V7J/90btmmb1EgyYu/rZi54pPvJL6pM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; c=relaxed/simple; bh=+NYye4MNUBzt+8gl9CH+0mAJaYkWbn1vnUcsS4wSDTY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Df2eItvNz2UU1xiyILMcqPWD9YeNDmxfC98kMfLnfdZN3WZhVo0ft36h3IPB/M6CXuTCFgwoihrmyVkKpQFkJ/a21np70ozmezQjp8X0IEf9N0kLAPXxRanCcOkDn+wAm0PiUQJ8ypwzY8x78dfxFPBWvlyjGXRvmK7odBB5ko4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HTix3quG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HTix3quG" Received: by smtp.kernel.org (Postfix) with ESMTPS id C113DC4CEF4; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749777091; bh=+NYye4MNUBzt+8gl9CH+0mAJaYkWbn1vnUcsS4wSDTY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=HTix3quG9jysMHH/ardx70hcNXVKzPW5rAo2n53Zo5gcttooUXwhpBWAgRzVEpqc8 33j6imvtD49t7cl5UNIHG2wuSS8S1DTtRDnvyYNWq62HmkhuWER4LWGckRkZ8lZzlo nrLKftaHnOxPf6at6z+EUfMh7wcoR82sJccIGhNEA+16kjR6YrYq3ezMSRz9F5I9Dg jtIwb2FBlVT+6CWIK/RHgiAlQMq9px285CKIeo80SxTyqITl1//aMzOMlSUeMluIIX 5qOCAYGgT30L95eGDcpV/5cZyXNAaS0qeM2aGUQKToypEQOS2z7qzD7U5+ZWEUmNUe ZVtU9rGu4W4Nw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id B67AAC7114C; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) From: Mingcong Bai via B4 Relay Date: Fri, 13 Jun 2025 09:11:31 +0800 Subject: [PATCH v2 3/5] drm/xe/regs: fix RING_CTL_SIZE(size) calculation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io> References: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> In-Reply-To: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> To: Lucas De Marchi , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Rodrigo Vivi , David Airlie , Simona Vetter , Francois Dugast , =?utf-8?q?Zbigniew_Kempczy=C5=84ski?= , =?utf-8?q?Jos=C3=A9_Roberto_de_Souza?= , Mauro Carvalho Chehab , Matthew Brost , Zhanjun Dong , Matt Roper , Alan Previn , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mateusz Naklicki Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Kexy Biscuit , Shang Yatsen <429839446@qq.com>, Mingcong Bai , Wenbin Fang , Haien Liang <27873200@qq.com>, Jianfeng Liu , Shirong Liu , Haofeng Wu X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1749777090; l=6855; i=jeffbai@aosc.io; s=20250604; h=from:subject:message-id; bh=3xAgeSx/kQOKvU32cZAgSjMp+7iADD0ecjGV9IQ2UqA=; b=T2CI7xLa2+GyufMpZGOBEUr5PG/VRusl/QURmgjEkWfSVErjRUXfYoofFNSvv27T1+7ypecOL 3UJUBsTTLk8Akiwo6cLwSIWOffAIqIF+PDE4McTNUJHUkmCPqfNhd9+ X-Developer-Key: i=jeffbai@aosc.io; a=ed25519; pk=MJdgklflDF+Xz9x2Lp+ogEnEyk8HRosMGiqLgWbFctY= X-Endpoint-Received: by B4 Relay for jeffbai@aosc.io/20250604 with auth_id=422 X-Original-From: Mingcong Bai Reply-To: jeffbai@aosc.io From: Mingcong Bai Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit= .c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_b= roadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_= reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) = nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6tabl= e_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf= _defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security= (E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(= E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E)= nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E)= snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_= cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E)= soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x= (E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables= (E) x_tables(E) xe(E) d rm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) ce= c(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class= (E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) = drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G = E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=3DUNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EV= B/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-= prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 = sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 = a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 = a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 = t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 = t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 = s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 = s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 = s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0x= acc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0x= acc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=3DCC DACM=3DCC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=3D0,2-4,10-12 VS=3D7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS=3D ECode=3D12 EsubCode=3D0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G = E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=3DUNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EV= B/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-= prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 9= 00000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9= 000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0= 000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0= 000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0= 000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 9= 00000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0= 000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0= 000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0= 000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0= 000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai Tested-by: Wenbin Fang Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu Tested-by: Shirong Liu Tested-by: Haofeng Wu Link: https://github.com/FanFansfan/loongson-linux/commit/22c55ab3931c32410= a077b3ddb6dca3f28223360 Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai Suggested-by: Kexy Biscuit --- drivers/gpu/drm/xe/regs/xe_engine_regs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/= regs/xe_engine_regs.h index 7ade41e2b7b3b4438a1955eb73e7d929d6bec24c..a7608c50c907e704404d029d909= f4b39130d29c7 100644 --- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h @@ -56,7 +56,7 @@ #define RING_START(base) XE_REG((base) + 0x38) =20 #define RING_CTL(base) XE_REG((base) + 0x3c) -#define RING_CTL_SIZE(size) ((size) - PAGE_SIZE) /* in bytes -> pages = */ +#define RING_CTL_SIZE(size) ((size) - SZ_4K) /* in bytes -> pages */ =20 #define RING_START_UDW(base) XE_REG((base) + 0x48) =20 --=20 2.49.0 From nobody Fri Oct 10 17:32:57 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D2101D5175; Fri, 13 Jun 2025 01:11:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; cv=none; b=b/laHpG221BeILEqI069t6uuaQcmwhLAq5VKwf7HUGg9GXU/rye5BGS8/c3hjqoTBJecpRUbZND4LqH8E3U2nGLibUAWddUIYURE3VFkGB96pUygxS/gw9l2evv/YW/wKnwcyOaMT1Oa1od6htEQwSyOHITK5fbcd34x+HXro/M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; c=relaxed/simple; bh=22pOkiaoYKkKE9+FYGPOdRNTGjENCtQxx9n9TpcRYQw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iF7iWdrosm3nSyneokYsHN+AhGv/wYt4ex8rRUDByNPbqYqG8WpRi7dIaySv9rKmuSXZgLniq/Rfa78U6lV7FIHybwoEUyrCiShs7eea6QJ6PmH6WqVy2akBzqwY00+tJDNKbV4jqd1sWfPq0tza7ox8jdqpswbc9L9z7849/Ro= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=K7xmcNgG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="K7xmcNgG" Received: by smtp.kernel.org (Postfix) with ESMTPS id D1FB5C4CEEB; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749777091; bh=22pOkiaoYKkKE9+FYGPOdRNTGjENCtQxx9n9TpcRYQw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=K7xmcNgGJceP5Ea/EI6S4Ae9f69XPMUzhS3hCUNJGwOh8mFkn3EiLevIQVzFOJm8P srXqvU+UuZSAcRmxl5f7544ZFCkX4hXdB/bv2JtaUQd+sCVN/zaCwJLnrB6VTHeM7t AuDznSMSaBbfZyWgq1SrBQQ4qE091QSW4C9aQBJmJIM2KEJdr5tMI7PF6D7Wtr7jc0 Gq9F/0TUtwG9JBV0LPS85JlmhIUzGmQCgYvfpTFYFDdc2/T7+1bKoCzUGl142n7mkj v86mwNjhTLCeERrgcx0T6opX9XOlmpHZSKVY+9IBTIWlbAZggiLspRehb5Ma93ayM5 TuKwfgD8OPS2g== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8653C71136; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) From: Mingcong Bai via B4 Relay Date: Fri, 13 Jun 2025 09:11:32 +0800 Subject: [PATCH v2 4/5] drm/xe: use 4KiB alignment for cursor jumps Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250613-upstream-xe-non-4k-v2-v2-4-934f82249f8a@aosc.io> References: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> In-Reply-To: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> To: Lucas De Marchi , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Rodrigo Vivi , David Airlie , Simona Vetter , Francois Dugast , =?utf-8?q?Zbigniew_Kempczy=C5=84ski?= , =?utf-8?q?Jos=C3=A9_Roberto_de_Souza?= , Mauro Carvalho Chehab , Matthew Brost , Zhanjun Dong , Matt Roper , Alan Previn , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mateusz Naklicki Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Kexy Biscuit , Shang Yatsen <429839446@qq.com>, Mingcong Bai , Wenbin Fang , Haien Liang <27873200@qq.com>, Jianfeng Liu , Shirong Liu , Haofeng Wu X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1749777090; l=7981; i=jeffbai@aosc.io; s=20250604; h=from:subject:message-id; bh=kV2+HI26h6dKnEaSLYBIKXKOHnIj4zNMp7DXKD0IyuY=; b=48LfUtXhEPCUahqI3iDktGvLSziHXj/NCM1hwuPhImZqPlmB5YP6R8E3mVMIvkAfuP1Sbe2z8 TkRywyRsSiYAyzrBxkKRGqHhreKqY3N3LJcwC3M1EmyVP4Nhbsd3YOw X-Developer-Key: i=jeffbai@aosc.io; a=ed25519; pk=MJdgklflDF+Xz9x2Lp+ogEnEyk8HRosMGiqLgWbFctY= X-Endpoint-Received: by B4 Relay for jeffbai@aosc.io/20250604 with auth_id=422 X-Original-From: Mingcong Bai Reply-To: jeffbai@aosc.io From: Mingcong Bai It appears that the xe_res_cursor also assumes 4KiB alignment. Current implementation uses `PAGE_SIZE' as an assumed alignment reference, but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 23.242757] ------------[ cut here ]------------ [ 23.247363] WARNING: CPU: 0 PID: 2036 at drivers/gpu/drm/xe/xe_res_curso= r.h:182 emit_pte+0x394/0x3b0 [xe] [ 23.256962] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_b= roadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_= reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) = rfkill(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(= E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_= ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security= (E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(= E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E)= snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) nls_iso8859_1(E) qrt= r(E) nls_cp437(E) snd_hda_core(E) loongson3_cpufreq(E) rtc_efi(E) snd_hwdep= (E) snd_pcm(E) spi_loongson_pci(E) snd_timer(E) snd(E) spi_loongson_core(E)= soundcore(E) gpio_loongson_64bit(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E= ) input_leds(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables= (E) x_tables(E) xe(E) d rm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 23.257034] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) ce= c(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) loongson(E) i2c_algo= _bit(E) realtek(E) drm_ttm_helper(E) led_class(E) ttm(E) drm_client_lib(E) = drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 23.369697] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G = E 6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8 [ 23.381640] Tainted: [E]=3DUNSIGNED_MODULE [ 23.385534] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EV= B/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-= prestab [ 23.399319] pc ffff80000251efc0 ra ffff80000251eddc tp 900000011fe3c000 = sp 900000011fe3f7e0 [ 23.407632] a0 0000000000000001 a1 0000000000000000 a2 0000000000000000 = a3 0000000000000000 [ 23.415938] a4 0000000000000000 a5 0000000000000000 a6 0000000000060000 = a7 900000010c947b00 [ 23.424240] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 = t3 900000012e456230 [ 23.432543] t4 0000000000000035 t5 0000000000004000 t6 00000001fbc40403 = t7 0000000000004000 [ 23.440845] t8 9000000100e688a8 u0 5cc06cee8ef0edee s9 9000000100024420 = s0 0000000000000047 [ 23.449147] s1 0000000000004000 s2 0000000000000001 s3 900000012adba000 = s4 ffffffffffffc000 [ 23.457450] s5 9000000108939428 s6 0000000000000000 s7 0000000000000000 = s8 900000011fe3f8e0 [ 23.465851] ra: ffff80000251eddc emit_pte+0x1b0/0x3b0 [xe] [ 23.471761] ERA: ffff80000251efc0 emit_pte+0x394/0x3b0 [xe] [ 23.477557] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=3DCC DACM=3DCC -WE) [ 23.483732] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 23.488068] EUEN: 00000003 (+FPE +SXE -ASXE -BTE) [ 23.492832] ECFG: 00071c1d (LIE=3D0,2-4,10-12 VS=3D7) [ 23.497594] ESTAT: 000c0000 [BRK] (IS=3D ECode=3D12 EsubCode=3D0) [ 23.503133] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 23.509164] CPU: 0 UID: 1000 PID: 2036 Comm: QSGRenderThread Tainted: G = E 6.14.0-rc4-aosc-main-g7cc07e6e50b0-dirty #8 [ 23.509168] Tainted: [E]=3DUNSIGNED_MODULE [ 23.509168] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EV= B/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-= prestab [ 23.509170] Stack : ffffffffffffffff ffffffffffffffff 900000000023eb34 9= 00000011fe3c000 [ 23.509176] 900000011fe3f440 0000000000000000 900000011fe3f448 9= 000000001c31c70 [ 23.509181] 0000000000000000 0000000000000000 0000000000000000 0= 000000000000000 [ 23.509185] 0000000000000000 5cc06cee8ef0edee 0000000000000000 0= 000000000000000 [ 23.509190] 0000000000000000 0000000000000000 0000000000000000 0= 000000000000000 [ 23.509193] 0000000000000000 0000000000000000 00000000066b4000 9= 000000100024420 [ 23.509197] 9000000001eb8000 0000000000000000 9000000001c31c70 0= 000000000000004 [ 23.509202] 0000000000000004 0000000000000000 0000000000000000 0= 000000000000000 [ 23.509206] 900000011fe3f8e0 9000000001c31c70 9000000000244174 0= 0007fffac097534 [ 23.509211] 00000000000000b0 0000000000000004 0000000000000003 0= 000000000071c1d [ 23.509216] ... [ 23.509218] Call Trace: [ 23.509220] [<9000000000244174>] show_stack+0x3c/0x16c [ 23.509226] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 23.509230] [<9000000000288208>] __warn+0x8c/0x174 [ 23.509234] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 23.509238] [<90000000017f66e8>] do_bp+0x280/0x344 [ 23.509243] [<90000000002428a0>] handle_bp+0x120/0x1c0 [ 23.509247] [] emit_pte+0x394/0x3b0 [xe] [ 23.509295] [] xe_migrate_clear+0x2d8/0xa54 [xe] [ 23.509341] [] xe_bo_move+0x324/0x930 [xe] [ 23.509387] [] ttm_bo_handle_move_mem+0xd0/0x194 [ttm] [ 23.509392] [] ttm_bo_validate+0xd4/0x1cc [ttm] [ 23.509396] [] ttm_bo_init_reserved+0x184/0x1dc [ttm] [ 23.509399] [] ___xe_bo_create_locked+0x1e8/0x3d4 [xe] [ 23.509445] [] __xe_bo_create_locked+0x2cc/0x390 [xe] [ 23.509489] [] xe_bo_create_user+0x34/0xe4 [xe] [ 23.509533] [] xe_gem_create_ioctl+0x154/0x4d8 [xe] [ 23.509578] [<9000000001062784>] drm_ioctl_kernel+0xe0/0x14c [ 23.509582] [<9000000001062c10>] drm_ioctl+0x420/0x5f4 [ 23.509585] [] xe_drm_ioctl+0x64/0xac [xe] [ 23.509630] [<9000000000653504>] sys_ioctl+0x2b8/0xf98 [ 23.509634] [<90000000017f684c>] do_syscall+0xa0/0x140 [ 23.509637] [<9000000000241e38>] handle_syscall+0xb8/0x158 [ 23.509640] [ 23.509644] ---[ end trace 0000000000000000 ]--- Revise calls to `xe_res_dma()' and `xe_res_cursor()' to use `XE_PTE_MASK' (12) and `SZ_4K' to fix this potentially confused use of `PAGE_SIZE' in relevant code. Cc: stable@vger.kernel.org Fixes: e89b384cde62 ("drm/xe/migrate: Update emit_pte to cope with a size l= evel than 4k") Tested-by: Mingcong Bai Tested-by: Wenbin Fang Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu Tested-by: Shirong Liu Tested-by: Haofeng Wu Link: https://github.com/FanFansfan/loongson-linux/commit/22c55ab3931c32410= a077b3ddb6dca3f28223360 Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai Suggested-by: Kexy Biscuit --- drivers/gpu/drm/xe/xe_migrate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrat= e.c index 8f8e9fdfb2a813dfc7619f626f919c3f70441527..74b887ec4edccccd65b8923b8c1= 70477ca28ed43 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -592,7 +592,7 @@ static void emit_pte(struct xe_migrate *m, u64 addr, flags =3D 0; bool devmem =3D false; =20 - addr =3D xe_res_dma(cur) & PAGE_MASK; + addr =3D xe_res_dma(cur) & ~XE_PTE_MASK; if (is_vram) { if (vm->flags & XE_VM_FLAG_64K) { u64 va =3D cur_ofs * XE_PAGE_SIZE / 8; @@ -613,7 +613,7 @@ static void emit_pte(struct xe_migrate *m, bb->cs[bb->len++] =3D lower_32_bits(addr); bb->cs[bb->len++] =3D upper_32_bits(addr); =20 - xe_res_next(cur, min_t(u32, size, PAGE_SIZE)); + xe_res_next(cur, min_t(u32, size, XE_PAGE_SIZE)); cur_ofs +=3D 8; } } --=20 2.49.0 From nobody Fri Oct 10 17:32:57 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74A3A1D7E4A; Fri, 13 Jun 2025 01:11:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; cv=none; b=VG7SXRFmG/IqHaHFOHcISmL8drKw5hte0GB1jFllHrIuduBkxuYq3oYVLfjUoXWzRCnn/YeowssL5AsmAz1VRbu2Pgct36WEgI0JC5BTYfkmYAk9Rwp2s8JO2RJvsf4eD3q7vjPMgQ0E9axfpbtXu4l/oTh6kw/w0Lmsqrb9pp4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749777092; c=relaxed/simple; bh=fYjZRivWn2Fn+oWmzNXQqN7xuN63eEwOfSVBHQpA9yc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GOHhusVfQH31aCv5ywFY+7o8iF98Y6sBuz2DOLvmVPIh+dVV0NwwnZOxQAfqZ7UopNlEnOZCnIYCukeImJTuZtF56QShu5CgjmkwXMG035UWAzBbCnZSF1FJukuedC5zCXYp75QPzlVo6yBLC0WarS28pFh72DThZa/VQAk5ois= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rtpVAedd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rtpVAedd" Received: by smtp.kernel.org (Postfix) with ESMTPS id E4E2EC4CEF7; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749777091; bh=fYjZRivWn2Fn+oWmzNXQqN7xuN63eEwOfSVBHQpA9yc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=rtpVAedd66sPrM+hY2oxPRDpPvlpiJObu0GevyQv0K3W7FSGgwJf3jgI8nZqCFvtn oP1zqeQTAbTSqS9MPDhf9sPqOKcu4RzJkIP7ONPyUw3P+Gv4RyH3rH4I4jep+dkIUS zsNn4uqIfj/4C/dbFOSZU74AoaSEk1d6hXgY9f9FSKJQbXDbUbXLQZDm4dqgkG9tHG 1Jze3d93xNlrVPXrYkr2Rbr13b/ksCl9g0MvOWHpFwuk+A1xQS/GUw7aNDDtmbxRAl 81EKT/BtkCE0/zEFEQ6BsfNfz9h8/XdQzkz4bpo8tenLzglELRAmLY9NvEaoNWBKP5 1mf0qMJJVmefg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D92E7C71148; Fri, 13 Jun 2025 01:11:31 +0000 (UTC) From: Mingcong Bai via B4 Relay Date: Fri, 13 Jun 2025 09:11:33 +0800 Subject: [PATCH v2 5/5] drm/xe/query: use PAGE_SIZE as the minimum page alignment Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250613-upstream-xe-non-4k-v2-v2-5-934f82249f8a@aosc.io> References: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> In-Reply-To: <20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io> To: Lucas De Marchi , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Rodrigo Vivi , David Airlie , Simona Vetter , Francois Dugast , =?utf-8?q?Zbigniew_Kempczy=C5=84ski?= , =?utf-8?q?Jos=C3=A9_Roberto_de_Souza?= , Mauro Carvalho Chehab , Matthew Brost , Zhanjun Dong , Matt Roper , Alan Previn , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Mateusz Naklicki Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Kexy Biscuit , Shang Yatsen <429839446@qq.com>, Mingcong Bai , Wenbin Fang , Haien Liang <27873200@qq.com>, Jianfeng Liu , Shirong Liu , Haofeng Wu X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1749777090; l=3034; i=jeffbai@aosc.io; s=20250604; h=from:subject:message-id; bh=gWL8RWOcZ+T5Uids7FBARXCOIx0PC8BeNHiC9X6HMCs=; b=Q0CPOLyStGLjhT6+x+gKKShN9k6MBkZEp9SYnxxZoRQbNmntY5WlakCsLQqecRGX4JzCl7Kmi swRCsXXAkwoCPO5anbatDazKDXA2TuDS/6jE0TkX9SncSqJk4cINnRb X-Developer-Key: i=jeffbai@aosc.io; a=ed25519; pk=MJdgklflDF+Xz9x2Lp+ogEnEyk8HRosMGiqLgWbFctY= X-Endpoint-Received: by B4 Relay for jeffbai@aosc.io/20250604 with auth_id=422 X-Original-From: Mingcong Bai Reply-To: jeffbai@aosc.io From: Mingcong Bai As this component hooks into userspace API, it should be assumed that it will play well with non-4KiB/64KiB pages. Use `PAGE_SIZE' as the final reference for page alignment instead. Cc: stable@vger.kernel.org Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Fixes: 801989b08aff ("drm/xe/uapi: Make constant comments visible in kernel= doc") Tested-by: Mingcong Bai Tested-by: Wenbin Fang Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu Tested-by: Shirong Liu Tested-by: Haofeng Wu Link: https://github.com/FanFansfan/loongson-linux/commit/22c55ab3931c32410= a077b3ddb6dca3f28223360 Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai Suggested-by: Kexy Biscuit --- drivers/gpu/drm/xe/xe_query.c | 2 +- include/uapi/drm/xe_drm.h | 7 +++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c index 2dbf4066d86ff225eee002d352e1233c8d9519b9..fe94a7781fa04fb277d5cc8b973= b145293d3d066 100644 --- a/drivers/gpu/drm/xe/xe_query.c +++ b/drivers/gpu/drm/xe/xe_query.c @@ -346,7 +346,7 @@ static int query_config(struct xe_device *xe, struct dr= m_xe_device_query *query) config->info[DRM_XE_QUERY_CONFIG_FLAGS] |=3D DRM_XE_QUERY_CONFIG_FLAG_HAS_LOW_LATENCY; config->info[DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT] =3D - xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : SZ_4K; + xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : PAGE_SIZE; config->info[DRM_XE_QUERY_CONFIG_VA_BITS] =3D xe->info.va_bits; config->info[DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY] =3D xe_exec_queue_device_get_max_priority(xe); diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h index 9c08738c3b918ee387f51a68ba080057c6d5716f..f92eb8c3317a09baad4550024bb= 3beea02850010 100644 --- a/include/uapi/drm/xe_drm.h +++ b/include/uapi/drm/xe_drm.h @@ -397,8 +397,11 @@ struct drm_xe_query_mem_regions { * has low latency hint support * - %DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR - Flag is set if the * device has CPU address mirroring support - * - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment - * required by this device, typically SZ_4K or SZ_64K + * - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment requir= ed + * by this device and the CPU. The minimum page size for the device is + * usually SZ_4K or SZ_64K, while for the CPU, it is PAGE_SIZE. This va= lue + * is calculated by max(min_gpu_page_size, PAGE_SIZE). This alignment is + * enforced on buffer object allocations and VM binds. * - %DRM_XE_QUERY_CONFIG_VA_BITS - Maximum bits of a virtual address * - %DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY - Value of the highest * available exec queue priority --=20 2.49.0