From nobody Mon Feb 9 02:45:57 2026 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8322D37F108 for ; Wed, 14 Jan 2026 08:47:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768380440; cv=none; b=XMjTuQ3cuHpweb9QVS4XAGCo5kYHmBjH7GBTGpMGqrnFItzx411EIWg6KoT0S6TT3JaEH8j1aghxtKI1mThzWBoszV+YEEqajbQldJuJ6n4yRbjQ92vufRXp4qNbVIETdXne1XSVVicxS8de9uSl4aMtKUDOsOO0TbXyIEfwwSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768380440; c=relaxed/simple; bh=724MKfbyRVpOlMBiWmZ+VnfVEBQCW/q6CKIp1BMjM/0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pL1IwleLXJBTchupnp67laoDf7kBZ3JqQ8U1eFgIDBJpm+BoWrDrySPBqN3AGQRhMHtfUFuicZ6qxpjlhxsN7pAlLmf80Iidt/Bc2EDOFPyZw6fJoq7SoLiV2ZgjcJZ/wU6mvwXes1j4Kt0P8twWKnFX/YBminTep0PW/mSn3UI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.208.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-64d02c01865so1845447a12.1 for ; Wed, 14 Jan 2026 00:47:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768380435; x=1768985235; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=s8k0pXMV1y8Iq4ImwhXf2ahDkvy9IQZJDHOXmgWjWKM=; b=a4lXMoKaQ0UyVQMoU++EUANh4rWxLpdYnMgCza+rGOimEXUEMR5M3JEx5ozGdz2TXN 0LHn/ssB0BcoVuyr4Q9ZJG3QVP19HzFyJ7N306Q2bZurXpf2qe+JRZWOEwVrR7aZjAm2 UPq5kU6yk1XFaTou6VSV7DE9BLjistusEgHtEgXd17htaIlQt+QWIa/zHVFuZe9n4HBS nEL5XlntP7p9/7HHxPt85XxuWe6ToQWw4vadvc2aP/z7SBu079VdnOYlDUMmD45rTx0k NtcNXzXpZXqqUZBkYt3l9nrolRcZ0SZLG33ZkASJgWvs3mAr/5XEnyUziF+jRCNYvIK3 SbiQ== X-Forwarded-Encrypted: i=1; AJvYcCUXpVBu14xX962SfXjHVE/N8r2daB85RKVc4Semk0Rd/b3rfaTY1y29RBANT4TsNI8wlSChP4Z/pOa8Fz0=@vger.kernel.org X-Gm-Message-State: AOJu0YxE8gYdlpvPBecY4foMPMrzztSFm0MEECff4q/Qq7veBH8nEadC ovzd6IJEG4+WTj8eq3dbY5ebMNWf32qwC2ja09nZkDoGJcHBkFm0iRFA X-Gm-Gg: AY/fxX5uRrrk3U4YBXiMvwB1I17dENo0jjp0vbZ9aNr+LGiURHFco0SHhZspCBVfBY4 5T9evsuxwuhKBrCTGE3YRnfmKRuef5SKqq41/mpiXmHIaDfNgtOZp+LVtLV2XxtsYlYaavWTIIi gxC6ktcJ+Wj0yvpf6QFys+z2c3zqajdQ82GhLxxhTXvVm3ylcJcSL686eafIZxUvXmgvltWI9zh 94M8Pc2aatjaJOUYU44VgJrzdWSRs3Kd0FdbncqkaxDSD/ejOHNbKfSL9jDzYG23PO3sxB1y2VL zQXHnXkNNrQy2095jVWB1hghNytbUKf0jrSHE1Zmb11bh+L0P7Rf2rBL1gw6blAxwv0siaVAffT 31sCla3d1AjpsDGO9Evbt4HnFr/5UvVovdKGBTDr2rfVehaxN2E5Qbe6ItgmPbGy7ZdfgD9pzh+ gdOxaRM/6zWiU0uFXJM74ZIO+T8W9x4TZK1d2usNQTXlswentsxdM2IoSC X-Received: by 2002:a05:6402:1474:b0:64b:82c8:e7b7 with SMTP id 4fb4d7f45d1cf-653ec45985emr1298237a12.24.1768380435183; Wed, 14 Jan 2026 00:47:15 -0800 (PST) Received: from [10.42.0.1] (cst-prg-36-231.cust.vodafone.cz. [46.135.36.231]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6507bf6d5d4sm22379136a12.32.2026.01.14.00.47.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jan 2026 00:47:14 -0800 (PST) From: Tomeu Vizoso Date: Wed, 14 Jan 2026 09:46:48 +0100 Subject: [PATCH v2 1/5] arm64: dts: ti: k3-j722s-ti-ipc-firmware: Add memory pool for DSP i/o buffers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260114-thames-v2-1-e94a6636e050@tomeuvizoso.net> References: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> In-Reply-To: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> To: Nishanth Menon , "Andrew F. Davis" , Randolph Sapp , Jonathan Humphreys , Andrei Aldea , Chirag Shilwant , Vignesh Raghavendra , Tero Kristo , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Oded Gabbay , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Robert Nelson , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Cc: linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 This memory region is used by the DRM/accel driver to allocate addresses for buffers that are used for communication with the DSP cores and for their intermediate results. Signed-off-by: Tomeu Vizoso --- arch/arm64/boot/dts/ti/k3-j722s-ti-ipc-firmware.dtsi | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/ti/k3-j722s-ti-ipc-firmware.dtsi b/arch/ar= m64/boot/dts/ti/k3-j722s-ti-ipc-firmware.dtsi index 3fbff927c4c08bce741555aa2753a394b751144f..b80d2a5a157ad59eaed8e57b22f= 1f4bce4765a85 100644 --- a/arch/arm64/boot/dts/ti/k3-j722s-ti-ipc-firmware.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j722s-ti-ipc-firmware.dtsi @@ -42,6 +42,11 @@ c7x_0_memory_region: memory@a3100000 { no-map; }; =20 + c7x_iova_pool: iommu-pool@a7000000 { + reg =3D <0x00 0xa7000000 0x00 0x18200000>; + no-map; + }; + c7x_1_dma_memory_region: memory@a4000000 { compatible =3D "shared-dma-pool"; reg =3D <0x00 0xa4000000 0x00 0x100000>; @@ -151,13 +156,15 @@ &main_r5fss0_core0 { &c7x_0 { mboxes =3D <&mailbox0_cluster2 &mbox_c7x_0>; memory-region =3D <&c7x_0_dma_memory_region>, - <&c7x_0_memory_region>; + <&c7x_0_memory_region>, + <&c7x_iova_pool>; status =3D "okay"; }; =20 &c7x_1 { mboxes =3D <&mailbox0_cluster3 &mbox_c7x_1>; memory-region =3D <&c7x_1_dma_memory_region>, - <&c7x_1_memory_region>; + <&c7x_1_memory_region>, + <&c7x_iova_pool>; status =3D "okay"; }; --=20 2.52.0 From nobody Mon Feb 9 02:45:57 2026 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2333B3557FD for ; Wed, 14 Jan 2026 10:45:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768387557; cv=none; b=iVK9jzHMvtA7SHmgpbPMEwC3ygKi0VPWFl6VcVThfkDjBiM/Dr1JFgPl+x1In5wDyvvsG1wNEGeggrxBS6FYEWeRl1ZXryWztXLKopUF/xoDBmx0nWSq+ZF+2r33ciIcLWJSw3JWAYRrEQY4h2k3T61O8ivOIVpdTCG+zt6iYG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768387557; c=relaxed/simple; bh=ARqCQv3wEbCgZq48lQ8M3Wi7MHgfqSHye/tSUjJUxtA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=rO0oiiNaNAByst6XFJdgMkV+e+ECwxdP5y649y8W/lqgKbSmI8UNHjzkyhoogTjQaaNMvGfdU9u2zZYDNf+ncyzLLDLrl5lvg+emjJLU4T6oortkjAhTKRXBCLgN9PJx+E4fAc8IxXFL1Lyp8tdDpHmW0NV1cVxo29v1hYB4aSQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.208.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-6536e4d25e1so2333210a12.1 for ; Wed, 14 Jan 2026 02:45:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768387554; x=1768992354; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=p15e32WbNaL2KGaBjlfT55qZ8Pa+6Ph5Zi1/eLI5a5E=; b=Whd7+en+2SvlR+NpKyH0IYQZqrPrzLcjlUzFzR0TVfWHa1Fv8WzDewRXb0UiBDavLM dpvc1P8p3ux/sS0YwlaNhJNZtMILqGLDqo6xg3vxyBI657TlxYzddymJcz5N/YpCinnw G7bnJu8PaC8+MYwolfEPzByKOiJnwudYfBfvRg3/Hf7+516ktka29msCRQqYgbrLjs/q EgUZXkBTeBbtZdXynsJq0WPoFK6ljDWZ/Col8Cpu2b0Pkfdh77QZgHvKW/Q9B/+JqCLJ Z51DGAwxl8957p3sVV2aMZ/RaA+8W3Nzcl7npJ+dwZcSe2iJTz07stJCIRZWPTMrSate ZbZQ== X-Forwarded-Encrypted: i=1; AJvYcCUUR/R9g7ib4cN0I4xj4BlxmHHXVy7NHJ58xqZoP6QrpXd5KPt/ESvNeL8c+1zYMo0HlZStfR7mkPzwSLM=@vger.kernel.org X-Gm-Message-State: AOJu0Ywnyxct6bhFrmS/6P2ZArYqIEqeuk6yFcZWRYlDZ/lmxcYtnMB8 jokKYHq+n7VOUxTKwhPiwiLznoUQnbh2CkhhMEIxdABFDe9Ji3je2fqr9qh3Zw== X-Gm-Gg: AY/fxX7ZrUhtx4u5Wzz6YhdTKCsghsanCI/LB1QoALXbf7XDXJX6V4mFMF6JE01ucQy ZgsizWlwpnEZD2vU+J8TzIy5i8DtuWZemKSQb2wk6jC9x+6GELN+wAbIv9xd5b6/k+NqD4mtKsF TuWZowyQQ6eY45/1wnOFRKBsfnCovEkU8oVsNH9StKldSGVBIR2agc4MYnp5PKMKKHifPM3pV0H MOeR/0/TS2rb9LyDRPMp0edjYnAoNvbmzIw/2tVtDoiA2I10/fo/+DfmxZKFIDIpzgoPzwPl8Ea U/3V6IJq80Bm23C5iPgVS8PmACz2hmHVkHKZXUqRbAmPk5E82oBv2yxwLDcmLMLhfAHYAbC9rCc ica3JEwV7rPxjiWRjn4OzIUOobyl2nIwFRYys2Pyiit7iqdg0FwTqDWcIHBz0W2ssXQNIr9lyc+ vm/cM3+yFMo3deXQpsyTriGOWxFcDc7GgJk7MG0OC/++ueCA== X-Received: by 2002:a05:6402:13c3:b0:64d:e1c:4c0a with SMTP id 4fb4d7f45d1cf-653ebf703famr1606697a12.0.1768380438100; Wed, 14 Jan 2026 00:47:18 -0800 (PST) Received: from [10.42.0.1] (cst-prg-36-231.cust.vodafone.cz. [46.135.36.231]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6507bf6d5d4sm22379136a12.32.2026.01.14.00.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jan 2026 00:47:17 -0800 (PST) From: Tomeu Vizoso Date: Wed, 14 Jan 2026 09:46:49 +0100 Subject: [PATCH v2 2/5] accel/thames: Add driver for the C7x DSPs in TI SoCs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260114-thames-v2-2-e94a6636e050@tomeuvizoso.net> References: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> In-Reply-To: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> To: Nishanth Menon , "Andrew F. Davis" , Randolph Sapp , Jonathan Humphreys , Andrei Aldea , Chirag Shilwant , Vignesh Raghavendra , Tero Kristo , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Oded Gabbay , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Robert Nelson , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Cc: linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Some SoCs from Texas Instruments contain DSPs that can be used for general compute tasks. This driver provides a drm/accel UABI to userspace for submitting jobs to the DSP cores and managing the input, output and intermediate memory. Signed-off-by: Tomeu Vizoso --- Documentation/accel/thames/index.rst | 28 +++++ MAINTAINERS | 9 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 3 +- drivers/accel/thames/Kconfig | 26 +++++ drivers/accel/thames/Makefile | 9 ++ drivers/accel/thames/thames_core.c | 155 ++++++++++++++++++++++++++ drivers/accel/thames/thames_core.h | 53 +++++++++ drivers/accel/thames/thames_device.c | 93 ++++++++++++++++ drivers/accel/thames/thames_device.h | 46 ++++++++ drivers/accel/thames/thames_drv.c | 155 ++++++++++++++++++++++++++ drivers/accel/thames/thames_drv.h | 21 ++++ drivers/accel/thames/thames_ipc.h | 204 +++++++++++++++++++++++++++++++= ++++ drivers/accel/thames/thames_rpmsg.c | 155 ++++++++++++++++++++++++++ drivers/accel/thames/thames_rpmsg.h | 27 +++++ 15 files changed, 984 insertions(+), 1 deletion(-) diff --git a/Documentation/accel/thames/index.rst b/Documentation/accel/tha= mes/index.rst new file mode 100644 index 0000000000000000000000000000000000000000..ca8391031f226f7ef1dc210a356= c86acbe126c6f --- /dev/null +++ b/Documentation/accel/thames/index.rst @@ -0,0 +1,28 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + accel/thames Driver for the C7x DSPs from Texas Instruments +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The accel/thames driver supports the C7x DSPs inside some Texas Instrument= s SoCs +such as the J722S. These can be used as accelerators for various workloads, +including machine learning inference. + +This driver controls the power state of the hardware via :doc:`remoteproc = ` +and communicates with the firmware running on the DSP via :doc:`rpmsg_virt= io `. +The kernel driver itself allocates buffers, manages contexts, and submits = jobs +to the DSP firmware. Buffers are mapped by the DSP itself using its MMU, +providing memory isolation among different clients. + +The source code for the firmware running on the DSP is available at: +https://gitlab.freedesktop.org/tomeu/thames_firmware/. + +Everything else is done in userspace, as a Gallium driver (also called tha= mes) +that is part of the Mesa3D project: https://docs.mesa3d.org/teflon.html + +If there is more than one core that advertises the same rpmsg_virtio servi= ce +name, the driver will load balance jobs between them with drm-gpu-schedule= r. + +Hardware currently supported: + +* J722S diff --git a/MAINTAINERS b/MAINTAINERS index dc731d37c8feeff25613c59fe9c929927dadaa7e..a3fc809c797269d0792dfe5202c= c1b49f6ff57e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7731,6 +7731,15 @@ F: Documentation/devicetree/bindings/npu/rockchip,rk= 3588-rknn-core.yaml F: drivers/accel/rocket/ F: include/uapi/drm/rocket_accel.h =20 +DRM ACCEL DRIVER FOR TI C7x DSPS +M: Tomeu Vizoso +L: dri-devel@lists.freedesktop.org +S: Supported +T: git https://gitlab.freedesktop.org/drm/misc/kernel.git +F: Documentation/accel/thames/ +F: drivers/accel/thames/ +F: include/uapi/drm/thames_accel.h + DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK M: Oded Gabbay L: dri-devel@lists.freedesktop.org diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig index bdf48ccafcf21b2fd685ec963e39e256196e6e17..cb49c71cd4e4a4220624f7041a7= 5ba950a1a2ee1 100644 --- a/drivers/accel/Kconfig +++ b/drivers/accel/Kconfig @@ -30,5 +30,6 @@ source "drivers/accel/habanalabs/Kconfig" source "drivers/accel/ivpu/Kconfig" source "drivers/accel/qaic/Kconfig" source "drivers/accel/rocket/Kconfig" +source "drivers/accel/thames/Kconfig" =20 endif diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 1d3a7251b950f39e2ae600a2fc07a3ef7e41831e..8472989cbe22746f1e7292d2401= fa0f7424a6c15 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) +=3D ethosu/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) +=3D habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) +=3D ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) +=3D qaic/ -obj-$(CONFIG_DRM_ACCEL_ROCKET) +=3D rocket/ \ No newline at end of file +obj-$(CONFIG_DRM_ACCEL_ROCKET) +=3D rocket/ +obj-$(CONFIG_DRM_ACCEL_THAMES) +=3D thames/ \ No newline at end of file diff --git a/drivers/accel/thames/Kconfig b/drivers/accel/thames/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..50e0b6ac2a16a942ba846333399= 1f5b0161b99ac --- /dev/null +++ b/drivers/accel/thames/Kconfig @@ -0,0 +1,26 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config DRM_ACCEL_THAMES + tristate "Thames (support for TI C7x DSP accelerators)" + depends on DRM_ACCEL + depends on TI_K3_R5_REMOTEPROC || COMPILE_TEST + depends on RPMSG + depends on MMU + select DRM_SCHED + select DRM_GEM_SHMEM_HELPER + help + Choose this option if you have a Texas Instruments SoC that contains + C7x DSP cores that can be used as compute accelerators. This includes + SoCs such as the AM62A, J721E, J721S2, and J784S4. + + The C7x DSP cores can be used for general-purpose compute acceleration + and are exposed through the DRM accel subsystem. + + The interface exposed to userspace is described in + include/uapi/drm/thames_accel.h and is used by the Thames userspace + driver in Mesa3D. + + If unsure, say N. + + To compile this driver as a module, choose M here: the + module will be called thames. diff --git a/drivers/accel/thames/Makefile b/drivers/accel/thames/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..7ccd8204f0f5ea800f30e84b319= f355be948109d --- /dev/null +++ b/drivers/accel/thames/Makefile @@ -0,0 +1,9 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-$(CONFIG_DRM_ACCEL_THAMES) :=3D thames.o + +thames-y :=3D \ + thames_core.o \ + thames_device.o \ + thames_drv.o \ + thames_rpmsg.o diff --git a/drivers/accel/thames/thames_core.c b/drivers/accel/thames/tham= es_core.c new file mode 100644 index 0000000000000000000000000000000000000000..92af1d68063116bcfa28a33960c= be829029fc1bf --- /dev/null +++ b/drivers/accel/thames/thames_core.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#include "linux/remoteproc.h" +#include +#include +#include +#include +#include +#include +#include +#include + +#include "thames_core.h" +#include "thames_device.h" +#include "thames_rpmsg.h" + +/* Shift to convert bytes to megabytes (divide by 1048576) */ +#define THAMES_BYTES_TO_MB_SHIFT 20 + +int thames_core_get_iova_range(struct rpmsg_device *rpdev, u64 *iova_start= , u64 *iova_size) +{ + struct rproc *rproc; + struct device_node *of_node; + struct device_node *mem_node; + struct resource mem_res; + int err; + + if (!iova_start || !iova_size) + return -EINVAL; + + rproc =3D rproc_get_by_child(&rpdev->dev); + if (!rproc) { + dev_err(&rpdev->dev, "Failed to get rproc device\n"); + return -ENODEV; + } + + of_node =3D rproc->dev.parent->of_node; + put_device(&rproc->dev); + + if (!of_node) { + dev_err(&rpdev->dev, "No device tree node found on rproc parent\n"); + return -ENODEV; + } + + /* + * Read the IOVA pool range from the device tree node. + * The third memory-region (index 2) defines the virtual address range. + * The first two regions are typically: + * [0] =3D DMA memory region for remoteproc (physically contiguous) + * [1] =3D Code/data memory region for remoteproc (physically contiguou= s) + * [2] =3D Virtual address pool for BO mappings (firmware-managed MMU) + */ + mem_node =3D of_parse_phandle(of_node, "memory-region", 2); + if (!mem_node) { + dev_err(&rpdev->dev, "Missing third memory-region (DSP VA pool) in devic= e tree\n"); + return -EINVAL; + } + + err =3D of_address_to_resource(mem_node, 0, &mem_res); + of_node_put(mem_node); + if (err) { + dev_err(&rpdev->dev, "Failed to get DSP VA pool range from memory-region= [2]: %d\n", + err); + return err; + } + + *iova_start =3D mem_res.start; + *iova_size =3D resource_size(&mem_res); + + if (!*iova_size) { + dev_err(&rpdev->dev, "Invalid DSP VA pool size: 0\n"); + return -EINVAL; + } + + return 0; +} + +static int thames_core_validate_iova_range(struct thames_core *core) +{ + struct thames_device *tdev =3D core->tdev; + u64 iova_start, iova_size; + int err; + + err =3D thames_core_get_iova_range(core->rpdev, &iova_start, &iova_size); + if (err) + return err; + + if (iova_start !=3D tdev->iova_start || iova_size !=3D tdev->iova_size) { + dev_err(core->dev, + "Core %d IOVA range mismatch! Expected 0x%llx-0x%llx, got 0x%llx-0x%llx= \n", + core->index, tdev->iova_start, tdev->iova_start + tdev->iova_size - 1, + iova_start, iova_start + iova_size - 1); + dev_err(core->dev, + "All cores must have the same memory-region[2] (IOVA pool) in device tr= ee\n"); + return -EINVAL; + } + + return 0; +} + +int thames_core_init(struct thames_core *core) +{ + int err =3D 0; + + err =3D thames_core_validate_iova_range(core); + if (err) + return err; + + err =3D thames_rpmsg_init(core); + if (err) + return err; + + err =3D thames_rpmsg_ping_test(core); + if (err) + return err; + + return 0; +} + +void thames_core_fini(struct thames_core *core) +{ + thames_rpmsg_fini(core); +} + +void thames_core_reset(struct thames_core *core) +{ + struct rpmsg_device *rpdev =3D core->rpdev; + struct rproc *rproc; + int ret; + + dev_warn(core->dev, "Resetting DSP core %d", core->index); + + if (!atomic_read(&core->reset.pending)) + dev_warn(core->dev, "Reset called without reset.pending set\n"); + + rproc =3D rproc_get_by_child(&rpdev->dev); + if (!rproc) { + dev_err(core->dev, "Failed to get rproc for reset\n"); + return; + } + + ret =3D rproc_shutdown(rproc); + if (ret) { + dev_err(&rproc->dev, "Failed to shut down DSP: %d\n", ret); + goto put_rproc; + } + + ret =3D rproc_boot(rproc); + if (ret) + dev_err(&rproc->dev, "Failed to boot DSP: %d\n", ret); + +put_rproc: + put_device(&rproc->dev); +} diff --git a/drivers/accel/thames/thames_core.h b/drivers/accel/thames/tham= es_core.h new file mode 100644 index 0000000000000000000000000000000000000000..72c3d3d6c575f56cc1d8731d1c9= dc958486dbf7f --- /dev/null +++ b/drivers/accel/thames/thames_core.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#ifndef __THAMES_CORE_H__ +#define __THAMES_CORE_H__ + +#include +#include +#include +#include + +struct thames_msg_buffer_op; + +struct thames_core { + struct rpmsg_device *rpdev; + struct device *dev; + struct thames_device *tdev; + unsigned int index; + + /* RPMSG communication context */ + struct { + struct rpmsg_endpoint *endpoint; + + struct { + u32 sequence; + u32 expected_data; + bool success; + struct completion completion; + } ping_test; + } rpmsg_ctx; + + struct mutex job_lock; + struct thames_job *in_flight_job; + + spinlock_t fence_lock; + + struct { + struct workqueue_struct *wq; + struct work_struct work; + atomic_t pending; + } reset; + + struct drm_gpu_scheduler sched; + u64 fence_context; + u64 emit_seqno; +}; + +int thames_core_init(struct thames_core *core); +void thames_core_fini(struct thames_core *core); +void thames_core_reset(struct thames_core *core); +int thames_core_get_iova_range(struct rpmsg_device *rpdev, u64 *iova_start= , u64 *iova_size); + +#endif diff --git a/drivers/accel/thames/thames_device.c b/drivers/accel/thames/th= ames_device.c new file mode 100644 index 0000000000000000000000000000000000000000..2b2aa32b07ee361ea388ab5ec78= 1a13ff4359e5f --- /dev/null +++ b/drivers/accel/thames/thames_device.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#include +#include +#include +#include +#include +#include + +#include "thames_device.h" + +/* Shift to convert bytes to megabytes (divide by 1048576) */ +#define THAMES_BYTES_TO_MB_SHIFT 20 + +struct thames_device *thames_device_init(struct platform_device *pdev, + + const struct drm_driver *thames_drm_driver, u64 iova_start, + u64 iova_size) +{ + struct device *dev =3D &pdev->dev; + struct thames_device *tdev; + struct drm_device *ddev; + int err; + + tdev =3D devm_drm_dev_alloc(dev, thames_drm_driver, struct thames_device,= ddev); + if (IS_ERR(tdev)) + return tdev; + + tdev->num_cores =3D 0; + ddev =3D &tdev->ddev; + dev_set_drvdata(dev, tdev); + + dma_set_max_seg_size(dev, UINT_MAX); + + err =3D dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40)); + if (err) + return ERR_PTR(err); + + err =3D devm_mutex_init(dev, &tdev->sched_lock); + if (err) + return ERR_PTR(-ENOMEM); + + ida_init(&tdev->bo_ida); + ida_init(&tdev->ctx_ida); + ida_init(&tdev->job_ida); + ida_init(&tdev->ipc_seq_ida); + + /* + * Initialize shared virtual address space for all DSP cores. + * + * IMPORTANT: This driver does NOT use Linux IOMMU. The TI C7x DSP cores + * have their own MMUs that are managed entirely by the DSP firmware. + * The VA space is shared across all cores - userspace receives VAs that + * work on all cores. Each core's firmware programs its own MMU to map + * the same VA to the same PA. + * + * The Linux driver's role is only to: + * 1. Allocate non-overlapping virtual addresses from a safe range + * 2. Provide physical addresses to each DSP firmware via IPC + * 3. Let each firmware program its own MMU to map VA -> PA + */ + if (!iova_size) { + dev_err(dev, "Invalid DSP VA pool size: 0\n"); + return ERR_PTR(-EINVAL); + } + + tdev->iova_start =3D iova_start; + tdev->iova_size =3D iova_size; + + drm_mm_init(&tdev->mm, iova_start, iova_size); + err =3D devm_mutex_init(dev, &tdev->mm_lock); + if (err) + return ERR_PTR(-ENOMEM); + + err =3D drm_dev_register(ddev, 0); + if (err) + return ERR_PTR(err); + + return tdev; +} + +void thames_device_fini(struct thames_device *tdev) +{ + WARN_ON(tdev->num_cores > 0); + + ida_destroy(&tdev->bo_ida); + ida_destroy(&tdev->ctx_ida); + ida_destroy(&tdev->job_ida); + ida_destroy(&tdev->ipc_seq_ida); + drm_mm_takedown(&tdev->mm); + drm_dev_unregister(&tdev->ddev); +} diff --git a/drivers/accel/thames/thames_device.h b/drivers/accel/thames/th= ames_device.h new file mode 100644 index 0000000000000000000000000000000000000000..c7d8e521d4323122134e8c8e8d2= 56d957c89ae5f --- /dev/null +++ b/drivers/accel/thames/thames_device.h @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#ifndef __THAMES_DEVICE_H__ +#define __THAMES_DEVICE_H__ + +#include +#include +#include +#include +#include +#include + +#include "thames_core.h" + +#define MAX_CORES 8 + +struct thames_device { + struct drm_device ddev; + + struct mutex sched_lock; + + struct thames_core cores[MAX_CORES]; + unsigned int num_cores; + + struct ida bo_ida; + struct ida ctx_ida; + struct ida job_ida; + struct ida ipc_seq_ida; + + struct drm_mm mm; + struct mutex mm_lock; + + u64 iova_start; + u64 iova_size; +}; + +struct thames_device *thames_device_init(struct platform_device *pdev, + const struct drm_driver *thames_drm_driver, u64 iova_start, + u64 iova_size); +void thames_device_fini(struct thames_device *rdev); + +#define to_thames_device(drm_dev) \ + ((struct thames_device *)(container_of((drm_dev), struct thames_device, d= dev))) + +#endif /* __THAMES_DEVICE_H__ */ diff --git a/drivers/accel/thames/thames_drv.c b/drivers/accel/thames/thame= s_drv.c new file mode 100644 index 0000000000000000000000000000000000000000..473498dd6f0135f346b0986a2a1= 7fc4411417f52 --- /dev/null +++ b/drivers/accel/thames/thames_drv.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "thames_drv.h" +#include "thames_core.h" +#include "thames_ipc.h" + +static struct platform_device *drm_dev; +static struct thames_device *tdev; + +static int thames_open(struct drm_device *dev, struct drm_file *file) +{ + struct thames_device *tdev =3D to_thames_device(dev); + struct thames_file_priv *thames_priv; + int ret; + + if (!try_module_get(THIS_MODULE)) + return -EINVAL; + + thames_priv =3D kzalloc(sizeof(*thames_priv), GFP_KERNEL); + if (!thames_priv) { + ret =3D -ENOMEM; + goto err_put_mod; + } + + thames_priv->tdev =3D tdev; + + file->driver_priv =3D thames_priv; + + return 0; + +err_put_mod: + module_put(THIS_MODULE); + return ret; +} + +static void thames_postclose(struct drm_device *dev, struct drm_file *file) +{ + struct thames_file_priv *thames_priv =3D file->driver_priv; + + kfree(thames_priv); + module_put(THIS_MODULE); +} + +static const struct drm_ioctl_desc thames_drm_driver_ioctls[] =3D { +#define THAMES_IOCTL(n, func) DRM_IOCTL_DEF_DRV(THAMES_##n, thames_ioctl_#= #func, 0) + +}; + +DEFINE_DRM_ACCEL_FOPS(thames_accel_driver_fops); + +static const struct drm_driver thames_drm_driver =3D { + .driver_features =3D DRIVER_COMPUTE_ACCEL | DRIVER_GEM, + .open =3D thames_open, + .postclose =3D thames_postclose, + .ioctls =3D thames_drm_driver_ioctls, + .num_ioctls =3D ARRAY_SIZE(thames_drm_driver_ioctls), + .fops =3D &thames_accel_driver_fops, + .name =3D "thames", + .desc =3D "thames DRM", +}; + +static int thames_probe(struct rpmsg_device *rpdev) +{ + u64 iova_start, iova_size; + unsigned int core; + int err; + + if (!tdev) { + err =3D thames_core_get_iova_range(rpdev, &iova_start, &iova_size); + if (err) + return err; + + tdev =3D thames_device_init(drm_dev, &thames_drm_driver, iova_start, iov= a_size); + if (IS_ERR(tdev)) { + dev_err(&rpdev->dev, "failed to initialize thames device\n"); + return PTR_ERR(tdev); + } + } + + core =3D tdev->num_cores; + + tdev->cores[core].tdev =3D tdev; + tdev->cores[core].rpdev =3D rpdev; + tdev->cores[core].dev =3D &rpdev->dev; + tdev->cores[core].index =3D core; + + tdev->num_cores++; + + return thames_core_init(&tdev->cores[core]); +} + +static void thames_remove(struct rpmsg_device *rpdev) +{ + unsigned int core; + + for (core =3D 0; core < tdev->num_cores; core++) { + if (tdev->cores[core].rpdev =3D=3D rpdev) { + thames_core_fini(&tdev->cores[core]); + tdev->num_cores--; + break; + } + } + + if (!tdev->num_cores) { + thames_device_fini(tdev); + tdev =3D NULL; + } +} + +static const struct rpmsg_device_id thames_rpmsg_id_table[] =3D { { .name = =3D THAMES_SERVICE_NAME }, + {} }; + +static struct rpmsg_driver thames_rpmsg_driver =3D { + .drv =3D { + .name =3D "thames", + .owner =3D THIS_MODULE, + }, + .id_table =3D thames_rpmsg_id_table, + .probe =3D thames_probe, + .remove =3D thames_remove, +}; + +static int __init thames_register(void) +{ + drm_dev =3D platform_device_register_simple("thames", -1, NULL, 0); + if (IS_ERR(drm_dev)) + return PTR_ERR(drm_dev); + + return register_rpmsg_driver(&thames_rpmsg_driver); +} + +static void __exit thames_unregister(void) +{ + unregister_rpmsg_driver(&thames_rpmsg_driver); + + platform_device_unregister(drm_dev); +} + +module_init(thames_register); +module_exit(thames_unregister); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("DRM driver for Texas Instrument's C7x accelerator core= s"); +MODULE_AUTHOR("Tomeu Vizoso"); +MODULE_ALIAS("rpmsg:" THAMES_SERVICE_NAME); diff --git a/drivers/accel/thames/thames_drv.h b/drivers/accel/thames/thame= s_drv.h new file mode 100644 index 0000000000000000000000000000000000000000..e03203eab8b88686ca91c10b45e= 55df1ea3d2e77 --- /dev/null +++ b/drivers/accel/thames/thames_drv.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#ifndef __THAMES_DRV_H__ +#define __THAMES_DRV_H__ + +#include +#include + +#include "thames_device.h" + +struct thames_file_priv { + struct thames_device *tdev; + + struct drm_sched_entity sched_entity; + + u32 context_id; + bool context_valid; +}; + +#endif diff --git a/drivers/accel/thames/thames_ipc.h b/drivers/accel/thames/thame= s_ipc.h new file mode 100644 index 0000000000000000000000000000000000000000..60297b4bc2ffd990315cb735a96= a23429d390f43 --- /dev/null +++ b/drivers/accel/thames/thames_ipc.h @@ -0,0 +1,204 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ + * + * This header defines the RPMSG message structures exchanged between + * the Linux kernel (host) and the C7x DSP (remote) firmware for the + * Thames DRM/accel driver. + */ + +#ifndef _THAMES_IPC_H +#define _THAMES_IPC_H + +#ifdef __KERNEL__ +#include +#else +#include +typedef uint8_t __u8; +typedef uint16_t __u16; +typedef uint32_t __u32; +typedef uint64_t __u64; +#endif + +#define THAMES_SERVICE_NAME "thames-service" + +/** + * @THAMES_MSG_TYPE: Simplified message type enumeration + */ +enum thames_msg_type { + /* --- Host (Kernel) -> Remote (DSP) --- */ + THAMES_MSG_PING =3D 0x100, /* Ping message to test communication */ + THAMES_MSG_CONTEXT_OP, /* Create/destroy context */ + THAMES_MSG_BO_OP, /* Map/unmap buffer objects */ + THAMES_MSG_SUBMIT_JOB, /* Submit job for execution */ + + /* --- Remote (DSP) -> Host (Kernel) --- */ + THAMES_MSG_PING_RESPONSE =3D 0x200, + THAMES_MSG_CONTEXT_OP_RESPONSE, + THAMES_MSG_BO_OP_RESPONSE, + THAMES_MSG_SUBMIT_JOB_RESPONSE, +}; + +/** + * @THAMES_CONTEXT_OP: Context operation types + */ +enum thames_context_op { + THAMES_CONTEXT_CREATE =3D 0, + THAMES_CONTEXT_DESTROY, +}; + +/** + * @THAMES_BO_OP: Buffer Object operation types + */ +enum thames_bo_op { + THAMES_BO_MAP =3D 0, + THAMES_BO_UNMAP, +}; + +/** + * @THAMES_RESP_STATUS: Response status codes + */ +enum thames_resp_status { + THAMES_RESP_SUCCESS =3D 0, + THAMES_RESP_ERR_GENERIC =3D 1, + THAMES_RESP_ERR_NOMEM =3D 2, + THAMES_RESP_ERR_INVAL =3D 3, + THAMES_RESP_ERR_NO_CTX =3D 4, + THAMES_RESP_ERR_MMU =3D 5, + THAMES_RESP_ERR_JOB_TIMEOUT =3D 6, +}; + +/** + * struct thames_msg_hdr - Common header for all RPMSG messages + * @type: Message type from enum thames_msg_type + * @seq: Sequence number for request/response matching + * @len: Total message length including header + */ +struct thames_msg_hdr { + __u32 type; + __u32 seq; + __u32 len; + __u32 reserved; +}; + +/* + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + * Host (Kernel) -> Remote (DSP) Messages + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + */ + +/** + * struct thames_msg_ping - Ping message to test communication + * @hdr: Common message header + * @ping_data: Optional ping data (timestamp, sequence, etc.) + */ +struct thames_msg_ping { + struct thames_msg_hdr hdr; + __u32 ping_data; +}; + +/** + * struct thames_msg_context_op - Context create/destroy operations + * @hdr: Common message header + * @op: Operation type (CREATE/DESTROY) + * @context_id: Context ID + */ +struct thames_msg_context_op { + struct thames_msg_hdr hdr; + uint32_t op; /* enum thames_context_op */ + uint32_t context_id; +}; + +/** + * struct thames_msg_bo_op - Buffer Object map/unmap operations + * @hdr: Common message header + * @op: Operation type (MAP/UNMAP) + * @context_id: Context ID that this BO belongs to + * @bo_id: Buffer Object ID for tracking + * @vaddr: Virtual address where BO should be mapped on DSP + * @paddr: Physical address of the BO + * @size: Size of the BO in bytes + */ +struct thames_msg_bo_op { + struct thames_msg_hdr hdr; + uint32_t op; /* enum thames_bo_op */ + uint32_t context_id; + uint32_t bo_id; + uint64_t vaddr; + uint64_t paddr; + uint64_t size; +}; + +/** + * struct thames_msg_submit_job - Submit job for execution + * @hdr: Common message header + * @context_id: Context to run job in + * @job_id: Host-generated job tracking ID + * @kernel_iova: IOVA of kernel code BO (first byte =3D first instruction) + * @kernel_size: Size of kernel code in bytes + * @args_iova: IOVA of arguments BO (array of uint64_t values) + * @args_size: Size of arguments BO in bytes + */ +struct thames_msg_submit_job { + struct thames_msg_hdr hdr; + uint32_t context_id; + uint32_t job_id; + uint64_t kernel_iova; + uint64_t kernel_size; + uint64_t args_iova; + uint64_t args_size; +}; + +/* + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + * Remote (DSP) -> Host (Kernel) Messages + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + */ + +/** + * struct thames_msg_response - Generic response to commands + * @hdr: Common message header (seq matches request) + * @status: Status code from enum thames_resp_status + * @data: Optional response data (context-dependent) + */ +struct thames_msg_response { + struct thames_msg_hdr hdr; + uint32_t status; + uint32_t data; +}; + +/* + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + * Buffer Size Calculations + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + */ + +/* Calculate the maximum message size by finding the largest structure */ +#define THAMES_MSG_SIZE_PING sizeof(struct thames_msg_ping) +#define THAMES_MSG_SIZE_CONTEXT_OP sizeof(struct thames_msg_context_op) +#define THAMES_MSG_SIZE_BO_OP sizeof(struct thames_msg_bo_op) +#define THAMES_MSG_SIZE_SUBMIT_JOB sizeof(struct thames_msg_submit_job) +#define THAMES_MSG_SIZE_RESPONSE sizeof(struct thames_msg_response) + +/* Helper macros to find maximum of multiple values */ +#define THAMES_MAX2(a, b) ((a) > (b) ? (a) : (b)) +#define THAMES_MAX3(a, b, c) THAMES_MAX2(THAMES_MAX2(a, b), c) +#define THAMES_MAX5(a, b, c, d, e) THAMES_MAX2(THAMES_MAX3(a, b, c), THAME= S_MAX2(d, e)) + +/* Maximum size of any Thames IPC message */ +#define THAMES_IPC_MAX_MSG_SIZE = \ + THAMES_MAX5(THAMES_MSG_SIZE_PING, THAMES_MSG_SIZE_CONTEXT_OP, THAMES_MSG_= SIZE_BO_OP, \ + THAMES_MSG_SIZE_SUBMIT_JOB, THAMES_MSG_SIZE_RESPONSE) + +/* RPMSG buffer size - should accommodate largest message + some padding */ +#define THAMES_RPMSG_BUFFER_SIZE ((THAMES_IPC_MAX_MSG_SIZE + 15) & ~15) /*= 16-byte aligned */ + +/* Compile-time size checks - use BUILD_BUG_ON in kernel code */ +#ifdef __KERNEL__ +#define THAMES_ASSERT_MSG_SIZE(msg_type) BUILD_BUG_ON(sizeof(struct msg_ty= pe) > 64) +#else +#define THAMES_ASSERT_MSG_SIZE(msg_type) \ + _Static_assert(sizeof(struct msg_type) <=3D 64, #msg_type " too large") +#endif + +#endif /* _THAMES_IPC_H */ diff --git a/drivers/accel/thames/thames_rpmsg.c b/drivers/accel/thames/tha= mes_rpmsg.c new file mode 100644 index 0000000000000000000000000000000000000000..ebc34f49353e5e7959734da8e8a= 935573c130e79 --- /dev/null +++ b/drivers/accel/thames/thames_rpmsg.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#include +#include +#include +#include +#include + +#include "thames_rpmsg.h" +#include "thames_core.h" +#include "thames_device.h" +#include "thames_ipc.h" + +#define THAMES_PING_TEST_PATTERN 0xDEADBEEF +#define THAMES_PING_TIMEOUT_MS 5000 + +static int thames_rpmsg_callback(struct rpmsg_device *rpdev, void *data, i= nt len, void *priv, + u32 src) +{ + struct thames_msg_hdr *hdr =3D (struct thames_msg_hdr *)data; + struct thames_core *core =3D priv; + + dev_dbg(&rpdev->dev, "Received response on core %d with length %d\n", cor= e->index, len); + + if (len < sizeof(struct thames_msg_hdr)) { + dev_err(&rpdev->dev, "Received message too short: %d bytes", len); + return -EINVAL; + } + + switch (hdr->type) { + case THAMES_MSG_PING_RESPONSE: { + struct thames_msg_response *response =3D (struct thames_msg_response *)d= ata; + + dev_dbg(&rpdev->dev, + "Received PING response: status=3D%u, data=3D0x%x, expected_data=3D0x%x= , seq=3D%u, expected_seq=3D%u\n", + response->status, response->data, core->rpmsg_ctx.ping_test.expected_da= ta, + hdr->seq, core->rpmsg_ctx.ping_test.sequence); + + if (hdr->seq !=3D core->rpmsg_ctx.ping_test.sequence) { + dev_err(&rpdev->dev, + "PING response sequence mismatch: got %u, expected %u\n", hdr->seq, + core->rpmsg_ctx.ping_test.sequence); + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + return -EINVAL; + } + + if (response->data !=3D core->rpmsg_ctx.ping_test.expected_data) { + dev_err(&rpdev->dev, + "PING response data mismatch: got 0x%x, expected 0x%x\n", + response->data, core->rpmsg_ctx.ping_test.expected_data); + core->rpmsg_ctx.ping_test.success =3D false; + complete(&core->rpmsg_ctx.ping_test.completion); + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + return -EINVAL; + } + + core->rpmsg_ctx.ping_test.success =3D (response->status =3D=3D THAMES_RE= SP_SUCCESS); + complete(&core->rpmsg_ctx.ping_test.completion); + + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + + break; + } + + default: + dev_warn(&rpdev->dev, "Unknown message type: %u\n", hdr->type); + break; + } + + return 0; +} + +static int thames_rpmsg_send_raw(struct thames_core *core, const void *dat= a, size_t len) +{ + if (!core->rpmsg_ctx.endpoint) { + dev_err(core->dev, "RPMSG endpoint not available"); + return -ENODEV; + } + + return rpmsg_send(core->rpmsg_ctx.endpoint, (void *)data, len); +} + +int thames_rpmsg_init(struct thames_core *core) +{ + struct rpmsg_device *rpdev =3D core->rpdev; + struct rpmsg_channel_info chinfo =3D {}; + + strscpy(chinfo.name, rpdev->id.name, sizeof(chinfo.name)); + chinfo.src =3D RPMSG_ADDR_ANY; /* Let rpmsg assign an address */ + chinfo.dst =3D RPMSG_ADDR_ANY; + + core->rpmsg_ctx.endpoint =3D rpmsg_create_ept(rpdev, thames_rpmsg_callbac= k, core, chinfo); + if (!core->rpmsg_ctx.endpoint) { + dev_err(core->dev, "Failed to create RPMSG endpoint for core %d", core->= index); + return -ENODEV; + } + + return 0; +} + +void thames_rpmsg_fini(struct thames_core *core) +{ + if (core->rpmsg_ctx.endpoint) { + rpmsg_destroy_ept(core->rpmsg_ctx.endpoint); + core->rpmsg_ctx.endpoint =3D NULL; + } +} + +int thames_rpmsg_send_ping(struct thames_core *core, u32 ping_data, u32 *s= equence) +{ + struct thames_msg_ping ping_msg =3D {}; + + ping_msg.hdr.type =3D THAMES_MSG_PING; + ping_msg.hdr.seq =3D ida_alloc(&core->tdev->ipc_seq_ida, GFP_KERNEL); + ping_msg.hdr.len =3D sizeof(ping_msg); + ping_msg.hdr.reserved =3D 0; + ping_msg.ping_data =3D ping_data; + + *sequence =3D ping_msg.hdr.seq; + + return thames_rpmsg_send_raw(core, &ping_msg, sizeof(ping_msg)); +} + +int thames_rpmsg_ping_test(struct thames_core *core) +{ + const u32 test_data =3D THAMES_PING_TEST_PATTERN; + int ret; + unsigned long timeout; + + core->rpmsg_ctx.ping_test.expected_data =3D test_data; + core->rpmsg_ctx.ping_test.success =3D false; + init_completion(&core->rpmsg_ctx.ping_test.completion); + + ret =3D thames_rpmsg_send_ping(core, test_data, &core->rpmsg_ctx.ping_tes= t.sequence); + if (ret) { + dev_err(core->dev, "Failed to send PING message to core %d: %d", core->i= ndex, ret); + return ret; + } + + timeout =3D msecs_to_jiffies(THAMES_PING_TIMEOUT_MS); + ret =3D wait_for_completion_timeout(&core->rpmsg_ctx.ping_test.completion= , timeout); + if (ret =3D=3D 0) { + dev_err(core->dev, "PING test timed out - DSP core %d not responding", c= ore->index); + return -ETIMEDOUT; + } + + if (!core->rpmsg_ctx.ping_test.success) { + dev_err(core->dev, "PING test failed - incorrect PONG response from DSP = core %d", + core->index); + return -EIO; + } + + return 0; +} diff --git a/drivers/accel/thames/thames_rpmsg.h b/drivers/accel/thames/tha= mes_rpmsg.h new file mode 100644 index 0000000000000000000000000000000000000000..6d5195453b8d3eac2c333b7ac03= e469b2744fb78 --- /dev/null +++ b/drivers/accel/thames/thames_rpmsg.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#ifndef __THAMES_RPMSG_H__ +#define __THAMES_RPMSG_H__ + +#include +#include + +struct thames_core; + +int thames_rpmsg_init(struct thames_core *core); +void thames_rpmsg_fini(struct thames_core *core); + +int thames_rpmsg_send_ping(struct thames_core *core, u32 ping_data, u32 *s= equence); +int thames_rpmsg_send_create_context(struct thames_core *core, u32 context= _id); +int thames_rpmsg_send_destroy_context(struct thames_core *core, u32 contex= t_id); +int thames_rpmsg_send_map_bo(struct thames_core *core, u32 context_id, u32= bo_id, u64 vaddr, + u64 paddr, u64 size); +int thames_rpmsg_send_unmap_bo(struct thames_core *core, u32 context_id, u= 32 bo_id); +int thames_rpmsg_send_submit_job(struct thames_core *core, u32 context_id,= u32 job_id, + u64 kernel_iova, u64 kernel_size, u64 args_iova, u64 args_size, + u32 *sequence); + +int thames_rpmsg_ping_test(struct thames_core *core); + +#endif /* __THAMES_RPMSG_H__ */ --=20 2.52.0 From nobody Mon Feb 9 02:45:57 2026 Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33719322B69 for ; Wed, 14 Jan 2026 10:17:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768385874; cv=none; b=p65bLHUfKuvpBlrxNCj4rMYdyBwcryeFrSP5xMQt9klaafBQ+Xy+ZcWiR/dQtgPERbGyknw4W6mOUt8hoPe6Q52Ifjy/wYzLCRJ9i24RULXPi5T4mqGL03Dou+dgQ5joESZvMoHFukktcXac//auKjz2S5/DX/24EbSZTxZEm8A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768385874; c=relaxed/simple; bh=kfFn/4AqV4irxXQ3lvZ3ILm+85HtLNAcMo7pNbokueo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=HGd5TXY7OH+kpQSjN5Df1A/Y8bsoSFRgcPZlMooHGUZWslUpAZ/hx5DR5WQ7rPtiHLfXlxLSYI/3/WDmNzYtjYV1aRK1lhwDBjUa8HD8uw5Y+W2+S7r4oRON8o8/BUsmeln/BW+IUzvFiZkNwma7D5mmrXxO6wNGPGQGxfW6kVw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.218.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-f44.google.com with SMTP id a640c23a62f3a-b871cfb49e6so513069866b.1 for ; Wed, 14 Jan 2026 02:17:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768385869; x=1768990669; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=4p0sdNT2Xct0sFlRUulsSXGJ37uEAZBd7wIarn1FlMs=; b=Ro9h/FUuEP8r9uaA56rsic3C5R+RHgdWKCo83TaJuO98VHzzMar+7b9BA/Ytgj/W6b i2a6oyWuGN2R/3CJyPqBl+UqZQyM6ty9+qiEHGr710y/FQUdXmU29uBP8Abu0c6lcBH5 72qTpQLBofa1sx4zQLdYnkIuwJdl3r0RuNpbaBneTzm7CPacROSL2m7qI6Q6bSUIfds+ kegDa0ESj4lHtKwLY7tHBzJuPsq84Ne6qzXVD0TvBYJDYYOaabf4GbG1UKJ54N6S1F4r gcnJy7+xcyky+qg7dX+damRo+A+UG2SgK1jgCMrUDnCl+YLEzVvsIRYqHlINrzA8JSRm QbNw== X-Forwarded-Encrypted: i=1; AJvYcCWJAihEpuQeIonPPOhF7H3bOHCGVwKO132Tvt10ec6v9dNA6tOv7YujChOg9KVieF+RXnSNbXOWUta7XhE=@vger.kernel.org X-Gm-Message-State: AOJu0Yz7QgXzQbr7xdAujx5eK1b2bP/Xniiy1XZ5v3ozG8UidHrAJNz7 OUR3a+lhiKOw89a3mJBmt1icEQEbg3kEjBHOeY5tpn7eklmFa5zcZSKU X-Gm-Gg: AY/fxX4+FE8EDK0v378dMFLs4mViWo5VSz8bgYy+v+twdYSuyYzV6uGi6qh/e1aFtU3 CuKOI1Gsfd5J8czP1xQZcuN6A1pxL0Sg8JUzdVs88OcX/jAs3V9nyygRe2hHKgL7n5Vjyu0EU5e 2v/Yz1vbXrokHEPSTus9fVZ82OR9XlEr8J8qZWoxEcfnKuEjWJUPhGYhdvpoy1NTXKuWCbL2vKo gi3e/YttGwrxvnkO4asjbSz5g0TNVJGKRTLgVxnWVRW8z0LJF9U0HjewKBd4Lu1fJEw5Ph2TtA+ QhvjR7yoodYK8LZBRi6AEFQVNDzhSG+u+zJfePafUHrUKCRs/7oaHE/MWTp/VdpVxenB4JU39oY Oj7pB5lrLfJcRnhls6XMEjcmBx4R9bsz3S+KHDawdP6oCwLdhKklRa5Hi3nCBhLawXN8bmJ1AtS lxYHPJPMhSUO9Uf1pnMWFm8Rh8NBmHFRlgaJFSglYQDhusZ2Vqbla9q2Xw X-Received: by 2002:a17:907:971c:b0:b87:1cfb:33c3 with SMTP id a640c23a62f3a-b87612bca40mr179008366b.56.1768380440308; Wed, 14 Jan 2026 00:47:20 -0800 (PST) Received: from [10.42.0.1] (cst-prg-36-231.cust.vodafone.cz. [46.135.36.231]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6507bf6d5d4sm22379136a12.32.2026.01.14.00.47.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jan 2026 00:47:19 -0800 (PST) From: Tomeu Vizoso Date: Wed, 14 Jan 2026 09:46:50 +0100 Subject: [PATCH v2 3/5] accel/thames: Add IOCTLs for BO creation and mapping Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260114-thames-v2-3-e94a6636e050@tomeuvizoso.net> References: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> In-Reply-To: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> To: Nishanth Menon , "Andrew F. Davis" , Randolph Sapp , Jonathan Humphreys , Andrei Aldea , Chirag Shilwant , Vignesh Raghavendra , Tero Kristo , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Oded Gabbay , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Robert Nelson , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Cc: linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Uses the SHMEM DRM helpers, mapping on creation to the device as all created buffers are expected to be accessed by the DSPs. We map to all DSPs because we cannot know upfront what DSP cores will run a given job. Buffers are mapped for the device by the DSPs themselves, as each contains a MMU. Buffers belong to a context, which is used by the DSP to switch to the page table that mapped the buffers for the user of the job to execute. v2: - Add thames_accel.h UAPI header (Robert Nelson). Signed-off-by: Tomeu Vizoso --- drivers/accel/thames/Makefile | 1 + drivers/accel/thames/thames_drv.c | 6 +- drivers/accel/thames/thames_gem.c | 353 ++++++++++++++++++++++++++++++++= ++++ drivers/accel/thames/thames_gem.h | 41 +++++ drivers/accel/thames/thames_rpmsg.c | 69 +++++++ include/uapi/drm/thames_accel.h | 104 +++++++++++ 6 files changed, 573 insertions(+), 1 deletion(-) diff --git a/drivers/accel/thames/Makefile b/drivers/accel/thames/Makefile index 7ccd8204f0f5ea800f30e84b319f355be948109d..0051e319f2e4966de72bc342d5b= 6e40b2890c006 100644 --- a/drivers/accel/thames/Makefile +++ b/drivers/accel/thames/Makefile @@ -6,4 +6,5 @@ thames-y :=3D \ thames_core.o \ thames_device.o \ thames_drv.o \ + thames_gem.o \ thames_rpmsg.o diff --git a/drivers/accel/thames/thames_drv.c b/drivers/accel/thames/thame= s_drv.c index 473498dd6f0135f346b0986a2a17fc4411417f52..d9ea2cab80e89cd13b1422a1763= 5a15b7f16fa4f 100644 --- a/drivers/accel/thames/thames_drv.c +++ b/drivers/accel/thames/thames_drv.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -12,6 +13,7 @@ =20 #include "thames_drv.h" #include "thames_core.h" +#include "thames_gem.h" #include "thames_ipc.h" =20 static struct platform_device *drm_dev; @@ -53,7 +55,8 @@ static void thames_postclose(struct drm_device *dev, stru= ct drm_file *file) =20 static const struct drm_ioctl_desc thames_drm_driver_ioctls[] =3D { #define THAMES_IOCTL(n, func) DRM_IOCTL_DEF_DRV(THAMES_##n, thames_ioctl_#= #func, 0) - + THAMES_IOCTL(BO_CREATE, bo_create), + THAMES_IOCTL(BO_MMAP_OFFSET, bo_mmap_offset), }; =20 DEFINE_DRM_ACCEL_FOPS(thames_accel_driver_fops); @@ -62,6 +65,7 @@ static const struct drm_driver thames_drm_driver =3D { .driver_features =3D DRIVER_COMPUTE_ACCEL | DRIVER_GEM, .open =3D thames_open, .postclose =3D thames_postclose, + .gem_create_object =3D thames_gem_create_object, .ioctls =3D thames_drm_driver_ioctls, .num_ioctls =3D ARRAY_SIZE(thames_drm_driver_ioctls), .fops =3D &thames_accel_driver_fops, diff --git a/drivers/accel/thames/thames_gem.c b/drivers/accel/thames/thame= s_gem.c new file mode 100644 index 0000000000000000000000000000000000000000..5a01ddaeb2448117d400a79e53d= 2c6123ecb5390 --- /dev/null +++ b/drivers/accel/thames/thames_gem.c @@ -0,0 +1,353 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#include "drm/drm_gem_shmem_helper.h" +#include +#include +#include +#include +#include +#include +#include + +#include "thames_gem.h" +#include "thames_device.h" +#include "thames_drv.h" +#include "thames_rpmsg.h" + +/* + * DSP MMU permission flags for buffer object mappings. + * These control read/write/execute permissions in the DSP's address space. + */ +#define THAMES_BO_PERM_READ (1 << 0) +#define THAMES_BO_PERM_WRITE (1 << 1) +#define THAMES_BO_PERM_EXEC (1 << 2) +#define THAMES_BO_PERM_RWX (THAMES_BO_PERM_READ | THAMES_BO_PERM_WRITE | T= HAMES_BO_PERM_EXEC) + +static u64 thames_alloc_vaddr(struct thames_device *tdev, struct thames_ge= m_object *bo, size_t size) +{ + int ret; + + size =3D ALIGN(size, SZ_1M); + + mutex_lock(&tdev->mm_lock); + ret =3D drm_mm_insert_node(&tdev->mm, &bo->mm, size); + mutex_unlock(&tdev->mm_lock); + + if (ret) + return 0; + + return bo->mm.start; +} + +static void thames_free_vaddr(struct thames_device *tdev, struct thames_ge= m_object *bo) +{ + if (!drm_mm_node_allocated(&bo->mm)) + return; + + mutex_lock(&tdev->mm_lock); + drm_mm_remove_node(&bo->mm); + mutex_unlock(&tdev->mm_lock); +} + +static int thames_context_destroy_on_core(struct thames_file_priv *priv, s= truct thames_core *core) +{ + struct thames_device *tdev =3D priv->tdev; + int ret; + + ret =3D thames_rpmsg_send_destroy_context(core, priv->context_id); + if (ret) + dev_warn(tdev->ddev.dev, "Failed to destroy context on core %d: %d", cor= e->index, + ret); + + return ret; +} + +static int thames_context_create_on_core(struct thames_file_priv *priv, st= ruct thames_core *core) +{ + struct thames_device *tdev =3D priv->tdev; + int ret; + + ret =3D thames_rpmsg_send_create_context(core, priv->context_id); + if (ret) + dev_warn(tdev->ddev.dev, "Failed to create context on core %d: %d", core= ->index, + ret); + + return ret; +} + +int thames_context_create(struct thames_file_priv *priv) +{ + struct thames_device *tdev =3D priv->tdev; + int i, ret; + + ret =3D ida_alloc_min(&tdev->ctx_ida, 1, GFP_KERNEL); + if (ret < 0) + return ret; + + priv->context_id =3D ret; + priv->context_valid =3D false; + + if (!tdev->num_cores) { + dev_err(tdev->ddev.dev, "No cores available\n"); + ret =3D -ENODEV; + goto err_free_id; + } + + for (i =3D 0; i < tdev->num_cores; i++) { + ret =3D thames_context_create_on_core(priv, &tdev->cores[i]); + if (ret) { + dev_err(tdev->ddev.dev, "Failed to create context on core %d: %d\n", i, + ret); + goto err_destroy_contexts; + } + } + + priv->context_valid =3D true; + return 0; + +err_destroy_contexts: + for (i =3D i - 1; i >=3D 0; i--) + thames_context_destroy_on_core(priv, &tdev->cores[i]); +err_free_id: + ida_free(&tdev->ctx_ida, priv->context_id); + return ret; +} + +void thames_context_destroy(struct thames_file_priv *priv) +{ + struct thames_device *tdev =3D priv->tdev; + int i; + + if (!priv->context_valid) + return; + + for (i =3D 0; i < tdev->num_cores; i++) + thames_context_destroy_on_core(priv, &tdev->cores[i]); + + ida_free(&tdev->ctx_ida, priv->context_id); + priv->context_valid =3D false; +} + +static int thames_bo_map_to_core(struct thames_gem_object *bo, struct tham= es_file_priv *file_priv, + struct thames_core *core, u64 vaddr, u64 paddr, u64 size, + u32 flags) +{ + struct thames_device *tdev =3D file_priv->tdev; + int ret; + + ret =3D thames_rpmsg_send_map_bo(core, file_priv->context_id, bo->id, vad= dr, paddr, size); + if (ret) + dev_warn(tdev->ddev.dev, "Failed to map buffer on core %d: %d", core->in= dex, ret); + + return ret; +} + +static int thames_bo_map_to_device(struct thames_gem_object *bo, struct th= ames_file_priv *file_priv) +{ + struct thames_device *tdev =3D file_priv->tdev; + struct sg_table *sgt; + dma_addr_t dma_addr; + int i, ret; + + if (bo->iova) + return 0; + + if (!file_priv->context_valid) + return -EINVAL; + + if (!tdev->num_cores) + return -ENODEV; + + sgt =3D drm_gem_shmem_get_pages_sgt(&bo->base); + if (IS_ERR(sgt)) + return PTR_ERR(sgt); + + dma_addr =3D sg_dma_address(sgt->sgl); + if (!dma_addr) { + ret =3D -EINVAL; + goto err_put_pages; + } + + bo->iova =3D thames_alloc_vaddr(tdev, bo, bo->base.base.size); + if (!bo->iova) { + ret =3D -ENOMEM; + goto err_put_pages; + } + + bo->context_id =3D file_priv->context_id; + + for (i =3D 0; i < tdev->num_cores; i++) { + ret =3D thames_bo_map_to_core(bo, file_priv, &tdev->cores[i], bo->iova, = dma_addr, + bo->base.base.size, THAMES_BO_PERM_RWX); + if (ret) { + while (--i >=3D 0) + thames_rpmsg_send_unmap_bo(&tdev->cores[i], bo->context_id, bo->id); + goto err_free_vaddr; + } + } + + return 0; + +err_free_vaddr: + thames_free_vaddr(tdev, bo); + bo->iova =3D 0; +err_put_pages: + dma_resv_lock(bo->base.base.resv, NULL); + drm_gem_shmem_put_pages_locked(&bo->base); + dma_resv_unlock(bo->base.base.resv); + return ret; +} + +static void thames_bo_unmap_from_device(struct thames_gem_object *bo, stru= ct thames_device *tdev) +{ + int i, ret, failed_unmaps =3D 0; + + if (!bo->iova) + return; + + for (i =3D 0; i < tdev->num_cores; i++) { + ret =3D thames_rpmsg_send_unmap_bo(&tdev->cores[i], bo->context_id, bo->= id); + if (ret) { + dev_err(tdev->ddev.dev, "Failed to unmap BO %u from core %d: %d\n", bo-= >id, + i, ret); + failed_unmaps++; + } + } + + if (failed_unmaps) + drm_WARN(&tdev->ddev, failed_unmaps > 0, + "BO %u: %d core(s) failed unmap, potential DSP-side leak\n", bo->id, + failed_unmaps); + + thames_free_vaddr(tdev, bo); + bo->iova =3D 0; + + dma_resv_lock(bo->base.base.resv, NULL); + drm_gem_shmem_put_pages_locked(&bo->base); + dma_resv_unlock(bo->base.base.resv); +} + +static void thames_gem_bo_free(struct drm_gem_object *obj) +{ + struct thames_gem_object *bo =3D to_thames_bo(obj); + struct thames_device *tdev =3D to_thames_device(obj->dev); + + drm_WARN_ON(obj->dev, refcount_read(&bo->base.pages_use_count) > 1); + + if (bo->iova) + thames_bo_unmap_from_device(bo, tdev); + + ida_free(&tdev->bo_ida, bo->id); + + drm_gem_free_mmap_offset(&bo->base.base); + drm_gem_shmem_free(&bo->base); +} + +static const struct drm_gem_object_funcs thames_gem_funcs =3D { + .free =3D thames_gem_bo_free, + .print_info =3D drm_gem_shmem_object_print_info, + .pin =3D drm_gem_shmem_object_pin, + .unpin =3D drm_gem_shmem_object_unpin, + .get_sg_table =3D drm_gem_shmem_object_get_sg_table, + .vmap =3D drm_gem_shmem_object_vmap, + .vunmap =3D drm_gem_shmem_object_vunmap, + .mmap =3D drm_gem_shmem_object_mmap, + .vm_ops =3D &drm_gem_shmem_vm_ops, +}; + +struct drm_gem_object *thames_gem_create_object(struct drm_device *dev, si= ze_t size) +{ + struct thames_device *tdev =3D to_thames_device(dev); + struct thames_gem_object *obj; + int bo_id; + + obj =3D kzalloc(sizeof(*obj), GFP_KERNEL); + if (!obj) + return ERR_PTR(-ENOMEM); + + obj->base.base.funcs =3D &thames_gem_funcs; + + bo_id =3D ida_alloc_min(&tdev->bo_ida, 1, GFP_KERNEL); + if (bo_id < 0) { + kfree(obj); + return ERR_PTR(bo_id); + } + obj->id =3D bo_id; + + return &obj->base.base; +} + +int thames_ioctl_bo_create(struct drm_device *ddev, void *data, struct drm= _file *file) +{ + struct thames_file_priv *file_priv =3D file->driver_priv; + struct drm_thames_bo_create *args =3D data; + struct drm_gem_shmem_object *mem; + struct thames_gem_object *bo; + int cookie, ret; + + if (!drm_dev_enter(ddev, &cookie)) + return -ENODEV; + + if (args->handle || args->iova) { + ret =3D -EINVAL; + goto err_exit; + } + + if (!args->size) { + ret =3D -EINVAL; + goto err_exit; + } + + mem =3D drm_gem_shmem_create(ddev, args->size); + if (IS_ERR(mem)) + return PTR_ERR(mem); + + bo =3D to_thames_bo(&mem->base); + + ret =3D drm_gem_handle_create(file, &mem->base, &args->handle); + drm_gem_object_put(&mem->base); + if (ret) { + dev_err(ddev->dev, "Failed to create gem handle: %d", ret); + goto err_free; + } + + ret =3D thames_bo_map_to_device(bo, file_priv); + if (ret) { + dev_err(ddev->dev, "Failed to map buffer to DSP on creation: %d", ret); + goto err_free; + } + + args->size =3D bo->base.base.size; + args->iova =3D bo->iova; + + drm_dev_exit(cookie); + + return 0; + +err_free: + drm_gem_shmem_object_free(&mem->base); + +err_exit: + drm_dev_exit(cookie); + + return ret; +} + +int thames_ioctl_bo_mmap_offset(struct drm_device *ddev, void *data, struc= t drm_file *file) +{ + struct drm_thames_bo_mmap_offset *args =3D data; + struct drm_gem_object *obj; + + if (args->pad) + return -EINVAL; + + obj =3D drm_gem_object_lookup(file, args->handle); + if (!obj) + return -ENOENT; + + args->offset =3D drm_vma_node_offset_addr(&obj->vma_node); + drm_gem_object_put(obj); + + return 0; +} diff --git a/drivers/accel/thames/thames_gem.h b/drivers/accel/thames/thame= s_gem.h new file mode 100644 index 0000000000000000000000000000000000000000..785843c40a89a9e84ab634aad77= e9ec46111693e --- /dev/null +++ b/drivers/accel/thames/thames_gem.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#ifndef __THAMES_GEM_H__ +#define __THAMES_GEM_H__ + +#include +#include + +struct thames_device; + +struct thames_gem_object { + struct drm_gem_shmem_object base; + + struct thames_file_priv *driver_priv; + + struct drm_mm_node mm; + + u32 id; + u32 context_id; + u64 iova; + size_t size; + size_t offset; +}; + +struct drm_gem_object *thames_gem_create_object(struct drm_device *dev, si= ze_t size); + +int thames_ioctl_bo_create(struct drm_device *ddev, void *data, struct drm= _file *file); + +int thames_ioctl_bo_mmap_offset(struct drm_device *ddev, void *data, struc= t drm_file *file); + +int thames_context_create(struct thames_file_priv *priv); + +void thames_context_destroy(struct thames_file_priv *priv); + +static inline struct thames_gem_object *to_thames_bo(struct drm_gem_object= *obj) +{ + return container_of(to_drm_gem_shmem_obj(obj), struct thames_gem_object, = base); +} + +#endif diff --git a/drivers/accel/thames/thames_rpmsg.c b/drivers/accel/thames/tha= mes_rpmsg.c index ebc34f49353e5e7959734da8e8a935573c130e79..a25465295a177877c5ca2b3c93f= 52d8288863797 100644 --- a/drivers/accel/thames/thames_rpmsg.c +++ b/drivers/accel/thames/thames_rpmsg.c @@ -63,6 +63,14 @@ static int thames_rpmsg_callback(struct rpmsg_device *rp= dev, void *data, int len break; } =20 + case THAMES_MSG_CONTEXT_OP_RESPONSE: + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + break; + + case THAMES_MSG_BO_OP_RESPONSE: + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + break; + default: dev_warn(&rpdev->dev, "Unknown message type: %u\n", hdr->type); break; @@ -122,6 +130,67 @@ int thames_rpmsg_send_ping(struct thames_core *core, u= 32 ping_data, u32 *sequenc return thames_rpmsg_send_raw(core, &ping_msg, sizeof(ping_msg)); } =20 +int thames_rpmsg_send_create_context(struct thames_core *core, u32 context= _id) +{ + struct thames_msg_context_op msg =3D {}; + + msg.hdr.type =3D THAMES_MSG_CONTEXT_OP; + msg.hdr.seq =3D ida_alloc(&core->tdev->ipc_seq_ida, GFP_KERNEL); + msg.hdr.len =3D sizeof(msg); + msg.op =3D THAMES_CONTEXT_CREATE; + msg.context_id =3D context_id; + + return thames_rpmsg_send_raw(core, &msg, sizeof(msg)); +} + +int thames_rpmsg_send_destroy_context(struct thames_core *core, u32 contex= t_id) +{ + struct thames_msg_context_op msg =3D {}; + + msg.hdr.type =3D THAMES_MSG_CONTEXT_OP; + msg.hdr.seq =3D ida_alloc(&core->tdev->ipc_seq_ida, GFP_KERNEL); + msg.hdr.len =3D sizeof(msg); + msg.op =3D THAMES_CONTEXT_DESTROY; + msg.context_id =3D context_id; + + return thames_rpmsg_send_raw(core, &msg, sizeof(msg)); +} + +int thames_rpmsg_send_map_bo(struct thames_core *core, u32 context_id, u32= bo_id, u64 vaddr, + u64 paddr, u64 size) +{ + struct thames_msg_bo_op msg =3D {}; + + msg.hdr.type =3D THAMES_MSG_BO_OP; + msg.hdr.seq =3D ida_alloc(&core->tdev->ipc_seq_ida, GFP_KERNEL); + msg.hdr.len =3D sizeof(msg); + msg.op =3D THAMES_BO_MAP; + msg.context_id =3D context_id; + msg.bo_id =3D bo_id; + msg.vaddr =3D vaddr; + msg.paddr =3D paddr; + msg.size =3D size; + + return thames_rpmsg_send_raw(core, &msg, sizeof(msg)); +} + +int thames_rpmsg_send_unmap_bo(struct thames_core *core, u32 context_id, u= 32 bo_id) +{ + struct thames_msg_bo_op msg =3D {}; + + msg.hdr.type =3D THAMES_MSG_BO_OP; + msg.hdr.seq =3D ida_alloc(&core->tdev->ipc_seq_ida, GFP_KERNEL); + msg.hdr.len =3D sizeof(msg); + msg.op =3D THAMES_BO_UNMAP; + msg.context_id =3D context_id; + msg.bo_id =3D bo_id; + msg.vaddr =3D 0; + msg.paddr =3D 0; + msg.size =3D 0; + + return thames_rpmsg_send_raw(core, &msg, sizeof(msg)); +} + int thames_rpmsg_ping_test(struct thames_core *core) { const u32 test_data =3D THAMES_PING_TEST_PATTERN; diff --git a/include/uapi/drm/thames_accel.h b/include/uapi/drm/thames_acce= l.h new file mode 100644 index 0000000000000000000000000000000000000000..0a5a5e5f6637ab474e9effbb6db= 29c1dd95e56b5 --- /dev/null +++ b/include/uapi/drm/thames_accel.h @@ -0,0 +1,104 @@ +/* SPDX-License-Identifier: MIT */ +/* Copyright (C) 2026 Texas Instruments Incorporated - https://www.ti.com/= */ +#ifndef _THAMES_DRM_H_ +#define _THAMES_DRM_H_ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +/** + * DOC: IOCTL IDs + * + * enum drm_thames_ioctl_id - IOCTL IDs + * + * Place new ioctls at the end, don't re-order, don't replace or remove en= tries. + * + * These IDs are not meant to be used directly. Use the DRM_IOCTL_THAMES_x= xx + * definitions instead. + */ +enum drm_thames_ioctl_id { + /** @DRM_THAMES_BO_CREATE: Create a buffer object. */ + DRM_THAMES_BO_CREATE, + + /** + * @DRM_THAMES_BO_MMAP_OFFSET: Get the file offset to pass to + * mmap to map a GEM object. + */ + DRM_THAMES_BO_MMAP_OFFSET, +}; + +/** + * DOC: IOCTL arguments + */ + +/** + * struct drm_thames_bo_create - Arguments passed to DRM_IOCTL_THAMES_BO_C= REATE. + */ +struct drm_thames_bo_create { + /** + * @size: Requested size for the object + * + * The (page-aligned) allocated size for the object will be returned. + */ + __u64 size; + + /** + * @iova: Returned IOVA for the object, in the DSPs' address space. + */ + __u64 iova; + + /** + * @handle: Returned handle for the object. + * + * Object handles are nonzero. + */ + __u32 handle; + + /** @pad: MBZ. */ + __u32 pad; +}; + +/** + * struct drm_thames_bo_mmap_offset - Arguments passed to DRM_IOCTL_THAMES= _BO_MMAP_OFFSET. + */ +struct drm_thames_bo_mmap_offset { + /** @handle: Handle of the object we want an mmap offset for. */ + __u32 handle; + + /** @pad: MBZ. */ + __u32 pad; + + /** @offset: The fake offset to use for subsequent mmap calls. */ + __u64 offset; +}; + +/** + * DRM_IOCTL_THAMES() - Build a thames IOCTL number + * @__access: Access type. Must be R, W or RW. + * @__id: One of the DRM_THAMES_xxx id. + * @__type: Suffix of the type being passed to the IOCTL. + * + * Don't use this macro directly, use the DRM_IOCTL_THAMES_xxx + * values instead. + * + * Return: An IOCTL number to be passed to ioctl() from userspace. + */ +#define DRM_IOCTL_THAMES(__access, __id, __type) \ + DRM_IO ## __access(DRM_COMMAND_BASE + DRM_THAMES_ ## __id, \ + struct drm_thames_ ## __type) + +enum { + DRM_IOCTL_THAMES_BO_CREATE =3D + DRM_IOCTL_THAMES(WR, BO_CREATE, bo_create), + DRM_IOCTL_THAMES_BO_MMAP_OFFSET =3D + DRM_IOCTL_THAMES(WR, BO_MMAP_OFFSET, bo_mmap_offset), +}; + +#if defined(__cplusplus) +} +#endif + +#endif /* _THAMES_DRM_H_ */ --=20 2.52.0 From nobody Mon Feb 9 02:45:57 2026 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F1F8393DC5 for ; Wed, 14 Jan 2026 10:42:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768387322; cv=none; b=dZRKHvkmGDrVU7S4QF0mBB/d4cXhWNrXtDk3iw+gTZS5tXxIcKA361jW1RhMXpusbD/pRbVFcf3ICgm89jfOIqpRgABsCizO8fF5op0zz71/kduqHtuk1DpFzo5P0e/R1gW5Z4QT4Sk38+GDIVIg5zGT3fusQoBHmSLDtDBVIUY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768387322; c=relaxed/simple; bh=VEkE3Eq8eZveoDAh2lolAC5T+pbWPvafVOst83XHu0Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BJ5b1592iAllgu42GapeTe0OiyqmSVGslajnkc+bMAbsuhhgDvU5y2iMhGvGJdWkfNiwnE8OBH6Tr5MTOmFxSTX7skjU3+Q9qMkB4wQDCZiMATEPMd7co2bQ1gkj326nsN4n0TwyrTpATcp0SufiW479sa3ZOChWqGMcNvxSt5g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.218.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-b870cbd1e52so511237066b.3 for ; Wed, 14 Jan 2026 02:42:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768387319; x=1768992119; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=co7I4okWZTaQdP5UOMt0Lhq59LKi7MnNaM/o9aEj64I=; b=HD6/pRZcmr16q0g06vf88FhA3n5JQsU5HbH6SiJZocQmLYsfFCL/Cv5AxPS094QDVY ZlydgE4trSRLANyVitDvlk8WBKZlYGgtR+DKaEve7edAOU9uRaVFMSFaDHzAG8twvR+W pX9yiUBlWo6+dzZwyaYdIfqsofqDgwNhvzNT559TYX2ZMOJTC7mirAQ89HMadx6Skwkl jGV8svDaOmJpaFU1BwR/EZQWPvPUHPoru1U9yT7yGMRsatXjAIkxduuyZhtGEkwPwk45 iq6C6fjOUiGogXFrKZN1SFNzqcdL/igatTAgz9MTfkueLJG3ts0tUeEvjvRNJTtPB3F3 bVHA== X-Forwarded-Encrypted: i=1; AJvYcCVJQnh8R2iKSZV3BogYHCX5rU6QCWyJkTEm/u4jYRnyff/CI+w+5sk4AMMPDiPCgpHqg7lQoY3DlNeV1QQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yy8Ufx844gUFTIPOVGNXzli6bHeCJgOU83ByhrZgyvU8fWzEtSx b6uHVR0LitF0mHOZ0HXmJSzX9x/Rx4GYKAriGKPhvtEYv+IvwxkC+KJEwPb9yg== X-Gm-Gg: AY/fxX6kTc2YhjUDYNI5Xd6MerFJzFMATSqP6U9UAfH3dt978E52FG51nmx1FMlMyFY k5z5gx9JQhbpFF1PH78d4wX1krOaDb7RXI6CrYW7U0HHApGfs9Zu5xdVBz6YFLR4NhrGJHL+TaX k/NfbdVvA/no1ghCvyv6FoDPHXK6EfQGn50/se/ROsqtgjd/MlJpnvRYIg+xOoXKXGdA0mpaWfZ QgcnPAaTq7/DvRY6lBUMyD3RczZThdPEsPbTm1m/+g72x/0SpLIrP9eo36mdkV/3wXXk8U+SOKS lom0Scm5nnM5q0XvnT0eIUpAafAcDlq7I61E4eQw2MLp31YDXpATVPoFRaXa/LFo0V1Sy7V+h9h u+/+J3LVhIkiJipPbprBLO89sMGcWDeEyUbBP1JXg0/x83TpI++fyAdjsNXFlmg/HCzBQY6ElMp F96ouqjr7UhwOD5UpcBst4vdwK/JKHhAsfwWejb3Oync1sBQ== X-Received: by 2002:a05:6402:3492:b0:64d:2822:cf68 with SMTP id 4fb4d7f45d1cf-653ec44ad5bmr1354163a12.21.1768380442622; Wed, 14 Jan 2026 00:47:22 -0800 (PST) Received: from [10.42.0.1] (cst-prg-36-231.cust.vodafone.cz. [46.135.36.231]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6507bf6d5d4sm22379136a12.32.2026.01.14.00.47.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jan 2026 00:47:22 -0800 (PST) From: Tomeu Vizoso Date: Wed, 14 Jan 2026 09:46:51 +0100 Subject: [PATCH v2 4/5] accel/thames: Add IOCTL for job submission Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260114-thames-v2-4-e94a6636e050@tomeuvizoso.net> References: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> In-Reply-To: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> To: Nishanth Menon , "Andrew F. Davis" , Randolph Sapp , Jonathan Humphreys , Andrei Aldea , Chirag Shilwant , Vignesh Raghavendra , Tero Kristo , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Oded Gabbay , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Robert Nelson , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Cc: linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Using the DRM GPU scheduler infrastructure, with a scheduler for each core. Contexts are created in all cores, and buffers mapped to all of them as well, so all cores are ready to execute any job. The job submission code was initially based on Panfrost. v2: - Add thames_accel.h UAPI header (Robert Nelson). Signed-off-by: Tomeu Vizoso --- drivers/accel/thames/Makefile | 1 + drivers/accel/thames/thames_core.c | 6 + drivers/accel/thames/thames_drv.c | 19 ++ drivers/accel/thames/thames_job.c | 463 ++++++++++++++++++++++++++++++++= ++++ drivers/accel/thames/thames_job.h | 51 ++++ drivers/accel/thames/thames_rpmsg.c | 52 ++++ include/uapi/drm/thames_accel.h | 54 +++++ 7 files changed, 646 insertions(+) diff --git a/drivers/accel/thames/Makefile b/drivers/accel/thames/Makefile index 0051e319f2e4966de72bc342d5b6e40b2890c006..b6c4516f8250e3d442f22e80d60= 9cb1be2970128 100644 --- a/drivers/accel/thames/Makefile +++ b/drivers/accel/thames/Makefile @@ -7,4 +7,5 @@ thames-y :=3D \ thames_device.o \ thames_drv.o \ thames_gem.o \ + thames_job.o \ thames_rpmsg.o diff --git a/drivers/accel/thames/thames_core.c b/drivers/accel/thames/tham= es_core.c index 92af1d68063116bcfa28a33960cbe829029fc1bf..5b96b25d287096803e034fcd426= 1d51795871543 100644 --- a/drivers/accel/thames/thames_core.c +++ b/drivers/accel/thames/thames_core.c @@ -13,6 +13,7 @@ =20 #include "thames_core.h" #include "thames_device.h" +#include "thames_job.h" #include "thames_rpmsg.h" =20 /* Shift to convert bytes to megabytes (divide by 1048576) */ @@ -115,11 +116,16 @@ int thames_core_init(struct thames_core *core) if (err) return err; =20 + err =3D thames_job_init(core); + if (err) + return err; + return 0; } =20 void thames_core_fini(struct thames_core *core) { + thames_job_fini(core); thames_rpmsg_fini(core); } =20 diff --git a/drivers/accel/thames/thames_drv.c b/drivers/accel/thames/thame= s_drv.c index d9ea2cab80e89cd13b1422a17635a15b7f16fa4f..1ff01428e6c80765cb741ae45c6= 7971b7b0f28c8 100644 --- a/drivers/accel/thames/thames_drv.c +++ b/drivers/accel/thames/thames_drv.c @@ -14,6 +14,7 @@ #include "thames_drv.h" #include "thames_core.h" #include "thames_gem.h" +#include "thames_job.h" #include "thames_ipc.h" =20 static struct platform_device *drm_dev; @@ -38,8 +39,22 @@ static int thames_open(struct drm_device *dev, struct dr= m_file *file) =20 file->driver_priv =3D thames_priv; =20 + ret =3D thames_job_open(thames_priv); + if (ret) + goto err_free; + + ret =3D thames_context_create(thames_priv); + if (ret) { + dev_err(dev->dev, "Failed to create context for client: %d", ret); + goto err_close_job; + } + return 0; =20 +err_close_job: + thames_job_close(thames_priv); +err_free: + kfree(thames_priv); err_put_mod: module_put(THIS_MODULE); return ret; @@ -49,6 +64,9 @@ static void thames_postclose(struct drm_device *dev, stru= ct drm_file *file) { struct thames_file_priv *thames_priv =3D file->driver_priv; =20 + thames_context_destroy(thames_priv); + + thames_job_close(thames_priv); kfree(thames_priv); module_put(THIS_MODULE); } @@ -57,6 +75,7 @@ static const struct drm_ioctl_desc thames_drm_driver_ioct= ls[] =3D { #define THAMES_IOCTL(n, func) DRM_IOCTL_DEF_DRV(THAMES_##n, thames_ioctl_#= #func, 0) THAMES_IOCTL(BO_CREATE, bo_create), THAMES_IOCTL(BO_MMAP_OFFSET, bo_mmap_offset), + THAMES_IOCTL(SUBMIT, submit), }; =20 DEFINE_DRM_ACCEL_FOPS(thames_accel_driver_fops); diff --git a/drivers/accel/thames/thames_job.c b/drivers/accel/thames/thame= s_job.c new file mode 100644 index 0000000000000000000000000000000000000000..bd8f8fa1783cf10c5e71c8f2ce5= fcc880a9b150b --- /dev/null +++ b/drivers/accel/thames/thames_job.c @@ -0,0 +1,463 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2019 Linaro, Ltd, Rob Herring */ +/* Copyright 2019 Collabora ltd. */ +/* Copyright 2024-2025 Tomeu Vizoso */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#include "linux/dev_printk.h" +#include +#include +#include +#include +#include + +#include "thames_core.h" +#include "thames_device.h" +#include "thames_drv.h" +#include "thames_gem.h" +#include "thames_job.h" +#include "thames_rpmsg.h" + +#define JOB_TIMEOUT_MS 500 + +static struct thames_job *to_thames_job(struct drm_sched_job *sched_job) +{ + return container_of(sched_job, struct thames_job, base); +} + +static const char *thames_fence_get_driver_name(struct dma_fence *fence) +{ + return "thames"; +} + +static const char *thames_fence_get_timeline_name(struct dma_fence *fence) +{ + return "thames"; +} + +static const struct dma_fence_ops thames_fence_ops =3D { + .get_driver_name =3D thames_fence_get_driver_name, + .get_timeline_name =3D thames_fence_get_timeline_name, +}; + +static struct dma_fence *thames_fence_create(struct thames_core *core) +{ + struct dma_fence *fence; + + fence =3D kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + dma_fence_init(fence, &thames_fence_ops, &core->fence_lock, core->fence_c= ontext, + ++core->emit_seqno); + + return fence; +} + +static void thames_job_hw_submit(struct thames_core *core, struct thames_j= ob *job) +{ + int ret; + + /* Don't queue the job if a reset is in progress */ + if (atomic_read(&core->reset.pending)) + return; + + ret =3D thames_rpmsg_send_submit_job(core, job->file_priv->context_id, jo= b->job_id, + to_thames_bo(job->kernel)->iova, job->kernel_size, + to_thames_bo(job->params)->iova, job->params_size, + &job->ipc_sequence); + + if (ret) { + dev_err(core->dev, "Failed to submit kernel to DSP core %d\n", core->ind= ex); + return; + } +} + +static int thames_acquire_object_fences(struct drm_gem_object **bos, int b= o_count, + struct drm_sched_job *job, bool is_write) +{ + int i, ret; + + for (i =3D 0; i < bo_count; i++) { + ret =3D dma_resv_reserve_fences(bos[i]->resv, 1); + if (ret) + return ret; + + ret =3D drm_sched_job_add_implicit_dependencies(job, bos[i], is_write); + if (ret) + return ret; + } + + return 0; +} + +static void thames_attach_object_fences(struct drm_gem_object **bos, int b= o_count, + struct dma_fence *fence) +{ + int i; + + for (i =3D 0; i < bo_count; i++) + dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE); +} + +static int thames_job_push(struct thames_job *job) +{ + struct thames_device *tdev =3D job->tdev; + struct drm_gem_object **bos; + struct ww_acquire_ctx acquire_ctx; + int ret =3D 0; + + dev_dbg(tdev->ddev.dev, "Pushing job with %u in BOs and %u out BOs\n", jo= b->in_bo_count, + job->out_bo_count); + bos =3D kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void = *), GFP_KERNEL); + memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *)); + memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(v= oid *)); + + ret =3D drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_cou= nt, &acquire_ctx); + if (ret) + goto err; + + scoped_guard(mutex, &tdev->sched_lock) + { + drm_sched_job_arm(&job->base); + + job->inference_done_fence =3D dma_fence_get(&job->base.s_fence->finished= ); + + ret =3D thames_acquire_object_fences(job->in_bos, job->in_bo_count, &job= ->base, + false); + if (ret) + goto err_unlock; + + ret =3D thames_acquire_object_fences(job->out_bos, job->out_bo_count, &j= ob->base, + true); + if (ret) + goto err_unlock; + + kref_get(&job->refcount); /* put by scheduler job completion */ + + drm_sched_entity_push_job(&job->base); + } + + thames_attach_object_fences(job->out_bos, job->out_bo_count, job->inferen= ce_done_fence); + +err_unlock: + drm_gem_unlock_reservations(bos, job->in_bo_count + job->out_bo_count, &a= cquire_ctx); +err: + kvfree(bos); + + return ret; +} + +static void thames_job_cleanup(struct kref *ref) +{ + struct thames_job *job =3D container_of(ref, struct thames_job, refcount); + struct thames_device *tdev =3D job->tdev; + unsigned int i; + + dma_fence_put(job->done_fence); + dma_fence_put(job->inference_done_fence); + + ida_free(&tdev->job_ida, job->job_id); + + if (job->kernel) + drm_gem_object_put(job->kernel); + + if (job->params) + drm_gem_object_put(job->params); + + if (job->in_bos) { + for (i =3D 0; i < job->in_bo_count; i++) + drm_gem_object_put(job->in_bos[i]); + + kvfree(job->in_bos); + } + + if (job->out_bos) { + for (i =3D 0; i < job->out_bo_count; i++) + drm_gem_object_put(job->out_bos[i]); + + kvfree(job->out_bos); + } + + kfree(job); +} + +static void thames_job_put(struct thames_job *job) +{ + kref_put(&job->refcount, thames_job_cleanup); +} + +static void thames_job_free(struct drm_sched_job *sched_job) +{ + struct thames_job *job =3D to_thames_job(sched_job); + + drm_sched_job_cleanup(sched_job); + + thames_job_put(job); +} + +static struct thames_core *sched_to_core(struct thames_device *tdev, + struct drm_gpu_scheduler *sched) +{ + unsigned int core; + + for (core =3D 0; core < tdev->num_cores; core++) { + if (&tdev->cores[core].sched =3D=3D sched) + return &tdev->cores[core]; + } + + return NULL; +} + +static struct dma_fence *thames_job_run(struct drm_sched_job *sched_job) +{ + struct thames_job *job =3D to_thames_job(sched_job); + struct thames_device *tdev =3D job->tdev; + struct thames_core *core =3D sched_to_core(tdev, sched_job->sched); + struct dma_fence *fence =3D NULL; + + if (unlikely(job->base.s_fence->finished.error)) + return NULL; + + fence =3D thames_fence_create(core); + if (IS_ERR(fence)) + return fence; + + if (job->done_fence) + dma_fence_put(job->done_fence); + job->done_fence =3D dma_fence_get(fence); + + scoped_guard(mutex, &core->job_lock) + { + core->in_flight_job =3D job; + thames_job_hw_submit(core, job); + } + + return fence; +} + +static void thames_reset(struct thames_core *core, struct drm_sched_job *b= ad) +{ + if (!atomic_read(&core->reset.pending)) + return; + + drm_sched_stop(&core->sched, bad); + scoped_guard(mutex, &core->job_lock) core->in_flight_job =3D NULL; + thames_core_reset(core); + atomic_set(&core->reset.pending, 0); + drm_sched_start(&core->sched, 0); +} + +static enum drm_gpu_sched_stat thames_job_timedout(struct drm_sched_job *s= ched_job) +{ + struct thames_job *job =3D to_thames_job(sched_job); + struct thames_device *tdev =3D job->tdev; + struct thames_core *core =3D sched_to_core(tdev, sched_job->sched); + + if (!core) { + dev_err(tdev->ddev.dev, "Failed to find core for timed out job\n"); + return DRM_GPU_SCHED_STAT_NONE; + } + + dev_err(core->dev, "Job %u timed out on DSP core %d\n", job->job_id, core= ->index); + + atomic_set(&core->reset.pending, 1); + thames_reset(core, sched_job); + + return DRM_GPU_SCHED_STAT_RESET; +} + +static void thames_reset_work(struct work_struct *work) +{ + struct thames_core *core; + + core =3D container_of(work, struct thames_core, reset.work); + thames_reset(core, NULL); +} + +static const struct drm_sched_backend_ops thames_sched_ops =3D { .run_job = =3D thames_job_run, + .timedout_job =3D thames_job_timedout, + .free_job =3D thames_job_free }; + +int thames_job_init(struct thames_core *core) +{ + struct drm_sched_init_args args =3D { + .ops =3D &thames_sched_ops, + .num_rqs =3D DRM_SCHED_PRIORITY_COUNT, + .credit_limit =3D 1, + .timeout =3D msecs_to_jiffies(JOB_TIMEOUT_MS), + .name =3D dev_name(core->dev), + .dev =3D core->dev, + }; + int ret; + + INIT_WORK(&core->reset.work, thames_reset_work); + spin_lock_init(&core->fence_lock); + mutex_init(&core->job_lock); + + core->reset.wq =3D alloc_ordered_workqueue("thames-reset-%d", 0, core->in= dex); + if (!core->reset.wq) + return -ENOMEM; + + core->fence_context =3D dma_fence_context_alloc(1); + + args.timeout_wq =3D core->reset.wq; + ret =3D drm_sched_init(&core->sched, &args); + if (ret) { + dev_err(core->dev, "Failed to create scheduler: %d.", ret); + destroy_workqueue(core->reset.wq); + return ret; + } + + return 0; +} + +void thames_job_fini(struct thames_core *core) +{ + drm_sched_fini(&core->sched); + + cancel_work_sync(&core->reset.work); + destroy_workqueue(core->reset.wq); +} + +int thames_job_open(struct thames_file_priv *thames_priv) +{ + struct thames_device *tdev =3D thames_priv->tdev; + struct drm_gpu_scheduler **scheds =3D + kmalloc_array(tdev->num_cores, sizeof(*scheds), GFP_KERNEL); + unsigned int core; + int ret; + + for (core =3D 0; core < tdev->num_cores; core++) + scheds[core] =3D &tdev->cores[core].sched; + + ret =3D drm_sched_entity_init(&thames_priv->sched_entity, DRM_SCHED_PRIOR= ITY_NORMAL, scheds, + tdev->num_cores, NULL); + if (WARN_ON(ret)) + return ret; + + return 0; +} + +void thames_job_close(struct thames_file_priv *thames_priv) +{ + struct drm_sched_entity *entity =3D &thames_priv->sched_entity; + + kfree(entity->sched_list); + drm_sched_entity_destroy(entity); +} + +static int thames_ioctl_submit_job(struct drm_device *dev, struct drm_file= *file, + struct drm_thames_job *job) +{ + struct thames_device *tdev =3D to_thames_device(dev); + struct thames_file_priv *file_priv =3D file->driver_priv; + struct thames_job *tjob =3D NULL; + int ret =3D 0; + + tjob =3D kzalloc(sizeof(*tjob), GFP_KERNEL); + if (!tjob) + return -ENOMEM; + + kref_init(&tjob->refcount); + + tjob->tdev =3D tdev; + tjob->file_priv =3D file_priv; + + tjob->job_id =3D ida_alloc_min(&tdev->job_ida, 1, GFP_KERNEL); + if (tjob->job_id < 0) + goto out_put_job; + + ret =3D drm_sched_job_init(&tjob->base, &file_priv->sched_entity, 1, NULL= , file->client_id); + if (ret) + goto out_put_job; + + tjob->kernel =3D drm_gem_object_lookup(file, job->kernel); + if (!tjob->kernel) { + ret =3D -ENOENT; + goto out_cleanup_job; + } + + tjob->kernel_size =3D job->kernel_size; + + if (job->params) { + tjob->params =3D drm_gem_object_lookup(file, job->params); + if (!tjob->params) { + ret =3D -ENOENT; + goto out_cleanup_job; + } + tjob->params_size =3D job->params_size; + } + + ret =3D drm_gem_objects_lookup(file, u64_to_user_ptr(job->in_bo_handles), + job->in_bo_handle_count, &tjob->in_bos); + if (ret) + goto out_cleanup_job; + + tjob->in_bo_count =3D job->in_bo_handle_count; + + ret =3D drm_gem_objects_lookup(file, u64_to_user_ptr(job->out_bo_handles), + job->out_bo_handle_count, &tjob->out_bos); + if (ret) + goto out_cleanup_job; + + tjob->out_bo_count =3D job->out_bo_handle_count; + + ret =3D thames_job_push(tjob); + +out_cleanup_job: + if (ret) + drm_sched_job_cleanup(&tjob->base); +out_put_job: + thames_job_put(tjob); + + return ret; +} + +#define THAMES_MAX_JOBS_PER_SUBMIT 256 + +int thames_ioctl_submit(struct drm_device *dev, void *data, struct drm_fil= e *file) +{ + struct drm_thames_submit *args =3D data; + struct drm_thames_job *jobs; + size_t jobs_size; + int ret =3D 0; + unsigned int i =3D 0; + + if (args->pad) + return -EINVAL; + + if (args->job_count =3D=3D 0) + return -EINVAL; + + if (args->job_count > THAMES_MAX_JOBS_PER_SUBMIT) { + dev_err(dev->dev, "Job count %u exceeds maximum %u\n", args->job_count, + THAMES_MAX_JOBS_PER_SUBMIT); + return -EINVAL; + } + + jobs_size =3D array_size(args->job_count, sizeof(*jobs)); + if (jobs_size =3D=3D SIZE_MAX) + return -EINVAL; + + jobs =3D kvmalloc_array(args->job_count, sizeof(*jobs), GFP_KERNEL); + if (!jobs) + return -ENOMEM; + + if (copy_from_user(jobs, u64_to_user_ptr(args->jobs), jobs_size)) { + ret =3D -EFAULT; + drm_dbg(dev, "Failed to copy incoming job array\n"); + goto exit; + } + + for (i =3D 0; i < args->job_count; i++) { + ret =3D thames_ioctl_submit_job(dev, file, &jobs[i]); + if (ret) + break; + } + +exit: + kvfree(jobs); + + return ret; +} diff --git a/drivers/accel/thames/thames_job.h b/drivers/accel/thames/thame= s_job.h new file mode 100644 index 0000000000000000000000000000000000000000..3bfd2c779d9b783624a25e6d063= 68f3e1daf569e --- /dev/null +++ b/drivers/accel/thames/thames_job.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ +/* Copyright 2026 Texas Instruments Incorporated - https://www.ti.com/ */ + +#ifndef __THAMES_JOB_H__ +#define __THAMES_JOB_H__ + +#include +#include + +#include "thames_core.h" +#include "thames_drv.h" + +struct thames_job { + struct drm_sched_job base; + + struct thames_device *tdev; + struct thames_file_priv *file_priv; + + u32 job_id; + u32 ipc_sequence; + + struct drm_gem_object *kernel; + size_t kernel_size; + + struct drm_gem_object *params; + size_t params_size; + + struct drm_gem_object **in_bos; + u32 in_bo_count; + + struct drm_gem_object **out_bos; + u32 out_bo_count; + + /* Fence to be signaled by drm-sched once its done with the job */ + struct dma_fence *inference_done_fence; + + /* Fence to be signaled by rpmsg handler when the job is complete. */ + struct dma_fence *done_fence; + + struct kref refcount; +}; + +int thames_ioctl_submit(struct drm_device *dev, void *data, struct drm_fil= e *file); + +int thames_job_init(struct thames_core *core); +void thames_job_fini(struct thames_core *core); +int thames_job_open(struct thames_file_priv *thames_priv); +void thames_job_close(struct thames_file_priv *thames_priv); + +#endif diff --git a/drivers/accel/thames/thames_rpmsg.c b/drivers/accel/thames/tha= mes_rpmsg.c index a25465295a177877c5ca2b3c93f52d8288863797..9747690e0f84fe00d605ad0e708= d597da2240d97 100644 --- a/drivers/accel/thames/thames_rpmsg.c +++ b/drivers/accel/thames/thames_rpmsg.c @@ -11,6 +11,7 @@ #include "thames_core.h" #include "thames_device.h" #include "thames_ipc.h" +#include "thames_job.h" =20 #define THAMES_PING_TEST_PATTERN 0xDEADBEEF #define THAMES_PING_TIMEOUT_MS 5000 @@ -71,6 +72,36 @@ static int thames_rpmsg_callback(struct rpmsg_device *rp= dev, void *data, int len ida_free(&core->tdev->ipc_seq_ida, hdr->seq); break; =20 + case THAMES_MSG_SUBMIT_JOB_RESPONSE: { + struct thames_job *job; + + scoped_guard(mutex, &core->job_lock) + { + job =3D core->in_flight_job; + if (!job) { + dev_err(&rpdev->dev, + "Received job response but no job in flight\n"); + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + return -EINVAL; + } + + if (hdr->seq !=3D job->ipc_sequence) { + dev_err(&rpdev->dev, + "Job response sequence mismatch: got %u, expected %u\n", + hdr->seq, job->ipc_sequence); + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + return -EINVAL; + } + + dma_fence_signal(job->done_fence); + core->in_flight_job =3D NULL; + } + + ida_free(&core->tdev->ipc_seq_ida, hdr->seq); + + break; + } + default: dev_warn(&rpdev->dev, "Unknown message type: %u\n", hdr->type); break; @@ -191,6 +222,27 @@ int thames_rpmsg_send_unmap_bo(struct thames_core *cor= e, u32 context_id, u32 bo_ return thames_rpmsg_send_raw(core, &msg, sizeof(msg)); } =20 +int thames_rpmsg_send_submit_job(struct thames_core *core, u32 context_id,= u32 job_id, + u64 kernel_iova, u64 kernel_size, u64 args_iova, u64 args_size, + u32 *sequence) +{ + struct thames_msg_submit_job msg =3D {}; + + msg.hdr.type =3D THAMES_MSG_SUBMIT_JOB; + msg.hdr.seq =3D ida_alloc(&core->tdev->ipc_seq_ida, GFP_KERNEL); + msg.hdr.len =3D sizeof(msg); + msg.context_id =3D context_id; + msg.job_id =3D job_id; + msg.kernel_iova =3D kernel_iova; + msg.kernel_size =3D kernel_size; + msg.args_iova =3D args_iova; + msg.args_size =3D args_size; + + *sequence =3D msg.hdr.seq; + + return thames_rpmsg_send_raw(core, &msg, sizeof(msg)); +} + int thames_rpmsg_ping_test(struct thames_core *core) { const u32 test_data =3D THAMES_PING_TEST_PATTERN; diff --git a/include/uapi/drm/thames_accel.h b/include/uapi/drm/thames_acce= l.h index 0a5a5e5f6637ab474e9effbb6db29c1dd95e56b5..5b35e50826ed95bfcc3709bef33= 416d2b6d11c70 100644 --- a/include/uapi/drm/thames_accel.h +++ b/include/uapi/drm/thames_accel.h @@ -28,6 +28,9 @@ enum drm_thames_ioctl_id { * mmap to map a GEM object. */ DRM_THAMES_BO_MMAP_OFFSET, + + /** @DRM_THAMES_SUBMIT: Submit a job and BOs to run. */ + DRM_THAMES_SUBMIT, }; =20 /** @@ -75,6 +78,55 @@ struct drm_thames_bo_mmap_offset { __u64 offset; }; =20 +/** + * struct drm_thames_job - A job to be run on the NPU + * + * The kernel will schedule the execution of this job taking into account = its + * dependencies with other jobs. All tasks in the same job will be executed + * sequentially on the same core, to benefit from memory residency in SRAM. + */ +struct drm_thames_job { + /** Input: BO handle for kernel. */ + __u32 kernel; + + /** Input: Size in bytes of the compiled kernel. */ + __u32 kernel_size; + + /** Input: BO handle for params BO. */ + __u32 params; + + /** Input: Size in bytes of the params BO. */ + __u32 params_size; + + /** Input: Pointer to a u32 array of the BOs that are read by the job. */ + __u64 in_bo_handles; + + /** Input: Pointer to a u32 array of the BOs that are written to by the j= ob. */ + __u64 out_bo_handles; + + /** Input: Number of input BO handles passed in (size is that times 4). */ + __u32 in_bo_handle_count; + + /** Input: Number of output BO handles passed in (size is that times 4). = */ + __u32 out_bo_handle_count; +}; + +/** + * struct drm_thames_submit - ioctl argument for submitting commands to th= e NPU. + * + * The kernel will schedule the execution of these jobs in dependency orde= r. + */ +struct drm_thames_submit { + /** Input: Pointer to an array of struct drm_thames_job. */ + __u64 jobs; + + /** Input: Number of jobs passed in. */ + __u32 job_count; + + /** Reserved, must be zero. */ + __u32 pad; +}; + /** * DRM_IOCTL_THAMES() - Build a thames IOCTL number * @__access: Access type. Must be R, W or RW. @@ -95,6 +147,8 @@ enum { DRM_IOCTL_THAMES(WR, BO_CREATE, bo_create), DRM_IOCTL_THAMES_BO_MMAP_OFFSET =3D DRM_IOCTL_THAMES(WR, BO_MMAP_OFFSET, bo_mmap_offset), + DRM_IOCTL_THAMES_SUBMIT =3D + DRM_IOCTL_THAMES(WR, SUBMIT, submit), }; =20 #if defined(__cplusplus) --=20 2.52.0 From nobody Mon Feb 9 02:45:57 2026 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11F4537F8DA for ; Wed, 14 Jan 2026 08:47:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768380451; cv=none; b=EGuChR3cITSJOKRBG9g2LQxi+Yp4e0eLIZhPvc1EK+v12fYMM7vXLIEKiC3QGrkHiKfWsox2IbXhMceqPQisXI4r+Z7nORct7G2Yt+18x9nnCn2kp76bCe3Sq657Z7ZCGYPB79OM3HrtqmfO38/Jb6CEahvwGUmyO899aTsgxoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768380451; c=relaxed/simple; bh=S790a5X3uzkW8n3YiCxdxC5d6m+cBugOxTOqMxpwGQQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=H3+oUVmVYyarKayZ83UU8jvZcG95eWhJXidNCMZ60IQ71XBbuihpVuEUo9psNcVD0/LBWNY32y7bFj9aNeBaaPLAGKOMoUi+PdxTeTPKejO4TptpZS3VCPI8NyZUXb/wl1NBAkfcjo+lWKyL1a77ls0ARokIGeus7EKEX2ZMQmo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.208.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-64d4d8b3ad7so13546210a12.2 for ; Wed, 14 Jan 2026 00:47:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768380445; x=1768985245; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=MFgbGKta1WWetg8kikPQgA/0YVVbzxRzodQq2Gmv8S8=; b=XFL4F+o8St2e/kZUnRiiz/R3xzZiIqRLd4qKBcjlsV/Snolph/bSLgGPfsPwoaGJJB G4INqMyWDQ9OS1qmQhd9/De/HHFcRg8f0y2djCxgVSD3UsBQ+NUuPo+Tij74umg2qFIr z7n7rCP0g/oUTm7gSxXSqhyzoZ1GuVSstmZlgbCs1aq/cQhYFmdZmTC9cCUGx28SfVoH P/MpgVuFEP9k1GoLfgsJsu3zkIQbIjqRWReI1lIL947aihHJuORXMfdpQcZRbUy0TDw+ LAErkROqv80O8ktj8HfwODLnVkSR9gw8tjB3oEawc1QUOz3wtIoG5H+E58ELBvTNKXiT /NEQ== X-Forwarded-Encrypted: i=1; AJvYcCVTSYqE61bsy/EiXXdenqEP1sL6tYeJ6LDVoO0jVXivz07PJHecWO8xP2xxY/WgOMXmIaG2xOitnSL9aeI=@vger.kernel.org X-Gm-Message-State: AOJu0YxrNtM0EM8y60Ee79XoqjWvGNUnxHY+zPL6dc4p6vbqGZsDpsyj TmD6m35gDL/P3VgCiDUABgjxGKXplR+2bbGufLSJwNdGSBcyb7sQj0hy X-Gm-Gg: AY/fxX5B4ogWF2XYfdSSnr6z9vK51sTala5X1hhiQLnSnMIzortSt71+fzPTdRjP2ZD irQRWhZJiG+lcCs+FjqS0IlC6zsg5ur0r+UFyg+2YsaZTz4VdZH9vVMt8OeL8NFtt1+JkowM3B+ oBYqaPiPUFS1C/pZCfRzzsI7jrlZ6EJqHlVNA/LmoXa17iuesDnK+KsIhRzGDDNkYboFgpPoNUp 2PVG17jPFdmfb8j8ClxmYz3gv3E9BkCC2PyjWAS35lx6bPvfNeOo84wlxTiHcP0qXfjeMv3rLxs ZkaEn76xoxfxfKE8+k1jvGQtUZXtYJqxtYSTcwCWLm9Vfjf/VHToYMIo18DhPQUaAZei0mrvPw1 +8cbyiJQus6OhN60Odr80V8kBLIOcT8oP1pYpbl1lM7e+/h3atfHubxx42ezHNFlOp1WTlh8qRX itdxbORpUyThwhAJDKAjuymNMPmnfwvQWfQlPNZ3mSarOAjwYWXEnpG9+/ X-Received: by 2002:a05:6402:790:b0:640:e75a:f95d with SMTP id 4fb4d7f45d1cf-653ec12583bmr1081734a12.15.1768380445470; Wed, 14 Jan 2026 00:47:25 -0800 (PST) Received: from [10.42.0.1] (cst-prg-36-231.cust.vodafone.cz. [46.135.36.231]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6507bf6d5d4sm22379136a12.32.2026.01.14.00.47.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jan 2026 00:47:25 -0800 (PST) From: Tomeu Vizoso Date: Wed, 14 Jan 2026 09:46:52 +0100 Subject: [PATCH v2 5/5] accel/thames: Add IOCTL for memory synchronization Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260114-thames-v2-5-e94a6636e050@tomeuvizoso.net> References: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> In-Reply-To: <20260114-thames-v2-0-e94a6636e050@tomeuvizoso.net> To: Nishanth Menon , "Andrew F. Davis" , Randolph Sapp , Jonathan Humphreys , Andrei Aldea , Chirag Shilwant , Vignesh Raghavendra , Tero Kristo , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Oded Gabbay , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Robert Nelson , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann Cc: linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 The DSP cores have their own access to the memory bus, and it isn't cache coherent with the CPUs. Add IOCTLs so userspace can mark when the caches need to be flushed, and also when a writer job needs to be waited for before the buffer can be accessed from the CPU. Initially based on the same IOCTLs from the Etnaviv driver. v2: - Add thames_accel.h UAPI header (Robert Nelson). Signed-off-by: Tomeu Vizoso --- drivers/accel/thames/thames_drv.c | 2 ++ drivers/accel/thames/thames_gem.c | 52 +++++++++++++++++++++++++++++++++++= ++++ drivers/accel/thames/thames_gem.h | 4 +++ include/uapi/drm/thames_accel.h | 31 +++++++++++++++++++++++ 4 files changed, 89 insertions(+) diff --git a/drivers/accel/thames/thames_drv.c b/drivers/accel/thames/thame= s_drv.c index 1ff01428e6c80765cb741ae45c67971b7b0f28c8..6993d503139d3aaef830cdf5cfc= f38476c5f9d99 100644 --- a/drivers/accel/thames/thames_drv.c +++ b/drivers/accel/thames/thames_drv.c @@ -76,6 +76,8 @@ static const struct drm_ioctl_desc thames_drm_driver_ioct= ls[] =3D { THAMES_IOCTL(BO_CREATE, bo_create), THAMES_IOCTL(BO_MMAP_OFFSET, bo_mmap_offset), THAMES_IOCTL(SUBMIT, submit), + THAMES_IOCTL(BO_PREP, bo_prep), + THAMES_IOCTL(BO_FINI, bo_fini), }; =20 DEFINE_DRM_ACCEL_FOPS(thames_accel_driver_fops); diff --git a/drivers/accel/thames/thames_gem.c b/drivers/accel/thames/thame= s_gem.c index 5a01ddaeb2448117d400a79e53d2c6123ecb5390..4ded8fab0f3ff6f75a1446c5661= fdbc68f1f2ac7 100644 --- a/drivers/accel/thames/thames_gem.c +++ b/drivers/accel/thames/thames_gem.c @@ -351,3 +351,55 @@ int thames_ioctl_bo_mmap_offset(struct drm_device *dde= v, void *data, struct drm_ =20 return 0; } + +int thames_ioctl_bo_prep(struct drm_device *ddev, void *data, struct drm_f= ile *file) +{ + struct drm_thames_bo_prep *args =3D data; + struct drm_gem_object *gem_obj; + struct drm_gem_shmem_object *shmem_obj; + unsigned long timeout =3D drm_timeout_abs_to_jiffies(args->timeout_ns); + long ret =3D 0; + + if (args->reserved !=3D 0) + return -EINVAL; + + gem_obj =3D drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + ret =3D dma_resv_wait_timeout(gem_obj->resv, DMA_RESV_USAGE_WRITE, true, = timeout); + if (!ret) + ret =3D timeout ? -ETIMEDOUT : -EBUSY; + + shmem_obj =3D &to_thames_bo(gem_obj)->base; + + dma_sync_sgtable_for_cpu(ddev->dev, shmem_obj->sgt, DMA_FROM_DEVICE); + + drm_gem_object_put(gem_obj); + + return ret; +} + +int thames_ioctl_bo_fini(struct drm_device *ddev, void *data, struct drm_f= ile *file) +{ + struct drm_thames_bo_fini *args =3D data; + struct drm_gem_shmem_object *shmem_obj; + struct thames_gem_object *thames_obj; + struct drm_gem_object *gem_obj; + + if (args->reserved !=3D 0) + return -EINVAL; + + gem_obj =3D drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + thames_obj =3D to_thames_bo(gem_obj); + shmem_obj =3D &thames_obj->base; + + dma_sync_sgtable_for_device(ddev->dev, shmem_obj->sgt, DMA_TO_DEVICE); + + drm_gem_object_put(gem_obj); + + return 0; +} diff --git a/drivers/accel/thames/thames_gem.h b/drivers/accel/thames/thame= s_gem.h index 785843c40a89a9e84ab634aad77e9ec46111693e..e5a8278e98c578c2903cf23aea1= bf887be0389e8 100644 --- a/drivers/accel/thames/thames_gem.h +++ b/drivers/accel/thames/thames_gem.h @@ -29,6 +29,10 @@ int thames_ioctl_bo_create(struct drm_device *ddev, void= *data, struct drm_file =20 int thames_ioctl_bo_mmap_offset(struct drm_device *ddev, void *data, struc= t drm_file *file); =20 +int thames_ioctl_bo_prep(struct drm_device *ddev, void *data, struct drm_f= ile *file); + +int thames_ioctl_bo_fini(struct drm_device *ddev, void *data, struct drm_f= ile *file); + int thames_context_create(struct thames_file_priv *priv); =20 void thames_context_destroy(struct thames_file_priv *priv); diff --git a/include/uapi/drm/thames_accel.h b/include/uapi/drm/thames_acce= l.h index 5b35e50826ed95bfcc3709bef33416d2b6d11c70..07477087211c14721298ff52a1f= 3d253a6e65d58 100644 --- a/include/uapi/drm/thames_accel.h +++ b/include/uapi/drm/thames_accel.h @@ -31,6 +31,12 @@ enum drm_thames_ioctl_id { =20 /** @DRM_THAMES_SUBMIT: Submit a job and BOs to run. */ DRM_THAMES_SUBMIT, + + /** @DRM_THAMES_BO_PREP: Prepare a BO for CPU access after DSP writes. */ + DRM_THAMES_BO_PREP, + + /** @DRM_THAMES_BO_FINI: Finish CPU access and prepare BO for DSP access.= */ + DRM_THAMES_BO_FINI, }; =20 /** @@ -127,6 +133,27 @@ struct drm_thames_submit { __u32 pad; }; =20 +/** + * struct drm_thames_bo_prep - ioctl argument for preparing a BO for CPU a= ccess. + * + * This invalidates CPU caches and waits for pending DSP operations to com= plete. + */ +struct drm_thames_bo_prep { + __u32 handle; + __u32 reserved; + __s64 timeout_ns; /* absolute */ +}; + +/** + * struct drm_thames_bo_fini - ioctl argument for finishing CPU access to = a BO. + * + * This flushes CPU caches to make CPU writes visible to the DSP. + */ +struct drm_thames_bo_fini { + __u32 handle; + __u32 reserved; +}; + /** * DRM_IOCTL_THAMES() - Build a thames IOCTL number * @__access: Access type. Must be R, W or RW. @@ -149,6 +176,10 @@ enum { DRM_IOCTL_THAMES(WR, BO_MMAP_OFFSET, bo_mmap_offset), DRM_IOCTL_THAMES_SUBMIT =3D DRM_IOCTL_THAMES(WR, SUBMIT, submit), + DRM_IOCTL_THAMES_BO_PREP =3D + DRM_IOCTL_THAMES(WR, BO_PREP, bo_prep), + DRM_IOCTL_THAMES_BO_FINI =3D + DRM_IOCTL_THAMES(WR, BO_FINI, bo_fini), }; =20 #if defined(__cplusplus) --=20 2.52.0