From nobody Sun Nov 24 03:56:32 2024 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2041.outbound.protection.outlook.com [40.107.237.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A8C51CBEB6 for ; Fri, 8 Nov 2024 04:35:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.237.41 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731040536; cv=fail; b=gDgOq/1a4kwxqgIleQq0prfDmTpVnAW0HGNpHBd2ATVm2cbnd5NNR0wGVtkoq3mKh8icLT0qLVJ3e1/qxexqNxY6kYRg/wcIr6ZsAhDdQdlNnbC61D2OrT2BfDvwzjJPUZKXOFXysobG7DpREd999jG32NPCkauYWqOk1/oM9c0= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731040536; c=relaxed/simple; bh=GnQrXTU8Qna3vKliV4vzq6gz9gKLXPqWUyuHtmTEZiI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Sdpya5QoUPTuwIrw4rjiMPqYz+7pE5f3dU1VkZ/LQ6rYsyVarPGAg2aiEQBQ2Rm08F+TtGhXwyjCFkt213DHlYjrVSLpoBCAbTSIcqonZ4iYlMQM9CP7rF4iH7DxPhWuGPQ0IV8YovyIDA/oAG5tkxdEZYd8r2m5nAENCcY7Oms= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=V0btQWSF; arc=fail smtp.client-ip=40.107.237.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="V0btQWSF" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=C2NMoOseBTs9VGw+xqvzwktuerQKh3pQWSdaiAjONzQVPLrDjEsopLRK2XZ2pZlAvcH2icIOhCGFZBPXpx7e55BTt+E+Xymjnhuf7LjWSPmhIev7wwQ3KBGuXSLHrl1zfSK/WybzYGqXFQWro4haaChlzv/eNDh6xrCJji6MI02gP1/ASVeQ3Ja30/a22t0/r5LqeK4vhVQvTFpZ+PAMhTWQiDI37Q2XyadtYNvo/Oa0NG32Xb/tNgxe0mPg+vSOoRsqZXpC8woCzmb/jQxYCMNQzFVOhorxqqpO/5DAuD2a4C40fOcvQUZuYWGxtmsBYlJb1T7KQWXdfzPU8P1uSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dx4JMNrHRevSDrNHMhDtiSFhARq7Y+YYmzlhaKTQ+cs=; b=YScAUPN8jXQ6OyN6kGQFPErdi+TSzIir7aYyA/5fSuE7x57B6S4+YGgBL0BV8Jv4UuzyIdTHgAj4Llwczl+Qn+CGrg4QVRv3RauHKs4QqPwSCNn7cCnwa09aN7Qaee7RASeZJFLWb5D++vFurgEHXliT9im+sSKasaKTPDlpppzFQnmVBrfs6xXUd4Zfv4MO18uzkYSKG2bUFI0eL8H/Xn+A8vUHikqA7vRClRWTqiSpYZH2RgYn/0B+4nQY8kBx0FpUI1g+E2aNtABLNR8AlHDks9UeeHpVQ/SyQKd9Fs8hRZYdGp03n37+BgBimFRxL/DMk+sRF6COYZUooUjFnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dx4JMNrHRevSDrNHMhDtiSFhARq7Y+YYmzlhaKTQ+cs=; b=V0btQWSFbOsc6T0sMRax/dPsS5cvnBHljFq/r8HRhb9JoNsqIiqnqP5cJBFdjHpMsCqvvD8TC+Hl1tPUyJ+d6Y4KRx2I3ZYNNH3z5RkQGfS7vTpEJqMAf98LRSNJnTOV4WInAmC6n5jL6EPaALtoBM7mq7Yd9tpUKHxYsXMBb7s= Received: from BLAPR05CA0041.namprd05.prod.outlook.com (2603:10b6:208:335::24) by SJ2PR12MB8157.namprd12.prod.outlook.com (2603:10b6:a03:4fa::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8137.19; Fri, 8 Nov 2024 04:35:26 +0000 Received: from BL6PEPF0001AB4D.namprd04.prod.outlook.com (2603:10b6:208:335:cafe::2c) by BLAPR05CA0041.outlook.office365.com (2603:10b6:208:335::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8137.19 via Frontend Transport; Fri, 8 Nov 2024 04:35:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB03.amd.com; pr=C Received: from SATLEXMB03.amd.com (165.204.84.17) by BL6PEPF0001AB4D.mail.protection.outlook.com (10.167.242.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8137.17 via Frontend Transport; Fri, 8 Nov 2024 04:35:25 +0000 Received: from SATLEXMB05.amd.com (10.181.40.146) by SATLEXMB03.amd.com (10.181.40.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov 2024 22:35:25 -0600 Received: from SATLEXMB03.amd.com (10.181.40.144) by SATLEXMB05.amd.com (10.181.40.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov 2024 22:35:24 -0600 Received: from xsjlizhih51.xilinx.com (10.180.168.240) by SATLEXMB03.amd.com (10.181.40.144) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 7 Nov 2024 22:35:24 -0600 From: Lizhi Hou To: , , CC: Lizhi Hou , , , , , Subject: [PATCH V7 06/10] accel/amdxdna: Add GEM buffer object management Date: Thu, 7 Nov 2024 20:34:44 -0800 Message-ID: <20241108043448.449314-7-lizhi.hou@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241108043448.449314-1-lizhi.hou@amd.com> References: <20241108043448.449314-1-lizhi.hou@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: None (SATLEXMB05.amd.com: lizhi.hou@amd.com does not designate permitted sender hosts) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL6PEPF0001AB4D:EE_|SJ2PR12MB8157:EE_ X-MS-Office365-Filtering-Correlation-Id: cd01c98b-300d-47f7-327b-08dcffaec3ed X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|376014|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?Ud97w4v7rZgVAn+Rv0G9u01Pg9R+VuFunDu2YYBa+yb8vMNUWssKxPZB1NGn?= =?us-ascii?Q?NhJ7GI4N4zRuGXhIERd4gSv9tPOE5O4iw8sXRw/NrxYMluXv1CWSVs8q1SIX?= =?us-ascii?Q?uzQN56t8YrfzjXhvnCaIusdnBTV2SSBEnmfN/JF1cpnW8+/GaM9nsfxSS6pN?= =?us-ascii?Q?dwr4o5L0oA8PU62gVxrMEL+RZa8mNDziaUCzP4ctr+N4twOM2SsbzFg/oDhC?= =?us-ascii?Q?UmgiicdHFZ665SlNQaxXqokPuo6AgDgZsCaNFA4xQzzTYtL4WnCmYcJQ/h9K?= =?us-ascii?Q?2BKg9XHpcgznI7ZIg0Equl0++SLds+XKsitKn9yK4KIarJhKn1d9oWjMc7we?= =?us-ascii?Q?5yoDGqsERXdayG5C2Y6YkJm1GvRIdGNAx/nPUVqnXT+2lrA+BKVRk6E7/GJM?= =?us-ascii?Q?CgEHKZLsNKeCVUj5G14JuDHbPdC8BLx6dphOeufdaR+/8A8+iVLLdGIlxHEb?= =?us-ascii?Q?N/LoqGrFHRwqeP9nOq8aanXhXOc/rS0qMOZeuX6e41Gdx4A/epKOXZn+FEmw?= =?us-ascii?Q?JMOQBiVDzzd/bLcANuzrIdkF3SplWL5J1XXqc3YjTPfLxoUcJtNtr3M4F1EK?= =?us-ascii?Q?asmun14bGdUxAROUTpniuiYcnSTCcBlyKqRYjTFYloLSs1mNNq3EHx/MR5Lj?= =?us-ascii?Q?iD7+ScD4qbBfBnQbtuRVc0+FfvgmsK6cbVlHSo7t/lf9YXm+MJgf0OAgtrn2?= =?us-ascii?Q?yWjCMSVYAbu6VeF/CvBK9oA4N2p/FDgnb4EcBUCnQSEdWvxNWSnjh+ovmM0x?= =?us-ascii?Q?dGl1y3g+Ev4VrxLHWFEFUO3sEJQBg90dDeXDI00a2+m5bNHUiuZqhRLG7iHN?= =?us-ascii?Q?cRWBiQavZ2kL2fzhfdphNOift3SL8vYW1eA+OL8TpJeQmJ5oOd0nmkBsZIfZ?= =?us-ascii?Q?2ot1UXvMz9swjeRyDZWAWVreiratZ6LEWmfAtKTEtII7Tw9hPSXC1s9CwBd/?= =?us-ascii?Q?sas2UNaaYsa6KK3m+NmD8eMpW1O3FjE80zlIRLntFrZtO1CHl1vyg3adsYWi?= =?us-ascii?Q?NaFf3qydwT5zmL7ZxEITTqCNs0FqASkGCodHaEOiOdSREiLoWYliOosAe1tE?= =?us-ascii?Q?3FLnW5zRK4un3/um3T+YTMQQxNFGjdMnV67ZvnpzY7XpdOYJht9fKXxBjboG?= =?us-ascii?Q?bdRpKALyaOsnZo9kgCgpDG1Qqb1zY2a7xSUuqoTzXeLOPYLYHNl6XPXMXbEU?= =?us-ascii?Q?aCZiBivxSVJIB6r1mezOBISYezSrU5bp0FzIFw4sDYGqzhAnzXeNpkGR+0k3?= =?us-ascii?Q?Pr/fW97m9czt3DW+creonWm9FScC5TnuQ+rJx7eHIgG0J96OMNauh8OlLW1e?= =?us-ascii?Q?0BXHy58XQ4lBIsvflcVYvdZ5sBDxBn1IqBgZWmWl0JNiVWTPIRo3PGLLJT2k?= =?us-ascii?Q?zv1I6RCXm9jYY7uJIqFdNOFjYx/LFajk+CiClC8ZsDrMmWeJVA=3D=3D?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB03.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(376014)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Nov 2024 04:35:25.5481 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: cd01c98b-300d-47f7-327b-08dcffaec3ed X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB03.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL6PEPF0001AB4D.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ2PR12MB8157 Content-Type: text/plain; charset="utf-8" There different types of BOs are supported: - shmem A user application uses shmem BOs as input/output for its workload running on NPU. - device memory heap The fixed size buffer dedicated to the device. - device buffer The buffer object allocated from device memory heap. - command buffer The buffer object created for delivering commands. The command buffer object is small and pinned on creation. New IOCTLs are added: CREATE_BO, GET_BO_INFO, SYNC_BO. SYNC_BO is used to explicitly flush CPU cache for BO memory. Co-developed-by: Min Ma Signed-off-by: Min Ma Reviewed-by: Jeffrey Hugo Signed-off-by: Lizhi Hou --- drivers/accel/amdxdna/Makefile | 1 + drivers/accel/amdxdna/aie2_ctx.c | 85 +++- drivers/accel/amdxdna/aie2_message.c | 80 +++ drivers/accel/amdxdna/aie2_pci.h | 3 + drivers/accel/amdxdna/amdxdna_ctx.h | 10 + drivers/accel/amdxdna/amdxdna_gem.c | 621 ++++++++++++++++++++++++ drivers/accel/amdxdna/amdxdna_gem.h | 65 +++ drivers/accel/amdxdna/amdxdna_pci_drv.c | 12 + drivers/accel/amdxdna/amdxdna_pci_drv.h | 6 + include/uapi/drm/amdxdna_accel.h | 77 +++ 10 files changed, 959 insertions(+), 1 deletion(-) create mode 100644 drivers/accel/amdxdna/amdxdna_gem.c create mode 100644 drivers/accel/amdxdna/amdxdna_gem.h diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile index c86c90dfd303..a688c378761f 100644 --- a/drivers/accel/amdxdna/Makefile +++ b/drivers/accel/amdxdna/Makefile @@ -8,6 +8,7 @@ amdxdna-y :=3D \ aie2_smu.o \ aie2_solver.o \ amdxdna_ctx.o \ + amdxdna_gem.o \ amdxdna_mailbox.o \ amdxdna_mailbox_helper.o \ amdxdna_pci_drv.o \ diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_= ctx.c index 022b2b0b015d..ae8a91dad042 100644 --- a/drivers/accel/amdxdna/aie2_ctx.c +++ b/drivers/accel/amdxdna/aie2_ctx.c @@ -5,12 +5,15 @@ =20 #include #include +#include +#include #include #include =20 #include "aie2_pci.h" #include "aie2_solver.h" #include "amdxdna_ctx.h" +#include "amdxdna_gem.h" #include "amdxdna_mailbox.h" #include "amdxdna_pci_drv.h" =20 @@ -128,6 +131,7 @@ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) struct amdxdna_client *client =3D hwctx->client; struct amdxdna_dev *xdna =3D client->xdna; struct amdxdna_hwctx_priv *priv; + struct amdxdna_gem_obj *heap; int ret; =20 priv =3D kzalloc(sizeof(*hwctx->priv), GFP_KERNEL); @@ -135,10 +139,28 @@ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) return -ENOMEM; hwctx->priv =3D priv; =20 + mutex_lock(&client->mm_lock); + heap =3D client->dev_heap; + if (!heap) { + XDNA_ERR(xdna, "The client dev heap object not exist"); + mutex_unlock(&client->mm_lock); + ret =3D -ENOENT; + goto free_priv; + } + drm_gem_object_get(to_gobj(heap)); + mutex_unlock(&client->mm_lock); + priv->heap =3D heap; + + ret =3D amdxdna_gem_pin(heap); + if (ret) { + XDNA_ERR(xdna, "Dev heap pin failed, ret %d", ret); + goto put_heap; + } + ret =3D aie2_hwctx_col_list(hwctx); if (ret) { XDNA_ERR(xdna, "Create col list failed, ret %d", ret); - goto free_priv; + goto unpin; } =20 ret =3D aie2_alloc_resource(hwctx); @@ -147,14 +169,26 @@ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) goto free_col_list; } =20 + ret =3D aie2_map_host_buf(xdna->dev_handle, hwctx->fw_ctx_id, + heap->mem.userptr, heap->mem.size); + if (ret) { + XDNA_ERR(xdna, "Map host buffer failed, ret %d", ret); + goto release_resource; + } hwctx->status =3D HWCTX_STAT_INIT; =20 XDNA_DBG(xdna, "hwctx %s init completed", hwctx->name); =20 return 0; =20 +release_resource: + aie2_release_resource(hwctx); free_col_list: kfree(hwctx->col_list); +unpin: + amdxdna_gem_unpin(heap); +put_heap: + drm_gem_object_put(to_gobj(heap)); free_priv: kfree(priv); return ret; @@ -164,11 +198,59 @@ void aie2_hwctx_fini(struct amdxdna_hwctx *hwctx) { aie2_release_resource(hwctx); =20 + amdxdna_gem_unpin(hwctx->priv->heap); + drm_gem_object_put(to_gobj(hwctx->priv->heap)); + kfree(hwctx->col_list); kfree(hwctx->priv); kfree(hwctx->cus); } =20 +static int aie2_hwctx_cu_config(struct amdxdna_hwctx *hwctx, void *buf, u3= 2 size) +{ + struct amdxdna_hwctx_param_config_cu *config =3D buf; + struct amdxdna_dev *xdna =3D hwctx->client->xdna; + u32 total_size; + int ret; + + XDNA_DBG(xdna, "Config %d CU to %s", config->num_cus, hwctx->name); + if (hwctx->status !=3D HWCTX_STAT_INIT) { + XDNA_ERR(xdna, "Not support re-config CU"); + return -EINVAL; + } + + if (!config->num_cus) { + XDNA_ERR(xdna, "Number of CU is zero"); + return -EINVAL; + } + + total_size =3D struct_size(config, cu_configs, config->num_cus); + if (total_size > size) { + XDNA_ERR(xdna, "CU config larger than size"); + return -EINVAL; + } + + hwctx->cus =3D kmemdup(config, total_size, GFP_KERNEL); + if (!hwctx->cus) + return -ENOMEM; + + ret =3D aie2_config_cu(hwctx); + if (ret) { + XDNA_ERR(xdna, "Config CU to firmware failed, ret %d", ret); + goto free_cus; + } + + wmb(); /* To avoid locking in command submit when check status */ + hwctx->status =3D HWCTX_STAT_READY; + + return 0; + +free_cus: + kfree(hwctx->cus); + hwctx->cus =3D NULL; + return ret; +} + int aie2_hwctx_config(struct amdxdna_hwctx *hwctx, u32 type, u64 value, vo= id *buf, u32 size) { struct amdxdna_dev *xdna =3D hwctx->client->xdna; @@ -176,6 +258,7 @@ int aie2_hwctx_config(struct amdxdna_hwctx *hwctx, u32 = type, u64 value, void *bu drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); switch (type) { case DRM_AMDXDNA_HWCTX_CONFIG_CU: + return aie2_hwctx_cu_config(hwctx, buf, size); case DRM_AMDXDNA_HWCTX_ASSIGN_DBG_BUF: case DRM_AMDXDNA_HWCTX_REMOVE_DBG_BUF: return -EOPNOTSUPP; diff --git a/drivers/accel/amdxdna/aie2_message.c b/drivers/accel/amdxdna/a= ie2_message.c index 4b8a71bf4fae..40d9e4261e8b 100644 --- a/drivers/accel/amdxdna/aie2_message.c +++ b/drivers/accel/amdxdna/aie2_message.c @@ -5,7 +5,10 @@ =20 #include #include +#include +#include #include +#include #include #include #include @@ -13,6 +16,7 @@ #include "aie2_msg_priv.h" #include "aie2_pci.h" #include "amdxdna_ctx.h" +#include "amdxdna_gem.h" #include "amdxdna_mailbox.h" #include "amdxdna_mailbox_helper.h" #include "amdxdna_pci_drv.h" @@ -282,3 +286,79 @@ int aie2_destroy_context(struct amdxdna_dev_hdl *ndev,= struct amdxdna_hwctx *hwc =20 return ret; } + +int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 ad= dr, u64 size) +{ + DECLARE_AIE2_MSG(map_host_buffer, MSG_OP_MAP_HOST_BUFFER); + struct amdxdna_dev *xdna =3D ndev->xdna; + int ret; + + req.context_id =3D context_id; + req.buf_addr =3D addr; + req.buf_size =3D size; + ret =3D aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + XDNA_DBG(xdna, "fw ctx %d map host buf addr 0x%llx size 0x%llx", + context_id, addr, size); + + return 0; +} + +int aie2_config_cu(struct amdxdna_hwctx *hwctx) +{ + struct mailbox_channel *chann =3D hwctx->priv->mbox_chann; + struct amdxdna_dev *xdna =3D hwctx->client->xdna; + u32 shift =3D xdna->dev_info->dev_mem_buf_shift; + DECLARE_AIE2_MSG(config_cu, MSG_OP_CONFIG_CU); + struct drm_gem_object *gobj; + struct amdxdna_gem_obj *abo; + int ret, i; + + if (!chann) + return -ENODEV; + + if (hwctx->cus->num_cus > MAX_NUM_CUS) { + XDNA_DBG(xdna, "Exceed maximum CU %d", MAX_NUM_CUS); + return -EINVAL; + } + + for (i =3D 0; i < hwctx->cus->num_cus; i++) { + struct amdxdna_cu_config *cu =3D &hwctx->cus->cu_configs[i]; + + gobj =3D drm_gem_object_lookup(hwctx->client->filp, cu->cu_bo); + if (!gobj) { + XDNA_ERR(xdna, "Lookup GEM object failed"); + return -EINVAL; + } + abo =3D to_xdna_obj(gobj); + + if (abo->type !=3D AMDXDNA_BO_DEV) { + drm_gem_object_put(gobj); + XDNA_ERR(xdna, "Invalid BO type"); + return -EINVAL; + } + + req.cfgs[i] =3D FIELD_PREP(AIE2_MSG_CFG_CU_PDI_ADDR, + abo->mem.dev_addr >> shift); + req.cfgs[i] |=3D FIELD_PREP(AIE2_MSG_CFG_CU_FUNC, cu->cu_func); + XDNA_DBG(xdna, "CU %d full addr 0x%llx, cfg 0x%x", i, + abo->mem.dev_addr, req.cfgs[i]); + drm_gem_object_put(gobj); + } + req.num_cus =3D hwctx->cus->num_cus; + + ret =3D xdna_send_msg_wait(xdna, chann, &msg); + if (ret =3D=3D -ETIME) + aie2_destroy_context(xdna->dev_handle, hwctx); + + if (resp.status =3D=3D AIE2_STATUS_SUCCESS) { + XDNA_DBG(xdna, "Configure %d CUs, ret %d", req.num_cus, ret); + return 0; + } + + XDNA_ERR(xdna, "Command opcode 0x%x failed, status 0x%x ret %d", + msg.opcode, resp.status, ret); + return ret; +} diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_= pci.h index b789286bc9d4..3ac936e2c9d1 100644 --- a/drivers/accel/amdxdna/aie2_pci.h +++ b/drivers/accel/amdxdna/aie2_pci.h @@ -119,6 +119,7 @@ struct rt_config { }; =20 struct amdxdna_hwctx_priv { + struct amdxdna_gem_obj *heap; void *mbox_chann; }; =20 @@ -196,6 +197,8 @@ int aie2_query_firmware_version(struct amdxdna_dev_hdl = *ndev, struct amdxdna_fw_ver *fw_ver); int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx= *hwctx); int aie2_destroy_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwct= x *hwctx); +int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 ad= dr, u64 size); +int aie2_config_cu(struct amdxdna_hwctx *hwctx); =20 /* aie2_hwctx.c */ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx); diff --git a/drivers/accel/amdxdna/amdxdna_ctx.h b/drivers/accel/amdxdna/am= dxdna_ctx.h index 00b96cf2e9a7..b409d0731ab8 100644 --- a/drivers/accel/amdxdna/amdxdna_ctx.h +++ b/drivers/accel/amdxdna/amdxdna_ctx.h @@ -6,6 +6,16 @@ #ifndef _AMDXDNA_CTX_H_ #define _AMDXDNA_CTX_H_ =20 +/* Exec buffer command header format */ +#define AMDXDNA_CMD_STATE GENMASK(3, 0) +#define AMDXDNA_CMD_EXTRA_CU_MASK GENMASK(11, 10) +#define AMDXDNA_CMD_COUNT GENMASK(22, 12) +#define AMDXDNA_CMD_OPCODE GENMASK(27, 23) +struct amdxdna_cmd { + u32 header; + u32 data[]; +}; + struct amdxdna_hwctx { struct amdxdna_client *client; struct amdxdna_hwctx_priv *priv; diff --git a/drivers/accel/amdxdna/amdxdna_gem.c b/drivers/accel/amdxdna/am= dxdna_gem.c new file mode 100644 index 000000000000..f2ba86ae9e1a --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_gem.c @@ -0,0 +1,621 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_pci_drv.h" + +#define XDNA_MAX_CMD_BO_SIZE SZ_32K + +static int +amdxdna_gem_insert_node_locked(struct amdxdna_gem_obj *abo, bool use_vmap) +{ + struct amdxdna_client *client =3D abo->client; + struct amdxdna_dev *xdna =3D client->xdna; + struct amdxdna_mem *mem =3D &abo->mem; + u64 offset; + u32 align; + int ret; + + align =3D 1 << max(PAGE_SHIFT, xdna->dev_info->dev_mem_buf_shift); + ret =3D drm_mm_insert_node_generic(&abo->dev_heap->mm, &abo->mm_node, + mem->size, align, + 0, DRM_MM_INSERT_BEST); + if (ret) { + XDNA_ERR(xdna, "Failed to alloc dev bo memory, ret %d", ret); + return ret; + } + + mem->dev_addr =3D abo->mm_node.start; + offset =3D mem->dev_addr - abo->dev_heap->mem.dev_addr; + mem->userptr =3D abo->dev_heap->mem.userptr + offset; + mem->pages =3D &abo->dev_heap->base.pages[offset >> PAGE_SHIFT]; + mem->nr_pages =3D mem->size >> PAGE_SHIFT; + + if (use_vmap) { + mem->kva =3D vmap(mem->pages, mem->nr_pages, VM_MAP, PAGE_KERNEL); + if (!mem->kva) { + XDNA_ERR(xdna, "Failed to vmap"); + drm_mm_remove_node(&abo->mm_node); + return -EFAULT; + } + } + + return 0; +} + +static void amdxdna_gem_obj_free(struct drm_gem_object *gobj) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(gobj->dev); + struct amdxdna_gem_obj *abo =3D to_xdna_obj(gobj); + struct iosys_map map =3D IOSYS_MAP_INIT_VADDR(abo->mem.kva); + + XDNA_DBG(xdna, "BO type %d xdna_addr 0x%llx", abo->type, abo->mem.dev_add= r); + if (abo->pinned) + amdxdna_gem_unpin(abo); + + if (abo->type =3D=3D AMDXDNA_BO_DEV) { + mutex_lock(&abo->client->mm_lock); + drm_mm_remove_node(&abo->mm_node); + mutex_unlock(&abo->client->mm_lock); + + vunmap(abo->mem.kva); + drm_gem_object_put(to_gobj(abo->dev_heap)); + drm_gem_object_release(gobj); + mutex_destroy(&abo->lock); + kfree(abo); + return; + } + + if (abo->type =3D=3D AMDXDNA_BO_DEV_HEAP) + drm_mm_takedown(&abo->mm); + + drm_gem_vunmap_unlocked(gobj, &map); + mutex_destroy(&abo->lock); + drm_gem_shmem_free(&abo->base); +} + +static const struct drm_gem_object_funcs amdxdna_gem_dev_obj_funcs =3D { + .free =3D amdxdna_gem_obj_free, +}; + +static bool amdxdna_hmm_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq) +{ + struct amdxdna_gem_obj *abo =3D container_of(mni, struct amdxdna_gem_obj, + mem.notifier); + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + + XDNA_DBG(xdna, "Invalid range 0x%llx, 0x%lx, type %d", + abo->mem.userptr, abo->mem.size, abo->type); + + if (!mmu_notifier_range_blockable(range)) + return false; + + xdna->dev_info->ops->hmm_invalidate(abo, cur_seq); + + return true; +} + +static const struct mmu_interval_notifier_ops amdxdna_hmm_ops =3D { + .invalidate =3D amdxdna_hmm_invalidate, +}; + +static void amdxdna_hmm_unregister(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + + if (!xdna->dev_info->ops->hmm_invalidate) + return; + + mmu_interval_notifier_remove(&abo->mem.notifier); + kvfree(abo->mem.pfns); + abo->mem.pfns =3D NULL; +} + +static int amdxdna_hmm_register(struct amdxdna_gem_obj *abo, unsigned long= addr, + size_t len) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + u32 nr_pages; + int ret; + + if (!xdna->dev_info->ops->hmm_invalidate) + return 0; + + if (abo->mem.pfns) + return -EEXIST; + + nr_pages =3D (PAGE_ALIGN(addr + len) - (addr & PAGE_MASK)) >> PAGE_SHIFT; + abo->mem.pfns =3D kvcalloc(nr_pages, sizeof(*abo->mem.pfns), + GFP_KERNEL); + if (!abo->mem.pfns) + return -ENOMEM; + + ret =3D mmu_interval_notifier_insert_locked(&abo->mem.notifier, + current->mm, + addr, + len, + &amdxdna_hmm_ops); + if (ret) { + XDNA_ERR(xdna, "Insert mmu notifier failed, ret %d", ret); + kvfree(abo->mem.pfns); + } + abo->mem.userptr =3D addr; + + return ret; +} + +static int amdxdna_gem_obj_mmap(struct drm_gem_object *gobj, + struct vm_area_struct *vma) +{ + struct amdxdna_gem_obj *abo =3D to_xdna_obj(gobj); + unsigned long num_pages; + int ret; + + ret =3D amdxdna_hmm_register(abo, vma->vm_start, gobj->size); + if (ret) + return ret; + + ret =3D drm_gem_shmem_mmap(&abo->base, vma); + if (ret) + goto hmm_unreg; + + num_pages =3D gobj->size >> PAGE_SHIFT; + /* Try to insert the pages */ + vm_flags_mod(vma, VM_MIXEDMAP, VM_PFNMAP); + ret =3D vm_insert_pages(vma, vma->vm_start, abo->base.pages, &num_pages); + if (ret) + XDNA_ERR(abo->client->xdna, "Failed insert pages, ret %d", ret); + + return 0; + +hmm_unreg: + amdxdna_hmm_unregister(abo); + return ret; +} + +static vm_fault_t amdxdna_gem_vm_fault(struct vm_fault *vmf) +{ + return drm_gem_shmem_vm_ops.fault(vmf); +} + +static void amdxdna_gem_vm_open(struct vm_area_struct *vma) +{ + drm_gem_shmem_vm_ops.open(vma); +} + +static void amdxdna_gem_vm_close(struct vm_area_struct *vma) +{ + struct drm_gem_object *gobj =3D vma->vm_private_data; + + amdxdna_hmm_unregister(to_xdna_obj(gobj)); + drm_gem_shmem_vm_ops.close(vma); +} + +static const struct vm_operations_struct amdxdna_gem_vm_ops =3D { + .fault =3D amdxdna_gem_vm_fault, + .open =3D amdxdna_gem_vm_open, + .close =3D amdxdna_gem_vm_close, +}; + +static const struct drm_gem_object_funcs amdxdna_gem_shmem_funcs =3D { + .free =3D amdxdna_gem_obj_free, + .print_info =3D drm_gem_shmem_object_print_info, + .pin =3D drm_gem_shmem_object_pin, + .unpin =3D drm_gem_shmem_object_unpin, + .get_sg_table =3D drm_gem_shmem_object_get_sg_table, + .vmap =3D drm_gem_shmem_object_vmap, + .vunmap =3D drm_gem_shmem_object_vunmap, + .mmap =3D amdxdna_gem_obj_mmap, + .vm_ops =3D &amdxdna_gem_vm_ops, +}; + +static struct amdxdna_gem_obj * +amdxdna_gem_create_obj(struct drm_device *dev, size_t size) +{ + struct amdxdna_gem_obj *abo; + + abo =3D kzalloc(sizeof(*abo), GFP_KERNEL); + if (!abo) + return ERR_PTR(-ENOMEM); + + abo->pinned =3D false; + abo->assigned_hwctx =3D AMDXDNA_INVALID_CTX_HANDLE; + mutex_init(&abo->lock); + + abo->mem.userptr =3D AMDXDNA_INVALID_ADDR; + abo->mem.dev_addr =3D AMDXDNA_INVALID_ADDR; + abo->mem.size =3D size; + + return abo; +} + +/* For drm_driver->gem_create_object callback */ +struct drm_gem_object * +amdxdna_gem_create_object_cb(struct drm_device *dev, size_t size) +{ + struct amdxdna_gem_obj *abo; + + abo =3D amdxdna_gem_create_obj(dev, size); + if (IS_ERR(abo)) + return ERR_CAST(abo); + + to_gobj(abo)->funcs =3D &amdxdna_gem_shmem_funcs; + + return to_gobj(abo); +} + +static struct amdxdna_gem_obj * +amdxdna_drm_alloc_shmem(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_client *client =3D filp->driver_priv; + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + + shmem =3D drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) + return ERR_CAST(shmem); + + shmem->map_wc =3D false; + + abo =3D to_xdna_obj(&shmem->base); + abo->client =3D client; + abo->type =3D AMDXDNA_BO_SHMEM; + + return abo; +} + +static struct amdxdna_gem_obj * +amdxdna_drm_create_dev_heap(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_client *client =3D filp->driver_priv; + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + int ret; + + if (args->size > xdna->dev_info->dev_mem_size) { + XDNA_DBG(xdna, "Invalid dev heap size 0x%llx, limit 0x%lx", + args->size, xdna->dev_info->dev_mem_size); + return ERR_PTR(-EINVAL); + } + + mutex_lock(&client->mm_lock); + if (client->dev_heap) { + XDNA_DBG(client->xdna, "dev heap is already created"); + ret =3D -EBUSY; + goto mm_unlock; + } + + shmem =3D drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) { + ret =3D PTR_ERR(shmem); + goto mm_unlock; + } + + shmem->map_wc =3D false; + abo =3D to_xdna_obj(&shmem->base); + + abo->type =3D AMDXDNA_BO_DEV_HEAP; + abo->client =3D client; + abo->mem.dev_addr =3D client->xdna->dev_info->dev_mem_base; + drm_mm_init(&abo->mm, abo->mem.dev_addr, abo->mem.size); + + client->dev_heap =3D abo; + drm_gem_object_get(to_gobj(abo)); + mutex_unlock(&client->mm_lock); + + return abo; + +mm_unlock: + mutex_unlock(&client->mm_lock); + return ERR_PTR(ret); +} + +struct amdxdna_gem_obj * +amdxdna_drm_alloc_dev_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp, bool use_vmap) +{ + struct amdxdna_client *client =3D filp->driver_priv; + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + size_t aligned_sz =3D PAGE_ALIGN(args->size); + struct amdxdna_gem_obj *abo, *heap; + int ret; + + mutex_lock(&client->mm_lock); + heap =3D client->dev_heap; + if (!heap) { + ret =3D -EINVAL; + goto mm_unlock; + } + + if (heap->mem.userptr =3D=3D AMDXDNA_INVALID_ADDR) { + XDNA_ERR(xdna, "Invalid dev heap userptr"); + ret =3D -EINVAL; + goto mm_unlock; + } + + if (args->size > heap->mem.size) { + XDNA_ERR(xdna, "Invalid dev bo size 0x%llx, limit 0x%lx", + args->size, heap->mem.size); + ret =3D -EINVAL; + goto mm_unlock; + } + + abo =3D amdxdna_gem_create_obj(&xdna->ddev, aligned_sz); + if (IS_ERR(abo)) { + ret =3D PTR_ERR(abo); + goto mm_unlock; + } + to_gobj(abo)->funcs =3D &amdxdna_gem_dev_obj_funcs; + abo->type =3D AMDXDNA_BO_DEV; + abo->client =3D client; + abo->dev_heap =3D heap; + ret =3D amdxdna_gem_insert_node_locked(abo, use_vmap); + if (ret) { + XDNA_ERR(xdna, "Failed to alloc dev bo memory, ret %d", ret); + goto mm_unlock; + } + + drm_gem_object_get(to_gobj(heap)); + drm_gem_private_object_init(&xdna->ddev, to_gobj(abo), aligned_sz); + + mutex_unlock(&client->mm_lock); + return abo; + +mm_unlock: + mutex_unlock(&client->mm_lock); + return ERR_PTR(ret); +} + +static struct amdxdna_gem_obj * +amdxdna_drm_create_cmd_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + struct iosys_map map; + int ret; + + if (args->size > XDNA_MAX_CMD_BO_SIZE) { + XDNA_ERR(xdna, "Command bo size 0x%llx too large", args->size); + return ERR_PTR(-EINVAL); + } + + if (args->size < sizeof(struct amdxdna_cmd)) { + XDNA_DBG(xdna, "Command BO size 0x%llx too small", args->size); + return ERR_PTR(-EINVAL); + } + + shmem =3D drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) + return ERR_CAST(shmem); + + shmem->map_wc =3D false; + abo =3D to_xdna_obj(&shmem->base); + + abo->type =3D AMDXDNA_BO_CMD; + abo->client =3D filp->driver_priv; + + ret =3D drm_gem_vmap_unlocked(to_gobj(abo), &map); + if (ret) { + XDNA_ERR(xdna, "Vmap cmd bo failed, ret %d", ret); + goto release_obj; + } + abo->mem.kva =3D map.vaddr; + + return abo; + +release_obj: + drm_gem_shmem_free(shmem); + return ERR_PTR(ret); +} + +int amdxdna_drm_create_bo_ioctl(struct drm_device *dev, void *data, struct= drm_file *filp) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct amdxdna_drm_create_bo *args =3D data; + struct amdxdna_gem_obj *abo; + int ret; + + if (args->flags || args->vaddr || !args->size) + return -EINVAL; + + XDNA_DBG(xdna, "BO arg type %d vaddr 0x%llx size 0x%llx flags 0x%llx", + args->type, args->vaddr, args->size, args->flags); + switch (args->type) { + case AMDXDNA_BO_SHMEM: + abo =3D amdxdna_drm_alloc_shmem(dev, args, filp); + break; + case AMDXDNA_BO_DEV_HEAP: + abo =3D amdxdna_drm_create_dev_heap(dev, args, filp); + break; + case AMDXDNA_BO_DEV: + abo =3D amdxdna_drm_alloc_dev_bo(dev, args, filp, false); + break; + case AMDXDNA_BO_CMD: + abo =3D amdxdna_drm_create_cmd_bo(dev, args, filp); + break; + default: + return -EINVAL; + } + if (IS_ERR(abo)) + return PTR_ERR(abo); + + /* ready to publish object to userspace */ + ret =3D drm_gem_handle_create(filp, to_gobj(abo), &args->handle); + if (ret) { + XDNA_ERR(xdna, "Create handle failed"); + goto put_obj; + } + + XDNA_DBG(xdna, "BO hdl %d type %d userptr 0x%llx xdna_addr 0x%llx size 0x= %lx", + args->handle, args->type, abo->mem.userptr, + abo->mem.dev_addr, abo->mem.size); +put_obj: + /* Dereference object reference. Handle holds it now. */ + drm_gem_object_put(to_gobj(abo)); + return ret; +} + +int amdxdna_gem_pin_nolock(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + int ret; + + switch (abo->type) { + case AMDXDNA_BO_SHMEM: + case AMDXDNA_BO_DEV_HEAP: + ret =3D drm_gem_shmem_pin(&abo->base); + break; + case AMDXDNA_BO_DEV: + ret =3D drm_gem_shmem_pin(&abo->dev_heap->base); + break; + default: + ret =3D -EOPNOTSUPP; + } + + XDNA_DBG(xdna, "BO type %d ret %d", abo->type, ret); + return ret; +} + +int amdxdna_gem_pin(struct amdxdna_gem_obj *abo) +{ + int ret; + + if (abo->type =3D=3D AMDXDNA_BO_DEV) + abo =3D abo->dev_heap; + + mutex_lock(&abo->lock); + ret =3D amdxdna_gem_pin_nolock(abo); + mutex_unlock(&abo->lock); + + return ret; +} + +void amdxdna_gem_unpin(struct amdxdna_gem_obj *abo) +{ + if (abo->type =3D=3D AMDXDNA_BO_DEV) + abo =3D abo->dev_heap; + + mutex_lock(&abo->lock); + drm_gem_shmem_unpin(&abo->base); + mutex_unlock(&abo->lock); +} + +struct amdxdna_gem_obj *amdxdna_gem_get_obj(struct amdxdna_client *client, + u32 bo_hdl, u8 bo_type) +{ + struct amdxdna_dev *xdna =3D client->xdna; + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + + gobj =3D drm_gem_object_lookup(client->filp, bo_hdl); + if (!gobj) { + XDNA_DBG(xdna, "Can not find bo %d", bo_hdl); + return NULL; + } + + abo =3D to_xdna_obj(gobj); + if (bo_type =3D=3D AMDXDNA_BO_INVALID || abo->type =3D=3D bo_type) + return abo; + + drm_gem_object_put(gobj); + return NULL; +} + +int amdxdna_drm_get_bo_info_ioctl(struct drm_device *dev, void *data, stru= ct drm_file *filp) +{ + struct amdxdna_drm_get_bo_info *args =3D data; + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + int ret =3D 0; + + if (args->ext || args->ext_flags) + return -EINVAL; + + gobj =3D drm_gem_object_lookup(filp, args->handle); + if (!gobj) { + XDNA_DBG(xdna, "Lookup GEM object %d failed", args->handle); + return -ENOENT; + } + + abo =3D to_xdna_obj(gobj); + args->vaddr =3D abo->mem.userptr; + args->xdna_addr =3D abo->mem.dev_addr; + + if (abo->type !=3D AMDXDNA_BO_DEV) + args->map_offset =3D drm_vma_node_offset_addr(&gobj->vma_node); + else + args->map_offset =3D AMDXDNA_INVALID_ADDR; + + XDNA_DBG(xdna, "BO hdl %d map_offset 0x%llx vaddr 0x%llx xdna_addr 0x%llx= ", + args->handle, args->map_offset, args->vaddr, args->xdna_addr); + + drm_gem_object_put(gobj); + return ret; +} + +/* + * The sync bo ioctl is to make sure the CPU cache is in sync with memory. + * This is required because NPU is not cache coherent device. CPU cache + * flushing/invalidation is expensive so it is best to handle this outside + * of the command submission path. This ioctl allows explicit cache + * flushing/invalidation outside of the critical path. + */ +int amdxdna_drm_sync_bo_ioctl(struct drm_device *dev, + void *data, struct drm_file *filp) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct amdxdna_drm_sync_bo *args =3D data; + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + int ret; + + gobj =3D drm_gem_object_lookup(filp, args->handle); + if (!gobj) { + XDNA_ERR(xdna, "Lookup GEM object failed"); + return -ENOENT; + } + abo =3D to_xdna_obj(gobj); + + ret =3D amdxdna_gem_pin(abo); + if (ret) { + XDNA_ERR(xdna, "Pin BO %d failed, ret %d", args->handle, ret); + goto put_obj; + } + + if (abo->type =3D=3D AMDXDNA_BO_DEV) + drm_clflush_pages(abo->mem.pages, abo->mem.nr_pages); + else + drm_clflush_pages(abo->base.pages, gobj->size >> PAGE_SHIFT); + + amdxdna_gem_unpin(abo); + + XDNA_DBG(xdna, "Sync bo %d offset 0x%llx, size 0x%llx\n", + args->handle, args->offset, args->size); + +put_obj: + drm_gem_object_put(gobj); + return ret; +} diff --git a/drivers/accel/amdxdna/amdxdna_gem.h b/drivers/accel/amdxdna/am= dxdna_gem.h new file mode 100644 index 000000000000..8ccc0375dd9d --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_gem.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AMDXDNA_GEM_H_ +#define _AMDXDNA_GEM_H_ + +struct amdxdna_mem { + u64 userptr; + void *kva; + u64 dev_addr; + size_t size; + struct page **pages; + u32 nr_pages; + struct mmu_interval_notifier notifier; + unsigned long *pfns; + bool map_invalid; +}; + +struct amdxdna_gem_obj { + struct drm_gem_shmem_object base; + struct amdxdna_client *client; + u8 type; + bool pinned; + struct mutex lock; /* Protects: pinned */ + struct amdxdna_mem mem; + + /* Below members is uninitialized when needed */ + struct drm_mm mm; /* For AMDXDNA_BO_DEV_HEAP */ + struct amdxdna_gem_obj *dev_heap; /* For AMDXDNA_BO_DEV */ + struct drm_mm_node mm_node; /* For AMDXDNA_BO_DEV */ + u32 assigned_hwctx; +}; + +#define to_gobj(obj) (&(obj)->base.base) + +static inline struct amdxdna_gem_obj *to_xdna_obj(struct drm_gem_object *g= obj) +{ + return container_of(gobj, struct amdxdna_gem_obj, base.base); +} + +struct amdxdna_gem_obj *amdxdna_gem_get_obj(struct amdxdna_client *client, + u32 bo_hdl, u8 bo_type); +static inline void amdxdna_gem_put_obj(struct amdxdna_gem_obj *abo) +{ + drm_gem_object_put(to_gobj(abo)); +} + +struct drm_gem_object * +amdxdna_gem_create_object_cb(struct drm_device *dev, size_t size); +struct amdxdna_gem_obj * +amdxdna_drm_alloc_dev_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp, bool use_vmap); + +int amdxdna_gem_pin_nolock(struct amdxdna_gem_obj *abo); +int amdxdna_gem_pin(struct amdxdna_gem_obj *abo); +void amdxdna_gem_unpin(struct amdxdna_gem_obj *abo); + +int amdxdna_drm_create_bo_ioctl(struct drm_device *dev, void *data, struct= drm_file *filp); +int amdxdna_drm_get_bo_info_ioctl(struct drm_device *dev, void *data, stru= ct drm_file *filp); +int amdxdna_drm_sync_bo_ioctl(struct drm_device *dev, void *data, struct d= rm_file *filp); + +#endif /* _AMDXDNA_GEM_H_ */ diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c b/drivers/accel/amdxdn= a/amdxdna_pci_drv.c index dfe682df5640..172109cc9617 100644 --- a/drivers/accel/amdxdna/amdxdna_pci_drv.c +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c @@ -7,12 +7,14 @@ #include #include #include +#include #include #include #include #include =20 #include "amdxdna_ctx.h" +#include "amdxdna_gem.h" #include "amdxdna_pci_drv.h" =20 /* @@ -63,6 +65,7 @@ static int amdxdna_drm_open(struct drm_device *ddev, stru= ct drm_file *filp) } mutex_init(&client->hwctx_lock); idr_init_base(&client->hwctx_idr, AMDXDNA_INVALID_CTX_HANDLE + 1); + mutex_init(&client->mm_lock); =20 mutex_lock(&xdna->dev_lock); list_add_tail(&client->node, &xdna->client_list); @@ -91,6 +94,9 @@ static void amdxdna_drm_close(struct drm_device *ddev, st= ruct drm_file *filp) =20 idr_destroy(&client->hwctx_idr); mutex_destroy(&client->hwctx_lock); + mutex_destroy(&client->mm_lock); + if (client->dev_heap) + drm_gem_object_put(to_gobj(client->dev_heap)); =20 iommu_sva_unbind_device(client->sva); =20 @@ -123,6 +129,10 @@ static const struct drm_ioctl_desc amdxdna_drm_ioctls[= ] =3D { DRM_IOCTL_DEF_DRV(AMDXDNA_CREATE_HWCTX, amdxdna_drm_create_hwctx_ioctl, 0= ), DRM_IOCTL_DEF_DRV(AMDXDNA_DESTROY_HWCTX, amdxdna_drm_destroy_hwctx_ioctl,= 0), DRM_IOCTL_DEF_DRV(AMDXDNA_CONFIG_HWCTX, amdxdna_drm_config_hwctx_ioctl, 0= ), + /* BO */ + DRM_IOCTL_DEF_DRV(AMDXDNA_CREATE_BO, amdxdna_drm_create_bo_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_GET_BO_INFO, amdxdna_drm_get_bo_info_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_SYNC_BO, amdxdna_drm_sync_bo_ioctl, 0), }; =20 static const struct file_operations amdxdna_fops =3D { @@ -149,6 +159,8 @@ const struct drm_driver amdxdna_drm_drv =3D { .postclose =3D amdxdna_drm_close, .ioctls =3D amdxdna_drm_ioctls, .num_ioctls =3D ARRAY_SIZE(amdxdna_drm_ioctls), + + .gem_create_object =3D amdxdna_gem_create_object_cb, }; =20 static const struct amdxdna_dev_info * diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.h b/drivers/accel/amdxdn= a/amdxdna_pci_drv.h index 5ec7fe168406..3dddde4ac12a 100644 --- a/drivers/accel/amdxdna/amdxdna_pci_drv.h +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.h @@ -18,6 +18,7 @@ extern const struct drm_driver amdxdna_drm_drv; =20 struct amdxdna_dev; +struct amdxdna_gem_obj; struct amdxdna_hwctx; =20 /* @@ -29,6 +30,7 @@ struct amdxdna_dev_ops { int (*hwctx_init)(struct amdxdna_hwctx *hwctx); void (*hwctx_fini)(struct amdxdna_hwctx *hwctx); int (*hwctx_config)(struct amdxdna_hwctx *hwctx, u32 type, u64 value, voi= d *buf, u32 size); + void (*hmm_invalidate)(struct amdxdna_gem_obj *abo, unsigned long cur_seq= ); }; =20 /* @@ -89,6 +91,10 @@ struct amdxdna_client { struct idr hwctx_idr; struct amdxdna_dev *xdna; struct drm_file *filp; + + struct mutex mm_lock; /* protect memory related */ + struct amdxdna_gem_obj *dev_heap; + struct iommu_sva *sva; int pasid; }; diff --git a/include/uapi/drm/amdxdna_accel.h b/include/uapi/drm/amdxdna_ac= cel.h index a0dc821c1363..e3e78b79a8e7 100644 --- a/include/uapi/drm/amdxdna_accel.h +++ b/include/uapi/drm/amdxdna_accel.h @@ -13,7 +13,9 @@ extern "C" { #endif =20 +#define AMDXDNA_INVALID_ADDR (~0UL) #define AMDXDNA_INVALID_CTX_HANDLE 0 +#define AMDXDNA_INVALID_BO_HANDLE 0 =20 enum amdxdna_device_type { AMDXDNA_DEV_TYPE_UNKNOWN =3D -1, @@ -24,6 +26,9 @@ enum amdxdna_drm_ioctl_id { DRM_AMDXDNA_CREATE_HWCTX, DRM_AMDXDNA_DESTROY_HWCTX, DRM_AMDXDNA_CONFIG_HWCTX, + DRM_AMDXDNA_CREATE_BO, + DRM_AMDXDNA_GET_BO_INFO, + DRM_AMDXDNA_SYNC_BO, }; =20 /** @@ -136,6 +141,66 @@ struct amdxdna_drm_config_hwctx { __u32 pad; }; =20 +enum amdxdna_bo_type { + AMDXDNA_BO_INVALID =3D 0, + AMDXDNA_BO_SHMEM, + AMDXDNA_BO_DEV_HEAP, + AMDXDNA_BO_DEV, + AMDXDNA_BO_CMD, +}; + +/** + * struct amdxdna_drm_create_bo - Create a buffer object. + * @flags: Buffer flags. MBZ. + * @vaddr: User VA of buffer if applied. MBZ. + * @size: Size in bytes. + * @type: Buffer type. + * @handle: Returned DRM buffer object handle. + */ +struct amdxdna_drm_create_bo { + __u64 flags; + __u64 vaddr; + __u64 size; + __u32 type; + __u32 handle; +}; + +/** + * struct amdxdna_drm_get_bo_info - Get buffer object information. + * @ext: MBZ. + * @ext_flags: MBZ. + * @handle: DRM buffer object handle. + * @pad: Structure padding. + * @map_offset: Returned DRM fake offset for mmap(). + * @vaddr: Returned user VA of buffer. 0 in case user needs mmap(). + * @xdna_addr: Returned XDNA device virtual address. + */ +struct amdxdna_drm_get_bo_info { + __u64 ext; + __u64 ext_flags; + __u32 handle; + __u32 pad; + __u64 map_offset; + __u64 vaddr; + __u64 xdna_addr; +}; + +/** + * struct amdxdna_drm_sync_bo - Sync buffer object. + * @handle: Buffer object handle. + * @direction: Direction of sync, can be from device or to device. + * @offset: Offset in the buffer to sync. + * @size: Size in bytes. + */ +struct amdxdna_drm_sync_bo { + __u32 handle; +#define SYNC_DIRECT_TO_DEVICE 0U +#define SYNC_DIRECT_FROM_DEVICE 1U + __u32 direction; + __u64 offset; + __u64 size; +}; + #define DRM_IOCTL_AMDXDNA_CREATE_HWCTX \ DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_CREATE_HWCTX, \ struct amdxdna_drm_create_hwctx) @@ -148,6 +213,18 @@ struct amdxdna_drm_config_hwctx { DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_CONFIG_HWCTX, \ struct amdxdna_drm_config_hwctx) =20 +#define DRM_IOCTL_AMDXDNA_CREATE_BO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_CREATE_BO, \ + struct amdxdna_drm_create_bo) + +#define DRM_IOCTL_AMDXDNA_GET_BO_INFO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_GET_BO_INFO, \ + struct amdxdna_drm_get_bo_info) + +#define DRM_IOCTL_AMDXDNA_SYNC_BO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_SYNC_BO, \ + struct amdxdna_drm_sync_bo) + #if defined(__cplusplus) } /* extern c end */ #endif --=20 2.34.1