From nobody Sat Nov 30 05:24:36 2024 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2046.outbound.protection.outlook.com [40.107.237.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C8D01BA263 for ; Wed, 11 Sep 2024 18:06:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.237.46 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726078004; cv=fail; b=g5B2tzzz6l/6EAUkRSQFiIrhLgifjF4P//AkMPENHm3u27kUAvCrPs+kaMbFze1W+bCojmFxQhUWj3i8sv2AaT6X82kGx+ETZqII/73XbAwXJEXXIyMxn0AsVaFeVsCk2+EhSBM9mWRz2s926hql1rZHTui8pXMn+cgKxWT2t3Y= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726078004; c=relaxed/simple; bh=rrS9p7BMA3p5QIu9Yi/f2Drnhd3AfLWDnHIgsqa/Vnc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mGWA+XouxKkcxXj7MfOOOXy7eMgAqptyERqRPVf3R5gwZgv7zSo0V/g5QA7W47H+8L8ggTVkKmi2nGrKInD16YqvCErvuXh2ptk9t9bbT+HIZrBsqQSE/CzvYB+hskF8NXneVZ1A+axArqDQ5O/a3uNJ/kEj6yXdpKIVI+S7a94= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=l3DxzsEQ; arc=fail smtp.client-ip=40.107.237.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="l3DxzsEQ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=cEzux3+QJoo8tPasJaVxv122N5ggVhLmhjPn6bVIB06j4AWFGB5djyodM0KcBatUSH/2IkNAL8y/zsl4zKKZuMtgU+q3TGcnk0XXqlpwpkKIdievpbp+ooC0wsv+x6E7mtP6YHCNHZYwdmQ9Nvh+zDGBFM1TSxgCUundCOiO292vFIGBKgbKgcSxRUjrlfEeJeIiFlgEJi/SN5sXy5wYsYVh5wmM3MEHpRuJDWzCXlWiMBj0exka8auK0DAiSsXg8Wr8Z2kF/T6vl7Xr3Gcs1k+TWhEk5V4fV7PT23dP4GzAQbbxxtMXjMi8vmcixiu9W3lt5T4xiVB6KgzZUGq3aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RtUVlBpqa7MVpC+30ztj20+PuU2tHJNsOKcRh86TMFU=; b=mV2dXr8zA/MRP8/pfy001MdVB8Vhvy0Qj6UMyGlRA2CWPFnkgMkEkIjYYDWxodnl5aMttMuuD9knbuzaz34OEb216SdtymkoBdv2GWdxoVARWHFV4yMvD5LGEPjonV5yqiAInIq334KHbfD3iJQa5h4APUVxjGfhd8UM0rqSqK4k88yx+MAW9srtBnLaehTUsEslm3CAXdPV1vhuSUDTg5qWkTbdpZ1PP9oDPQaPy+7mcPbMkqGcSqubJcTrxP5wAvIr6suX8O6CwZoPk3TK8q66IIGsnDVkdQLzec3DD8GubIES+0hA3GEKg65DJ+R1FS/2FdSf7sUox2eLNhJDAg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RtUVlBpqa7MVpC+30ztj20+PuU2tHJNsOKcRh86TMFU=; b=l3DxzsEQ35PTy7F96+r017/+RFFDcHO/ojcyNJO2Y5ogaJWeUpiEzdZAtvkaD6EiZRzeGbODE/Ifsm8qtuALv5AFq1TnVSnit3jkmflRSKkxWwlhRFXUWM2p/jppZ/8AMBBR0WydO/JbCReM9YzDTkIEByTLxErmuk0RPSyMS4w= Received: from BN8PR15CA0012.namprd15.prod.outlook.com (2603:10b6:408:c0::25) by SN7PR12MB7023.namprd12.prod.outlook.com (2603:10b6:806:260::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27; Wed, 11 Sep 2024 18:06:31 +0000 Received: from BL02EPF0002992E.namprd02.prod.outlook.com (2603:10b6:408:c0:cafe::2) by BN8PR15CA0012.outlook.office365.com (2603:10b6:408:c0::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.24 via Frontend Transport; Wed, 11 Sep 2024 18:06:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB03.amd.com; pr=C Received: from SATLEXMB03.amd.com (165.204.84.17) by BL02EPF0002992E.mail.protection.outlook.com (10.167.249.59) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Wed, 11 Sep 2024 18:06:30 +0000 Received: from SATLEXMB06.amd.com (10.181.40.147) by SATLEXMB03.amd.com (10.181.40.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 11 Sep 2024 13:06:26 -0500 Received: from SATLEXMB03.amd.com (10.181.40.144) by SATLEXMB06.amd.com (10.181.40.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 11 Sep 2024 13:06:26 -0500 Received: from xsjlizhih51.xilinx.com (10.180.168.240) by SATLEXMB03.amd.com (10.181.40.144) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Wed, 11 Sep 2024 13:06:25 -0500 From: Lizhi Hou To: , CC: Lizhi Hou , , , , , Subject: [PATCH V3 06/11] accel/amdxdna: Add GEM buffer object management Date: Wed, 11 Sep 2024 11:05:59 -0700 Message-ID: <20240911180604.1834434-7-lizhi.hou@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240911180604.1834434-1-lizhi.hou@amd.com> References: <20240911180604.1834434-1-lizhi.hou@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0002992E:EE_|SN7PR12MB7023:EE_ X-MS-Office365-Filtering-Correlation-Id: 18b1df03-673a-491d-2678-08dcd28c76d3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|82310400026|36860700013|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?PpZ15obmW4NiDEy3O3isttb1PihN6sC7VTrPNVO8KuTQIuCzkr262YwEMjzj?= =?us-ascii?Q?vVq89VnTsjV+GOMbMVKfReXeW3xTjOGbR+D2ptS2Zdb2hFNC16TCt5mm6shx?= =?us-ascii?Q?FvSmmYR5CvAcssRxHrNL0xp5tzwchrXcVXnj50M94G4blV5N+R+n5J2K+lBB?= =?us-ascii?Q?pXxmr2SbAoInELQdBJnZmebH/6QYFndaBNyW8cMWgkQC7+H+6f2WZR0Ilt46?= =?us-ascii?Q?TNhqIAzVN3c8A6gcQKMvEmCf/SsgqxPXD2Bic0SJgooy9bRaHfHWCkur3Wei?= =?us-ascii?Q?+mvC8mh/YbbS4OvWteTDWBsZbM1F7Ipnx89nFRtZ35hprOxyy3TU8Fr1y9Hn?= =?us-ascii?Q?NxX0uzFcOi9c2JPAJnGwGHwGvfQqcz8a0skGZGBRdi/SOvzhhWOcsgQ3Ug0v?= =?us-ascii?Q?lrEupzMiao4ec85pd6HtmjMI5MYZnsXUdN2xyPiF59S5D15+BABSzVrLG+Os?= =?us-ascii?Q?T46vIRPcfveBI3BfluMobMbLCTQDjTas1a2Ce2UKYEtYi3AusZiIQDs3g2ZN?= =?us-ascii?Q?f82TRYsZs/RipwE3k7q+x8ngYBbHKv633Xu9YZcvVPqNtC8zHjUkG+73iiAo?= =?us-ascii?Q?/dyH+BwctZCgZ1u2fWBzyethjs/sqogC9Xf/bctC4r+We28g54JJTA9Sy6cQ?= =?us-ascii?Q?/dY0DH8BMLAAush7dU49RUzwbb0928LI+Qk89+J+F8vFCSJAA8/X7gwUPawC?= =?us-ascii?Q?AqYYXN4z5rdBunMnjzRD+BIhoaqX+4kGeD/WLMHLGyuBMlc6EZeL/laolBV4?= =?us-ascii?Q?wSz6IFlf26iAsTtz+JnE0e6L2oTvWGy5tVaX5M5g759YhSfxKnlZzR9GW4HM?= =?us-ascii?Q?tjaQpAiVW2eQL8yAWBdSIuKTMpp0Hpc+8sO8SyP9cEr6zv0STz6x1OUtQgwb?= =?us-ascii?Q?4P+09GOezfUWCJXSaPLgX4Xqo4B+W1x6IQEqfm/RhhddaJ0Evu3xIENIHahq?= =?us-ascii?Q?GFwog/BgXMVh3+im8HqJg9cssILmHfFMW2R+6hbF7dsJlNkek8dxNXUBX9RF?= =?us-ascii?Q?WWINe7J8Gb+f6TTpSJ20VIsY1qnvZPbvvM4boqZQxHivYmebhEB0n1+YuNts?= =?us-ascii?Q?d6DzOG+At/E6pmTe4w6kASt7nnS5daaRjUJ2SHQgklpEZr4xGx+Es2wED7Ez?= =?us-ascii?Q?f2sHrq8deD3AtGJPjW/hWBCbCxY3D4j9Spmnxq/aKcjK3UaLVXtEi0JB21ky?= =?us-ascii?Q?Dq5Cwj53xR7Bo3t43pm6O3riBw05IV9MH3odBfaxHQp3nM6SF3MWztbrQomA?= =?us-ascii?Q?mzFYxvZ3qa5MJ4vzIirFIISBHJSxaq2vs6vFV5p3TvGahAc9T8cjWWO4m9Hm?= =?us-ascii?Q?nA8Al6zD7jq/zIbEdaWle9k5RvBPBufLdIA98Q/63UCZcvpJTuJembbRrDX2?= =?us-ascii?Q?WvlJDh4KWik7Yk7nrV3UZ8LeKP8isfwVNeexq4ULPgb5zVqyBF+VGdss7TPx?= =?us-ascii?Q?SIdMlzda10G3UaD4ubSYPdabx+LXCOj7?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB03.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(82310400026)(36860700013)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Sep 2024 18:06:30.9790 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 18b1df03-673a-491d-2678-08dcd28c76d3 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB03.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0002992E.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7023 Content-Type: text/plain; charset="utf-8" There different types of BOs are supported: - shmem A user application uses shmem BOs as input/output for its workload running on NPU. - device memory heap The fixed size buffer dedicated to the device. - device buffer The buffer object allocated from device memory heap. - command buffer The buffer object created for delivering commands. The command buffer object is small and pinned on creation. New IOCTLs are added: CREATE_BO, GET_BO_INFO, SYNC_BO. SYNC_BO is used to explicitly flush CPU cache for BO memory. Co-developed-by: Min Ma Signed-off-by: Min Ma Signed-off-by: Lizhi Hou --- drivers/accel/amdxdna/Makefile | 1 + drivers/accel/amdxdna/aie2_ctx.c | 85 +++- drivers/accel/amdxdna/aie2_message.c | 79 +++ drivers/accel/amdxdna/aie2_pci.h | 3 + drivers/accel/amdxdna/amdxdna_ctx.h | 10 + drivers/accel/amdxdna/amdxdna_gem.c | 627 ++++++++++++++++++++++++ drivers/accel/amdxdna/amdxdna_gem.h | 65 +++ drivers/accel/amdxdna/amdxdna_pci_drv.c | 12 + drivers/accel/amdxdna/amdxdna_pci_drv.h | 6 + include/uapi/drm/amdxdna_accel.h | 81 +++ 10 files changed, 968 insertions(+), 1 deletion(-) create mode 100644 drivers/accel/amdxdna/amdxdna_gem.c create mode 100644 drivers/accel/amdxdna/amdxdna_gem.h diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile index c86c90dfd303..a688c378761f 100644 --- a/drivers/accel/amdxdna/Makefile +++ b/drivers/accel/amdxdna/Makefile @@ -8,6 +8,7 @@ amdxdna-y :=3D \ aie2_smu.o \ aie2_solver.o \ amdxdna_ctx.o \ + amdxdna_gem.o \ amdxdna_mailbox.o \ amdxdna_mailbox_helper.o \ amdxdna_pci_drv.o \ diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_= ctx.c index 52a71661f887..7904ccd36518 100644 --- a/drivers/accel/amdxdna/aie2_ctx.c +++ b/drivers/accel/amdxdna/aie2_ctx.c @@ -5,6 +5,8 @@ =20 #include #include +#include +#include #include #include =20 @@ -13,6 +15,7 @@ #include "amdxdna_pci_drv.h" #include "aie2_pci.h" #include "aie2_solver.h" +#include "amdxdna_gem.h" =20 static int aie2_hwctx_col_list(struct amdxdna_hwctx *hwctx) { @@ -128,6 +131,7 @@ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) struct amdxdna_client *client =3D hwctx->client; struct amdxdna_dev *xdna =3D client->xdna; struct amdxdna_hwctx_priv *priv; + struct amdxdna_gem_obj *heap; int ret; =20 priv =3D kzalloc(sizeof(*hwctx->priv), GFP_KERNEL); @@ -135,10 +139,28 @@ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) return -ENOMEM; hwctx->priv =3D priv; =20 + mutex_lock(&client->mm_lock); + heap =3D client->dev_heap; + if (!heap) { + XDNA_ERR(xdna, "The client dev heap object not exist"); + mutex_unlock(&client->mm_lock); + ret =3D -ENOENT; + goto free_priv; + } + drm_gem_object_get(to_gobj(heap)); + mutex_unlock(&client->mm_lock); + priv->heap =3D heap; + + ret =3D amdxdna_gem_pin(heap); + if (ret) { + XDNA_ERR(xdna, "Dev heap pin failed, ret %d", ret); + goto put_heap; + } + ret =3D aie2_hwctx_col_list(hwctx); if (ret) { XDNA_ERR(xdna, "Create col list failed, ret %d", ret); - goto free_priv; + goto unpin; } =20 ret =3D aie2_alloc_resource(hwctx); @@ -147,14 +169,26 @@ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx) goto free_col_list; } =20 + ret =3D aie2_map_host_buf(xdna->dev_handle, hwctx->fw_ctx_id, + heap->mem.userptr, heap->mem.size); + if (ret) { + XDNA_ERR(xdna, "Map host buffer failed, ret %d", ret); + goto release_resource; + } hwctx->status =3D HWCTX_STAT_INIT; =20 XDNA_DBG(xdna, "hwctx %s init completed", hwctx->name); =20 return 0; =20 +release_resource: + aie2_release_resource(hwctx); free_col_list: kfree(hwctx->col_list); +unpin: + amdxdna_gem_unpin(heap); +put_heap: + drm_gem_object_put(to_gobj(heap)); free_priv: kfree(priv); return ret; @@ -164,11 +198,59 @@ void aie2_hwctx_fini(struct amdxdna_hwctx *hwctx) { aie2_release_resource(hwctx); =20 + amdxdna_gem_unpin(hwctx->priv->heap); + drm_gem_object_put(to_gobj(hwctx->priv->heap)); + kfree(hwctx->col_list); kfree(hwctx->priv); kfree(hwctx->cus); } =20 +static int aie2_hwctx_cu_config(struct amdxdna_hwctx *hwctx, void *buf, u3= 2 size) +{ + struct amdxdna_hwctx_param_config_cu *config =3D buf; + struct amdxdna_dev *xdna =3D hwctx->client->xdna; + u32 total_size; + int ret; + + XDNA_DBG(xdna, "Config %d CU to %s", config->num_cus, hwctx->name); + if (hwctx->status !=3D HWCTX_STAT_INIT) { + XDNA_ERR(xdna, "Not support re-config CU"); + return -EINVAL; + } + + if (!config->num_cus) { + XDNA_ERR(xdna, "Number of CU is zero"); + return -EINVAL; + } + + total_size =3D struct_size(config, cu_configs, config->num_cus); + if (total_size > size) { + XDNA_ERR(xdna, "CU config larger than size"); + return -EINVAL; + } + + hwctx->cus =3D kmemdup(config, total_size, GFP_KERNEL); + if (!hwctx->cus) + return -ENOMEM; + + ret =3D aie2_config_cu(hwctx); + if (ret) { + XDNA_ERR(xdna, "Configu CU to firmware failed, ret %d", ret); + goto free_cus; + } + + wmb(); /* To avoid locking in command submit when check status */ + hwctx->status =3D HWCTX_STAT_READY; + + return 0; + +free_cus: + kfree(hwctx->cus); + hwctx->cus =3D NULL; + return ret; +} + int aie2_hwctx_config(struct amdxdna_hwctx *hwctx, u32 type, u64 value, vo= id *buf, u32 size) { struct amdxdna_dev *xdna =3D hwctx->client->xdna; @@ -176,6 +258,7 @@ int aie2_hwctx_config(struct amdxdna_hwctx *hwctx, u32 = type, u64 value, void *bu drm_WARN_ON(&xdna->ddev, !mutex_is_locked(&xdna->dev_lock)); switch (type) { case DRM_AMDXDNA_HWCTX_CONFIG_CU: + return aie2_hwctx_cu_config(hwctx, buf, size); case DRM_AMDXDNA_HWCTX_ASSIGN_DBG_BUF: case DRM_AMDXDNA_HWCTX_REMOVE_DBG_BUF: return -EOPNOTSUPP; diff --git a/drivers/accel/amdxdna/aie2_message.c b/drivers/accel/amdxdna/a= ie2_message.c index 83a749e10e4f..75332b049357 100644 --- a/drivers/accel/amdxdna/aie2_message.c +++ b/drivers/accel/amdxdna/aie2_message.c @@ -5,12 +5,15 @@ =20 #include #include +#include +#include #include #include #include #include =20 #include "amdxdna_ctx.h" +#include "amdxdna_gem.h" #include "amdxdna_mailbox.h" #include "amdxdna_mailbox_helper.h" #include "amdxdna_pci_drv.h" @@ -282,3 +285,79 @@ int aie2_destroy_context(struct amdxdna_dev_hdl *ndev,= struct amdxdna_hwctx *hwc =20 return ret; } + +int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 ad= dr, u64 size) +{ + DECLARE_AIE2_MSG(map_host_buffer, MSG_OP_MAP_HOST_BUFFER); + struct amdxdna_dev *xdna =3D ndev->xdna; + int ret; + + req.context_id =3D context_id; + req.buf_addr =3D addr; + req.buf_size =3D size; + ret =3D aie2_send_mgmt_msg_wait(ndev, &msg); + if (ret) + return ret; + + XDNA_DBG(xdna, "fw ctx %d map host buf addr 0x%llx size 0x%llx", + context_id, addr, size); + + return 0; +} + +int aie2_config_cu(struct amdxdna_hwctx *hwctx) +{ + struct mailbox_channel *chann =3D hwctx->priv->mbox_chann; + struct amdxdna_dev *xdna =3D hwctx->client->xdna; + u32 shift =3D xdna->dev_info->dev_mem_buf_shift; + DECLARE_AIE2_MSG(config_cu, MSG_OP_CONFIG_CU); + struct drm_gem_object *gobj; + struct amdxdna_gem_obj *abo; + int ret, i; + + if (!chann) + return -ENODEV; + + if (hwctx->cus->num_cus > MAX_NUM_CUS) { + XDNA_DBG(xdna, "Exceed maximum CU %d", MAX_NUM_CUS); + return -EINVAL; + } + + for (i =3D 0; i < hwctx->cus->num_cus; i++) { + struct amdxdna_cu_config *cu =3D &hwctx->cus->cu_configs[i]; + + gobj =3D drm_gem_object_lookup(hwctx->client->filp, cu->cu_bo); + if (!gobj) { + XDNA_ERR(xdna, "Lookup GEM object failed"); + return -EINVAL; + } + abo =3D to_xdna_obj(gobj); + + if (abo->type !=3D AMDXDNA_BO_DEV) { + drm_gem_object_put(gobj); + XDNA_ERR(xdna, "Invalid BO type"); + return -EINVAL; + } + + req.cfgs[i] =3D FIELD_PREP(AIE2_MSG_CFG_CU_PDI_ADDR, + abo->mem.dev_addr >> shift); + req.cfgs[i] |=3D FIELD_PREP(AIE2_MSG_CFG_CU_FUNC, cu->cu_func); + XDNA_DBG(xdna, "CU %d full addr 0x%llx, cfg 0x%x", i, + abo->mem.dev_addr, req.cfgs[i]); + drm_gem_object_put(gobj); + } + req.num_cus =3D hwctx->cus->num_cus; + + ret =3D xdna_send_msg_wait(xdna, chann, &msg); + if (ret =3D=3D -ETIME) + aie2_destroy_context(xdna->dev_handle, hwctx); + + if (resp.status =3D=3D AIE2_STATUS_SUCCESS) { + XDNA_DBG(xdna, "Configure %d CUs, ret %d", req.num_cus, ret); + return 0; + } + + XDNA_ERR(xdna, "Command opcode 0x%x failed, status 0x%x ret %d", + msg.opcode, resp.status, ret); + return ret; +} diff --git a/drivers/accel/amdxdna/aie2_pci.h b/drivers/accel/amdxdna/aie2_= pci.h index a5354e84d533..aa0a1940a545 100644 --- a/drivers/accel/amdxdna/aie2_pci.h +++ b/drivers/accel/amdxdna/aie2_pci.h @@ -116,6 +116,7 @@ struct rt_config { }; =20 struct amdxdna_hwctx_priv { + struct amdxdna_gem_obj *heap; void *mbox_chann; }; =20 @@ -193,6 +194,8 @@ int aie2_query_firmware_version(struct amdxdna_dev_hdl = *ndev, struct amdxdna_fw_ver *fw_ver); int aie2_create_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwctx= *hwctx); int aie2_destroy_context(struct amdxdna_dev_hdl *ndev, struct amdxdna_hwct= x *hwctx); +int aie2_map_host_buf(struct amdxdna_dev_hdl *ndev, u32 context_id, u64 ad= dr, u64 size); +int aie2_config_cu(struct amdxdna_hwctx *hwctx); =20 /* aie2_hwctx.c */ int aie2_hwctx_init(struct amdxdna_hwctx *hwctx); diff --git a/drivers/accel/amdxdna/amdxdna_ctx.h b/drivers/accel/amdxdna/am= dxdna_ctx.h index 3627770e5a98..ee647b98218e 100644 --- a/drivers/accel/amdxdna/amdxdna_ctx.h +++ b/drivers/accel/amdxdna/amdxdna_ctx.h @@ -6,6 +6,16 @@ #ifndef _AMDXDNA_CTX_H_ #define _AMDXDNA_CTX_H_ =20 +/* Exec buffer command header format */ +#define AMDXDNA_CMD_STATE GENMASK(3, 0) +#define AMDXDNA_CMD_EXTRA_CU_MASK GENMASK(11, 10) +#define AMDXDNA_CMD_COUNT GENMASK(22, 12) +#define AMDXDNA_CMD_OPCODE GENMASK(27, 23) +struct amdxdna_cmd { + u32 header; + u32 data[] __counted_by(count); +}; + struct amdxdna_hwctx { struct amdxdna_client *client; struct amdxdna_hwctx_priv *priv; diff --git a/drivers/accel/amdxdna/amdxdna_gem.c b/drivers/accel/amdxdna/am= dxdna_gem.c new file mode 100644 index 000000000000..9526ff390a44 --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_gem.c @@ -0,0 +1,627 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "amdxdna_ctx.h" +#include "amdxdna_gem.h" +#include "amdxdna_pci_drv.h" + +#define XDNA_MAX_CMD_BO_SIZE SZ_32K + +static int +amdxdna_gem_insert_node_locked(struct amdxdna_gem_obj *abo, bool use_vmap) +{ + struct amdxdna_client *client =3D abo->client; + struct amdxdna_dev *xdna =3D client->xdna; + struct amdxdna_mem *mem =3D &abo->mem; + u64 offset; + u32 align; + int ret; + + align =3D 1 << max(PAGE_SHIFT, xdna->dev_info->dev_mem_buf_shift); + ret =3D drm_mm_insert_node_generic(&abo->dev_heap->mm, &abo->mm_node, + mem->size, align, + 0, DRM_MM_INSERT_BEST); + if (ret) { + XDNA_ERR(xdna, "Failed to alloc dev bo memory, ret %d", ret); + return ret; + } + + mem->dev_addr =3D abo->mm_node.start; + offset =3D mem->dev_addr - abo->dev_heap->mem.dev_addr; + mem->userptr =3D abo->dev_heap->mem.userptr + offset; + mem->pages =3D &abo->dev_heap->base.pages[offset >> PAGE_SHIFT]; + mem->nr_pages =3D mem->size >> PAGE_SHIFT; + + if (use_vmap) { + mem->kva =3D vmap(mem->pages, mem->nr_pages, VM_MAP, PAGE_KERNEL); + if (!mem->kva) { + XDNA_ERR(xdna, "Failed to vmap"); + drm_mm_remove_node(&abo->mm_node); + return -EFAULT; + } + } + + return 0; +} + +static void amdxdna_gem_obj_free(struct drm_gem_object *gobj) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(gobj->dev); + struct amdxdna_gem_obj *abo =3D to_xdna_obj(gobj); + struct iosys_map map =3D IOSYS_MAP_INIT_VADDR(abo->mem.kva); + + XDNA_DBG(xdna, "BO type %d xdna_addr 0x%llx", abo->type, abo->mem.dev_add= r); + if (abo->pinned) + amdxdna_gem_unpin(abo); + + if (abo->type =3D=3D AMDXDNA_BO_DEV) { + mutex_lock(&abo->client->mm_lock); + drm_mm_remove_node(&abo->mm_node); + mutex_unlock(&abo->client->mm_lock); + + vunmap(abo->mem.kva); + drm_gem_object_put(to_gobj(abo->dev_heap)); + drm_gem_object_release(gobj); + mutex_destroy(&abo->lock); + kfree(abo); + return; + } + + if (abo->type =3D=3D AMDXDNA_BO_DEV_HEAP) + drm_mm_takedown(&abo->mm); + + drm_gem_vunmap_unlocked(gobj, &map); + mutex_destroy(&abo->lock); + drm_gem_shmem_free(&abo->base); +} + +static const struct drm_gem_object_funcs amdxdna_gem_dev_obj_funcs =3D { + .free =3D amdxdna_gem_obj_free, +}; + +static bool amdxdna_hmm_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq) +{ + struct amdxdna_gem_obj *abo =3D container_of(mni, struct amdxdna_gem_obj, + mem.notifier); + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + + XDNA_DBG(xdna, "Invalid range 0x%llx, 0x%lx, type %d", + abo->mem.userptr, abo->mem.size, abo->type); + + if (!mmu_notifier_range_blockable(range)) + return false; + + xdna->dev_info->ops->hmm_invalidate(abo, cur_seq); + + return true; +} + +static const struct mmu_interval_notifier_ops amdxdna_hmm_ops =3D { + .invalidate =3D amdxdna_hmm_invalidate, +}; + +static void amdxdna_hmm_unregister(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + + if (!xdna->dev_info->ops->hmm_invalidate) + return; + + mmu_interval_notifier_remove(&abo->mem.notifier); + kvfree(abo->mem.pfns); + abo->mem.pfns =3D NULL; +} + +static int amdxdna_hmm_register(struct amdxdna_gem_obj *abo, unsigned long= addr, + size_t len) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + u32 nr_pages; + int ret; + + if (!xdna->dev_info->ops->hmm_invalidate) + return 0; + + if (abo->mem.pfns) + return -EEXIST; + + nr_pages =3D (PAGE_ALIGN(addr + len) - (addr & PAGE_MASK)) >> PAGE_SHIFT; + abo->mem.pfns =3D kvcalloc(nr_pages, sizeof(*abo->mem.pfns), + GFP_KERNEL); + if (!abo->mem.pfns) + return -ENOMEM; + + ret =3D mmu_interval_notifier_insert_locked(&abo->mem.notifier, + current->mm, + addr, + len, + &amdxdna_hmm_ops); + if (ret) { + XDNA_ERR(xdna, "Insert mmu notifier failed, ret %d", ret); + kvfree(abo->mem.pfns); + } + abo->mem.userptr =3D addr; + + return ret; +} + +static int amdxdna_gem_obj_mmap(struct drm_gem_object *gobj, + struct vm_area_struct *vma) +{ + struct amdxdna_gem_obj *abo =3D to_xdna_obj(gobj); + unsigned long num_pages; + int ret; + + ret =3D amdxdna_hmm_register(abo, vma->vm_start, gobj->size); + if (ret) + return ret; + + ret =3D drm_gem_shmem_mmap(&abo->base, vma); + if (ret) + goto hmm_unreg; + + num_pages =3D gobj->size >> PAGE_SHIFT; + /* Try to insert the pages */ + vm_flags_mod(vma, VM_MIXEDMAP, VM_PFNMAP); + ret =3D vm_insert_pages(vma, vma->vm_start, abo->base.pages, &num_pages); + if (ret) + XDNA_ERR(abo->client->xdna, "Failed insert pages, ret %d", ret); + + return 0; + +hmm_unreg: + amdxdna_hmm_unregister(abo); + return ret; +} + +static vm_fault_t amdxdna_gem_vm_fault(struct vm_fault *vmf) +{ + return drm_gem_shmem_vm_ops.fault(vmf); +} + +static void amdxdna_gem_vm_open(struct vm_area_struct *vma) +{ + drm_gem_shmem_vm_ops.open(vma); +} + +static void amdxdna_gem_vm_close(struct vm_area_struct *vma) +{ + struct drm_gem_object *gobj =3D vma->vm_private_data; + + amdxdna_hmm_unregister(to_xdna_obj(gobj)); + drm_gem_shmem_vm_ops.close(vma); +} + +static const struct vm_operations_struct amdxdna_gem_vm_ops =3D { + .fault =3D amdxdna_gem_vm_fault, + .open =3D amdxdna_gem_vm_open, + .close =3D amdxdna_gem_vm_close, +}; + +static const struct drm_gem_object_funcs amdxdna_gem_shmem_funcs =3D { + .free =3D amdxdna_gem_obj_free, + .print_info =3D drm_gem_shmem_object_print_info, + .pin =3D drm_gem_shmem_object_pin, + .unpin =3D drm_gem_shmem_object_unpin, + .get_sg_table =3D drm_gem_shmem_object_get_sg_table, + .vmap =3D drm_gem_shmem_object_vmap, + .vunmap =3D drm_gem_shmem_object_vunmap, + .mmap =3D amdxdna_gem_obj_mmap, + .vm_ops =3D &amdxdna_gem_vm_ops, +}; + +static struct amdxdna_gem_obj * +amdxdna_gem_create_obj(struct drm_device *dev, size_t size) +{ + struct amdxdna_gem_obj *abo; + + abo =3D kzalloc(sizeof(*abo), GFP_KERNEL); + if (!abo) + return ERR_PTR(-ENOMEM); + + abo->pinned =3D false; + abo->assigned_hwctx =3D AMDXDNA_INVALID_CTX_HANDLE; + mutex_init(&abo->lock); + + abo->mem.userptr =3D AMDXDNA_INVALID_ADDR; + abo->mem.dev_addr =3D AMDXDNA_INVALID_ADDR; + abo->mem.size =3D size; + + return abo; +} + +/* For drm_driver->gem_create_object callback */ +struct drm_gem_object * +amdxdna_gem_create_object_cb(struct drm_device *dev, size_t size) +{ + struct amdxdna_gem_obj *abo; + + abo =3D amdxdna_gem_create_obj(dev, size); + if (IS_ERR(abo)) + return ERR_CAST(abo); + + to_gobj(abo)->funcs =3D &amdxdna_gem_shmem_funcs; + + return to_gobj(abo); +} + +static struct amdxdna_gem_obj * +amdxdna_drm_alloc_shmem(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_client *client =3D filp->driver_priv; + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + + shmem =3D drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) + return ERR_CAST(shmem); + + shmem->map_wc =3D false; + + abo =3D to_xdna_obj(&shmem->base); + abo->client =3D client; + abo->type =3D AMDXDNA_BO_SHMEM; + + return abo; +} + +static struct amdxdna_gem_obj * +amdxdna_drm_create_dev_heap(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_client *client =3D filp->driver_priv; + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + int ret; + + if (args->size > xdna->dev_info->dev_mem_size) { + XDNA_DBG(xdna, "Invalid dev heap size 0x%llx, limit 0x%lx", + args->size, xdna->dev_info->dev_mem_size); + return ERR_PTR(-EINVAL); + } + + mutex_lock(&client->mm_lock); + if (client->dev_heap) { + XDNA_DBG(client->xdna, "dev heap is already created"); + ret =3D -EBUSY; + goto mm_unlock; + } + + shmem =3D drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) { + ret =3D PTR_ERR(shmem); + goto mm_unlock; + } + + shmem->map_wc =3D false; + abo =3D to_xdna_obj(&shmem->base); + + abo->type =3D AMDXDNA_BO_DEV_HEAP; + abo->client =3D client; + abo->mem.dev_addr =3D client->xdna->dev_info->dev_mem_base; + drm_mm_init(&abo->mm, abo->mem.dev_addr, abo->mem.size); + + client->dev_heap =3D abo; + drm_gem_object_get(to_gobj(abo)); + mutex_unlock(&client->mm_lock); + + return abo; + +mm_unlock: + mutex_unlock(&client->mm_lock); + return ERR_PTR(ret); +} + +struct amdxdna_gem_obj * +amdxdna_drm_alloc_dev_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp, bool use_vmap) +{ + struct amdxdna_client *client =3D filp->driver_priv; + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + size_t aligned_sz =3D PAGE_ALIGN(args->size); + struct amdxdna_gem_obj *abo, *heap; + int ret; + + mutex_lock(&client->mm_lock); + heap =3D client->dev_heap; + if (!heap) { + ret =3D -EINVAL; + goto mm_unlock; + } + + if (heap->mem.userptr =3D=3D AMDXDNA_INVALID_ADDR) { + XDNA_ERR(xdna, "Invalid dev heap userptr"); + ret =3D -EINVAL; + goto mm_unlock; + } + + if (args->size > heap->mem.size) { + XDNA_ERR(xdna, "Invalid dev bo size 0x%llx, limit 0x%lx", + args->size, heap->mem.size); + ret =3D -EINVAL; + goto mm_unlock; + } + + abo =3D amdxdna_gem_create_obj(&xdna->ddev, aligned_sz); + if (IS_ERR(abo)) { + ret =3D PTR_ERR(abo); + goto mm_unlock; + } + to_gobj(abo)->funcs =3D &amdxdna_gem_dev_obj_funcs; + abo->type =3D AMDXDNA_BO_DEV; + abo->client =3D client; + abo->dev_heap =3D heap; + ret =3D amdxdna_gem_insert_node_locked(abo, use_vmap); + if (ret) { + XDNA_ERR(xdna, "Failed to alloc dev bo memory, ret %d", ret); + goto mm_unlock; + } + + drm_gem_object_get(to_gobj(heap)); + drm_gem_private_object_init(&xdna->ddev, to_gobj(abo), aligned_sz); + + mutex_unlock(&client->mm_lock); + return abo; + +mm_unlock: + mutex_unlock(&client->mm_lock); + return ERR_PTR(ret); +} + +static struct amdxdna_gem_obj * +amdxdna_drm_create_cmd_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct drm_gem_shmem_object *shmem; + struct amdxdna_gem_obj *abo; + struct iosys_map map; + int ret; + + if (args->size > XDNA_MAX_CMD_BO_SIZE) { + XDNA_ERR(xdna, "Command bo size 0x%llx too large", args->size); + return ERR_PTR(-EINVAL); + } + + if (args->size < sizeof(struct amdxdna_cmd)) { + XDNA_DBG(xdna, "Command BO size 0x%llx too small", args->size); + return ERR_PTR(-EINVAL); + } + + shmem =3D drm_gem_shmem_create(dev, args->size); + if (IS_ERR(shmem)) + return ERR_CAST(shmem); + + shmem->map_wc =3D false; + abo =3D to_xdna_obj(&shmem->base); + + abo->type =3D AMDXDNA_BO_CMD; + abo->client =3D filp->driver_priv; + + ret =3D drm_gem_vmap_unlocked(to_gobj(abo), &map); + if (ret) { + XDNA_ERR(xdna, "Vmap cmd bo failed, ret %d", ret); + goto release_obj; + } + abo->mem.kva =3D map.vaddr; + + return abo; + +release_obj: + drm_gem_shmem_free(shmem); + return ERR_PTR(ret); +} + +int amdxdna_drm_create_bo_ioctl(struct drm_device *dev, void *data, struct= drm_file *filp) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct amdxdna_drm_create_bo *args =3D data; + struct amdxdna_gem_obj *abo; + int ret; + + if (args->flags || args->vaddr || !args->size) + return -EINVAL; + + XDNA_DBG(xdna, "BO arg type %d vaddr 0x%llx size 0x%llx flags 0x%llx", + args->type, args->vaddr, args->size, args->flags); + switch (args->type) { + case AMDXDNA_BO_SHMEM: + abo =3D amdxdna_drm_alloc_shmem(dev, args, filp); + break; + case AMDXDNA_BO_DEV_HEAP: + abo =3D amdxdna_drm_create_dev_heap(dev, args, filp); + break; + case AMDXDNA_BO_DEV: + abo =3D amdxdna_drm_alloc_dev_bo(dev, args, filp, false); + break; + case AMDXDNA_BO_CMD: + abo =3D amdxdna_drm_create_cmd_bo(dev, args, filp); + break; + default: + return -EINVAL; + } + if (IS_ERR(abo)) + return PTR_ERR(abo); + + /* ready to publish object to userspace */ + ret =3D drm_gem_handle_create(filp, to_gobj(abo), &args->handle); + if (ret) { + XDNA_ERR(xdna, "Create handle failed"); + goto put_obj; + } + + XDNA_DBG(xdna, "BO hdl %d type %d userptr 0x%llx xdna_addr 0x%llx size 0x= %lx", + args->handle, args->type, abo->mem.userptr, + abo->mem.dev_addr, abo->mem.size); +put_obj: + /* Dereference object reference. Handle holds it now. */ + drm_gem_object_put(to_gobj(abo)); + return ret; +} + +int amdxdna_gem_pin_nolock(struct amdxdna_gem_obj *abo) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(to_gobj(abo)->dev); + int ret; + + switch (abo->type) { + case AMDXDNA_BO_SHMEM: + case AMDXDNA_BO_DEV_HEAP: + ret =3D drm_gem_shmem_pin(&abo->base); + break; + case AMDXDNA_BO_DEV: + ret =3D amdxdna_gem_pin(abo->dev_heap); + break; + default: + ret =3D -EOPNOTSUPP; + } + + XDNA_DBG(xdna, "BO type %d ret %d", abo->type, ret); + return ret; +} + +int amdxdna_gem_pin(struct amdxdna_gem_obj *abo) +{ + int ret; + + mutex_lock(&abo->lock); + ret =3D amdxdna_gem_pin_nolock(abo); + mutex_unlock(&abo->lock); + + return ret; +} + +void amdxdna_gem_unpin(struct amdxdna_gem_obj *abo) +{ + mutex_lock(&abo->lock); + XDNA_DBG(abo->client->xdna, "BO type %d", abo->type); + switch (abo->type) { + case AMDXDNA_BO_SHMEM: + case AMDXDNA_BO_DEV_HEAP: + drm_gem_shmem_unpin(&abo->base); + break; + case AMDXDNA_BO_DEV: + amdxdna_gem_unpin(abo->dev_heap); + break; + default: + /* Should never go here */ + WARN_ONCE(1, "Unexpected BO type %d\n", abo->type); + } + mutex_unlock(&abo->lock); +} + +struct amdxdna_gem_obj *amdxdna_gem_get_obj(struct amdxdna_client *client, + u32 bo_hdl, u8 bo_type) +{ + struct amdxdna_dev *xdna =3D client->xdna; + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + + gobj =3D drm_gem_object_lookup(client->filp, bo_hdl); + if (!gobj) { + XDNA_DBG(xdna, "Can not find bo %d", bo_hdl); + return NULL; + } + + abo =3D to_xdna_obj(gobj); + if (bo_type =3D=3D AMDXDNA_BO_INVALID || abo->type =3D=3D bo_type) + return abo; + + drm_gem_object_put(gobj); + return NULL; +} + +int amdxdna_drm_get_bo_info_ioctl(struct drm_device *dev, void *data, stru= ct drm_file *filp) +{ + struct amdxdna_drm_get_bo_info *args =3D data; + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + int ret =3D 0; + + if (args->ext_flags) + return -EINVAL; + + gobj =3D drm_gem_object_lookup(filp, args->handle); + if (!gobj) { + XDNA_DBG(xdna, "Lookup GEM object %d failed", args->handle); + return -ENOENT; + } + + abo =3D to_xdna_obj(gobj); + args->vaddr =3D abo->mem.userptr; + args->xdna_addr =3D abo->mem.dev_addr; + + if (abo->type !=3D AMDXDNA_BO_DEV) + args->map_offset =3D drm_vma_node_offset_addr(&gobj->vma_node); + else + args->map_offset =3D AMDXDNA_INVALID_ADDR; + + XDNA_DBG(xdna, "BO hdl %d map_offset 0x%llx vaddr 0x%llx xdna_addr 0x%llx= ", + args->handle, args->map_offset, args->vaddr, args->xdna_addr); + + drm_gem_object_put(gobj); + return ret; +} + +/* + * The sync bo ioctl is to make sure the CPU cache is in sync with memory. + * This is required because NPU is not cache coherent device. CPU cache + * flushing/invalidation is expensive so it is best to handle this outside + * of the command submission path. This ioctl allows explicit cache + * flushing/invalidation outside of the critical path. + */ +int amdxdna_drm_sync_bo_ioctl(struct drm_device *dev, + void *data, struct drm_file *filp) +{ + struct amdxdna_dev *xdna =3D to_xdna_dev(dev); + struct amdxdna_drm_sync_bo *args =3D data; + struct amdxdna_gem_obj *abo; + struct drm_gem_object *gobj; + int ret; + + gobj =3D drm_gem_object_lookup(filp, args->handle); + if (!gobj) { + XDNA_ERR(xdna, "Lookup GEM object failed"); + return -ENOENT; + } + abo =3D to_xdna_obj(gobj); + + ret =3D amdxdna_gem_pin(abo); + if (ret) { + XDNA_ERR(xdna, "Pin BO %d failed, ret %d", args->handle, ret); + goto put_obj; + } + + if (abo->type =3D=3D AMDXDNA_BO_DEV) + drm_clflush_pages(abo->mem.pages, abo->mem.nr_pages); + else + drm_clflush_pages(abo->base.pages, gobj->size >> PAGE_SHIFT); + + amdxdna_gem_unpin(abo); + + XDNA_DBG(xdna, "Sync bo %d offset 0x%llx, size 0x%llx\n", + args->handle, args->offset, args->size); + +put_obj: + drm_gem_object_put(gobj); + return ret; +} diff --git a/drivers/accel/amdxdna/amdxdna_gem.h b/drivers/accel/amdxdna/am= dxdna_gem.h new file mode 100644 index 000000000000..8ccc0375dd9d --- /dev/null +++ b/drivers/accel/amdxdna/amdxdna_gem.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AMDXDNA_GEM_H_ +#define _AMDXDNA_GEM_H_ + +struct amdxdna_mem { + u64 userptr; + void *kva; + u64 dev_addr; + size_t size; + struct page **pages; + u32 nr_pages; + struct mmu_interval_notifier notifier; + unsigned long *pfns; + bool map_invalid; +}; + +struct amdxdna_gem_obj { + struct drm_gem_shmem_object base; + struct amdxdna_client *client; + u8 type; + bool pinned; + struct mutex lock; /* Protects: pinned */ + struct amdxdna_mem mem; + + /* Below members is uninitialized when needed */ + struct drm_mm mm; /* For AMDXDNA_BO_DEV_HEAP */ + struct amdxdna_gem_obj *dev_heap; /* For AMDXDNA_BO_DEV */ + struct drm_mm_node mm_node; /* For AMDXDNA_BO_DEV */ + u32 assigned_hwctx; +}; + +#define to_gobj(obj) (&(obj)->base.base) + +static inline struct amdxdna_gem_obj *to_xdna_obj(struct drm_gem_object *g= obj) +{ + return container_of(gobj, struct amdxdna_gem_obj, base.base); +} + +struct amdxdna_gem_obj *amdxdna_gem_get_obj(struct amdxdna_client *client, + u32 bo_hdl, u8 bo_type); +static inline void amdxdna_gem_put_obj(struct amdxdna_gem_obj *abo) +{ + drm_gem_object_put(to_gobj(abo)); +} + +struct drm_gem_object * +amdxdna_gem_create_object_cb(struct drm_device *dev, size_t size); +struct amdxdna_gem_obj * +amdxdna_drm_alloc_dev_bo(struct drm_device *dev, + struct amdxdna_drm_create_bo *args, + struct drm_file *filp, bool use_vmap); + +int amdxdna_gem_pin_nolock(struct amdxdna_gem_obj *abo); +int amdxdna_gem_pin(struct amdxdna_gem_obj *abo); +void amdxdna_gem_unpin(struct amdxdna_gem_obj *abo); + +int amdxdna_drm_create_bo_ioctl(struct drm_device *dev, void *data, struct= drm_file *filp); +int amdxdna_drm_get_bo_info_ioctl(struct drm_device *dev, void *data, stru= ct drm_file *filp); +int amdxdna_drm_sync_bo_ioctl(struct drm_device *dev, void *data, struct d= rm_file *filp); + +#endif /* _AMDXDNA_GEM_H_ */ diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c b/drivers/accel/amdxdn= a/amdxdna_pci_drv.c index 9ca62901fcf1..4fc6677a7973 100644 --- a/drivers/accel/amdxdna/amdxdna_pci_drv.c +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c @@ -7,12 +7,14 @@ #include #include #include +#include #include #include #include #include =20 #include "amdxdna_ctx.h" +#include "amdxdna_gem.h" #include "amdxdna_pci_drv.h" =20 /* @@ -63,6 +65,7 @@ static int amdxdna_drm_open(struct drm_device *ddev, stru= ct drm_file *filp) } mutex_init(&client->hwctx_lock); idr_init_base(&client->hwctx_idr, AMDXDNA_INVALID_CTX_HANDLE + 1); + mutex_init(&client->mm_lock); =20 mutex_lock(&xdna->dev_lock); list_add_tail(&client->node, &xdna->client_list); @@ -91,6 +94,9 @@ static void amdxdna_drm_close(struct drm_device *ddev, st= ruct drm_file *filp) =20 idr_destroy(&client->hwctx_idr); mutex_destroy(&client->hwctx_lock); + mutex_destroy(&client->mm_lock); + if (client->dev_heap) + drm_gem_object_put(to_gobj(client->dev_heap)); =20 iommu_sva_unbind_device(client->sva); =20 @@ -123,6 +129,10 @@ static const struct drm_ioctl_desc amdxdna_drm_ioctls[= ] =3D { DRM_IOCTL_DEF_DRV(AMDXDNA_CREATE_HWCTX, amdxdna_drm_create_hwctx_ioctl, 0= ), DRM_IOCTL_DEF_DRV(AMDXDNA_DESTROY_HWCTX, amdxdna_drm_destroy_hwctx_ioctl,= 0), DRM_IOCTL_DEF_DRV(AMDXDNA_CONFIG_HWCTX, amdxdna_drm_config_hwctx_ioctl, 0= ), + /* BO */ + DRM_IOCTL_DEF_DRV(AMDXDNA_CREATE_BO, amdxdna_drm_create_bo_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_GET_BO_INFO, amdxdna_drm_get_bo_info_ioctl, 0), + DRM_IOCTL_DEF_DRV(AMDXDNA_SYNC_BO, amdxdna_drm_sync_bo_ioctl, 0), }; =20 static const struct file_operations amdxdna_fops =3D { @@ -147,6 +157,8 @@ const struct drm_driver amdxdna_drm_drv =3D { .postclose =3D amdxdna_drm_close, .ioctls =3D amdxdna_drm_ioctls, .num_ioctls =3D ARRAY_SIZE(amdxdna_drm_ioctls), + + .gem_create_object =3D amdxdna_gem_create_object_cb, }; =20 static const struct amdxdna_dev_info * diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.h b/drivers/accel/amdxdn= a/amdxdna_pci_drv.h index 5ec7fe168406..3dddde4ac12a 100644 --- a/drivers/accel/amdxdna/amdxdna_pci_drv.h +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.h @@ -18,6 +18,7 @@ extern const struct drm_driver amdxdna_drm_drv; =20 struct amdxdna_dev; +struct amdxdna_gem_obj; struct amdxdna_hwctx; =20 /* @@ -29,6 +30,7 @@ struct amdxdna_dev_ops { int (*hwctx_init)(struct amdxdna_hwctx *hwctx); void (*hwctx_fini)(struct amdxdna_hwctx *hwctx); int (*hwctx_config)(struct amdxdna_hwctx *hwctx, u32 type, u64 value, voi= d *buf, u32 size); + void (*hmm_invalidate)(struct amdxdna_gem_obj *abo, unsigned long cur_seq= ); }; =20 /* @@ -89,6 +91,10 @@ struct amdxdna_client { struct idr hwctx_idr; struct amdxdna_dev *xdna; struct drm_file *filp; + + struct mutex mm_lock; /* protect memory related */ + struct amdxdna_gem_obj *dev_heap; + struct iommu_sva *sva; int pasid; }; diff --git a/include/uapi/drm/amdxdna_accel.h b/include/uapi/drm/amdxdna_ac= cel.h index 371a12f36cac..d42f0a0ea794 100644 --- a/include/uapi/drm/amdxdna_accel.h +++ b/include/uapi/drm/amdxdna_accel.h @@ -13,7 +13,9 @@ extern "C" { #endif =20 +#define AMDXDNA_INVALID_ADDR (~0UL) #define AMDXDNA_INVALID_CTX_HANDLE 0 +#define AMDXDNA_INVALID_BO_HANDLE 0 =20 enum amdxdna_device_type { AMDXDNA_DEV_TYPE_UNKNOWN =3D -1, @@ -24,6 +26,9 @@ enum amdxdna_drm_ioctl_id { DRM_AMDXDNA_CREATE_HWCTX, DRM_AMDXDNA_DESTROY_HWCTX, DRM_AMDXDNA_CONFIG_HWCTX, + DRM_AMDXDNA_CREATE_BO, + DRM_AMDXDNA_GET_BO_INFO, + DRM_AMDXDNA_SYNC_BO, }; =20 /** @@ -136,6 +141,70 @@ struct amdxdna_drm_config_hwctx { __u32 pad; }; =20 +enum amdxdna_bo_type { + AMDXDNA_BO_INVALID =3D 0, + AMDXDNA_BO_SHMEM, + AMDXDNA_BO_DEV_HEAP, + AMDXDNA_BO_DEV, + AMDXDNA_BO_CMD, +}; + +/** + * struct amdxdna_drm_create_bo - Create a buffer object. + * @flags: Buffer flags. MBZ. + * @type: Buffer type. + * @pad1: MBZ. + * @vaddr: User VA of buffer if applied. MBZ. + * @size: Size in bytes. + * @handle: Returned DRM buffer object handle. + * @pad2: MBZ. + */ +struct amdxdna_drm_create_bo { + __u64 flags; + __u32 type; + __u32 pad1; + __u64 vaddr; + __u64 size; + __u32 handle; + __u32 pad2; +}; + +/** + * struct amdxdna_drm_get_bo_info - Get buffer object information. + * @ext: MBZ. + * @ext_flags: MBZ. + * @handle: DRM buffer object handle. + * @pad: MBZ. + * @map_offset: Returned DRM fake offset for mmap(). + * @vaddr: Returned user VA of buffer. 0 in case user needs mmap(). + * @xdna_addr: Returned XDNA device virtual address. + */ +struct amdxdna_drm_get_bo_info { + __u64 ext; + __u64 ext_flags; + __u32 handle; + __u32 pad; + __u64 map_offset; + __u64 vaddr; + __u64 xdna_addr; +}; + +/** + * struct amdxdna_drm_sync_bo - Sync buffer object. + * @handle: Buffer object handle. + * @direction: Direction of sync, can be from device or to device. + * @offset: Offset in the buffer to sync. + * @size: Size in bytes. + */ +struct amdxdna_drm_sync_bo { + __u32 handle; +#define SYNC_DIRECT_TO_DEVICE 0U +#define SYNC_DIRECT_FROM_DEVICE 1U + __u32 direction; + __u64 offset; + __u64 size; +}; + #define DRM_IOCTL_AMDXDNA_CREATE_HWCTX \ DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_CREATE_HWCTX, \ struct amdxdna_drm_create_hwctx) @@ -148,6 +217,18 @@ struct amdxdna_drm_config_hwctx { DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_CONFIG_HWCTX, \ struct amdxdna_drm_config_hwctx) =20 +#define DRM_IOCTL_AMDXDNA_CREATE_BO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_CREATE_BO, \ + struct amdxdna_drm_create_bo) + +#define DRM_IOCTL_AMDXDNA_GET_BO_INFO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_GET_BO_INFO, \ + struct amdxdna_drm_get_bo_info) + +#define DRM_IOCTL_AMDXDNA_SYNC_BO \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDXDNA_SYNC_BO, \ + struct amdxdna_drm_sync_bo) + #if defined(__cplusplus) } /* extern c end */ #endif --=20 2.34.1