From nobody Sun Nov 24 04:41:18 2024 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2046.outbound.protection.outlook.com [40.107.95.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 225B629CE8 for ; Fri, 8 Nov 2024 04:35:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.95.46 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731040532; cv=fail; b=qK1dt/vuMsa5YmIxukck5ShrVVB9THVZBTKRQ06U88zJ1vsh0IrtDWt1qcWHN8o5iuPjP11fcaWcOwB1NWvigr2yOzjkrgjIL2iVITMl1WYaQAr/vZRTAF73B/uGqa6h01WHJxYoUd6p/9GCTb/eBsfCHsJ9G+hvU/zYkRUcVwE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731040532; c=relaxed/simple; bh=nJlrA9xWFt03fx/PlKtz69OPlJuNfRT3Sh6f6spP9AE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=EHNZbG9BmtG4oEDHD4uPjI1vILEFesHT/t2bwKRVzP2sPmOaxFydiupt1KJlYXafqXC/hPVnlBWYXlG/aiKfX7bSgr9nMlFN6sB/hX/2zUgTEpdYusluJ+YIEzx3VtmwJKNC4VBuqELeSWUdQYT8qMYdemdg7JNLiuwIpgVrTFM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=I+G4q59V; arc=fail smtp.client-ip=40.107.95.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="I+G4q59V" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=vkQSfMOc2oXfRZ4PAMk26ybd3AmGCG3uk+NWOhVAnbU0ZiYEMO6msV2lEZrBro7F/A++hyLlJRyqgm+etwpAeRr/QPLaYvmXnUqmHGc72XyCi1tc79sXjDtBAqA71Z0+aRw5EsdM1Hd4v1SbKzuiFVUkjHtlVQw1uk0Tvh0FjTG0OntWyROmMGZCu+jCDFYkNzW5HK9TdbkhJIgVEW3GbeNmUJ8tbn1n5hr73+7Ni6xIF2p/eaZerJwS6pBS2ZqhcBIDaUbqsNPAUjCOXO1KgUYi60zqEGYslrXLIJq7X60IPwNUQ3PtFyG97jrB3SuHY2hmWbaMYzb+eT6+Mr5XqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=69Sp7piqeGMS15Kz78jyW16hiS0aBw2J4sjIzRRVwMw=; b=i2W7qeZm3aKs23rBo8x//G7I8o3Z3hkkXNpJ8pYxEvUO7O+lAhhZb5y9JA5cLUyp7jQyNY3E1R/Ho2YKGD13AP+7soiRlRfl2cGio3qwl9vjtoLavi9viR1z19pfljV0FuvJ042rZrbjMZ09kPPtwAm0dlaxOd9xJ143IBGuSN4ySnUQKp+TaN66Or9v0XxOGzPD5mm5vwtupkRDw8K+ZBjDdaE9MlaAEc19GBUTTC+ugIZqKM2qEGUr/rzsij/fx0Yy24sEdd6LZshMWgd5Tvqb0fnywjUaoaRLJz/TcfbiEjBwGBNUOTIeE/1L/nO4Oux+HvCJP7jUUe1rI0Or9g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=69Sp7piqeGMS15Kz78jyW16hiS0aBw2J4sjIzRRVwMw=; b=I+G4q59V9zQuAY5NSF+SGbnPI5wgVPs3ahTzatBHHDTUeZ9jAtIAWgle2zPmOxn9LI+1LvoFS0OV12wCokSW5lfAo7KEtuYOLcKeFFLf4ynuhOkD2uWbXQAYNFzp6gPS/SpNdcaDF/9MR1cfpnkU6xOPSrg3rMVyGHJtmqT4ZdI= Received: from DM6PR06CA0079.namprd06.prod.outlook.com (2603:10b6:5:336::12) by SN7PR12MB8004.namprd12.prod.outlook.com (2603:10b6:806:341::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.25; Fri, 8 Nov 2024 04:35:25 +0000 Received: from DS2PEPF0000343A.namprd02.prod.outlook.com (2603:10b6:5:336:cafe::d3) by DM6PR06CA0079.outlook.office365.com (2603:10b6:5:336::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8137.20 via Frontend Transport; Fri, 8 Nov 2024 04:35:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS2PEPF0000343A.mail.protection.outlook.com (10.167.18.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8137.17 via Frontend Transport; Fri, 8 Nov 2024 04:35:24 +0000 Received: from SATLEXMB05.amd.com (10.181.40.146) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov 2024 22:35:23 -0600 Received: from SATLEXMB03.amd.com (10.181.40.144) by SATLEXMB05.amd.com (10.181.40.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov 2024 22:35:23 -0600 Received: from xsjlizhih51.xilinx.com (10.180.168.240) by SATLEXMB03.amd.com (10.181.40.144) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 7 Nov 2024 22:35:22 -0600 From: Lizhi Hou To: , , CC: Lizhi Hou , , , , , Subject: [PATCH V7 04/10] accel/amdxdna: Add hardware resource solver Date: Thu, 7 Nov 2024 20:34:42 -0800 Message-ID: <20241108043448.449314-5-lizhi.hou@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241108043448.449314-1-lizhi.hou@amd.com> References: <20241108043448.449314-1-lizhi.hou@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: None (SATLEXMB05.amd.com: lizhi.hou@amd.com does not designate permitted sender hosts) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PEPF0000343A:EE_|SN7PR12MB8004:EE_ X-MS-Office365-Filtering-Correlation-Id: 75b14adb-980e-4636-97c9-08dcffaec37c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|82310400026|36860700013|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?rel5j80CAyfkHc0EVP6lw5nQDykNQH9H0zNFMfFoeAtwXxPpm6OXzMDBSEVG?= =?us-ascii?Q?spz4/JObSMhkkPlo4oENwFVaKiPYyi5PSW4Pc+PvZS8vSY4NN8WcXvgMt0rW?= =?us-ascii?Q?Rad5da359owkzUp9G3Ga7pB7FScSmXislT69FnSfCpBCfaSdNcTg35JYdNMg?= =?us-ascii?Q?YKaTQn4Vw9+LafFVeDxQprUVjKLJwvJDkHviY0/qYbfPjBuMOlf3SqORb7J5?= =?us-ascii?Q?f+ukYQDmaLJAxqUnVEKvxMDx6vSH2mql9NUtyQFOpjAl96qfKY3FQnXGdCfs?= =?us-ascii?Q?T8HhSFTTEeLBAM8X+V4RhKftW28DhxBMiSUYJkPdYKMHFlfRjC2v1x+E8TVm?= =?us-ascii?Q?RxJODjcZY5WuM3szC0E7iiGagymlqjq9ibjq2KV8CW102ZGXaHNgF8yqprrX?= =?us-ascii?Q?zJ1ZOQc4J+C1XQiivYN3iz8Gt7CQYpr/vB0AMQNWca4Luop68EwKf6NUlUS2?= =?us-ascii?Q?tUdQb11W7EwE49U9e1PE6LvDqNXJu6GYBpwImx/YAPjemAGCQq2wRm8AtOHs?= =?us-ascii?Q?GkweiZgIAC41UZs+xVZiRIwUQibEeJyKvVpTrZ/c8gyCaB8SqJccTOiUtZ98?= =?us-ascii?Q?uTTuAr237rHLqBrWm+TiyFbNGZZ7hM6i+CSD41P+TqOTZ+F3zURKtO2Ug/+4?= =?us-ascii?Q?dkjqOkVhi7jwBsl3ST4ilZONkvA48jp1lB6/bndin2vZ69NlR6iBCrqxWcYW?= =?us-ascii?Q?9b8kNa7NBpQo6xdMfaFzPCGn8M01b5Louo3uKDavcq6P/gxqpGhnEmfLzIVd?= =?us-ascii?Q?Gm2LdYTvKGKlFte93ZinYo886DpiTxIKW2el/Ov+Hsbf2Zyh1pDHxcdfITi3?= =?us-ascii?Q?KbvNsowLQOFJ7PBy1GZhnGI3l7d6vP5kPBwQ2Zlk1Ukwn+TS6iiwzju+ZCyh?= =?us-ascii?Q?v1rGQbLNsl8vPPdgURYe1v2pYXjOAI/2MlfaljYUq8JzWNHZPwJutppsnp//?= =?us-ascii?Q?iH2MvttEv5UvknFxg+b9aic/0lD+GEm1H23xrejtSslNcFe+NS72qwD3qEqO?= =?us-ascii?Q?CWrORMNZtw4DF60aNuOr/iFXBVIcYbcFpX4jHoNdzfyImroIEY+Tkt0DNcPH?= =?us-ascii?Q?p5QiIrEgT0WEKAeLJmFY9hDsM9t7DKtxKGeYh93gmzElsfw7IuqBs3HM8NKH?= =?us-ascii?Q?Cp6i36ZEnIrnBD60+WQ8RVCzjlQK2saW2ZLFJMivHU1E3XHDzPPEI6Nde2Y0?= =?us-ascii?Q?oDZazmlilXGnfvkAuf6nx9RrXisrBsAbUUewoD5ttHdpBS0Gb2l0AsjO0Lm+?= =?us-ascii?Q?X6xHv33Mu+Wvqch4/gW4RMehhOHhedh5hvDGGdlEnpJ/+mGYTgRVQ6fYnDYS?= =?us-ascii?Q?CtVJQfL1FPFa1yK/aKu2i2RX0a++LunSYjOl853M0Tc1S6m35/IdY0553d0N?= =?us-ascii?Q?q01nYMafBSxSDXWjwWKuf3O8Ei5iLXwKERCrBd2zaIlz/ffnEw=3D=3D?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(82310400026)(36860700013)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Nov 2024 04:35:24.8101 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 75b14adb-980e-4636-97c9-08dcffaec37c X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS2PEPF0000343A.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB8004 Content-Type: text/plain; charset="utf-8" The AI Engine consists of 2D array of tiles arranged as columns. Provides the basic column allocation and release functions for the tile columns. Co-developed-by: Min Ma Signed-off-by: Min Ma Reviewed-by: Jeffrey Hugo Signed-off-by: Lizhi Hou --- drivers/accel/amdxdna/Makefile | 1 + drivers/accel/amdxdna/aie2_pci.c | 23 +- drivers/accel/amdxdna/aie2_solver.c | 330 ++++++++++++++++++++++++ drivers/accel/amdxdna/aie2_solver.h | 154 +++++++++++ drivers/accel/amdxdna/amdxdna_pci_drv.h | 1 + 5 files changed, 508 insertions(+), 1 deletion(-) create mode 100644 drivers/accel/amdxdna/aie2_solver.c create mode 100644 drivers/accel/amdxdna/aie2_solver.h diff --git a/drivers/accel/amdxdna/Makefile b/drivers/accel/amdxdna/Makefile index 1b4e78b43b44..39d3404fbc8f 100644 --- a/drivers/accel/amdxdna/Makefile +++ b/drivers/accel/amdxdna/Makefile @@ -5,6 +5,7 @@ amdxdna-y :=3D \ aie2_pci.o \ aie2_psp.o \ aie2_smu.o \ + aie2_solver.o \ amdxdna_mailbox.o \ amdxdna_mailbox_helper.o \ amdxdna_pci_drv.o \ diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_= pci.c index 73dfcbae2ea7..27e92621c05b 100644 --- a/drivers/accel/amdxdna/aie2_pci.c +++ b/drivers/accel/amdxdna/aie2_pci.c @@ -14,9 +14,14 @@ =20 #include "aie2_msg_priv.h" #include "aie2_pci.h" +#include "aie2_solver.h" #include "amdxdna_mailbox.h" #include "amdxdna_pci_drv.h" =20 +int aie2_max_col =3D XRS_MAX_COL; +module_param(aie2_max_col, uint, 0600); +MODULE_PARM_DESC(aie2_max_col, "Maximum column could be used"); + /* * The management mailbox channel is allocated by firmware. * The related register and ring buffer information is on SRAM BAR. @@ -307,6 +312,7 @@ static int aie2_init(struct amdxdna_dev *xdna) { struct pci_dev *pdev =3D to_pci_dev(xdna->ddev.dev); void __iomem *tbl[PCI_NUM_RESOURCES] =3D {0}; + struct init_config xrs_cfg =3D { 0 }; struct amdxdna_dev_hdl *ndev; struct psp_config psp_conf; const struct firmware *fw; @@ -402,7 +408,22 @@ static int aie2_init(struct amdxdna_dev *xdna) XDNA_ERR(xdna, "Query firmware failed, ret %d", ret); goto stop_hw; } - ndev->total_col =3D ndev->metadata.cols; + ndev->total_col =3D min(aie2_max_col, ndev->metadata.cols); + + xrs_cfg.clk_list.num_levels =3D 3; + xrs_cfg.clk_list.cu_clk_list[0] =3D 0; + xrs_cfg.clk_list.cu_clk_list[1] =3D 800; + xrs_cfg.clk_list.cu_clk_list[2] =3D 1000; + xrs_cfg.sys_eff_factor =3D 1; + xrs_cfg.ddev =3D &xdna->ddev; + xrs_cfg.total_col =3D ndev->total_col; + + xdna->xrs_hdl =3D xrsm_init(&xrs_cfg); + if (!xdna->xrs_hdl) { + XDNA_ERR(xdna, "Initialize resolver failed"); + ret =3D -EINVAL; + goto stop_hw; + } =20 release_firmware(fw); return 0; diff --git a/drivers/accel/amdxdna/aie2_solver.c b/drivers/accel/amdxdna/ai= e2_solver.c new file mode 100644 index 000000000000..a537c66589a4 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_solver.c @@ -0,0 +1,330 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022-2024, Advanced Micro Devices, Inc. + */ + +#include +#include +#include +#include +#include + +#include "aie2_solver.h" + +struct partition_node { + struct list_head list; + u32 nshared; /* # shared requests */ + u32 start_col; /* start column */ + u32 ncols; /* # columns */ + bool exclusive; /* can not be shared if set */ +}; + +struct solver_node { + struct list_head list; + u64 rid; /* Request ID from consumer */ + + struct partition_node *pt_node; + void *cb_arg; + u32 cols_len; + u32 start_cols[] __counted_by(cols_len); +}; + +struct solver_rgroup { + u32 rgid; + u32 nnode; + u32 npartition_node; + + DECLARE_BITMAP(resbit, XRS_MAX_COL); + struct list_head node_list; + struct list_head pt_node_list; +}; + +struct solver_state { + struct solver_rgroup rgp; + struct init_config cfg; + struct xrs_action_ops *actions; +}; + +static u32 calculate_gops(struct aie_qos *rqos) +{ + u32 service_rate =3D 0; + + if (rqos->latency) + service_rate =3D (1000 / rqos->latency); + + if (rqos->fps > service_rate) + return rqos->fps * rqos->gops; + + return service_rate * rqos->gops; +} + +/* + * qos_meet() - Check the QOS request can be met. + */ +static int qos_meet(struct solver_state *xrs, struct aie_qos *rqos, u32 cg= ops) +{ + u32 request_gops =3D calculate_gops(rqos) * xrs->cfg.sys_eff_factor; + + if (request_gops <=3D cgops) + return 0; + + return -EINVAL; +} + +/* + * sanity_check() - Do a basic sanity check on allocation request. + */ +static int sanity_check(struct solver_state *xrs, struct alloc_requests *r= eq) +{ + struct cdo_parts *cdop =3D &req->cdo; + struct aie_qos *rqos =3D &req->rqos; + u32 cu_clk_freq; + + if (cdop->ncols > xrs->cfg.total_col) + return -EINVAL; + + /* + * We can find at least one CDOs groups that meet the + * GOPs requirement. + */ + cu_clk_freq =3D xrs->cfg.clk_list.cu_clk_list[xrs->cfg.clk_list.num_level= s - 1]; + + if (qos_meet(xrs, rqos, cdop->qos_cap.opc * cu_clk_freq / 1000)) + return -EINVAL; + + return 0; +} + +static struct solver_node *rg_search_node(struct solver_rgroup *rgp, u64 r= id) +{ + struct solver_node *node; + + list_for_each_entry(node, &rgp->node_list, list) { + if (node->rid =3D=3D rid) + return node; + } + + return NULL; +} + +static void remove_partition_node(struct solver_rgroup *rgp, + struct partition_node *pt_node) +{ + pt_node->nshared--; + if (pt_node->nshared > 0) + return; + + list_del(&pt_node->list); + rgp->npartition_node--; + + bitmap_clear(rgp->resbit, pt_node->start_col, pt_node->ncols); + kfree(pt_node); +} + +static void remove_solver_node(struct solver_rgroup *rgp, + struct solver_node *node) +{ + list_del(&node->list); + rgp->nnode--; + + if (node->pt_node) + remove_partition_node(rgp, node->pt_node); + + kfree(node); +} + +static int get_free_partition(struct solver_state *xrs, + struct solver_node *snode, + struct alloc_requests *req) +{ + struct partition_node *pt_node; + u32 ncols =3D req->cdo.ncols; + u32 col, i; + + for (i =3D 0; i < snode->cols_len; i++) { + col =3D snode->start_cols[i]; + if (find_next_bit(xrs->rgp.resbit, XRS_MAX_COL, col) >=3D col + ncols) + break; + } + + if (i =3D=3D snode->cols_len) + return -ENODEV; + + pt_node =3D kzalloc(sizeof(*pt_node), GFP_KERNEL); + if (!pt_node) + return -ENOMEM; + + pt_node->nshared =3D 1; + pt_node->start_col =3D col; + pt_node->ncols =3D ncols; + + /* + * Before fully support latency in QoS, if a request + * specifies a non-zero latency value, it will not share + * the partition with other requests. + */ + if (req->rqos.latency) + pt_node->exclusive =3D true; + + list_add_tail(&pt_node->list, &xrs->rgp.pt_node_list); + xrs->rgp.npartition_node++; + bitmap_set(xrs->rgp.resbit, pt_node->start_col, pt_node->ncols); + + snode->pt_node =3D pt_node; + + return 0; +} + +static int allocate_partition(struct solver_state *xrs, + struct solver_node *snode, + struct alloc_requests *req) +{ + struct partition_node *pt_node, *rpt_node =3D NULL; + int idx, ret; + + ret =3D get_free_partition(xrs, snode, req); + if (!ret) + return ret; + + /* try to get a share-able partition */ + list_for_each_entry(pt_node, &xrs->rgp.pt_node_list, list) { + if (pt_node->exclusive) + continue; + + if (rpt_node && pt_node->nshared >=3D rpt_node->nshared) + continue; + + for (idx =3D 0; idx < snode->cols_len; idx++) { + if (snode->start_cols[idx] !=3D pt_node->start_col) + continue; + + if (req->cdo.ncols !=3D pt_node->ncols) + continue; + + rpt_node =3D pt_node; + break; + } + } + + if (!rpt_node) + return -ENODEV; + + rpt_node->nshared++; + snode->pt_node =3D rpt_node; + + return 0; +} + +static struct solver_node *create_solver_node(struct solver_state *xrs, + struct alloc_requests *req) +{ + struct cdo_parts *cdop =3D &req->cdo; + struct solver_node *node; + int ret; + + node =3D kzalloc(struct_size(node, start_cols, cdop->cols_len), GFP_KERNE= L); + if (!node) + return ERR_PTR(-ENOMEM); + + node->rid =3D req->rid; + node->cols_len =3D cdop->cols_len; + memcpy(node->start_cols, cdop->start_cols, cdop->cols_len * sizeof(u32)); + + ret =3D allocate_partition(xrs, node, req); + if (ret) + goto free_node; + + list_add_tail(&node->list, &xrs->rgp.node_list); + xrs->rgp.nnode++; + return node; + +free_node: + kfree(node); + return ERR_PTR(ret); +} + +static void fill_load_action(struct solver_state *xrs, + struct solver_node *snode, + struct xrs_action_load *action) +{ + action->rid =3D snode->rid; + action->part.start_col =3D snode->pt_node->start_col; + action->part.ncols =3D snode->pt_node->ncols; +} + +int xrs_allocate_resource(void *hdl, struct alloc_requests *req, void *cb_= arg) +{ + struct xrs_action_load load_act; + struct solver_node *snode; + struct solver_state *xrs; + int ret; + + xrs =3D (struct solver_state *)hdl; + + ret =3D sanity_check(xrs, req); + if (ret) { + drm_err(xrs->cfg.ddev, "invalid request"); + return ret; + } + + if (rg_search_node(&xrs->rgp, req->rid)) { + drm_err(xrs->cfg.ddev, "rid %lld is in-use", req->rid); + return -EEXIST; + } + + snode =3D create_solver_node(xrs, req); + if (IS_ERR(snode)) + return PTR_ERR(snode); + + fill_load_action(xrs, snode, &load_act); + ret =3D xrs->cfg.actions->load(cb_arg, &load_act); + if (ret) + goto free_node; + + snode->cb_arg =3D cb_arg; + + drm_dbg(xrs->cfg.ddev, "start col %d ncols %d\n", + snode->pt_node->start_col, snode->pt_node->ncols); + + return 0; + +free_node: + remove_solver_node(&xrs->rgp, snode); + + return ret; +} + +int xrs_release_resource(void *hdl, u64 rid) +{ + struct solver_state *xrs =3D hdl; + struct solver_node *node; + + node =3D rg_search_node(&xrs->rgp, rid); + if (!node) { + drm_err(xrs->cfg.ddev, "node not exist"); + return -ENODEV; + } + + xrs->cfg.actions->unload(node->cb_arg); + remove_solver_node(&xrs->rgp, node); + + return 0; +} + +void *xrsm_init(struct init_config *cfg) +{ + struct solver_rgroup *rgp; + struct solver_state *xrs; + + xrs =3D drmm_kzalloc(cfg->ddev, sizeof(*xrs), GFP_KERNEL); + if (!xrs) + return NULL; + + memcpy(&xrs->cfg, cfg, sizeof(*cfg)); + + rgp =3D &xrs->rgp; + INIT_LIST_HEAD(&rgp->node_list); + INIT_LIST_HEAD(&rgp->pt_node_list); + + return xrs; +} diff --git a/drivers/accel/amdxdna/aie2_solver.h b/drivers/accel/amdxdna/ai= e2_solver.h new file mode 100644 index 000000000000..9b1847bb46a6 --- /dev/null +++ b/drivers/accel/amdxdna/aie2_solver.h @@ -0,0 +1,154 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023-2024, Advanced Micro Devices, Inc. + */ + +#ifndef _AIE2_SOLVER_H +#define _AIE2_SOLVER_H + +#define XRS_MAX_COL 128 + +/* + * Structure used to describe a partition. A partition is column based + * allocation unit described by its start column and number of columns. + */ +struct aie_part { + u32 start_col; + u32 ncols; +}; + +/* + * The QoS capabilities of a given AIE partition. + */ +struct aie_qos_cap { + u32 opc; /* operations per cycle */ + u32 dma_bw; /* DMA bandwidth */ +}; + +/* + * QoS requirement of a resource allocation. + */ +struct aie_qos { + u32 gops; /* Giga operations */ + u32 fps; /* Frames per second */ + u32 dma_bw; /* DMA bandwidth */ + u32 latency; /* Frame response latency */ + u32 exec_time; /* Frame execution time */ + u32 priority; /* Request priority */ +}; + +/* + * Structure used to describe a relocatable CDO (Configuration Data Object= ). + */ +struct cdo_parts { + u32 *start_cols; /* Start column array */ + u32 cols_len; /* Length of start column array */ + u32 ncols; /* # of column */ + struct aie_qos_cap qos_cap; /* CDO QoS capabilities */ +}; + +/* + * Structure used to describe a request to allocate. + */ +struct alloc_requests { + u64 rid; + struct cdo_parts cdo; + struct aie_qos rqos; /* Requested QoS */ +}; + +/* + * Load callback argument + */ +struct xrs_action_load { + u32 rid; + struct aie_part part; +}; + +/* + * Define the power level available + * + * POWER_LEVEL_MIN: + * Lowest power level. Usually set when all actions are unloaded. + * + * POWER_LEVEL_n + * Power levels 0 - n, is a step increase in system frequencies + */ +enum power_level { + POWER_LEVEL_MIN =3D 0x0, + POWER_LEVEL_0 =3D 0x1, + POWER_LEVEL_1 =3D 0x2, + POWER_LEVEL_2 =3D 0x3, + POWER_LEVEL_3 =3D 0x4, + POWER_LEVEL_4 =3D 0x5, + POWER_LEVEL_5 =3D 0x6, + POWER_LEVEL_6 =3D 0x7, + POWER_LEVEL_7 =3D 0x8, + POWER_LEVEL_NUM, +}; + +/* + * Structure used to describe the frequency table. + * Resource solver chooses the frequency from the table + * to meet the QOS requirements. + */ +struct clk_list_info { + u32 num_levels; /* available power levels */ + u32 cu_clk_list[POWER_LEVEL_NUM]; /* available aie clock frequen= cies in Mhz*/ +}; + +struct xrs_action_ops { + int (*load)(void *cb_arg, struct xrs_action_load *action); + int (*unload)(void *cb_arg); +}; + +/* + * Structure used to describe information for solver during initialization. + */ +struct init_config { + u32 total_col; + u32 sys_eff_factor; /* system efficiency factor */ + u32 latency_adj; /* latency adjustment in ms */ + struct clk_list_info clk_list; /* List of frequencies available in = system */ + struct drm_device *ddev; + struct xrs_action_ops *actions; +}; + +/* + * xrsm_init() - Register resource solver. Resource solver client needs + * to call this function to register itself. + * + * @cfg: The system metrics for resource solver to use + * + * Return: A resource solver handle + * + * Note: We should only create one handle per AIE array to be managed. + */ +void *xrsm_init(struct init_config *cfg); + +/* + * xrs_allocate_resource() - Request to allocate resources for a given con= text + * and a partition metadata. (See struct part_me= ta) + * + * @hdl: Resource solver handle obtained from xrs_init() + * @req: Input to the Resource solver including request id + * and partition metadata. + * @cb_arg: callback argument pointer + * + * Return: 0 when successful. + * Or standard error number when failing + * + * Note: + * There is no lock mechanism inside resource solver. So it is + * the caller's responsibility to lock down XCLBINs and grab + * necessary lock. + */ +int xrs_allocate_resource(void *hdl, struct alloc_requests *req, void *cb_= arg); + +/* + * xrs_release_resource() - Request to free resources for a given context. + * + * @hdl: Resource solver handle obtained from xrs_init() + * @rid: The Request ID to identify the requesting context + */ +int xrs_release_resource(void *hdl, u64 rid); +#endif /* _AIE2_SOLVER_H */ diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.h b/drivers/accel/amdxdn= a/amdxdna_pci_drv.h index 64bce970514b..c0710d3130fd 100644 --- a/drivers/accel/amdxdna/amdxdna_pci_drv.h +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.h @@ -58,6 +58,7 @@ struct amdxdna_dev { struct drm_device ddev; struct amdxdna_dev_hdl *dev_handle; const struct amdxdna_dev_info *dev_info; + void *xrs_hdl; =20 struct mutex dev_lock; /* per device lock */ struct amdxdna_fw_ver fw_ver; --=20 2.34.1