From nobody Wed Oct 8 03:53:53 2025 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2053.outbound.protection.outlook.com [40.107.101.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6EA692F3C36 for ; Wed, 2 Jul 2025 16:12:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.101.53 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751472773; cv=fail; b=nBqbw3D0l6RgPSFb8LTIdLo48lKyqRSzrDzDJvzPS42vBlcO9+NbzqrinAhdIYmcq7OpvJ1AppD+GoAkLBSfAHczTkaniDuak9rPEOs3X0Inf5F5lNdkqc13Rs5omoMIcTg5pGD9GIjfHgO/MBOsPVr5H6Bb5HDyiiEMfqXkcYE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751472773; c=relaxed/simple; bh=ltDLSJGNPwzYmSdEmu0oibSK/fSiX0id49x+J7nHeGM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=T8cUSFr/+AuF5JPLxgcxDJUatAW3Na5AH+RN+OOdeWwfaZheBAHAArUbUZiloNt56to2Fn/EHa8sAJERSOtXflrjStaONsqemzOCAHH4K+vl7fjWuJ8bca6dpRK6pPo+rLWBKIw/s9y37aolfAqYoJFpIB9OdTMN18sEYs1ejfM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=SIa2e93G; arc=fail smtp.client-ip=40.107.101.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="SIa2e93G" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=wLNVneX8HjZkm1mQ71NsGyQshIp5c9Fl2MGVdSGRGuAxa4oKpoWbe9c889yiA08L3Gp+BR69erx3uOqM1p79mWvFEe2AAIUlj69i1ed8nCO/SuT48idXZLR2XS1e795IC5A5p/YXgrfNh7YQFuJnOCcqu6p5hDufQ8F8Jlc6O+Evx8CibAO4WNTDjdSXntXmAujHh+T9ZTwb/G8TLfDQ5Wn4J1C2yoFkL5bRdwNSNLbkh7If/Shj48X59C2X+A0sjE6MEVdkdOYShvwkehqAOJgmsrBuQIwz7+VRgV7K5tAUULShQp9AlDNW0GcB52TGkKTFCtxUITO/IlKZPwr2NA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IgDSJVhc+/W5LVX3K/iGY+J3F2lI8LWtPFYZt3Eij9w=; b=XZ0KXBMCezaT4zZj94lZ4oSC2vkjjpDs7kt6Qa58RAJrhgsz6K6WUCpiZB4X8UTq86iD2UuZKuZiWP6pNhtG5W/XjEhXtP4jCfDCAbWYcGl7zmpvnBc4Z6DUgwFVddDTY2UHmqWt3DaKr3CaXiitTuL1VUDrqe2gtjnPt5mo7KzFVOXINt3m8oP5xMv5AlsVZYb5fkyDsDEPVyYQzI4/Ue6l2+mReycJp84m4p2B8kgvm/4w0vf2SF4bI3oM8mDqXC20RmgOVzhS32JgaQs8CqDb5DRlvi98ZcztPTa/xmK6BbkvcGHVBTTC0dWelniLehykGCx89obX4yJb9H/+pA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=gmail.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IgDSJVhc+/W5LVX3K/iGY+J3F2lI8LWtPFYZt3Eij9w=; b=SIa2e93GDOrnk9dldUIX8HsteYDMkk6TcqKKpzxvcEY8JUXMXCzZ7nKAkSknWWI+iE/ZZ4za4uJgFmWEGfDzrQMKcIkITnW+W4t+NgeoqUipxA5dMp7kmpu79Aq1cK/6AS6cJ+LVyXt9Ua4QDm8B/loUWg+1YJlfYcOBQjE6ff4= Received: from CY5PR15CA0054.namprd15.prod.outlook.com (2603:10b6:930:1b::20) by DM6PR12MB4156.namprd12.prod.outlook.com (2603:10b6:5:218::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8835.30; Wed, 2 Jul 2025 16:12:49 +0000 Received: from CY4PEPF0000FCC4.namprd03.prod.outlook.com (2603:10b6:930:1b:cafe::e8) by CY5PR15CA0054.outlook.office365.com (2603:10b6:930:1b::20) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8901.20 via Frontend Transport; Wed, 2 Jul 2025 16:12:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CY4PEPF0000FCC4.mail.protection.outlook.com (10.167.242.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8901.15 via Frontend Transport; Wed, 2 Jul 2025 16:12:48 +0000 Received: from FRAPPELLOUX01.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Jul 2025 11:12:45 -0500 From: Pierre-Eric Pelloux-Prayer To: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Matthew Auld , Arunpravin Paneer Selvam CC: Pierre-Eric Pelloux-Prayer , , , Subject: [PATCH v1 1/3] drm/buddy: add a flag to disable trimming of non cleared blocks Date: Wed, 2 Jul 2025 18:12:02 +0200 Message-ID: <20250702161208.25188-2-pierre-eric.pelloux-prayer@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250702161208.25188-1-pierre-eric.pelloux-prayer@amd.com> References: <20250702161208.25188-1-pierre-eric.pelloux-prayer@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000FCC4:EE_|DM6PR12MB4156:EE_ X-MS-Office365-Filtering-Correlation-Id: ce347b89-2b8b-4449-3184-08ddb98349d1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|376014|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?rKZHtIJ6C4FpcmrkQuWuC6W4e4+WkZTzS8HKq3Iq/rcgcCSqVQuLbSl8j5mi?= =?us-ascii?Q?w1XrJBPWhM6azVMIDyhOPnjUFmzARYyrqe4XjfsIBUWQm7gSNtMq/qxmP7xD?= =?us-ascii?Q?/67v5k6LoUihzUa2OliWUHb40bI1sMMDqvzUAMDs+h14CdypC0ma2FYcKFVS?= =?us-ascii?Q?a2aUTHcpoJDNLRvszKUWGbDw3SLaDMzs/laWoyB+/ZZr3bGKn/H6Ja0FR9ep?= =?us-ascii?Q?jECpij0N9Rs5XGeOg/n6FuZj/M5OwlLX7w3s6uSTL86TZmNuqXMW6BAhfiPv?= =?us-ascii?Q?UA1RP4SN/9CIxmMDlji6yYKhd0PohlggnlhJRauDgR2OKc8wkfsbej1puT7f?= =?us-ascii?Q?nw5bpVxvXoEUei6tAPRznoTVz9dFzPSfWDLvgLn9VrKQRNg8Zw4pedDCMtUx?= =?us-ascii?Q?YX16C0ounrK1z2jX61vT4rjM1vU26BFmxRDaTzOmrPWiyB4usI+X2bsOaqg9?= =?us-ascii?Q?uwcWLG3OkpyxC/m3NzYpobbwBskjbZfboooAY4CQ9ILqyTU2YiEKoxkklP8H?= =?us-ascii?Q?fK9uSzqA78efEGuJVdEc4ohKB5CYC+/aZeKKHF89V4BegXiiZWE0zVBbUKpx?= =?us-ascii?Q?mqoJzPaSqHS6i6xZxXoQURRpjNlE+6hph1Lucx7Eo/B8ums1ysq108vClhvR?= =?us-ascii?Q?RW+7UJjAFR9xOb+QFFrhwh2MrYY3yKtBHeYuOroO1R9/pmy/BPIpwcWuMxKH?= =?us-ascii?Q?yRzEMI5j+dvCAIDX69GUNMea3/YHB04Cn/qotIHA/glWhzfh/WW5DfYrVhV0?= =?us-ascii?Q?5vIjKx/pcsSGs2M4Un5YRJFoab9eBOmWD1m05SR1oHhMk/hMCL/ZZhOi3CrU?= =?us-ascii?Q?CVfviAZ2viEErGutQQvtP8Kz9XKoYYUtVRdWmMFg8wSIjm6nMFS938LPUKFT?= =?us-ascii?Q?G/42fipdvxEyVxm9mPE2JTax0hFP9zXdUDrxEMXetrYhpPFMGDgCNb2MWi5q?= =?us-ascii?Q?pSI5cywlc8MzDbyGyrpI6jA82ff/AarGcs4dDEwtE3+9aDu77b5PtsX9WnWR?= =?us-ascii?Q?MmlcwpZTCtKbaFyKKVmwiGTI6nlb6JV2IjMQLxv1Zwj2HdaNV3Fh8FwcyNYc?= =?us-ascii?Q?2g8EsOr05ybckuyshUsI3iVckocOTAV2cEhAoPn2/+pT1Q4c0ZyrdEFbbmA1?= =?us-ascii?Q?l9/YlkNuwFbrWwt+9/p+9N415pnH10dEiz6m/x9VzQesefGglmnR+1qSA/5d?= =?us-ascii?Q?KlaQDrxh7rDhaqepP0JAVviP6vBT7JHjADdLKlK/mfP4tLHdnUabGag2G7kb?= =?us-ascii?Q?v/mYlrbebFJtKYQUPivZXckFsRxao+rzsEKwZ6S5RaGyYKIudK+YB9hZLQZ1?= =?us-ascii?Q?PIayoFIubUXeFRAijHEjE+jfGGyKVN6bymfYh0EFDtr1lp9X1ZWJNfbQvCJo?= =?us-ascii?Q?FkqOM8moad5Wue0cXanR3AZZV1GFqwjN7kFPy4Iks425DpJTpSkEKgNtBVfY?= =?us-ascii?Q?jTjniS2ORbQkEQVV9AuoepF5SRlj2sqsfqmliC094W3zRAAnzHbooactg3IE?= =?us-ascii?Q?V4ZA33YTDN89v4GvdFGvJs5yM6IZ65b2WhXn?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(376014)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jul 2025 16:12:48.5779 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ce347b89-2b8b-4449-3184-08ddb98349d1 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000FCC4.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4156 Content-Type: text/plain; charset="utf-8" A vkcts test case is triggering a case where the drm buddy allocator wastes lots of memory and performs badly: dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000 For each memory pool type, the test will allocate 4000 8kB objects, and then will release them. The alignment request is 256kB. For each object, the allocator will select a 256kB block (to match the alignment), and then trim it to 8kB, adding lots of free entries to the free_lists of order 5 to 1. On deallocation, none of these objects will be merged with their buddy because their "clear status" is different: only the block that was handed over to the driver might come back cleared. Also since the test don't allocate much memory, the allocator don't need to force the merge process so it will repeat the same logic for each run. As a result, after the first run (which takes about 6sec), the freelists look like this: chunk_size: 4KiB, total: 16368MiB, free: 15354MiB, clear_free: 397MiB [...] order- 5 free: 1914 MiB, blocks: 15315 order- 4 free: 957 MiB, blocks: 15325 order- 3 free: 480 MiB, blocks: 15360 order- 2 free: 239 MiB, blocks: 15347 order- 1 free: 238 MiB, blocks: 30489 After the second run (19 sec): chunk_size: 4KiB, total: 16368MiB, free: 15374MiB, clear_free: 537MiB [...] order- 5 free: 3326 MiB, blocks: 26615 order- 4 free: 1663 MiB, blocks: 26619 order- 3 free: 833 MiB, blocks: 26659 order- 2 free: 416 MiB, blocks: 26643 order- 1 free: 414 MiB, blocks: 53071 list_insert_sorted is part of the problem here since it iterates over the free_list to figure out where to insert the new blocks. To fix this while keeping the clear tracking information, a new bit is exposed to drivers, allowing them to disable trimming for blocks that aren't "clear". This bit is used by amdgpu because it always returns cleared memory to drm_buddy. With this bit set, the "merge buddies on deallocation logic" can work again, and the free_list are not growing indefinitely anymore. So after a run we get: chunk_size: 4KiB, total: 16368MiB, free: 15306MiB, clear_free: 1734MiB [...] order- 5 free: 2 MiB, blocks: 17 order- 4 free: 2 MiB, blocks: 35 order- 3 free: 1 MiB, blocks: 41 order- 2 free: 656 KiB, blocks: 41 order- 1 free: 256 KiB, blocks: 32 The runtime is better (2 sec) and stable across multiple runs, and we also see that the reported "clear_free" amount is larger than without the patch. Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature") Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 8 ++++++++ drivers/gpu/drm/drm_buddy.c | 1 + include/drm/drm_buddy.h | 1 + 3 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm= /amd/amdgpu/amdgpu_vram_mgr.c index abdc52b0895a..dbbaa15a973e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -499,6 +499,14 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_man= ager *man, =20 INIT_LIST_HEAD(&vres->blocks); =20 + /* Trimming create smaller blocks that may never be given to the driver. + * Such blocks won't be cleared until being seen by the driver, which mig= ht + * never occur (for instance UMD might request large alignment) =3D> in s= uch + * case, upon release of the block, the drm_buddy allocator won't merge t= hem + * back, because their clear status is different. + */ + vres->flags =3D DRM_BUDDY_TRIM_IF_CLEAR; + if (place->flags & TTM_PL_FLAG_TOPDOWN) vres->flags |=3D DRM_BUDDY_TOPDOWN_ALLOCATION; =20 diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index a1e652b7631d..555c72abce4c 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -1092,6 +1092,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, =20 /* Trim the allocated block to the required size */ if (!(flags & DRM_BUDDY_TRIM_DISABLE) && + (!(flags & DRM_BUDDY_TRIM_IF_CLEAR) || drm_buddy_block_is_clear(block= )) && original_size !=3D size) { struct list_head *trim_list; LIST_HEAD(temp); diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index 9689a7c5dd36..c338d03028c3 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -28,6 +28,7 @@ #define DRM_BUDDY_CLEAR_ALLOCATION BIT(3) #define DRM_BUDDY_CLEARED BIT(4) #define DRM_BUDDY_TRIM_DISABLE BIT(5) +#define DRM_BUDDY_TRIM_IF_CLEAR BIT(6) =20 struct drm_buddy_block { #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12) --=20 2.43.0 From nobody Wed Oct 8 03:53:53 2025 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2049.outbound.protection.outlook.com [40.107.102.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F28F2D46BE for ; Wed, 2 Jul 2025 16:13:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.102.49 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751472799; cv=fail; b=g3SbnZjorsNl+yrh2AiucZmdBHqKaphZ1x7Z87GSuo8JWeiQMmxZdN5VbwqyaykO/edtWJqDAAWIUBuZROVe8lxbcsliyg4vlcJiEPOlXZZQ8Oaz8dcLOh9LUJObuMF+cicxmlAsBuo05arUEIm27CFvUdtyFGl3dY0XIxC2eDg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751472799; c=relaxed/simple; bh=awEfWSpZpHIFdnepieYVnLdufT5Vegp9loauOfuVAEU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DHsqDlA/2m5BMohgqZqLEmV4EL4LabETrcuyQp6DHAZLopqwGXHifjA/v0OHp+WqqQIh8NtwDax7kRk+jS0yplsQ4gcbWflcirRfHLUKF2/0gLI3r3glPLqW173kwYlX7mLGs7xeW7xvBmS2Cg7roE1KSpMA4ToHzyKqtKDwzZ8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=ZN3BtJ+9; arc=fail smtp.client-ip=40.107.102.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="ZN3BtJ+9" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CUblJQ8JlKHcpYLIgXBUtxTJJf7PfBDFdygw5lrSVSg7tIIHoIgzv72OVj3Y1jtsxj7e69qmivXwqlaSfJJUkGd8wwn2fhcAC+yu26M6pdawy7kucPuDEye9fhUOgfhnr3NAmfnL3pS2n2QgWtHc2pOnS9D9C0OyqxmjRxCThKbTBzHypENStBCHGCgaMIcd10VayirHYn4gWYktzQa9is1o8L+kDCJMnc+TEN6tQGSEaWhKKw34NfOHoHP6jpX6QhHID+X1Ur1OGm6fczf8yXNtPQFmGzJobgIvsix+tFhkIxRUIn02u6yYt8TnZYMv46vyURqlmepS4SaBJbxrgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AuJ+Tr08weuEJgrrKgI6T2Jy2JZS3PuD7ZscwTWw2Ys=; b=jELla4zbf4cgVadCNI2VtaJmcCmgtd0hT6o1DjuMjvgrHEVNCKyX/5MeMlgxTu7CkNcbQhyC/XtjpFTyiy4KZmv8Lg4eJlYkcIxIhFknBir2CZ5v28QT9JiSUJPB5luGx4KkIlQHCrWE5hOr5X07IalN9JsN0eM2PB/J2m/9Rz5lvy/S3ne5oYVQOHLOtLmPpxsy5kolM1MNJ23lnFLDYezoQY3Uo0k+HobjZ6UxpiW7NXNzkmX8U0LbWxFhF31+1hIP8w0kQbeNLZm15faMjYmlPWeFHhUKoHRs21uug9mcyGbkdD7fSqzo4lhGdnOpLnSL71VK9I6pAgXuYV3A9A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=gmail.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AuJ+Tr08weuEJgrrKgI6T2Jy2JZS3PuD7ZscwTWw2Ys=; b=ZN3BtJ+9C7ogepgzOVg/GlWlTo5fc+jsIJRv7ffNKYuKZh7nvZp8/RYC/iNi5w3QGnNDRBT/I/RydY8ghRlMf9PQjwD/5xQlAxxyfAXZu73d7MOPG74n7l88QwloMRMJAWjm3M7lLamVTzgbom68gF4S3Nn4yNXeaeGwwRv7ka0= Received: from MW4PR03CA0197.namprd03.prod.outlook.com (2603:10b6:303:b8::22) by IA0PR12MB7507.namprd12.prod.outlook.com (2603:10b6:208:441::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8901.20; Wed, 2 Jul 2025 16:13:12 +0000 Received: from SJ1PEPF00001CE1.namprd05.prod.outlook.com (2603:10b6:303:b8::4) by MW4PR03CA0197.outlook.office365.com (2603:10b6:303:b8::22) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8901.19 via Frontend Transport; Wed, 2 Jul 2025 16:13:11 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by SJ1PEPF00001CE1.mail.protection.outlook.com (10.167.242.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8901.15 via Frontend Transport; Wed, 2 Jul 2025 16:13:10 +0000 Received: from FRAPPELLOUX01.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Jul 2025 11:13:07 -0500 From: Pierre-Eric Pelloux-Prayer To: Alex Deucher , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann CC: Pierre-Eric Pelloux-Prayer , , , Subject: [PATCH v1 2/3] drm/buddy: use DRM_BUDDY_CLEAR_ALLOCATION as a hint, not a hard req Date: Wed, 2 Jul 2025 18:12:03 +0200 Message-ID: <20250702161208.25188-3-pierre-eric.pelloux-prayer@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250702161208.25188-1-pierre-eric.pelloux-prayer@amd.com> References: <20250702161208.25188-1-pierre-eric.pelloux-prayer@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00001CE1:EE_|IA0PR12MB7507:EE_ X-MS-Office365-Filtering-Correlation-Id: 94cf7d47-799c-4bfa-441d-08ddb983571c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|376014|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?CmdVIrrOAVgJULBwk620Ji6GuW/crZWFu6hmrAvoXVDrHswtVVXw1ViZ6Lq9?= =?us-ascii?Q?x0S34yQHkuSi5Q86N03CtAMqkpSSHSHqMa/9doOfE2RU5vpi9mLO0Mm79SW4?= =?us-ascii?Q?HVrnrtuhiuBvj/8W2JUl+suPjKo12ZQYx7IzjcuDpGAUEfjyxRSmfSE0bEqO?= =?us-ascii?Q?SjAHjcyVrYaEVpdjuWn91y9nfjG8HxnjhMDiI2PWSfFX+j66NfuhylBsS5Ud?= =?us-ascii?Q?aAYST9oJlx87nCJqnt0fN0tyI8EYTzDUZy/DWHCjv7qjtXyE0GGc5JGh43PV?= =?us-ascii?Q?m642JH1cvOzYRidDNidizOIQil4rEY/m96E2DOvKbHbgvl3dXbBWAgACmVt9?= =?us-ascii?Q?LD/ii6tlDazVQrJSzXMYJzLEWLUMEAOp3e4WZuAy29TkQGmdvGrd64rYO/8P?= =?us-ascii?Q?szyMQiMPg5DcSk926YiBJBWJScjVAem2WM0HvKHN2pKh86KCCX9nGC8vWzZD?= =?us-ascii?Q?t+KWjO2AfoZKVy9aLqmofyrjkBjoIP8SWgJk/t3pu6ophI/qrtp9trmi7vfk?= =?us-ascii?Q?X7UpqzRqUF9tXoQFlRhpKhqkhxD72nbMNumJoYhKJu9waKRVrJ9DlrF9WfQS?= =?us-ascii?Q?6jZXiccct6IsBNM7QGdvnVSGUFLmlWhdC84zYc4HUAn+FB80Ljexe7Fg5j4C?= =?us-ascii?Q?qmGSmi8c5pvPYdBox6bSrTWcgePqGX97ZvdcVjDrBsru3qiX4K38AvYCW6H3?= =?us-ascii?Q?kyxreyy4gtL990VlLnXWyQ8w7whH+5TcBFcFhhxZRIr9ATIM2ikJEHIDtmCn?= =?us-ascii?Q?C0ME3QRlOhsndL2K9lWfVDdYfwPzrEZoC9GkZZDZaHPdJ81vGx7SCaoVYrqw?= =?us-ascii?Q?KuIR6jatKs13Pr4zhtw/TPhzcKbz/wXeWBVM/Y7NPAElZRbGsofhknbuvtNS?= =?us-ascii?Q?QTJ+mdexqFsVUn8NOqtPeBwfnpxJ4rPgKFQflUTclG6zOFCFr7am+sQMvEsQ?= =?us-ascii?Q?mtFy/5Ki8Agfhk0OY9hGScS6N3qOrRQWsBFq5xIBCbyP4Og1kpnG78h/QrAP?= =?us-ascii?Q?zPSjoWFOLObYIWHOWb4OVGn9yMrxA7lI7WC+gyH1S9GRx8mm9TxEruX7iSpv?= =?us-ascii?Q?fX1FK5sMvGJ53Jc+T7SEpZT2GstcMSjRd811Bo9iNCJ9iYG6M8AMObF7hM75?= =?us-ascii?Q?Ukg65y6URQlAkmVqeurlVKVFjjjrS08xwqfi0UlFJlrRp1UsezxbL6uS1Ggx?= =?us-ascii?Q?q/hxR0aZvAi2XctD8uOmgwA4DW36lLUKrp/qTbKniBrXs+yz+9LFql82hI0T?= =?us-ascii?Q?IXoKD/Cvtx5NRWV92QJLowgCnpzM/s4U9CXq/cGhqGn0S+/x2zGuwP1l6WXn?= =?us-ascii?Q?JURKAPo6/d0+yp/+O+n3EH8JmaLrh/l9RdXdUnsgazRj9O4+fb6WDgydyDWJ?= =?us-ascii?Q?wkcHuhEFOpnmq3j9uhvtXeUN2L1OQyzguEXC2jBlZFGbkPsSsh38LURwjq8H?= =?us-ascii?Q?iWjlt87bdbH0pjo2A5NPF3Q6DflqAVy/h5M9GnQL2vZuGB5t/ufaQD3kYVn1?= =?us-ascii?Q?U8szG08fjaJHi+So8ewfVQMvyCZPaFGgsm36?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(376014)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jul 2025 16:13:10.8446 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 94cf7d47-799c-4bfa-441d-08ddb983571c X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00001CE1.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR12MB7507 Content-Type: text/plain; charset="utf-8" The rationale for this change is that it's preferable to return non-cleared memory instead of splitting up higher-order blocks as this leads to more fragmented memory. The driver will be able to clear the memory by itself if required and the clear tracking will avoid the need for useless clearing jobs. This commit renames DRM_BUDDY_CLEAR_ALLOCATION as DRM_BUDDY_PREFER_CLEAR_ALLOCATION to make the intent clearer, and delete the tests that expected that passing this flag would return cleared memory. Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +- drivers/gpu/drm/drm_buddy.c | 43 ++++++----- drivers/gpu/drm/tests/drm_buddy_test.c | 75 +++----------------- include/drm/drm_buddy.h | 2 +- 4 files changed, 35 insertions(+), 87 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm= /amd/amdgpu/amdgpu_vram_mgr.c index dbbaa15a973e..24dd094eac84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -514,7 +514,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_mana= ger *man, vres->flags |=3D DRM_BUDDY_CONTIGUOUS_ALLOCATION; =20 if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED) - vres->flags |=3D DRM_BUDDY_CLEAR_ALLOCATION; + vres->flags |=3D DRM_BUDDY_PREFER_CLEAR_ALLOCATION; =20 if (fpfn || lpfn !=3D mgr->mm.size) /* Allocate blocks in desired range */ diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 555c72abce4c..fd31322b3d41 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -473,7 +473,7 @@ EXPORT_SYMBOL(drm_buddy_free_list); =20 static bool block_incompatible(struct drm_buddy_block *block, unsigned int= flags) { - bool needs_clear =3D flags & DRM_BUDDY_CLEAR_ALLOCATION; + bool needs_clear =3D flags & DRM_BUDDY_PREFER_CLEAR_ALLOCATION; =20 return needs_clear !=3D drm_buddy_block_is_clear(block); } @@ -593,21 +593,30 @@ get_maxblock(struct drm_buddy *mm, unsigned int order, unsigned long flags) { struct drm_buddy_block *max_block =3D NULL, *block =3D NULL; + bool wants_clear; unsigned int i; =20 for (i =3D order; i <=3D mm->max_order; ++i) { struct drm_buddy_block *tmp_block; =20 + wants_clear =3D flags & DRM_BUDDY_PREFER_CLEAR_ALLOCATION; + +retry: list_for_each_entry_reverse(tmp_block, &mm->free_list[i], link) { - if (block_incompatible(tmp_block, flags)) + if (wants_clear && !drm_buddy_block_is_clear(tmp_block)) continue; =20 block =3D tmp_block; break; } =20 - if (!block) + if (!block) { + if (wants_clear) { + wants_clear =3D false; + goto retry; + } continue; + } =20 if (!max_block) { max_block =3D block; @@ -630,6 +639,7 @@ alloc_from_freelist(struct drm_buddy *mm, { struct drm_buddy_block *block =3D NULL; unsigned int tmp; + bool wants_clear; int err; =20 if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) { @@ -640,9 +650,11 @@ alloc_from_freelist(struct drm_buddy *mm, } else { for (tmp =3D order; tmp <=3D mm->max_order; ++tmp) { struct drm_buddy_block *tmp_block; + wants_clear =3D flags & DRM_BUDDY_PREFER_CLEAR_ALLOCATION; =20 +retry: list_for_each_entry_reverse(tmp_block, &mm->free_list[tmp], link) { - if (block_incompatible(tmp_block, flags)) + if (wants_clear && !drm_buddy_block_is_clear(tmp_block)) continue; =20 block =3D tmp_block; @@ -651,25 +663,20 @@ alloc_from_freelist(struct drm_buddy *mm, =20 if (block) break; - } - } =20 - if (!block) { - /* Fallback method */ - for (tmp =3D order; tmp <=3D mm->max_order; ++tmp) { - if (!list_empty(&mm->free_list[tmp])) { - block =3D list_last_entry(&mm->free_list[tmp], - struct drm_buddy_block, - link); - if (block) - break; + if (wants_clear) { + /* Relax this requirement to avoid splitting up higher order + * blocks. + */ + wants_clear =3D false; + goto retry; } } - - if (!block) - return ERR_PTR(-ENOSPC); } =20 + if (!block) + return ERR_PTR(-ENOSPC); + BUG_ON(!drm_buddy_block_is_free(block)); =20 while (tmp !=3D order) { diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests= /drm_buddy_test.c index 7a0e523651f0..7ae65d93adb0 100644 --- a/drivers/gpu/drm/tests/drm_buddy_test.c +++ b/drivers/gpu/drm/tests/drm_buddy_test.c @@ -240,7 +240,7 @@ static void drm_test_buddy_alloc_range_bias(struct kuni= t *test) bias_end =3D max(bias_end, bias_start + ps); bias_rem =3D bias_end - bias_start; =20 - flags =3D DRM_BUDDY_CLEAR_ALLOCATION | DRM_BUDDY_RANGE_ALLOCATION; + flags =3D DRM_BUDDY_PREFER_CLEAR_ALLOCATION | DRM_BUDDY_RANGE_ALLOCATION; size =3D max(round_up(prandom_u32_state(&prng) % bias_rem, ps), ps); =20 KUNIT_ASSERT_FALSE_MSG(test, @@ -272,67 +272,9 @@ static void drm_test_buddy_alloc_clear(struct kunit *t= est) LIST_HEAD(clean); =20 mm_size =3D SZ_4K << max_order; - KUNIT_EXPECT_FALSE(test, drm_buddy_init(&mm, mm_size, ps)); - - KUNIT_EXPECT_EQ(test, mm.max_order, max_order); - - /* - * Idea is to allocate and free some random portion of the address space, - * returning those pages as non-dirty and randomly alternate between - * requesting dirty and non-dirty pages (not going over the limit - * we freed as non-dirty), putting that into two separate lists. - * Loop over both lists at the end checking that the dirty list - * is indeed all dirty pages and vice versa. Free it all again, - * keeping the dirty/clear status. - */ - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, - 5 * ps, ps, &allocated, - DRM_BUDDY_TOPDOWN_ALLOCATION), - "buddy_alloc hit an error size=3D%lu\n", 5 * ps); - drm_buddy_free_list(&mm, &allocated, DRM_BUDDY_CLEARED); - - n_pages =3D 10; - do { - unsigned long flags; - struct list_head *list; - int slot =3D i % 2; - - if (slot =3D=3D 0) { - list =3D &dirty; - flags =3D 0; - } else { - list =3D &clean; - flags =3D DRM_BUDDY_CLEAR_ALLOCATION; - } - - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, - ps, ps, list, - flags), - "buddy_alloc hit an error size=3D%lu\n", ps); - } while (++i < n_pages); - - list_for_each_entry(block, &clean, link) - KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), true); - - list_for_each_entry(block, &dirty, link) - KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), false); - - drm_buddy_free_list(&mm, &clean, DRM_BUDDY_CLEARED); - - /* - * Trying to go over the clear limit for some allocation. - * The allocation should never fail with reasonable page-size. - */ - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, - 10 * ps, ps, &clean, - DRM_BUDDY_CLEAR_ALLOCATION), - "buddy_alloc hit an error size=3D%lu\n", 10 * ps); - - drm_buddy_free_list(&mm, &clean, DRM_BUDDY_CLEARED); - drm_buddy_free_list(&mm, &dirty, 0); - drm_buddy_fini(&mm); =20 KUNIT_EXPECT_FALSE(test, drm_buddy_init(&mm, mm_size, ps)); + KUNIT_EXPECT_EQ(test, mm.max_order, max_order); =20 /* * Create a new mm. Intentionally fragment the address space by creating @@ -366,14 +308,13 @@ static void drm_test_buddy_alloc_clear(struct kunit *= test) do { size =3D SZ_4K << order; =20 - KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, - size, size, &allocated, - DRM_BUDDY_CLEAR_ALLOCATION), - "buddy_alloc hit an error size=3D%u\n", size); + KUNIT_ASSERT_FALSE_MSG( + test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + size, size, &allocated, + DRM_BUDDY_PREFER_CLEAR_ALLOCATION), + "buddy_alloc hit an error size=3D%u\n", size); total =3D 0; list_for_each_entry(block, &allocated, link) { - if (size !=3D mm_size) - KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), false); total +=3D drm_buddy_block_size(&mm, block); } KUNIT_EXPECT_EQ(test, total, size); @@ -399,7 +340,7 @@ static void drm_test_buddy_alloc_clear(struct kunit *te= st) drm_buddy_free_list(&mm, &allocated, DRM_BUDDY_CLEARED); KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, SZ_4K << max_= order, 2 * ps, ps, &allocated, - DRM_BUDDY_CLEAR_ALLOCATION), + DRM_BUDDY_PREFER_CLEAR_ALLOCATION), "buddy_alloc hit an error size=3D%lu\n", 2 * ps); drm_buddy_free_list(&mm, &allocated, DRM_BUDDY_CLEARED); KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, SZ_4K << max_ord= er, mm_size, diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index c338d03028c3..ed06be63a770 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -25,7 +25,7 @@ #define DRM_BUDDY_RANGE_ALLOCATION BIT(0) #define DRM_BUDDY_TOPDOWN_ALLOCATION BIT(1) #define DRM_BUDDY_CONTIGUOUS_ALLOCATION BIT(2) -#define DRM_BUDDY_CLEAR_ALLOCATION BIT(3) +#define DRM_BUDDY_PREFER_CLEAR_ALLOCATION BIT(3) #define DRM_BUDDY_CLEARED BIT(4) #define DRM_BUDDY_TRIM_DISABLE BIT(5) #define DRM_BUDDY_TRIM_IF_CLEAR BIT(6) --=20 2.43.0 From nobody Wed Oct 8 03:53:53 2025 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2059.outbound.protection.outlook.com [40.107.236.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E7B512D46BE for ; Wed, 2 Jul 2025 16:13:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751472825; cv=fail; b=W1q3S5U+VuImE/mYgx8NqYA9Dnp1VkJ/XluOnQguV4y95MZGWD6mQo3mopy7KnC/gzO7tC45pCHtJNQpRX473yxoOXYlPTriYNwNmP3JwGMB3KqSJ05dIPNmjyqPcpPD/gGQc++Q8b8tsLD0kA9comkDhRlDlNwFGEgZy/EFCIQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751472825; c=relaxed/simple; bh=WW3H/3PUBAN8vbvlxSIkCZqtgDWeezWeZRaaVmjUs9g=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QGhmGFWeFMAd/ddV3io4EbFAvyXAdScj0eYKh3oeotJIoTKp4Oe0ycr4cDU9kzYPMOu8h1u+tq4wyW1meVoz5aZUptEjKDdGRVir/FRA0dhoAe7dLjyS7Sr6w+Ay1Lz3SbV2/+FcgfaxZy8afMOJFN46IsRXj4WOXUghFHsKhBs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=xVsDIZQR; arc=fail smtp.client-ip=40.107.236.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="xVsDIZQR" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Fj8gf5UxDu9efmB4JMNU5fERkgwKH5AjoIf//mYRCJgwB5hWhrWrerW+h2b4zxSxsuxes6svW2AI3eE24Q0NPj8mmXJywQlo0DidSFcCrbBNdoGnx4CvxhJwStXOymyrlNG9QwYWICuUMTlf8MflAqjB/XBJJZ1MKMuKp6atc82IbgLUr+5/9CRAys3KOg6PoXAPCvsndh2FfckcP9JQjRp9RBoL/Dgu3Z324DRkPn4N+EC2sfobiAVH/sghVFrmWGwLQnCpUMR34HBdYH29DpcVVr1UDaH3LdJCGdNdkGx3efJSnepU32tiJpxtqVbqzsKJJAnbgo4Kbp1Kq5AdjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ik1fsrU3+1GM9eFwPLMAXJh38CC+gxZmcnGsD0S1ytI=; b=RwbgBvWpXlGq9zv4gk9N0WQP5J9vQEB18SlfsexS7NqcmAjdFvJsVtlwYUxRwuYnH7tJAH/jTVNv+eP7Dz4blWFnnzXdxqazOfKcxyT4CGoN0OEChcRsXeonSZZ1ORgB3jmCMXKkwAo85CJPaIquhzHb+JVVGw/EVDDVPNUuOfPMjrata2I7bAmXbISXXQzpQI55ZA+k4A4cNERfqvw6pwZtjOiQM8ZDnn+DtiXyMZOf+16SAYPCzZBZ6Qbib0rWPy8XoIjthUNIn9pJwLT0O2MWSmBRatxuOGShqghPeOWH7lJfXWXYuwq40yjZydr8sUyCCU56C2fYV9/be3FhRA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=linux.intel.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ik1fsrU3+1GM9eFwPLMAXJh38CC+gxZmcnGsD0S1ytI=; b=xVsDIZQRzKh25OQv06cpWeuPDVXg15C5/9ePZE80MCtT6VJkVrzIRy1F8rWJvip8osuA+yx7I6UAi7zi0F3WA+IenTQd+V0wB07XbKZkmO+9Id84rPexTwzfl2Ofn79FO4dNayUobXP94negjYYWIgqSLOUdgUGqa8dDt2OOhSA= Received: from SN7P222CA0016.NAMP222.PROD.OUTLOOK.COM (2603:10b6:806:124::16) by PH8PR12MB7325.namprd12.prod.outlook.com (2603:10b6:510:217::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8880.17; Wed, 2 Jul 2025 16:13:36 +0000 Received: from SA2PEPF00001504.namprd04.prod.outlook.com (2603:10b6:806:124:cafe::9e) by SN7P222CA0016.outlook.office365.com (2603:10b6:806:124::16) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8901.21 via Frontend Transport; Wed, 2 Jul 2025 16:13:35 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by SA2PEPF00001504.mail.protection.outlook.com (10.167.242.36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8901.15 via Frontend Transport; Wed, 2 Jul 2025 16:13:35 +0000 Received: from FRAPPELLOUX01.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Jul 2025 11:13:33 -0500 From: Pierre-Eric Pelloux-Prayer To: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter CC: Pierre-Eric Pelloux-Prayer , , Subject: [PATCH v1 3/3] drm/buddy: dont go over the higher orders multiple times Date: Wed, 2 Jul 2025 18:12:04 +0200 Message-ID: <20250702161208.25188-4-pierre-eric.pelloux-prayer@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250702161208.25188-1-pierre-eric.pelloux-prayer@amd.com> References: <20250702161208.25188-1-pierre-eric.pelloux-prayer@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SA2PEPF00001504:EE_|PH8PR12MB7325:EE_ X-MS-Office365-Filtering-Correlation-Id: 6c03b0f0-1cdc-4d21-5705-08ddb98365c2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?7FPfXiVCZxILjuV/nzvRpMK/FBYIgvadxKV6mFY60jur9EDwOx9FU1ENfUSd?= =?us-ascii?Q?DYz5eAfKawiylss9E0fMe0qdZ6CHYGnozQcjjKU8cU3TugPW8ZhV5kgWgdNr?= =?us-ascii?Q?9LTEI5CpZXLVhxrWVnGtTh/w7F74LABveejwvzGjfwVXWBo1Qj03+er1uxoI?= =?us-ascii?Q?/c1PwGeN2uitefhK63zB2gtCw3TF/AE6CAlTAA55TDxAXDpCkLJHyGxrZ4MF?= =?us-ascii?Q?fGrbPHSb22po2lquz+tOKJl5itAmpyw3D7AZkJy9Q0bisTXQk9UY0dyIFhQw?= =?us-ascii?Q?D6agBq4KzeZnQIYs9H0/knp4WnwGVEySpv7wsIzQN4S2jpt2TQedmodADg6W?= =?us-ascii?Q?0k8244uXw9244dkiRbMO/5tiUJI4N/ArHoPqoDn/hoXnLj6pY5dyOTYsGXpE?= =?us-ascii?Q?DwN/sRCErIvZ9tTzFuwk85s8hSrPe5w5p3Qh8V5JOVPEP6imXvdkfpSg4/Nf?= =?us-ascii?Q?XOa49+wU8o60bqL9xdgjyB5PFNNYHjEwhlE52NrBTbEXyLAaTRIsYsaRIO1Y?= =?us-ascii?Q?H6BTwItrXITbekWXnCTdUdL8vBE9288baZPXo7Zx3saPlMK3QVexnDWQqzFT?= =?us-ascii?Q?cG563HcHDHwfKrLZILI9T3MdwB/vDJ7MnxUICbpnUAWh+B1D8OY0RxLE4LnI?= =?us-ascii?Q?LaXYvCb/G9enKnutEvzXhZlVfly28SciPf8GGdwSHLAiglMgEk61SdLJVHRe?= =?us-ascii?Q?jcq9SkXgaE8Vgd4HwWaA7nVfajTWAa6i3U/hcNjVIpDdD98PWKWRRILDW8yU?= =?us-ascii?Q?uv+QmsQUiAj3jXNR7riIZjDVuoNg/h43ry4Rk7J8nmE5GMyjxXRaHReFQUqX?= =?us-ascii?Q?6y+ffKmCVXMiD5v3EJwwffvtj+pQluDwRW4wc+84uMunHhRo4GipqB3NYO3Q?= =?us-ascii?Q?O3+9ytkSlI4GTder17tSLj7e3hvH1MU78JfXoRAwsnBr6fiv9AaUQXNp9YrF?= =?us-ascii?Q?yK1xbDi102SUWU6V9t6CDWPkQzf9E2PCaO0tttqg/rV3gNHne3WLfb0FH1vN?= =?us-ascii?Q?AtVlciMIZQy/yvjSCsB+cG1Lms/1mSnnJ6K7y49pl9aGX0JmJ4BVwXTQQPZl?= =?us-ascii?Q?GATtm/JqiZLyA5jww1fCWTDwxp7vmCIHfkPwKK00jHdGMHJOIG87oGlUoFEf?= =?us-ascii?Q?/MhUn/B0tRsX2x0ePB1Zl3HTA+lID8xsjG8PwG8KNaD3f9pb99e49sbvnGcs?= =?us-ascii?Q?JeUBiG81+Wog3bIb+iWv75d0Et3OFpXJdahpZ5OEFkTF4sXvOqa1NCsrV2lr?= =?us-ascii?Q?/Js8aVzTkNVDqe7X9MU3YNGa40ni38IfJmlwDtHqxjzx3YybFdFKDL6o3oIh?= =?us-ascii?Q?3C5a1Pm66zgqqRZMmmm+JpWNG85ymfR30+EoHih66HfQK9L/F6FRVHNhvpzW?= =?us-ascii?Q?POxY+/PrxR2SjxiQNAMCH2zaWTjH3g36XWxJ2fSwMxnPvBhBU7LxOxBO7HPl?= =?us-ascii?Q?iBaeiVy2Qmn2sY4fXkS/UAavnLtmhcg4MqwembNyNrnq4wIVt7koCmBYs2ri?= =?us-ascii?Q?JbsKgUDvAf/LeaXuzCh8WWJTScHBkzXiy9RZ?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jul 2025 16:13:35.4909 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6c03b0f0-1cdc-4d21-5705-08ddb98365c2 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: SA2PEPF00001504.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB7325 Content-Type: text/plain; charset="utf-8" AFAICT the rationale for the loop is to: 1) try to allocate from the preferred order 2) if it fails, try higher orders (order + 1 -> max order) 3) if it fails, try smaller orders (order - 1 -> min order) Steps 1 and 2 are covered by the loop going through [order, max_order]. Currently step 3 tries again [order, max_order] but with decreasing values of order. This is wasteful, so change it to evaluate only order. Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/drm_buddy.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index fd31322b3d41..9d3723f2cff9 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -590,13 +590,14 @@ __drm_buddy_alloc_range_bias(struct drm_buddy *mm, =20 static struct drm_buddy_block * get_maxblock(struct drm_buddy *mm, unsigned int order, + unsigned int max_order, unsigned long flags) { struct drm_buddy_block *max_block =3D NULL, *block =3D NULL; bool wants_clear; unsigned int i; =20 - for (i =3D order; i <=3D mm->max_order; ++i) { + for (i =3D order; i <=3D max_order; ++i) { struct drm_buddy_block *tmp_block; =20 wants_clear =3D flags & DRM_BUDDY_PREFER_CLEAR_ALLOCATION; @@ -635,6 +636,7 @@ get_maxblock(struct drm_buddy *mm, unsigned int order, static struct drm_buddy_block * alloc_from_freelist(struct drm_buddy *mm, unsigned int order, + unsigned int max_order, unsigned long flags) { struct drm_buddy_block *block =3D NULL; @@ -643,12 +645,12 @@ alloc_from_freelist(struct drm_buddy *mm, int err; =20 if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) { - block =3D get_maxblock(mm, order, flags); + block =3D get_maxblock(mm, order, max_order, flags); if (block) /* Store the obtained block order */ tmp =3D drm_buddy_block_order(block); } else { - for (tmp =3D order; tmp <=3D mm->max_order; ++tmp) { + for (tmp =3D order; tmp <=3D max_order; ++tmp) { struct drm_buddy_block *tmp_block; wants_clear =3D flags & DRM_BUDDY_PREFER_CLEAR_ALLOCATION; =20 @@ -956,6 +958,7 @@ static struct drm_buddy_block * __drm_buddy_alloc_blocks(struct drm_buddy *mm, u64 start, u64 end, unsigned int order, + unsigned int max_order, unsigned long flags) { if (flags & DRM_BUDDY_RANGE_ALLOCATION) @@ -964,7 +967,7 @@ __drm_buddy_alloc_blocks(struct drm_buddy *mm, order, flags); else /* Allocate from freelist */ - return alloc_from_freelist(mm, order, flags); + return alloc_from_freelist(mm, order, max_order, flags); } =20 /** @@ -995,7 +998,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, { struct drm_buddy_block *block =3D NULL; u64 original_size, original_min_size; - unsigned int min_order, order; + unsigned int min_order, max_order, order; LIST_HEAD(allocated); unsigned long pages; int err; @@ -1044,6 +1047,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, =20 do { order =3D min(order, (unsigned int)fls(pages) - 1); + max_order =3D mm->max_order; BUG_ON(order > mm->max_order); BUG_ON(order < min_order); =20 @@ -1051,6 +1055,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, block =3D __drm_buddy_alloc_blocks(mm, start, end, order, + max_order, flags); if (!IS_ERR(block)) break; @@ -1062,6 +1067,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, block =3D __drm_buddy_alloc_blocks(mm, start, end, min_order, + mm->max_order, flags); if (!IS_ERR(block)) { order =3D min_order; @@ -1082,6 +1088,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, err =3D -ENOSPC; goto err_free; } + max_order =3D order; } while (1); =20 mark_allocated(block); --=20 2.43.0