From nobody Thu Oct 2 07:49:10 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C366E2D3756 for ; Thu, 18 Sep 2025 20:25:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227116; cv=none; b=Hf8SOhhuV6JZ4EtUFJjh30IUYRfYBIB2hM7AAFVwE+kuOIUCk4pZY8MgA2OUJ6gasV5WGfOoaEv+AXlOSW7Jb+Wst1vz8mvS2mLirkPrPTWjwDKjmav84OKo3bXlF+oyiddL60JPEtICQQ4mEqX9DPEQDF02EES9wWuKttKasJI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227116; c=relaxed/simple; bh=9aV22t7bk0UoXv46bvhEAZ57WjI7exArwzm9mKqjLrM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=DGuXxB5VwctqZjbmAqwKdIHnYglnCxOIR/33CmmyN7y6cYBoASqZNe+73Hbga7aEU7FzDmKwz7kDLLAGK+rPZvX7M3df6DqOQXKsGfue8X+AqeubjOvxTV34rw+5HI3285+r9YRn5RQz9/icBSw2CYKI39K+WdsLfIyK26+8w1Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=CcC1IRdg; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="CcC1IRdg" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=ViZFP+fI0QUjz41MCfkYglrEpSnPBiVf0zcZpjHQNag=; b=CcC1IRdgN/vL0VAeTkcmPM5Bie EUDpdBKBACIc7ceGeW2cLgHijzkek/V8CAUBbJWqTPL76EELT+Op32ZkUDvisHTSspAigPu2ruE1y kctGVOBiovJRF1MZvrHdHajqhRxu6uK2Th3Mk+5Jevx4Bi0BRtAwTyOX3E27xKvbbGuQqJMcODIy6 Yel7B4n+EcnReL22baVazXCHGI/ZcuIbrNU5Fhg1bcOYAInTvqpNyJtobe3/WhQA2e9UBcP4Fi5w1 AResCDeZX3VPYxUMoU7cIoE7lWwSrZEHyGQ3y+k4abP7P61RAz7dBIIp1zWO55nOQ1G3M5WAdj+7f GZxHcWgA==; Received: from 179-125-87-227-dinamico.pombonet.net.br ([179.125.87.227] helo=[127.0.0.1]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uzLBl-00Dp3q-01; Thu, 18 Sep 2025 22:25:09 +0200 From: Thadeu Lima de Souza Cascardo Date: Thu, 18 Sep 2025 17:09:24 -0300 Subject: [PATCH RFC v2 1/3] ttm: pool: allow requests to prefer latency over throughput Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250918-ttm_pool_no_direct_reclaim-v2-1-135294e1f8a2@igalia.com> References: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> In-Reply-To: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> To: Christian Koenig , =?utf-8?q?Michel_D=C3=A4nzer?= , Huang Rui , Matthew Auld , Matthew Brost , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, Tvrtko Ursulin , Sergey Senozhatsky , Thadeu Lima de Souza Cascardo X-Mailer: b4 0.14.2 The TTM pool allocator prefer to allocate higher order pages such that the GPU will spend less time walking page tables and provide better throughput. There were cases where too much fragmented memory led to a 30% change in the throughput of a given GPU workload on a datacenter. On a desktop workload on a low-memory system, though, allocating such higher order pages might put the system under memory pressure, triggering direct reclaim and leading to latency in certain desktop operations, while allocating lower order pages would be possible and avoid such reclaims. This was seen on ChromeOS when opening multiple tabs and switching desktops, leading to high latency in such operations. Add an option to the ttm operation context that allows the behavior to be set system wide or per TTM object. Signed-off-by: Thadeu Lima de Souza Cascardo --- drivers/gpu/drm/ttm/ttm_pool.c | 11 +++++++---- include/drm/ttm/ttm_bo.h | 5 +++++ 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index baf27c70a4193a121fbc8b4e67cd6feb4c612b85..02c622a103fcece003bd70ce6b5= 833ada70f5228 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -133,7 +133,8 @@ static DECLARE_RWSEM(pool_shrink_rwsem); =20 /* Allocate pages of size 1 << order with the given gfp_flags */ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_f= lags, - unsigned int order) + unsigned int order, + const struct ttm_operation_ctx *ctx) { unsigned long attr =3D DMA_ATTR_FORCE_CONTIGUOUS; struct ttm_pool_dma *dma; @@ -144,9 +145,12 @@ static struct page *ttm_pool_alloc_page(struct ttm_poo= l *pool, gfp_t gfp_flags, * Mapping pages directly into an userspace process and calling * put_page() on a TTM allocated page is illegal. */ - if (order) + if (order) { gfp_flags |=3D __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_THISNODE; + if (ctx->alloc_method =3D=3D ttm_op_alloc_latency) + gfp_flags &=3D ~__GFP_DIRECT_RECLAIM; + } =20 if (!pool->use_dma_alloc) { p =3D alloc_pages_node(pool->nid, gfp_flags, order); @@ -745,7 +749,7 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, stru= ct ttm_tt *tt, if (!p) { page_caching =3D ttm_cached; allow_pools =3D false; - p =3D ttm_pool_alloc_page(pool, gfp_flags, order); + p =3D ttm_pool_alloc_page(pool, gfp_flags, order, ctx); } /* If that fails, lower the order if possible and retry. */ if (!p) { @@ -815,7 +819,6 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt= *tt, return -EINVAL; =20 ttm_pool_alloc_state_init(tt, &alloc); - return __ttm_pool_alloc(pool, tt, ctx, &alloc, NULL); } EXPORT_SYMBOL(ttm_pool_alloc); diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h index 479b7ed075c0ffba21df971db7fef914c531a51d..8531f8e8bb9b079927d0e4759a1= 2819303542f62 100644 --- a/include/drm/ttm/ttm_bo.h +++ b/include/drm/ttm/ttm_bo.h @@ -184,6 +184,11 @@ struct ttm_operation_ctx { bool no_wait_gpu; bool gfp_retry_mayfail; bool allow_res_evict; + enum { + ttm_op_alloc_default =3D 0, + ttm_op_alloc_latency =3D 2, + ttm_op_alloc_throughput =3D 3, + } alloc_method; struct dma_resv *resv; uint64_t bytes_moved; }; --=20 2.47.3 From nobody Thu Oct 2 07:49:10 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C286C31D37A for ; Thu, 18 Sep 2025 20:25:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227119; cv=none; b=nvmcCkmMb9pG/pwazR1Hd2IA+MXE4R4XfJrP+X5oblXgkgFYW4sTDXWVWCZ4RwDRFa+8vcVJytcjqID34dKo+QPKEOTiSRuVUv6/wLYqnHOsHDaG3QTrV6vKyIxNqREvnkB3va3cae/wtlMxOCKEHrUyoiSt5/4i0WuYRpn2/rs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227119; c=relaxed/simple; bh=BGec5hRHCdYo1tz3XrXo1BbiWzyDNTSvVdqyIAcrfYA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=NcmyjQzFZdE2dgRXf897Ib9ByybqhmcLK6OyIgskgWwbB+j7GsECY7k3Ed9ZLOFIVOWcR5BnatjWyrreyyM6VVc8GC+RjmZRE3ajimY165pCyb/A4cAxduXueDxFKB28Fyd4iKF6FBQfVQkLXdi4Hb6U8qVYKe/SW9L2BwuEqqo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=BPICQt7z; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="BPICQt7z" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=woVS27yiYLORVThWbfEQkqZ5Ig2PmX/E6zj72wklklk=; b=BPICQt7zw+lEmchMwkXG065zLd WlcCBQbhDtkeMqpBEs6nQSofJmdnkAAJR5jUoXx9nncI/hK/5fVFySTy42O5NvLwRZGcJ6lRg8K2r V3OozTQ51zkuklzHsRMcRqTuT7JdQSkxdalVFBNsTs7qzN/w5M5L7w4DDhAMe5IcM4BspmU4H8WNU kmj7Gr/+DPHV6Em9EvhQSgYCO5IASqzpIrAIrJvCbVyhF07NyrUglOZkqI4PVIAwQ9Ol/xBu9hsbW nQYMTg619AtU3y56khX7KpGI62dBbrsU/7l0Ucfwqt1IiMkikNBjdk0+JOXCVt6Zo9YnqIS/z6rJo wCzWHfFA==; Received: from 179-125-87-227-dinamico.pombonet.net.br ([179.125.87.227] helo=[127.0.0.1]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uzLBp-00Dp3q-I8; Thu, 18 Sep 2025 22:25:13 +0200 From: Thadeu Lima de Souza Cascardo Date: Thu, 18 Sep 2025 17:09:25 -0300 Subject: [PATCH RFC v2 2/3] ttm: pool: add a module parameter to set latency preference Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250918-ttm_pool_no_direct_reclaim-v2-2-135294e1f8a2@igalia.com> References: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> In-Reply-To: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> To: Christian Koenig , =?utf-8?q?Michel_D=C3=A4nzer?= , Huang Rui , Matthew Auld , Matthew Brost , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, Tvrtko Ursulin , Sergey Senozhatsky , Thadeu Lima de Souza Cascardo X-Mailer: b4 0.14.2 This allows a system-wide setting for allocations of higher order pages not to use direct reclaim. The default setting is to keep existing behavior and allow direct reclaim when allocating higher order pages. Signed-off-by: Thadeu Lima de Souza Cascardo --- drivers/gpu/drm/ttm/ttm_pool.c | 12 ++++++++++-- drivers/gpu/drm/ttm/ttm_tt.c | 2 +- include/drm/ttm/ttm_pool.h | 2 +- include/drm/ttm/ttm_tt.h | 2 +- 4 files changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index 02c622a103fcece003bd70ce6b5833ada70f5228..39203f2c247a36b0389682d7fb8= 41088f4c8a95b 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -56,6 +56,11 @@ static DECLARE_FAULT_ATTR(backup_fault_inject); #define should_fail(...) false #endif =20 +static unsigned int ttm_alloc_method; + +MODULE_PARM_DESC(alloc_method, "TTM allocation method (0 - throughput, 1 -= latency"); +module_param_named(alloc_method, ttm_alloc_method, uint, 0644); + /** * struct ttm_pool_dma - Helper object for coherent DMA mappings * @@ -702,7 +707,7 @@ static unsigned int ttm_pool_alloc_find_order(unsigned = int highest, } =20 static int __ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, - const struct ttm_operation_ctx *ctx, + struct ttm_operation_ctx *ctx, struct ttm_pool_alloc_state *alloc, struct ttm_pool_tt_restore *restore) { @@ -717,6 +722,9 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, stru= ct ttm_tt *tt, WARN_ON(!alloc->remaining_pages || ttm_tt_is_populated(tt)); WARN_ON(alloc->dma_addr && !pool->dev); =20 + if (ctx->alloc_method =3D=3D ttm_op_alloc_default && ttm_alloc_method =3D= =3D 1) + ctx->alloc_method =3D ttm_op_alloc_latency; + if (tt->page_flags & TTM_TT_FLAG_ZERO_ALLOC) gfp_flags |=3D __GFP_ZERO; =20 @@ -837,7 +845,7 @@ EXPORT_SYMBOL(ttm_pool_alloc); * Returns: 0 on successe, negative error code otherwise. */ int ttm_pool_restore_and_alloc(struct ttm_pool *pool, struct ttm_tt *tt, - const struct ttm_operation_ctx *ctx) + struct ttm_operation_ctx *ctx) { struct ttm_pool_alloc_state alloc; =20 diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 506e257dfba8501815f8416e808f437e5f17aa8f..e1975d740b948f9b7fe1d35d913= a458026d2c783 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -294,7 +294,7 @@ long ttm_tt_backup(struct ttm_device *bdev, struct ttm_= tt *tt, } =20 int ttm_tt_restore(struct ttm_device *bdev, struct ttm_tt *tt, - const struct ttm_operation_ctx *ctx) + struct ttm_operation_ctx *ctx) { int ret =3D ttm_pool_restore_and_alloc(&bdev->pool, tt, ctx); =20 diff --git a/include/drm/ttm/ttm_pool.h b/include/drm/ttm/ttm_pool.h index 54cd34a6e4c0ac5e17844b50fd08e72143b460c1..08f9a1388754fac352058ac2beb= 2b59bb944477c 100644 --- a/include/drm/ttm/ttm_pool.h +++ b/include/drm/ttm/ttm_pool.h @@ -95,7 +95,7 @@ void ttm_pool_drop_backed_up(struct ttm_tt *tt); long ttm_pool_backup(struct ttm_pool *pool, struct ttm_tt *ttm, const struct ttm_backup_flags *flags); int ttm_pool_restore_and_alloc(struct ttm_pool *pool, struct ttm_tt *tt, - const struct ttm_operation_ctx *ctx); + struct ttm_operation_ctx *ctx); =20 int ttm_pool_mgr_init(unsigned long num_pages); void ttm_pool_mgr_fini(void); diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 406437ad674bf1a96527b45c5a81c58a747271c1..3575e20b77f3ccbc3d9aad0afbb= 762055b3cb139 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -296,7 +296,7 @@ long ttm_tt_backup(struct ttm_device *bdev, struct ttm_= tt *tt, const struct ttm_backup_flags flags); =20 int ttm_tt_restore(struct ttm_device *bdev, struct ttm_tt *tt, - const struct ttm_operation_ctx *ctx); + struct ttm_operation_ctx *ctx); =20 int ttm_tt_setup_backup(struct ttm_tt *tt); =20 --=20 2.47.3 From nobody Thu Oct 2 07:49:10 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1FB231D37A for ; Thu, 18 Sep 2025 20:25:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227124; cv=none; b=Piscm/QJ+TCLrL6wExP/GHu1777TRBpCoue7kaucAhHTXnu+4o9+fAbJI4fKAyPGS+4mH8oey2UZWmrXbzZmsHxMpTM3xJdEQdfwo583EThg/3VToA2FwJfjuwsbJiDWZjlYwjboi6lAY67ZfDE8eAbgJ1Nx5713Hq7JS8r+GoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227124; c=relaxed/simple; bh=oPOfnEc3fKY86qUChaHtQCAhClYdLf7PDqq1u/77YT4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Rwgkoj7as2RLwMiu3TWDzWTOSJBi6ij1lAk1TLGaYhUXP/i9z+/TwGuERWM6X5Io+RbwE6ATyhS9qSDnbEDFHI1p05Jrz8kjytuFL9Pzb/kniKC0t1gIv327MUv1CDAEyUskEuV32hpA5ceIeS49HaTxUbTaGckaBqFusM7XnX4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=dL5GL0et; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="dL5GL0et" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=WL6Kw9YUWCG0W8YCQWVk37aQPOOB+kRzsZn1Ep3RQ9E=; b=dL5GL0etsWf87csUiJUWIysuyb 8NGFGem8xV12Rc4e5BNHP74oOwMmbfW/kFPjw1xU35v81QjAQZv4FCP4qhq5GWM1UDq5gNvQwD8dd bqWJbsl2EHJfxtcc0v2n4dFT4ss+XU0codakbqRYSO64j0Q3s4S6yIkM/tVF5XZMK9ha0S09t6cez RtDJYqNxYhlTq1viV5FQ8uyyyW1g+idVZ8mzPGlOIWw/T4sqc2iF4LeNfWvFmqvqB/lsPYD6YVVhI ZkpgiCmXCduHtdFGfwEU+IhkJe9wmGqih7CReNYgCxNSXGul1/dza0OdU1cVC0SQKqhResx1r14pU EZCxTu8w==; Received: from 179-125-87-227-dinamico.pombonet.net.br ([179.125.87.227] helo=[127.0.0.1]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uzLBu-00Dp3q-CC; Thu, 18 Sep 2025 22:25:18 +0200 From: Thadeu Lima de Souza Cascardo Date: Thu, 18 Sep 2025 17:09:26 -0300 Subject: [PATCH RFC v2 3/3] drm/amdgpu: allow allocation preferences when creating GEM object Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250918-ttm_pool_no_direct_reclaim-v2-3-135294e1f8a2@igalia.com> References: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> In-Reply-To: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> To: Christian Koenig , =?utf-8?q?Michel_D=C3=A4nzer?= , Huang Rui , Matthew Auld , Matthew Brost , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, Tvrtko Ursulin , Sergey Senozhatsky , Thadeu Lima de Souza Cascardo X-Mailer: b4 0.14.2 When creating a GEM object on amdgpu, it may be specified that latency during allocation should be preferred over throughput when processing. That will reflect into the TTM operation, which will lead to the use of direct reclaim for higher order pages when throughput is preferred, even if latency is configured to be preferred in the system. If latency is preferred, no direct reclaim will be used for higher order pages, which might lead to more use of lower order pages, which can also compromised throughput. Signed-off-by: Thadeu Lima de Souza Cascardo --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- include/uapi/drm/amdgpu_drm.h | 9 +++++++++ 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_gem.c index d1ccbfcf21fa62a8d4fe1b8f020cf00d34efe1ab..0a0333e7ed1a45de63fedfbc161= 094f6de7fda00 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -451,7 +451,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, voi= d *data, AMDGPU_GEM_CREATE_EXPLICIT_SYNC | AMDGPU_GEM_CREATE_ENCRYPTED | AMDGPU_GEM_CREATE_GFX12_DCC | - AMDGPU_GEM_CREATE_DISCARDABLE)) + AMDGPU_GEM_CREATE_DISCARDABLE | + AMDGPU_GEM_ALLOCATION_MASK)) return -EINVAL; =20 /* reject invalid gem domains */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/a= md/amdgpu/amdgpu_object.c index 122a882948839464dc197d40ff8e46cf161f7b42..54350460bb41e4bc057eb61d7bb= 6014457e56c6e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -632,7 +632,8 @@ int amdgpu_bo_create(struct amdgpu_device *adev, /* We opt to avoid OOM on system pages allocations */ .gfp_retry_mayfail =3D true, .allow_res_evict =3D bp->type !=3D ttm_bo_type_kernel, - .resv =3D bp->resv + .resv =3D bp->resv, + .alloc_method =3D AMDGPU_GEM_ALLOCATION(bp->flags) }; struct amdgpu_bo *bo; unsigned long page_align, size =3D bp->size; diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index bdedbaccf776db0c86cec939725a435c37f09f77..b796744abeba2bf4b14556251b3= 6938ba0905c1e 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -180,6 +180,15 @@ extern "C" { /* Set PTE.D and recompress during GTT->VRAM moves according to TILING fla= gs. */ #define AMDGPU_GEM_CREATE_GFX12_DCC (1 << 16) =20 +/* Prioritize allocation latency or high-order allocations that favor + * throughput */ +#define AMDGPU_GEM_OVERRIDE_ALLOCATION_SHIFT (17) +#define AMDGPU_GEM_ALLOCATION_DEFAULT (0 << AMDGPU_GEM_OVERRIDE_ALLOCATIO= N_SHIFT) +#define AMDGPU_GEM_ALLOCATION_LATENCY (2 << AMDGPU_GEM_OVERRIDE_ALLOCATIO= N_SHIFT) +#define AMDGPU_GEM_ALLOCATION_THROUGHPUT (3 << AMDGPU_GEM_OVERRIDE_ALLOCAT= ION_SHIFT) +#define AMDGPU_GEM_ALLOCATION_MASK (3 << AMDGPU_GEM_OVERRIDE_ALLOCATION_S= HIFT) +#define AMDGPU_GEM_ALLOCATION(flags) ((flags & AMDGPU_GEM_ALLOCATION_MASK= ) >> AMDGPU_GEM_OVERRIDE_ALLOCATION_SHIFT) + struct drm_amdgpu_gem_create_in { /** the requested memory size */ __u64 bo_size; --=20 2.47.3