From nobody Thu Oct 2 09:21:20 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1FB231D37A for ; Thu, 18 Sep 2025 20:25:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227124; cv=none; b=Piscm/QJ+TCLrL6wExP/GHu1777TRBpCoue7kaucAhHTXnu+4o9+fAbJI4fKAyPGS+4mH8oey2UZWmrXbzZmsHxMpTM3xJdEQdfwo583EThg/3VToA2FwJfjuwsbJiDWZjlYwjboi6lAY67ZfDE8eAbgJ1Nx5713Hq7JS8r+GoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758227124; c=relaxed/simple; bh=oPOfnEc3fKY86qUChaHtQCAhClYdLf7PDqq1u/77YT4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Rwgkoj7as2RLwMiu3TWDzWTOSJBi6ij1lAk1TLGaYhUXP/i9z+/TwGuERWM6X5Io+RbwE6ATyhS9qSDnbEDFHI1p05Jrz8kjytuFL9Pzb/kniKC0t1gIv327MUv1CDAEyUskEuV32hpA5ceIeS49HaTxUbTaGckaBqFusM7XnX4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=dL5GL0et; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="dL5GL0et" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=WL6Kw9YUWCG0W8YCQWVk37aQPOOB+kRzsZn1Ep3RQ9E=; b=dL5GL0etsWf87csUiJUWIysuyb 8NGFGem8xV12Rc4e5BNHP74oOwMmbfW/kFPjw1xU35v81QjAQZv4FCP4qhq5GWM1UDq5gNvQwD8dd bqWJbsl2EHJfxtcc0v2n4dFT4ss+XU0codakbqRYSO64j0Q3s4S6yIkM/tVF5XZMK9ha0S09t6cez RtDJYqNxYhlTq1viV5FQ8uyyyW1g+idVZ8mzPGlOIWw/T4sqc2iF4LeNfWvFmqvqB/lsPYD6YVVhI ZkpgiCmXCduHtdFGfwEU+IhkJe9wmGqih7CReNYgCxNSXGul1/dza0OdU1cVC0SQKqhResx1r14pU EZCxTu8w==; Received: from 179-125-87-227-dinamico.pombonet.net.br ([179.125.87.227] helo=[127.0.0.1]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uzLBu-00Dp3q-CC; Thu, 18 Sep 2025 22:25:18 +0200 From: Thadeu Lima de Souza Cascardo Date: Thu, 18 Sep 2025 17:09:26 -0300 Subject: [PATCH RFC v2 3/3] drm/amdgpu: allow allocation preferences when creating GEM object Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250918-ttm_pool_no_direct_reclaim-v2-3-135294e1f8a2@igalia.com> References: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> In-Reply-To: <20250918-ttm_pool_no_direct_reclaim-v2-0-135294e1f8a2@igalia.com> To: Christian Koenig , =?utf-8?q?Michel_D=C3=A4nzer?= , Huang Rui , Matthew Auld , Matthew Brost , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel-dev@igalia.com, Tvrtko Ursulin , Sergey Senozhatsky , Thadeu Lima de Souza Cascardo X-Mailer: b4 0.14.2 When creating a GEM object on amdgpu, it may be specified that latency during allocation should be preferred over throughput when processing. That will reflect into the TTM operation, which will lead to the use of direct reclaim for higher order pages when throughput is preferred, even if latency is configured to be preferred in the system. If latency is preferred, no direct reclaim will be used for higher order pages, which might lead to more use of lower order pages, which can also compromised throughput. Signed-off-by: Thadeu Lima de Souza Cascardo --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- include/uapi/drm/amdgpu_drm.h | 9 +++++++++ 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_gem.c index d1ccbfcf21fa62a8d4fe1b8f020cf00d34efe1ab..0a0333e7ed1a45de63fedfbc161= 094f6de7fda00 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -451,7 +451,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, voi= d *data, AMDGPU_GEM_CREATE_EXPLICIT_SYNC | AMDGPU_GEM_CREATE_ENCRYPTED | AMDGPU_GEM_CREATE_GFX12_DCC | - AMDGPU_GEM_CREATE_DISCARDABLE)) + AMDGPU_GEM_CREATE_DISCARDABLE | + AMDGPU_GEM_ALLOCATION_MASK)) return -EINVAL; =20 /* reject invalid gem domains */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/a= md/amdgpu/amdgpu_object.c index 122a882948839464dc197d40ff8e46cf161f7b42..54350460bb41e4bc057eb61d7bb= 6014457e56c6e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -632,7 +632,8 @@ int amdgpu_bo_create(struct amdgpu_device *adev, /* We opt to avoid OOM on system pages allocations */ .gfp_retry_mayfail =3D true, .allow_res_evict =3D bp->type !=3D ttm_bo_type_kernel, - .resv =3D bp->resv + .resv =3D bp->resv, + .alloc_method =3D AMDGPU_GEM_ALLOCATION(bp->flags) }; struct amdgpu_bo *bo; unsigned long page_align, size =3D bp->size; diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index bdedbaccf776db0c86cec939725a435c37f09f77..b796744abeba2bf4b14556251b3= 6938ba0905c1e 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -180,6 +180,15 @@ extern "C" { /* Set PTE.D and recompress during GTT->VRAM moves according to TILING fla= gs. */ #define AMDGPU_GEM_CREATE_GFX12_DCC (1 << 16) =20 +/* Prioritize allocation latency or high-order allocations that favor + * throughput */ +#define AMDGPU_GEM_OVERRIDE_ALLOCATION_SHIFT (17) +#define AMDGPU_GEM_ALLOCATION_DEFAULT (0 << AMDGPU_GEM_OVERRIDE_ALLOCATIO= N_SHIFT) +#define AMDGPU_GEM_ALLOCATION_LATENCY (2 << AMDGPU_GEM_OVERRIDE_ALLOCATIO= N_SHIFT) +#define AMDGPU_GEM_ALLOCATION_THROUGHPUT (3 << AMDGPU_GEM_OVERRIDE_ALLOCAT= ION_SHIFT) +#define AMDGPU_GEM_ALLOCATION_MASK (3 << AMDGPU_GEM_OVERRIDE_ALLOCATION_S= HIFT) +#define AMDGPU_GEM_ALLOCATION(flags) ((flags & AMDGPU_GEM_ALLOCATION_MASK= ) >> AMDGPU_GEM_OVERRIDE_ALLOCATION_SHIFT) + struct drm_amdgpu_gem_create_in { /** the requested memory size */ __u64 bo_size; --=20 2.47.3