From nobody Wed Apr 1 08:22:30 2026 Received: from dancol.org (dancol.org [96.126.100.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65E9F3242AB for ; Wed, 1 Apr 2026 02:44:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.126.100.184 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775011503; cv=none; b=MBH/sYT+Vk3KhWNuG/I5f7N+g8QpxtCFamC2B9I061SmJRJGRNb860IrinXKiWO1HQ6Pd3EFeh7rjvdV5uUoTVYRj84W59LFl8nB6ZIaSQKtBwmQdCE4IHotOyILef4zDA5zH8+CLwYevZBIbhjTtOKD/cP5hJsUoRJ0NH3xd94= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775011503; c=relaxed/simple; bh=NU/cQwVDxis5m3C7Y+mGoPBL7xDi/BIZ7beBfUck/jE=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=B6r/zFQCIFQb/FocO/IwrGGPi76WlSZC8MIUseviUMnxd3fHjzzTZ/m4eey7I+BgyngFVvLpel3uQc8NHnFvt/FSfVNpfUQEp6osg3DHnsP8PC1SBwOW/7uGGJBRIc2LLi9SOt7OwScUZ5lD9qgTPkLkBkhbTae4qDF2xf1s/lo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=dancol.org; spf=pass smtp.mailfrom=dancol.org; dkim=pass (2048-bit key) header.d=dancol.org header.i=@dancol.org header.b=Y7/BUZ6o; arc=none smtp.client-ip=96.126.100.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=dancol.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=dancol.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=dancol.org header.i=@dancol.org header.b="Y7/BUZ6o" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dancol.org; s=x; h=Content-Type:MIME-Version:Message-ID:Date:Subject:To:From; bh=9e7RQVAOjBjxxSPR5no5ag7Pa/vfgF7hX1+alac3HRQ=; b=Y7/BUZ6oXyUCgFlffwjsTLvwlO PT2u6prSXUcZwEW3T+FyXiA8impPgouWhBx9nIRHyUAAmKjH5MWViqsQuxqFUVMDSEA9YypkdCorK 4BEIhoLCWMaOiFh20h2LT+FY2IEI1JV2wZwQ3cbcC98K9l04mtF/KpBbQKrp0uC0AZleW0htDlFKe qSjd5dZ/3zmJvCrbDR4AmqQ+g7UR9WiaWE0iUPascAlx0f2CuDlPlFgTUGCFxRz+i/jD8TGu/PmGW T1dN28EU61m8a00QM1pr1OQ20toeeudePcWQwI4wGBKwjdllE/AQmNXVDG+GQIqUtsK6lik+KDJ+U FlD8y7ag==; Received: from 2603-900b-c400-2854-4cc1-9500-c07a-5906.inf6.spectrum.com ([2603:900b:c400:2854:4cc1:9500:c07a:5906]:58092 helo=localhost) by dancol.org with utf8esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w7l0u-00000001Nk7-0VLT; Tue, 31 Mar 2026 22:09:00 -0400 From: Daniel Colascione To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Christian Koenig , Huang Rui , Matthew Auld , Matthew Brost , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Thomas =?utf-8?Q?Hellstr=C3=B6m?= , linux-kernel@vger.kernel.org Subject: [RFC PATCH] Limit reclaim to avoid TTM desktop stutter under mem pressure User-Agent: mu4e 1.14.0-pre1; emacs 31.0.50 Date: Tue, 31 Mar 2026 22:08:58 -0400 Message-ID: <87341fsa85.fsf@dancol.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" TTM seems to be too eager to kick off reclaim while kwin is drawing I've noticed that in 7.0-rc6, and since at least 6.17, kwin_wayland stalls in DRM ioctls to xe when the system is under memory pressure, causing missed frames, cursor-movement stutter, and general sluggishness. The root cause seems to be synchronous and asynchronous reclaim in ttm_pool_alloc_page as TTM tries, and fails, to allocate progressively lower-order pages in response to pool-cache misses when allocating graphics buffers. Memory is fragmented enough that the compaction fails (as I can see in compact_fail and compact_stall in /proc/vmstat; extfrag says the normal pool is unusable for large allocations too). Additionally, compaction seems to be emptying the ttm pool, since page_pool in TTM debugfs reports all the buckets are empty while I'm seeing the kwin_wayland sluggishness. In profiles, I see time dominated by copy_pages and clear_pages in the TTM paging code. kswapd runs constantly despite the system as a whole having plenty of free memory. I can reproduce the problem on my 32GB-RAM X1C Gen 13 by booting with kernelcore=3D8G (not needed, but makes the repro happen sooner), running a find / >/dev/null (to fragment memory), and doing general web browsing. The stalls seem self-perpetuating once it gets started; it persists even after killing the find. I've noticed this stall in ordinary use too, even without the kernelcore=3D zone tweak, but without kernelcore, it usually takes a while (hours?) after boot for memory to become fragmented enough that higher-order allocations fail. The patch below fixes the issue for me. TBC, I'm not sure it's the _right_ fix, but it works for me. I'm guessing that even if the approach is right, a new module parameter isn't warranted. With the patch below, when I set my new max_reclaim_order ttm module parameter to zero, the kwin_wayland stalls under memory pressure stop. (TBC, this setting inhibits sync or async reclaim except for order-zero pages.) TTM allocation occurs in latency-critical paths (e.g. Wayland frame commit): do you think we _should_ reclaim here? BTW, I also tried having xe pass a beneficial order of 9, but it didn't help: we end up doing a lot of compaction work below this order anyway. Signed-off-by: Daniel Colascione diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index c0d95559197c..fd255914c0d3 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -115,9 +115,13 @@ struct ttm_pool_tt_restore { }; =20 static unsigned long page_pool_size; +static unsigned int max_reclaim_order; =20 MODULE_PARM_DESC(page_pool_size, "Number of pages in the WC/UC/DMA pool"); module_param(page_pool_size, ulong, 0644); +MODULE_PARM_DESC(max_reclaim_order, + "Maximum order that keeps upstream reclaim behavior"); +module_param(max_reclaim_order, uint, 0644); =20 static atomic_long_t allocated_pages; =20 @@ -146,16 +150,14 @@ static struct page *ttm_pool_alloc_page(struct ttm_po= ol *pool, gfp_t gfp_flags, * Mapping pages directly into an userspace process and calling * put_page() on a TTM allocated page is illegal. */ - if (order) + if (order) { gfp_flags |=3D __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOW= ARN | __GFP_THISNODE; - - /* - * Do not add latency to the allocation path for allocations orders - * device tolds us do not bring them additional performance gains. - */ - if (beneficial_order && order > beneficial_order) - gfp_flags &=3D ~__GFP_DIRECT_RECLAIM; + if (beneficial_order && order > beneficial_order) + gfp_flags &=3D ~__GFP_DIRECT_RECLAIM; + if (order > max_reclaim_order) + gfp_flags &=3D ~__GFP_RECLAIM; + } =20 if (!ttm_pool_uses_dma_alloc(pool)) { p =3D alloc_pages_node(pool->nid, gfp_flags, order);