From nobody Sun Oct 5 23:57:03 2025 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DA10223DE9 for ; Mon, 28 Jul 2025 08:04:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689855; cv=none; b=q4xCUeW8GS1wBE5YMBRtdWcNXqLtdq/JBbgq9xz4tw5shiG0aztzbYXVAYFLb1uFJQMdqZfb9hGu4csHKACQsheFEoSma8F+HbKH4RMymvhFeXK1S61VHqqkEHC2xvSJNIEDOg0S1BTd+ZJ4xXapyqza7KXDS+6nIFXyV9Gy4ao= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753689855; c=relaxed/simple; bh=kpUv0DUXwxdXmL7Qqpmu2Pj3VF4c3OdoDqZ5lfZkulM=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=LcALP2Z4VB4N/ISs3BK9PAml+xwHsUJeOT6yDv4HXVEOYwO4BKDOm+lG/YgTciBhAU+Hh7ft7tAbDQ9NdrTd1459ABCurUKdWs5ATqBfngK6MfZbWi1yOOJVwnblUY4V5dDlKGXw3+4zBjYQLh5iTwpx5QvgKr3fy9HCVQ5f/qU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=kepHxj6K; arc=none smtp.client-ip=115.124.30.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="kepHxj6K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1753689843; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=R5JhzFAe2+pi/8yqRFPN+PrUVu7txCrojFI/LNAgGtk=; b=kepHxj6K2WQ5b1O0Lk+ZXTbEgcjH7EbyQxYr62pv2TqS5uCU/qmJteoI0u3H0JOF/d6UzIcMP4XGTy6z4VvpeFC60qK8DqgpoVH3w0iJff0o67gPiyLAAFgweZ64trQ0XGtMFIGalkAmUbDK6hSKr87xy/oDyQRRETJ4kgRrtL4= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WkFS0FZ_1753689841 cluster:ay36) by smtp.aliyun-inc.com; Mon, 28 Jul 2025 16:04:01 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: patryk@kowalczyk.ws, ville.syrjala@linux.intel.com, david@redhat.com, willy@infradead.org, maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, simona@ffwll.ch, jani.nikula@linux.intel.com, joonas.lahtinen@linux.intel.com, rodrigo.vivi@intel.com, tursulin@ursulin.net, christian.koenig@amd.com, ray.huang@amd.com, matthew.auld@intel.com, matthew.brost@intel.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH] mm: shmem: fix the shmem large folio allocation for the i915 driver Date: Mon, 28 Jul 2025 16:03:53 +0800 Message-ID: <0d734549d5ed073c80b11601da3abdd5223e1889.1753689802.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable After commit acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs"), we extend the 'huge=3D' option to allow any sized large folios for tmpfs, which means tmpfs will allow getting a highest order hint based on the size of write() and fallocate() paths, and then will try each allowable large or= der. However, when the i915 driver allocates shmem memory, it doesn't provide hi= nt information about the size of the large folio to be allocated, resulting in the inability to allocate PMD-sized shmem, which in turn affects GPU perfor= mance. To fix this issue, add the 'end' information for shmem_read_folio_gfp() to= help allocate PMD-sized large folios. Additionally, use the maximum allocation c= hunk (via mapping_max_folio_size()) to determine the size of the large folios to allocate in the i915 driver. Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs") Reported-by: Patryk Kowalczyk Reported-by: Ville Syrj=C3=A4l=C3=A4 Tested-by: Patryk Kowalczyk Signed-off-by: Baolin Wang --- drivers/gpu/drm/drm_gem.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 7 ++++++- drivers/gpu/drm/ttm/ttm_backup.c | 2 +- include/linux/shmem_fs.h | 4 ++-- mm/shmem.c | 7 ++++--- 5 files changed, 14 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 4bf0a76bb35e..5ed34a9211a4 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -627,7 +627,7 @@ struct page **drm_gem_get_pages(struct drm_gem_object *= obj) i =3D 0; while (i < npages) { long nr; - folio =3D shmem_read_folio_gfp(mapping, i, + folio =3D shmem_read_folio_gfp(mapping, i, 0, mapping_gfp_mask(mapping)); if (IS_ERR(folio)) goto fail; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i9= 15/gem/i915_gem_shmem.c index f263615f6ece..778290f49853 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -69,6 +69,7 @@ int shmem_sg_alloc_table(struct drm_i915_private *i915, s= truct sg_table *st, struct scatterlist *sg; unsigned long next_pfn =3D 0; /* suppress gcc warning */ gfp_t noreclaim; + size_t chunk; int ret; =20 if (overflows_type(size / PAGE_SIZE, page_count)) @@ -94,6 +95,7 @@ int shmem_sg_alloc_table(struct drm_i915_private *i915, s= truct sg_table *st, mapping_set_unevictable(mapping); noreclaim =3D mapping_gfp_constraint(mapping, ~__GFP_RECLAIM); noreclaim |=3D __GFP_NORETRY | __GFP_NOWARN; + chunk =3D mapping_max_folio_size(mapping); =20 sg =3D st->sgl; st->nents =3D 0; @@ -105,10 +107,13 @@ int shmem_sg_alloc_table(struct drm_i915_private *i91= 5, struct sg_table *st, 0, }, *s =3D shrink; gfp_t gfp =3D noreclaim; + loff_t bytes =3D (page_count - i) << PAGE_SHIFT; + loff_t pos =3D i << PAGE_SHIFT; =20 + bytes =3D min_t(loff_t, chunk, bytes); do { cond_resched(); - folio =3D shmem_read_folio_gfp(mapping, i, gfp); + folio =3D shmem_read_folio_gfp(mapping, i, pos + bytes, gfp); if (!IS_ERR(folio)) break; =20 diff --git a/drivers/gpu/drm/ttm/ttm_backup.c b/drivers/gpu/drm/ttm/ttm_bac= kup.c index 6f2e58be4f3e..0c90ae338afb 100644 --- a/drivers/gpu/drm/ttm/ttm_backup.c +++ b/drivers/gpu/drm/ttm/ttm_backup.c @@ -100,7 +100,7 @@ ttm_backup_backup_page(struct file *backup, struct page= *page, struct folio *to_folio; int ret; =20 - to_folio =3D shmem_read_folio_gfp(mapping, idx, alloc_gfp); + to_folio =3D shmem_read_folio_gfp(mapping, idx, 0, alloc_gfp); if (IS_ERR(to_folio)) return PTR_ERR(to_folio); =20 diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 6d0f9c599ff7..203eebad6b38 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -156,12 +156,12 @@ enum sgp_type { int shmem_get_folio(struct inode *inode, pgoff_t index, loff_t write_end, struct folio **foliop, enum sgp_type sgp); struct folio *shmem_read_folio_gfp(struct address_space *mapping, - pgoff_t index, gfp_t gfp); + pgoff_t index, loff_t end, gfp_t gfp); =20 static inline struct folio *shmem_read_folio(struct address_space *mapping, pgoff_t index) { - return shmem_read_folio_gfp(mapping, index, mapping_gfp_mask(mapping)); + return shmem_read_folio_gfp(mapping, index, 0, mapping_gfp_mask(mapping)); } =20 static inline struct page *shmem_read_mapping_page( diff --git a/mm/shmem.c b/mm/shmem.c index e6cdfda08aed..c79f5760cfc9 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -5960,6 +5960,7 @@ int shmem_zero_setup(struct vm_area_struct *vma) * shmem_read_folio_gfp - read into page cache, using specified page alloc= ation flags. * @mapping: the folio's address_space * @index: the folio index + * @end: end of a read if allocating a new folio * @gfp: the page allocator flags to use if allocating * * This behaves as a tmpfs "read_cache_page_gfp(mapping, index, gfp)", @@ -5972,14 +5973,14 @@ int shmem_zero_setup(struct vm_area_struct *vma) * with the mapping_gfp_mask(), to avoid OOMing the machine unnecessarily. */ struct folio *shmem_read_folio_gfp(struct address_space *mapping, - pgoff_t index, gfp_t gfp) + pgoff_t index, loff_t end, gfp_t gfp) { #ifdef CONFIG_SHMEM struct inode *inode =3D mapping->host; struct folio *folio; int error; =20 - error =3D shmem_get_folio_gfp(inode, index, 0, &folio, SGP_CACHE, + error =3D shmem_get_folio_gfp(inode, index, end, &folio, SGP_CACHE, gfp, NULL, NULL); if (error) return ERR_PTR(error); @@ -5998,7 +5999,7 @@ EXPORT_SYMBOL_GPL(shmem_read_folio_gfp); struct page *shmem_read_mapping_page_gfp(struct address_space *mapping, pgoff_t index, gfp_t gfp) { - struct folio *folio =3D shmem_read_folio_gfp(mapping, index, gfp); + struct folio *folio =3D shmem_read_folio_gfp(mapping, index, 0, gfp); struct page *page; =20 if (IS_ERR(folio)) --=20 2.43.5