From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28A533C6AB; Fri, 1 Mar 2024 16:44:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311498; cv=none; b=jnxHloGLxVVW1yufFbYq7XhiofYah7DobjG1/ST2Vh49aSTBf6caa72Egx2s/7fobEyYuA8Euef9MKTofxa40wY29VOX6AooFdvIHpYiRACzYA7NzL86AXu30jC1wjR1mCMUsWpBzqY90Wk+QmTy+E4n/FbNVz+0oK1+sN4nAfk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311498; c=relaxed/simple; bh=BY0+lJrnKbexBUzETPtFYZ+OJuge9ZodiUfVOjc2hK8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JNpmBsdIldQP6TrDGtY+HVWfCGW7e7ZbHLBRFn3bfSuWdWIgUdNDVuEt70gOqyMx0bF/5C1v8lg0OylWNFQWgDxpPVsmBjkdSRNdUwdwSaCV6ChNvee8/XgMmXQl4TqMt9nnAKNFQhTsz6D0qC/Oq018jwjzzvnjGw+F/7vY+Ow= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=fNWUnrtg; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="fNWUnrtg" Received: from smtp1.mailbox.org (smtp1.mailbox.org [IPv6:2001:67c:2050:b231:465::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TmYpr21J9z9t8W; Fri, 1 Mar 2024 17:44:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O82T73AIK6qGhlYorGzOupTzei7wEk6+FPA8o3URJUE=; b=fNWUnrtgSvHwlKswQCDBdMZ99aHEG9IE0hNSW5jexFvvQkpzgpzjuETR588Aq25oYo7zKm 8/eeN0UkSi/gyP8pjyFHWVKgnqR3ti1wvcPmdX3uCpNLcdx7pAw4DTQ/lkD/MWS4JI7jCc lWQn7nG2EGhY3dAQjA0kSFrC6IVJyl6Kx/V1NZAN5ZMSUU8rpTjjNQ68eAgTbbYXLDlU+D nunEh6RLwLQd9tGcIVuB9KmtTSEiCFBkw/cg0DX8LZECOWq5I6YzTySOTqCL7PStVsWUB/ V3TcSHJAm7jDZ6M3hPNRYHqx5COZfIg5tVkt5O731DxQR8wWHdF2+4rgwYKw7A== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org Subject: [PATCH v2 01/13] mm: Support order-1 folios in the page cache Date: Fri, 1 Mar 2024 17:44:32 +0100 Message-ID: <20240301164444.3799288-2-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYpr21J9z9t8W Content-Type: text/plain; charset="utf-8" From: "Matthew Wilcox (Oracle)" Folios of order 1 have no space to store the deferred list. This is not a problem for the page cache as file-backed folios are never placed on the deferred list. All we need to do is prevent the core MM from touching the deferred list for order 1 folios and remove the code which prevented us from allocating order 1 folios. Link: https://lore.kernel.org/linux-mm/90344ea7-4eec-47ee-5996-0c22f42d6a6a= @google.com/ Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Hannes Reinecke --- include/linux/huge_mm.h | 7 +++++-- mm/filemap.c | 2 -- mm/huge_memory.c | 23 ++++++++++++++++++----- mm/internal.h | 4 +--- mm/readahead.c | 3 --- 5 files changed, 24 insertions(+), 15 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..916a2a539517 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -263,7 +263,7 @@ unsigned long thp_vma_allowable_orders(struct vm_area_s= truct *vma, unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); =20 -void folio_prep_large_rmappable(struct folio *folio); +struct folio *folio_prep_large_rmappable(struct folio *folio); bool can_split_folio(struct folio *folio, int *pextra_pins); int split_huge_page_to_list(struct page *page, struct list_head *list); static inline int split_huge_page(struct page *page) @@ -410,7 +410,10 @@ static inline unsigned long thp_vma_allowable_orders(s= truct vm_area_struct *vma, return 0; } =20 -static inline void folio_prep_large_rmappable(struct folio *folio) {} +static inline struct folio *folio_prep_large_rmappable(struct folio *folio) +{ + return folio; +} =20 #define transparent_hugepage_flags 0UL =20 diff --git a/mm/filemap.c b/mm/filemap.c index 750e779c23db..2b00442b9d19 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1912,8 +1912,6 @@ struct folio *__filemap_get_folio(struct address_spac= e *mapping, pgoff_t index, gfp_t alloc_gfp =3D gfp; =20 err =3D -ENOMEM; - if (order =3D=3D 1) - order =3D 0; if (order > 0) alloc_gfp |=3D __GFP_NORETRY | __GFP_NOWARN; folio =3D filemap_alloc_folio(alloc_gfp, order); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94c958f7ebb5..81fd1ba57088 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -788,11 +788,15 @@ struct deferred_split *get_deferred_split_queue(struc= t folio *folio) } #endif =20 -void folio_prep_large_rmappable(struct folio *folio) +struct folio *folio_prep_large_rmappable(struct folio *folio) { - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); - INIT_LIST_HEAD(&folio->_deferred_list); + if (!folio || !folio_test_large(folio)) + return folio; + if (folio_order(folio) > 1) + INIT_LIST_HEAD(&folio->_deferred_list); folio_set_large_rmappable(folio); + + return folio; } =20 static inline bool is_transparent_hugepage(struct folio *folio) @@ -3082,7 +3086,8 @@ int split_huge_page_to_list(struct page *page, struct= list_head *list) /* Prevent deferred_split_scan() touching ->_refcount */ spin_lock(&ds_queue->split_queue_lock); if (folio_ref_freeze(folio, 1 + extra_pins)) { - if (!list_empty(&folio->_deferred_list)) { + if (folio_order(folio) > 1 && + !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del(&folio->_deferred_list); } @@ -3133,6 +3138,9 @@ void folio_undo_large_rmappable(struct folio *folio) struct deferred_split *ds_queue; unsigned long flags; =20 + if (folio_order(folio) <=3D 1) + return; + /* * At this point, there is no one trying to add the folio to * deferred_list. If folio is not in deferred_list, it's safe @@ -3158,7 +3166,12 @@ void deferred_split_folio(struct folio *folio) #endif unsigned long flags; =20 - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); + /* + * Order 1 folios have no space for a deferred list, but we also + * won't waste much memory by not adding them to the deferred list. + */ + if (folio_order(folio) <=3D 1) + return; =20 /* * The try_to_unmap() in page reclaim path might reach here too, diff --git a/mm/internal.h b/mm/internal.h index f309a010d50f..5174b5b0c344 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -419,9 +419,7 @@ static inline struct folio *page_rmappable_folio(struct= page *page) { struct folio *folio =3D (struct folio *)page; =20 - if (folio && folio_order(folio) > 1) - folio_prep_large_rmappable(folio); - return folio; + return folio_prep_large_rmappable(folio); } =20 static inline void prep_compound_head(struct page *page, unsigned int orde= r) diff --git a/mm/readahead.c b/mm/readahead.c index 2648ec4f0494..369c70e2be42 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -516,9 +516,6 @@ void page_cache_ra_order(struct readahead_control *ract= l, /* Don't allocate pages past EOF */ while (index + (1UL << order) - 1 > limit) order--; - /* THP machinery does not support order-1 */ - if (order =3D=3D 1) - order =3D 0; err =3D ra_alloc_folio(ractl, index, mark, order, gfp); if (err) break; --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E60884417; Fri, 1 Mar 2024 16:45:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311506; cv=none; b=C4EzVsTNWKU8W7QvHDp6QAWvwICcHQlrrmnqk3LTkzZINnOG0TEba9UxrftdlPO8Fpnd2vjMCAkQzfykviBstFGBcjCxISOGRp/kUKcSQQR9TZ6pTdKTbc34R/UhHvDAlEnvxghi+jEyXuxsEa3B8LBzhsYeMYku8PaCBMnmzXw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311506; c=relaxed/simple; bh=FOON1o0mymLuX2IKmggEuVP3QW7GlR7Lcz6UZhHnvAM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HiH6kncvim5OTZS67pd7kkp6SOW9MmWg9GnEbEvcS9NnhwqvUatbNRUcx3Bs9R7w3K67OBgEMQQq+MpLhnqzEg/iNu2deWqZto0qP4hCr3po7Oa0wDJexgaTTTArh6h/HSn6inwNmFzGtBtGPZBxuummtz5xnElczYjgFLgg4Wk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=bL8wu3wN; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="bL8wu3wN" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYpv5Blmz9tW4; Fri, 1 Mar 2024 17:44:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311495; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1cELEozXOYCd1Af2ZXfddDuv3Odm1CWuSBqyYDa15DQ=; b=bL8wu3wNKf1pmc8SzUQPy23UfUZ0BwEQ8+fDKjHUH8N+j2+VBUy1OSkewMRkC5mKZ/EWvp rQuqv1J8e6Vp3ePcwqreNd8Np8dNup0i92iIQy96VkqcjsyAawLjiJLYav4gsGGdCUNmcA 2lGI++nSQTqWERFoQA7n7WWLsY9t5UfQJiaX3aM0PHMZ1pJ+ckJtG93Tf9a5Cm/n06wnT9 r0yuN/PlrhcScbNCXn5iUOCg5Nt2ipmOpo8lK8BVQhqzgMJWClHr3TjBA7+wIS08h+1l3a s22J69b1C9/Y8dEvw0GLItEqLOMZw1U5cE+ygJdpzWGZ+pPT5Kpw6pzwE1ZHZA== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 02/13] fs: Allow fine-grained control of folio sizes Date: Fri, 1 Mar 2024 17:44:33 +0100 Message-ID: <20240301164444.3799288-3-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYpv5Blmz9tW4 Content-Type: text/plain; charset="utf-8" From: "Matthew Wilcox (Oracle)" Some filesystems want to be able to ensure that folios that are added to the page cache are at least a certain size. Add mapping_set_folio_min_order() to allow this level of control. Signed-off-by: Matthew Wilcox (Oracle) Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke --- include/linux/pagemap.h | 100 ++++++++++++++++++++++++++++++++-------- 1 file changed, 80 insertions(+), 20 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 2df35e65557d..fc8eb9c94e9c 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -202,13 +202,18 @@ enum mapping_flags { AS_EXITING =3D 4, /* final truncate in progress */ /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS =3D 5, - AS_LARGE_FOLIO_SUPPORT =3D 6, - AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ - AS_STABLE_WRITES, /* must wait for writeback before modifying + AS_RELEASE_ALWAYS =3D 6, /* Call ->release_folio(), even if no private da= ta */ + AS_STABLE_WRITES =3D 7, /* must wait for writeback before modifying folio contents */ - AS_UNMOVABLE, /* The mapping cannot be moved, ever */ + AS_FOLIO_ORDER_MIN =3D 8, + AS_FOLIO_ORDER_MAX =3D 13, /* Bit 8-17 are used for FOLIO_ORDER */ + AS_UNMOVABLE =3D 18, /* The mapping cannot be moved, ever */ }; =20 +#define AS_FOLIO_ORDER_MIN_MASK 0x00001f00 +#define AS_FOLIO_ORDER_MAX_MASK 0x0003e000 +#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_= MASK) + /** * mapping_set_error - record a writeback error in the address_space * @mapping: the mapping in which an error should be set @@ -344,9 +349,47 @@ static inline void mapping_set_gfp_mask(struct address= _space *m, gfp_t mask) m->gfp_mask =3D mask; } =20 +/* + * There are some parts of the kernel which assume that PMD entries + * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, + * limit the maximum allocation order to PMD size. I'm not aware of any + * assumptions about maximum order if THP are disabled, but 8 seems like + * a good order (that's 1MB if you're using 4kB pages) + */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER +#else +#define MAX_PAGECACHE_ORDER 8 +#endif + +/* + * mapping_set_folio_min_order() - Set the minimum folio order + * @mapping: The address_space. + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * + * The filesystem should call this function in its inode constructor to + * indicate which base size of folio the VFS can use to cache the contents + * of the file. This should only be used if the filesystem needs special + * handling of folio sizes (ie there is something the core cannot know). + * Do not tune it based on, eg, i_size. + * + * Context: This should not be called while the inode is active as it + * is non-atomic. + */ +static inline void mapping_set_folio_min_order(struct address_space *mappi= ng, + unsigned int min) +{ + if (min > MAX_PAGECACHE_ORDER) + min =3D MAX_PAGECACHE_ORDER; + + mapping->flags =3D (mapping->flags & ~AS_FOLIO_ORDER_MASK) | + (min << AS_FOLIO_ORDER_MIN) | + (MAX_PAGECACHE_ORDER << AS_FOLIO_ORDER_MAX); +} + /** * mapping_set_large_folios() - Indicate the file supports large folios. - * @mapping: The file. + * @mapping: The address_space. * * The filesystem should call this function in its inode constructor to * indicate that the VFS can use large folios to cache the contents of @@ -357,7 +400,37 @@ static inline void mapping_set_gfp_mask(struct address= _space *m, gfp_t mask) */ static inline void mapping_set_large_folios(struct address_space *mapping) { - __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + mapping_set_folio_min_order(mapping, 0); +} + +static inline unsigned int mapping_max_folio_order(struct address_space *m= apping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX; +} + +static inline unsigned int mapping_min_folio_order(struct address_space *m= apping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; +} + +static inline unsigned long mapping_min_folio_nrpages(struct address_space= *mapping) +{ + return 1UL << mapping_min_folio_order(mapping); +} + +/** + * mapping_align_start_index() - Align starting index based on the min + * folio order of the page cache. + * @mapping: The address_space. + * + * Ensure the index used is aligned to the minimum folio order when adding + * new folios to the page cache by rounding down to the nearest minimum + * folio number of pages. + */ +static inline pgoff_t mapping_align_start_index(struct address_space *mapp= ing, + pgoff_t index) +{ + return round_down(index, mapping_min_folio_nrpages(mapping)); } =20 /* @@ -367,7 +440,7 @@ static inline void mapping_set_large_folios(struct addr= ess_space *mapping) static inline bool mapping_large_folio_support(struct address_space *mappi= ng) { return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && - test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + (mapping_max_folio_order(mapping) > 0); } =20 static inline int filemap_nr_thps(struct address_space *mapping) @@ -528,19 +601,6 @@ static inline void *detach_page_private(struct page *p= age) return folio_detach_private(page_folio(page)); } =20 -/* - * There are some parts of the kernel which assume that PMD entries - * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, - * limit the maximum allocation order to PMD size. I'm not aware of any - * assumptions about maximum order if THP are disabled, but 8 seems like - * a good order (that's 1MB if you're using 4kB pages) - */ -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER -#else -#define MAX_PAGECACHE_ORDER 8 -#endif - #ifdef CONFIG_NUMA struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order); #else --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 909483D96B; Fri, 1 Mar 2024 16:45:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311510; cv=none; b=Q7sgTUrRXKz9oXIW0jbi7Ao+7wmJaPavr+gjuxtUQyfnRDmIBQ7B8Q4JqOQvjHToOwo9IkIZvh7cfqeSksfR4w4I05bsuNwk1c0bSIn0pLH5k3gRaWuXgcqk1Iyx7gCYbyBfv9Sp6S/NCR4lh7zyDfXvc11uru81Sqq9IhMh03k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311510; c=relaxed/simple; bh=Dhf0XL34Ux242LTLDdtsnVgCvNkORmcL/S9/XRzpBuo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BIOfPEfMFo3hp0Yrat40V98exv8By/NJQjIIzuGlrHAc2DDb1ykTEK+MHpS/bDqYXBoaYDJMdRbOJsrqOZcDx2ER7y5yf2zI36mu1W8+YzE6O9B9cWesNormD6QQzfttUHcK9xZatAT47qxalrwBwxV5xuEQQaF6zxBUbPIzEP8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=HN/hGMWj; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="HN/hGMWj" Received: from smtp1.mailbox.org (smtp1.mailbox.org [IPv6:2001:67c:2050:b231:465::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYpz35Y1z9t7Z; Fri, 1 Mar 2024 17:44:59 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311499; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BCQodYVgPvK7zJztO22OSwzZBWMPBsGjwa0dIcNZkr8=; b=HN/hGMWj2gC5ZojLre62sCdhsS18t89JR0BD9LJzX7iJ/Hqx9zmSsQtDrhYImhyoAxGAO/ ruoKmN06+/h7LtqfkBwARYmoZ0+J8PfLGGkOmqyqs2wo8Ib04XGqutAvFrw9JWq1uMImwZ 5J4jlTrgdOVujUjaWDNOgaDR6C9KMlCIw/5oCv6ERInUn8FOInxxcRlA0aZecyczCv1d9D MkEyl3pqTJHM2TU2ErxfE3hE5dV3hL3yj/GVot+TyNb6m9tcmVWrVLRdLf/Pd3C7qUD5UW 3Dv2OCudXJiZqfb5sxfmf6tQheb+gjipAbANPdwH71pSHy4a3Jgu/Ikv8G8i+g== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 03/13] filemap: align the index to mapping_min_order in the page cache Date: Fri, 1 Mar 2024 17:44:34 +0100 Message-ID: <20240301164444.3799288-4-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYpz35Y1z9t7Z Content-Type: text/plain; charset="utf-8" From: Luis Chamberlain Supporting mapping_min_order implies that we guarantee each folio in the page cache has at least an order of mapping_min_order. So when adding new folios to the page cache we must ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. A higher order folio than min_order by definition is a multiple of the min_order. If an index is aligned to an order higher than a min_order, it will also be aligned to the min order. This effectively introduces no new functional changes when min order is not set other than a few rounding computations that should result in the same value. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- include/linux/pagemap.h | 10 +++++++++- mm/filemap.c | 16 ++++++++++------ 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index fc8eb9c94e9c..b3cf8ef89826 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1328,6 +1328,14 @@ struct readahead_control { ._index =3D i, \ } =20 +#define DEFINE_READAHEAD_ALIGNED(ractl, f, r, m, i) \ + struct readahead_control ractl =3D { \ + .file =3D f, \ + .mapping =3D m, \ + .ra =3D r, \ + ._index =3D mapping_align_start_index(m, i), \ + } + #define VM_READAHEAD_PAGES (SZ_128K / PAGE_SIZE) =20 void page_cache_ra_unbounded(struct readahead_control *, @@ -1356,7 +1364,7 @@ void page_cache_sync_readahead(struct address_space *= mapping, struct file_ra_state *ra, struct file *file, pgoff_t index, unsigned long req_count) { - DEFINE_READAHEAD(ractl, file, ra, mapping, index); + DEFINE_READAHEAD_ALIGNED(ractl, file, ra, mapping, index); page_cache_sync_ra(&ractl, req_count); } =20 diff --git a/mm/filemap.c b/mm/filemap.c index 2b00442b9d19..96fe5c7fe094 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2416,11 +2416,13 @@ static int filemap_update_page(struct kiocb *iocb, } =20 static int filemap_create_folio(struct file *file, - struct address_space *mapping, pgoff_t index, + struct address_space *mapping, loff_t pos, struct folio_batch *fbatch) { struct folio *folio; int error; + unsigned int min_order =3D mapping_min_folio_order(mapping); + pgoff_t index; =20 folio =3D filemap_alloc_folio(mapping_gfp_mask(mapping), 0); if (!folio) @@ -2440,6 +2442,8 @@ static int filemap_create_folio(struct file *file, * well to keep locking rules simple. */ filemap_invalidate_lock_shared(mapping); + /* index in PAGE units but aligned to min_order number of pages. */ + index =3D (pos >> (PAGE_SHIFT + min_order)) << min_order; error =3D filemap_add_folio(mapping, folio, index, mapping_gfp_constraint(mapping, GFP_KERNEL)); if (error =3D=3D -EEXIST) @@ -2500,8 +2504,7 @@ static int filemap_get_pages(struct kiocb *iocb, size= _t count, if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ)) return -EAGAIN; - err =3D filemap_create_folio(filp, mapping, - iocb->ki_pos >> PAGE_SHIFT, fbatch); + err =3D filemap_create_folio(filp, mapping, iocb->ki_pos, fbatch); if (err =3D=3D AOP_TRUNCATED_PAGE) goto retry; return err; @@ -3093,7 +3096,7 @@ static struct file *do_sync_mmap_readahead(struct vm_= fault *vmf) struct file *file =3D vmf->vma->vm_file; struct file_ra_state *ra =3D &file->f_ra; struct address_space *mapping =3D file->f_mapping; - DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff); + DEFINE_READAHEAD_ALIGNED(ractl, file, ra, mapping, vmf->pgoff); struct file *fpin =3D NULL; unsigned long vm_flags =3D vmf->vma->vm_flags; unsigned int mmap_miss; @@ -3147,7 +3150,7 @@ static struct file *do_sync_mmap_readahead(struct vm_= fault *vmf) ra->start =3D max_t(long, 0, vmf->pgoff - ra->ra_pages / 2); ra->size =3D ra->ra_pages; ra->async_size =3D ra->ra_pages / 4; - ractl._index =3D ra->start; + ractl._index =3D mapping_align_start_index(mapping, ra->start); page_cache_ra_order(&ractl, ra, 0); return fpin; } @@ -3162,7 +3165,7 @@ static struct file *do_async_mmap_readahead(struct vm= _fault *vmf, { struct file *file =3D vmf->vma->vm_file; struct file_ra_state *ra =3D &file->f_ra; - DEFINE_READAHEAD(ractl, file, ra, file->f_mapping, vmf->pgoff); + DEFINE_READAHEAD_ALIGNED(ractl, file, ra, file->f_mapping, vmf->pgoff); struct file *fpin =3D NULL; unsigned int mmap_miss; =20 @@ -3657,6 +3660,7 @@ static struct folio *do_read_cache_folio(struct addre= ss_space *mapping, struct folio *folio; int err; =20 + index =3D mapping_align_start_index(mapping, index); if (!filler) filler =3D mapping->a_ops->read_folio; repeat: --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C4D33D55D; Fri, 1 Mar 2024 16:45:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311508; cv=none; b=PugJo8IGdjQR5Jb7vrUeiDdYv75/HQz9g5Z6QPlMDjXMbj/qF9w5r2ujltkl5f9VEyWvAo+wX3NB7u/cpkxhtzs6Rxh9KRnPww++KMYpm7vTkoJO72R3/I+8tt1PsSwH9juIt0/NdCkPOKqP0/80gaLAxfkEy7kon0qwbjPgm8U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311508; c=relaxed/simple; bh=/yCQWXlfzMMgXB62iTfoUc1cPEUwbGEfOAR7A7EJP58=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cjRQYb6GwPiubDEh/Tx3yyKpvAAzynzovADe++d6uDK/gn6XGSbgjK6yV0i/Uej33TGvbulfpiMUnehuJzPpTAoREt3djv7myxgaymIRnOgOwQe/Hc4013PWyczvEuPg+tQuZA1HBUh58c/EGgxhsbJcxxJ10uFBJH5snKRuzoQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=gLQIhxE+; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="gLQIhxE+" Received: from smtp1.mailbox.org (smtp1.mailbox.org [10.196.197.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYq26jcYz9tWM; Fri, 1 Mar 2024 17:45:02 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311502; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TofXuty2ThUTVbSRpvzx3fKuxbhJgVt2hgC/1tt8iys=; b=gLQIhxE+TRtwfuu+rkDGqXs9WJziHw+JgFYbs09TJNFTQr/gW51opLqv2FzJyRdjGXp4D+ ne65X9oZMlHrjtB7n2NoVfWLB7HRk8vf4R2509Dwyk0M0ewQllhPja9qtmtoL4b8sPFTQ+ nkXNSyx3ulIxKLLguMd3aSVWP1dyntFOl0MdffdljoNDzvM11kcYKpKwACEbRaRp/XCnWk 3OVsBFM2dylNmecH3BKcdEkp8R+JtceQH/tuSiOi2Mxn50YFHT3Dhlceayzt6YF/1p9wYQ 959x+3MJa2eWXaXbDEZx1+NDiBkzsl0pBsLvjOU/sWouTcpNx5vpkjIcz7zGbA== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 04/13] filemap: use mapping_min_order while allocating folios Date: Fri, 1 Mar 2024 17:44:35 +0100 Message-ID: <20240301164444.3799288-5-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav filemap_create_folio() and do_read_cache_folio() were always allocating folio of order 0. __filemap_get_folio was trying to allocate higher order folios when fgp_flags had higher order hint set but it will default to order 0 folio if higher order memory allocation fails. As we bring the notion of mapping_min_order, make sure these functions allocate at least folio of mapping_min_order as we need to guarantee it in the page cache. Add some additional VM_BUG_ON() in __filemap_add_folio to catch errors where we add folios that has order less than min_order. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Acked-by: Darrick J. Wong --- mm/filemap.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 96fe5c7fe094..3e621c6344f7 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -849,6 +849,8 @@ noinline int __filemap_add_folio(struct address_space *= mapping, =20 VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); mapping_set_update(&xas, mapping); =20 if (!huge) { @@ -1886,8 +1888,10 @@ struct folio *__filemap_get_folio(struct address_spa= ce *mapping, pgoff_t index, folio_wait_stable(folio); no_page: if (!folio && (fgp_flags & FGP_CREAT)) { - unsigned order =3D FGF_GET_ORDER(fgp_flags); + unsigned int min_order =3D mapping_min_folio_order(mapping); + unsigned int order =3D max(min_order, FGF_GET_ORDER(fgp_flags)); int err; + index =3D mapping_align_start_index(mapping, index); =20 if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping)) gfp |=3D __GFP_WRITE; @@ -1912,8 +1916,11 @@ struct folio *__filemap_get_folio(struct address_spa= ce *mapping, pgoff_t index, gfp_t alloc_gfp =3D gfp; =20 err =3D -ENOMEM; + if (order < min_order) + order =3D min_order; if (order > 0) alloc_gfp |=3D __GFP_NORETRY | __GFP_NOWARN; + folio =3D filemap_alloc_folio(alloc_gfp, order); if (!folio) continue; @@ -1927,7 +1934,7 @@ struct folio *__filemap_get_folio(struct address_spac= e *mapping, pgoff_t index, break; folio_put(folio); folio =3D NULL; - } while (order-- > 0); + } while (order-- > min_order); =20 if (err =3D=3D -EEXIST) goto repeat; @@ -2424,7 +2431,8 @@ static int filemap_create_folio(struct file *file, unsigned int min_order =3D mapping_min_folio_order(mapping); pgoff_t index; =20 - folio =3D filemap_alloc_folio(mapping_gfp_mask(mapping), 0); + folio =3D filemap_alloc_folio(mapping_gfp_mask(mapping), + min_order); if (!folio) return -ENOMEM; =20 @@ -3666,7 +3674,8 @@ static struct folio *do_read_cache_folio(struct addre= ss_space *mapping, repeat: folio =3D filemap_get_folio(mapping, index); if (IS_ERR(folio)) { - folio =3D filemap_alloc_folio(gfp, 0); + folio =3D filemap_alloc_folio(gfp, + mapping_min_folio_order(mapping)); if (!folio) return ERR_PTR(-ENOMEM); err =3D filemap_add_folio(mapping, folio, index, gfp); --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 917153E495; Fri, 1 Mar 2024 16:45:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311512; cv=none; b=IYTDLinmbH3ueIGP9AzUqzDA/du4jJomd5KfF+TdVu4/TTC5YaMIr8sEsrDIelSVLudnyt13IAK8h/6Ns2xIyA+FF2sHDhR461a8+/FqwZD+qCuUf0IMusLkbBLEvxv46f2UilOh7RlGCR2AwgEHZVvjYJaFP3zD9kpOpbfMXh0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311512; c=relaxed/simple; bh=KQLxCxTm1a3G+AdJEU2l+UjiEY138yZKMq+PpqCnBQE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ViTLhLrxGV2bEZRVuvBiw94X/yD7ALKFCsgWympb9cDVaWdoGX113GX+pkJrpl1Qbjdjm3qNWbfe837bn/JfY5zI5h2D3jCPJjcYYB9SXoNoFSkBXwmbh5I5FLHmGLL000eLpDep3zqD0xExktEi/+LRH7dtgVDxEqrr9ksRON0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=EhcDjdG9; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="EhcDjdG9" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TmYq56x6Dz9t7g; Fri, 1 Mar 2024 17:45:05 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311506; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wBvRRAph/60/h4V+RTLg/hSlbvygSXk8WrjV0xu5/4g=; b=EhcDjdG9qiZTG+V3uYtrOsPK0UUvzb17ipja6U+/IY0HDDnkY+HkPMp6p0u2+FSKecvqMu iWT8AtwgnLDakHtg95b2lJHGfk+s2XyJPWYikTnuC3BK/QjQRJMmqqOAPqBWr3IoSls4mW q19/nic1+DfHfZHR7JZTV1HpciObH2ScYT+9tgsK2vmAPUsIfP+bPAJRYxen4YAMvdN2kC SIAdF3SlBhG3Mqe/v2I/i2CCBekZQdagyrgkf3dcQFHgQZK6Na1PLH3v3csGAqLmSH04IG 9DZNFKUyvPqqe/5Ne7SHmLPq54ZL9V7gnXR0jla2EERn/u2BZBYnEczSx/YJzA== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 05/13] readahead: round up file_ra_state->ra_pages to mapping_min_nrpages Date: Fri, 1 Mar 2024 17:44:36 +0100 Message-ID: <20240301164444.3799288-6-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Luis Chamberlain As we will be adding multiples of mapping_min_nrpages to the page cache, initialize file_ra_state->ra_pages with bdi->ra_pages rounded up to the nearest mapping_min_nrpages. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- mm/readahead.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/readahead.c b/mm/readahead.c index 369c70e2be42..6336c1736cc9 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -138,7 +138,8 @@ void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { - ra->ra_pages =3D inode_to_bdi(mapping->host)->ra_pages; + ra->ra_pages =3D round_up(inode_to_bdi(mapping->host)->ra_pages, + mapping_min_folio_nrpages(mapping)); ra->prev_pos =3D -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1126F3F9D3; Fri, 1 Mar 2024 16:45:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311514; cv=none; b=nTLO+hkHRzi+lkZyuq46VX1fRERGelf1mq/bawYUWL2jVTY1EbvwGJDxDP/XTvBs6R4xeKupiQGfHNCH/A45ZUY2etpfnvz9YV3tgpsGJ84n0yforOUBe8T9aCLWxsBvay4LkB7hy9XJZB26m2YCqpz6yqLwxAaWe73oLNzMnIM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311514; c=relaxed/simple; bh=4QswKuIX45E3NH0Y6dDvf2DoGTpuEd8QlFE/EkV33k4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DGi7gwNHHNOYMyv1KEBVzGMfba8E6x5jNGNa2TE3aDD53NyAJFJu5aMrUXsKrTrwHDLRAIPIh7HBN5h2u6oX8+gVKgbjoV6YkcoA4yxJ5n6CcMTK+wh3mZ88xofQX2UWfoRoCoHqlCczZB2t/8Wud0raO8cS+KR0CReMapNnHsY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=sy87xrj0; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="sy87xrj0" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYq94WM6z9t2l; Fri, 1 Mar 2024 17:45:09 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311509; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fyszdUSBCFtEWqaMFYCTrgSY4YVZLc+Vs7TuxxQjQl4=; b=sy87xrj0yXtold9KEo0HbYK8U8fwRqeLzMnk/2gqJnegYhfzQ3WYLFoP+gKh5jVg1lArg7 fdvBQ/UZr6BWq1qkEe8pVninsVe5DywCu19kITPvxIcx9Vack4ssHwRyzuRaD6lB1jC7BQ uTbBKafIzaN4RXOHoJ1rzeP6f4AufeVRjLDYHxKwlvj43YJ6QFoKnvZ+64+iUVUOvlm/8e cWxdREnG6S6nfRUGvJWdO0s95pIgZWmt7I76Q/tgXIulymt3CuLsbAF7hA1QD8/PNc/0PC sqv72BLNDr5q3jgNBxoz2RpihfmZAFSOIg0/3k52B5oukRPUxhIRfIEheiIRow== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 06/13] readahead: align index to mapping_min_order in ondemand_ra and force_ra Date: Fri, 1 Mar 2024 17:44:37 +0100 Message-ID: <20240301164444.3799288-7-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYq94WM6z9t2l Content-Type: text/plain; charset="utf-8" From: Luis Chamberlain Align the ra->start and ra->size to mapping_min_order in ondemand_readahead(), and align the index to mapping_min_order in force_page_cache_ra(). This will ensure that the folios allocated for readahead that are added to the page cache are aligned to mapping_min_order. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- mm/readahead.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 6336c1736cc9..0197cb91cf85 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -310,7 +310,9 @@ void force_page_cache_ra(struct readahead_control *ract= l, struct address_space *mapping =3D ractl->mapping; struct file_ra_state *ra =3D ractl->ra; struct backing_dev_info *bdi =3D inode_to_bdi(mapping->host); - unsigned long max_pages, index; + unsigned long max_pages; + pgoff_t index, new_index; + unsigned long min_nrpages =3D mapping_min_folio_nrpages(mapping); =20 if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead)) return; @@ -320,7 +322,14 @@ void force_page_cache_ra(struct readahead_control *rac= tl, * be up to the optimal hardware IO size */ index =3D readahead_index(ractl); + new_index =3D mapping_align_start_index(mapping, index); + if (new_index !=3D index) { + nr_to_read +=3D index - new_index; + index =3D new_index; + } + max_pages =3D max_t(unsigned long, bdi->io_pages, ra->ra_pages); + max_pages =3D max_t(unsigned long, max_pages, min_nrpages); nr_to_read =3D min_t(unsigned long, nr_to_read, max_pages); while (nr_to_read) { unsigned long this_chunk =3D (2 * 1024 * 1024) / PAGE_SIZE; @@ -328,6 +337,7 @@ void force_page_cache_ra(struct readahead_control *ract= l, if (this_chunk > nr_to_read) this_chunk =3D nr_to_read; ractl->_index =3D index; + VM_BUG_ON(!IS_ALIGNED(index, min_nrpages)); do_page_cache_ra(ractl, this_chunk, 0); =20 index +=3D this_chunk; @@ -554,8 +564,11 @@ static void ondemand_readahead(struct readahead_contro= l *ractl, unsigned long add_pages; pgoff_t index =3D readahead_index(ractl); pgoff_t expected, prev_index; - unsigned int order =3D folio ? folio_order(folio) : 0; + unsigned int min_order =3D mapping_min_folio_order(ractl->mapping); + unsigned int min_nrpages =3D mapping_min_folio_nrpages(ractl->mapping); + unsigned int order =3D folio ? folio_order(folio) : min_order; =20 + VM_BUG_ON(!IS_ALIGNED(index, min_nrpages)); /* * If the request exceeds the readahead window, allow the read to * be up to the optimal hardware IO size @@ -577,7 +590,7 @@ static void ondemand_readahead(struct readahead_control= *ractl, 1UL << order); if (index =3D=3D expected || index =3D=3D (ra->start + ra->size)) { ra->start +=3D ra->size; - ra->size =3D get_next_ra_size(ra, max_pages); + ra->size =3D max(get_next_ra_size(ra, max_pages), min_nrpages); ra->async_size =3D ra->size; goto readit; } @@ -602,7 +615,7 @@ static void ondemand_readahead(struct readahead_control= *ractl, ra->start =3D start; ra->size =3D start - index; /* old async_size */ ra->size +=3D req_size; - ra->size =3D get_next_ra_size(ra, max_pages); + ra->size =3D max(get_next_ra_size(ra, max_pages), min_nrpages); ra->async_size =3D ra->size; goto readit; } @@ -639,7 +652,7 @@ static void ondemand_readahead(struct readahead_control= *ractl, =20 initial_readahead: ra->start =3D index; - ra->size =3D get_init_ra_size(req_size, max_pages); + ra->size =3D max(min_nrpages, get_init_ra_size(req_size, max_pages)); ra->async_size =3D ra->size > req_size ? ra->size - req_size : ra->size; =20 readit: @@ -650,7 +663,7 @@ static void ondemand_readahead(struct readahead_control= *ractl, * Take care of maximum IO pages as above. */ if (index =3D=3D ra->start && ra->size =3D=3D ra->async_size) { - add_pages =3D get_next_ra_size(ra, max_pages); + add_pages =3D max(get_next_ra_size(ra, max_pages), min_nrpages); if (ra->size + add_pages <=3D max_pages) { ra->async_size =3D add_pages; ra->size +=3D add_pages; @@ -660,7 +673,7 @@ static void ondemand_readahead(struct readahead_control= *ractl, } } =20 - ractl->_index =3D ra->start; + ractl->_index =3D mapping_align_start_index(ractl->mapping, ra->start); page_cache_ra_order(ractl, ra, order); } =20 --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 922133EA89; Fri, 1 Mar 2024 16:45:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311518; cv=none; b=g1APcb90TJ0eEIwdJsvIs/huAYN8bn5hLkZf10SF2jpurPZcm0WYKQ9RBO1wkINuf68DO/3vlqqQKQN+d3IstOHshDgJetUKRaThYLDMcCLvFeGn9fArQFDrpbHmKzDSb+w9QtWZylwhQ2k+MWDqbxSx8iQVJheVqMcV36QjQiI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311518; c=relaxed/simple; bh=ZTzrN3PfzOk7uvp85ZEBavNp8FF3Rm3QXyiQCAXrJco=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EgHNvuUmvRDxYMqUmoYsX8lrV95XHIJwqwp1+VnzoOKEvKTj2mtS76zDAmTY/QaLfajgvkPJpy43CKmbQdz2rRCX4erLxKpqsYH9kBhssRxRjX7AvkCV0khsIqrPJ+4K4P30Y4MTJhNDKBsIqYEPR4Pk34Zw4qJjQeSeYJAlcwE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=VkAMxpfa; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="VkAMxpfa" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYqF1lVsz9tWG; Fri, 1 Mar 2024 17:45:13 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311513; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=467JnvYp6xZwSrqHVz6JwGG/B+XhREWQAJ9IMIxeQus=; b=VkAMxpfaEVmgbBcyuWtw1txdyYcbaIt4sqlGeJGp4avK/yPIQcy8to9fJZZ3qScgjLTMG0 zZ9hNEFp/oojel/WDhiyAb0dUct7HpNi+jJfzLbKNIueuEmbxVJyZbUht4rQxyC5lh8sMS HT6uu3rN9e8yCRwtkb60yJuBlBx8yoAQTlAbKLMMQDdxFUhoa6mKOfD0v3iZ/6pYAYxmVe Lb60kYjxTwQ5MKjEw4kbxCiUHOFuNB1Ojc9L6Jm6wAx5rq7lhFCLV8t0rmZZF8lVDx9g6W 5GNu5Ozk7Sxh5HXX22l8nX0Ko7gEP5moqmlTnADLEs0uHQPFY4rDTa4vNBGg7A== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 07/13] readahead: rework loop in page_cache_ra_unbounded() Date: Fri, 1 Mar 2024 17:44:38 +0100 Message-ID: <20240301164444.3799288-8-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYqF1lVsz9tWG Content-Type: text/plain; charset="utf-8" From: Hannes Reinecke Rework the loop in page_cache_ra_unbounded() to advance with the number of pages in a folio instead of just one page at a time. Signed-off-by: Hannes Reinecke Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Acked-by: Darrick J. Wong --- mm/readahead.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 0197cb91cf85..65fbb9e78615 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -209,7 +209,7 @@ void page_cache_ra_unbounded(struct readahead_control *= ractl, struct address_space *mapping =3D ractl->mapping; unsigned long index =3D readahead_index(ractl); gfp_t gfp_mask =3D readahead_gfp_mask(mapping); - unsigned long i; + unsigned long i =3D 0; =20 /* * Partway through the readahead operation, we will have added @@ -227,7 +227,7 @@ void page_cache_ra_unbounded(struct readahead_control *= ractl, /* * Preallocate as many pages as we will need. */ - for (i =3D 0; i < nr_to_read; i++) { + while (i < nr_to_read) { struct folio *folio =3D xa_load(&mapping->i_pages, index + i); =20 if (folio && !xa_is_value(folio)) { @@ -240,8 +240,8 @@ void page_cache_ra_unbounded(struct readahead_control *= ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index++; - i =3D ractl->_index + ractl->_nr_pages - index - 1; + ractl->_index +=3D folio_nr_pages(folio); + i =3D ractl->_index + ractl->_nr_pages - index; continue; } =20 @@ -253,13 +253,14 @@ void page_cache_ra_unbounded(struct readahead_control= *ractl, folio_put(folio); read_pages(ractl); ractl->_index++; - i =3D ractl->_index + ractl->_nr_pages - index - 1; + i =3D ractl->_index + ractl->_nr_pages - index; continue; } if (i =3D=3D nr_to_read - lookahead_size) folio_set_readahead(folio); ractl->_workingset |=3D folio_test_workingset(folio); - ractl->_nr_pages++; + ractl->_nr_pages +=3D folio_nr_pages(folio); + i +=3D folio_nr_pages(folio); } =20 /* --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C1043FB1D; Fri, 1 Mar 2024 16:45:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311521; cv=none; b=hs1aZWaJsth9d8I7hILBaetJ0kr2Xg6xT6qF1y8/c0Ph1B3d6qNZO/1EaxRZnd0JzsSm0Z0IBNqiYh8j6egDYL9t7JkXL0HOKUWxyBDl0bnzWfN47Ar1BDEXLgwP/kGRli3u28iO7zfDonS9igrqxQnU39MrB4i5HfsKARaWjeA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311521; c=relaxed/simple; bh=FMbtxZRjXXq41O8f7Gs/qtCnSgbVemzQiH8ZkpI8wCQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u8t1HrK3Y1Y/ZJfvcUbxRJZOtf5hMFUDnIFrMInHsgRODIwhsou8ap7oGOktPRp/sgwaq96EYmsrbsopTVFm3CcTrLxQ+LgbcsFHnbWovprK9HxQ5sftTYSjiq7Klgu2Vp7zu+Kx67fCPFaAqvyOorNH6jWGgyPbE5IGRawOHDM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=GTvMRiJf; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="GTvMRiJf" Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYqJ1Kr0z9tWS; Fri, 1 Mar 2024 17:45:16 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311516; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zvw2rzk11TxuiUv97J3TlgZt9xwdAbT4PmWWnU5r0+0=; b=GTvMRiJfwP1CM9YGAN8fjl3pwLxOZk9Gjl/t9vwy8fr7jW+Wufva3/mOFVv/mPJ1wAqh/X 2XKnWTgFH/VlymrO7JqlRSExaN9oXj2IOFhR7Y6qPalDbQtLS6T/oWq4uXm3kHimID0E7i XfGjXVaUhu6kbuAvHHGrVKvfk7wBZiOMHKUB1jxvSNAGa6OxwxT5bmwHB/hjAmTZLipfiZ pkU2S8yLNWUOZ1HtCb5FiUBq/8oK4BURcysJPANZhPHgBabo3bLhcWu89AgRmDqb0SXsDb wKOCAZ1pTj+fuvYcuMhbu40x5DSNsukW4ZietZo/NSJfFBrXmRMcDA99zOCd2A== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 08/13] readahead: allocate folios with mapping_min_order in ra_(unbounded|order) Date: Fri, 1 Mar 2024 17:44:39 +0100 Message-ID: <20240301164444.3799288-9-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav Allocate folios with at least mapping_min_order in page_cache_ra_unbounded() and page_cache_ra_order() as we need to guarantee a minimum order in the page cache. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Acked-by: Darrick J. Wong Reviewed-by: Hannes Reinecke --- mm/readahead.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 65fbb9e78615..4e3a6f763f5c 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -210,6 +210,7 @@ void page_cache_ra_unbounded(struct readahead_control *= ractl, unsigned long index =3D readahead_index(ractl); gfp_t gfp_mask =3D readahead_gfp_mask(mapping); unsigned long i =3D 0; + unsigned int min_nrpages =3D mapping_min_folio_nrpages(mapping); =20 /* * Partway through the readahead operation, we will have added @@ -231,6 +232,8 @@ void page_cache_ra_unbounded(struct readahead_control *= ractl, struct folio *folio =3D xa_load(&mapping->i_pages, index + i); =20 if (folio && !xa_is_value(folio)) { + long nr_pages =3D folio_nr_pages(folio); + /* * Page already present? Kick off the current batch * of contiguous pages before continuing with the @@ -240,19 +243,31 @@ void page_cache_ra_unbounded(struct readahead_control= *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index +=3D folio_nr_pages(folio); + + /* + * Move the ractl->_index by at least min_pages + * if the folio got truncated to respect the + * alignment constraint in the page cache. + * + */ + if (mapping !=3D folio->mapping) + nr_pages =3D min_nrpages; + + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio); + ractl->_index +=3D nr_pages; i =3D ractl->_index + ractl->_nr_pages - index; continue; } =20 - folio =3D filemap_alloc_folio(gfp_mask, 0); + folio =3D filemap_alloc_folio(gfp_mask, + mapping_min_folio_order(mapping)); if (!folio) break; if (filemap_add_folio(mapping, folio, index + i, gfp_mask) < 0) { folio_put(folio); read_pages(ractl); - ractl->_index++; + ractl->_index +=3D min_nrpages; i =3D ractl->_index + ractl->_nr_pages - index; continue; } @@ -500,6 +515,7 @@ void page_cache_ra_order(struct readahead_control *ract= l, { struct address_space *mapping =3D ractl->mapping; pgoff_t index =3D readahead_index(ractl); + unsigned int min_order =3D mapping_min_folio_order(mapping); pgoff_t limit =3D (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; pgoff_t mark =3D index + ra->size - ra->async_size; int err =3D 0; @@ -526,8 +542,13 @@ void page_cache_ra_order(struct readahead_control *rac= tl, if (index & ((1UL << order) - 1)) order =3D __ffs(index); /* Don't allocate pages past EOF */ - while (index + (1UL << order) - 1 > limit) + while (order > min_order && index + (1UL << order) - 1 > limit) order--; + + if (order < min_order) + order =3D min_order; + + VM_BUG_ON(index & ((1UL << order) - 1)); err =3D ra_alloc_folio(ractl, index, mark, order, gfp); if (err) break; --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A88BB3FB0C; Fri, 1 Mar 2024 16:45:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311524; cv=none; b=egFT2SxJbjK/P+HiD2BkgnFC0ZHalQX/sp5S4znUSyhXBPFQ+V+qhZdAJ1yCj0t5cV4VkEouPhVjPMDGqcj0c86p4o6DZwed6hmkHpF2Bem5B80D6BAceacRxBaBjwzGDKAWt0mRu5kEJX5l3ejOSYOGRvCGsl5zCfW9wks9oFM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311524; c=relaxed/simple; bh=RpGDv7xjt1Tn0NAFmnUJ4dRChi0PW0zUu2AGhx/ey2I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fhrrf4Uv+kBhwrdcH5qrfJzBhkw9wrWE2kCRz3NJe1LOtNqMK1bdmANQSCIEA7ZEHQfdWEB7yNE5XL92jSLudX5801Z0ntUKoAd4rtBpcW0bB+ceIfTeUlp2aSzUt/D0LUcMO3JA3AX7I7wMKPA9EyRxf88ezuBlhaEyNmHm3kY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=OZmI+l81; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="OZmI+l81" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYqM2l4gz9tW7; Fri, 1 Mar 2024 17:45:19 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311519; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F3Xg5bE7lOUEiWrLgvJkawBGempm7FqinU/8gfujPB8=; b=OZmI+l81//odOBAskdE4zl6fx7Y7LVheQ+0p8xiXpuBz9P7DZyU0Im1meut9eKMfMmB788 ipb6Wl0H30SzQq2hEQaq4wH35pYRxjMogRPWpnm6QhSPE9TbIn+hTafnbjpS+iJtUnffxl iByU4+WZ1caxuiG+BaPAA2sBfuzOzdtyly45fU418II1VmVRcZPFnjSmTSR7MBN+Bf8Rmy 6xDC4IVUbgGFmFkd1gcKMP/OesDKMF0KZuJoxLbYn2fZzWoMyuXP5Rvy3PNFfH7pRm2kuH C5UnREMrQiygk+CwzFIbQGQ6JDso3LOU+7kPNg4LV/P1QBLntDzO3hfnOUyagg== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 09/13] mm: do not split a folio if it has minimum folio order requirement Date: Fri, 1 Mar 2024 17:44:40 +0100 Message-ID: <20240301164444.3799288-10-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYqM2l4gz9tW7 Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav As we don't have a way to split a folio to a any given lower folio order yet, avoid splitting the folio in split_huge_page_to_list() if it has a minimum folio order requirement. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke --- mm/huge_memory.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 81fd1ba57088..6ec3417638a1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3030,6 +3030,19 @@ int split_huge_page_to_list(struct page *page, struc= t list_head *list) goto out; } =20 + /* + * Do not split if mapping has minimum folio order + * requirement. + * + * XXX: Once we have support for splitting to any lower + * folio order, then it could be split based on the + * min_folio_order. + */ + if (mapping_min_folio_order(mapping)) { + ret =3D -EAGAIN; + goto out; + } + gfp =3D current_gfp_context(mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); =20 --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB23E745CD; Fri, 1 Mar 2024 16:45:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311527; cv=none; b=QImyElls34IijnIvE9Lz5g6SFn8KZvtv26fxsqWwAUo0oVVmp9AkG09HJfY/b0A8Vll+Xf1ADudA0ntdWcMmiKZOjDZ4rwmWAf7FTKFGp8zBjlHIeEi6K9F1dPHmVUbKUu+hl8028kv+rdDH8xSWMNIVqoOAmC6PRE8x9WaR5GU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311527; c=relaxed/simple; bh=OPjPelr7jC5j/AA2ZEydqVDJmFXOxA3MXRxWvzjeqDU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=r2iCstlrmrrB9wG4zFRlD+b7AaFtnB21gNBSS7Qbm3x2uP7x/MZ8mb1gzfn3M0kH4HKqZ8BVQX/Jji6kl9CY8OMbspV3hyd2vlDOpCVWdphIXl6GEo+esSDDpFeETfd0FsNGPlgb3gLicBk7rmBZai1AJRnuJMXvbMLJJZibyPc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=Rf4VH7so; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="Rf4VH7so" Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TmYqQ3Hk7z9tCB; Fri, 1 Mar 2024 17:45:22 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311522; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CoSlfuEyGkiaGErrLwm5tBbxdIgWjqVT8F2G+alxKAU=; b=Rf4VH7soZ/cgnAGRXcTEcKIV/sG+KQuBxSpjF53mjWkAVL6sw+HKmLZew6ousle6MJeTUR IHe7T5isExiLl34phOmLhBwfhIZEFdlgboAhxkyuZgZuf5NaxXrr27uqubaNFYdGVJD9SI L4yen1nde0OHzhSgMHHVPvrT5AK0WDQ6MA9eXSGNzhILjZ1fr5/v8PTxoP/MEiJ5ivl7kA d9Q+olVPgC8eTnAA4azyQ/pNJveDrQS69Ks+aTtNNlTHuVnAlQxPKoYb6QV6h/XK+zmD07 PfFk8VJkOPsqMYrsi/y++OTMIKetmkWOleeP43XvHi+QSCDr2GUPbuDuaMwb8g== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 10/13] iomap: fix iomap_dio_zero() for fs bs > system page size Date: Fri, 1 Mar 2024 17:44:41 +0100 Message-ID: <20240301164444.3799288-11-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYqQ3Hk7z9tCB Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment. If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption. iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system. Signed-off-by: Pankaj Raghav Reviewed-by: Darrick J. Wong --- fs/iomap/direct-io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index bcd3f8cf5ea4..04f6c5548136 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -239,14 +239,23 @@ static void iomap_dio_zero(const struct iomap_iter *i= ter, struct iomap_dio *dio, struct page *page =3D ZERO_PAGE(0); struct bio *bio; =20 - bio =3D iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_I= DLE); + WARN_ON_ONCE(len > (BIO_MAX_VECS * PAGE_SIZE)); + + bio =3D iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS, + REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits, GFP_KERNEL); + bio->bi_iter.bi_sector =3D iomap_sector(&iter->iomap, pos); bio->bi_private =3D dio; bio->bi_end_io =3D iomap_dio_bio_end_io; =20 - __bio_add_page(bio, page, len, 0); + while (len) { + unsigned int io_len =3D min_t(unsigned int, len, PAGE_SIZE); + + __bio_add_page(bio, page, io_len, 0); + len -=3D io_len; + } iomap_dio_submit_bio(iter, dio, bio, pos); } =20 --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5AD774C02; Fri, 1 Mar 2024 16:45:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311531; cv=none; b=E0ls2pnJUttuENSxR0IfKClPNOzythm8YrNakLDkvnIRyouYyuwPnW7wjbkIqWvvavxYB4AhGMfjKxq70Ai7W5JUCDdpj5YPjIfU4Pu0OOLkg5l58yTprGC/U8xENTlf4Ncl2VAzYsKfRNTKKNpxbxClmqd9+ijla4ajKG2Uncg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311531; c=relaxed/simple; bh=8h2k716LEkRXj/j4QF0KIKvbMa13OHXZcCU5xl00czw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JaEFrJMuYv0qZjCBagpirGeDaW41aBseMhiKHXZHTA3d3w80voY6mdO2Hoi6Zz1zmDjiZgNAf1flhIq4qr+tI8cEvOvMZ+JmCRkus99q/74TW1jPwmOPEe25biU/gaqEY6gGAfVvUjFYSskQh9odv8OcpifC3s/YMf6ZAmogBYE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=0NCEvPJd; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="0NCEvPJd" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYqV0SbKz9tWG; Fri, 1 Mar 2024 17:45:26 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311526; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTTuIiANd5kCHMpVIcnhOGjU1SgMYOpCUNpWo+aN5OQ=; b=0NCEvPJdd4ZEm/pZAZuCWy9XIlnUOtlWaauKypZe9P5F5ntCE+g/+PjzImbecph0YYvig/ ERtH1tJPJuP+IrQy6srtTTxprCqpcpCGn12/P3rnfM+5AeV85UHGbuOkGW2gU1OVRKk0TS AaNkqKsh2FVbzE6I+1mHAdaQ0qoI6iY503orVDqX4dRohCWWfIns0Us+KJjTPSDrK9a3ds QItylujWub7PPZL3dHm41Z62PDEAIDyDdhyAJ7dFh83D7M1CeEJ3ubndtJYyVGAIPZnljT wEGdXx5kF3MyLmOY8wEKjccltT0TtyWGBJSpVeCtz1Aif3/oRTY6yUVQr+1oaQ== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 11/13] xfs: expose block size in stat Date: Fri, 1 Mar 2024 17:44:42 +0100 Message-ID: <20240301164444.3799288-12-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYqV0SbKz9tWG Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav For block size larger than page size, the unit of efficient IO is the block size, not the page size. Leaving stat() to report PAGE_SIZE as the block size causes test programs like fsx to issue illegal ranges for operations that require block size alignment (e.g. fallocate() insert range). Hence update the preferred IO size to reflect the block size in this case. This change is based on a patch originally from Dave Chinner.[1] [1] https://lwn.net/ml/linux-fsdevel/20181107063127.3902-16-david@fromorbit= .com/ Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain --- fs/xfs/xfs_iops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index a0d77f5f512e..7ee829f7d708 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -543,7 +543,7 @@ xfs_stat_blksize( return 1U << mp->m_allocsize_log; } =20 - return PAGE_SIZE; + return max_t(uint32_t, PAGE_SIZE, mp->m_sb.sb_blocksize); } =20 STATIC int --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8DCC74E1B; Fri, 1 Mar 2024 16:45:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311534; cv=none; b=IWzlGKWpD7FjLcgqSoyrMy8D1Up4WgnrV23qtug9aanedwZC8HT6aE2OQUjirO+oSAzfT4ULqIblHbHkuzv2mvTwr2PbRjkjeZxaSQeBhIMwG5FL6DRpRhT/zdcAHSKZAXHcoQr2XA434M+bXrczTKJvKbDEYJKKx9I8Zh82lPM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311534; c=relaxed/simple; bh=TxlF0K0AnCi7IfUOf9fZ9W/wQu6fwrGsP17GgzEpFC4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GXWPNRy/YiVjsNKMRffI88j/pOGBGOjY9jivCGxpBqcc6FSRSQDyGLJ5Iq5DdyTXl45buN1svVV+16XotBE87R+5d2qXUVy2E8tz4jvPDVUYzt8t9RUJSAt/8TOxkHgPHPd+/a1VKc9adTDo+nNOU8YGReQajJ3m93PWH0UJcP4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=nwU+P+9a; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="nwU+P+9a" Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4TmYqY3ft9z9tW7; Fri, 1 Mar 2024 17:45:29 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jHOAd9x0YRVSd/dvnN0auN+I3KtosAxaAU4ZYQzup9I=; b=nwU+P+9a3uF5NA7u6ijpi6mGj8WO2v/jjNlHzx7ZxLokEZE5PEgAPYYwNjrPK+Tms9wLwQ Pyz3Y6zKfMOGrh0h2VqWh2Xebs6ECkTHOJxrSP7lJmYSxooolgOm898sH7IfMtja4+xGbf QGn8iUq+WZZldhVy+gb1dxTa7Ny3C2BlNxq8eJnkFSDSoR0286+8t8p0UrDuC+8w/zds3S J+TgIDqaXHtYS/JGkjQDAQcq9WsaFH8ovt/DcyClf+Qc4mFxK3M8JiIMBwy0/MNjgfSVTg I9Fss6QCWHr13cCBIdBsFbx2KvYCliOGQ3dKcsxyAWzBWPURp+SXGtiqkS2XGA== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 12/13] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Date: Fri, 1 Mar 2024 17:44:43 +0100 Message-ID: <20240301164444.3799288-13-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TmYqY3ft9z9tW7 Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav Instead of assuming that PAGE_SHIFT is always higher than the blocklog, make the calculation generic so that page cache count can be calculated correctly for LBS. Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_mount.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index aabb25dc3efa..9cf800586da7 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -133,9 +133,16 @@ xfs_sb_validate_fsb_count( { ASSERT(PAGE_SHIFT >=3D sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >=3D BBSHIFT); + uint64_t max_index; + uint64_t max_bytes; + + if (check_shl_overflow(nblocks, sbp->sb_blocklog, &max_bytes)) + return -EFBIG; =20 /* Limited by ULONG_MAX of page cache index */ - if (nblocks >> (PAGE_SHIFT - sbp->sb_blocklog) > ULONG_MAX) + max_index =3D max_bytes >> PAGE_SHIFT; + + if (max_index > ULONG_MAX) return -EFBIG; return 0; } --=20 2.43.0 From nobody Sun Feb 8 07:08:07 2026 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73EE376022; Fri, 1 Mar 2024 16:45:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311538; cv=none; b=ixnTX1zpRV2CEn0rKW9qpTZGL3l7tR0B8Uxf8j3EEmdqTyGBS4vhHjTtZfDe8fGWGWev6Jm8arz92qUP3vHCWkriKTgPfshFKGATmhP4ICbnZemyftqxOs89EKJOyzoSQUDvm5glfefDajBNn2j+USK9Y/YdNi/h8U8NYm8E/Hc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709311538; c=relaxed/simple; bh=cVmNfcBZIT2KyKlxzKD0nCUVMT+sJlIw0srzqPxA7t0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ezgw1sSaooHjLz6QoCKZRm52ELnFLfa691pLbUpnJ5GJCKJ/8FMqXvuI9sPTDbhQ4cUTstbTE2pbiRBaPwdDP6nnKKe5pWymwF6TGQ5/CtUcqE6iksi/5IRXuzyCQ6Z2sd5glTPaZW7XhklOFN4xb8R9/DToT2CSDaDK/cs7CR4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com; spf=pass smtp.mailfrom=pankajraghav.com; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b=rMBOipsj; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pankajraghav.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pankajraghav.com header.i=@pankajraghav.com header.b="rMBOipsj" Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TmYqc6nYkz9t8H; Fri, 1 Mar 2024 17:45:32 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1709311532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5wzUd/M7ghnUlsNiIWLrpCm78wsb6uIwk5Sv3jrhDE4=; b=rMBOipsj2b50Vol+/PLPN13o213KX6Sgu4s0nw+xplKlK/aVb/eFPsgk0UgrZLst5NUnYW k7Cin8vOiTKFSHGKdhKHNxiiQnBbPZ/xOvRoIOozm0vRrDAZR0ivPF2OrA/Ahch6UutPr9 kELko29pYXU4rzAKp13u4BCCV0VMMTTIqQh4ejEPaMlEA3dRoSV60Gjw23hKzMeqH0d+tA 2NrlE4euJ9rYg5nglYG1aSRTkTdFTypuO423WTy6PT4qu3eOZpJ3FteUVUiA1NbDSNcoup XO1K1DNBNb8zy27CqmMtnKkkwSNOPs7SPGEMgxquqNsqTsxis0jqCRQqqyTSUA== From: "Pankaj Raghav (Samsung)" To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: djwong@kernel.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, david@fromorbit.com, akpm@linux-foundation.org, gost.dev@samsung.com, linux-kernel@vger.kernel.org, chandan.babu@oracle.com, willy@infradead.org, Pankaj Raghav Subject: [PATCH v2 13/13] xfs: enable block size larger than page size support Date: Fri, 1 Mar 2024 17:44:44 +0100 Message-ID: <20240301164444.3799288-14-kernel@pankajraghav.com> In-Reply-To: <20240301164444.3799288-1-kernel@pankajraghav.com> References: <20240301164444.3799288-1-kernel@pankajraghav.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pankaj Raghav Page cache now has the ability to have a minimum order when allocating a folio which is a prerequisite to add support for block size > page size. Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain --- fs/xfs/libxfs/xfs_ialloc.c | 5 +++++ fs/xfs/libxfs/xfs_shared.h | 3 +++ fs/xfs/xfs_icache.c | 6 ++++-- fs/xfs/xfs_mount.c | 1 - fs/xfs/xfs_super.c | 10 ++-------- 5 files changed, 14 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 2361a22035b0..c040bd6271fd 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -2892,6 +2892,11 @@ xfs_ialloc_setup_geometry( igeo->ialloc_align =3D mp->m_dalign; else igeo->ialloc_align =3D 0; + + if (mp->m_sb.sb_blocksize > PAGE_SIZE) + igeo->min_folio_order =3D mp->m_sb.sb_blocklog - PAGE_SHIFT; + else + igeo->min_folio_order =3D 0; } =20 /* Compute the location of the root directory inode that is laid out by mk= fs. */ diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 4220d3584c1b..67ed406e7a81 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -188,6 +188,9 @@ struct xfs_ino_geometry { /* precomputed value for di_flags2 */ uint64_t new_diflags2; =20 + /* minimum folio order of a page cache allocation */ + unsigned int min_folio_order; + }; =20 #endif /* __XFS_SHARED_H__ */ diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index dba514a2c84d..a1857000e2cd 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -88,7 +88,8 @@ xfs_inode_alloc( /* VFS doesn't initialise i_mode or i_state! */ VFS_I(ip)->i_mode =3D 0; VFS_I(ip)->i_state =3D 0; - mapping_set_large_folios(VFS_I(ip)->i_mapping); + mapping_set_folio_min_order(VFS_I(ip)->i_mapping, + M_IGEO(mp)->min_folio_order); =20 XFS_STATS_INC(mp, vn_active); ASSERT(atomic_read(&ip->i_pincount) =3D=3D 0); @@ -323,7 +324,8 @@ xfs_reinit_inode( inode->i_rdev =3D dev; inode->i_uid =3D uid; inode->i_gid =3D gid; - mapping_set_large_folios(inode->i_mapping); + mapping_set_folio_min_order(inode->i_mapping, + M_IGEO(mp)->min_folio_order); return error; } =20 diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 9cf800586da7..a77e927807e5 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -131,7 +131,6 @@ xfs_sb_validate_fsb_count( xfs_sb_t *sbp, uint64_t nblocks) { - ASSERT(PAGE_SHIFT >=3D sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >=3D BBSHIFT); uint64_t max_index; uint64_t max_bytes; diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 5a2512d20bd0..685ce7bf7324 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1625,16 +1625,10 @@ xfs_fs_fill_super( goto out_free_sb; } =20 - /* - * Until this is fixed only page-sized or smaller data blocks work. - */ if (mp->m_sb.sb_blocksize > PAGE_SIZE) { xfs_warn(mp, - "File system with blocksize %d bytes. " - "Only pagesize (%ld) or less will currently work.", - mp->m_sb.sb_blocksize, PAGE_SIZE); - error =3D -ENOSYS; - goto out_free_sb; +"EXPERIMENTAL: Filesystem with Large Block Size (%d bytes) enabled.", + mp->m_sb.sb_blocksize); } =20 /* Ensure this filesystem fits in the page cache limits */ --=20 2.43.0