From nobody Sun Nov 24 02:40:09 2024 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEA6C1BF810 for ; Thu, 10 Oct 2024 09:58:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554310; cv=none; b=FrM1Y/Xx6WPKn40QhbkVYeaypXA82EwSOyS8F54DLKAXLybv8FxrMV5UqOUS1iMKhoyZWynyQgQEoKAOnYDBO7yIfXYmlQH3FSalV7s/t8Nu6AMh/TX6n/0ltzsKq8/nML6aVMidXKDpP7cNYw1K7jxE0TOh05fcQeHAG1ondwo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554310; c=relaxed/simple; bh=aW+mk2ePZEaHmiZzVO05BzffwyUJkOfhEC+jcUFwaoc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JWoqYRtR6uFRDlk2dlkuxxY9q/Vp6z6SVBy5Bu3uz9J/v+Ssc0E6YAJiqiLVyh5j5kM+Mb0HBPMWGTQiHuXg0a7HsL9UiXbDSpNa1i/jLVqauH1mhId9Lp5d3BsfG/cHN6FanfdcjS1shlBMu1PRUk28svCvZPaOmrKZrS5f7vU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Xy+jETAc; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Xy+jETAc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728554300; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=GjeDj7oC1iiZ/gkwEo4XJZE2Kubqr3JwCuLq6Y0HOAU=; b=Xy+jETAcI31K/++s8bwE1Q96Nr06WGfscMPMiMRzKj5dO9zg2qeAWkOqhF2frMQYp1zojA9Pavh/WBGSZ+jrDwA0siqKNCxghp5D0iFByplhyIK3J3j8Cl9Mh0dSdFKEHOnft+Ng6fEHgYsOm/sQlLr3JC4gcSq416/OjWHk1kc= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WGm1Wg-_1728554298 cluster:ay36) by smtp.aliyun-inc.com; Thu, 10 Oct 2024 17:58:19 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ioworker0@gmail.com, da.gomez@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 1/4] mm: factor out the order calculation into a new helper Date: Thu, 10 Oct 2024 17:58:11 +0800 Message-Id: <382e820c4805dcac80723d8f1707088dbcea2e85.1728548374.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Factor out the order calculation into a new helper, which can be reused by shmem in the following patch. Suggested-by: Matthew Wilcox Signed-off-by: Baolin Wang --- include/linux/pagemap.h | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index bcf0865a38ae..d796c8a33647 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -727,6 +727,16 @@ typedef unsigned int __bitwise fgf_t; =20 #define FGP_WRITEBEGIN (FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE) =20 +static inline unsigned int filemap_get_order(size_t size) +{ + unsigned int shift =3D ilog2(size); + + if (shift <=3D PAGE_SHIFT) + return 0; + + return shift - PAGE_SHIFT; +} + /** * fgf_set_order - Encode a length in the fgf_t flags. * @size: The suggested size of the folio to create. @@ -740,11 +750,11 @@ typedef unsigned int __bitwise fgf_t; */ static inline fgf_t fgf_set_order(size_t size) { - unsigned int shift =3D ilog2(size); + unsigned int order =3D filemap_get_order(size); =20 - if (shift <=3D PAGE_SHIFT) + if (!order) return 0; - return (__force fgf_t)((shift - PAGE_SHIFT) << 26); + return (__force fgf_t)(order << 26); } =20 void *filemap_get_entry(struct address_space *mapping, pgoff_t index); --=20 2.39.3 From nobody Sun Nov 24 02:40:09 2024 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D37E61BFDEB for ; Thu, 10 Oct 2024 09:58:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.98 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554311; cv=none; b=AsBMlBKt5WvcfC8kiGcK+gi8VC3FXBYFdzd2CT1LL7YuLT1SYrkEMTHTxEMLatUWBNfhpCodJYEAULFH+AEbqlyis9Tk5QDYkaD8bMQg5SpGcW36Apz7bkdRhD7PPMV2Y9GwLmgz07MBmGaAODT3gXpv1XQcA3esA761mhqp4Q0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554311; c=relaxed/simple; bh=fipWSVqfJTIQ/t8ZyHaxcVq5Em65bazNxVMdGA1/PJ8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n1AzmfgxGLM+a9gBmPyjNFUMOfHNChiCl4b5lOmn4COC4FWKER0gOmcni9MQsbC/rYvyEtgAABf/3LpuwEi1TqaU9iKCdAUgzrL0UxJuKN/J/0ar8K9GRDycZuZJYuxTSaBjaFZ5RVo83JGk7XO1uoXXd6p2tGMk43AAJbEScNw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=liuRLKXC; arc=none smtp.client-ip=115.124.30.98 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="liuRLKXC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728554301; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=UsgmxRS9TKs/0ogcQ76wiAy1j6N7MLMM/FjqAS25YQo=; b=liuRLKXCh3Z8VQnKAEfRp8eTtKweIxMqov8MQ5rrJWv7TfXTalyE5xxey29UzYimsMr+eh+44lKG5gfG3uLJqN6V5l3fEcuUW5wviI0GKw8+VQV/hGTZkN5IEwhRnEWls/AF18Kw5vFo0/gU3Ki7/VQ4WN6vgfG5R2Qla/AR28w= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WGlzQXz_1728554299 cluster:ay36) by smtp.aliyun-inc.com; Thu, 10 Oct 2024 17:58:20 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ioworker0@gmail.com, da.gomez@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap Date: Thu, 10 Oct 2024 17:58:12 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Change the shmem_huge_global_enabled() to return the suitable huge order bitmap, and return 0 if huge pages are not allowed. This is a preparation for adding a new huge option to support various huge orders allocation in the following patch. No functional changes. Signed-off-by: Baolin Wang --- mm/shmem.c | 43 ++++++++++++++++++++++--------------------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 0613421e09e7..f04935722457 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -548,48 +548,48 @@ static bool shmem_confirm_swap(struct address_space *= mapping, =20 static int shmem_huge __read_mostly =3D SHMEM_HUGE_NEVER; =20 -static bool __shmem_huge_global_enabled(struct inode *inode, pgoff_t index, - loff_t write_end, bool shmem_huge_force, - struct vm_area_struct *vma, - unsigned long vm_flags) +static unsigned int __shmem_huge_global_enabled(struct inode *inode, pgoff= _t index, + loff_t write_end, bool shmem_huge_force, + struct vm_area_struct *vma, + unsigned long vm_flags) { struct mm_struct *mm =3D vma ? vma->vm_mm : NULL; loff_t i_size; =20 if (!S_ISREG(inode->i_mode)) - return false; + return 0; if (mm && ((vm_flags & VM_NOHUGEPAGE) || test_bit(MMF_DISABLE_THP, &mm->f= lags))) - return false; + return 0; if (shmem_huge =3D=3D SHMEM_HUGE_DENY) - return false; + return 0; if (shmem_huge_force || shmem_huge =3D=3D SHMEM_HUGE_FORCE) - return true; + return BIT(HPAGE_PMD_ORDER); =20 switch (SHMEM_SB(inode->i_sb)->huge) { case SHMEM_HUGE_ALWAYS: - return true; + return BIT(HPAGE_PMD_ORDER); case SHMEM_HUGE_WITHIN_SIZE: index =3D round_up(index + 1, HPAGE_PMD_NR); i_size =3D max(write_end, i_size_read(inode)); i_size =3D round_up(i_size, PAGE_SIZE); if (i_size >> PAGE_SHIFT >=3D index) - return true; + return BIT(HPAGE_PMD_ORDER); fallthrough; case SHMEM_HUGE_ADVISE: if (mm && (vm_flags & VM_HUGEPAGE)) - return true; + return BIT(HPAGE_PMD_ORDER); fallthrough; default: - return false; + return 0; } } =20 -static bool shmem_huge_global_enabled(struct inode *inode, pgoff_t index, +static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t= index, loff_t write_end, bool shmem_huge_force, struct vm_area_struct *vma, unsigned long vm_flags) { if (HPAGE_PMD_ORDER > MAX_PAGECACHE_ORDER) - return false; + return 0; =20 return __shmem_huge_global_enabled(inode, index, write_end, shmem_huge_force, vma, vm_flags); @@ -771,11 +771,11 @@ static unsigned long shmem_unused_huge_shrink(struct = shmem_sb_info *sbinfo, return 0; } =20 -static bool shmem_huge_global_enabled(struct inode *inode, pgoff_t index, +static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t= index, loff_t write_end, bool shmem_huge_force, struct vm_area_struct *vma, unsigned long vm_flags) { - return false; + return 0; } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ =20 @@ -1170,7 +1170,8 @@ static int shmem_getattr(struct mnt_idmap *idmap, generic_fillattr(idmap, request_mask, inode, stat); inode_unlock_shared(inode); =20 - if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0)) + if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0) =3D=3D + BIT(HPAGE_PMD_ORDER)) stat->blksize =3D HPAGE_PMD_SIZE; =20 if (request_mask & STATX_BTIME) { @@ -1679,7 +1680,7 @@ unsigned long shmem_allowable_huge_orders(struct inod= e *inode, unsigned long mask =3D READ_ONCE(huge_shmem_orders_always); unsigned long within_size_orders =3D READ_ONCE(huge_shmem_orders_within_s= ize); unsigned long vm_flags =3D vma ? vma->vm_flags : 0; - bool global_huge; + unsigned int global_order; loff_t i_size; int order; =20 @@ -1691,14 +1692,14 @@ unsigned long shmem_allowable_huge_orders(struct in= ode *inode, if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED)) return 0; =20 - global_huge =3D shmem_huge_global_enabled(inode, index, write_end, + global_order =3D shmem_huge_global_enabled(inode, index, write_end, shmem_huge_force, vma, vm_flags); if (!vma || !vma_is_anon_shmem(vma)) { /* * For tmpfs, we now only support PMD sized THP if huge page * is enabled, otherwise fallback to order 0. */ - return global_huge ? BIT(HPAGE_PMD_ORDER) : 0; + return global_order; } =20 /* @@ -1731,7 +1732,7 @@ unsigned long shmem_allowable_huge_orders(struct inod= e *inode, if (vm_flags & VM_HUGEPAGE) mask |=3D READ_ONCE(huge_shmem_orders_madvise); =20 - if (global_huge) + if (global_order > 0) mask |=3D READ_ONCE(huge_shmem_orders_inherit); =20 return THP_ORDERS_ALL_FILE_DEFAULT & mask; --=20 2.39.3 From nobody Sun Nov 24 02:40:09 2024 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 389ED1BE86F for ; Thu, 10 Oct 2024 09:58:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554306; cv=none; b=kh6u8uRj9m1dEWd0mT3xuIWMHJVu8tt9n6v51evGH44oW1b/0qQv4jkdFS/1NP76ewHL/RSfgOklmwUTBqMwBovW3vccCDof5GaDjw4hwldvV+ZycOV+G8t9zG/eQyyFMtKZjBvp8cF0V4gwPs8IvP5ipSMsDIBQMr7+PL5/A0Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554306; c=relaxed/simple; bh=5+gX2Wp3dxdEacjMq8pP3jKsECs5CoHoKOL9Jaxyz3A=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Zvv0Hw75Ex7M6XHA98kdpsDD5z3oJ7tcfmLrUW+p79T7raGGC/3TtDzFIBFQH/C4NGyB21ulW2yfCzXGMyYV1TUREEtjryyLlK3C+mQF/SlTxNM9yqPX+Z9/0OQFfe2wERWL5AdoI4MV9Doy/8a2f8LJCiplbctN3E2OhIPTzp0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=nmGiYb/Z; arc=none smtp.client-ip=115.124.30.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="nmGiYb/Z" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728554302; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=0/z9eme036ELYbZ0x9xP/AM4BDPvXGK6kIU0Y8j+uE0=; b=nmGiYb/ZdFE63ymZzgWaWErPpAh66/ltmnC91kyf7T+qw/Inw7qb2iwe/Wl4P/e2lLmGQ/R9CNyWHxpBHphYUKAWUBYf4oz8RywWiBYPL2g44COvtIejM6LxWbUgp5tD7b+9qCtj7jP6WEyrTMyzurZALDQ2uqtZmcecK73MzqE= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WGm.dgu_1728554300 cluster:ay36) by smtp.aliyun-inc.com; Thu, 10 Oct 2024 17:58:21 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ioworker0@gmail.com, da.gomez@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 3/4] mm: shmem: add large folio support to the write and fallocate paths for tmpfs Date: Thu, 10 Oct 2024 17:58:13 +0800 Message-Id: <252c5999f8789d4f511e8e1466414238990f7e18.1728548374.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add large folio support for tmpfs write and fallocate paths matching the same high order preference mechanism used in the iomap buffered IO path as used in __filemap_get_folio(). Add shmem_mapping_size_order() to get a hint for the order of the folio based on the file size which takes care of the mapping requirements. Considering that tmpfs already has the 'huge=3D' option to control the huge pages allocation, it is necessary to maintain compatibility with the 'huge= =3D' option, as well as considering the 'deny' and 'force' option controlled by '/sys/kernel/mm/transparent_hugepage/shmem_enabled'. Add a new huge option 'write_size' to support large folio allocation based on the write size for tmpfs write and fallocate paths. So the huge pages allocation strategy for tmpfs is that, if the 'huge=3D' option (huge=3Dalways/within_size/advise) is enabled or the 'shmem_enabled' option is 'force', it need just allow PMD sized THP to keep backward compatibility for tmpfs. While 'huge=3D' option is disabled (huge=3Dnever) or the 'shmem_= enabled' option is 'deny', it will still disable any large folio allocations. Only when the 'huge=3D' option is 'write_size', it will allow allocating large folios based on the write size. Co-developed-by: Daniel Gomez Signed-off-by: Daniel Gomez Signed-off-by: Baolin Wang --- mm/shmem.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 55 insertions(+), 7 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index f04935722457..66f1cf5b1645 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -523,12 +523,15 @@ static bool shmem_confirm_swap(struct address_space *= mapping, * also respect fadvise()/madvise() hints; * SHMEM_HUGE_ADVISE: * only allocate huge pages if requested with fadvise()/madvise(); + * SHMEM_HUGE_WRITE_SIZE: + * only allocate huge pages based on the write size. */ =20 #define SHMEM_HUGE_NEVER 0 #define SHMEM_HUGE_ALWAYS 1 #define SHMEM_HUGE_WITHIN_SIZE 2 #define SHMEM_HUGE_ADVISE 3 +#define SHMEM_HUGE_WRITE_SIZE 4 =20 /* * Special values. @@ -548,12 +551,46 @@ static bool shmem_confirm_swap(struct address_space *= mapping, =20 static int shmem_huge __read_mostly =3D SHMEM_HUGE_NEVER; =20 +/** + * shmem_mapping_size_order - Get maximum folio order for the given file s= ize. + * @mapping: Target address_space. + * @index: The page index. + * @size: The suggested size of the folio to create. + * + * This returns a high order for folios (when supported) based on the file= size + * which the mapping currently allows at the given index. The index is rel= evant + * due to alignment considerations the mapping might have. The returned or= der + * may be less than the size passed. + * + * Return: The order. + */ +static inline unsigned int +shmem_mapping_size_order(struct address_space *mapping, pgoff_t index, siz= e_t size) +{ + unsigned int order; + + if (!mapping_large_folio_support(mapping)) + return 0; + + order =3D filemap_get_order(size); + if (!order) + return 0; + + /* If we're not aligned, allocate a smaller folio */ + if (index & ((1UL << order) - 1)) + order =3D __ffs(index); + + return min_t(size_t, order, MAX_PAGECACHE_ORDER); +} + static unsigned int __shmem_huge_global_enabled(struct inode *inode, pgoff= _t index, loff_t write_end, bool shmem_huge_force, struct vm_area_struct *vma, unsigned long vm_flags) { struct mm_struct *mm =3D vma ? vma->vm_mm : NULL; + unsigned int order; + size_t len; loff_t i_size; =20 if (!S_ISREG(inode->i_mode)) @@ -568,6 +605,17 @@ static unsigned int __shmem_huge_global_enabled(struct= inode *inode, pgoff_t ind switch (SHMEM_SB(inode->i_sb)->huge) { case SHMEM_HUGE_ALWAYS: return BIT(HPAGE_PMD_ORDER); + /* + * If the huge option is SHMEM_HUGE_WRITE_SIZE, it will allow + * getting a highest order hint based on the size of write and + * fallocate paths, then will try each allowable huge orders. + */ + case SHMEM_HUGE_WRITE_SIZE: + if (!write_end) + return 0; + len =3D write_end - (index << PAGE_SHIFT); + order =3D shmem_mapping_size_order(inode->i_mapping, index, len); + return order > 0 ? BIT(order + 1) - 1 : 0; case SHMEM_HUGE_WITHIN_SIZE: index =3D round_up(index + 1, HPAGE_PMD_NR); i_size =3D max(write_end, i_size_read(inode)); @@ -624,6 +672,8 @@ static const char *shmem_format_huge(int huge) return "always"; case SHMEM_HUGE_WITHIN_SIZE: return "within_size"; + case SHMEM_HUGE_WRITE_SIZE: + return "write_size"; case SHMEM_HUGE_ADVISE: return "advise"; case SHMEM_HUGE_DENY: @@ -1694,13 +1744,9 @@ unsigned long shmem_allowable_huge_orders(struct ino= de *inode, =20 global_order =3D shmem_huge_global_enabled(inode, index, write_end, shmem_huge_force, vma, vm_flags); - if (!vma || !vma_is_anon_shmem(vma)) { - /* - * For tmpfs, we now only support PMD sized THP if huge page - * is enabled, otherwise fallback to order 0. - */ + /* Tmpfs huge pages allocation? */ + if (!vma || !vma_is_anon_shmem(vma)) return global_order; - } =20 /* * Following the 'deny' semantics of the top level, force the huge @@ -2851,7 +2897,8 @@ static struct inode *__shmem_get_inode(struct mnt_idm= ap *idmap, cache_no_acl(inode); if (sbinfo->noswap) mapping_set_unevictable(inode->i_mapping); - mapping_set_large_folios(inode->i_mapping); + if (sbinfo->huge) + mapping_set_large_folios(inode->i_mapping); =20 switch (mode & S_IFMT) { default: @@ -4224,6 +4271,7 @@ static const struct constant_table shmem_param_enums_= huge[] =3D { {"always", SHMEM_HUGE_ALWAYS }, {"within_size", SHMEM_HUGE_WITHIN_SIZE }, {"advise", SHMEM_HUGE_ADVISE }, + {"write_size", SHMEM_HUGE_WRITE_SIZE }, {} }; =20 --=20 2.39.3 From nobody Sun Nov 24 02:40:09 2024 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE6261C0DC1 for ; Thu, 10 Oct 2024 09:58:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554312; cv=none; b=HTQa6gA9iTI/tYuctb1jZz4UzfOfo27DLMS1q0bE0qMKhwfhBE3hk0qFniMwNv0TAN+VwsxVR6jGWPwbCPF5jeoFeklbLZ5xP8OO7OhgcnccuWeZayt6OEhNU5NgiY6c17tl+lXBf2ft6ajfFt8fF8h6IfQfRanoWLjNLr16lL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728554312; c=relaxed/simple; bh=DHFw5q2YloFtaqbqYnA+n3xUeQwrKYdZTTMc6I80SuQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=khX2t1WgG9tvYtxE2Iuf3xRtyDu2IcdQED5Mtx2xkyyjnjAmQxoH2B/ai35GsaA50PXvC0nirw0ycDykpKhNUtCOJzRrLEdROgh927ltx2e1xorTWzTm13rq5FwdpRp4X3cvOn76Iyj8S2vaYSkP3P9rYHwqccREmIqtkyK86VM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=A8LhrUy/; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="A8LhrUy/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728554303; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=TAqUYxnDAtyT0H6gtJz0ppazair4o9uzMNjBuGwGaBM=; b=A8LhrUy/2USDey0puPf9V5n74XjZquA/tmCUcu+ROxLj7M/ZcWxwJjSKxr0qb+GEUfrLu50w4Jfp2IAOmn0hOTQ+xpx14VvEdzfheEogI+mn164lAILwR2I37PzZJqqB19WlqpEGlX3+2i9rJ61QOlM4EjXVNpnPNzuQBoUXi60= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WGlzQYu_1728554301 cluster:ay36) by smtp.aliyun-inc.com; Thu, 10 Oct 2024 17:58:22 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ioworker0@gmail.com, da.gomez@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v3 4/4] docs: tmpfs: add documention for 'write_size' huge option Date: Thu, 10 Oct 2024 17:58:14 +0800 Message-Id: <556e9944f0328ccb0025e8582894ae01efd98842.1728548374.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add documention for 'write_size' huge option, as well as making previous huge options more clear. Signed-off-by: Baolin Wang --- Documentation/filesystems/tmpfs.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystem= s/tmpfs.rst index 56a26c843dbe..be998ff47018 100644 --- a/Documentation/filesystems/tmpfs.rst +++ b/Documentation/filesystems/tmpfs.rst @@ -115,10 +115,11 @@ The mount options for this are: =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D huge=3Dnever Do not allocate huge pages. This is the default. -huge=3Dalways Attempt to allocate huge page every time a new page is = needed. -huge=3Dwithin_size Only allocate huge page if it will be fully within i_si= ze. +huge=3Dalways Attempt to allocate PMD sized huge page every time a ne= w page is needed. +huge=3Dwithin_size Only allocate PMD sized huge page if it will be fully w= ithin i_size. Also respect madvise(2) hints. -huge=3Dadvise Only allocate huge page if requested with madvise(2). +huge=3Dadvise Only allocate PMD sized huge page if requested with mad= vise(2). +huge=3Dwrite_size Can allocate various sized huge page based on the write= size. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D =20 See also Documentation/admin-guide/mm/transhuge.rst, which describes the --=20 2.39.3