From nobody Wed Nov 27 13:00:05 2024 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A83680035; Thu, 10 Oct 2024 13:35:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728567346; cv=none; b=bRK3nFUb7aBVtkWnbVeaPRN7UTbxvbOq3w31KnCS7my4WCq0bUjSFqYVzNEATFa/B8/s3RimRXlCt8igcUgnslonI/o2F1f+osaBG310rPSnpLbrb0ilhBwiTlbUQ/Vnyt2IIbfgZ1SInSLg/QHO+avgEIwZVZSV7tYGCr5Yjzw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728567346; c=relaxed/simple; bh=VdJ9IMIjtQwr5Ztqa838ADogQtEwKITr1AZqEGXblPA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=A7xgkALRFEviMb/LR3VSsL9jRt/Du8ylnvrKBHSiSzc1sDQ+VYwjiXQaU8ElCDTLuZZqxgET2P3IIbhCozUDztDzcfHKpKhuyUl9TOU904MyGb/m1uPKOzBRn4bnc5mmh0xLp7MBEgnt+HUfzwiMIhp1QzSyQ4KMpNoqa+nfxWA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XPW3P1dxHz4f3jtS; Thu, 10 Oct 2024 21:35:29 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 094761A0568; Thu, 10 Oct 2024 21:35:41 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgCXysYc2AdnDRnZDg--.21356S5; Thu, 10 Oct 2024 21:35:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com Subject: [PATCH v3 01/10] ext4: remove writable userspace mappings before truncating page cache Date: Thu, 10 Oct 2024 21:33:24 +0800 Message-Id: <20241010133333.146793-2-yi.zhang@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20241010133333.146793-1-yi.zhang@huawei.com> References: <20241010133333.146793-1-yi.zhang@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCXysYc2AdnDRnZDg--.21356S5 X-Coremail-Antispam: 1UD129KBjvJXoWxXFyrJF43Xw18CFy5Gry5XFb_yoWrAr43pr 9rGFyfCrWrZasrWa1Sg3WUZw1rK3WkCF4UJ34fWr1UXFy8X3WkKF1Dtw1jyF4UtrW8Jr4j vF45trWjgF45A3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUHYb4IE77IF4wAFF20E14v26ryj6rWUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW5JVW7JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7Iv64x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YV CY1x02628vn2kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF04k2 0xvEw4C26cxK6c8Ij28IcwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r 1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij 64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr 0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF 0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07jeg4hUUUUU= Sender: yi.zhang@huaweicloud.com X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" When zeroing a range of folios on the filesystem which block size is less than the page size, the file's mapped partial blocks within one page will be marked as unwritten, we should remove writable userspace mappings to ensure that ext4_page_mkwrite() can be called during subsequent write access to these folios. Otherwise, data written by subsequent mmap writes may not be saved to disk. $mkfs.ext4 -b 1024 /dev/vdb $mount /dev/vdb /mnt $xfs_io -t -f -c "pwrite -S 0x58 0 4096" -c "mmap -rw 0 4096" \ -c "mwrite -S 0x5a 2048 2048" -c "fzero 2048 2048" \ -c "mwrite -S 0x59 2048 2048" -c "close" /mnt/foo $od -Ax -t x1z /mnt/foo 000000 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 * 000800 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59 * 001000 $umount /mnt && mount /dev/vdb /mnt $od -Ax -t x1z /mnt/foo 000000 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 * 000800 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 001000 Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 2 ++ fs/ext4/extents.c | 1 + fs/ext4/inode.c | 41 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 44 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 44b0d418143c..6d0267afd4c1 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3020,6 +3020,8 @@ extern int ext4_inode_attach_jinode(struct inode *ino= de); extern int ext4_can_truncate(struct inode *inode); extern int ext4_truncate(struct inode *); extern int ext4_break_layouts(struct inode *); +extern void ext4_truncate_folios_range(struct inode *inode, loff_t start, + loff_t end); extern int ext4_punch_hole(struct file *file, loff_t offset, loff_t length= ); extern void ext4_set_inode_flags(struct inode *, bool init); extern int ext4_alloc_da_blocks(struct inode *inode); diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 34e25eee6521..2a054c3689f0 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4677,6 +4677,7 @@ static long ext4_zero_range(struct file *file, loff_t= offset, } =20 /* Now release the pages and zero block aligned part of pages */ + ext4_truncate_folios_range(inode, start, end); truncate_pagecache_range(inode, start, end - 1); inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); =20 diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 54bdd4884fe6..8b34e79112d5 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -3870,6 +3871,46 @@ int ext4_update_disksize_before_punch(struct inode *= inode, loff_t offset, return ret; } =20 +static inline void ext4_truncate_folio(struct inode *inode, + loff_t start, loff_t end) +{ + unsigned long blocksize =3D i_blocksize(inode); + struct folio *folio; + + if (round_up(start, blocksize) >=3D round_down(end, blocksize)) + return; + + folio =3D filemap_lock_folio(inode->i_mapping, start >> PAGE_SHIFT); + if (IS_ERR(folio)) + return; + + if (folio_mkclean(folio)) + folio_mark_dirty(folio); + folio_unlock(folio); + folio_put(folio); +} + +/* + * When truncating a range of folios, if the block size is less than the + * page size, the file's mapped partial blocks within one page could be + * freed or converted to unwritten. We should call this function to remove + * writable userspace mappings so that ext4_page_mkwrite() can be called + * during subsequent write access to these folios. + */ +void ext4_truncate_folios_range(struct inode *inode, loff_t start, loff_t = end) +{ + unsigned long blocksize =3D i_blocksize(inode); + + if (end > inode->i_size) + end =3D inode->i_size; + if (start >=3D end || blocksize >=3D PAGE_SIZE) + return; + + ext4_truncate_folio(inode, start, min(round_up(start, PAGE_SIZE), end)); + if (end > round_up(start, PAGE_SIZE)) + ext4_truncate_folio(inode, round_down(end, PAGE_SIZE), end); +} + static void ext4_wait_dax_page(struct inode *inode) { filemap_invalidate_unlock(inode->i_mapping); --=20 2.39.2