From nobody Thu Mar 5 06:33:52 2026 Received: from relayaws-01.paragon-software.com (relayaws-01.paragon-software.com [35.157.23.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F9043321B0; Mon, 16 Feb 2026 16:26:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=35.157.23.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771259190; cv=none; b=Ztjxtx7D5F/tyYrm4SZ0pPax/YV56G0XVg3cujv3gcRU9eeaW/dEmT6wgEbpn8QFUYz7zAgB3FvUUCNJVKhgJ3SZ5UgnSsqNsey/yEeiBbx8uC4quYsTd3lNo6UchjFUFKYW/EIl8CkTtnxmAyJ0hNA4xMPvzeVHjQvnr+5IfMY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771259190; c=relaxed/simple; bh=nVbndlvuUCmWFgLFno54hmUini/LqWiGlw1L/7xpYj0=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=Lc3HKDOT8ixBtMMb2inOOX5drFOsbQYrwKDThuXd+ahx5OOjwR9GUNXz/eS9TAb3yAUI1Jzj7dAGFUGK534Ob7IGGMYdcH3QR4p3U5ADp3038fhqwMCumOqF0q8hsP3Lo50Jq97b4H5oIXePLkTWEKn42JhI9XTLfjgZysCLlE0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=paragon-software.com; spf=pass smtp.mailfrom=paragon-software.com; dkim=pass (1024-bit key) header.d=paragon-software.com header.i=@paragon-software.com header.b=k+4XQV9h; arc=none smtp.client-ip=35.157.23.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=paragon-software.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=paragon-software.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=paragon-software.com header.i=@paragon-software.com header.b="k+4XQV9h" Received: from relayfre-01.paragon-software.com (unknown [176.12.100.13]) by relayaws-01.paragon-software.com (Postfix) with ESMTPS id 8805AD0; Mon, 16 Feb 2026 16:24:20 +0000 (UTC) Authentication-Results: relayaws-01.paragon-software.com; dkim=pass (1024-bit key; unprotected) header.d=paragon-software.com header.i=@paragon-software.com header.b=k+4XQV9h; dkim-atps=neutral Received: from dlg2.mail.paragon-software.com (vdlg-exch-02.paragon-software.com [172.30.1.105]) by relayfre-01.paragon-software.com (Postfix) with ESMTPS id BCDD42147; Mon, 16 Feb 2026 16:26:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paragon-software.com; s=mail; t=1771259178; bh=oHF+AODGIsDIedlvKmS7ziMUSgEwbUJQaHOtsn7naLM=; h=From:To:CC:Subject:Date; b=k+4XQV9h/iX/FNz0FXLfBHvB4T+N92xN+OtrTa6KeOcPEoyUiQ2ABcGuqsI6w2fXP +q1RmtJte+LlBqqm98wilq9Dvz6qcnocRPjMkZrntp+H3pTO12wNbRSkzTqXseAyPF Dp7I9qahWH9nnjPIZUhsra9S5us1Evslgz7wL7OI= Received: from localhost.localdomain (172.30.20.140) by vdlg-exch-02.paragon-software.com (172.30.1.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.7; Mon, 16 Feb 2026 19:26:17 +0300 From: Konstantin Komarov To: CC: , , Konstantin Komarov , , Subject: [PATCH v2] fs/ntfs3: add delayed-allocation (delalloc) support Date: Mon, 16 Feb 2026 17:26:08 +0100 Message-ID: <20260216162608.4351-1-almaz.alexandrovich@paragon-software.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: vobn-exch-01.paragon-software.com (172.30.72.13) To vdlg-exch-02.paragon-software.com (172.30.1.105) Content-Type: text/plain; charset="utf-8" This patch implements delayed allocation (delalloc) in ntfs3 driver. It introduces an in-memory delayed-runlist (run_da) and the helpers to track, reserve and later convert those delayed reservations into real clusters at writeback time. The change keeps on-disk formats untouched and focuses on pagecache integration, correctness and safe interaction with fallocate, truncate, and dio/iomap paths. Key points: - add run_da (delay-allocated run tree) and bookkeeping for delayed cluster= s. - mark ranges as delalloc (DELALLOC_LCN) instead of immediately allocating. Actual allocation performed later (writeback / attr_set_size_ex / explicit flush paths). - direct i/o / iomap paths updated to avoid dio collisions with delalloc: dio falls back or forces allocation of delayed blocks before proceeding. - punch/collapse/truncate/fallocate check and cancel delay-alloc reservatio= ns. Sparse/compressed files handled specially. - free-space checks updated (ntfs_check_free_space) to account for reserved delalloc clusters and MFT record budgeting. - delayed allocations are committed on last writer (file release) and on explicit allocation flush paths. Tested-by: syzbot@syzkaller.appspotmail.com Reported-by: syzbot+2bd8e813c7f767aa9bb1@syzkaller.appspotmail.com Signed-off-by: Konstantin Komarov --- v2: removed warning on compressed inode. fs/ntfs3/attrib.c | 333 ++++++++++++++++++++++++++++++++------------ fs/ntfs3/attrlist.c | 8 +- fs/ntfs3/file.c | 306 +++++++++++++++++++++------------------- fs/ntfs3/frecord.c | 72 +++++++++- fs/ntfs3/fsntfs.c | 53 +++++-- fs/ntfs3/index.c | 23 ++- fs/ntfs3/inode.c | 161 ++++++++++++++------- fs/ntfs3/ntfs.h | 3 + fs/ntfs3/ntfs_fs.h | 91 ++++++++++-- fs/ntfs3/run.c | 150 ++++++++++++++++++-- fs/ntfs3/super.c | 28 +++- fs/ntfs3/xattr.c | 2 +- 12 files changed, 886 insertions(+), 344 deletions(-) diff --git a/fs/ntfs3/attrib.c b/fs/ntfs3/attrib.c index aa745fb226f5..6cb9bc5d605c 100644 --- a/fs/ntfs3/attrib.c +++ b/fs/ntfs3/attrib.c @@ -91,7 +91,8 @@ static int attr_load_runs(struct ATTRIB *attr, struct ntf= s_inode *ni, * run_deallocate_ex - Deallocate clusters. */ static int run_deallocate_ex(struct ntfs_sb_info *sbi, struct runs_tree *r= un, - CLST vcn, CLST len, CLST *done, bool trim) + CLST vcn, CLST len, CLST *done, bool trim, + struct runs_tree *run_da) { int err =3D 0; CLST vcn_next, vcn0 =3D vcn, lcn, clen, dn =3D 0; @@ -120,6 +121,16 @@ static int run_deallocate_ex(struct ntfs_sb_info *sbi,= struct runs_tree *run, if (sbi) { /* mark bitmap range [lcn + clen) as free and trim clusters. */ mark_as_free_ex(sbi, lcn, clen, trim); + + if (run_da) { + CLST da_len; + if (!run_remove_range(run_da, vcn, clen, + &da_len)) { + err =3D -ENOMEM; + goto failed; + } + ntfs_sub_da(sbi, da_len); + } } dn +=3D clen; } @@ -147,9 +158,10 @@ static int run_deallocate_ex(struct ntfs_sb_info *sbi,= struct runs_tree *run, * attr_allocate_clusters - Find free space, mark it as used and store in = @run. */ int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run, - CLST vcn, CLST lcn, CLST len, CLST *pre_alloc, - enum ALLOCATE_OPT opt, CLST *alen, const size_t fr, - CLST *new_lcn, CLST *new_len) + struct runs_tree *run_da, CLST vcn, CLST lcn, + CLST len, CLST *pre_alloc, enum ALLOCATE_OPT opt, + CLST *alen, const size_t fr, CLST *new_lcn, + CLST *new_len) { int err; CLST flen, vcn0 =3D vcn, pre =3D pre_alloc ? *pre_alloc : 0; @@ -185,12 +197,21 @@ int attr_allocate_clusters(struct ntfs_sb_info *sbi, = struct runs_tree *run, =20 /* Add new fragment into run storage. */ if (!run_add_entry(run, vcn, lcn, flen, opt & ALLOCATE_MFT)) { +undo_alloc: /* Undo last 'ntfs_look_for_free_space' */ mark_as_free_ex(sbi, lcn, len, false); err =3D -ENOMEM; goto out; } =20 + if (run_da) { + CLST da_len; + if (!run_remove_range(run_da, vcn, flen, &da_len)) { + goto undo_alloc; + } + ntfs_sub_da(sbi, da_len); + } + if (opt & ALLOCATE_ZERO) { u8 shift =3D sbi->cluster_bits - SECTOR_SHIFT; =20 @@ -205,7 +226,7 @@ int attr_allocate_clusters(struct ntfs_sb_info *sbi, st= ruct runs_tree *run, vcn +=3D flen; =20 if (flen >=3D len || (opt & ALLOCATE_MFT) || - (fr && run->count - cnt >=3D fr)) { + (opt & ALLOCATE_ONE_FR) || (fr && run->count - cnt >=3D fr)) { *alen =3D vcn - vcn0; return 0; } @@ -216,7 +237,8 @@ int attr_allocate_clusters(struct ntfs_sb_info *sbi, st= ruct runs_tree *run, out: /* Undo 'ntfs_look_for_free_space' */ if (vcn - vcn0) { - run_deallocate_ex(sbi, run, vcn0, vcn - vcn0, NULL, false); + run_deallocate_ex(sbi, run, vcn0, vcn - vcn0, NULL, false, + run_da); run_truncate(run, vcn0); } =20 @@ -281,7 +303,7 @@ int attr_make_nonresident(struct ntfs_inode *ni, struct= ATTRIB *attr, } else { const char *data =3D resident_data(attr); =20 - err =3D attr_allocate_clusters(sbi, run, 0, 0, len, NULL, + err =3D attr_allocate_clusters(sbi, run, NULL, 0, 0, len, NULL, ALLOCATE_DEF, &alen, 0, NULL, NULL); if (err) @@ -397,7 +419,7 @@ static int attr_set_size_res(struct ntfs_inode *ni, str= uct ATTRIB *attr, } =20 /* - * attr_set_size - Change the size of attribute. + * attr_set_size_ex - Change the size of attribute. * * Extend: * - Sparse/compressed: No allocated clusters. @@ -405,24 +427,28 @@ static int attr_set_size_res(struct ntfs_inode *ni, s= truct ATTRIB *attr, * Shrink: * - No deallocate if @keep_prealloc is set. */ -int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type, - const __le16 *name, u8 name_len, struct runs_tree *run, - u64 new_size, const u64 *new_valid, bool keep_prealloc, - struct ATTRIB **ret) +int attr_set_size_ex(struct ntfs_inode *ni, enum ATTR_TYPE type, + const __le16 *name, u8 name_len, struct runs_tree *run, + u64 new_size, const u64 *new_valid, bool keep_prealloc, + struct ATTRIB **ret, bool no_da) { int err =3D 0; struct ntfs_sb_info *sbi =3D ni->mi.sbi; u8 cluster_bits =3D sbi->cluster_bits; bool is_mft =3D ni->mi.rno =3D=3D MFT_REC_MFT && type =3D=3D ATTR_DATA && !name_len; - u64 old_valid, old_size, old_alloc, new_alloc, new_alloc_tmp; + u64 old_valid, old_size, old_alloc, new_alloc_tmp; + u64 new_alloc =3D 0; struct ATTRIB *attr =3D NULL, *attr_b; struct ATTR_LIST_ENTRY *le, *le_b; struct mft_inode *mi, *mi_b; CLST alen, vcn, lcn, new_alen, old_alen, svcn, evcn; CLST next_svcn, pre_alloc =3D -1, done =3D 0; - bool is_ext, is_bad =3D false; + bool is_ext =3D false, is_bad =3D false; bool dirty =3D false; + struct runs_tree *run_da =3D run =3D=3D &ni->file.run ? &ni->file.run_da : + NULL; + bool da =3D !is_mft && sbi->options->delalloc && run_da && !no_da; u32 align; struct MFT_REC *rec; =20 @@ -457,6 +483,7 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE= type, if (is_ext) { align <<=3D attr_b->nres.c_unit; keep_prealloc =3D false; + da =3D false; } =20 old_valid =3D le64_to_cpu(attr_b->nres.valid_size); @@ -475,6 +502,37 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYP= E type, goto ok; } =20 + if (da && + (vcn =3D old_alen + run_len(&ni->file.run_da), new_alen > vcn)) { + /* Resize up normal file. Delay new clusters allocation. */ + alen =3D new_alen - vcn; + + if (ntfs_check_free_space(sbi, alen, 0, true)) { + if (!run_add_entry(&ni->file.run_da, vcn, SPARSE_LCN, + alen, false)) { + err =3D -ENOMEM; + goto out; + } + + ntfs_add_da(sbi, alen); + goto ok1; + } + } + + if (!keep_prealloc && run_da && run_da->count && + (vcn =3D run_get_max_vcn(run_da), new_alen < vcn)) { + /* Shrink delayed clusters. */ + + /* Try to remove fragment from delay allocated run. */ + if (!run_remove_range(run_da, new_alen, vcn - new_alen, + &alen)) { + err =3D -ENOMEM; + goto out; + } + + ntfs_sub_da(sbi, alen); + } + vcn =3D old_alen - 1; =20 svcn =3D le64_to_cpu(attr_b->nres.svcn); @@ -580,7 +638,8 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE= type, } else { /* ~3 bytes per fragment. */ err =3D attr_allocate_clusters( - sbi, run, vcn, lcn, to_allocate, &pre_alloc, + sbi, run, run_da, vcn, lcn, to_allocate, + &pre_alloc, is_mft ? ALLOCATE_MFT : ALLOCATE_DEF, &alen, is_mft ? 0 : (sbi->record_size - @@ -759,14 +818,14 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TY= PE type, mi_b->dirty =3D dirty =3D true; =20 err =3D run_deallocate_ex(sbi, run, vcn, evcn - vcn + 1, &dlen, - true); + true, run_da); if (err) goto out; =20 if (is_ext) { /* dlen - really deallocated clusters. */ le64_sub_cpu(&attr_b->nres.total_size, - ((u64)dlen << cluster_bits)); + (u64)dlen << cluster_bits); } =20 run_truncate(run, vcn); @@ -821,14 +880,14 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TY= PE type, if (((type =3D=3D ATTR_DATA && !name_len) || (type =3D=3D ATTR_ALLOC && name =3D=3D I30_NAME))) { /* Update inode_set_bytes. */ - if (attr_b->non_res) { - new_alloc =3D le64_to_cpu(attr_b->nres.alloc_size); - if (inode_get_bytes(&ni->vfs_inode) !=3D new_alloc) { - inode_set_bytes(&ni->vfs_inode, new_alloc); - dirty =3D true; - } + if (attr_b->non_res && + inode_get_bytes(&ni->vfs_inode) !=3D new_alloc) { + inode_set_bytes(&ni->vfs_inode, new_alloc); + dirty =3D true; } =20 + i_size_write(&ni->vfs_inode, new_size); + /* Don't forget to update duplicate information in parent. */ if (dirty) { ni->ni_flags |=3D NI_FLAG_UPDATE_PARENT; @@ -869,7 +928,7 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE= type, is_bad =3D true; =20 undo_1: - run_deallocate_ex(sbi, run, vcn, alen, NULL, false); + run_deallocate_ex(sbi, run, vcn, alen, NULL, false, run_da); =20 run_truncate(run, vcn); out: @@ -892,20 +951,9 @@ int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYP= E type, * - new allocated clusters are zeroed via blkdev_issue_zeroout. */ int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *= lcn, - CLST *len, bool *new, bool zero, void **res) + CLST *len, bool *new, bool zero, void **res, bool no_da) { - int err =3D 0; - struct runs_tree *run =3D &ni->file.run; - struct ntfs_sb_info *sbi; - u8 cluster_bits; - struct ATTRIB *attr, *attr_b; - struct ATTR_LIST_ENTRY *le, *le_b; - struct mft_inode *mi, *mi_b; - CLST hint, svcn, to_alloc, evcn1, next_svcn, asize, end, vcn0, alen; - CLST alloc, evcn; - unsigned fr; - u64 total_size, total_size0; - int step =3D 0; + int err; =20 if (new) *new =3D false; @@ -914,23 +962,63 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST v= cn, CLST clen, CLST *lcn, =20 /* Try to find in cache. */ down_read(&ni->file.run_lock); - if (!run_lookup_entry(run, vcn, lcn, len, NULL)) + if (!no_da && run_lookup_entry(&ni->file.run_da, vcn, lcn, len, NULL)) { + /* The requested vcn is delay allocated. */ + *lcn =3D DELALLOC_LCN; + } else if (run_lookup_entry(&ni->file.run, vcn, lcn, len, NULL)) { + /* The requested vcn is known in current run. */ + } else { *len =3D 0; + } up_read(&ni->file.run_lock); =20 if (*len && (*lcn !=3D SPARSE_LCN || !new)) return 0; /* Fast normal way without allocation. */ =20 /* No cluster in cache or we need to allocate cluster in hole. */ - sbi =3D ni->mi.sbi; - cluster_bits =3D sbi->cluster_bits; - ni_lock(ni); down_write(&ni->file.run_lock); =20 - /* Repeat the code above (under write lock). */ - if (!run_lookup_entry(run, vcn, lcn, len, NULL)) + err =3D attr_data_get_block_locked(ni, vcn, clen, lcn, len, new, zero, + res, no_da); + + up_write(&ni->file.run_lock); + ni_unlock(ni); + + return err; +} + +/* + * attr_data_get_block_locked - Helper for attr_data_get_block. + */ +int attr_data_get_block_locked(struct ntfs_inode *ni, CLST vcn, CLST clen, + CLST *lcn, CLST *len, bool *new, bool zero, + void **res, bool no_da) +{ + int err =3D 0; + struct ntfs_sb_info *sbi =3D ni->mi.sbi; + struct runs_tree *run =3D &ni->file.run; + struct runs_tree *run_da =3D &ni->file.run_da; + bool da =3D sbi->options->delalloc && !no_da; + u8 cluster_bits; + struct ATTRIB *attr, *attr_b; + struct ATTR_LIST_ENTRY *le, *le_b; + struct mft_inode *mi, *mi_b; + CLST hint, svcn, to_alloc, evcn1, next_svcn, asize, end, vcn0; + CLST alloc, evcn; + unsigned fr; + u64 total_size, total_size0; + int step; + +again: + if (da && run_lookup_entry(run_da, vcn, lcn, len, NULL)) { + /* The requested vcn is delay allocated. */ + *lcn =3D DELALLOC_LCN; + } else if (run_lookup_entry(run, vcn, lcn, len, NULL)) { + /* The requested vcn is known in current run. */ + } else { *len =3D 0; + } =20 if (*len) { if (*lcn !=3D SPARSE_LCN || !new) @@ -939,6 +1027,9 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST vc= n, CLST clen, CLST *lcn, clen =3D *len; } =20 + cluster_bits =3D sbi->cluster_bits; + step =3D 0; + le_b =3D NULL; attr_b =3D ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, &mi_b); if (!attr_b) { @@ -1061,11 +1152,38 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST= vcn, CLST clen, CLST *lcn, if (err) goto out; } + da =3D false; /* no delalloc for compressed file. */ } =20 if (vcn + to_alloc > asize) to_alloc =3D asize - vcn; =20 + if (da) { + CLST rlen1, rlen2; + if (!ntfs_check_free_space(sbi, to_alloc, 0, true)) { + err =3D ni_allocate_da_blocks_locked(ni); + if (err) + goto out; + /* Layout of records may be changed. Start again without 'da'. */ + da =3D false; + goto again; + } + + /* run_add_entry consolidates existed ranges. */ + rlen1 =3D run_len(run_da); + if (!run_add_entry(run_da, vcn, SPARSE_LCN, to_alloc, false)) { + err =3D -ENOMEM; + goto out; + } + rlen2 =3D run_len(run_da); + + /* new added delay clusters =3D rlen2 - rlen1. */ + ntfs_add_da(sbi, rlen2 - rlen1); + *len =3D to_alloc; + *lcn =3D DELALLOC_LCN; + goto ok; + } + /* Get the last LCN to allocate from. */ hint =3D 0; =20 @@ -1080,18 +1198,19 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST= vcn, CLST clen, CLST *lcn, } =20 /* Allocate and zeroout new clusters. */ - err =3D attr_allocate_clusters(sbi, run, vcn, hint + 1, to_alloc, NULL, - zero ? ALLOCATE_ZERO : ALLOCATE_DEF, &alen, - fr, lcn, len); + err =3D attr_allocate_clusters(sbi, run, run_da, vcn, hint + 1, to_alloc, + NULL, + zero ? ALLOCATE_ZERO : ALLOCATE_ONE_FR, + len, fr, lcn, len); if (err) goto out; *new =3D true; step =3D 1; =20 - end =3D vcn + alen; + end =3D vcn + *len; /* Save 'total_size0' to restore if error. */ total_size0 =3D le64_to_cpu(attr_b->nres.total_size); - total_size =3D total_size0 + ((u64)alen << cluster_bits); + total_size =3D total_size0 + ((u64)*len << cluster_bits); =20 if (vcn !=3D vcn0) { if (!run_lookup_entry(run, vcn0, lcn, len, NULL)) { @@ -1157,7 +1276,7 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST v= cn, CLST clen, CLST *lcn, * in 'ni_insert_nonresident'. * Return in advance -ENOSPC here if there are no free cluster and no fre= e MFT. */ - if (!ntfs_check_for_free_space(sbi, 1, 1)) { + if (!ntfs_check_free_space(sbi, 1, 1, false)) { /* Undo step 1. */ err =3D -ENOSPC; goto undo1; @@ -1242,8 +1361,6 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST v= cn, CLST clen, CLST *lcn, /* Too complex to restore. */ _ntfs_bad_inode(&ni->vfs_inode); } - up_write(&ni->file.run_lock); - ni_unlock(ni); =20 return err; =20 @@ -1252,8 +1369,8 @@ int attr_data_get_block(struct ntfs_inode *ni, CLST v= cn, CLST clen, CLST *lcn, attr_b->nres.total_size =3D cpu_to_le64(total_size0); inode_set_bytes(&ni->vfs_inode, total_size0); =20 - if (run_deallocate_ex(sbi, run, vcn, alen, NULL, false) || - !run_add_entry(run, vcn, SPARSE_LCN, alen, false) || + if (run_deallocate_ex(sbi, run, vcn, *len, NULL, false, run_da) || + !run_add_entry(run, vcn, SPARSE_LCN, *len, false) || mi_pack_runs(mi, attr, run, max(end, evcn1) - svcn)) { _ntfs_bad_inode(&ni->vfs_inode); } @@ -1688,7 +1805,7 @@ int attr_allocate_frame(struct ntfs_inode *ni, CLST f= rame, size_t compr_size, =20 if (len < clst_data) { err =3D run_deallocate_ex(sbi, run, vcn + len, clst_data - len, - NULL, true); + NULL, true, NULL); if (err) goto out; =20 @@ -1708,7 +1825,7 @@ int attr_allocate_frame(struct ntfs_inode *ni, CLST f= rame, size_t compr_size, hint =3D -1; } =20 - err =3D attr_allocate_clusters(sbi, run, vcn + clst_data, + err =3D attr_allocate_clusters(sbi, run, NULL, vcn + clst_data, hint + 1, len - clst_data, NULL, ALLOCATE_DEF, &alen, 0, NULL, NULL); @@ -1863,6 +1980,7 @@ int attr_collapse_range(struct ntfs_inode *ni, u64 vb= o, u64 bytes) CLST vcn, end; u64 valid_size, data_size, alloc_size, total_size; u32 mask; + u64 i_size; __le16 a_flags; =20 if (!bytes) @@ -1878,52 +1996,79 @@ int attr_collapse_range(struct ntfs_inode *ni, u64 = vbo, u64 bytes) return 0; } =20 - data_size =3D le64_to_cpu(attr_b->nres.data_size); - alloc_size =3D le64_to_cpu(attr_b->nres.alloc_size); - a_flags =3D attr_b->flags; - - if (is_attr_ext(attr_b)) { - total_size =3D le64_to_cpu(attr_b->nres.total_size); - mask =3D (sbi->cluster_size << attr_b->nres.c_unit) - 1; - } else { - total_size =3D alloc_size; - mask =3D sbi->cluster_mask; - } - - if ((vbo & mask) || (bytes & mask)) { + mask =3D is_attr_ext(attr_b) ? + ((sbi->cluster_size << attr_b->nres.c_unit) - 1) : + sbi->cluster_mask; + if ((vbo | bytes) & mask) { /* Allow to collapse only cluster aligned ranges. */ return -EINVAL; } =20 - if (vbo > data_size) + /* i_size - size of file with delay allocated clusters. */ + i_size =3D ni->vfs_inode.i_size; + + if (vbo > i_size) return -EINVAL; =20 down_write(&ni->file.run_lock); =20 - if (vbo + bytes >=3D data_size) { - u64 new_valid =3D min(ni->i_valid, vbo); + if (vbo + bytes >=3D i_size) { + valid_size =3D min(ni->i_valid, vbo); =20 /* Simple truncate file at 'vbo'. */ truncate_setsize(&ni->vfs_inode, vbo); err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, vbo, - &new_valid, true, NULL); + &valid_size, true); =20 - if (!err && new_valid < ni->i_valid) - ni->i_valid =3D new_valid; + if (!err && valid_size < ni->i_valid) + ni->i_valid =3D valid_size; =20 goto out; } =20 - /* - * Enumerate all attribute segments and collapse. - */ - alen =3D alloc_size >> sbi->cluster_bits; vcn =3D vbo >> sbi->cluster_bits; len =3D bytes >> sbi->cluster_bits; end =3D vcn + len; dealloc =3D 0; done =3D 0; =20 + /* + * Check delayed clusters. + */ + if (ni->file.run_da.count) { + struct runs_tree *run_da =3D &ni->file.run_da; + if (run_is_mapped_full(run_da, vcn, end - 1)) { + /* + * The requested range is full in delayed clusters. + */ + err =3D attr_set_size_ex(ni, ATTR_DATA, NULL, 0, run, + i_size - bytes, NULL, false, + NULL, true); + goto out; + } + + /* Collapse request crosses real and delayed clusters. */ + err =3D ni_allocate_da_blocks_locked(ni); + if (err) + goto out; + + /* Layout of records maybe changed. */ + le_b =3D NULL; + attr_b =3D ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, + &mi_b); + if (!attr_b || !attr_b->non_res) { + err =3D -ENOENT; + goto out; + } + } + + data_size =3D le64_to_cpu(attr_b->nres.data_size); + alloc_size =3D le64_to_cpu(attr_b->nres.alloc_size); + total_size =3D is_attr_ext(attr_b) ? + le64_to_cpu(attr_b->nres.total_size) : + alloc_size; + alen =3D alloc_size >> sbi->cluster_bits; + a_flags =3D attr_b->flags; svcn =3D le64_to_cpu(attr_b->nres.svcn); evcn1 =3D le64_to_cpu(attr_b->nres.evcn) + 1; =20 @@ -1946,6 +2091,9 @@ int attr_collapse_range(struct ntfs_inode *ni, u64 vb= o, u64 bytes) goto out; } =20 + /* + * Enumerate all attribute segments and collapse. + */ for (;;) { CLST vcn1, eat, next_svcn; =20 @@ -1973,13 +2121,13 @@ int attr_collapse_range(struct ntfs_inode *ni, u64 = vbo, u64 bytes) vcn1 =3D vcn + done; /* original vcn in attr/run. */ eat =3D min(end, evcn1) - vcn1; =20 - err =3D run_deallocate_ex(sbi, run, vcn1, eat, &dealloc, true); + err =3D run_deallocate_ex(sbi, run, vcn1, eat, &dealloc, true, + NULL); if (err) goto out; =20 if (svcn + eat < evcn1) { /* Collapse a part of this attribute segment. */ - if (!run_collapse_range(run, vcn1, eat, done)) { err =3D -ENOMEM; goto out; @@ -2160,9 +2308,9 @@ int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u= 64 bytes, u32 *frame_size) bytes =3D alloc_size; bytes -=3D vbo; =20 - if ((vbo & mask) || (bytes & mask)) { + if ((vbo | bytes) & mask) { /* We have to zero a range(s). */ - if (frame_size =3D=3D NULL) { + if (!frame_size) { /* Caller insists range is aligned. */ return -EINVAL; } @@ -2221,7 +2369,8 @@ int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u= 64 bytes, u32 *frame_size) * Calculate how many clusters there are. * Don't do any destructive actions. */ - err =3D run_deallocate_ex(NULL, run, vcn1, zero, &hole2, false); + err =3D run_deallocate_ex(NULL, run, vcn1, zero, &hole2, false, + NULL); if (err) goto done; =20 @@ -2259,7 +2408,8 @@ int attr_punch_hole(struct ntfs_inode *ni, u64 vbo, u= 64 bytes, u32 *frame_size) } =20 /* Real deallocate. Should not fail. */ - run_deallocate_ex(sbi, &run2, vcn1, zero, &hole, true); + run_deallocate_ex(sbi, &run2, vcn1, zero, &hole, true, + &ni->file.run_da); =20 next_attr: /* Free all allocated memory. */ @@ -2371,7 +2521,7 @@ int attr_insert_range(struct ntfs_inode *ni, u64 vbo,= u64 bytes) return -EINVAL; } =20 - if ((vbo & mask) || (bytes & mask)) { + if ((vbo | bytes) & mask) { /* Allow to insert only frame aligned ranges. */ return -EINVAL; } @@ -2390,7 +2540,7 @@ int attr_insert_range(struct ntfs_inode *ni, u64 vbo,= u64 bytes) =20 if (!attr_b->non_res) { err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, run, - data_size + bytes, NULL, false, NULL); + data_size + bytes, NULL, false); =20 le_b =3D NULL; attr_b =3D ni_find_attr(ni, NULL, &le_b, ATTR_DATA, NULL, 0, NULL, @@ -2413,7 +2563,7 @@ int attr_insert_range(struct ntfs_inode *ni, u64 vbo,= u64 bytes) goto done; } =20 - /* Resident files becomes nonresident. */ + /* Resident file becomes nonresident. */ data_size =3D le64_to_cpu(attr_b->nres.data_size); alloc_size =3D le64_to_cpu(attr_b->nres.alloc_size); } @@ -2450,10 +2600,13 @@ int attr_insert_range(struct ntfs_inode *ni, u64 vb= o, u64 bytes) if (err) goto out; =20 - if (!run_insert_range(run, vcn, len)) { - err =3D -ENOMEM; + err =3D run_insert_range(run, vcn, len); + if (err) + goto out; + + err =3D run_insert_range_da(&ni->file.run_da, vcn, len); + if (err) goto out; - } =20 /* Try to pack in current record as much as possible. */ err =3D mi_pack_runs(mi, attr, run, evcn1 + len - svcn); diff --git a/fs/ntfs3/attrlist.c b/fs/ntfs3/attrlist.c index 098bd7e8c3d6..270a29323530 100644 --- a/fs/ntfs3/attrlist.c +++ b/fs/ntfs3/attrlist.c @@ -345,8 +345,8 @@ int al_add_le(struct ntfs_inode *ni, enum ATTR_TYPE typ= e, const __le16 *name, le->id =3D id; memcpy(le->name, name, sizeof(short) * name_len); =20 - err =3D attr_set_size(ni, ATTR_LIST, NULL, 0, &al->run, new_size, - &new_size, true, &attr); + err =3D attr_set_size_ex(ni, ATTR_LIST, NULL, 0, &al->run, new_size, + &new_size, true, &attr, false); if (err) { /* Undo memmove above. */ memmove(le, Add2Ptr(le, sz), old_size - off); @@ -404,8 +404,8 @@ int al_update(struct ntfs_inode *ni, int sync) * Attribute list increased on demand in al_add_le. * Attribute list decreased here. */ - err =3D attr_set_size(ni, ATTR_LIST, NULL, 0, &al->run, al->size, NULL, - false, &attr); + err =3D attr_set_size_ex(ni, ATTR_LIST, NULL, 0, &al->run, al->size, NULL, + false, &attr, false); if (err) goto out; =20 diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c index 1be77f865d78..79e4c7a78c26 100644 --- a/fs/ntfs3/file.c +++ b/fs/ntfs3/file.c @@ -26,6 +26,38 @@ */ #define NTFS3_IOC_SHUTDOWN _IOR('X', 125, __u32) =20 +/* + * Helper for ntfs_should_use_dio. + */ +static u32 ntfs_dio_alignment(struct inode *inode) +{ + struct ntfs_inode *ni =3D ntfs_i(inode); + + if (is_resident(ni)) { + /* Check delalloc. */ + if (!ni->file.run_da.count) + return 0; + } + + /* In most cases this is bdev_logical_block_size(bdev). */ + return ni->mi.sbi->bdev_blocksize; +} + +/* + * Returns %true if the given DIO request should be attempted with DIO, or + * %false if it should fall back to buffered I/O. + */ +static bool ntfs_should_use_dio(struct kiocb *iocb, struct iov_iter *iter) +{ + struct inode *inode =3D file_inode(iocb->ki_filp); + u32 dio_align =3D ntfs_dio_alignment(inode); + + if (!dio_align) + return false; + + return IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(iter), dio_align); +} + static int ntfs_ioctl_fitrim(struct ntfs_sb_info *sbi, unsigned long arg) { struct fstrim_range __user *user_range; @@ -186,10 +218,10 @@ int ntfs_getattr(struct mnt_idmap *idmap, const struc= t path *path, =20 static int ntfs_extend_initialized_size(struct file *file, struct ntfs_inode *ni, - const loff_t valid, const loff_t new_valid) { struct inode *inode =3D &ni->vfs_inode; + const loff_t valid =3D ni->i_valid; int err; =20 if (valid >=3D new_valid) @@ -200,8 +232,6 @@ static int ntfs_extend_initialized_size(struct file *fi= le, return 0; } =20 - WARN_ON(is_compressed(ni)); - err =3D iomap_zero_range(inode, valid, new_valid - valid, NULL, &ntfs_iomap_ops, &ntfs_iomap_folio_ops, NULL); if (err) { @@ -291,7 +321,7 @@ static int ntfs_file_mmap_prepare(struct vm_area_desc *= desc) for (; vcn < end; vcn +=3D len) { err =3D attr_data_get_block(ni, vcn, 1, &lcn, &len, &new, true, - NULL); + NULL, false); if (err) goto out; } @@ -302,8 +332,7 @@ static int ntfs_file_mmap_prepare(struct vm_area_desc *= desc) err =3D -EAGAIN; goto out; } - err =3D ntfs_extend_initialized_size(file, ni, - ni->i_valid, to); + err =3D ntfs_extend_initialized_size(file, ni, to); inode_unlock(inode); if (err) goto out; @@ -333,55 +362,23 @@ static int ntfs_extend(struct inode *inode, loff_t po= s, size_t count, ntfs_set_state(ni->mi.sbi, NTFS_DIRTY_DIRTY); =20 if (end > inode->i_size) { + /* + * Normal files: increase file size, allocate space. + * Sparse/Compressed: increase file size. No space allocated. + */ err =3D ntfs_set_size(inode, end); if (err) goto out; } =20 if (extend_init && !is_compressed(ni)) { - err =3D ntfs_extend_initialized_size(file, ni, ni->i_valid, pos); + err =3D ntfs_extend_initialized_size(file, ni, pos); if (err) goto out; } else { err =3D 0; } =20 - if (file && is_sparsed(ni)) { - /* - * This code optimizes large writes to sparse file. - * TODO: merge this fragment with fallocate fragment. - */ - struct ntfs_sb_info *sbi =3D ni->mi.sbi; - CLST vcn =3D pos >> sbi->cluster_bits; - CLST cend =3D bytes_to_cluster(sbi, end); - CLST cend_v =3D bytes_to_cluster(sbi, ni->i_valid); - CLST lcn, clen; - bool new; - - if (cend_v > cend) - cend_v =3D cend; - - /* - * Allocate and zero new clusters. - * Zeroing these clusters may be too long. - */ - for (; vcn < cend_v; vcn +=3D clen) { - err =3D attr_data_get_block(ni, vcn, cend_v - vcn, &lcn, - &clen, &new, true, NULL); - if (err) - goto out; - } - /* - * Allocate but not zero new clusters. - */ - for (; vcn < cend; vcn +=3D clen) { - err =3D attr_data_get_block(ni, vcn, cend - vcn, &lcn, - &clen, &new, false, NULL); - if (err) - goto out; - } - } - inode_set_mtime_to_ts(inode, inode_set_ctime_current(inode)); mark_inode_dirty(inode); =20 @@ -414,8 +411,9 @@ static int ntfs_truncate(struct inode *inode, loff_t ne= w_size) ni_lock(ni); =20 down_write(&ni->file.run_lock); - err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size, - &new_valid, ni->mi.sbi->options->prealloc, NULL); + err =3D attr_set_size_ex(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size, + &new_valid, ni->mi.sbi->options->prealloc, NULL, + false); up_write(&ni->file.run_lock); =20 ni->i_valid =3D new_valid; @@ -507,7 +505,7 @@ static long ntfs_fallocate(struct file *file, int mode,= loff_t vbo, loff_t len) =20 if (mode & FALLOC_FL_PUNCH_HOLE) { u32 frame_size; - loff_t mask, vbo_a, end_a, tmp; + loff_t mask, vbo_a, end_a, tmp, from; =20 err =3D filemap_write_and_wait_range(mapping, vbo_down, LLONG_MAX); @@ -527,28 +525,24 @@ static long ntfs_fallocate(struct file *file, int mod= e, loff_t vbo, loff_t len) =20 /* Process not aligned punch. */ err =3D 0; + if (end > i_size) + end =3D i_size; mask =3D frame_size - 1; vbo_a =3D (vbo + mask) & ~mask; end_a =3D end & ~mask; =20 tmp =3D min(vbo_a, end); - if (tmp > vbo) { - err =3D iomap_zero_range(inode, vbo, tmp - vbo, NULL, - &ntfs_iomap_ops, - &ntfs_iomap_folio_ops, NULL); - if (err) - goto out; - } - - if (vbo < end_a && end_a < end) { - err =3D iomap_zero_range(inode, end_a, end - end_a, NULL, + from =3D min_t(loff_t, ni->i_valid, vbo); + /* Zero head of punch. */ + if (tmp > from) { + err =3D iomap_zero_range(inode, from, tmp - from, NULL, &ntfs_iomap_ops, &ntfs_iomap_folio_ops, NULL); if (err) goto out; } =20 - /* Aligned punch_hole */ + /* Aligned punch_hole. Deallocate clusters. */ if (end_a > vbo_a) { ni_lock(ni); err =3D attr_punch_hole(ni, vbo_a, end_a - vbo_a, NULL); @@ -556,6 +550,15 @@ static long ntfs_fallocate(struct file *file, int mode= , loff_t vbo, loff_t len) if (err) goto out; } + + /* Zero tail of punch. */ + if (vbo < end_a && end_a < end) { + err =3D iomap_zero_range(inode, end_a, end - end_a, NULL, + &ntfs_iomap_ops, + &ntfs_iomap_folio_ops, NULL); + if (err) + goto out; + } } else if (mode & FALLOC_FL_COLLAPSE_RANGE) { /* * Write tail of the last page before removed range since @@ -653,17 +656,26 @@ static long ntfs_fallocate(struct file *file, int mod= e, loff_t vbo, loff_t len) for (; vcn < cend_v; vcn +=3D clen) { err =3D attr_data_get_block(ni, vcn, cend_v - vcn, &lcn, &clen, &new, - true, NULL); + true, NULL, false); if (err) goto out; } + + /* + * Moving up 'valid size'. + */ + err =3D ntfs_extend_initialized_size( + file, ni, (u64)cend_v << cluster_bits); + if (err) + goto out; + /* * Allocate but not zero new clusters. */ for (; vcn < cend; vcn +=3D clen) { err =3D attr_data_get_block(ni, vcn, cend - vcn, &lcn, &clen, &new, - false, NULL); + false, NULL, false); if (err) goto out; } @@ -674,7 +686,7 @@ static long ntfs_fallocate(struct file *file, int mode,= loff_t vbo, loff_t len) /* True - Keep preallocated. */ err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, i_size, &ni->i_valid, - true, NULL); + true); ni_unlock(ni); if (err) goto out; @@ -816,6 +828,8 @@ static ssize_t ntfs_file_read_iter(struct kiocb *iocb, = struct iov_iter *iter) struct inode *inode =3D file_inode(file); struct ntfs_inode *ni =3D ntfs_i(inode); size_t bytes =3D iov_iter_count(iter); + loff_t valid, i_size, vbo, end; + unsigned int dio_flags; ssize_t err; =20 err =3D check_read_restriction(inode); @@ -835,62 +849,63 @@ static ssize_t ntfs_file_read_iter(struct kiocb *iocb= , struct iov_iter *iter) file->f_ra.ra_pages =3D 0; } =20 - /* Check minimum alignment for dio. */ - if ((iocb->ki_flags & IOCB_DIRECT) && - (is_resident(ni) || ((iocb->ki_pos | iov_iter_alignment(iter)) & - ni->mi.sbi->bdev_blocksize_mask))) { - /* Fallback to buffered I/O */ + /* Fallback to buffered I/O if the inode does not support direct I/O. */ + if (!(iocb->ki_flags & IOCB_DIRECT) || + !ntfs_should_use_dio(iocb, iter)) { iocb->ki_flags &=3D ~IOCB_DIRECT; + return generic_file_read_iter(iocb, iter); } =20 - if (iocb->ki_flags & IOCB_DIRECT) { - loff_t valid, i_size; - loff_t vbo =3D iocb->ki_pos; - loff_t end =3D vbo + bytes; - unsigned int dio_flags =3D IOMAP_DIO_PARTIAL; - - if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock_shared(inode)) - return -EAGAIN; - } else { - inode_lock_shared(inode); - } - - valid =3D ni->i_valid; - i_size =3D inode->i_size; + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!inode_trylock_shared(inode)) + return -EAGAIN; + } else { + inode_lock_shared(inode); + } =20 - if (vbo < valid) { - if (valid < end) { - /* read cross 'valid' size. */ - dio_flags |=3D IOMAP_DIO_FORCE_WAIT; - } + vbo =3D iocb->ki_pos; + end =3D vbo + bytes; + dio_flags =3D 0; + valid =3D ni->i_valid; + i_size =3D inode->i_size; =20 - err =3D iomap_dio_rw(iocb, iter, &ntfs_iomap_ops, NULL, - dio_flags, NULL, 0); + if (vbo < valid) { + if (valid < end) { + /* read cross 'valid' size. */ + dio_flags |=3D IOMAP_DIO_FORCE_WAIT; + } =20 - if (err > 0) { - end =3D vbo + err; - if (valid < end) { - size_t to_zero =3D end - valid; - /* Fix iter. */ - iov_iter_revert(iter, to_zero); - iov_iter_zero(to_zero, iter); - } - } - } else if (vbo < i_size) { - if (end > i_size) - bytes =3D i_size - vbo; - iov_iter_zero(bytes, iter); - iocb->ki_pos +=3D bytes; - err =3D bytes; + if (ni->file.run_da.count) { + /* Direct I/O is not compatible with delalloc. */ + err =3D ni_allocate_da_blocks(ni); + if (err) + goto out; } =20 - inode_unlock_shared(inode); - file_accessed(iocb->ki_filp); - return err; + err =3D iomap_dio_rw(iocb, iter, &ntfs_iomap_ops, NULL, dio_flags, + NULL, 0); + + if (err <=3D 0) + goto out; + end =3D vbo + err; + if (valid < end) { + size_t to_zero =3D end - valid; + /* Fix iter. */ + iov_iter_revert(iter, to_zero); + iov_iter_zero(to_zero, iter); + } + } else if (vbo < i_size) { + if (end > i_size) + bytes =3D i_size - vbo; + iov_iter_zero(bytes, iter); + iocb->ki_pos +=3D bytes; + err =3D bytes; } =20 - return generic_file_read_iter(iocb, iter); +out: + inode_unlock_shared(inode); + file_accessed(iocb->ki_filp); + return err; } =20 /* @@ -1011,17 +1026,13 @@ static ssize_t ntfs_compress_write(struct kiocb *io= cb, struct iov_iter *from) off =3D valid & (frame_size - 1); =20 err =3D attr_data_get_block(ni, frame << NTFS_LZNT_CUNIT, 1, &lcn, - &clen, NULL, false, NULL); + &clen, NULL, false, NULL, false); if (err) goto out; =20 if (lcn =3D=3D SPARSE_LCN) { - valid =3D frame_vbo + ((u64)clen << sbi->cluster_bits); - if (ni->i_valid =3D=3D valid) { - err =3D -EINVAL; - goto out; - } - ni->i_valid =3D valid; + ni->i_valid =3D valid =3D + frame_vbo + ((u64)clen << sbi->cluster_bits); continue; } =20 @@ -1207,6 +1218,9 @@ static int check_write_restriction(struct inode *inod= e) return -EOPNOTSUPP; } =20 + if (unlikely(IS_IMMUTABLE(inode))) + return -EPERM; + return 0; } =20 @@ -1218,8 +1232,6 @@ static ssize_t ntfs_file_write_iter(struct kiocb *ioc= b, struct iov_iter *from) struct file *file =3D iocb->ki_filp; struct inode *inode =3D file_inode(file); struct ntfs_inode *ni =3D ntfs_i(inode); - struct super_block *sb =3D inode->i_sb; - struct ntfs_sb_info *sbi =3D sb->s_fs_info; ssize_t ret, err; =20 if (!inode_trylock(inode)) { @@ -1263,15 +1275,11 @@ static ssize_t ntfs_file_write_iter(struct kiocb *i= ocb, struct iov_iter *from) goto out; } =20 - /* Check minimum alignment for dio. */ - if ((iocb->ki_flags & IOCB_DIRECT) && - (is_resident(ni) || ((iocb->ki_pos | iov_iter_alignment(from)) & - sbi->bdev_blocksize_mask))) { - /* Fallback to buffered I/O */ + /* Fallback to buffered I/O if the inode does not support direct I/O. */ + if (!(iocb->ki_flags & IOCB_DIRECT) || + !ntfs_should_use_dio(iocb, from)) { iocb->ki_flags &=3D ~IOCB_DIRECT; - } =20 - if (!(iocb->ki_flags & IOCB_DIRECT)) { ret =3D iomap_file_buffered_write(iocb, from, &ntfs_iomap_ops, &ntfs_iomap_folio_ops, NULL); inode_unlock(inode); @@ -1282,8 +1290,14 @@ static ssize_t ntfs_file_write_iter(struct kiocb *io= cb, struct iov_iter *from) return ret; } =20 - ret =3D iomap_dio_rw(iocb, from, &ntfs_iomap_ops, NULL, IOMAP_DIO_PARTIAL, - NULL, 0); + if (ni->file.run_da.count) { + /* Direct I/O is not compatible with delalloc. */ + ret =3D ni_allocate_da_blocks(ni); + if (ret) + goto out; + } + + ret =3D iomap_dio_rw(iocb, from, &ntfs_iomap_ops, NULL, 0, NULL, 0); =20 if (ret =3D=3D -ENOTBLK) { /* Returns -ENOTBLK in case of a page invalidation failure for writes.*/ @@ -1370,34 +1384,42 @@ int ntfs_file_open(struct inode *inode, struct file= *file) =20 /* * ntfs_file_release - file_operations::release + * + * Called when an inode is released. Note that this is different + * from ntfs_file_open: open gets called at every open, but release + * gets called only when /all/ the files are closed. */ static int ntfs_file_release(struct inode *inode, struct file *file) { - struct ntfs_inode *ni =3D ntfs_i(inode); - struct ntfs_sb_info *sbi =3D ni->mi.sbi; - int err =3D 0; - - /* If we are last writer on the inode, drop the block reservation. */ - if (sbi->options->prealloc && - ((file->f_mode & FMODE_WRITE) && - atomic_read(&inode->i_writecount) =3D=3D 1) - /* - * The only file when inode->i_fop =3D &ntfs_file_operations and - * init_rwsem(&ni->file.run_lock) is not called explicitly is MFT. - * - * Add additional check here. - */ - && inode->i_ino !=3D MFT_REC_MFT) { + int err; + struct ntfs_inode *ni; + + if (!(file->f_mode & FMODE_WRITE) || + atomic_read(&inode->i_writecount) !=3D 1 || + inode->i_ino =3D=3D MFT_REC_MFT) { + return 0; + } + + /* Close the last writer on the inode. */ + ni =3D ntfs_i(inode); + + /* Allocate delayed blocks (clusters). */ + err =3D ni_allocate_da_blocks(ni); + if (err) + goto out; + + if (ni->mi.sbi->options->prealloc) { ni_lock(ni); down_write(&ni->file.run_lock); =20 + /* Deallocate preallocated. */ err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, - i_size_read(inode), &ni->i_valid, false, - NULL); + inode->i_size, &ni->i_valid, false); =20 up_write(&ni->file.run_lock); ni_unlock(ni); } +out: return err; } =20 @@ -1506,7 +1528,7 @@ static loff_t ntfs_llseek(struct file *file, loff_t o= ffset, int whence) =20 if (whence =3D=3D SEEK_DATA || whence =3D=3D SEEK_HOLE) { inode_lock_shared(inode); - /* Scan fragments for hole or data. */ + /* Scan file for hole or data. */ ret =3D ni_seek_data_or_hole(ni, offset, whence =3D=3D SEEK_DATA); inode_unlock_shared(inode); =20 diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c index 0dc28815331e..bd0fa481e4b3 100644 --- a/fs/ntfs3/frecord.c +++ b/fs/ntfs3/frecord.c @@ -123,6 +123,8 @@ void ni_clear(struct ntfs_inode *ni) indx_clear(&ni->dir); else { run_close(&ni->file.run); + ntfs_sub_da(ni->mi.sbi, run_len(&ni->file.run_da)); + run_close(&ni->file.run_da); #ifdef CONFIG_NTFS3_LZX_XPRESS if (ni->file.offs_folio) { /* On-demand allocated page for offsets. */ @@ -2014,7 +2016,8 @@ int ni_decompress_file(struct ntfs_inode *ni) =20 for (vcn =3D vbo >> sbi->cluster_bits; vcn < end; vcn +=3D clen) { err =3D attr_data_get_block(ni, vcn, cend - vcn, &lcn, - &clen, &new, false, NULL); + &clen, &new, false, NULL, + false); if (err) goto out; } @@ -2235,7 +2238,7 @@ int ni_read_frame(struct ntfs_inode *ni, u64 frame_vb= o, struct page **pages, struct runs_tree *run =3D &ni->file.run; u64 valid_size =3D ni->i_valid; u64 vbo_disk; - size_t unc_size; + size_t unc_size =3D 0; u32 frame_size, i, ondisk_size; struct page *pg; struct ATTRIB *attr; @@ -2846,7 +2849,7 @@ loff_t ni_seek_data_or_hole(struct ntfs_inode *ni, lo= ff_t offset, bool data) /* Enumerate all fragments. */ for (vcn =3D offset >> cluster_bits;; vcn +=3D clen) { err =3D attr_data_get_block(ni, vcn, 1, &lcn, &clen, NULL, false, - NULL); + NULL, false); if (err) { return err; } @@ -2886,9 +2889,9 @@ loff_t ni_seek_data_or_hole(struct ntfs_inode *ni, lo= ff_t offset, bool data) } } else { /* - * Adjust the file offset to the next hole in the file greater than or=20 + * Adjust the file offset to the next hole in the file greater than or * equal to offset. If offset points into the middle of a hole, then the - * file offset is set to offset. If there is no hole past offset, then = the=20 + * file offset is set to offset. If there is no hole past offset, then = the * file offset is adjusted to the end of the file * (i.e., there is an implicit hole at the end of any file). */ @@ -3235,3 +3238,62 @@ int ni_write_inode(struct inode *inode, int sync, co= nst char *hint) =20 return 0; } + +/* + * Force to allocate all delay allocated clusters. + */ +int ni_allocate_da_blocks(struct ntfs_inode *ni) +{ + int err; + + ni_lock(ni); + down_write(&ni->file.run_lock); + + err =3D ni_allocate_da_blocks_locked(ni); + + up_write(&ni->file.run_lock); + ni_unlock(ni); + + return err; +} + +/* + * Force to allocate all delay allocated clusters. + */ +int ni_allocate_da_blocks_locked(struct ntfs_inode *ni) +{ + int err; + + if (!ni->file.run_da.count) + return 0; + + if (is_sparsed(ni)) { + CLST vcn, lcn, clen, alen; + bool new; + + /* + * Sparse file allocates clusters in 'attr_data_get_block_locked' + */ + while (run_get_entry(&ni->file.run_da, 0, &vcn, &lcn, &clen)) { + /* TODO: zero=3Dtrue? */ + err =3D attr_data_get_block_locked(ni, vcn, clen, &lcn, + &alen, &new, true, + NULL, true); + if (err) + break; + if (!new) { + err =3D -EINVAL; + break; + } + } + } else { + /* + * Normal file allocates clusters in 'attr_set_size' + */ + err =3D attr_set_size_ex(ni, ATTR_DATA, NULL, 0, &ni->file.run, + ni->vfs_inode.i_size, &ni->i_valid, + false, NULL, true); + } + + return err; +} diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c index 2ef500f1a9fa..5f44e91d7997 100644 --- a/fs/ntfs3/fsntfs.c +++ b/fs/ntfs3/fsntfs.c @@ -445,36 +445,59 @@ int ntfs_look_for_free_space(struct ntfs_sb_info *sbi= , CLST lcn, CLST len, } =20 /* - * ntfs_check_for_free_space + * ntfs_check_free_space * * Check if it is possible to allocate 'clen' clusters and 'mlen' Mft reco= rds */ -bool ntfs_check_for_free_space(struct ntfs_sb_info *sbi, CLST clen, CLST m= len) +bool ntfs_check_free_space(struct ntfs_sb_info *sbi, CLST clen, CLST mlen, + bool da) { size_t free, zlen, avail; struct wnd_bitmap *wnd; + CLST da_clusters =3D ntfs_get_da(sbi); =20 wnd =3D &sbi->used.bitmap; down_read_nested(&wnd->rw_lock, BITMAP_MUTEX_CLUSTERS); free =3D wnd_zeroes(wnd); + + if (free >=3D da_clusters) { + free -=3D da_clusters; + } else { + free =3D 0; + } + zlen =3D min_t(size_t, NTFS_MIN_MFT_ZONE, wnd_zone_len(wnd)); up_read(&wnd->rw_lock); =20 - if (free < zlen + clen) + if (free < zlen + clen) { return false; + } =20 avail =3D free - (zlen + clen); =20 - wnd =3D &sbi->mft.bitmap; - down_read_nested(&wnd->rw_lock, BITMAP_MUTEX_MFT); - free =3D wnd_zeroes(wnd); - zlen =3D wnd_zone_len(wnd); - up_read(&wnd->rw_lock); + /*=20 + * When delalloc is active then keep in mind some reserved space. + * The worst case: 1 mft record per each ~500 clusters. + */ + if (da) { + /* 1 mft record per each 1024 clusters. */ + mlen +=3D da_clusters >> 10; + } + + if (mlen || !avail) { + wnd =3D &sbi->mft.bitmap; + down_read_nested(&wnd->rw_lock, BITMAP_MUTEX_MFT); + free =3D wnd_zeroes(wnd); + zlen =3D wnd_zone_len(wnd); + up_read(&wnd->rw_lock); =20 - if (free >=3D zlen + mlen) - return true; + if (free < zlen + mlen && + avail < bytes_to_cluster(sbi, mlen << sbi->record_bits)) { + return false; + } + } =20 - return avail >=3D bytes_to_cluster(sbi, mlen << sbi->record_bits); + return true; } =20 /* @@ -509,8 +532,8 @@ static int ntfs_extend_mft(struct ntfs_sb_info *sbi) =20 /* Step 1: Resize $MFT::DATA. */ down_write(&ni->file.run_lock); - err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, - new_mft_bytes, NULL, false, &attr); + err =3D attr_set_size_ex(ni, ATTR_DATA, NULL, 0, &ni->file.run, + new_mft_bytes, NULL, false, &attr, false); =20 if (err) { up_write(&ni->file.run_lock); @@ -525,7 +548,7 @@ static int ntfs_extend_mft(struct ntfs_sb_info *sbi) new_bitmap_bytes =3D ntfs3_bitmap_size(new_mft_total); =20 err =3D attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run, - new_bitmap_bytes, &new_bitmap_bytes, true, NULL); + new_bitmap_bytes, &new_bitmap_bytes, true); =20 /* Refresh MFT Zone if necessary. */ down_write_nested(&sbi->used.bitmap.rw_lock, BITMAP_MUTEX_CLUSTERS); @@ -2191,7 +2214,7 @@ int ntfs_insert_security(struct ntfs_sb_info *sbi, if (new_sds_size > ni->vfs_inode.i_size) { err =3D attr_set_size(ni, ATTR_DATA, SDS_NAME, ARRAY_SIZE(SDS_NAME), &ni->file.run, - new_sds_size, &new_sds_size, false, NULL); + new_sds_size, &new_sds_size, false); if (err) goto out; } diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c index d08bee3c20fa..2416c61050f1 100644 --- a/fs/ntfs3/index.c +++ b/fs/ntfs3/index.c @@ -1446,8 +1446,8 @@ static int indx_create_allocate(struct ntfs_index *in= dx, struct ntfs_inode *ni, =20 run_init(&run); =20 - err =3D attr_allocate_clusters(sbi, &run, 0, 0, len, NULL, ALLOCATE_DEF, - &alen, 0, NULL, NULL); + err =3D attr_allocate_clusters(sbi, &run, NULL, 0, 0, len, NULL, + ALLOCATE_DEF, &alen, 0, NULL, NULL); if (err) goto out; =20 @@ -1531,8 +1531,7 @@ static int indx_add_allocate(struct ntfs_index *indx,= struct ntfs_inode *ni, /* Increase bitmap. */ err =3D attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len, &indx->bitmap_run, - ntfs3_bitmap_size(bit + 1), NULL, true, - NULL); + ntfs3_bitmap_size(bit + 1), NULL, true); if (err) goto out1; } @@ -1553,8 +1552,7 @@ static int indx_add_allocate(struct ntfs_index *indx,= struct ntfs_inode *ni, =20 /* Increase allocation. */ err =3D attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len, - &indx->alloc_run, data_size, &data_size, true, - NULL); + &indx->alloc_run, data_size, &data_size, true); if (err) { if (bmp) goto out2; @@ -1572,7 +1570,7 @@ static int indx_add_allocate(struct ntfs_index *indx,= struct ntfs_inode *ni, out2: /* Ops. No space? */ attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len, - &indx->bitmap_run, bmp_size, &bmp_size_v, false, NULL); + &indx->bitmap_run, bmp_size, &bmp_size_v, false); =20 out1: return err; @@ -2106,7 +2104,7 @@ static int indx_shrink(struct ntfs_index *indx, struc= t ntfs_inode *ni, new_data =3D (u64)bit << indx->index_bits; =20 err =3D attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len, - &indx->alloc_run, new_data, &new_data, false, NULL); + &indx->alloc_run, new_data, &new_data, false); if (err) return err; =20 @@ -2118,7 +2116,7 @@ static int indx_shrink(struct ntfs_index *indx, struc= t ntfs_inode *ni, return 0; =20 err =3D attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len, - &indx->bitmap_run, bpb, &bpb, false, NULL); + &indx->bitmap_run, bpb, &bpb, false); =20 return err; } @@ -2333,6 +2331,7 @@ int indx_delete_entry(struct ntfs_index *indx, struct= ntfs_inode *ni, hdr =3D &root->ihdr; e =3D fnd->root_de; n =3D NULL; + ib =3D NULL; } =20 e_size =3D le16_to_cpu(e->size); @@ -2355,7 +2354,7 @@ int indx_delete_entry(struct ntfs_index *indx, struct= ntfs_inode *ni, * Check to see if removing that entry made * the leaf empty. */ - if (ib_is_leaf(ib) && ib_is_empty(ib)) { + if (ib && ib_is_leaf(ib) && ib_is_empty(ib)) { fnd_pop(fnd); fnd_push(fnd2, n, e); } @@ -2603,7 +2602,7 @@ int indx_delete_entry(struct ntfs_index *indx, struct= ntfs_inode *ni, in =3D &s_index_names[indx->type]; =20 err =3D attr_set_size(ni, ATTR_ALLOC, in->name, in->name_len, - &indx->alloc_run, 0, NULL, false, NULL); + &indx->alloc_run, 0, NULL, false); if (in->name =3D=3D I30_NAME) i_size_write(&ni->vfs_inode, 0); =20 @@ -2612,7 +2611,7 @@ int indx_delete_entry(struct ntfs_index *indx, struct= ntfs_inode *ni, run_close(&indx->alloc_run); =20 err =3D attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len, - &indx->bitmap_run, 0, NULL, false, NULL); + &indx->bitmap_run, 0, NULL, false); err =3D ni_remove_attr(ni, ATTR_BITMAP, in->name, in->name_len, false, NULL); run_close(&indx->bitmap_run); diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c index 2147fce8e0b2..aca774f1aed1 100644 --- a/fs/ntfs3/inode.c +++ b/fs/ntfs3/inode.c @@ -40,7 +40,7 @@ static struct inode *ntfs_read_mft(struct inode *inode, u32 rp_fa =3D 0, asize, t32; u16 roff, rsize, names =3D 0, links =3D 0; const struct ATTR_FILE_NAME *fname =3D NULL; - const struct INDEX_ROOT *root; + const struct INDEX_ROOT *root =3D NULL; struct REPARSE_DATA_BUFFER rp; // 0x18 bytes u64 t64; struct MFT_REC *rec; @@ -556,6 +556,25 @@ struct inode *ntfs_iget5(struct super_block *sb, const= struct MFT_REF *ref, =20 static sector_t ntfs_bmap(struct address_space *mapping, sector_t block) { + struct inode *inode =3D mapping->host; + struct ntfs_inode *ni =3D ntfs_i(inode); + + /* + * We can get here for an inline file via the FIBMAP ioctl + */ + if (is_resident(ni)) + return 0; + + if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) && + !run_is_empty(&ni->file.run_da)) { + /* + * With delalloc data we want to sync the file so + * that we can make sure we allocate blocks for file and data + * is in place for the user to see it + */ + ni_allocate_da_blocks(ni); + } + return iomap_bmap(mapping, block, &ntfs_iomap_ops); } =20 @@ -722,7 +741,7 @@ int ntfs_set_size(struct inode *inode, u64 new_size) down_write(&ni->file.run_lock); =20 err =3D attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size, - &ni->i_valid, true, NULL); + &ni->i_valid, true); =20 if (!err) { i_size_write(inode, new_size); @@ -735,6 +754,10 @@ int ntfs_set_size(struct inode *inode, u64 new_size) return err; } =20 +/* + * Special value to detect ntfs_writeback_range call + */ +#define WB_NO_DA (struct iomap *)1 /* * Function to get mapping vbo -> lbo. * used with: @@ -760,22 +783,40 @@ static int ntfs_iomap_begin(struct inode *inode, loff= _t offset, loff_t length, loff_t endbyte =3D offset + length; void *res =3D NULL; int err; - CLST lcn, clen, clen_max; + CLST lcn, clen, clen_max =3D 1; bool new_clst =3D false; + bool no_da; + bool zero =3D false; if (unlikely(ntfs3_forced_shutdown(sbi->sb))) return -EIO; =20 - if ((flags & IOMAP_REPORT) && offset > ntfs_get_maxbytes(ni)) { - /* called from fiemap/bmap. */ - return -EINVAL; + if (flags & IOMAP_REPORT) { + if (offset > ntfs_get_maxbytes(ni)) { + /* called from fiemap/bmap. */ + return -EINVAL; + } + + if (offset >=3D inode->i_size) { + /* special code for report. */ + return -ENOENT; + } } =20 - clen_max =3D rw ? (bytes_to_cluster(sbi, endbyte) - vcn) : 1; + if (IOMAP_ZERO =3D=3D flags && (endbyte & sbi->cluster_mask)) { + rw =3D true; + } else if (rw) { + clen_max =3D bytes_to_cluster(sbi, endbyte) - vcn; + } =20 - err =3D attr_data_get_block( - ni, vcn, clen_max, &lcn, &clen, rw ? &new_clst : NULL, - flags =3D=3D IOMAP_WRITE && (off || (endbyte & sbi->cluster_mask)), - &res); + /*=20 + * Force to allocate clusters if directIO(write) or writeback_range. + * NOTE: attr_data_get_block allocates clusters only for sparse file. + * Normal file allocates clusters in attr_set_size. + */ + no_da =3D flags =3D=3D (IOMAP_DIRECT | IOMAP_WRITE) || srcmap =3D=3D WB_N= O_DA; + + err =3D attr_data_get_block(ni, vcn, clen_max, &lcn, &clen, + rw ? &new_clst : NULL, zero, &res, no_da); =20 if (err) { return err; @@ -795,6 +836,8 @@ static int ntfs_iomap_begin(struct inode *inode, loff_t= offset, loff_t length, lcn =3D SPARSE_LCN; } =20 + iomap->flags =3D new_clst ? IOMAP_F_NEW : 0; + if (lcn =3D=3D RESIDENT_LCN) { if (offset >=3D clen) { kfree(res); @@ -809,7 +852,6 @@ static int ntfs_iomap_begin(struct inode *inode, loff_t= offset, loff_t length, iomap->type =3D IOMAP_INLINE; iomap->offset =3D 0; iomap->length =3D clen; /* resident size in bytes. */ - iomap->flags =3D 0; return 0; } =20 @@ -818,42 +860,52 @@ static int ntfs_iomap_begin(struct inode *inode, loff= _t offset, loff_t length, return -EINVAL; } =20 + iomap->bdev =3D inode->i_sb->s_bdev; + iomap->offset =3D offset; + iomap->length =3D ((loff_t)clen << cluster_bits) - off; + if (lcn =3D=3D COMPRESSED_LCN) { /* should never be here. */ return -EOPNOTSUPP; } =20 - iomap->flags =3D new_clst ? IOMAP_F_NEW : 0; - iomap->bdev =3D inode->i_sb->s_bdev; - - /* Translate clusters into bytes. */ - iomap->offset =3D offset; - iomap->addr =3D ((loff_t)lcn << cluster_bits) + off; - iomap->length =3D ((loff_t)clen << cluster_bits) - off; - if (length && iomap->length > length) - iomap->length =3D length; - else - endbyte =3D offset + iomap->length; - - if (lcn =3D=3D SPARSE_LCN) { + if (lcn =3D=3D DELALLOC_LCN) { + iomap->type =3D IOMAP_DELALLOC; iomap->addr =3D IOMAP_NULL_ADDR; - iomap->type =3D IOMAP_HOLE; - } else if (endbyte <=3D ni->i_valid) { - iomap->type =3D IOMAP_MAPPED; - } else if (offset < ni->i_valid) { - iomap->type =3D IOMAP_MAPPED; - if (flags & IOMAP_REPORT) - iomap->length =3D ni->i_valid - offset; - } else if (rw || (flags & IOMAP_ZERO)) { - iomap->type =3D IOMAP_MAPPED; } else { - iomap->type =3D IOMAP_UNWRITTEN; + + /* Translate clusters into bytes. */ + iomap->addr =3D ((loff_t)lcn << cluster_bits) + off; + if (length && iomap->length > length) + iomap->length =3D length; + else + endbyte =3D offset + iomap->length; + + if (lcn =3D=3D SPARSE_LCN) { + iomap->addr =3D IOMAP_NULL_ADDR; + iomap->type =3D IOMAP_HOLE; + // if (IOMAP_ZERO =3D=3D flags && !off) { + // iomap->length =3D (endbyte - offset) & + // sbi->cluster_mask_inv; + // } + } else if (endbyte <=3D ni->i_valid) { + iomap->type =3D IOMAP_MAPPED; + } else if (offset < ni->i_valid) { + iomap->type =3D IOMAP_MAPPED; + if (flags & IOMAP_REPORT) + iomap->length =3D ni->i_valid - offset; + } else if (rw || (flags & IOMAP_ZERO)) { + iomap->type =3D IOMAP_MAPPED; + } else { + iomap->type =3D IOMAP_UNWRITTEN; + } } =20 - if ((flags & IOMAP_ZERO) && iomap->type =3D=3D IOMAP_MAPPED) { + if ((flags & IOMAP_ZERO) && + (iomap->type =3D=3D IOMAP_MAPPED || iomap->type =3D=3D IOMAP_DELALLOC= )) { /* Avoid too large requests. */ u32 tail; - u32 off_a =3D iomap->addr & (PAGE_SIZE - 1); + u32 off_a =3D offset & (PAGE_SIZE - 1); if (off_a) tail =3D PAGE_SIZE - off_a; else @@ -904,7 +956,9 @@ static int ntfs_iomap_end(struct inode *inode, loff_t p= os, loff_t length, } } =20 - if ((flags & IOMAP_ZERO) && iomap->type =3D=3D IOMAP_MAPPED) { + if ((flags & IOMAP_ZERO) && + (iomap->type =3D=3D IOMAP_MAPPED || iomap->type =3D=3D IOMAP_DELALLOC= )) { + /* Pair for code in ntfs_iomap_begin. */ balance_dirty_pages_ratelimited(inode->i_mapping); cond_resched(); } @@ -933,7 +987,7 @@ static void ntfs_iomap_put_folio(struct inode *inode, l= off_t pos, loff_t f_pos =3D folio_pos(folio); loff_t f_end =3D f_pos + f_size; =20 - if (ni->i_valid < end && end < f_end) { + if (ni->i_valid <=3D end && end < f_end) { /* zero range [end - f_end). */ /* The only thing ntfs_iomap_put_folio used for. */ folio_zero_segment(folio, offset_in_folio(folio, end), f_size); @@ -942,23 +996,31 @@ static void ntfs_iomap_put_folio(struct inode *inode,= loff_t pos, folio_put(folio); } =20 +/* + * iomap_writeback_ops::writeback_range + */ static ssize_t ntfs_writeback_range(struct iomap_writepage_ctx *wpc, struct folio *folio, u64 offset, unsigned int len, u64 end_pos) { struct iomap *iomap =3D &wpc->iomap; - struct inode *inode =3D wpc->inode; - /* Check iomap position. */ - if (!(iomap->offset <=3D offset && - offset < iomap->offset + iomap->length)) { + if (iomap->offset + iomap->length <=3D offset || offset < iomap->offset) { int err; + struct inode *inode =3D wpc->inode; + struct ntfs_inode *ni =3D ntfs_i(inode); struct ntfs_sb_info *sbi =3D ntfs_sb(inode->i_sb); loff_t i_size_up =3D ntfs_up_cluster(sbi, inode->i_size); loff_t len_max =3D i_size_up - offset; =20 - err =3D ntfs_iomap_begin(inode, offset, len_max, IOMAP_WRITE, - iomap, NULL); + err =3D ni->file.run_da.count ? ni_allocate_da_blocks(ni) : 0; + + if (!err) { + /* Use local special value 'WB_NO_DA' to disable delalloc. */ + err =3D ntfs_iomap_begin(inode, offset, len_max, + IOMAP_WRITE, iomap, WB_NO_DA); + } + if (err) { ntfs_set_state(sbi, NTFS_DIRTY_DIRTY); return err; @@ -1532,9 +1594,10 @@ int ntfs_create_inode(struct mnt_idmap *idmap, struc= t inode *dir, attr->nres.alloc_size =3D cpu_to_le64(ntfs_up_cluster(sbi, nsize)); =20 - err =3D attr_allocate_clusters(sbi, &ni->file.run, 0, 0, - clst, NULL, ALLOCATE_DEF, - &alen, 0, NULL, NULL); + err =3D attr_allocate_clusters(sbi, &ni->file.run, NULL, + 0, 0, clst, NULL, + ALLOCATE_DEF, &alen, 0, + NULL, NULL); if (err) goto out5; =20 @@ -1675,7 +1738,7 @@ int ntfs_create_inode(struct mnt_idmap *idmap, struct= inode *dir, /* Delete ATTR_EA, if non-resident. */ struct runs_tree run; run_init(&run); - attr_set_size(ni, ATTR_EA, NULL, 0, &run, 0, NULL, false, NULL); + attr_set_size(ni, ATTR_EA, NULL, 0, &run, 0, NULL, false); run_close(&run); } =20 diff --git a/fs/ntfs3/ntfs.h b/fs/ntfs3/ntfs.h index ae0a6ba102c0..892f13e65d42 100644 --- a/fs/ntfs3/ntfs.h +++ b/fs/ntfs3/ntfs.h @@ -77,11 +77,14 @@ static_assert(sizeof(size_t) =3D=3D 8); typedef u32 CLST; #endif =20 +/* On-disk sparsed cluster is marked as -1. */ #define SPARSE_LCN64 ((u64)-1) #define SPARSE_LCN ((CLST)-1) +/* Below is virtual (not on-disk) values. */ #define RESIDENT_LCN ((CLST)-2) #define COMPRESSED_LCN ((CLST)-3) #define EOF_LCN ((CLST)-4) +#define DELALLOC_LCN ((CLST)-5) =20 enum RECORD_NUM { MFT_REC_MFT =3D 0, diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h index b7017dd4d7cd..a705923de75e 100644 --- a/fs/ntfs3/ntfs_fs.h +++ b/fs/ntfs3/ntfs_fs.h @@ -108,6 +108,7 @@ struct ntfs_mount_options { unsigned force : 1; /* RW mount dirty volume. */ unsigned prealloc : 1; /* Preallocate space when file is growing. */ unsigned nocase : 1; /* case insensitive. */ + unsigned delalloc : 1; /* delay allocation. */ }; =20 /* Special value to unpack and deallocate. */ @@ -132,7 +133,8 @@ struct ntfs_buffers { enum ALLOCATE_OPT { ALLOCATE_DEF =3D 0, // Allocate all clusters. ALLOCATE_MFT =3D 1, // Allocate for MFT. - ALLOCATE_ZERO =3D 2, // Zeroout new allocated clusters + ALLOCATE_ZERO =3D 2, // Zeroout new allocated clusters. + ALLOCATE_ONE_FR =3D 4, // Allocate one fragment only. }; =20 enum bitmap_mutex_classes { @@ -213,7 +215,7 @@ struct ntfs_sb_info { =20 u32 discard_granularity; u64 discard_granularity_mask_inv; // ~(discard_granularity_mask_inv-1) - u32 bdev_blocksize_mask; // bdev_logical_block_size(bdev) - 1; + u32 bdev_blocksize; // bdev_logical_block_size(bdev) =20 u32 cluster_size; // bytes per cluster u32 cluster_mask; // =3D=3D cluster_size - 1 @@ -272,6 +274,12 @@ struct ntfs_sb_info { struct { struct wnd_bitmap bitmap; // $Bitmap::Data CLST next_free_lcn; + /* Total sum of delay allocated clusters in all files. */ +#ifdef CONFIG_NTFS3_64BIT_CLUSTER + atomic64_t da; +#else + atomic_t da; +#endif } used; =20 struct { @@ -379,7 +387,7 @@ struct ntfs_inode { */ u8 mi_loaded; =20 - /*=20 + /* * Use this field to avoid any write(s). * If inode is bad during initialization - use make_bad_inode * If inode is bad during operations - use this field @@ -390,7 +398,14 @@ struct ntfs_inode { struct ntfs_index dir; struct { struct rw_semaphore run_lock; + /* Unpacked runs from just one record. */ struct runs_tree run; + /*=20 + * Pairs [vcn, len] for all delay allocated clusters. + * Normal file always contains delayed clusters in one fragment. + * TODO: use 2 CLST per pair instead of 3. + */ + struct runs_tree run_da; #ifdef CONFIG_NTFS3_LZX_XPRESS struct folio *offs_folio; #endif @@ -430,19 +445,32 @@ enum REPARSE_SIGN { =20 /* Functions from attrib.c */ int attr_allocate_clusters(struct ntfs_sb_info *sbi, struct runs_tree *run, - CLST vcn, CLST lcn, CLST len, CLST *pre_alloc, - enum ALLOCATE_OPT opt, CLST *alen, const size_t fr, - CLST *new_lcn, CLST *new_len); + struct runs_tree *run_da, CLST vcn, CLST lcn, + CLST len, CLST *pre_alloc, enum ALLOCATE_OPT opt, + CLST *alen, const size_t fr, CLST *new_lcn, + CLST *new_len); int attr_make_nonresident(struct ntfs_inode *ni, struct ATTRIB *attr, struct ATTR_LIST_ENTRY *le, struct mft_inode *mi, u64 new_size, struct runs_tree *run, struct ATTRIB **ins_attr, struct page *page); -int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type, - const __le16 *name, u8 name_len, struct runs_tree *run, - u64 new_size, const u64 *new_valid, bool keep_prealloc, - struct ATTRIB **ret); +int attr_set_size_ex(struct ntfs_inode *ni, enum ATTR_TYPE type, + const __le16 *name, u8 name_len, struct runs_tree *run, + u64 new_size, const u64 *new_valid, bool keep_prealloc, + struct ATTRIB **ret, bool no_da); +static inline int attr_set_size(struct ntfs_inode *ni, enum ATTR_TYPE type, + const __le16 *name, u8 name_len, + struct runs_tree *run, u64 new_size, + const u64 *new_valid, bool keep_prealloc) +{ + return attr_set_size_ex(ni, type, name, name_len, run, new_size, + new_valid, keep_prealloc, NULL, false); +} int attr_data_get_block(struct ntfs_inode *ni, CLST vcn, CLST clen, CLST *= lcn, - CLST *len, bool *new, bool zero, void **res); + CLST *len, bool *new, bool zero, void **res, + bool no_da); +int attr_data_get_block_locked(struct ntfs_inode *ni, CLST vcn, CLST clen, + CLST *lcn, CLST *len, bool *new, bool zero, + void **res, bool no_da); int attr_data_write_resident(struct ntfs_inode *ni, struct folio *folio); int attr_load_runs_vcn(struct ntfs_inode *ni, enum ATTR_TYPE type, const __le16 *name, u8 name_len, struct runs_tree *run, @@ -590,6 +618,8 @@ int ni_rename(struct ntfs_inode *dir_ni, struct ntfs_in= ode *new_dir_ni, bool ni_is_dirty(struct inode *inode); loff_t ni_seek_data_or_hole(struct ntfs_inode *ni, loff_t offset, bool dat= a); int ni_write_parents(struct ntfs_inode *ni, int sync); +int ni_allocate_da_blocks(struct ntfs_inode *ni); +int ni_allocate_da_blocks_locked(struct ntfs_inode *ni); =20 /* Globals from fslog.c */ bool check_index_header(const struct INDEX_HDR *hdr, size_t bytes); @@ -605,7 +635,8 @@ int ntfs_loadlog_and_replay(struct ntfs_inode *ni, stru= ct ntfs_sb_info *sbi); int ntfs_look_for_free_space(struct ntfs_sb_info *sbi, CLST lcn, CLST len, CLST *new_lcn, CLST *new_len, enum ALLOCATE_OPT opt); -bool ntfs_check_for_free_space(struct ntfs_sb_info *sbi, CLST clen, CLST m= len); +bool ntfs_check_free_space(struct ntfs_sb_info *sbi, CLST clen, CLST mlen, + bool da); int ntfs_look_free_mft(struct ntfs_sb_info *sbi, CLST *rno, bool mft, struct ntfs_inode *ni, struct mft_inode **mi); void ntfs_mark_rec_free(struct ntfs_sb_info *sbi, CLST rno, bool is_mft); @@ -831,7 +862,8 @@ void run_truncate_around(struct runs_tree *run, CLST vc= n); bool run_add_entry(struct runs_tree *run, CLST vcn, CLST lcn, CLST len, bool is_mft); bool run_collapse_range(struct runs_tree *run, CLST vcn, CLST len, CLST su= b); -bool run_insert_range(struct runs_tree *run, CLST vcn, CLST len); +int run_insert_range(struct runs_tree *run, CLST vcn, CLST len); +int run_insert_range_da(struct runs_tree *run, CLST vcn, CLST len); bool run_get_entry(const struct runs_tree *run, size_t index, CLST *vcn, CLST *lcn, CLST *len); bool run_is_mapped_full(const struct runs_tree *run, CLST svcn, CLST evcn); @@ -851,6 +883,9 @@ int run_unpack_ex(struct runs_tree *run, struct ntfs_sb= _info *sbi, CLST ino, #endif int run_get_highest_vcn(CLST vcn, const u8 *run_buf, u64 *highest_vcn); int run_clone(const struct runs_tree *run, struct runs_tree *new_run); +bool run_remove_range(struct runs_tree *run, CLST vcn, CLST len, CLST *don= e); +CLST run_len(const struct runs_tree *run); +CLST run_get_max_vcn(const struct runs_tree *run); =20 /* Globals from super.c */ void *ntfs_set_shared(void *ptr, u32 bytes); @@ -1027,6 +1062,36 @@ static inline int ntfs3_forced_shutdown(struct super= _block *sb) return test_bit(NTFS_FLAGS_SHUTDOWN_BIT, &ntfs_sb(sb)->flags); } =20 +/* Returns total sum of delay allocated clusters in all files. */ +static inline CLST ntfs_get_da(struct ntfs_sb_info *sbi) +{ +#ifdef CONFIG_NTFS3_64BIT_CLUSTER + return atomic64_read(&sbi->used.da); +#else + return atomic_read(&sbi->used.da); +#endif +} + +/* Update total count of delay allocated clusters. */ +static inline void ntfs_add_da(struct ntfs_sb_info *sbi, CLST da) +{ +#ifdef CONFIG_NTFS3_64BIT_CLUSTER + atomic64_add(da, &sbi->used.da); +#else + atomic_add(da, &sbi->used.da); +#endif +} + +/* Update total count of delay allocated clusters. */ +static inline void ntfs_sub_da(struct ntfs_sb_info *sbi, CLST da) +{ +#ifdef CONFIG_NTFS3_64BIT_CLUSTER + atomic64_sub(da, &sbi->used.da); +#else + atomic_sub(da, &sbi->used.da); +#endif +} + /* * ntfs_up_cluster - Align up on cluster boundary. */ diff --git a/fs/ntfs3/run.c b/fs/ntfs3/run.c index dc59cad4fa37..c0324cdc174d 100644 --- a/fs/ntfs3/run.c +++ b/fs/ntfs3/run.c @@ -454,7 +454,7 @@ bool run_add_entry(struct runs_tree *run, CLST vcn, CLS= T lcn, CLST len, =20 /* * If existing range fits then were done. - * Otherwise extend found one and fall back to range jocode. + * Otherwise extend found one and fall back to range join code. */ if (r->vcn + r->len < vcn + len) r->len +=3D len - ((r->vcn + r->len) - vcn); @@ -482,7 +482,8 @@ bool run_add_entry(struct runs_tree *run, CLST vcn, CLS= T lcn, CLST len, return true; } =20 -/* run_collapse_range +/* + * run_collapse_range * * Helper for attr_collapse_range(), * which is helper for fallocate(collapse_range). @@ -493,8 +494,9 @@ bool run_collapse_range(struct runs_tree *run, CLST vcn= , CLST len, CLST sub) struct ntfs_run *r, *e, *eat_start, *eat_end; CLST end; =20 - if (WARN_ON(!run_lookup(run, vcn, &index))) - return true; /* Should never be here. */ + if (!run_lookup(run, vcn, &index) && index >=3D run->count) { + return true; + } =20 e =3D run->runs + run->count; r =3D run->runs + index; @@ -560,13 +562,13 @@ bool run_collapse_range(struct runs_tree *run, CLST v= cn, CLST len, CLST sub) * Helper for attr_insert_range(), * which is helper for fallocate(insert_range). */ -bool run_insert_range(struct runs_tree *run, CLST vcn, CLST len) +int run_insert_range(struct runs_tree *run, CLST vcn, CLST len) { size_t index; struct ntfs_run *r, *e; =20 if (WARN_ON(!run_lookup(run, vcn, &index))) - return false; /* Should never be here. */ + return -EINVAL; /* Should never be here. */ =20 e =3D run->runs + run->count; r =3D run->runs + index; @@ -588,13 +590,49 @@ bool run_insert_range(struct runs_tree *run, CLST vcn= , CLST len) r->len =3D len1; =20 if (!run_add_entry(run, vcn + len, lcn2, len2, false)) - return false; + return -ENOMEM; } =20 if (!run_add_entry(run, vcn, SPARSE_LCN, len, false)) - return false; + return -ENOMEM; =20 - return true; + return 0; +} + +/* run_insert_range_da + * + * Helper for attr_insert_range(), + * which is helper for fallocate(insert_range). + */ +int run_insert_range_da(struct runs_tree *run, CLST vcn, CLST len) +{ + struct ntfs_run *r, *r0 =3D NULL, *e =3D run->runs + run->count; + ; + + for (r =3D run->runs; r < e; r++) { + CLST end =3D r->vcn + r->len; + + if (vcn >=3D end) + continue; + + if (!r0 && r->vcn < vcn) { + r0 =3D r; + } else { + r->vcn +=3D len; + } + } + + if (r0) { + /* split fragment. */ + CLST len1 =3D vcn - r0->vcn; + CLST len2 =3D r0->len - len1; + + r0->len =3D len1; + if (!run_add_entry(run, vcn + len, SPARSE_LCN, len2, false)) + return -ENOMEM; + } + + return 0; } =20 /* @@ -1209,3 +1247,97 @@ int run_clone(const struct runs_tree *run, struct ru= ns_tree *new_run) new_run->count =3D run->count; return 0; } + +/* + * run_remove_range + * + */ +bool run_remove_range(struct runs_tree *run, CLST vcn, CLST len, CLST *don= e) +{ + size_t index, eat; + struct ntfs_run *r, *e, *eat_start, *eat_end; + CLST end, d; + + *done =3D 0; + + /* Fast check. */ + if (!run->count) + return true; + + if (!run_lookup(run, vcn, &index) && index >=3D run->count) { + /* No entries in this run. */ + return true; + } + + + e =3D run->runs + run->count; + r =3D run->runs + index; + end =3D vcn + len; + + if (vcn > r->vcn) { + CLST r_end =3D r->vcn + r->len; + d =3D vcn - r->vcn; + + if (r_end > end) { + /* Remove a middle part, split. */ + *done +=3D len; + r->len =3D d; + return run_add_entry(run, end, r->lcn, r_end - end, + false); + } + /* Remove tail of run .*/ + *done +=3D r->len - d; + r->len =3D d; + r +=3D 1; + } + + eat_start =3D r; + eat_end =3D r; + + for (; r < e; r++) { + if (r->vcn >=3D end) + continue; + + if (r->vcn + r->len <=3D end) { + /* Eat this run. */ + *done +=3D r->len; + eat_end =3D r + 1; + continue; + } + + d =3D end - r->vcn; + *done +=3D d; + if (r->lcn !=3D SPARSE_LCN) + r->lcn +=3D d; + r->len -=3D d; + r->vcn =3D end; + } + + eat =3D eat_end - eat_start; + memmove(eat_start, eat_end, (e - eat_end) * sizeof(*r)); + run->count -=3D eat; + + return true; +} + +CLST run_len(const struct runs_tree *run) +{ + const struct ntfs_run *r, *e; + CLST len =3D 0; + + for (r =3D run->runs, e =3D r + run->count; r < e; r++) { + len +=3D r->len; + } + + return len; +} + +CLST run_get_max_vcn(const struct runs_tree *run) +{ + const struct ntfs_run *r; + if (!run->count) + return 0; + + r =3D run->runs + run->count - 1; + return r->vcn + r->len; +} diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c index a3c07f2b604f..27411203082a 100644 --- a/fs/ntfs3/super.c +++ b/fs/ntfs3/super.c @@ -269,6 +269,8 @@ enum Opt { Opt_prealloc, Opt_prealloc_bool, Opt_nocase, + Opt_delalloc, + Opt_delalloc_bool, Opt_err, }; =20 @@ -293,6 +295,8 @@ static const struct fs_parameter_spec ntfs_fs_parameter= s[] =3D { fsparam_flag("prealloc", Opt_prealloc), fsparam_bool("prealloc", Opt_prealloc_bool), fsparam_flag("nocase", Opt_nocase), + fsparam_flag("delalloc", Opt_delalloc), + fsparam_bool("delalloc", Opt_delalloc_bool), {} }; // clang-format on @@ -410,6 +414,12 @@ static int ntfs_fs_parse_param(struct fs_context *fc, case Opt_nocase: opts->nocase =3D 1; break; + case Opt_delalloc: + opts->delalloc =3D 1; + break; + case Opt_delalloc_bool: + opts->delalloc =3D result.boolean; + break; default: /* Should not be here unless we forget add case. */ return -EINVAL; @@ -726,14 +736,22 @@ static int ntfs_statfs(struct dentry *dentry, struct = kstatfs *buf) struct super_block *sb =3D dentry->d_sb; struct ntfs_sb_info *sbi =3D sb->s_fs_info; struct wnd_bitmap *wnd =3D &sbi->used.bitmap; + CLST da_clusters =3D ntfs_get_da(sbi); =20 buf->f_type =3D sb->s_magic; - buf->f_bsize =3D sbi->cluster_size; + buf->f_bsize =3D buf->f_frsize =3D sbi->cluster_size; buf->f_blocks =3D wnd->nbits; =20 - buf->f_bfree =3D buf->f_bavail =3D wnd_zeroes(wnd); + buf->f_bfree =3D wnd_zeroes(wnd); + if (buf->f_bfree > da_clusters) { + buf->f_bfree -=3D da_clusters; + } else { + buf->f_bfree =3D 0; + } + buf->f_bavail =3D buf->f_bfree; + buf->f_fsid.val[0] =3D sbi->volume.ser_num; - buf->f_fsid.val[1] =3D (sbi->volume.ser_num >> 32); + buf->f_fsid.val[1] =3D sbi->volume.ser_num >> 32; buf->f_namelen =3D NTFS_NAME_LEN; =20 return 0; @@ -778,6 +796,8 @@ static int ntfs_show_options(struct seq_file *m, struct= dentry *root) seq_puts(m, ",prealloc"); if (opts->nocase) seq_puts(m, ",nocase"); + if (opts->delalloc) + seq_puts(m, ",delalloc"); =20 return 0; } @@ -1088,7 +1108,7 @@ static int ntfs_init_from_boot(struct super_block *sb= , u32 sector_size, dev_size +=3D sector_size - 1; } =20 - sbi->bdev_blocksize_mask =3D max(boot_sector_size, sector_size) - 1; + sbi->bdev_blocksize =3D max(boot_sector_size, sector_size); sbi->mft.lbo =3D mlcn << cluster_bits; sbi->mft.lbo2 =3D mlcn2 << cluster_bits; =20 diff --git a/fs/ntfs3/xattr.c b/fs/ntfs3/xattr.c index c93df55e98d0..2302539852ef 100644 --- a/fs/ntfs3/xattr.c +++ b/fs/ntfs3/xattr.c @@ -460,7 +460,7 @@ static noinline int ntfs_set_ea(struct inode *inode, co= nst char *name, =20 new_sz =3D size; err =3D attr_set_size(ni, ATTR_EA, NULL, 0, &ea_run, new_sz, &new_sz, - false, NULL); + false); if (err) goto out; =20 --=20 2.43.0