From nobody Sun Feb 8 14:10:43 2026 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BD11314A9D for ; Fri, 6 Feb 2026 07:34:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770363258; cv=none; b=bYPXnQiJa7IUjC2Hmj+t5cMNg1Aycy1Em1vDeuJsdgvqiRjFi5ZJC1qryGR5I0/W7lK73NA0XrPw/pSifWZpQ/39hBpf7w/UQiW7rrHl9sFpUT55xOjf6bBCV/k5z1BTr93OL5CqcnfGxidor+u0NdjHcprHWmfmm06pMtyhNBo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770363258; c=relaxed/simple; bh=JEBLXDPvCGwj6QA5TCpuZCrjcimJgqn2pRubdEgkLKo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XazR/xAYQCQ1D3z1EdrJqoG/ctx+noilUKWE+aTPRLjgpG9vJu9gozYYfwRchu1HeWMzOdDx6lXCIWlsaKKgtr4vTo+MSE217ILllBSl88ifpFxMg8nr4UtCRTmohBxf8B88kDL1HZ4N4UwPoJLepDIhLQRqqrRxjN97/rLl2RE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.210.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-8230f2140beso1716521b3a.1 for ; Thu, 05 Feb 2026 23:34:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770363257; x=1770968057; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=riJ6MEDrOuiqj0Ef0/5B17zm2z5YBxp8xnn1OUgalmc=; b=nSVknAQGFJwAcquC2N0ASbE/AHTOvc9OMxB6SefWwVLf/HgSiwtlbzYaR3ElzZBNlR aWWh2WZIVk26obkE3EAs1fTX0PoaZ8L1ei+7CaA9M4zCgMZYajuCg15xlKmWTpsC7tnZ RFBLV0T2ulkYrCEgI3Zwzmxud6p7AE+4gV1pTkohqBmxRZmlbNBw08VnMg+FMXN40gTk CqBR1JPLP92uEGJSoL/LGcK2D3o0N6mU5ThXj1Pd4mUYhHsf6LgFJ43vR/DYG4uTyBcn 93og1psD7cIX+h9fmUHmvy4TNCIkXdnD15HX9kPjIE2zq1amQfTO8g364AJNzOqXkTBq 0bKg== X-Forwarded-Encrypted: i=1; AJvYcCUBlmecOgU2h9HlZiVwqaQEcKXq0xyFZZ8yDBwoytMG+tHqDtHysQXGBUL8LAV0tOJ8gKZcSsfJWahMYg0=@vger.kernel.org X-Gm-Message-State: AOJu0YwLR9Tz1HrTqPCn/0qDmT/Q3qhf9Xs1336kH67NNYP2BxXcywz2 S6NmuWQ+9FANZ006en8We65f3YlRsOn+DGy0UDQaLyf2ijpa/a2R3Q2B X-Gm-Gg: AZuq6aKXSs2Wp1WiUM8LHQ7XElz2sX0hlIsOyydxDTH4bUwUJYyNQOxtAJFlwYEYJKL HC48DiKrUjy43PbQ4zq5/6+yNezg1vi4otDYvOVGOzGqkyXbe2TYSllr+ZYXIkEo4yeqZk6OLRi Q4dLz9QzAKv7f8+srCYI9tP8e8AQfTG0s27WL/tqfW2olTjXX8t6SYwM8zArf3q89MjqpyVzF9W GjM6DGJGcK3agk5ork+rDNZ7yb2qYBf50Jv8k1SnNRCaB81Mm5ZBLAtdvzsApKjgeaq+9jrqlEC R6TeLfAjH2eV/V6FTNBbc5BkVj/Dp7j4V2JZ0ZM04vgR7JTs8H9EjBR+4l1Rn1n3cJagUKujqNo AE1aPg+p+qyQeCjq642HKueLWwi8/UjYZJxOmpC9cLUWq2wN7wzeaIGTn8Hf0YE9v+Z+TPphxnv CKK/DyQ2nDo7Ppum4LKeeNbf4ryA== X-Received: by 2002:a17:902:fc47:b0:295:28a4:f0c6 with SMTP id d9443c01a7336-2a95158a91amr19256555ad.0.1770363255374; Thu, 05 Feb 2026 23:34:15 -0800 (PST) Received: from localhost.localdomain ([1.227.206.162]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a951c7a047sm13575125ad.27.2026.02.05.23.34.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Feb 2026 23:34:14 -0800 (PST) From: Namjae Jeon To: viro@zeniv.linux.org.uk, brauner@kernel.org, hch@lst.de, tytso@mit.edu, willy@infradead.org, jack@suse.cz, djwong@kernel.org, dsterba@suse.com, pali@kernel.org, amir73il@gmail.com, xiang@kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Namjae Jeon , Hyunchul Lee Subject: [PATCH v8 11/17] ntfs: update runlist handling and cluster allocator Date: Fri, 6 Feb 2026 16:18:54 +0900 Message-Id: <20260206071900.6800-12-linkinjeon@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20260206071900.6800-1-linkinjeon@kernel.org> References: <20260206071900.6800-1-linkinjeon@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Updates runlist handling and cluster allocation to support contiguous allocations and filesystem trimming. Improve the runlist API to handle allocation failures and introduces discard support. Acked-by: Christoph Hellwig Signed-off-by: Hyunchul Lee Signed-off-by: Namjae Jeon --- fs/ntfs/bitmap.c | 196 +++++-- fs/ntfs/lcnalloc.c | 753 ++++++++++++------------ fs/ntfs/runlist.c | 1353 +++++++++++++++++++++++++------------------- 3 files changed, 1315 insertions(+), 987 deletions(-) diff --git a/fs/ntfs/bitmap.c b/fs/ntfs/bitmap.c index 0675b2400873..ed7770853fa8 100644 --- a/fs/ntfs/bitmap.c +++ b/fs/ntfs/bitmap.c @@ -1,20 +1,107 @@ // SPDX-License-Identifier: GPL-2.0-or-later /* - * bitmap.c - NTFS kernel bitmap handling. Part of the Linux-NTFS project. + * NTFS kernel bitmap handling. * * Copyright (c) 2004-2005 Anton Altaparmakov + * Copyright (c) 2025 LG Electronics Co., Ltd. */ =20 -#ifdef NTFS_RW - -#include +#include +#include =20 #include "bitmap.h" -#include "debug.h" -#include "aops.h" #include "ntfs.h" =20 -/** +int ntfs_trim_fs(struct ntfs_volume *vol, struct fstrim_range *range) +{ + size_t buf_clusters; + pgoff_t index, start_index, end_index; + struct file_ra_state *ra; + struct folio *folio; + unsigned long *bitmap; + char *kaddr; + u64 end, trimmed =3D 0, start_buf, end_buf, end_cluster; + u64 start_cluster =3D ntfs_bytes_to_cluster(vol, range->start); + u32 dq =3D bdev_discard_granularity(vol->sb->s_bdev); + int ret =3D 0; + + if (!dq) + dq =3D vol->cluster_size; + + if (start_cluster >=3D vol->nr_clusters) + return -EINVAL; + + if (range->len =3D=3D (u64)-1) + end_cluster =3D vol->nr_clusters; + else { + end_cluster =3D ntfs_bytes_to_cluster(vol, + (range->start + range->len + vol->cluster_size - 1)); + if (end_cluster > vol->nr_clusters) + end_cluster =3D vol->nr_clusters; + } + + ra =3D kzalloc(sizeof(*ra), GFP_NOFS); + if (!ra) + return -ENOMEM; + + buf_clusters =3D PAGE_SIZE * 8; + start_index =3D start_cluster >> 15; + end_index =3D (end_cluster + buf_clusters - 1) >> 15; + + for (index =3D start_index; index < end_index; index++) { + folio =3D ntfs_get_locked_folio(vol->lcnbmp_ino->i_mapping, + index, end_index, ra); + if (IS_ERR(folio)) { + ret =3D PTR_ERR(folio); + goto out_free; + } + + kaddr =3D kmap_local_folio(folio, 0); + bitmap =3D (unsigned long *)kaddr; + + start_buf =3D max_t(u64, index * buf_clusters, start_cluster); + end_buf =3D min_t(u64, (index + 1) * buf_clusters, end_cluster); + + end =3D start_buf; + while (end < end_buf) { + u64 aligned_start, aligned_count; + u64 start =3D find_next_zero_bit(bitmap, end_buf - start_buf, + end - start_buf) + start_buf; + if (start >=3D end_buf) + break; + + end =3D find_next_bit(bitmap, end_buf - start_buf, + start - start_buf) + start_buf; + + aligned_start =3D ALIGN(ntfs_cluster_to_bytes(vol, start), dq); + aligned_count =3D + ALIGN_DOWN(ntfs_cluster_to_bytes(vol, end - start), dq); + if (aligned_count >=3D range->minlen) { + ret =3D blkdev_issue_discard(vol->sb->s_bdev, aligned_start >> 9, + aligned_count >> 9, GFP_NOFS); + if (ret) + goto out_unmap; + trimmed +=3D aligned_count; + } + } + +out_unmap: + kunmap_local(kaddr); + folio_unlock(folio); + folio_put(folio); + + if (ret) + goto out_free; + } + + range->len =3D trimmed; + +out_free: + kfree(ra); + return ret; +} + +/* * __ntfs_bitmap_set_bits_in_run - set a run of bits in a bitmap to a value * @vi: vfs inode describing the bitmap * @start_bit: first bit to set @@ -36,19 +123,21 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, co= nst s64 start_bit, s64 cnt =3D count; pgoff_t index, end_index; struct address_space *mapping; - struct page *page; + struct folio *folio; u8 *kaddr; int pos, len; u8 bit; + struct ntfs_inode *ni =3D NTFS_I(vi); + struct ntfs_volume *vol =3D ni->vol; =20 - BUG_ON(!vi); - ntfs_debug("Entering for i_ino 0x%lx, start_bit 0x%llx, count 0x%llx, " - "value %u.%s", vi->i_ino, (unsigned long long)start_bit, + ntfs_debug("Entering for i_ino 0x%lx, start_bit 0x%llx, count 0x%llx, val= ue %u.%s", + vi->i_ino, (unsigned long long)start_bit, (unsigned long long)cnt, (unsigned int)value, is_rollback ? " (rollback)" : ""); - BUG_ON(start_bit < 0); - BUG_ON(cnt < 0); - BUG_ON(value > 1); + + if (start_bit < 0 || cnt < 0 || value > 1) + return -EINVAL; + /* * Calculate the indices for the pages containing the first and last * bits, i.e. @start_bit and @start_bit + @cnt - 1, respectively. @@ -58,14 +147,17 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, co= nst s64 start_bit, =20 /* Get the page containing the first bit (@start_bit). */ mapping =3D vi->i_mapping; - page =3D ntfs_map_page(mapping, index); - if (IS_ERR(page)) { + folio =3D read_mapping_folio(mapping, index, NULL); + if (IS_ERR(folio)) { if (!is_rollback) - ntfs_error(vi->i_sb, "Failed to map first page (error " - "%li), aborting.", PTR_ERR(page)); - return PTR_ERR(page); + ntfs_error(vi->i_sb, + "Failed to map first page (error %li), aborting.", + PTR_ERR(folio)); + return PTR_ERR(folio); } - kaddr =3D page_address(page); + + folio_lock(folio); + kaddr =3D kmap_local_folio(folio, 0); =20 /* Set @pos to the position of the byte containing @start_bit. */ pos =3D (start_bit >> 3) & ~PAGE_MASK; @@ -76,6 +168,9 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, cons= t s64 start_bit, /* If the first byte is partial, modify the appropriate bits in it. */ if (bit) { u8 *byte =3D kaddr + pos; + + if (ni->mft_no =3D=3D FILE_Bitmap) + ntfs_set_lcn_empty_bits(vol, index, value, min_t(s64, 8 - bit, cnt)); while ((bit & 7) && cnt) { cnt--; if (value) @@ -97,6 +192,8 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, cons= t s64 start_bit, len =3D min_t(s64, cnt >> 3, PAGE_SIZE - pos); memset(kaddr + pos, value ? 0xff : 0, len); cnt -=3D len << 3; + if (ni->mft_no =3D=3D FILE_Bitmap) + ntfs_set_lcn_empty_bits(vol, index, value, len << 3); =20 /* Update @len to point to the first not-done byte in the page. */ if (cnt < 8) @@ -104,16 +201,24 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, c= onst s64 start_bit, =20 /* If we are not in the last page, deal with all subsequent pages. */ while (index < end_index) { - BUG_ON(cnt <=3D 0); - - /* Update @index and get the next page. */ - flush_dcache_page(page); - set_page_dirty(page); - ntfs_unmap_page(page); - page =3D ntfs_map_page(mapping, ++index); - if (IS_ERR(page)) + if (cnt <=3D 0) + goto rollback; + + /* Update @index and get the next folio. */ + folio_mark_dirty(folio); + folio_unlock(folio); + kunmap_local(kaddr); + folio_put(folio); + folio =3D read_mapping_folio(mapping, ++index, NULL); + if (IS_ERR(folio)) { + ntfs_error(vi->i_sb, + "Failed to map subsequent page (error %li), aborting.", + PTR_ERR(folio)); goto rollback; - kaddr =3D page_address(page); + } + + folio_lock(folio); + kaddr =3D kmap_local_folio(folio, 0); /* * Depending on @value, modify all remaining whole bytes in the * page up to @cnt. @@ -121,6 +226,8 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, con= st s64 start_bit, len =3D min_t(s64, cnt >> 3, PAGE_SIZE); memset(kaddr, value ? 0xff : 0, len); cnt -=3D len << 3; + if (ni->mft_no =3D=3D FILE_Bitmap) + ntfs_set_lcn_empty_bits(vol, index, value, len << 3); } /* * The currently mapped page is the last one. If the last byte is @@ -130,10 +237,12 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, c= onst s64 start_bit, if (cnt) { u8 *byte; =20 - BUG_ON(cnt > 7); + WARN_ON(cnt > 7); =20 bit =3D cnt; byte =3D kaddr + len; + if (ni->mft_no =3D=3D FILE_Bitmap) + ntfs_set_lcn_empty_bits(vol, index, value, bit); while (bit--) { if (value) *byte |=3D 1 << bit; @@ -142,10 +251,11 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, c= onst s64 start_bit, } } done: - /* We are done. Unmap the page and return success. */ - flush_dcache_page(page); - set_page_dirty(page); - ntfs_unmap_page(page); + /* We are done. Unmap the folio and return success. */ + folio_mark_dirty(folio); + folio_unlock(folio); + kunmap_local(kaddr); + folio_put(folio); ntfs_debug("Done."); return 0; rollback: @@ -155,7 +265,7 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, con= st s64 start_bit, * - @count - @cnt is the number of bits that have been modified */ if (is_rollback) - return PTR_ERR(page); + return PTR_ERR(folio); if (count !=3D cnt) pos =3D __ntfs_bitmap_set_bits_in_run(vi, start_bit, count - cnt, value ? 0 : 1, true); @@ -163,17 +273,15 @@ int __ntfs_bitmap_set_bits_in_run(struct inode *vi, c= onst s64 start_bit, pos =3D 0; if (!pos) { /* Rollback was successful. */ - ntfs_error(vi->i_sb, "Failed to map subsequent page (error " - "%li), aborting.", PTR_ERR(page)); + ntfs_error(vi->i_sb, + "Failed to map subsequent page (error %li), aborting.", + PTR_ERR(folio)); } else { /* Rollback failed. */ - ntfs_error(vi->i_sb, "Failed to map subsequent page (error " - "%li) and rollback failed (error %i). " - "Aborting and leaving inconsistent metadata. " - "Unmount and run chkdsk.", PTR_ERR(page), pos); + ntfs_error(vi->i_sb, + "Failed to map subsequent page (error %li) and rollback failed (error %= i). Aborting and leaving inconsistent metadata. Unmount and run chkdsk.", + PTR_ERR(folio), pos); NVolSetErrors(NTFS_SB(vi->i_sb)); } - return PTR_ERR(page); + return PTR_ERR(folio); } - -#endif /* NTFS_RW */ diff --git a/fs/ntfs/lcnalloc.c b/fs/ntfs/lcnalloc.c index eda9972e6159..6f4df07e3726 100644 --- a/fs/ntfs/lcnalloc.c +++ b/fs/ntfs/lcnalloc.c @@ -1,25 +1,25 @@ // SPDX-License-Identifier: GPL-2.0-or-later /* - * lcnalloc.c - Cluster (de)allocation code. Part of the Linux-NTFS proje= ct. + * Cluster (de)allocation code. * * Copyright (c) 2004-2005 Anton Altaparmakov + * Copyright (c) 2025 LG Electronics Co., Ltd. + * + * Part of this file is based on code from the NTFS-3G. + * and is copyrighted by the respective authors below: + * Copyright (c) 2002-2004 Anton Altaparmakov + * Copyright (c) 2004 Yura Pakhuchiy + * Copyright (c) 2004-2008 Szabolcs Szakacsits + * Copyright (c) 2008-2009 Jean-Pierre Andre */ =20 -#ifdef NTFS_RW - -#include +#include =20 #include "lcnalloc.h" -#include "debug.h" #include "bitmap.h" -#include "inode.h" -#include "volume.h" -#include "attrib.h" -#include "malloc.h" -#include "aops.h" #include "ntfs.h" =20 -/** +/* * ntfs_cluster_free_from_rl_nolock - free clusters from runlist * @vol: mounted ntfs volume on which to free the clusters * @rl: runlist describing the clusters to free @@ -33,15 +33,20 @@ * Locking: - The volume lcn bitmap must be locked for writing on entry an= d is * left locked on return. */ -int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, - const runlist_element *rl) +int ntfs_cluster_free_from_rl_nolock(struct ntfs_volume *vol, + const struct runlist_element *rl) { struct inode *lcnbmp_vi =3D vol->lcnbmp_ino; int ret =3D 0; + s64 nr_freed =3D 0; =20 ntfs_debug("Entering."); if (!rl) return 0; + + if (!NVolFreeClusterKnown(vol)) + wait_event(vol->free_waitq, NVolFreeClusterKnown(vol)); + for (; rl->length; rl++) { int err; =20 @@ -50,19 +55,77 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, err =3D ntfs_bitmap_clear_run(lcnbmp_vi, rl->lcn, rl->length); if (unlikely(err && (!ret || ret =3D=3D -ENOMEM) && ret !=3D err)) ret =3D err; + else + nr_freed +=3D rl->length; } + ntfs_inc_free_clusters(vol, nr_freed); ntfs_debug("Done."); return ret; } =20 -/** +static s64 max_empty_bit_range(unsigned char *buf, int size) +{ + int i, j, run =3D 0; + int max_range =3D 0; + s64 start_pos =3D -1; + + ntfs_debug("Entering\n"); + + i =3D 0; + while (i < size) { + switch (*buf) { + case 0: + do { + buf++; + run +=3D 8; + i++; + } while ((i < size) && !*buf); + break; + case 255: + if (run > max_range) { + max_range =3D run; + start_pos =3D (s64)i * 8 - run; + } + run =3D 0; + do { + buf++; + i++; + } while ((i < size) && (*buf =3D=3D 255)); + break; + default: + for (j =3D 0; j < 8; j++) { + int bit =3D *buf & (1 << j); + + if (bit) { + if (run > max_range) { + max_range =3D run; + start_pos =3D (s64)i * 8 + (j - run); + } + run =3D 0; + } else + run++; + } + i++; + buf++; + } + } + + if (run > max_range) + start_pos =3D (s64)i * 8 - run; + + return start_pos; +} + +/* * ntfs_cluster_alloc - allocate clusters on an ntfs volume - * @vol: mounted ntfs volume on which to allocate the clusters - * @start_vcn: vcn to use for the first allocated cluster - * @count: number of clusters to allocate - * @start_lcn: starting lcn at which to allocate the clusters (or -1 if no= ne) - * @zone: zone from which to allocate the clusters - * @is_extension: if 'true', this is an attribute extension + * @vol: mounted ntfs volume on which to allocate clusters + * @start_vcn: vcn of the first allocated cluster + * @count: number of clusters to allocate + * @start_lcn: starting lcn at which to allocate the clusters or -1 if no= ne + * @zone: zone from which to allocate (MFT_ZONE or DATA_ZONE) + * @is_extension: if true, the caller is extending an attribute + * @is_contig: if true, require contiguous allocation + * @is_dealloc: if true, the allocation is for deallocation purposes * * Allocate @count clusters preferably starting at cluster @start_lcn or a= t the * current allocator position if @start_lcn is -1, on the mounted ntfs vol= ume @@ -109,62 +172,65 @@ int ntfs_cluster_free_from_rl_nolock(ntfs_volume *vol, * for speed, but the algorithm is, so further speed improvements are prob= ably * possible). * - * FIXME: We should be monitoring cluster allocation and increment the MFT= zone - * size dynamically but this is something for the future. We will just ca= use - * heavier fragmentation by not doing it and I am not even sure Windows wo= uld - * grow the MFT zone dynamically, so it might even be correct not to do th= is. - * The overhead in doing dynamic MFT zone expansion would be very large and - * unlikely worth the effort. (AIA) - * - * TODO: I have added in double the required zone position pointer wrap ar= ound - * logic which can be optimized to having only one of the two logic sets. - * However, having the double logic will work fine, but if we have only on= e of - * the sets and we get it wrong somewhere, then we get into trouble, so - * removing the duplicate logic requires _very_ careful consideration of _= all_ - * possible code paths. So at least for now, I am leaving the double logi= c - - * better safe than sorry... (AIA) - * * Locking: - The volume lcn bitmap must be unlocked on entry and is unloc= ked * on return. * - This function takes the volume lcn bitmap lock for writing and * modifies the bitmap contents. + * + * Return: Runlist describing the allocated cluster(s) on success, error p= ointer + * on failure. */ -runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, const VCN start_vcn, - const s64 count, const LCN start_lcn, - const NTFS_CLUSTER_ALLOCATION_ZONES zone, - const bool is_extension) +struct runlist_element *ntfs_cluster_alloc(struct ntfs_volume *vol, const = s64 start_vcn, + const s64 count, const s64 start_lcn, + const int zone, + const bool is_extension, + const bool is_contig, + const bool is_dealloc) { - LCN zone_start, zone_end, bmp_pos, bmp_initial_pos, last_read_pos, lcn; - LCN prev_lcn =3D 0, prev_run_len =3D 0, mft_zone_size; - s64 clusters; + s64 zone_start, zone_end, bmp_pos, bmp_initial_pos, last_read_pos, lcn; + s64 prev_lcn =3D 0, prev_run_len =3D 0, mft_zone_size; + s64 clusters, free_clusters; loff_t i_size; struct inode *lcnbmp_vi; - runlist_element *rl =3D NULL; + struct runlist_element *rl =3D NULL; struct address_space *mapping; - struct page *page =3D NULL; - u8 *buf, *byte; - int err =3D 0, rlpos, rlsize, buf_size; + struct folio *folio =3D NULL; + u8 *buf =3D NULL, *byte; + int err =3D 0, rlpos, rlsize, buf_size, pg_off; u8 pass, done_zones, search_zone, need_writeback =3D 0, bit; + unsigned int memalloc_flags; + u8 has_guess, used_zone_pos; + pgoff_t index; =20 - ntfs_debug("Entering for start_vcn 0x%llx, count 0x%llx, start_lcn " - "0x%llx, zone %s_ZONE.", (unsigned long long)start_vcn, - (unsigned long long)count, - (unsigned long long)start_lcn, + ntfs_debug("Entering for start_vcn 0x%llx, count 0x%llx, start_lcn 0x%llx= , zone %s_ZONE.", + start_vcn, count, start_lcn, zone =3D=3D MFT_ZONE ? "MFT" : "DATA"); - BUG_ON(!vol); + lcnbmp_vi =3D vol->lcnbmp_ino; - BUG_ON(!lcnbmp_vi); - BUG_ON(start_vcn < 0); - BUG_ON(count < 0); - BUG_ON(start_lcn < -1); - BUG_ON(zone < FIRST_ZONE); - BUG_ON(zone > LAST_ZONE); + if (start_vcn < 0 || start_lcn < LCN_HOLE || + zone < FIRST_ZONE || zone > LAST_ZONE) + return ERR_PTR(-EINVAL); =20 /* Return NULL if @count is zero. */ - if (!count) - return NULL; + if (count < 0 || !count) + return ERR_PTR(-EINVAL); + + memalloc_flags =3D memalloc_nofs_save(); + + if (!NVolFreeClusterKnown(vol)) + wait_event(vol->free_waitq, NVolFreeClusterKnown(vol)); + free_clusters =3D atomic64_read(&vol->free_clusters); + /* Take the lcnbmp lock for writing. */ down_write(&vol->lcnbmp_lock); + if (is_dealloc =3D=3D false) + free_clusters -=3D atomic64_read(&vol->dirty_clusters); + + if (free_clusters < count) { + err =3D -ENOSPC; + goto out_restore; + } + /* * If no specific @start_lcn was requested, use the current data zone * position, otherwise use the requested @start_lcn but make sure it @@ -183,7 +249,9 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, c= onst VCN start_vcn, * volume) and 4 for data zone 2 (start of volume till start of mft * zone). */ + has_guess =3D 1; zone_start =3D start_lcn; + if (zone_start < 0) { if (zone =3D=3D DATA_ZONE) zone_start =3D vol->data1_zone_pos; @@ -196,39 +264,30 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, */ pass =3D 2; } - } else if (zone =3D=3D DATA_ZONE && zone_start >=3D vol->mft_zone_start && - zone_start < vol->mft_zone_end) { - zone_start =3D vol->mft_zone_end; - /* - * Starting at beginning of data1_zone which means a single - * pass in this zone is sufficient. - */ - pass =3D 2; - } else if (zone =3D=3D MFT_ZONE && (zone_start < vol->mft_zone_start || - zone_start >=3D vol->mft_zone_end)) { - zone_start =3D vol->mft_lcn; - if (!vol->mft_zone_end) - zone_start =3D 0; - /* - * Starting at beginning of volume which means a single pass - * is sufficient. - */ - pass =3D 2; + has_guess =3D 0; } - if (zone =3D=3D MFT_ZONE) { + + used_zone_pos =3D has_guess ? 0 : 1; + + if (!zone_start || zone_start =3D=3D vol->mft_zone_start || + zone_start =3D=3D vol->mft_zone_end) + pass =3D 2; + + if (zone_start < vol->mft_zone_start) { + zone_end =3D vol->mft_zone_start; + search_zone =3D 4; + /* Skip searching the mft zone. */ + done_zones |=3D 1; + } else if (zone_start < vol->mft_zone_end) { zone_end =3D vol->mft_zone_end; search_zone =3D 1; - } else /* if (zone =3D=3D DATA_ZONE) */ { + } else { + zone_end =3D vol->nr_clusters; + search_zone =3D 2; /* Skip searching the mft zone. */ done_zones |=3D 1; - if (zone_start >=3D vol->mft_zone_end) { - zone_end =3D vol->nr_clusters; - search_zone =3D 2; - } else { - zone_end =3D vol->mft_zone_start; - search_zone =3D 4; - } } + /* * bmp_pos is the current bit position inside the bitmap. We use * bmp_initial_pos to determine whether or not to do a zone switch. @@ -241,100 +300,96 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol= , const VCN start_vcn, mapping =3D lcnbmp_vi->i_mapping; i_size =3D i_size_read(lcnbmp_vi); while (1) { - ntfs_debug("Start of outer while loop: done_zones 0x%x, " - "search_zone %i, pass %i, zone_start 0x%llx, " - "zone_end 0x%llx, bmp_initial_pos 0x%llx, " - "bmp_pos 0x%llx, rlpos %i, rlsize %i.", + ntfs_debug("Start of outer while loop: done_zones 0x%x, search_zone %i, = pass %i, zone_start 0x%llx, zone_end 0x%llx, bmp_initial_pos 0x%llx, bmp_po= s 0x%llx, rlpos %i, rlsize %i.", done_zones, search_zone, pass, - (unsigned long long)zone_start, - (unsigned long long)zone_end, - (unsigned long long)bmp_initial_pos, - (unsigned long long)bmp_pos, rlpos, rlsize); + zone_start, zone_end, bmp_initial_pos, + bmp_pos, rlpos, rlsize); /* Loop until we run out of free clusters. */ last_read_pos =3D bmp_pos >> 3; - ntfs_debug("last_read_pos 0x%llx.", - (unsigned long long)last_read_pos); - if (last_read_pos > i_size) { - ntfs_debug("End of attribute reached. " - "Skipping to zone_pass_done."); + ntfs_debug("last_read_pos 0x%llx.", last_read_pos); + if (last_read_pos >=3D i_size) { + ntfs_debug("End of attribute reached. Skipping to zone_pass_done."); goto zone_pass_done; } - if (likely(page)) { + if (likely(folio)) { if (need_writeback) { ntfs_debug("Marking page dirty."); - flush_dcache_page(page); - set_page_dirty(page); + folio_mark_dirty(folio); need_writeback =3D 0; } - ntfs_unmap_page(page); - } - page =3D ntfs_map_page(mapping, last_read_pos >> - PAGE_SHIFT); - if (IS_ERR(page)) { - err =3D PTR_ERR(page); - ntfs_error(vol->sb, "Failed to map page."); - goto out; + folio_unlock(folio); + kunmap_local(buf); + folio_put(folio); + folio =3D NULL; } - buf_size =3D last_read_pos & ~PAGE_MASK; - buf =3D page_address(page) + buf_size; - buf_size =3D PAGE_SIZE - buf_size; + + index =3D last_read_pos >> PAGE_SHIFT; + pg_off =3D last_read_pos & ~PAGE_MASK; + buf_size =3D PAGE_SIZE - pg_off; if (unlikely(last_read_pos + buf_size > i_size)) buf_size =3D i_size - last_read_pos; buf_size <<=3D 3; lcn =3D bmp_pos & 7; - bmp_pos &=3D ~(LCN)7; - ntfs_debug("Before inner while loop: buf_size %i, lcn 0x%llx, " - "bmp_pos 0x%llx, need_writeback %i.", buf_size, - (unsigned long long)lcn, - (unsigned long long)bmp_pos, need_writeback); + bmp_pos &=3D ~(s64)7; + + if (vol->lcn_empty_bits_per_page[index] =3D=3D 0) + goto next_bmp_pos; + + folio =3D read_mapping_folio(mapping, index, NULL); + if (IS_ERR(folio)) { + err =3D PTR_ERR(folio); + ntfs_error(vol->sb, "Failed to map page."); + goto out; + } + + folio_lock(folio); + buf =3D kmap_local_folio(folio, 0) + pg_off; + ntfs_debug("Before inner while loop: buf_size %i, lcn 0x%llx, bmp_pos 0x= %llx, need_writeback %i.", + buf_size, lcn, bmp_pos, need_writeback); while (lcn < buf_size && lcn + bmp_pos < zone_end) { byte =3D buf + (lcn >> 3); - ntfs_debug("In inner while loop: buf_size %i, " - "lcn 0x%llx, bmp_pos 0x%llx, " - "need_writeback %i, byte ofs 0x%x, " - "*byte 0x%x.", buf_size, - (unsigned long long)lcn, - (unsigned long long)bmp_pos, - need_writeback, + ntfs_debug("In inner while loop: buf_size %i, lcn 0x%llx, bmp_pos 0x%ll= x, need_writeback %i, byte ofs 0x%x, *byte 0x%x.", + buf_size, lcn, bmp_pos, need_writeback, (unsigned int)(lcn >> 3), (unsigned int)*byte); - /* Skip full bytes. */ - if (*byte =3D=3D 0xff) { - lcn =3D (lcn + 8) & ~(LCN)7; - ntfs_debug("Continuing while loop 1."); - continue; - } bit =3D 1 << (lcn & 7); ntfs_debug("bit 0x%x.", bit); - /* If the bit is already set, go onto the next one. */ - if (*byte & bit) { - lcn++; - ntfs_debug("Continuing while loop 2."); + + if (has_guess) { + if (*byte & bit) { + if (is_contig =3D=3D true && prev_run_len > 0) + goto done; + + has_guess =3D 0; + break; + } + } else { + lcn =3D max_empty_bit_range(buf, buf_size >> 3); + if (lcn < 0) + break; + has_guess =3D 1; continue; } /* * Allocate more memory if needed, including space for * the terminator element. - * ntfs_malloc_nofs() operates on whole pages only. + * kvzalloc() operates on whole pages only. */ if ((rlpos + 2) * sizeof(*rl) > rlsize) { - runlist_element *rl2; + struct runlist_element *rl2; =20 ntfs_debug("Reallocating memory."); if (!rl) - ntfs_debug("First free bit is at LCN " - "0x%llx.", - (unsigned long long) - (lcn + bmp_pos)); - rl2 =3D ntfs_malloc_nofs(rlsize + (int)PAGE_SIZE); + ntfs_debug("First free bit is at s64 0x%llx.", + lcn + bmp_pos); + rl2 =3D kvzalloc(rlsize + PAGE_SIZE, GFP_NOFS); if (unlikely(!rl2)) { err =3D -ENOMEM; - ntfs_error(vol->sb, "Failed to " - "allocate memory."); + ntfs_error(vol->sb, "Failed to allocate memory."); goto out; } memcpy(rl2, rl, rlsize); - ntfs_free(rl); + kvfree(rl); rl =3D rl2; rlsize +=3D PAGE_SIZE; ntfs_debug("Reallocated memory, rlsize 0x%x.", @@ -346,50 +401,33 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, need_writeback =3D 1; ntfs_debug("*byte 0x%x, need_writeback is set.", (unsigned int)*byte); + ntfs_dec_free_clusters(vol, 1); + ntfs_set_lcn_empty_bits(vol, index, 1, 1); + /* * Coalesce with previous run if adjacent LCNs. * Otherwise, append a new run. */ - ntfs_debug("Adding run (lcn 0x%llx, len 0x%llx), " - "prev_lcn 0x%llx, lcn 0x%llx, " - "bmp_pos 0x%llx, prev_run_len 0x%llx, " - "rlpos %i.", - (unsigned long long)(lcn + bmp_pos), - 1ULL, (unsigned long long)prev_lcn, - (unsigned long long)lcn, - (unsigned long long)bmp_pos, - (unsigned long long)prev_run_len, - rlpos); + ntfs_debug("Adding run (lcn 0x%llx, len 0x%llx), prev_lcn 0x%llx, lcn 0= x%llx, bmp_pos 0x%llx, prev_run_len 0x%llx, rlpos %i.", + lcn + bmp_pos, 1ULL, prev_lcn, + lcn, bmp_pos, prev_run_len, rlpos); if (prev_lcn =3D=3D lcn + bmp_pos - prev_run_len && rlpos) { - ntfs_debug("Coalescing to run (lcn 0x%llx, " - "len 0x%llx).", - (unsigned long long) + ntfs_debug("Coalescing to run (lcn 0x%llx, len 0x%llx).", rl[rlpos - 1].lcn, - (unsigned long long) rl[rlpos - 1].length); rl[rlpos - 1].length =3D ++prev_run_len; - ntfs_debug("Run now (lcn 0x%llx, len 0x%llx), " - "prev_run_len 0x%llx.", - (unsigned long long) + ntfs_debug("Run now (lcn 0x%llx, len 0x%llx), prev_run_len 0x%llx.", rl[rlpos - 1].lcn, - (unsigned long long) rl[rlpos - 1].length, - (unsigned long long) prev_run_len); } else { if (likely(rlpos)) { - ntfs_debug("Adding new run, (previous " - "run lcn 0x%llx, " - "len 0x%llx).", - (unsigned long long) - rl[rlpos - 1].lcn, - (unsigned long long) - rl[rlpos - 1].length); + ntfs_debug("Adding new run, (previous run lcn 0x%llx, len 0x%llx).", + rl[rlpos - 1].lcn, rl[rlpos - 1].length); rl[rlpos].vcn =3D rl[rlpos - 1].vcn + prev_run_len; } else { - ntfs_debug("Adding new run, is first " - "run."); + ntfs_debug("Adding new run, is first run."); rl[rlpos].vcn =3D start_vcn; } rl[rlpos].lcn =3D prev_lcn =3D lcn + bmp_pos; @@ -398,24 +436,21 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, } /* Done? */ if (!--clusters) { - LCN tc; + s64 tc; +done: + if (!used_zone_pos) + goto out; /* * Update the current zone position. Positions * of already scanned zones have been updated * during the respective zone switches. */ tc =3D lcn + bmp_pos + 1; - ntfs_debug("Done. Updating current zone " - "position, tc 0x%llx, " - "search_zone %i.", - (unsigned long long)tc, - search_zone); + ntfs_debug("Done. Updating current zone position, tc 0x%llx, search_zo= ne %i.", + tc, search_zone); switch (search_zone) { case 1: - ntfs_debug("Before checks, " - "vol->mft_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("Before checks, vol->mft_zone_pos 0x%llx.", vol->mft_zone_pos); if (tc >=3D vol->mft_zone_end) { vol->mft_zone_pos =3D @@ -427,17 +462,11 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, tc > vol->mft_zone_pos) && tc >=3D vol->mft_lcn) vol->mft_zone_pos =3D tc; - ntfs_debug("After checks, " - "vol->mft_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("After checks, vol->mft_zone_pos 0x%llx.", vol->mft_zone_pos); break; case 2: - ntfs_debug("Before checks, " - "vol->data1_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("Before checks, vol->data1_zone_pos 0x%llx.", vol->data1_zone_pos); if (tc >=3D vol->nr_clusters) vol->data1_zone_pos =3D @@ -447,17 +476,11 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, tc > vol->data1_zone_pos) && tc >=3D vol->mft_zone_end) vol->data1_zone_pos =3D tc; - ntfs_debug("After checks, " - "vol->data1_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("After checks, vol->data1_zone_pos 0x%llx.", vol->data1_zone_pos); break; case 4: - ntfs_debug("Before checks, " - "vol->data2_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("Before checks, vol->data2_zone_pos 0x%llx.", vol->data2_zone_pos); if (tc >=3D vol->mft_zone_start) vol->data2_zone_pos =3D 0; @@ -465,30 +488,41 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, vol->data2_zone_pos || tc > vol->data2_zone_pos) vol->data2_zone_pos =3D tc; - ntfs_debug("After checks, " - "vol->data2_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("After checks, vol->data2_zone_pos 0x%llx.", vol->data2_zone_pos); break; default: - BUG(); + WARN_ON(1); } ntfs_debug("Finished. Going to out."); goto out; } lcn++; } - bmp_pos +=3D buf_size; - ntfs_debug("After inner while loop: buf_size 0x%x, lcn " - "0x%llx, bmp_pos 0x%llx, need_writeback %i.", - buf_size, (unsigned long long)lcn, - (unsigned long long)bmp_pos, need_writeback); + + if (!used_zone_pos) { + used_zone_pos =3D 1; + if (search_zone =3D=3D 1) + zone_start =3D vol->mft_zone_pos; + else if (search_zone =3D=3D 2) + zone_start =3D vol->data1_zone_pos; + else + zone_start =3D vol->data2_zone_pos; + + if (!zone_start || zone_start =3D=3D vol->mft_zone_start || + zone_start =3D=3D vol->mft_zone_end) + pass =3D 2; + bmp_pos =3D zone_start; + } else { +next_bmp_pos: + bmp_pos +=3D buf_size; + } + + ntfs_debug("After inner while loop: buf_size 0x%x, lcn 0x%llx, bmp_pos 0= x%llx, need_writeback %i.", + buf_size, lcn, bmp_pos, need_writeback); if (bmp_pos < zone_end) { - ntfs_debug("Continuing outer while loop, " - "bmp_pos 0x%llx, zone_end 0x%llx.", - (unsigned long long)bmp_pos, - (unsigned long long)zone_end); + ntfs_debug("Continuing outer while loop, bmp_pos 0x%llx, zone_end 0x%ll= x.", + bmp_pos, zone_end); continue; } zone_pass_done: /* Finished with the current zone pass. */ @@ -511,23 +545,18 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, zone_start =3D 0; break; default: - BUG(); + WARN_ON(1); } /* Sanity check. */ if (zone_end < zone_start) zone_end =3D zone_start; bmp_pos =3D zone_start; - ntfs_debug("Continuing outer while loop, pass 2, " - "zone_start 0x%llx, zone_end 0x%llx, " - "bmp_pos 0x%llx.", - (unsigned long long)zone_start, - (unsigned long long)zone_end, - (unsigned long long)bmp_pos); + ntfs_debug("Continuing outer while loop, pass 2, zone_start 0x%llx, zon= e_end 0x%llx, bmp_pos 0x%llx.", + zone_start, zone_end, bmp_pos); continue; } /* pass =3D=3D 2 */ done_zones_check: - ntfs_debug("At done_zones_check, search_zone %i, done_zones " - "before 0x%x, done_zones after 0x%x.", + ntfs_debug("At done_zones_check, search_zone %i, done_zones before 0x%x,= done_zones after 0x%x.", search_zone, done_zones, done_zones | search_zone); done_zones |=3D search_zone; @@ -537,16 +566,12 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol,= const VCN start_vcn, pass =3D 1; switch (search_zone) { case 1: - ntfs_debug("Switching from mft zone to data1 " - "zone."); + ntfs_debug("Switching from mft zone to data1 zone."); /* Update mft zone position. */ - if (rlpos) { - LCN tc; + if (rlpos && used_zone_pos) { + s64 tc; =20 - ntfs_debug("Before checks, " - "vol->mft_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("Before checks, vol->mft_zone_pos 0x%llx.", vol->mft_zone_pos); tc =3D rl[rlpos - 1].lcn + rl[rlpos - 1].length; @@ -560,10 +585,7 @@ runlist_element *ntfs_cluster_alloc(ntfs_volume *vol, = const VCN start_vcn, tc > vol->mft_zone_pos) && tc >=3D vol->mft_lcn) vol->mft_zone_pos =3D tc; - ntfs_debug("After checks, " - "vol->mft_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("After checks, vol->mft_zone_pos 0x%llx.", vol->mft_zone_pos); } /* Switch from mft zone to data1 zone. */ @@ -580,16 +602,12 @@ switch_to_data1_zone: search_zone =3D 2; } break; case 2: - ntfs_debug("Switching from data1 zone to " - "data2 zone."); + ntfs_debug("Switching from data1 zone to data2 zone."); /* Update data1 zone position. */ - if (rlpos) { - LCN tc; + if (rlpos && used_zone_pos) { + s64 tc; =20 - ntfs_debug("Before checks, " - "vol->data1_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("Before checks, vol->data1_zone_pos 0x%llx.", vol->data1_zone_pos); tc =3D rl[rlpos - 1].lcn + rl[rlpos - 1].length; @@ -601,10 +619,7 @@ switch_to_data1_zone: search_zone =3D 2; tc > vol->data1_zone_pos) && tc >=3D vol->mft_zone_end) vol->data1_zone_pos =3D tc; - ntfs_debug("After checks, " - "vol->data1_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("After checks, vol->data1_zone_pos 0x%llx.", vol->data1_zone_pos); } /* Switch from data1 zone to data2 zone. */ @@ -621,16 +636,12 @@ switch_to_data1_zone: search_zone =3D 2; } break; case 4: - ntfs_debug("Switching from data2 zone to " - "data1 zone."); + ntfs_debug("Switching from data2 zone to data1 zone."); /* Update data2 zone position. */ - if (rlpos) { - LCN tc; + if (rlpos && used_zone_pos) { + s64 tc; =20 - ntfs_debug("Before checks, " - "vol->data2_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("Before checks, vol->data2_zone_pos 0x%llx.", vol->data2_zone_pos); tc =3D rl[rlpos - 1].lcn + rl[rlpos - 1].length; @@ -640,28 +651,22 @@ switch_to_data1_zone: search_zone =3D 2; vol->data2_zone_pos || tc > vol->data2_zone_pos) vol->data2_zone_pos =3D tc; - ntfs_debug("After checks, " - "vol->data2_zone_pos " - "0x%llx.", - (unsigned long long) + ntfs_debug("After checks, vol->data2_zone_pos 0x%llx.", vol->data2_zone_pos); } /* Switch from data2 zone to data1 zone. */ goto switch_to_data1_zone; default: - BUG(); + WARN_ON(1); } - ntfs_debug("After zone switch, search_zone %i, " - "pass %i, bmp_initial_pos 0x%llx, " - "zone_start 0x%llx, zone_end 0x%llx.", + ntfs_debug("After zone switch, search_zone %i, pass %i, bmp_initial_pos= 0x%llx, zone_start 0x%llx, zone_end 0x%llx.", search_zone, pass, - (unsigned long long)bmp_initial_pos, - (unsigned long long)zone_start, - (unsigned long long)zone_end); + bmp_initial_pos, + zone_start, + zone_end); bmp_pos =3D zone_start; if (zone_start =3D=3D zone_end) { - ntfs_debug("Empty zone, going to " - "done_zones_check."); + ntfs_debug("Empty zone, going to done_zones_check."); /* Empty zone. Don't bother searching it. */ goto done_zones_check; } @@ -674,11 +679,9 @@ switch_to_data1_zone: search_zone =3D 2; * MFT_ZONE, we have really run out of space. */ mft_zone_size =3D vol->mft_zone_end - vol->mft_zone_start; - ntfs_debug("vol->mft_zone_start 0x%llx, vol->mft_zone_end " - "0x%llx, mft_zone_size 0x%llx.", - (unsigned long long)vol->mft_zone_start, - (unsigned long long)vol->mft_zone_end, - (unsigned long long)mft_zone_size); + ntfs_debug("vol->mft_zone_start 0x%llx, vol->mft_zone_end 0x%llx, mft_zo= ne_size 0x%llx.", + vol->mft_zone_start, vol->mft_zone_end, + mft_zone_size); if (zone =3D=3D MFT_ZONE || mft_zone_size <=3D 0) { ntfs_debug("No free clusters left, going to out."); /* Really no more space left on device. */ @@ -703,20 +706,11 @@ switch_to_data1_zone: search_zone =3D 2; search_zone =3D 2; pass =3D 2; done_zones &=3D ~2; - ntfs_debug("After shrinking mft zone, mft_zone_size 0x%llx, " - "vol->mft_zone_start 0x%llx, " - "vol->mft_zone_end 0x%llx, " - "vol->mft_zone_pos 0x%llx, search_zone 2, " - "pass 2, dones_zones 0x%x, zone_start 0x%llx, " - "zone_end 0x%llx, vol->data1_zone_pos 0x%llx, " - "continuing outer while loop.", - (unsigned long long)mft_zone_size, - (unsigned long long)vol->mft_zone_start, - (unsigned long long)vol->mft_zone_end, - (unsigned long long)vol->mft_zone_pos, - done_zones, (unsigned long long)zone_start, - (unsigned long long)zone_end, - (unsigned long long)vol->data1_zone_pos); + ntfs_debug("After shrinking mft zone, mft_zone_size 0x%llx, vol->mft_zon= e_start 0x%llx, vol->mft_zone_end 0x%llx, vol->mft_zone_pos 0x%llx, search_= zone 2, pass 2, dones_zones 0x%x, zone_start 0x%llx, zone_end 0x%llx, vol->= data1_zone_pos 0x%llx, continuing outer while loop.", + mft_zone_size, vol->mft_zone_start, + vol->mft_zone_end, vol->mft_zone_pos, + done_zones, zone_start, zone_end, + vol->data1_zone_pos); } ntfs_debug("After outer while loop."); out: @@ -727,52 +721,58 @@ switch_to_data1_zone: search_zone =3D 2; rl[rlpos].lcn =3D is_extension ? LCN_ENOENT : LCN_RL_NOT_MAPPED; rl[rlpos].length =3D 0; } - if (likely(page && !IS_ERR(page))) { + if (likely(folio && !IS_ERR(folio))) { if (need_writeback) { ntfs_debug("Marking page dirty."); - flush_dcache_page(page); - set_page_dirty(page); + folio_mark_dirty(folio); need_writeback =3D 0; } - ntfs_unmap_page(page); + folio_unlock(folio); + kunmap_local(buf); + folio_put(folio); } if (likely(!err)) { - up_write(&vol->lcnbmp_lock); + if (is_dealloc =3D=3D true) + ntfs_release_dirty_clusters(vol, rl->length); ntfs_debug("Done."); - return rl; + if (rl =3D=3D NULL) + err =3D -EIO; + goto out_restore; } - ntfs_error(vol->sb, "Failed to allocate clusters, aborting " - "(error %i).", err); + if (err !=3D -ENOSPC) + ntfs_error(vol->sb, + "Failed to allocate clusters, aborting (error %i).", + err); if (rl) { int err2; =20 if (err =3D=3D -ENOSPC) - ntfs_debug("Not enough space to complete allocation, " - "err -ENOSPC, first free lcn 0x%llx, " - "could allocate up to 0x%llx " - "clusters.", - (unsigned long long)rl[0].lcn, - (unsigned long long)(count - clusters)); + ntfs_debug("Not enough space to complete allocation, err -ENOSPC, first= free lcn 0x%llx, could allocate up to 0x%llx clusters.", + rl[0].lcn, count - clusters); /* Deallocate all allocated clusters. */ ntfs_debug("Attempting rollback..."); err2 =3D ntfs_cluster_free_from_rl_nolock(vol, rl); if (err2) { - ntfs_error(vol->sb, "Failed to rollback (error %i). " - "Leaving inconsistent metadata! " - "Unmount and run chkdsk.", err2); + ntfs_error(vol->sb, + "Failed to rollback (error %i). Leaving inconsistent metadata! Unmount= and run chkdsk.", + err2); NVolSetErrors(vol); } /* Free the runlist. */ - ntfs_free(rl); + kvfree(rl); } else if (err =3D=3D -ENOSPC) - ntfs_debug("No space left at all, err =3D -ENOSPC, first free " - "lcn =3D 0x%llx.", - (long long)vol->data1_zone_pos); + ntfs_debug("No space left at all, err =3D -ENOSPC, first free lcn =3D 0x= %llx.", + vol->data1_zone_pos); + atomic64_set(&vol->dirty_clusters, 0); + +out_restore: up_write(&vol->lcnbmp_lock); - return ERR_PTR(err); + memalloc_nofs_restore(memalloc_flags); + + return err < 0 ? ERR_PTR(err) : rl; } =20 -/** +/* * __ntfs_cluster_free - free clusters on an ntfs volume * @ni: ntfs inode whose runlist describes the clusters to free * @start_vcn: vcn in the runlist of @ni at which to start freeing clusters @@ -801,8 +801,8 @@ switch_to_data1_zone: search_zone =3D 2; * you will probably want to do: * m =3D ctx->mrec; * a =3D ctx->attr; - * Assuming you cache ctx->attr in a variable @a of type ATTR_RECORD * and= that - * you cache ctx->mrec in a variable @m of type MFT_RECORD *. + * Assuming you cache ctx->attr in a variable @a of type attr_record * and= that + * you cache ctx->mrec in a variable @m of type struct mft_record *. * * @is_rollback should always be 'false', it is for internal use to rollba= ck * errors. You probably want to use ntfs_cluster_free() instead. @@ -832,25 +832,27 @@ switch_to_data1_zone: search_zone =3D 2; * - If @ctx is not NULL, the base mft record must be mapped on entry * and it will be left mapped on return. */ -s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start_vcn, s64 count, - ntfs_attr_search_ctx *ctx, const bool is_rollback) +s64 __ntfs_cluster_free(struct ntfs_inode *ni, const s64 start_vcn, s64 co= unt, + struct ntfs_attr_search_ctx *ctx, const bool is_rollback) { s64 delta, to_free, total_freed, real_freed; - ntfs_volume *vol; + struct ntfs_volume *vol; struct inode *lcnbmp_vi; - runlist_element *rl; + struct runlist_element *rl; int err; + unsigned int memalloc_flags; =20 - BUG_ON(!ni); - ntfs_debug("Entering for i_ino 0x%lx, start_vcn 0x%llx, count " - "0x%llx.%s", ni->mft_no, (unsigned long long)start_vcn, - (unsigned long long)count, + ntfs_debug("Entering for i_ino 0x%lx, start_vcn 0x%llx, count 0x%llx.%s", + ni->mft_no, start_vcn, count, is_rollback ? " (rollback)" : ""); vol =3D ni->vol; lcnbmp_vi =3D vol->lcnbmp_ino; - BUG_ON(!lcnbmp_vi); - BUG_ON(start_vcn < 0); - BUG_ON(count < -1); + if (start_vcn < 0 || count < -1) + return -EINVAL; + + if (!NVolFreeClusterKnown(vol)) + wait_event(vol->free_waitq, NVolFreeClusterKnown(vol)); + /* * Lock the lcn bitmap for writing but only if not rolling back. We * must hold the lock all the way including through rollback otherwise @@ -858,24 +860,33 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN sta= rt_vcn, s64 count, * dropped the lock, anyone could have set the bit again, thus * allocating the cluster for another use. */ - if (likely(!is_rollback)) + if (likely(!is_rollback)) { + memalloc_flags =3D memalloc_nofs_save(); down_write(&vol->lcnbmp_lock); + } =20 total_freed =3D real_freed =3D 0; =20 rl =3D ntfs_attr_find_vcn_nolock(ni, start_vcn, ctx); if (IS_ERR(rl)) { - if (!is_rollback) - ntfs_error(vol->sb, "Failed to find first runlist " - "element (error %li), aborting.", - PTR_ERR(rl)); err =3D PTR_ERR(rl); + if (err =3D=3D -ENOENT) { + if (likely(!is_rollback)) { + up_write(&vol->lcnbmp_lock); + memalloc_nofs_restore(memalloc_flags); + } + return 0; + } + + if (!is_rollback) + ntfs_error(vol->sb, + "Failed to find first runlist element (error %d), aborting.", + err); goto err_out; } if (unlikely(rl->lcn < LCN_HOLE)) { if (!is_rollback) - ntfs_error(vol->sb, "First runlist element has " - "invalid lcn, aborting."); + ntfs_error(vol->sb, "First runlist element has invalid lcn, aborting."); err =3D -EIO; goto err_out; } @@ -893,13 +904,14 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN sta= rt_vcn, s64 count, to_free, likely(!is_rollback) ? 0 : 1); if (unlikely(err)) { if (!is_rollback) - ntfs_error(vol->sb, "Failed to clear first run " - "(error %i), aborting.", err); + ntfs_error(vol->sb, + "Failed to clear first run (error %i), aborting.", + err); goto err_out; } /* We have freed @to_free real clusters. */ real_freed =3D to_free; - }; + } /* Go to the next run and adjust the number of clusters left to free. */ ++rl; if (count >=3D 0) @@ -913,7 +925,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start= _vcn, s64 count, */ for (; rl->length && count !=3D 0; ++rl) { if (unlikely(rl->lcn < LCN_HOLE)) { - VCN vcn; + s64 vcn; =20 /* Attempt to map runlist. */ vcn =3D rl->vcn; @@ -921,20 +933,15 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN sta= rt_vcn, s64 count, if (IS_ERR(rl)) { err =3D PTR_ERR(rl); if (!is_rollback) - ntfs_error(vol->sb, "Failed to map " - "runlist fragment or " - "failed to find " - "subsequent runlist " - "element."); + ntfs_error(vol->sb, + "Failed to map runlist fragment or failed to find subsequent runlist= element."); goto err_out; } if (unlikely(rl->lcn < LCN_HOLE)) { if (!is_rollback) - ntfs_error(vol->sb, "Runlist element " - "has invalid lcn " - "(0x%llx).", - (unsigned long long) - rl->lcn); + ntfs_error(vol->sb, + "Runlist element has invalid lcn (0x%llx).", + rl->lcn); err =3D -EIO; goto err_out; } @@ -950,8 +957,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN start= _vcn, s64 count, to_free, likely(!is_rollback) ? 0 : 1); if (unlikely(err)) { if (!is_rollback) - ntfs_error(vol->sb, "Failed to clear " - "subsequent run."); + ntfs_error(vol->sb, "Failed to clear subsequent run."); goto err_out; } /* We have freed @to_free real clusters. */ @@ -960,14 +966,54 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN sta= rt_vcn, s64 count, /* Adjust the number of clusters left to free. */ if (count >=3D 0) count -=3D to_free; -=09 + /* Update the total done clusters. */ total_freed +=3D to_free; } - if (likely(!is_rollback)) + ntfs_inc_free_clusters(vol, real_freed); + if (likely(!is_rollback)) { up_write(&vol->lcnbmp_lock); + memalloc_nofs_restore(memalloc_flags); + } + + WARN_ON(count > 0); + + if (NVolDiscard(vol) && !is_rollback) { + s64 total_discarded =3D 0, rl_off; + u32 gran =3D bdev_discard_granularity(vol->sb->s_bdev); + + rl =3D ntfs_attr_find_vcn_nolock(ni, start_vcn, ctx); + if (IS_ERR(rl)) + return real_freed; + rl_off =3D start_vcn - rl->vcn; + while (rl->length && total_discarded < total_freed) { + s64 to_discard =3D rl->length - rl_off; + + if (to_discard + total_discarded > total_freed) + to_discard =3D total_freed - total_discarded; + if (rl->lcn >=3D 0) { + sector_t start_sector, end_sector; + int ret; =20 - BUG_ON(count > 0); + start_sector =3D ALIGN(NTFS_CLU_TO_B(vol, rl->lcn + rl_off), + gran) >> SECTOR_SHIFT; + end_sector =3D ALIGN_DOWN(NTFS_CLU_TO_B(vol, + rl->lcn + rl_off + to_discard), + gran) >> SECTOR_SHIFT; + if (start_sector < end_sector) { + ret =3D blkdev_issue_discard(vol->sb->s_bdev, start_sector, + end_sector - start_sector, + GFP_NOFS); + if (ret) + break; + } + } + + total_discarded +=3D to_discard; + ++rl; + rl_off =3D 0; + } + } =20 /* We are done. Return the number of actually freed clusters. */ ntfs_debug("Done."); @@ -978,6 +1024,7 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN star= t_vcn, s64 count, /* If no real clusters were freed, no need to rollback. */ if (!real_freed) { up_write(&vol->lcnbmp_lock); + memalloc_nofs_restore(memalloc_flags); return err; } /* @@ -987,14 +1034,14 @@ s64 __ntfs_cluster_free(ntfs_inode *ni, const VCN st= art_vcn, s64 count, */ delta =3D __ntfs_cluster_free(ni, start_vcn, total_freed, ctx, true); if (delta < 0) { - ntfs_error(vol->sb, "Failed to rollback (error %i). Leaving " - "inconsistent metadata! Unmount and run " - "chkdsk.", (int)delta); + ntfs_error(vol->sb, + "Failed to rollback (error %i). Leaving inconsistent metadata! Unmoun= t and run chkdsk.", + (int)delta); NVolSetErrors(vol); } + ntfs_dec_free_clusters(vol, delta); up_write(&vol->lcnbmp_lock); + memalloc_nofs_restore(memalloc_flags); ntfs_error(vol->sb, "Aborting (error %i).", err); return err; } - -#endif /* NTFS_RW */ diff --git a/fs/ntfs/runlist.c b/fs/ntfs/runlist.c index 0d448e9881f7..fd6595067a25 100644 --- a/fs/ntfs/runlist.c +++ b/fs/ntfs/runlist.c @@ -1,43 +1,57 @@ // SPDX-License-Identifier: GPL-2.0-or-later /* - * runlist.c - NTFS runlist handling code. Part of the Linux-NTFS project. + * NTFS runlist handling code. * * Copyright (c) 2001-2007 Anton Altaparmakov * Copyright (c) 2002-2005 Richard Russon + * Copyright (c) 2025 LG Electronics Co., Ltd. + * + * Part of this file is based on code from the NTFS-3G. + * and is copyrighted by the respective authors below: + * Copyright (c) 2002-2005 Anton Altaparmakov + * Copyright (c) 2002-2005 Richard Russon + * Copyright (c) 2002-2008 Szabolcs Szakacsits + * Copyright (c) 2004 Yura Pakhuchiy + * Copyright (c) 2007-2022 Jean-Pierre Andre */ =20 -#include "debug.h" -#include "dir.h" -#include "endian.h" -#include "malloc.h" #include "ntfs.h" +#include "attrib.h" =20 -/** +/* * ntfs_rl_mm - runlist memmove + * @base: base runlist array + * @dst: destination index in @base + * @src: source index in @base + * @size: number of elements to move * * It is up to the caller to serialize access to the runlist @base. */ -static inline void ntfs_rl_mm(runlist_element *base, int dst, int src, - int size) +static inline void ntfs_rl_mm(struct runlist_element *base, int dst, int s= rc, int size) { if (likely((dst !=3D src) && (size > 0))) memmove(base + dst, base + src, size * sizeof(*base)); } =20 -/** +/* * ntfs_rl_mc - runlist memory copy + * @dstbase: destination runlist array + * @dst: destination index in @dstbase + * @srcbase: source runlist array + * @src: source index in @srcbase + * @size: number of elements to copy * * It is up to the caller to serialize access to the runlists @dstbase and * @srcbase. */ -static inline void ntfs_rl_mc(runlist_element *dstbase, int dst, - runlist_element *srcbase, int src, int size) +static inline void ntfs_rl_mc(struct runlist_element *dstbase, int dst, + struct runlist_element *srcbase, int src, int size) { if (likely(size > 0)) memcpy(dstbase + dst, srcbase + src, size * sizeof(*dstbase)); } =20 -/** +/* * ntfs_rl_realloc - Reallocate memory for runlists * @rl: original runlist * @old_size: number of runlist elements in the original runlist @rl @@ -53,21 +67,19 @@ static inline void ntfs_rl_mc(runlist_element *dstbase,= int dst, * memory, the function will return the original pointer. * * On success, return a pointer to the newly allocated, or recycled, memor= y. - * On error, return -errno. The following error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. + * On error, return -errno. */ -static inline runlist_element *ntfs_rl_realloc(runlist_element *rl, +struct runlist_element *ntfs_rl_realloc(struct runlist_element *rl, int old_size, int new_size) { - runlist_element *new_rl; + struct runlist_element *new_rl; =20 - old_size =3D PAGE_ALIGN(old_size * sizeof(*rl)); - new_size =3D PAGE_ALIGN(new_size * sizeof(*rl)); + old_size =3D old_size * sizeof(*rl); + new_size =3D new_size * sizeof(*rl); if (old_size =3D=3D new_size) return rl; =20 - new_rl =3D ntfs_malloc_nofs(new_size); + new_rl =3D kvzalloc(new_size, GFP_NOFS); if (unlikely(!new_rl)) return ERR_PTR(-ENOMEM); =20 @@ -75,12 +87,12 @@ static inline runlist_element *ntfs_rl_realloc(runlist_= element *rl, if (unlikely(old_size > new_size)) old_size =3D new_size; memcpy(new_rl, rl, old_size); - ntfs_free(rl); + kvfree(rl); } return new_rl; } =20 -/** +/* * ntfs_rl_realloc_nofail - Reallocate memory for runlists * @rl: original runlist * @old_size: number of runlist elements in the original runlist @rl @@ -99,33 +111,29 @@ static inline runlist_element *ntfs_rl_realloc(runlist= _element *rl, * memory, the function will return the original pointer. * * On success, return a pointer to the newly allocated, or recycled, memor= y. - * On error, return -errno. The following error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. + * On error, return -errno. */ -static inline runlist_element *ntfs_rl_realloc_nofail(runlist_element *rl, +static inline struct runlist_element *ntfs_rl_realloc_nofail(struct runlis= t_element *rl, int old_size, int new_size) { - runlist_element *new_rl; + struct runlist_element *new_rl; =20 - old_size =3D PAGE_ALIGN(old_size * sizeof(*rl)); - new_size =3D PAGE_ALIGN(new_size * sizeof(*rl)); + old_size =3D old_size * sizeof(*rl); + new_size =3D new_size * sizeof(*rl); if (old_size =3D=3D new_size) return rl; =20 - new_rl =3D ntfs_malloc_nofs_nofail(new_size); - BUG_ON(!new_rl); - + new_rl =3D kvmalloc(new_size, GFP_NOFS | __GFP_NOFAIL); if (likely(rl !=3D NULL)) { if (unlikely(old_size > new_size)) old_size =3D new_size; memcpy(new_rl, rl, old_size); - ntfs_free(rl); + kvfree(rl); } return new_rl; } =20 -/** +/* * ntfs_are_rl_mergeable - test if two runlists can be joined together * @dst: original runlist * @src: new runlist to test for mergeability with @dst @@ -138,12 +146,9 @@ static inline runlist_element *ntfs_rl_realloc_nofail(= runlist_element *rl, * Return: true Success, the runlists can be merged. * false Failure, the runlists cannot be merged. */ -static inline bool ntfs_are_rl_mergeable(runlist_element *dst, - runlist_element *src) +static inline bool ntfs_are_rl_mergeable(struct runlist_element *dst, + struct runlist_element *src) { - BUG_ON(!dst); - BUG_ON(!src); - /* We can merge unmapped regions even if they are misaligned. */ if ((dst->lcn =3D=3D LCN_RL_NOT_MAPPED) && (src->lcn =3D=3D LCN_RL_NOT_MA= PPED)) return true; @@ -157,11 +162,14 @@ static inline bool ntfs_are_rl_mergeable(runlist_elem= ent *dst, /* If we are merging two holes, we can merge them. */ if ((dst->lcn =3D=3D LCN_HOLE) && (src->lcn =3D=3D LCN_HOLE)) return true; + /* If we are merging two dealloc, we can merge them. */ + if ((dst->lcn =3D=3D LCN_DELALLOC) && (src->lcn =3D=3D LCN_DELALLOC)) + return true; /* Cannot merge. */ return false; } =20 -/** +/* * __ntfs_rl_merge - merge two runlists without testing if they can be mer= ged * @dst: original, destination runlist * @src: new runlist to merge with @dst @@ -172,18 +180,19 @@ static inline bool ntfs_are_rl_mergeable(runlist_elem= ent *dst, * * It is up to the caller to serialize access to the runlists @dst and @sr= c. */ -static inline void __ntfs_rl_merge(runlist_element *dst, runlist_element *= src) +static inline void __ntfs_rl_merge(struct runlist_element *dst, struct run= list_element *src) { dst->length +=3D src->length; } =20 -/** +/* * ntfs_rl_append - append a runlist after a given element - * @dst: original runlist to be worked on - * @dsize: number of elements in @dst (including end marker) - * @src: runlist to be inserted into @dst - * @ssize: number of elements in @src (excluding end marker) - * @loc: append the new runlist @src after this element in @dst + * @dst: destination runlist to append to + * @dsize: number of elements in @dst + * @src: source runlist to append from + * @ssize: number of elements in @src + * @loc: index in @dst after which to append @src + * @new_size: on success, set to the new combined size * * Append the runlist @src after element @loc in @dst. Merge the right en= d of * the new runlist, if necessary. Adjust the size of the hole before the @@ -196,20 +205,15 @@ static inline void __ntfs_rl_merge(runlist_element *d= st, runlist_element *src) * the pointers for anything any more. (Strictly speaking the returned run= list * may be the same as @dst but this is irrelevant.) * - * On error, return -errno. Both runlists are left unmodified. The followi= ng - * error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. + * On error, return -errno. Both runlists are left unmodified. */ -static inline runlist_element *ntfs_rl_append(runlist_element *dst, - int dsize, runlist_element *src, int ssize, int loc) +static inline struct runlist_element *ntfs_rl_append(struct runlist_elemen= t *dst, + int dsize, struct runlist_element *src, int ssize, int loc, + size_t *new_size) { bool right =3D false; /* Right end of @src needs merging. */ int marker; /* End of the inserted runs. */ =20 - BUG_ON(!dst); - BUG_ON(!src); - /* First, check if the right hand end needs merging. */ if ((loc + 1) < dsize) right =3D ntfs_are_rl_mergeable(src + ssize - 1, dst + loc + 1); @@ -218,6 +222,8 @@ static inline runlist_element *ntfs_rl_append(runlist_e= lement *dst, dst =3D ntfs_rl_realloc(dst, dsize, dsize + ssize - right); if (IS_ERR(dst)) return dst; + + *new_size =3D dsize + ssize - right; /* * We are guaranteed to succeed from here so can start modifying the * original runlists. @@ -244,13 +250,14 @@ static inline runlist_element *ntfs_rl_append(runlist= _element *dst, return dst; } =20 -/** +/* * ntfs_rl_insert - insert a runlist into another - * @dst: original runlist to be worked on - * @dsize: number of elements in @dst (including end marker) - * @src: new runlist to be inserted - * @ssize: number of elements in @src (excluding end marker) - * @loc: insert the new runlist @src before this element in @dst + * @dst: destination runlist to insert into + * @dsize: number of elements in @dst + * @src: source runlist to insert from + * @ssize: number of elements in @src + * @loc: index in @dst at which to insert @src + * @new_size: on success, set to the new combined size * * Insert the runlist @src before element @loc in the runlist @dst. Merge = the * left end of the new runlist, if necessary. Adjust the size of the hole @@ -263,21 +270,16 @@ static inline runlist_element *ntfs_rl_append(runlist= _element *dst, * the pointers for anything any more. (Strictly speaking the returned run= list * may be the same as @dst but this is irrelevant.) * - * On error, return -errno. Both runlists are left unmodified. The followi= ng - * error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. + * On error, return -errno. Both runlists are left unmodified. */ -static inline runlist_element *ntfs_rl_insert(runlist_element *dst, - int dsize, runlist_element *src, int ssize, int loc) +static inline struct runlist_element *ntfs_rl_insert(struct runlist_elemen= t *dst, + int dsize, struct runlist_element *src, int ssize, int loc, + size_t *new_size) { bool left =3D false; /* Left end of @src needs merging. */ bool disc =3D false; /* Discontinuity between @dst and @src. */ int marker; /* End of the inserted runs. */ =20 - BUG_ON(!dst); - BUG_ON(!src); - /* * disc =3D> Discontinuity between the end of @dst and the start of @src. * This means we might need to insert a "not mapped" run. @@ -302,6 +304,8 @@ static inline runlist_element *ntfs_rl_insert(runlist_e= lement *dst, dst =3D ntfs_rl_realloc(dst, dsize, dsize + ssize - left + disc); if (IS_ERR(dst)) return dst; + + *new_size =3D dsize + ssize - left + disc; /* * We are guaranteed to succeed from here so can start modifying the * original runlist. @@ -324,7 +328,8 @@ static inline runlist_element *ntfs_rl_insert(runlist_e= lement *dst, /* Adjust the VCN of the first run after the insertion... */ dst[marker].vcn =3D dst[marker - 1].vcn + dst[marker - 1].length; /* ... and the length. */ - if (dst[marker].lcn =3D=3D LCN_HOLE || dst[marker].lcn =3D=3D LCN_RL_NOT_= MAPPED) + if (dst[marker].lcn =3D=3D LCN_HOLE || dst[marker].lcn =3D=3D LCN_RL_NOT_= MAPPED || + dst[marker].lcn =3D=3D LCN_DELALLOC) dst[marker].length =3D dst[marker + 1].vcn - dst[marker].vcn; =20 /* Writing beyond the end of the file and there is a discontinuity. */ @@ -341,13 +346,14 @@ static inline runlist_element *ntfs_rl_insert(runlist= _element *dst, return dst; } =20 -/** +/* * ntfs_rl_replace - overwrite a runlist element with another runlist - * @dst: original runlist to be worked on - * @dsize: number of elements in @dst (including end marker) - * @src: new runlist to be inserted - * @ssize: number of elements in @src (excluding end marker) - * @loc: index in runlist @dst to overwrite with @src + * @dst: destination runlist to replace in + * @dsize: number of elements in @dst + * @src: source runlist to replace with + * @ssize: number of elements in @src + * @loc: index in @dst to replace + * @new_size: on success, set to the new combined size * * Replace the runlist element @dst at @loc with @src. Merge the left and * right ends of the inserted runlist, if necessary. @@ -359,23 +365,18 @@ static inline runlist_element *ntfs_rl_insert(runlist= _element *dst, * the pointers for anything any more. (Strictly speaking the returned run= list * may be the same as @dst but this is irrelevant.) * - * On error, return -errno. Both runlists are left unmodified. The followi= ng - * error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. + * On error, return -errno. Both runlists are left unmodified. */ -static inline runlist_element *ntfs_rl_replace(runlist_element *dst, - int dsize, runlist_element *src, int ssize, int loc) +static inline struct runlist_element *ntfs_rl_replace(struct runlist_eleme= nt *dst, + int dsize, struct runlist_element *src, int ssize, int loc, + size_t *new_size) { - signed delta; + int delta; bool left =3D false; /* Left end of @src needs merging. */ bool right =3D false; /* Right end of @src needs merging. */ int tail; /* Start of tail of @dst. */ int marker; /* End of the inserted runs. */ =20 - BUG_ON(!dst); - BUG_ON(!src); - /* First, see if the left and right ends need merging. */ if ((loc + 1) < dsize) right =3D ntfs_are_rl_mergeable(src + ssize - 1, dst + loc + 1); @@ -391,6 +392,8 @@ static inline runlist_element *ntfs_rl_replace(runlist_= element *dst, if (IS_ERR(dst)) return dst; } + + *new_size =3D dsize + delta; /* * We are guaranteed to succeed from here so can start modifying the * original runlists. @@ -429,13 +432,14 @@ static inline runlist_element *ntfs_rl_replace(runlis= t_element *dst, return dst; } =20 -/** +/* * ntfs_rl_split - insert a runlist into the centre of a hole - * @dst: original runlist to be worked on - * @dsize: number of elements in @dst (including end marker) - * @src: new runlist to be inserted - * @ssize: number of elements in @src (excluding end marker) - * @loc: index in runlist @dst at which to split and insert @src + * @dst: destination runlist with a hole + * @dsize: number of elements in @dst + * @src: source runlist to insert + * @ssize: number of elements in @src + * @loc: index in @dst of the hole to split + * @new_size: on success, set to the new combined size * * Split the runlist @dst at @loc into two and insert @new in between the = two * fragments. No merging of runlists is necessary. Adjust the size of the @@ -448,21 +452,18 @@ static inline runlist_element *ntfs_rl_replace(runlis= t_element *dst, * the pointers for anything any more. (Strictly speaking the returned run= list * may be the same as @dst but this is irrelevant.) * - * On error, return -errno. Both runlists are left unmodified. The followi= ng - * error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. + * On error, return -errno. Both runlists are left unmodified. */ -static inline runlist_element *ntfs_rl_split(runlist_element *dst, int dsi= ze, - runlist_element *src, int ssize, int loc) +static inline struct runlist_element *ntfs_rl_split(struct runlist_element= *dst, int dsize, + struct runlist_element *src, int ssize, int loc, + size_t *new_size) { - BUG_ON(!dst); - BUG_ON(!src); - /* Space required: @dst size + @src size + one new hole. */ dst =3D ntfs_rl_realloc(dst, dsize, dsize + ssize + 1); if (IS_ERR(dst)) return dst; + + *new_size =3D dsize + ssize + 1; /* * We are guaranteed to succeed from here so can start modifying the * original runlists. @@ -480,10 +481,12 @@ static inline runlist_element *ntfs_rl_split(runlist_= element *dst, int dsize, return dst; } =20 -/** +/* * ntfs_runlists_merge - merge two runlists into one - * @drl: original runlist to be worked on - * @srl: new runlist to be merged into @drl + * @d_runlist: destination runlist structure to merge into + * @srl: source runlist to merge from + * @s_rl_count: number of elements in @srl (0 to auto-detect) + * @new_rl_count: on success, set to the new combined runlist size * * First we sanity check the two runlists @srl and @drl to make sure that = they * are sensible and can be merged. The runlist @srl must be either after t= he @@ -508,23 +511,20 @@ static inline runlist_element *ntfs_rl_split(runlist_= element *dst, int dsize, * the pointers for anything any more. (Strictly speaking the returned run= list * may be the same as @dst but this is irrelevant.) * - * On error, return -errno. Both runlists are left unmodified. The followi= ng - * error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EINVAL - Invalid parameters were passed in. - * -ERANGE - The runlists overlap and cannot be merged. + * On error, return -errno. Both runlists are left unmodified. */ -runlist_element *ntfs_runlists_merge(runlist_element *drl, - runlist_element *srl) +struct runlist_element *ntfs_runlists_merge(struct runlist *d_runlist, + struct runlist_element *srl, size_t s_rl_count, + size_t *new_rl_count) { int di, si; /* Current index into @[ds]rl. */ int sstart; /* First index with lcn > LCN_RL_NOT_MAPPED. */ int dins; /* Index into @drl at which to insert @srl. */ int dend, send; /* Last index into @[ds]rl. */ - int dfinal, sfinal; /* The last index into @[ds]rl with - lcn >=3D LCN_HOLE. */ + int dfinal, sfinal; /* The last index into @[ds]rl with lcn >=3D LCN_HOLE= . */ int marker =3D 0; - VCN marker_vcn =3D 0; + s64 marker_vcn =3D 0; + struct runlist_element *drl =3D d_runlist->rl, *rl; =20 #ifdef DEBUG ntfs_debug("dst:"); @@ -539,27 +539,36 @@ runlist_element *ntfs_runlists_merge(runlist_element = *drl, if (IS_ERR(srl) || IS_ERR(drl)) return ERR_PTR(-EINVAL); =20 + if (s_rl_count =3D=3D 0) { + for (; srl[s_rl_count].length; s_rl_count++) + ; + s_rl_count++; + } + /* Check for the case where the first mapping is being done now. */ if (unlikely(!drl)) { drl =3D srl; /* Complete the source runlist if necessary. */ if (unlikely(drl[0].vcn)) { /* Scan to the end of the source runlist. */ - for (dend =3D 0; likely(drl[dend].length); dend++) - ; - dend++; - drl =3D ntfs_rl_realloc(drl, dend, dend + 1); + drl =3D ntfs_rl_realloc(drl, s_rl_count, s_rl_count + 1); if (IS_ERR(drl)) return drl; /* Insert start element at the front of the runlist. */ - ntfs_rl_mm(drl, 1, 0, dend); + ntfs_rl_mm(drl, 1, 0, s_rl_count); drl[0].vcn =3D 0; drl[0].lcn =3D LCN_RL_NOT_MAPPED; drl[0].length =3D drl[1].vcn; + s_rl_count++; } + + *new_rl_count =3D s_rl_count; goto finished; } =20 + if (d_runlist->count < 1 || s_rl_count < 2) + return ERR_PTR(-EINVAL); + si =3D di =3D 0; =20 /* Skip any unmapped start element(s) in the source runlist. */ @@ -567,7 +576,7 @@ runlist_element *ntfs_runlists_merge(runlist_element *d= rl, si++; =20 /* Can't have an entirely unmapped source runlist. */ - BUG_ON(!srl[si].length); + WARN_ON(!srl[si].length); =20 /* Record the starting points. */ sstart =3D si; @@ -577,10 +586,11 @@ runlist_element *ntfs_runlists_merge(runlist_element = *drl, * be inserted. If we reach the end of @drl, @srl just needs to be * appended to @drl. */ - for (; drl[di].length; di++) { - if (drl[di].vcn + drl[di].length > srl[sstart].vcn) - break; - } + rl =3D __ntfs_attr_find_vcn_nolock(d_runlist, srl[sstart].vcn); + if (IS_ERR(rl)) + di =3D (int)d_runlist->count - 1; + else + di =3D (int)(rl - d_runlist->rl); dins =3D di; =20 /* Sanity check for illegal overlaps. */ @@ -591,10 +601,8 @@ runlist_element *ntfs_runlists_merge(runlist_element *= drl, } =20 /* Scan to the end of both runlists in order to know their sizes. */ - for (send =3D si; srl[send].length; send++) - ; - for (dend =3D di; drl[dend].length; dend++) - ; + send =3D (int)s_rl_count - 1; + dend =3D (int)d_runlist->count - 1; =20 if (srl[send].lcn =3D=3D LCN_ENOENT) marker_vcn =3D srl[marker =3D send].vcn; @@ -622,28 +630,23 @@ runlist_element *ntfs_runlists_merge(runlist_element = *drl, ss++; if (marker && (drl[dins].vcn + drl[dins].length > srl[send - 1].vcn)) finish =3D false; -#if 0 - ntfs_debug("dfinal =3D %i, dend =3D %i", dfinal, dend); - ntfs_debug("sstart =3D %i, sfinal =3D %i, send =3D %i", sstart, sfinal, s= end); - ntfs_debug("start =3D %i, finish =3D %i", start, finish); - ntfs_debug("ds =3D %i, ss =3D %i, dins =3D %i", ds, ss, dins); -#endif + if (start) { if (finish) - drl =3D ntfs_rl_replace(drl, ds, srl + sstart, ss, dins); + drl =3D ntfs_rl_replace(drl, ds, srl + sstart, ss, dins, new_rl_count); else - drl =3D ntfs_rl_insert(drl, ds, srl + sstart, ss, dins); + drl =3D ntfs_rl_insert(drl, ds, srl + sstart, ss, dins, new_rl_count); } else { if (finish) - drl =3D ntfs_rl_append(drl, ds, srl + sstart, ss, dins); + drl =3D ntfs_rl_append(drl, ds, srl + sstart, ss, dins, new_rl_count); else - drl =3D ntfs_rl_split(drl, ds, srl + sstart, ss, dins); + drl =3D ntfs_rl_split(drl, ds, srl + sstart, ss, dins, new_rl_count); } if (IS_ERR(drl)) { ntfs_error(NULL, "Merge failed."); return drl; } - ntfs_free(srl); + kvfree(srl); if (marker) { ntfs_debug("Triggering marker code."); for (ds =3D dend; drl[ds].length; ds++) @@ -653,9 +656,7 @@ runlist_element *ntfs_runlists_merge(runlist_element *d= rl, int slots =3D 0; =20 if (drl[ds].vcn =3D=3D marker_vcn) { - ntfs_debug("Old marker =3D 0x%llx, replacing " - "with LCN_ENOENT.", - (unsigned long long) + ntfs_debug("Old marker =3D 0x%llx, replacing with LCN_ENOENT.", drl[ds].lcn); drl[ds].lcn =3D LCN_ENOENT; goto finished; @@ -675,6 +676,7 @@ runlist_element *ntfs_runlists_merge(runlist_element *d= rl, drl =3D ntfs_rl_realloc_nofail(drl, ds, ds + 2); slots =3D 2; + *new_rl_count +=3D 2; } ds++; /* Need to set vcn if it isn't set already. */ @@ -688,8 +690,10 @@ runlist_element *ntfs_runlists_merge(runlist_element *= drl, drl[ds].length =3D marker_vcn - drl[ds].vcn; /* Finally add the ENOENT terminator. */ ds++; - if (!slots) + if (!slots) { drl =3D ntfs_rl_realloc_nofail(drl, ds, ds + 1); + *new_rl_count +=3D 1; + } drl[ds].vcn =3D marker_vcn; drl[ds].lcn =3D LCN_ENOENT; drl[ds].length =3D (s64)0; @@ -704,11 +708,12 @@ runlist_element *ntfs_runlists_merge(runlist_element = *drl, return drl; } =20 -/** +/* * ntfs_mapping_pairs_decompress - convert mapping pairs array to runlist - * @vol: ntfs volume on which the attribute resides - * @attr: attribute record whose mapping pairs array to decompress - * @old_rl: optional runlist in which to insert @attr's runlist + * @vol: ntfs volume + * @attr: attribute record whose mapping pairs to decompress + * @old_runlist: optional runlist to merge the decompressed runlist into + * @new_rl_count: on success, set to the new runlist size * * It is up to the caller to serialize access to the runlist @old_rl. * @@ -720,58 +725,44 @@ runlist_element *ntfs_runlists_merge(runlist_element = *drl, * returned. The original @old_rl is deallocated. * * On error, return -errno. @old_rl is left unmodified in that case. - * - * The following error codes are defined: - * -ENOMEM - Not enough memory to allocate runlist array. - * -EIO - Corrupt runlist. - * -EINVAL - Invalid parameters were passed in. - * -ERANGE - The two runlists overlap. - * - * FIXME: For now we take the conceptionally simplest approach of creating= the - * new runlist disregarding the already existing one and then splicing the - * two into one, if that is possible (we check for overlap and discard the= new - * runlist if overlap present before returning ERR_PTR(-ERANGE)). */ -runlist_element *ntfs_mapping_pairs_decompress(const ntfs_volume *vol, - const ATTR_RECORD *attr, runlist_element *old_rl) +struct runlist_element *ntfs_mapping_pairs_decompress(const struct ntfs_vo= lume *vol, + const struct attr_record *attr, struct runlist *old_runlist, + size_t *new_rl_count) { - VCN vcn; /* Current vcn. */ - LCN lcn; /* Current lcn. */ + s64 vcn; /* Current vcn. */ + s64 lcn; /* Current lcn. */ s64 deltaxcn; /* Change in [vl]cn. */ - runlist_element *rl; /* The output runlist. */ + struct runlist_element *rl, *new_rl; /* The output runlist. */ u8 *buf; /* Current position in mapping pairs array. */ u8 *attr_end; /* End of attribute. */ int rlsize; /* Size of runlist buffer. */ - u16 rlpos; /* Current runlist position in units of - runlist_elements. */ + u16 rlpos; /* Current runlist position in units of struct runlist_elemen= ts. */ u8 b; /* Current byte offset in buf. */ =20 #ifdef DEBUG /* Make sure attr exists and is non-resident. */ - if (!attr || !attr->non_resident || sle64_to_cpu( - attr->data.non_resident.lowest_vcn) < (VCN)0) { + if (!attr || !attr->non_resident) { ntfs_error(vol->sb, "Invalid arguments."); return ERR_PTR(-EINVAL); } #endif /* Start at vcn =3D lowest_vcn and lcn 0. */ - vcn =3D sle64_to_cpu(attr->data.non_resident.lowest_vcn); + vcn =3D le64_to_cpu(attr->data.non_resident.lowest_vcn); lcn =3D 0; /* Get start of the mapping pairs array. */ - buf =3D (u8*)attr + le16_to_cpu( - attr->data.non_resident.mapping_pairs_offset); - attr_end =3D (u8*)attr + le32_to_cpu(attr->length); - if (unlikely(buf < (u8*)attr || buf > attr_end)) { + buf =3D (u8 *)attr + + le16_to_cpu(attr->data.non_resident.mapping_pairs_offset); + attr_end =3D (u8 *)attr + le32_to_cpu(attr->length); + if (unlikely(buf < (u8 *)attr || buf > attr_end)) { ntfs_error(vol->sb, "Corrupt attribute."); return ERR_PTR(-EIO); } - /* If the mapping pairs array is valid but empty, nothing to do. */ - if (!vcn && !*buf) - return old_rl; + /* Current position in runlist array. */ rlpos =3D 0; /* Allocate first page and set current runlist size to one page. */ - rl =3D ntfs_malloc_nofs(rlsize =3D PAGE_SIZE); + rl =3D kvzalloc(rlsize =3D PAGE_SIZE, GFP_NOFS); if (unlikely(!rl)) return ERR_PTR(-ENOMEM); /* Insert unmapped starting element if necessary. */ @@ -784,19 +775,19 @@ runlist_element *ntfs_mapping_pairs_decompress(const = ntfs_volume *vol, while (buf < attr_end && *buf) { /* * Allocate more memory if needed, including space for the - * not-mapped and terminator elements. ntfs_malloc_nofs() + * not-mapped and terminator elements. kvzalloc() * operates on whole pages only. */ - if (((rlpos + 3) * sizeof(*old_rl)) > rlsize) { - runlist_element *rl2; + if (((rlpos + 3) * sizeof(*rl)) > rlsize) { + struct runlist_element *rl2; =20 - rl2 =3D ntfs_malloc_nofs(rlsize + (int)PAGE_SIZE); + rl2 =3D kvzalloc(rlsize + PAGE_SIZE, GFP_NOFS); if (unlikely(!rl2)) { - ntfs_free(rl); + kvfree(rl); return ERR_PTR(-ENOMEM); } memcpy(rl2, rl, rlsize); - ntfs_free(rl); + kvfree(rl); rl =3D rl2; rlsize +=3D PAGE_SIZE; } @@ -816,8 +807,7 @@ runlist_element *ntfs_mapping_pairs_decompress(const nt= fs_volume *vol, for (deltaxcn =3D (s8)buf[b--]; b; b--) deltaxcn =3D (deltaxcn << 8) + buf[b]; } else { /* The length entry is compulsory. */ - ntfs_error(vol->sb, "Missing length entry in mapping " - "pairs array."); + ntfs_error(vol->sb, "Missing length entry in mapping pairs array."); deltaxcn =3D (s64)-1; } /* @@ -825,8 +815,7 @@ runlist_element *ntfs_mapping_pairs_decompress(const nt= fs_volume *vol, * hence clean-up and return NULL. */ if (unlikely(deltaxcn < 0)) { - ntfs_error(vol->sb, "Invalid length in mapping pairs " - "array."); + ntfs_error(vol->sb, "Invalid length in mapping pairs array."); goto err_out; } /* @@ -846,6 +835,7 @@ runlist_element *ntfs_mapping_pairs_decompress(const nt= fs_volume *vol, else { /* Get the lcn change which really can be negative. */ u8 b2 =3D *buf & 0xf; + b =3D b2 + ((*buf >> 4) & 0xf); if (buf + b > attr_end) goto io_error; @@ -855,30 +845,39 @@ runlist_element *ntfs_mapping_pairs_decompress(const = ntfs_volume *vol, lcn +=3D deltaxcn; #ifdef DEBUG /* - * On NTFS 1.2-, apparently can have lcn =3D=3D -1 to + * On NTFS 1.2-, apparently can have lcn =3D=3D LCN_HOLE to * indicate a hole. But we haven't verified ourselves * whether it is really the lcn or the deltaxcn that is - * -1. So if either is found give us a message so we + * LCN_HOLE. So if either is found give us a message so we * can investigate it further! */ if (vol->major_ver < 3) { - if (unlikely(deltaxcn =3D=3D (LCN)-1)) - ntfs_error(vol->sb, "lcn delta =3D=3D -1"); - if (unlikely(lcn =3D=3D (LCN)-1)) - ntfs_error(vol->sb, "lcn =3D=3D -1"); + if (unlikely(deltaxcn =3D=3D LCN_HOLE)) + ntfs_error(vol->sb, "lcn delta =3D=3D LCN_HOLE"); + if (unlikely(lcn =3D=3D LCN_HOLE)) + ntfs_error(vol->sb, "lcn =3D=3D LCN_HOLE"); } #endif - /* Check lcn is not below -1. */ - if (unlikely(lcn < (LCN)-1)) { - ntfs_error(vol->sb, "Invalid LCN < -1 in " - "mapping pairs array."); + /* Check lcn is not below LCN_HOLE. */ + if (unlikely(lcn < LCN_HOLE)) { + ntfs_error(vol->sb, "Invalid s64 < -2 in mapping pairs array."); goto err_out; } + + /* chkdsk accepts zero-sized runs only for holes */ + if ((lcn !=3D LCN_HOLE) && !rl[rlpos].length) { + ntfs_error(vol->sb, + "Invalid zero-sized data run(lcn : %lld).\n", + lcn); + goto err_out; + } + /* Enter the current lcn into the runlist element. */ rl[rlpos].lcn =3D lcn; } - /* Get to the next runlist element. */ - rlpos++; + /* Get to the next runlist element, skipping zero-sized holes */ + if (rl[rlpos].length) + rlpos++; /* Increment the buffer position to the next mapping pair. */ buf +=3D (*buf & 0xf) + ((*buf >> 4) & 0xf) + 1; } @@ -888,19 +887,17 @@ runlist_element *ntfs_mapping_pairs_decompress(const = ntfs_volume *vol, * If there is a highest_vcn specified, it must be equal to the final * vcn in the runlist - 1, or something has gone badly wrong. */ - deltaxcn =3D sle64_to_cpu(attr->data.non_resident.highest_vcn); + deltaxcn =3D le64_to_cpu(attr->data.non_resident.highest_vcn); if (unlikely(deltaxcn && vcn - 1 !=3D deltaxcn)) { mpa_err: - ntfs_error(vol->sb, "Corrupt mapping pairs array in " - "non-resident attribute."); + ntfs_error(vol->sb, "Corrupt mapping pairs array in non-resident attribu= te."); goto err_out; } /* Setup not mapped runlist element if this is the base extent. */ if (!attr->data.non_resident.lowest_vcn) { - VCN max_cluster; + s64 max_cluster; =20 - max_cluster =3D ((sle64_to_cpu( - attr->data.non_resident.allocated_size) + + max_cluster =3D ((le64_to_cpu(attr->data.non_resident.allocated_size) + vol->cluster_size - 1) >> vol->cluster_size_bits) - 1; /* @@ -915,24 +912,17 @@ runlist_element *ntfs_mapping_pairs_decompress(const = ntfs_volume *vol, * this one. */ if (deltaxcn < max_cluster) { - ntfs_debug("More extents to follow; deltaxcn " - "=3D 0x%llx, max_cluster =3D " - "0x%llx", - (unsigned long long)deltaxcn, - (unsigned long long) - max_cluster); + ntfs_debug("More extents to follow; deltaxcn =3D 0x%llx, max_cluster = =3D 0x%llx", + deltaxcn, max_cluster); rl[rlpos].vcn =3D vcn; vcn +=3D rl[rlpos].length =3D max_cluster - deltaxcn; rl[rlpos].lcn =3D LCN_RL_NOT_MAPPED; rlpos++; } else if (unlikely(deltaxcn > max_cluster)) { - ntfs_error(vol->sb, "Corrupt attribute. " - "deltaxcn =3D 0x%llx, " - "max_cluster =3D 0x%llx", - (unsigned long long)deltaxcn, - (unsigned long long) - max_cluster); + ntfs_error(vol->sb, + "Corrupt attribute. deltaxcn =3D 0x%llx, max_cluster =3D 0x%llx", + deltaxcn, max_cluster); goto mpa_err; } } @@ -944,26 +934,27 @@ runlist_element *ntfs_mapping_pairs_decompress(const = ntfs_volume *vol, rl[rlpos].vcn =3D vcn; rl[rlpos].length =3D (s64)0; /* If no existing runlist was specified, we are done. */ - if (!old_rl) { + if (!old_runlist || !old_runlist->rl) { + *new_rl_count =3D rlpos + 1; ntfs_debug("Mapping pairs array successfully decompressed:"); ntfs_debug_dump_runlist(rl); return rl; } /* Now combine the new and old runlists checking for overlaps. */ - old_rl =3D ntfs_runlists_merge(old_rl, rl); - if (!IS_ERR(old_rl)) - return old_rl; - ntfs_free(rl); + new_rl =3D ntfs_runlists_merge(old_runlist, rl, rlpos + 1, new_rl_count); + if (!IS_ERR(new_rl)) + return new_rl; + kvfree(rl); ntfs_error(vol->sb, "Failed to merge runlists."); - return old_rl; + return new_rl; io_error: ntfs_error(vol->sb, "Corrupt attribute."); err_out: - ntfs_free(rl); + kvfree(rl); return ERR_PTR(-EIO); } =20 -/** +/* * ntfs_rl_vcn_to_lcn - convert a vcn into a lcn given a runlist * @rl: runlist to use for conversion * @vcn: vcn to convert @@ -987,11 +978,10 @@ runlist_element *ntfs_mapping_pairs_decompress(const = ntfs_volume *vol, * - This function does not touch the lock, nor does it modify the * runlist. */ -LCN ntfs_rl_vcn_to_lcn(const runlist_element *rl, const VCN vcn) +s64 ntfs_rl_vcn_to_lcn(const struct runlist_element *rl, const s64 vcn) { int i; =20 - BUG_ON(vcn < 0); /* * If rl is NULL, assume that we have found an unmapped runlist. The * caller can then attempt to map it and fail appropriately if @@ -1005,8 +995,8 @@ LCN ntfs_rl_vcn_to_lcn(const runlist_element *rl, cons= t VCN vcn) return LCN_ENOENT; =20 for (i =3D 0; likely(rl[i].length); i++) { - if (unlikely(vcn < rl[i+1].vcn)) { - if (likely(rl[i].lcn >=3D (LCN)0)) + if (vcn < rl[i+1].vcn) { + if (likely(rl[i].lcn >=3D 0)) return rl[i].lcn + (vcn - rl[i].vcn); return rl[i].lcn; } @@ -1015,15 +1005,13 @@ LCN ntfs_rl_vcn_to_lcn(const runlist_element *rl, c= onst VCN vcn) * The terminator element is setup to the correct value, i.e. one of * LCN_HOLE, LCN_RL_NOT_MAPPED, or LCN_ENOENT. */ - if (likely(rl[i].lcn < (LCN)0)) + if (likely(rl[i].lcn < 0)) return rl[i].lcn; /* Just in case... We could replace this with BUG() some day. */ return LCN_ENOENT; } =20 -#ifdef NTFS_RW - -/** +/* * ntfs_rl_find_vcn_nolock - find a vcn in a runlist * @rl: runlist to search * @vcn: vcn to find @@ -1036,9 +1024,8 @@ LCN ntfs_rl_vcn_to_lcn(const runlist_element *rl, con= st VCN vcn) * * Locking: The runlist must be locked on entry. */ -runlist_element *ntfs_rl_find_vcn_nolock(runlist_element *rl, const VCN vc= n) +struct runlist_element *ntfs_rl_find_vcn_nolock(struct runlist_element *rl= , const s64 vcn) { - BUG_ON(vcn < 0); if (unlikely(!rl || vcn < rl[0].vcn)) return NULL; while (likely(rl->length)) { @@ -1054,7 +1041,7 @@ runlist_element *ntfs_rl_find_vcn_nolock(runlist_elem= ent *rl, const VCN vcn) return NULL; } =20 -/** +/* * ntfs_get_nr_significant_bytes - get number of bytes needed to store a n= umber * @n: number for which to get the number of bytes for * @@ -1085,12 +1072,13 @@ static inline int ntfs_get_nr_significant_bytes(con= st s64 n) return i; } =20 -/** +/* * ntfs_get_size_for_mapping_pairs - get bytes needed for mapping pairs ar= ray - * @vol: ntfs volume (needed for the ntfs version) - * @rl: locked runlist to determine the size of the mapping pairs of - * @first_vcn: first vcn which to include in the mapping pairs array - * @last_vcn: last vcn which to include in the mapping pairs array + * @vol: ntfs volume + * @rl: runlist to calculate the mapping pairs array size for + * @first_vcn: first vcn which to include in the mapping pairs array + * @last_vcn: last vcn which to include in the mapping pairs array + * @max_mp_size: maximum size to return (0 or less means unlimited) * * Walk the locked runlist @rl and calculate the size in bytes of the mapp= ing * pairs array corresponding to the runlist @rl, starting at vcn @first_vc= n and @@ -1106,30 +1094,28 @@ static inline int ntfs_get_nr_significant_bytes(con= st s64 n) * If @rl is NULL, just return 1 (for the single terminator byte). * * Return the calculated size in bytes on success. On error, return -errn= o. - * The following error codes are defined: - * -EINVAL - Run list contains unmapped elements. Make sure to only pass - * fully mapped runlists to this function. - * -EIO - The runlist is corrupt. - * - * Locking: @rl must be locked on entry (either for reading or writing), it - * remains locked throughout, and is left locked upon return. */ -int ntfs_get_size_for_mapping_pairs(const ntfs_volume *vol, - const runlist_element *rl, const VCN first_vcn, - const VCN last_vcn) +int ntfs_get_size_for_mapping_pairs(const struct ntfs_volume *vol, + const struct runlist_element *rl, const s64 first_vcn, + const s64 last_vcn, int max_mp_size) { - LCN prev_lcn; + s64 prev_lcn; int rls; bool the_end =3D false; =20 - BUG_ON(first_vcn < 0); - BUG_ON(last_vcn < -1); - BUG_ON(last_vcn >=3D 0 && first_vcn > last_vcn); + if (first_vcn < 0 || last_vcn < -1) + return -EINVAL; + + if (last_vcn >=3D 0 && first_vcn > last_vcn) + return -EINVAL; + if (!rl) { - BUG_ON(first_vcn); - BUG_ON(last_vcn > 0); + WARN_ON(first_vcn); + WARN_ON(last_vcn > 0); return 1; } + if (max_mp_size <=3D 0) + max_mp_size =3D INT_MAX; /* Skip to runlist element containing @first_vcn. */ while (rl->length && first_vcn >=3D rl[1].vcn) rl++; @@ -1152,6 +1138,7 @@ int ntfs_get_size_for_mapping_pairs(const ntfs_volume= *vol, */ if (unlikely(last_vcn >=3D 0 && rl[1].vcn > last_vcn)) { s64 s1 =3D last_vcn + 1; + if (unlikely(rl[1].vcn > s1)) length =3D s1 - rl->vcn; the_end =3D true; @@ -1188,6 +1175,7 @@ int ntfs_get_size_for_mapping_pairs(const ntfs_volume= *vol, */ if (unlikely(last_vcn >=3D 0 && rl[1].vcn > last_vcn)) { s64 s1 =3D last_vcn + 1; + if (unlikely(rl[1].vcn > s1)) length =3D s1 - rl->vcn; the_end =3D true; @@ -1207,6 +1195,9 @@ int ntfs_get_size_for_mapping_pairs(const ntfs_volume= *vol, prev_lcn); prev_lcn =3D rl->lcn; } + + if (rls > max_mp_size) + break; } return rls; err_out: @@ -1217,7 +1208,7 @@ int ntfs_get_size_for_mapping_pairs(const ntfs_volume= *vol, return rls; } =20 -/** +/* * ntfs_write_significant_bytes - write the significant bytes of a number * @dst: destination buffer to write to * @dst_max: pointer to last byte of destination buffer for bounds checking @@ -1268,15 +1259,17 @@ static inline int ntfs_write_significant_bytes(s8 *= dst, const s8 *dst_max, return -ENOSPC; } =20 -/** +/* * ntfs_mapping_pairs_build - build the mapping pairs array from a runlist - * @vol: ntfs volume (needed for the ntfs version) - * @dst: destination buffer to which to write the mapping pairs array - * @dst_len: size of destination buffer @dst in bytes - * @rl: locked runlist for which to build the mapping pairs array - * @first_vcn: first vcn which to include in the mapping pairs array - * @last_vcn: last vcn which to include in the mapping pairs array - * @stop_vcn: first vcn outside destination buffer on success or -ENOSPC + * @vol: ntfs volume + * @dst: destination buffer to build mapping pairs array into + * @dst_len: size of @dst in bytes + * @rl: runlist to build the mapping pairs array from + * @first_vcn: first vcn which to include in the mapping pairs array + * @last_vcn: last vcn which to include in the mapping pairs array + * @stop_vcn: on return, set to the first vcn outside the destination buff= er + * @stop_rl: on return, set to the runlist element where encoding stopped + * @de_cluster_count: on return, set to the number of clusters encoded * * Create the mapping pairs array from the locked runlist @rl, starting at= vcn * @first_vcn and finishing with vcn @last_vcn and save the array in @dst. @@ -1306,23 +1299,25 @@ static inline int ntfs_write_significant_bytes(s8 *= dst, const s8 *dst_max, * Locking: @rl must be locked on entry (either for reading or writing), it * remains locked throughout, and is left locked upon return. */ -int ntfs_mapping_pairs_build(const ntfs_volume *vol, s8 *dst, - const int dst_len, const runlist_element *rl, - const VCN first_vcn, const VCN last_vcn, VCN *const stop_vcn) +int ntfs_mapping_pairs_build(const struct ntfs_volume *vol, s8 *dst, + const int dst_len, const struct runlist_element *rl, + const s64 first_vcn, const s64 last_vcn, s64 *const stop_vcn, + struct runlist_element **stop_rl, unsigned int *de_cluster_count) { - LCN prev_lcn; + s64 prev_lcn; s8 *dst_max, *dst_next; int err =3D -ENOSPC; bool the_end =3D false; s8 len_len, lcn_len; + unsigned int de_cnt =3D 0; + + if (first_vcn < 0 || last_vcn < -1 || dst_len < 1) + return -EINVAL; + if (last_vcn >=3D 0 && first_vcn > last_vcn) + return -EINVAL; =20 - BUG_ON(first_vcn < 0); - BUG_ON(last_vcn < -1); - BUG_ON(last_vcn >=3D 0 && first_vcn > last_vcn); - BUG_ON(dst_len < 1); if (!rl) { - BUG_ON(first_vcn); - BUG_ON(last_vcn > 0); + WARN_ON(first_vcn || last_vcn > 0); if (stop_vcn) *stop_vcn =3D 0; /* Terminator byte. */ @@ -1354,6 +1349,7 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol, = s8 *dst, */ if (unlikely(last_vcn >=3D 0 && rl[1].vcn > last_vcn)) { s64 s1 =3D last_vcn + 1; + if (unlikely(rl[1].vcn > s1)) length =3D s1 - rl->vcn; the_end =3D true; @@ -1368,10 +1364,7 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol,= s8 *dst, * If the logical cluster number (lcn) denotes a hole and we * are on NTFS 3.0+, we don't store it at all, i.e. we need * zero space. On earlier NTFS versions we just write the lcn - * change. FIXME: Do we need to write the lcn change or just - * the lcn in that case? Not sure as I have never seen this - * case on NT4. - We assume that we just need to write the lcn - * change until someone tells us otherwise... (AIA) + * change. */ if (likely(rl->lcn >=3D 0 || vol->major_ver < 3)) { prev_lcn =3D rl->lcn; @@ -1406,6 +1399,7 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol, = s8 *dst, */ if (unlikely(last_vcn >=3D 0 && rl[1].vcn > last_vcn)) { s64 s1 =3D last_vcn + 1; + if (unlikely(rl[1].vcn > s1)) length =3D s1 - rl->vcn; the_end =3D true; @@ -1419,10 +1413,7 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol,= s8 *dst, * If the logical cluster number (lcn) denotes a hole and we * are on NTFS 3.0+, we don't store it at all, i.e. we need * zero space. On earlier NTFS versions we just write the lcn - * change. FIXME: Do we need to write the lcn change or just - * the lcn in that case? Not sure as I have never seen this - * case on NT4. - We assume that we just need to write the lcn - * change until someone tells us otherwise... (AIA) + * change. */ if (likely(rl->lcn >=3D 0 || vol->major_ver < 3)) { /* Write change in lcn. */ @@ -1431,8 +1422,11 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol,= s8 *dst, if (unlikely(lcn_len < 0)) goto size_err; prev_lcn =3D rl->lcn; - } else + } else { + if (rl->lcn =3D=3D LCN_DELALLOC) + de_cnt +=3D rl->length; lcn_len =3D 0; + } dst_next =3D dst + len_len + lcn_len + 1; if (unlikely(dst_next > dst_max)) goto size_err; @@ -1442,11 +1436,15 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol= , s8 *dst, dst =3D dst_next; } /* Success. */ + if (de_cluster_count) + *de_cluster_count =3D de_cnt; err =3D 0; size_err: /* Set stop vcn. */ if (stop_vcn) *stop_vcn =3D rl->vcn; + if (stop_rl) + *stop_rl =3D (struct runlist_element *)rl; /* Add terminator byte. */ *dst =3D 0; return err; @@ -1458,7 +1456,7 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol, = s8 *dst, return err; } =20 -/** +/* * ntfs_rl_truncate_nolock - truncate a runlist starting at a specified vcn * @vol: ntfs volume (needed for error output) * @runlist: runlist to truncate @@ -1482,42 +1480,21 @@ int ntfs_mapping_pairs_build(const ntfs_volume *vol= , s8 *dst, * * Locking: The caller must hold @runlist->lock for writing. */ -int ntfs_rl_truncate_nolock(const ntfs_volume *vol, runlist *const runlist, +int ntfs_rl_truncate_nolock(const struct ntfs_volume *vol, struct runlist = *const runlist, const s64 new_length) { - runlist_element *rl; + struct runlist_element *rl; int old_size; =20 ntfs_debug("Entering for new_length 0x%llx.", (long long)new_length); - BUG_ON(!runlist); - BUG_ON(new_length < 0); + + if (!runlist || new_length < 0) + return -EINVAL; + rl =3D runlist->rl; - if (!new_length) { - ntfs_debug("Freeing runlist."); - runlist->rl =3D NULL; - if (rl) - ntfs_free(rl); - return 0; - } - if (unlikely(!rl)) { - /* - * Create a runlist consisting of a sparse runlist element of - * length @new_length followed by a terminator runlist element. - */ - rl =3D ntfs_malloc_nofs(PAGE_SIZE); - if (unlikely(!rl)) { - ntfs_error(vol->sb, "Not enough memory to allocate " - "runlist element buffer."); - return -ENOMEM; - } - runlist->rl =3D rl; - rl[1].length =3D rl->vcn =3D 0; - rl->lcn =3D LCN_HOLE; - rl[1].vcn =3D rl->length =3D new_length; - rl[1].lcn =3D LCN_ENOENT; - return 0; - } - BUG_ON(new_length < rl->vcn); + if (new_length < rl->vcn) + return -EINVAL; + /* Find @new_length in the runlist. */ while (likely(rl->length && new_length >=3D rl[1].vcn)) rl++; @@ -1526,7 +1503,7 @@ int ntfs_rl_truncate_nolock(const ntfs_volume *vol, r= unlist *const runlist, * If at the end of the runlist we need to expand it. */ if (rl->length) { - runlist_element *trl; + struct runlist_element *trl; bool is_end; =20 ntfs_debug("Shrinking runlist."); @@ -1550,16 +1527,15 @@ int ntfs_rl_truncate_nolock(const ntfs_volume *vol,= runlist *const runlist, rl->length =3D 0; } rl->lcn =3D LCN_ENOENT; + runlist->count =3D rl - runlist->rl + 1; /* Reallocate memory if necessary. */ if (!is_end) { int new_size =3D rl - runlist->rl + 1; + rl =3D ntfs_rl_realloc(runlist->rl, old_size, new_size); if (IS_ERR(rl)) - ntfs_warning(vol->sb, "Failed to shrink " - "runlist buffer. This just " - "wastes a bit of memory " - "temporarily so we ignore it " - "and return success."); + ntfs_warning(vol->sb, + "Failed to shrink runlist buffer. This just wastes a bit of memory t= emporarily so we ignore it and return success."); else runlist->rl =3D rl; } @@ -1579,8 +1555,7 @@ int ntfs_rl_truncate_nolock(const ntfs_volume *vol, r= unlist *const runlist, rl =3D ntfs_rl_realloc(runlist->rl, old_size, old_size + 1); if (IS_ERR(rl)) { - ntfs_error(vol->sb, "Failed to expand runlist " - "buffer, aborting."); + ntfs_error(vol->sb, "Failed to expand runlist buffer, aborting."); return PTR_ERR(rl); } runlist->rl =3D rl; @@ -1595,6 +1570,7 @@ int ntfs_rl_truncate_nolock(const ntfs_volume *vol, r= unlist *const runlist, /* Add a new terminator runlist element. */ rl++; rl->length =3D 0; + runlist->count =3D old_size + 1; } rl->vcn =3D new_length; rl->lcn =3D LCN_ENOENT; @@ -1606,288 +1582,485 @@ int ntfs_rl_truncate_nolock(const ntfs_volume *vo= l, runlist *const runlist, return 0; } =20 -/** - * ntfs_rl_punch_nolock - punch a hole into a runlist - * @vol: ntfs volume (needed for error output) - * @runlist: runlist to punch a hole into - * @start: starting VCN of the hole to be created - * @length: size of the hole to be created in units of clusters - * - * Punch a hole into the runlist @runlist starting at VCN @start and of si= ze - * @length clusters. - * - * Return 0 on success and -errno on error, in which case @runlist has not= been - * modified. - * - * If @start and/or @start + @length are outside the runlist return error = code - * -ENOENT. +/* + * ntfs_rl_sparse - check whether runlist have sparse regions or not. + * @rl: runlist to check * - * If the runlist contains unmapped or error elements between @start and @= start - * + @length return error code -EINVAL. + * Return 1 if have, 0 if not, -errno on error. + */ +int ntfs_rl_sparse(struct runlist_element *rl) +{ + struct runlist_element *rlc; + + if (!rl) + return -EINVAL; + + for (rlc =3D rl; rlc->length; rlc++) + if (rlc->lcn < 0) { + if (rlc->lcn !=3D LCN_HOLE && rlc->lcn !=3D LCN_DELALLOC) { + pr_err("%s: bad runlist", __func__); + return -EINVAL; + } + return 1; + } + return 0; +} + +/* + * ntfs_rl_get_compressed_size - calculate length of non sparse regions + * @vol: ntfs volume (need for cluster size) + * @rl: runlist to calculate for * - * Locking: The caller must hold @runlist->lock for writing. + * Return compressed size or -errno on error. */ -int ntfs_rl_punch_nolock(const ntfs_volume *vol, runlist *const runlist, - const VCN start, const s64 length) +s64 ntfs_rl_get_compressed_size(struct ntfs_volume *vol, struct runlist_el= ement *rl) { - const VCN end =3D start + length; - s64 delta; - runlist_element *rl, *rl_end, *rl_real_end, *trl; - int old_size; - bool lcn_fixup =3D false; - - ntfs_debug("Entering for start 0x%llx, length 0x%llx.", - (long long)start, (long long)length); - BUG_ON(!runlist); - BUG_ON(start < 0); - BUG_ON(length < 0); - BUG_ON(end < 0); - rl =3D runlist->rl; - if (unlikely(!rl)) { - if (likely(!start && !length)) - return 0; - return -ENOENT; + struct runlist_element *rlc; + s64 ret =3D 0; + + if (!rl) + return -EINVAL; + + for (rlc =3D rl; rlc->length; rlc++) { + if (rlc->lcn < 0) { + if (rlc->lcn !=3D LCN_HOLE && rlc->lcn !=3D LCN_DELALLOC) { + ntfs_error(vol->sb, "%s: bad runlist, rlc->lcn : %lld", + __func__, rlc->lcn); + return -EINVAL; + } + } else + ret +=3D rlc->length; } - /* Find @start in the runlist. */ - while (likely(rl->length && start >=3D rl[1].vcn)) - rl++; - rl_end =3D rl; - /* Find @end in the runlist. */ - while (likely(rl_end->length && end >=3D rl_end[1].vcn)) { - /* Verify there are no unmapped or error elements. */ - if (unlikely(rl_end->lcn < LCN_HOLE)) - return -EINVAL; - rl_end++; + return NTFS_CLU_TO_B(vol, ret); +} + +static inline bool ntfs_rle_lcn_contiguous(struct runlist_element *left_rl= e, + struct runlist_element *right_rle) +{ + if (left_rle->lcn > LCN_HOLE && + left_rle->lcn + left_rle->length =3D=3D right_rle->lcn) + return true; + else if (left_rle->lcn =3D=3D LCN_HOLE && right_rle->lcn =3D=3D LCN_HOLE) + return true; + else + return false; +} + +static inline bool ntfs_rle_contain(struct runlist_element *rle, s64 vcn) +{ + if (rle->length > 0 && + vcn >=3D rle->vcn && vcn < rle->vcn + rle->length) + return true; + else + return false; +} + +struct runlist_element *ntfs_rl_insert_range(struct runlist_element *dst_r= l, int dst_cnt, + struct runlist_element *src_rl, int src_cnt, + size_t *new_rl_cnt) +{ + struct runlist_element *i_rl, *new_rl, *src_rl_origin =3D src_rl; + struct runlist_element dst_rl_split; + s64 start_vcn =3D src_rl[0].vcn; + int new_1st_cnt, new_2nd_cnt, new_3rd_cnt, new_cnt; + + if (!dst_rl || !src_rl || !new_rl_cnt) + return ERR_PTR(-EINVAL); + if (dst_cnt <=3D 0 || src_cnt <=3D 0) + return ERR_PTR(-EINVAL); + if (!(dst_rl[dst_cnt - 1].lcn =3D=3D LCN_ENOENT && + dst_rl[dst_cnt - 1].length =3D=3D 0) || + src_rl[src_cnt - 1].lcn < LCN_HOLE) + return ERR_PTR(-EINVAL); + + start_vcn =3D src_rl[0].vcn; + + i_rl =3D ntfs_rl_find_vcn_nolock(dst_rl, start_vcn); + if (!i_rl || + (i_rl->lcn =3D=3D LCN_ENOENT && i_rl->vcn !=3D start_vcn) || + (i_rl->lcn !=3D LCN_ENOENT && !ntfs_rle_contain(i_rl, start_vcn))) + return ERR_PTR(-EINVAL); + + new_1st_cnt =3D (int)(i_rl - dst_rl); + if (new_1st_cnt > dst_cnt) + return ERR_PTR(-EINVAL); + new_3rd_cnt =3D dst_cnt - new_1st_cnt; + if (new_3rd_cnt < 1) + return ERR_PTR(-EINVAL); + + if (i_rl[0].vcn !=3D start_vcn) { + if (i_rl[0].lcn =3D=3D LCN_HOLE && src_rl[0].lcn =3D=3D LCN_HOLE) + goto merge_src_rle; + + /* split @i_rl[0] and create @dst_rl_split */ + dst_rl_split.vcn =3D i_rl[0].vcn; + dst_rl_split.length =3D start_vcn - i_rl[0].vcn; + dst_rl_split.lcn =3D i_rl[0].lcn; + + i_rl[0].vcn =3D start_vcn; + i_rl[0].length -=3D dst_rl_split.length; + i_rl[0].lcn +=3D dst_rl_split.length; + } else { + struct runlist_element *dst_rle, *src_rle; +merge_src_rle: + + /* not split @i_rl[0] */ + dst_rl_split.lcn =3D LCN_ENOENT; + + /* merge @src_rl's first run and @i_rl[0]'s left run if possible */ + dst_rle =3D &dst_rl[new_1st_cnt - 1]; + src_rle =3D &src_rl[0]; + if (new_1st_cnt > 0 && ntfs_rle_lcn_contiguous(dst_rle, src_rle)) { + WARN_ON(dst_rle->vcn + dst_rle->length !=3D src_rle->vcn); + dst_rle->length +=3D src_rle->length; + src_rl++; + src_cnt--; + } else { + /* merge @src_rl's last run and @i_rl[0]'s right if possible */ + dst_rle =3D &dst_rl[new_1st_cnt]; + src_rle =3D &src_rl[src_cnt - 1]; + + if (ntfs_rle_lcn_contiguous(dst_rle, src_rle)) { + dst_rle->length +=3D src_rle->length; + src_cnt--; + } + } } - /* Check the last element. */ - if (unlikely(rl_end->length && rl_end->lcn < LCN_HOLE)) - return -EINVAL; - /* This covers @start being out of bounds, too. */ - if (!rl_end->length && end > rl_end->vcn) - return -ENOENT; - if (!length) - return 0; - if (!rl->length) - return -ENOENT; - rl_real_end =3D rl_end; - /* Determine the runlist size. */ - while (likely(rl_real_end->length)) - rl_real_end++; - old_size =3D rl_real_end - runlist->rl + 1; - /* If @start is in a hole simply extend the hole. */ - if (rl->lcn =3D=3D LCN_HOLE) { - /* - * If both @start and @end are in the same sparse run, we are - * done. + + new_2nd_cnt =3D src_cnt; + new_cnt =3D new_1st_cnt + new_2nd_cnt + new_3rd_cnt; + new_cnt +=3D dst_rl_split.lcn >=3D LCN_HOLE ? 1 : 0; + new_rl =3D kvcalloc(new_cnt, sizeof(*new_rl), GFP_NOFS); + if (!new_rl) + return ERR_PTR(-ENOMEM); + + /* Copy the @dst_rl's first half to @new_rl */ + ntfs_rl_mc(new_rl, 0, dst_rl, 0, new_1st_cnt); + if (dst_rl_split.lcn >=3D LCN_HOLE) { + ntfs_rl_mc(new_rl, new_1st_cnt, &dst_rl_split, 0, 1); + new_1st_cnt++; + } + /* Copy the @src_rl to @new_rl */ + ntfs_rl_mc(new_rl, new_1st_cnt, src_rl, 0, new_2nd_cnt); + /* Copy the @dst_rl's second half to @new_rl */ + if (new_3rd_cnt >=3D 1) { + struct runlist_element *rl, *rl_3rd; + int dst_1st_cnt =3D dst_rl_split.lcn >=3D LCN_HOLE ? + new_1st_cnt - 1 : new_1st_cnt; + + ntfs_rl_mc(new_rl, new_1st_cnt + new_2nd_cnt, + dst_rl, dst_1st_cnt, new_3rd_cnt); + /* Update vcn of the @dst_rl's second half runs to reflect + * appended @src_rl. */ - if (end <=3D rl[1].vcn) { - ntfs_debug("Done (requested hole is already sparse)."); - return 0; - } -extend_hole: - /* Extend the hole. */ - rl->length =3D end - rl->vcn; - /* If @end is in a hole, merge it with the current one. */ - if (rl_end->lcn =3D=3D LCN_HOLE) { - rl_end++; - rl->length =3D rl_end->vcn - rl->vcn; - } - /* We have done the hole. Now deal with the remaining tail. */ - rl++; - /* Cut out all runlist elements up to @end. */ - if (rl < rl_end) - memmove(rl, rl_end, (rl_real_end - rl_end + 1) * - sizeof(*rl)); - /* Adjust the beginning of the tail if necessary. */ - if (end > rl->vcn) { - delta =3D end - rl->vcn; - rl->vcn =3D end; - rl->length -=3D delta; - /* Only adjust the lcn if it is real. */ - if (rl->lcn >=3D 0) - rl->lcn +=3D delta; - } -shrink_allocation: - /* Reallocate memory if the allocation changed. */ - if (rl < rl_end) { - rl =3D ntfs_rl_realloc(runlist->rl, old_size, - old_size - (rl_end - rl)); - if (IS_ERR(rl)) - ntfs_warning(vol->sb, "Failed to shrink " - "runlist buffer. This just " - "wastes a bit of memory " - "temporarily so we ignore it " - "and return success."); - else - runlist->rl =3D rl; + if (new_1st_cnt + new_2nd_cnt =3D=3D 0) { + rl_3rd =3D &new_rl[new_1st_cnt + new_2nd_cnt + 1]; + rl =3D &new_rl[new_1st_cnt + new_2nd_cnt]; + } else { + rl_3rd =3D &new_rl[new_1st_cnt + new_2nd_cnt]; + rl =3D &new_rl[new_1st_cnt + new_2nd_cnt - 1]; } - ntfs_debug("Done (extend hole)."); - return 0; + do { + rl_3rd->vcn =3D rl->vcn + rl->length; + if (rl_3rd->length <=3D 0) + break; + rl =3D rl_3rd; + rl_3rd++; + } while (1); } - /* - * If @start is at the beginning of a run things are easier as there is - * no need to split the first run. - */ - if (start =3D=3D rl->vcn) { - /* - * @start is at the beginning of a run. - * - * If the previous run is sparse, extend its hole. - * - * If @end is not in the same run, switch the run to be sparse - * and extend the newly created hole. - * - * Thus both of these cases reduce the problem to the above - * case of "@start is in a hole". + *new_rl_cnt =3D new_1st_cnt + new_2nd_cnt + new_3rd_cnt; + + kvfree(dst_rl); + kvfree(src_rl_origin); + return new_rl; +} + +struct runlist_element *ntfs_rl_punch_hole(struct runlist_element *dst_rl,= int dst_cnt, + s64 start_vcn, s64 len, + struct runlist_element **punch_rl, + size_t *new_rl_cnt) +{ + struct runlist_element *s_rl, *e_rl, *new_rl, *dst_3rd_rl, hole_rl[1]; + s64 end_vcn; + int new_1st_cnt, dst_3rd_cnt, new_cnt, punch_cnt, merge_cnt; + bool begin_split, end_split, one_split_3; + + if (dst_cnt < 2 || + !(dst_rl[dst_cnt - 1].lcn =3D=3D LCN_ENOENT && + dst_rl[dst_cnt - 1].length =3D=3D 0)) + return ERR_PTR(-EINVAL); + + end_vcn =3D min(start_vcn + len - 1, + dst_rl[dst_cnt - 2].vcn + dst_rl[dst_cnt - 2].length - 1); + + s_rl =3D ntfs_rl_find_vcn_nolock(dst_rl, start_vcn); + if (!s_rl || + s_rl->lcn <=3D LCN_ENOENT || + !ntfs_rle_contain(s_rl, start_vcn)) + return ERR_PTR(-EINVAL); + + begin_split =3D s_rl->vcn !=3D start_vcn ? true : false; + + e_rl =3D ntfs_rl_find_vcn_nolock(dst_rl, end_vcn); + if (!e_rl || + e_rl->lcn <=3D LCN_ENOENT || + !ntfs_rle_contain(e_rl, end_vcn)) + return ERR_PTR(-EINVAL); + + end_split =3D e_rl->vcn + e_rl->length - 1 !=3D end_vcn ? true : false; + + /* @s_rl has to be split into left, punched hole, and right */ + one_split_3 =3D e_rl =3D=3D s_rl && begin_split && end_split ? true : fal= se; + + punch_cnt =3D (int)(e_rl - s_rl) + 1; + + *punch_rl =3D kvcalloc(punch_cnt + 1, sizeof(struct runlist_element), + GFP_NOFS); + if (!*punch_rl) + return ERR_PTR(-ENOMEM); + + new_cnt =3D dst_cnt - (int)(e_rl - s_rl + 1) + 3; + new_rl =3D kvcalloc(new_cnt, sizeof(struct runlist_element), GFP_NOFS); + if (!new_rl) { + kvfree(*punch_rl); + *punch_rl =3D NULL; + return ERR_PTR(-ENOMEM); + } + + new_1st_cnt =3D (int)(s_rl - dst_rl) + 1; + ntfs_rl_mc(*punch_rl, 0, dst_rl, new_1st_cnt - 1, punch_cnt); + + (*punch_rl)[punch_cnt].lcn =3D LCN_ENOENT; + (*punch_rl)[punch_cnt].length =3D 0; + + if (!begin_split) + new_1st_cnt--; + dst_3rd_rl =3D e_rl; + dst_3rd_cnt =3D (int)(&dst_rl[dst_cnt - 1] - e_rl) + 1; + if (!end_split) { + dst_3rd_rl++; + dst_3rd_cnt--; + } + + /* Copy the 1st part of @dst_rl into @new_rl */ + ntfs_rl_mc(new_rl, 0, dst_rl, 0, new_1st_cnt); + if (begin_split) { + /* the @e_rl has to be splited and copied into the last of @new_rl + * and the first of @punch_rl */ - if (rl > runlist->rl && (rl - 1)->lcn =3D=3D LCN_HOLE) { - rl--; - goto extend_hole; - } - if (end >=3D rl[1].vcn) { - rl->lcn =3D LCN_HOLE; - goto extend_hole; - } - /* - * The final case is when @end is in the same run as @start. - * For this need to split the run into two. One run for the - * sparse region between the beginning of the old run, i.e. - * @start, and @end and one for the remaining non-sparse - * region, i.e. between @end and the end of the old run. + s64 first_cnt =3D start_vcn - dst_rl[new_1st_cnt - 1].vcn; + + if (new_1st_cnt) + new_rl[new_1st_cnt - 1].length =3D first_cnt; + + (*punch_rl)[0].vcn =3D start_vcn; + (*punch_rl)[0].length -=3D first_cnt; + if ((*punch_rl)[0].lcn > LCN_HOLE) + (*punch_rl)[0].lcn +=3D first_cnt; + } + + /* Copy a hole into @new_rl */ + hole_rl[0].vcn =3D start_vcn; + hole_rl[0].length =3D (s64)len; + hole_rl[0].lcn =3D LCN_HOLE; + ntfs_rl_mc(new_rl, new_1st_cnt, hole_rl, 0, 1); + + /* Copy the 3rd part of @dst_rl into @new_rl */ + ntfs_rl_mc(new_rl, new_1st_cnt + 1, dst_3rd_rl, 0, dst_3rd_cnt); + if (end_split) { + /* the @e_rl has to be splited and copied into the first of + * @new_rl and the last of @punch_rl */ - trl =3D ntfs_rl_realloc(runlist->rl, old_size, old_size + 1); - if (IS_ERR(trl)) - goto enomem_out; - old_size++; - if (runlist->rl !=3D trl) { - rl =3D trl + (rl - runlist->rl); - rl_end =3D trl + (rl_end - runlist->rl); - rl_real_end =3D trl + (rl_real_end - runlist->rl); - runlist->rl =3D trl; - } -split_end: - /* Shift all the runs up by one. */ - memmove(rl + 1, rl, (rl_real_end - rl + 1) * sizeof(*rl)); - /* Finally, setup the two split runs. */ - rl->lcn =3D LCN_HOLE; - rl->length =3D length; - rl++; - rl->vcn +=3D length; - /* Only adjust the lcn if it is real. */ - if (rl->lcn >=3D 0 || lcn_fixup) - rl->lcn +=3D length; - rl->length -=3D length; - ntfs_debug("Done (split one)."); - return 0; + s64 first_cnt =3D end_vcn - dst_3rd_rl[0].vcn + 1; + + new_rl[new_1st_cnt + 1].vcn =3D end_vcn + 1; + new_rl[new_1st_cnt + 1].length -=3D first_cnt; + if (new_rl[new_1st_cnt + 1].lcn > LCN_HOLE) + new_rl[new_1st_cnt + 1].lcn +=3D first_cnt; + + if (one_split_3) + (*punch_rl)[punch_cnt - 1].length -=3D + new_rl[new_1st_cnt + 1].length; + else + (*punch_rl)[punch_cnt - 1].length =3D first_cnt; } - /* - * @start is neither in a hole nor at the beginning of a run. - * - * If @end is in a hole, things are easier as simply truncating the run - * @start is in to end at @start - 1, deleting all runs after that up - * to @end, and finally extending the beginning of the run @end is in - * to be @start is all that is needed. + + /* Merge left and hole, or hole and right in @new_rl, if left or right + * consists of holes. */ - if (rl_end->lcn =3D=3D LCN_HOLE) { - /* Truncate the run containing @start. */ - rl->length =3D start - rl->vcn; - rl++; - /* Cut out all runlist elements up to @end. */ - if (rl < rl_end) - memmove(rl, rl_end, (rl_real_end - rl_end + 1) * - sizeof(*rl)); - /* Extend the beginning of the run @end is in to be @start. */ - rl->vcn =3D start; - rl->length =3D rl[1].vcn - start; - goto shrink_allocation; + merge_cnt =3D 0; + if (new_1st_cnt > 0 && new_rl[new_1st_cnt - 1].lcn =3D=3D LCN_HOLE) { + /* Merge right and hole */ + s_rl =3D &new_rl[new_1st_cnt - 1]; + s_rl->length +=3D s_rl[1].length; + merge_cnt =3D 1; + /* Merge left and right */ + if (new_1st_cnt + 1 < new_cnt && + new_rl[new_1st_cnt + 1].lcn =3D=3D LCN_HOLE) { + s_rl->length +=3D s_rl[2].length; + merge_cnt++; + } + } else if (new_1st_cnt + 1 < new_cnt && + new_rl[new_1st_cnt + 1].lcn =3D=3D LCN_HOLE) { + /* Merge left and hole */ + s_rl =3D &new_rl[new_1st_cnt]; + s_rl->length +=3D s_rl[1].length; + merge_cnt =3D 1; } - /*=20 - * If @end is not in a hole there are still two cases to distinguish. - * Either @end is or is not in the same run as @start. - * - * The second case is easier as it can be reduced to an already solved - * problem by truncating the run @start is in to end at @start - 1. - * Then, if @end is in the next run need to split the run into a sparse - * run followed by a non-sparse run (already covered above) and if @end - * is not in the next run switching it to be sparse, again reduces the - * problem to the already covered case of "@start is in a hole". - */ - if (end >=3D rl[1].vcn) { - /* - * If @end is not in the next run, reduce the problem to the - * case of "@start is in a hole". + if (merge_cnt) { + struct runlist_element *d_rl, *src_rl; + + d_rl =3D s_rl + 1; + src_rl =3D s_rl + 1 + merge_cnt; + ntfs_rl_mm(new_rl, (int)(d_rl - new_rl), (int)(src_rl - new_rl), + (int)(&new_rl[new_cnt - 1] - src_rl) + 1); + } + + (*punch_rl)[punch_cnt].vcn =3D (*punch_rl)[punch_cnt - 1].vcn + + (*punch_rl)[punch_cnt - 1].length; + + /* punch_cnt elements of dst are replaced with one hole */ + *new_rl_cnt =3D dst_cnt - (punch_cnt - (int)begin_split - (int)end_split)= + + 1 - merge_cnt; + kvfree(dst_rl); + return new_rl; +} + +struct runlist_element *ntfs_rl_collapse_range(struct runlist_element *dst= _rl, int dst_cnt, + s64 start_vcn, s64 len, + struct runlist_element **punch_rl, + size_t *new_rl_cnt) +{ + struct runlist_element *s_rl, *e_rl, *new_rl, *dst_3rd_rl; + s64 end_vcn; + int new_1st_cnt, dst_3rd_cnt, new_cnt, punch_cnt, merge_cnt, i; + bool begin_split, end_split, one_split_3; + + if (dst_cnt < 2 || + !(dst_rl[dst_cnt - 1].lcn =3D=3D LCN_ENOENT && + dst_rl[dst_cnt - 1].length =3D=3D 0)) + return ERR_PTR(-EINVAL); + + end_vcn =3D min(start_vcn + len - 1, + dst_rl[dst_cnt - 1].vcn - 1); + + s_rl =3D ntfs_rl_find_vcn_nolock(dst_rl, start_vcn); + if (!s_rl || + s_rl->lcn <=3D LCN_ENOENT || + !ntfs_rle_contain(s_rl, start_vcn)) + return ERR_PTR(-EINVAL); + + begin_split =3D s_rl->vcn !=3D start_vcn ? true : false; + + e_rl =3D ntfs_rl_find_vcn_nolock(dst_rl, end_vcn); + if (!e_rl || + e_rl->lcn <=3D LCN_ENOENT || + !ntfs_rle_contain(e_rl, end_vcn)) + return ERR_PTR(-EINVAL); + + end_split =3D e_rl->vcn + e_rl->length - 1 !=3D end_vcn ? true : false; + + /* @s_rl has to be split into left, collapsed, and right */ + one_split_3 =3D e_rl =3D=3D s_rl && begin_split && end_split ? true : fal= se; + + punch_cnt =3D (int)(e_rl - s_rl) + 1; + *punch_rl =3D kvcalloc(punch_cnt + 1, sizeof(struct runlist_element), + GFP_NOFS); + if (!*punch_rl) + return ERR_PTR(-ENOMEM); + + new_cnt =3D dst_cnt - (int)(e_rl - s_rl + 1) + 3; + new_rl =3D kvcalloc(new_cnt, sizeof(struct runlist_element), GFP_NOFS); + if (!new_rl) { + kvfree(*punch_rl); + *punch_rl =3D NULL; + return ERR_PTR(-ENOMEM); + } + + new_1st_cnt =3D (int)(s_rl - dst_rl) + 1; + ntfs_rl_mc(*punch_rl, 0, dst_rl, new_1st_cnt - 1, punch_cnt); + (*punch_rl)[punch_cnt].lcn =3D LCN_ENOENT; + (*punch_rl)[punch_cnt].length =3D 0; + + if (!begin_split) + new_1st_cnt--; + dst_3rd_rl =3D e_rl; + dst_3rd_cnt =3D (int)(&dst_rl[dst_cnt - 1] - e_rl) + 1; + if (!end_split) { + dst_3rd_rl++; + dst_3rd_cnt--; + } + + /* Copy the 1st part of @dst_rl into @new_rl */ + ntfs_rl_mc(new_rl, 0, dst_rl, 0, new_1st_cnt); + if (begin_split) { + /* the @e_rl has to be splited and copied into the last of @new_rl + * and the first of @punch_rl */ - if (rl[1].length && end >=3D rl[2].vcn) { - /* Truncate the run containing @start. */ - rl->length =3D start - rl->vcn; - rl++; - rl->vcn =3D start; - rl->lcn =3D LCN_HOLE; - goto extend_hole; - } - trl =3D ntfs_rl_realloc(runlist->rl, old_size, old_size + 1); - if (IS_ERR(trl)) - goto enomem_out; - old_size++; - if (runlist->rl !=3D trl) { - rl =3D trl + (rl - runlist->rl); - rl_end =3D trl + (rl_end - runlist->rl); - rl_real_end =3D trl + (rl_real_end - runlist->rl); - runlist->rl =3D trl; - } - /* Truncate the run containing @start. */ - rl->length =3D start - rl->vcn; - rl++; - /* - * @end is in the next run, reduce the problem to the case - * where "@start is at the beginning of a run and @end is in - * the same run as @start". + s64 first_cnt =3D start_vcn - dst_rl[new_1st_cnt - 1].vcn; + + new_rl[new_1st_cnt - 1].length =3D first_cnt; + + (*punch_rl)[0].vcn =3D start_vcn; + (*punch_rl)[0].length -=3D first_cnt; + if ((*punch_rl)[0].lcn > LCN_HOLE) + (*punch_rl)[0].lcn +=3D first_cnt; + } + + /* Copy the 3rd part of @dst_rl into @new_rl */ + ntfs_rl_mc(new_rl, new_1st_cnt, dst_3rd_rl, 0, dst_3rd_cnt); + if (end_split) { + /* the @e_rl has to be splited and copied into the first of + * @new_rl and the last of @punch_rl */ - delta =3D rl->vcn - start; - rl->vcn =3D start; - if (rl->lcn >=3D 0) { - rl->lcn -=3D delta; - /* Need this in case the lcn just became negative. */ - lcn_fixup =3D true; - } - rl->length +=3D delta; - goto split_end; + s64 first_cnt =3D end_vcn - dst_3rd_rl[0].vcn + 1; + + new_rl[new_1st_cnt].vcn =3D end_vcn + 1; + new_rl[new_1st_cnt].length -=3D first_cnt; + if (new_rl[new_1st_cnt].lcn > LCN_HOLE) + new_rl[new_1st_cnt].lcn +=3D first_cnt; + + if (one_split_3) + (*punch_rl)[punch_cnt - 1].length -=3D + new_rl[new_1st_cnt].length; + else + (*punch_rl)[punch_cnt - 1].length =3D first_cnt; } - /* - * The first case from above, i.e. @end is in the same run as @start. - * We need to split the run into three. One run for the non-sparse - * region between the beginning of the old run and @start, one for the - * sparse region between @start and @end, and one for the remaining - * non-sparse region, i.e. between @end and the end of the old run. + + /* Adjust vcn */ + if (new_1st_cnt =3D=3D 0) + new_rl[new_1st_cnt].vcn =3D 0; + for (i =3D new_1st_cnt =3D=3D 0 ? 1 : new_1st_cnt; new_rl[i].length; i++) + new_rl[i].vcn =3D new_rl[i - 1].vcn + new_rl[i - 1].length; + new_rl[i].vcn =3D new_rl[i - 1].vcn + new_rl[i - 1].length; + + /* Merge left and hole, or hole and right in @new_rl, if left or right + * consists of holes. */ - trl =3D ntfs_rl_realloc(runlist->rl, old_size, old_size + 2); - if (IS_ERR(trl)) - goto enomem_out; - old_size +=3D 2; - if (runlist->rl !=3D trl) { - rl =3D trl + (rl - runlist->rl); - rl_end =3D trl + (rl_end - runlist->rl); - rl_real_end =3D trl + (rl_real_end - runlist->rl); - runlist->rl =3D trl; + merge_cnt =3D 0; + i =3D new_1st_cnt =3D=3D 0 ? 1 : new_1st_cnt; + if (ntfs_rle_lcn_contiguous(&new_rl[i - 1], &new_rl[i])) { + /* Merge right and left */ + s_rl =3D &new_rl[new_1st_cnt - 1]; + s_rl->length +=3D s_rl[1].length; + merge_cnt =3D 1; + } + if (merge_cnt) { + struct runlist_element *d_rl, *src_rl; + + d_rl =3D s_rl + 1; + src_rl =3D s_rl + 1 + merge_cnt; + ntfs_rl_mm(new_rl, (int)(d_rl - new_rl), (int)(src_rl - new_rl), + (int)(&new_rl[new_cnt - 1] - src_rl) + 1); } - /* Shift all the runs up by two. */ - memmove(rl + 2, rl, (rl_real_end - rl + 1) * sizeof(*rl)); - /* Finally, setup the three split runs. */ - rl->length =3D start - rl->vcn; - rl++; - rl->vcn =3D start; - rl->lcn =3D LCN_HOLE; - rl->length =3D length; - rl++; - delta =3D end - rl->vcn; - rl->vcn =3D end; - rl->lcn +=3D delta; - rl->length -=3D delta; - ntfs_debug("Done (split both)."); - return 0; -enomem_out: - ntfs_error(vol->sb, "Not enough memory to extend runlist buffer."); - return -ENOMEM; -} =20 -#endif /* NTFS_RW */ + (*punch_rl)[punch_cnt].vcn =3D (*punch_rl)[punch_cnt - 1].vcn + + (*punch_rl)[punch_cnt - 1].length; + + /* punch_cnt elements of dst are extracted */ + *new_rl_cnt =3D dst_cnt - (punch_cnt - (int)begin_split - (int)end_split)= - + merge_cnt; + + kvfree(dst_rl); + return new_rl; +} --=20 2.25.1