fs/xfs/libxfs/xfs_bmap.c | 377 +++++++++++++++++++++++++-------------- fs/xfs/libxfs/xfs_bmap.h | 7 +- fs/xfs/xfs_reflink.c | 24 +-- 3 files changed, 252 insertions(+), 156 deletions(-)
Hi folks,
Currently, xfs_reflink_end_cow_extent() will unconditionally unmap an
extent from DATA fork and then remap an extent from COW fork. It seems
somewhat ineffective since for many cases we could update real bmbt
records directly by sightly enhancing old
xfs_bmap_add_extent_unwritten_real() implementation, thus reduce some
measurable extra metadata overhead.
It's important to us since, actually, we're planing to use a modified
alway-cow like atomic write approach internally for database
applications, therefore it'd be nice to do some optimization over
simple end COW approach. Also I think it's still generic and can
benefit other reflink use cases as well.
I did some tests with ramdisk in order to measure metadata overhead:
echo 1 > /sys/fs/xfs/debug/always_cow
mkfs.xfs -f -mreflink=1 /dev/ram0
mount /dev/ram0 testdir
fio -filename=testdir/1 -size=1G -ioengine=psync -bs=4k -rw=randwrite -overwrite=1 -direct=1 -end_fsync=1 -name=job1
Test results as below:
Vanilla:
(1) iops : min= 7986, max=16434, avg=12505.76, stdev=2400.05, samples=41
(2) iops : min= 7636, max=16376, avg=12474.19, stdev=2258.18, samples=42
(3) iops : min= 8346, max=16439, avg=12227.95, stdev=2432.12, samples=42
(4) iops : min= 8580, max=16496, avg=12779.41, stdev=2297.42, samples=41
(5) iops : min= 8286, max=16556, avg=12500.76, stdev=2123.90, samples=41
Patched:
(1) iops : min= 7086, max=17132, avg=12931.20, stdev=2729.10, samples=40
(2) iops : min= 7704, max=17508, avg=13204.62, stdev=2507.70, samples=39
(3) iops : min= 8736, max=17634, avg=13253.08, stdev=2545.18, samples=39
(4) iops : min= 7188, max=17550, avg=12928.40, stdev=2633.64, samples=40
(5) iops : min= 8268, max=17446, avg=12837.55, stdev=2717.98, samples=40
xfstests seems survived. Comments are much welcomed and
thanks for your time!
Thanks,
Gao Xiang
Changes since v1:
- fix missing tmp_logflags initialization;
- drop unnecessary realtime inode check pointed out by Darrick.
Gao Xiang (3):
xfs: get rid of LEFT, RIGHT, PREV in
xfs_bmap_add_extent_unwritten_real()
xfs: introduce xfs_bmap_update_extent_real()
xfs: introduce xfs_bremapi_from_cowfork()
fs/xfs/libxfs/xfs_bmap.c | 377 +++++++++++++++++++++++++--------------
fs/xfs/libxfs/xfs_bmap.h | 7 +-
fs/xfs/xfs_reflink.c | 24 +--
3 files changed, 252 insertions(+), 156 deletions(-)
--
2.24.4
On Wed, Feb 16, 2022 at 11:08:51AM +0800, Gao Xiang wrote: > Hi folks, > > Currently, xfs_reflink_end_cow_extent() will unconditionally unmap an > extent from DATA fork and then remap an extent from COW fork. It seems > somewhat ineffective since for many cases we could update real bmbt > records directly by sightly enhancing old > xfs_bmap_add_extent_unwritten_real() implementation, thus reduce some > measurable extra metadata overhead. Does it work with rmap enabled? Reading between the lines, I'm guessing the performance boost might come from avoiding a transaction roll and (possibly) reducing the need to log bmbt updates? Particularly in the worst case where we split the bmbt only to rejoin the blocks immediately after. Recently, Dave and Allison have been pondering making an addition to the deferred log item code so that we could ->finish_item the first defer op in the same transaction that logs the caller's deferred operations. Might that get you most of the speed advantage that you're seeking? > It's important to us since, actually, we're planing to use a modified > alway-cow like atomic write approach internally for database > applications, therefore it'd be nice to do some optimization over > simple end COW approach. Also I think it's still generic and can > benefit other reflink use cases as well. Hmm, that sounds /awfully/ similar to what sqlite does with f2fs' atomic write ioctls. Alternately, would this[1] feature that's been sitting around in djwong-dev since late 2019 help? [1] https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=atomic-file-updates --D > > I did some tests with ramdisk in order to measure metadata overhead: > > echo 1 > /sys/fs/xfs/debug/always_cow > mkfs.xfs -f -mreflink=1 /dev/ram0 > mount /dev/ram0 testdir > fio -filename=testdir/1 -size=1G -ioengine=psync -bs=4k -rw=randwrite -overwrite=1 -direct=1 -end_fsync=1 -name=job1 > > Test results as below: > Vanilla: > (1) iops : min= 7986, max=16434, avg=12505.76, stdev=2400.05, samples=41 > (2) iops : min= 7636, max=16376, avg=12474.19, stdev=2258.18, samples=42 > (3) iops : min= 8346, max=16439, avg=12227.95, stdev=2432.12, samples=42 > (4) iops : min= 8580, max=16496, avg=12779.41, stdev=2297.42, samples=41 > (5) iops : min= 8286, max=16556, avg=12500.76, stdev=2123.90, samples=41 > > Patched: > (1) iops : min= 7086, max=17132, avg=12931.20, stdev=2729.10, samples=40 > (2) iops : min= 7704, max=17508, avg=13204.62, stdev=2507.70, samples=39 > (3) iops : min= 8736, max=17634, avg=13253.08, stdev=2545.18, samples=39 > (4) iops : min= 7188, max=17550, avg=12928.40, stdev=2633.64, samples=40 > (5) iops : min= 8268, max=17446, avg=12837.55, stdev=2717.98, samples=40 > > xfstests seems survived. Comments are much welcomed and > thanks for your time! > > Thanks, > Gao Xiang > > Changes since v1: > - fix missing tmp_logflags initialization; > - drop unnecessary realtime inode check pointed out by Darrick. > > Gao Xiang (3): > xfs: get rid of LEFT, RIGHT, PREV in > xfs_bmap_add_extent_unwritten_real() > xfs: introduce xfs_bmap_update_extent_real() > xfs: introduce xfs_bremapi_from_cowfork() > > fs/xfs/libxfs/xfs_bmap.c | 377 +++++++++++++++++++++++++-------------- > fs/xfs/libxfs/xfs_bmap.h | 7 +- > fs/xfs/xfs_reflink.c | 24 +-- > 3 files changed, 252 insertions(+), 156 deletions(-) > > -- > 2.24.4 >
Hi Darrick, On Mon, Mar 21, 2022 at 03:21:25PM -0700, Darrick J. Wong wrote: > On Wed, Feb 16, 2022 at 11:08:51AM +0800, Gao Xiang wrote: > > Hi folks, > > > > Currently, xfs_reflink_end_cow_extent() will unconditionally unmap an > > extent from DATA fork and then remap an extent from COW fork. It seems > > somewhat ineffective since for many cases we could update real bmbt > > records directly by sightly enhancing old > > xfs_bmap_add_extent_unwritten_real() implementation, thus reduce some > > measurable extra metadata overhead. > > Does it work with rmap enabled? Yeah, I've tested with rmap enabled. But for now, I have to prioritize `erofs over fscache' stuff since it's really needed for cloud vendors and cloud-native ecosystem to achieve high-dense and proformance image solution for runC and Kata containers. And this patchset still left some work to do (since our tester found some cases could cause performance degration but I haven't checked/confirm it yet.) > Reading between the lines, I'm guessing the performance boost might > come from avoiding a transaction roll and (possibly) reducing the need > to log bmbt updates? Particularly in the worst case where we split the > bmbt only to rejoin the blocks immediately after. > > Recently, Dave and Allison have been pondering making an addition to the > deferred log item code so that we could ->finish_item the first defer op > in the same transaction that logs the caller's deferred operations. > Might that get you most of the speed advantage that you're seeking? Ok, I will recheck later after my container stuff are almost done.. > > > It's important to us since, actually, we're planing to use a modified > > alway-cow like atomic write approach internally for database > > applications, therefore it'd be nice to do some optimization over > > simple end COW approach. Also I think it's still generic and can > > benefit other reflink use cases as well. > > Hmm, that sounds /awfully/ similar to what sqlite does with f2fs' atomic > write ioctls. > > Alternately, would this[1] feature that's been sitting around in > djwong-dev since late 2019 help? > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=atomic-file-updates > The problem is that, for example, mysql directly uses direct I/O to write data, and we don't want to touch applications (since there are still old e.g. mysql versions in production.) so a transparent COW atomic write would be much helpful for products (yeah, I admit it's a bit tricky, but until applications choose to use swapext, we still need some way that we don't need to touch database apps). Thanks for the reply and the time! Thanks, Gao Xiang > --D > > > > > I did some tests with ramdisk in order to measure metadata overhead: > > > > echo 1 > /sys/fs/xfs/debug/always_cow > > mkfs.xfs -f -mreflink=1 /dev/ram0 > > mount /dev/ram0 testdir > > fio -filename=testdir/1 -size=1G -ioengine=psync -bs=4k -rw=randwrite -overwrite=1 -direct=1 -end_fsync=1 -name=job1 > > > > Test results as below: > > Vanilla: > > (1) iops : min= 7986, max=16434, avg=12505.76, stdev=2400.05, samples=41 > > (2) iops : min= 7636, max=16376, avg=12474.19, stdev=2258.18, samples=42 > > (3) iops : min= 8346, max=16439, avg=12227.95, stdev=2432.12, samples=42 > > (4) iops : min= 8580, max=16496, avg=12779.41, stdev=2297.42, samples=41 > > (5) iops : min= 8286, max=16556, avg=12500.76, stdev=2123.90, samples=41 > > > > Patched: > > (1) iops : min= 7086, max=17132, avg=12931.20, stdev=2729.10, samples=40 > > (2) iops : min= 7704, max=17508, avg=13204.62, stdev=2507.70, samples=39 > > (3) iops : min= 8736, max=17634, avg=13253.08, stdev=2545.18, samples=39 > > (4) iops : min= 7188, max=17550, avg=12928.40, stdev=2633.64, samples=40 > > (5) iops : min= 8268, max=17446, avg=12837.55, stdev=2717.98, samples=40 > > > > xfstests seems survived. Comments are much welcomed and > > thanks for your time! > > > > Thanks, > > Gao Xiang > > > > Changes since v1: > > - fix missing tmp_logflags initialization; > > - drop unnecessary realtime inode check pointed out by Darrick. > > > > Gao Xiang (3): > > xfs: get rid of LEFT, RIGHT, PREV in > > xfs_bmap_add_extent_unwritten_real() > > xfs: introduce xfs_bmap_update_extent_real() > > xfs: introduce xfs_bremapi_from_cowfork() > > > > fs/xfs/libxfs/xfs_bmap.c | 377 +++++++++++++++++++++++++-------------- > > fs/xfs/libxfs/xfs_bmap.h | 7 +- > > fs/xfs/xfs_reflink.c | 24 +-- > > 3 files changed, 252 insertions(+), 156 deletions(-) > > > > -- > > 2.24.4 > >
© 2016 - 2026 Red Hat, Inc.