[PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier

Ojaswin Mujoo posted 11 patches 1 month, 3 weeks ago
There is a newer version of this series
[PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
Posted by Ojaswin Mujoo 1 month, 3 weeks ago
From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>

We brute force all possible blocksize & clustersize combinations on
a bigalloc filesystem for stressing atomic write using fio data crc
verifier. We run nproc * $LOAD_FACTOR threads in parallel writing to
a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
that we never see the mix of data contents from different threads on
a given bsrange.

This test might do overlapping atomic writes but that should be okay
since overlapping parallel hardware atomic writes don't cause tearing as
long as io size is the same for all writes.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/ext4/061     | 130 +++++++++++++++++++++++++++++++++++++++++++++
 tests/ext4/061.out |   2 +
 2 files changed, 132 insertions(+)
 create mode 100755 tests/ext4/061
 create mode 100644 tests/ext4/061.out

diff --git a/tests/ext4/061 b/tests/ext4/061
new file mode 100755
index 00000000..a0e49249
--- /dev/null
+++ b/tests/ext4/061
@@ -0,0 +1,130 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test 061
+#
+# Brute force all possible blocksize clustersize combination on a bigalloc
+# filesystem for stressing atomic write using fio data crc verifier. We run
+# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
+# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
+# we should never see the mix of data contents from different threads for any
+# given fio blocksize.
+#
+
+. ./common/preamble
+. ./common/atomicwrites
+
+_begin_fstest auto rw stress atomicwrites
+
+_require_scratch_write_atomic
+_require_aiodio
+
+FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
+SIZE=$((100*1024*1024))
+fiobsize=4096
+
+# Calculate fsblocksize as per bdev atomic write units.
+bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
+bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
+fsblocksize=$(_max 4096 "$bdev_awu_min")
+
+function create_fio_configs()
+{
+	create_fio_aw_config
+	create_fio_verify_config
+}
+
+function create_fio_verify_config()
+{
+cat >$fio_verify_config <<EOF
+	[aio-dio-aw-verify]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=native
+	filename=$SCRATCH_MNT/test-file
+	size=$SIZE
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	atomic=1
+	group_reporting=1
+
+	verify_only=1
+	verify_state_save=0
+	verify=crc32c
+	verify_fatal=1
+	verify_write_sequence=0
+EOF
+}
+
+function create_fio_aw_config()
+{
+cat >$fio_aw_config <<EOF
+	[aio-dio-aw]
+	direct=1
+	ioengine=libaio
+	rw=randwrite
+	bs=$fiobsize
+	fallocate=native
+	filename=$SCRATCH_MNT/test-file
+	size=$SIZE
+	iodepth=$FIO_LOAD
+	numjobs=$FIO_LOAD
+	group_reporting=1
+	atomic=1
+
+	verify_state_save=0
+	verify=crc32c
+	do_verify=0
+
+EOF
+}
+
+# Let's create a sample fio config to check whether fio supports all options.
+fio_aw_config=$tmp.aw.fio
+fio_verify_config=$tmp.verify.fio
+fio_out=$tmp.fio.out
+
+create_fio_configs
+_require_fio $fio_aw_config
+
+for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
+	# cluster sizes above 16 x blocksize are experimental so avoid them
+	# Also, cap cluster size at 128kb to keep it reasonable for large
+	# blocks size
+	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
+
+	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
+		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
+			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
+			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
+			if _try_scratch_mount >> $seqres.full 2>&1; then
+				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
+
+				touch $SCRATCH_MNT/f1
+				create_fio_configs
+
+				cat $fio_aw_config >> $seqres.full
+				echo >> $seqres.full
+				cat $fio_verify_config >> $seqres.full
+
+				$FIO_PROG $fio_aw_config >> $seqres.full
+				ret1=$?
+
+				$FIO_PROG $fio_verify_config >> $seqres.full
+				ret2=$?
+
+				_scratch_unmount
+
+				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
+			fi
+		done
+	done
+done
+
+# success, all done
+echo Silence is golden
+status=0
+exit
diff --git a/tests/ext4/061.out b/tests/ext4/061.out
new file mode 100644
index 00000000..273be9e0
--- /dev/null
+++ b/tests/ext4/061.out
@@ -0,0 +1,2 @@
+QA output created by 061
+Silence is golden
-- 
2.49.0
Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
Posted by John Garry 1 month, 3 weeks ago
On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> We brute force all possible blocksize & clustersize combinations on
> a bigalloc filesystem for stressing atomic write using fio data crc
> verifier. 

you seem to run mkfs per block size. Why not just mkfs for largest 
blocksize once, which will support all block sizes?

> We run nproc * $LOAD_FACTOR threads in parallel writing to
> a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> that we never see the mix of data contents from different threads on
> a given bsrange.
> 
> This test might do overlapping atomic writes but that should be okay
> since overlapping parallel hardware atomic writes don't cause tearing as
> long as io size is the same for all writes.

Please mention that serializing racing writes is not guaranteed for 
RWF_ATOMIC, and that NVMe and SCSI provide this guarantee as an 
inseparable feature to power-fail atomicity.

Please also mention that the value is that we test that we split no bios 
or only generate a single bio per write syscall.

> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>   tests/ext4/061     | 130 +++++++++++++++++++++++++++++++++++++++++++++
>   tests/ext4/061.out |   2 +
>   2 files changed, 132 insertions(+)
>   create mode 100755 tests/ext4/061
>   create mode 100644 tests/ext4/061.out
> 
> diff --git a/tests/ext4/061 b/tests/ext4/061
> new file mode 100755
> index 00000000..a0e49249
> --- /dev/null
> +++ b/tests/ext4/061
> @@ -0,0 +1,130 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test 061
> +#
> +# Brute force all possible blocksize clustersize combination on a bigalloc
> +# filesystem for stressing atomic write using fio data crc verifier. We run
> +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> +# we should never see the mix of data contents from different threads for any
> +# given fio blocksize.
> +#
> +
> +. ./common/preamble
> +. ./common/atomicwrites
> +
> +_begin_fstest auto rw stress atomicwrites
> +
> +_require_scratch_write_atomic
> +_require_aiodio

do you require fio with a certain version somewhere?

> +
> +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> +SIZE=$((100*1024*1024))
> +fiobsize=4096
> +
> +# Calculate fsblocksize as per bdev atomic write units.
> +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> +fsblocksize=$(_max 4096 "$bdev_awu_min")
> +
> +function create_fio_configs()
> +{
> +	create_fio_aw_config
> +	create_fio_verify_config
> +}
> +
> +function create_fio_verify_config()
> +{
> +cat >$fio_verify_config <<EOF
> +	[aio-dio-aw-verify]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite

it prob makes sense to just have read, but I guess with verify_only=1 
that this makes no difference

> +	bs=$fiobsize
> +	fallocate=native
> +	filename=$SCRATCH_MNT/test-file
> +	size=$SIZE
> +	iodepth=$FIO_LOAD
> +	numjobs=$FIO_LOAD
> +	atomic=1
> +	group_reporting=1
> +
> +	verify_only=1
> +	verify_state_save=0
> +	verify=crc32c
> +	verify_fatal=1
> +	verify_write_sequence=0
> +EOF
> +}
> +
> +function create_fio_aw_config()
> +{
> +cat >$fio_aw_config <<EOF
> +	[aio-dio-aw]
> +	direct=1
> +	ioengine=libaio
> +	rw=randwrite
> +	bs=$fiobsize
> +	fallocate=native
> +	filename=$SCRATCH_MNT/test-file
> +	size=$SIZE
> +	iodepth=$FIO_LOAD
> +	numjobs=$FIO_LOAD
> +	group_reporting=1
> +	atomic=1
> +
> +	verify_state_save=0
> +	verify=crc32c
> +	do_verify=0
> +
> +EOF
> +}
> +
> +# Let's create a sample fio config to check whether fio supports all options.
> +fio_aw_config=$tmp.aw.fio
> +fio_verify_config=$tmp.verify.fio
> +fio_out=$tmp.fio.out
> +
> +create_fio_configs
> +_require_fio $fio_aw_config
> +
> +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> +	# cluster sizes above 16 x blocksize are experimental so avoid them
> +	# Also, cap cluster size at 128kb to keep it reasonable for large
> +	# blocks size
> +	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> +
> +	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"

this is quite heavy indentation. Maybe the below steps can be put into a 
separate routine (to make the code more readable).


> +			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
> +			if _try_scratch_mount >> $seqres.full 2>&1; then
> +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> +
> +				touch $SCRATCH_MNT/f1
> +				create_fio_configs
> +
> +				cat $fio_aw_config >> $seqres.full
> +				echo >> $seqres.full
> +				cat $fio_verify_config >> $seqres.full
> +
> +				$FIO_PROG $fio_aw_config >> $seqres.full
> +				ret1=$?
> +
> +				$FIO_PROG $fio_verify_config >> $seqres.full
> +				ret2=$?
> +
> +				_scratch_unmount
> +
> +				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> +			fi
> +		done
> +	done
> +done
> +
> +# success, all done
> +echo Silence is golden
> +status=0
> +exit
> diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> new file mode 100644
> index 00000000..273be9e0
> --- /dev/null
> +++ b/tests/ext4/061.out
> @@ -0,0 +1,2 @@
> +QA output created by 061
> +Silence is golden
Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
Posted by Ojaswin Mujoo 1 month, 3 weeks ago
On Tue, Aug 12, 2025 at 09:08:59AM +0100, John Garry wrote:
> On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> > From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> > 
> > We brute force all possible blocksize & clustersize combinations on
> > a bigalloc filesystem for stressing atomic write using fio data crc
> > verifier.
> 
> you seem to run mkfs per block size. Why not just mkfs for largest blocksize
> once, which will support all block sizes?

We are just stressing all the possible combinations to shake out any
bugs. This is marked as stress so I feel the extra loops should be okay.

> 
> > We run nproc * $LOAD_FACTOR threads in parallel writing to
> > a single $SCRATCH_MNT/test-file. With atomic writes this test ensures
> > that we never see the mix of data contents from different threads on
> > a given bsrange.
> > 
> > This test might do overlapping atomic writes but that should be okay
> > since overlapping parallel hardware atomic writes don't cause tearing as
> > long as io size is the same for all writes.
> 
> Please mention that serializing racing writes is not guaranteed for
> RWF_ATOMIC, and that NVMe and SCSI provide this guarantee as an inseparable
> feature to power-fail atomicity.
> 
> Please also mention that the value is that we test that we split no bios or
> only generate a single bio per write syscall.

Got it, will do.
> 
> > 
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > ---
> >   tests/ext4/061     | 130 +++++++++++++++++++++++++++++++++++++++++++++
> >   tests/ext4/061.out |   2 +
> >   2 files changed, 132 insertions(+)
> >   create mode 100755 tests/ext4/061
> >   create mode 100644 tests/ext4/061.out
> > 
> > diff --git a/tests/ext4/061 b/tests/ext4/061
> > new file mode 100755
> > index 00000000..a0e49249
> > --- /dev/null
> > +++ b/tests/ext4/061
> > @@ -0,0 +1,130 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test 061
> > +#
> > +# Brute force all possible blocksize clustersize combination on a bigalloc
> > +# filesystem for stressing atomic write using fio data crc verifier. We run
> > +# nproc * 2 * $LOAD_FACTOR threads in parallel writing to a single
> > +# $SCRATCH_MNT/test-file. With fio aio-dio atomic write this test ensures that
> > +# we should never see the mix of data contents from different threads for any
> > +# given fio blocksize.
> > +#
> > +
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +
> > +_begin_fstest auto rw stress atomicwrites
> > +
> > +_require_scratch_write_atomic
> > +_require_aiodio
> 
> do you require fio with a certain version somewhere?

Oh right you mentioned that atomic=1 was broken on some older fios.
Would you happen to know which version fixed it?

> 
> > +
> > +FIO_LOAD=$(($(nproc) * 2 * LOAD_FACTOR))
> > +SIZE=$((100*1024*1024))
> > +fiobsize=4096
> > +
> > +# Calculate fsblocksize as per bdev atomic write units.
> > +bdev_awu_min=$(_get_atomic_write_unit_min $SCRATCH_DEV)
> > +bdev_awu_max=$(_get_atomic_write_unit_max $SCRATCH_DEV)
> > +fsblocksize=$(_max 4096 "$bdev_awu_min")
> > +
> > +function create_fio_configs()
> > +{
> > +	create_fio_aw_config
> > +	create_fio_verify_config
> > +}
> > +
> > +function create_fio_verify_config()
> > +{
> > +cat >$fio_verify_config <<EOF
> > +	[aio-dio-aw-verify]
> > +	direct=1
> > +	ioengine=libaio
> > +	rw=randwrite
> 
> it prob makes sense to just have read, but I guess with verify_only=1 that
> this makes no difference

Right, but I can change it in the next revision.

> 
> > +	bs=$fiobsize
> > +	fallocate=native
> > +	filename=$SCRATCH_MNT/test-file
> > +	size=$SIZE
> > +	iodepth=$FIO_LOAD
> > +	numjobs=$FIO_LOAD
> > +	atomic=1
> > +	group_reporting=1
> > +
> > +	verify_only=1
> > +	verify_state_save=0
> > +	verify=crc32c
> > +	verify_fatal=1
> > +	verify_write_sequence=0
> > +EOF
> > +}
> > +
> > +function create_fio_aw_config()
> > +{
> > +cat >$fio_aw_config <<EOF
> > +	[aio-dio-aw]
> > +	direct=1
> > +	ioengine=libaio
> > +	rw=randwrite
> > +	bs=$fiobsize
> > +	fallocate=native
> > +	filename=$SCRATCH_MNT/test-file
> > +	size=$SIZE
> > +	iodepth=$FIO_LOAD
> > +	numjobs=$FIO_LOAD
> > +	group_reporting=1
> > +	atomic=1
> > +
> > +	verify_state_save=0
> > +	verify=crc32c
> > +	do_verify=0
> > +
> > +EOF
> > +}
> > +
> > +# Let's create a sample fio config to check whether fio supports all options.
> > +fio_aw_config=$tmp.aw.fio
> > +fio_verify_config=$tmp.verify.fio
> > +fio_out=$tmp.fio.out
> > +
> > +create_fio_configs
> > +_require_fio $fio_aw_config
> > +
> > +for ((fsblocksize=$fsblocksize; fsblocksize <= $(_get_page_size); fsblocksize = $fsblocksize << 1)); do
> > +	# cluster sizes above 16 x blocksize are experimental so avoid them
> > +	# Also, cap cluster size at 128kb to keep it reasonable for large
> > +	# blocks size
> > +	fs_max_clustersize=$(_min $((16 * fsblocksize)) "$bdev_awu_max" $((128 * 1024)))
> > +
> > +	for ((fsclustersize=$fsblocksize; fsclustersize <= $fs_max_clustersize; fsclustersize = $fsclustersize << 1)); do
> > +		for ((fiobsize = $fsblocksize; fiobsize <= $fsclustersize; fiobsize = $fiobsize << 1)); do
> > +			MKFS_OPTIONS="-O bigalloc -b $fsblocksize -C $fsclustersize"
> 
> this is quite heavy indentation. Maybe the below steps can be put into a
> separate routine (to make the code more readable).

Got it.

> 
> 
> > +			_scratch_mkfs_ext4  >> $seqres.full 2>&1 || continue
> > +			if _try_scratch_mount >> $seqres.full 2>&1; then
> > +				echo "== FIO test for fsblocksize=$fsblocksize fsclustersize=$fsclustersize fiobsize=$fiobsize ==" >> $seqres.full
> > +
> > +				touch $SCRATCH_MNT/f1
> > +				create_fio_configs
> > +
> > +				cat $fio_aw_config >> $seqres.full
> > +				echo >> $seqres.full
> > +				cat $fio_verify_config >> $seqres.full
> > +
> > +				$FIO_PROG $fio_aw_config >> $seqres.full
> > +				ret1=$?
> > +
> > +				$FIO_PROG $fio_verify_config >> $seqres.full
> > +				ret2=$?
> > +
> > +				_scratch_unmount
> > +
> > +				[[ $ret1 -eq 0 && $ret2 -eq 0 ]] || _fail "fio with atomic write failed"
> > +			fi
> > +		done
> > +	done
> > +done
> > +
> > +# success, all done
> > +echo Silence is golden
> > +status=0
> > +exit
> > diff --git a/tests/ext4/061.out b/tests/ext4/061.out
> > new file mode 100644
> > index 00000000..273be9e0
> > --- /dev/null
> > +++ b/tests/ext4/061.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 061
> > +Silence is golden
>
Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
Posted by John Garry 1 month, 3 weeks ago
On 13/08/2025 08:08, Ojaswin Mujoo wrote:
>>> +_begin_fstest auto rw stress atomicwrites
>>> +
>>> +_require_scratch_write_atomic
>>> +_require_aiodio
>> do you require fio with a certain version somewhere?
> Oh right you mentioned that atomic=1 was broken on some older fios.

It was not broken - it just did nothing. I suppose that they are the 
same thing.

> Would you happen to know which version fixed it?

fio 3.38

thanks,
John
Re: [PATCH v4 09/11] ext4: Atomic writes stress test for bigalloc using fio crc verifier
Posted by Ojaswin Mujoo 1 month, 2 weeks ago
On Wed, Aug 13, 2025 at 08:33:44AM +0100, John Garry wrote:
> On 13/08/2025 08:08, Ojaswin Mujoo wrote:
> > > > +_begin_fstest auto rw stress atomicwrites
> > > > +
> > > > +_require_scratch_write_atomic
> > > > +_require_aiodio
> > > do you require fio with a certain version somewhere?
> > Oh right you mentioned that atomic=1 was broken on some older fios.
> 
> It was not broken - it just did nothing. I suppose that they are the same
> thing.
> 
> > Would you happen to know which version fixed it?
> 
> fio 3.38

Thanks, I'll add the version check.
> 
> thanks,
> John