[PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes

Ojaswin Mujoo posted 11 patches 5 months, 2 weeks ago
There is a newer version of this series
[PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by Ojaswin Mujoo 5 months, 2 weeks ago
This test is intended to ensure that multi blocks atomic writes
maintain atomic guarantees across sudden FS shutdowns.

The way we work is that we lay out a file with random mix of written,
unwritten and hole extents. Then we start performing atomic writes
sequentially on the file while we parallely shutdown the FS. Then we
note the last offset where the atomic write happened just before shut
down and then make sure blocks around it either have completely old
data or completely new data, ie the write was not torn during shutdown.

We repeat the same with completely written, completely unwritten and completely
empty file to ensure these cases are not torn either.  Finally, we have a
similar test for append atomic writes

Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
 tests/generic/1230.out |   2 +
 2 files changed, 399 insertions(+)
 create mode 100755 tests/generic/1230
 create mode 100644 tests/generic/1230.out

diff --git a/tests/generic/1230 b/tests/generic/1230
new file mode 100755
index 00000000..cff5adc0
--- /dev/null
+++ b/tests/generic/1230
@@ -0,0 +1,397 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
+#
+# FS QA Test No. 1230
+#
+# Test multi block atomic writes with sudden FS shutdowns to ensure
+# the FS is not tearing the write operation
+. ./common/preamble
+. ./common/atomicwrites
+_begin_fstest auto atomicwrites
+
+_require_scratch_write_atomic_multi_fsblock
+_require_atomic_write_test_commands
+_require_scratch_shutdown
+_require_xfs_io_command "truncate"
+
+_scratch_mkfs >> $seqres.full 2>&1
+_scratch_mount >> $seqres.full
+
+testfile=$SCRATCH_MNT/testfile
+touch $testfile
+
+awu_max=$(_get_atomic_write_unit_max $testfile)
+blksz=$(_get_block_size $SCRATCH_MNT)
+echo "Awu max: $awu_max" >> $seqres.full
+
+num_blocks=$((awu_max / blksz))
+# keep initial value high for dry run. This will be
+# tweaked in dry_run() based on device write speed.
+filesize=$(( 10 * 1024 * 1024 * 1024 ))
+
+_cleanup() {
+	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
+	wait
+}
+
+atomic_write_loop() {
+	local off=0
+	local size=$awu_max
+	for ((i=0; i<$((filesize / $size )); i++)); do
+		# Due to sudden shutdown this can produce errors so just
+		# redirect them to seqres.full
+		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
+		echo "Written to offset: $off" >> $tmp.aw
+		off=$((off + $size))
+	done
+}
+
+# This test has the following flow:
+# 1. Start doing sequential atomic writes in bg, upto $filesize
+# 2. Sleep for 0.2s and shutdown the FS
+# 3. kill the atomic write process
+# 4. verify the writes were not torn
+#
+# We ideally want the shutdown to happen while an atomic write is ongoing
+# but this gets tricky since faster devices can actually finish the whole
+# atomic write loop before sleep 0.2s completes, resulting in the shutdown
+# happening after the write loop which is not what we want. A simple solution
+# to this is to increase $filesize so step 1 takes long enough but a big
+# $filesize leads to create_mixed_mappings() taking very long, which is not
+# ideal.
+#
+# Hence, use the dry_run function to figure out the rough device speed and set
+# $filesize accordingly.
+dry_run() {
+	echo >> $seqres.full
+	echo "# Estimating ideal filesize..." >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	local i=0
+	# Wait for atleast first write to be recorded or 10s
+	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+	if [[ $i -gt 50 ]]
+	then
+		_fail "atomic write process took too long to start"
+	fi
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid 2>/dev/null  # the process might have finished already
+	wait $awloop_pid
+	unset $awloop_pid
+
+	bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
+
+	filesize=$((bytes_written * 3))
+	echo "# Setting \$filesize=$filesize" >> $seqres.full
+
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+
+}
+
+create_mixed_mappings() {
+	local file=$1
+	local size_bytes=$2
+
+	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
+	#Fill the file with alternate written and unwritten blocks
+	local off=0
+	local operations=("W" "U")
+
+	for ((i=0; i<$((size_bytes / blksz )); i++)); do
+		index=$(($i % ${#operations[@]}))
+		map="${operations[$index]}"
+
+		case "$map" in
+		    "W")
+			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
+			;;
+		    "U")
+			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
+			;;
+		esac
+		off=$((off + blksz))
+	done
+
+	sync $file
+}
+
+populate_expected_data() {
+	# create a dummy file with expected old data for different cases
+	create_mixed_mappings $testfile.exp_old_mixed $awu_max
+	expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
+
+	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
+	expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
+
+	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
+	expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
+
+	# create a dummy file with expected new data
+	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
+	expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
+}
+
+verify_data_blocks() {
+	local verify_start=$1
+	local verify_end=$2
+	local expected_data_old="$3"
+	local expected_data_new="$4"
+
+	echo >> $seqres.full
+	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
+
+	# After an atomic write, for every chunk we ensure that the underlying
+	# data is either the old data or new data as writes shouldn't get torn.
+	local off=$verify_start
+	while [[ "$off" -lt "$verify_end" ]]
+	do
+		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
+		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
+		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
+		then
+			echo "Checksum match failed at off: $off size: $awu_max"
+			echo "Expected contents: (Either of the 2 below):"
+			echo
+			echo "Expected old: "
+			echo "$expected_data_old"
+			echo
+			echo "Expected new: "
+			echo "$expected_data_new"
+			echo
+			echo "Actual contents: "
+			echo "$actual_data"
+
+			_fail
+		fi
+		echo -n "Check at offset $off suceeded! " >> $seqres.full
+		if [[ "$actual_data" == "$expected_data_new" ]]
+		then
+			echo "matched new" >> $seqres.full
+		elif [[ "$actual_data" == "$expected_data_old" ]]
+		then
+			echo "matched old" >> $seqres.full
+		fi
+		off=$(( off + awu_max ))
+	done
+}
+
+# test data integrity for file by shutting down in between atomic writes
+test_data_integrity() {
+	echo >> $seqres.full
+	echo "# Writing atomically to file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	local i=0
+	# Wait for atleast first write to be recorded or 10s
+	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+	if [[ $i -gt 50 ]]
+	then
+		_fail "atomic write process took too long to start"
+	fi
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid 2>/dev/null  # the process might have finished already
+	wait $awloop_pid
+	unset $awloop_pid
+
+	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	if [[ -z $last_offset ]]
+	then
+		last_offset=0
+	fi
+
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+
+	# we want to verify all blocks around which the shutdown happended
+	verify_start=$(( last_offset - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	verify_end=$(( last_offset + (awu_max * 5)))
+	if [[ "$verify_end" -gt "$filesize" ]]
+	then
+		verify_end=$filesize
+	fi
+}
+
+# test data integrity for file wiht written and unwritten mappings
+test_data_integrity_mixed() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with mixed mappings" >> $seqres.full
+	create_mixed_mappings $testfile $filesize
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
+}
+
+# test data integrity for file with completely written mappings
+test_data_integrity_writ() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully written mapping" >> $seqres.full
+	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
+}
+
+# test data integrity for file with completely unwritten mappings
+test_data_integrity_unwrit() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
+	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+# test data integrity for file with no mappings
+test_data_integrity_hole() {
+	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Creating testfile with no mappings" >> $seqres.full
+	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
+	sync $testfile
+
+	test_data_integrity
+
+	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
+}
+
+test_filesize_integrity() {
+	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Performing extending atomic writes over file in background" >> $seqres.full
+	atomic_write_loop &
+	awloop_pid=$!
+
+	local i=0
+	# Wait for atleast first write to be recorded or 10s
+	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
+
+	if [[ $i -gt 50 ]]
+	then
+		_fail "atomic write process took too long to start"
+	fi
+
+	echo >> $seqres.full
+	echo "# Shutting down filesystem while write is running" >> $seqres.full
+	_scratch_shutdown
+
+	kill $awloop_pid 2>/dev/null  # the process might have finished already
+	wait $awloop_pid
+	unset $awloop_pid
+
+	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
+	if [[ -z $last_offset ]]
+	then
+		last_offset=0
+	fi
+
+	echo >> $seqres.full
+	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
+	rm $tmp.aw
+	sleep 0.5
+
+	_scratch_cycle_mount
+	local filesize=$(_get_filesize $testfile)
+	echo >> $seqres.full
+	echo "# Filesize after shutdown: $filesize" >> $seqres.full
+
+	# To confirm that the write went atomically, we check:
+	# 1. The last block should be a multiple of awu_max
+	# 2. The last block should be the completely new data
+
+	if (( $filesize % $awu_max ))
+	then
+		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
+	fi
+
+	verify_start=$(( filesize - (awu_max * 5)))
+	if [[ $verify_start < 0 ]]
+	then
+		verify_start=0
+	fi
+
+	local verify_end=$filesize
+
+	# Here the blocks should always match new data hence, for simplicity of
+	# code, just corrupt the $expected_data_old buffer so it never matches
+	local expected_data_old="POISON"
+	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
+}
+
+$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
+
+dry_run
+
+echo >> $seqres.full
+echo "# Populating expected data buffers" >> $seqres.full
+populate_expected_data
+
+# Loop 20 times to shake out any races due to shutdown
+for ((iter=0; iter<20; iter++))
+do
+	echo >> $seqres.full
+	echo "------ Iteration $iter ------" >> $seqres.full
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
+	test_data_integrity_mixed
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
+	test_data_integrity_writ
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
+	test_data_integrity_unwrit
+
+	echo >> $seqres.full
+	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
+	test_data_integrity_hole
+
+	echo >> $seqres.full
+	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
+	test_filesize_integrity
+done
+
+echo "Silence is golden"
+status=0
+exit
diff --git a/tests/generic/1230.out b/tests/generic/1230.out
new file mode 100644
index 00000000..d01f54ea
--- /dev/null
+++ b/tests/generic/1230.out
@@ -0,0 +1,2 @@
+QA output created by 1230
+Silence is golden
-- 
2.49.0
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by John Garry 5 months, 1 week ago
On 22/08/2025 09:02, Ojaswin Mujoo wrote:
> This test is intended to ensure that multi blocks atomic writes
> maintain atomic guarantees across sudden FS shutdowns.
> 
> The way we work is that we lay out a file with random mix of written,
> unwritten and hole extents. Then we start performing atomic writes
> sequentially on the file while we parallely shutdown the FS. Then we
> note the last offset where the atomic write happened just before shut
> down and then make sure blocks around it either have completely old
> data or completely new data, ie the write was not torn during shutdown.
> 
> We repeat the same with completely written, completely unwritten and completely
> empty file to ensure these cases are not torn either.  Finally, we have a
> similar test for append atomic writes
> 
> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

Please check comments, below, thanks!

> ---
>   tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
>   tests/generic/1230.out |   2 +
>   2 files changed, 399 insertions(+)
>   create mode 100755 tests/generic/1230
>   create mode 100644 tests/generic/1230.out
> 
> diff --git a/tests/generic/1230 b/tests/generic/1230
> new file mode 100755
> index 00000000..cff5adc0
> --- /dev/null
> +++ b/tests/generic/1230
> @@ -0,0 +1,397 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> +#
> +# FS QA Test No. 1230
> +#
> +# Test multi block atomic writes with sudden FS shutdowns to ensure
> +# the FS is not tearing the write operation
> +. ./common/preamble
> +. ./common/atomicwrites
> +_begin_fstest auto atomicwrites
> +
> +_require_scratch_write_atomic_multi_fsblock
> +_require_atomic_write_test_commands
> +_require_scratch_shutdown
> +_require_xfs_io_command "truncate"

is a similar fallocate test needed?

> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_scratch_mount >> $seqres.full
> +
> +testfile=$SCRATCH_MNT/testfile
> +touch $testfile
> +
> +awu_max=$(_get_atomic_write_unit_max $testfile)
> +blksz=$(_get_block_size $SCRATCH_MNT)
> +echo "Awu max: $awu_max" >> $seqres.full
> +
> +num_blocks=$((awu_max / blksz))
> +# keep initial value high for dry run. This will be
> +# tweaked in dry_run() based on device write speed.
> +filesize=$(( 10 * 1024 * 1024 * 1024 ))

could this cause some out-of-space issue? That's 10GB, right?

> +
> +_cleanup() {
> +	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
> +	wait
> +}
> +
> +atomic_write_loop() {
> +	local off=0
> +	local size=$awu_max
> +	for ((i=0; i<$((filesize / $size )); i++)); do
> +		# Due to sudden shutdown this can produce errors so just
> +		# redirect them to seqres.full
> +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
> +		echo "Written to offset: $off" >> $tmp.aw
> +		off=$((off + $size))
> +	done
> +}
> +
> +# This test has the following flow:
> +# 1. Start doing sequential atomic writes in bg, upto $filesize

bg?

> +# 2. Sleep for 0.2s and shutdown the FS
> +# 3. kill the atomic write process
> +# 4. verify the writes were not torn
> +#
> +# We ideally want the shutdown to happen while an atomic write is ongoing
> +# but this gets tricky since faster devices can actually finish the whole
> +# atomic write loop before sleep 0.2s completes, resulting in the shutdown
> +# happening after the write loop which is not what we want. A simple solution
> +# to this is to increase $filesize so step 1 takes long enough but a big
> +# $filesize leads to create_mixed_mappings() taking very long, which is not
> +# ideal.
> +#
> +# Hence, use the dry_run function to figure out the rough device speed and set
> +# $filesize accordingly.
> +dry_run() {
> +	echo >> $seqres.full
> +	echo "# Estimating ideal filesize..." >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!
> +
> +	local i=0
> +	# Wait for atleast first write to be recorded or 10s
> +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> +
> +	if [[ $i -gt 50 ]]
> +	then
> +		_fail "atomic write process took too long to start"
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown
> +
> +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> +	wait $awloop_pid
> +	unset $awloop_pid
> +
> +	bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
> +
> +	filesize=$((bytes_written * 3))
> +	echo "# Setting \$filesize=$filesize" >> $seqres.full
> +
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +
> +}
> +
> +create_mixed_mappings() {

Is this same as patch 08/12?

> +	local file=$1
> +	local size_bytes=$2
> +
> +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> +	#Fill the file with alternate written and unwritten blocks
> +	local off=0
> +	local operations=("W" "U")
> +
> +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> +		index=$(($i % ${#operations[@]}))
> +		map="${operations[$index]}"
> +
> +		case "$map" in
> +		    "W")
> +			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null

does this just write random data? I don't see any pattern being set.

> +			;;
> +		    "U")
> +			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
> +			;;
> +		esac
> +		off=$((off + blksz))
> +	done
> +
> +	sync $file
> +}
> +
> +populate_expected_data() {
> +	# create a dummy file with expected old data for different cases
> +	create_mixed_mappings $testfile.exp_old_mixed $awu_max
> +	expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
> +
> +	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
> +	expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
> +
> +	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
> +	expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
> +
> +	# create a dummy file with expected new data
> +	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
> +	expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
> +}
> +
> +verify_data_blocks() {
> +	local verify_start=$1
> +	local verify_end=$2
> +	local expected_data_old="$3"
> +	local expected_data_new="$4"
> +
> +	echo >> $seqres.full
> +	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
> +
> +	# After an atomic write, for every chunk we ensure that the underlying
> +	# data is either the old data or new data as writes shouldn't get torn.
> +	local off=$verify_start
> +	while [[ "$off" -lt "$verify_end" ]]
> +	do
> +		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
> +		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
> +		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
> +		then
> +			echo "Checksum match failed at off: $off size: $awu_max"
> +			echo "Expected contents: (Either of the 2 below):"
> +			echo
> +			echo "Expected old: "
> +			echo "$expected_data_old"


it would be nice if this was deterministic - see comment in 
create_mixed_mappings

> +			echo
> +			echo "Expected new: "
> +			echo "$expected_data_new"

nit: I am not sure what is meant by "expected". I would just have "new 
data". We don't know what to expect, as it could be old or new, right?

> +			echo
> +			echo "Actual contents: "
> +			echo "$actual_data"
> +
> +			_fail
> +		fi
> +		echo -n "Check at offset $off suceeded! " >> $seqres.full
> +		if [[ "$actual_data" == "$expected_data_new" ]]
> +		then
> +			echo "matched new" >> $seqres.full
> +		elif [[ "$actual_data" == "$expected_data_old" ]]
> +		then
> +			echo "matched old" >> $seqres.full
> +		fi
> +		off=$(( off + awu_max ))
> +	done
> +}
> +
> +# test data integrity for file by shutting down in between atomic writes
> +test_data_integrity() {
> +	echo >> $seqres.full
> +	echo "# Writing atomically to file in background" >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!
> +

from here ...

> +	local i=0
> +	# Wait for atleast first write to be recorded or 10s
> +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> +
> +	if [[ $i -gt 50 ]]
> +	then
> +		_fail "atomic write process took too long to start"
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown
> +
> +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> +	wait $awloop_pid
> +	unset $awloop_pid

... to here looks similar in many functions. Can we factor it out?

> +
> +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	if [[ -z $last_offset ]]
> +	then
> +		last_offset=0
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +
> +	# we want to verify all blocks around which the shutdown happended
> +	verify_start=$(( last_offset - (awu_max * 5)))
> +	if [[ $verify_start < 0 ]]
> +	then
> +		verify_start=0
> +	fi
> +
> +	verify_end=$(( last_offset + (awu_max * 5)))
> +	if [[ "$verify_end" -gt "$filesize" ]]
> +	then
> +		verify_end=$filesize
> +	fi
> +}
> +
> +# test data integrity for file wiht written and unwritten mappings

with

> +test_data_integrity_mixed() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with mixed mappings" >> $seqres.full
> +	create_mixed_mappings $testfile $filesize
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> +}
> +
> +# test data integrity for file with completely written mappings
> +test_data_integrity_writ() {

please spell "writ" out fully, which I think should be "written"

> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with fully written mapping" >> $seqres.full
> +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> +}
> +
> +# test data integrity for file with completely unwritten mappings
> +test_data_integrity_unwrit() {

same as above

> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +}
> +
> +# test data integrity for file with no mappings
> +test_data_integrity_hole() {
> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Creating testfile with no mappings" >> $seqres.full
> +	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
> +	sync $testfile
> +
> +	test_data_integrity
> +
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> +}
> +
> +test_filesize_integrity() {
> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Performing extending atomic writes over file in background" >> $seqres.full
> +	atomic_write_loop &
> +	awloop_pid=$!
> +
> +	local i=0
> +	# Wait for atleast first write to be recorded or 10s
> +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> +
> +	if [[ $i -gt 50 ]]
> +	then
> +		_fail "atomic write process took too long to start"
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> +	_scratch_shutdown
> +
> +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> +	wait $awloop_pid
> +	unset $awloop_pid
> +
> +	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> +	if [[ -z $last_offset ]]
> +	then
> +		last_offset=0
> +	fi
> +
> +	echo >> $seqres.full
> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> +	rm $tmp.aw
> +	sleep 0.5
> +
> +	_scratch_cycle_mount
> +	local filesize=$(_get_filesize $testfile)
> +	echo >> $seqres.full
> +	echo "# Filesize after shutdown: $filesize" >> $seqres.full
> +
> +	# To confirm that the write went atomically, we check:
> +	# 1. The last block should be a multiple of awu_max
> +	# 2. The last block should be the completely new data
> +
> +	if (( $filesize % $awu_max ))
> +	then
> +		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
> +	fi
> +
> +	verify_start=$(( filesize - (awu_max * 5)))
> +	if [[ $verify_start < 0 ]]
> +	then
> +		verify_start=0
> +	fi
> +
> +	local verify_end=$filesize
> +
> +	# Here the blocks should always match new data hence, for simplicity of
> +	# code, just corrupt the $expected_data_old buffer so it never matches
> +	local expected_data_old="POISON"
> +	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
> +}
> +
> +$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> +
> +dry_run
> +
> +echo >> $seqres.full
> +echo "# Populating expected data buffers" >> $seqres.full
> +populate_expected_data
> +
> +# Loop 20 times to shake out any races due to shutdown
> +for ((iter=0; iter<20; iter++))
> +do
> +	echo >> $seqres.full
> +	echo "------ Iteration $iter ------" >> $seqres.full
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
> +	test_data_integrity_mixed
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
> +	test_data_integrity_writ
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
> +	test_data_integrity_unwrit
> +
> +	echo >> $seqres.full
> +	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
> +	test_data_integrity_hole
> +
> +	echo >> $seqres.full
> +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full

what does "Starting filesize integrity test" mean?

> +	test_filesize_integrity
> +done
> +
> +echo "Silence is golden"
> +status=0
> +exit
> diff --git a/tests/generic/1230.out b/tests/generic/1230.out
> new file mode 100644
> index 00000000..d01f54ea
> --- /dev/null
> +++ b/tests/generic/1230.out
> @@ -0,0 +1,2 @@
> +QA output created by 1230
> +Silence is golden
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by Ojaswin Mujoo 5 months ago
On Tue, Sep 02, 2025 at 04:49:26PM +0100, John Garry wrote:
> On 22/08/2025 09:02, Ojaswin Mujoo wrote:
> > This test is intended to ensure that multi blocks atomic writes
> > maintain atomic guarantees across sudden FS shutdowns.
> > 
> > The way we work is that we lay out a file with random mix of written,
> > unwritten and hole extents. Then we start performing atomic writes
> > sequentially on the file while we parallely shutdown the FS. Then we
> > note the last offset where the atomic write happened just before shut
> > down and then make sure blocks around it either have completely old
> > data or completely new data, ie the write was not torn during shutdown.
> > 
> > We repeat the same with completely written, completely unwritten and completely
> > empty file to ensure these cases are not torn either.  Finally, we have a
> > similar test for append atomic writes
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> 
> Please check comments, below, thanks!
> 
> > ---
> >   tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
> >   tests/generic/1230.out |   2 +
> >   2 files changed, 399 insertions(+)
> >   create mode 100755 tests/generic/1230
> >   create mode 100644 tests/generic/1230.out
> > 
> > diff --git a/tests/generic/1230 b/tests/generic/1230
> > new file mode 100755
> > index 00000000..cff5adc0
> > --- /dev/null
> > +++ b/tests/generic/1230
> > @@ -0,0 +1,397 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test No. 1230
> > +#
> > +# Test multi block atomic writes with sudden FS shutdowns to ensure
> > +# the FS is not tearing the write operation
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +_require_scratch_shutdown
> > +_require_xfs_io_command "truncate"
> 
> is a similar fallocate test needed?

Hey John, we run the test for the following cases:

- file has mixed mapping
- file has only hole (trucate case)
- file has only unwritten blocks (falloc case)
- file has only written blocks 

So we already do that. It's just that we don't need the
_require_xfs_io_command "falloc" since it is already included in
_require_atomic_write_test_commnads.

> 
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount >> $seqres.full
> > +
> > +testfile=$SCRATCH_MNT/testfile
> > +touch $testfile
> > +
> > +awu_max=$(_get_atomic_write_unit_max $testfile)
> > +blksz=$(_get_block_size $SCRATCH_MNT)
> > +echo "Awu max: $awu_max" >> $seqres.full
> > +
> > +num_blocks=$((awu_max / blksz))
> > +# keep initial value high for dry run. This will be
> > +# tweaked in dry_run() based on device write speed.
> > +filesize=$(( 10 * 1024 * 1024 * 1024 ))
> 
> could this cause some out-of-space issue? That's 10GB, right?

Hey John, yes this is just a dummy value. We tune the filesize later
based on how fast the device is. That will usually be around 3 * (bytes
written in 0.2s) (check dry_run function). 

Generally this will be a smaller size. ( 3GB on a 5GB/s SSD.) But yes
I should probably add a _notrun if our ssd fast enough to fill up the
full FS in 0.2s.

> 
> > +
> > +_cleanup() {
> > +	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
> > +	wait
> > +}
> > +
> > +atomic_write_loop() {
> > +	local off=0
> > +	local size=$awu_max
> > +	for ((i=0; i<$((filesize / $size )); i++)); do
> > +		# Due to sudden shutdown this can produce errors so just
> > +		# redirect them to seqres.full
> > +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
> > +		echo "Written to offset: $off" >> $tmp.aw
> > +		off=$((off + $size))
> > +	done
> > +}
> > +
> > +# This test has the following flow:
> > +# 1. Start doing sequential atomic writes in bg, upto $filesize
> 
> bg?

background*. I'll change it.
> 
> > +# 2. Sleep for 0.2s and shutdown the FS
> > +# 3. kill the atomic write process
> > +# 4. verify the writes were not torn
> > +#
> > +# We ideally want the shutdown to happen while an atomic write is ongoing
> > +# but this gets tricky since faster devices can actually finish the whole
> > +# atomic write loop before sleep 0.2s completes, resulting in the shutdown
> > +# happening after the write loop which is not what we want. A simple solution
> > +# to this is to increase $filesize so step 1 takes long enough but a big
> > +# $filesize leads to create_mixed_mappings() taking very long, which is not
> > +# ideal.
> > +#
> > +# Hence, use the dry_run function to figure out the rough device speed and set
> > +# $filesize accordingly.
> > +dry_run() {
> > +	echo >> $seqres.full
> > +	echo "# Estimating ideal filesize..." >> $seqres.full
> > +	atomic_write_loop &
> > +	awloop_pid=$!
> > +
> > +	local i=0
> > +	# Wait for atleast first write to be recorded or 10s
> > +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> > +
> > +	if [[ $i -gt 50 ]]
> > +	then
> > +		_fail "atomic write process took too long to start"
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> > +	_scratch_shutdown
> > +
> > +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> > +	wait $awloop_pid
> > +	unset $awloop_pid
> > +
> > +	bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> > +	echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
> > +
> > +	filesize=$((bytes_written * 3))
> > +	echo "# Setting \$filesize=$filesize" >> $seqres.full
> > +
> > +	rm $tmp.aw
> > +	sleep 0.5
> > +
> > +	_scratch_cycle_mount
> > +
> > +}
> > +
> > +create_mixed_mappings() {
> 
> Is this same as patch 08/12?

I believe you mean the [D]SYNC tests, yes it is the same.

> 
> > +	local file=$1
> > +	local size_bytes=$2
> > +
> > +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> > +	#Fill the file with alternate written and unwritten blocks
> > +	local off=0
> > +	local operations=("W" "U")
> > +
> > +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> > +		index=$(($i % ${#operations[@]}))
> > +		map="${operations[$index]}"
> > +
> > +		case "$map" in
> > +		    "W")
> > +			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
> 
> does this just write random data? I don't see any pattern being set.

By default pwrite writes 0xcd if no patterns is specified. This helps us
reliably check the data back.

> 
> > +			;;
> > +		    "U")
> > +			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
> > +			;;
> > +		esac
> > +		off=$((off + blksz))
> > +	done
> > +
> > +	sync $file
> > +}
> > +
> > +populate_expected_data() {
> > +	# create a dummy file with expected old data for different cases
> > +	create_mixed_mappings $testfile.exp_old_mixed $awu_max
> > +	expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
> > +
> > +	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
> > +	expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
> > +
> > +	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
> > +	expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
> > +
> > +	# create a dummy file with expected new data
> > +	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
> > +	expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
> > +}
> > +
> > +verify_data_blocks() {
> > +	local verify_start=$1
> > +	local verify_end=$2
> > +	local expected_data_old="$3"
> > +	local expected_data_new="$4"
> > +
> > +	echo >> $seqres.full
> > +	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
> > +
> > +	# After an atomic write, for every chunk we ensure that the underlying
> > +	# data is either the old data or new data as writes shouldn't get torn.
> > +	local off=$verify_start
> > +	while [[ "$off" -lt "$verify_end" ]]
> > +	do
> > +		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
> > +		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
> > +		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
> > +		then
> > +			echo "Checksum match failed at off: $off size: $awu_max"
> > +			echo "Expected contents: (Either of the 2 below):"
> > +			echo
> > +			echo "Expected old: "
> > +			echo "$expected_data_old"
> 
> 
> it would be nice if this was deterministic - see comment in
> create_mixed_mappings

Yes, it is. It will be 0xcdcdcdcd
> 
> > +			echo
> > +			echo "Expected new: "
> > +			echo "$expected_data_new"
> 
> nit: I am not sure what is meant by "expected". I would just have "new
> data". We don't know what to expect, as it could be old or new, right?

Yes, so the I was thinking of it this way:

We either expect the data to be the full new (named expected_new) or
fully old (named expected_old). Else renaming it to new vs old vs actual
makese it a bit more confusing imo

> 
> > +			echo
> > +			echo "Actual contents: "
> > +			echo "$actual_data"
> > +
> > +			_fail
> > +		fi
> > +		echo -n "Check at offset $off suceeded! " >> $seqres.full
> > +		if [[ "$actual_data" == "$expected_data_new" ]]
> > +		then
> > +			echo "matched new" >> $seqres.full
> > +		elif [[ "$actual_data" == "$expected_data_old" ]]
> > +		then
> > +			echo "matched old" >> $seqres.full
> > +		fi
> > +		off=$(( off + awu_max ))
> > +	done
> > +}
> > +
> > +# test data integrity for file by shutting down in between atomic writes
> > +test_data_integrity() {
> > +	echo >> $seqres.full
> > +	echo "# Writing atomically to file in background" >> $seqres.full
> > +	atomic_write_loop &
> > +	awloop_pid=$!
> > +
> 
> from here ...
> 
> > +	local i=0
> > +	# Wait for atleast first write to be recorded or 10s
> > +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> > +
> > +	if [[ $i -gt 50 ]]
> > +	then
> > +		_fail "atomic write process took too long to start"
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> > +	_scratch_shutdown
> > +
> > +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> > +	wait $awloop_pid
> > +	unset $awloop_pid
> 
> ... to here looks similar in many functions. Can we factor it out?

Right thats true, I'll factor this out. Thanks for pointing it out.
> 
> > +
> > +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> > +	if [[ -z $last_offset ]]
> > +	then
> > +		last_offset=0
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> > +
> > +	rm $tmp.aw
> > +	sleep 0.5
> > +
> > +	_scratch_cycle_mount
> > +
> > +	# we want to verify all blocks around which the shutdown happended
> > +	verify_start=$(( last_offset - (awu_max * 5)))
> > +	if [[ $verify_start < 0 ]]
> > +	then
> > +		verify_start=0
> > +	fi
> > +
> > +	verify_end=$(( last_offset + (awu_max * 5)))
> > +	if [[ "$verify_end" -gt "$filesize" ]]
> > +	then
> > +		verify_end=$filesize
> > +	fi
> > +}
> > +
> > +# test data integrity for file wiht written and unwritten mappings
> 
> with
> 
> > +test_data_integrity_mixed() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with mixed mappings" >> $seqres.full
> > +	create_mixed_mappings $testfile $filesize
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> > +}
> > +
> > +# test data integrity for file with completely written mappings
> > +test_data_integrity_writ() {
> 
> please spell "writ" out fully, which I think should be "written"

Yes, will do.

> 
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with fully written mapping" >> $seqres.full
> > +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> > +}
> > +
> > +# test data integrity for file with completely unwritten mappings
> > +test_data_integrity_unwrit() {
> 
> same as above
> 
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> > +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> > +}
> > +
> > +# test data integrity for file with no mappings
> > +test_data_integrity_hole() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with no mappings" >> $seqres.full
> > +	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> > +}
> > +
> > +test_filesize_integrity() {
> > +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Performing extending atomic writes over file in background" >> $seqres.full
> > +	atomic_write_loop &
> > +	awloop_pid=$!
> > +
> > +	local i=0
> > +	# Wait for atleast first write to be recorded or 10s
> > +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> > +
> > +	if [[ $i -gt 50 ]]
> > +	then
> > +		_fail "atomic write process took too long to start"
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> > +	_scratch_shutdown
> > +
> > +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> > +	wait $awloop_pid
> > +	unset $awloop_pid
> > +
> > +	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> > +	if [[ -z $last_offset ]]
> > +	then
> > +		last_offset=0
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> > +	rm $tmp.aw
> > +	sleep 0.5
> > +
> > +	_scratch_cycle_mount
> > +	local filesize=$(_get_filesize $testfile)
> > +	echo >> $seqres.full
> > +	echo "# Filesize after shutdown: $filesize" >> $seqres.full
> > +
> > +	# To confirm that the write went atomically, we check:
> > +	# 1. The last block should be a multiple of awu_max
> > +	# 2. The last block should be the completely new data
> > +
> > +	if (( $filesize % $awu_max ))
> > +	then
> > +		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
> > +	fi
> > +
> > +	verify_start=$(( filesize - (awu_max * 5)))
> > +	if [[ $verify_start < 0 ]]
> > +	then
> > +		verify_start=0
> > +	fi
> > +
> > +	local verify_end=$filesize
> > +
> > +	# Here the blocks should always match new data hence, for simplicity of
> > +	# code, just corrupt the $expected_data_old buffer so it never matches
> > +	local expected_data_old="POISON"
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
> > +}
> > +
> > +$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +dry_run
> > +
> > +echo >> $seqres.full
> > +echo "# Populating expected data buffers" >> $seqres.full
> > +populate_expected_data
> > +
> > +# Loop 20 times to shake out any races due to shutdown
> > +for ((iter=0; iter<20; iter++))
> > +do
> > +	echo >> $seqres.full
> > +	echo "------ Iteration $iter ------" >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
> > +	test_data_integrity_mixed
> > +
> > +	echo >> $seqres.full
> > +	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
> > +	test_data_integrity_writ
> > +
> > +	echo >> $seqres.full
> > +	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
> > +	test_data_integrity_unwrit
> > +
> > +	echo >> $seqres.full
> > +	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
> > +	test_data_integrity_hole
> > +
> > +	echo >> $seqres.full
> > +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> 
> what does "Starting filesize integrity test" mean?

Basically other tests already truncate the file to a higher value and
then perform the shut down test. Here we actually do append atomic
writes since we want to also stress the i_size update paths during
shutdown to ensure that doesn't cause any tearing with atomic writes.

I can maybe rename it to:


echo "# Starting data integrity test for atomic append writes" >> $seqres.full

Thanks for the review!

Regards,
ojaswin

> 
> > +	test_filesize_integrity
> > +done
> > +
> > +echo "Silence is golden"
> > +status=0
> > +exit
> > diff --git a/tests/generic/1230.out b/tests/generic/1230.out
> > new file mode 100644
> > index 00000000..d01f54ea
> > --- /dev/null
> > +++ b/tests/generic/1230.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 1230
> > +Silence is golden
>
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by John Garry 5 months ago
On 05/09/2025 18:06, Ojaswin Mujoo wrote:
> On Tue, Sep 02, 2025 at 04:49:26PM +0100, John Garry wrote:
>> On 22/08/2025 09:02, Ojaswin Mujoo wrote:
>>> This test is intended to ensure that multi blocks atomic writes
>>> maintain atomic guarantees across sudden FS shutdowns.
>>>
>>> The way we work is that we lay out a file with random mix of written,
>>> unwritten and hole extents. Then we start performing atomic writes
>>> sequentially on the file while we parallely shutdown the FS. Then we
>>> note the last offset where the atomic write happened just before shut
>>> down and then make sure blocks around it either have completely old
>>> data or completely new data, ie the write was not torn during shutdown.
>>>
>>> We repeat the same with completely written, completely unwritten and completely
>>> empty file to ensure these cases are not torn either.  Finally, we have a
>>> similar test for append atomic writes
>>>
>>> Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
>>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>>
>> Please check comments, below, thanks!
>>
>>> ---
>>>    tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
>>>    tests/generic/1230.out |   2 +
>>>    2 files changed, 399 insertions(+)
>>>    create mode 100755 tests/generic/1230
>>>    create mode 100644 tests/generic/1230.out
>>>
>>> diff --git a/tests/generic/1230 b/tests/generic/1230
>>> new file mode 100755
>>> index 00000000..cff5adc0
>>> --- /dev/null
>>> +++ b/tests/generic/1230
>>> @@ -0,0 +1,397 @@
>>> +#! /bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
>>> +#
>>> +# FS QA Test No. 1230
>>> +#
>>> +# Test multi block atomic writes with sudden FS shutdowns to ensure
>>> +# the FS is not tearing the write operation
>>> +. ./common/preamble
>>> +. ./common/atomicwrites
>>> +_begin_fstest auto atomicwrites
>>> +
>>> +_require_scratch_write_atomic_multi_fsblock
>>> +_require_atomic_write_test_commands
>>> +_require_scratch_shutdown
>>> +_require_xfs_io_command "truncate"
>>
>> is a similar fallocate test needed?
> 
> Hey John, we run the test for the following cases:
> 
> - file has mixed mapping
> - file has only hole (trucate case)
> - file has only unwritten blocks (falloc case)
> - file has only written blocks
> 
> So we already do that. It's just that we don't need the
> _require_xfs_io_command "falloc" since it is already included in
> _require_atomic_write_test_commnads.

ok

> 
>>
>>> +
>>> +_scratch_mkfs >> $seqres.full 2>&1
>>> +_scratch_mount >> $seqres.full
>>> +
>>> +testfile=$SCRATCH_MNT/testfile
>>> +touch $testfile
>>> +
>>> +awu_max=$(_get_atomic_write_unit_max $testfile)
>>> +blksz=$(_get_block_size $SCRATCH_MNT)
>>> +echo "Awu max: $awu_max" >> $seqres.full
>>> +
>>> +num_blocks=$((awu_max / blksz))
>>> +# keep initial value high for dry run. This will be
>>> +# tweaked in dry_run() based on device write speed.
>>> +filesize=$(( 10 * 1024 * 1024 * 1024 ))
>>
>> could this cause some out-of-space issue? That's 10GB, right?
> 
> Hey John, yes this is just a dummy value. We tune the filesize later
> based on how fast the device is. That will usually be around 3 * (bytes
> written in 0.2s) (check dry_run function).
> 
> Generally this will be a smaller size. ( 3GB on a 5GB/s SSD.) But yes
> I should probably add a _notrun if our ssd fast enough to fill up the
> full FS in 0.2s.

ok, good

> 
>>
>>> +
>>> +_cleanup() {
>>> +	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
>>> +	wait
>>> +}
>>> +
>>> +atomic_write_loop() {
>>> +	local off=0
>>> +	local size=$awu_max
>>> +	for ((i=0; i<$((filesize / $size )); i++)); do
>>> +		# Due to sudden shutdown this can produce errors so just
>>> +		# redirect them to seqres.full
>>> +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
>>> +		echo "Written to offset: $off" >> $tmp.aw
>>> +		off=$((off + $size))
>>> +	done
>>> +}
>>> +
>>> +# This test has the following flow:
>>> +# 1. Start doing sequential atomic writes in bg, upto $filesize
>>
>> bg?
> 
> background*. I'll change it.
>>
>>> +# 2. Sleep for 0.2s and shutdown the FS
>>> +# 3. kill the atomic write process
>>> +# 4. verify the writes were not torn
>>> +#
>>> +# We ideally want the shutdown to happen while an atomic write is ongoing
>>> +# but this gets tricky since faster devices can actually finish the whole
>>> +# atomic write loop before sleep 0.2s completes, resulting in the shutdown
>>> +# happening after the write loop which is not what we want. A simple solution
>>> +# to this is to increase $filesize so step 1 takes long enough but a big
>>> +# $filesize leads to create_mixed_mappings() taking very long, which is not
>>> +# ideal.
>>> +#
>>> +# Hence, use the dry_run function to figure out the rough device speed and set
>>> +# $filesize accordingly.
>>> +dry_run() {
>>> +	echo >> $seqres.full
>>> +	echo "# Estimating ideal filesize..." >> $seqres.full
>>> +	atomic_write_loop &
>>> +	awloop_pid=$!
>>> +
>>> +	local i=0
>>> +	# Wait for atleast first write to be recorded or 10s
>>> +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
>>> +
>>> +	if [[ $i -gt 50 ]]
>>> +	then
>>> +		_fail "atomic write process took too long to start"
>>> +	fi
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
>>> +	_scratch_shutdown
>>> +
>>> +	kill $awloop_pid 2>/dev/null  # the process might have finished already
>>> +	wait $awloop_pid
>>> +	unset $awloop_pid
>>> +
>>> +	bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
>>> +	echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
>>> +
>>> +	filesize=$((bytes_written * 3))
>>> +	echo "# Setting \$filesize=$filesize" >> $seqres.full
>>> +
>>> +	rm $tmp.aw
>>> +	sleep 0.5
>>> +
>>> +	_scratch_cycle_mount
>>> +
>>> +}
>>> +
>>> +create_mixed_mappings() {
>>
>> Is this same as patch 08/12?
> 
> I believe you mean the [D]SYNC tests, yes it is the same.

then maybe factor out the test, if possible. I assume that this sort of 
approach is taken for xfstests.

> 
>>
>>> +	local file=$1
>>> +	local size_bytes=$2
>>> +
>>> +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
>>> +	#Fill the file with alternate written and unwritten blocks
>>> +	local off=0
>>> +	local operations=("W" "U")
>>> +
>>> +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
>>> +		index=$(($i % ${#operations[@]}))
>>> +		map="${operations[$index]}"
>>> +
>>> +		case "$map" in
>>> +		    "W")
>>> +			$XFS_IO_PROG -fc "pwrite -b $blksz $off $blksz" $file  >> /dev/null
>>
>> does this just write random data? I don't see any pattern being set.
> 
> By default pwrite writes 0xcd if no patterns is specified. This helps us
> reliably check the data back.

ok, understood

> 
>>
>>> +			;;
>>> +		    "U")
>>> +			$XFS_IO_PROG -fc "falloc $off $blksz" $file >> /dev/null
>>> +			;;
>>> +		esac
>>> +		off=$((off + blksz))
>>> +	done
>>> +
>>> +	sync $file
>>> +}
>>> +
>>> +populate_expected_data() {
>>> +	# create a dummy file with expected old data for different cases
>>> +	create_mixed_mappings $testfile.exp_old_mixed $awu_max
>>> +	expected_data_old_mixed=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mixed)
>>> +
>>> +	$XFS_IO_PROG -fc "falloc 0 $awu_max" $testfile.exp_old_zeroes >> $seqres.full
>>> +	expected_data_old_zeroes=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_zeroes)
>>> +
>>> +	$XFS_IO_PROG -fc "pwrite -b $awu_max 0 $awu_max" $testfile.exp_old_mapped >> $seqres.full
>>> +	expected_data_old_mapped=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_old_mapped)
>>> +
>>> +	# create a dummy file with expected new data
>>> +	$XFS_IO_PROG -fc "pwrite -S 0x61 -b $awu_max 0 $awu_max" $testfile.exp_new >> $seqres.full
>>> +	expected_data_new=$(od -An -t x1 -j 0 -N $awu_max $testfile.exp_new)
>>> +}
>>> +
>>> +verify_data_blocks() {
>>> +	local verify_start=$1
>>> +	local verify_end=$2
>>> +	local expected_data_old="$3"
>>> +	local expected_data_new="$4"
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
>>> +
>>> +	# After an atomic write, for every chunk we ensure that the underlying
>>> +	# data is either the old data or new data as writes shouldn't get torn.
>>> +	local off=$verify_start
>>> +	while [[ "$off" -lt "$verify_end" ]]
>>> +	do
>>> +		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
>>> +		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
>>> +		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
>>> +		then
>>> +			echo "Checksum match failed at off: $off size: $awu_max"
>>> +			echo "Expected contents: (Either of the 2 below):"
>>> +			echo
>>> +			echo "Expected old: "
>>> +			echo "$expected_data_old"
>>
>>
>> it would be nice if this was deterministic - see comment in
>> create_mixed_mappings
> 
> Yes, it is. It will be 0xcdcdcdcd
>>
>>> +			echo
>>> +			echo "Expected new: "
>>> +			echo "$expected_data_new"
>>
>> nit: I am not sure what is meant by "expected". I would just have "new
>> data". We don't know what to expect, as it could be old or new, right?
> 
> Yes, so the I was thinking of it this way:
> 
> We either expect the data to be the full new (named expected_new) or
> fully old (named expected_old). Else renaming it to new vs old vs actual
> makese it a bit more confusing imo

ok, good

> 
>>
>>> +			echo
>>> +			echo "Actual contents: "
>>> +			echo "$actual_data"
>>> +
>>> +			_fail
>>> +		fi
>>> +		echo -n "Check at offset $off suceeded! " >> $seqres.full
>>> +		if [[ "$actual_data" == "$expected_data_new" ]]
>>> +		then
>>> +			echo "matched new" >> $seqres.full
>>> +		elif [[ "$actual_data" == "$expected_data_old" ]]
>>> +		then
>>> +			echo "matched old" >> $seqres.full
>>> +		fi
>>> +		off=$(( off + awu_max ))
>>> +	done
>>> +}
>>> +
>>> +# test data integrity for file by shutting down in between atomic writes
>>> +test_data_integrity() {
>>> +	echo >> $seqres.full
>>> +	echo "# Writing atomically to file in background" >> $seqres.full
>>> +	atomic_write_loop &
>>> +	awloop_pid=$!
>>> +
>>
>> from here ...
>>
>>> +	local i=0
>>> +	# Wait for atleast first write to be recorded or 10s
>>> +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
>>> +
>>> +	if [[ $i -gt 50 ]]
>>> +	then
>>> +		_fail "atomic write process took too long to start"
>>> +	fi
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
>>> +	_scratch_shutdown
>>> +
>>> +	kill $awloop_pid 2>/dev/null  # the process might have finished already
>>> +	wait $awloop_pid
>>> +	unset $awloop_pid
>>
>> ... to here looks similar in many functions. Can we factor it out?
> 
> Right thats true, I'll factor this out. Thanks for pointing it out.

ok

>>
>>> +
>>> +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
>>> +	if [[ -z $last_offset ]]
>>> +	then
>>> +		last_offset=0
>>> +	fi
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
>>> +
>>> +	rm $tmp.aw
>>> +	sleep 0.5
>>> +
>>> +	_scratch_cycle_mount
>>> +
>>> +	# we want to verify all blocks around which the shutdown happended
>>> +	verify_start=$(( last_offset - (awu_max * 5)))
>>> +	if [[ $verify_start < 0 ]]
>>> +	then
>>> +		verify_start=0
>>> +	fi
>>> +
>>> +	verify_end=$(( last_offset + (awu_max * 5)))
>>> +	if [[ "$verify_end" -gt "$filesize" ]]
>>> +	then
>>> +		verify_end=$filesize
>>> +	fi
>>> +}
>>> +
>>> +# test data integrity for file wiht written and unwritten mappings
>>
>> with
>>
>>> +test_data_integrity_mixed() {
>>> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Creating testfile with mixed mappings" >> $seqres.full
>>> +	create_mixed_mappings $testfile $filesize
>>> +
>>> +	test_data_integrity
>>> +
>>> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
>>> +}
>>> +
>>> +# test data integrity for file with completely written mappings
>>> +test_data_integrity_writ() {
>>
>> please spell "writ" out fully, which I think should be "written"
> 
> Yes, will do.
> 
>>
>>> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Creating testfile with fully written mapping" >> $seqres.full
>>> +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
>>> +	sync $testfile
>>> +
>>> +	test_data_integrity
>>> +
>>> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
>>> +}
>>> +
>>> +# test data integrity for file with completely unwritten mappings
>>> +test_data_integrity_unwrit() {
>>
>> same as above
>>
>>> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
>>> +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
>>> +	sync $testfile
>>> +
>>> +	test_data_integrity
>>> +
>>> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
>>> +}
>>> +
>>> +# test data integrity for file with no mappings
>>> +test_data_integrity_hole() {
>>> +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Creating testfile with no mappings" >> $seqres.full
>>> +	$XFS_IO_PROG -c "truncate $filesize" $testfile >> $seqres.full
>>> +	sync $testfile
>>> +
>>> +	test_data_integrity
>>> +
>>> +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
>>> +}
>>> +
>>> +test_filesize_integrity() {
>>> +	$XFS_IO_PROG -c "truncate 0" $testfile >> $seqres.full
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Performing extending atomic writes over file in background" >> $seqres.full
>>> +	atomic_write_loop &
>>> +	awloop_pid=$!
>>> +
>>> +	local i=0
>>> +	# Wait for atleast first write to be recorded or 10s
>>> +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
>>> +
>>> +	if [[ $i -gt 50 ]]
>>> +	then
>>> +		_fail "atomic write process took too long to start"
>>> +	fi
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Shutting down filesystem while write is running" >> $seqres.full
>>> +	_scratch_shutdown
>>> +
>>> +	kill $awloop_pid 2>/dev/null  # the process might have finished already
>>> +	wait $awloop_pid
>>> +	unset $awloop_pid
>>> +
>>> +	local last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
>>> +	if [[ -z $last_offset ]]
>>> +	then
>>> +		last_offset=0
>>> +	fi
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
>>> +	rm $tmp.aw
>>> +	sleep 0.5
>>> +
>>> +	_scratch_cycle_mount
>>> +	local filesize=$(_get_filesize $testfile)
>>> +	echo >> $seqres.full
>>> +	echo "# Filesize after shutdown: $filesize" >> $seqres.full
>>> +
>>> +	# To confirm that the write went atomically, we check:
>>> +	# 1. The last block should be a multiple of awu_max
>>> +	# 2. The last block should be the completely new data
>>> +
>>> +	if (( $filesize % $awu_max ))
>>> +	then
>>> +		echo "Filesize after shutdown ($filesize) not a multiple of atomic write unit ($awu_max)"
>>> +	fi
>>> +
>>> +	verify_start=$(( filesize - (awu_max * 5)))
>>> +	if [[ $verify_start < 0 ]]
>>> +	then
>>> +		verify_start=0
>>> +	fi
>>> +
>>> +	local verify_end=$filesize
>>> +
>>> +	# Here the blocks should always match new data hence, for simplicity of
>>> +	# code, just corrupt the $expected_data_old buffer so it never matches
>>> +	local expected_data_old="POISON"
>>> +	verify_data_blocks $verify_start $verify_end "$expected_data_old" "$expected_data_new"
>>> +}
>>> +
>>> +$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
>>> +
>>> +dry_run
>>> +
>>> +echo >> $seqres.full
>>> +echo "# Populating expected data buffers" >> $seqres.full
>>> +populate_expected_data
>>> +
>>> +# Loop 20 times to shake out any races due to shutdown
>>> +for ((iter=0; iter<20; iter++))
>>> +do
>>> +	echo >> $seqres.full
>>> +	echo "------ Iteration $iter ------" >> $seqres.full
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
>>> +	test_data_integrity_mixed
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
>>> +	test_data_integrity_writ
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
>>> +	test_data_integrity_unwrit
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
>>> +	test_data_integrity_hole
>>> +
>>> +	echo >> $seqres.full
>>> +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
>>
>> what does "Starting filesize integrity test" mean?
> 
> Basically other tests already truncate the file to a higher value and
> then perform the shut down test. Here we actually do append atomic
> writes since we want to also stress the i_size update paths during
> shutdown to ensure that doesn't cause any tearing with atomic writes.
> 
> I can maybe rename it to:
> 
> 
> echo "# Starting data integrity test for atomic append writes" >> $seqres.full
> 
> Thanks for the review!
> 

It's just the name "integrity" that throws me a bit..
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by Ojaswin Mujoo 5 months ago
On Mon, Sep 08, 2025 at 03:27:57PM +0100, John Garry wrote:
> On 05/09/2025 18:06, Ojaswin Mujoo wrote:
> > On Tue, Sep 02, 2025 at 04:49:26PM +0100, John Garry wrote:
> > > On 22/08/2025 09:02, Ojaswin Mujoo wrote:
> > > > ---
> > > >    tests/generic/1230     | 397 +++++++++++++++++++++++++++++++++++++++++
> > > >    tests/generic/1230.out |   2 +
> > > >    2 files changed, 399 insertions(+)
> > > >    create mode 100755 tests/generic/1230
> > > >    create mode 100644 tests/generic/1230.out
> > > > 

<...>

> > > > +
> > > > +	bytes_written=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> > > > +	echo "# Bytes written in 0.2s: $bytes_written" >> $seqres.full
> > > > +
> > > > +	filesize=$((bytes_written * 3))
> > > > +	echo "# Setting \$filesize=$filesize" >> $seqres.full
> > > > +
> > > > +	rm $tmp.aw
> > > > +	sleep 0.5
> > > > +
> > > > +	_scratch_cycle_mount
> > > > +
> > > > +}
> > > > +
> > > > +create_mixed_mappings() {
> > > 
> > > Is this same as patch 08/12?
> > 
> > I believe you mean the [D]SYNC tests, yes it is the same.
> 
> then maybe factor out the test, if possible. I assume that this sort of
> approach is taken for xfstests.
> 

I'm not sure what you mean by factor out the *test*. We are testing
different things there and the only thing common in the tests is
creation of mixed mapping files and the check to ensure we didn't tear
data.

In case you mean to factor out the create_mixed_mappings() helper into
common/rc, sure I can do that but I'm unsure if at this point it would
be very useful for other tests.

> > 
> > > 
> > > > +	local file=$1
> > > > +	local size_bytes=$2
> > > > +
> > > > +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> > > > +	#Fill the file with alternate written and unwritten blocks
> > > > +	local off=0
> > > > +	local operations=("W" "U")
> > > > +
> > > > +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> > > > +		index=$(($i % ${#operations[@]}))
> > > > +		map="${operations[$index]}"
> > > > +

<...>

> > > > +# Loop 20 times to shake out any races due to shutdown
> > > > +for ((iter=0; iter<20; iter++))
> > > > +do
> > > > +	echo >> $seqres.full
> > > > +	echo "------ Iteration $iter ------" >> $seqres.full
> > > > +
> > > > +	echo >> $seqres.full
> > > > +	echo "# Starting data integrity test for atomic writes over mixed mapping" >> $seqres.full
> > > > +	test_data_integrity_mixed
> > > > +
> > > > +	echo >> $seqres.full
> > > > +	echo "# Starting data integrity test for atomic writes over fully written mapping" >> $seqres.full
> > > > +	test_data_integrity_writ
> > > > +
> > > > +	echo >> $seqres.full
> > > > +	echo "# Starting data integrity test for atomic writes over fully unwritten mapping" >> $seqres.full
> > > > +	test_data_integrity_unwrit
> > > > +
> > > > +	echo >> $seqres.full
> > > > +	echo "# Starting data integrity test for atomic writes over holes" >> $seqres.full
> > > > +	test_data_integrity_hole
> > > > +
> > > > +	echo >> $seqres.full
> > > > +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> > > 
> > > what does "Starting filesize integrity test" mean?
> > 
> > Basically other tests already truncate the file to a higher value and
> > then perform the shut down test. Here we actually do append atomic
> > writes since we want to also stress the i_size update paths during
> > shutdown to ensure that doesn't cause any tearing with atomic writes.
> > 
> > I can maybe rename it to:
> > 
> > 
> > echo "# Starting data integrity test for atomic append writes" >> $seqres.full
> > 
> > Thanks for the review!
> > 
> 
> It's just the name "integrity" that throws me a bit..

So I mean integrity as in writes are not tearing after the shutdown.
That's how we have worded the other sub-tests above.

Regards,
ojaswin
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by John Garry 5 months ago
On 09/09/2025 07:44, Ojaswin Mujoo wrote:
>>>>> +create_mixed_mappings() {
>>>> Is this same as patch 08/12?
>>> I believe you mean the [D]SYNC tests, yes it is the same.
>> then maybe factor out the test, if possible. I assume that this sort of
>> approach is taken for xfstests.
>>
> I'm not sure what you mean by factor out the*test*. We are testing
> different things there and the only thing common in the tests is
> creation of mixed mapping files and the check to ensure we didn't tear
> data.
> 
> In case you mean to factor out the create_mixed_mappings() helper into
> common/rc, sure I can do that but I'm unsure if at this point it would
> be very useful for other tests.

above it was mentioned that the code in create_mixed_mappings was 
common, so that is why I suggested to factor it out. If it does not make 
sense, then fine (and don't).

> 
>>>>> +	local file=$1
>>>>> +	local size_bytes=$2
>>>>> +
>>>>> +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
>>>>> +	#Fill the file with alternate written and unwritten blocks
>>>>> +	local off=0
>>>>> +	local operations=("W" "U")
>>>>> +
>>>>> +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
>>>>> +		index=$(($i % ${#operations[@]}))
>>>>> +		map="${operations[$index]}"
>>>>> +
> <...>


>>>>> +	echo >> $seqres.full
>>>>> +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
>>>> what does "Starting filesize integrity test" mean?
>>> Basically other tests already truncate the file to a higher value and
>>> then perform the shut down test. Here we actually do append atomic
>>> writes since we want to also stress the i_size update paths during
>>> shutdown to ensure that doesn't cause any tearing with atomic writes.
>>>
>>> I can maybe rename it to:
>>>
>>>
>>> echo "# Starting data integrity test for atomic append writes" >> $seqres.full
>>>
>>> Thanks for the review!
>>>
>> It's just the name "integrity" that throws me a bit..
> So I mean integrity as in writes are not tearing after the shutdown.
> That's how we have worded the other sub-tests above.

you could mention "shutdown" also in the print.

Thanks!
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by Ojaswin Mujoo 5 months ago
On Tue, Sep 09, 2025 at 08:49:18AM +0100, John Garry wrote:
> On 09/09/2025 07:44, Ojaswin Mujoo wrote:
> > > > > > +create_mixed_mappings() {
> > > > > Is this same as patch 08/12?
> > > > I believe you mean the [D]SYNC tests, yes it is the same.
> > > then maybe factor out the test, if possible. I assume that this sort of
> > > approach is taken for xfstests.
> > > 
> > I'm not sure what you mean by factor out the*test*. We are testing
> > different things there and the only thing common in the tests is
> > creation of mixed mapping files and the check to ensure we didn't tear
> > data.
> > 
> > In case you mean to factor out the create_mixed_mappings() helper into
> > common/rc, sure I can do that but I'm unsure if at this point it would
> > be very useful for other tests.
> 
> above it was mentioned that the code in create_mixed_mappings was common, so
> that is why I suggested to factor it out. If it does not make sense, then
> fine (and don't).

Yes I mean Im unsure it will be useful for tests other than the two
tests in this patchset.

> 
> > 
> > > > > > +	local file=$1
> > > > > > +	local size_bytes=$2
> > > > > > +
> > > > > > +	echo "# Filling file $file with alternate mappings till size $size_bytes" >> $seqres.full
> > > > > > +	#Fill the file with alternate written and unwritten blocks
> > > > > > +	local off=0
> > > > > > +	local operations=("W" "U")
> > > > > > +
> > > > > > +	for ((i=0; i<$((size_bytes / blksz )); i++)); do
> > > > > > +		index=$(($i % ${#operations[@]}))
> > > > > > +		map="${operations[$index]}"
> > > > > > +
> > <...>
> 
> 
> > > > > > +	echo >> $seqres.full
> > > > > > +	echo "# Starting filesize integrity test for atomic writes" >> $seqres.full
> > > > > what does "Starting filesize integrity test" mean?
> > > > Basically other tests already truncate the file to a higher value and
> > > > then perform the shut down test. Here we actually do append atomic
> > > > writes since we want to also stress the i_size update paths during
> > > > shutdown to ensure that doesn't cause any tearing with atomic writes.
> > > > 
> > > > I can maybe rename it to:
> > > > 
> > > > 
> > > > echo "# Starting data integrity test for atomic append writes" >> $seqres.full
> > > > 
> > > > Thanks for the review!
> > > > 
> > > It's just the name "integrity" that throws me a bit..
> > So I mean integrity as in writes are not tearing after the shutdown.
> > That's how we have worded the other sub-tests above.
> 
> you could mention "shutdown" also in the print.

Umm, do you mean something like:

 "Starting shutdown data integrity tests ..."

 Regards,
 ojaswin
> 
> Thanks!
>
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by John Garry 5 months ago
On 09/09/2025 10:01, Ojaswin Mujoo wrote:
>> you could mention "shutdown" also in the print.
> Umm, do you mean something like:
> 
>   "Starting shutdown data integrity tests ..."

yes :)
Re: [PATCH v5 09/12] generic: Add sudden shutdown tests for multi block atomic writes
Posted by Ojaswin Mujoo 5 months ago
On Tue, Sep 09, 2025 at 10:04:09AM +0100, John Garry wrote:
> On 09/09/2025 10:01, Ojaswin Mujoo wrote:
> > > you could mention "shutdown" also in the print.
> > Umm, do you mean something like:
> > 
> >   "Starting shutdown data integrity tests ..."
> 
> yes :)

Sure, will do thanks!