[PATCH v3 2/3] iotests/298: add testcase for async writes with preallocation filter

Andrey Drobyshev posted 3 patches 3 months ago
Maintainers: Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, John Snow <jsnow@redhat.com>, Cleber Rosa <crosa@redhat.com>
[PATCH v3 2/3] iotests/298: add testcase for async writes with preallocation filter
Posted by Andrey Drobyshev 3 months ago
The testcase simply creates a 64G image with 1M clusters, generates a list
of 1M aligned offsets and feeds aio_write commands with those offsets to
qemu-io run with '--aio native --nocache'.  Then we check the data
written at each of the offsets.  Before the previous commit this could
result into a race within the preallocation filter which would zeroize
some clusters after actually writing data to them.

Note: the test doesn't fail in 100% cases as there's a race involved,
but the failures are pretty consistent so it should be good enough for
detecting the problem.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
---
 tests/qemu-iotests/298     | 49 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/298.out |  4 ++--
 2 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298
index 09c9290711..b7126e9e15 100755
--- a/tests/qemu-iotests/298
+++ b/tests/qemu-iotests/298
@@ -20,8 +20,10 @@
 
 import os
 import iotests
+import random
 
 MiB = 1024 * 1024
+GiB = MiB * 1024
 disk = os.path.join(iotests.test_dir, 'disk')
 overlay = os.path.join(iotests.test_dir, 'overlay')
 refdisk = os.path.join(iotests.test_dir, 'refdisk')
@@ -176,5 +178,52 @@ class TestTruncate(iotests.QMPTestCase):
         self.do_test('off', '150M')
 
 
+class TestPreallocAsyncWrites(iotests.QMPTestCase):
+    def setUp(self):
+        # Make sure we get reproducible write patterns on each run
+        random.seed(42)
+        iotests.qemu_img_create('-f', iotests.imgfmt, disk, '-o',
+                                f'cluster_size={MiB},lazy_refcounts=on',
+                                str(64 * GiB))
+
+    def tearDown(self):
+        os.remove(disk)
+
+    def test_prealloc_async_writes(self):
+        def gen_write_pattern():
+            n = 0
+            while True:
+                yield '-P 0xaa' if n else '-z'
+                n = 1 - n
+
+        def gen_read_pattern():
+            n = 0
+            while True:
+                yield '-P 0xaa' if n else '-P 0x00'
+                n = 1 - n
+
+        requests = 2048 # Number of write/read requests to feed to qemu-io
+        total_clusters = 64 * 1024 # 64G / 1M
+
+        wpgen = gen_write_pattern()
+        rpgen = gen_read_pattern()
+
+        offsets = random.sample(range(0, total_clusters), requests)
+        aio_write_cmds = [f'aio_write {next(wpgen)} {off}M 1M' for off in offsets]
+        read_cmds = [f'read {next(rpgen)} {off}M 1M' for off in offsets]
+
+        proc = iotests.QemuIoInteractive('--aio', 'native', '--nocache',
+                                         '--image-opts', drive_opts)
+        for cmd in aio_write_cmds:
+            proc.cmd(cmd)
+        proc.close()
+
+        proc = iotests.QemuIoInteractive('-f', iotests.imgfmt, disk)
+        for cmd in read_cmds:
+            out = proc.cmd(cmd)
+            self.assertFalse('Pattern verification failed' in str(out))
+        proc.close()
+
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2'], required_fmts=['preallocate'])
diff --git a/tests/qemu-iotests/298.out b/tests/qemu-iotests/298.out
index fa16b5ccef..6323079e08 100644
--- a/tests/qemu-iotests/298.out
+++ b/tests/qemu-iotests/298.out
@@ -1,5 +1,5 @@
-.............
+..............
 ----------------------------------------------------------------------
-Ran 13 tests
+Ran 14 tests
 
 OK
-- 
2.39.3
Re: [PATCH v3 2/3] iotests/298: add testcase for async writes with preallocation filter
Posted by Kevin Wolf 2 months, 1 week ago
Am 16.07.2024 um 16:41 hat Andrey Drobyshev geschrieben:
> The testcase simply creates a 64G image with 1M clusters, generates a list
> of 1M aligned offsets and feeds aio_write commands with those offsets to
> qemu-io run with '--aio native --nocache'.  Then we check the data
> written at each of the offsets.  Before the previous commit this could
> result into a race within the preallocation filter which would zeroize
> some clusters after actually writing data to them.
> 
> Note: the test doesn't fail in 100% cases as there's a race involved,
> but the failures are pretty consistent so it should be good enough for
> detecting the problem.
> 
> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>

I left it running in a loop for a while, but couldn't reproduce the bug
with this test.

>  tests/qemu-iotests/298     | 49 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/298.out |  4 ++--
>  2 files changed, 51 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298
> index 09c9290711..b7126e9e15 100755
> --- a/tests/qemu-iotests/298
> +++ b/tests/qemu-iotests/298
> @@ -20,8 +20,10 @@
>  
>  import os
>  import iotests
> +import random
>  
>  MiB = 1024 * 1024
> +GiB = MiB * 1024
>  disk = os.path.join(iotests.test_dir, 'disk')
>  overlay = os.path.join(iotests.test_dir, 'overlay')
>  refdisk = os.path.join(iotests.test_dir, 'refdisk')
> @@ -176,5 +178,52 @@ class TestTruncate(iotests.QMPTestCase):
>          self.do_test('off', '150M')
>  
>  
> +class TestPreallocAsyncWrites(iotests.QMPTestCase):
> +    def setUp(self):
> +        # Make sure we get reproducible write patterns on each run
> +        random.seed(42)
> +        iotests.qemu_img_create('-f', iotests.imgfmt, disk, '-o',
> +                                f'cluster_size={MiB},lazy_refcounts=on',
> +                                str(64 * GiB))
> +
> +    def tearDown(self):
> +        os.remove(disk)
> +
> +    def test_prealloc_async_writes(self):
> +        def gen_write_pattern():
> +            n = 0
> +            while True:
> +                yield '-P 0xaa' if n else '-z'
> +                n = 1 - n

This looks like a complicated way to write the following?

    # Alternate between write_zeroes and writing data
    def gen_write_pattern():
        while True:
            yield '-z'
            yield '-P 0xaa'

> +        def gen_read_pattern():
> +            n = 0
> +            while True:
> +                yield '-P 0xaa' if n else '-P 0x00'
> +                n = 1 - n

Same here.

Kevin
Re: [PATCH v3 2/3] iotests/298: add testcase for async writes with preallocation filter
Posted by Andrey Drobyshev 2 months, 1 week ago
On 8/5/24 3:04 PM, Kevin Wolf wrote:
> Am 16.07.2024 um 16:41 hat Andrey Drobyshev geschrieben:
>> The testcase simply creates a 64G image with 1M clusters, generates a list
>> of 1M aligned offsets and feeds aio_write commands with those offsets to
>> qemu-io run with '--aio native --nocache'.  Then we check the data
>> written at each of the offsets.  Before the previous commit this could
>> result into a race within the preallocation filter which would zeroize
>> some clusters after actually writing data to them.
>>
>> Note: the test doesn't fail in 100% cases as there's a race involved,
>> but the failures are pretty consistent so it should be good enough for
>> detecting the problem.
>>
>> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> 
> I left it running in a loop for a while, but couldn't reproduce the bug
> with this test.
> 

Hmmm, it seems to have stopped failing on my setup as well, no matter
how I increase the number of requests.  And it seems to be related to
the interleaving 'write-zeroes' requests.  My initial attempt was to
cover the case described by Vladimir here:
https://lists.nongnu.org/archive/html/qemu-block/2024-07/msg00415.html
Maybe we just leave it and try reproducing the corruption with just
regular write requests?  At least with this version it seems to be
failing pretty stably on my setup:

> +    def test_prealloc_async_writes(self):
> +        requests = 2048 # Number of write/read requests to feed to qemu-io
> +        total_clusters = 64 * 1024 # 64G / 1M
> +
> +        offsets = random.sample(range(0, total_clusters), requests)
> +        aio_write_cmds = [f'aio_write -P 0xaa  {off}M 1M' for off in offsets]
> +        read_cmds = [f'read -P 0xaa {off}M 1M' for off in offsets]
> +
> +        proc = iotests.QemuIoInteractive('--aio', 'native', '--nocache',
> +                                         '--image-opts', drive_opts)
> +        for cmd in aio_write_cmds:
> +            proc.cmd(cmd)
> +        proc.close()
> +
> +        proc = iotests.QemuIoInteractive('-f', iotests.imgfmt, disk)
> +        for cmd in read_cmds:
> +            out = proc.cmd(cmd)
> +            self.assertFalse('Pattern verification failed' in str(out))
> +        proc.close()
> +



>>  tests/qemu-iotests/298     | 49 ++++++++++++++++++++++++++++++++++++++
>>  tests/qemu-iotests/298.out |  4 ++--
>>  2 files changed, 51 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298
>> index 09c9290711..b7126e9e15 100755
>> --- a/tests/qemu-iotests/298
>> +++ b/tests/qemu-iotests/298
>> @@ -20,8 +20,10 @@
>>  
>>  import os
>>  import iotests
>> +import random
>>  
>>  MiB = 1024 * 1024
>> +GiB = MiB * 1024
>>  disk = os.path.join(iotests.test_dir, 'disk')
>>  overlay = os.path.join(iotests.test_dir, 'overlay')
>>  refdisk = os.path.join(iotests.test_dir, 'refdisk')
>> @@ -176,5 +178,52 @@ class TestTruncate(iotests.QMPTestCase):
>>          self.do_test('off', '150M')
>>  
>>  
>> +class TestPreallocAsyncWrites(iotests.QMPTestCase):
>> +    def setUp(self):
>> +        # Make sure we get reproducible write patterns on each run
>> +        random.seed(42)
>> +        iotests.qemu_img_create('-f', iotests.imgfmt, disk, '-o',
>> +                                f'cluster_size={MiB},lazy_refcounts=on',
>> +                                str(64 * GiB))
>> +
>> +    def tearDown(self):
>> +        os.remove(disk)
>> +
>> +    def test_prealloc_async_writes(self):
>> +        def gen_write_pattern():
>> +            n = 0
>> +            while True:
>> +                yield '-P 0xaa' if n else '-z'
>> +                n = 1 - n
> 
> This looks like a complicated way to write the following?
> 
>     # Alternate between write_zeroes and writing data
>     def gen_write_pattern():
>         while True:
>             yield '-z'
>             yield '-P 0xaa'
>

Agreed, thank you :)  Won't need this chunk though if we end up adopting
the version I posted above.

>> +        def gen_read_pattern():
>> +            n = 0
>> +            while True:
>> +                yield '-P 0xaa' if n else '-P 0x00'
>> +                n = 1 - n
> 
> Same here.
> 
> Kevin
>
Re: [PATCH v3 2/3] iotests/298: add testcase for async writes with preallocation filter
Posted by Kevin Wolf 2 months, 1 week ago
Am 05.08.2024 um 14:56 hat Andrey Drobyshev geschrieben:
> On 8/5/24 3:04 PM, Kevin Wolf wrote:
> > Am 16.07.2024 um 16:41 hat Andrey Drobyshev geschrieben:
> >> The testcase simply creates a 64G image with 1M clusters, generates a list
> >> of 1M aligned offsets and feeds aio_write commands with those offsets to
> >> qemu-io run with '--aio native --nocache'.  Then we check the data
> >> written at each of the offsets.  Before the previous commit this could
> >> result into a race within the preallocation filter which would zeroize
> >> some clusters after actually writing data to them.
> >>
> >> Note: the test doesn't fail in 100% cases as there's a race involved,
> >> but the failures are pretty consistent so it should be good enough for
> >> detecting the problem.
> >>
> >> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> > 
> > I left it running in a loop for a while, but couldn't reproduce the bug
> > with this test.
> > 
> 
> Hmmm, it seems to have stopped failing on my setup as well, no matter
> how I increase the number of requests.  And it seems to be related to
> the interleaving 'write-zeroes' requests.  My initial attempt was to
> cover the case described by Vladimir here:
> https://lists.nongnu.org/archive/html/qemu-block/2024-07/msg00415.html
> Maybe we just leave it and try reproducing the corruption with just
> regular write requests?  At least with this version it seems to be
> failing pretty stably on my setup:
> 
> > +    def test_prealloc_async_writes(self):
> > +        requests = 2048 # Number of write/read requests to feed to qemu-io
> > +        total_clusters = 64 * 1024 # 64G / 1M
> > +
> > +        offsets = random.sample(range(0, total_clusters), requests)
> > +        aio_write_cmds = [f'aio_write -P 0xaa  {off}M 1M' for off in offsets]
> > +        read_cmds = [f'read -P 0xaa {off}M 1M' for off in offsets]
> > +
> > +        proc = iotests.QemuIoInteractive('--aio', 'native', '--nocache',
> > +                                         '--image-opts', drive_opts)
> > +        for cmd in aio_write_cmds:
> > +            proc.cmd(cmd)
> > +        proc.close()
> > +
> > +        proc = iotests.QemuIoInteractive('-f', iotests.imgfmt, disk)
> > +        for cmd in read_cmds:
> > +            out = proc.cmd(cmd)
> > +            self.assertFalse('Pattern verification failed' in str(out))
> > +        proc.close()
> > +

This doesn't seem to fail for me either. Neither on tmpfs nor in my home
directory (which is XFS on an encrypted LVM volume).

Are you using some more complicated setup than "./check -qcow2 298"?

Do you think we could use blkdebug to deterministically trigger the case
instead of trying to brute force it? If we can suspend the write_zeroes
request on the child node of preallocate, I think that's all we need to
reproduce the bug as described in the commit message of the fix.

Kevin