From nobody Sun Nov 24 18:24:48 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1720704852; cv=none; d=zohomail.com; s=zohoarc; b=jlMVmGNyc7vfzdk+bu74rm9hodQnexS3Z++lgeo4+IeWcn+1BMzcDtXX69+gUUc857wDgtqMrQM2l40RFDjkT3A3QY1hezGoVsUXJlufSZWFCZyIOTZ83VrudFqBzNye+s6yT8fYi1rLg9J+QxC2nmcGKTKM5C61Xe/Pezcv9tc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1720704852; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=bUAAIaIwATSVVMEabqm9IcIMBFrz7eMWutd1t1Bw3uc=; b=hYkgjsW3Cxs1O2/PywKrXEAB51pbBM/j9CWPJVF5BZsmvdM00/UukplCwaN4B7tu0Ca2wkS1ZNjJ6TRh29O3inQRo2cP4Bd37HE1YEblDY/OpxlQ3K8KjqHuAG54jZix+xTy9bRBfghsYnCC5+I0z1mq352EeaFWTX7UAzLKDus= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1720704852301829.6554232729014; Thu, 11 Jul 2024 06:34:12 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sRtuq-0004kp-Bt; Thu, 11 Jul 2024 09:32:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sRtuo-0004dy-Dy; Thu, 11 Jul 2024 09:32:54 -0400 Received: from relay.virtuozzo.com ([130.117.225.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sRtul-00034L-9F; Thu, 11 Jul 2024 09:32:54 -0400 Received: from [130.117.225.1] (helo=dev005.ch-qa.vzint.dev) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1sRtty-00CZ6M-0t; Thu, 11 Jul 2024 15:32:42 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-Id:Date:Subject:From: Content-Type; bh=bUAAIaIwATSVVMEabqm9IcIMBFrz7eMWutd1t1Bw3uc=; b=aIoUQa6zZwOq xJvvw+txVAaX+gUPQgGp2gjTCyEJIPEgKisdkIUC6ZPwlg95I4KVhu2KZLsm65lY/P4WXjlblIefM l76InnrZTCtHmFv3OXlZpfmKhZqxPZ7djlg4xw2u9ux4lTRF8HCPST0MlUevIVsCHB8WLxVW9uqRu jzJqGb58jE7dF755C6nJGcOwFqbWcC6UHyr5AfDz06FNnGu0vg1ckHiCKzr0ksBqwcVYSXAORDGuw VlTmoZXRVTBvKwfsX19L82FmpgBrKlJUQqzrnhlMISORGJweMZXvSRO/fItDc68+JsJpuKmHrM6Wb g134rzCowX7m+uaDP7Q32g==; From: Andrey Drobyshev To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, hreitz@redhat.com, kwolf@redhat.com, vsementsov@yandex-team.ru, pbonzini@redhat.com, eesposit@redhat.com, andrey.drobyshev@virtuozzo.com, den@virtuozzo.com Subject: [PATCH 1/2] block: zero data data corruption using prealloc-filter Date: Thu, 11 Jul 2024 16:32:41 +0300 Message-Id: <20240711133242.251061-2-andrey.drobyshev@virtuozzo.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20240711133242.251061-1-andrey.drobyshev@virtuozzo.com> References: <20240711133242.251061-1-andrey.drobyshev@virtuozzo.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=130.117.225.111; envelope-from=andrey.drobyshev@virtuozzo.com; helo=relay.virtuozzo.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @virtuozzo.com) X-ZM-MESSAGEID: 1720704853613116600 Content-Type: text/plain; charset="utf-8" From: "Denis V. Lunev" We have observed that some clusters in the QCOW2 files are zeroed while preallocation filter is used. We are able to trace down the following sequence when prealloc-filter is used: co=3D0x55e7cbed7680 qcow2_co_pwritev_task() co=3D0x55e7cbed7680 preallocate_co_pwritev_part() co=3D0x55e7cbed7680 handle_write() co=3D0x55e7cbed7680 bdrv_co_do_pwrite_zeroes() co=3D0x55e7cbed7680 raw_do_pwrite_zeroes() co=3D0x7f9edb7fe500 do_fallocate() Here coroutine 0x55e7cbed7680 is being blocked waiting while coroutine 0x7f9edb7fe500 will finish with fallocate of the file area. OK. It is time to handle next coroutine, which co=3D0x55e7cbee91b0 qcow2_co_pwritev_task() co=3D0x55e7cbee91b0 preallocate_co_pwritev_part() co=3D0x55e7cbee91b0 handle_write() co=3D0x55e7cbee91b0 bdrv_co_do_pwrite_zeroes() co=3D0x55e7cbee91b0 raw_do_pwrite_zeroes() co=3D0x7f9edb7deb00 do_fallocate() The trouble comes here. Coroutine 0x55e7cbed7680 has not advanced file_end yet and coroutine 0x55e7cbee91b0 will start fallocate() for the same area. This means that if (once fallocate is started inside 0x7f9edb7deb00) original fallocate could end and the real write will be executed. In that case write() request is handled at the same time as fallocate(). Normally we should protect s->file_end while it is detected that preallocation is need. The patch introduces file_end_lock for it to be protected when run in the coroutine context. Note: the lock is taken only once it is detected that the preallocation is really required. This is not a frequent case due to the preallocation nature thus the patch should not have performance impact. Originally-by: Denis V. Lunev Co-authored-by: Andrey Drobyshev Signed-off-by: Denis V. Lunev Signed-off-by: Andrey Drobyshev --- block/preallocate.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/block/preallocate.c b/block/preallocate.c index d215bc5d6d..9cb2c97635 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -78,6 +78,8 @@ typedef struct BDRVPreallocateState { =20 /* Gives up the resize permission on children when parents don't need = it */ QEMUBH *drop_resize_bh; + + CoMutex file_end_lock; } BDRVPreallocateState; =20 static int preallocate_drop_resize(BlockDriverState *bs, Error **errp); @@ -170,6 +172,8 @@ static int preallocate_open(BlockDriverState *bs, QDict= *options, int flags, ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) & bs->file->bs->supported_zero_flags); =20 + qemu_co_mutex_init(&s->file_end_lock); + return 0; } =20 @@ -342,6 +346,7 @@ handle_write(BlockDriverState *bs, int64_t offset, int6= 4_t bytes, return false; } =20 + QEMU_LOCK_GUARD(&s->file_end_lock); if (s->file_end < 0) { s->file_end =3D s->data_end; } @@ -353,6 +358,8 @@ handle_write(BlockDriverState *bs, int64_t offset, int6= 4_t bytes, =20 /* We have valid s->data_end, and request writes beyond it. */ =20 + QEMU_LOCK_GUARD(&s->file_end_lock); + s->data_end =3D end; if (s->zero_start < 0 || !want_merge_zero) { s->zero_start =3D end; @@ -428,6 +435,8 @@ preallocate_co_truncate(BlockDriverState *bs, int64_t o= ffset, BDRVPreallocateState *s =3D bs->opaque; int ret; =20 + QEMU_LOCK_GUARD(&s->file_end_lock); + if (s->data_end >=3D 0 && offset > s->data_end) { if (s->file_end < 0) { s->file_end =3D bdrv_co_getlength(bs->file->bs); @@ -501,6 +510,8 @@ preallocate_co_getlength(BlockDriverState *bs) return s->data_end; } =20 + QEMU_LOCK_GUARD(&s->file_end_lock); + ret =3D bdrv_co_getlength(bs->file->bs); =20 if (has_prealloc_perms(bs)) { --=20 2.39.3 From nobody Sun Nov 24 18:24:48 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1720704856; cv=none; d=zohomail.com; s=zohoarc; b=dQN6riOWGqrL5GOnvvwQub6S8l9C+pi5fd58FS2KP0u29iUN8fu0OFoRTVVy9X0RzVG/Vt7zkm9ARa0NL+9V4it36ULGSsanwby5hwhODz3mJFOU7Ss8zlycGVVqoCPd/yYtht6GLov86OmF0QQs5QETXO/dv/rOiQgy4/Hn7m4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1720704856; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=wNiQ3C1oGxdSfTWhqpAGz0OkDU2T3/U7NR1jQ5Ua/cU=; b=hyirKL1eea81lZ42dpt0I+U5mhdjneoySGNyG+VQVIoVYEEDWYvl9m0xgdXq0/HXww538dxeH+9F7ZGTuyuBTlxDe6QHVPnT4V/h5wEqF5HANHURW6Gk6pZLkrNtbWM+yAdVdodrEeQFBQ6vRS9b/eI191MawijEkb6U0AjpTJU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1720704856754536.5924979115335; Thu, 11 Jul 2024 06:34:16 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sRtus-0004rN-0J; Thu, 11 Jul 2024 09:32:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sRtup-0004jc-V8; Thu, 11 Jul 2024 09:32:56 -0400 Received: from relay.virtuozzo.com ([130.117.225.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sRtul-00034J-9y; Thu, 11 Jul 2024 09:32:55 -0400 Received: from [130.117.225.1] (helo=dev005.ch-qa.vzint.dev) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1sRtty-00CZ6M-0w; Thu, 11 Jul 2024 15:32:42 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-Id:Date:Subject:From: Content-Type; bh=wNiQ3C1oGxdSfTWhqpAGz0OkDU2T3/U7NR1jQ5Ua/cU=; b=h2paJzPXKkef 7RsordnF1EPs6r0+gsq8l2JG8fBxpKlPpktnP/ndpYI/h0qxaRESWAY8w2RUhFxyEekSlDH0cZE4/ gU28Qr2DA7H37AIVooqfE+5u25qwq8bidIzMifgiQ5ZAXAl4SmIOVshQRgA3o9zlKrBLuFkuQcSeR /o8eJsk6CSoEn9htPM5JivZVVgTWjTRWg38yDULIv3RCM8UtguA+iLpiuK5KGSU5pEtQTJKoR1ct/ B8sSzwyTHkx6nOcSQSnK/JWUUNuIlsqyd5QhOHe/zhemaw6eoD+HXKkV9U7MJT5eej2Ei+JRWMamN QtPBu44vokBeovKEpxP2EQ==; From: Andrey Drobyshev To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, hreitz@redhat.com, kwolf@redhat.com, vsementsov@yandex-team.ru, pbonzini@redhat.com, eesposit@redhat.com, andrey.drobyshev@virtuozzo.com, den@virtuozzo.com Subject: [PATCH 2/2] iotests/298: add testcase for async writes with preallocation filter Date: Thu, 11 Jul 2024 16:32:42 +0300 Message-Id: <20240711133242.251061-3-andrey.drobyshev@virtuozzo.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20240711133242.251061-1-andrey.drobyshev@virtuozzo.com> References: <20240711133242.251061-1-andrey.drobyshev@virtuozzo.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=130.117.225.111; envelope-from=andrey.drobyshev@virtuozzo.com; helo=relay.virtuozzo.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @virtuozzo.com) X-ZM-MESSAGEID: 1720704857425116600 Content-Type: text/plain; charset="utf-8" The testcase simply creates a 64G image with 1M clusters, generates a list of 1M aligned offsets and feeds aio_write commands with those offsets to qemu-io run with '--aio native --nocache'. Then we check the data written at each of the offsets. Before the previous commit this could result into a race within the preallocation filter which would zeroize some clusters after actually writing data to them. Note: the test doesn't fail in 100% cases as there's a race involved, but the failures are pretty consistent so it should be good enough for detecting the problem. Signed-off-by: Andrey Drobyshev --- tests/qemu-iotests/298 | 34 ++++++++++++++++++++++++++++++++++ tests/qemu-iotests/298.out | 4 ++-- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298 index 09c9290711..d1bf5ee0df 100755 --- a/tests/qemu-iotests/298 +++ b/tests/qemu-iotests/298 @@ -20,8 +20,10 @@ =20 import os import iotests +import random =20 MiB =3D 1024 * 1024 +GiB =3D MiB * 1024 disk =3D os.path.join(iotests.test_dir, 'disk') overlay =3D os.path.join(iotests.test_dir, 'overlay') refdisk =3D os.path.join(iotests.test_dir, 'refdisk') @@ -176,5 +178,37 @@ class TestTruncate(iotests.QMPTestCase): self.do_test('off', '150M') =20 =20 +class TestPreallocAsyncWrites(iotests.QMPTestCase): + def setUp(self): + # Make sure we get reproducible write patterns on each run + random.seed(42) + iotests.qemu_img_create('-f', iotests.imgfmt, disk, '-o', + f'cluster_size=3D{MiB},lazy_refcounts=3Don= ', + str(64 * GiB)) + + def tearDown(self): + os.remove(disk) + + def test_prealloc_async_writes(self): + requests =3D 1024 # Number of write/read requests to feed to qemu-= io + total_clusters =3D 64 * 1024 # 64G / 1M + + offsets =3D random.sample(range(0, total_clusters), requests) + aio_write_cmds =3D [f'aio_write -P 0xaa {off}M 1M' for off in offs= ets] + read_cmds =3D [f'read -P 0xaa {off}M 1M' for off in offsets] + + proc =3D iotests.QemuIoInteractive('--aio', 'native', '--nocache', + '--image-opts', drive_opts) + for cmd in aio_write_cmds: + proc.cmd(cmd) + proc.close() + + proc =3D iotests.QemuIoInteractive('-f', iotests.imgfmt, disk) + for cmd in read_cmds: + out =3D proc.cmd(cmd) + self.assertFalse('Pattern verification failed' in str(out)) + proc.close() + + if __name__ =3D=3D '__main__': iotests.main(supported_fmts=3D['qcow2'], required_fmts=3D['preallocate= ']) diff --git a/tests/qemu-iotests/298.out b/tests/qemu-iotests/298.out index fa16b5ccef..6323079e08 100644 --- a/tests/qemu-iotests/298.out +++ b/tests/qemu-iotests/298.out @@ -1,5 +1,5 @@ -............. +.............. ---------------------------------------------------------------------- -Ran 13 tests +Ran 14 tests =20 OK --=20 2.39.3