From nobody Mon Feb 9 22:05:20 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1492006069714742.136154957515; Wed, 12 Apr 2017 07:07:49 -0700 (PDT) Received: from localhost ([::1]:44545 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cyIw0-0003sB-4k for importer@patchew.org; Wed, 12 Apr 2017 10:07:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36519) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cyIuV-0002sP-OZ for qemu-devel@nongnu.org; Wed, 12 Apr 2017 10:06:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cyIuR-00016e-HL for qemu-devel@nongnu.org; Wed, 12 Apr 2017 10:06:15 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:3365 helo=dggrg03-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cyIuK-0000tL-I8; Wed, 12 Apr 2017 10:06:05 -0400 Received: from 172.30.72.55 (EHLO DGGEML403-HUB.china.huawei.com) ([172.30.72.55]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALP11472; Wed, 12 Apr 2017 22:05:51 +0800 (CST) Received: from localhost (10.177.24.212) by DGGEML403-HUB.china.huawei.com (10.3.17.33) with Microsoft SMTP Server id 14.3.301.0; Wed, 12 Apr 2017 22:05:42 +0800 From: zhanghailiang To: , Date: Wed, 12 Apr 2017 22:05:20 +0800 Message-ID: <1492005921-15664-6-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1492005921-15664-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1492005921-15664-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58EE3440.034C, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 5c6cc602e433b6bfc9c4c7bd21dba11a X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] [fuzzy] X-Received-From: 45.249.212.189 Subject: [Qemu-devel] [PATCH v4 5/6] replication: Implement block replication for shared disk case X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, xiecl.fnst@cn.fujitsu.com, zhangchen.fnst@cn.fujitsu.com, Wen Congyang , qemu-block@nongnu.org, zhanghailiang Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Just as the scenario of non-shared disk block replication, we are going to implement block replication from many basic blocks that are already in QEMU. The architecture is: virtio-blk || .-= --------- / || | = Secondary / || '-= --------- / || = virtio-blk / || = | | || re= plication(5) | NBD --------> NBD (2) = | | client || server ---> hidden disk <-- a= ctive disk(4) | ^ || | | replication(1) || | | | || | | +-----------------' || | (3) |drive-backup sync=3Dnone || | --------. | +-----------------+ || | Primary | | | || backing | --------' | | || | V | | +-------------------------------------------+ | | shared disk | <----------+ +-------------------------------------------+ 1) Primary writes will read original data and forward it to Secondary QEMU. 2) The hidden-disk is created automatically. It buffers the original co= ntent that is modified by the primary VM. It should also be an empty disk,= and the driver supports bdrv_make_empty() and backing file. 3) Primary write requests will be written to Shared disk. 4) Secondary write requests will be buffered in the active disk and it will overwrite the existing sector content in the buffer. Signed-off-by: zhanghailiang Signed-off-by: Wen Congyang Signed-off-by: Zhang Chen --- v4: - Call bdrv_invalidate_cache() while do checkpoint for shared disk --- block/replication.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++--= ---- 1 file changed, 52 insertions(+), 6 deletions(-) diff --git a/block/replication.c b/block/replication.c index 3a35471..fb604e5 100644 --- a/block/replication.c +++ b/block/replication.c @@ -253,7 +253,7 @@ static coroutine_fn int replication_co_readv(BlockDrive= rState *bs, QEMUIOVector *qiov) { BDRVReplicationState *s =3D bs->opaque; - BdrvChild *child =3D s->secondary_disk; + BdrvChild *child =3D s->is_shared_disk ? s->primary_disk : s->secondar= y_disk; BlockJob *job =3D NULL; CowRequest req; int ret; @@ -435,7 +435,12 @@ static void backup_job_completed(void *opaque, int ret) s->error =3D -EIO; } =20 - backup_job_cleanup(bs); + if (s->mode =3D=3D REPLICATION_MODE_PRIMARY) { + s->replication_state =3D BLOCK_REPLICATION_DONE; + s->error =3D 0; + } else { + backup_job_cleanup(bs); + } } =20 static bool check_top_bs(BlockDriverState *top_bs, BlockDriverState *bs) @@ -487,6 +492,19 @@ static void replication_start(ReplicationState *rs, Re= plicationMode mode, =20 switch (s->mode) { case REPLICATION_MODE_PRIMARY: + if (s->is_shared_disk) { + job =3D backup_job_create(NULL, s->primary_disk->bs, bs, 0, + MIRROR_SYNC_MODE_NONE, NULL, false, BLOCKDEV_ON_ERROR_REPO= RT, + BLOCKDEV_ON_ERROR_REPORT, BLOCK_JOB_INTERNAL, + backup_job_completed, bs, NULL, &local_err); + if (local_err) { + error_propagate(errp, local_err); + backup_job_cleanup(bs); + aio_context_release(aio_context); + return; + } + block_job_start(job); + } break; case REPLICATION_MODE_SECONDARY: s->active_disk =3D bs->file; @@ -505,7 +523,8 @@ static void replication_start(ReplicationState *rs, Rep= licationMode mode, } =20 s->secondary_disk =3D s->hidden_disk->bs->backing; - if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)= ) { + if (!s->secondary_disk->bs || + (!s->is_shared_disk && !bdrv_has_blk(s->secondary_disk->bs))) { error_setg(errp, "The secondary disk doesn't have block backen= d"); aio_context_release(aio_context); return; @@ -600,11 +619,24 @@ static void replication_do_checkpoint(ReplicationStat= e *rs, Error **errp) =20 switch (s->mode) { case REPLICATION_MODE_PRIMARY: + if (s->is_shared_disk) { + if (!s->primary_disk->bs->job) { + error_setg(errp, "Primary backup job was cancelled" + " unexpectedly"); + break; + } + + backup_do_checkpoint(s->primary_disk->bs->job, &local_err); + if (local_err) { + error_propagate(errp, local_err); + } + } break; case REPLICATION_MODE_SECONDARY: if (!s->is_shared_disk) { if (!s->secondary_disk->bs->job) { - error_setg(errp, "Backup job was cancelled unexpectedly"); + error_setg(errp, "Secondary backup job was cancelled" + " unexpectedly"); break; } backup_do_checkpoint(s->secondary_disk->bs->job, &local_err); @@ -612,6 +644,16 @@ static void replication_do_checkpoint(ReplicationState= *rs, Error **errp) error_propagate(errp, local_err); break; } + } else { + /* + * For shared disk, we need to force SVM to re-read metadata + * that is loaded in memory, or there will be inconsistent. + */ + bdrv_invalidate_cache(s->secondary_disk->bs, &local_err); + if (local_err) { + error_propagate(errp, local_err); + break; + } } secondary_do_checkpoint(s, errp); break; @@ -683,8 +725,12 @@ static void replication_stop(ReplicationState *rs, boo= l failover, Error **errp) =20 switch (s->mode) { case REPLICATION_MODE_PRIMARY: - s->replication_state =3D BLOCK_REPLICATION_DONE; - s->error =3D 0; + if (s->is_shared_disk && s->primary_disk->bs->job) { + block_job_cancel(s->primary_disk->bs->job); + } else { + s->replication_state =3D BLOCK_REPLICATION_DONE; + s->error =3D 0; + } break; case REPLICATION_MODE_SECONDARY: /* --=20 1.8.3.1