From nobody Wed Dec 17 05:38:23 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526117577205537.8062990890805; Sat, 12 May 2018 02:32:57 -0700 (PDT) Received: from localhost ([::1]:60208 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fHQtc-0008EY-Al for importer@patchew.org; Sat, 12 May 2018 05:32:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34924) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fHQqq-0006Pp-1E for qemu-devel@nongnu.org; Sat, 12 May 2018 05:30:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fHQqo-0000qg-Si for qemu-devel@nongnu.org; Sat, 12 May 2018 05:30:04 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:50320 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fHQqm-0000pL-IY; Sat, 12 May 2018 05:30:00 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 306AA7D85E; Sat, 12 May 2018 09:30:00 +0000 (UTC) Received: from localhost (ovpn-116-42.ams2.redhat.com [10.36.116.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id D3D96ADF92; Sat, 12 May 2018 09:29:59 +0000 (UTC) From: Stefan Hajnoczi To: Date: Sat, 12 May 2018 10:28:23 +0100 Message-Id: <20180512092824.13848-8-stefanha@redhat.com> In-Reply-To: <20180512092824.13848-1-stefanha@redhat.com> References: <20180512092824.13848-1-stefanha@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Sat, 12 May 2018 09:30:00 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Sat, 12 May 2018 09:30:00 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'stefanha@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PULL 7/8] block/file-posix: implement bdrv_co_invalidate_cache() on Linux X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-block@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" On Linux posix_fadvise(POSIX_FADV_DONTNEED) invalidates pages*. Use this to drop page cache on the destination host during shared storage migration. This way the destination host will read the latest copy of the data and will not use stale data from the page cache. The flow is as follows: 1. Source host writes out all dirty pages and inactivates drives. 2. QEMU_VM_EOF is sent on migration stream. 3. Destination host invalidates caches before accessing drives. This patch enables live migration even with -drive cache.direct=3Doff. * Terms and conditions may apply, please see patch for details. Signed-off-by: Stefan Hajnoczi Reviewed-by: Fam Zheng Message-id: 20180427162312.18583-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi --- block/file-posix.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index 3794c0007a..3707ea2d1c 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2236,6 +2236,49 @@ static int coroutine_fn raw_co_block_status(BlockDri= verState *bs, return ret | BDRV_BLOCK_OFFSET_VALID; } =20 +static void coroutine_fn raw_co_invalidate_cache(BlockDriverState *bs, + Error **errp) +{ + BDRVRawState *s =3D bs->opaque; + int ret; + + ret =3D fd_open(bs); + if (ret < 0) { + error_setg_errno(errp, -ret, "The file descriptor is not open"); + return; + } + + if (s->open_flags & O_DIRECT) { + return; /* No host kernel page cache */ + } + +#if defined(__linux__) + /* This sets the scene for the next syscall... */ + ret =3D bdrv_co_flush(bs); + if (ret < 0) { + error_setg_errno(errp, -ret, "flush failed"); + return; + } + + /* Linux does not invalidate pages that are dirty, locked, or mmapped = by a + * process. These limitations are okay because we just fsynced the fi= le, + * we don't use mmap, and the file should not be in use by other proce= sses. + */ + ret =3D posix_fadvise(s->fd, 0, 0, POSIX_FADV_DONTNEED); + if (ret !=3D 0) { /* the return value is a positive errno */ + error_setg_errno(errp, ret, "fadvise failed"); + return; + } +#else /* __linux__ */ + /* Do nothing. Live migration to a remote host with cache.direct=3Dof= f is + * unsupported on other host operating systems. Cache consistency iss= ues + * may occur but no error is reported here, partly because that's the + * historical behavior and partly because it's hard to differentiate v= alid + * configurations that should not cause errors. + */ +#endif /* !__linux__ */ +} + static coroutine_fn BlockAIOCB *raw_aio_pdiscard(BlockDriverState *bs, int64_t offset, int bytes, BlockCompletionFunc *cb, void *opaque) @@ -2328,6 +2371,7 @@ BlockDriver bdrv_file =3D { .bdrv_co_create_opts =3D raw_co_create_opts, .bdrv_has_zero_init =3D bdrv_has_zero_init_1, .bdrv_co_block_status =3D raw_co_block_status, + .bdrv_co_invalidate_cache =3D raw_co_invalidate_cache, .bdrv_co_pwrite_zeroes =3D raw_co_pwrite_zeroes, =20 .bdrv_co_preadv =3D raw_co_preadv, @@ -2805,6 +2849,7 @@ static BlockDriver bdrv_host_device =3D { .bdrv_reopen_abort =3D raw_reopen_abort, .bdrv_co_create_opts =3D hdev_co_create_opts, .create_opts =3D &raw_create_opts, + .bdrv_co_invalidate_cache =3D raw_co_invalidate_cache, .bdrv_co_pwrite_zeroes =3D hdev_co_pwrite_zeroes, =20 .bdrv_co_preadv =3D raw_co_preadv, @@ -2927,6 +2972,7 @@ static BlockDriver bdrv_host_cdrom =3D { .bdrv_reopen_abort =3D raw_reopen_abort, .bdrv_co_create_opts =3D hdev_co_create_opts, .create_opts =3D &raw_create_opts, + .bdrv_co_invalidate_cache =3D raw_co_invalidate_cache, =20 =20 .bdrv_co_preadv =3D raw_co_preadv, --=20 2.17.0