:p
atchew
Login
Recent QEMU changes around preallocate_set_perm mandates that it is not possible to poll on aio_context inside this function anymore. Thus truncate operation has been moved inside bottom half. This bottom half is scheduled from preallocate_set_perm() and that is all. This approach proven to be problematic in a lot of places once additional operations are executed over preallocate filter in production. The code validates that permissions have been really changed just after the call to the set operation. All permissions operations or block driver graph changes are performed inside the quiscent state in terms of the block layer. This means that there are no in-flight packets which is guaranteed by the passing through bdrv_drain() section. The idea is that we should effectively disable preallocate filter inside bdrv_drain() and unblock permission changes. This section is definitely not on the hot path and additional single truncate operation will not hurt. Unfortunately bdrv_drain_begin() callback according to the documentation also disallow waiting inside. Thus original approach with the bottom half is not changed. bdrv_drain_begin() schedules the operation and in order to ensure that it has been really executed before completion of the section increments the amount of in-flight requests. In addition to this we should disable lifting WRITE permission when truncate() operation is not fully completed yet. Changes from v1: - rebased to the latest master Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com>
RW permissions could not be lifted from the preallocation filter if truncate operation has not been finished. In the other case this would mean WRITE operation (image truncate) called after the return from inactivate call. This is definitely a contract violation. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com> --- block/preallocate.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ static void preallocate_child_perm(BlockDriverState *bs, BdrvChild *c, } } +static int preallocate_check_perm(BlockDriverState *bs, uint64_t perm, + uint64_t shared, Error **errp) +{ + BDRVPreallocateState *s = bs->opaque; + if (!can_write_resize(perm) && s->data_end != -EINVAL) { + error_setg_errno(errp, EPERM, "Write access is required for truncate"); + return -EPERM; + } + return 0; +} + static BlockDriver bdrv_preallocate_filter = { .format_name = "preallocate", .instance_size = sizeof(BDRVPreallocateState), @@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_preallocate_filter = { .bdrv_set_perm = preallocate_set_perm, .bdrv_child_perm = preallocate_child_perm, + .bdrv_check_perm = preallocate_check_perm, .is_filter = true, }; -- 2.43.5
Recent QEMU changes around preallocate_set_perm mandates that it is not possible to poll on aio_context inside this function anymore. Thus truncate operation has been moved inside bottom half. This bottom half is scheduled from preallocate_set_perm() and that is all. This approach proven to be problematic in a lot of places once additional operations are executed over preallocate filter in production. The code validates that permissions have been really changed just after the call to the set operation. All permissions operations or block driver graph changes are performed inside the quiscent state in terms of the block layer. This means that there are no in-flight packets which is guaranteed by the passing through bdrv_drain() section. The idea is that we should effectively disable preallocate filter inside bdrv_drain() and unblock permission changes. This section is definitely not on the hot path and additional single truncate operation will not hurt. Unfortunately bdrv_drain_begin() callback according to the documentation also disallow waiting inside. Thus original approach with the bottom half is not changed. bdrv_drain_begin() schedules the operation and in order to ensure that it has been really executed before completion of the section increments the amount of in-flight requests. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com> --- block/preallocate.c | 42 ++++++++++++++++++++++++++++++++++++++---- tests/qemu-iotests/298 | 6 ++++-- 2 files changed, 42 insertions(+), 6 deletions(-) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ typedef struct BDRVPreallocateState { /* Gives up the resize permission on children when parents don't need it */ QEMUBH *drop_resize_bh; + bool drop_resize_armed; } BDRVPreallocateState; static int preallocate_drop_resize(BlockDriverState *bs, Error **errp); @@ -XXX,XX +XXX,XX @@ static int preallocate_open(BlockDriverState *bs, QDict *options, int flags, */ s->file_end = s->zero_start = s->data_end = -EINVAL; s->drop_resize_bh = qemu_bh_new(preallocate_drop_resize_bh, bs); + s->drop_resize_armed = false; ret = bdrv_open_file_child(NULL, options, "file", bs, errp); if (ret < 0) { @@ -XXX,XX +XXX,XX @@ static void preallocate_close(BlockDriverState *bs) GLOBAL_STATE_CODE(); GRAPH_RDLOCK_GUARD_MAINLOOP(); - qemu_bh_cancel(s->drop_resize_bh); + assert(!s->drop_resize_armed); qemu_bh_delete(s->drop_resize_bh); if (s->data_end >= 0) { @@ -XXX,XX +XXX,XX @@ preallocate_drop_resize(BlockDriverState *bs, Error **errp) BDRVPreallocateState *s = bs->opaque; int ret; + s->drop_resize_armed = false; + if (s->data_end < 0) { return 0; } @@ -XXX,XX +XXX,XX @@ preallocate_drop_resize(BlockDriverState *bs, Error **errp) static void preallocate_drop_resize_bh(void *opaque) { + BlockDriverState *bs = opaque; + + /* + * In case of errors, we'll simply keep the exclusive lock on the image + * indefinitely. + */ GLOBAL_STATE_CODE(); GRAPH_RDLOCK_GUARD_MAINLOOP(); @@ -XXX,XX +XXX,XX @@ static void preallocate_drop_resize_bh(void *opaque) * In case of errors, we'll simply keep the exclusive lock on the image * indefinitely. */ - preallocate_drop_resize(opaque, NULL); + preallocate_drop_resize(bs, NULL); + + bdrv_dec_in_flight(bs); } static void GRAPH_RDLOCK @@ -XXX,XX +XXX,XX @@ preallocate_set_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared) BDRVPreallocateState *s = bs->opaque; if (can_write_resize(perm)) { - qemu_bh_cancel(s->drop_resize_bh); if (s->data_end < 0) { s->data_end = s->file_end = s->zero_start = bs->file->bs->total_sectors * BDRV_SECTOR_SIZE; } } else { - qemu_bh_schedule(s->drop_resize_bh); + assert(!s->drop_resize_armed); + assert(s->data_end < 0); } } @@ -XXX,XX +XXX,XX @@ static int preallocate_check_perm(BlockDriverState *bs, uint64_t perm, return 0; } +static void preallocate_drain_begin(BlockDriverState *bs) +{ + BDRVPreallocateState *s = bs->opaque; + + if (s->data_end < 0) { + return; + } + if (s->drop_resize_armed) { + return; + } + if (s->data_end == s->file_end) { + s->file_end = s->zero_start = s->data_end = -EINVAL; + return; + } + + s->drop_resize_armed = true; + bdrv_inc_in_flight(bs); + qemu_bh_schedule(s->drop_resize_bh); +} + static BlockDriver bdrv_preallocate_filter = { .format_name = "preallocate", .instance_size = sizeof(BDRVPreallocateState), @@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_preallocate_filter = { .bdrv_open = preallocate_open, .bdrv_close = preallocate_close, + .bdrv_drain_begin = preallocate_drain_begin, + .bdrv_reopen_prepare = preallocate_reopen_prepare, .bdrv_reopen_commit = preallocate_reopen_commit, .bdrv_reopen_abort = preallocate_reopen_abort, diff --git a/tests/qemu-iotests/298 b/tests/qemu-iotests/298 index XXXXXXX..XXXXXXX 100755 --- a/tests/qemu-iotests/298 +++ b/tests/qemu-iotests/298 @@ -XXX,XX +XXX,XX @@ class TestPreallocateFilter(TestPreallocateBase): self.vm.cmd('block-commit', device='overlay') self.complete_and_wait() - # commit of new megabyte should trigger preallocation - self.check_big() + # commit of new megabyte should trigger preallocation, but drain + # will make file smaller + self.check_small() + def test_reopen_opts(self): self.vm.cmd('blockdev-reopen', options=[{ -- 2.43.5
Recent QEMU changes around preallocate_set_perm mandates that it is not possible to poll on aio_context inside this function anymore. Thus truncate operation has been moved inside bottom half. This bottom half is scheduled from preallocate_set_perm() and that is all. This approach proven to be problematic in a lot of places once additional operations are executed over preallocate filter in production. The code validates that permissions have been really changed just after the call to the set operation. A lot of time has been spent since the previous series in order to stabilize changes with a whole product regression. This is reflected with some unit tests added. Without this series 2 out of 3 are broken. In general, the approach has been changed. We should not have image truncate inside permission handling routines. That would be quite incorrect and wrong, f.e. we potentially have truncate() executed once we returned from bdrv_inactivate() and the ownership of the image does not belong to us anymore. It should be noted that the life cycle of the image length is quite similar to one of the CBT and thus places where CBT is handled should provide a good hint. This is just a guideline note. Thus the most noticeable change is an addition of preallocate_inactivate() callback and cleanup of all asynchronous stuff. Changes from v2: - the series has been completely rethinked Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com>
Let us assume that we have the following quite usual chain: QCOW2 -> preallocate-filter -> raw-file In this case in the case of the migration over shared storage, f.e. NFS we go through bdrv_inactivate() bdrv_inactivate_recurse() qcow2_inactivate() <- writes a lot of data, f.e. CBT bdrv_get_cumulative_perm(qcow2) qcow2->open_flags |= BDRV_O_INACTIVE; bdrv_refresh_perms(qcow2) bdrv_inactivate_recurse(preallocate) preallocate_inactivate() bdrv_get_cumulative_perm(preallocate) preallocate->open_flags |= BDRV_O_INACTIVE; bdrv_refresh_perms(preallocate) Right now preallocate_set_perm() is called through bdrv_refresh_perms(qcow2) and this is the only moment when permissions for the entire chain could be changed. If we will deny write permissions here for the sake of the next step, the only place left for a whole set of permissions changes would be preallocate_inactivate() callback. This is all looking really terrible. In addition to this, we are in trouble due truncate() operation requested inside prealloc_set_perm() would be in reality called seriously later, potentially once we have returned from bdrv_inactivate() to the caller and the control has been passed to the migration target. This patch has truncated the image inside preallocate_inactivate() thus making further work inside preallocate_drop_resize_bh() noop. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> --- block/preallocate.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ static void preallocate_child_perm(BlockDriverState *bs, BdrvChild *c, } } +static int GRAPH_RDLOCK preallocate_inactivate(BlockDriverState *bs) +{ + return preallocate_drop_resize(bs, NULL); +} + static BlockDriver bdrv_preallocate_filter = { .format_name = "preallocate", .instance_size = sizeof(BDRVPreallocateState), @@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_preallocate_filter = { .bdrv_set_perm = preallocate_set_perm, .bdrv_child_perm = preallocate_child_perm, + .bdrv_inactivate = preallocate_inactivate, + .is_filter = true, }; -- 2.45.2
Let us auto-enable it inside handle_write() and truncate(), i.e. on the actual write operation. This just makes things more relaxing. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> --- block/preallocate.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ static int coroutine_fn GRAPH_RDLOCK preallocate_co_flush(BlockDriverState *bs) static int64_t coroutine_fn GRAPH_RDLOCK preallocate_co_getlength(BlockDriverState *bs) { - int64_t ret; BDRVPreallocateState *s = bs->opaque; if (s->data_end >= 0) { return s->data_end; } - ret = bdrv_co_getlength(bs->file->bs); - - if (has_prealloc_perms(bs)) { - s->file_end = s->zero_start = s->data_end = ret; - } - - return ret; + return bdrv_co_getlength(bs->file->bs); } static int GRAPH_RDLOCK -- 2.45.2
The filter is not enabled if s->data_end is negative. In this case it would be useless completely to initialize s->file_end inside preallocate_truncate_to_real_size() without setting s->data_end. Here are we are going to reset the state and disable the filter as we are either in the process of switching to the read-only state or the driver is being closed. Now the driver is disabled unconditionally even on the error and this is pretty much correct. In the worst case the image would be a bit longer and that is all. The patch also adds redundant check for bs->open_flags into this helper for the convinience. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> --- block/preallocate.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ preallocate_truncate_to_real_size(BlockDriverState *bs, Error **errp) BDRVPreallocateState *s = bs->opaque; int ret; - if (s->file_end < 0) { - s->file_end = bdrv_getlength(bs->file->bs); - if (s->file_end < 0) { - error_setg_errno(errp, -s->file_end, "Failed to get file length"); - return s->file_end; - } + if (!(bs->open_flags & BDRV_O_RDWR)) { + return 0; + } + if (s->data_end < 0) { + return 0; } if (s->data_end < s->file_end) { @@ -XXX,XX +XXX,XX @@ preallocate_truncate_to_real_size(BlockDriverState *bs, Error **errp) NULL); if (ret < 0) { error_setg_errno(errp, -ret, "Failed to drop preallocation"); - s->file_end = ret; - return ret; } - s->file_end = s->data_end; } + s->data_end = s->file_end = s->zero_start = -EINVAL; return 0; } @@ -XXX,XX +XXX,XX @@ static void preallocate_close(BlockDriverState *bs) qemu_bh_cancel(s->drop_resize_bh); qemu_bh_delete(s->drop_resize_bh); - if (s->data_end >= 0) { - preallocate_truncate_to_real_size(bs, NULL); - } + preallocate_truncate_to_real_size(bs, NULL); } @@ -XXX,XX +XXX,XX @@ preallocate_drop_resize(BlockDriverState *bs, Error **errp) * change the child, so mark all states invalid. We'll regain control if a * parent requests write access again. */ - s->data_end = s->file_end = s->zero_start = -EINVAL; - bdrv_child_refresh_perms(bs, bs->file, NULL); return 0; -- 2.45.2
IO operations like truncate inside preallocate_set_perm() looks like insane complexity which should not seen in the reality. Preallocate filter lifecycle is very close to the lifecycle of CBT. This is well handled inside QCOW2 driver. The approach should be the same inside preallocation filter. It should be enough to be handled inside preallocate_reopen_prepare() and preallocate_inactivate(). There is no more need to have preallocate_set_perm() and preallocate_child_perm() perms. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> --- block/preallocate.c | 39 --------------------------------------- 1 file changed, 39 deletions(-) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ typedef struct BDRVPreallocateState { * be invalid (< 0) when we don't have both exclusive BLK_PERM_RESIZE and * BLK_PERM_WRITE permissions on file child. */ - - /* Gives up the resize permission on children when parents don't need it */ - QEMUBH *drop_resize_bh; } BDRVPreallocateState; static int preallocate_drop_resize(BlockDriverState *bs, Error **errp); -static void preallocate_drop_resize_bh(void *opaque); #define PREALLOCATE_OPT_PREALLOC_ALIGN "prealloc-align" #define PREALLOCATE_OPT_PREALLOC_SIZE "prealloc-size" @@ -XXX,XX +XXX,XX @@ static int preallocate_open(BlockDriverState *bs, QDict *options, int flags, * For this to work, mark them invalid. */ s->file_end = s->zero_start = s->data_end = -EINVAL; - s->drop_resize_bh = qemu_bh_new(preallocate_drop_resize_bh, bs); ret = bdrv_open_file_child(NULL, options, "file", bs, errp); if (ret < 0) { @@ -XXX,XX +XXX,XX @@ preallocate_truncate_to_real_size(BlockDriverState *bs, Error **errp) static void preallocate_close(BlockDriverState *bs) { - BDRVPreallocateState *s = bs->opaque; - GLOBAL_STATE_CODE(); GRAPH_RDLOCK_GUARD_MAINLOOP(); - qemu_bh_cancel(s->drop_resize_bh); - qemu_bh_delete(s->drop_resize_bh); - preallocate_truncate_to_real_size(bs, NULL); } @@ -XXX,XX +XXX,XX @@ preallocate_drop_resize(BlockDriverState *bs, Error **errp) return 0; } -static void preallocate_drop_resize_bh(void *opaque) -{ - GLOBAL_STATE_CODE(); - GRAPH_RDLOCK_GUARD_MAINLOOP(); - - /* - * In case of errors, we'll simply keep the exclusive lock on the image - * indefinitely. - */ - preallocate_drop_resize(opaque, NULL); -} - -static void GRAPH_RDLOCK -preallocate_set_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared) -{ - BDRVPreallocateState *s = bs->opaque; - - if (can_write_resize(perm)) { - qemu_bh_cancel(s->drop_resize_bh); - if (s->data_end < 0) { - s->data_end = s->file_end = s->zero_start = - bs->file->bs->total_sectors * BDRV_SECTOR_SIZE; - } - } else { - qemu_bh_schedule(s->drop_resize_bh); - } -} - static void preallocate_child_perm(BlockDriverState *bs, BdrvChild *c, BdrvChildRole role, BlockReopenQueue *reopen_queue, uint64_t perm, uint64_t shared, uint64_t *nperm, uint64_t *nshared) @@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_preallocate_filter = { .bdrv_co_flush = preallocate_co_flush, .bdrv_co_truncate = preallocate_co_truncate, - .bdrv_set_perm = preallocate_set_perm, .bdrv_child_perm = preallocate_child_perm, .bdrv_inactivate = preallocate_inactivate, -- 2.45.2
Once permission change process is normalized, there is no need to have permission update inside preallocate_drop_resize() and preallocate_truncate_to_real_size() could be merged into the caller. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> --- block/preallocate.c | 36 ++---------------------------------- 1 file changed, 2 insertions(+), 34 deletions(-) diff --git a/block/preallocate.c b/block/preallocate.c index XXXXXXX..XXXXXXX 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -XXX,XX +XXX,XX @@ typedef struct BDRVPreallocateState { */ } BDRVPreallocateState; -static int preallocate_drop_resize(BlockDriverState *bs, Error **errp); - #define PREALLOCATE_OPT_PREALLOC_ALIGN "prealloc-align" #define PREALLOCATE_OPT_PREALLOC_SIZE "prealloc-size" static QemuOptsList runtime_opts = { @@ -XXX,XX +XXX,XX @@ static int preallocate_open(BlockDriverState *bs, QDict *options, int flags, } static int GRAPH_RDLOCK -preallocate_truncate_to_real_size(BlockDriverState *bs, Error **errp) +preallocate_drop_resize(BlockDriverState *bs, Error **errp) { BDRVPreallocateState *s = bs->opaque; int ret; @@ -XXX,XX +XXX,XX @@ static void preallocate_close(BlockDriverState *bs) GLOBAL_STATE_CODE(); GRAPH_RDLOCK_GUARD_MAINLOOP(); - preallocate_truncate_to_real_size(bs, NULL); + preallocate_drop_resize(bs, NULL); } @@ -XXX,XX +XXX,XX @@ preallocate_co_getlength(BlockDriverState *bs) return bdrv_co_getlength(bs->file->bs); } -static int GRAPH_RDLOCK -preallocate_drop_resize(BlockDriverState *bs, Error **errp) -{ - BDRVPreallocateState *s = bs->opaque; - int ret; - - if (s->data_end < 0) { - return 0; - } - - /* - * Before switching children to be read-only, truncate them to remove - * the preallocation and let them have the real size. - */ - ret = preallocate_truncate_to_real_size(bs, errp); - if (ret < 0) { - return ret; - } - - /* - * We'll drop our permissions and will allow other users to take write and - * resize permissions (see preallocate_child_perm). Anyone will be able to - * change the child, so mark all states invalid. We'll regain control if a - * parent requests write access again. - */ - bdrv_child_refresh_perms(bs, bs->file, NULL); - - return 0; -} - static void preallocate_child_perm(BlockDriverState *bs, BdrvChild *c, BdrvChildRole role, BlockReopenQueue *reopen_queue, uint64_t perm, uint64_t shared, uint64_t *nperm, uint64_t *nshared) -- 2.45.2
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> This test summaries the cases faced inside Virtuozzo with the preallocation filter worth to be added to unit tests: 1. Launch a VM whose block graph has preallocate filter node and migrate it locally into a file. 2. Same, but make sure preallocate filter is activated by performing a write op to it beyond the current disk length (which is zero). 3. Add testcase which would perform write operation to VM disk to make sure preallocation filter is active, and then run 'blockdev-snapshot' command to turn another image (also with preallocation) into an external snapshot and make it an active disk. Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> --- tests/qemu-iotests/tests/prealloc-checks | 222 +++++++++++++++++++ tests/qemu-iotests/tests/prealloc-checks.out | 81 +++++++ 2 files changed, 303 insertions(+) create mode 100644 tests/qemu-iotests/tests/prealloc-checks create mode 100644 tests/qemu-iotests/tests/prealloc-checks.out diff --git a/tests/qemu-iotests/tests/prealloc-checks b/tests/qemu-iotests/tests/prealloc-checks new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/qemu-iotests/tests/prealloc-checks @@ -XXX,XX +XXX,XX @@ +#!/usr/bin/env bash +# group: rw +# +# Checks for preallocate filter. +# +# Copyright (c) 2024 Virtuozzo International GmbH. All rights reserved. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +# creator +owner=andrey.drobyshev@virtuozzo.com + +seq=`basename $0` +echo "QA output created by $seq" + +status=1 # failure is the default! + +_cleanup() +{ + _cleanup_qemu + rm -f $SOCK_DIR/nbd.sock + rm -f $TEST_IMG.snap + _cleanup_test_img +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +. ../common.rc +. ../common.filter +. ../common.qemu + +_supported_fmt qcow2 + +_recreate_test_img() +{ + local size="1M" + local imgopts="cluster_size=$size,extended_l2=on,lazy_refcounts=on" + local image="$TEST_IMG" + + if test -n "$1" ; then + image="$1" + fi + + rm -f $image + TEST_IMG=$image _make_test_img -o "$imgopts" $size +} + +blkopts="node-name=disk,driver=qcow2,file.driver=preallocate," +blkopts+="file.node-name=prealloc,file.file.driver=file," +blkopts+="file.file.filename=$TEST_IMG,file.file.node-name=storage" + +# +# 1. Launch a VM so that its block graph contains preallocate filter node, +# and perform its local migration to a file. That is similar to doing +# "virsh save VM /path/to/vmsave". +# + +echo +echo === 1. Migration to a local file === +echo + +echo "# Create image and start VM with preallocate filter:" +echo +_recreate_test_img +qemu_comm_method="monitor" _launch_qemu -blockdev "$blkopts" +handle=$QEMU_HANDLE +_send_qemu_cmd $handle "" "(qemu)" + +echo +echo "# Migrate VM to a local file (/dev/null):" +echo +_send_qemu_cmd $handle "migrate \"exec: cat > '/dev/null'\"" "(qemu)" + +echo +echo "# Exit VM:" +echo +_send_qemu_cmd $handle "quit" "" +wait=yes _cleanup_qemu + +# +# 2. Same as 1st, but this time we make sure that preallocate filter is +# actually active. To do that we perform a write op beyond current length +# (which is 0 as the image's just created). Then migrate VM to a local +# file (/dev/null). +# + +echo +echo === 2. Migration to a local file after a write operation === +echo + +echo "# Create image and start VM with preallocate filter:" +echo +_recreate_test_img +qemu_comm_method="monitor" _launch_qemu -blockdev "$blkopts" +handle=$QEMU_HANDLE +_send_qemu_cmd $handle "" "(qemu)" + +echo +echo "# Perform write op to the image to activate preallocate filter:" +echo +_send_qemu_cmd $handle 'qemu-io disk "write -P 0xff 0 1M"' "1 MiB" + +echo +echo "# Migrate VM to a local file (/dev/null):" +echo +_send_qemu_cmd $handle "migrate \"exec: cat > '/dev/null'\"" "(qemu)" + +echo +echo "# Exit VM:" +echo +_send_qemu_cmd $handle "quit" "" +wait=yes _cleanup_qemu + +# +# 3. Add another overlay image (with preallocation filter as well), launch +# VM, export its disk via nbd to perform a write operation and activate the +# preallocation filter, and then run 'blockdev-snapshot' to turn the overlay +# image into an external snapshot of the disk. +# + +echo +echo === 3. Taking external snapshot after a write operation === +echo + +snapblkopts="node-name=snap,driver=qcow2,file.driver=preallocate," +snapblkopts+="file.node-name=snap-prealloc,file.file.driver=file," +snapblkopts+="file.file.filename=$TEST_IMG.snap," +snapblkopts+="file.file.node-name=snap-storage" + +echo "# Create disk and snapshot images and start VM with preallocate filter:" +echo +_recreate_test_img +_recreate_test_img $TEST_IMG.snap + +qemu_comm_method="qmp" qmp_pretty= \ + _launch_qemu -blockdev "$snapblkopts" -blockdev "$blkopts" +handle=$QEMU_HANDLE +_send_qemu_cmd $handle "{ 'execute': 'qmp_capabilities' }" "return" + +silent=yes + +echo +echo "# Start nbd server:" +echo +_send_qemu_cmd $handle \ + "{ 'execute': 'nbd-server-start', + 'arguments': { 'addr': { 'type': 'unix', + 'data': { 'path': '$SOCK_DIR/nbd.sock' }}}}" +_send_qemu_cmd $handle "" "return" + +echo +echo "# Export 'disk' node via nbd server:" +echo +_send_qemu_cmd $handle \ + "{ 'execute': 'block-export-add', + 'arguments': { 'type': 'nbd', 'node-name': 'disk', 'id': 'nbdexp', + 'name': 'nbdexp', 'writable': true }}" +_send_qemu_cmd $handle "" "return" + +echo +echo "# Perform write op to the nbd-exported disk:" +echo +silent= +$QEMU_IO_PROG -f raw -c "write -P 0xff 0 1M" \ + "nbd+unix:///nbdexp?socket=$SOCK_DIR/nbd.sock" 2>&1 \ + | _filter_qemu_io | _filter_nbd + +echo +echo "# Delete nbd export:" +echo +silent=yes +_send_qemu_cmd $handle \ + "{ 'execute': 'block-export-del', 'arguments': { 'id': 'nbdexp' }}" +_send_qemu_cmd $handle "" "return" + +echo +echo "# Stop nbd server:" +echo +_send_qemu_cmd $handle \ + "{ 'execute': 'nbd-server-stop' }" +_send_qemu_cmd $handle "" "return" + +echo +echo "# Turn 'snap' node into the external snapshot of 'disk' node:" +echo +silent= +_send_qemu_cmd $handle \ + "{ 'execute': 'blockdev-snapshot', + 'arguments': { 'node': 'disk', 'overlay': 'snap' }}" +_send_qemu_cmd $handle "" "return" + +echo +echo "# Check block graph:" +echo +_send_qemu_cmd $handle \ + "{ 'execute': 'x-debug-query-block-graph' }" +_send_qemu_cmd $handle "" "return" + +echo +echo "# Exit VM:" +echo +qmp_pretty= +silent=yes +_send_qemu_cmd $handle "{ 'execute': 'quit' }" "qmp-quit" +wait=yes _cleanup_qemu + +# success, all done +echo "*** done" +rm -f $seq.full +status=0 diff --git a/tests/qemu-iotests/tests/prealloc-checks.out b/tests/qemu-iotests/tests/prealloc-checks.out new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/qemu-iotests/tests/prealloc-checks.out @@ -XXX,XX +XXX,XX @@ +QA output created by prealloc-checks + +=== 1. Migration to a local file === + +# Create image and start VM with preallocate filter: + +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 +QEMU X.Y.Z monitor - type 'help' for more information +(qemu) + +# Migrate VM to a local file (/dev/null): + +(qemu) migrate "exec: cat > '/dev/null'" + +# Exit VM: + +(qemu) quit + +=== 2. Migration to a local file after a write operation === + +# Create image and start VM with preallocate filter: + +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 +QEMU X.Y.Z monitor - type 'help' for more information +(qemu) + +# Perform write op to the image to activate preallocate filter: + +(qemu) qemu-io disk "write -P 0xff 0 1M" +wrote 1048576/1048576 bytes at offset 0 +1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +# Migrate VM to a local file (/dev/null): + +(qemu) migrate "exec: cat > '/dev/null'" + +# Exit VM: + +(qemu) quit + +=== 3. Taking external snapshot after a write operation === + +# Create disk and snapshot images and start VM with preallocate filter: + +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 +Formatting 'TEST_DIR/t.IMGFMT.snap', fmt=IMGFMT size=1048576 +{ 'execute': 'qmp_capabilities' } +{"return": {}} + +# Start nbd server: + + +# Export 'disk' node via nbd server: + + +# Perform write op to the nbd-exported disk: + +wrote 1048576/1048576 bytes at offset 0 +1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +# Delete nbd export: + + +# Stop nbd server: + + +# Turn 'snap' node into the external snapshot of 'disk' node: + +{ 'execute': 'blockdev-snapshot', + 'arguments': { 'node': 'disk', 'overlay': 'snap' }} +{"return": {}} + +# Check block graph: + +{ 'execute': 'x-debug-query-block-graph' } +{"return": {"edges": [{"name": "file", "parent": 4, "shared-perm": ["write-unchanged", "consistent-read"], "perm": ["consistent-read"], "child": 6}, {"name": "file", "parent": 6, "shared-perm": ["write-unchanged", "consistent-read"], "perm": ["consistent-read"], "child": 5}, {"name": "file", "parent": 3, "shared-perm": ["write-unchanged", "consistent-read"], "perm": ["resize", "write", "consistent-read"], "child": 2}, {"name": "backing", "parent": 3, "shared-perm": ["resize", "write-unchanged", "write", "consistent-read"], "perm": [], "child": 4}, {"name": "file", "parent": 2, "shared-perm": ["write-unchanged", "consistent-read"], "perm": ["resize", "write", "consistent-read"], "child": 1}], "nodes": [{"name": "disk", "type": "block-driver", "id": 4}, {"name": "prealloc", "type": "block-driver", "id": 6}, {"name": "storage", "type": "block-driver", "id": 5}, {"name": "snap", "type": "block-driver", "id": 3}, {"name": "snap-prealloc", "type": "block-driver", "id": 2}, {"name": "snap-storage", "type": "block-driver", "id": 1}]}} + +# Exit VM: + +{"return": {}} +*** done -- 2.45.2