1
The following changes since commit 36f87b4513373b3cd79c87c9197d17face95d4ac:
1
The following changes since commit 67c1115edd98f388ca89dd38322ea3fadf034523:
2
2
3
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170630' into staging (2017-06-30 11:58:49 +0100)
3
Merge remote-tracking branch 'remotes/kraxel/tags/ui-20210323-pull-request' into staging (2021-03-23 23:47:30 +0000)
4
4
5
are available in the git repository at:
5
are available in the Git repository at:
6
6
7
git://github.com/famz/qemu.git tags/block-pull-request
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to c61e684e44272f2acb2bef34cf2aa234582a73a9:
9
for you to fetch changes up to 3460fd7f3959d1fa7bcc255796844aa261c805a4:
10
10
11
block: Exploit BDRV_BLOCK_EOF for larger zero blocks (2017-06-30 21:48:06 +0800)
11
migrate-bitmaps-postcopy-test: check that we can't remove in-flight bitmaps (2021-03-24 13:41:19 +0000)
12
13
----------------------------------------------------------------
14
Pull request
15
16
This dirty bitmap fix solves a crash that can be triggered in the destination
17
QEMU process during live migration.
12
18
13
----------------------------------------------------------------
19
----------------------------------------------------------------
14
20
15
Hi Peter,
21
Vladimir Sementsov-Ogievskiy (2):
22
migration/block-dirty-bitmap: make incoming disabled bitmaps busy
23
migrate-bitmaps-postcopy-test: check that we can't remove in-flight
24
bitmaps
16
25
17
Here are Eric Blake's enhancement to block layer API. Thanks!
26
migration/block-dirty-bitmap.c | 6 ++++++
18
27
tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test | 10 ++++++++++
19
----------------------------------------------------------------
28
2 files changed, 16 insertions(+)
20
21
Eric Blake (2):
22
block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()
23
block: Exploit BDRV_BLOCK_EOF for larger zero blocks
24
25
block/io.c | 42 +++++++++++++++++++++++++++++++++---------
26
include/block/block.h | 2 ++
27
tests/qemu-iotests/154 | 4 ----
28
tests/qemu-iotests/154.out | 12 ++++++------
29
4 files changed, 41 insertions(+), 19 deletions(-)
30
29
31
--
30
--
32
2.9.4
31
2.30.2
33
32
34
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a
3
Incoming enabled bitmaps are busy, because we do
4
shortcut for subsequent operations, there are also some optimizations
4
bdrv_dirty_bitmap_create_successor() for them. But disabled bitmaps
5
that are made easier if we can quickly tell that *pnum will advance
5
being migrated are not marked busy, and user can remove them during the
6
us to the end of a file, via a new BDRV_BLOCK_EOF which gets set
6
incoming migration. Then we may crash in cancel_incoming_locked() when
7
by the block layer.
7
try to remove the bitmap that was already removed by user, like this:
8
8
9
This just plumbs up the new bit; subsequent patches will make use
9
#0 qemu_mutex_lock_impl (mutex=0x5593d88c50d1, file=0x559680554b20
10
of it.
10
"../block/dirty-bitmap.c", line=64) at ../util/qemu-thread-posix.c:77
11
#1 bdrv_dirty_bitmaps_lock (bs=0x5593d88c0ee9)
12
at ../block/dirty-bitmap.c:64
13
#2 bdrv_release_dirty_bitmap (bitmap=0x5596810e9570)
14
at ../block/dirty-bitmap.c:362
15
#3 cancel_incoming_locked (s=0x559680be8208 <dbm_state+40>)
16
at ../migration/block-dirty-bitmap.c:918
17
#4 dirty_bitmap_load (f=0x559681d02b10, opaque=0x559680be81e0
18
<dbm_state>, version_id=1) at ../migration/block-dirty-bitmap.c:1194
19
#5 vmstate_load (f=0x559681d02b10, se=0x559680fb5810)
20
at ../migration/savevm.c:908
21
#6 qemu_loadvm_section_part_end (f=0x559681d02b10,
22
mis=0x559680fb4a30) at ../migration/savevm.c:2473
23
#7 qemu_loadvm_state_main (f=0x559681d02b10, mis=0x559680fb4a30)
24
at ../migration/savevm.c:2626
25
#8 postcopy_ram_listen_thread (opaque=0x0)
26
at ../migration/savevm.c:1871
27
#9 qemu_thread_start (args=0x5596817ccd10)
28
at ../util/qemu-thread-posix.c:521
29
#10 start_thread () at /lib64/libpthread.so.0
30
#11 clone () at /lib64/libc.so.6
11
31
12
Signed-off-by: Eric Blake <eblake@redhat.com>
32
Note bs pointer taken from bitmap: it's definitely bad aligned. That's
13
Message-Id: <20170505021500.19315-2-eblake@redhat.com>
33
because we are in use after free, bitmap is already freed.
14
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
34
15
Signed-off-by: Fam Zheng <famz@redhat.com>
35
So, let's make disabled bitmaps (being migrated) busy during incoming
36
migration.
37
38
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
39
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
40
Message-Id: <20210322094906.5079-2-vsementsov@virtuozzo.com>
16
---
41
---
17
block/io.c | 15 +++++++++++----
42
migration/block-dirty-bitmap.c | 6 ++++++
18
include/block/block.h | 2 ++
43
1 file changed, 6 insertions(+)
19
2 files changed, 13 insertions(+), 4 deletions(-)
20
44
21
diff --git a/block/io.c b/block/io.c
45
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
22
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
23
--- a/block/io.c
47
--- a/migration/block-dirty-bitmap.c
24
+++ b/block/io.c
48
+++ b/migration/block-dirty-bitmap.c
25
@@ -XXX,XX +XXX,XX @@ typedef struct BdrvCoGetBlockStatusData {
49
@@ -XXX,XX +XXX,XX @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
26
* Drivers not implementing the functionality are assumed to not support
50
error_report_err(local_err);
27
* backing files, hence all their sectors are reported as allocated.
51
return -EINVAL;
28
*
52
}
29
- * If 'sector_num' is beyond the end of the disk image the return value is 0
53
+ } else {
30
- * and 'pnum' is set to 0.
54
+ bdrv_dirty_bitmap_set_busy(s->bitmap, true);
31
+ * If 'sector_num' is beyond the end of the disk image the return value is
32
+ * BDRV_BLOCK_EOF and 'pnum' is set to 0.
33
*
34
* 'pnum' is set to the number of sectors (including and immediately following
35
* the specified sector) that are known to be in the same
36
* allocated/unallocated state.
37
*
38
* 'nb_sectors' is the max value 'pnum' should be set to. If nb_sectors goes
39
- * beyond the end of the disk image it will be clamped.
40
+ * beyond the end of the disk image it will be clamped; if 'pnum' is set to
41
+ * the end of the image, then the returned value will include BDRV_BLOCK_EOF.
42
*
43
* If returned value is positive and BDRV_BLOCK_OFFSET_VALID bit is set, 'file'
44
* points to the BDS which the sector range is allocated in.
45
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
46
47
if (sector_num >= total_sectors) {
48
*pnum = 0;
49
- return 0;
50
+ return BDRV_BLOCK_EOF;
51
}
55
}
52
56
53
n = total_sectors - sector_num;
57
b = g_new(LoadBitmapState, 1);
54
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
58
@@ -XXX,XX +XXX,XX @@ static void cancel_incoming_locked(DBMLoadState *s)
55
if (!bs->drv->bdrv_co_get_block_status) {
59
assert(!s->before_vm_start_handled || !b->migrated);
56
*pnum = nb_sectors;
60
if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
57
ret = BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED;
61
bdrv_reclaim_dirty_bitmap(b->bitmap, &error_abort);
58
+ if (sector_num + nb_sectors == total_sectors) {
62
+ } else {
59
+ ret |= BDRV_BLOCK_EOF;
63
+ bdrv_dirty_bitmap_set_busy(b->bitmap, false);
60
+ }
61
if (bs->drv->protocol_name) {
62
ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num * BDRV_SECTOR_SIZE);
63
}
64
}
64
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
65
bdrv_release_dirty_bitmap(b->bitmap);
65
66
}
66
out:
67
@@ -XXX,XX +XXX,XX @@ static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
67
bdrv_dec_in_flight(bs);
68
68
+ if (ret >= 0 && sector_num + *pnum == total_sectors) {
69
if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
69
+ ret |= BDRV_BLOCK_EOF;
70
bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
70
+ }
71
+ } else {
71
return ret;
72
+ bdrv_dirty_bitmap_set_busy(s->bitmap, false);
72
}
73
}
73
74
74
diff --git a/include/block/block.h b/include/block/block.h
75
for (item = s->bitmaps; item; item = g_slist_next(item)) {
75
index XXXXXXX..XXXXXXX 100644
76
--- a/include/block/block.h
77
+++ b/include/block/block.h
78
@@ -XXX,XX +XXX,XX @@ typedef struct HDGeometry {
79
* BDRV_BLOCK_OFFSET_VALID: an associated offset exists for accessing raw data
80
* BDRV_BLOCK_ALLOCATED: the content of the block is determined by this
81
* layer (short for DATA || ZERO), set by block layer
82
+ * BDRV_BLOCK_EOF: the returned pnum covers through end of file for this layer
83
*
84
* Internal flag:
85
* BDRV_BLOCK_RAW: used internally to indicate that the request was
86
@@ -XXX,XX +XXX,XX @@ typedef struct HDGeometry {
87
#define BDRV_BLOCK_OFFSET_VALID 0x04
88
#define BDRV_BLOCK_RAW 0x08
89
#define BDRV_BLOCK_ALLOCATED 0x10
90
+#define BDRV_BLOCK_EOF 0x20
91
#define BDRV_BLOCK_OFFSET_MASK BDRV_SECTOR_MASK
92
93
typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
94
--
76
--
95
2.9.4
77
2.30.2
96
78
97
diff view generated by jsdifflib
1
From: Eric Blake <eblake@redhat.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
When we have a BDS with unallocated clusters, but asking the status
3
Check that we can't remove bitmaps being migrated on destination vm.
4
of its underlying bs->file or backing layer encounters an end-of-file
4
The new check proves that previous commit helps.
5
condition, we know that the rest of the unallocated area will read as
6
zeroes. However, pre-patch, this required two separate calls to
7
bdrv_get_block_status(), as the first call stops at the point where
8
the underlying file ends. Thanks to BDRV_BLOCK_EOF, we can now widen
9
the results of the primary status if the secondary status already
10
includes BDRV_BLOCK_ZERO.
11
5
12
In turn, this fixes a TODO mentioned in iotest 154, where we can now
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
13
see that all sectors in a partial cluster at the end of a file read
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
as zero when coupling the shorter backing file's status along with our
8
Message-Id: <20210322094906.5079-3-vsementsov@virtuozzo.com>
15
knowledge that the remaining sectors came from an unallocated cluster.
9
---
10
tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test | 10 ++++++++++
11
1 file changed, 10 insertions(+)
16
12
17
Also, note that the loop in bdrv_co_get_block_status_above() had an
13
diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
18
inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO
14
index XXXXXXX..XXXXXXX 100755
19
but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read
15
--- a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
20
zeroes merely because our unallocated clusters lie beyond the backing
16
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
21
file's shorter length), we still ended up probing the backing layer
17
@@ -XXX,XX +XXX,XX @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
22
even though we already had a good answer.
18
self.start_postcopy()
19
20
self.vm_b_events += self.vm_b.get_qmp_events()
21
+
22
+ # While being here, let's check that we can't remove in-flight bitmaps.
23
+ for vm in (self.vm_a, self.vm_b):
24
+ for i in range(0, nb_bitmaps):
25
+ result = vm.qmp('block-dirty-bitmap-remove', node='drive0',
26
+ name=f'bitmap{i}')
27
+ self.assert_qmp(result, 'error/desc',
28
+ f"Bitmap 'bitmap{i}' is currently in use by "
29
+ "another operation and cannot be used")
30
+
31
self.vm_b.shutdown()
32
# recreate vm_b, so there is no incoming option, which prevents
33
# loading bitmaps from disk
34
--
35
2.30.2
23
36
24
Signed-off-by: Eric Blake <eblake@redhat.com>
25
Message-Id: <20170505021500.19315-3-eblake@redhat.com>
26
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
27
Signed-off-by: Fam Zheng <famz@redhat.com>
28
---
29
block/io.c | 27 ++++++++++++++++++++++-----
30
tests/qemu-iotests/154 | 4 ----
31
tests/qemu-iotests/154.out | 12 ++++++------
32
3 files changed, 28 insertions(+), 15 deletions(-)
33
34
diff --git a/block/io.c b/block/io.c
35
index XXXXXXX..XXXXXXX 100644
36
--- a/block/io.c
37
+++ b/block/io.c
38
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
39
/* Ignore errors. This is just providing extra information, it
40
* is useful but not necessary.
41
*/
42
- if (!file_pnum) {
43
- /* !file_pnum indicates an offset at or beyond the EOF; it is
44
- * perfectly valid for the format block driver to point to such
45
- * offsets, so catch it and mark everything as zero */
46
+ if (ret2 & BDRV_BLOCK_EOF &&
47
+ (!file_pnum || ret2 & BDRV_BLOCK_ZERO)) {
48
+ /*
49
+ * It is valid for the format block driver to read
50
+ * beyond the end of the underlying file's current
51
+ * size; such areas read as zero.
52
+ */
53
ret |= BDRV_BLOCK_ZERO;
54
} else {
55
/* Limit request to the range reported by the protocol driver */
56
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState *bs,
57
{
58
BlockDriverState *p;
59
int64_t ret = 0;
60
+ bool first = true;
61
62
assert(bs != base);
63
for (p = bs; p != base; p = backing_bs(p)) {
64
ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
65
- if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
66
+ if (ret < 0) {
67
+ break;
68
+ }
69
+ if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
70
+ /*
71
+ * Reading beyond the end of the file continues to read
72
+ * zeroes, but we can only widen the result to the
73
+ * unallocated length we learned from an earlier
74
+ * iteration.
75
+ */
76
+ *pnum = nb_sectors;
77
+ }
78
+ if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
79
break;
80
}
81
/* [sector_num, pnum] unallocated on this layer, which could be only
82
* the first part of [sector_num, nb_sectors]. */
83
nb_sectors = MIN(nb_sectors, *pnum);
84
+ first = false;
85
}
86
return ret;
87
}
88
diff --git a/tests/qemu-iotests/154 b/tests/qemu-iotests/154
89
index XXXXXXX..XXXXXXX 100755
90
--- a/tests/qemu-iotests/154
91
+++ b/tests/qemu-iotests/154
92
@@ -XXX,XX +XXX,XX @@ $QEMU_IO -c "alloc $size 2048" "$TEST_IMG" | _filter_qemu_io
93
$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
94
95
# Repeat with backing file holding unallocated cluster.
96
-# TODO: Note that this forces an allocation, because we aren't yet able to
97
-# quickly detect that reads beyond EOF of the backing file are always zero
98
CLUSTER_SIZE=2048 TEST_IMG="$TEST_IMG.base" _make_test_img $((size + 1024))
99
100
# Write at the front: sector-wise, the request is:
101
@@ -XXX,XX +XXX,XX @@ $QEMU_IO -c "alloc $size 2048" "$TEST_IMG" | _filter_qemu_io
102
$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
103
104
# Repeat with backing file holding zero'd cluster
105
-# TODO: Note that this forces an allocation, because we aren't yet able to
106
-# quickly detect that reads beyond EOF of the backing file are always zero
107
$QEMU_IO -c "write -z $size 512" "$TEST_IMG.base" | _filter_qemu_io
108
109
# Write at the front: sector-wise, the request is:
110
diff --git a/tests/qemu-iotests/154.out b/tests/qemu-iotests/154.out
111
index XXXXXXX..XXXXXXX 100644
112
--- a/tests/qemu-iotests/154.out
113
+++ b/tests/qemu-iotests/154.out
114
@@ -XXX,XX +XXX,XX @@ wrote 512/512 bytes at offset 134217728
115
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
116
2048/2048 bytes allocated at offset 128 MiB
117
[{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false},
118
-{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
119
+{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}]
120
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base
121
wrote 512/512 bytes at offset 134219264
122
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
123
2048/2048 bytes allocated at offset 128 MiB
124
[{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false},
125
-{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
126
+{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}]
127
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base
128
wrote 1024/1024 bytes at offset 134218240
129
1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
130
2048/2048 bytes allocated at offset 128 MiB
131
[{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false},
132
-{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
133
+{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}]
134
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base
135
wrote 2048/2048 bytes at offset 134217728
136
2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
137
@@ -XXX,XX +XXX,XX @@ wrote 512/512 bytes at offset 134217728
138
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
139
2048/2048 bytes allocated at offset 128 MiB
140
[{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false},
141
-{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
142
+{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}]
143
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base
144
wrote 512/512 bytes at offset 134219264
145
512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
146
2048/2048 bytes allocated at offset 128 MiB
147
[{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false},
148
-{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
149
+{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}]
150
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base
151
wrote 1024/1024 bytes at offset 134218240
152
1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
153
2048/2048 bytes allocated at offset 128 MiB
154
[{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false},
155
-{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}]
156
+{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}]
157
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base
158
wrote 2048/2048 bytes at offset 134217728
159
2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
160
--
161
2.9.4
162
163
diff view generated by jsdifflib