1
The following changes since commit 4c8c1cc544dbd5e2564868e61c5037258e393832:
1
The following changes since commit 9a7beaad3dbba982f7a461d676b55a5c3851d312:
2
2
3
Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.10-pull-request' into staging (2017-06-22 19:01:58 +0100)
3
Merge remote-tracking branch 'remotes/alistair/tags/pull-riscv-to-apply-20210304' into staging (2021-03-05 10:47:46 +0000)
4
4
5
are available in the git repository at:
5
are available in the Git repository at:
6
7
6
8
git://repo.or.cz/qemu/kevin.git tags/for-upstream
7
git://repo.or.cz/qemu/kevin.git tags/for-upstream
9
8
10
for you to fetch changes up to 1512008812410ca4054506a7c44343088abdd977:
9
for you to fetch changes up to 67bedc3aed5c455b629c2cb5f523b536c46adff9:
11
10
12
Merge remote-tracking branch 'mreitz/tags/pull-block-2017-06-23' into queue-block (2017-06-23 14:09:12 +0200)
11
docs: qsd: Explain --export nbd,name=... default (2021-03-05 17:09:46 +0100)
13
12
14
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Block layer patches:
15
15
16
Block layer patches
16
- qemu-storage-daemon: add --pidfile option
17
- qemu-storage-daemon: CLI error messages include the option name now
18
- vhost-user-blk export: Misc fixes, added test cases
19
- docs: Improvements for qemu-storage-daemon documentation
20
- parallels: load bitmap extension
21
- backup-top: Don't crash on post-finalize accesses
22
- iotests improvements
17
23
18
----------------------------------------------------------------
24
----------------------------------------------------------------
19
Alberto Garcia (9):
25
Alberto Garcia (1):
20
throttle: Update throttle-groups.c documentation
26
iotests: Drop deprecated 'props' from object-add
21
qcow2: Remove unused Error variable in do_perform_cow()
22
qcow2: Use unsigned int for both members of Qcow2COWRegion
23
qcow2: Make perform_cow() call do_perform_cow() twice
24
qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
25
qcow2: Allow reading both COW regions with only one request
26
qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()
27
qcow2: Merge the writing of the COW regions with the guest data
28
qcow2: Use offset_into_cluster() and offset_to_l2_index()
29
27
30
Kevin Wolf (37):
28
Coiby Xu (1):
31
commit: Fix completion with extra reference
29
test: new qTest case to test the vhost-user-blk-server
32
qemu-iotests: Allow starting new qemu after cleanup
33
qemu-iotests: Test exiting qemu with running job
34
doc: Document generic -blockdev options
35
doc: Document driver-specific -blockdev options
36
qed: Use bottom half to resume waiting requests
37
qed: Make qed_read_table() synchronous
38
qed: Remove callback from qed_read_table()
39
qed: Remove callback from qed_read_l2_table()
40
qed: Remove callback from qed_find_cluster()
41
qed: Make qed_read_backing_file() synchronous
42
qed: Make qed_copy_from_backing_file() synchronous
43
qed: Remove callback from qed_copy_from_backing_file()
44
qed: Make qed_write_header() synchronous
45
qed: Remove callback from qed_write_header()
46
qed: Make qed_write_table() synchronous
47
qed: Remove GenericCB
48
qed: Remove callback from qed_write_table()
49
qed: Make qed_aio_read_data() synchronous
50
qed: Make qed_aio_write_main() synchronous
51
qed: Inline qed_commit_l2_update()
52
qed: Add return value to qed_aio_write_l1_update()
53
qed: Add return value to qed_aio_write_l2_update()
54
qed: Add return value to qed_aio_write_main()
55
qed: Add return value to qed_aio_write_cow()
56
qed: Add return value to qed_aio_write_inplace/alloc()
57
qed: Add return value to qed_aio_read/write_data()
58
qed: Remove ret argument from qed_aio_next_io()
59
qed: Remove recursion in qed_aio_next_io()
60
qed: Implement .bdrv_co_readv/writev
61
qed: Use CoQueue for serialising allocations
62
qed: Simplify request handling
63
qed: Use a coroutine for need_check_timer
64
qed: Add coroutine_fn to I/O path functions
65
qed: Use bdrv_co_* for coroutine_fns
66
block: Remove bdrv_aio_readv/writev/flush()
67
Merge remote-tracking branch 'mreitz/tags/pull-block-2017-06-23' into queue-block
68
30
69
Manos Pitsidianakis (1):
31
Eric Blake (1):
70
block: change variable names in BlockDriverState
32
iotests: Fix up python style in 300
33
34
Kevin Wolf (1):
35
docs: qsd: Explain --export nbd,name=... default
71
36
72
Max Reitz (3):
37
Max Reitz (3):
73
blkdebug: Catch bs->exact_filename overflow
38
backup: Remove nodes from job in .clean()
74
blkverify: Catch bs->exact_filename overflow
39
backup-top: Refuse I/O in inactive state
75
block: Do not strcmp() with NULL uri->scheme
40
iotests/283: Check that finalize drops backup-top
76
41
77
Stefan Hajnoczi (10):
42
Paolo Bonzini (2):
78
block: count bdrv_co_rw_vmstate() requests
43
storage-daemon: report unexpected arguments on the fly
79
block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
44
storage-daemon: include current command line option in the errors
80
migration: avoid recursive AioContext locking in save_vmstate()
81
migration: use bdrv_drain_all_begin/end() instead bdrv_drain_all()
82
virtio-pci: use ioeventfd even when KVM is disabled
83
migration: hold AioContext lock for loadvm qemu_fclose()
84
qemu-iotests: 068: extract _qemu() function
85
qemu-iotests: 068: use -drive/-device instead of -hda
86
qemu-iotests: 068: test iothread mode
87
qemu-img: don't shadow opts variable in img_dd()
88
45
89
Stephen Bates (1):
46
Stefan Hajnoczi (14):
90
nvme: Add support for Read Data and Write Data in CMBs.
47
qemu-storage-daemon: add --pidfile option
48
docs: show how to spawn qemu-storage-daemon with fd passing
49
docs: replace insecure /tmp examples in qsd docs
50
vhost-user-blk: fix blkcfg->num_queues endianness
51
libqtest: add qtest_socket_server()
52
libqtest: add qtest_kill_qemu()
53
libqtest: add qtest_remove_abrt_handler()
54
tests/qtest: add multi-queue test case to vhost-user-blk-test
55
block/export: fix blk_size double byteswap
56
block/export: use VIRTIO_BLK_SECTOR_BITS
57
block/export: fix vhost-user-blk export sector number calculation
58
block/export: port virtio-blk discard/write zeroes input validation
59
vhost-user-blk-test: test discard/write zeroes invalid inputs
60
block/export: port virtio-blk read/write range check
91
61
92
sochin.jiang (1):
62
Stefano Garzarella (1):
93
fix: avoid an infinite loop or a dangling pointer problem in img_commit
63
blockjob: report a better error message
94
64
95
block/Makefile.objs | 2 +-
65
Vladimir Sementsov-Ogievskiy (7):
96
block/blkdebug.c | 46 +--
66
qcow2-bitmap: make bytes_covered_by_bitmap_cluster() public
97
block/blkreplay.c | 8 +-
67
parallels.txt: fix bitmap L1 table description
98
block/blkverify.c | 12 +-
68
block/parallels: BDRVParallelsState: add cluster_size field
99
block/block-backend.c | 22 +-
69
parallels: support bitmap extension for read-only mode
100
block/commit.c | 7 +
70
iotests.py: add unarchive_sample_image() helper
101
block/file-posix.c | 34 +-
71
iotests: add parallels-read-bitmap test
102
block/io.c | 240 ++-----------
72
MAINTAINERS: update parallels block driver
103
block/iscsi.c | 20 +-
104
block/mirror.c | 8 +-
105
block/nbd-client.c | 8 +-
106
block/nbd-client.h | 4 +-
107
block/nbd.c | 6 +-
108
block/nfs.c | 2 +-
109
block/qcow2-cluster.c | 201 ++++++++---
110
block/qcow2.c | 94 +++--
111
block/qcow2.h | 11 +-
112
block/qed-cluster.c | 124 +++----
113
block/qed-gencb.c | 33 --
114
block/qed-table.c | 261 +++++---------
115
block/qed.c | 779 ++++++++++++++++-------------------------
116
block/qed.h | 54 +--
117
block/raw-format.c | 8 +-
118
block/rbd.c | 4 +-
119
block/sheepdog.c | 12 +-
120
block/ssh.c | 2 +-
121
block/throttle-groups.c | 2 +-
122
block/trace-events | 3 -
123
blockjob.c | 4 +-
124
hw/block/nvme.c | 83 +++--
125
hw/block/nvme.h | 1 +
126
hw/virtio/virtio-pci.c | 2 +-
127
include/block/block.h | 16 +-
128
include/block/block_int.h | 6 +-
129
include/block/blockjob.h | 18 +
130
include/sysemu/block-backend.h | 20 +-
131
migration/savevm.c | 32 +-
132
qemu-img.c | 29 +-
133
qemu-io-cmds.c | 46 +--
134
qemu-options.hx | 221 ++++++++++--
135
tests/qemu-iotests/068 | 37 +-
136
tests/qemu-iotests/068.out | 11 +-
137
tests/qemu-iotests/185 | 206 +++++++++++
138
tests/qemu-iotests/185.out | 59 ++++
139
tests/qemu-iotests/common.qemu | 3 +
140
tests/qemu-iotests/group | 1 +
141
46 files changed, 1477 insertions(+), 1325 deletions(-)
142
delete mode 100644 block/qed-gencb.c
143
create mode 100755 tests/qemu-iotests/185
144
create mode 100644 tests/qemu-iotests/185.out
145
73
74
docs/interop/parallels.txt | 28 +-
75
docs/tools/qemu-storage-daemon.rst | 68 +-
76
block/parallels.h | 7 +-
77
include/block/dirty-bitmap.h | 2 +
78
tests/qtest/libqos/libqtest.h | 37 +
79
tests/qtest/libqos/vhost-user-blk.h | 48 +
80
block/backup-top.c | 10 +
81
block/backup.c | 1 +
82
block/dirty-bitmap.c | 13 +
83
block/export/vhost-user-blk-server.c | 150 +++-
84
block/parallels-ext.c | 300 +++++++
85
block/parallels.c | 26 +-
86
block/qcow2-bitmap.c | 16 +-
87
blockjob.c | 10 +-
88
hw/block/vhost-user-blk.c | 7 +-
89
storage-daemon/qemu-storage-daemon.c | 56 +-
90
tests/qtest/libqos/vhost-user-blk.c | 130 +++
91
tests/qtest/libqtest.c | 82 +-
92
tests/qtest/vhost-user-blk-test.c | 983 +++++++++++++++++++++
93
tests/qemu-iotests/iotests.py | 10 +
94
MAINTAINERS | 5 +
95
block/meson.build | 3 +-
96
tests/qemu-iotests/087 | 8 +-
97
tests/qemu-iotests/184 | 18 +-
98
tests/qemu-iotests/218 | 2 +-
99
tests/qemu-iotests/235 | 2 +-
100
tests/qemu-iotests/245 | 4 +-
101
tests/qemu-iotests/258 | 6 +-
102
tests/qemu-iotests/258.out | 4 +-
103
tests/qemu-iotests/283 | 53 ++
104
tests/qemu-iotests/283.out | 15 +
105
tests/qemu-iotests/295 | 2 +-
106
tests/qemu-iotests/296 | 2 +-
107
tests/qemu-iotests/300 | 10 +-
108
.../sample_images/parallels-with-bitmap.bz2 | Bin 0 -> 203 bytes
109
.../sample_images/parallels-with-bitmap.sh | 51 ++
110
tests/qemu-iotests/tests/parallels-read-bitmap | 55 ++
111
tests/qemu-iotests/tests/parallels-read-bitmap.out | 6 +
112
tests/qtest/libqos/meson.build | 1 +
113
tests/qtest/meson.build | 4 +
114
40 files changed, 2098 insertions(+), 137 deletions(-)
115
create mode 100644 tests/qtest/libqos/vhost-user-blk.h
116
create mode 100644 block/parallels-ext.c
117
create mode 100644 tests/qtest/libqos/vhost-user-blk.c
118
create mode 100644 tests/qtest/vhost-user-blk-test.c
119
create mode 100644 tests/qemu-iotests/sample_images/parallels-with-bitmap.bz2
120
create mode 100755 tests/qemu-iotests/sample_images/parallels-with-bitmap.sh
121
create mode 100755 tests/qemu-iotests/tests/parallels-read-bitmap
122
create mode 100644 tests/qemu-iotests/tests/parallels-read-bitmap.out
123
124
diff view generated by jsdifflib
Deleted patch
1
commit_complete() can't assume that after its block_job_completed() the
2
job is actually immediately freed; someone else may still be holding
3
references. In this case, the op blockers on the intermediate nodes make
4
the graph reconfiguration in the completion code fail.
5
1
6
Call block_job_remove_all_bdrv() manually so that we know for sure that
7
any blockers on intermediate nodes are given up.
8
9
Cc: qemu-stable@nongnu.org
10
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
Reviewed-by: Eric Blake <eblake@redhat.com>
12
Reviewed-by: Max Reitz <mreitz@redhat.com>
13
---
14
block/commit.c | 7 +++++++
15
1 file changed, 7 insertions(+)
16
17
diff --git a/block/commit.c b/block/commit.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/block/commit.c
20
+++ b/block/commit.c
21
@@ -XXX,XX +XXX,XX @@ static void commit_complete(BlockJob *job, void *opaque)
22
}
23
g_free(s->backing_file_str);
24
blk_unref(s->top);
25
+
26
+ /* If there is more than one reference to the job (e.g. if called from
27
+ * block_job_finish_sync()), block_job_completed() won't free it and
28
+ * therefore the blockers on the intermediate nodes remain. This would
29
+ * cause bdrv_set_backing_hd() to fail. */
30
+ block_job_remove_all_bdrv(job);
31
+
32
block_job_completed(&s->common, ret);
33
g_free(data);
34
35
--
36
1.8.3.1
37
38
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Alberto Garcia <berto@igalia.com>
2
2
3
We already have functions for doing these calculations, so let's use
4
them instead of doing everything by hand. This makes the code a bit
5
more readable.
6
7
Signed-off-by: Alberto Garcia <berto@igalia.com>
3
Signed-off-by: Alberto Garcia <berto@igalia.com>
4
Message-Id: <20210222115737.2993-1-berto@igalia.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
---
6
---
10
block/qcow2-cluster.c | 4 ++--
7
tests/qemu-iotests/087 | 8 ++------
11
block/qcow2.c | 2 +-
8
tests/qemu-iotests/184 | 18 ++++++------------
12
2 files changed, 3 insertions(+), 3 deletions(-)
9
tests/qemu-iotests/218 | 2 +-
13
10
tests/qemu-iotests/235 | 2 +-
14
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
11
tests/qemu-iotests/245 | 4 ++--
12
tests/qemu-iotests/258 | 6 +++---
13
tests/qemu-iotests/258.out | 4 ++--
14
tests/qemu-iotests/295 | 2 +-
15
tests/qemu-iotests/296 | 2 +-
16
9 files changed, 19 insertions(+), 29 deletions(-)
17
18
diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
19
index XXXXXXX..XXXXXXX 100755
20
--- a/tests/qemu-iotests/087
21
+++ b/tests/qemu-iotests/087
22
@@ -XXX,XX +XXX,XX @@ run_qemu <<EOF
23
"arguments": {
24
"qom-type": "secret",
25
"id": "sec0",
26
- "props": {
27
- "data": "123456"
28
- }
29
+ "data": "123456"
30
}
31
}
32
{ "execute": "blockdev-add",
33
@@ -XXX,XX +XXX,XX @@ run_qemu <<EOF
34
"arguments": {
35
"qom-type": "secret",
36
"id": "sec0",
37
- "props": {
38
- "data": "123456"
39
- }
40
+ "data": "123456"
41
}
42
}
43
{ "execute": "blockdev-add",
44
diff --git a/tests/qemu-iotests/184 b/tests/qemu-iotests/184
45
index XXXXXXX..XXXXXXX 100755
46
--- a/tests/qemu-iotests/184
47
+++ b/tests/qemu-iotests/184
48
@@ -XXX,XX +XXX,XX @@ run_qemu <<EOF
49
"arguments": {
50
"qom-type": "throttle-group",
51
"id": "group0",
52
- "props": {
53
- "limits" : {
54
- "iops-total": 1000
55
- }
56
+ "limits" : {
57
+ "iops-total": 1000
58
}
59
}
60
}
61
@@ -XXX,XX +XXX,XX @@ run_qemu <<EOF
62
"arguments": {
63
"qom-type": "throttle-group",
64
"id": "group0",
65
- "props" : {
66
- "limits": {
67
- "iops-total": 1000
68
- }
69
+ "limits": {
70
+ "iops-total": 1000
71
}
72
}
73
}
74
@@ -XXX,XX +XXX,XX @@ run_qemu <<EOF
75
"arguments": {
76
"qom-type": "throttle-group",
77
"id": "group0",
78
- "props" : {
79
- "limits": {
80
- "iops-total": 1000
81
- }
82
+ "limits": {
83
+ "iops-total": 1000
84
}
85
}
86
}
87
diff --git a/tests/qemu-iotests/218 b/tests/qemu-iotests/218
88
index XXXXXXX..XXXXXXX 100755
89
--- a/tests/qemu-iotests/218
90
+++ b/tests/qemu-iotests/218
91
@@ -XXX,XX +XXX,XX @@ with iotests.VM() as vm, \
92
vm.launch()
93
94
ret = vm.qmp('object-add', qom_type='throttle-group', id='tg',
95
- props={'x-bps-read': 4096})
96
+ limits={'bps-read': 4096})
97
assert ret['return'] == {}
98
99
ret = vm.qmp('blockdev-add',
100
diff --git a/tests/qemu-iotests/235 b/tests/qemu-iotests/235
101
index XXXXXXX..XXXXXXX 100755
102
--- a/tests/qemu-iotests/235
103
+++ b/tests/qemu-iotests/235
104
@@ -XXX,XX +XXX,XX @@ vm.add_args('-drive', 'id=src,file=' + disk)
105
vm.launch()
106
107
log(vm.qmp('object-add', qom_type='throttle-group', id='tg0',
108
- props={ 'x-bps-total': size }))
109
+ limits={'bps-total': size}))
110
111
log(vm.qmp('blockdev-add',
112
**{ 'node-name': 'target',
113
diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
114
index XXXXXXX..XXXXXXX 100755
115
--- a/tests/qemu-iotests/245
116
+++ b/tests/qemu-iotests/245
117
@@ -XXX,XX +XXX,XX @@ class TestBlockdevReopen(iotests.QMPTestCase):
118
###### throttle ######
119
######################
120
opts = { 'qom-type': 'throttle-group', 'id': 'group0',
121
- 'props': { 'limits': { 'iops-total': 1000 } } }
122
+ 'limits': { 'iops-total': 1000 } }
123
result = self.vm.qmp('object-add', conv_keys = False, **opts)
124
self.assert_qmp(result, 'return', {})
125
126
opts = { 'qom-type': 'throttle-group', 'id': 'group1',
127
- 'props': { 'limits': { 'iops-total': 2000 } } }
128
+ 'limits': { 'iops-total': 2000 } }
129
result = self.vm.qmp('object-add', conv_keys = False, **opts)
130
self.assert_qmp(result, 'return', {})
131
132
diff --git a/tests/qemu-iotests/258 b/tests/qemu-iotests/258
133
index XXXXXXX..XXXXXXX 100755
134
--- a/tests/qemu-iotests/258
135
+++ b/tests/qemu-iotests/258
136
@@ -XXX,XX +XXX,XX @@ def test_concurrent_finish(write_to_stream_node):
137
vm.qmp_log('object-add',
138
qom_type='throttle-group',
139
id='tg',
140
- props={
141
- 'x-iops-write': 1,
142
- 'x-iops-write-max': 1
143
+ limits={
144
+ 'iops-write': 1,
145
+ 'iops-write-max': 1
146
})
147
148
vm.qmp_log('blockdev-add',
149
diff --git a/tests/qemu-iotests/258.out b/tests/qemu-iotests/258.out
15
index XXXXXXX..XXXXXXX 100644
150
index XXXXXXX..XXXXXXX 100644
16
--- a/block/qcow2-cluster.c
151
--- a/tests/qemu-iotests/258.out
17
+++ b/block/qcow2-cluster.c
152
+++ b/tests/qemu-iotests/258.out
18
@@ -XXX,XX +XXX,XX @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
153
@@ -XXX,XX +XXX,XX @@ Running tests:
19
154
20
/* find the cluster offset for the given disk offset */
155
=== Commit and stream finish concurrently (letting stream write) ===
21
156
22
- l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
157
-{"execute": "object-add", "arguments": {"id": "tg", "props": {"x-iops-write": 1, "x-iops-write-max": 1}, "qom-type": "throttle-group"}}
23
+ l2_index = offset_to_l2_index(s, offset);
158
+{"execute": "object-add", "arguments": {"id": "tg", "limits": {"iops-write": 1, "iops-write-max": 1}, "qom-type": "throttle-group"}}
24
*cluster_offset = be64_to_cpu(l2_table[l2_index]);
159
{"return": {}}
25
160
{"execute": "blockdev-add", "arguments": {"backing": {"backing": {"backing": {"backing": {"driver": "raw", "file": {"driver": "file", "filename": "TEST_DIR/PID-node0.img"}, "node-name": "node0"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node1.img"}, "node-name": "node1"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node2.img"}, "node-name": "node2"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node3.img"}, "node-name": "node3"}, "driver": "IMGFMT", "file": {"driver": "throttle", "file": {"driver": "file", "filename": "TEST_DIR/PID-node4.img"}, "throttle-group": "tg"}, "node-name": "node4"}}
26
nb_clusters = size_to_clusters(s, bytes_needed);
161
{"return": {}}
27
@@ -XXX,XX +XXX,XX @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
162
@@ -XXX,XX +XXX,XX @@ Running tests:
28
163
29
/* find the cluster offset for the given disk offset */
164
=== Commit and stream finish concurrently (letting commit write) ===
30
165
31
- l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
166
-{"execute": "object-add", "arguments": {"id": "tg", "props": {"x-iops-write": 1, "x-iops-write-max": 1}, "qom-type": "throttle-group"}}
32
+ l2_index = offset_to_l2_index(s, offset);
167
+{"execute": "object-add", "arguments": {"id": "tg", "limits": {"iops-write": 1, "iops-write-max": 1}, "qom-type": "throttle-group"}}
33
168
{"return": {}}
34
*new_l2_table = l2_table;
169
{"execute": "blockdev-add", "arguments": {"backing": {"backing": {"backing": {"backing": {"driver": "raw", "file": {"driver": "throttle", "file": {"driver": "file", "filename": "TEST_DIR/PID-node0.img"}, "throttle-group": "tg"}, "node-name": "node0"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node1.img"}, "node-name": "node1"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node2.img"}, "node-name": "node2"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node3.img"}, "node-name": "node3"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-node4.img"}, "node-name": "node4"}}
35
*new_l2_index = l2_index;
170
{"return": {}}
36
diff --git a/block/qcow2.c b/block/qcow2.c
171
diff --git a/tests/qemu-iotests/295 b/tests/qemu-iotests/295
37
index XXXXXXX..XXXXXXX 100644
172
index XXXXXXX..XXXXXXX 100755
38
--- a/block/qcow2.c
173
--- a/tests/qemu-iotests/295
39
+++ b/block/qcow2.c
174
+++ b/tests/qemu-iotests/295
40
@@ -XXX,XX +XXX,XX @@ static int validate_table_offset(BlockDriverState *bs, uint64_t offset,
175
@@ -XXX,XX +XXX,XX @@ class Secret:
41
}
176
42
177
def to_qmp_object(self):
43
/* Tables must be cluster aligned */
178
return { "qom_type" : "secret", "id": self.id(),
44
- if (offset & (s->cluster_size - 1)) {
179
- "props": { "data": self.secret() } }
45
+ if (offset_into_cluster(s, offset) != 0) {
180
+ "data": self.secret() }
46
return -EINVAL;
181
47
}
182
################################################################################
183
class EncryptionSetupTestCase(iotests.QMPTestCase):
184
diff --git a/tests/qemu-iotests/296 b/tests/qemu-iotests/296
185
index XXXXXXX..XXXXXXX 100755
186
--- a/tests/qemu-iotests/296
187
+++ b/tests/qemu-iotests/296
188
@@ -XXX,XX +XXX,XX @@ class Secret:
189
190
def to_qmp_object(self):
191
return { "qom_type" : "secret", "id": self.id(),
192
- "props": { "data": self.secret() } }
193
+ "data": self.secret() }
194
195
################################################################################
48
196
49
--
197
--
50
1.8.3.1
198
2.29.2
51
199
52
200
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Max Reitz <mreitz@redhat.com>
2
2
3
The bs->exact_filename field may not be sufficient to store the full
3
The block job holds a reference to the backup-top node (because it is
4
blkverify node filename. In this case, we should not generate a filename
4
passed as the main job BDS to block_job_create()). Therefore,
5
at all instead of an unusable one.
5
bdrv_backup_top_drop() cannot delete the backup-top node (replacing it
6
by its child does not affect the job parent, because that has
7
.stay_at_node set). That is a problem, because all of its I/O functions
8
assume the BlockCopyState (s->bcs) to be valid and that it has a
9
filtered child; but after bdrv_backup_top_drop(), neither of those
10
things are true.
6
11
7
Cc: qemu-stable@nongnu.org
12
It does not make sense to add new parents to backup-top after
8
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
13
backup_clean(), so we should detach it from the job before
14
bdrv_backup_top_drop(). Because there is no function to do that for a
15
single node, just detach all of the job's nodes -- the job does not do
16
anything past backup_clean() anyway.
17
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
18
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
Message-id: 20170613172006.19685-3-mreitz@redhat.com
19
Message-Id: <20210219153348.41861-2-mreitz@redhat.com>
11
Reviewed-by: Alberto Garcia <berto@igalia.com>
20
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
---
21
---
15
block/blkverify.c | 12 ++++++++----
22
block/backup.c | 1 +
16
1 file changed, 8 insertions(+), 4 deletions(-)
23
1 file changed, 1 insertion(+)
17
24
18
diff --git a/block/blkverify.c b/block/blkverify.c
25
diff --git a/block/backup.c b/block/backup.c
19
index XXXXXXX..XXXXXXX 100644
26
index XXXXXXX..XXXXXXX 100644
20
--- a/block/blkverify.c
27
--- a/block/backup.c
21
+++ b/block/blkverify.c
28
+++ b/block/backup.c
22
@@ -XXX,XX +XXX,XX @@ static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
29
@@ -XXX,XX +XXX,XX @@ static void backup_abort(Job *job)
23
if (bs->file->bs->exact_filename[0]
30
static void backup_clean(Job *job)
24
&& s->test_file->bs->exact_filename[0])
31
{
25
{
32
BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
26
- snprintf(bs->exact_filename, sizeof(bs->exact_filename),
33
+ block_job_remove_all_bdrv(&s->common);
27
- "blkverify:%s:%s",
34
bdrv_backup_top_drop(s->backup_top);
28
- bs->file->bs->exact_filename,
29
- s->test_file->bs->exact_filename);
30
+ int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
31
+ "blkverify:%s:%s",
32
+ bs->file->bs->exact_filename,
33
+ s->test_file->bs->exact_filename);
34
+ if (ret >= sizeof(bs->exact_filename)) {
35
+ /* An overflow makes the filename unusable, so do not report any */
36
+ bs->exact_filename[0] = 0;
37
+ }
38
}
39
}
35
}
40
36
41
--
37
--
42
1.8.3.1
38
2.29.2
43
39
44
40
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Max Reitz <mreitz@redhat.com>
2
2
3
uri_parse(...)->scheme may be NULL. In fact, probably every field may be
3
When the backup-top node transitions from active to inactive in
4
NULL, and the callers do test this for all of the other fields but not
4
bdrv_backup_top_drop(), the BlockCopyState is freed and the filtered
5
for scheme (except for block/gluster.c; block/vxhs.c does not access
5
child is removed, so the node effectively becomes unusable.
6
that field at all).
7
6
8
We can easily fix this by using g_strcmp0() instead of strcmp().
7
However, noone told its I/O functions this, so they will happily
8
continue accessing bs->backing and s->bcs. Prevent that by aborting
9
early when s->active is false.
9
10
10
Cc: qemu-stable@nongnu.org
11
(After the preceding patch, the node should be gone after
12
bdrv_backup_top_drop(), so this should largely be a theoretical problem.
13
But still, better to be safe than sorry, and also I think it just makes
14
sense to check s->active in the I/O functions.)
15
11
Signed-off-by: Max Reitz <mreitz@redhat.com>
16
Signed-off-by: Max Reitz <mreitz@redhat.com>
12
Message-id: 20170613205726.13544-1-mreitz@redhat.com
17
Message-Id: <20210219153348.41861-3-mreitz@redhat.com>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
18
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
14
Signed-off-by: Max Reitz <mreitz@redhat.com>
15
---
19
---
16
block/nbd.c | 6 +++---
20
block/backup-top.c | 10 ++++++++++
17
block/nfs.c | 2 +-
21
1 file changed, 10 insertions(+)
18
block/sheepdog.c | 6 +++---
19
block/ssh.c | 2 +-
20
4 files changed, 8 insertions(+), 8 deletions(-)
21
22
22
diff --git a/block/nbd.c b/block/nbd.c
23
diff --git a/block/backup-top.c b/block/backup-top.c
23
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
24
--- a/block/nbd.c
25
--- a/block/backup-top.c
25
+++ b/block/nbd.c
26
+++ b/block/backup-top.c
26
@@ -XXX,XX +XXX,XX @@ static int nbd_parse_uri(const char *filename, QDict *options)
27
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int backup_top_co_preadv(
27
}
28
BlockDriverState *bs, uint64_t offset, uint64_t bytes,
28
29
QEMUIOVector *qiov, int flags)
29
/* transport */
30
{
30
- if (!strcmp(uri->scheme, "nbd")) {
31
+ BDRVBackupTopState *s = bs->opaque;
31
+ if (!g_strcmp0(uri->scheme, "nbd")) {
32
+
32
is_unix = false;
33
+ if (!s->active) {
33
- } else if (!strcmp(uri->scheme, "nbd+tcp")) {
34
+ return -EIO;
34
+ } else if (!g_strcmp0(uri->scheme, "nbd+tcp")) {
35
+ }
35
is_unix = false;
36
+
36
- } else if (!strcmp(uri->scheme, "nbd+unix")) {
37
return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
37
+ } else if (!g_strcmp0(uri->scheme, "nbd+unix")) {
38
}
38
is_unix = true;
39
39
} else {
40
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int backup_top_cbw(BlockDriverState *bs, uint64_t offset,
40
ret = -EINVAL;
41
BDRVBackupTopState *s = bs->opaque;
41
diff --git a/block/nfs.c b/block/nfs.c
42
uint64_t off, end;
42
index XXXXXXX..XXXXXXX 100644
43
43
--- a/block/nfs.c
44
+ if (!s->active) {
44
+++ b/block/nfs.c
45
+ return -EIO;
45
@@ -XXX,XX +XXX,XX @@ static int nfs_parse_uri(const char *filename, QDict *options, Error **errp)
46
+ }
46
error_setg(errp, "Invalid URI specified");
47
+
47
goto out;
48
if (flags & BDRV_REQ_WRITE_UNCHANGED) {
48
}
49
return 0;
49
- if (strcmp(uri->scheme, "nfs") != 0) {
50
+ if (g_strcmp0(uri->scheme, "nfs") != 0) {
51
error_setg(errp, "URI scheme must be 'nfs'");
52
goto out;
53
}
54
diff --git a/block/sheepdog.c b/block/sheepdog.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/block/sheepdog.c
57
+++ b/block/sheepdog.c
58
@@ -XXX,XX +XXX,XX @@ static void sd_parse_uri(SheepdogConfig *cfg, const char *filename,
59
}
60
61
/* transport */
62
- if (!strcmp(uri->scheme, "sheepdog")) {
63
+ if (!g_strcmp0(uri->scheme, "sheepdog")) {
64
is_unix = false;
65
- } else if (!strcmp(uri->scheme, "sheepdog+tcp")) {
66
+ } else if (!g_strcmp0(uri->scheme, "sheepdog+tcp")) {
67
is_unix = false;
68
- } else if (!strcmp(uri->scheme, "sheepdog+unix")) {
69
+ } else if (!g_strcmp0(uri->scheme, "sheepdog+unix")) {
70
is_unix = true;
71
} else {
72
error_setg(&err, "URI scheme must be 'sheepdog', 'sheepdog+tcp',"
73
diff --git a/block/ssh.c b/block/ssh.c
74
index XXXXXXX..XXXXXXX 100644
75
--- a/block/ssh.c
76
+++ b/block/ssh.c
77
@@ -XXX,XX +XXX,XX @@ static int parse_uri(const char *filename, QDict *options, Error **errp)
78
return -EINVAL;
79
}
80
81
- if (strcmp(uri->scheme, "ssh") != 0) {
82
+ if (g_strcmp0(uri->scheme, "ssh") != 0) {
83
error_setg(errp, "URI scheme must be 'ssh'");
84
goto err;
85
}
50
}
86
--
51
--
87
1.8.3.1
52
2.29.2
88
53
89
54
diff view generated by jsdifflib
1
From: Max Reitz <mreitz@redhat.com>
1
From: Max Reitz <mreitz@redhat.com>
2
2
3
The bs->exact_filename field may not be sufficient to store the full
3
Without any of HEAD^ or HEAD^^ applied, qemu will most likely crash on
4
blkdebug node filename. In this case, we should not generate a filename
4
the qemu-io invocation, for a variety of immediate reasons. The
5
at all instead of an unusable one.
5
underlying problem is generally a use-after-free access into
6
backup-top's BlockCopyState.
6
7
7
Cc: qemu-stable@nongnu.org
8
With only HEAD^ applied, qemu-io will run into an EIO (which is not
8
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
9
capture by the output, but you can see that the qemu-io invocation will
10
be accepted (i.e., qemu-io will run) in contrast to the reference
11
output, where the node name cannot be found), and qemu will then crash
12
in query-named-block-nodes: bdrv_get_allocated_file_size() detects
13
backup-top to be a filter and passes the request through to its child.
14
However, after bdrv_backup_top_drop(), that child is NULL, so the
15
recursive call crashes.
16
17
With HEAD^^ applied, this test should pass.
18
9
Signed-off-by: Max Reitz <mreitz@redhat.com>
19
Signed-off-by: Max Reitz <mreitz@redhat.com>
10
Message-id: 20170613172006.19685-2-mreitz@redhat.com
20
Message-Id: <20210219153348.41861-4-mreitz@redhat.com>
11
Reviewed-by: Alberto Garcia <berto@igalia.com>
21
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Signed-off-by: Max Reitz <mreitz@redhat.com>
14
---
22
---
15
block/blkdebug.c | 10 +++++++---
23
tests/qemu-iotests/283 | 53 ++++++++++++++++++++++++++++++++++++++
16
1 file changed, 7 insertions(+), 3 deletions(-)
24
tests/qemu-iotests/283.out | 15 +++++++++++
25
2 files changed, 68 insertions(+)
17
26
18
diff --git a/block/blkdebug.c b/block/blkdebug.c
27
diff --git a/tests/qemu-iotests/283 b/tests/qemu-iotests/283
28
index XXXXXXX..XXXXXXX 100755
29
--- a/tests/qemu-iotests/283
30
+++ b/tests/qemu-iotests/283
31
@@ -XXX,XX +XXX,XX @@ vm.qmp_log('blockdev-add', **{
32
vm.qmp_log('blockdev-backup', sync='full', device='source', target='target')
33
34
vm.shutdown()
35
+
36
+
37
+print('\n=== backup-top should be gone after job-finalize ===\n')
38
+
39
+# Check that the backup-top node is gone after job-finalize.
40
+#
41
+# During finalization, the node becomes inactive and can no longer
42
+# function. If it is still present, new parents might be attached, and
43
+# there would be no meaningful way to handle their I/O requests.
44
+
45
+vm = iotests.VM()
46
+vm.launch()
47
+
48
+vm.qmp_log('blockdev-add', **{
49
+ 'node-name': 'source',
50
+ 'driver': 'null-co',
51
+})
52
+
53
+vm.qmp_log('blockdev-add', **{
54
+ 'node-name': 'target',
55
+ 'driver': 'null-co',
56
+})
57
+
58
+vm.qmp_log('blockdev-backup',
59
+ job_id='backup',
60
+ device='source',
61
+ target='target',
62
+ sync='full',
63
+ filter_node_name='backup-filter',
64
+ auto_finalize=False,
65
+ auto_dismiss=False)
66
+
67
+vm.event_wait('BLOCK_JOB_PENDING', 5.0)
68
+
69
+# The backup-top filter should still be present prior to finalization
70
+assert vm.node_info('backup-filter') is not None
71
+
72
+vm.qmp_log('job-finalize', id='backup')
73
+vm.event_wait('BLOCK_JOB_COMPLETED', 5.0)
74
+
75
+# The filter should be gone now. Check that by trying to access it
76
+# with qemu-io (which will most likely crash qemu if it is still
77
+# there.).
78
+vm.qmp_log('human-monitor-command',
79
+ command_line='qemu-io backup-filter "write 0 1M"')
80
+
81
+# (Also, do an explicit check.)
82
+assert vm.node_info('backup-filter') is None
83
+
84
+vm.qmp_log('job-dismiss', id='backup')
85
+vm.event_wait('JOB_STATUS_CHANGE', 5.0, {'data': {'status': 'null'}})
86
+
87
+vm.shutdown()
88
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
19
index XXXXXXX..XXXXXXX 100644
89
index XXXXXXX..XXXXXXX 100644
20
--- a/block/blkdebug.c
90
--- a/tests/qemu-iotests/283.out
21
+++ b/block/blkdebug.c
91
+++ b/tests/qemu-iotests/283.out
22
@@ -XXX,XX +XXX,XX @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
92
@@ -XXX,XX +XXX,XX @@
23
}
93
{"return": {}}
24
94
{"execute": "blockdev-backup", "arguments": {"device": "source", "sync": "full", "target": "target"}}
25
if (!force_json && bs->file->bs->exact_filename[0]) {
95
{"error": {"class": "GenericError", "desc": "Cannot set permissions for backup-top filter: Conflicts with use by other as 'image', which uses 'write' on base"}}
26
- snprintf(bs->exact_filename, sizeof(bs->exact_filename),
96
+
27
- "blkdebug:%s:%s", s->config_file ?: "",
97
+=== backup-top should be gone after job-finalize ===
28
- bs->file->bs->exact_filename);
98
+
29
+ int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
99
+{"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": "source"}}
30
+ "blkdebug:%s:%s", s->config_file ?: "",
100
+{"return": {}}
31
+ bs->file->bs->exact_filename);
101
+{"execute": "blockdev-add", "arguments": {"driver": "null-co", "node-name": "target"}}
32
+ if (ret >= sizeof(bs->exact_filename)) {
102
+{"return": {}}
33
+ /* An overflow makes the filename unusable, so do not report any */
103
+{"execute": "blockdev-backup", "arguments": {"auto-dismiss": false, "auto-finalize": false, "device": "source", "filter-node-name": "backup-filter", "job-id": "backup", "sync": "full", "target": "target"}}
34
+ bs->exact_filename[0] = 0;
104
+{"return": {}}
35
+ }
105
+{"execute": "job-finalize", "arguments": {"id": "backup"}}
36
}
106
+{"return": {}}
37
107
+{"execute": "human-monitor-command", "arguments": {"command-line": "qemu-io backup-filter \"write 0 1M\""}}
38
opts = qdict_new();
108
+{"return": "Error: Cannot find device= nor node_name=backup-filter\r\n"}
109
+{"execute": "job-dismiss", "arguments": {"id": "backup"}}
110
+{"return": {}}
39
--
111
--
40
1.8.3.1
112
2.29.2
41
113
42
114
diff view generated by jsdifflib
1
All functions that are marked coroutine_fn can directly call the
1
From: Eric Blake <eblake@redhat.com>
2
bdrv_co_* version of functions instead of going through the wrapper.
3
2
3
Break some long lines, and relax our type hints to be more generic to
4
any JSON, in order to more easily permit the additional JSON depth now
5
possible in migration parameters. Detected by iotest 297.
6
7
Fixes: ca4bfec41d56
8
(qemu-iotests: 300: Add test case for modifying persistence of bitmap)
9
Reported-by: Kevin Wolf <kwolf@redhat.com>
10
Signed-off-by: Eric Blake <eblake@redhat.com>
11
Message-Id: <20210215220518.1745469-1-eblake@redhat.com>
12
Reviewed-by: John Snow <jsnow@redhat.com>
13
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
14
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
15
---
8
block/qed.c | 16 +++++++++-------
16
tests/qemu-iotests/300 | 10 ++++++----
9
1 file changed, 9 insertions(+), 7 deletions(-)
17
1 file changed, 6 insertions(+), 4 deletions(-)
10
18
11
diff --git a/block/qed.c b/block/qed.c
19
diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300
12
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100755
13
--- a/block/qed.c
21
--- a/tests/qemu-iotests/300
14
+++ b/block/qed.c
22
+++ b/tests/qemu-iotests/300
15
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_write_header(BDRVQEDState *s)
23
@@ -XXX,XX +XXX,XX @@
16
};
24
import os
17
qemu_iovec_init_external(&qiov, &iov, 1);
25
import random
18
26
import re
19
- ret = bdrv_preadv(s->bs->file, 0, &qiov);
27
-from typing import Dict, List, Optional, Union
20
+ ret = bdrv_co_preadv(s->bs->file, 0, qiov.size, &qiov, 0);
28
+from typing import Dict, List, Optional
21
if (ret < 0) {
29
22
goto out;
30
import iotests
23
}
31
24
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_write_header(BDRVQEDState *s)
32
@@ -XXX,XX +XXX,XX @@ import iotests
25
/* Update header */
33
# pylint: disable=wrong-import-order
26
qed_header_cpu_to_le(&s->header, (QEDHeader *) buf);
34
import qemu
27
35
28
- ret = bdrv_pwritev(s->bs->file, 0, &qiov);
36
-BlockBitmapMapping = List[Dict[str, Union[str, List[Dict[str, str]]]]]
29
+ ret = bdrv_co_pwritev(s->bs->file, 0, qiov.size, &qiov, 0);
37
+BlockBitmapMapping = List[Dict[str, object]]
30
if (ret < 0) {
38
31
goto out;
39
mig_sock = os.path.join(iotests.sock_dir, 'mig_sock')
32
}
40
33
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
41
@@ -XXX,XX +XXX,XX @@ class TestCrossAliasMigration(TestDirtyBitmapMigration):
34
qemu_iovec_concat(*backing_qiov, qiov, 0, size);
42
35
43
class TestAliasTransformMigration(TestDirtyBitmapMigration):
36
BLKDBG_EVENT(s->bs->file, BLKDBG_READ_BACKING_AIO);
44
"""
37
- ret = bdrv_preadv(s->bs->backing, pos, *backing_qiov);
45
- Tests the 'transform' option which modifies bitmap persistence on migration.
38
+ ret = bdrv_co_preadv(s->bs->backing, pos, size, *backing_qiov, 0);
46
+ Tests the 'transform' option which modifies bitmap persistence on
39
if (ret < 0) {
47
+ migration.
40
return ret;
48
"""
41
}
49
42
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_copy_from_backing_file(BDRVQEDState *s,
50
src_node_name = 'node-a'
43
}
51
@@ -XXX,XX +XXX,XX @@ class TestAliasTransformMigration(TestDirtyBitmapMigration):
44
52
bitmaps = self.vm_b.query_bitmaps()
45
BLKDBG_EVENT(s->bs->file, BLKDBG_COW_WRITE);
53
46
- ret = bdrv_pwritev(s->bs->file, offset, &qiov);
54
for node in bitmaps:
47
+ ret = bdrv_co_pwritev(s->bs->file, offset, qiov.size, &qiov, 0);
55
- bitmaps[node] = sorted(((bmap['name'], bmap['persistent']) for bmap in bitmaps[node]))
48
if (ret < 0) {
56
+ bitmaps[node] = sorted(((bmap['name'], bmap['persistent'])
49
goto out;
57
+ for bmap in bitmaps[node]))
50
}
58
51
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_aio_write_main(QEDAIOCB *acb)
59
self.assertEqual(bitmaps,
52
trace_qed_aio_write_main(s, acb, 0, offset, acb->cur_qiov.size);
60
{'node-a': [('bmap-a', True), ('bmap-b', False)],
53
54
BLKDBG_EVENT(s->bs->file, BLKDBG_WRITE_AIO);
55
- ret = bdrv_pwritev(s->bs->file, offset, &acb->cur_qiov);
56
+ ret = bdrv_co_pwritev(s->bs->file, offset, acb->cur_qiov.size,
57
+ &acb->cur_qiov, 0);
58
if (ret < 0) {
59
return ret;
60
}
61
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_aio_write_main(QEDAIOCB *acb)
62
* region. The solution is to flush after writing a new data
63
* cluster and before updating the L2 table.
64
*/
65
- ret = bdrv_flush(s->bs->file->bs);
66
+ ret = bdrv_co_flush(s->bs->file->bs);
67
if (ret < 0) {
68
return ret;
69
}
70
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn qed_aio_read_data(void *opaque, int ret,
71
}
72
73
BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
74
- ret = bdrv_preadv(bs->file, offset, &acb->cur_qiov);
75
+ ret = bdrv_co_preadv(bs->file, offset, acb->cur_qiov.size,
76
+ &acb->cur_qiov, 0);
77
if (ret < 0) {
78
return ret;
79
}
80
--
61
--
81
1.8.3.1
62
2.29.2
82
63
83
64
diff view generated by jsdifflib
1
From: "sochin.jiang" <sochin.jiang@huawei.com>
1
From: Stefano Garzarella <sgarzare@redhat.com>
2
2
3
img_commit could fall into an infinite loop calling run_block_job() if
3
When a block job fails, we report strerror(-job->job.ret) error
4
its blockjob fails on any I/O error, fix this already known problem.
4
message, also if the job set an error object.
5
Let's report a better error message using error_get_pretty(job->job.err).
5
6
6
Signed-off-by: sochin.jiang <sochin.jiang@huawei.com>
7
If an error object was not set, strerror(-job->ret) is used as fallback,
7
Message-id: 1497509253-28941-1-git-send-email-sochin.jiang@huawei.com
8
as explained in include/qemu/job.h:
8
Signed-off-by: Max Reitz <mreitz@redhat.com>
9
10
typedef struct Job {
11
...
12
/**
13
* Error object for a failed job.
14
* If job->ret is nonzero and an error object was not set, it will be set
15
* to strerror(-job->ret) during job_completed.
16
*/
17
Error *err;
18
}
19
20
In block_job_query() there can be a transient where 'job.err' is not set
21
by a scheduled bottom half. In that case we use strerror(-job->ret) as it
22
was before.
23
24
Suggested-by: Kevin Wolf <kwolf@redhat.com>
25
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
26
Message-Id: <20210225103633.76746-1-sgarzare@redhat.com>
27
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
---
28
---
10
blockjob.c | 4 ++--
29
blockjob.c | 10 +++++++---
11
include/block/blockjob.h | 18 ++++++++++++++++++
30
1 file changed, 7 insertions(+), 3 deletions(-)
12
qemu-img.c | 20 +++++++++++++-------
13
3 files changed, 33 insertions(+), 9 deletions(-)
14
31
15
diff --git a/blockjob.c b/blockjob.c
32
diff --git a/blockjob.c b/blockjob.c
16
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
17
--- a/blockjob.c
34
--- a/blockjob.c
18
+++ b/blockjob.c
35
+++ b/blockjob.c
19
@@ -XXX,XX +XXX,XX @@ static void block_job_resume(BlockJob *job)
36
@@ -XXX,XX +XXX,XX @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
20
block_job_enter(job);
37
info->status = job->job.status;
38
info->auto_finalize = job->job.auto_finalize;
39
info->auto_dismiss = job->job.auto_dismiss;
40
- info->has_error = job->job.ret != 0;
41
- info->error = job->job.ret ? g_strdup(strerror(-job->job.ret)) : NULL;
42
+ if (job->job.ret) {
43
+ info->has_error = true;
44
+ info->error = job->job.err ?
45
+ g_strdup(error_get_pretty(job->job.err)) :
46
+ g_strdup(strerror(-job->job.ret));
47
+ }
48
return info;
21
}
49
}
22
50
23
-static void block_job_ref(BlockJob *job)
51
@@ -XXX,XX +XXX,XX @@ static void block_job_event_completed(Notifier *n, void *opaque)
24
+void block_job_ref(BlockJob *job)
52
}
25
{
53
26
++job->refcnt;
54
if (job->job.ret < 0) {
27
}
55
- msg = strerror(-job->job.ret);
28
@@ -XXX,XX +XXX,XX @@ static void block_job_attached_aio_context(AioContext *new_context,
56
+ msg = error_get_pretty(job->job.err);
29
void *opaque);
57
}
30
static void block_job_detach_aio_context(void *opaque);
58
31
59
qapi_event_send_block_job_completed(job_type(&job->job),
32
-static void block_job_unref(BlockJob *job)
33
+void block_job_unref(BlockJob *job)
34
{
35
if (--job->refcnt == 0) {
36
BlockDriverState *bs = blk_bs(job->blk);
37
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
38
index XXXXXXX..XXXXXXX 100644
39
--- a/include/block/blockjob.h
40
+++ b/include/block/blockjob.h
41
@@ -XXX,XX +XXX,XX @@ void block_job_iostatus_reset(BlockJob *job);
42
BlockJobTxn *block_job_txn_new(void);
43
44
/**
45
+ * block_job_ref:
46
+ *
47
+ * Add a reference to BlockJob refcnt, it will be decreased with
48
+ * block_job_unref, and then be freed if it comes to be the last
49
+ * reference.
50
+ */
51
+void block_job_ref(BlockJob *job);
52
+
53
+/**
54
+ * block_job_unref:
55
+ *
56
+ * Release a reference that was previously acquired with block_job_ref
57
+ * or block_job_create. If it's the last reference to the object, it will be
58
+ * freed.
59
+ */
60
+void block_job_unref(BlockJob *job);
61
+
62
+/**
63
* block_job_txn_unref:
64
*
65
* Release a reference that was previously acquired with block_job_txn_add_job
66
diff --git a/qemu-img.c b/qemu-img.c
67
index XXXXXXX..XXXXXXX 100644
68
--- a/qemu-img.c
69
+++ b/qemu-img.c
70
@@ -XXX,XX +XXX,XX @@ static void common_block_job_cb(void *opaque, int ret)
71
static void run_block_job(BlockJob *job, Error **errp)
72
{
73
AioContext *aio_context = blk_get_aio_context(job->blk);
74
+ int ret = 0;
75
76
- /* FIXME In error cases, the job simply goes away and we access a dangling
77
- * pointer below. */
78
aio_context_acquire(aio_context);
79
+ block_job_ref(job);
80
do {
81
aio_poll(aio_context, true);
82
qemu_progress_print(job->len ?
83
((float)job->offset / job->len * 100.f) : 0.0f, 0);
84
- } while (!job->ready);
85
+ } while (!job->ready && !job->completed);
86
87
- block_job_complete_sync(job, errp);
88
+ if (!job->completed) {
89
+ ret = block_job_complete_sync(job, errp);
90
+ } else {
91
+ ret = job->ret;
92
+ }
93
+ block_job_unref(job);
94
aio_context_release(aio_context);
95
96
- /* A block job may finish instantaneously without publishing any progress,
97
- * so just signal completion here */
98
- qemu_progress_print(100.f, 0);
99
+ /* publish completion progress only when success */
100
+ if (!ret) {
101
+ qemu_progress_print(100.f, 0);
102
+ }
103
}
104
105
static int img_commit(int argc, char **argv)
106
--
60
--
107
1.8.3.1
61
2.29.2
108
62
109
63
diff view generated by jsdifflib
1
Now that we process a request in the same coroutine from beginning to
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
end and don't drop out of it any more, we can look like a proper
3
coroutine-based driver and simply call qed_aio_next_io() and get a
4
return value from it instead of spawning an additional coroutine that
5
reenters the parent when it's done.
6
2
3
If the first character of optstring is '-', then each nonoption argv
4
element is handled as if it were the argument of an option with character
5
code 1. This removes the reordering of the argv array, and enables usage
6
of loc_set_cmdline to provide better error messages.
7
8
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9
Message-Id: <20210301152844.291799-2-pbonzini@redhat.com>
10
Reviewed-by: Eric Blake <eblake@redhat.com>
7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
---
12
---
10
block/qed.c | 101 +++++++++++++-----------------------------------------------
13
storage-daemon/qemu-storage-daemon.c | 9 ++++-----
11
block/qed.h | 3 +-
14
1 file changed, 4 insertions(+), 5 deletions(-)
12
2 files changed, 22 insertions(+), 82 deletions(-)
13
15
14
diff --git a/block/qed.c b/block/qed.c
16
diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c
15
index XXXXXXX..XXXXXXX 100644
17
index XXXXXXX..XXXXXXX 100644
16
--- a/block/qed.c
18
--- a/storage-daemon/qemu-storage-daemon.c
17
+++ b/block/qed.c
19
+++ b/storage-daemon/qemu-storage-daemon.c
18
@@ -XXX,XX +XXX,XX @@
20
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
19
#include "qapi/qmp/qerror.h"
21
* they are given on the command lines. This means that things must be
20
#include "sysemu/block-backend.h"
22
* defined first before they can be referenced in another option.
21
23
*/
22
-static const AIOCBInfo qed_aiocb_info = {
24
- while ((c = getopt_long(argc, argv, "hT:V", long_options, NULL)) != -1) {
23
- .aiocb_size = sizeof(QEDAIOCB),
25
+ while ((c = getopt_long(argc, argv, "-hT:V", long_options, NULL)) != -1) {
24
-};
26
switch (c) {
25
-
27
case '?':
26
static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
28
exit(EXIT_FAILURE);
27
const char *filename)
29
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
28
{
30
qobject_unref(args);
29
@@ -XXX,XX +XXX,XX @@ static CachedL2Table *qed_new_l2_table(BDRVQEDState *s)
31
break;
30
return l2_table;
31
}
32
33
-static void qed_aio_next_io(QEDAIOCB *acb);
34
-
35
-static void qed_aio_start_io(QEDAIOCB *acb)
36
-{
37
- qed_aio_next_io(acb);
38
-}
39
-
40
static void qed_plug_allocating_write_reqs(BDRVQEDState *s)
41
{
42
assert(!s->allocating_write_reqs_plugged);
43
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
44
45
static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
46
{
47
- return acb->common.bs->opaque;
48
+ return acb->bs->opaque;
49
}
50
51
/**
52
@@ -XXX,XX +XXX,XX @@ static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
53
}
54
}
55
56
-static void qed_aio_complete_bh(void *opaque)
57
-{
58
- QEDAIOCB *acb = opaque;
59
- BDRVQEDState *s = acb_to_s(acb);
60
- BlockCompletionFunc *cb = acb->common.cb;
61
- void *user_opaque = acb->common.opaque;
62
- int ret = acb->bh_ret;
63
-
64
- qemu_aio_unref(acb);
65
-
66
- /* Invoke callback */
67
- qed_acquire(s);
68
- cb(user_opaque, ret);
69
- qed_release(s);
70
-}
71
-
72
-static void qed_aio_complete(QEDAIOCB *acb, int ret)
73
+static void qed_aio_complete(QEDAIOCB *acb)
74
{
75
BDRVQEDState *s = acb_to_s(acb);
76
77
- trace_qed_aio_complete(s, acb, ret);
78
-
79
/* Free resources */
80
qemu_iovec_destroy(&acb->cur_qiov);
81
qed_unref_l2_cache_entry(acb->request.l2_table);
82
@@ -XXX,XX +XXX,XX @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
83
acb->qiov->iov[0].iov_base = NULL;
84
}
85
86
- /* Arrange for a bh to invoke the completion function */
87
- acb->bh_ret = ret;
88
- aio_bh_schedule_oneshot(bdrv_get_aio_context(acb->common.bs),
89
- qed_aio_complete_bh, acb);
90
-
91
/* Start next allocating write request waiting behind this one. Note that
92
* requests enqueue themselves when they first hit an unallocated cluster
93
* but they wait until the entire request is finished before waking up the
94
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
95
struct iovec *iov = acb->qiov->iov;
96
97
if (!iov->iov_base) {
98
- iov->iov_base = qemu_try_blockalign(acb->common.bs, iov->iov_len);
99
+ iov->iov_base = qemu_try_blockalign(acb->bs, iov->iov_len);
100
if (iov->iov_base == NULL) {
101
return -ENOMEM;
102
}
32
}
103
@@ -XXX,XX +XXX,XX @@ static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
33
+ case 1:
104
{
34
+ error_report("Unexpected argument: %s", optarg);
105
QEDAIOCB *acb = opaque;
35
+ exit(EXIT_FAILURE);
106
BDRVQEDState *s = acb_to_s(acb);
36
default:
107
- BlockDriverState *bs = acb->common.bs;
37
g_assert_not_reached();
108
+ BlockDriverState *bs = acb->bs;
109
110
/* Adjust offset into cluster */
111
offset += qed_offset_into_cluster(s, acb->cur_pos);
112
@@ -XXX,XX +XXX,XX @@ static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
113
/**
114
* Begin next I/O or complete the request
115
*/
116
-static void qed_aio_next_io(QEDAIOCB *acb)
117
+static int qed_aio_next_io(QEDAIOCB *acb)
118
{
119
BDRVQEDState *s = acb_to_s(acb);
120
uint64_t offset;
121
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb)
122
123
/* Complete request */
124
if (acb->cur_pos >= acb->end_pos) {
125
- qed_aio_complete(acb, 0);
126
- return;
127
+ ret = 0;
128
+ break;
129
}
130
131
/* Find next cluster and start I/O */
132
len = acb->end_pos - acb->cur_pos;
133
ret = qed_find_cluster(s, &acb->request, acb->cur_pos, &len, &offset);
134
if (ret < 0) {
135
- qed_aio_complete(acb, ret);
136
- return;
137
+ break;
138
}
139
140
if (acb->flags & QED_AIOCB_WRITE) {
141
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb)
142
}
143
144
if (ret < 0 && ret != -EAGAIN) {
145
- qed_aio_complete(acb, ret);
146
- return;
147
+ break;
148
}
38
}
149
}
39
}
150
-}
40
- if (optind != argc) {
151
41
- error_report("Unexpected argument: %s", argv[optind]);
152
-typedef struct QEDRequestCo {
42
- exit(EXIT_FAILURE);
153
- Coroutine *co;
43
- }
154
- bool done;
155
- int ret;
156
-} QEDRequestCo;
157
-
158
-static void qed_co_request_cb(void *opaque, int ret)
159
-{
160
- QEDRequestCo *co = opaque;
161
-
162
- co->done = true;
163
- co->ret = ret;
164
- qemu_coroutine_enter_if_inactive(co->co);
165
+ trace_qed_aio_complete(s, acb, ret);
166
+ qed_aio_complete(acb);
167
+ return ret;
168
}
44
}
169
45
170
static int coroutine_fn qed_co_request(BlockDriverState *bs, int64_t sector_num,
46
int main(int argc, char *argv[])
171
QEMUIOVector *qiov, int nb_sectors,
172
int flags)
173
{
174
- QEDRequestCo co = {
175
- .co = qemu_coroutine_self(),
176
- .done = false,
177
+ QEDAIOCB acb = {
178
+ .bs = bs,
179
+ .cur_pos = (uint64_t) sector_num * BDRV_SECTOR_SIZE,
180
+ .end_pos = (sector_num + nb_sectors) * BDRV_SECTOR_SIZE,
181
+ .qiov = qiov,
182
+ .flags = flags,
183
};
184
- QEDAIOCB *acb = qemu_aio_get(&qed_aiocb_info, bs, qed_co_request_cb, &co);
185
-
186
- trace_qed_aio_setup(bs->opaque, acb, sector_num, nb_sectors, &co, flags);
187
+ qemu_iovec_init(&acb.cur_qiov, qiov->niov);
188
189
- acb->flags = flags;
190
- acb->qiov = qiov;
191
- acb->qiov_offset = 0;
192
- acb->cur_pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
193
- acb->end_pos = acb->cur_pos + nb_sectors * BDRV_SECTOR_SIZE;
194
- acb->backing_qiov = NULL;
195
- acb->request.l2_table = NULL;
196
- qemu_iovec_init(&acb->cur_qiov, qiov->niov);
197
+ trace_qed_aio_setup(bs->opaque, &acb, sector_num, nb_sectors, NULL, flags);
198
199
/* Start request */
200
- qed_aio_start_io(acb);
201
-
202
- if (!co.done) {
203
- qemu_coroutine_yield();
204
- }
205
-
206
- return co.ret;
207
+ return qed_aio_next_io(&acb);
208
}
209
210
static int coroutine_fn bdrv_qed_co_readv(BlockDriverState *bs,
211
diff --git a/block/qed.h b/block/qed.h
212
index XXXXXXX..XXXXXXX 100644
213
--- a/block/qed.h
214
+++ b/block/qed.h
215
@@ -XXX,XX +XXX,XX @@ enum {
216
};
217
218
typedef struct QEDAIOCB {
219
- BlockAIOCB common;
220
- int bh_ret; /* final return status for completion bh */
221
+ BlockDriverState *bs;
222
QSIMPLEQ_ENTRY(QEDAIOCB) next; /* next request */
223
int flags; /* QED_AIOCB_* bits ORed together */
224
uint64_t end_pos; /* request end on block device, in bytes */
225
--
47
--
226
1.8.3.1
48
2.29.2
227
49
228
50
diff view generated by jsdifflib
1
From: Stephen Bates <sbates@raithlin.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
Add the ability for the NVMe model to support both the RDS and WDS
3
Use the location management facilities that the emulator uses, so that
4
modes in the Controller Memory Buffer.
4
the current command line option appears in the error message.
5
5
6
Although not currently supported in the upstreamed Linux kernel a fork
6
Before:
7
with support exists [1] and user-space test programs that build on
8
this also exist [2].
9
7
10
Useful for testing CMB functionality in preperation for real CMB
8
$ storage-daemon/qemu-storage-daemon --nbd key..=
11
enabled NVMe devices (coming soon).
9
qemu-storage-daemon: Invalid parameter 'key..'
12
10
13
[1] https://github.com/sbates130272/linux-p2pmem
11
After:
14
[2] https://github.com/sbates130272/p2pmem-test
15
12
16
Signed-off-by: Stephen Bates <sbates@raithlin.com>
13
$ storage-daemon/qemu-storage-daemon --nbd key..=
17
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
14
qemu-storage-daemon: --nbd key..=: Invalid parameter 'key..'
18
Reviewed-by: Keith Busch <keith.busch@intel.com>
15
16
Reviewed-by: Eric Blake <eblake@redhat.com>
17
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
18
Message-Id: <20210301152844.291799-3-pbonzini@redhat.com>
19
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
19
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
20
---
20
---
21
hw/block/nvme.c | 83 +++++++++++++++++++++++++++++++++++++++------------------
21
storage-daemon/qemu-storage-daemon.c | 19 +++++++++++++++++--
22
hw/block/nvme.h | 1 +
22
1 file changed, 17 insertions(+), 2 deletions(-)
23
2 files changed, 58 insertions(+), 26 deletions(-)
24
23
25
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
24
diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c
26
index XXXXXXX..XXXXXXX 100644
25
index XXXXXXX..XXXXXXX 100644
27
--- a/hw/block/nvme.c
26
--- a/storage-daemon/qemu-storage-daemon.c
28
+++ b/hw/block/nvme.c
27
+++ b/storage-daemon/qemu-storage-daemon.c
29
@@ -XXX,XX +XXX,XX @@
28
@@ -XXX,XX +XXX,XX @@ static void init_qmp_commands(void)
30
* cmb_size_mb=<cmb_size_mb[optional]>
29
qmp_marshal_qmp_capabilities, QCO_ALLOW_PRECONFIG);
31
*
32
* Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
33
- * offset 0 in BAR2 and supports SQS only for now.
34
+ * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
35
*/
36
37
#include "qemu/osdep.h"
38
@@ -XXX,XX +XXX,XX @@ static void nvme_isr_notify(NvmeCtrl *n, NvmeCQueue *cq)
39
}
40
}
30
}
41
31
42
-static uint16_t nvme_map_prp(QEMUSGList *qsg, uint64_t prp1, uint64_t prp2,
32
+static int getopt_set_loc(int argc, char **argv, const char *optstring,
43
- uint32_t len, NvmeCtrl *n)
33
+ const struct option *longopts)
44
+static uint16_t nvme_map_prp(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t prp1,
34
+{
45
+ uint64_t prp2, uint32_t len, NvmeCtrl *n)
35
+ int c, save_index;
36
+
37
+ optarg = NULL;
38
+ save_index = optind;
39
+ c = getopt_long(argc, argv, optstring, longopts, NULL);
40
+ if (optarg) {
41
+ loc_set_cmdline(argv, save_index, MAX(1, optind - save_index));
42
+ }
43
+ return c;
44
+}
45
+
46
static void process_options(int argc, char *argv[])
46
{
47
{
47
hwaddr trans_len = n->page_size - (prp1 % n->page_size);
48
int c;
48
trans_len = MIN(len, trans_len);
49
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
49
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, uint64_t prp1, uint64_t prp2,
50
* they are given on the command lines. This means that things must be
50
51
* defined first before they can be referenced in another option.
51
if (!prp1) {
52
*/
52
return NVME_INVALID_FIELD | NVME_DNR;
53
- while ((c = getopt_long(argc, argv, "-hT:V", long_options, NULL)) != -1) {
53
+ } else if (n->cmbsz && prp1 >= n->ctrl_mem.addr &&
54
+ while ((c = getopt_set_loc(argc, argv, "-hT:V", long_options)) != -1) {
54
+ prp1 < n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size)) {
55
switch (c) {
55
+ qsg->nsg = 0;
56
case '?':
56
+ qemu_iovec_init(iov, num_prps);
57
exit(EXIT_FAILURE);
57
+ qemu_iovec_add(iov, (void *)&n->cmbuf[prp1 - n->ctrl_mem.addr], trans_len);
58
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
58
+ } else {
59
break;
59
+ pci_dma_sglist_init(qsg, &n->parent_obj, num_prps);
60
+ qemu_sglist_add(qsg, prp1, trans_len);
61
}
62
-
63
- pci_dma_sglist_init(qsg, &n->parent_obj, num_prps);
64
- qemu_sglist_add(qsg, prp1, trans_len);
65
len -= trans_len;
66
if (len) {
67
if (!prp2) {
68
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, uint64_t prp1, uint64_t prp2,
69
70
nents = (len + n->page_size - 1) >> n->page_bits;
71
prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
72
- pci_dma_read(&n->parent_obj, prp2, (void *)prp_list, prp_trans);
73
+ nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
74
while (len != 0) {
75
uint64_t prp_ent = le64_to_cpu(prp_list[i]);
76
77
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, uint64_t prp1, uint64_t prp2,
78
i = 0;
79
nents = (len + n->page_size - 1) >> n->page_bits;
80
prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
81
- pci_dma_read(&n->parent_obj, prp_ent, (void *)prp_list,
82
+ nvme_addr_read(n, prp_ent, (void *)prp_list,
83
prp_trans);
84
prp_ent = le64_to_cpu(prp_list[i]);
85
}
86
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, uint64_t prp1, uint64_t prp2,
87
}
88
89
trans_len = MIN(len, n->page_size);
90
- qemu_sglist_add(qsg, prp_ent, trans_len);
91
+ if (qsg->nsg){
92
+ qemu_sglist_add(qsg, prp_ent, trans_len);
93
+ } else {
94
+ qemu_iovec_add(iov, (void *)&n->cmbuf[prp_ent - n->ctrl_mem.addr], trans_len);
95
+ }
96
len -= trans_len;
97
i++;
98
}
60
}
99
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_map_prp(QEMUSGList *qsg, uint64_t prp1, uint64_t prp2,
61
case 1:
100
if (prp2 & (n->page_size - 1)) {
62
- error_report("Unexpected argument: %s", optarg);
101
goto unmap;
63
+ error_report("Unexpected argument");
102
}
64
exit(EXIT_FAILURE);
103
- qemu_sglist_add(qsg, prp2, len);
65
default:
104
+ if (qsg->nsg) {
66
g_assert_not_reached();
105
+ qemu_sglist_add(qsg, prp2, len);
106
+ } else {
107
+ qemu_iovec_add(iov, (void *)&n->cmbuf[prp2 - n->ctrl_mem.addr], trans_len);
108
+ }
109
}
67
}
110
}
68
}
111
return NVME_SUCCESS;
69
+ loc_set_none();
112
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_dma_read_prp(NvmeCtrl *n, uint8_t *ptr, uint32_t len,
113
uint64_t prp1, uint64_t prp2)
114
{
115
QEMUSGList qsg;
116
+ QEMUIOVector iov;
117
+ uint16_t status = NVME_SUCCESS;
118
119
- if (nvme_map_prp(&qsg, prp1, prp2, len, n)) {
120
+ if (nvme_map_prp(&qsg, &iov, prp1, prp2, len, n)) {
121
return NVME_INVALID_FIELD | NVME_DNR;
122
}
123
- if (dma_buf_read(ptr, len, &qsg)) {
124
+ if (qsg.nsg > 0) {
125
+ if (dma_buf_read(ptr, len, &qsg)) {
126
+ status = NVME_INVALID_FIELD | NVME_DNR;
127
+ }
128
qemu_sglist_destroy(&qsg);
129
- return NVME_INVALID_FIELD | NVME_DNR;
130
+ } else {
131
+ if (qemu_iovec_to_buf(&iov, 0, ptr, len) != len) {
132
+ status = NVME_INVALID_FIELD | NVME_DNR;
133
+ }
134
+ qemu_iovec_destroy(&iov);
135
}
136
- qemu_sglist_destroy(&qsg);
137
- return NVME_SUCCESS;
138
+ return status;
139
}
70
}
140
71
141
static void nvme_post_cqes(void *opaque)
72
int main(int argc, char *argv[])
142
@@ -XXX,XX +XXX,XX @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeNamespace *ns, NvmeCmd *cmd,
143
return NVME_LBA_RANGE | NVME_DNR;
144
}
145
146
- if (nvme_map_prp(&req->qsg, prp1, prp2, data_size, n)) {
147
+ if (nvme_map_prp(&req->qsg, &req->iov, prp1, prp2, data_size, n)) {
148
block_acct_invalid(blk_get_stats(n->conf.blk), acct);
149
return NVME_INVALID_FIELD | NVME_DNR;
150
}
151
152
- assert((nlb << data_shift) == req->qsg.size);
153
-
154
- req->has_sg = true;
155
dma_acct_start(n->conf.blk, &req->acct, &req->qsg, acct);
156
- req->aiocb = is_write ?
157
- dma_blk_write(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
158
- nvme_rw_cb, req) :
159
- dma_blk_read(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
160
- nvme_rw_cb, req);
161
+ if (req->qsg.nsg > 0) {
162
+ req->has_sg = true;
163
+ req->aiocb = is_write ?
164
+ dma_blk_write(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
165
+ nvme_rw_cb, req) :
166
+ dma_blk_read(n->conf.blk, &req->qsg, data_offset, BDRV_SECTOR_SIZE,
167
+ nvme_rw_cb, req);
168
+ } else {
169
+ req->has_sg = false;
170
+ req->aiocb = is_write ?
171
+ blk_aio_pwritev(n->conf.blk, data_offset, &req->iov, 0, nvme_rw_cb,
172
+ req) :
173
+ blk_aio_preadv(n->conf.blk, data_offset, &req->iov, 0, nvme_rw_cb,
174
+ req);
175
+ }
176
177
return NVME_NO_COMPLETE;
178
}
179
@@ -XXX,XX +XXX,XX @@ static int nvme_init(PCIDevice *pci_dev)
180
NVME_CMBSZ_SET_SQS(n->bar.cmbsz, 1);
181
NVME_CMBSZ_SET_CQS(n->bar.cmbsz, 0);
182
NVME_CMBSZ_SET_LISTS(n->bar.cmbsz, 0);
183
- NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 0);
184
- NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 0);
185
+ NVME_CMBSZ_SET_RDS(n->bar.cmbsz, 1);
186
+ NVME_CMBSZ_SET_WDS(n->bar.cmbsz, 1);
187
NVME_CMBSZ_SET_SZU(n->bar.cmbsz, 2); /* MBs */
188
NVME_CMBSZ_SET_SZ(n->bar.cmbsz, n->cmb_size_mb);
189
190
+ n->cmbloc = n->bar.cmbloc;
191
+ n->cmbsz = n->bar.cmbsz;
192
+
193
n->cmbuf = g_malloc0(NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
194
memory_region_init_io(&n->ctrl_mem, OBJECT(n), &nvme_cmb_ops, n,
195
"nvme-cmb", NVME_CMBSZ_GETSIZE(n->bar.cmbsz));
196
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
197
index XXXXXXX..XXXXXXX 100644
198
--- a/hw/block/nvme.h
199
+++ b/hw/block/nvme.h
200
@@ -XXX,XX +XXX,XX @@ typedef struct NvmeRequest {
201
NvmeCqe cqe;
202
BlockAcctCookie acct;
203
QEMUSGList qsg;
204
+ QEMUIOVector iov;
205
QTAILQ_ENTRY(NvmeRequest)entry;
206
} NvmeRequest;
207
208
--
73
--
209
1.8.3.1
74
2.29.2
210
75
211
76
diff view generated by jsdifflib
1
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
just return an error code and let the caller handle it.
3
2
4
While refactoring qed_aio_write_alloc() to accomodate the change,
3
Daemons often have a --pidfile option where the pid is written to a file
5
qed_aio_write_zero_cluster() ended up with a single line, so I chose to
4
so that scripts can stop the daemon by sending a signal.
6
inline that line and remove the function completely.
7
5
6
The pid file also acts as a lock to prevent multiple instances of the
7
daemon from launching for a given pid file.
8
9
QEMU, qemu-nbd, qemu-ga, virtiofsd, and qemu-pr-helper all support the
10
--pidfile option. Add it to qemu-storage-daemon too.
11
12
Reported-by: Richard W.M. Jones <rjones@redhat.com>
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Message-Id: <20210302142746.170535-1-stefanha@redhat.com>
15
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
16
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
17
---
11
block/qed.c | 58 +++++++++++++++++++++-------------------------------------
18
docs/tools/qemu-storage-daemon.rst | 14 +++++++++++
12
1 file changed, 21 insertions(+), 37 deletions(-)
19
storage-daemon/qemu-storage-daemon.c | 36 ++++++++++++++++++++++++++++
20
2 files changed, 50 insertions(+)
13
21
14
diff --git a/block/qed.c b/block/qed.c
22
diff --git a/docs/tools/qemu-storage-daemon.rst b/docs/tools/qemu-storage-daemon.rst
15
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
16
--- a/block/qed.c
24
--- a/docs/tools/qemu-storage-daemon.rst
17
+++ b/block/qed.c
25
+++ b/docs/tools/qemu-storage-daemon.rst
18
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_main(QEDAIOCB *acb)
26
@@ -XXX,XX +XXX,XX @@ Standard options:
19
/**
27
List object properties with ``<type>,help``. See the :manpage:`qemu(1)`
20
* Populate untouched regions of new data cluster
28
manual page for a description of the object properties.
21
*/
29
22
-static void qed_aio_write_cow(void *opaque, int ret)
30
+.. option:: --pidfile PATH
23
+static int qed_aio_write_cow(QEDAIOCB *acb)
24
{
25
- QEDAIOCB *acb = opaque;
26
BDRVQEDState *s = acb_to_s(acb);
27
uint64_t start, len, offset;
28
+ int ret;
29
30
/* Populate front untouched region of new data cluster */
31
start = qed_start_of_cluster(s, acb->cur_pos);
32
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_cow(void *opaque, int ret)
33
34
trace_qed_aio_write_prefill(s, acb, start, len, acb->cur_cluster);
35
ret = qed_copy_from_backing_file(s, start, len, acb->cur_cluster);
36
- if (ret) {
37
- qed_aio_complete(acb, ret);
38
- return;
39
+ if (ret < 0) {
40
+ return ret;
41
}
42
43
/* Populate back untouched region of new data cluster */
44
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_cow(void *opaque, int ret)
45
46
trace_qed_aio_write_postfill(s, acb, start, len, offset);
47
ret = qed_copy_from_backing_file(s, start, len, offset);
48
- if (ret) {
49
- qed_aio_complete(acb, ret);
50
- return;
51
- }
52
-
53
- ret = qed_aio_write_main(acb);
54
if (ret < 0) {
55
- qed_aio_complete(acb, ret);
56
- return;
57
+ return ret;
58
}
59
- qed_aio_next_io(acb, 0);
60
+
31
+
61
+ return qed_aio_write_main(acb);
32
+ is the path to a file where the daemon writes its pid. This allows scripts to
33
+ stop the daemon by sending a signal::
34
+
35
+ $ kill -SIGTERM $(<path/to/qsd.pid)
36
+
37
+ A file lock is applied to the file so only one instance of the daemon can run
38
+ with a given pid file path. The daemon unlinks its pid file when terminating.
39
+
40
+ The pid file is written after chardevs, exports, and NBD servers have been
41
+ created but before accepting connections. The daemon has started successfully
42
+ when the pid file is written and clients may begin connecting.
43
+
44
Examples
45
--------
46
Launch the daemon with QMP monitor socket ``qmp.sock`` so clients can execute
47
diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/storage-daemon/qemu-storage-daemon.c
50
+++ b/storage-daemon/qemu-storage-daemon.c
51
@@ -XXX,XX +XXX,XX @@
52
#include "sysemu/runstate.h"
53
#include "trace/control.h"
54
55
+static const char *pid_file;
56
static volatile bool exit_requested = false;
57
58
void qemu_system_killed(int signal, pid_t pid)
59
@@ -XXX,XX +XXX,XX @@ static void help(void)
60
" See the qemu(1) man page for documentation of the\n"
61
" objects that can be added.\n"
62
"\n"
63
+" --pidfile <path> write process ID to a file after startup\n"
64
+"\n"
65
QEMU_HELP_BOTTOM "\n",
66
error_get_progname());
62
}
67
}
63
68
@@ -XXX,XX +XXX,XX @@ enum {
64
/**
69
OPTION_MONITOR,
65
@@ -XXX,XX +XXX,XX @@ static bool qed_should_set_need_check(BDRVQEDState *s)
70
OPTION_NBD_SERVER,
66
return !(s->header.features & QED_F_NEED_CHECK);
71
OPTION_OBJECT,
72
+ OPTION_PIDFILE,
73
};
74
75
extern QemuOptsList qemu_chardev_opts;
76
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
77
{"monitor", required_argument, NULL, OPTION_MONITOR},
78
{"nbd-server", required_argument, NULL, OPTION_NBD_SERVER},
79
{"object", required_argument, NULL, OPTION_OBJECT},
80
+ {"pidfile", required_argument, NULL, OPTION_PIDFILE},
81
{"trace", required_argument, NULL, 'T'},
82
{"version", no_argument, NULL, 'V'},
83
{0, 0, 0, 0}
84
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
85
qobject_unref(args);
86
break;
87
}
88
+ case OPTION_PIDFILE:
89
+ pid_file = optarg;
90
+ break;
91
case 1:
92
error_report("Unexpected argument");
93
exit(EXIT_FAILURE);
94
@@ -XXX,XX +XXX,XX @@ static void process_options(int argc, char *argv[])
95
loc_set_none();
67
}
96
}
68
97
69
-static void qed_aio_write_zero_cluster(void *opaque, int ret)
98
+static void pid_file_cleanup(void)
70
-{
99
+{
71
- QEDAIOCB *acb = opaque;
100
+ unlink(pid_file);
72
-
101
+}
73
- if (ret) {
102
+
74
- qed_aio_complete(acb, ret);
103
+static void pid_file_init(void)
75
- return;
104
+{
76
- }
105
+ Error *err = NULL;
77
-
106
+
78
- ret = qed_aio_write_l2_update(acb, 1);
107
+ if (!pid_file) {
79
- if (ret < 0) {
108
+ return;
80
- qed_aio_complete(acb, ret);
81
- return;
82
- }
83
- qed_aio_next_io(acb, 0);
84
-}
85
-
86
/**
87
* Write new data cluster
88
*
89
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_zero_cluster(void *opaque, int ret)
90
static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
91
{
92
BDRVQEDState *s = acb_to_s(acb);
93
- BlockCompletionFunc *cb;
94
int ret;
95
96
/* Cancel timer when the first allocating request comes in */
97
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
98
qed_aio_start_io(acb);
99
return;
100
}
101
-
102
- cb = qed_aio_write_zero_cluster;
103
} else {
104
- cb = qed_aio_write_cow;
105
acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
106
}
107
108
if (qed_should_set_need_check(s)) {
109
s->header.features |= QED_F_NEED_CHECK;
110
ret = qed_write_header(s);
111
- cb(acb, ret);
112
+ if (ret < 0) {
113
+ qed_aio_complete(acb, ret);
114
+ return;
115
+ }
116
+ }
109
+ }
117
+
110
+
118
+ if (acb->flags & QED_AIOCB_ZERO) {
111
+ if (!qemu_write_pidfile(pid_file, &err)) {
119
+ ret = qed_aio_write_l2_update(acb, 1);
112
+ error_reportf_err(err, "cannot create PID file: ");
120
} else {
113
+ exit(EXIT_FAILURE);
121
- cb(acb, 0);
114
+ }
122
+ ret = qed_aio_write_cow(acb);
115
+
116
+ atexit(pid_file_cleanup);
117
+}
118
+
119
int main(int argc, char *argv[])
120
{
121
#ifdef CONFIG_POSIX
122
@@ -XXX,XX +XXX,XX @@ int main(int argc, char *argv[])
123
qemu_init_main_loop(&error_fatal);
124
process_options(argc, argv);
125
126
+ /*
127
+ * Write the pid file after creating chardevs, exports, and NBD servers but
128
+ * before accepting connections. This ordering is documented. Do not change
129
+ * it.
130
+ */
131
+ pid_file_init();
132
+
133
while (!exit_requested) {
134
main_loop_wait(false);
123
}
135
}
124
+ if (ret < 0) {
125
+ qed_aio_complete(acb, ret);
126
+ return;
127
+ }
128
+ qed_aio_next_io(acb, 0);
129
}
130
131
/**
132
--
136
--
133
1.8.3.1
137
2.29.2
134
138
135
139
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
Old kvm.ko versions only supported a tiny number of ioeventfds so
3
The QMP monitor, NBD server, and vhost-user-blk export all support file
4
virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0.
4
descriptor passing. This is a useful technique because it allows the
5
parent process to spawn and wait for qemu-storage-daemon without busy
6
waiting, which may delay startup due to arbitrary sleep() calls.
5
7
6
Do not check kvm_has_many_ioeventfds() when KVM is disabled since it
8
This Python example is inspired by the test case written for libnbd by
7
always returns 0. Since commit 8c56c1a592b5092d91da8d8943c17777d6462a6f
9
Richard W.M. Jones <rjones@redhat.com>:
8
("memory: emulate ioeventfd") it has been possible to use ioeventfds in
10
https://gitlab.com/nbdkit/libnbd/-/commit/89113f484effb0e6c322314ba75c1cbe07a04543
9
qtest or TCG mode.
10
11
11
This patch makes -device virtio-blk-pci,iothread=iothread0 work even
12
Thanks to Daniel P. Berrangé <berrange@redhat.com> for suggestions on
12
when KVM is disabled.
13
how to get this working. Now let's document it!
13
14
14
I have tested that virtio-blk-pci works under TCG both with and without
15
Reported-by: Richard W.M. Jones <rjones@redhat.com>
15
iothread.
16
Cc: Kevin Wolf <kwolf@redhat.com>
16
17
Cc: Daniel P. Berrangé <berrange@redhat.com>
17
Cc: Michael S. Tsirkin <mst@redhat.com>
18
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
18
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
19
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
19
Message-Id: <20210301172728.135331-2-stefanha@redhat.com>
20
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
21
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
20
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
22
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
21
---
23
---
22
hw/virtio/virtio-pci.c | 2 +-
24
docs/tools/qemu-storage-daemon.rst | 42 ++++++++++++++++++++++++++++--
23
1 file changed, 1 insertion(+), 1 deletion(-)
25
1 file changed, 40 insertions(+), 2 deletions(-)
24
26
25
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
27
diff --git a/docs/tools/qemu-storage-daemon.rst b/docs/tools/qemu-storage-daemon.rst
26
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
27
--- a/hw/virtio/virtio-pci.c
29
--- a/docs/tools/qemu-storage-daemon.rst
28
+++ b/hw/virtio/virtio-pci.c
30
+++ b/docs/tools/qemu-storage-daemon.rst
29
@@ -XXX,XX +XXX,XX @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error **errp)
31
@@ -XXX,XX +XXX,XX @@ Standard options:
30
bool pcie_port = pci_bus_is_express(pci_dev->bus) &&
32
31
!pci_bus_is_root(pci_dev->bus);
33
.. option:: --nbd-server addr.type=inet,addr.host=<host>,addr.port=<port>[,tls-creds=<id>][,tls-authz=<id>][,max-connections=<n>]
32
34
--nbd-server addr.type=unix,addr.path=<path>[,tls-creds=<id>][,tls-authz=<id>][,max-connections=<n>]
33
- if (!kvm_has_many_ioeventfds()) {
35
+ --nbd-server addr.type=fd,addr.str=<fd>[,tls-creds=<id>][,tls-authz=<id>][,max-connections=<n>]
34
+ if (kvm_enabled() && !kvm_has_many_ioeventfds()) {
36
35
proxy->flags &= ~VIRTIO_PCI_FLAG_USE_IOEVENTFD;
37
is a server for NBD exports. Both TCP and UNIX domain sockets are supported.
36
}
38
- TLS encryption can be configured using ``--object`` tls-creds-* and authz-*
37
39
- secrets (see below).
40
+ A listen socket can be provided via file descriptor passing (see Examples
41
+ below). TLS encryption can be configured using ``--object`` tls-creds-* and
42
+ authz-* secrets (see below).
43
44
To configure an NBD server on UNIX domain socket path ``/tmp/nbd.sock``::
45
46
@@ -XXX,XX +XXX,XX @@ QMP commands::
47
--chardev socket,path=qmp.sock,server=on,wait=off,id=char1 \
48
--monitor chardev=char1
49
50
+Launch the daemon from Python with a QMP monitor socket using file descriptor
51
+passing so there is no need to busy wait for the QMP monitor to become
52
+available::
53
+
54
+ #!/usr/bin/env python3
55
+ import subprocess
56
+ import socket
57
+
58
+ sock_path = '/var/run/qmp.sock'
59
+
60
+ with socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) as listen_sock:
61
+ listen_sock.bind(sock_path)
62
+ listen_sock.listen()
63
+
64
+ fd = listen_sock.fileno()
65
+
66
+ subprocess.Popen(
67
+ ['qemu-storage-daemon',
68
+ '--chardev', f'socket,fd={fd},server=on,id=char1',
69
+ '--monitor', 'chardev=char1'],
70
+ pass_fds=[fd],
71
+ )
72
+
73
+ # listen_sock was automatically closed when leaving the 'with' statement
74
+ # body. If the daemon process terminated early then the following connect()
75
+ # will fail with "Connection refused" because no process has the listen
76
+ # socket open anymore. Launch errors can be detected this way.
77
+
78
+ qmp_sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
79
+ qmp_sock.connect(sock_path)
80
+ ...QMP interaction...
81
+
82
+The same socket spawning approach also works with the ``--nbd-server
83
+addr.type=fd,addr.str=<fd>`` and ``--export
84
+type=vhost-user-blk,addr.type=fd,addr.str=<fd>`` options.
85
+
86
Export raw image file ``disk.img`` over NBD UNIX domain socket ``nbd.sock``::
87
88
$ qemu-storage-daemon \
38
--
89
--
39
1.8.3.1
90
2.29.2
40
91
41
92
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
AioContext was designed to allow nested acquire/release calls. It uses
3
World-writeable directories have security issues. Avoid showing them in
4
a recursive mutex so callers don't need to worry about nesting...or so
4
the documentation since someone might accidentally use them in
5
we thought.
5
situations where they are insecure.
6
6
7
BDRV_POLL_WHILE() is used to wait for block I/O requests. It releases
7
There tend to be 3 security problems:
8
the AioContext temporarily around aio_poll(). This gives IOThreads a
8
1. Denial of service. An adversary may be able to create the file
9
chance to acquire the AioContext to process I/O completions.
9
beforehand, consume all space/inodes, etc to sabotage us.
10
2. Impersonation. An adversary may be able to create a listen socket and
11
accept incoming connections that were meant for us.
12
3. Unauthenticated client access. An adversary may be able to connect to
13
us if we did not set the uid/gid and permissions correctly.
10
14
11
It turns out that recursive locking and BDRV_POLL_WHILE() don't mix.
15
These can be prevented or mitigated with private /tmp, carefully setting
12
BDRV_POLL_WHILE() only releases the AioContext once, so the IOThread
16
the umask, etc but that requires special action and does not apply to
13
will not be able to acquire the AioContext if it was acquired
17
all situations. Just avoid using /tmp in examples.
14
multiple times.
15
18
16
Instead of trying to release AioContext n times in BDRV_POLL_WHILE(),
19
Reported-by: Richard W.M. Jones <rjones@redhat.com>
17
this patch simply avoids nested locking in save_vmstate(). It's the
20
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
18
simplest fix and we should step back to consider the big picture with
19
all the recent changes to block layer threading.
20
21
This patch is the final fix to solve 'savevm' hanging with -object
22
iothread.
23
24
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
21
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
25
Reviewed-by: Eric Blake <eblake@redhat.com>
22
Message-Id: <20210301172728.135331-3-stefanha@redhat.com>
26
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
23
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
24
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
27
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
25
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
28
---
26
---
29
migration/savevm.c | 12 +++++++++++-
27
docs/tools/qemu-storage-daemon.rst | 7 ++++---
30
1 file changed, 11 insertions(+), 1 deletion(-)
28
1 file changed, 4 insertions(+), 3 deletions(-)
31
29
32
diff --git a/migration/savevm.c b/migration/savevm.c
30
diff --git a/docs/tools/qemu-storage-daemon.rst b/docs/tools/qemu-storage-daemon.rst
33
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
34
--- a/migration/savevm.c
32
--- a/docs/tools/qemu-storage-daemon.rst
35
+++ b/migration/savevm.c
33
+++ b/docs/tools/qemu-storage-daemon.rst
36
@@ -XXX,XX +XXX,XX @@ int save_snapshot(const char *name, Error **errp)
34
@@ -XXX,XX +XXX,XX @@ Standard options:
37
goto the_end;
35
a description of character device properties. A common character device
38
}
36
definition configures a UNIX domain socket::
39
37
40
+ /* The bdrv_all_create_snapshot() call that follows acquires the AioContext
38
- --chardev socket,id=char1,path=/tmp/qmp.sock,server=on,wait=off
41
+ * for itself. BDRV_POLL_WHILE() does not support nested locking because
39
+ --chardev socket,id=char1,path=/var/run/qsd-qmp.sock,server=on,wait=off
42
+ * it only releases the lock once. Therefore synchronous I/O will deadlock
40
43
+ * unless we release the AioContext before bdrv_all_create_snapshot().
41
.. option:: --export [type=]nbd,id=<id>,node-name=<node-name>[,name=<export-name>][,writable=on|off][,bitmap=<name>]
44
+ */
42
--export [type=]vhost-user-blk,id=<id>,node-name=<node-name>,addr.type=unix,addr.path=<socket-path>[,writable=on|off][,logical-block-size=<block-size>][,num-queues=<num-queues>]
45
+ aio_context_release(aio_context);
43
@@ -XXX,XX +XXX,XX @@ Standard options:
46
+ aio_context = NULL;
44
below). TLS encryption can be configured using ``--object`` tls-creds-* and
47
+
45
authz-* secrets (see below).
48
ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
46
49
if (ret < 0) {
47
- To configure an NBD server on UNIX domain socket path ``/tmp/nbd.sock``::
50
error_setg(errp, "Error while creating snapshot on '%s'",
48
+ To configure an NBD server on UNIX domain socket path
51
@@ -XXX,XX +XXX,XX @@ int save_snapshot(const char *name, Error **errp)
49
+ ``/var/run/qsd-nbd.sock``::
52
ret = 0;
50
53
51
- --nbd-server addr.type=unix,addr.path=/tmp/nbd.sock
54
the_end:
52
+ --nbd-server addr.type=unix,addr.path=/var/run/qsd-nbd.sock
55
- aio_context_release(aio_context);
53
56
+ if (aio_context) {
54
.. option:: --object help
57
+ aio_context_release(aio_context);
55
--object <type>,help
58
+ }
59
if (saved_vm_running) {
60
vm_start();
61
}
62
--
56
--
63
1.8.3.1
57
2.29.2
64
58
65
59
diff view generated by jsdifflib
1
All callers pass ret = 0, so we can just remove it.
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
Treat the num_queues field as virtio-endian. On big-endian hosts the
4
vhost-user-blk num_queues field was in the wrong endianness.
5
6
Move the blkcfg.num_queues store operation from realize to
7
vhost_user_blk_update_config() so feature negotiation has finished and
8
we know the endianness of the device. VIRTIO 1.0 devices are
9
little-endian, but in case someone wants to use legacy VIRTIO we support
10
all endianness cases.
11
12
Cc: qemu-stable@nongnu.org
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
15
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
16
Message-Id: <20210223144653.811468-2-stefanha@redhat.com>
3
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
---
18
---
6
block/qed.c | 17 ++++++-----------
19
hw/block/vhost-user-blk.c | 7 +++----
7
1 file changed, 6 insertions(+), 11 deletions(-)
20
1 file changed, 3 insertions(+), 4 deletions(-)
8
21
9
diff --git a/block/qed.c b/block/qed.c
22
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
10
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
11
--- a/block/qed.c
24
--- a/hw/block/vhost-user-blk.c
12
+++ b/block/qed.c
25
+++ b/hw/block/vhost-user-blk.c
13
@@ -XXX,XX +XXX,XX @@ static CachedL2Table *qed_new_l2_table(BDRVQEDState *s)
26
@@ -XXX,XX +XXX,XX @@ static void vhost_user_blk_update_config(VirtIODevice *vdev, uint8_t *config)
14
return l2_table;
27
{
28
VHostUserBlk *s = VHOST_USER_BLK(vdev);
29
30
+ /* Our num_queues overrides the device backend */
31
+ virtio_stw_p(vdev, &s->blkcfg.num_queues, s->num_queues);
32
+
33
memcpy(config, &s->blkcfg, sizeof(struct virtio_blk_config));
15
}
34
}
16
35
17
-static void qed_aio_next_io(QEDAIOCB *acb, int ret);
36
@@ -XXX,XX +XXX,XX @@ reconnect:
18
+static void qed_aio_next_io(QEDAIOCB *acb);
37
goto reconnect;
19
20
static void qed_aio_start_io(QEDAIOCB *acb)
21
{
22
- qed_aio_next_io(acb, 0);
23
+ qed_aio_next_io(acb);
24
}
25
26
static void qed_plug_allocating_write_reqs(BDRVQEDState *s)
27
@@ -XXX,XX +XXX,XX @@ static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
28
/**
29
* Begin next I/O or complete the request
30
*/
31
-static void qed_aio_next_io(QEDAIOCB *acb, int ret)
32
+static void qed_aio_next_io(QEDAIOCB *acb)
33
{
34
BDRVQEDState *s = acb_to_s(acb);
35
uint64_t offset;
36
size_t len;
37
+ int ret;
38
39
- trace_qed_aio_next_io(s, acb, ret, acb->cur_pos + acb->cur_qiov.size);
40
+ trace_qed_aio_next_io(s, acb, 0, acb->cur_pos + acb->cur_qiov.size);
41
42
if (acb->backing_qiov) {
43
qemu_iovec_destroy(acb->backing_qiov);
44
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb, int ret)
45
acb->backing_qiov = NULL;
46
}
38
}
47
39
48
- /* Handle I/O error */
40
- if (s->blkcfg.num_queues != s->num_queues) {
49
- if (ret) {
41
- s->blkcfg.num_queues = s->num_queues;
50
- qed_aio_complete(acb, ret);
51
- return;
52
- }
42
- }
53
-
43
-
54
acb->qiov_offset += acb->cur_qiov.size;
44
return;
55
acb->cur_pos += acb->cur_qiov.size;
45
56
qemu_iovec_reset(&acb->cur_qiov);
46
virtio_err:
57
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb, int ret)
58
}
59
return;
60
}
61
- qed_aio_next_io(acb, 0);
62
+ qed_aio_next_io(acb);
63
}
64
65
static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
66
--
47
--
67
1.8.3.1
48
2.29.2
68
49
69
50
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
Perform the savevm/loadvm test with both iothread on and off. This
3
Add an API that returns a new UNIX domain socket in the listen state.
4
covers the recently found savevm/loadvm hang when iothread is enabled.
4
The code for this was already there but only used internally in
5
init_socket().
6
7
This new API will be used by vhost-user-blk-test.
5
8
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Reviewed-by: Thomas Huth <thuth@redhat.com>
11
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
12
Message-Id: <20210223144653.811468-3-stefanha@redhat.com>
7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
13
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
---
14
---
9
tests/qemu-iotests/068 | 23 ++++++++++++++---------
15
tests/qtest/libqos/libqtest.h | 8 +++++++
10
tests/qemu-iotests/068.out | 11 ++++++++++-
16
tests/qtest/libqtest.c | 40 ++++++++++++++++++++---------------
11
2 files changed, 24 insertions(+), 10 deletions(-)
17
2 files changed, 31 insertions(+), 17 deletions(-)
12
18
13
diff --git a/tests/qemu-iotests/068 b/tests/qemu-iotests/068
19
diff --git a/tests/qtest/libqos/libqtest.h b/tests/qtest/libqos/libqtest.h
14
index XXXXXXX..XXXXXXX 100755
20
index XXXXXXX..XXXXXXX 100644
15
--- a/tests/qemu-iotests/068
21
--- a/tests/qtest/libqos/libqtest.h
16
+++ b/tests/qemu-iotests/068
22
+++ b/tests/qtest/libqos/libqtest.h
17
@@ -XXX,XX +XXX,XX @@ _supported_os Linux
23
@@ -XXX,XX +XXX,XX @@ void qtest_qmp_send(QTestState *s, const char *fmt, ...)
18
IMGOPTS="compat=1.1"
24
void qtest_qmp_send_raw(QTestState *s, const char *fmt, ...)
19
IMG_SIZE=128K
25
GCC_FMT_ATTR(2, 3);
20
26
21
-echo
27
+/**
22
-echo "=== Saving and reloading a VM state to/from a qcow2 image ==="
28
+ * qtest_socket_server:
23
-echo
29
+ * @socket_path: the UNIX domain socket path
24
-_make_test_img $IMG_SIZE
30
+ *
31
+ * Create and return a listen socket file descriptor, or abort on failure.
32
+ */
33
+int qtest_socket_server(const char *socket_path);
34
+
35
/**
36
* qtest_vqmp_fds:
37
* @s: #QTestState instance to operate on.
38
diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/tests/qtest/libqtest.c
41
+++ b/tests/qtest/libqtest.c
42
@@ -XXX,XX +XXX,XX @@ static void qtest_client_set_rx_handler(QTestState *s, QTestRecvFn recv);
43
44
static int init_socket(const char *socket_path)
45
{
46
- struct sockaddr_un addr;
47
- int sock;
48
- int ret;
25
-
49
-
26
case "$QEMU_DEFAULT_MACHINE" in
50
- sock = socket(PF_UNIX, SOCK_STREAM, 0);
27
s390-ccw-virtio)
51
- g_assert_cmpint(sock, !=, -1);
28
platform_parm="-no-shutdown"
52
-
29
@@ -XXX,XX +XXX,XX @@ _qemu()
53
- addr.sun_family = AF_UNIX;
30
_filter_qemu | _filter_hmp
54
- snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", socket_path);
55
+ int sock = qtest_socket_server(socket_path);
56
qemu_set_cloexec(sock);
57
-
58
- do {
59
- ret = bind(sock, (struct sockaddr *)&addr, sizeof(addr));
60
- } while (ret == -1 && errno == EINTR);
61
- g_assert_cmpint(ret, !=, -1);
62
- ret = listen(sock, 1);
63
- g_assert_cmpint(ret, !=, -1);
64
-
65
return sock;
31
}
66
}
32
67
33
-# Give qemu some time to boot before saving the VM state
68
@@ -XXX,XX +XXX,XX @@ QDict *qtest_qmp_receive_dict(QTestState *s)
34
-bash -c 'sleep 1; echo -e "savevm 0\nquit"' | _qemu
69
return qmp_fd_receive(s->qmp_fd);
35
-# Now try to continue from that VM state (this should just work)
70
}
36
-echo quit | _qemu -loadvm 0
71
37
+for extra_args in \
72
+int qtest_socket_server(const char *socket_path)
38
+ "" \
73
+{
39
+ "-object iothread,id=iothread0 -set device.hba0.iothread=iothread0"; do
74
+ struct sockaddr_un addr;
40
+ echo
75
+ int sock;
41
+ echo "=== Saving and reloading a VM state to/from a qcow2 image ($extra_args) ==="
76
+ int ret;
42
+ echo
43
+
77
+
44
+ _make_test_img $IMG_SIZE
78
+ sock = socket(PF_UNIX, SOCK_STREAM, 0);
79
+ g_assert_cmpint(sock, !=, -1);
45
+
80
+
46
+ # Give qemu some time to boot before saving the VM state
81
+ addr.sun_family = AF_UNIX;
47
+ bash -c 'sleep 1; echo -e "savevm 0\nquit"' | _qemu $extra_args
82
+ snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", socket_path);
48
+ # Now try to continue from that VM state (this should just work)
49
+ echo quit | _qemu $extra_args -loadvm 0
50
+done
51
52
# success, all done
53
echo "*** done"
54
diff --git a/tests/qemu-iotests/068.out b/tests/qemu-iotests/068.out
55
index XXXXXXX..XXXXXXX 100644
56
--- a/tests/qemu-iotests/068.out
57
+++ b/tests/qemu-iotests/068.out
58
@@ -XXX,XX +XXX,XX @@
59
QA output created by 068
60
61
-=== Saving and reloading a VM state to/from a qcow2 image ===
62
+=== Saving and reloading a VM state to/from a qcow2 image () ===
63
+
83
+
64
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072
84
+ do {
65
+QEMU X.Y.Z monitor - type 'help' for more information
85
+ ret = bind(sock, (struct sockaddr *)&addr, sizeof(addr));
66
+(qemu) savevm 0
86
+ } while (ret == -1 && errno == EINTR);
67
+(qemu) quit
87
+ g_assert_cmpint(ret, !=, -1);
68
+QEMU X.Y.Z monitor - type 'help' for more information
88
+ ret = listen(sock, 1);
69
+(qemu) quit
89
+ g_assert_cmpint(ret, !=, -1);
70
+
90
+
71
+=== Saving and reloading a VM state to/from a qcow2 image (-object iothread,id=iothread0 -set device.hba0.iothread=iothread0) ===
91
+ return sock;
72
92
+}
73
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072
93
+
74
QEMU X.Y.Z monitor - type 'help' for more information
94
/**
95
* Allow users to send a message without waiting for the reply,
96
* in the case that they choose to discard all replies up until
75
--
97
--
76
1.8.3.1
98
2.29.2
77
99
78
100
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
Instead of calling perform_cow() twice with a different COW region
3
Tests that manage multiple processes may wish to kill QEMU before
4
each time, call it just once and make perform_cow() handle both
4
destroying the QTestState. Expose a function to do that.
5
regions.
6
5
7
This patch simply moves code around. The next one will do the actual
6
The vhost-user-blk-test testcase will need this.
8
reordering of the COW operations.
9
7
10
Signed-off-by: Alberto Garcia <berto@igalia.com>
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
12
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
10
Message-Id: <20210223144653.811468-4-stefanha@redhat.com>
13
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
11
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
14
---
12
---
15
block/qcow2-cluster.c | 36 ++++++++++++++++++++++--------------
13
tests/qtest/libqos/libqtest.h | 11 +++++++++++
16
1 file changed, 22 insertions(+), 14 deletions(-)
14
tests/qtest/libqtest.c | 7 ++++---
15
2 files changed, 15 insertions(+), 3 deletions(-)
17
16
18
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
17
diff --git a/tests/qtest/libqos/libqtest.h b/tests/qtest/libqos/libqtest.h
19
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
20
--- a/block/qcow2-cluster.c
19
--- a/tests/qtest/libqos/libqtest.h
21
+++ b/block/qcow2-cluster.c
20
+++ b/tests/qtest/libqos/libqtest.h
22
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
21
@@ -XXX,XX +XXX,XX @@ QTestState *qtest_init_without_qmp_handshake(const char *extra_args);
23
struct iovec iov;
22
*/
24
int ret;
23
QTestState *qtest_init_with_serial(const char *extra_args, int *sock_fd);
25
24
26
+ if (bytes == 0) {
25
+/**
27
+ return 0;
26
+ * qtest_kill_qemu:
28
+ }
27
+ * @s: #QTestState instance to operate on.
28
+ *
29
+ * Kill the QEMU process and wait for it to terminate. It is safe to call this
30
+ * function multiple times. Normally qtest_quit() is used instead because it
31
+ * also frees QTestState. Use qtest_kill_qemu() when you just want to kill QEMU
32
+ * and qtest_quit() will be called later.
33
+ */
34
+void qtest_kill_qemu(QTestState *s);
29
+
35
+
30
iov.iov_len = bytes;
36
/**
31
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
37
* qtest_quit:
32
if (iov.iov_base == NULL) {
38
* @s: #QTestState instance to operate on.
33
@@ -XXX,XX +XXX,XX @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
39
diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
34
return cluster_offset;
40
index XXXXXXX..XXXXXXX 100644
41
--- a/tests/qtest/libqtest.c
42
+++ b/tests/qtest/libqtest.c
43
@@ -XXX,XX +XXX,XX @@ void qtest_set_expected_status(QTestState *s, int status)
44
s->expected_status = status;
35
}
45
}
36
46
37
-static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
47
-static void kill_qemu(QTestState *s)
38
+static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
48
+void qtest_kill_qemu(QTestState *s)
39
{
49
{
40
BDRVQcow2State *s = bs->opaque;
50
pid_t pid = s->qemu_pid;
41
+ Qcow2COWRegion *start = &m->cow_start;
51
int wstatus;
42
+ Qcow2COWRegion *end = &m->cow_end;
52
@@ -XXX,XX +XXX,XX @@ static void kill_qemu(QTestState *s)
43
int ret;
53
kill(pid, SIGTERM);
44
54
TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
45
- if (r->nb_bytes == 0) {
55
assert(pid == s->qemu_pid);
46
+ if (start->nb_bytes == 0 && end->nb_bytes == 0) {
56
+ s->qemu_pid = -1;
47
return 0;
48
}
57
}
49
58
50
qemu_co_mutex_unlock(&s->lock);
51
- ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
52
- qemu_co_mutex_lock(&s->lock);
53
-
54
+ ret = do_perform_cow(bs, m->offset, m->alloc_offset,
55
+ start->offset, start->nb_bytes);
56
if (ret < 0) {
57
- return ret;
58
+ goto fail;
59
}
60
61
+ ret = do_perform_cow(bs, m->offset, m->alloc_offset,
62
+ end->offset, end->nb_bytes);
63
+
64
+fail:
65
+ qemu_co_mutex_lock(&s->lock);
66
+
67
/*
59
/*
68
* Before we update the L2 table to actually point to the new cluster, we
60
@@ -XXX,XX +XXX,XX @@ static void kill_qemu(QTestState *s)
69
* need to be sure that the refcounts have been increased and COW was
61
70
* handled.
62
static void kill_qemu_hook_func(void *s)
71
*/
63
{
72
- qcow2_cache_depends_on_flush(s->l2_table_cache);
64
- kill_qemu(s);
73
+ if (ret == 0) {
65
+ qtest_kill_qemu(s);
74
+ qcow2_cache_depends_on_flush(s->l2_table_cache);
75
+ }
76
77
- return 0;
78
+ return ret;
79
}
66
}
80
67
81
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
68
static void sigabrt_handler(int signo)
82
@@ -XXX,XX +XXX,XX @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
69
@@ -XXX,XX +XXX,XX @@ void qtest_quit(QTestState *s)
83
}
70
/* Uninstall SIGABRT handler on last instance */
84
71
cleanup_sigabrt_handler();
85
/* copy content of unmodified sectors */
72
86
- ret = perform_cow(bs, m, &m->cow_start);
73
- kill_qemu(s);
87
- if (ret < 0) {
74
+ qtest_kill_qemu(s);
88
- goto err;
75
close(s->fd);
89
- }
76
close(s->qmp_fd);
90
-
77
g_string_free(s->rx, true);
91
- ret = perform_cow(bs, m, &m->cow_end);
92
+ ret = perform_cow(bs, m);
93
if (ret < 0) {
94
goto err;
95
}
96
--
78
--
97
1.8.3.1
79
2.29.2
98
80
99
81
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
Avoid duplicating the QEMU command-line.
3
Add a function to remove previously-added abrt handler functions.
4
5
Now that a symmetric pair of add/remove functions exists we can also
6
balance the SIGABRT handler installation. The signal handler was
7
installed each time qtest_add_abrt_handler() was called. Now it is
8
installed when the abrt handler list becomes non-empty and removed again
9
when the list becomes empty.
10
11
The qtest_remove_abrt_handler() function will be used by
12
vhost-user-blk-test.
4
13
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
16
Message-Id: <20210223144653.811468-5-stefanha@redhat.com>
6
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
7
---
18
---
8
tests/qemu-iotests/068 | 15 +++++++++------
19
tests/qtest/libqos/libqtest.h | 18 ++++++++++++++++++
9
1 file changed, 9 insertions(+), 6 deletions(-)
20
tests/qtest/libqtest.c | 35 +++++++++++++++++++++++++++++------
21
2 files changed, 47 insertions(+), 6 deletions(-)
10
22
11
diff --git a/tests/qemu-iotests/068 b/tests/qemu-iotests/068
23
diff --git a/tests/qtest/libqos/libqtest.h b/tests/qtest/libqos/libqtest.h
12
index XXXXXXX..XXXXXXX 100755
24
index XXXXXXX..XXXXXXX 100644
13
--- a/tests/qemu-iotests/068
25
--- a/tests/qtest/libqos/libqtest.h
14
+++ b/tests/qemu-iotests/068
26
+++ b/tests/qtest/libqos/libqtest.h
15
@@ -XXX,XX +XXX,XX @@ case "$QEMU_DEFAULT_MACHINE" in
27
@@ -XXX,XX +XXX,XX @@ void qtest_add_data_func_full(const char *str, void *data,
16
;;
28
g_free(path); \
17
esac
29
} while (0)
18
30
19
-# Give qemu some time to boot before saving the VM state
31
+/**
20
-bash -c 'sleep 1; echo -e "savevm 0\nquit"' |\
32
+ * qtest_add_abrt_handler:
21
- $QEMU $platform_parm -nographic -monitor stdio -serial none -hda "$TEST_IMG" |\
33
+ * @fn: Handler function
22
+_qemu()
34
+ * @data: Argument that is passed to the handler
35
+ *
36
+ * Add a handler function that is invoked on SIGABRT. This can be used to
37
+ * terminate processes and perform other cleanup. The handler can be removed
38
+ * with qtest_remove_abrt_handler().
39
+ */
40
void qtest_add_abrt_handler(GHookFunc fn, const void *data);
41
42
+/**
43
+ * qtest_remove_abrt_handler:
44
+ * @data: Argument previously passed to qtest_add_abrt_handler()
45
+ *
46
+ * Remove an abrt handler that was previously added with
47
+ * qtest_add_abrt_handler().
48
+ */
49
+void qtest_remove_abrt_handler(void *data);
50
+
51
/**
52
* qtest_qmp_assert_success:
53
* @qts: QTestState instance to operate on
54
diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
55
index XXXXXXX..XXXXXXX 100644
56
--- a/tests/qtest/libqtest.c
57
+++ b/tests/qtest/libqtest.c
58
@@ -XXX,XX +XXX,XX @@ static void cleanup_sigabrt_handler(void)
59
sigaction(SIGABRT, &sigact_old, NULL);
60
}
61
62
+static bool hook_list_is_empty(GHookList *hook_list)
23
+{
63
+{
24
+ $QEMU $platform_parm -nographic -monitor stdio -serial none -hda "$TEST_IMG" \
64
+ GHook *hook = g_hook_first_valid(hook_list, TRUE);
25
+ "$@" |\
65
+
26
_filter_qemu | _filter_hmp
66
+ if (!hook) {
67
+ return false;
68
+ }
69
+
70
+ g_hook_unref(hook_list, hook);
71
+ return true;
27
+}
72
+}
28
+
73
+
29
+# Give qemu some time to boot before saving the VM state
74
void qtest_add_abrt_handler(GHookFunc fn, const void *data)
30
+bash -c 'sleep 1; echo -e "savevm 0\nquit"' | _qemu
75
{
31
# Now try to continue from that VM state (this should just work)
76
GHook *hook;
32
-echo quit |\
77
33
- $QEMU $platform_parm -nographic -monitor stdio -serial none -hda "$TEST_IMG" -loadvm 0 |\
78
- /* Only install SIGABRT handler once */
34
- _filter_qemu | _filter_hmp
79
if (!abrt_hooks.is_setup) {
35
+echo quit | _qemu -loadvm 0
80
g_hook_list_init(&abrt_hooks, sizeof(GHook));
36
81
}
37
# success, all done
82
- setup_sigabrt_handler();
38
echo "*** done"
83
+
84
+ /* Only install SIGABRT handler once */
85
+ if (hook_list_is_empty(&abrt_hooks)) {
86
+ setup_sigabrt_handler();
87
+ }
88
89
hook = g_hook_alloc(&abrt_hooks);
90
hook->func = fn;
91
@@ -XXX,XX +XXX,XX @@ void qtest_add_abrt_handler(GHookFunc fn, const void *data)
92
g_hook_prepend(&abrt_hooks, hook);
93
}
94
95
+void qtest_remove_abrt_handler(void *data)
96
+{
97
+ GHook *hook = g_hook_find_data(&abrt_hooks, TRUE, data);
98
+ g_hook_destroy_link(&abrt_hooks, hook);
99
+
100
+ /* Uninstall SIGABRT handler on last instance */
101
+ if (hook_list_is_empty(&abrt_hooks)) {
102
+ cleanup_sigabrt_handler();
103
+ }
104
+}
105
+
106
static const char *qtest_qemu_binary(void)
107
{
108
const char *qemu_bin;
109
@@ -XXX,XX +XXX,XX @@ QTestState *qtest_init_with_serial(const char *extra_args, int *sock_fd)
110
111
void qtest_quit(QTestState *s)
112
{
113
- g_hook_destroy_link(&abrt_hooks, g_hook_find_data(&abrt_hooks, TRUE, s));
114
-
115
- /* Uninstall SIGABRT handler on last instance */
116
- cleanup_sigabrt_handler();
117
+ qtest_remove_abrt_handler(s);
118
119
qtest_kill_qemu(s);
120
close(s->fd);
39
--
121
--
40
1.8.3.1
122
2.29.2
41
123
42
124
diff view generated by jsdifflib
1
Now that we stay in coroutine context for the whole request when doing
1
From: Coiby Xu <coiby.xu@gmail.com>
2
reads or writes, we can add coroutine_fn annotations to many functions
3
that can do I/O or yield directly.
4
2
3
This test case has the same tests as tests/virtio-blk-test.c except for
4
tests have block_resize. Since the vhost-user-blk export only serves one
5
client one time, two exports are started by qemu-storage-daemon for the
6
hotplug test.
7
8
Suggested-by: Thomas Huth <thuth@redhat.com>
9
Signed-off-by: Coiby Xu <coiby.xu@gmail.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Message-Id: <20210223144653.811468-6-stefanha@redhat.com>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
13
---
8
block/qed-cluster.c | 5 +++--
14
tests/qtest/libqos/vhost-user-blk.h | 48 ++
9
block/qed.c | 44 ++++++++++++++++++++++++--------------------
15
tests/qtest/libqos/vhost-user-blk.c | 130 +++++
10
block/qed.h | 5 +++--
16
tests/qtest/vhost-user-blk-test.c | 788 ++++++++++++++++++++++++++++
11
3 files changed, 30 insertions(+), 24 deletions(-)
17
MAINTAINERS | 2 +
18
tests/qtest/libqos/meson.build | 1 +
19
tests/qtest/meson.build | 4 +
20
6 files changed, 973 insertions(+)
21
create mode 100644 tests/qtest/libqos/vhost-user-blk.h
22
create mode 100644 tests/qtest/libqos/vhost-user-blk.c
23
create mode 100644 tests/qtest/vhost-user-blk-test.c
12
24
13
diff --git a/block/qed-cluster.c b/block/qed-cluster.c
25
diff --git a/tests/qtest/libqos/vhost-user-blk.h b/tests/qtest/libqos/vhost-user-blk.h
26
new file mode 100644
27
index XXXXXXX..XXXXXXX
28
--- /dev/null
29
+++ b/tests/qtest/libqos/vhost-user-blk.h
30
@@ -XXX,XX +XXX,XX @@
31
+/*
32
+ * libqos driver framework
33
+ *
34
+ * Based on tests/qtest/libqos/virtio-blk.c
35
+ *
36
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
37
+ *
38
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
39
+ *
40
+ * This library is free software; you can redistribute it and/or
41
+ * modify it under the terms of the GNU Lesser General Public
42
+ * License version 2 as published by the Free Software Foundation.
43
+ *
44
+ * This library is distributed in the hope that it will be useful,
45
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
46
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
47
+ * Lesser General Public License for more details.
48
+ *
49
+ * You should have received a copy of the GNU Lesser General Public
50
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
51
+ */
52
+
53
+#ifndef TESTS_LIBQOS_VHOST_USER_BLK_H
54
+#define TESTS_LIBQOS_VHOST_USER_BLK_H
55
+
56
+#include "qgraph.h"
57
+#include "virtio.h"
58
+#include "virtio-pci.h"
59
+
60
+typedef struct QVhostUserBlk QVhostUserBlk;
61
+typedef struct QVhostUserBlkPCI QVhostUserBlkPCI;
62
+typedef struct QVhostUserBlkDevice QVhostUserBlkDevice;
63
+
64
+struct QVhostUserBlk {
65
+ QVirtioDevice *vdev;
66
+};
67
+
68
+struct QVhostUserBlkPCI {
69
+ QVirtioPCIDevice pci_vdev;
70
+ QVhostUserBlk blk;
71
+};
72
+
73
+struct QVhostUserBlkDevice {
74
+ QOSGraphObject obj;
75
+ QVhostUserBlk blk;
76
+};
77
+
78
+#endif
79
diff --git a/tests/qtest/libqos/vhost-user-blk.c b/tests/qtest/libqos/vhost-user-blk.c
80
new file mode 100644
81
index XXXXXXX..XXXXXXX
82
--- /dev/null
83
+++ b/tests/qtest/libqos/vhost-user-blk.c
84
@@ -XXX,XX +XXX,XX @@
85
+/*
86
+ * libqos driver framework
87
+ *
88
+ * Based on tests/qtest/libqos/virtio-blk.c
89
+ *
90
+ * Copyright (c) 2020 Coiby Xu <coiby.xu@gmail.com>
91
+ *
92
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
93
+ *
94
+ * This library is free software; you can redistribute it and/or
95
+ * modify it under the terms of the GNU Lesser General Public
96
+ * License version 2.1 as published by the Free Software Foundation.
97
+ *
98
+ * This library is distributed in the hope that it will be useful,
99
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
100
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
101
+ * Lesser General Public License for more details.
102
+ *
103
+ * You should have received a copy of the GNU Lesser General Public
104
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
105
+ */
106
+
107
+#include "qemu/osdep.h"
108
+#include "libqtest.h"
109
+#include "qemu/module.h"
110
+#include "standard-headers/linux/virtio_blk.h"
111
+#include "vhost-user-blk.h"
112
+
113
+#define PCI_SLOT 0x04
114
+#define PCI_FN 0x00
115
+
116
+/* virtio-blk-device */
117
+static void *qvhost_user_blk_get_driver(QVhostUserBlk *v_blk,
118
+ const char *interface)
119
+{
120
+ if (!g_strcmp0(interface, "vhost-user-blk")) {
121
+ return v_blk;
122
+ }
123
+ if (!g_strcmp0(interface, "virtio")) {
124
+ return v_blk->vdev;
125
+ }
126
+
127
+ fprintf(stderr, "%s not present in vhost-user-blk-device\n", interface);
128
+ g_assert_not_reached();
129
+}
130
+
131
+static void *qvhost_user_blk_device_get_driver(void *object,
132
+ const char *interface)
133
+{
134
+ QVhostUserBlkDevice *v_blk = object;
135
+ return qvhost_user_blk_get_driver(&v_blk->blk, interface);
136
+}
137
+
138
+static void *vhost_user_blk_device_create(void *virtio_dev,
139
+ QGuestAllocator *t_alloc,
140
+ void *addr)
141
+{
142
+ QVhostUserBlkDevice *vhost_user_blk = g_new0(QVhostUserBlkDevice, 1);
143
+ QVhostUserBlk *interface = &vhost_user_blk->blk;
144
+
145
+ interface->vdev = virtio_dev;
146
+
147
+ vhost_user_blk->obj.get_driver = qvhost_user_blk_device_get_driver;
148
+
149
+ return &vhost_user_blk->obj;
150
+}
151
+
152
+/* virtio-blk-pci */
153
+static void *qvhost_user_blk_pci_get_driver(void *object, const char *interface)
154
+{
155
+ QVhostUserBlkPCI *v_blk = object;
156
+ if (!g_strcmp0(interface, "pci-device")) {
157
+ return v_blk->pci_vdev.pdev;
158
+ }
159
+ return qvhost_user_blk_get_driver(&v_blk->blk, interface);
160
+}
161
+
162
+static void *vhost_user_blk_pci_create(void *pci_bus, QGuestAllocator *t_alloc,
163
+ void *addr)
164
+{
165
+ QVhostUserBlkPCI *vhost_user_blk = g_new0(QVhostUserBlkPCI, 1);
166
+ QVhostUserBlk *interface = &vhost_user_blk->blk;
167
+ QOSGraphObject *obj = &vhost_user_blk->pci_vdev.obj;
168
+
169
+ virtio_pci_init(&vhost_user_blk->pci_vdev, pci_bus, addr);
170
+ interface->vdev = &vhost_user_blk->pci_vdev.vdev;
171
+
172
+ g_assert_cmphex(interface->vdev->device_type, ==, VIRTIO_ID_BLOCK);
173
+
174
+ obj->get_driver = qvhost_user_blk_pci_get_driver;
175
+
176
+ return obj;
177
+}
178
+
179
+static void vhost_user_blk_register_nodes(void)
180
+{
181
+ /*
182
+ * FIXME: every test using these two nodes needs to setup a
183
+ * -drive,id=drive0 otherwise QEMU is not going to start.
184
+ * Therefore, we do not include "produces" edge for virtio
185
+ * and pci-device yet.
186
+ */
187
+
188
+ char *arg = g_strdup_printf("id=drv0,chardev=char1,addr=%x.%x",
189
+ PCI_SLOT, PCI_FN);
190
+
191
+ QPCIAddress addr = {
192
+ .devfn = QPCI_DEVFN(PCI_SLOT, PCI_FN),
193
+ };
194
+
195
+ QOSGraphEdgeOptions opts = { };
196
+
197
+ /* virtio-blk-device */
198
+ /** opts.extra_device_opts = "drive=drive0"; */
199
+ qos_node_create_driver("vhost-user-blk-device",
200
+ vhost_user_blk_device_create);
201
+ qos_node_consumes("vhost-user-blk-device", "virtio-bus", &opts);
202
+ qos_node_produces("vhost-user-blk-device", "vhost-user-blk");
203
+
204
+ /* virtio-blk-pci */
205
+ opts.extra_device_opts = arg;
206
+ add_qpci_address(&opts, &addr);
207
+ qos_node_create_driver("vhost-user-blk-pci", vhost_user_blk_pci_create);
208
+ qos_node_consumes("vhost-user-blk-pci", "pci-bus", &opts);
209
+ qos_node_produces("vhost-user-blk-pci", "vhost-user-blk");
210
+
211
+ g_free(arg);
212
+}
213
+
214
+libqos_init(vhost_user_blk_register_nodes);
215
diff --git a/tests/qtest/vhost-user-blk-test.c b/tests/qtest/vhost-user-blk-test.c
216
new file mode 100644
217
index XXXXXXX..XXXXXXX
218
--- /dev/null
219
+++ b/tests/qtest/vhost-user-blk-test.c
220
@@ -XXX,XX +XXX,XX @@
221
+/*
222
+ * QTest testcase for Vhost-user Block Device
223
+ *
224
+ * Based on tests/qtest//virtio-blk-test.c
225
+
226
+ * Copyright (c) 2014 SUSE LINUX Products GmbH
227
+ * Copyright (c) 2014 Marc Marí
228
+ * Copyright (c) 2020 Coiby Xu
229
+ *
230
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
231
+ * See the COPYING file in the top-level directory.
232
+ */
233
+
234
+#include "qemu/osdep.h"
235
+#include "libqtest-single.h"
236
+#include "qemu/bswap.h"
237
+#include "qemu/module.h"
238
+#include "standard-headers/linux/virtio_blk.h"
239
+#include "standard-headers/linux/virtio_pci.h"
240
+#include "libqos/qgraph.h"
241
+#include "libqos/vhost-user-blk.h"
242
+#include "libqos/libqos-pc.h"
243
+
244
+#define TEST_IMAGE_SIZE (64 * 1024 * 1024)
245
+#define QVIRTIO_BLK_TIMEOUT_US (30 * 1000 * 1000)
246
+#define PCI_SLOT_HP 0x06
247
+
248
+typedef struct {
249
+ pid_t pid;
250
+} QemuStorageDaemonState;
251
+
252
+typedef struct QVirtioBlkReq {
253
+ uint32_t type;
254
+ uint32_t ioprio;
255
+ uint64_t sector;
256
+ char *data;
257
+ uint8_t status;
258
+} QVirtioBlkReq;
259
+
260
+#ifdef HOST_WORDS_BIGENDIAN
261
+static const bool host_is_big_endian = true;
262
+#else
263
+static const bool host_is_big_endian; /* false */
264
+#endif
265
+
266
+static inline void virtio_blk_fix_request(QVirtioDevice *d, QVirtioBlkReq *req)
267
+{
268
+ if (qvirtio_is_big_endian(d) != host_is_big_endian) {
269
+ req->type = bswap32(req->type);
270
+ req->ioprio = bswap32(req->ioprio);
271
+ req->sector = bswap64(req->sector);
272
+ }
273
+}
274
+
275
+static inline void virtio_blk_fix_dwz_hdr(QVirtioDevice *d,
276
+ struct virtio_blk_discard_write_zeroes *dwz_hdr)
277
+{
278
+ if (qvirtio_is_big_endian(d) != host_is_big_endian) {
279
+ dwz_hdr->sector = bswap64(dwz_hdr->sector);
280
+ dwz_hdr->num_sectors = bswap32(dwz_hdr->num_sectors);
281
+ dwz_hdr->flags = bswap32(dwz_hdr->flags);
282
+ }
283
+}
284
+
285
+static uint64_t virtio_blk_request(QGuestAllocator *alloc, QVirtioDevice *d,
286
+ QVirtioBlkReq *req, uint64_t data_size)
287
+{
288
+ uint64_t addr;
289
+ uint8_t status = 0xFF;
290
+ QTestState *qts = global_qtest;
291
+
292
+ switch (req->type) {
293
+ case VIRTIO_BLK_T_IN:
294
+ case VIRTIO_BLK_T_OUT:
295
+ g_assert_cmpuint(data_size % 512, ==, 0);
296
+ break;
297
+ case VIRTIO_BLK_T_DISCARD:
298
+ case VIRTIO_BLK_T_WRITE_ZEROES:
299
+ g_assert_cmpuint(data_size %
300
+ sizeof(struct virtio_blk_discard_write_zeroes), ==, 0);
301
+ break;
302
+ default:
303
+ g_assert_cmpuint(data_size, ==, 0);
304
+ }
305
+
306
+ addr = guest_alloc(alloc, sizeof(*req) + data_size);
307
+
308
+ virtio_blk_fix_request(d, req);
309
+
310
+ qtest_memwrite(qts, addr, req, 16);
311
+ qtest_memwrite(qts, addr + 16, req->data, data_size);
312
+ qtest_memwrite(qts, addr + 16 + data_size, &status, sizeof(status));
313
+
314
+ return addr;
315
+}
316
+
317
+/* Returns the request virtqueue so the caller can perform further tests */
318
+static QVirtQueue *test_basic(QVirtioDevice *dev, QGuestAllocator *alloc)
319
+{
320
+ QVirtioBlkReq req;
321
+ uint64_t req_addr;
322
+ uint64_t capacity;
323
+ uint64_t features;
324
+ uint32_t free_head;
325
+ uint8_t status;
326
+ char *data;
327
+ QTestState *qts = global_qtest;
328
+ QVirtQueue *vq;
329
+
330
+ features = qvirtio_get_features(dev);
331
+ features = features & ~(QVIRTIO_F_BAD_FEATURE |
332
+ (1u << VIRTIO_RING_F_INDIRECT_DESC) |
333
+ (1u << VIRTIO_RING_F_EVENT_IDX) |
334
+ (1u << VIRTIO_BLK_F_SCSI));
335
+ qvirtio_set_features(dev, features);
336
+
337
+ capacity = qvirtio_config_readq(dev, 0);
338
+ g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512);
339
+
340
+ vq = qvirtqueue_setup(dev, alloc, 0);
341
+
342
+ qvirtio_set_driver_ok(dev);
343
+
344
+ /* Write and read with 3 descriptor layout */
345
+ /* Write request */
346
+ req.type = VIRTIO_BLK_T_OUT;
347
+ req.ioprio = 1;
348
+ req.sector = 0;
349
+ req.data = g_malloc0(512);
350
+ strcpy(req.data, "TEST");
351
+
352
+ req_addr = virtio_blk_request(alloc, dev, &req, 512);
353
+
354
+ g_free(req.data);
355
+
356
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
357
+ qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true);
358
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
359
+
360
+ qvirtqueue_kick(qts, dev, vq, free_head);
361
+
362
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
363
+ QVIRTIO_BLK_TIMEOUT_US);
364
+ status = readb(req_addr + 528);
365
+ g_assert_cmpint(status, ==, 0);
366
+
367
+ guest_free(alloc, req_addr);
368
+
369
+ /* Read request */
370
+ req.type = VIRTIO_BLK_T_IN;
371
+ req.ioprio = 1;
372
+ req.sector = 0;
373
+ req.data = g_malloc0(512);
374
+
375
+ req_addr = virtio_blk_request(alloc, dev, &req, 512);
376
+
377
+ g_free(req.data);
378
+
379
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
380
+ qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true);
381
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
382
+
383
+ qvirtqueue_kick(qts, dev, vq, free_head);
384
+
385
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
386
+ QVIRTIO_BLK_TIMEOUT_US);
387
+ status = readb(req_addr + 528);
388
+ g_assert_cmpint(status, ==, 0);
389
+
390
+ data = g_malloc0(512);
391
+ qtest_memread(qts, req_addr + 16, data, 512);
392
+ g_assert_cmpstr(data, ==, "TEST");
393
+ g_free(data);
394
+
395
+ guest_free(alloc, req_addr);
396
+
397
+ if (features & (1u << VIRTIO_BLK_F_WRITE_ZEROES)) {
398
+ struct virtio_blk_discard_write_zeroes dwz_hdr;
399
+ void *expected;
400
+
401
+ /*
402
+ * WRITE_ZEROES request on the same sector of previous test where
403
+ * we wrote "TEST".
404
+ */
405
+ req.type = VIRTIO_BLK_T_WRITE_ZEROES;
406
+ req.data = (char *) &dwz_hdr;
407
+ dwz_hdr.sector = 0;
408
+ dwz_hdr.num_sectors = 1;
409
+ dwz_hdr.flags = 0;
410
+
411
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
412
+
413
+ req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
414
+
415
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
416
+ qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
417
+ qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), 1, true,
418
+ false);
419
+
420
+ qvirtqueue_kick(qts, dev, vq, free_head);
421
+
422
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
423
+ QVIRTIO_BLK_TIMEOUT_US);
424
+ status = readb(req_addr + 16 + sizeof(dwz_hdr));
425
+ g_assert_cmpint(status, ==, 0);
426
+
427
+ guest_free(alloc, req_addr);
428
+
429
+ /* Read request to check if the sector contains all zeroes */
430
+ req.type = VIRTIO_BLK_T_IN;
431
+ req.ioprio = 1;
432
+ req.sector = 0;
433
+ req.data = g_malloc0(512);
434
+
435
+ req_addr = virtio_blk_request(alloc, dev, &req, 512);
436
+
437
+ g_free(req.data);
438
+
439
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
440
+ qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true);
441
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
442
+
443
+ qvirtqueue_kick(qts, dev, vq, free_head);
444
+
445
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
446
+ QVIRTIO_BLK_TIMEOUT_US);
447
+ status = readb(req_addr + 528);
448
+ g_assert_cmpint(status, ==, 0);
449
+
450
+ data = g_malloc(512);
451
+ expected = g_malloc0(512);
452
+ qtest_memread(qts, req_addr + 16, data, 512);
453
+ g_assert_cmpmem(data, 512, expected, 512);
454
+ g_free(expected);
455
+ g_free(data);
456
+
457
+ guest_free(alloc, req_addr);
458
+ }
459
+
460
+ if (features & (1u << VIRTIO_BLK_F_DISCARD)) {
461
+ struct virtio_blk_discard_write_zeroes dwz_hdr;
462
+
463
+ req.type = VIRTIO_BLK_T_DISCARD;
464
+ req.data = (char *) &dwz_hdr;
465
+ dwz_hdr.sector = 0;
466
+ dwz_hdr.num_sectors = 1;
467
+ dwz_hdr.flags = 0;
468
+
469
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
470
+
471
+ req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
472
+
473
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
474
+ qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
475
+ qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr),
476
+ 1, true, false);
477
+
478
+ qvirtqueue_kick(qts, dev, vq, free_head);
479
+
480
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
481
+ QVIRTIO_BLK_TIMEOUT_US);
482
+ status = readb(req_addr + 16 + sizeof(dwz_hdr));
483
+ g_assert_cmpint(status, ==, 0);
484
+
485
+ guest_free(alloc, req_addr);
486
+ }
487
+
488
+ if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
489
+ /* Write and read with 2 descriptor layout */
490
+ /* Write request */
491
+ req.type = VIRTIO_BLK_T_OUT;
492
+ req.ioprio = 1;
493
+ req.sector = 1;
494
+ req.data = g_malloc0(512);
495
+ strcpy(req.data, "TEST");
496
+
497
+ req_addr = virtio_blk_request(alloc, dev, &req, 512);
498
+
499
+ g_free(req.data);
500
+
501
+ free_head = qvirtqueue_add(qts, vq, req_addr, 528, false, true);
502
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
503
+ qvirtqueue_kick(qts, dev, vq, free_head);
504
+
505
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
506
+ QVIRTIO_BLK_TIMEOUT_US);
507
+ status = readb(req_addr + 528);
508
+ g_assert_cmpint(status, ==, 0);
509
+
510
+ guest_free(alloc, req_addr);
511
+
512
+ /* Read request */
513
+ req.type = VIRTIO_BLK_T_IN;
514
+ req.ioprio = 1;
515
+ req.sector = 1;
516
+ req.data = g_malloc0(512);
517
+
518
+ req_addr = virtio_blk_request(alloc, dev, &req, 512);
519
+
520
+ g_free(req.data);
521
+
522
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
523
+ qvirtqueue_add(qts, vq, req_addr + 16, 513, true, false);
524
+
525
+ qvirtqueue_kick(qts, dev, vq, free_head);
526
+
527
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
528
+ QVIRTIO_BLK_TIMEOUT_US);
529
+ status = readb(req_addr + 528);
530
+ g_assert_cmpint(status, ==, 0);
531
+
532
+ data = g_malloc0(512);
533
+ qtest_memread(qts, req_addr + 16, data, 512);
534
+ g_assert_cmpstr(data, ==, "TEST");
535
+ g_free(data);
536
+
537
+ guest_free(alloc, req_addr);
538
+ }
539
+
540
+ return vq;
541
+}
542
+
543
+static void basic(void *obj, void *data, QGuestAllocator *t_alloc)
544
+{
545
+ QVhostUserBlk *blk_if = obj;
546
+ QVirtQueue *vq;
547
+
548
+ vq = test_basic(blk_if->vdev, t_alloc);
549
+ qvirtqueue_cleanup(blk_if->vdev->bus, vq, t_alloc);
550
+
551
+}
552
+
553
+static void indirect(void *obj, void *u_data, QGuestAllocator *t_alloc)
554
+{
555
+ QVirtQueue *vq;
556
+ QVhostUserBlk *blk_if = obj;
557
+ QVirtioDevice *dev = blk_if->vdev;
558
+ QVirtioBlkReq req;
559
+ QVRingIndirectDesc *indirect;
560
+ uint64_t req_addr;
561
+ uint64_t capacity;
562
+ uint64_t features;
563
+ uint32_t free_head;
564
+ uint8_t status;
565
+ char *data;
566
+ QTestState *qts = global_qtest;
567
+
568
+ features = qvirtio_get_features(dev);
569
+ g_assert_cmphex(features & (1u << VIRTIO_RING_F_INDIRECT_DESC), !=, 0);
570
+ features = features & ~(QVIRTIO_F_BAD_FEATURE |
571
+ (1u << VIRTIO_RING_F_EVENT_IDX) |
572
+ (1u << VIRTIO_BLK_F_SCSI));
573
+ qvirtio_set_features(dev, features);
574
+
575
+ capacity = qvirtio_config_readq(dev, 0);
576
+ g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512);
577
+
578
+ vq = qvirtqueue_setup(dev, t_alloc, 0);
579
+ qvirtio_set_driver_ok(dev);
580
+
581
+ /* Write request */
582
+ req.type = VIRTIO_BLK_T_OUT;
583
+ req.ioprio = 1;
584
+ req.sector = 0;
585
+ req.data = g_malloc0(512);
586
+ strcpy(req.data, "TEST");
587
+
588
+ req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
589
+
590
+ g_free(req.data);
591
+
592
+ indirect = qvring_indirect_desc_setup(qts, dev, t_alloc, 2);
593
+ qvring_indirect_desc_add(dev, qts, indirect, req_addr, 528, false);
594
+ qvring_indirect_desc_add(dev, qts, indirect, req_addr + 528, 1, true);
595
+ free_head = qvirtqueue_add_indirect(qts, vq, indirect);
596
+ qvirtqueue_kick(qts, dev, vq, free_head);
597
+
598
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
599
+ QVIRTIO_BLK_TIMEOUT_US);
600
+ status = readb(req_addr + 528);
601
+ g_assert_cmpint(status, ==, 0);
602
+
603
+ g_free(indirect);
604
+ guest_free(t_alloc, req_addr);
605
+
606
+ /* Read request */
607
+ req.type = VIRTIO_BLK_T_IN;
608
+ req.ioprio = 1;
609
+ req.sector = 0;
610
+ req.data = g_malloc0(512);
611
+ strcpy(req.data, "TEST");
612
+
613
+ req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
614
+
615
+ g_free(req.data);
616
+
617
+ indirect = qvring_indirect_desc_setup(qts, dev, t_alloc, 2);
618
+ qvring_indirect_desc_add(dev, qts, indirect, req_addr, 16, false);
619
+ qvring_indirect_desc_add(dev, qts, indirect, req_addr + 16, 513, true);
620
+ free_head = qvirtqueue_add_indirect(qts, vq, indirect);
621
+ qvirtqueue_kick(qts, dev, vq, free_head);
622
+
623
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
624
+ QVIRTIO_BLK_TIMEOUT_US);
625
+ status = readb(req_addr + 528);
626
+ g_assert_cmpint(status, ==, 0);
627
+
628
+ data = g_malloc0(512);
629
+ qtest_memread(qts, req_addr + 16, data, 512);
630
+ g_assert_cmpstr(data, ==, "TEST");
631
+ g_free(data);
632
+
633
+ g_free(indirect);
634
+ guest_free(t_alloc, req_addr);
635
+ qvirtqueue_cleanup(dev->bus, vq, t_alloc);
636
+}
637
+
638
+static void idx(void *obj, void *u_data, QGuestAllocator *t_alloc)
639
+{
640
+ QVirtQueue *vq;
641
+ QVhostUserBlkPCI *blk = obj;
642
+ QVirtioPCIDevice *pdev = &blk->pci_vdev;
643
+ QVirtioDevice *dev = &pdev->vdev;
644
+ QVirtioBlkReq req;
645
+ uint64_t req_addr;
646
+ uint64_t capacity;
647
+ uint64_t features;
648
+ uint32_t free_head;
649
+ uint32_t write_head;
650
+ uint32_t desc_idx;
651
+ uint8_t status;
652
+ char *data;
653
+ QOSGraphObject *blk_object = obj;
654
+ QPCIDevice *pci_dev = blk_object->get_driver(blk_object, "pci-device");
655
+ QTestState *qts = global_qtest;
656
+
657
+ if (qpci_check_buggy_msi(pci_dev)) {
658
+ return;
659
+ }
660
+
661
+ qpci_msix_enable(pdev->pdev);
662
+ qvirtio_pci_set_msix_configuration_vector(pdev, t_alloc, 0);
663
+
664
+ features = qvirtio_get_features(dev);
665
+ features = features & ~(QVIRTIO_F_BAD_FEATURE |
666
+ (1u << VIRTIO_RING_F_INDIRECT_DESC) |
667
+ (1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
668
+ (1u << VIRTIO_BLK_F_SCSI));
669
+ qvirtio_set_features(dev, features);
670
+
671
+ capacity = qvirtio_config_readq(dev, 0);
672
+ g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512);
673
+
674
+ vq = qvirtqueue_setup(dev, t_alloc, 0);
675
+ qvirtqueue_pci_msix_setup(pdev, (QVirtQueuePCI *)vq, t_alloc, 1);
676
+
677
+ qvirtio_set_driver_ok(dev);
678
+
679
+ /* Write request */
680
+ req.type = VIRTIO_BLK_T_OUT;
681
+ req.ioprio = 1;
682
+ req.sector = 0;
683
+ req.data = g_malloc0(512);
684
+ strcpy(req.data, "TEST");
685
+
686
+ req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
687
+
688
+ g_free(req.data);
689
+
690
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
691
+ qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true);
692
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
693
+ qvirtqueue_kick(qts, dev, vq, free_head);
694
+
695
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
696
+ QVIRTIO_BLK_TIMEOUT_US);
697
+
698
+ /* Write request */
699
+ req.type = VIRTIO_BLK_T_OUT;
700
+ req.ioprio = 1;
701
+ req.sector = 1;
702
+ req.data = g_malloc0(512);
703
+ strcpy(req.data, "TEST");
704
+
705
+ req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
706
+
707
+ g_free(req.data);
708
+
709
+ /* Notify after processing the third request */
710
+ qvirtqueue_set_used_event(qts, vq, 2);
711
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
712
+ qvirtqueue_add(qts, vq, req_addr + 16, 512, false, true);
713
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
714
+ qvirtqueue_kick(qts, dev, vq, free_head);
715
+ write_head = free_head;
716
+
717
+ /* No notification expected */
718
+ status = qvirtio_wait_status_byte_no_isr(qts, dev,
719
+ vq, req_addr + 528,
720
+ QVIRTIO_BLK_TIMEOUT_US);
721
+ g_assert_cmpint(status, ==, 0);
722
+
723
+ guest_free(t_alloc, req_addr);
724
+
725
+ /* Read request */
726
+ req.type = VIRTIO_BLK_T_IN;
727
+ req.ioprio = 1;
728
+ req.sector = 1;
729
+ req.data = g_malloc0(512);
730
+
731
+ req_addr = virtio_blk_request(t_alloc, dev, &req, 512);
732
+
733
+ g_free(req.data);
734
+
735
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
736
+ qvirtqueue_add(qts, vq, req_addr + 16, 512, true, true);
737
+ qvirtqueue_add(qts, vq, req_addr + 528, 1, true, false);
738
+
739
+ qvirtqueue_kick(qts, dev, vq, free_head);
740
+
741
+ /* We get just one notification for both requests */
742
+ qvirtio_wait_used_elem(qts, dev, vq, write_head, NULL,
743
+ QVIRTIO_BLK_TIMEOUT_US);
744
+ g_assert(qvirtqueue_get_buf(qts, vq, &desc_idx, NULL));
745
+ g_assert_cmpint(desc_idx, ==, free_head);
746
+
747
+ status = readb(req_addr + 528);
748
+ g_assert_cmpint(status, ==, 0);
749
+
750
+ data = g_malloc0(512);
751
+ qtest_memread(qts, req_addr + 16, data, 512);
752
+ g_assert_cmpstr(data, ==, "TEST");
753
+ g_free(data);
754
+
755
+ guest_free(t_alloc, req_addr);
756
+
757
+ /* End test */
758
+ qpci_msix_disable(pdev->pdev);
759
+
760
+ qvirtqueue_cleanup(dev->bus, vq, t_alloc);
761
+}
762
+
763
+static void pci_hotplug(void *obj, void *data, QGuestAllocator *t_alloc)
764
+{
765
+ QVirtioPCIDevice *dev1 = obj;
766
+ QVirtioPCIDevice *dev;
767
+ QTestState *qts = dev1->pdev->bus->qts;
768
+
769
+ /* plug secondary disk */
770
+ qtest_qmp_device_add(qts, "vhost-user-blk-pci", "drv1",
771
+ "{'addr': %s, 'chardev': 'char2'}",
772
+ stringify(PCI_SLOT_HP) ".0");
773
+
774
+ dev = virtio_pci_new(dev1->pdev->bus,
775
+ &(QPCIAddress) { .devfn = QPCI_DEVFN(PCI_SLOT_HP, 0)
776
+ });
777
+ g_assert_nonnull(dev);
778
+ g_assert_cmpint(dev->vdev.device_type, ==, VIRTIO_ID_BLOCK);
779
+ qvirtio_pci_device_disable(dev);
780
+ qos_object_destroy((QOSGraphObject *)dev);
781
+
782
+ /* unplug secondary disk */
783
+ qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
784
+}
785
+
786
+/*
787
+ * Check that setting the vring addr on a non-existent virtqueue does
788
+ * not crash.
789
+ */
790
+static void test_nonexistent_virtqueue(void *obj, void *data,
791
+ QGuestAllocator *t_alloc)
792
+{
793
+ QVhostUserBlkPCI *blk = obj;
794
+ QVirtioPCIDevice *pdev = &blk->pci_vdev;
795
+ QPCIBar bar0;
796
+ QPCIDevice *dev;
797
+
798
+ dev = qpci_device_find(pdev->pdev->bus, QPCI_DEVFN(4, 0));
799
+ g_assert(dev != NULL);
800
+ qpci_device_enable(dev);
801
+
802
+ bar0 = qpci_iomap(dev, 0, NULL);
803
+
804
+ qpci_io_writeb(dev, bar0, VIRTIO_PCI_QUEUE_SEL, 2);
805
+ qpci_io_writel(dev, bar0, VIRTIO_PCI_QUEUE_PFN, 1);
806
+
807
+ g_free(dev);
808
+}
809
+
810
+static const char *qtest_qemu_storage_daemon_binary(void)
811
+{
812
+ const char *qemu_storage_daemon_bin;
813
+
814
+ qemu_storage_daemon_bin = getenv("QTEST_QEMU_STORAGE_DAEMON_BINARY");
815
+ if (!qemu_storage_daemon_bin) {
816
+ fprintf(stderr, "Environment variable "
817
+ "QTEST_QEMU_STORAGE_DAEMON_BINARY required\n");
818
+ exit(0);
819
+ }
820
+
821
+ return qemu_storage_daemon_bin;
822
+}
823
+
824
+/* g_test_queue_destroy() cleanup function for files */
825
+static void destroy_file(void *path)
826
+{
827
+ unlink(path);
828
+ g_free(path);
829
+ qos_invalidate_command_line();
830
+}
831
+
832
+static char *drive_create(void)
833
+{
834
+ int fd, ret;
835
+ /** vhost-user-blk won't recognize drive located in /tmp */
836
+ char *t_path = g_strdup("qtest.XXXXXX");
837
+
838
+ /** Create a temporary raw image */
839
+ fd = mkstemp(t_path);
840
+ g_assert_cmpint(fd, >=, 0);
841
+ ret = ftruncate(fd, TEST_IMAGE_SIZE);
842
+ g_assert_cmpint(ret, ==, 0);
843
+ close(fd);
844
+
845
+ g_test_queue_destroy(destroy_file, t_path);
846
+ return t_path;
847
+}
848
+
849
+static char *create_listen_socket(int *fd)
850
+{
851
+ int tmp_fd;
852
+ char *path;
853
+
854
+ /* No race because our pid makes the path unique */
855
+ path = g_strdup_printf("/tmp/qtest-%d-sock.XXXXXX", getpid());
856
+ tmp_fd = mkstemp(path);
857
+ g_assert_cmpint(tmp_fd, >=, 0);
858
+ close(tmp_fd);
859
+ unlink(path);
860
+
861
+ *fd = qtest_socket_server(path);
862
+ g_test_queue_destroy(destroy_file, path);
863
+ return path;
864
+}
865
+
866
+/*
867
+ * g_test_queue_destroy() and qtest_add_abrt_handler() cleanup function for
868
+ * qemu-storage-daemon.
869
+ */
870
+static void quit_storage_daemon(void *data)
871
+{
872
+ QemuStorageDaemonState *qsd = data;
873
+ int wstatus;
874
+ pid_t pid;
875
+
876
+ /*
877
+ * If we were invoked as a g_test_queue_destroy() cleanup function we need
878
+ * to remove the abrt handler to avoid being called again if the code below
879
+ * aborts. Also, we must not leave the abrt handler installed after
880
+ * cleanup.
881
+ */
882
+ qtest_remove_abrt_handler(data);
883
+
884
+ /* Before quitting storage-daemon, quit qemu to avoid dubious messages */
885
+ qtest_kill_qemu(global_qtest);
886
+
887
+ kill(qsd->pid, SIGTERM);
888
+ pid = waitpid(qsd->pid, &wstatus, 0);
889
+ g_assert_cmpint(pid, ==, qsd->pid);
890
+ if (!WIFEXITED(wstatus)) {
891
+ fprintf(stderr, "%s: expected qemu-storage-daemon to exit\n",
892
+ __func__);
893
+ abort();
894
+ }
895
+ if (WEXITSTATUS(wstatus) != 0) {
896
+ fprintf(stderr, "%s: expected qemu-storage-daemon to exit "
897
+ "successfully, got %d\n",
898
+ __func__, WEXITSTATUS(wstatus));
899
+ abort();
900
+ }
901
+
902
+ g_free(data);
903
+}
904
+
905
+static void start_vhost_user_blk(GString *cmd_line, int vus_instances)
906
+{
907
+ const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
908
+ int i;
909
+ gchar *img_path;
910
+ GString *storage_daemon_command = g_string_new(NULL);
911
+ QemuStorageDaemonState *qsd;
912
+
913
+ g_string_append_printf(storage_daemon_command,
914
+ "exec %s ",
915
+ vhost_user_blk_bin);
916
+
917
+ g_string_append_printf(cmd_line,
918
+ " -object memory-backend-memfd,id=mem,size=256M,share=on "
919
+ " -M memory-backend=mem -m 256M ");
920
+
921
+ for (i = 0; i < vus_instances; i++) {
922
+ int fd;
923
+ char *sock_path = create_listen_socket(&fd);
924
+
925
+ /* create image file */
926
+ img_path = drive_create();
927
+ g_string_append_printf(storage_daemon_command,
928
+ "--blockdev driver=file,node-name=disk%d,filename=%s "
929
+ "--export type=vhost-user-blk,id=disk%d,addr.type=unix,addr.path=%s,"
930
+ "node-name=disk%i,writable=on ",
931
+ i, img_path, i, sock_path, i);
932
+
933
+ g_string_append_printf(cmd_line, "-chardev socket,id=char%d,path=%s ",
934
+ i + 1, sock_path);
935
+ }
936
+
937
+ g_test_message("starting vhost-user backend: %s",
938
+ storage_daemon_command->str);
939
+ pid_t pid = fork();
940
+ if (pid == 0) {
941
+ /*
942
+ * Close standard file descriptors so tap-driver.pl pipe detects when
943
+ * our parent terminates.
944
+ */
945
+ close(0);
946
+ close(1);
947
+ open("/dev/null", O_RDONLY);
948
+ open("/dev/null", O_WRONLY);
949
+
950
+ execlp("/bin/sh", "sh", "-c", storage_daemon_command->str, NULL);
951
+ exit(1);
952
+ }
953
+ g_string_free(storage_daemon_command, true);
954
+
955
+ qsd = g_new(QemuStorageDaemonState, 1);
956
+ qsd->pid = pid;
957
+
958
+ /* Make sure qemu-storage-daemon is stopped */
959
+ qtest_add_abrt_handler(quit_storage_daemon, qsd);
960
+ g_test_queue_destroy(quit_storage_daemon, qsd);
961
+}
962
+
963
+static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg)
964
+{
965
+ start_vhost_user_blk(cmd_line, 1);
966
+ return arg;
967
+}
968
+
969
+/*
970
+ * Setup for hotplug.
971
+ *
972
+ * Since vhost-user server only serves one vhost-user client one time,
973
+ * another exprot
974
+ *
975
+ */
976
+static void *vhost_user_blk_hotplug_test_setup(GString *cmd_line, void *arg)
977
+{
978
+ /* "-chardev socket,id=char2" is used for pci_hotplug*/
979
+ start_vhost_user_blk(cmd_line, 2);
980
+ return arg;
981
+}
982
+
983
+static void register_vhost_user_blk_test(void)
984
+{
985
+ QOSGraphTestOptions opts = {
986
+ .before = vhost_user_blk_test_setup,
987
+ };
988
+
989
+ /*
990
+ * tests for vhost-user-blk and vhost-user-blk-pci
991
+ * The tests are borrowed from tests/virtio-blk-test.c. But some tests
992
+ * regarding block_resize don't work for vhost-user-blk.
993
+ * vhost-user-blk device doesn't have -drive, so tests containing
994
+ * block_resize are also abandoned,
995
+ * - config
996
+ * - resize
997
+ */
998
+ qos_add_test("basic", "vhost-user-blk", basic, &opts);
999
+ qos_add_test("indirect", "vhost-user-blk", indirect, &opts);
1000
+ qos_add_test("idx", "vhost-user-blk-pci", idx, &opts);
1001
+ qos_add_test("nxvirtq", "vhost-user-blk-pci",
1002
+ test_nonexistent_virtqueue, &opts);
1003
+
1004
+ opts.before = vhost_user_blk_hotplug_test_setup;
1005
+ qos_add_test("hotplug", "vhost-user-blk-pci", pci_hotplug, &opts);
1006
+}
1007
+
1008
+libqos_init(register_vhost_user_blk_test);
1009
diff --git a/MAINTAINERS b/MAINTAINERS
14
index XXXXXXX..XXXXXXX 100644
1010
index XXXXXXX..XXXXXXX 100644
15
--- a/block/qed-cluster.c
1011
--- a/MAINTAINERS
16
+++ b/block/qed-cluster.c
1012
+++ b/MAINTAINERS
17
@@ -XXX,XX +XXX,XX @@ static unsigned int qed_count_contiguous_clusters(BDRVQEDState *s,
1013
@@ -XXX,XX +XXX,XX @@ F: block/export/vhost-user-blk-server.c
18
* On failure QED_CLUSTER_L2 or QED_CLUSTER_L1 is returned for missing L2 or L1
1014
F: block/export/vhost-user-blk-server.h
19
* table offset, respectively. len is number of contiguous unallocated bytes.
1015
F: include/qemu/vhost-user-server.h
20
*/
1016
F: tests/qtest/libqos/vhost-user-blk.c
21
-int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
1017
+F: tests/qtest/libqos/vhost-user-blk.h
22
- size_t *len, uint64_t *img_offset)
1018
+F: tests/qtest/vhost-user-blk-test.c
23
+int coroutine_fn qed_find_cluster(BDRVQEDState *s, QEDRequest *request,
1019
F: util/vhost-user-server.c
24
+ uint64_t pos, size_t *len,
1020
25
+ uint64_t *img_offset)
1021
FUSE block device exports
26
{
1022
diff --git a/tests/qtest/libqos/meson.build b/tests/qtest/libqos/meson.build
27
uint64_t l2_offset;
28
uint64_t offset = 0;
29
diff --git a/block/qed.c b/block/qed.c
30
index XXXXXXX..XXXXXXX 100644
1023
index XXXXXXX..XXXXXXX 100644
31
--- a/block/qed.c
1024
--- a/tests/qtest/libqos/meson.build
32
+++ b/block/qed.c
1025
+++ b/tests/qtest/libqos/meson.build
33
@@ -XXX,XX +XXX,XX @@ int qed_write_header_sync(BDRVQEDState *s)
1026
@@ -XXX,XX +XXX,XX @@ libqos_srcs = files('../libqtest.c',
34
* This function only updates known header fields in-place and does not affect
1027
'virtio-9p.c',
35
* extra data after the QED header.
1028
'virtio-balloon.c',
36
*/
1029
'virtio-blk.c',
37
-static int qed_write_header(BDRVQEDState *s)
1030
+ 'vhost-user-blk.c',
38
+static int coroutine_fn qed_write_header(BDRVQEDState *s)
1031
'virtio-mmio.c',
39
{
1032
'virtio-net.c',
40
/* We must write full sectors for O_DIRECT but cannot necessarily generate
1033
'virtio-pci.c',
41
* the data following the header if an unrecognized compat feature is
1034
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
42
@@ -XXX,XX +XXX,XX @@ static void qed_unplug_allocating_write_reqs(BDRVQEDState *s)
43
qemu_co_enter_next(&s->allocating_write_reqs);
44
}
45
46
-static void qed_need_check_timer_entry(void *opaque)
47
+static void coroutine_fn qed_need_check_timer_entry(void *opaque)
48
{
49
BDRVQEDState *s = opaque;
50
int ret;
51
@@ -XXX,XX +XXX,XX @@ static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
52
* This function reads qiov->size bytes starting at pos from the backing file.
53
* If there is no backing file then zeroes are read.
54
*/
55
-static int qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
56
- QEMUIOVector *qiov,
57
- QEMUIOVector **backing_qiov)
58
+static int coroutine_fn qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
59
+ QEMUIOVector *qiov,
60
+ QEMUIOVector **backing_qiov)
61
{
62
uint64_t backing_length = 0;
63
size_t size;
64
@@ -XXX,XX +XXX,XX @@ static int qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
65
* @len: Number of bytes
66
* @offset: Byte offset in image file
67
*/
68
-static int qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
69
- uint64_t len, uint64_t offset)
70
+static int coroutine_fn qed_copy_from_backing_file(BDRVQEDState *s,
71
+ uint64_t pos, uint64_t len,
72
+ uint64_t offset)
73
{
74
QEMUIOVector qiov;
75
QEMUIOVector *backing_qiov = NULL;
76
@@ -XXX,XX +XXX,XX @@ out:
77
* The cluster offset may be an allocated byte offset in the image file, the
78
* zero cluster marker, or the unallocated cluster marker.
79
*/
80
-static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
81
- unsigned int n, uint64_t cluster)
82
+static void coroutine_fn qed_update_l2_table(BDRVQEDState *s, QEDTable *table,
83
+ int index, unsigned int n,
84
+ uint64_t cluster)
85
{
86
int i;
87
for (i = index; i < index + n; i++) {
88
@@ -XXX,XX +XXX,XX @@ static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
89
}
90
}
91
92
-static void qed_aio_complete(QEDAIOCB *acb)
93
+static void coroutine_fn qed_aio_complete(QEDAIOCB *acb)
94
{
95
BDRVQEDState *s = acb_to_s(acb);
96
97
@@ -XXX,XX +XXX,XX @@ static void qed_aio_complete(QEDAIOCB *acb)
98
/**
99
* Update L1 table with new L2 table offset and write it out
100
*/
101
-static int qed_aio_write_l1_update(QEDAIOCB *acb)
102
+static int coroutine_fn qed_aio_write_l1_update(QEDAIOCB *acb)
103
{
104
BDRVQEDState *s = acb_to_s(acb);
105
CachedL2Table *l2_table = acb->request.l2_table;
106
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_l1_update(QEDAIOCB *acb)
107
/**
108
* Update L2 table with new cluster offsets and write them out
109
*/
110
-static int qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t offset)
111
+static int coroutine_fn qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t offset)
112
{
113
BDRVQEDState *s = acb_to_s(acb);
114
bool need_alloc = acb->find_cluster_ret == QED_CLUSTER_L1;
115
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t offset)
116
/**
117
* Write data to the image file
118
*/
119
-static int qed_aio_write_main(QEDAIOCB *acb)
120
+static int coroutine_fn qed_aio_write_main(QEDAIOCB *acb)
121
{
122
BDRVQEDState *s = acb_to_s(acb);
123
uint64_t offset = acb->cur_cluster +
124
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_main(QEDAIOCB *acb)
125
/**
126
* Populate untouched regions of new data cluster
127
*/
128
-static int qed_aio_write_cow(QEDAIOCB *acb)
129
+static int coroutine_fn qed_aio_write_cow(QEDAIOCB *acb)
130
{
131
BDRVQEDState *s = acb_to_s(acb);
132
uint64_t start, len, offset;
133
@@ -XXX,XX +XXX,XX @@ static bool qed_should_set_need_check(BDRVQEDState *s)
134
*
135
* This path is taken when writing to previously unallocated clusters.
136
*/
137
-static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
138
+static int coroutine_fn qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
139
{
140
BDRVQEDState *s = acb_to_s(acb);
141
int ret;
142
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
143
*
144
* This path is taken when writing to already allocated clusters.
145
*/
146
-static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
147
+static int coroutine_fn qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset,
148
+ size_t len)
149
{
150
/* Allocate buffer for zero writes */
151
if (acb->flags & QED_AIOCB_ZERO) {
152
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
153
* @offset: Cluster offset in bytes
154
* @len: Length in bytes
155
*/
156
-static int qed_aio_write_data(void *opaque, int ret,
157
- uint64_t offset, size_t len)
158
+static int coroutine_fn qed_aio_write_data(void *opaque, int ret,
159
+ uint64_t offset, size_t len)
160
{
161
QEDAIOCB *acb = opaque;
162
163
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_data(void *opaque, int ret,
164
* @offset: Cluster offset in bytes
165
* @len: Length in bytes
166
*/
167
-static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
168
+static int coroutine_fn qed_aio_read_data(void *opaque, int ret,
169
+ uint64_t offset, size_t len)
170
{
171
QEDAIOCB *acb = opaque;
172
BDRVQEDState *s = acb_to_s(acb);
173
@@ -XXX,XX +XXX,XX @@ static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
174
/**
175
* Begin next I/O or complete the request
176
*/
177
-static int qed_aio_next_io(QEDAIOCB *acb)
178
+static int coroutine_fn qed_aio_next_io(QEDAIOCB *acb)
179
{
180
BDRVQEDState *s = acb_to_s(acb);
181
uint64_t offset;
182
diff --git a/block/qed.h b/block/qed.h
183
index XXXXXXX..XXXXXXX 100644
1035
index XXXXXXX..XXXXXXX 100644
184
--- a/block/qed.h
1036
--- a/tests/qtest/meson.build
185
+++ b/block/qed.h
1037
+++ b/tests/qtest/meson.build
186
@@ -XXX,XX +XXX,XX @@ int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
1038
@@ -XXX,XX +XXX,XX @@ if have_virtfs
187
/**
1039
qos_test_ss.add(files('virtio-9p-test.c'))
188
* Cluster functions
1040
endif
189
*/
1041
qos_test_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user-test.c'))
190
-int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
1042
+if have_vhost_user_blk_server
191
- size_t *len, uint64_t *img_offset);
1043
+ qos_test_ss.add(files('vhost-user-blk-test.c'))
192
+int coroutine_fn qed_find_cluster(BDRVQEDState *s, QEDRequest *request,
1044
+endif
193
+ uint64_t pos, size_t *len,
1045
194
+ uint64_t *img_offset);
1046
tpmemu_files = ['tpm-emu.c', 'tpm-util.c', 'tpm-tests.c']
195
1047
196
/**
1048
@@ -XXX,XX +XXX,XX @@ foreach dir : target_dirs
197
* Consistency check
1049
endif
1050
qtest_env.set('G_TEST_DBUS_DAEMON', meson.source_root() / 'tests/dbus-vmstate-daemon.sh')
1051
qtest_env.set('QTEST_QEMU_BINARY', './qemu-system-' + target_base)
1052
+ qtest_env.set('QTEST_QEMU_STORAGE_DAEMON_BINARY', './storage-daemon/qemu-storage-daemon')
1053
1054
foreach test : target_qtests
1055
# Executables are shared across targets, declare them only the first time we
198
--
1056
--
199
1.8.3.1
1057
2.29.2
200
1058
201
1059
diff view generated by jsdifflib
1
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
just return an error code and let the caller handle it.
3
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
4
Message-Id: <20210223144653.811468-7-stefanha@redhat.com>
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
6
---
7
block/qed.c | 43 ++++++++++++++++++++-----------------------
7
tests/qtest/vhost-user-blk-test.c | 81 +++++++++++++++++++++++++++++--
8
1 file changed, 20 insertions(+), 23 deletions(-)
8
1 file changed, 76 insertions(+), 5 deletions(-)
9
9
10
diff --git a/block/qed.c b/block/qed.c
10
diff --git a/tests/qtest/vhost-user-blk-test.c b/tests/qtest/vhost-user-blk-test.c
11
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/qed.c
12
--- a/tests/qtest/vhost-user-blk-test.c
13
+++ b/block/qed.c
13
+++ b/tests/qtest/vhost-user-blk-test.c
14
@@ -XXX,XX +XXX,XX @@ static bool qed_should_set_need_check(BDRVQEDState *s)
14
@@ -XXX,XX +XXX,XX @@ static void pci_hotplug(void *obj, void *data, QGuestAllocator *t_alloc)
15
*
15
qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
16
* This path is taken when writing to previously unallocated clusters.
16
}
17
*/
17
18
-static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
18
+static void multiqueue(void *obj, void *data, QGuestAllocator *t_alloc)
19
+static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
19
+{
20
+ QVirtioPCIDevice *pdev1 = obj;
21
+ QVirtioDevice *dev1 = &pdev1->vdev;
22
+ QVirtioPCIDevice *pdev8;
23
+ QVirtioDevice *dev8;
24
+ QTestState *qts = pdev1->pdev->bus->qts;
25
+ uint64_t features;
26
+ uint16_t num_queues;
27
+
28
+ /*
29
+ * The primary device has 1 queue and VIRTIO_BLK_F_MQ is not enabled. The
30
+ * VIRTIO specification allows VIRTIO_BLK_F_MQ to be enabled when there is
31
+ * only 1 virtqueue, but --device vhost-user-blk-pci doesn't do this (which
32
+ * is also spec-compliant).
33
+ */
34
+ features = qvirtio_get_features(dev1);
35
+ g_assert_cmpint(features & (1u << VIRTIO_BLK_F_MQ), ==, 0);
36
+ features = features & ~(QVIRTIO_F_BAD_FEATURE |
37
+ (1u << VIRTIO_RING_F_INDIRECT_DESC) |
38
+ (1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
39
+ (1u << VIRTIO_BLK_F_SCSI));
40
+ qvirtio_set_features(dev1, features);
41
+
42
+ /* Hotplug a secondary device with 8 queues */
43
+ qtest_qmp_device_add(qts, "vhost-user-blk-pci", "drv1",
44
+ "{'addr': %s, 'chardev': 'char2', 'num-queues': 8}",
45
+ stringify(PCI_SLOT_HP) ".0");
46
+
47
+ pdev8 = virtio_pci_new(pdev1->pdev->bus,
48
+ &(QPCIAddress) {
49
+ .devfn = QPCI_DEVFN(PCI_SLOT_HP, 0)
50
+ });
51
+ g_assert_nonnull(pdev8);
52
+ g_assert_cmpint(pdev8->vdev.device_type, ==, VIRTIO_ID_BLOCK);
53
+
54
+ qos_object_start_hw(&pdev8->obj);
55
+
56
+ dev8 = &pdev8->vdev;
57
+ features = qvirtio_get_features(dev8);
58
+ g_assert_cmpint(features & (1u << VIRTIO_BLK_F_MQ),
59
+ ==,
60
+ (1u << VIRTIO_BLK_F_MQ));
61
+ features = features & ~(QVIRTIO_F_BAD_FEATURE |
62
+ (1u << VIRTIO_RING_F_INDIRECT_DESC) |
63
+ (1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
64
+ (1u << VIRTIO_BLK_F_SCSI) |
65
+ (1u << VIRTIO_BLK_F_MQ));
66
+ qvirtio_set_features(dev8, features);
67
+
68
+ num_queues = qvirtio_config_readw(dev8,
69
+ offsetof(struct virtio_blk_config, num_queues));
70
+ g_assert_cmpint(num_queues, ==, 8);
71
+
72
+ qvirtio_pci_device_disable(pdev8);
73
+ qos_object_destroy(&pdev8->obj);
74
+
75
+ /* unplug secondary disk */
76
+ qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
77
+}
78
+
79
/*
80
* Check that setting the vring addr on a non-existent virtqueue does
81
* not crash.
82
@@ -XXX,XX +XXX,XX @@ static void quit_storage_daemon(void *data)
83
g_free(data);
84
}
85
86
-static void start_vhost_user_blk(GString *cmd_line, int vus_instances)
87
+static void start_vhost_user_blk(GString *cmd_line, int vus_instances,
88
+ int num_queues)
20
{
89
{
21
BDRVQEDState *s = acb_to_s(acb);
90
const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
22
int ret;
91
int i;
23
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
92
@@ -XXX,XX +XXX,XX @@ static void start_vhost_user_blk(GString *cmd_line, int vus_instances)
24
}
93
g_string_append_printf(storage_daemon_command,
25
if (acb != QSIMPLEQ_FIRST(&s->allocating_write_reqs) ||
94
"--blockdev driver=file,node-name=disk%d,filename=%s "
26
s->allocating_write_reqs_plugged) {
95
"--export type=vhost-user-blk,id=disk%d,addr.type=unix,addr.path=%s,"
27
- return; /* wait for existing request to finish */
96
- "node-name=disk%i,writable=on ",
28
+ return -EINPROGRESS; /* wait for existing request to finish */
97
- i, img_path, i, sock_path, i);
29
}
98
+ "node-name=disk%i,writable=on,num-queues=%d ",
30
99
+ i, img_path, i, sock_path, i, num_queues);
31
acb->cur_nclusters = qed_bytes_to_clusters(s,
100
32
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
101
g_string_append_printf(cmd_line, "-chardev socket,id=char%d,path=%s ",
33
if (acb->flags & QED_AIOCB_ZERO) {
102
i + 1, sock_path);
34
/* Skip ahead if the clusters are already zero */
103
@@ -XXX,XX +XXX,XX @@ static void start_vhost_user_blk(GString *cmd_line, int vus_instances)
35
if (acb->find_cluster_ret == QED_CLUSTER_ZERO) {
104
36
- qed_aio_start_io(acb);
105
static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg)
37
- return;
106
{
38
+ return 0;
107
- start_vhost_user_blk(cmd_line, 1);
39
}
108
+ start_vhost_user_blk(cmd_line, 1, 1);
40
} else {
109
return arg;
41
acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
42
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
43
s->header.features |= QED_F_NEED_CHECK;
44
ret = qed_write_header(s);
45
if (ret < 0) {
46
- qed_aio_complete(acb, ret);
47
- return;
48
+ return ret;
49
}
50
}
51
52
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
53
ret = qed_aio_write_cow(acb);
54
}
55
if (ret < 0) {
56
- qed_aio_complete(acb, ret);
57
- return;
58
+ return ret;
59
}
60
- qed_aio_next_io(acb, 0);
61
+ return 0;
62
}
110
}
63
111
64
/**
112
@@ -XXX,XX +XXX,XX @@ static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg)
65
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
113
static void *vhost_user_blk_hotplug_test_setup(GString *cmd_line, void *arg)
66
*
67
* This path is taken when writing to already allocated clusters.
68
*/
69
-static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
70
+static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
71
{
114
{
72
- int ret;
115
/* "-chardev socket,id=char2" is used for pci_hotplug*/
73
-
116
- start_vhost_user_blk(cmd_line, 2);
74
/* Allocate buffer for zero writes */
117
+ start_vhost_user_blk(cmd_line, 2, 1);
75
if (acb->flags & QED_AIOCB_ZERO) {
118
+ return arg;
76
struct iovec *iov = acb->qiov->iov;
119
+}
77
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
120
+
78
if (!iov->iov_base) {
121
+static void *vhost_user_blk_multiqueue_test_setup(GString *cmd_line, void *arg)
79
iov->iov_base = qemu_try_blockalign(acb->common.bs, iov->iov_len);
122
+{
80
if (iov->iov_base == NULL) {
123
+ start_vhost_user_blk(cmd_line, 2, 8);
81
- qed_aio_complete(acb, -ENOMEM);
124
return arg;
82
- return;
83
+ return -ENOMEM;
84
}
85
memset(iov->iov_base, 0, iov->iov_len);
86
}
87
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
88
qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
89
90
/* Do the actual write */
91
- ret = qed_aio_write_main(acb);
92
- if (ret < 0) {
93
- qed_aio_complete(acb, ret);
94
- return;
95
- }
96
- qed_aio_next_io(acb, 0);
97
+ return qed_aio_write_main(acb);
98
}
125
}
99
126
100
/**
127
@@ -XXX,XX +XXX,XX @@ static void register_vhost_user_blk_test(void)
101
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_data(void *opaque, int ret,
128
102
129
opts.before = vhost_user_blk_hotplug_test_setup;
103
switch (ret) {
130
qos_add_test("hotplug", "vhost-user-blk-pci", pci_hotplug, &opts);
104
case QED_CLUSTER_FOUND:
105
- qed_aio_write_inplace(acb, offset, len);
106
+ ret = qed_aio_write_inplace(acb, offset, len);
107
break;
108
109
case QED_CLUSTER_L2:
110
case QED_CLUSTER_L1:
111
case QED_CLUSTER_ZERO:
112
- qed_aio_write_alloc(acb, len);
113
+ ret = qed_aio_write_alloc(acb, len);
114
break;
115
116
default:
117
- qed_aio_complete(acb, ret);
118
+ assert(ret < 0);
119
break;
120
}
121
+
131
+
122
+ if (ret < 0) {
132
+ opts.before = vhost_user_blk_multiqueue_test_setup;
123
+ if (ret != -EINPROGRESS) {
133
+ qos_add_test("multiqueue", "vhost-user-blk-pci", multiqueue, &opts);
124
+ qed_aio_complete(acb, ret);
125
+ }
126
+ return;
127
+ }
128
+ qed_aio_next_io(acb, 0);
129
}
134
}
130
135
131
/**
136
libqos_init(register_vhost_user_blk_test);
132
--
137
--
133
1.8.3.1
138
2.29.2
134
139
135
140
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
Calling aio_poll() directly may have been fine previously, but this is
3
The config->blk_size field is little-endian. Use the native-endian
4
the future, man! The difference between an aio_poll() loop and
4
blk_size variable to avoid double byteswapping.
5
BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
6
around aio_poll().
7
5
8
This allows the IOThread to run fd handlers or BHs to complete the
6
Fixes: 11f60f7eaee2630dd6fa0c3a8c49f792e46c4cf1 ("block/export: make vhost-user-blk config space little-endian")
9
request. Failure to release the AioContext causes deadlocks.
10
11
Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
12
iothread.
13
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
Reviewed-by: Eric Blake <eblake@redhat.com>
8
Message-Id: <20210223144653.811468-8-stefanha@redhat.com>
16
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
18
---
10
---
19
block/io.c | 4 +---
11
block/export/vhost-user-blk-server.c | 2 +-
20
1 file changed, 1 insertion(+), 3 deletions(-)
12
1 file changed, 1 insertion(+), 1 deletion(-)
21
13
22
diff --git a/block/io.c b/block/io.c
14
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
23
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
24
--- a/block/io.c
16
--- a/block/export/vhost-user-blk-server.c
25
+++ b/block/io.c
17
+++ b/block/export/vhost-user-blk-server.c
26
@@ -XXX,XX +XXX,XX @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
18
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
27
Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
19
config->num_queues = cpu_to_le16(num_queues);
28
20
config->max_discard_sectors = cpu_to_le32(32768);
29
bdrv_coroutine_enter(bs, co);
21
config->max_discard_seg = cpu_to_le32(1);
30
- while (data.ret == -EINPROGRESS) {
22
- config->discard_sector_alignment = cpu_to_le32(config->blk_size >> 9);
31
- aio_poll(bdrv_get_aio_context(bs), true);
23
+ config->discard_sector_alignment = cpu_to_le32(blk_size >> 9);
32
- }
24
config->max_write_zeroes_sectors = cpu_to_le32(32768);
33
+ BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
25
config->max_write_zeroes_seg = cpu_to_le32(1);
34
return data.ret;
35
}
36
}
26
}
37
--
27
--
38
1.8.3.1
28
2.29.2
39
29
40
30
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
blk/bdrv_drain_all() only takes effect for a single instant and then
3
Use VIRTIO_BLK_SECTOR_BITS and VIRTIO_BLK_SECTOR_SIZE when dealing with
4
resumes block jobs, guest devices, and other external clients like the
4
virtio-blk sector numbers. Although the values happen to be the same as
5
NBD server. This can be handy when performing a synchronous drain
5
BDRV_SECTOR_BITS and BDRV_SECTOR_SIZE, they are conceptually different.
6
before terminating the program, for example.
6
This makes it clearer when we are dealing with virtio-blk sector units.
7
7
8
Monitor commands usually need to quiesce I/O across an entire code
8
Use VIRTIO_BLK_SECTOR_BITS in vu_blk_initialize_config(). Later patches
9
region so blk/bdrv_drain_all() is not suitable. They must use
9
will use it the new constants the virtqueue request processing code
10
bdrv_drain_all_begin/end() to mark the region. This prevents new I/O
10
path.
11
requests from slipping in or worse - block jobs completing and modifying
12
the graph.
13
11
14
I audited other blk/bdrv_drain_all() callers but did not find anything
12
Suggested-by: Max Reitz <mreitz@redhat.com>
15
that needs a similar fix. This patch fixes the savevm/loadvm commands.
16
Although I haven't encountered a read world issue this makes the code
17
safer.
18
19
Suggested-by: Kevin Wolf <kwolf@redhat.com>
20
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
21
Reviewed-by: Eric Blake <eblake@redhat.com>
14
Message-Id: <20210223144653.811468-9-stefanha@redhat.com>
22
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
15
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
23
---
16
---
24
migration/savevm.c | 18 +++++++++++++++---
17
block/export/vhost-user-blk-server.c | 15 ++++++++++++---
25
1 file changed, 15 insertions(+), 3 deletions(-)
18
1 file changed, 12 insertions(+), 3 deletions(-)
26
19
27
diff --git a/migration/savevm.c b/migration/savevm.c
20
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
28
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
29
--- a/migration/savevm.c
22
--- a/block/export/vhost-user-blk-server.c
30
+++ b/migration/savevm.c
23
+++ b/block/export/vhost-user-blk-server.c
31
@@ -XXX,XX +XXX,XX @@ int save_snapshot(const char *name, Error **errp)
24
@@ -XXX,XX +XXX,XX @@
25
#include "sysemu/block-backend.h"
26
#include "util/block-helpers.h"
27
28
+/*
29
+ * Sector units are 512 bytes regardless of the
30
+ * virtio_blk_config->blk_size value.
31
+ */
32
+#define VIRTIO_BLK_SECTOR_BITS 9
33
+#define VIRTIO_BLK_SECTOR_SIZE (1ull << VIRTIO_BLK_SECTOR_BITS)
34
+
35
enum {
36
VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
37
};
38
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
39
uint32_t blk_size,
40
uint16_t num_queues)
41
{
42
- config->capacity = cpu_to_le64(bdrv_getlength(bs) >> BDRV_SECTOR_BITS);
43
+ config->capacity =
44
+ cpu_to_le64(bdrv_getlength(bs) >> VIRTIO_BLK_SECTOR_BITS);
45
config->blk_size = cpu_to_le32(blk_size);
46
config->size_max = cpu_to_le32(0);
47
config->seg_max = cpu_to_le32(128 - 2);
48
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
49
config->num_queues = cpu_to_le16(num_queues);
50
config->max_discard_sectors = cpu_to_le32(32768);
51
config->max_discard_seg = cpu_to_le32(1);
52
- config->discard_sector_alignment = cpu_to_le32(blk_size >> 9);
53
+ config->discard_sector_alignment =
54
+ cpu_to_le32(blk_size >> VIRTIO_BLK_SECTOR_BITS);
55
config->max_write_zeroes_sectors = cpu_to_le32(32768);
56
config->max_write_zeroes_seg = cpu_to_le32(1);
57
}
58
@@ -XXX,XX +XXX,XX @@ static int vu_blk_exp_create(BlockExport *exp, BlockExportOptions *opts,
59
if (vu_opts->has_logical_block_size) {
60
logical_block_size = vu_opts->logical_block_size;
61
} else {
62
- logical_block_size = BDRV_SECTOR_SIZE;
63
+ logical_block_size = VIRTIO_BLK_SECTOR_SIZE;
32
}
64
}
33
vm_stop(RUN_STATE_SAVE_VM);
65
check_block_size(exp->id, "logical-block-size", logical_block_size,
34
66
&local_err);
35
+ bdrv_drain_all_begin();
36
+
37
aio_context_acquire(aio_context);
38
39
memset(sn, 0, sizeof(*sn));
40
@@ -XXX,XX +XXX,XX @@ int save_snapshot(const char *name, Error **errp)
41
if (aio_context) {
42
aio_context_release(aio_context);
43
}
44
+
45
+ bdrv_drain_all_end();
46
+
47
if (saved_vm_running) {
48
vm_start();
49
}
50
@@ -XXX,XX +XXX,XX @@ int load_snapshot(const char *name, Error **errp)
51
}
52
53
/* Flush all IO requests so they don't interfere with the new state. */
54
- bdrv_drain_all();
55
+ bdrv_drain_all_begin();
56
57
ret = bdrv_all_goto_snapshot(name, &bs);
58
if (ret < 0) {
59
error_setg(errp, "Error %d while activating snapshot '%s' on '%s'",
60
ret, name, bdrv_get_device_name(bs));
61
- return ret;
62
+ goto err_drain;
63
}
64
65
/* restore the VM state */
66
f = qemu_fopen_bdrv(bs_vm_state, 0);
67
if (!f) {
68
error_setg(errp, "Could not open VM state file");
69
- return -EINVAL;
70
+ ret = -EINVAL;
71
+ goto err_drain;
72
}
73
74
qemu_system_reset(SHUTDOWN_CAUSE_NONE);
75
@@ -XXX,XX +XXX,XX @@ int load_snapshot(const char *name, Error **errp)
76
ret = qemu_loadvm_state(f);
77
aio_context_release(aio_context);
78
79
+ bdrv_drain_all_end();
80
+
81
migration_incoming_state_destroy();
82
if (ret < 0) {
83
error_setg(errp, "Error %d while loading VM state", ret);
84
@@ -XXX,XX +XXX,XX @@ int load_snapshot(const char *name, Error **errp)
85
}
86
87
return 0;
88
+
89
+err_drain:
90
+ bdrv_drain_all_end();
91
+ return ret;
92
}
93
94
void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
95
--
67
--
96
1.8.3.1
68
2.29.2
97
69
98
70
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
migration_incoming_state_destroy() uses qemu_fclose() on the vmstate
3
The driver is supposed to honor the blk_size field but the protocol
4
file. Make sure to call it inside an AioContext acquire/release region.
4
still uses 512-byte sector numbers. It is incorrect to multiply
5
req->sector_num by blk_size.
5
6
6
This fixes an 'qemu: qemu_mutex_unlock: Operation not permitted' abort
7
VIRTIO 1.1 5.2.5 Device Initialization says:
7
in loadvm.
8
8
9
This patch closes the vmstate file before ending the drained region.
9
blk_size can be read to determine the optimal sector size for the
10
Previously we closed the vmstate file after ending the drained region.
10
driver to use. This does not affect the units used in the protocol
11
The order does not matter.
11
(always 512 bytes), but awareness of the correct value can affect
12
performance.
12
13
14
Fixes: 3578389bcf76c824a5d82e6586a6f0c71e56f2aa ("block/export: vhost-user block device backend server")
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Message-Id: <20210223144653.811468-10-stefanha@redhat.com>
14
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
17
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
15
---
18
---
16
migration/savevm.c | 2 +-
19
block/export/vhost-user-blk-server.c | 2 +-
17
1 file changed, 1 insertion(+), 1 deletion(-)
20
1 file changed, 1 insertion(+), 1 deletion(-)
18
21
19
diff --git a/migration/savevm.c b/migration/savevm.c
22
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
20
index XXXXXXX..XXXXXXX 100644
23
index XXXXXXX..XXXXXXX 100644
21
--- a/migration/savevm.c
24
--- a/block/export/vhost-user-blk-server.c
22
+++ b/migration/savevm.c
25
+++ b/block/export/vhost-user-blk-server.c
23
@@ -XXX,XX +XXX,XX @@ int load_snapshot(const char *name, Error **errp)
26
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
24
27
break;
25
aio_context_acquire(aio_context);
28
}
26
ret = qemu_loadvm_state(f);
29
27
+ migration_incoming_state_destroy();
30
- int64_t offset = req->sector_num * vexp->blk_size;
28
aio_context_release(aio_context);
31
+ int64_t offset = req->sector_num << VIRTIO_BLK_SECTOR_BITS;
29
32
QEMUIOVector qiov;
30
bdrv_drain_all_end();
33
if (is_write) {
31
34
qemu_iovec_init_external(&qiov, out_iov, out_num);
32
- migration_incoming_state_destroy();
33
if (ret < 0) {
34
error_setg(errp, "Error %d while loading VM state", ret);
35
return ret;
36
--
35
--
37
1.8.3.1
36
2.29.2
38
37
39
38
diff view generated by jsdifflib
1
Most of the qed code is now synchronous and matches the coroutine model.
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
One notable exception is the serialisation between requests which can
3
still schedule a callback. Before we can replace this with coroutine
4
locks, let's convert the driver's external interfaces to the coroutine
5
versions.
6
2
7
We need to be careful to handle both requests that call the completion
3
Validate discard/write zeroes the same way we do for virtio-blk. Some of
8
callback directly from the calling coroutine (i.e. fully synchronous
4
these checks are mandated by the VIRTIO specification, others are
9
code) and requests that involve some callback, so that we need to yield
5
internal to QEMU.
10
and wait for the completion callback coming from outside the coroutine.
11
6
7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Message-Id: <20210223144653.811468-11-stefanha@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
13
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
14
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
15
---
10
---
16
block/qed.c | 97 ++++++++++++++++++++++++++-----------------------------------
11
block/export/vhost-user-blk-server.c | 116 +++++++++++++++++++++------
17
1 file changed, 42 insertions(+), 55 deletions(-)
12
1 file changed, 93 insertions(+), 23 deletions(-)
18
13
19
diff --git a/block/qed.c b/block/qed.c
14
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
20
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
21
--- a/block/qed.c
16
--- a/block/export/vhost-user-blk-server.c
22
+++ b/block/qed.c
17
+++ b/block/export/vhost-user-blk-server.c
23
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb)
18
@@ -XXX,XX +XXX,XX @@
24
}
19
20
enum {
21
VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
22
+ VHOST_USER_BLK_MAX_DISCARD_SECTORS = 32768,
23
+ VHOST_USER_BLK_MAX_WRITE_ZEROES_SECTORS = 32768,
24
};
25
struct virtio_blk_inhdr {
26
unsigned char status;
27
@@ -XXX,XX +XXX,XX @@ static void vu_blk_req_complete(VuBlkReq *req)
28
free(req);
25
}
29
}
26
30
27
-static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
31
+static bool vu_blk_sect_range_ok(VuBlkExport *vexp, uint64_t sector,
28
- int64_t sector_num,
32
+ size_t size)
29
- QEMUIOVector *qiov, int nb_sectors,
33
+{
30
- BlockCompletionFunc *cb,
34
+ uint64_t nb_sectors = size >> BDRV_SECTOR_BITS;
31
- void *opaque, int flags)
35
+ uint64_t total_sectors;
32
+typedef struct QEDRequestCo {
33
+ Coroutine *co;
34
+ bool done;
35
+ int ret;
36
+} QEDRequestCo;
37
+
36
+
38
+static void qed_co_request_cb(void *opaque, int ret)
37
+ if (nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
39
{
38
+ return false;
40
- QEDAIOCB *acb = qemu_aio_get(&qed_aiocb_info, bs, cb, opaque);
39
+ }
41
+ QEDRequestCo *co = opaque;
40
+ if ((sector << VIRTIO_BLK_SECTOR_BITS) % vexp->blk_size) {
42
41
+ return false;
43
- trace_qed_aio_setup(bs->opaque, acb, sector_num, nb_sectors,
42
+ }
44
- opaque, flags);
43
+ blk_get_geometry(vexp->export.blk, &total_sectors);
45
+ co->done = true;
44
+ if (sector > total_sectors || nb_sectors > total_sectors - sector) {
46
+ co->ret = ret;
45
+ return false;
47
+ qemu_coroutine_enter_if_inactive(co->co);
46
+ }
47
+ return true;
48
+}
48
+}
49
+
49
+
50
+static int coroutine_fn qed_co_request(BlockDriverState *bs, int64_t sector_num,
50
static int coroutine_fn
51
+ QEMUIOVector *qiov, int nb_sectors,
51
-vu_blk_discard_write_zeroes(BlockBackend *blk, struct iovec *iov,
52
+ int flags)
52
+vu_blk_discard_write_zeroes(VuBlkExport *vexp, struct iovec *iov,
53
+{
53
uint32_t iovcnt, uint32_t type)
54
+ QEDRequestCo co = {
54
{
55
+ .co = qemu_coroutine_self(),
55
+ BlockBackend *blk = vexp->export.blk;
56
+ .done = false,
56
struct virtio_blk_discard_write_zeroes desc;
57
+ };
57
- ssize_t size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
58
+ QEDAIOCB *acb = qemu_aio_get(&qed_aiocb_info, bs, qed_co_request_cb, &co);
58
+ ssize_t size;
59
+ uint64_t sector;
60
+ uint32_t num_sectors;
61
+ uint32_t max_sectors;
62
+ uint32_t flags;
63
+ int bytes;
59
+
64
+
60
+ trace_qed_aio_setup(bs->opaque, acb, sector_num, nb_sectors, &co, flags);
65
+ /* Only one desc is currently supported */
61
66
+ if (unlikely(iov_size(iov, iovcnt) > sizeof(desc))) {
62
acb->flags = flags;
67
+ return VIRTIO_BLK_S_UNSUPP;
63
acb->qiov = qiov;
64
@@ -XXX,XX +XXX,XX @@ static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
65
66
/* Start request */
67
qed_aio_start_io(acb);
68
- return &acb->common;
69
-}
70
71
-static BlockAIOCB *bdrv_qed_aio_readv(BlockDriverState *bs,
72
- int64_t sector_num,
73
- QEMUIOVector *qiov, int nb_sectors,
74
- BlockCompletionFunc *cb,
75
- void *opaque)
76
-{
77
- return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
78
+ if (!co.done) {
79
+ qemu_coroutine_yield();
80
+ }
68
+ }
81
+
69
+
82
+ return co.ret;
70
+ size = iov_to_buf(iov, iovcnt, 0, &desc, sizeof(desc));
71
if (unlikely(size != sizeof(desc))) {
72
- error_report("Invalid size %zd, expect %zu", size, sizeof(desc));
73
- return -EINVAL;
74
+ error_report("Invalid size %zd, expected %zu", size, sizeof(desc));
75
+ return VIRTIO_BLK_S_IOERR;
76
}
77
78
- uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
79
- le32_to_cpu(desc.num_sectors) << 9 };
80
- if (type == VIRTIO_BLK_T_DISCARD) {
81
- if (blk_co_pdiscard(blk, range[0], range[1]) == 0) {
82
- return 0;
83
+ sector = le64_to_cpu(desc.sector);
84
+ num_sectors = le32_to_cpu(desc.num_sectors);
85
+ flags = le32_to_cpu(desc.flags);
86
+ max_sectors = (type == VIRTIO_BLK_T_WRITE_ZEROES) ?
87
+ VHOST_USER_BLK_MAX_WRITE_ZEROES_SECTORS :
88
+ VHOST_USER_BLK_MAX_DISCARD_SECTORS;
89
+
90
+ /* This check ensures that 'bytes' fits in an int */
91
+ if (unlikely(num_sectors > max_sectors)) {
92
+ return VIRTIO_BLK_S_IOERR;
93
+ }
94
+
95
+ bytes = num_sectors << VIRTIO_BLK_SECTOR_BITS;
96
+
97
+ if (unlikely(!vu_blk_sect_range_ok(vexp, sector, bytes))) {
98
+ return VIRTIO_BLK_S_IOERR;
99
+ }
100
+
101
+ /*
102
+ * The device MUST set the status byte to VIRTIO_BLK_S_UNSUPP for discard
103
+ * and write zeroes commands if any unknown flag is set.
104
+ */
105
+ if (unlikely(flags & ~VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP)) {
106
+ return VIRTIO_BLK_S_UNSUPP;
107
+ }
108
+
109
+ if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
110
+ int blk_flags = 0;
111
+
112
+ if (flags & VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP) {
113
+ blk_flags |= BDRV_REQ_MAY_UNMAP;
114
+ }
115
+
116
+ if (blk_co_pwrite_zeroes(blk, sector << VIRTIO_BLK_SECTOR_BITS,
117
+ bytes, blk_flags) == 0) {
118
+ return VIRTIO_BLK_S_OK;
119
}
120
- } else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
121
- if (blk_co_pwrite_zeroes(blk, range[0], range[1], 0) == 0) {
122
- return 0;
123
+ } else if (type == VIRTIO_BLK_T_DISCARD) {
124
+ /*
125
+ * The device MUST set the status byte to VIRTIO_BLK_S_UNSUPP for
126
+ * discard commands if the unmap flag is set.
127
+ */
128
+ if (unlikely(flags & VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP)) {
129
+ return VIRTIO_BLK_S_UNSUPP;
130
+ }
131
+
132
+ if (blk_co_pdiscard(blk, sector << VIRTIO_BLK_SECTOR_BITS,
133
+ bytes) == 0) {
134
+ return VIRTIO_BLK_S_OK;
135
}
136
}
137
138
- return -EINVAL;
139
+ return VIRTIO_BLK_S_IOERR;
83
}
140
}
84
141
85
-static BlockAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs,
142
static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
86
- int64_t sector_num,
143
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
87
- QEMUIOVector *qiov, int nb_sectors,
144
}
88
- BlockCompletionFunc *cb,
145
case VIRTIO_BLK_T_DISCARD:
89
- void *opaque)
146
case VIRTIO_BLK_T_WRITE_ZEROES: {
90
+static int coroutine_fn bdrv_qed_co_readv(BlockDriverState *bs,
147
- int rc;
91
+ int64_t sector_num, int nb_sectors,
148
-
92
+ QEMUIOVector *qiov)
149
if (!vexp->writable) {
93
{
150
req->in->status = VIRTIO_BLK_S_IOERR;
94
- return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb,
151
break;
95
- opaque, QED_AIOCB_WRITE);
152
}
96
+ return qed_co_request(bs, sector_num, qiov, nb_sectors, 0);
153
154
- rc = vu_blk_discard_write_zeroes(blk, &elem->out_sg[1], out_num, type);
155
- if (rc == 0) {
156
- req->in->status = VIRTIO_BLK_S_OK;
157
- } else {
158
- req->in->status = VIRTIO_BLK_S_IOERR;
159
- }
160
+ req->in->status = vu_blk_discard_write_zeroes(vexp, out_iov, out_num,
161
+ type);
162
break;
163
}
164
default:
165
@@ -XXX,XX +XXX,XX @@ vu_blk_initialize_config(BlockDriverState *bs,
166
config->min_io_size = cpu_to_le16(1);
167
config->opt_io_size = cpu_to_le32(1);
168
config->num_queues = cpu_to_le16(num_queues);
169
- config->max_discard_sectors = cpu_to_le32(32768);
170
+ config->max_discard_sectors =
171
+ cpu_to_le32(VHOST_USER_BLK_MAX_DISCARD_SECTORS);
172
config->max_discard_seg = cpu_to_le32(1);
173
config->discard_sector_alignment =
174
cpu_to_le32(blk_size >> VIRTIO_BLK_SECTOR_BITS);
175
- config->max_write_zeroes_sectors = cpu_to_le32(32768);
176
+ config->max_write_zeroes_sectors
177
+ = cpu_to_le32(VHOST_USER_BLK_MAX_WRITE_ZEROES_SECTORS);
178
config->max_write_zeroes_seg = cpu_to_le32(1);
97
}
179
}
98
180
99
-typedef struct {
100
- Coroutine *co;
101
- int ret;
102
- bool done;
103
-} QEDWriteZeroesCB;
104
-
105
-static void coroutine_fn qed_co_pwrite_zeroes_cb(void *opaque, int ret)
106
+static int coroutine_fn bdrv_qed_co_writev(BlockDriverState *bs,
107
+ int64_t sector_num, int nb_sectors,
108
+ QEMUIOVector *qiov)
109
{
110
- QEDWriteZeroesCB *cb = opaque;
111
-
112
- cb->done = true;
113
- cb->ret = ret;
114
- if (cb->co) {
115
- aio_co_wake(cb->co);
116
- }
117
+ return qed_co_request(bs, sector_num, qiov, nb_sectors, QED_AIOCB_WRITE);
118
}
119
120
static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
121
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
122
int count,
123
BdrvRequestFlags flags)
124
{
125
- BlockAIOCB *blockacb;
126
BDRVQEDState *s = bs->opaque;
127
- QEDWriteZeroesCB cb = { .done = false };
128
QEMUIOVector qiov;
129
struct iovec iov;
130
131
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
132
iov.iov_len = count;
133
134
qemu_iovec_init_external(&qiov, &iov, 1);
135
- blockacb = qed_aio_setup(bs, offset >> BDRV_SECTOR_BITS, &qiov,
136
- count >> BDRV_SECTOR_BITS,
137
- qed_co_pwrite_zeroes_cb, &cb,
138
- QED_AIOCB_WRITE | QED_AIOCB_ZERO);
139
- if (!blockacb) {
140
- return -EIO;
141
- }
142
- if (!cb.done) {
143
- cb.co = qemu_coroutine_self();
144
- qemu_coroutine_yield();
145
- }
146
- assert(cb.done);
147
- return cb.ret;
148
+ return qed_co_request(bs, offset >> BDRV_SECTOR_BITS, &qiov,
149
+ count >> BDRV_SECTOR_BITS,
150
+ QED_AIOCB_WRITE | QED_AIOCB_ZERO);
151
}
152
153
static int bdrv_qed_truncate(BlockDriverState *bs, int64_t offset, Error **errp)
154
@@ -XXX,XX +XXX,XX @@ static BlockDriver bdrv_qed = {
155
.bdrv_create = bdrv_qed_create,
156
.bdrv_has_zero_init = bdrv_has_zero_init_1,
157
.bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
158
- .bdrv_aio_readv = bdrv_qed_aio_readv,
159
- .bdrv_aio_writev = bdrv_qed_aio_writev,
160
+ .bdrv_co_readv = bdrv_qed_co_readv,
161
+ .bdrv_co_writev = bdrv_qed_co_writev,
162
.bdrv_co_pwrite_zeroes = bdrv_qed_co_pwrite_zeroes,
163
.bdrv_truncate = bdrv_qed_truncate,
164
.bdrv_getlength = bdrv_qed_getlength,
165
--
181
--
166
1.8.3.1
182
2.29.2
167
183
168
184
diff view generated by jsdifflib
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
3
Exercise input validation code paths in
4
block/export/vhost-user-blk-server.c.
5
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Message-Id: <20210223144653.811468-12-stefanha@redhat.com>
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2
Reviewed-by: Eric Blake <eblake@redhat.com>
3
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
4
---
9
---
5
block/qed-cluster.c | 94 ++++++++++++++++++-----------------------------------
10
tests/qtest/vhost-user-blk-test.c | 124 ++++++++++++++++++++++++++++++
6
block/qed-table.c | 15 +++------
11
1 file changed, 124 insertions(+)
7
block/qed.h | 3 +-
8
3 files changed, 36 insertions(+), 76 deletions(-)
9
12
10
diff --git a/block/qed-cluster.c b/block/qed-cluster.c
13
diff --git a/tests/qtest/vhost-user-blk-test.c b/tests/qtest/vhost-user-blk-test.c
11
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
12
--- a/block/qed-cluster.c
15
--- a/tests/qtest/vhost-user-blk-test.c
13
+++ b/block/qed-cluster.c
16
+++ b/tests/qtest/vhost-user-blk-test.c
14
@@ -XXX,XX +XXX,XX @@ static unsigned int qed_count_contiguous_clusters(BDRVQEDState *s,
17
@@ -XXX,XX +XXX,XX @@ static uint64_t virtio_blk_request(QGuestAllocator *alloc, QVirtioDevice *d,
15
return i - index;
18
return addr;
16
}
19
}
17
20
18
-typedef struct {
21
+static void test_invalid_discard_write_zeroes(QVirtioDevice *dev,
19
- BDRVQEDState *s;
22
+ QGuestAllocator *alloc,
20
- uint64_t pos;
23
+ QTestState *qts,
21
- size_t len;
24
+ QVirtQueue *vq,
22
-
25
+ uint32_t type)
23
- QEDRequest *request;
26
+{
24
-
27
+ QVirtioBlkReq req;
25
- /* User callback */
28
+ struct virtio_blk_discard_write_zeroes dwz_hdr;
26
- QEDFindClusterFunc *cb;
29
+ struct virtio_blk_discard_write_zeroes dwz_hdr2[2];
27
- void *opaque;
30
+ uint64_t req_addr;
28
-} QEDFindClusterCB;
31
+ uint32_t free_head;
29
-
32
+ uint8_t status;
30
-static void qed_find_cluster_cb(void *opaque, int ret)
33
+
31
-{
34
+ /* More than one dwz is not supported */
32
- QEDFindClusterCB *find_cluster_cb = opaque;
35
+ req.type = type;
33
- BDRVQEDState *s = find_cluster_cb->s;
36
+ req.data = (char *) dwz_hdr2;
34
- QEDRequest *request = find_cluster_cb->request;
37
+ dwz_hdr2[0].sector = 0;
35
- uint64_t offset = 0;
38
+ dwz_hdr2[0].num_sectors = 1;
36
- size_t len = 0;
39
+ dwz_hdr2[0].flags = 0;
37
- unsigned int index;
40
+ dwz_hdr2[1].sector = 1;
38
- unsigned int n;
41
+ dwz_hdr2[1].num_sectors = 1;
39
-
42
+ dwz_hdr2[1].flags = 0;
40
- qed_acquire(s);
43
+
41
- if (ret) {
44
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr2[0]);
42
- goto out;
45
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr2[1]);
43
- }
46
+
44
-
47
+ req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr2));
45
- index = qed_l2_index(s, find_cluster_cb->pos);
48
+
46
- n = qed_bytes_to_clusters(s,
49
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
47
- qed_offset_into_cluster(s, find_cluster_cb->pos) +
50
+ qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr2), false, true);
48
- find_cluster_cb->len);
51
+ qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr2), 1, true,
49
- n = qed_count_contiguous_clusters(s, request->l2_table->table,
52
+ false);
50
- index, n, &offset);
53
+
51
-
54
+ qvirtqueue_kick(qts, dev, vq, free_head);
52
- if (qed_offset_is_unalloc_cluster(offset)) {
55
+
53
- ret = QED_CLUSTER_L2;
56
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
54
- } else if (qed_offset_is_zero_cluster(offset)) {
57
+ QVIRTIO_BLK_TIMEOUT_US);
55
- ret = QED_CLUSTER_ZERO;
58
+ status = readb(req_addr + 16 + sizeof(dwz_hdr2));
56
- } else if (qed_check_cluster_offset(s, offset)) {
59
+ g_assert_cmpint(status, ==, VIRTIO_BLK_S_UNSUPP);
57
- ret = QED_CLUSTER_FOUND;
60
+
58
- } else {
61
+ guest_free(alloc, req_addr);
59
- ret = -EINVAL;
62
+
60
- }
63
+ /* num_sectors must be less than config->max_write_zeroes_sectors */
61
-
64
+ req.type = type;
62
- len = MIN(find_cluster_cb->len, n * s->header.cluster_size -
65
+ req.data = (char *) &dwz_hdr;
63
- qed_offset_into_cluster(s, find_cluster_cb->pos));
66
+ dwz_hdr.sector = 0;
64
-
67
+ dwz_hdr.num_sectors = 0xffffffff;
65
-out:
68
+ dwz_hdr.flags = 0;
66
- find_cluster_cb->cb(find_cluster_cb->opaque, ret, offset, len);
69
+
67
- qed_release(s);
70
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
68
- g_free(find_cluster_cb);
71
+
69
-}
72
+ req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
70
-
73
+
71
/**
74
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
72
* Find the offset of a data cluster
75
+ qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
73
*
76
+ qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), 1, true,
74
@@ -XXX,XX +XXX,XX @@ out:
77
+ false);
75
void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
78
+
76
size_t len, QEDFindClusterFunc *cb, void *opaque)
79
+ qvirtqueue_kick(qts, dev, vq, free_head);
80
+
81
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
82
+ QVIRTIO_BLK_TIMEOUT_US);
83
+ status = readb(req_addr + 16 + sizeof(dwz_hdr));
84
+ g_assert_cmpint(status, ==, VIRTIO_BLK_S_IOERR);
85
+
86
+ guest_free(alloc, req_addr);
87
+
88
+ /* sector must be less than the device capacity */
89
+ req.type = type;
90
+ req.data = (char *) &dwz_hdr;
91
+ dwz_hdr.sector = TEST_IMAGE_SIZE / 512 + 1;
92
+ dwz_hdr.num_sectors = 1;
93
+ dwz_hdr.flags = 0;
94
+
95
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
96
+
97
+ req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
98
+
99
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
100
+ qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
101
+ qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), 1, true,
102
+ false);
103
+
104
+ qvirtqueue_kick(qts, dev, vq, free_head);
105
+
106
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
107
+ QVIRTIO_BLK_TIMEOUT_US);
108
+ status = readb(req_addr + 16 + sizeof(dwz_hdr));
109
+ g_assert_cmpint(status, ==, VIRTIO_BLK_S_IOERR);
110
+
111
+ guest_free(alloc, req_addr);
112
+
113
+ /* reserved flag bits must be zero */
114
+ req.type = type;
115
+ req.data = (char *) &dwz_hdr;
116
+ dwz_hdr.sector = 0;
117
+ dwz_hdr.num_sectors = 1;
118
+ dwz_hdr.flags = ~VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP;
119
+
120
+ virtio_blk_fix_dwz_hdr(dev, &dwz_hdr);
121
+
122
+ req_addr = virtio_blk_request(alloc, dev, &req, sizeof(dwz_hdr));
123
+
124
+ free_head = qvirtqueue_add(qts, vq, req_addr, 16, false, true);
125
+ qvirtqueue_add(qts, vq, req_addr + 16, sizeof(dwz_hdr), false, true);
126
+ qvirtqueue_add(qts, vq, req_addr + 16 + sizeof(dwz_hdr), 1, true,
127
+ false);
128
+
129
+ qvirtqueue_kick(qts, dev, vq, free_head);
130
+
131
+ qvirtio_wait_used_elem(qts, dev, vq, free_head, NULL,
132
+ QVIRTIO_BLK_TIMEOUT_US);
133
+ status = readb(req_addr + 16 + sizeof(dwz_hdr));
134
+ g_assert_cmpint(status, ==, VIRTIO_BLK_S_UNSUPP);
135
+
136
+ guest_free(alloc, req_addr);
137
+}
138
+
139
/* Returns the request virtqueue so the caller can perform further tests */
140
static QVirtQueue *test_basic(QVirtioDevice *dev, QGuestAllocator *alloc)
77
{
141
{
78
- QEDFindClusterCB *find_cluster_cb;
142
@@ -XXX,XX +XXX,XX @@ static QVirtQueue *test_basic(QVirtioDevice *dev, QGuestAllocator *alloc)
79
uint64_t l2_offset;
143
g_free(data);
80
+ uint64_t offset = 0;
144
81
+ unsigned int index;
145
guest_free(alloc, req_addr);
82
+ unsigned int n;
146
+
83
+ int ret;
147
+ test_invalid_discard_write_zeroes(dev, alloc, qts, vq,
84
148
+ VIRTIO_BLK_T_WRITE_ZEROES);
85
/* Limit length to L2 boundary. Requests are broken up at the L2 boundary
86
* so that a request acts on one L2 table at a time.
87
@@ -XXX,XX +XXX,XX @@ void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
88
return;
89
}
149
}
90
150
91
- find_cluster_cb = g_malloc(sizeof(*find_cluster_cb));
151
if (features & (1u << VIRTIO_BLK_F_DISCARD)) {
92
- find_cluster_cb->s = s;
152
@@ -XXX,XX +XXX,XX @@ static QVirtQueue *test_basic(QVirtioDevice *dev, QGuestAllocator *alloc)
93
- find_cluster_cb->pos = pos;
153
g_assert_cmpint(status, ==, 0);
94
- find_cluster_cb->len = len;
154
95
- find_cluster_cb->cb = cb;
155
guest_free(alloc, req_addr);
96
- find_cluster_cb->opaque = opaque;
97
- find_cluster_cb->request = request;
98
+ ret = qed_read_l2_table(s, request, l2_offset);
99
+ qed_acquire(s);
100
+ if (ret) {
101
+ goto out;
102
+ }
103
+
156
+
104
+ index = qed_l2_index(s, pos);
157
+ test_invalid_discard_write_zeroes(dev, alloc, qts, vq,
105
+ n = qed_bytes_to_clusters(s,
158
+ VIRTIO_BLK_T_DISCARD);
106
+ qed_offset_into_cluster(s, pos) + len);
107
+ n = qed_count_contiguous_clusters(s, request->l2_table->table,
108
+ index, n, &offset);
109
+
110
+ if (qed_offset_is_unalloc_cluster(offset)) {
111
+ ret = QED_CLUSTER_L2;
112
+ } else if (qed_offset_is_zero_cluster(offset)) {
113
+ ret = QED_CLUSTER_ZERO;
114
+ } else if (qed_check_cluster_offset(s, offset)) {
115
+ ret = QED_CLUSTER_FOUND;
116
+ } else {
117
+ ret = -EINVAL;
118
+ }
119
+
120
+ len = MIN(len,
121
+ n * s->header.cluster_size - qed_offset_into_cluster(s, pos));
122
123
- qed_read_l2_table(s, request, l2_offset,
124
- qed_find_cluster_cb, find_cluster_cb);
125
+out:
126
+ cb(opaque, ret, offset, len);
127
+ qed_release(s);
128
}
129
diff --git a/block/qed-table.c b/block/qed-table.c
130
index XXXXXXX..XXXXXXX 100644
131
--- a/block/qed-table.c
132
+++ b/block/qed-table.c
133
@@ -XXX,XX +XXX,XX @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
134
return ret;
135
}
136
137
-void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
138
- BlockCompletionFunc *cb, void *opaque)
139
+int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
140
{
141
int ret;
142
143
@@ -XXX,XX +XXX,XX @@ void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
144
/* Check for cached L2 entry */
145
request->l2_table = qed_find_l2_cache_entry(&s->l2_cache, offset);
146
if (request->l2_table) {
147
- cb(opaque, 0);
148
- return;
149
+ return 0;
150
}
159
}
151
160
152
request->l2_table = qed_alloc_l2_cache_entry(&s->l2_cache);
161
if (features & (1u << VIRTIO_F_ANY_LAYOUT)) {
153
@@ -XXX,XX +XXX,XX @@ void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
154
}
155
qed_release(s);
156
157
- cb(opaque, ret);
158
+ return ret;
159
}
160
161
int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
162
{
163
- int ret = -EINPROGRESS;
164
-
165
- qed_read_l2_table(s, request, offset, qed_sync_cb, &ret);
166
- BDRV_POLL_WHILE(s->bs, ret == -EINPROGRESS);
167
-
168
- return ret;
169
+ return qed_read_l2_table(s, request, offset);
170
}
171
172
void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
173
diff --git a/block/qed.h b/block/qed.h
174
index XXXXXXX..XXXXXXX 100644
175
--- a/block/qed.h
176
+++ b/block/qed.h
177
@@ -XXX,XX +XXX,XX @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
178
unsigned int n);
179
int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
180
uint64_t offset);
181
-void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
182
- BlockCompletionFunc *cb, void *opaque);
183
+int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset);
184
void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
185
unsigned int index, unsigned int n, bool flush,
186
BlockCompletionFunc *cb, void *opaque);
187
--
162
--
188
1.8.3.1
163
2.29.2
189
164
190
165
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Stefan Hajnoczi <stefanha@redhat.com>
2
2
3
If the guest tries to write data that results on the allocation of a
3
Check that the sector number and byte count are valid.
4
new cluster, instead of writing the guest data first and then the data
5
from the COW regions, write everything together using one single I/O
6
operation.
7
4
8
This can improve the write performance by 25% or more, depending on
5
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
several factors such as the media type, the cluster size and the I/O
6
Message-Id: <20210223144653.811468-13-stefanha@redhat.com>
10
request size.
11
12
Signed-off-by: Alberto Garcia <berto@igalia.com>
13
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
14
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
15
---
8
---
16
block/qcow2-cluster.c | 40 ++++++++++++++++++++++++--------
9
block/export/vhost-user-blk-server.c | 19 ++++++++++++++++---
17
block/qcow2.c | 64 +++++++++++++++++++++++++++++++++++++++++++--------
10
1 file changed, 16 insertions(+), 3 deletions(-)
18
block/qcow2.h | 7 ++++++
19
3 files changed, 91 insertions(+), 20 deletions(-)
20
11
21
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
12
diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
22
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
23
--- a/block/qcow2-cluster.c
14
--- a/block/export/vhost-user-blk-server.c
24
+++ b/block/qcow2-cluster.c
15
+++ b/block/export/vhost-user-blk-server.c
25
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
16
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
26
assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
17
switch (type & ~VIRTIO_BLK_T_BARRIER) {
27
assert(start->nb_bytes + end->nb_bytes <= UINT_MAX - data_bytes);
18
case VIRTIO_BLK_T_IN:
28
assert(start->offset + start->nb_bytes <= end->offset);
19
case VIRTIO_BLK_T_OUT: {
29
+ assert(!m->data_qiov || m->data_qiov->size == data_bytes);
20
+ QEMUIOVector qiov;
30
21
+ int64_t offset;
31
if (start->nb_bytes == 0 && end->nb_bytes == 0) {
22
ssize_t ret = 0;
32
return 0;
23
bool is_write = type & VIRTIO_BLK_T_OUT;
33
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
24
req->sector_num = le64_to_cpu(req->out.sector);
34
/* The part of the buffer where the end region is located */
25
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
35
end_buffer = start_buffer + buffer_size - end->nb_bytes;
26
break;
36
37
- qemu_iovec_init(&qiov, 1);
38
+ qemu_iovec_init(&qiov, 2 + (m->data_qiov ? m->data_qiov->niov : 0));
39
40
qemu_co_mutex_unlock(&s->lock);
41
/* First we read the existing data from both COW regions. We
42
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
43
}
27
}
44
}
28
45
29
- int64_t offset = req->sector_num << VIRTIO_BLK_SECTOR_BITS;
46
- /* And now we can write everything */
30
- QEMUIOVector qiov;
47
- qemu_iovec_reset(&qiov);
31
if (is_write) {
48
- qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
32
qemu_iovec_init_external(&qiov, out_iov, out_num);
49
- ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
33
- ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0);
50
- if (ret < 0) {
34
} else {
51
- goto fail;
35
qemu_iovec_init_external(&qiov, in_iov, in_num);
52
+ /* And now we can write everything. If we have the guest data we
53
+ * can write everything in one single operation */
54
+ if (m->data_qiov) {
55
+ qemu_iovec_reset(&qiov);
56
+ if (start->nb_bytes) {
57
+ qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
58
+ }
59
+ qemu_iovec_concat(&qiov, m->data_qiov, 0, data_bytes);
60
+ if (end->nb_bytes) {
61
+ qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
62
+ }
63
+ /* NOTE: we have a write_aio blkdebug event here followed by
64
+ * a cow_write one in do_perform_cow_write(), but there's only
65
+ * one single I/O operation */
66
+ BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
67
+ ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
68
+ } else {
69
+ /* If there's no guest data then write both COW regions separately */
70
+ qemu_iovec_reset(&qiov);
71
+ qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
72
+ ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
73
+ if (ret < 0) {
74
+ goto fail;
75
+ }
36
+ }
76
+
37
+
77
+ qemu_iovec_reset(&qiov);
38
+ if (unlikely(!vu_blk_sect_range_ok(vexp,
78
+ qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
39
+ req->sector_num,
79
+ ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, &qiov);
40
+ qiov.size))) {
80
}
41
+ req->in->status = VIRTIO_BLK_S_IOERR;
81
42
+ break;
82
- qemu_iovec_reset(&qiov);
83
- qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
84
- ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, &qiov);
85
fail:
86
qemu_co_mutex_lock(&s->lock);
87
88
diff --git a/block/qcow2.c b/block/qcow2.c
89
index XXXXXXX..XXXXXXX 100644
90
--- a/block/qcow2.c
91
+++ b/block/qcow2.c
92
@@ -XXX,XX +XXX,XX @@ fail:
93
return ret;
94
}
95
96
+/* Check if it's possible to merge a write request with the writing of
97
+ * the data from the COW regions */
98
+static bool merge_cow(uint64_t offset, unsigned bytes,
99
+ QEMUIOVector *hd_qiov, QCowL2Meta *l2meta)
100
+{
101
+ QCowL2Meta *m;
102
+
103
+ for (m = l2meta; m != NULL; m = m->next) {
104
+ /* If both COW regions are empty then there's nothing to merge */
105
+ if (m->cow_start.nb_bytes == 0 && m->cow_end.nb_bytes == 0) {
106
+ continue;
107
+ }
43
+ }
108
+
44
+
109
+ /* The data (middle) region must be immediately after the
45
+ offset = req->sector_num << VIRTIO_BLK_SECTOR_BITS;
110
+ * start region */
111
+ if (l2meta_cow_start(m) + m->cow_start.nb_bytes != offset) {
112
+ continue;
113
+ }
114
+
46
+
115
+ /* The end region must be immediately after the data (middle)
47
+ if (is_write) {
116
+ * region */
48
+ ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0);
117
+ if (m->offset + m->cow_end.offset != offset + bytes) {
49
+ } else {
118
+ continue;
50
ret = blk_co_preadv(blk, offset, qiov.size, &qiov, 0);
119
+ }
120
+
121
+ /* Make sure that adding both COW regions to the QEMUIOVector
122
+ * does not exceed IOV_MAX */
123
+ if (hd_qiov->niov > IOV_MAX - 2) {
124
+ continue;
125
+ }
126
+
127
+ m->data_qiov = hd_qiov;
128
+ return true;
129
+ }
130
+
131
+ return false;
132
+}
133
+
134
static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
135
uint64_t bytes, QEMUIOVector *qiov,
136
int flags)
137
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
138
goto fail;
139
}
51
}
140
52
if (ret >= 0) {
141
- qemu_co_mutex_unlock(&s->lock);
142
- BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
143
- trace_qcow2_writev_data(qemu_coroutine_self(),
144
- cluster_offset + offset_in_cluster);
145
- ret = bdrv_co_pwritev(bs->file,
146
- cluster_offset + offset_in_cluster,
147
- cur_bytes, &hd_qiov, 0);
148
- qemu_co_mutex_lock(&s->lock);
149
- if (ret < 0) {
150
- goto fail;
151
+ /* If we need to do COW, check if it's possible to merge the
152
+ * writing of the guest data together with that of the COW regions.
153
+ * If it's not possible (or not necessary) then write the
154
+ * guest data now. */
155
+ if (!merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) {
156
+ qemu_co_mutex_unlock(&s->lock);
157
+ BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
158
+ trace_qcow2_writev_data(qemu_coroutine_self(),
159
+ cluster_offset + offset_in_cluster);
160
+ ret = bdrv_co_pwritev(bs->file,
161
+ cluster_offset + offset_in_cluster,
162
+ cur_bytes, &hd_qiov, 0);
163
+ qemu_co_mutex_lock(&s->lock);
164
+ if (ret < 0) {
165
+ goto fail;
166
+ }
167
}
168
169
while (l2meta != NULL) {
170
diff --git a/block/qcow2.h b/block/qcow2.h
171
index XXXXXXX..XXXXXXX 100644
172
--- a/block/qcow2.h
173
+++ b/block/qcow2.h
174
@@ -XXX,XX +XXX,XX @@ typedef struct QCowL2Meta
175
*/
176
Qcow2COWRegion cow_end;
177
178
+ /**
179
+ * The I/O vector with the data from the actual guest write request.
180
+ * If non-NULL, this is meant to be merged together with the data
181
+ * from @cow_start and @cow_end into one single write operation.
182
+ */
183
+ QEMUIOVector *data_qiov;
184
+
185
/** Pointer to next L2Meta of the same write request */
186
struct QCowL2Meta *next;
187
188
--
53
--
189
1.8.3.1
54
2.29.2
190
55
191
56
diff view generated by jsdifflib
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
3
Rename bytes_covered_by_bitmap_cluster() to
4
bdrv_dirty_bitmap_serialization_coverage() and make it public.
5
It is needed as we are going to share it with bitmap loading in
6
parallels format.
7
8
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
9
Reviewed-by: Eric Blake <eblake@redhat.com>
10
Reviewed-by: Denis V. Lunev <den@openvz.org>
11
Message-Id: <20210224104707.88430-2-vsementsov@virtuozzo.com>
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
3
---
13
---
4
block/qed.c | 32 ++++++++++++--------------------
14
include/block/dirty-bitmap.h | 2 ++
5
1 file changed, 12 insertions(+), 20 deletions(-)
15
block/dirty-bitmap.c | 13 +++++++++++++
16
block/qcow2-bitmap.c | 16 ++--------------
17
3 files changed, 17 insertions(+), 14 deletions(-)
6
18
7
diff --git a/block/qed.c b/block/qed.c
19
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
8
index XXXXXXX..XXXXXXX 100644
20
index XXXXXXX..XXXXXXX 100644
9
--- a/block/qed.c
21
--- a/include/block/dirty-bitmap.h
10
+++ b/block/qed.c
22
+++ b/include/block/dirty-bitmap.h
11
@@ -XXX,XX +XXX,XX @@ int qed_write_header_sync(BDRVQEDState *s)
23
@@ -XXX,XX +XXX,XX @@ void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
12
* This function only updates known header fields in-place and does not affect
24
uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
13
* extra data after the QED header.
25
uint64_t offset, uint64_t bytes);
14
*/
26
uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap);
15
-static void qed_write_header(BDRVQEDState *s, BlockCompletionFunc cb,
27
+uint64_t bdrv_dirty_bitmap_serialization_coverage(int serialized_chunk_size,
16
- void *opaque)
28
+ const BdrvDirtyBitmap *bitmap);
17
+static int qed_write_header(BDRVQEDState *s)
29
void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
18
{
30
uint8_t *buf, uint64_t offset,
19
/* We must write full sectors for O_DIRECT but cannot necessarily generate
31
uint64_t bytes);
20
* the data following the header if an unrecognized compat feature is
32
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
21
@@ -XXX,XX +XXX,XX @@ static void qed_write_header(BDRVQEDState *s, BlockCompletionFunc cb,
33
index XXXXXXX..XXXXXXX 100644
22
ret = 0;
34
--- a/block/dirty-bitmap.c
23
out:
35
+++ b/block/dirty-bitmap.c
24
qemu_vfree(buf);
36
@@ -XXX,XX +XXX,XX @@ uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
25
- cb(opaque, ret);
37
return hbitmap_serialization_align(bitmap->bitmap);
26
+ return ret;
27
}
38
}
28
39
29
static uint64_t qed_max_image_size(uint32_t cluster_size, uint32_t table_size)
40
+/* Return the disk size covered by a chunk of serialized bitmap data. */
30
@@ -XXX,XX +XXX,XX @@ static void qed_unplug_allocating_write_reqs(BDRVQEDState *s)
41
+uint64_t bdrv_dirty_bitmap_serialization_coverage(int serialized_chunk_size,
31
}
42
+ const BdrvDirtyBitmap *bitmap)
43
+{
44
+ uint64_t granularity = bdrv_dirty_bitmap_granularity(bitmap);
45
+ uint64_t limit = granularity * (serialized_chunk_size << 3);
46
+
47
+ assert(QEMU_IS_ALIGNED(limit,
48
+ bdrv_dirty_bitmap_serialization_align(bitmap)));
49
+ return limit;
50
+}
51
+
52
+
53
void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
54
uint8_t *buf, uint64_t offset,
55
uint64_t bytes)
56
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
57
index XXXXXXX..XXXXXXX 100644
58
--- a/block/qcow2-bitmap.c
59
+++ b/block/qcow2-bitmap.c
60
@@ -XXX,XX +XXX,XX @@ static int free_bitmap_clusters(BlockDriverState *bs, Qcow2BitmapTable *tb)
61
return 0;
32
}
62
}
33
63
34
-static void qed_finish_clear_need_check(void *opaque, int ret)
64
-/* Return the disk size covered by a single qcow2 cluster of bitmap data. */
65
-static uint64_t bytes_covered_by_bitmap_cluster(const BDRVQcow2State *s,
66
- const BdrvDirtyBitmap *bitmap)
35
-{
67
-{
36
- /* Do nothing */
68
- uint64_t granularity = bdrv_dirty_bitmap_granularity(bitmap);
69
- uint64_t limit = granularity * (s->cluster_size << 3);
70
-
71
- assert(QEMU_IS_ALIGNED(limit,
72
- bdrv_dirty_bitmap_serialization_align(bitmap)));
73
- return limit;
37
-}
74
-}
38
-
75
-
39
-static void qed_flush_after_clear_need_check(void *opaque, int ret)
76
/* load_bitmap_data
40
-{
77
* @bitmap_table entries must satisfy specification constraints.
41
- BDRVQEDState *s = opaque;
78
* @bitmap must be cleared */
42
-
79
@@ -XXX,XX +XXX,XX @@ static int load_bitmap_data(BlockDriverState *bs,
43
- bdrv_aio_flush(s->bs, qed_finish_clear_need_check, s);
44
-
45
- /* No need to wait until flush completes */
46
- qed_unplug_allocating_write_reqs(s);
47
-}
48
-
49
static void qed_clear_need_check(void *opaque, int ret)
50
{
51
BDRVQEDState *s = opaque;
52
@@ -XXX,XX +XXX,XX @@ static void qed_clear_need_check(void *opaque, int ret)
53
}
80
}
54
81
55
s->header.features &= ~QED_F_NEED_CHECK;
82
buf = g_malloc(s->cluster_size);
56
- qed_write_header(s, qed_flush_after_clear_need_check, s);
83
- limit = bytes_covered_by_bitmap_cluster(s, bitmap);
57
+ ret = qed_write_header(s);
84
+ limit = bdrv_dirty_bitmap_serialization_coverage(s->cluster_size, bitmap);
58
+ (void) ret;
85
for (i = 0, offset = 0; i < tab_size; ++i, offset += limit) {
59
+
86
uint64_t count = MIN(bm_size - offset, limit);
60
+ qed_unplug_allocating_write_reqs(s);
87
uint64_t entry = bitmap_table[i];
61
+
88
@@ -XXX,XX +XXX,XX @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
62
+ ret = bdrv_flush(s->bs);
63
+ (void) ret;
64
}
65
66
static void qed_need_check_timer_cb(void *opaque)
67
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
68
{
69
BDRVQEDState *s = acb_to_s(acb);
70
BlockCompletionFunc *cb;
71
+ int ret;
72
73
/* Cancel timer when the first allocating request comes in */
74
if (QSIMPLEQ_EMPTY(&s->allocating_write_reqs)) {
75
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
76
77
if (qed_should_set_need_check(s)) {
78
s->header.features |= QED_F_NEED_CHECK;
79
- qed_write_header(s, cb, acb);
80
+ ret = qed_write_header(s);
81
+ cb(acb, ret);
82
} else {
83
cb(acb, 0);
84
}
89
}
90
91
buf = g_malloc(s->cluster_size);
92
- limit = bytes_covered_by_bitmap_cluster(s, bitmap);
93
+ limit = bdrv_dirty_bitmap_serialization_coverage(s->cluster_size, bitmap);
94
assert(DIV_ROUND_UP(bm_size, limit) == tb_size);
95
96
offset = 0;
85
--
97
--
86
1.8.3.1
98
2.29.2
87
99
88
100
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Qcow2COWRegion has two attributes:
3
Actually L1 table entry offset is in 512 bytes sectors. Fix the spec.
4
4
5
- The offset of the COW region from the start of the first cluster
5
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
6
touched by the I/O request. Since it's always going to be positive
6
Message-Id: <20210224104707.88430-3-vsementsov@virtuozzo.com>
7
and the maximum request size is at most INT_MAX, we can use a
7
Reviewed-by: Denis V. Lunev <den@openvz.org>
8
regular unsigned int to store this offset.
9
10
- The size of the COW region in bytes. This is guaranteed to be >= 0,
11
so we should use an unsigned type instead.
12
13
In x86_64 this reduces the size of Qcow2COWRegion from 16 to 8 bytes.
14
It will also help keep some assertions simpler now that we know that
15
there are no negative numbers.
16
17
The prototype of do_perform_cow() is also updated to reflect these
18
changes.
19
20
Signed-off-by: Alberto Garcia <berto@igalia.com>
21
Reviewed-by: Eric Blake <eblake@redhat.com>
22
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
23
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
24
---
9
---
25
block/qcow2-cluster.c | 4 ++--
10
docs/interop/parallels.txt | 28 ++++++++++++++++------------
26
block/qcow2.h | 4 ++--
11
1 file changed, 16 insertions(+), 12 deletions(-)
27
2 files changed, 4 insertions(+), 4 deletions(-)
28
12
29
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
13
diff --git a/docs/interop/parallels.txt b/docs/interop/parallels.txt
30
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
31
--- a/block/qcow2-cluster.c
15
--- a/docs/interop/parallels.txt
32
+++ b/block/qcow2-cluster.c
16
+++ b/docs/interop/parallels.txt
33
@@ -XXX,XX +XXX,XX @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
17
@@ -XXX,XX +XXX,XX @@ of its data area are:
34
static int coroutine_fn do_perform_cow(BlockDriverState *bs,
18
28 - 31: l1_size
35
uint64_t src_cluster_offset,
19
The number of entries in the L1 table of the bitmap.
36
uint64_t cluster_offset,
20
37
- int offset_in_cluster,
21
- variable: l1_table (8 * l1_size bytes)
38
- int bytes)
22
- L1 offset table (in bytes)
39
+ unsigned offset_in_cluster,
23
+ variable: L1 offset table (l1_table), size: 8 * l1_size bytes
40
+ unsigned bytes)
24
41
{
25
-A dirty bitmap is stored using a one-level structure for the mapping to host
42
BDRVQcow2State *s = bs->opaque;
26
-clusters - an L1 table.
43
QEMUIOVector qiov;
27
+The dirty bitmap described by this feature extension is stored in a set of
44
diff --git a/block/qcow2.h b/block/qcow2.h
28
+clusters inside the Parallels image file. The offsets of these clusters are
45
index XXXXXXX..XXXXXXX 100644
29
+saved in the L1 offset table specified by the feature extension. Each L1 table
46
--- a/block/qcow2.h
30
+entry is a 64 bit integer as described below:
47
+++ b/block/qcow2.h
31
48
@@ -XXX,XX +XXX,XX @@ typedef struct Qcow2COWRegion {
32
-Given an offset in bytes into the bitmap data, the offset in bytes into the
49
* Offset of the COW region in bytes from the start of the first cluster
33
-image file can be obtained as follows:
50
* touched by the request.
34
+Given an offset in bytes into the bitmap data, corresponding L1 entry is
51
*/
35
52
- uint64_t offset;
36
- offset = l1_table[offset / cluster_size] + (offset % cluster_size)
53
+ unsigned offset;
37
+ l1_table[offset / cluster_size]
54
38
55
/** Number of bytes to copy */
39
-If an L1 table entry is 0, the corresponding cluster of the bitmap is assumed
56
- int nb_bytes;
40
-to be zero.
57
+ unsigned nb_bytes;
41
+If an L1 table entry is 0, all bits in the corresponding cluster of the bitmap
58
} Qcow2COWRegion;
42
+are assumed to be 0.
59
43
60
/**
44
-If an L1 table entry is 1, the corresponding cluster of the bitmap is assumed
45
-to have all bits set.
46
+If an L1 table entry is 1, all bits in the corresponding cluster of the bitmap
47
+are assumed to be 1.
48
49
-If an L1 table entry is not 0 or 1, it allocates a cluster from the data area.
50
+If an L1 table entry is not 0 or 1, it contains the corresponding cluster
51
+offset (in 512b sectors). Given an offset in bytes into the bitmap data the
52
+offset in bytes into the image file can be obtained as follows:
53
+
54
+ offset = l1_table[offset / cluster_size] * 512 + (offset % cluster_size)
61
--
55
--
62
1.8.3.1
56
2.29.2
63
57
64
58
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
Instead of passing a single buffer pointer to do_perform_cow_write(),
3
We are going to use it in more places, calculating
4
pass a QEMUIOVector. This will allow us to merge the write requests
4
"s->tracks << BDRV_SECTOR_BITS" doesn't look good.
5
for the COW regions and the actual data into a single one.
6
5
7
Although do_perform_cow_read() does not strictly need to change its
6
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8
API, we're doing it here as well for consistency.
7
Message-Id: <20210224104707.88430-4-vsementsov@virtuozzo.com>
9
8
Reviewed-by: Denis V. Lunev <den@openvz.org>
10
Signed-off-by: Alberto Garcia <berto@igalia.com>
11
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
12
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
13
---
10
---
14
block/qcow2-cluster.c | 51 ++++++++++++++++++++++++---------------------------
11
block/parallels.h | 1 +
15
1 file changed, 24 insertions(+), 27 deletions(-)
12
block/parallels.c | 8 ++++----
13
2 files changed, 5 insertions(+), 4 deletions(-)
16
14
17
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
15
diff --git a/block/parallels.h b/block/parallels.h
18
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
19
--- a/block/qcow2-cluster.c
17
--- a/block/parallels.h
20
+++ b/block/qcow2-cluster.c
18
+++ b/block/parallels.h
21
@@ -XXX,XX +XXX,XX @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
19
@@ -XXX,XX +XXX,XX @@ typedef struct BDRVParallelsState {
22
static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
20
ParallelsPreallocMode prealloc_mode;
23
uint64_t src_cluster_offset,
21
24
unsigned offset_in_cluster,
22
unsigned int tracks;
25
- uint8_t *buffer,
23
+ unsigned int cluster_size;
26
- unsigned bytes)
24
27
+ QEMUIOVector *qiov)
25
unsigned int off_multiplier;
28
{
26
Error *migration_blocker;
29
- QEMUIOVector qiov;
27
diff --git a/block/parallels.c b/block/parallels.c
30
- struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
28
index XXXXXXX..XXXXXXX 100644
29
--- a/block/parallels.c
30
+++ b/block/parallels.c
31
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
31
int ret;
32
int ret;
32
33
uint32_t i;
33
- if (bytes == 0) {
34
bool flush_bat = false;
34
+ if (qiov->size == 0) {
35
- int cluster_size = s->tracks << BDRV_SECTOR_BITS;
35
return 0;
36
37
size = bdrv_getlength(bs->file->bs);
38
if (size < 0) {
39
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
40
high_off = off;
41
}
42
43
- if (prev_off != 0 && (prev_off + cluster_size) != off) {
44
+ if (prev_off != 0 && (prev_off + s->cluster_size) != off) {
45
res->bfi.fragmented_clusters++;
46
}
47
prev_off = off;
48
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
49
}
36
}
50
}
37
51
38
- qemu_iovec_init_external(&qiov, &iov, 1);
52
- res->image_end_offset = high_off + cluster_size;
39
-
53
+ res->image_end_offset = high_off + s->cluster_size;
40
BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
54
if (size > res->image_end_offset) {
41
55
int64_t count;
42
if (!bs->drv) {
56
- count = DIV_ROUND_UP(size - res->image_end_offset, cluster_size);
43
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
57
+ count = DIV_ROUND_UP(size - res->image_end_offset, s->cluster_size);
44
* which can lead to deadlock when block layer copy-on-read is enabled.
58
fprintf(stderr, "%s space leaked at the end of the image %" PRId64 "\n",
45
*/
59
fix & BDRV_FIX_LEAKS ? "Repairing" : "ERROR",
46
ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
60
size - res->image_end_offset);
47
- bytes, &qiov, 0);
61
@@ -XXX,XX +XXX,XX @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
48
+ qiov->size, qiov, 0);
62
ret = -EFBIG;
49
if (ret < 0) {
50
return ret;
51
}
52
@@ -XXX,XX +XXX,XX @@ static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
53
static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
54
uint64_t cluster_offset,
55
unsigned offset_in_cluster,
56
- uint8_t *buffer,
57
- unsigned bytes)
58
+ QEMUIOVector *qiov)
59
{
60
- QEMUIOVector qiov;
61
- struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
62
int ret;
63
64
- if (bytes == 0) {
65
+ if (qiov->size == 0) {
66
return 0;
67
}
68
69
- qemu_iovec_init_external(&qiov, &iov, 1);
70
-
71
ret = qcow2_pre_write_overlap_check(bs, 0,
72
- cluster_offset + offset_in_cluster, bytes);
73
+ cluster_offset + offset_in_cluster, qiov->size);
74
if (ret < 0) {
75
return ret;
76
}
77
78
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
79
ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster,
80
- bytes, &qiov, 0);
81
+ qiov->size, qiov, 0);
82
if (ret < 0) {
83
return ret;
84
}
85
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
86
unsigned data_bytes = end->offset - (start->offset + start->nb_bytes);
87
bool merge_reads;
88
uint8_t *start_buffer, *end_buffer;
89
+ QEMUIOVector qiov;
90
int ret;
91
92
assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
93
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
94
/* The part of the buffer where the end region is located */
95
end_buffer = start_buffer + buffer_size - end->nb_bytes;
96
97
+ qemu_iovec_init(&qiov, 1);
98
+
99
qemu_co_mutex_unlock(&s->lock);
100
/* First we read the existing data from both COW regions. We
101
* either read the whole region in one go, or the start and end
102
* regions separately. */
103
if (merge_reads) {
104
- ret = do_perform_cow_read(bs, m->offset, start->offset,
105
- start_buffer, buffer_size);
106
+ qemu_iovec_add(&qiov, start_buffer, buffer_size);
107
+ ret = do_perform_cow_read(bs, m->offset, start->offset, &qiov);
108
} else {
109
- ret = do_perform_cow_read(bs, m->offset, start->offset,
110
- start_buffer, start->nb_bytes);
111
+ qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
112
+ ret = do_perform_cow_read(bs, m->offset, start->offset, &qiov);
113
if (ret < 0) {
114
goto fail;
115
}
116
117
- ret = do_perform_cow_read(bs, m->offset, end->offset,
118
- end_buffer, end->nb_bytes);
119
+ qemu_iovec_reset(&qiov);
120
+ qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
121
+ ret = do_perform_cow_read(bs, m->offset, end->offset, &qiov);
122
}
123
if (ret < 0) {
124
goto fail;
125
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
126
}
127
128
/* And now we can write everything */
129
- ret = do_perform_cow_write(bs, m->alloc_offset, start->offset,
130
- start_buffer, start->nb_bytes);
131
+ qemu_iovec_reset(&qiov);
132
+ qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
133
+ ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
134
if (ret < 0) {
135
goto fail;
63
goto fail;
136
}
64
}
137
65
+ s->cluster_size = s->tracks << BDRV_SECTOR_BITS;
138
- ret = do_perform_cow_write(bs, m->alloc_offset, end->offset,
66
139
- end_buffer, end->nb_bytes);
67
s->bat_size = le32_to_cpu(ph.bat_entries);
140
+ qemu_iovec_reset(&qiov);
68
if (s->bat_size > INT_MAX / sizeof(uint32_t)) {
141
+ qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
142
+ ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, &qiov);
143
fail:
144
qemu_co_mutex_lock(&s->lock);
145
146
@@ -XXX,XX +XXX,XX @@ fail:
147
}
148
149
qemu_vfree(start_buffer);
150
+ qemu_iovec_destroy(&qiov);
151
return ret;
152
}
153
154
--
69
--
155
1.8.3.1
70
2.29.2
156
71
157
72
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
This patch splits do_perform_cow() into three separate functions to
3
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
4
read, encrypt and write the COW regions.
4
Message-Id: <20210224104707.88430-5-vsementsov@virtuozzo.com>
5
5
Reviewed-by: Denis V. Lunev <den@openvz.org>
6
perform_cow() can now read both regions first, then encrypt them and
7
finally write them to disk. The memory allocation is also done in
8
this function now, using one single buffer large enough to hold both
9
regions.
10
11
Signed-off-by: Alberto Garcia <berto@igalia.com>
12
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
13
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
14
---
7
---
15
block/qcow2-cluster.c | 117 +++++++++++++++++++++++++++++++++++++-------------
8
block/parallels.h | 6 +-
16
1 file changed, 87 insertions(+), 30 deletions(-)
9
block/parallels-ext.c | 300 ++++++++++++++++++++++++++++++++++++++++++
10
block/parallels.c | 18 +++
11
block/meson.build | 3 +-
12
4 files changed, 325 insertions(+), 2 deletions(-)
13
create mode 100644 block/parallels-ext.c
17
14
18
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
15
diff --git a/block/parallels.h b/block/parallels.h
19
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
20
--- a/block/qcow2-cluster.c
17
--- a/block/parallels.h
21
+++ b/block/qcow2-cluster.c
18
+++ b/block/parallels.h
22
@@ -XXX,XX +XXX,XX @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
19
@@ -XXX,XX +XXX,XX @@ typedef struct ParallelsHeader {
23
return 0;
20
uint64_t nb_sectors;
24
}
21
uint32_t inuse;
25
22
uint32_t data_off;
26
-static int coroutine_fn do_perform_cow(BlockDriverState *bs,
23
- char padding[12];
27
- uint64_t src_cluster_offset,
24
+ uint32_t flags;
28
- uint64_t cluster_offset,
25
+ uint64_t ext_off;
29
- unsigned offset_in_cluster,
26
} QEMU_PACKED ParallelsHeader;
30
- unsigned bytes)
27
31
+static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
28
typedef enum ParallelsPreallocMode {
32
+ uint64_t src_cluster_offset,
29
@@ -XXX,XX +XXX,XX @@ typedef struct BDRVParallelsState {
33
+ unsigned offset_in_cluster,
30
Error *migration_blocker;
34
+ uint8_t *buffer,
31
} BDRVParallelsState;
35
+ unsigned bytes)
32
36
{
33
+int parallels_read_format_extension(BlockDriverState *bs,
37
- BDRVQcow2State *s = bs->opaque;
34
+ int64_t ext_off, Error **errp);
38
QEMUIOVector qiov;
35
+
39
- struct iovec iov;
36
#endif
40
+ struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
37
diff --git a/block/parallels-ext.c b/block/parallels-ext.c
41
int ret;
38
new file mode 100644
42
39
index XXXXXXX..XXXXXXX
43
if (bytes == 0) {
40
--- /dev/null
44
return 0;
41
+++ b/block/parallels-ext.c
45
}
42
@@ -XXX,XX +XXX,XX @@
46
43
+/*
47
- iov.iov_len = bytes;
44
+ * Support of Parallels Format Extension. It's a part of Parallels format
48
- iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
45
+ * driver.
49
- if (iov.iov_base == NULL) {
46
+ *
50
- return -ENOMEM;
47
+ * Copyright (c) 2021 Virtuozzo International GmbH
51
- }
48
+ *
52
-
49
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
53
qemu_iovec_init_external(&qiov, &iov, 1);
50
+ * of this software and associated documentation files (the "Software"), to deal
54
51
+ * in the Software without restriction, including without limitation the rights
55
BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
52
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
56
53
+ * copies of the Software, and to permit persons to whom the Software is
57
if (!bs->drv) {
54
+ * furnished to do so, subject to the following conditions:
58
- ret = -ENOMEDIUM;
55
+ *
59
- goto out;
56
+ * The above copyright notice and this permission notice shall be included in
60
+ return -ENOMEDIUM;
57
+ * all copies or substantial portions of the Software.
61
}
58
+ *
62
59
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
63
/* Call .bdrv_co_readv() directly instead of using the public block-layer
60
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
64
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
61
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
65
ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
62
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
66
bytes, &qiov, 0);
63
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
67
if (ret < 0) {
64
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
68
- goto out;
65
+ * THE SOFTWARE.
69
+ return ret;
66
+ */
70
}
67
+
71
68
+#include "qemu/osdep.h"
72
- if (bs->encrypted) {
69
+#include "qapi/error.h"
73
+ return 0;
70
+#include "block/block_int.h"
71
+#include "parallels.h"
72
+#include "crypto/hash.h"
73
+#include "qemu/uuid.h"
74
+
75
+#define PARALLELS_FORMAT_EXTENSION_MAGIC 0xAB234CEF23DCEA87ULL
76
+
77
+#define PARALLELS_END_OF_FEATURES_MAGIC 0x0ULL
78
+#define PARALLELS_DIRTY_BITMAP_FEATURE_MAGIC 0x20385FAE252CB34AULL
79
+
80
+typedef struct ParallelsFormatExtensionHeader {
81
+ uint64_t magic; /* PARALLELS_FORMAT_EXTENSION_MAGIC */
82
+ uint8_t check_sum[16];
83
+} QEMU_PACKED ParallelsFormatExtensionHeader;
84
+
85
+typedef struct ParallelsFeatureHeader {
86
+ uint64_t magic;
87
+ uint64_t flags;
88
+ uint32_t data_size;
89
+ uint32_t _unused;
90
+} QEMU_PACKED ParallelsFeatureHeader;
91
+
92
+typedef struct ParallelsDirtyBitmapFeature {
93
+ uint64_t size;
94
+ uint8_t id[16];
95
+ uint32_t granularity;
96
+ uint32_t l1_size;
97
+ /* L1 table follows */
98
+} QEMU_PACKED ParallelsDirtyBitmapFeature;
99
+
100
+/* Given L1 table read bitmap data from the image and populate @bitmap */
101
+static int parallels_load_bitmap_data(BlockDriverState *bs,
102
+ const uint64_t *l1_table,
103
+ uint32_t l1_size,
104
+ BdrvDirtyBitmap *bitmap,
105
+ Error **errp)
106
+{
107
+ BDRVParallelsState *s = bs->opaque;
108
+ int ret = 0;
109
+ uint64_t offset, limit;
110
+ uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
111
+ uint8_t *buf = NULL;
112
+ uint64_t i, tab_size =
113
+ DIV_ROUND_UP(bdrv_dirty_bitmap_serialization_size(bitmap, 0, bm_size),
114
+ s->cluster_size);
115
+
116
+ if (tab_size != l1_size) {
117
+ error_setg(errp, "Bitmap table size %" PRIu32 " does not correspond "
118
+ "to bitmap size and cluster size. Expected %" PRIu64,
119
+ l1_size, tab_size);
120
+ return -EINVAL;
121
+ }
122
+
123
+ buf = qemu_blockalign(bs, s->cluster_size);
124
+ limit = bdrv_dirty_bitmap_serialization_coverage(s->cluster_size, bitmap);
125
+ for (i = 0, offset = 0; i < tab_size; ++i, offset += limit) {
126
+ uint64_t count = MIN(bm_size - offset, limit);
127
+ uint64_t entry = l1_table[i];
128
+
129
+ if (entry == 0) {
130
+ /* No need to deserialize zeros because @bitmap is cleared. */
131
+ continue;
132
+ }
133
+
134
+ if (entry == 1) {
135
+ bdrv_dirty_bitmap_deserialize_ones(bitmap, offset, count, false);
136
+ } else {
137
+ ret = bdrv_pread(bs->file, entry << BDRV_SECTOR_BITS, buf,
138
+ s->cluster_size);
139
+ if (ret < 0) {
140
+ error_setg_errno(errp, -ret,
141
+ "Failed to read bitmap data cluster");
142
+ goto finish;
143
+ }
144
+ bdrv_dirty_bitmap_deserialize_part(bitmap, buf, offset, count,
145
+ false);
146
+ }
147
+ }
148
+ ret = 0;
149
+
150
+ bdrv_dirty_bitmap_deserialize_finish(bitmap);
151
+
152
+finish:
153
+ qemu_vfree(buf);
154
+
155
+ return ret;
74
+}
156
+}
75
+
157
+
76
+static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
158
+/*
77
+ uint64_t src_cluster_offset,
159
+ * @data buffer (of @data_size size) is the Dirty bitmaps feature which
78
+ unsigned offset_in_cluster,
160
+ * consists of ParallelsDirtyBitmapFeature followed by L1 table.
79
+ uint8_t *buffer,
161
+ */
80
+ unsigned bytes)
162
+static BdrvDirtyBitmap *parallels_load_bitmap(BlockDriverState *bs,
163
+ uint8_t *data,
164
+ size_t data_size,
165
+ Error **errp)
81
+{
166
+{
82
+ if (bytes && bs->encrypted) {
167
+ int ret;
83
+ BDRVQcow2State *s = bs->opaque;
168
+ ParallelsDirtyBitmapFeature bf;
84
int64_t sector = (src_cluster_offset + offset_in_cluster)
169
+ g_autofree uint64_t *l1_table = NULL;
85
>> BDRV_SECTOR_BITS;
170
+ BdrvDirtyBitmap *bitmap;
86
assert(s->cipher);
171
+ QemuUUID uuid;
87
assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
172
+ char uuidstr[UUID_FMT_LEN + 1];
88
assert((bytes & ~BDRV_SECTOR_MASK) == 0);
173
+ int i;
89
- if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
174
+
90
+ if (qcow2_encrypt_sectors(s, sector, buffer, buffer,
175
+ if (data_size < sizeof(bf)) {
91
bytes >> BDRV_SECTOR_BITS, true, NULL) < 0) {
176
+ error_setg(errp, "Too small Bitmap Feature area in Parallels Format "
92
- ret = -EIO;
177
+ "Extension: %zu bytes, expected at least %zu bytes",
93
- goto out;
178
+ data_size, sizeof(bf));
94
+ return false;
179
+ return NULL;
95
}
180
+ }
96
}
181
+ memcpy(&bf, data, sizeof(bf));
97
+ return true;
182
+ bf.size = le64_to_cpu(bf.size);
183
+ bf.granularity = le32_to_cpu(bf.granularity) << BDRV_SECTOR_BITS;
184
+ bf.l1_size = le32_to_cpu(bf.l1_size);
185
+ data += sizeof(bf);
186
+ data_size -= sizeof(bf);
187
+
188
+ if (bf.size != bs->total_sectors) {
189
+ error_setg(errp, "Bitmap size (in sectors) %" PRId64 " differs from "
190
+ "disk size in sectors %" PRId64, bf.size, bs->total_sectors);
191
+ return NULL;
192
+ }
193
+
194
+ if (bf.l1_size * sizeof(uint64_t) > data_size) {
195
+ error_setg(errp, "Bitmaps feature corrupted: l1 table exceeds "
196
+ "extension data_size");
197
+ return NULL;
198
+ }
199
+
200
+ memcpy(&uuid, bf.id, sizeof(uuid));
201
+ qemu_uuid_unparse(&uuid, uuidstr);
202
+ bitmap = bdrv_create_dirty_bitmap(bs, bf.granularity, uuidstr, errp);
203
+ if (!bitmap) {
204
+ return NULL;
205
+ }
206
+
207
+ l1_table = g_new(uint64_t, bf.l1_size);
208
+ for (i = 0; i < bf.l1_size; i++, data += sizeof(uint64_t)) {
209
+ l1_table[i] = ldq_le_p(data);
210
+ }
211
+
212
+ ret = parallels_load_bitmap_data(bs, l1_table, bf.l1_size, bitmap, errp);
213
+ if (ret < 0) {
214
+ bdrv_release_dirty_bitmap(bitmap);
215
+ return NULL;
216
+ }
217
+
218
+ /* We support format extension only for RO parallels images. */
219
+ assert(!(bs->open_flags & BDRV_O_RDWR));
220
+ bdrv_dirty_bitmap_set_readonly(bitmap, true);
221
+
222
+ return bitmap;
98
+}
223
+}
99
+
224
+
100
+static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
225
+static int parallels_parse_format_extension(BlockDriverState *bs,
101
+ uint64_t cluster_offset,
226
+ uint8_t *ext_cluster, Error **errp)
102
+ unsigned offset_in_cluster,
103
+ uint8_t *buffer,
104
+ unsigned bytes)
105
+{
227
+{
106
+ QEMUIOVector qiov;
228
+ BDRVParallelsState *s = bs->opaque;
107
+ struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
108
+ int ret;
229
+ int ret;
109
+
230
+ int remaining = s->cluster_size;
110
+ if (bytes == 0) {
231
+ uint8_t *pos = ext_cluster;
111
+ return 0;
232
+ ParallelsFormatExtensionHeader eh;
112
+ }
233
+ g_autofree uint8_t *hash = NULL;
113
+
234
+ size_t hash_len = 0;
114
+ qemu_iovec_init_external(&qiov, &iov, 1);
235
+ GSList *bitmaps = NULL, *el;
115
236
+
116
ret = qcow2_pre_write_overlap_check(bs, 0,
237
+ memcpy(&eh, pos, sizeof(eh));
117
cluster_offset + offset_in_cluster, bytes);
238
+ eh.magic = le64_to_cpu(eh.magic);
118
if (ret < 0) {
239
+ pos += sizeof(eh);
119
- goto out;
240
+ remaining -= sizeof(eh);
120
+ return ret;
241
+
121
}
242
+ if (eh.magic != PARALLELS_FORMAT_EXTENSION_MAGIC) {
122
243
+ error_setg(errp, "Wrong parallels Format Extension magic: 0x%" PRIx64
123
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
244
+ ", expected: 0x%llx", eh.magic,
124
ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster,
245
+ PARALLELS_FORMAT_EXTENSION_MAGIC);
125
bytes, &qiov, 0);
246
+ goto fail;
126
if (ret < 0) {
247
+ }
127
- goto out;
248
+
128
+ return ret;
249
+ ret = qcrypto_hash_bytes(QCRYPTO_HASH_ALG_MD5, (char *)pos, remaining,
129
}
250
+ &hash, &hash_len, errp);
130
131
- ret = 0;
132
-out:
133
- qemu_vfree(iov.iov_base);
134
- return ret;
135
+ return 0;
136
}
137
138
139
@@ -XXX,XX +XXX,XX @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
140
BDRVQcow2State *s = bs->opaque;
141
Qcow2COWRegion *start = &m->cow_start;
142
Qcow2COWRegion *end = &m->cow_end;
143
+ unsigned buffer_size;
144
+ uint8_t *start_buffer, *end_buffer;
145
int ret;
146
147
+ assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
148
+
149
if (start->nb_bytes == 0 && end->nb_bytes == 0) {
150
return 0;
151
}
152
153
+ /* Reserve a buffer large enough to store the data from both the
154
+ * start and end COW regions. Add some padding in the middle if
155
+ * necessary to make sure that the end region is optimally aligned */
156
+ buffer_size = QEMU_ALIGN_UP(start->nb_bytes, bdrv_opt_mem_align(bs)) +
157
+ end->nb_bytes;
158
+ start_buffer = qemu_try_blockalign(bs, buffer_size);
159
+ if (start_buffer == NULL) {
160
+ return -ENOMEM;
161
+ }
162
+ /* The part of the buffer where the end region is located */
163
+ end_buffer = start_buffer + buffer_size - end->nb_bytes;
164
+
165
qemu_co_mutex_unlock(&s->lock);
166
- ret = do_perform_cow(bs, m->offset, m->alloc_offset,
167
- start->offset, start->nb_bytes);
168
+ /* First we read the existing data from both COW regions */
169
+ ret = do_perform_cow_read(bs, m->offset, start->offset,
170
+ start_buffer, start->nb_bytes);
171
if (ret < 0) {
172
goto fail;
173
}
174
175
- ret = do_perform_cow(bs, m->offset, m->alloc_offset,
176
- end->offset, end->nb_bytes);
177
+ ret = do_perform_cow_read(bs, m->offset, end->offset,
178
+ end_buffer, end->nb_bytes);
179
+ if (ret < 0) {
251
+ if (ret < 0) {
180
+ goto fail;
252
+ goto fail;
181
+ }
253
+ }
182
+
254
+
183
+ /* Encrypt the data if necessary before writing it */
255
+ if (hash_len != sizeof(eh.check_sum) ||
184
+ if (bs->encrypted) {
256
+ memcmp(hash, eh.check_sum, sizeof(eh.check_sum)) != 0) {
185
+ if (!do_perform_cow_encrypt(bs, m->offset, start->offset,
257
+ error_setg(errp, "Wrong checksum in Format Extension header. Format "
186
+ start_buffer, start->nb_bytes) ||
258
+ "extension is corrupted.");
187
+ !do_perform_cow_encrypt(bs, m->offset, end->offset,
259
+ goto fail;
188
+ end_buffer, end->nb_bytes)) {
260
+ }
189
+ ret = -EIO;
261
+
262
+ while (true) {
263
+ ParallelsFeatureHeader fh;
264
+ BdrvDirtyBitmap *bitmap;
265
+
266
+ if (remaining < sizeof(fh)) {
267
+ error_setg(errp, "Can not read feature header, as remaining bytes "
268
+ "(%d) in Format Extension is less than Feature header "
269
+ "size (%zu)", remaining, sizeof(fh));
190
+ goto fail;
270
+ goto fail;
191
+ }
271
+ }
192
+ }
272
+
193
+
273
+ memcpy(&fh, pos, sizeof(fh));
194
+ /* And now we can write everything */
274
+ pos += sizeof(fh);
195
+ ret = do_perform_cow_write(bs, m->alloc_offset, start->offset,
275
+ remaining -= sizeof(fh);
196
+ start_buffer, start->nb_bytes);
276
+
277
+ fh.magic = le64_to_cpu(fh.magic);
278
+ fh.flags = le64_to_cpu(fh.flags);
279
+ fh.data_size = le32_to_cpu(fh.data_size);
280
+
281
+ if (fh.flags) {
282
+ error_setg(errp, "Flags for extension feature are unsupported");
283
+ goto fail;
284
+ }
285
+
286
+ if (fh.data_size > remaining) {
287
+ error_setg(errp, "Feature data_size exceedes Format Extension "
288
+ "cluster");
289
+ goto fail;
290
+ }
291
+
292
+ switch (fh.magic) {
293
+ case PARALLELS_END_OF_FEATURES_MAGIC:
294
+ return 0;
295
+
296
+ case PARALLELS_DIRTY_BITMAP_FEATURE_MAGIC:
297
+ bitmap = parallels_load_bitmap(bs, pos, fh.data_size, errp);
298
+ if (!bitmap) {
299
+ goto fail;
300
+ }
301
+ bitmaps = g_slist_append(bitmaps, bitmap);
302
+ break;
303
+
304
+ default:
305
+ error_setg(errp, "Unknown feature: 0x%" PRIu64, fh.magic);
306
+ goto fail;
307
+ }
308
+
309
+ pos = ext_cluster + QEMU_ALIGN_UP(pos + fh.data_size - ext_cluster, 8);
310
+ }
311
+
312
+fail:
313
+ for (el = bitmaps; el; el = el->next) {
314
+ bdrv_release_dirty_bitmap(el->data);
315
+ }
316
+ g_slist_free(bitmaps);
317
+
318
+ return -EINVAL;
319
+}
320
+
321
+int parallels_read_format_extension(BlockDriverState *bs,
322
+ int64_t ext_off, Error **errp)
323
+{
324
+ BDRVParallelsState *s = bs->opaque;
325
+ int ret;
326
+ uint8_t *ext_cluster = qemu_blockalign(bs, s->cluster_size);
327
+
328
+ assert(ext_off > 0);
329
+
330
+ ret = bdrv_pread(bs->file, ext_off, ext_cluster, s->cluster_size);
197
+ if (ret < 0) {
331
+ if (ret < 0) {
198
+ goto fail;
332
+ error_setg_errno(errp, -ret, "Failed to read Format Extension cluster");
199
+ }
333
+ goto out;
200
334
+ }
201
+ ret = do_perform_cow_write(bs, m->alloc_offset, end->offset,
335
+
202
+ end_buffer, end->nb_bytes);
336
+ ret = parallels_parse_format_extension(bs, ext_cluster, errp);
203
fail:
337
+
204
qemu_co_mutex_lock(&s->lock);
338
+out:
205
339
+ qemu_vfree(ext_cluster);
206
@@ -XXX,XX +XXX,XX @@ fail:
340
+
207
qcow2_cache_depends_on_flush(s->l2_table_cache);
341
+ return ret;
342
+}
343
diff --git a/block/parallels.c b/block/parallels.c
344
index XXXXXXX..XXXXXXX 100644
345
--- a/block/parallels.c
346
+++ b/block/parallels.c
347
@@ -XXX,XX +XXX,XX @@
348
*/
349
350
#include "qemu/osdep.h"
351
+#include "qemu/error-report.h"
352
#include "qapi/error.h"
353
#include "block/block_int.h"
354
#include "block/qdict.h"
355
@@ -XXX,XX +XXX,XX @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
356
goto fail_options;
208
}
357
}
209
358
210
+ qemu_vfree(start_buffer);
359
+ if (ph.ext_off) {
211
return ret;
360
+ if (flags & BDRV_O_RDWR) {
212
}
361
+ /*
213
362
+ * It's unsafe to open image RW if there is an extension (as we
363
+ * don't support it). But parallels driver in QEMU historically
364
+ * ignores the extension, so print warning and don't care.
365
+ */
366
+ warn_report("Format Extension ignored in RW mode");
367
+ } else {
368
+ ret = parallels_read_format_extension(
369
+ bs, le64_to_cpu(ph.ext_off) << BDRV_SECTOR_BITS, errp);
370
+ if (ret < 0) {
371
+ goto fail;
372
+ }
373
+ }
374
+ }
375
+
376
if ((flags & BDRV_O_RDWR) && !(flags & BDRV_O_INACTIVE)) {
377
s->header->inuse = cpu_to_le32(HEADER_INUSE_MAGIC);
378
ret = parallels_update_header(bs);
379
diff --git a/block/meson.build b/block/meson.build
380
index XXXXXXX..XXXXXXX 100644
381
--- a/block/meson.build
382
+++ b/block/meson.build
383
@@ -XXX,XX +XXX,XX @@ block_ss.add(when: 'CONFIG_QED', if_true: files(
384
'qed-table.c',
385
'qed.c',
386
))
387
-block_ss.add(when: [libxml2, 'CONFIG_PARALLELS'], if_true: files('parallels.c'))
388
+block_ss.add(when: [libxml2, 'CONFIG_PARALLELS'],
389
+ if_true: files('parallels.c', 'parallels-ext.c'))
390
block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 'win32-aio.c'))
391
block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, iokit])
392
block_ss.add(when: libiscsi, if_true: files('iscsi-opts.c'))
214
--
393
--
215
1.8.3.1
394
2.29.2
216
395
217
396
diff view generated by jsdifflib
1
From: Alberto Garcia <berto@igalia.com>
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
2
3
There used to be throttle_timers_{detach,attach}_aio_context() calls
3
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
4
in bdrv_set_aio_context(), but since 7ca7f0f6db1fedd28d490795d778cf239
4
Message-Id: <20210224104707.88430-6-vsementsov@virtuozzo.com>
5
they are now in blk_set_aio_context().
5
Reviewed-by: Denis V. Lunev <den@openvz.org>
6
7
Signed-off-by: Alberto Garcia <berto@igalia.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10
---
7
---
11
block/throttle-groups.c | 2 +-
8
tests/qemu-iotests/iotests.py | 10 ++++++++++
12
1 file changed, 1 insertion(+), 1 deletion(-)
9
1 file changed, 10 insertions(+)
13
10
14
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
11
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
15
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
16
--- a/block/throttle-groups.c
13
--- a/tests/qemu-iotests/iotests.py
17
+++ b/block/throttle-groups.c
14
+++ b/tests/qemu-iotests/iotests.py
18
@@ -XXX,XX +XXX,XX @@
15
@@ -XXX,XX +XXX,XX @@
19
* Again, all this is handled internally and is mostly transparent to
16
#
20
* the outside. The 'throttle_timers' field however has an additional
17
21
* constraint because it may be temporarily invalid (see for example
18
import atexit
22
- * bdrv_set_aio_context()). Therefore in this file a thread will
19
+import bz2
23
+ * blk_set_aio_context()). Therefore in this file a thread will
20
from collections import OrderedDict
24
* access some other BlockBackend's timers only after verifying that
21
import faulthandler
25
* that BlockBackend has throttled requests in the queue.
22
import io
26
*/
23
@@ -XXX,XX +XXX,XX @@
24
import logging
25
import os
26
import re
27
+import shutil
28
import signal
29
import struct
30
import subprocess
31
@@ -XXX,XX +XXX,XX @@
32
os.environ.get('IMGKEYSECRET', '')
33
luks_default_key_secret_opt = 'key-secret=keysec0'
34
35
+sample_img_dir = os.environ['SAMPLE_IMG_DIR']
36
+
37
+
38
+def unarchive_sample_image(sample, fname):
39
+ sample_fname = os.path.join(sample_img_dir, sample + '.bz2')
40
+ with bz2.open(sample_fname) as f_in, open(fname, 'wb') as f_out:
41
+ shutil.copyfileobj(f_in, f_out)
42
+
43
44
def qemu_tool_pipe_and_status(tool: str, args: Sequence[str],
45
connect_stderr: bool = True) -> Tuple[str, int]:
27
--
46
--
28
1.8.3.1
47
2.29.2
29
48
30
49
diff view generated by jsdifflib
1
When qemu is exited, all running jobs should be cancelled successfully.
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
This adds a test for this for all types of block jobs that currently
3
exist in qemu.
4
2
3
Test support for reading bitmap from parallels image format.
4
parallels-with-bitmap.bz2 is generated on Virtuozzo by
5
parallels-with-bitmap.sh
6
7
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8
Message-Id: <20210224104707.88430-7-vsementsov@virtuozzo.com>
9
Reviewed-by: Denis V. Lunev <den@openvz.org>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
10
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Eric Blake <eblake@redhat.com>
7
---
11
---
8
tests/qemu-iotests/185 | 206 +++++++++++++++++++++++++++++++++++++++++++++
12
.../sample_images/parallels-with-bitmap.bz2 | Bin 0 -> 203 bytes
9
tests/qemu-iotests/185.out | 59 +++++++++++++
13
.../sample_images/parallels-with-bitmap.sh | 51 ++++++++++++++++
10
tests/qemu-iotests/group | 1 +
14
.../qemu-iotests/tests/parallels-read-bitmap | 55 ++++++++++++++++++
11
3 files changed, 266 insertions(+)
15
.../tests/parallels-read-bitmap.out | 6 ++
12
create mode 100755 tests/qemu-iotests/185
16
4 files changed, 112 insertions(+)
13
create mode 100644 tests/qemu-iotests/185.out
17
create mode 100644 tests/qemu-iotests/sample_images/parallels-with-bitmap.bz2
18
create mode 100755 tests/qemu-iotests/sample_images/parallels-with-bitmap.sh
19
create mode 100755 tests/qemu-iotests/tests/parallels-read-bitmap
20
create mode 100644 tests/qemu-iotests/tests/parallels-read-bitmap.out
14
21
15
diff --git a/tests/qemu-iotests/185 b/tests/qemu-iotests/185
22
diff --git a/tests/qemu-iotests/sample_images/parallels-with-bitmap.bz2 b/tests/qemu-iotests/sample_images/parallels-with-bitmap.bz2
23
new file mode 100644
24
index XXXXXXX..XXXXXXX
25
GIT binary patch
26
literal 203
27
zcmV;+05tzXT4*^jL0KkKS@=;0bpT+Hf7|^?Km<xfFyKQJ7=Y^F-%vt;00~Ysa6|-=
28
zk&7Szk`SoS002EkfMftPG<ipnsiCK}K_sNmm}me3FiZr%Oaf_u5F8kD;mB_~cxD-r
29
z5P$(X{&Tq5C`<xK02D?NNdN+t$~z$m00O|zFh^ynq*yaCtkn+NZzWom<#OEoF`?zb
30
zv(i3x^K~wt!aLPcRBP+PckUsIh6*LgjYSh0`}#7hMC9NR5D)+W0d&8Mxgwk>NPH-R
31
Fx`3oHQ9u9y
32
33
literal 0
34
HcmV?d00001
35
36
diff --git a/tests/qemu-iotests/sample_images/parallels-with-bitmap.sh b/tests/qemu-iotests/sample_images/parallels-with-bitmap.sh
16
new file mode 100755
37
new file mode 100755
17
index XXXXXXX..XXXXXXX
38
index XXXXXXX..XXXXXXX
18
--- /dev/null
39
--- /dev/null
19
+++ b/tests/qemu-iotests/185
40
+++ b/tests/qemu-iotests/sample_images/parallels-with-bitmap.sh
20
@@ -XXX,XX +XXX,XX @@
41
@@ -XXX,XX +XXX,XX @@
21
+#!/bin/bash
42
+#!/bin/bash
22
+#
43
+#
23
+# Test exiting qemu while jobs are still running
44
+# Test parallels load bitmap
24
+#
45
+#
25
+# Copyright (C) 2017 Red Hat, Inc.
46
+# Copyright (c) 2021 Virtuozzo International GmbH.
26
+#
47
+#
27
+# This program is free software; you can redistribute it and/or modify
48
+# This program is free software; you can redistribute it and/or modify
28
+# it under the terms of the GNU General Public License as published by
49
+# it under the terms of the GNU General Public License as published by
29
+# the Free Software Foundation; either version 2 of the License, or
50
+# the Free Software Foundation; either version 2 of the License, or
30
+# (at your option) any later version.
51
+# (at your option) any later version.
...
...
36
+#
57
+#
37
+# You should have received a copy of the GNU General Public License
58
+# You should have received a copy of the GNU General Public License
38
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
59
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
39
+#
60
+#
40
+
61
+
41
+# creator
62
+CT=parallels-with-bitmap-ct
42
+owner=kwolf@redhat.com
63
+DIR=$PWD/parallels-with-bitmap-dir
64
+IMG=$DIR/root.hds
65
+XML=$DIR/DiskDescriptor.xml
66
+TARGET=parallels-with-bitmap.bz2
43
+
67
+
44
+seq=`basename $0`
68
+rm -rf $DIR
45
+echo "QA output created by $seq"
46
+
69
+
47
+here=`pwd`
70
+prlctl create $CT --vmtype ct
48
+status=1 # failure is the default!
71
+prlctl set $CT --device-add hdd --image $DIR --recreate --size 2G
49
+
72
+
50
+MIG_SOCKET="${TEST_DIR}/migrate"
73
+# cleanup the image
74
+qemu-img create -f parallels $IMG 64G
51
+
75
+
52
+_cleanup()
76
+# create bitmap
53
+{
77
+prlctl backup $CT
54
+ rm -f "${TEST_IMG}.mid"
55
+ rm -f "${TEST_IMG}.copy"
56
+ _cleanup_test_img
57
+ _cleanup_qemu
58
+}
59
+trap "_cleanup; exit \$status" 0 1 2 3 15
60
+
78
+
61
+# get standard environment, filters and checks
79
+prlctl set $CT --device-del hdd1
62
+. ./common.rc
80
+prlctl destroy $CT
63
+. ./common.filter
64
+. ./common.qemu
65
+
81
+
66
+_supported_fmt qcow2
82
+dev=$(ploop mount $XML | sed -n 's/^Adding delta dev=\(\/dev\/ploop[0-9]\+\).*/\1/p')
67
+_supported_proto file
83
+dd if=/dev/zero of=$dev bs=64K seek=5 count=2 oflag=direct
68
+_supported_os Linux
84
+dd if=/dev/zero of=$dev bs=64K seek=30 count=1 oflag=direct
85
+dd if=/dev/zero of=$dev bs=64K seek=10 count=3 oflag=direct
86
+ploop umount $XML # bitmap name will be in the output
69
+
87
+
70
+size=64M
88
+bzip2 -z $IMG
71
+TEST_IMG="${TEST_IMG}.base" _make_test_img $size
72
+
89
+
73
+echo
90
+mv $IMG.bz2 $TARGET
74
+echo === Starting VM ===
75
+echo
76
+
91
+
77
+qemu_comm_method="qmp"
92
+rm -rf $DIR
93
diff --git a/tests/qemu-iotests/tests/parallels-read-bitmap b/tests/qemu-iotests/tests/parallels-read-bitmap
94
new file mode 100755
95
index XXXXXXX..XXXXXXX
96
--- /dev/null
97
+++ b/tests/qemu-iotests/tests/parallels-read-bitmap
98
@@ -XXX,XX +XXX,XX @@
99
+#!/usr/bin/env python3
100
+#
101
+# Test parallels load bitmap
102
+#
103
+# Copyright (c) 2021 Virtuozzo International GmbH.
104
+#
105
+# This program is free software; you can redistribute it and/or modify
106
+# it under the terms of the GNU General Public License as published by
107
+# the Free Software Foundation; either version 2 of the License, or
108
+# (at your option) any later version.
109
+#
110
+# This program is distributed in the hope that it will be useful,
111
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
112
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
113
+# GNU General Public License for more details.
114
+#
115
+# You should have received a copy of the GNU General Public License
116
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
117
+#
78
+
118
+
79
+_launch_qemu \
119
+import json
80
+ -drive file="${TEST_IMG}.base",cache=$CACHEMODE,driver=$IMGFMT,id=disk
120
+import iotests
81
+h=$QEMU_HANDLE
121
+from iotests import qemu_nbd_popen, qemu_img_pipe, log, file_path
82
+_send_qemu_cmd $h "{ 'execute': 'qmp_capabilities' }" 'return'
83
+
122
+
84
+echo
123
+iotests.script_initialize(supported_fmts=['parallels'])
85
+echo === Creating backing chain ===
86
+echo
87
+
124
+
88
+_send_qemu_cmd $h \
125
+nbd_sock = file_path('nbd-sock', base_dir=iotests.sock_dir)
89
+ "{ 'execute': 'blockdev-snapshot-sync',
126
+disk = iotests.file_path('disk')
90
+ 'arguments': { 'device': 'disk',
127
+bitmap = 'e4f2eed0-37fe-4539-b50b-85d2e7fd235f'
91
+ 'snapshot-file': '$TEST_IMG.mid',
128
+nbd_opts = f'driver=nbd,server.type=unix,server.path={nbd_sock}' \
92
+ 'format': '$IMGFMT',
129
+ f',x-dirty-bitmap=qemu:dirty-bitmap:{bitmap}'
93
+ 'mode': 'absolute-paths' } }" \
94
+ "return"
95
+
130
+
96
+_send_qemu_cmd $h \
97
+ "{ 'execute': 'human-monitor-command',
98
+ 'arguments': { 'command-line':
99
+ 'qemu-io disk \"write 0 4M\"' } }" \
100
+ "return"
101
+
131
+
102
+_send_qemu_cmd $h \
132
+iotests.unarchive_sample_image('parallels-with-bitmap', disk)
103
+ "{ 'execute': 'blockdev-snapshot-sync',
104
+ 'arguments': { 'device': 'disk',
105
+ 'snapshot-file': '$TEST_IMG',
106
+ 'format': '$IMGFMT',
107
+ 'mode': 'absolute-paths' } }" \
108
+ "return"
109
+
133
+
110
+echo
111
+echo === Start commit job and exit qemu ===
112
+echo
113
+
134
+
114
+# Note that the reference output intentionally includes the 'offset' field in
135
+with qemu_nbd_popen('--read-only', f'--socket={nbd_sock}',
115
+# BLOCK_JOB_CANCELLED events for all of the following block jobs. They are
136
+ f'--bitmap={bitmap}', '-f', iotests.imgfmt, disk):
116
+# predictable and any change in the offsets would hint at a bug in the job
137
+ out = qemu_img_pipe('map', '--output=json', '--image-opts', nbd_opts)
117
+# throttling code.
138
+ chunks = json.loads(out)
118
+#
139
+ cluster = 64 * 1024
119
+# In order to achieve these predictable offsets, all of the following tests
120
+# use speed=65536. Each job will perform exactly one iteration before it has
121
+# to sleep at least for a second, which is plenty of time for the 'quit' QMP
122
+# command to be received (after receiving the command, the rest runs
123
+# synchronously, so jobs can arbitrarily continue or complete).
124
+#
125
+# The buffer size for commit and streaming is 512k (waiting for 8 seconds after
126
+# the first request), for active commit and mirror it's large enough to cover
127
+# the full 4M, and for backup it's the qcow2 cluster size, which we know is
128
+# 64k. As all of these are at least as large as the speed, we are sure that the
129
+# offset doesn't advance after the first iteration before qemu exits.
130
+
140
+
131
+_send_qemu_cmd $h \
141
+ log('dirty clusters (cluster size is 64K):')
132
+ "{ 'execute': 'block-commit',
142
+ for c in chunks:
133
+ 'arguments': { 'device': 'disk',
143
+ assert c['start'] % cluster == 0
134
+ 'base':'$TEST_IMG.base',
144
+ assert c['length'] % cluster == 0
135
+ 'top': '$TEST_IMG.mid',
145
+ if c['data']:
136
+ 'speed': 65536 } }" \
146
+ continue
137
+ "return"
138
+
147
+
139
+_send_qemu_cmd $h "{ 'execute': 'quit' }" "return"
148
+ a = c['start'] // cluster
140
+wait=1 _cleanup_qemu
149
+ b = (c['start'] + c['length']) // cluster
141
+
150
+ if b - a > 1:
142
+echo
151
+ log(f'{a}-{b-1}')
143
+echo === Start active commit job and exit qemu ===
152
+ else:
144
+echo
153
+ log(a)
145
+
154
diff --git a/tests/qemu-iotests/tests/parallels-read-bitmap.out b/tests/qemu-iotests/tests/parallels-read-bitmap.out
146
+_launch_qemu \
147
+ -drive file="${TEST_IMG}",cache=$CACHEMODE,driver=$IMGFMT,id=disk
148
+h=$QEMU_HANDLE
149
+_send_qemu_cmd $h "{ 'execute': 'qmp_capabilities' }" 'return'
150
+
151
+_send_qemu_cmd $h \
152
+ "{ 'execute': 'block-commit',
153
+ 'arguments': { 'device': 'disk',
154
+ 'base':'$TEST_IMG.base',
155
+ 'speed': 65536 } }" \
156
+ "return"
157
+
158
+_send_qemu_cmd $h "{ 'execute': 'quit' }" "return"
159
+wait=1 _cleanup_qemu
160
+
161
+echo
162
+echo === Start mirror job and exit qemu ===
163
+echo
164
+
165
+_launch_qemu \
166
+ -drive file="${TEST_IMG}",cache=$CACHEMODE,driver=$IMGFMT,id=disk
167
+h=$QEMU_HANDLE
168
+_send_qemu_cmd $h "{ 'execute': 'qmp_capabilities' }" 'return'
169
+
170
+_send_qemu_cmd $h \
171
+ "{ 'execute': 'drive-mirror',
172
+ 'arguments': { 'device': 'disk',
173
+ 'target': '$TEST_IMG.copy',
174
+ 'format': '$IMGFMT',
175
+ 'sync': 'full',
176
+ 'speed': 65536 } }" \
177
+ "return"
178
+
179
+_send_qemu_cmd $h "{ 'execute': 'quit' }" "return"
180
+wait=1 _cleanup_qemu
181
+
182
+echo
183
+echo === Start backup job and exit qemu ===
184
+echo
185
+
186
+_launch_qemu \
187
+ -drive file="${TEST_IMG}",cache=$CACHEMODE,driver=$IMGFMT,id=disk
188
+h=$QEMU_HANDLE
189
+_send_qemu_cmd $h "{ 'execute': 'qmp_capabilities' }" 'return'
190
+
191
+_send_qemu_cmd $h \
192
+ "{ 'execute': 'drive-backup',
193
+ 'arguments': { 'device': 'disk',
194
+ 'target': '$TEST_IMG.copy',
195
+ 'format': '$IMGFMT',
196
+ 'sync': 'full',
197
+ 'speed': 65536 } }" \
198
+ "return"
199
+
200
+_send_qemu_cmd $h "{ 'execute': 'quit' }" "return"
201
+wait=1 _cleanup_qemu
202
+
203
+echo
204
+echo === Start streaming job and exit qemu ===
205
+echo
206
+
207
+_launch_qemu \
208
+ -drive file="${TEST_IMG}",cache=$CACHEMODE,driver=$IMGFMT,id=disk
209
+h=$QEMU_HANDLE
210
+_send_qemu_cmd $h "{ 'execute': 'qmp_capabilities' }" 'return'
211
+
212
+_send_qemu_cmd $h \
213
+ "{ 'execute': 'block-stream',
214
+ 'arguments': { 'device': 'disk',
215
+ 'speed': 65536 } }" \
216
+ "return"
217
+
218
+_send_qemu_cmd $h "{ 'execute': 'quit' }" "return"
219
+wait=1 _cleanup_qemu
220
+
221
+_check_test_img
222
+
223
+# success, all done
224
+echo "*** done"
225
+rm -f $seq.full
226
+status=0
227
diff --git a/tests/qemu-iotests/185.out b/tests/qemu-iotests/185.out
228
new file mode 100644
155
new file mode 100644
229
index XXXXXXX..XXXXXXX
156
index XXXXXXX..XXXXXXX
230
--- /dev/null
157
--- /dev/null
231
+++ b/tests/qemu-iotests/185.out
158
+++ b/tests/qemu-iotests/tests/parallels-read-bitmap.out
232
@@ -XXX,XX +XXX,XX @@
159
@@ -XXX,XX +XXX,XX @@
233
+QA output created by 185
160
+Start NBD server
234
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864
161
+dirty clusters (cluster size is 64K):
235
+
162
+5-6
236
+=== Starting VM ===
163
+10-12
237
+
164
+30
238
+{"return": {}}
165
+Kill NBD server
239
+
240
+=== Creating backing chain ===
241
+
242
+Formatting 'TEST_DIR/t.qcow2.mid', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.qcow2.base backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
243
+{"return": {}}
244
+wrote 4194304/4194304 bytes at offset 0
245
+4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
246
+{"return": ""}
247
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.qcow2.mid backing_fmt=qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
248
+{"return": {}}
249
+
250
+=== Start commit job and exit qemu ===
251
+
252
+{"return": {}}
253
+{"return": {}}
254
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
255
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offset": 524288, "speed": 65536, "type": "commit"}}
256
+
257
+=== Start active commit job and exit qemu ===
258
+
259
+{"return": {}}
260
+{"return": {}}
261
+{"return": {}}
262
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
263
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": 4194304, "speed": 65536, "type": "commit"}}
264
+
265
+=== Start mirror job and exit qemu ===
266
+
267
+{"return": {}}
268
+Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
269
+{"return": {}}
270
+{"return": {}}
271
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
272
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": 4194304, "speed": 65536, "type": "mirror"}}
273
+
274
+=== Start backup job and exit qemu ===
275
+
276
+{"return": {}}
277
+Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
278
+{"return": {}}
279
+{"return": {}}
280
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
281
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offset": 65536, "speed": 65536, "type": "backup"}}
282
+
283
+=== Start streaming job and exit qemu ===
284
+
285
+{"return": {}}
286
+{"return": {}}
287
+{"return": {}}
288
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
289
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offset": 524288, "speed": 65536, "type": "stream"}}
290
+No errors were found on the image.
291
+*** done
292
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
293
index XXXXXXX..XXXXXXX 100644
294
--- a/tests/qemu-iotests/group
295
+++ b/tests/qemu-iotests/group
296
@@ -XXX,XX +XXX,XX @@
297
181 rw auto migration
298
182 rw auto quick
299
183 rw auto migration
300
+185 rw auto
301
--
166
--
302
1.8.3.1
167
2.29.2
303
168
304
169
diff view generated by jsdifflib
1
After _cleanup_qemu(), test cases should be able to start the next qemu
1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2
process and call _cleanup_qemu() for that one as well. For this to work
3
cleanly, we need to improve the cleanup so that the second invocation
4
doesn't try to kill the qemu instances from the first invocation a
5
second time (which would result in error messages).
6
2
3
Add new parallels-ext.c and myself as co-maintainer.
4
5
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
6
Message-Id: <20210304095151.19358-1-vsementsov@virtuozzo.com>
7
Reviewed-by: Denis V. Lunev <den@openvz.org>
7
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8
Reviewed-by: Eric Blake <eblake@redhat.com>
9
Reviewed-by: Max Reitz <mreitz@redhat.com>
10
---
9
---
11
tests/qemu-iotests/common.qemu | 3 +++
10
MAINTAINERS | 3 +++
12
1 file changed, 3 insertions(+)
11
1 file changed, 3 insertions(+)
13
12
14
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
13
diff --git a/MAINTAINERS b/MAINTAINERS
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/tests/qemu-iotests/common.qemu
15
--- a/MAINTAINERS
17
+++ b/tests/qemu-iotests/common.qemu
16
+++ b/MAINTAINERS
18
@@ -XXX,XX +XXX,XX @@ function _cleanup_qemu()
17
@@ -XXX,XX +XXX,XX @@ F: block/dmg.c
19
rm -f "${QEMU_FIFO_IN}_${i}" "${QEMU_FIFO_OUT}_${i}"
18
parallels
20
eval "exec ${QEMU_IN[$i]}<&-" # close file descriptors
19
M: Stefan Hajnoczi <stefanha@redhat.com>
21
eval "exec ${QEMU_OUT[$i]}<&-"
20
M: Denis V. Lunev <den@openvz.org>
22
+
21
+M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
23
+ unset QEMU_IN[$i]
22
L: qemu-block@nongnu.org
24
+ unset QEMU_OUT[$i]
23
S: Supported
25
done
24
F: block/parallels.c
26
}
25
+F: block/parallels-ext.c
26
F: docs/interop/parallels.txt
27
+T: git https://src.openvz.org/scm/~vsementsov/qemu.git parallels
28
29
qed
30
M: Stefan Hajnoczi <stefanha@redhat.com>
27
--
31
--
28
1.8.3.1
32
2.29.2
29
33
30
34
diff view generated by jsdifflib
Deleted patch
1
This adds documentation for the -blockdev options that apply to all
2
nodes independent of the block driver used.
3
1
4
All options that are shared by -blockdev and -drive are now explained in
5
the section for -blockdev. The documentation of -drive mentions that all
6
-blockdev options are accepted as well.
7
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Reviewed-by: Eric Blake <eblake@redhat.com>
10
Reviewed-by: Max Reitz <mreitz@redhat.com>
11
---
12
qemu-options.hx | 108 +++++++++++++++++++++++++++++++++++++++++---------------
13
1 file changed, 79 insertions(+), 29 deletions(-)
14
15
diff --git a/qemu-options.hx b/qemu-options.hx
16
index XXXXXXX..XXXXXXX 100644
17
--- a/qemu-options.hx
18
+++ b/qemu-options.hx
19
@@ -XXX,XX +XXX,XX @@ DEF("blockdev", HAS_ARG, QEMU_OPTION_blockdev,
20
" [,read-only=on|off][,detect-zeroes=on|off|unmap]\n"
21
" [,driver specific parameters...]\n"
22
" configure a block backend\n", QEMU_ARCH_ALL)
23
+STEXI
24
+@item -blockdev @var{option}[,@var{option}[,@var{option}[,...]]]
25
+@findex -blockdev
26
+
27
+Define a new block driver node.
28
+
29
+@table @option
30
+@item Valid options for any block driver node:
31
+
32
+@table @code
33
+@item driver
34
+Specifies the block driver to use for the given node.
35
+@item node-name
36
+This defines the name of the block driver node by which it will be referenced
37
+later. The name must be unique, i.e. it must not match the name of a different
38
+block driver node, or (if you use @option{-drive} as well) the ID of a drive.
39
+
40
+If no node name is specified, it is automatically generated. The generated node
41
+name is not intended to be predictable and changes between QEMU invocations.
42
+For the top level, an explicit node name must be specified.
43
+@item read-only
44
+Open the node read-only. Guest write attempts will fail.
45
+@item cache.direct
46
+The host page cache can be avoided with @option{cache.direct=on}. This will
47
+attempt to do disk IO directly to the guest's memory. QEMU may still perform an
48
+internal copy of the data.
49
+@item cache.no-flush
50
+In case you don't care about data integrity over host failures, you can use
51
+@option{cache.no-flush=on}. This option tells QEMU that it never needs to write
52
+any data to the disk but can instead keep things in cache. If anything goes
53
+wrong, like your host losing power, the disk storage getting disconnected
54
+accidentally, etc. your image will most probably be rendered unusable.
55
+@item discard=@var{discard}
56
+@var{discard} is one of "ignore" (or "off") or "unmap" (or "on") and controls
57
+whether @code{discard} (also known as @code{trim} or @code{unmap}) requests are
58
+ignored or passed to the filesystem. Some machine types may not support
59
+discard requests.
60
+@item detect-zeroes=@var{detect-zeroes}
61
+@var{detect-zeroes} is "off", "on" or "unmap" and enables the automatic
62
+conversion of plain zero writes by the OS to driver specific optimized
63
+zero write commands. You may even choose "unmap" if @var{discard} is set
64
+to "unmap" to allow a zero write to be converted to an @code{unmap} operation.
65
+@end table
66
+
67
+@end table
68
+
69
+ETEXI
70
71
DEF("drive", HAS_ARG, QEMU_OPTION_drive,
72
"-drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n"
73
@@ -XXX,XX +XXX,XX @@ STEXI
74
@item -drive @var{option}[,@var{option}[,@var{option}[,...]]]
75
@findex -drive
76
77
-Define a new drive. Valid options are:
78
+Define a new drive. This includes creating a block driver node (the backend) as
79
+well as a guest device, and is mostly a shortcut for defining the corresponding
80
+@option{-blockdev} and @option{-device} options.
81
+
82
+@option{-drive} accepts all options that are accepted by @option{-blockdev}. In
83
+addition, it knows the following options:
84
85
@table @option
86
@item file=@var{file}
87
@@ -XXX,XX +XXX,XX @@ These options have the same definition as they have in @option{-hdachs}.
88
@var{snapshot} is "on" or "off" and controls snapshot mode for the given drive
89
(see @option{-snapshot}).
90
@item cache=@var{cache}
91
-@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough" and controls how the host cache is used to access block data.
92
+@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough"
93
+and controls how the host cache is used to access block data. This is a
94
+shortcut that sets the @option{cache.direct} and @option{cache.no-flush}
95
+options (as in @option{-blockdev}), and additionally @option{cache.writeback},
96
+which provides a default for the @option{write-cache} option of block guest
97
+devices (as in @option{-device}). The modes correspond to the following
98
+settings:
99
+
100
+@c Our texi2pod.pl script doesn't support @multitable, so fall back to using
101
+@c plain ASCII art (well, UTF-8 art really). This looks okay both in the manpage
102
+@c and the HTML output.
103
+@example
104
+@ │ cache.writeback cache.direct cache.no-flush
105
+─────────────┼─────────────────────────────────────────────────
106
+writeback │ on off off
107
+none │ on on off
108
+writethrough │ off off off
109
+directsync │ off on off
110
+unsafe │ on off on
111
+@end example
112
+
113
+The default mode is @option{cache=writeback}.
114
+
115
@item aio=@var{aio}
116
@var{aio} is "threads", or "native" and selects between pthread based disk I/O and native Linux AIO.
117
-@item discard=@var{discard}
118
-@var{discard} is one of "ignore" (or "off") or "unmap" (or "on") and controls whether @dfn{discard} (also known as @dfn{trim} or @dfn{unmap}) requests are ignored or passed to the filesystem. Some machine types may not support discard requests.
119
@item format=@var{format}
120
Specify which disk @var{format} will be used rather than detecting
121
the format. Can be used to specify format=raw to avoid interpreting
122
@@ -XXX,XX +XXX,XX @@ Specify which @var{action} to take on write and read errors. Valid actions are:
123
"report" (report the error to the guest), "enospc" (pause QEMU only if the
124
host disk is full; report the error to the guest otherwise).
125
The default setting is @option{werror=enospc} and @option{rerror=report}.
126
-@item readonly
127
-Open drive @option{file} as read-only. Guest write attempts will fail.
128
@item copy-on-read=@var{copy-on-read}
129
@var{copy-on-read} is "on" or "off" and enables whether to copy read backing
130
file sectors into the image file.
131
-@item detect-zeroes=@var{detect-zeroes}
132
-@var{detect-zeroes} is "off", "on" or "unmap" and enables the automatic
133
-conversion of plain zero writes by the OS to driver specific optimized
134
-zero write commands. You may even choose "unmap" if @var{discard} is set
135
-to "unmap" to allow a zero write to be converted to an UNMAP operation.
136
@item bps=@var{b},bps_rd=@var{r},bps_wr=@var{w}
137
Specify bandwidth throttling limits in bytes per second, either for all request
138
types or for reads or writes only. Small values can lead to timeouts or hangs
139
@@ -XXX,XX +XXX,XX @@ prevent guests from circumventing throttling limits by using many small disks
140
instead of a single larger disk.
141
@end table
142
143
-By default, the @option{cache=writeback} mode is used. It will report data
144
+By default, the @option{cache.writeback=on} mode is used. It will report data
145
writes as completed as soon as the data is present in the host page cache.
146
This is safe as long as your guest OS makes sure to correctly flush disk caches
147
where needed. If your guest OS does not handle volatile disk write caches
148
correctly and your host crashes or loses power, then the guest may experience
149
data corruption.
150
151
-For such guests, you should consider using @option{cache=writethrough}. This
152
+For such guests, you should consider using @option{cache.writeback=off}. This
153
means that the host page cache will be used to read and write data, but write
154
notification will be sent to the guest only after QEMU has made sure to flush
155
each write to the disk. Be aware that this has a major impact on performance.
156
157
-The host page cache can be avoided entirely with @option{cache=none}. This will
158
-attempt to do disk IO directly to the guest's memory. QEMU may still perform
159
-an internal copy of the data. Note that this is considered a writeback mode and
160
-the guest OS must handle the disk write cache correctly in order to avoid data
161
-corruption on host crashes.
162
-
163
-The host page cache can be avoided while only sending write notifications to
164
-the guest when the data has been flushed to the disk using
165
-@option{cache=directsync}.
166
-
167
-In case you don't care about data integrity over host failures, use
168
-@option{cache=unsafe}. This option tells QEMU that it never needs to write any
169
-data to the disk but can instead keep things in cache. If anything goes wrong,
170
-like your host losing power, the disk storage getting disconnected accidentally,
171
-etc. your image will most probably be rendered unusable. When using
172
-the @option{-snapshot} option, unsafe caching is always used.
173
+When using the @option{-snapshot} option, unsafe caching is always used.
174
175
Copy-on-read avoids accessing the same backing file sectors repeatedly and is
176
useful when the backing file is over a slow network. By default copy-on-read
177
--
178
1.8.3.1
179
180
diff view generated by jsdifflib
Deleted patch
1
This documents the driver-specific options for the raw, qcow2 and file
2
block drivers for the man page. For everything else, we refer to the
3
QAPI documentation.
4
1
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Eric Blake <eblake@redhat.com>
7
Reviewed-by: Max Reitz <mreitz@redhat.com>
8
---
9
qemu-options.hx | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
10
1 file changed, 114 insertions(+), 1 deletion(-)
11
12
diff --git a/qemu-options.hx b/qemu-options.hx
13
index XXXXXXX..XXXXXXX 100644
14
--- a/qemu-options.hx
15
+++ b/qemu-options.hx
16
@@ -XXX,XX +XXX,XX @@ STEXI
17
@item -blockdev @var{option}[,@var{option}[,@var{option}[,...]]]
18
@findex -blockdev
19
20
-Define a new block driver node.
21
+Define a new block driver node. Some of the options apply to all block drivers,
22
+other options are only accepted for a specific block driver. See below for a
23
+list of generic options and options for the most common block drivers.
24
+
25
+Options that expect a reference to another node (e.g. @code{file}) can be
26
+given in two ways. Either you specify the node name of an already existing node
27
+(file=@var{node-name}), or you define a new node inline, adding options
28
+for the referenced node after a dot (file.filename=@var{path},file.aio=native).
29
+
30
+A block driver node created with @option{-blockdev} can be used for a guest
31
+device by specifying its node name for the @code{drive} property in a
32
+@option{-device} argument that defines a block device.
33
34
@table @option
35
@item Valid options for any block driver node:
36
@@ -XXX,XX +XXX,XX @@ zero write commands. You may even choose "unmap" if @var{discard} is set
37
to "unmap" to allow a zero write to be converted to an @code{unmap} operation.
38
@end table
39
40
+@item Driver-specific options for @code{file}
41
+
42
+This is the protocol-level block driver for accessing regular files.
43
+
44
+@table @code
45
+@item filename
46
+The path to the image file in the local filesystem
47
+@item aio
48
+Specifies the AIO backend (threads/native, default: threads)
49
+@end table
50
+Example:
51
+@example
52
+-blockdev driver=file,node-name=disk,filename=disk.img
53
+@end example
54
+
55
+@item Driver-specific options for @code{raw}
56
+
57
+This is the image format block driver for raw images. It is usually
58
+stacked on top of a protocol level block driver such as @code{file}.
59
+
60
+@table @code
61
+@item file
62
+Reference to or definition of the data source block driver node
63
+(e.g. a @code{file} driver node)
64
+@end table
65
+Example 1:
66
+@example
67
+-blockdev driver=file,node-name=disk_file,filename=disk.img
68
+-blockdev driver=raw,node-name=disk,file=disk_file
69
+@end example
70
+Example 2:
71
+@example
72
+-blockdev driver=raw,node-name=disk,file.driver=file,file.filename=disk.img
73
+@end example
74
+
75
+@item Driver-specific options for @code{qcow2}
76
+
77
+This is the image format block driver for qcow2 images. It is usually
78
+stacked on top of a protocol level block driver such as @code{file}.
79
+
80
+@table @code
81
+@item file
82
+Reference to or definition of the data source block driver node
83
+(e.g. a @code{file} driver node)
84
+
85
+@item backing
86
+Reference to or definition of the backing file block device (default is taken
87
+from the image file). It is allowed to pass an empty string here in order to
88
+disable the default backing file.
89
+
90
+@item lazy-refcounts
91
+Whether to enable the lazy refcounts feature (on/off; default is taken from the
92
+image file)
93
+
94
+@item cache-size
95
+The maximum total size of the L2 table and refcount block caches in bytes
96
+(default: 1048576 bytes or 8 clusters, whichever is larger)
97
+
98
+@item l2-cache-size
99
+The maximum size of the L2 table cache in bytes
100
+(default: 4/5 of the total cache size)
101
+
102
+@item refcount-cache-size
103
+The maximum size of the refcount block cache in bytes
104
+(default: 1/5 of the total cache size)
105
+
106
+@item cache-clean-interval
107
+Clean unused entries in the L2 and refcount caches. The interval is in seconds.
108
+The default value is 0 and it disables this feature.
109
+
110
+@item pass-discard-request
111
+Whether discard requests to the qcow2 device should be forwarded to the data
112
+source (on/off; default: on if discard=unmap is specified, off otherwise)
113
+
114
+@item pass-discard-snapshot
115
+Whether discard requests for the data source should be issued when a snapshot
116
+operation (e.g. deleting a snapshot) frees clusters in the qcow2 file (on/off;
117
+default: on)
118
+
119
+@item pass-discard-other
120
+Whether discard requests for the data source should be issued on other
121
+occasions where a cluster gets freed (on/off; default: off)
122
+
123
+@item overlap-check
124
+Which overlap checks to perform for writes to the image
125
+(none/constant/cached/all; default: cached). For details or finer
126
+granularity control refer to the QAPI documentation of @code{blockdev-add}.
127
+@end table
128
+
129
+Example 1:
130
+@example
131
+-blockdev driver=file,node-name=my_file,filename=/tmp/disk.qcow2
132
+-blockdev driver=qcow2,node-name=hda,file=my_file,overlap-check=none,cache-size=16777216
133
+@end example
134
+Example 2:
135
+@example
136
+-blockdev driver=qcow2,node-name=disk,file.driver=http,file.filename=http://example.com/image.qcow2
137
+@end example
138
+
139
+@item Driver-specific options for other drivers
140
+Please refer to the QAPI documentation of the @code{blockdev-add} QMP command.
141
+
142
@end table
143
144
ETEXI
145
--
146
1.8.3.1
147
148
diff view generated by jsdifflib
Deleted patch
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
3
---
4
block/qed-cluster.c | 39 ++++++++++++++++++++++-----------------
5
block/qed.c | 24 +++++++++++-------------
6
block/qed.h | 4 ++--
7
3 files changed, 35 insertions(+), 32 deletions(-)
8
1
9
diff --git a/block/qed-cluster.c b/block/qed-cluster.c
10
index XXXXXXX..XXXXXXX 100644
11
--- a/block/qed-cluster.c
12
+++ b/block/qed-cluster.c
13
@@ -XXX,XX +XXX,XX @@ static unsigned int qed_count_contiguous_clusters(BDRVQEDState *s,
14
* @s: QED state
15
* @request: L2 cache entry
16
* @pos: Byte position in device
17
- * @len: Number of bytes
18
- * @cb: Completion function
19
- * @opaque: User data for completion function
20
+ * @len: Number of bytes (may be shortened on return)
21
+ * @img_offset: Contains offset in the image file on success
22
*
23
* This function translates a position in the block device to an offset in the
24
- * image file. It invokes the cb completion callback to report back the
25
- * translated offset or unallocated range in the image file.
26
+ * image file. The translated offset or unallocated range in the image file is
27
+ * reported back in *img_offset and *len.
28
*
29
* If the L2 table exists, request->l2_table points to the L2 table cache entry
30
* and the caller must free the reference when they are finished. The cache
31
* entry is exposed in this way to avoid callers having to read the L2 table
32
* again later during request processing. If request->l2_table is non-NULL it
33
* will be unreferenced before taking on the new cache entry.
34
+ *
35
+ * On success QED_CLUSTER_FOUND is returned and img_offset/len are a contiguous
36
+ * range in the image file.
37
+ *
38
+ * On failure QED_CLUSTER_L2 or QED_CLUSTER_L1 is returned for missing L2 or L1
39
+ * table offset, respectively. len is number of contiguous unallocated bytes.
40
*/
41
-void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
42
- size_t len, QEDFindClusterFunc *cb, void *opaque)
43
+int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
44
+ size_t *len, uint64_t *img_offset)
45
{
46
uint64_t l2_offset;
47
uint64_t offset = 0;
48
@@ -XXX,XX +XXX,XX @@ void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
49
/* Limit length to L2 boundary. Requests are broken up at the L2 boundary
50
* so that a request acts on one L2 table at a time.
51
*/
52
- len = MIN(len, (((pos >> s->l1_shift) + 1) << s->l1_shift) - pos);
53
+ *len = MIN(*len, (((pos >> s->l1_shift) + 1) << s->l1_shift) - pos);
54
55
l2_offset = s->l1_table->offsets[qed_l1_index(s, pos)];
56
if (qed_offset_is_unalloc_cluster(l2_offset)) {
57
- cb(opaque, QED_CLUSTER_L1, 0, len);
58
- return;
59
+ *img_offset = 0;
60
+ return QED_CLUSTER_L1;
61
}
62
if (!qed_check_table_offset(s, l2_offset)) {
63
- cb(opaque, -EINVAL, 0, 0);
64
- return;
65
+ *img_offset = *len = 0;
66
+ return -EINVAL;
67
}
68
69
ret = qed_read_l2_table(s, request, l2_offset);
70
@@ -XXX,XX +XXX,XX @@ void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
71
}
72
73
index = qed_l2_index(s, pos);
74
- n = qed_bytes_to_clusters(s,
75
- qed_offset_into_cluster(s, pos) + len);
76
+ n = qed_bytes_to_clusters(s, qed_offset_into_cluster(s, pos) + *len);
77
n = qed_count_contiguous_clusters(s, request->l2_table->table,
78
index, n, &offset);
79
80
@@ -XXX,XX +XXX,XX @@ void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
81
ret = -EINVAL;
82
}
83
84
- len = MIN(len,
85
- n * s->header.cluster_size - qed_offset_into_cluster(s, pos));
86
+ *len = MIN(*len,
87
+ n * s->header.cluster_size - qed_offset_into_cluster(s, pos));
88
89
out:
90
- cb(opaque, ret, offset, len);
91
+ *img_offset = offset;
92
qed_release(s);
93
+ return ret;
94
}
95
diff --git a/block/qed.c b/block/qed.c
96
index XXXXXXX..XXXXXXX 100644
97
--- a/block/qed.c
98
+++ b/block/qed.c
99
@@ -XXX,XX +XXX,XX @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
100
.file = file,
101
};
102
QEDRequest request = { .l2_table = NULL };
103
+ uint64_t offset;
104
+ int ret;
105
106
- qed_find_cluster(s, &request, cb.pos, len, qed_is_allocated_cb, &cb);
107
+ ret = qed_find_cluster(s, &request, cb.pos, &len, &offset);
108
+ qed_is_allocated_cb(&cb, ret, offset, len);
109
110
- /* Now sleep if the callback wasn't invoked immediately */
111
- while (cb.status == BDRV_BLOCK_OFFSET_MASK) {
112
- cb.co = qemu_coroutine_self();
113
- qemu_coroutine_yield();
114
- }
115
+ /* The callback was invoked immediately */
116
+ assert(cb.status != BDRV_BLOCK_OFFSET_MASK);
117
118
qed_unref_l2_cache_entry(request.l2_table);
119
120
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
121
* or -errno
122
* @offset: Cluster offset in bytes
123
* @len: Length in bytes
124
- *
125
- * Callback from qed_find_cluster().
126
*/
127
static void qed_aio_write_data(void *opaque, int ret,
128
uint64_t offset, size_t len)
129
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_data(void *opaque, int ret,
130
* or -errno
131
* @offset: Cluster offset in bytes
132
* @len: Length in bytes
133
- *
134
- * Callback from qed_find_cluster().
135
*/
136
static void qed_aio_read_data(void *opaque, int ret,
137
uint64_t offset, size_t len)
138
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb, int ret)
139
BDRVQEDState *s = acb_to_s(acb);
140
QEDFindClusterFunc *io_fn = (acb->flags & QED_AIOCB_WRITE) ?
141
qed_aio_write_data : qed_aio_read_data;
142
+ uint64_t offset;
143
+ size_t len;
144
145
trace_qed_aio_next_io(s, acb, ret, acb->cur_pos + acb->cur_qiov.size);
146
147
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb, int ret)
148
}
149
150
/* Find next cluster and start I/O */
151
- qed_find_cluster(s, &acb->request,
152
- acb->cur_pos, acb->end_pos - acb->cur_pos,
153
- io_fn, acb);
154
+ len = acb->end_pos - acb->cur_pos;
155
+ ret = qed_find_cluster(s, &acb->request, acb->cur_pos, &len, &offset);
156
+ io_fn(acb, ret, offset, len);
157
}
158
159
static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
160
diff --git a/block/qed.h b/block/qed.h
161
index XXXXXXX..XXXXXXX 100644
162
--- a/block/qed.h
163
+++ b/block/qed.h
164
@@ -XXX,XX +XXX,XX @@ int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
165
/**
166
* Cluster functions
167
*/
168
-void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
169
- size_t len, QEDFindClusterFunc *cb, void *opaque);
170
+int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
171
+ size_t *len, uint64_t *img_offset);
172
173
/**
174
* Consistency check
175
--
176
1.8.3.1
177
178
diff view generated by jsdifflib
Deleted patch
1
With this change, qed_aio_write_prefill() and qed_aio_write_postfill()
2
collapse into a single function. This is reflected by a rename of the
3
combined function to qed_aio_write_cow().
4
1
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Eric Blake <eblake@redhat.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
---
9
block/qed.c | 57 +++++++++++++++++++++++----------------------------------
10
1 file changed, 23 insertions(+), 34 deletions(-)
11
12
diff --git a/block/qed.c b/block/qed.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/block/qed.c
15
+++ b/block/qed.c
16
@@ -XXX,XX +XXX,XX @@ static int qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
17
* @pos: Byte position in device
18
* @len: Number of bytes
19
* @offset: Byte offset in image file
20
- * @cb: Completion function
21
- * @opaque: User data for completion function
22
*/
23
-static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
24
- uint64_t len, uint64_t offset,
25
- BlockCompletionFunc *cb,
26
- void *opaque)
27
+static int qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
28
+ uint64_t len, uint64_t offset)
29
{
30
QEMUIOVector qiov;
31
QEMUIOVector *backing_qiov = NULL;
32
@@ -XXX,XX +XXX,XX @@ static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
33
34
/* Skip copy entirely if there is no work to do */
35
if (len == 0) {
36
- cb(opaque, 0);
37
- return;
38
+ return 0;
39
}
40
41
iov = (struct iovec) {
42
@@ -XXX,XX +XXX,XX @@ static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
43
ret = 0;
44
out:
45
qemu_vfree(iov.iov_base);
46
- cb(opaque, ret);
47
+ return ret;
48
}
49
50
/**
51
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_main(void *opaque, int ret)
52
}
53
54
/**
55
- * Populate back untouched region of new data cluster
56
+ * Populate untouched regions of new data cluster
57
*/
58
-static void qed_aio_write_postfill(void *opaque, int ret)
59
+static void qed_aio_write_cow(void *opaque, int ret)
60
{
61
QEDAIOCB *acb = opaque;
62
BDRVQEDState *s = acb_to_s(acb);
63
- uint64_t start = acb->cur_pos + acb->cur_qiov.size;
64
- uint64_t len =
65
- qed_start_of_cluster(s, start + s->header.cluster_size - 1) - start;
66
- uint64_t offset = acb->cur_cluster +
67
- qed_offset_into_cluster(s, acb->cur_pos) +
68
- acb->cur_qiov.size;
69
+ uint64_t start, len, offset;
70
+
71
+ /* Populate front untouched region of new data cluster */
72
+ start = qed_start_of_cluster(s, acb->cur_pos);
73
+ len = qed_offset_into_cluster(s, acb->cur_pos);
74
75
+ trace_qed_aio_write_prefill(s, acb, start, len, acb->cur_cluster);
76
+ ret = qed_copy_from_backing_file(s, start, len, acb->cur_cluster);
77
if (ret) {
78
qed_aio_complete(acb, ret);
79
return;
80
}
81
82
- trace_qed_aio_write_postfill(s, acb, start, len, offset);
83
- qed_copy_from_backing_file(s, start, len, offset,
84
- qed_aio_write_main, acb);
85
-}
86
+ /* Populate back untouched region of new data cluster */
87
+ start = acb->cur_pos + acb->cur_qiov.size;
88
+ len = qed_start_of_cluster(s, start + s->header.cluster_size - 1) - start;
89
+ offset = acb->cur_cluster +
90
+ qed_offset_into_cluster(s, acb->cur_pos) +
91
+ acb->cur_qiov.size;
92
93
-/**
94
- * Populate front untouched region of new data cluster
95
- */
96
-static void qed_aio_write_prefill(void *opaque, int ret)
97
-{
98
- QEDAIOCB *acb = opaque;
99
- BDRVQEDState *s = acb_to_s(acb);
100
- uint64_t start = qed_start_of_cluster(s, acb->cur_pos);
101
- uint64_t len = qed_offset_into_cluster(s, acb->cur_pos);
102
+ trace_qed_aio_write_postfill(s, acb, start, len, offset);
103
+ ret = qed_copy_from_backing_file(s, start, len, offset);
104
105
- trace_qed_aio_write_prefill(s, acb, start, len, acb->cur_cluster);
106
- qed_copy_from_backing_file(s, start, len, acb->cur_cluster,
107
- qed_aio_write_postfill, acb);
108
+ qed_aio_write_main(acb, ret);
109
}
110
111
/**
112
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
113
114
cb = qed_aio_write_zero_cluster;
115
} else {
116
- cb = qed_aio_write_prefill;
117
+ cb = qed_aio_write_cow;
118
acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
119
}
120
121
--
122
1.8.3.1
123
124
diff view generated by jsdifflib
Deleted patch
1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
3
---
4
block/qed-table.c | 47 ++++++++++++-----------------------------------
5
block/qed.c | 12 +++++++-----
6
block/qed.h | 8 +++-----
7
3 files changed, 22 insertions(+), 45 deletions(-)
8
1
9
diff --git a/block/qed-table.c b/block/qed-table.c
10
index XXXXXXX..XXXXXXX 100644
11
--- a/block/qed-table.c
12
+++ b/block/qed-table.c
13
@@ -XXX,XX +XXX,XX @@ out:
14
* @index: Index of first element
15
* @n: Number of elements
16
* @flush: Whether or not to sync to disk
17
- * @cb: Completion function
18
- * @opaque: Argument for completion function
19
*/
20
-static void qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
21
- unsigned int index, unsigned int n, bool flush,
22
- BlockCompletionFunc *cb, void *opaque)
23
+static int qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
24
+ unsigned int index, unsigned int n, bool flush)
25
{
26
unsigned int sector_mask = BDRV_SECTOR_SIZE / sizeof(uint64_t) - 1;
27
unsigned int start, end, i;
28
@@ -XXX,XX +XXX,XX @@ static void qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
29
ret = 0;
30
out:
31
qemu_vfree(new_table);
32
- cb(opaque, ret);
33
-}
34
-
35
-/**
36
- * Propagate return value from async callback
37
- */
38
-static void qed_sync_cb(void *opaque, int ret)
39
-{
40
- *(int *)opaque = ret;
41
+ return ret;
42
}
43
44
int qed_read_l1_table_sync(BDRVQEDState *s)
45
@@ -XXX,XX +XXX,XX @@ int qed_read_l1_table_sync(BDRVQEDState *s)
46
return qed_read_table(s, s->header.l1_table_offset, s->l1_table);
47
}
48
49
-void qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n,
50
- BlockCompletionFunc *cb, void *opaque)
51
+int qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n)
52
{
53
BLKDBG_EVENT(s->bs->file, BLKDBG_L1_UPDATE);
54
- qed_write_table(s, s->header.l1_table_offset,
55
- s->l1_table, index, n, false, cb, opaque);
56
+ return qed_write_table(s, s->header.l1_table_offset,
57
+ s->l1_table, index, n, false);
58
}
59
60
int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
61
unsigned int n)
62
{
63
- int ret = -EINPROGRESS;
64
-
65
- qed_write_l1_table(s, index, n, qed_sync_cb, &ret);
66
- BDRV_POLL_WHILE(s->bs, ret == -EINPROGRESS);
67
-
68
- return ret;
69
+ return qed_write_l1_table(s, index, n);
70
}
71
72
int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
73
@@ -XXX,XX +XXX,XX @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
74
return qed_read_l2_table(s, request, offset);
75
}
76
77
-void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
78
- unsigned int index, unsigned int n, bool flush,
79
- BlockCompletionFunc *cb, void *opaque)
80
+int qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
81
+ unsigned int index, unsigned int n, bool flush)
82
{
83
BLKDBG_EVENT(s->bs->file, BLKDBG_L2_UPDATE);
84
- qed_write_table(s, request->l2_table->offset,
85
- request->l2_table->table, index, n, flush, cb, opaque);
86
+ return qed_write_table(s, request->l2_table->offset,
87
+ request->l2_table->table, index, n, flush);
88
}
89
90
int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
91
unsigned int index, unsigned int n, bool flush)
92
{
93
- int ret = -EINPROGRESS;
94
-
95
- qed_write_l2_table(s, request, index, n, flush, qed_sync_cb, &ret);
96
- BDRV_POLL_WHILE(s->bs, ret == -EINPROGRESS);
97
-
98
- return ret;
99
+ return qed_write_l2_table(s, request, index, n, flush);
100
}
101
diff --git a/block/qed.c b/block/qed.c
102
index XXXXXXX..XXXXXXX 100644
103
--- a/block/qed.c
104
+++ b/block/qed.c
105
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_l1_update(void *opaque, int ret)
106
index = qed_l1_index(s, acb->cur_pos);
107
s->l1_table->offsets[index] = acb->request.l2_table->offset;
108
109
- qed_write_l1_table(s, index, 1, qed_commit_l2_update, acb);
110
+ ret = qed_write_l1_table(s, index, 1);
111
+ qed_commit_l2_update(acb, ret);
112
}
113
114
/**
115
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_l2_update(QEDAIOCB *acb, int ret, uint64_t offset)
116
117
if (need_alloc) {
118
/* Write out the whole new L2 table */
119
- qed_write_l2_table(s, &acb->request, 0, s->table_nelems, true,
120
- qed_aio_write_l1_update, acb);
121
+ ret = qed_write_l2_table(s, &acb->request, 0, s->table_nelems, true);
122
+ qed_aio_write_l1_update(acb, ret);
123
} else {
124
/* Write out only the updated part of the L2 table */
125
- qed_write_l2_table(s, &acb->request, index, acb->cur_nclusters, false,
126
- qed_aio_next_io_cb, acb);
127
+ ret = qed_write_l2_table(s, &acb->request, index, acb->cur_nclusters,
128
+ false);
129
+ qed_aio_next_io(acb, ret);
130
}
131
return;
132
133
diff --git a/block/qed.h b/block/qed.h
134
index XXXXXXX..XXXXXXX 100644
135
--- a/block/qed.h
136
+++ b/block/qed.h
137
@@ -XXX,XX +XXX,XX @@ void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_table);
138
* Table I/O functions
139
*/
140
int qed_read_l1_table_sync(BDRVQEDState *s);
141
-void qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n,
142
- BlockCompletionFunc *cb, void *opaque);
143
+int qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n);
144
int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
145
unsigned int n);
146
int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
147
uint64_t offset);
148
int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset);
149
-void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
150
- unsigned int index, unsigned int n, bool flush,
151
- BlockCompletionFunc *cb, void *opaque);
152
+int qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
153
+ unsigned int index, unsigned int n, bool flush);
154
int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
155
unsigned int index, unsigned int n, bool flush);
156
157
--
158
1.8.3.1
159
160
diff view generated by jsdifflib
Deleted patch
1
Note that this code is generally not running in coroutine context, so
2
this is an actual blocking synchronous operation. We'll fix this in a
3
moment.
4
1
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
8
block/qed.c | 61 +++++++++++++++++++------------------------------------------
9
1 file changed, 19 insertions(+), 42 deletions(-)
10
11
diff --git a/block/qed.c b/block/qed.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/block/qed.c
14
+++ b/block/qed.c
15
@@ -XXX,XX +XXX,XX @@ static void qed_aio_start_io(QEDAIOCB *acb)
16
qed_aio_next_io(acb, 0);
17
}
18
19
-static void qed_aio_next_io_cb(void *opaque, int ret)
20
-{
21
- QEDAIOCB *acb = opaque;
22
-
23
- qed_aio_next_io(acb, ret);
24
-}
25
-
26
static void qed_plug_allocating_write_reqs(BDRVQEDState *s)
27
{
28
assert(!s->allocating_write_reqs_plugged);
29
@@ -XXX,XX +XXX,XX @@ err:
30
qed_aio_complete(acb, ret);
31
}
32
33
-static void qed_aio_write_l2_update_cb(void *opaque, int ret)
34
-{
35
- QEDAIOCB *acb = opaque;
36
- qed_aio_write_l2_update(acb, ret, acb->cur_cluster);
37
-}
38
-
39
-/**
40
- * Flush new data clusters before updating the L2 table
41
- *
42
- * This flush is necessary when a backing file is in use. A crash during an
43
- * allocating write could result in empty clusters in the image. If the write
44
- * only touched a subregion of the cluster, then backing image sectors have
45
- * been lost in the untouched region. The solution is to flush after writing a
46
- * new data cluster and before updating the L2 table.
47
- */
48
-static void qed_aio_write_flush_before_l2_update(void *opaque, int ret)
49
-{
50
- QEDAIOCB *acb = opaque;
51
- BDRVQEDState *s = acb_to_s(acb);
52
-
53
- if (!bdrv_aio_flush(s->bs->file->bs, qed_aio_write_l2_update_cb, opaque)) {
54
- qed_aio_complete(acb, -EIO);
55
- }
56
-}
57
-
58
/**
59
* Write data to the image file
60
*/
61
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_main(void *opaque, int ret)
62
BDRVQEDState *s = acb_to_s(acb);
63
uint64_t offset = acb->cur_cluster +
64
qed_offset_into_cluster(s, acb->cur_pos);
65
- BlockCompletionFunc *next_fn;
66
67
trace_qed_aio_write_main(s, acb, ret, offset, acb->cur_qiov.size);
68
69
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_main(void *opaque, int ret)
70
return;
71
}
72
73
+ BLKDBG_EVENT(s->bs->file, BLKDBG_WRITE_AIO);
74
+ ret = bdrv_pwritev(s->bs->file, offset, &acb->cur_qiov);
75
+ if (ret >= 0) {
76
+ ret = 0;
77
+ }
78
+
79
if (acb->find_cluster_ret == QED_CLUSTER_FOUND) {
80
- next_fn = qed_aio_next_io_cb;
81
+ qed_aio_next_io(acb, ret);
82
} else {
83
if (s->bs->backing) {
84
- next_fn = qed_aio_write_flush_before_l2_update;
85
- } else {
86
- next_fn = qed_aio_write_l2_update_cb;
87
+ /*
88
+ * Flush new data clusters before updating the L2 table
89
+ *
90
+ * This flush is necessary when a backing file is in use. A crash
91
+ * during an allocating write could result in empty clusters in the
92
+ * image. If the write only touched a subregion of the cluster,
93
+ * then backing image sectors have been lost in the untouched
94
+ * region. The solution is to flush after writing a new data
95
+ * cluster and before updating the L2 table.
96
+ */
97
+ ret = bdrv_flush(s->bs->file->bs);
98
}
99
+ qed_aio_write_l2_update(acb, ret, acb->cur_cluster);
100
}
101
-
102
- BLKDBG_EVENT(s->bs->file, BLKDBG_WRITE_AIO);
103
- bdrv_aio_writev(s->bs->file, offset / BDRV_SECTOR_SIZE,
104
- &acb->cur_qiov, acb->cur_qiov.size / BDRV_SECTOR_SIZE,
105
- next_fn, acb);
106
}
107
108
/**
109
--
110
1.8.3.1
111
112
diff view generated by jsdifflib
Deleted patch
1
qed_commit_l2_update() is unconditionally called at the end of
2
qed_aio_write_l1_update(). Inline it.
3
1
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/qed.c | 36 ++++++++++++++----------------------
8
1 file changed, 14 insertions(+), 22 deletions(-)
9
10
diff --git a/block/qed.c b/block/qed.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/qed.c
13
+++ b/block/qed.c
14
@@ -XXX,XX +XXX,XX @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
15
}
16
17
/**
18
- * Commit the current L2 table to the cache
19
+ * Update L1 table with new L2 table offset and write it out
20
*/
21
-static void qed_commit_l2_update(void *opaque, int ret)
22
+static void qed_aio_write_l1_update(void *opaque, int ret)
23
{
24
QEDAIOCB *acb = opaque;
25
BDRVQEDState *s = acb_to_s(acb);
26
CachedL2Table *l2_table = acb->request.l2_table;
27
uint64_t l2_offset = l2_table->offset;
28
+ int index;
29
+
30
+ if (ret) {
31
+ qed_aio_complete(acb, ret);
32
+ return;
33
+ }
34
35
+ index = qed_l1_index(s, acb->cur_pos);
36
+ s->l1_table->offsets[index] = l2_table->offset;
37
+
38
+ ret = qed_write_l1_table(s, index, 1);
39
+
40
+ /* Commit the current L2 table to the cache */
41
qed_commit_l2_cache_entry(&s->l2_cache, l2_table);
42
43
/* This is guaranteed to succeed because we just committed the entry to the
44
@@ -XXX,XX +XXX,XX @@ static void qed_commit_l2_update(void *opaque, int ret)
45
qed_aio_next_io(acb, ret);
46
}
47
48
-/**
49
- * Update L1 table with new L2 table offset and write it out
50
- */
51
-static void qed_aio_write_l1_update(void *opaque, int ret)
52
-{
53
- QEDAIOCB *acb = opaque;
54
- BDRVQEDState *s = acb_to_s(acb);
55
- int index;
56
-
57
- if (ret) {
58
- qed_aio_complete(acb, ret);
59
- return;
60
- }
61
-
62
- index = qed_l1_index(s, acb->cur_pos);
63
- s->l1_table->offsets[index] = acb->request.l2_table->offset;
64
-
65
- ret = qed_write_l1_table(s, index, 1);
66
- qed_commit_l2_update(acb, ret);
67
-}
68
69
/**
70
* Update L2 table with new cluster offsets and write them out
71
--
72
1.8.3.1
73
74
diff view generated by jsdifflib
Deleted patch
1
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
2
just return an error code and let the caller handle it.
3
1
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/qed.c | 19 +++++++++----------
8
1 file changed, 9 insertions(+), 10 deletions(-)
9
10
diff --git a/block/qed.c b/block/qed.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/qed.c
13
+++ b/block/qed.c
14
@@ -XXX,XX +XXX,XX @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
15
/**
16
* Update L1 table with new L2 table offset and write it out
17
*/
18
-static void qed_aio_write_l1_update(void *opaque, int ret)
19
+static int qed_aio_write_l1_update(QEDAIOCB *acb)
20
{
21
- QEDAIOCB *acb = opaque;
22
BDRVQEDState *s = acb_to_s(acb);
23
CachedL2Table *l2_table = acb->request.l2_table;
24
uint64_t l2_offset = l2_table->offset;
25
- int index;
26
-
27
- if (ret) {
28
- qed_aio_complete(acb, ret);
29
- return;
30
- }
31
+ int index, ret;
32
33
index = qed_l1_index(s, acb->cur_pos);
34
s->l1_table->offsets[index] = l2_table->offset;
35
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_l1_update(void *opaque, int ret)
36
acb->request.l2_table = qed_find_l2_cache_entry(&s->l2_cache, l2_offset);
37
assert(acb->request.l2_table != NULL);
38
39
- qed_aio_next_io(acb, ret);
40
+ return ret;
41
}
42
43
44
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_l2_update(QEDAIOCB *acb, int ret, uint64_t offset)
45
if (need_alloc) {
46
/* Write out the whole new L2 table */
47
ret = qed_write_l2_table(s, &acb->request, 0, s->table_nelems, true);
48
- qed_aio_write_l1_update(acb, ret);
49
+ if (ret) {
50
+ goto err;
51
+ }
52
+ ret = qed_aio_write_l1_update(acb);
53
+ qed_aio_next_io(acb, ret);
54
+
55
} else {
56
/* Write out only the updated part of the L2 table */
57
ret = qed_write_l2_table(s, &acb->request, index, acb->cur_nclusters,
58
--
59
1.8.3.1
60
61
diff view generated by jsdifflib
Deleted patch
1
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
2
just return an error code and let the caller handle it.
3
1
4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
---
7
block/qed.c | 43 ++++++++++++++++++++++++++-----------------
8
1 file changed, 26 insertions(+), 17 deletions(-)
9
10
diff --git a/block/qed.c b/block/qed.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/block/qed.c
13
+++ b/block/qed.c
14
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_l1_update(QEDAIOCB *acb)
15
/**
16
* Update L2 table with new cluster offsets and write them out
17
*/
18
-static void qed_aio_write_l2_update(QEDAIOCB *acb, int ret, uint64_t offset)
19
+static int qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t offset)
20
{
21
BDRVQEDState *s = acb_to_s(acb);
22
bool need_alloc = acb->find_cluster_ret == QED_CLUSTER_L1;
23
- int index;
24
-
25
- if (ret) {
26
- goto err;
27
- }
28
+ int index, ret;
29
30
if (need_alloc) {
31
qed_unref_l2_cache_entry(acb->request.l2_table);
32
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_l2_update(QEDAIOCB *acb, int ret, uint64_t offset)
33
/* Write out the whole new L2 table */
34
ret = qed_write_l2_table(s, &acb->request, 0, s->table_nelems, true);
35
if (ret) {
36
- goto err;
37
+ return ret;
38
}
39
- ret = qed_aio_write_l1_update(acb);
40
- qed_aio_next_io(acb, ret);
41
-
42
+ return qed_aio_write_l1_update(acb);
43
} else {
44
/* Write out only the updated part of the L2 table */
45
ret = qed_write_l2_table(s, &acb->request, index, acb->cur_nclusters,
46
false);
47
- qed_aio_next_io(acb, ret);
48
+ if (ret) {
49
+ return ret;
50
+ }
51
}
52
- return;
53
-
54
-err:
55
- qed_aio_complete(acb, ret);
56
+ return 0;
57
}
58
59
/**
60
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_main(void *opaque, int ret)
61
*/
62
ret = bdrv_flush(s->bs->file->bs);
63
}
64
- qed_aio_write_l2_update(acb, ret, acb->cur_cluster);
65
+ if (ret) {
66
+ goto err;
67
+ }
68
+ ret = qed_aio_write_l2_update(acb, acb->cur_cluster);
69
+ if (ret) {
70
+ goto err;
71
+ }
72
+ qed_aio_next_io(acb, 0);
73
}
74
+ return;
75
+
76
+err:
77
+ qed_aio_complete(acb, ret);
78
}
79
80
/**
81
@@ -XXX,XX +XXX,XX @@ static void qed_aio_write_zero_cluster(void *opaque, int ret)
82
return;
83
}
84
85
- qed_aio_write_l2_update(acb, 0, 1);
86
+ ret = qed_aio_write_l2_update(acb, 1);
87
+ if (ret < 0) {
88
+ qed_aio_complete(acb, ret);
89
+ return;
90
+ }
91
+ qed_aio_next_io(acb, 0);
92
}
93
94
/**
95
--
96
1.8.3.1
97
98
diff view generated by jsdifflib
Deleted patch
1
Now that we're running in coroutine context, the ad-hoc serialisation
2
code (which drops a request that has to wait out of coroutine context)
3
can be replaced by a CoQueue.
4
1
5
This means that when we resume a serialised request, it is running in
6
coroutine context again and its I/O isn't blocking any more.
7
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
---
11
block/qed.c | 49 +++++++++++++++++--------------------------------
12
block/qed.h | 3 ++-
13
2 files changed, 19 insertions(+), 33 deletions(-)
14
15
diff --git a/block/qed.c b/block/qed.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/block/qed.c
18
+++ b/block/qed.c
19
@@ -XXX,XX +XXX,XX @@ static void qed_plug_allocating_write_reqs(BDRVQEDState *s)
20
21
static void qed_unplug_allocating_write_reqs(BDRVQEDState *s)
22
{
23
- QEDAIOCB *acb;
24
-
25
assert(s->allocating_write_reqs_plugged);
26
27
s->allocating_write_reqs_plugged = false;
28
-
29
- acb = QSIMPLEQ_FIRST(&s->allocating_write_reqs);
30
- if (acb) {
31
- qed_aio_start_io(acb);
32
- }
33
+ qemu_co_enter_next(&s->allocating_write_reqs);
34
}
35
36
static void qed_clear_need_check(void *opaque, int ret)
37
@@ -XXX,XX +XXX,XX @@ static void qed_need_check_timer_cb(void *opaque)
38
BDRVQEDState *s = opaque;
39
40
/* The timer should only fire when allocating writes have drained */
41
- assert(!QSIMPLEQ_FIRST(&s->allocating_write_reqs));
42
+ assert(!s->allocating_acb);
43
44
trace_qed_need_check_timer_cb(s);
45
46
@@ -XXX,XX +XXX,XX @@ static int bdrv_qed_do_open(BlockDriverState *bs, QDict *options, int flags,
47
int ret;
48
49
s->bs = bs;
50
- QSIMPLEQ_INIT(&s->allocating_write_reqs);
51
+ qemu_co_queue_init(&s->allocating_write_reqs);
52
53
ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header));
54
if (ret < 0) {
55
@@ -XXX,XX +XXX,XX @@ static void qed_aio_complete_bh(void *opaque)
56
qed_release(s);
57
}
58
59
-static void qed_resume_alloc_bh(void *opaque)
60
-{
61
- qed_aio_start_io(opaque);
62
-}
63
-
64
static void qed_aio_complete(QEDAIOCB *acb, int ret)
65
{
66
BDRVQEDState *s = acb_to_s(acb);
67
@@ -XXX,XX +XXX,XX @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
68
* next request in the queue. This ensures that we don't cycle through
69
* requests multiple times but rather finish one at a time completely.
70
*/
71
- if (acb == QSIMPLEQ_FIRST(&s->allocating_write_reqs)) {
72
- QEDAIOCB *next_acb;
73
- QSIMPLEQ_REMOVE_HEAD(&s->allocating_write_reqs, next);
74
- next_acb = QSIMPLEQ_FIRST(&s->allocating_write_reqs);
75
- if (next_acb) {
76
- aio_bh_schedule_oneshot(bdrv_get_aio_context(acb->common.bs),
77
- qed_resume_alloc_bh, next_acb);
78
+ if (acb == s->allocating_acb) {
79
+ s->allocating_acb = NULL;
80
+ if (!qemu_co_queue_empty(&s->allocating_write_reqs)) {
81
+ qemu_co_enter_next(&s->allocating_write_reqs);
82
} else if (s->header.features & QED_F_NEED_CHECK) {
83
qed_start_need_check_timer(s);
84
}
85
@@ -XXX,XX +XXX,XX @@ static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
86
int ret;
87
88
/* Cancel timer when the first allocating request comes in */
89
- if (QSIMPLEQ_EMPTY(&s->allocating_write_reqs)) {
90
+ if (s->allocating_acb == NULL) {
91
qed_cancel_need_check_timer(s);
92
}
93
94
/* Freeze this request if another allocating write is in progress */
95
- if (acb != QSIMPLEQ_FIRST(&s->allocating_write_reqs)) {
96
- QSIMPLEQ_INSERT_TAIL(&s->allocating_write_reqs, acb, next);
97
- }
98
- if (acb != QSIMPLEQ_FIRST(&s->allocating_write_reqs) ||
99
- s->allocating_write_reqs_plugged) {
100
- return -EINPROGRESS; /* wait for existing request to finish */
101
+ if (s->allocating_acb != acb || s->allocating_write_reqs_plugged) {
102
+ if (s->allocating_acb != NULL) {
103
+ qemu_co_queue_wait(&s->allocating_write_reqs, NULL);
104
+ assert(s->allocating_acb == NULL);
105
+ }
106
+ s->allocating_acb = acb;
107
+ return -EAGAIN; /* start over with looking up table entries */
108
}
109
110
acb->cur_nclusters = qed_bytes_to_clusters(s,
111
@@ -XXX,XX +XXX,XX @@ static void qed_aio_next_io(QEDAIOCB *acb)
112
ret = qed_aio_read_data(acb, ret, offset, len);
113
}
114
115
- if (ret < 0) {
116
- if (ret != -EINPROGRESS) {
117
- qed_aio_complete(acb, ret);
118
- }
119
+ if (ret < 0 && ret != -EAGAIN) {
120
+ qed_aio_complete(acb, ret);
121
return;
122
}
123
}
124
diff --git a/block/qed.h b/block/qed.h
125
index XXXXXXX..XXXXXXX 100644
126
--- a/block/qed.h
127
+++ b/block/qed.h
128
@@ -XXX,XX +XXX,XX @@ typedef struct {
129
uint32_t l2_mask;
130
131
/* Allocating write request queue */
132
- QSIMPLEQ_HEAD(, QEDAIOCB) allocating_write_reqs;
133
+ QEDAIOCB *allocating_acb;
134
+ CoQueue allocating_write_reqs;
135
bool allocating_write_reqs_plugged;
136
137
/* Periodic flush and clear need check flag */
138
--
139
1.8.3.1
140
141
diff view generated by jsdifflib
1
This fixes the last place where we degraded from AIO to actual blocking
1
The 'name' option for NBD exports is optional. Add a note that the
2
synchronous I/O requests. Putting it into a coroutine means that instead
2
default for the option is the node name (people could otherwise expect
3
of blocking, the coroutine simply yields while doing I/O.
3
that it's the empty string like for qemu-nbd).
4
4
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
5
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Message-Id: <20210305094856.18964-1-kwolf@redhat.com>
7
Reviewed-by: Max Reitz <mreitz@redhat.com>
8
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
7
---
9
---
8
block/qed.c | 33 +++++++++++++++++----------------
10
docs/tools/qemu-storage-daemon.rst | 5 +++--
9
1 file changed, 17 insertions(+), 16 deletions(-)
11
1 file changed, 3 insertions(+), 2 deletions(-)
10
12
11
diff --git a/block/qed.c b/block/qed.c
13
diff --git a/docs/tools/qemu-storage-daemon.rst b/docs/tools/qemu-storage-daemon.rst
12
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
13
--- a/block/qed.c
15
--- a/docs/tools/qemu-storage-daemon.rst
14
+++ b/block/qed.c
16
+++ b/docs/tools/qemu-storage-daemon.rst
15
@@ -XXX,XX +XXX,XX @@ static void qed_unplug_allocating_write_reqs(BDRVQEDState *s)
17
@@ -XXX,XX +XXX,XX @@ Standard options:
16
qemu_co_enter_next(&s->allocating_write_reqs);
18
requests for modifying data (the default is off).
17
}
19
18
20
The ``nbd`` export type requires ``--nbd-server`` (see below). ``name`` is
19
-static void qed_clear_need_check(void *opaque, int ret)
21
- the NBD export name. ``bitmap`` is the name of a dirty bitmap reachable from
20
+static void qed_need_check_timer_entry(void *opaque)
22
- the block node, so the NBD client can use NBD_OPT_SET_META_CONTEXT with the
21
{
23
+ the NBD export name (if not specified, it defaults to the given
22
BDRVQEDState *s = opaque;
24
+ ``node-name``). ``bitmap`` is the name of a dirty bitmap reachable from the
23
+ int ret;
25
+ block node, so the NBD client can use NBD_OPT_SET_META_CONTEXT with the
24
26
metadata context name "qemu:dirty-bitmap:BITMAP" to inspect the bitmap.
25
- if (ret) {
27
26
+ /* The timer should only fire when allocating writes have drained */
28
The ``vhost-user-blk`` export type takes a vhost-user socket address on which
27
+ assert(!s->allocating_acb);
28
+
29
+ trace_qed_need_check_timer_cb(s);
30
+
31
+ qed_acquire(s);
32
+ qed_plug_allocating_write_reqs(s);
33
+
34
+ /* Ensure writes are on disk before clearing flag */
35
+ ret = bdrv_co_flush(s->bs->file->bs);
36
+ qed_release(s);
37
+ if (ret < 0) {
38
qed_unplug_allocating_write_reqs(s);
39
return;
40
}
41
@@ -XXX,XX +XXX,XX @@ static void qed_clear_need_check(void *opaque, int ret)
42
43
qed_unplug_allocating_write_reqs(s);
44
45
- ret = bdrv_flush(s->bs);
46
+ ret = bdrv_co_flush(s->bs);
47
(void) ret;
48
}
49
50
static void qed_need_check_timer_cb(void *opaque)
51
{
52
- BDRVQEDState *s = opaque;
53
-
54
- /* The timer should only fire when allocating writes have drained */
55
- assert(!s->allocating_acb);
56
-
57
- trace_qed_need_check_timer_cb(s);
58
-
59
- qed_acquire(s);
60
- qed_plug_allocating_write_reqs(s);
61
-
62
- /* Ensure writes are on disk before clearing flag */
63
- bdrv_aio_flush(s->bs->file->bs, qed_clear_need_check, s);
64
- qed_release(s);
65
+ Coroutine *co = qemu_coroutine_create(qed_need_check_timer_entry, opaque);
66
+ qemu_coroutine_enter(co);
67
}
68
69
void qed_acquire(BDRVQEDState *s)
70
--
29
--
71
1.8.3.1
30
2.29.2
72
31
73
32
diff view generated by jsdifflib